閱讀391 返回首頁    go 阿裏雲 go 技術社區[雲棲]


[筆記]Python虛擬機的運行時基本知識

首先應該了解程序的運行時刻環境,個人覺得龍書中文版第7章挺通俗易懂的。

Python在這方麵設計了PyFrameObject這個結構(對應於龍書中的“活動記錄”)來維護運行時環境,並采用了“訪問鏈”的思想(龍書中介紹了“訪問鏈”和“顯示表”)來解決不同作用域間變量的訪問問題。
不過在PyFrameObject中維護了3個成員,用來指向最經常使用的3個符號表,內置符號表、全局符號表、局部符號表:
    PyObject *f_builtins;     /* builtin symbol table (PyDictObject) */
    PyObject *f_globals;      /* global symbol table (PyDictObject) */
    PyObject *f_locals;       /* local symbol table (any mapping) */

這樣可以避免在訪問全局變量、內建變量時還要通過“訪問鏈”上的回溯來搜索。
PyFrameObject通過如下成員來維護“訪問鏈”(或者稱“符號表鏈”、“名字空間鏈”):
struct _frame *f_back;    /* previous frame, or NULL */

關於Python的作用域,有一些規則。
最內嵌套作用域規則:由一個賦值語句引進的名字在這個賦值語句所在的作用域裏是可見(起作用)的,而且在其內部嵌套的每個作用域裏也可見,除非它被嵌套於內部的,引進同樣名字的另一條賦值語句所遮蔽/覆蓋。
LEGB:符號表的搜索順序是Local -> Enclosing Function -> Global -> Built-in

一個比較常見而且經典的案例是UnboundLocalError,見如下代碼:
x = 10
def foo():
    print(x)
    x += 1
foo()

這一段代碼會出現如下錯誤:
UnboundLocalError: local variable 'x' referenced before assignment

這個問題可以用下麵兩段話來解答:
This is because when you make an assignment to a variable in a scope, that variable becomes local to that scope and shadows any similarly named variable in the outer scope. Since the last statement in foo assigns a new value to x, the compiler recognizes it as a local variable. Consequently when the earlier print x attempts to print the uninitialized local variable and an error results.
URL:https://docs.python.org/faq/programming.html
Otherwise, all variables found outside of the innermost scope are read-only (an attempt to write to such a variable will simply create a new local variable in the innermost scope, leaving the identically named outer variable unchanged).
URL:https://docs.python.org/tutorial/classes.html

第二個URL,即官方文檔也說明了LEGB規則:
  • the innermost scope, which is searched first, contains the local names
  • the scopes of any enclosing functions, which are searched starting with the nearest enclosing scope, contains non-local, but also non-global names
  • the next-to-last scope contains the current module’s global names
  • the outermost scope (searched last) is the namespace containing built-in names


上麵討論了幀對象PyFrameObject和作用域、符號表等,下麵是比較大的概念:關於Python虛擬機的運行時環境。

虛擬機的具體實現位於ceval.c中的PyEval_EvalFrameEx函數中。
函數開頭首先定義了如下變量:
    register PyObject **stack_pointer;  /* Next free slot in value stack */
    register unsigned char *next_instr;
    register int opcode;        /* Current opcode */
    register int oparg;         /* Current opcode argument, if any */
    register enum why_code why; /* Reason for block stack unwind */

含義可以從注釋中看出,比如next_instr表示下一條指令,why表示棧展開的原因。

PyEval_EvalFrameEx是一個非常龐大的函數,擁有龐大的switch/case語句數目來執行各種指令。
函數中提供了幾個訪問指令的宏:
/* Code access macros */

#define INSTR_OFFSET()  ((int)(next_instr - first_instr))
#define NEXTOP()        (*next_instr++)
#define NEXTARG()       (next_instr += 2, (next_instr[-1]<<8) + next_instr[-2])
#define PEEKARG()       ((next_instr[2]<<8) + next_instr[1])
#define JUMPTO(x)       (next_instr = first_instr + (x))
#define JUMPBY(x)       (next_instr += (x))

此外,在運行時需要涉及的還有線程和進程,Python使用的是係統原生的線程/進程,並使用PyThreadState和PyInterpreterState對象來進行抽象和維護。
在PyEval_EvalFrameEx函數開頭,也定義了tstate變量,並把當前線程狀態賦值給該變量:
PyThreadState *tstate = PyThreadState_GET();

接著設置線程狀態對象中的幀:
    tstate->frame = f;

然後再設置幀的一些信息:
    co = f->f_code;
    names = co->co_names;
    consts = co->co_consts;
    fastlocals = f->f_localsplus;
    freevars = f->f_localsplus + co->co_nlocals;
    first_instr = (unsigned char*) PyString_AS_STRING(co->co_code);
    next_instr = first_instr + f->f_lasti + 1;
    stack_pointer = f->f_stacktop;
    assert(stack_pointer != NULL);
    f->f_stacktop = NULL;       /* remains NULL unless yield suspends frame */

最後,進入switch/case:
        switch (opcode) {

P.S. “訪問鏈”的形成是在PyFrame_New函數中,幀的f_back成員指向當前線程狀態對象的frame成員。


JasonLee     2011.08.20     20:18

最後更新:2017-04-02 22:16:33

  上一篇:go Android2.3 編譯出錯:No rule to make InterpAsm-x86.S
  下一篇:go QVGA、WVGA、VGA、WQVGA、SQVGA等幾種手機分辨率掃盲!