Let's walk through how Python translates your code from plain text to bytecode, step by step. We'll use this example:
a = 6
b = 4
print(a + b)
This is the code you write in a .py file. It's just text, readable by humans.
Python first breaks your code into tokens—the smallest meaningful elements (like names, numbers, and operators).
Example tokens:
| Token Type | Token Value |
|---|---|
| NAME | 'a' |
| OP | '=' |
| NUMBER | '6' |
| NEWLINE | '\n' |
| NAME | 'b' |
| OP | '=' |
| NUMBER | '4' |
| NEWLINE | '\n' |
| NAME | 'print' |
| OP | '(' |
| NAME | 'a' |
| OP | '+' |
| NAME | 'b' |
| OP | ')' |
| NEWLINE | '\n' |
Next, Python organizes the tokens into a tree structure that represents the meaning of your code. This is called the AST.
AST Outline:
This tree shows how each part of your code relates to the others.
Python then compiles the AST into bytecode—a set of instructions for the Python Virtual Machine (PVM).
Example bytecode (using dis module):
1 0 LOAD_CONST 0 (6)
2 STORE_NAME 0 (a)
2 4 LOAD_CONST 1 (4)
6 STORE_NAME 1 (b)
3 8 LOAD_NAME 2 (print)
10 LOAD_NAME 0 (a)
12 LOAD_NAME 1 (b)
14 BINARY_OP 0 (+)
16 PRECALL 1
18 CALL 1
20 POP_TOP
22 LOAD_CONST 2 (None)
24 RETURN_VALUE
Each line is a low-level instruction that the PVM can execute.
The Python Virtual Machine runs the bytecode, performing the actual calculations and function calls.
| Step | What Happens | Example Output/Structure |
|---|---|---|
| Source Code | Human-readable text | a = 6 |
| Tokenization | Break into tokens | NAME, OP, NUMBER, ... |
| AST | Build tree of code structure | Assign, BinOp, Call, ... |
| Bytecode | Compile to VM instructions | LOAD_CONST, STORE_NAME, ... |
| Execution | Python VM runs the bytecode | Output: 10 |
In short: Python takes your code through tokenization, parsing (AST), compilation to bytecode, and finally execution. Each step helps Python understand and run your code efficiently.
References: