From the pystats doc (pystats-2023-02-05-python-5a2b984.md), I find that LOAD_CONST + RETURN_VALUE is a very high frequency (Because the default return of the function is None).
Successors for LOAD_CONST
| Successors |
Count |
Percentage |
| RETURN_VALUE |
969,173,651 |
21.8% |
| BINARY_OP_ADD_INT |
418,647,997 |
9.4% |
| LOAD_CONST |
403,185,774 |
9.1% |
| COMPARE_AND_BRANCH_INT |
314,633,792 |
7.1% |
| STORE_FAST |
295,563,626 |
6.6% |
And predecessors for RETURN_VALUE
| Predecessors |
Count |
Percentage |
| LOAD_CONST |
969,173,651 |
29.9% |
| LOAD_FAST |
505,933,343 |
15.6% |
| RETURN_VALUE |
382,698,373 |
11.8% |
| BUILD_TUPLE |
328,532,240 |
10.1% |
| COMPARE_OP |
107,210,803 |
3.3% |
This means that if we add a RETURN_CONST, we can reduce the RETURN_VALUE instruction by 30% and the LOAD_CONST instruction by 20%.
./bin/python3 -m pyperf timeit -w 3 --compare-to ../python-3.12/bin/python3 -s "
def test():
return 10000
" "test()"
/python-3.12/bin/python3: ..................... 27.0 ns +- 0.3 ns
/cpython/bin/python3: ..................... 25.0 ns +- 0.5 ns
Mean +- std dev: [/python-3.12/bin/python3] 27.0 ns +- 0.3 ns -> [/cpython/bin/python3] 25.0 ns +- 0.5 ns: 1.08x faster
./bin/python3 -m pyperf timeit -w 3 --compare-to ../python-3.12/bin/python3 -s "
def test():
return None
" "test()"
/python-3.12/bin/python3: ..................... 27.2 ns +- 1.3 ns
/cpython/bin/python3: ..................... 25.1 ns +- 0.6 ns
Mean +- std dev: [/python-3.12/bin/python3] 27.2 ns +- 1.3 ns -> [/cpython/bin/python3] 25.1 ns +- 0.6 ns: 1.08x faster
From the microbenchmark that there is indeed a ~10% improvement (considering the interference of function calls, I think 10% should be there), which is not very high, but it should be an optimization without adverse effects.
Linked PRs
From the pystats doc (pystats-2023-02-05-python-5a2b984.md), I find that
LOAD_CONST + RETURN_VALUEis a very high frequency (Because the default return of the function is None).Successors for LOAD_CONST
And predecessors for RETURN_VALUE
This means that if we add a
RETURN_CONST, we can reduce theRETURN_VALUEinstruction by 30% and theLOAD_CONSTinstruction by 20%.From the microbenchmark that there is indeed a ~10% improvement (considering the interference of function calls, I think 10% should be there), which is not very high, but it should be an optimization without adverse effects.
Linked PRs