54 Commits

Author SHA1 Message Date
Ken Jin 6f7bb297db gh-146261: JIT: protect against function version changes (#146300)
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
2026-04-13 02:23:47 +08:00
Sacul 83f33dccf2 gh-148374: Fix a bug in _Py_uop_sym_get_type (GH-148375) 2026-04-11 23:03:13 +08:00
Donghee Na 853dafe23a gh-148083: Prevent constant folding when lhs is container types (gh-148090) 2026-04-05 00:40:12 +09:00
Donghee Na 289f19adb0 gh-148083: Constant-fold _CONTAINS_OP_SET for frozenset (gh-148084) 2026-04-04 12:32:12 +00:00
Kumar Aditya e7bf8eac0f gh-131798: split recursion check to _CHECK_RECURSION_LIMIT and combine checks (GH-148070) 2026-04-04 17:23:03 +08:00
Donghee Na 5992238986 gh-146381: Constant-fold frozendict subscript lookups via REPLACE_OPCODE_IF_EVALUATES_PURE (gh-146490) 2026-03-28 09:48:53 +09:00
Brandon c8ee196030 gh-146388: Add null check for sym_new(ctx) in make_bottom (GH-146389) 2026-03-27 22:50:29 +08:00
reiden e36f8db7e5 gh-143414: Implement unique reference tracking for JIT, optimize unpacking of such tuples (GH-144300)
Co-authored-by: Ken Jin <kenjin4096@gmail.com>
2026-03-23 00:57:23 +08:00
Sacul 1ceb1fb284 gh-146261: Fix bug in _Py_uop_sym_set_func_version (GH-146291) 2026-03-23 00:11:37 +08:00
Ken Jin acfb4528de gh-146099: Optimize _GUARD_CODE_VERSION+IP via function version symbols (GH-146101) 2026-03-20 12:26:41 +00:00
Donghee Na 6908372fb8 gh-145214: Narrow _GUARD_TOS_ANY_{SET,DICT} by using probable type (gh-145215) 2026-03-03 09:58:38 +09:00
Mark Shannon 3f37b94c73 GH-144651: Optimize the new uops added when recording values during tracing. (GH-144948)
* Handle dependencies in the optimizer, not the tracer
* Strengthen some checks to avoid relying on optimizer for correctness
2026-02-19 11:52:57 +00:00
Mark Shannon b53fc7caa6 GH-144179: Use recorded values to make optimizer more robust (GH-144437)
* Add three new symbol kinds
* Do not smuggle code object in _PUSH_FRAME operand
* Fix small bug in predicate analysis
2026-02-05 08:58:41 +00:00
Hai Zhu ebbb2ca81f gh-144145: Revert PR#144122 for performance and potential bugs. (GH-144391)
Revert "gh-144145: Track nullness of properties in the Tier 2 JIT optimizer (GH-144122)"

This reverts commit 1dc12b2883.
2026-02-02 14:09:54 +00:00
reiden a7048327ed gh-144280: Add missing predicate symbol to case-switch (GH-144298) 2026-01-30 16:43:27 +00:00
Hai Zhu 1dc12b2883 gh-144145: Track nullness of properties in the Tier 2 JIT optimizer (GH-144122) 2026-01-30 15:25:19 +00:00
reiden 6d972e0104 gh-130415: Narrow types to constants in branches involving specialized comparisons with a constant (GH-144150) 2026-01-24 10:02:08 +00:00
reiden 0b08438ea6 gh-130415: Narrowing to constants in branches involving is comparisons with a constant (GH-143895) 2026-01-22 09:37:45 +00:00
Hai Zhu 61ec66acd5 gh-143946: Show JitOptSymbol on abstract stack when set PYTHON_OPT_DEBUG > 4 (GH-143957) 2026-01-17 15:20:35 +00:00
Ken Jin f7a03bb944 gh-134584: JIT: Remove redundant refcount from STORE_FAST (GH-143336) 2026-01-02 18:22:21 +00:00
Ken Jin 4fa80ce74c gh-139109: A new tracing JIT compiler frontend for CPython (GH-140310)
This PR changes the current JIT model from trace projection to trace recording. Benchmarking: better pyperformance (about 1.7% overall) geomean versus current https://raw.githubusercontent.com/facebookexperimental/free-threading-benchmarking/refs/heads/main/results/bm-20251108-3.15.0a1%2B-7e2bc1d-JIT/bm-20251108-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-7e2bc1d-vs-base.svg, 100% faster Richards on the most improved benchmark versus the current JIT. Slowdown of about 10-15% on the worst benchmark versus the current JIT. **Note: the fastest version isn't the one merged, as it relies on fixing bugs in the specializing interpreter, which is left to another PR**. The speedup in the merged version is about 1.1%. https://raw.githubusercontent.com/facebookexperimental/free-threading-benchmarking/refs/heads/main/results/bm-20251112-3.15.0a1%2B-f8a764a-JIT/bm-20251112-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-f8a764a-vs-base.svg

Stats: 50% more uops executed, 30% more traces entered the last time we ran them. It also suggests our trace lengths for a real trace recording JIT are too short, as a lot of trace too long aborts https://github.com/facebookexperimental/free-threading-benchmarking/blob/main/results/bm-20251023-3.15.0a1%2B-eb73378-CLANG%2CJIT/bm-20251023-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-eb73378-pystats-vs-base.md .

This new JIT frontend is already able to record/execute significantly more instructions than the previous JIT frontend. In this PR, we are now able to record through custom dunders, simple object creation, generators, etc. None of these were done by the old JIT frontend. Some custom dunders uops were discovered to be broken as part of this work gh-140277

The optimizer stack space check is disabled, as it's no longer valid to deal with underflow.

Pros:
* Ignoring the generated tracer code as it's automatically created, this is only additional 1k lines of code. The maintenance burden is handled by the DSL and code generator.
* `optimizer.c` is now significantly simpler, as we don't have to do strange things to recover the bytecode from a trace.
* The new JIT frontend is able to handle a lot more control-flow than the old one.
* Tracing is very low overhead. We use the tail calling interpreter/computed goto interpreter to switch between tracing mode and non-tracing mode. I call this mechanism dual dispatch, as we have two dispatch tables dispatching to each other. Specialization is still enabled while tracing.
* Better handling of polymorphism. We leverage the specializing interpreter for this.

Cons:
* (For now) requires tail calling interpreter or computed gotos. This means no Windows JIT for now :(. Not to fret, tail calling is coming soon to Windows though https://github.com/python/cpython/pull/139962

Design:
* After each instruction, the `record_previous_inst` function/label is executed. This does as the name suggests.
* The tracing interpreter lowers bytecode to uops directly so that it can obtain "fresh" values at the point of lowering.
* The tracing version behaves nearly identical to the normal interpreter, in fact it even has specialization! This allows it to run without much of a slowdown when tracing. The actual cost of tracing is only a function call and writes to memory.
* The tracing interpreter uses the specializing interpreter's deopt to naturally form the side exit chains. This allows it to side exit chain effectively, without repeating much code. We force a re-specializing when tracing a deopt.
* The tracing interpreter can even handle goto errors/exceptions, but I chose to disable them for now as it's not tested.
* Because we do not share interpreter dispatch, there is should be no significant slowdown to the original specializing interpreter on tailcall and computed got with JIT disabled. With JIT enabled, there might be a slowdown in the form of the JIT trying to trace.
* Things that could have dynamic instruction pointer effects are guarded on. The guard deopts to a new instruction --- `_DYNAMIC_EXIT`.
2025-11-13 18:08:32 +00:00
Victor Stinner 166cdaa6fb gh-111489: Remove _PyTuple_FromArray() alias (#139973)
Replace _PyTuple_FromArray() with PyTuple_FromArray().
Remove pycore_tuple.h includes.
2025-10-11 22:58:14 +02:00
Mark Shannon 3b83257366 GH-138378: Move globals-to-consts pass into main optimizer pass (GH-138379) 2025-09-18 10:09:59 +01:00
AN Long 1ff2cbbac8 gh-137136: Suppress build warnings when build on Windows with --experimental-jit-interpreter (GH-137137) 2025-09-03 15:42:26 +01:00
Ken Jin 7fda8b66de gh-137728 gh-137762: Fix bugs in the JIT with many local variables (GH-137764) 2025-08-20 22:53:54 +08:00
Ken Jin ff7b5d44a0 gh-132732: Fix up pure types in JIT (GH-136050)
Fix up pure types in JIT
2025-06-28 18:30:30 +08:00
Ken Jin c419af9e27 gh-132732: JIT: Only allow compact ints in pure evaluation (GH-136040) 2025-06-28 00:18:44 +08:00
Ken Jin 695ab61351 gh-132732: Automatically constant evaluate pure operations (GH-132733)
This adds a "macro" to the optimizer DSL called "REPLACE_OPCODE_IF_EVALUATES_PURE", which allows automatically constant evaluating a bytecode body if certain inputs have no side effects upon evaluations (such as ints, strings, and floats).


Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
2025-06-27 19:37:44 +08:00
Ken Jin 569fc6870f gh-134584: Specialize POP_TOP by reference and type in JIT (GH-135761) 2025-06-24 00:57:14 +08:00
Ken Jin 0243260284 gh-135379: Move PyLong_CheckCompact to private header and rename it (GH-135707) 2025-06-19 13:09:09 +00:00
Mark Shannon 9731dd2c8d GH-135379: Specialize int operations for compact ints only (GH-135668) 2025-06-19 11:10:29 +01:00
Ken Jin fba5dded6d gh-134584: Decref elimination for float ops in the JIT (GH-134588)
This PR adds a PyJitRef API to the JIT's optimizer that mimics the _PyStackRef API. This allows it to track references and their stack lifetimes properly. Thus opening up the doorway to refcount elimination in the JIT.
2025-06-17 23:25:53 +08:00
Brandt Bucher ec736e7dae GH-131798: Optimize cached class attributes and methods in the JIT (GH-134403) 2025-05-22 11:15:03 -04:00
Brandt Bucher 2f0570caf4 GH-131798: Narrow types more aggressively in the JIT (GH-134373) 2025-05-20 18:09:51 -04:00
Bénédikt Tran 883c2f682b GH-131331: Rename "not" to "invert" (GH-131334) 2025-03-20 16:59:41 -07:00
Mark Shannon 7ebd71ee14 GH-131498: Remove conditional stack effects (GH-131499)
* Adds some missing #includes
2025-03-20 15:39:38 +00:00
Mark Shannon a45f25361d GH-131238: More refactoring of core header files (GH-131351)
Adds new pycore_stats.h header file to help break dependencies involving the pycore_code.h header.
2025-03-17 14:41:05 +00:00
Amit Lavon 691354ccb0 GH-130415: Narrow str to "" based on boolean tests (GH-130476) 2025-03-04 13:20:17 -08:00
Klaus117 c989e74446 GH-130415: Narrow int to 0 based on boolean tests (GH-130772) 2025-03-04 12:44:09 -08:00
Brandt Bucher 7afa476874 GH-130415: Use boolean guards to narrow types to values in the JIT (GH-130659) 2025-03-02 13:21:34 -08:00
Mark Shannon f0f7b978be GH-128939: Refactor JIT optimize structs (GH-128940) 2025-01-20 15:49:15 +00:00
Victor Stinner 9e4a81f00f gh-120642: Move private PyCode APIs to the internal C API (#120643)
* Move _Py_CODEUNIT and related functions to pycore_code.h.
* Move _Py_BackoffCounter to pycore_backoff.h.
* Move Include/cpython/optimizer.h content to pycore_optimizer.h.
* Remove Include/cpython/optimizer.h.
* Remove PyUnstable_Replace_Executor().

Rename functions:

* PyUnstable_GetExecutor() => _Py_GetExecutor()
* PyUnstable_GetOptimizer() => _Py_GetOptimizer()
* PyUnstable_SetOptimizer() => _Py_SetTier2Optimizer()
* PyUnstable_Optimizer_NewCounter() => _PyOptimizer_NewCounter()
* PyUnstable_Optimizer_NewUOpOptimizer() => _PyOptimizer_NewUOpOptimizer()
2024-06-26 13:54:03 +02:00
Saul Shanabrook 55402d3232 gh-119258: Eliminate Type Guards in Tier 2 Optimizer with Watcher (GH-119365)
Co-authored-by: parmeggiani <parmeggiani@spaziodati.eu>
Co-authored-by: dpdani <git@danieleparmeggiani.me>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
Co-authored-by: Ken Jin <kenjin@python.org>
2024-06-08 17:41:45 +08:00
Mark Shannon f5c6b9977a GH-118910: Less boilerplate in the tier 2 optimizer (#118913) 2024-05-10 17:43:23 +01:00
Mark Shannon 72867c962c GH-118095: Unify the behavior of tier 2 FOR_ITER branch micro-ops (GH-118420)
* Target _FOR_ITER_TIER_TWO at POP_TOP following the matching END_FOR

* Modify _GUARD_NOT_EXHAUSTED_RANGE, _GUARD_NOT_EXHAUSTED_LIST and _GUARD_NOT_EXHAUSTED_TUPLE so that they also target the POP_TOP following the matching END_FOR
2024-05-02 16:17:59 +01:00
Guido van Rossum 7d83f7bcc4 gh-118335: Configure Tier 2 interpreter at build time (#118339)
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.

We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.

On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.

In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
2024-04-30 18:26:34 -07:00
Mark Shannon a6647d16ab GH-115480: Reduce guard strength for binary ops when type of one operand is known already (GH-118050) 2024-04-22 13:34:06 +01:00
Mark Shannon 0c81ce1360 GH-115819: Eliminate Boolean guards when value is known (GH-116355) 2024-03-05 15:06:00 +00:00
Mark Shannon cbf3d38cbe GH-115685: Optimize TO_BOOL and variants based on truthiness of input. (GH-116311) 2024-03-05 11:23:46 +00:00
Guido van Rossum 0656509033 gh-116088: Insert bottom checks after all sym_set_...() calls (#116089)
This changes the `sym_set_...()` functions to return a `bool` which is `false`
when the symbol is `bottom` after the operation.

All calls to such functions now check this result and go to `hit_bottom`,
a special error label that prints a different message and then reports
that it wasn't able to optimize the trace. No executor will be produced
in this case.
2024-02-29 18:55:29 +00:00