Commit Graph

1000 Commits

Author SHA1 Message Date
Miss Islington (bot) f5231469b5 [3.15] gh-148829: Make sentinels' repr and module customizable (GH-149654) (#150092)
Implementation of python/peps#4968.
(cherry picked from commit 08218030a5)

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
2026-05-22 07:44:34 -07:00
Miss Islington (bot) 63a4007d25 [3.15] gh-149685: Use the _Py prefix for private C macros (GH-149686) (GH-149790)
(cherry picked from commit 125f26358a)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
2026-05-13 23:29:08 +02:00
Miss Islington (bot) a5f77a13fd [3.15] gh-148829: Add PySentinel_CheckExact() (GH-149725) (#149766)
gh-148829: Add PySentinel_CheckExact() (GH-149725)
(cherry picked from commit 94df62542c)

Co-authored-by: scoder <stefan_ml@behnel.de>
2026-05-13 10:39:38 +00:00
Peter Bierma 2b7c28a440 gh-149101: Implement PEP 788 (GH-149116)
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Sam Gross <colesbury@gmail.com>
2026-05-06 17:39:30 -04:00
Alex Malyshev 646853df13 gh-145559: Add PyUnstable_DumpTraceback() and PyUnstable_DumpTracebackThreads() (#148145)
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
2026-05-06 15:01:12 +00:00
Mark Shannon 70bd1c2dd2 GH-143732: SEND specialization (GH-148963)
* SEND specialization. Adds 2 new specialized instructions:

* SEND_VIRTUAL: for sends to virtual iterators e.g lists and tuples
* SEND_ASYNC_GEN: for sends to async generators

Tweak FOR_ITER_VIRTUAL so that SEND_VIRTUAL and FOR_ITER_VIRTUAL use equivalent guards
2026-05-05 15:19:16 +01:00
Petr Viktorin 6ca5cdba18 gh-149225: Expose Py_CriticalSection in Stable ABI (GH-149227) 2026-05-04 17:32:17 +02:00
Victor Stinner ce51c18818 gh-146063: Add PyObject_CallFinalizerFromDealloc() to the limited C API (#146172)
Co-authored-by: Petr Viktorin <encukou@gmail.com>
2026-05-04 15:37:45 +02:00
Diego Russo c7b7ca2cd5 GH-126910: Add gdb support for unwinding JIT frames (#146071)
Co-authored-by: Pablo Galindo Salgado <pablogsal@gmail.com>
2026-05-02 13:42:03 +00:00
Victor Stinner f0e90d78eb gh-148829: Move sentinelobject.h to Include/cpython/ (#149186)
This C API is not part of the limited C API, so move it to the
CPython C API.
2026-04-30 16:57:29 +02:00
scoder db0ee44b93 gh-142186: Revert the unintended value change in the PY_MONITORING_EVENT_* values from gh-146182 (gh-148955)
https://github.com/python/cpython/pull/146182 left an unintended change in the `PY_MONITORING_*` macro values. This change reverts that part to avoid a user visible impact.
2026-04-25 09:05:03 +02:00
Hai Zhu 618b726d68 gh-146073: Add fitness/exit quality mechanism for JIT trace frontend (GH-148089)
* Replaces ad-hoc logic for ending traces with a simple inequality: `fitness < exit_quality`
* Fitness starts high and is reduced for branches, backward edges, calls and trace length
* Exit quality reflect how good a spot that instruction is to end a trace. Closing a loop is very, specializable instructions are very low and the others in between.
2026-04-24 10:37:01 +01:00
Gabriele N. Tornetta 858e69eab0 gh-142186: Allow all PEP-669 events to be per-code object and disableable (GH-146182)
* Make the `PY_UNWIND` monitoring event available as a code-local
event to allow trapping on function exit events when an exception
bubbles up. This complements the PY_RETURN event by allowing to
catch any function exit event.

* Allow `PY_UNWIND`  to be `DISABLE`d; disabling it disables the event for the whole code object.

* Do the above for `PY_THROW`, `RAISE`, `EXCEPTION_HANDLED`, and `RERAISE` events.
2026-04-22 09:08:23 +01:00
Dino Viehland c0af5c024b gh-146031: Allow keeping specialization enabled when specifying eval frame function (#146032)
Allow keeping specialization enabled when specifying eval frame function
2026-04-16 09:44:26 -07:00
Mark Shannon 600f4dbd54 GH-145668: Add FOR_ITER specialization for virtual iterators. Specialize GET_ITER. (GH-147967)
* Add FOR_ITER_VIRTUAL to specialize FOR_ITER for virtual iterators
* Add GET_ITER_SELF to specialize GET_ITER for iterators (including generators)
* Add GET_ITER_VIRTUAL to specialize GET_ITER for iterables as virtual iterators
* Add new (internal) _tp_iteritem function slot to PyTypeObject
* Put limited RESUME at start of genexpr for free-threading. Fix up exception handling in genexpr
2026-04-16 15:22:22 +01:00
Petr Viktorin 8923ca418c gh-145921: Add "_DuringGC" functions for tp_traverse (GH-145925)
There are newly documented restrictions on tp_traverse:

    The traversal function must not have any side effects.
    It must not modify the reference counts of any Python
    objects nor create or destroy any Python objects.

* Add several functions that are guaranteed side-effect-free,
  with a _DuringGC suffix.
* Use these in ctypes
* Consolidate tp_traverse docs in gcsupport.rst, moving unique
  content from typeobj.rst there

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
2026-04-08 09:15:11 +02:00
Petr Viktorin fbc1a5b076 gh-146636: abi3t: Define Py_GIL_DISABLED but do not use it (GH-148142)
When compiling for abi3t, define Py_GIL_DISABLED, so that users who
check it to enable additional locking aren't broken.

But also avoid using Py_GIL_DISABLED in Python headers themselves
-- abi3 and abi3t ought to be the same except
the _Py_OPAQUE_PYOBJECT differences.

A check for this is coming in a later PR.
It will require rewriting some preprocessor conditions, some of these
changes are included in this PR.
For _Py_IsOwnedByCurrentThread & supporting functions
I opted to move them to a cpython/ header, as they're rather self-contained.
2026-04-07 09:06:17 +02:00
Ken Jin 328da67b2c gh-146073: Revert "gh-146073: Add fitness/exit quality mechanism for JIT trace frontend (GH-147966)" (#148082)
This reverts commit 198b04b75f.
2026-04-04 11:56:40 +00:00
Hai Zhu 198b04b75f gh-146073: Add fitness/exit quality mechanism for JIT trace frontend (GH-147966) 2026-04-03 23:54:30 +08:00
Victor Stinner c1a4112c22 gh-147988: Initialize digits in long_alloc() in debug mode (#147989)
When Python is built in debug mode:

* long_alloc() now initializes digits with a pattern to detect usage of
  uninitialized digits.
* _PyLong_CompactValue() now makes sure that the digit is zero when the
  sign is zero.
* PyLongWriter_Finish() now raises SystemError if it detects uninitialized
  digits

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2026-04-02 11:55:34 +00:00
Karolina Surma 1887a95f51 gh-128341: Use _Py_ABI_SLOT in stdlib modules (#145770)
Rename from _Py_INTERNAL_ABI_SLOT to _Py_ABI_SLOT
and define the macro using _PyABIInfo_DEFAULT.

Use the ABI slot in stdlib extension modules to enable running
a check of ABI version compatibility.

_tkinter, _tracemalloc and readline don't use the slots, hence they need
explicit handling.

Co-authored-by: Victor Stinner <vstinner@python.org>
2026-03-24 17:47:55 +00:00
Sam Gross 1eff27f2c0 gh-146227: Fix wrong type in _Py_atomic_load_uint16 in pyatomic_std.h (gh-146229)
Also fix a few related issues in the pyatomic headers:

* Fix _Py_atomic_store_uint_release in pyatomic_msc.h to use __stlr32
  on ARM64 instead of a plain volatile store (which is only relaxed on
  ARM64).

* Add missing _Py_atomic_store_uint_release to pyatomic_gcc.h.

* Fix pseudo-code comment for _Py_atomic_store_ptr_release in
  pyatomic.h.
2026-03-20 15:38:35 -04:00
Serhiy Storchaka becd7a967f gh-146143: Fix the PyUnicodeWriter_WriteUCS4() signature (GH-146144)
It now accepts a pointer to constant buffer of Py_UCS4.
2026-03-19 08:23:01 +00:00
T. Wouters 706fd4ec08 gh-142183: Cache one datachunk per tstate to prevent alloc/dealloc thrashing (#145789)
Cache one datachunk per tstate to prevent alloc/dealloc thrashing when repeatedly hitting the same call depth at exactly the wrong boundary.

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2026-03-11 15:46:16 +01:00
Victor Stinner 9159287f58 gh-144175: Add PyArg_ParseArray() function (#144283)
Add PyArg_ParseArray() and PyArg_ParseArrayAndKeywords()
functions to parse arguments of functions using the METH_FASTCALL
calling convention.

Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
2026-03-06 21:57:44 +00:00
Victor Stinner 4fce98a920 gh-141510: Change marshal version to 6 (#145551)
Fix SliceTestCase: test also that version 4 fails with ValueError.
2026-03-06 10:23:11 +01:00
Victor Stinner 31343cf2bc gh-142417: Restore private _Py_InitializeMain() function (#145472)
This reverts commit 07c3518ffb.

Co-authored-by: Petr Viktorin <encukou@gmail.com>
2026-03-04 11:00:08 +01:00
Hai Zhu 107863ee17 gh-144569: Avoid creating temporary objects in BINARY_SLICE for list, tuple, and unicode (GH-144590)
* Scalar replacement of BINARY_SLICE for list, tuple, and unicode
2026-03-02 17:02:38 +00:00
Victor Stinner 696cdfc0a2 gh-141510, PEP 814: Add built-in frozendict type (#144757)
Add TYPE_FROZENDICT to the marshal module.

Add C API functions:

* PyAnyDict_Check()
* PyAnyDict_CheckExact()
* PyFrozenDict_Check()
* PyFrozenDict_CheckExact()
* PyFrozenDict_New()

Add PyFrozenDict_Type C type.

Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Adam Johnson <me@adamj.eu>
Co-authored-by: Benedikt Johannes <benedikt.johannes.hofer@gmail.com>
2026-02-17 10:54:41 +01:00
Pablo Galindo Salgado 46d5106cfa gh-142349: Implement PEP 810 - Explicit lazy imports (#142351)
Co-authored-by: T. Wouters <twouters@meta.com >
Co-authored-by: Brittany Reynoso <breynoso@meta.com>
Co-authored-by: Dino Viehland <dinoviehland@meta.com>
2026-02-12 00:15:33 +00:00
Kumar Aditya 347d3594d3 gh-143300: implement PyUnstable_SetImmortal for marking objects as immortal (#144543) 2026-02-11 20:59:31 +05:30
Pablo Galindo Salgado 96e4cd698a gh-144319: Fix huge page safety in pymalloc arenas (#144331)
The pymalloc huge page support had two problems. First, on
architectures where the default huge page size exceeds the arena
size (e.g. 32 MiB on PPC, 512 MiB on ARM64 with 64 KB base
pages), mmap with MAP_HUGETLB silently allocates a full huge page
even when the requested size is smaller. The subsequent munmap
with the original arena size then fails with EINVAL, permanently
leaking the entire huge page. Second, huge pages were always
attempted when compiled in, with no way to disable them at
runtime. On Linux, if the huge page pool is exhausted, page
faults including copy-on-write faults after fork deliver SIGBUS
and kill the process.

The arena allocator now queries the system huge page size from
/proc/meminfo and skips MAP_HUGETLB when the arena size is not a
multiple of it. Huge pages also now require explicit opt-in at
runtime via the PYTHON_PYMALLOC_HUGEPAGES environment variable,
which is read through PyConfig and respects -E and -I flags.
The config field pymalloc_hugepages is propagated to the runtime
allocators struct so the low-level arena allocator can check it
without calling getenv directly.
2026-01-30 18:18:56 +00:00
Sergey B Kirpichev 4c7ec78092 gh-143869: Add PEP 757 functions to the limited API (#143906)
Co-authored-by: Petr Viktorin <encukou@gmail.com>
2026-01-21 14:47:14 +01:00
Peter Bierma 19e64afddf gh-141070: Rename PyUnstable_Object_Dump to PyObject_Dump (GH-142848) 2026-01-16 09:19:43 -05:00
Sam Gross 08bc03ff2a gh-120321: Make gi_frame_state transitions atomic in FT build (gh-142599)
This makes generator frame state transitions atomic in the free
threading build, which avoids segfaults when trying to execute
a generator from multiple threads concurrently.

There are still a few operations that aren't thread-safe and may crash
if performed concurrently on the same generator/coroutine:

 * Accessing gi_yieldfrom/cr_await/ag_await
 * Accessing gi_frame/cr_frame/ag_frame
 * Async generator operations
2025-12-19 19:10:37 +00:00
Victor Stinner 7aa353c414 gh-142217: Deprecate the private _Py_Identifier C API (#142221)
Deprecate functions:

* _PyObject_CallMethodId()
* _PyObject_GetAttrId()
* _PyUnicode_FromId()
2025-12-12 14:10:25 +01:00
Sam Gross 0a62f8277e gh-142534: Avoid TSan warnings in dictobject.c (gh-142544)
There are places we use "relaxed" loads where C11 requires "consume" or
stronger. Unfortunately, compilers don't really implement "consume" so
fake it for our use in a way that avoids upsetting TSan.
2025-12-11 16:23:19 -05:00
Mark Shannon 469f191a85 GH-135379: Top of stack caching for the JIT. (GH-135465)
Uses three registers to cache values at the top of the evaluation stack
This significantly reduces memory traffic for smaller, more common uops.
2025-12-11 10:32:52 +00:00
Ken Jin ebf3427615 gh-141976: Protect against non-progressing specializations in tracing JIT (GH-141989) 2025-12-10 19:39:11 +00:00
dr-carlos ff2577f56e gh-141732: Fix ExceptionGroup repr changing when original exception sequence is mutated (#141736) 2025-12-07 21:04:04 +00:00
Pablo Galindo Salgado d6d850df89 gh-138122: Don't sample partial frame chains (#141912) 2025-12-07 15:53:48 +00:00
Pablo Galindo Salgado 572c780aa8 gh-138122: Implement frame caching in RemoteUnwinder to reduce memory reads (#142137)
This PR implements frame caching in the RemoteUnwinder class to significantly reduce memory reads when profiling remote processes with deep call stacks.

When cache_frames=True, the unwinder stores the frame chain from each sample and reuses unchanged portions in subsequent samples. Since most profiling samples capture similar call stacks (especially the parent frames), this optimization avoids repeatedly reading the same frame data from the target process.

The implementation adds a last_profiled_frame field to the thread state that tracks where the previous sample stopped. On the next sample, if the current frame chain reaches this marker, the cached frames from that point onward are reused instead of being re-read from remote memory.

The sampling profiler now enables frame caching by default.
2025-12-06 22:37:34 +00:00
Victor Stinner 7fe1a18b77 gh-130396: Remove _Py_ReachedRecursionLimitWithMargin() function (#141951)
Move the private function to the internal C API (pycore_ceval.h).
2025-11-27 12:32:00 +01:00
da-woods afa0badcc5 gh-141726: Add PyDict_SetDefaultRef() to the Stable ABI (#141727)
Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
2025-11-19 11:38:10 +00:00
Victor Stinner 600f3feb23 gh-141070: Add PyUnstable_Object_Dump() function (#141072)
* Promote _PyObject_Dump() as a public function.
* Keep _PyObject_Dump() alias to PyUnstable_Object_Dump()
  for backward compatibility.
* Replace _PyObject_Dump() with PyUnstable_Object_Dump().

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
2025-11-18 16:13:13 +00:00
Pablo Galindo Salgado 89a914c58d gh-135953: Add GIL contention markers to sampling profiler Gecko format (#139485)
This commit enhances the Gecko format reporter in the sampling profiler
to include markers for GIL acquisition events.
2025-11-17 12:46:26 +00:00
SubbaraoGarlapati 7800b78067 fix memory order of _Py_atomic_store_uint_release (#141562) 2025-11-17 16:53:12 +05:30
Victor Stinner 3bacae5598 gh-131510: Use PyUnstable_Unicode_GET_CACHED_HASH() (GH-141520)
Replace code that directly accesses PyASCIIObject.hash with
PyUnstable_Unicode_GET_CACHED_HASH().

Remove redundant "assert(PyUnicode_Check(op))" from
PyUnstable_Unicode_GET_CACHED_HASH(), _PyASCIIObject_CAST() already
implements the check.
2025-11-14 11:13:24 +01:00
Itamar Oren 1e4e59bb37 gh-116146: Add C-API to create module from spec and initfunc (GH-139196)
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
2025-11-14 10:43:25 +01:00
Ken Jin 4fa80ce74c gh-139109: A new tracing JIT compiler frontend for CPython (GH-140310)
This PR changes the current JIT model from trace projection to trace recording. Benchmarking: better pyperformance (about 1.7% overall) geomean versus current https://raw.githubusercontent.com/facebookexperimental/free-threading-benchmarking/refs/heads/main/results/bm-20251108-3.15.0a1%2B-7e2bc1d-JIT/bm-20251108-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-7e2bc1d-vs-base.svg, 100% faster Richards on the most improved benchmark versus the current JIT. Slowdown of about 10-15% on the worst benchmark versus the current JIT. **Note: the fastest version isn't the one merged, as it relies on fixing bugs in the specializing interpreter, which is left to another PR**. The speedup in the merged version is about 1.1%. https://raw.githubusercontent.com/facebookexperimental/free-threading-benchmarking/refs/heads/main/results/bm-20251112-3.15.0a1%2B-f8a764a-JIT/bm-20251112-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-f8a764a-vs-base.svg

Stats: 50% more uops executed, 30% more traces entered the last time we ran them. It also suggests our trace lengths for a real trace recording JIT are too short, as a lot of trace too long aborts https://github.com/facebookexperimental/free-threading-benchmarking/blob/main/results/bm-20251023-3.15.0a1%2B-eb73378-CLANG%2CJIT/bm-20251023-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-eb73378-pystats-vs-base.md .

This new JIT frontend is already able to record/execute significantly more instructions than the previous JIT frontend. In this PR, we are now able to record through custom dunders, simple object creation, generators, etc. None of these were done by the old JIT frontend. Some custom dunders uops were discovered to be broken as part of this work gh-140277

The optimizer stack space check is disabled, as it's no longer valid to deal with underflow.

Pros:
* Ignoring the generated tracer code as it's automatically created, this is only additional 1k lines of code. The maintenance burden is handled by the DSL and code generator.
* `optimizer.c` is now significantly simpler, as we don't have to do strange things to recover the bytecode from a trace.
* The new JIT frontend is able to handle a lot more control-flow than the old one.
* Tracing is very low overhead. We use the tail calling interpreter/computed goto interpreter to switch between tracing mode and non-tracing mode. I call this mechanism dual dispatch, as we have two dispatch tables dispatching to each other. Specialization is still enabled while tracing.
* Better handling of polymorphism. We leverage the specializing interpreter for this.

Cons:
* (For now) requires tail calling interpreter or computed gotos. This means no Windows JIT for now :(. Not to fret, tail calling is coming soon to Windows though https://github.com/python/cpython/pull/139962

Design:
* After each instruction, the `record_previous_inst` function/label is executed. This does as the name suggests.
* The tracing interpreter lowers bytecode to uops directly so that it can obtain "fresh" values at the point of lowering.
* The tracing version behaves nearly identical to the normal interpreter, in fact it even has specialization! This allows it to run without much of a slowdown when tracing. The actual cost of tracing is only a function call and writes to memory.
* The tracing interpreter uses the specializing interpreter's deopt to naturally form the side exit chains. This allows it to side exit chain effectively, without repeating much code. We force a re-specializing when tracing a deopt.
* The tracing interpreter can even handle goto errors/exceptions, but I chose to disable them for now as it's not tested.
* Because we do not share interpreter dispatch, there is should be no significant slowdown to the original specializing interpreter on tailcall and computed got with JIT disabled. With JIT enabled, there might be a slowdown in the form of the JIT trying to trace.
* Things that could have dynamic instruction pointer effects are guarded on. The guard deopts to a new instruction --- `_DYNAMIC_EXIT`.
2025-11-13 18:08:32 +00:00