Commit Graph

1667 Commits

Author SHA1 Message Date
Peter Bierma d5d84c3f13 gh-127791: Fix, document, and test PyUnstable_AtExit (#127793) 2024-12-11 12:14:04 +01:00
Peter Bierma 12680ec5bd gh-127314: Don't mention the GIL when calling without a thread state on the free-threaded build (#127315)
Co-authored-by: Victor Stinner <vstinner@python.org>
2024-12-06 16:58:19 +01:00
Sam Gross f4f530804b gh-127582: Make object resurrection thread-safe for free threading. (GH-127612)
Objects may be temporarily "resurrected" in destructors when calling
finalizers or watcher callbacks. We previously undid the resurrection
by decrementing the reference count using `Py_SET_REFCNT`. This was not
thread-safe because other threads might be accessing the object
(modifying its reference count) if it was exposed by the finalizer,
watcher callback, or temporarily accessed by a racy dictionary or list
access.

This adds internal-only thread-safe functions for temporary object
resurrection during destructors.
2024-12-05 16:07:31 -05:00
Brandt Bucher 94b8f8b409 GH-126795: Increase the JIT side-exit threshold from 64 to 4096 (GH-127155) 2024-12-04 15:01:28 -08:00
mpage dabcecfd6d gh-115999: Enable specialization of CALL instructions in free-threaded builds (#127123)
The CALL family of instructions were mostly thread-safe already and only required a small number of changes, which are documented below.

A few changes were needed to make CALL_ALLOC_AND_ENTER_INIT thread-safe:

Added _PyType_LookupRefAndVersion, which returns the type version corresponding to the returned ref.

Added _PyType_CacheInitForSpecialization, which takes an init method and the corresponding type version and only populates the specialization cache if the current type version matches the supplied version. This prevents potentially caching a stale value in free-threaded builds if we race with an update to __init__.

Only cache __init__ functions that are deferred in free-threaded builds. This ensures that the reference to __init__ that is stored in the specialization cache is valid if the type version guard in _CHECK_AND_ALLOCATE_OBJECT passes.
Fix a bug in _CREATE_INIT_FRAME where the frame is pushed to the stack on failure.

A few other miscellaneous changes were also needed:

Use {LOCK,UNLOCK}_OBJECT in LIST_APPEND. This ensures that the list's per-object lock is held while we are appending to it.

Add missing co_tlbc for _Py_InitCleanup.

Stop/start the world around setting the eval frame hook. This allows us to read interp->eval_frame non-atomically and preserves the behavior of _CHECK_PEP_523 documented below.
2024-12-03 11:20:20 -08:00
Daniele Parmeggiani 979bf2489d gh-117657: TSAN Fix races in PyMember_Get and PyMember_Set for C extensions (GH-123211) 2024-12-03 09:41:53 -05:00
mpage c4303763da gh-127411: Fix invalid conversion of load of TLBC array when compiled in C++ (#127466)
Cast the result of the load to the correct type
2024-12-02 10:13:30 -08:00
Donghee Na 7c2bd9b226 gh-115999: Use light-weight lock for UNPACK_SEQUENCE_LIST (gh-127514) 2024-12-03 00:14:40 +09:00
Mark Shannon a8dd821d5b GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-127110)
* Mark almost all reachable objects before doing collection phase

* Add stats for objects marked

* Visit new frames before each increment

* Update docs

* Clearer calculation of work to do.
2024-12-02 10:12:17 +00:00
Donghee Na e2713409cf gh-115999: Add partial free-thread specialization for BINARY_SUBSCR (gh-127227) 2024-12-02 10:38:17 +09:00
Sergey B Kirpichev 987311d42e gh-69639: Add mixed-mode rules for complex arithmetic (C-like) (GH-124829)
"Generally, mixed-mode arithmetic combining real and complex variables should
be performed directly, not by first coercing the real to complex, lest the sign
of zero be rendered uninformative; the same goes for combinations of pure
imaginary quantities with complex variables." (c) Kahan, W: Branch cuts for
complex elementary functions.

This patch implements mixed-mode arithmetic rules, combining real and
complex variables as specified by C standards since C99 (in particular,
there is no special version for the true division with real lhs
operand).  Most C compilers implementing C99+ Annex G have only these
special rules (without support for imaginary type, which is going to be
deprecated in C2y).
2024-11-26 17:57:39 +02:00
Jelle Zijlstra dcf629213b gh-119180: Add VALUE_WITH_FAKE_GLOBALS format to annotationlib (#124415) 2024-11-26 15:40:13 +00:00
mpage 193890c1cc gh-126612: Include stack effects of uops when computing maximum stack depth (#126894) 2024-11-26 00:53:49 +00:00
Sam Gross 4759ba6eec gh-127022: Simplify PyStackRef_FromPyObjectSteal (#127024)
This gets rid of the immortal check in `PyStackRef_FromPyObjectSteal()`.
Overall, this improves performance about 2% in the free threading
build.

This also renames `PyStackRef_Is()` to `PyStackRef_IsExactly()` because
the macro requires that the tag bits of the arguments match, which is
only true in certain special cases.
2024-11-22 12:55:33 -05:00
Kirill Podoprigora 27486c3365 gh-115999: Add free-threaded specialization for UNPACK_SEQUENCE (#126600)
Add free-threaded specialization for `UNPACK_SEQUENCE` opcode.
`UNPACK_SEQUENCE_TUPLE/UNPACK_SEQUENCE_TWO_TUPLE` are already thread safe since tuples are immutable.
`UNPACK_SEQUENCE_LIST` is not thread safe because of nature of lists (there is nothing preventing another thread from adding items to or removing them the list while the instruction is executing). To achieve thread safety we add a critical section to the implementation of `UNPACK_SEQUENCE_LIST`, especially around the parts where we check the size of the list and push items onto the stack.


---------

Co-authored-by: Matt Page <mpage@meta.com>
Co-authored-by: mpage <mpage@cs.stanford.edu>
2024-11-22 19:00:35 +02:00
Eric Snow 5ba67af006 gh-127117: Ensure the Correct Last Fields of PyInterpreterState and of _PyRuntimeState (gh-127118)
We add some comments so contributors can be aware and we move one out-of-place field up.
2024-11-22 09:37:02 -07:00
Donghee Na 78a530a578 gh-115999: Add free-threaded specialization for `TO_BOOL` (gh-126616) 2024-11-22 07:52:16 +09:00
mpage 09c240f20c gh-115999: Specialize LOAD_GLOBAL in free-threaded builds (#126607)
Enable specialization of LOAD_GLOBAL in free-threaded builds.

Thread-safety of specialization in free-threaded builds is provided by the following:

A critical section is held on both the globals and builtins objects during specialization. This ensures we get an atomic view of both builtins and globals during specialization.
Generation of new keys versions is made atomic in free-threaded builds.
Existing helpers are used to atomically modify the opcode.
Thread-safety of specialized instructions in free-threaded builds is provided by the following:

Relaxed atomics are used when loading and storing dict keys versions. This avoids potential data races as the dict keys versions are read without holding the dictionary's per-object lock in version guards.
Dicts keys objects are passed from keys version guards to the downstream uops. This ensures that we are loading from the correct offset in the keys object. Once a unicode key has been stored in a keys object for a combined dictionary in free-threaded builds, the offset that it is stored in will never be reused for a different key. Once the version guard passes, we know that we are reading from the correct offset.
The dictionary read fast-path is used to read values from the dictionary once we know the correct offset.
2024-11-21 11:22:21 -08:00
Eric Snow 9dabace39d gh-114940: Add _Py_FOR_EACH_TSTATE_UNLOCKED(), and Friends (gh-127077)
This is a precursor to the actual fix for gh-114940, where we will change these macros to use the new lock.  This change is almost entirely mechanical; the exceptions are the loops in codeobject.c and ceval.c, which now hold the "head" lock.  Note that almost all of the uses of _Py_FOR_EACH_TSTATE_UNLOCKED() here will change to _Py_FOR_EACH_TSTATE_BEGIN() once we add the new per-interpreter lock.
2024-11-21 11:08:38 -07:00
Dino Viehland bf542f8bb9 gh-124470: Fix crash when reading from object instance dictionary while replacing it (#122489)
Delay free a dictionary when replacing it
2024-11-21 10:41:19 -06:00
Victor Stinner 3c2bd66e21 gh-126316: Make grp.getgrall() thread-safe: add a mutex (#127055)
grpmodule.c is no longer built with the limited C API, since PyMutex
is excluded from the limited C API.
2024-11-21 15:47:24 +01:00
Victor Stinner 0c5556fcb7 gh-112136: Remove unused #include "pycore_lock.h" (#127093)
pycore_modsupport.h no longer needs pycore_lock.h.
2024-11-21 13:50:11 +01:00
Mark Shannon aea0c586d1 GH-127010: Don't lazily track and untrack dicts (GH-127027) 2024-11-20 16:41:20 +00:00
Eric Snow 1c0a104eca gh-126914: Store the Preallocated Thread State's Pointer in a PyInterpreterState Field (gh-126989)
This approach eliminates the originally reported race. It also gets rid of the deadlock reported in gh-96071, so we can remove the workaround added then.
2024-11-19 12:59:19 -07:00
Pablo Galindo Salgado 30aeb00d36 gh-126076: Account for relocated objects in tracemalloc (#126077) 2024-11-19 10:35:17 +00:00
Hugo van Kemenade 899fdb213d Revert "GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-126502)" (#126983) 2024-11-19 11:25:09 +02:00
Eric Snow d6b3e78504 gh-126986: Drop _PyInterpreterState_FailIfNotRunning() (gh-126988)
We replace it with _PyErr_SetInterpreterAlreadyRunning().
2024-11-19 00:11:12 +00:00
Brandt Bucher 4cd10762b0 GH-126795: Increase the JIT threshold from 16 to 4096 (GH-126816) 2024-11-18 11:11:23 -08:00
Mark Shannon b0fcc2c47a GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-126502)
* Mark almost all reachable objects before doing collection phase

* Add stats for objects marked

* Visit new frames before each increment

* Remove lazy dict tracking

* Update docs

* Clearer calculation of work to do.
2024-11-18 14:31:26 +00:00
Sam Gross 5610860840 gh-126688: Reinit import lock after fork (#126692)
The PyMutex implementation supports unlocking after fork because we
clear the list of waiters in parking_lot.c. This doesn't work as well
for _PyRecursiveMutex because on some systems, such as SerenityOS, the
thread id is not preserved across fork().
2024-11-12 15:53:58 -05:00
Eric Snow 73cf069099 gh-76785: Improved Subinterpreters Compatibility with 3.12 (2/2) (gh-126707)
These changes makes it easier to backport the _interpreters, _interpqueues, and _interpchannels modules to Python 3.12.

This involves the following:

* add the _PyXI_GET_STATE() and _PyXI_GET_GLOBAL_STATE() macros
* add _PyXIData_lookup_context_t and _PyXIData_GetLookupContext()
* add _Py_xi_state_init() and _Py_xi_state_fini()
2024-11-12 10:41:51 -07:00
Eric Snow a6d48e8f83 gh-76785: Improved Subinterpreters Compatibility with 3.12 (1/2) (gh-126704)
These changes makes it easier to backport the _interpreters, _interpqueues, and _interpchannels modules to Python 3.12.

This involves the following:

* rename several structs and typedefs
* add several typedefs
* stop using the PyThreadState.state field directly in parking_lot.c
2024-11-11 15:58:46 -07:00
Eric Snow b697d8c48e gh-76785: Minor Cleanup of Exception-related Cross-interpreter State (gh-126602)
This change makes it easier to backport the _interpreters, _interpqueues, and _interpchannels modules to Python 3.12.
2024-11-11 14:49:41 -07:00
Ken Jin 6293d00e72 gh-120619: Strength reduce function guards, support 2-operand uop forms (GH-124846)
Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
2024-11-09 11:35:33 +08:00
Mark Shannon fa40922597 GH-126547: Pre-assign version numbers for a few common classes (GH-126551) 2024-11-08 16:44:44 +00:00
Serhiy Storchaka 061e50f196 gh-122943: Add the varpos parameter in _PyArg_UnpackKeywords (GH-126564)
Remove _PyArg_UnpackKeywordsWithVararg.
Add comments for integer arguments of _PyArg_UnpackKeywords.
2024-11-08 14:23:50 +02:00
Serhiy Storchaka 1f777396f5 gh-122943: Rework support of var-positional parameter in Argument Clinic (GH-122945)
Move creation of a tuple for var-positional parameter out of
_PyArg_UnpackKeywordsWithVararg().
Merge _PyArg_UnpackKeywordsWithVararg() with _PyArg_UnpackKeywords().
Add a new parameter in _PyArg_UnpackKeywords().

The "parameters" and "converters" attributes of ParseArgsCodeGen no
longer contain the var-positional parameter. It is now available as the
"varpos" attribute. Optimize code generation for var-positional
parameter and reuse the same generating code for functions with and without
keyword parameters.

Add special converters for var-positional parameter. "tuple" represents it as
a Python tuple and "array" represents it as a continuous array of PyObject*.
"object" is a temporary alias of "tuple".
2024-11-07 23:40:03 +02:00
Eric Snow 9357fdcaf0 gh-76785: Minor Cleanup of "Cross-interpreter" Code (gh-126457)
The primary objective here is to allow some later changes to be cleaner. Mostly this involves renaming things and moving a few things around.

* CrossInterpreterData -> XIData
* crossinterpdatafunc -> xidatafunc
* split out pycore_crossinterp_data_registry.h
* add _PyXIData_lookup_t
2024-11-07 09:32:42 -07:00
Mark Shannon 85036c8d61 GH-126222: Fix _PyUop_num_popped (GH-126507) 2024-11-07 10:48:27 +00:00
Serhiy Storchaka d3840503b0 gh-126303: Fix pickling and copying of os.sched_param objects (GH-126336) 2024-11-05 08:23:17 +02:00
mpage 2e95c5ba3b gh-115999: Implement thread-local bytecode and enable specialization for BINARY_OP (#123926)
Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads.

Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization.

Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.
2024-11-04 11:13:32 -08:00
sobolevn c806cd5af6 gh-126220: Adapt _lsprof to Argument Clinic (#126233)
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
2024-11-04 19:18:21 +03:00
Sergey B Kirpichev 8477951a1c gh-120026: soft deprecate Py_HUGE_VAL macro (#120027)
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
2024-11-01 22:04:31 +00:00
Xuanteng Huang 35df4eb959 gh-126072: do not add None to co_consts if there is no docstring (GH-126101) 2024-10-30 09:01:09 +00:00
Mark Shannon faa3272fb8 GH-125837: Split LOAD_CONST into three. (GH-125972)
* Add LOAD_CONST_IMMORTAL opcode

* Add LOAD_SMALL_INT opcode

* Remove RETURN_CONST opcode
2024-10-29 11:15:42 +00:00
Mark Shannon 25441592db GH-125515: Reduce number of compiler warnings in generated code (GH-125697) 2024-10-28 10:30:31 +00:00
Sam Gross 332356b880 gh-125900: Clean-up logic around immortalization in free-threading (#125901)
* Remove `@suppress_immortalization` decorator
* Make suppression flag per-thread instead of per-interpreter
* Suppress immortalization in `eval()` to avoid refleaks in three tests
  (test_datetime.test_roundtrip, test_logging.test_config8_ok, and
   test_random.test_after_fork).
* frozenset() is constant, but not a singleton. When run multiple times,
  the test could fail due to constant interning.
2024-10-24 18:09:59 -04:00
Shantanu 500f5338a8 gh-123930: Better error for "from imports" when script shadows module (#123929) 2024-10-24 12:11:12 -07:00
Sam Gross 3c4a7fa617 gh-124218: Avoid refcount contention on builtins module (GH-125847)
This replaces `_PyEval_BuiltinsFromGlobals` with
`_PyDict_LoadBuiltinsFromGlobals`, which returns a new reference
instead of a borrowed reference. Internally, the new function uses
per-thread reference counting when possible to avoid contention on the
refcount fields on the builtins module.
2024-10-24 12:44:38 -04:00
Sam Gross e545ead66c gh-125859: Fix crash when gc.get_objects is called during GC (#125882)
This fixes a crash when `gc.get_objects()` or `gc.get_referrers()` is
called during a GC in the free threading build.

Switch to `_PyObjectStack` to avoid corrupting the `struct worklist`
linked list maintained by the GC. Also, don't return objects that are frozen
(`gc.freeze()`) or in the process of being collected to more closely match
the behavior of the default build.
2024-10-24 09:33:11 -04:00