Commit Graph

1694 Commits

Author SHA1 Message Date
Mark Shannon 517dc65ffc GH-128682: Stronger checking of PyStackRef_CLOSE and DEAD. (GH-128683) 2025-01-13 12:37:48 +00:00
Mark Shannon 39fc7ef4fe GH-124483: Mark Py_DECREF, etc. as escaping for the JIT (GH-128678) 2025-01-13 11:42:45 +00:00
Mark Shannon ddd959987c GH-128685: Specialize (rather than quicken) LOAD_CONST into LOAD_CONST_[IM]MORTAL (GH-128708) 2025-01-13 10:30:28 +00:00
Hood Chatham d0ecbdd838 gh-128627: Emscripten: Use wasm-gc based call adaptor if available (#128628)
Replaces the trampoline mechanism in Emscripten with an implementation that uses a
recently added feature of wasm-gc instead of JS type reflection, when that feature is
available.
2025-01-13 07:09:39 +08:00
Pieter Eendebak ff39e3ff7b gh-126703: Add freelist for PyMethodObject (#128594) 2025-01-12 18:31:49 +05:30
Kumar Aditya 7dc41ad6a7 gh-128002: fix asyncio.all_tasks against concurrent deallocations of tasks (#128541) 2025-01-09 21:26:00 +05:30
Mark Shannon f826beca0c GH-128375: Better instrument for FOR_ITER (GH-128445) 2025-01-06 17:54:47 +00:00
Jelle Zijlstra d50fa0541b gh-128089: Add PYC magic number for VALUE_WITH_FAKE_GLOBALS (#128097)
Assign 3610 PYC magic number to VALUE_WITH_FAKE_GLOBALS
format of annotationlib.
2025-01-06 10:30:42 +00:00
Ken Jin 7ef4907412 gh-128262: Allow specialization of calls to classes with __slots__ (GH-128263) 2024-12-31 12:24:17 +08:00
T. Wouters 180d417e9f gh-114203: Optimise simple recursive critical sections (#128126)
Add a fast path to (single-mutex) critical section locking _iff_ the mutex
is already held by the currently active, top-most critical section of this
thread. This can matter a lot for indirectly recursive critical sections
without intervening critical sections.
2024-12-23 13:31:33 +01:00
Mark Shannon 128cc47fbd GH-127705: Add debug mode for _PyStackRefs inspired by HPy debug mode (GH-128121) 2024-12-20 16:52:20 +00:00
Mark Shannon df46c780fe GH-122548: Correct magic number comment (GH-128115)
Correct magic number comment
2024-12-20 11:57:44 +00:00
mpage 255762c09f gh-127274: Defer nested methods (#128012)
Methods (functions defined in class scope) are likely to be cleaned
up by the GC anyway.

Add a new code flag, `CO_METHOD`, that is set for functions defined
in a class scope. Use that when deciding to defer functions.
2024-12-19 13:03:14 -08:00
Neil Schemenauer 1b15c89a17 gh-115999: Specialize STORE_ATTR in free-threaded builds. (gh-127838)
* Add `_PyDictKeys_StringLookupSplit` which does locking on dict keys and
  use in place of `_PyDictKeys_StringLookup`.

* Change `_PyObject_TryGetInstanceAttribute` to use that function
  in the case of split keys.

* Add `unicodekeys_lookup_split` helper which allows code sharing
  between `_Py_dict_lookup` and `_PyDictKeys_StringLookupSplit`.

* Fix locking for `STORE_ATTR_INSTANCE_VALUE`.  Create
  `_GUARD_TYPE_VERSION_AND_LOCK` uop so that object stays locked and
  `tp_version_tag` cannot change.

* Pass `tp_version_tag` to `specialize_dict_access()`, ensuring
  the version we store on the cache is the correct one (in case of
  it changing during the specalize analysis).

* Split `analyze_descriptor` into `analyze_descriptor_load` and
  `analyze_descriptor_store` since those don't share much logic.
  Add `descriptor_is_class` helper function.

* In `specialize_dict_access`, double check `_PyObject_GetManagedDict()`
  in case we race and dict was materialized before the lock.

* Avoid borrowed references in `_Py_Specialize_StoreAttr()`.

* Use `specialize()` and `unspecialize()` helpers.

* Add unit tests to ensure specializing happens as expected in FT builds.

* Add unit tests to attempt to trigger data races (useful for running under TSAN).

* Add `has_split_table` function to `_testinternalcapi`.
2024-12-19 10:21:17 -08:00
Mark Shannon d2f1d917e8 GH-122548: Implement branch taken and not taken events for sys.monitoring (GH-122564) 2024-12-19 16:59:51 +00:00
Donghee Na 48c70b8f7d gh-115999: Enable BINARY_SUBSCR_GETITEM for free-threaded build (gh-127737) 2024-12-19 11:08:17 +09:00
Kumar Aditya 91c55085a9 gh-128033: change PyMutex_LockFast to take PyMutex as argument (#128054)
Change `PyMutex_LockFast` to take `PyMutex` as argument.
2024-12-18 20:49:00 +05:30
Bénédikt Tran 7303f06846 gh-126742: Add _PyErr_SetLocaleString, use it for gdbm & dlerror messages (GH-126746)
- Add a helper to set an error from locale-encoded `char*`
- Use the helper for gdbm & dlerror messages

Co-authored-by: Victor Stinner <vstinner@python.org>
2024-12-17 12:12:45 +01:00
Peter Bierma 3b766824fe gh-126907: make atexit thread safe in free-threading (#127935)
Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
2024-12-16 19:31:44 +00:00
Ed Nutting ab05beb8ce gh-127599: Fix _Py_RefcntAdd missing calls to _Py_INCREF_STAT_INC/_Py_INCREF_IMMORTAL_STAT_INC (#127717)
Previously, `_Py_RefcntAdd` hasn't called 
`_Py_INCREF_STAT_INC/_Py_INCREF_IMMORTAL_STAT_INC` which is incorrect. 

Now it has been fixed.
2024-12-15 15:51:03 +02:00
mpage 2de048ce79 gh-115999: Specialize loading attributes from modules in free-threaded builds (#127711)
We use the same approach that was used for specialization of LOAD_GLOBAL in free-threaded builds:

_CHECK_ATTR_MODULE is renamed to _CHECK_ATTR_MODULE_PUSH_KEYS; it pushes the keys object for the following _LOAD_ATTR_MODULE_FROM_KEYS (nee _LOAD_ATTR_MODULE). This arrangement avoids having to recheck the keys version.

_LOAD_ATTR_MODULE is renamed to _LOAD_ATTR_MODULE_FROM_KEYS; it loads the value from the keys object pushed by the preceding _CHECK_ATTR_MODULE_PUSH_KEYS at the cached index.
2024-12-13 10:17:16 -08:00
Mark Shannon e62e1ca455 GH-126833: Dumps graphviz representation of executor graph. (GH-126880) 2024-12-13 11:00:00 +00:00
Pieter Eendebak 5fc6bb2754 gh-126868: Add freelist for compact int objects (GH-126865) 2024-12-13 10:06:26 +00:00
Sam Gross f8dcb82006 gh-127879: Fix data race in _PyFreeList_Push (#127880)
Writes to the `ob_tid` field need to use atomics because it may be
concurrently read by a non-locking dictionary, list, or structmember
read.
2024-12-12 12:59:13 -05:00
Mark Shannon 487fdbed40 GH-125174: Fix compiler warning (GH-127860)
Fix compiler warning
2024-12-12 11:22:20 +00:00
Mark Shannon bc262de06b GH-125174: Mark objects as statically allocated. (#127797)
* Set a bit in the unused part of the refcount on 64 bit machines and the free-threaded build.

* Use the top of the refcount range on 32 bit machines
2024-12-11 17:37:38 +00:00
Mark Shannon 5a23994a3d GH-127058: Make PySequence_Tuple safer and probably faster. (#127758)
* Use a small buffer, then list when constructing a tuple from an arbitrary sequence.
2024-12-11 14:02:59 +00:00
Peter Bierma d5d84c3f13 gh-127791: Fix, document, and test PyUnstable_AtExit (#127793) 2024-12-11 12:14:04 +01:00
Peter Bierma 12680ec5bd gh-127314: Don't mention the GIL when calling without a thread state on the free-threaded build (#127315)
Co-authored-by: Victor Stinner <vstinner@python.org>
2024-12-06 16:58:19 +01:00
Sam Gross f4f530804b gh-127582: Make object resurrection thread-safe for free threading. (GH-127612)
Objects may be temporarily "resurrected" in destructors when calling
finalizers or watcher callbacks. We previously undid the resurrection
by decrementing the reference count using `Py_SET_REFCNT`. This was not
thread-safe because other threads might be accessing the object
(modifying its reference count) if it was exposed by the finalizer,
watcher callback, or temporarily accessed by a racy dictionary or list
access.

This adds internal-only thread-safe functions for temporary object
resurrection during destructors.
2024-12-05 16:07:31 -05:00
Brandt Bucher 94b8f8b409 GH-126795: Increase the JIT side-exit threshold from 64 to 4096 (GH-127155) 2024-12-04 15:01:28 -08:00
mpage dabcecfd6d gh-115999: Enable specialization of CALL instructions in free-threaded builds (#127123)
The CALL family of instructions were mostly thread-safe already and only required a small number of changes, which are documented below.

A few changes were needed to make CALL_ALLOC_AND_ENTER_INIT thread-safe:

Added _PyType_LookupRefAndVersion, which returns the type version corresponding to the returned ref.

Added _PyType_CacheInitForSpecialization, which takes an init method and the corresponding type version and only populates the specialization cache if the current type version matches the supplied version. This prevents potentially caching a stale value in free-threaded builds if we race with an update to __init__.

Only cache __init__ functions that are deferred in free-threaded builds. This ensures that the reference to __init__ that is stored in the specialization cache is valid if the type version guard in _CHECK_AND_ALLOCATE_OBJECT passes.
Fix a bug in _CREATE_INIT_FRAME where the frame is pushed to the stack on failure.

A few other miscellaneous changes were also needed:

Use {LOCK,UNLOCK}_OBJECT in LIST_APPEND. This ensures that the list's per-object lock is held while we are appending to it.

Add missing co_tlbc for _Py_InitCleanup.

Stop/start the world around setting the eval frame hook. This allows us to read interp->eval_frame non-atomically and preserves the behavior of _CHECK_PEP_523 documented below.
2024-12-03 11:20:20 -08:00
Daniele Parmeggiani 979bf2489d gh-117657: TSAN Fix races in PyMember_Get and PyMember_Set for C extensions (GH-123211) 2024-12-03 09:41:53 -05:00
mpage c4303763da gh-127411: Fix invalid conversion of load of TLBC array when compiled in C++ (#127466)
Cast the result of the load to the correct type
2024-12-02 10:13:30 -08:00
Donghee Na 7c2bd9b226 gh-115999: Use light-weight lock for UNPACK_SEQUENCE_LIST (gh-127514) 2024-12-03 00:14:40 +09:00
Mark Shannon a8dd821d5b GH-126491: GC: Mark objects reachable from roots before doing cycle collection (GH-127110)
* Mark almost all reachable objects before doing collection phase

* Add stats for objects marked

* Visit new frames before each increment

* Update docs

* Clearer calculation of work to do.
2024-12-02 10:12:17 +00:00
Donghee Na e2713409cf gh-115999: Add partial free-thread specialization for BINARY_SUBSCR (gh-127227) 2024-12-02 10:38:17 +09:00
Sergey B Kirpichev 987311d42e gh-69639: Add mixed-mode rules for complex arithmetic (C-like) (GH-124829)
"Generally, mixed-mode arithmetic combining real and complex variables should
be performed directly, not by first coercing the real to complex, lest the sign
of zero be rendered uninformative; the same goes for combinations of pure
imaginary quantities with complex variables." (c) Kahan, W: Branch cuts for
complex elementary functions.

This patch implements mixed-mode arithmetic rules, combining real and
complex variables as specified by C standards since C99 (in particular,
there is no special version for the true division with real lhs
operand).  Most C compilers implementing C99+ Annex G have only these
special rules (without support for imaginary type, which is going to be
deprecated in C2y).
2024-11-26 17:57:39 +02:00
Jelle Zijlstra dcf629213b gh-119180: Add VALUE_WITH_FAKE_GLOBALS format to annotationlib (#124415) 2024-11-26 15:40:13 +00:00
mpage 193890c1cc gh-126612: Include stack effects of uops when computing maximum stack depth (#126894) 2024-11-26 00:53:49 +00:00
Sam Gross 4759ba6eec gh-127022: Simplify PyStackRef_FromPyObjectSteal (#127024)
This gets rid of the immortal check in `PyStackRef_FromPyObjectSteal()`.
Overall, this improves performance about 2% in the free threading
build.

This also renames `PyStackRef_Is()` to `PyStackRef_IsExactly()` because
the macro requires that the tag bits of the arguments match, which is
only true in certain special cases.
2024-11-22 12:55:33 -05:00
Kirill Podoprigora 27486c3365 gh-115999: Add free-threaded specialization for UNPACK_SEQUENCE (#126600)
Add free-threaded specialization for `UNPACK_SEQUENCE` opcode.
`UNPACK_SEQUENCE_TUPLE/UNPACK_SEQUENCE_TWO_TUPLE` are already thread safe since tuples are immutable.
`UNPACK_SEQUENCE_LIST` is not thread safe because of nature of lists (there is nothing preventing another thread from adding items to or removing them the list while the instruction is executing). To achieve thread safety we add a critical section to the implementation of `UNPACK_SEQUENCE_LIST`, especially around the parts where we check the size of the list and push items onto the stack.


---------

Co-authored-by: Matt Page <mpage@meta.com>
Co-authored-by: mpage <mpage@cs.stanford.edu>
2024-11-22 19:00:35 +02:00
Eric Snow 5ba67af006 gh-127117: Ensure the Correct Last Fields of PyInterpreterState and of _PyRuntimeState (gh-127118)
We add some comments so contributors can be aware and we move one out-of-place field up.
2024-11-22 09:37:02 -07:00
Donghee Na 78a530a578 gh-115999: Add free-threaded specialization for `TO_BOOL` (gh-126616) 2024-11-22 07:52:16 +09:00
mpage 09c240f20c gh-115999: Specialize LOAD_GLOBAL in free-threaded builds (#126607)
Enable specialization of LOAD_GLOBAL in free-threaded builds.

Thread-safety of specialization in free-threaded builds is provided by the following:

A critical section is held on both the globals and builtins objects during specialization. This ensures we get an atomic view of both builtins and globals during specialization.
Generation of new keys versions is made atomic in free-threaded builds.
Existing helpers are used to atomically modify the opcode.
Thread-safety of specialized instructions in free-threaded builds is provided by the following:

Relaxed atomics are used when loading and storing dict keys versions. This avoids potential data races as the dict keys versions are read without holding the dictionary's per-object lock in version guards.
Dicts keys objects are passed from keys version guards to the downstream uops. This ensures that we are loading from the correct offset in the keys object. Once a unicode key has been stored in a keys object for a combined dictionary in free-threaded builds, the offset that it is stored in will never be reused for a different key. Once the version guard passes, we know that we are reading from the correct offset.
The dictionary read fast-path is used to read values from the dictionary once we know the correct offset.
2024-11-21 11:22:21 -08:00
Eric Snow 9dabace39d gh-114940: Add _Py_FOR_EACH_TSTATE_UNLOCKED(), and Friends (gh-127077)
This is a precursor to the actual fix for gh-114940, where we will change these macros to use the new lock.  This change is almost entirely mechanical; the exceptions are the loops in codeobject.c and ceval.c, which now hold the "head" lock.  Note that almost all of the uses of _Py_FOR_EACH_TSTATE_UNLOCKED() here will change to _Py_FOR_EACH_TSTATE_BEGIN() once we add the new per-interpreter lock.
2024-11-21 11:08:38 -07:00
Dino Viehland bf542f8bb9 gh-124470: Fix crash when reading from object instance dictionary while replacing it (#122489)
Delay free a dictionary when replacing it
2024-11-21 10:41:19 -06:00
Victor Stinner 3c2bd66e21 gh-126316: Make grp.getgrall() thread-safe: add a mutex (#127055)
grpmodule.c is no longer built with the limited C API, since PyMutex
is excluded from the limited C API.
2024-11-21 15:47:24 +01:00
Victor Stinner 0c5556fcb7 gh-112136: Remove unused #include "pycore_lock.h" (#127093)
pycore_modsupport.h no longer needs pycore_lock.h.
2024-11-21 13:50:11 +01:00
Mark Shannon aea0c586d1 GH-127010: Don't lazily track and untrack dicts (GH-127027) 2024-11-20 16:41:20 +00:00