gh-137814: [3.14] Fix __qualname__ of __annotate__ functions in the interpreter
I'd still like to do #137842 on 3.15+, but that requires changing bytecode and we can't
really afford to do that in 3.14. So to fix this in 3.14, let's patch things up in the
ceval loop instead.
This is safe because the compiler only sets __annotate__ to just-created dedicated
annotate functions.
The watcher-bits read in _PyDict_NotifyEvent needs to use acquire to
synchronize with the release from PyDict_Watch so that the callback
publication is visible before the callback is invoked.
(cherry picked from commit 19f96f99fe)
Co-authored-by: Sam Gross <colesbury@gmail.com>
Fixes data races between dict mutation and watch/unwatch on the same dict.
(cherry picked from commit 3ab94d6842)
Co-authored-by: Sam Gross <colesbury@gmail.com>
gh-148144: Initialize visited on copied interpreter frames (GH-148143)
_PyFrame_Copy() copied interpreter frames into generator and
frame-object storage without initializing the visited byte. Incremental
GC later reads frame->visited in mark_stacks() on non-start passes, so
copied frames could expose an uninitialized value once they became live
on a thread stack again.
Reset visited when copying a frame so copied frames start with defined
GC bookkeeping state. Preserve lltrace in Py_DEBUG builds.
(cherry picked from commit fbfc6ccb0a)
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
Align the QSBR thread state array to a 64-byte cache line boundary
and add padding at the end of _PyThreadStateImpl. Depending on heap
layout, the QSBR array could end up sharing a cache line with a
thread's tlbc_index, causing QSBR quiescent state updates to contend
with reads of tlbc_index in RESUME_CHECK. This is sensitive to
earlier allocations during interpreter init and can appear or
disappear with seemingly unrelated changes.
Either change alone is sufficient to fix the specific issue, but both
are worthwhile to avoid similar problems in the future.
(cherry picked from commit 6577d870b0)
gh-143050: Correct PyLong_FromString() to use _PyLong_Negate() (GH-145901)
The long_from_string_base() might return a small integer, when the
_pylong.py is used to do conversion. Hence, we must be careful here to
not smash it "small int" bit by using the _PyLong_FlipSign().
(cherry picked from commit db5936c5b8)
Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
Also fix a few related issues in the pyatomic headers:
* Fix _Py_atomic_store_uint_release in pyatomic_msc.h to use __stlr32
on ARM64 instead of a plain volatile store (which is only relaxed on
ARM64).
* Add missing _Py_atomic_store_uint_release to pyatomic_gcc.h.
* Fix pseudo-code comment for _Py_atomic_store_ptr_release in
pyatomic.h.
(cherry picked from commit 1eff27f2c0)
Add special cases for classmethod and staticmethod descriptors in
_PyObject_GetMethodStackRef() to avoid calling tp_descr_get, which
avoids reference count contention on the bound method and underlying
callable. This improves scaling when calling classmethods and
staticmethods from multiple threads.
Also refactor method_vectorcall in classobject.c into a new
_PyObject_VectorcallPrepend() helper so that it can be used by
PyObject_VectorcallMethod as well.
(cherry picked from commit e0f7c1097e)
Cache one datachunk per tstate to prevent alloc/dealloc thrashing when repeatedly hitting the same call depth at exactly the wrong boundary.
Move new _ts member to the end to not mess up remote debuggers' ideas of the
struct's layout. (The struct is only created by the runtime, and the new
field only used by the runtime, so it should be safe.)
(cherry picked from commit 706fd4ec08)
Co-authored-by: T. Wouters <thomas@python.org>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Avoid locking in the PyType_Lookup cache-miss path if the type's
tp_version_tag is already valid.
(cherry picked from commit cd52172831)
Co-authored-by: Sam Gross <colesbury@gmail.com>
This undoes a change made as a part of PR 137470, for compatibility with EMSDK
4.0.19. It adds `emscripten_trampoline` field in `pycore_runtime_structs.h`
and initializes it from JS initialization code with the wasm-gc based trampoline
if possible. Otherwise we fall back to the JS trampoline.
(cherry picked from commit 43fdb7037e)
Co-authored-by: Hood Chatham <roberthoodchatham@gmail.com>
gh-143057: avoid locking in `tracemalloc` C-APIs when it is not enabled (GH-143065)
(cherry picked from commit e728b006de)
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
There are places we use "relaxed" loads where C11 requires "consume" or
stronger. Unfortunately, compilers don't really implement "consume" so
fake it for our use in a way that avoids upsetting TSan.
(cherry picked from commit 0a62f8277e)
Co-authored-by: Sam Gross <colesbury@gmail.com>
gh-140222: Increase stack margin on debug build (GH-142452)
Increase _PyOS_MIN_STACK_SIZE if Python is built in debug mode.
(cherry picked from commit 49207a5226)
Co-authored-by: Victor Stinner <vstinner@python.org>
On m68k, an fmove instruction accessing %fpcr may only move from
or to a data register or a memory operand. The constraint "g" also
permits the use of address registers, which is invalid. The correct
constraint is "dm". Beginning with GCC 15, the register allocator
picks an address register in the code which causes SIGILL during
runtime.
(cherry picked from commit 02c085d48b)
Co-authored-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Co-authored-by: Michael Karcher <github@mkarcher.dialup.fu-berlin.de>
gh-116008: Detect freed thread state in faulthandler (GH-141988)
Add _PyMem_IsULongFreed() function.
(cherry picked from commit d5d9e89dde)
Co-authored-by: Victor Stinner <vstinner@python.org>
* gh-125434: Display thread name in faulthandler on Windows (#140675)
(cherry picked from commit 313145eab5)
* gh-125434: Fix non-ASCII thread names in faulthandler on Windows (#140700)
Add _Py_DumpWideString() function to dump a wide string as ASCII. It
supports surrogate pairs.
Replace _Py_EncodeLocaleRaw() with _Py_DumpWideString()
in write_thread_name().
(cherry picked from commit 80f20f58b2)
Added atomic operations to `scanner_begin()` and `scanner_end()` to prevent
race conditions on the `executing` flag in free-threaded builds. Also added
tests for concurrent usage of the `re` module.
Without the atomic operations, `test_scanner_concurrent_access()` triggers
`assert(self->executing)` failures, or a thread sanitizer run emits errors.
(cherry picked from commit bc9e63dd9d)
Co-authored-by: Alper <alperyoney@fb.com>
Only raises if the stack pointer is both below the limit *and* above the stack base.
This prevents false positives for user-space threads, as the stack pointer will be outside those bounds
if the stack has been swapped.
Cherry-picked from commit c25a070759
Co-authored-by: Mark Shannon <mark@hotpy.org>
* [3.14] GH-139914: Handle stack growth direction on HPPA (GH-140028)
Adapted from a patch for Python 3.14 submitted to the Debian BTS by John David Anglin https://bugs.debian.org/1105111#20
* Forgot to update test_call
* WTF typo