Commit Graph

137 Commits

Author SHA1 Message Date
mpage 2e95c5ba3b gh-115999: Implement thread-local bytecode and enable specialization for BINARY_OP (#123926)
Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads.

Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization.

Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.
2024-11-04 11:13:32 -08:00
Victor Stinner ef4b69d2be gh-123747: Avoid static_assert() in internal header files (#123779) 2024-09-06 15:52:07 +02:00
Mark Shannon a3d8c0542e GH-123197: Only count an instruction as deferred if it hasn't deopted first. (GH-123222)
Only count an instruction as deferred if hasn't deopted first.
2024-08-22 14:17:10 +01:00
Mark Shannon bb1d30336e GH-118093: Make CALL_ALLOC_AND_ENTER_INIT suitable for tier 2. (GH-123140)
* Convert CALL_ALLOC_AND_ENTER_INIT to micro-ops such that tier 2 supports it

* Allow inexact arguments for CALL_ALLOC_AND_ENTER_INIT.
2024-08-20 16:52:58 +01:00
Mark Shannon c13e7d98fb GH-118093: Specialize CALL_KW (GH-123006) 2024-08-16 17:11:24 +01:00
Mark Shannon 7a65439b93 GH-122390: Replace _Py_GetbaseOpcode with _Py_GetBaseCodeUnit (GH-122942) 2024-08-13 14:22:57 +01:00
Victor Stinner a2bec77d25 gh-120642: Move _PyCode_CODE() to the internal C API (#121644)
Move _PyCode_CODE() and _PyCode_NBYTES() macros to the internal C API
since they use _Py_CODEUNIT which is only part of the internal C API.
2024-07-13 23:07:49 +02:00
Brandt Bucher 33903c53db GH-116017: Get rid of _COLD_EXITs (GH-120960) 2024-07-01 13:17:40 -07:00
Ken Jin 22b0de2755 gh-117139: Convert the evaluation stack to stack refs (#118450)
This PR sets up tagged pointers for CPython.

The general idea is to create a separate struct _PyStackRef for everything on the evaluation stack to store the bits. This forces the C compiler to warn us if we try to cast things or pull things out of the struct directly.

Only for free threading: We tag the low bit if something is deferred - that means we skip incref and decref operations on it. This behavior may change in the future if Mark's plans to defer all objects in the interpreter loop pans out.

This implies a strict stack reference discipline is required. ALL incref and decref operations on stackrefs must use the stackref variants. It is unsafe to untag something then do normal incref/decref ops on it.

The new incref and decref variants are called dup and close. They mimic a "handle" API operating on these stackrefs.

Please read Include/internal/pycore_stackref.h for more information!

---------

Co-authored-by: Mark Shannon <9448417+markshannon@users.noreply.github.com>
2024-06-27 03:10:43 +08:00
Victor Stinner 9e4a81f00f gh-120642: Move private PyCode APIs to the internal C API (#120643)
* Move _Py_CODEUNIT and related functions to pycore_code.h.
* Move _Py_BackoffCounter to pycore_backoff.h.
* Move Include/cpython/optimizer.h content to pycore_optimizer.h.
* Remove Include/cpython/optimizer.h.
* Remove PyUnstable_Replace_Executor().

Rename functions:

* PyUnstable_GetExecutor() => _Py_GetExecutor()
* PyUnstable_GetOptimizer() => _Py_GetOptimizer()
* PyUnstable_SetOptimizer() => _Py_SetTier2Optimizer()
* PyUnstable_Optimizer_NewCounter() => _PyOptimizer_NewCounter()
* PyUnstable_Optimizer_NewUOpOptimizer() => _PyOptimizer_NewUOpOptimizer()
2024-06-26 13:54:03 +02:00
Sam Gross 723d4d2fe8 gh-118527: Intern code consts in free-threaded build (#118667)
We already intern and immortalize most string constants. In the
free-threaded build, other constants can be a source of reference count
contention because they are shared by all threads running the same code
objects.
2024-05-06 20:12:39 -04:00
Guido van Rossum 4c7bfdff90 Remove more remnants of deepfreeze (#118159) 2024-04-22 12:17:57 -07:00
Jeff Glass acf69e09c6 gh-115178: Add Counts of UOp Pairs to pystats (GH-115181) 2024-04-16 14:27:18 +01:00
Guido van Rossum 060a96f1a9 gh-116968: Reimplement Tier 2 counters (#117144)
Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``),
shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The
API used for adaptive specialization counters is changed but the behavior is
(supposed to be) identical.

The behavior of the Tier 2 counters is changed:
- There are no longer dynamic thresholds (we never varied these).
- All counters now use the same exponential backoff.
- The counter for ``JUMP_BACKWARD`` starts counting down from 16.
- The ``temperature`` in side exits starts counting down from 64.
2024-04-04 15:03:27 +00:00
Mark Shannon c32dc47aca GH-115776: Embed the values array into the object, for "normal" Python objects. (GH-116115) 2024-04-02 11:59:21 +01:00
Michael Droettboom 50369e6c34 gh-116996: Add pystats about _Py_uop_analyse_and_optimize (GH-116997) 2024-03-22 01:27:46 +08:00
Ken Jin 7114cf20c0 gh-116381: Specialize CONTAINS_OP (GH-116385)
* Specialize CONTAINS_OP

* 📜🤖 Added by blurb_it.

* Add PyAPI_FUNC for JIT

---------

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2024-03-07 03:30:11 +08:00
Brett Simmers 339c8e1c13 gh-115999: Disable the specializing adaptive interpreter in free-threaded builds (#116013)
For now, disable all specialization when the GIL might be disabled.
2024-02-29 21:53:32 -05:00
Eric Snow 514b1c91b8 gh-76785: Improved Subinterpreters Compatibility with 3.12 (gh-115424)
For the most part, these changes make is substantially easier to backport subinterpreter-related code to 3.12, especially the related modules (e.g. _xxsubinterpreters). The main motivation is to support releasing a PyPI package with the 3.13 capabilities compiled for 3.12.

A lot of the changes here involve either hiding details behind macros/functions or splitting up some files.
2024-02-13 14:56:49 -07:00
Michael Droettboom ea3cd0498c gh-114312: Collect stats for unlikely events (GH-114493) 2024-01-25 11:10:51 +00:00
Victor Stinner a74902a14c gh-106550: Fix sign conversion in pycore_code.h (#112613)
Fix sign conversion in pycore_code.h: use unsigned integers and cast
explicitly when needed.
2023-12-04 11:42:58 +01:00
Michael Droettboom 2bc01cc0c7 gh-111652: Fix --enable-pystats build (GH-111653) 2023-11-03 15:21:16 +00:00
Michael Droettboom 84b4533e84 gh-109329: Count tier2 opcode misses (#110561)
This keeps a separate 'miss' counter for each micro-opcode, incremented whenever a guard uop takes a deoptimization side exit.
2023-10-30 17:02:45 -07:00
Michael Droettboom e561e98058 GH-109329: Add tier 2 stats (GH-109913) 2023-10-04 14:52:28 -07:00
Brandt Bucher 22e65eecaa GH-105848: Replace KW_NAMES + CALL with LOAD_CONST + CALL_KW (GH-109300) 2023-09-13 10:25:45 -07:00
Mark Shannon 15d4c9fabc GH-108716: Turn off deep-freezing of code objects. (GH-108722) 2023-09-08 10:34:40 +01:00
Dong-hee Na 3bfa24e29f gh-107265: Remove all ENTER_EXECUTOR when execute _Py_Instrument (gh-108539) 2023-09-07 09:53:54 +09:00
Victor Stinner a0773b89df gh-108753: Enhance pystats (#108754)
Statistics gathering is now off by default. Use the "-X pystats"
command line option or set the new PYTHONSTATS environment variable
to 1 to turn statistics gathering on at Python startup.

Statistics are no longer dumped at exit if statistics gathering was
off or statistics have been cleared.

Changes:

* Add PYTHONSTATS environment variable.
* sys._stats_dump() now returns False if statistics are not dumped
  because they are all equal to zero.
* Add PyConfig._pystats member.
* Add tests on sys functions and on setting PyConfig._pystats to 1.
* Add Include/cpython/pystats.h and Include/internal/pycore_pystats.h
  header files.
* Rename '_py_stats' variable to '_Py_stats'.
* Exclude Include/cpython/pystats.h from the Py_LIMITED_API.
* Move pystats.h include from object.h to Python.h.
* Add _Py_StatsOn() and _Py_StatsOff() functions. Remove
  '_py_stats_struct' variable from the API: make it static in
  specialize.c.
* Document API in Include/pystats.h and Include/cpython/pystats.h.
* Complete pystats documentation in Doc/using/configure.rst.
* Don't write "all zeros" stats: if _stats_off() and _stats_clear()
  or _stats_dump() were called.
* _PyEval_Fini() now always call _Py_PrintSpecializationStats() which
  does nothing if stats are all zeros.

Co-authored-by: Michael Droettboom <mdboom@gmail.com>
2023-09-06 15:54:59 +00:00
Victor Stinner 21c0844742 gh-108220: Internal header files require Py_BUILD_CORE to be defined (#108221)
* pycore_intrinsics.h does nothing if included twice
  (add #ifndef and #define).
* Update Tools/cases_generator/generate_cases.py to generate the
  Py_BUILD_CORE test.
* _bz2, _lzma, _opcode and zlib extensions now define the
  Py_BUILD_CORE_MODULE macro to use internal headers
  (pycore_code.h, pycore_intrinsics.h and pycore_blocks_output_buffer.h).
2023-08-21 19:15:52 +02:00
Mark Shannon 2ba7c7f7b1 Add some GC stats to Py_STATS (GH-107581) 2023-08-04 10:34:23 +01:00
Irit Katriel 6ef8f8ca88 gh-105481: the ENABLE_SPECIALIZATION flag does not need to be generated by the build script, or exposed in opcode.py (#107534) 2023-08-01 17:05:00 +00:00
Victor Stinner c5b13d6f80 gh-107211: No longer export internal functions (4) (#107217)
No longer export these 2 internal C API functions:

* _PyEval_SignalAsyncExc()
* _PyEval_SignalReceived()

Add also comments explaining why some internal functions have to be
exported, and update existing comments.
2023-07-25 03:16:28 +00:00
Victor Stinner 11306a91bd gh-107211: No longer export internal functions (1) (#107213)
No longer export these 49 internal C API functions:

* _PyArgv_AsWstrList()
* _PyCode_New()
* _PyCode_Validate()
* _PyFloat_DebugMallocStats()
* _PyFloat_FormatAdvancedWriter()
* _PyImport_CheckSubinterpIncompatibleExtensionAllowed()
* _PyImport_ClearExtension()
* _PyImport_GetModuleId()
* _PyImport_SetModuleString()
* _PyInterpreterState_IDDecref()
* _PyInterpreterState_IDIncref()
* _PyInterpreterState_IDInitref()
* _PyInterpreterState_LookUpID()
* _PyWideStringList_AsList()
* _PyWideStringList_CheckConsistency()
* _PyWideStringList_Clear()
* _PyWideStringList_Copy()
* _PyWideStringList_Extend()
* _Py_ClearArgcArgv()
* _Py_DecodeUTF8Ex()
* _Py_DecodeUTF8_surrogateescape()
* _Py_EncodeLocaleRaw()
* _Py_EncodeUTF8Ex()
* _Py_GetEnv()
* _Py_GetForceASCII()
* _Py_GetLocaleEncoding()
* _Py_GetLocaleEncodingObject()
* _Py_GetLocaleconvNumeric()
* _Py_ResetForceASCII()
* _Py_device_encoding()
* _Py_dg_dtoa()
* _Py_dg_freedtoa()
* _Py_dg_strtod()
* _Py_get_blocking()
* _Py_get_env_flag()
* _Py_get_inheritable()
* _Py_get_osfhandle_noraise()
* _Py_get_xoption()
* _Py_open()
* _Py_open_osfhandle()
* _Py_open_osfhandle_noraise()
* _Py_read()
* _Py_set_blocking()
* _Py_str_to_int()
* _Py_wfopen()
* _Py_wgetcwd()
* _Py_wreadlink()
* _Py_wrealpath()
* _Py_write()
2023-07-25 03:44:11 +02:00
Mark Shannon b03755a234 GH-104909: Break LOAD_GLOBAL specializations in micro-ops. (GH-106677) 2023-07-12 14:34:14 +01:00
Brandt Bucher 7b2d94d875 GH-106008: Make implicit boolean conversions explicit (GH-106003) 2023-06-29 13:49:54 -07:00
Mark Shannon 7199584ac8 GH-100987: Allow objects other than code objects as the "executable" of an internal frame. (GH-105727)
* Add table describing possible executable classes for out-of-process debuggers.

* Remove shim code object creation code as it is no longer needed.

* Make lltrace a bit more robust w.r.t. non-standard frames.
2023-06-14 13:46:37 +01:00
Mark Shannon 4bfa01b9d9 GH-104584: Plugin optimizer API (GH-105100) 2023-06-02 11:46:18 +01:00
Carl Meyer 77262458fe gh-87729: improve hit rate of LOAD_SUPER_ATTR specialization (#104270) 2023-05-11 08:08:13 -06:00
Carl Meyer c3b595e73e gh-97933: (PEP 709) inline list/dict/set comprehensions (#101441)
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
2023-05-09 11:02:14 -06:00
Carl Meyer ebf97c50f2 gh-103978: avoid using 'class' as an identifier (#103979) 2023-04-28 19:20:50 +00:00
Carl Meyer ef25febcf2 gh-87729: specialize LOAD_SUPER_ATTR_METHOD (#103809) 2023-04-25 17:45:51 +00:00
Mark Shannon 411b169281 GH-103082: Implementation of PEP 669: Low Impact Monitoring for CPython (GH-103083)
* The majority of the monitoring code is in instrumentation.c

* The new instrumentation bytecodes are in bytecodes.c

* legacy_tracing.c adapts the new API to the old sys.setrace and sys.setprofile APIs
2023-04-12 12:04:55 +01:00
Brandt Bucher b4978ff872 GH-88691: Shrink the CALL caches (GH-103230) 2023-04-05 14:15:49 -07:00
Brandt Bucher 121057aa36 GH-89987: Shrink the BINARY_SUBSCR caches (GH-103022) 2023-03-29 15:53:30 -07:00
Brandt Bucher 0444ae2487 GH-100982: Break up COMPARE_AND_BRANCH (GH-102801) 2023-03-23 15:25:09 -07:00
Brandt Bucher 08b67fb34f GH-90997: Shrink the LOAD_GLOBAL caches (#102569) 2023-03-10 17:01:16 -08:00
Mark Shannon 160f2fe2b9 GH-87849: Simplify stack effect of SEND and specialize it for generators and coroutines. (GH-101788) 2023-02-13 11:24:55 +00:00
Mark Shannon 7b14c2ef19 GH-100982: Add COMPARE_AND_BRANCH instruction (GH-100983) 2023-01-16 12:35:21 +00:00
Mark Shannon 6e4e14d98f GH-100923: Embed jump mask in COMPARE_OP oparg (GH-100924) 2023-01-11 20:40:43 +00:00
Mark Shannon 6997e77bdf GH-100222: Redefine _Py_CODEUNIT as a union to clarify structure of code unit. (GH-100223) 2022-12-14 11:12:53 +00:00