cpython

mirror of https://github.com/python/cpython.git synced 2026-05-07 21:20:55 -04:00

Author	SHA1	Message	Date
Mark Shannon	ddd959987c	GH-128685: Specialize (rather than quicken) LOAD_CONST into LOAD_CONST_[IM]MORTAL (GH-128708)	2025-01-13 10:30:28 +00:00
Brandt Bucher	65ae3d5a73	GH-127809: Fix the JIT's understanding of ** (GH-127844)	2025-01-07 17:25:48 -08:00
Mark Shannon	f826beca0c	GH-128375: Better instrument for `FOR_ITER` (GH-128445)	2025-01-06 17:54:47 +00:00
Yan Yanchii	30efede33c	gh-128195: Add `_REPLACE_WITH_TRUE` to the tier2 optimizer (GH-128203) Add `_REPLACE_WITH_TRUE` to the tier2 optimizer	2024-12-24 05:17:47 +08:00
Neil Schemenauer	1b15c89a17	gh-115999: Specialize `STORE_ATTR` in free-threaded builds. (gh-127838) * Add `_PyDictKeys_StringLookupSplit` which does locking on dict keys and use in place of `_PyDictKeys_StringLookup`. * Change `_PyObject_TryGetInstanceAttribute` to use that function in the case of split keys. * Add `unicodekeys_lookup_split` helper which allows code sharing between `_Py_dict_lookup` and `_PyDictKeys_StringLookupSplit`. * Fix locking for `STORE_ATTR_INSTANCE_VALUE`. Create `_GUARD_TYPE_VERSION_AND_LOCK` uop so that object stays locked and `tp_version_tag` cannot change. * Pass `tp_version_tag` to `specialize_dict_access()`, ensuring the version we store on the cache is the correct one (in case of it changing during the specalize analysis). * Split `analyze_descriptor` into `analyze_descriptor_load` and `analyze_descriptor_store` since those don't share much logic. Add `descriptor_is_class` helper function. * In `specialize_dict_access`, double check `_PyObject_GetManagedDict()` in case we race and dict was materialized before the lock. * Avoid borrowed references in `_Py_Specialize_StoreAttr()`. * Use `specialize()` and `unspecialize()` helpers. * Add unit tests to ensure specializing happens as expected in FT builds. * Add unit tests to attempt to trigger data races (useful for running under TSAN). * Add `has_split_table` function to `_testinternalcapi`.	2024-12-19 10:21:17 -08:00
Mark Shannon	d2f1d917e8	GH-122548: Implement branch taken and not taken events for sys.monitoring (GH-122564)	2024-12-19 16:59:51 +00:00
Donghee Na	48c70b8f7d	gh-115999: Enable BINARY_SUBSCR_GETITEM for free-threaded build (gh-127737)	2024-12-19 11:08:17 +09:00
mpage	2de048ce79	gh-115999: Specialize loading attributes from modules in free-threaded builds (#127711 ) We use the same approach that was used for specialization of LOAD_GLOBAL in free-threaded builds: _CHECK_ATTR_MODULE is renamed to _CHECK_ATTR_MODULE_PUSH_KEYS; it pushes the keys object for the following _LOAD_ATTR_MODULE_FROM_KEYS (nee _LOAD_ATTR_MODULE). This arrangement avoids having to recheck the keys version. _LOAD_ATTR_MODULE is renamed to _LOAD_ATTR_MODULE_FROM_KEYS; it loads the value from the keys object pushed by the preceding _CHECK_ATTR_MODULE_PUSH_KEYS at the cached index.	2024-12-13 10:17:16 -08:00
Ken Jin	6293d00e72	gh-120619: Strength reduce function guards, support 2-operand uop forms (GH-124846) Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>	2024-11-09 11:35:33 +08:00
mpage	2e95c5ba3b	gh-115999: Implement thread-local bytecode and enable specialization for `BINARY_OP` (#123926 ) Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads. Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization. Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.	2024-11-04 11:13:32 -08:00
Mark Shannon	faa3272fb8	GH-125837: Split `LOAD_CONST` into three. (GH-125972) * Add LOAD_CONST_IMMORTAL opcode * Add LOAD_SMALL_INT opcode * Remove RETURN_CONST opcode	2024-10-29 11:15:42 +00:00
Brandt Bucher	b5b06349eb	GH-125912: Teach the JIT's optimizer about _BINARY_OP_INPLACE_ADD_UNICODE (GH-125935)	2024-10-28 14:37:16 -07:00
mpage	f978fb4f8d	gh-115999: Refactor `LOAD_GLOBAL` specializations to avoid reloading {globals, builtins} keys (gh-124953) Each of the `LOAD_GLOBAL` specializations is implemented roughly as: 1. Load keys version. 2. Load cached keys version. 3. Deopt if (1) and (2) don't match. 4. Load keys. 5. Load cached index into keys. 6. Load object from (4) at offset from (5). This is not thread-safe in free-threaded builds; the keys object may be replaced in between steps (3) and (4). This change refactors the specializations to avoid reloading the keys object and instead pass the keys object from guards to be consumed by downstream uops.	2024-10-09 15:18:25 +00:00
Mark Shannon	da071fa3e8	GH-119866: Spill the stack around escaping calls. (GH-124392) * Spill the evaluation around escaping calls in the generated interpreter and JIT. * The code generator tracks live, cached values so they can be saved to memory when needed. * Spills the stack pointer around escaping calls, so that the exact stack is visible to the cycle GC.	2024-10-07 14:56:39 +01:00
Ken Jin	b84a763dca	gh-120619: Optimize through `_Py_FRAME_GENERAL` (GH-124518) * Optimize through _Py_FRAME_GENERAL * refactor	2024-10-03 01:10:51 +08:00
Savannah Ostrowski	65f1237098	GH-123516: Improve JIT memory consumption by invalidating cold executors (GH-124443) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>	2024-09-27 00:35:42 +00:00
Ken Jin	8810e286fa	gh-121459: Deferred LOAD_GLOBAL (GH-123128) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com> Co-authored-by: Sam Gross <655866+colesbury@users.noreply.github.com>	2024-09-14 00:23:51 +08:00
Mark Shannon	4ed7d1d6ac	GH-123996: Explicitly mark 'self_or_null' as an array of size 1 to ensure that it is kept in memory for calls (GH-124003)	2024-09-12 15:32:45 +01:00
Mark Shannon	a4fd7aa4a6	GH-115776: Allow any fixed sized object to have inline values (GH-123192)	2024-08-21 15:52:04 +01:00
Mark Shannon	bb1d30336e	GH-118093: Make `CALL_ALLOC_AND_ENTER_INIT` suitable for tier 2. (GH-123140) * Convert CALL_ALLOC_AND_ENTER_INIT to micro-ops such that tier 2 supports it * Allow inexact arguments for CALL_ALLOC_AND_ENTER_INIT.	2024-08-20 16:52:58 +01:00
Mark Shannon	c13e7d98fb	GH-118093: Specialize `CALL_KW` (GH-123006)	2024-08-16 17:11:24 +01:00
Mark Shannon	eec7bdaf01	GH-120024: Remove `CHECK_EVAL_BREAKER` macro. (GH-122968) * Factor some instructions into micro-ops to isolate CHECK_EVAL_BREAKER for escape analysis * Eliminate CHECK_EVAL_BREAKER macro	2024-08-14 12:04:05 +01:00
Mark Shannon	1795d6ceba	GH-122869: Add missing tier two optimizer cases (GH-122936)	2024-08-12 10:35:52 -07:00
Mark Shannon	df13a1821a	GH-118095: Add tier two support for BINARY_SUBSCR_GETITEM (GH-120793)	2024-08-01 16:19:05 -07:00
Mark Shannon	a9d56e38a0	GH-122155: Track local variables between pops and pushes in cases generator (GH-122286)	2024-08-01 09:27:26 +01:00
Mark Shannon	1ca99ed240	Manually override bytecode definition in optimizer, to avoid build error (GH-122316)	2024-07-26 18:38:52 +01:00
Brandt Bucher	64857d849f	GH-122294: Burn in the addresses of side exits (GH-122295)	2024-07-26 09:40:15 -07:00
Mark Shannon	95a73917cd	GH-122029: Break INSTRUMENTED_CALL into micro-ops, so that its behavior is consistent with CALL (GH-122177)	2024-07-26 14:35:57 +01:00
Mark Shannon	afb0aa6ed2	GH-121131: Clean up and fix some instrumented instructions. (GH-121132) * Add support for 'prev_instr' to code generator and refactor some INSTRUMENTED instructions	2024-07-26 12:24:12 +01:00
Brandt Bucher	d9efa45d74	GH-118093: Add tier two support for BINARY_OP_INPLACE_ADD_UNICODE (GH-122253)	2024-07-25 14:45:07 -07:00
Brandt Bucher	5f6001130f	GH-118093: Add tier two support for LOAD_ATTR_PROPERTY (GH-122283)	2024-07-25 10:45:28 -07:00
Mark Shannon	2e14a52cce	GH-122160: Remove BUILD_CONST_KEY_MAP opcode. (GH-122164)	2024-07-25 16:24:29 +01:00
Brandt Bucher	7b36b67b1e	GH-118093: Add tier two support to several instructions (GH-121884)	2024-07-18 14:24:58 -07:00
Brandt Bucher	33903c53db	GH-116017: Get rid of _COLD_EXITs (GH-120960)	2024-07-01 13:17:40 -07:00
Ken Jin	22b0de2755	gh-117139: Convert the evaluation stack to stack refs (#118450 ) This PR sets up tagged pointers for CPython. The general idea is to create a separate struct _PyStackRef for everything on the evaluation stack to store the bits. This forces the C compiler to warn us if we try to cast things or pull things out of the struct directly. Only for free threading: We tag the low bit if something is deferred - that means we skip incref and decref operations on it. This behavior may change in the future if Mark's plans to defer all objects in the interpreter loop pans out. This implies a strict stack reference discipline is required. ALL incref and decref operations on stackrefs must use the stackref variants. It is unsafe to untag something then do normal incref/decref ops on it. The new incref and decref variants are called dup and close. They mimic a "handle" API operating on these stackrefs. Please read Include/internal/pycore_stackref.h for more information! --------- Co-authored-by: Mark Shannon <9448417+markshannon@users.noreply.github.com>	2024-06-27 03:10:43 +08:00
Mark Shannon	8f5a01707f	GH-120982: Add stack check assertions to generated interpreter code (GH-120992)	2024-06-25 16:42:29 +01:00
Nadeshiko Manju	f385d99f57	gh-120437: Fix `_CHECK_STACK_SPACE` optimization problems introduced in gh-118322 (GH-120712) Co-authored-by: Ken Jin <kenjin4096@gmail.com>	2024-06-19 23:34:39 +08:00
Mark Shannon	9cefcc0ee7	GH-120507: Lower the `BEFORE_WITH` and `BEFORE_ASYNC_WITH` instructions. (#120640 ) * Remove BEFORE_WITH and BEFORE_ASYNC_WITH instructions. * Add LOAD_SPECIAL instruction * Reimplement `with` and `async with` statements using LOAD_SPECIAL	2024-06-18 12:17:46 +01:00
Mark Shannon	274f844830	GH-120619: Clean up `RETURN_VALUE` instruction (GH-120624) * Rename _POP_FRAME to _RETURN_VALUE as it returns a value as well as popping a frame. * Remove remaining _POP_FRAMEs	2024-06-17 14:40:11 +01:00
Saul Shanabrook	55402d3232	gh-119258: Eliminate Type Guards in Tier 2 Optimizer with Watcher (GH-119365) Co-authored-by: parmeggiani <parmeggiani@spaziodati.eu> Co-authored-by: dpdani <git@danieleparmeggiani.me> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com> Co-authored-by: Ken Jin <kenjin@python.org>	2024-06-08 17:41:45 +08:00
Jelle Zijlstra	80a4e38994	gh-119821: Support non-dict globals in LOAD_FROM_DICT_OR_GLOBALS (#119822 ) Support non-dict globals in LOAD_FROM_DICT_OR_GLOBALS The implementation basically copies LOAD_GLOBAL. Possibly it could be deduplicated, but that seems like it may get hairy since the two operations have different operands. This is important to fix in 3.14 for PEP 649, but it's a bug in earlier versions too, and we should backport to 3.13 and 3.12 if possible.	2024-05-31 14:05:24 -07:00
Brandt Bucher	5cd3ffd6b7	GH-119258: Handle STORE_ATTR_WITH_HINT in tier two (GH-119481)	2024-05-28 12:47:54 -07:00
Brandt Bucher	cfcc054dee	GH-119476: Split _CHECK_FUNCTION_VERSION out of _CHECK_FUNCTION_EXACT_ARGS (GH-119510)	2024-05-28 12:45:11 -07:00
Jelle Zijlstra	98e855fcc1	gh-119180: Add LOAD_COMMON_CONSTANT opcode (#119321 ) The PEP 649 implementation will require a way to load NotImplementedError from the bytecode. @markshannon suggested implementing this by converting LOAD_ASSERTION_ERROR into a more general mechanism for loading constants. This PR adds this new opcode. I will work on the rest of the implementation of the PEP separately. Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>	2024-05-22 00:46:39 +00:00
Mark Shannon	f5c6b9977a	GH-118910: Less boilerplate in the tier 2 optimizer (#118913 )	2024-05-10 17:43:23 +01:00
Mark Shannon	1ab6356ebe	GH-118095: Use broader specializations of CALL in tier 1, for better tier 2 support of calls. (GH-118322) * Add CALL_PY_GENERAL, CALL_BOUND_METHOD_GENERAL and call CALL_NON_PY_GENERAL specializations. * Remove CALL_PY_WITH_DEFAULTS specialization * Use CALL_NON_PY_GENERAL in more cases when otherwise failing to specialize	2024-05-04 12:11:11 +01:00
Mark Shannon	da2cfc4cb6	GH-113464: Remove the extra jump via `_SIDE_EXIT` in `_EXIT_TRACE` (GH-118545)	2024-05-04 08:50:24 +01:00
Mark Shannon	67bba9dd0f	GH-117442: Check eval-breaker at start (rather than end) of tier 2 loops (GH-118482)	2024-05-02 13:10:31 +01:00
Mark Shannon	5b05d452cd	GH-118095: Add tier 2 support for YIELD_VALUE (GH-118380)	2024-04-30 11:33:13 +01:00
Mark Shannon	ab6eda0ee5	GH-118095: Allow a variant of RESUME_CHECK in tier 2 (GH-118286)	2024-04-29 07:54:05 +01:00

1 2

80 Commits