cpython

mirror of https://github.com/python/cpython.git synced 2026-06-27 13:20:40 -04:00

Author	SHA1	Message	Date
Victor Stinner	10f616cf39	[3.15] gh-151253: Dump the Python path configuration on _PyCodec_InitRegistry() failure (#151250 ) (#151269 ) gh-151253: Dump the Python path configuration on _PyCodec_InitRegistry() failure (#151250) If "import encodings" fails at Python startup, dump the Python path configuration to help users debugging their configuration. The encodings module is the first module imported during Python startup. (cherry picked from commit `7b6e98911e`)	2026-06-10 22:03:27 +02:00
Stan Ulbrych	12805ef9da	`Python/codecs.c`: Remove unused forward declaration (#139511 )	2025-10-03 13:33:49 +02:00
Serhiy Storchaka	af58a6f883	gh-88886: Remove excessive encoding name normalization (GH-137167) The codecs lookup function now performs only minimal normalization of the encoding name before passing it to the search functions: all ASCII letters are converted to lower case, spaces are replaced with hyphens. Excessive normalization broke third-party codecs providers, like python-iconv. Revert "bpo-37751: Fix codecs.lookup() normalization (GH-15092)" This reverts commit `20f59fe1f7`.	2025-09-09 21:07:21 +03:00
Peter Bierma	082f370cdd	gh-137514: Add a free-threading wrapper for mutexes (GH-137515) Add `FT_MUTEX_LOCK`/`FT_MUTEX_UNLOCK`, which call `PyMutex_Lock` and `PyMutex_Unlock` on the free-threaded build, and no-op otherwise.	2025-08-07 11:24:50 -04:00
Victor Stinner	ce1b747ff6	gh-58124: Avoid CP_UTF8 in UnicodeDecodeError (#137415 ) Fix name of the Python encoding in Unicode errors of the code page codec: use "cp65000" and "cp65001" instead of "CP_UTF7" and "CP_UTF8" which are not valid Python code names.	2025-08-06 14:35:27 +02:00
Inada Naoki	4e294f6feb	gh-133036: Deprecate codecs.open (#133038 ) Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Co-authored-by: Victor Stinner <vstinner@python.org>	2025-04-30 10:11:09 +09:00
Victor Stinner	20c5f969dd	gh-131238: Remove more includes from pycore_interp.h (#131480 )	2025-03-19 23:01:32 +01:00
Victor Stinner	978e37bb5f	gh-131238: Add explicit includes to pycore headers (#131257 )	2025-03-17 12:32:43 +01:00
Sergey Miryanov	3a7f17c7e2	gh-130790: Remove references about unicode's readiness from comments (#130801 )	2025-03-03 19:18:09 +00:00
Bénédikt Tran	3146a25e97	gh-129173: refactor `PyCodec_BackslashReplaceErrors` into separate functions (#129895 ) The logic of `PyCodec_BackslashReplaceErrors` is now split into separate functions, each of which handling a specific exception type.	2025-03-03 13:58:15 +01:00
Bénédikt Tran	f693f84227	gh-129173: simplify `PyCodec_XMLCharRefReplaceErrors` logic (#129894 ) Writing the decimal representation of a Unicode codepoint only requires to know the number of digits. --------- Co-authored-by: Petr Viktorin <encukou@gmail.com>	2025-03-03 11:43:22 +00:00
Bénédikt Tran	fa6a8140dd	gh-129173: refactor `PyCodec_ReplaceErrors` into separate functions (#129893 ) The logic of `PyCodec_ReplaceErrors` is now split into separate functions, each of which handling a specific exception type.	2025-02-25 14:24:46 +01:00
Bénédikt Tran	e24a1ac17c	gh-129173: Use `_PyUnicodeError_GetParams` in `PyCodec_SurrogateEscapeErrors` (GH-129175)	2025-02-20 13:18:47 +00:00
Bénédikt Tran	1775091dc1	gh-129173: Use `_PyUnicodeError_GetParams` in `PyCodec_SurrogatePassErrors` (GH-129134)	2025-02-14 18:34:32 +01:00
Bénédikt Tran	a56ead089c	gh-129173: Use `_PyUnicodeError_GetParams` in `PyCodec_NameReplaceErrors` (GH-129135)	2025-02-08 16:01:57 +01:00
Bénédikt Tran	36bb229933	gh-129173: Use `_PyUnicodeError_GetParams` in `PyCodec_IgnoreErrors` (#129174 ) We also cleanup `PyCodec_StrictErrors` and the error message rendered when an object of incorrect type is passed to codec error handlers.	2025-01-24 11:25:03 +01:00
Bénédikt Tran	25a614a502	gh-126004: Fix positions handling in `codecs.backslashreplace_errors` (#127676 ) This fixes how `PyCodec_BackslashReplaceErrors` handles the `start` and `end` attributes of `UnicodeError` objects via the `_PyUnicodeError_GetParams` helper.	2025-01-23 14:28:33 +01:00
Bénédikt Tran	225296cd5b	gh-126004: Fix positions handling in `codecs.replace_errors` (#127674 ) This fixes how `PyCodec_ReplaceErrors` handles the `start` and `end` attributes of `UnicodeError` objects via the `_PyUnicodeError_GetParams` helper.	2025-01-23 11:44:18 +01:00
Bénédikt Tran	70dcc847df	gh-126004: Fix positions handling in `codecs.xmlcharrefreplace_errors` (#127675 ) This fixes how `PyCodec_XMLCharRefReplaceErrors` handles the `start` and `end` attributes of `UnicodeError` objects via the `_PyUnicodeError_GetParams` helper.	2025-01-23 11:42:38 +01:00
Victor Stinner	b9a8ca0a6a	gh-115754: Use Py_GetConstant(Py_CONSTANT_EMPTY_STR) (#125194 ) Replace PyUnicode_New(0, 0), PyUnicode_FromString("") and PyUnicode_FromStringAndSize("", 0) with Py_GetConstant(Py_CONSTANT_EMPTY_STR).	2024-10-09 17:15:23 +02:00
Bénédikt Tran	c00964ecd5	gh-124665: Add `_PyCodec_UnregisterError` and `_codecs._unregister_error` (#124677 )	2024-09-29 02:25:23 +02:00
Petr Viktorin	6f1d448bc1	gh-113993: Allow interned strings to be mortal, and fix related issues (GH-120520) * Add an InternalDocs file describing how interning should work and how to use it. * Add internal functions to explicitly request what kind of interning is done: - `_PyUnicode_InternMortal` - `_PyUnicode_InternImmortal` - `_PyUnicode_InternStatic` * Switch uses of `PyUnicode_InternInPlace` to those. * Disallow using `_Py_SetImmortal` on strings directly. You should use `_PyUnicode_InternImmortal` instead: - Strings should be interned before immortalization, otherwise you're possibly interning a immortalizing copy. - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in backports, as they are now part of public API and version-specific ABI. * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery. * Make sure the statically allocated string singletons are unique. This means these sets are now disjoint: - `_Py_ID` - `_Py_STR` (including the empty string) - one-character latin-1 singletons Now, when you intern a singleton, that exact singleton will be interned. * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic). * Intern `_Py_STR` singletons at startup. * For free-threaded builds, intern `_Py_LATIN1_CHR` singletons at startup. * Beef up the tests. Cover internal details (marked with `@cpython_only`). * Add lots of assertions Co-Authored-By: Eric Snow <ericsnowcurrently@gmail.com>	2024-06-21 17:19:31 +02:00
Brett Simmers	f8290df63f	gh-116738: Make `_codecs` module thread-safe (#117530 ) The module itself is a thin wrapper around calls to functions in `Python/codecs.c`, so that's where the meaningful changes happened: - Move codecs-related state that lives on `PyInterpreterState` to a struct declared in `pycore_codecs.h`. - In free-threaded builds, add a mutex to `codecs_state` to synchronize operations on `search_path`. Because `search_path_mutex` is used as a normal mutex and not a critical section, we must be extremely careful with operations called while holding it. - The codec registry is explicitly initialized as part of `_PyUnicode_InitEncodings` to simplify thread-safety.	2024-05-02 18:25:36 -04:00
Kirill Podoprigora	0785c68559	gh-111972: Make Unicode name C APIcapsule initialization thread-safe (#112249 )	2023-11-30 11:12:49 +01:00
Serhiy Storchaka	aa438bdd6d	gh-111789: Use PyDict_GetItemRef() in Python/codecs.c (gh-112082)	2023-11-27 18:53:43 +01:00
Victor Stinner	03c4080c71	gh-108765: Python.h no longer includes <ctype.h> (#108831 ) Remove <ctype.h> in C files which don't use it; only sre.c and _decimal.c still use it. Remove _PY_PORT_CTYPE_UTF8_ISSUE code from pyport.h: * Code added by commit `b5047fd019` in 2004 for MacOSX and FreeBSD. * Test removed by commit `52ddaefb6b` in 2007, since Python str type now uses locale independent functions like Py_ISALPHA() and Py_TOLOWER() and the Unicode database. Modules/_sre/sre.c replaces _PY_PORT_CTYPE_UTF8_ISSUE with new functions: sre_isalnum(), sre_tolower(), sre_toupper(). Remove unused includes: * _localemodule.c: remove <stdio.h>. * getargs.c: remove <float.h>. * dynload_win.c: remove <direct.h>, it no longer calls _getcwd() since commit `fb1f68ed7c` (in 2001).	2023-09-03 18:54:27 +02:00
Victor Stinner	4dc9f48930	gh-108308: Replace _PyDict_GetItemStringWithError() (#108372 ) Replace _PyDict_GetItemStringWithError() calls with PyDict_GetItemStringRef() which returns a strong reference to the item. Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>	2023-08-23 22:59:00 +02:00
Victor Stinner	615f6e946d	gh-106320: Remove _PyDict_GetItemStringWithError() function (#108313 ) Remove private _PyDict_GetItemStringWithError() function of the public C API: the new PyDict_GetItemStringRef() can be used instead. * Move private _PyDict_GetItemStringWithError() to the internal C API. * _testcapi get_code_extra_index() uses PyDict_GetItemStringRef(). Avoid using private functions in _testcapi which tests the public C API.	2023-08-22 18:17:25 +00:00
Serhiy Storchaka	be1b968dc1	gh-106521: Remove _PyObject_LookupAttr() function (GH-106642)	2023-07-12 08:57:10 +03:00
Victor Stinner	bc7eb17084	gh-106320: Use _PyInterpreterState_GET() (#106336 ) Replace PyInterpreterState_Get() with inlined _PyInterpreterState_GET().	2023-07-02 16:37:37 +00:00
Irit Katriel	55c99d97e1	gh-77757: replace exception wrapping by PEP-678 notes in typeobject's __set_name__ (#103402 )	2023-04-11 11:53:06 +01:00
Irit Katriel	76350e85eb	gh-102406: replace exception chaining by PEP-678 notes in codecs (#102407 )	2023-03-21 21:36:31 +00:00
Victor Stinner	8211cf5d28	gh-99300: Replace Py_INCREF() with Py_NewRef() (#99530 ) Replace Py_INCREF() and Py_XINCREF() using a cast with Py_NewRef() and Py_XNewRef().	2022-11-16 18:34:24 +01:00
Victor Stinner	d8f239d86e	gh-99300: Use Py_NewRef() in Python/ directory (#99302 ) Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in C files of the Python/ directory.	2022-11-10 09:03:39 +01:00
Eric Snow	81c72044a1	bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. (gh-30928) We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules. The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings). https://bugs.python.org/issue46541#msg411799 explains the rationale for this change. The core of the change is in: * (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros * Include/internal/pycore_runtime_init.h - added the static initializers for the global strings * Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState * Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config. The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _PyId functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _PyId(), replacing the _Py_Identifier * parameter with PyObject . The following are not changed (yet): stop using _Py_IDENTIFIER() in the stdlib modules * (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API * (maybe) intern the strings during runtime init https://bugs.python.org/issue46541	2022-02-08 13:39:07 -07:00
Kumar Aditya	41026c3155	bpo-45855: Replaced deprecated `PyImport_ImportModuleNoBlock` with PyImport_ImportModule (GH-30046)	2021-12-12 10:45:20 +02:00
Victor Stinner	d943d19172	bpo-45439: Move _PyObject_CallNoArgs() to pycore_call.h (GH-28895) * Move _PyObject_CallNoArgs() to pycore_call.h (internal C API). * _ssl, _sqlite and _testcapi extensions now call the public PyObject_CallNoArgs() function, rather than _PyObject_CallNoArgs(). * _lsprof extension is now built with Py_BUILD_CORE_MODULE macro defined to get access to internal _PyObject_CallNoArgs().	2021-10-12 08:38:19 +02:00
Victor Stinner	ce3489cfdb	bpo-45439: Rename _PyObject_CallNoArg() to _PyObject_CallNoArgs() (GH-28891) Fix typo in the private _PyObject_CallNoArg() function name: rename it to _PyObject_CallNoArgs() to be consistent with the public function PyObject_CallNoArgs().	2021-10-12 00:42:23 +02:00
Victor Stinner	920cb647ba	bpo-42157: unicodedata avoids references to UCD_Type (GH-22990) * UCD_Check() uses PyModule_Check() * Simplify the internal _PyUnicode_Name_CAPI structure: * Remove size and state members * Remove state and self parameters of getcode() and getname() functions * Remove global_module_state	2020-10-26 19:19:36 +01:00
Victor Stinner	47e1afd2a1	bpo-1635741: _PyUnicode_Name_CAPI moves to internal C API (GH-22713) The private _PyUnicode_Name_CAPI structure of the PyCapsule API unicodedata.ucnhash_CAPI moves to the internal C API. Moreover, the structure gets a new state member which must be passed to the getcode() and getname() functions. * Move Include/ucnhash.h to Include/internal/pycore_ucnhash.h * unicodedata module is now built with Py_BUILD_CORE_MODULE. * unicodedata: move hashAPI variable into unicodedata_module_state.	2020-10-26 16:43:47 +01:00
Hai Shi	c9f696cb96	bpo-41919, test_codecs: Move codecs.register calls to setUp() (GH-22513) * Move the codecs' (un)register operation to testcases. * Remove _codecs._forget_codec() and _PyCodec_Forget()	2020-10-16 10:34:15 +02:00
Hai Shi	d332e7b816	bpo-41842: Add codecs.unregister() function (GH-22360) Add codecs.unregister() and PyCodec_Unregister() functions to unregister a codec search function.	2020-09-28 23:41:11 +02:00
Victor Stinner	e5014be049	bpo-40268: Remove a few pycore_pystate.h includes (GH-19510)	2020-04-14 17:52:15 +02:00
Victor Stinner	81a7be3fa2	bpo-40268: Rename _PyInterpreterState_GET_UNSAFE() (GH-19509) Rename _PyInterpreterState_GET_UNSAFE() to _PyInterpreterState_GET() for consistency with _PyThreadState_GET() and to have a shorter name (help to fit into 80 columns). Add also "assert(tstate != NULL);" to the function.	2020-04-14 15:14:01 +02:00
Victor Stinner	4a3fe08353	bpo-40268: Include explicitly pycore_interp.h (GH-19505) pycore_pystate.h no longer includes pycore_interp.h: it's now included explicitly in files accessing PyInterpreterState.	2020-04-14 14:26:24 +02:00
Serhiy Storchaka	cd8295ff75	bpo-39943: Add the const qualifier to pointers on non-mutable PyUnicode data. (GH-19345)	2020-04-11 10:48:40 +03:00
Victor Stinner	ff4584caca	bpo-39947: Use _PyInterpreterState_GET_UNSAFE() (GH-18978) Replace _PyInterpreterState_Get() function call with _PyInterpreterState_GET_UNSAFE() macro which is more efficient but don't check if tstate or interp is NULL. _Py_GetConfigsAsDict() now uses _PyThreadState_GET().	2020-03-13 18:03:56 +01:00
Andy Lester	7386a70746	closes bpo-39630: Update pointers to string literals to be const char *. (GH-18510)	2020-02-13 20:42:56 -08:00
Petr Viktorin	ffd9753a94	bpo-39245: Switch to public API for Vectorcall (GH-18460) The bulk of this patch was generated automatically with: for name in \ PyObject_Vectorcall \ Py_TPFLAGS_HAVE_VECTORCALL \ PyObject_VectorcallMethod \ PyVectorcall_Function \ PyObject_CallOneArg \ PyObject_CallMethodNoArgs \ PyObject_CallMethodOneArg \ ; do echo $name git grep -lwz _$name \| xargs -0 sed -i "s/\b_$name\b/$name/g" done old=_PyObject_FastCallDict new=PyObject_VectorcallDict git grep -lwz $old \| xargs -0 sed -i "s/\b$old\b/$new/g" and then cleaned up: - Revert changes to in docs & news - Revert changes to backcompat defines in headers - Nudge misaligned comments	2020-02-11 17:46:57 +01:00
Victor Stinner	a102ed7d2f	bpo-39573: Use Py_TYPE() macro in Python and Include directories (GH-18391) Replace direct access to PyObject.ob_type with Py_TYPE().	2020-02-07 02:24:48 +01:00

1 2 3 4

170 Commits