Commit Graph

6688 Commits

Author SHA1 Message Date
Julien Danjou 64366fa9b3 bpo-41435: Add sys._current_exceptions() function (GH-21689)
This adds a new function named sys._current_exceptions() which is equivalent ot
sys._current_frames() except that it returns the exceptions currently handled
by other threads. It is equivalent to calling sys.exc_info() for each running
thread.
2020-11-02 16:16:25 +02:00
Victor Stinner e662c398d8 bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086)
If the nl_langinfo(CODESET) function returns an empty string, Python
now uses UTF-8 as the filesystem encoding.

In May 2010 (commit b744ba1d14), I
modified Python to log a warning and use UTF-8 as the filesystem
encoding (instead of None) if nl_langinfo(CODESET) returns an empty
string.

In August 2020 (commit 94908bbc15), I
modified Python startup to fail with a fatal error and a specific
error message if nl_langinfo(CODESET) returns an empty string. The
intent was to prevent guessing the encoding and also investigate user
configuration where this case happens.

In 10 years (2010 to 2020), I saw zero user report about the error
message related to nl_langinfo(CODESET) returning an empty string.

Today, UTF-8 became the defacto standard and it's safe to make the
assumption that the user expects UTF-8. For example,
nl_langinfo(CODESET) can return an empty string on macOS if the
LC_CTYPE locale is not supported, and UTF-8 is the default encoding
on macOS.

While this change is likely to not affect anyone in practice, it
should make UTF-8 lover happy ;-)

Rewrite also the documentation explaining how Python selects the
filesystem encoding and error handler.
2020-11-01 23:07:23 +01:00
Victor Stinner 82458b6cdb bpo-42236: Enhance _locale._get_locale_encoding() (GH-23083)
* Rename _Py_GetLocaleEncoding() to _Py_GetLocaleEncodingObject()
* Add _Py_GetLocaleEncoding() which returns a wchar_t* string to
  share code between _Py_GetLocaleEncodingObject()
  and config_get_locale_encoding().
* _Py_GetLocaleEncodingObject() now decodes nl_langinfo(CODESET)
  from the current locale encoding with surrogateescape,
  rather than using UTF-8.
2020-11-01 20:59:35 +01:00
Victor Stinner 710e826307 bpo-42208: Add _Py_GetLocaleEncoding() (GH-23050)
_io.TextIOWrapper no longer calls getpreferredencoding(False) of
_bootlocale to get the locale encoding, but calls
_Py_GetLocaleEncoding() instead.

Add config_get_fs_encoding() sub-function. Reorganize also
config_get_locale_encoding() code.
2020-10-31 01:02:09 +01:00
Victor Stinner eba5bf2f56 bpo-42208: Call GC collect earlier in PyInterpreterState_Clear() (GH-23044)
The last GC collection is now done before clearing builtins and sys
dictionaries. Add also assertions to ensure that gc.collect() is no
longer called after _PyGC_Fini().

Pass also the tstate to PyInterpreterState_Clear() to pass the
correct tstate to _PyGC_CollectNoFail() and _PyGC_Fini().
2020-10-30 22:51:02 +01:00
Victor Stinner dff1ad5090 bpo-42208: Move _PyImport_Cleanup() to pylifecycle.c (GH-23040)
Move _PyImport_Cleanup() to pylifecycle.c, rename it to
finalize_modules(), split it (200 lines) into many smaller
sub-functions and cleanup the code.
2020-10-30 18:03:28 +01:00
Victor Stinner 8b3414818f bpo-42208: Pass tstate to _PyGC_CollectNoFail() (GH-23038)
Move private _PyGC_CollectNoFail() to the internal C API.

Remove the private _PyGC_CollectIfEnabled() which was just an alias
to the public PyGC_Collect() function since Python 3.8.

Rename functions:

* collect() => gc_collect_main()
* collect_with_callback() => gc_collect_with_callback()
* collect_generations() => gc_collect_generations()
2020-10-30 17:00:00 +01:00
Neil Schemenauer 0564aafb71 bpo-42099: Fix reference to ob_type in unionobject.c and ceval (GH-22829)
* Use Py_TYPE() rather than o->ob_type.
2020-10-27 18:55:52 +00:00
Victor Stinner c9bc290dd6 bpo-42161: Use _PyLong_GetZero() and _PyLong_GetOne() (GH-22995)
Use _PyLong_GetZero() and _PyLong_GetOne()
in Objects/ and Python/ directories.
2020-10-27 02:24:34 +01:00
Victor Stinner 920cb647ba bpo-42157: unicodedata avoids references to UCD_Type (GH-22990)
* UCD_Check() uses PyModule_Check()
* Simplify the internal _PyUnicode_Name_CAPI structure:

  * Remove size and state members
  * Remove state and self parameters of getcode() and getname()
    functions

* Remove global_module_state
2020-10-26 19:19:36 +01:00
Victor Stinner 47e1afd2a1 bpo-1635741: _PyUnicode_Name_CAPI moves to internal C API (GH-22713)
The private _PyUnicode_Name_CAPI structure of the PyCapsule API
unicodedata.ucnhash_CAPI moves to the internal C API. Moreover, the
structure gets a new state member which must be passed to the
getcode() and getname() functions.

* Move Include/ucnhash.h to Include/internal/pycore_ucnhash.h
* unicodedata module is now built with Py_BUILD_CORE_MODULE.
* unicodedata: move hashAPI variable into unicodedata_module_state.
2020-10-26 16:43:47 +01:00
Serhiy Storchaka b510e101f8 bpo-42152: Use PyDict_Contains and PyDict_SetDefault if appropriate. (GH-22986)
If PyDict_GetItemWithError is only used to check whether the key is in dict,
it is better to use PyDict_Contains instead.

And if it is used in combination with PyDict_SetItem, PyDict_SetDefault can
replace the combination.
2020-10-26 12:47:57 +02:00
Serhiy Storchaka fb5db7ec58 bpo-42006: Stop using PyDict_GetItem, PyDict_GetItemString and _PyDict_GetItemId. (GH-22648)
These functions are considered not safe because they suppress all internal errors
and can return wrong result.  PyDict_GetItemString and _PyDict_GetItemId can
also silence current exception in rare cases.

Remove no longer used _PyDict_GetItemId.
Add _PyDict_ContainsId and rename _PyDict_Contains into
_PyDict_Contains_KnownHash.
2020-10-26 08:43:39 +02:00
TIGirardi f2312037e3 bpo-38324: Fix test__locale.py Windows failures (GH-20529)
Use wide-char _W_* fields of lconv structure on Windows
Remove "ps_AF" from test__locale.known_numerics on Windows
2020-10-20 12:39:52 +01:00
Pablo Galindo 109826c850 bpo-42093: Add opcode cache for LOAD_ATTR (GH-22803) 2020-10-20 06:22:44 +01:00
Kevin Adler 1dd6d956a3 closes bpo-42030: Remove legacy AIX dynload support (GH-22717)
Since c19c5a6, AIX builds have defaulted to using dynload_shlib over
dynload_aix when dlopen is available. This function has been available
since AIX 4.3, which went out of support in 2003, the same year the
previously referenced commit was made. It has been nearly 20 years
since a version of AIX has been supported which has not used
dynload_shlib so there's no reason to keep this legacy code around.
2020-10-16 13:03:28 -05:00
Hai Shi c9f696cb96 bpo-41919, test_codecs: Move codecs.register calls to setUp() (GH-22513)
* Move the codecs' (un)register operation to testcases.
* Remove _codecs._forget_codec() and _PyCodec_Forget()
2020-10-16 10:34:15 +02:00
Kevin Adler 2d2af320d9 bpo-41894: Fix UnicodeDecodeError while loading native module (GH-22466)
When running in a non-UTF-8 locale, if an error occurs while importing a
native Python module (say because a dependent share library is missing),
the error message string returned may contain non-ASCII code points
causing a UnicodeDecodeError.

PyUnicode_DecodeFSDefault is used for buffers which may contain
filesystem  paths. For consistency with os.strerror(),
PyUnicode_DecodeLocale is used for buffers which contain system error
messages. While the shortname parameter is always encoded in ASCII
according to PEP 489, it is left decoded using PyUnicode_FromString to
minimize the changes and since it should not affect the decoding (albeit
_potentially_ slower).

In dynload_hpux, since the error buffer contains a message generated
from a static ASCII string and the module filesystem path,
PyUnicode_DecodeFSDefault is used instead of PyUnicode_DecodeLocale as
is used elsewhere.

* bpo-41894: Fix bugs in dynload error msg handling

For both dynload_aix and dynload_hpux, properly handle the possibility
that decoding strings may return NULL and when such an error happens,
properly decrement any previously decoded strings and return early.

In addition, in dynload_aix, ensure that we pass the decoded string
*object* pathname_ob to PyErr_SetImportError instead of the original
pathname buffer.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2020-10-15 10:53:27 +09:00
Kevin Adler 0cafcd3c56 closes bpo-42029: Remove dynload_dl (GH-22687)
All references to this dynamic loading method were removed in b9949db,
when support for this method was dropped, but the implementation code
was not dropped (seemingly in oversight).
2020-10-13 20:49:24 -05:00
Kyle Evans 7992579cd2 bpo-40422: Move _Py_closerange to fileutils.c (GH-22680)
This API is relatively lightweight and organizationally, given that it's
used by multiple modules, it makes sense to move it to fileutils.

Requires making sure that _posixsubprocess is compiled with the appropriate
Py_BUIILD_CORE_BUILTIN macro.
2020-10-13 22:04:44 +02:00
Serhiy Storchaka 8287aadb75 bpo-41993: Fix possible issues in remove_module() (GH-22631)
* PyMapping_HasKey() is not safe because it silences all exceptions and can return incorrect result.
* Informative exceptions from PyMapping_DelItem() are overridden with RuntimeError and
  the original exception raised before calling remove_module() is lost.
* There is a race condition between PyMapping_HasKey() and PyMapping_DelItem().
2020-10-11 16:51:07 +03:00
Serhiy Storchaka fa1d83db62 bpo-42002: Clean up initialization of the sys module. (GH-22642)
Makes the code clearer and make errors handling more correct.
2020-10-11 15:30:43 +03:00
Batuhan Taskaya 22220ae216 bpo-38605: bump the magic number for 'annotations' future (#22630) 2020-10-10 15:19:46 -07:00
Serhiy Storchaka 98c4433a81 bpo-41991: Remove _PyObject_HasAttrId (GH-22629)
It can silence arbitrary exceptions.
2020-10-10 22:23:42 +03:00
Batuhan Taskaya 02a1603f91 bpo-42000: Cleanup the AST related C-code (GH-22641)
- Use the proper asdl sequence when creating empty arguments
- Remove reduntant casts (thanks to new typed asdl_sequences)
- Remove MarshalPrototypeVisitor and some utilities from asdl generator
- Fix the header of `Python/ast.c` (kept from pgen times)

Automerge-Triggered-By: @pablogsal
2020-10-10 10:14:59 -07:00
Vladimir Matveev 037245c5ac bpo-41756: Add PyIter_Send function (#22443) 2020-10-09 17:15:15 -07:00
Batuhan Taskaya 044a1048ca bpo-38605: Make 'from __future__ import annotations' the default (GH-20434)
The hard part was making all the tests pass; there are some subtle issues here, because apparently the future import wasn't tested very thoroughly in previous Python versions.

For example, `inspect.signature()` returned type objects normally (except for forward references), but strings with the future import. We changed it to try and return type objects by calling `typing.get_type_hints()`, but fall back on returning strings if that function fails (which it may do if there are future references in the annotations that require passing in a specific namespace to resolve).
2020-10-06 13:03:02 -07:00
Serhiy Storchaka dcc54215ac bpo-41936. Remove macros Py_ALLOW_RECURSION/Py_END_ALLOW_RECURSION (GH-22552) 2020-10-05 12:32:00 +03:00
Victor Stinner bd0a08ea90 bpo-21955: Change my nickname in BINARY_ADD comment (GH-22481) 2020-10-01 18:57:37 +02:00
Mark Shannon 17b5be0c0a bpo-41670: Remove outdated predict macro invocation. (GH-22026)
Remove PREDICTion of POP_BLOCK from FOR_ITER.
2020-09-29 10:09:13 +01:00
Hai Shi d332e7b816 bpo-41842: Add codecs.unregister() function (GH-22360)
Add codecs.unregister() and PyCodec_Unregister() functions
to unregister a codec search function.
2020-09-28 23:41:11 +02:00
Mark Shannon 02d126aa09 bpo-39934: Account for control blocks in 'except' in compiler. (GH-22395)
* Account for control blocks in 'except' in compiler. Fixes #39934.
2020-09-25 14:04:19 +01:00
Victor Stinner b7d8d8dbfe bpo-40941: Fix stackdepth compiler warnings (GH-22377)
Explicitly cast a difference of two pointers to int:
PyFrameObject.f_stackdepth is an int.
2020-09-23 14:07:16 +02:00
Victor Stinner 71f2ff4ccf bpo-40941: Fix fold_tuple_on_constants() compiler warnings (GH-22378)
Add explicit casts to fix compiler warnings in
fold_tuple_on_constants().

The limit of constants per code is now INT_MAX, rather than UINT_MAX.
2020-09-23 14:06:55 +02:00
Victor Stinner 19c3ac92bf bpo-41834: Remove _Py_CheckRecursionLimit variable (GH-22359)
Remove the global _Py_CheckRecursionLimit variable: it has been
replaced by ceval.recursion_limit of the PyInterpreterState
structure.

There is no need to keep the variable for the stable ABI, since
Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() were not usable
in Python 3.8 and older: these macros accessed PyThreadState members,
whereas the PyThreadState structure is opaque in the limited C API.
2020-09-23 14:04:57 +02:00
Samuel Marks c322948892 bpo-41819: Fix compiler warning in init_dump_ascii_wstr() (GH-22332)
Fix the compiler warning:

format specifies type `wint_t` (aka `int`) but the argument has type `unsigned int`
2020-09-21 10:35:17 +02:00
Vladimir Matveev 2b05361bf7 bpo-41756: Introduce PyGen_Send C API (GH-22196)
The new API allows to efficiently send values into native generators
and coroutines avoiding use of StopIteration exceptions to signal 
returns.

ceval loop now uses this method instead of the old "private"
_PyGen_Send C API. This translates to 1.6x increased performance
of 'await' calls in micro-benchmarks.

Aside from CPython core improvements, this new API will also allow 
Cython to generate more efficient code, benefiting high-performance
IO libraries like uvloop.
2020-09-18 18:38:38 -07:00
Pablo Galindo a5634c4067 bpo-41746: Add type information to asdl_seq objects (GH-22223)
* Add new capability to the PEG parser to type variable assignments. For instance:
```
       | a[asdl_stmt_seq*]=';'.small_stmt+ [';'] NEWLINE { a }
```

* Add new sequence types from the asdl definition (automatically generated)
* Make `asdl_seq` type a generic aliasing pointer type.
* Create a new `asdl_generic_seq` for the generic case using `void*`.
* The old `asdl_seq_GET`/`ast_seq_SET` macros now are typed.
* New `asdl_seq_GET_UNTYPED`/`ast_seq_SET_UNTYPED` macros for dealing with generic sequences.
* Changes all possible `asdl_seq` types to use specific versions everywhere.
2020-09-16 19:42:00 +01:00
Victor Stinner e5fbe0cbd4 bpo-41631: _ast module uses again a global state (#21961)
Partially revert commit ac46eb4ad6:
"bpo-38113: Update the Python-ast.c generator to PEP384 (gh-15957)".

Using a module state per module instance is causing subtle practical
problems.

For example, the Mercurial project replaces the __import__() function
to implement lazy import, whereas Python expected that "import _ast"
always return a fully initialized _ast module.

Add _PyAST_Fini() to clear the state at exit.

The _ast module has no state (set _astmodule.m_size to 0). Remove
astmodule_traverse(), astmodule_clear() and astmodule_free()
functions.
2020-09-15 18:03:34 +02:00
Victor Stinner 640e8e1d5f Fix compiler warnings in init_dump_ascii_wstr() (GH-22150)
Fix GCC 9.3 (using -O3) warnings on x86:

initconfig.c: In function ‘init_dump_ascii_wstr’:
initconfig.c:2679:34: warning: format ‘%lc’ expects argument of type
‘wint_t’, but argument 2 has type ‘wchar_t’ {aka ‘long int’}
 2679 |             PySys_WriteStderr("%lc", ch);
initconfig.c:2682:38: warning: format ‘%x’ expects argument of type
‘unsigned int’, but argument 2 has type ‘wchar_t’ {aka ‘long int’}
 2682 |             PySys_WriteStderr("\\x%02x", ch);
initconfig.c:2686:38: warning: format ‘%x’ expects argument of type
‘unsigned int’, but argument 2 has type ‘wchar_t’ {aka ‘long int’}
 2686 |             PySys_WriteStderr("\\U%08x", ch);
initconfig.c:2690:38: warning: format ‘%x’ expects argument of type
‘unsigned int’, but argument 2 has type ‘wchar_t’ {aka ‘long int’}
 2690 |             PySys_WriteStderr("\\u%04x", ch);
2020-09-09 12:07:17 +02:00
Serhiy Storchaka 58de1dd6a8 bpo-41525: Make the Python program help ASCII-only (GH-21836) 2020-09-09 01:28:02 +01:00
Victor Stinner f315142ddc bpo-1635741: Port mashal module to multi-phase init (#22149)
Port the 'mashal' extension module to the multi-phase initialization
API (PEP 489).
2020-09-08 15:33:52 +02:00
han-solo 0d6aa7f0ee bpo-41681: Fix for f-string/str.format error description when using 2 , in format specifier (GH-22036)
* Fixed `f-string/str.format` error description when using two `,` in format specifier.

Co-authored-by: millefalcon <hanish0019@hmail.com>
2020-09-01 10:34:29 -04:00
Tony Solomonik 75c80b0bda closes bpo-41533: Fix a potential memory leak when allocating a stack (GH-21847)
Free the stack allocated in va_build_stack if do_mkstack fails
and the stack is not a small_stack
2020-08-29 23:53:08 -05:00
wmeehan 97eaf2b5e5 bpo-41524: fix pointer bug in PyOS_mystr{n}icmp (GH-21845)
* bpo-41524: fix pointer bug in PyOS_mystr{n}icmp

The existing implementations of PyOS_mystrnicmp and PyOS_mystricmp
can increment pointers beyond the end of a string.

This commit fixes those cases by moving the mutation out of the condition.

* 📜🤖 Added by blurb_it.

* Address comments

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2020-08-27 14:45:25 +09:00
Hai Shi 8aa163eea6 bpo-1635741: Explict GC collect after PyInterpreterState_Clear() (GH-21902)
Fix a reference cycle by triggering an explicit GC collection
after calling PyInterpreterState_Clear().
2020-08-17 22:36:19 +02:00
Pablo Galindo c51db0ea40 bpo-41531: Fix compilation of dict literals with more than 0xFFFF elements (GH-21850) 2020-08-13 09:48:41 +01:00
Hai Shi 8ecc0c4d39 bpo-1635741: Clean sysdict and builtins of interpreter at exit (GH-21605) 2020-08-12 23:23:30 +02:00
Mark Shannon 582aaf19e8 bpo-41463: Generate information about jumps from 'opcode.py' rather than duplicating it in 'compile.c' (GH-21714)
Generate information about jumps from 'opcode.py' rather than duplicate it in 'compile.c'
2020-08-04 17:30:11 +01:00
Mark Shannon 6e8128f02e bpo-41323: Perform 'peephole' optimizations directly on the CFG. (GH-21517)
* Move 'peephole' optimizations into compile.c and perform them directly on the CFG.
2020-07-30 10:03:00 +01:00