Commit Graph

569 Commits

Author SHA1 Message Date
Serhiy Storchaka 39dea212f4 [3.12] gh-121905: Consistently use "floating-point" instead of "floating point" (GH-121907) (GH-122013)
(cherry picked from commit 1a0c7b9ba4)
2024-07-19 09:08:33 +00:00
Serhiy Storchaka 874eed6cfe [3.12] gh-121153: Fix some errors with use of _PyLong_CompactValue() (GH-121154)
* The result has type Py_ssize_t, not intptr_t.
* Type cast between unsigned and signed integer types should be explicit.
* Downcasting should be explicit.
* Fix integer overflow check in sum().
(cherry picked from commit 1801545)
2024-07-17 07:58:25 +00:00
Miss Islington (bot) 4ea9d15761 [3.12] gh-110819: Fix ‘kind’ may be used uninitialized warning in longobject (GH-116599) (#116648)
gh-110819: Fix ‘kind’ may be used uninitialized warning in `longobject` (GH-116599)
(cherry picked from commit eb947cdc13)

Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
2024-03-12 11:09:15 +00:00
Miss Islington (bot) 8f080a290b [3.12] gh-102509: Start initializing ob_digit of _PyLongValue (GH-102510) (#107464)
gh-102509: Start initializing `ob_digit` of `_PyLongValue` (GH-102510)
(cherry picked from commit fc130c47da)

Co-authored-by: Illia Volochii <illia.volochii@gmail.com>
2023-07-31 14:27:59 +02:00
Mark Shannon 93923793f6 GH-101291: Add low level, unstable API for pylong (GH-101685)
Co-authored-by: Petr Viktorin <encukou@gmail.com>
2023-05-21 14:45:48 +01:00
Eric Snow fdd878650d gh-94673: Properly Initialize and Finalize Static Builtin Types for Each Interpreter (gh-104072)
Until now, we haven't been initializing nor finalizing the per-interpreter state properly.
2023-05-01 19:36:00 -06:00
Eric Snow 59bc36aacd gh-84436: Immortalize in _PyStructSequence_InitBuiltinWithFlags() (gh-104054)
This also does some cleanup.
2023-05-01 15:08:34 -06:00
Eric Snow d2e2e53f73 gh-94673: Ensure Builtin Static Types are Readied Properly (gh-103940)
There were cases where we do unnecessary work for builtin static types. This also simplifies some work necessary for a per-interpreter GIL.
2023-04-27 16:19:43 -06:00
Eddie Elizondo ea2c001650 gh-84436: Implement Immortal Objects (gh-19474)
This is the implementation of PEP683

Motivation:

The PR introduces the ability to immortalize instances in CPython which bypasses reference counting. Tagging objects as immortal allows up to skip certain operations when we know that the object will be around for the entire execution of the runtime.

Note that this by itself will bring a performance regression to the runtime due to the extra reference count checks. However, this brings the ability of having truly immutable objects that are useful in other contexts such as immutable data sharing between sub-interpreters.
2023-04-22 13:39:37 -06:00
Mark Shannon 7559f5fda9 GH-101291: Rearrange the size bits in PyLongObject (GH-102464)
* Eliminate all remaining uses of Py_SIZE and Py_SET_SIZE on PyLongObject, adding asserts.

* Change layout of size/sign bits in longobject to support future addition of immortal ints and tagged medium ints.

* Add functions to hide some internals of long object, and for setting sign and digit count.

* Replace uses of IS_MEDIUM_VALUE macro with _PyLong_IsCompact().
2023-03-22 14:49:51 +00:00
Sergey B Kirpichev 4624987b29 gh-101825: Clarify that as_integer_ratio() output is always normalized (#101843)
Make docstrings for `as_integer_ratio` consistent across types, and document that
the returned pair is always normalized (coprime integers, with positive denominator).

---------

Co-authored-by: Owain Davies <116417456+OTheDev@users.noreply.github.com>
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
2023-02-27 19:11:28 +00:00
Mark Dickinson 39017e04b5 gh-101266: Fix __sizeof__ for subclasses of int (#101394)
Fix the behaviour of the `__sizeof__` method (and hence the results returned by `sys.getsizeof`) for subclasses of `int`. Previously, `int` subclasses gave identical results to the `int` base class, ignoring the presence of the instance dictionary.

<!-- gh-issue-number: gh-101266 -->
* Issue: gh-101266
<!-- /gh-issue-number -->
2023-02-05 10:02:53 +00:00
Mark Shannon c1b1f51cd1 GH-101291: Refactor the PyLongObject struct into object header and PyLongValue struct. (GH-101292) 2023-01-30 10:03:04 +00:00
Mark Dickinson 401fdf9c85 gh-101037: Fix potential memory underallocation for zeros of int subtypes (#101038)
This PR fixes object allocation in long_subtype_new to ensure that there's at least one digit in all cases, and makes sure that the value of that digit is copied over from the source long.

Needs backport to 3.11, but not any further: the change to require at least one digit was only introduced for Python 3.11.

Fixes #101037.
2023-01-21 10:23:59 +00:00
Ionite d7e7f79ca7 gh-100637: Fix int and bool __sizeof__ calculation to include the 1 element ob_digit array for 0 and False (#100663)
Fixes behaviour where int (and subtypes like bool) __sizeof__ under-reports true size as it did not take into account the size 1 `ob_digit` array for the zero int.

Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
2023-01-02 21:11:49 +00:00
Shantanu 3e46f9fe05 gh-100268: Add is_integer method to int (#100439)
This improves the lives of type annotation users of `float` - which type checkers implicitly treat as `int|float` because that is what most code actually wants. Before this change a `.is_integer()` method could not be assumed to exist on things annotated as `: float` due to the method not existing on both types.
2022-12-23 18:30:27 -08:00
Ned Batchelder 935ef59321 clarify the 4300-digit limit on int-str conversion (#100175) 2022-12-12 13:39:54 +02:00
Victor Stinner 5556d3e02c gh-98724: Fix warnings on Py_SETREF() usage (#99781)
Cast argument to the expected type.
2022-11-26 00:30:37 +01:00
Victor Stinner 20d9749a0f gh-99537: Use Py_SETREF() function in longobject C code (#99655)
Replace "Py_DECREF(var); var = new;" with "Py_SETREF(var, new);"
in longobject.c and _testcapi/long.c.
2022-11-22 13:04:19 +01:00
Victor Stinner 1960eb005e gh-99300: Use Py_NewRef() in Objects/ directory (#99351)
Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and
Py_XNewRef() in C files of the Objects/ directory.
2022-11-10 23:40:31 +01:00
jonasdlindner ede6cb2615 Correct some typos in comments (GH-98194)
Automerge-Triggered-By: GH:AlexWaygood
2022-11-06 08:54:44 -08:00
Victor Stinner 387f72588d gh-90716: Fix pylong_int_from_string() refleak (#99094)
Fix validated by:

    $ ./python -m test -R 3:3 test_int
    Tests result: SUCCESS
2022-11-04 14:24:10 +01:00
Gregory P. Smith 4c4b5ce2e5 gh-90716: bugfixes and more tests for _pylong. (#99073)
* Properly decref on _pylong import error.
* Improve the error message on _pylong TypeError.
* Fix the assertion error in pydebug builds to be a TypeError.
* Tie the return value comments together.

These are minor followups to issues not caught among the reviewers on
https://github.com/python/cpython/pull/96673.
2022-11-03 16:18:38 -07:00
Neil Schemenauer de6981680b gh-90716: add _pylong.py module (#96673)
Add Python implementations of certain longobject.c functions. These use
asymptotically faster algorithms that can be used for operations on
integers with many digits. In those cases, the performance overhead of
the Python implementation is not significant since the asymptotic
behavior is what dominates runtime. Functions provided by this module
should be considered private and not part of any public API.

Co-author: Tim Peters <tim.peters@gmail.com>
Co-author: Mark Dickinson <dickinsm@gmail.com>
Co-author: Bjorn Martinsson
2022-10-25 22:00:50 -07:00
Eric Snow 9c8dde0fa5 gh-98417: Store int_max_str_digits on the Interpreter State (GH-98418) 2022-10-19 13:27:46 -07:00
Michael 07b8e85d0e gh-96526: Clarify format and __format__ docstrings (gh-96648) 2022-10-03 15:28:02 -07:00
Gregory P. Smith b0f89cb431 gh-96512: Move int_max_str_digits setting to PyConfig (#96944)
It had to live as a global outside of PyConfig for stable ABI reasons in
the pre-3.12 backports.

This removes the `_Py_global_config_int_max_str_digits` and gets rid of
the equivalent field in the internal `struct _is PyInterpreterState` as
code can just use the existing nested config struct within that.

Adds tests to verify unique settings and configs in subinterpreters.
2022-10-03 13:55:45 -07:00
Oscar Benjamin 817fa28f81 gh-90716: Refactor PyLong_FromString to separate concerns (GH-96808)
This is a preliminary PR to refactor `PyLong_FromString` which is currently quite messy and has spaghetti like code that mixes up different concerns as well as duplicating logic.

In particular:

- `PyLong_FromString` now only handles sign, base and prefix detection and calls a new function `long_from_string_base` to parse the main body of the string.
- The `long_from_string_base` function handles all string validation and then calls `long_from_binary_base` or a new function `long_from_non_binary_base` to construct the actual `PyLong`.
- The existing `long_from_binary_base` function is simplified by factoring duplicated logic to `long_from_string_base`.
- The new function `long_from_non_binary_base` factors out much of the code from `PyLong_FromString` including in particular the quadratic algorithm reffered to in gh-95778 so that this can be seen separately from unrelated concerns such as string validation.
2022-09-25 10:09:50 +01:00
Victor Stinner e841ffc915 gh-95778: Mention sys.set_int_max_str_digits() in error message (#96874)
When ValueError is raised if an integer is larger than the limit,
mention sys.set_int_max_str_digits() in the error message.
2022-09-16 20:04:37 +02:00
Mark Dickinson b126196838 gh-95778: Correctly pre-check for int-to-str conversion (#96537)
Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =)

The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact.

The justification for the current check. The C code check is:
```c
max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10
```

In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is:
$$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$

From this it follows that
$$\frac{M}{3L} < \frac{s-1}{10}$$
hence that
$$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$
So
$$2^{L(s-1)} > 10^M.$$
But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check.

<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->

Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
2022-09-04 09:21:18 -07:00
Gregory P. Smith 511ca94520 gh-95778: CVE-2020-10735: Prevent DoS by very large int() (#96499)
Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds.

This PR comes fresh from a pile of work done in our private PSRT security response team repo.

Signed-off-by: Christian Heimes [Red Hat] <christian@python.org>
Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org>
Reviews via the private PSRT repo via many others (see the NEWS entry in the PR).

<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->

I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.
2022-09-02 09:35:08 -07:00
Eric Snow 4a1dd73431 gh-94673: Add _PyStaticType_InitBuiltin() (#95152)
This is the first of several precursors to storing tp_subclasses (and tp_weaklist) on the interpreter state for static builtin types.

We do the following:

* add `_PyStaticType_InitBuiltin()`
* add `_Py_TPFLAGS_STATIC_BUILTIN`
* set it on all static builtin types in `_PyStaticType_InitBuiltin()`
* shuffle some code around to be able to use _PyStaticType_InitBuiltin()
    * rename `_PyStructSequence_InitType()` to `_PyStructSequence_InitBuiltinWithFlags()`
    * add `_PyStructSequence_InitBuiltin()`.
2022-07-25 12:47:31 -06:00
Dennis Sweeney 5fcfdd87c9 GH-91432: Specialize FOR_ITER (GH-91713)
* Adds FOR_ITER_LIST and FOR_ITER_RANGE specializations.

* Adds _PyLong_AssignValue() internal function to avoid temporary boxing of ints.
2022-06-21 11:19:26 +01:00
oda-gitso 854db1a606 Remove unnecessary for loop initializer in long_lshift1() (GH-93071)
* Remove unnecessary for loop initialization.
2022-05-25 18:57:33 +01:00
Victor Stinner f62ad4f2c4 gh-89653: Use int type for Unicode kind (#92704)
Use the same type that PyUnicode_FromKindAndData() kind parameter
type (public C API): int.
2022-05-13 12:41:05 +02:00
Mark Dickinson 0ed91a26fe gh-90213: Speed up right shifts of negative integers (GH-30277) 2022-05-02 11:19:03 -06:00
Victor Stinner 7cdaf87ec5 gh-91731: Replace Py_BUILD_ASSERT() with static_assert() (#91730)
Python 3.11 now uses C11 standard which adds static_assert()
to <assert.h>.

* In pytime.c, replace Py_BUILD_ASSERT() with preprocessor checks on
  SIZEOF_TIME_T with #error.
* On macOS, py_mach_timebase_info() now accepts timebase members with
  the same size than _PyTime_t.
* py_get_monotonic_clock() now saturates GetTickCount64() to
  _PyTime_MAX: GetTickCount64() is unsigned, whereas _PyTime_t is
  signed.
2022-04-20 19:26:40 +02:00
Dennis Sweeney d7d7e6c007 Cast to (destructor) to fix compiler warnings (GH-91711) 2022-04-20 16:15:45 +01:00
Dennis Sweeney da6c78584b gh-90667: Add specializations of Py_DECREF when types are known (GH-30872) 2022-04-19 19:02:19 +01:00
Dennis Sweeney 8be8949116 gh-91117: Ensure integer mod and pow operations use cached small ints (GH-31843) 2022-04-11 16:07:09 -04:00
Mark Dickinson c60e6b6ad7 bpo-46311: Clean up PyLong_FromLong and PyLong_FromLongLong (GH-30496) 2022-03-01 14:20:52 +00:00
Eric Snow 81c72044a1 bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. (gh-30928)
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code.  It is still used in a number of non-builtin stdlib modules.

The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime.  A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).

https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.

The core of the change is in:

* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers

I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings.  That check is added to the PR CI config.

The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()).  This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.

The following are not changed (yet):

* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init

https://bugs.python.org/issue46541
2022-02-08 13:39:07 -07:00
Ken Jin 768569325a bpo-46407: Fix long_mod refleak (GH-31025) 2022-01-31 11:41:14 +01:00
Crowthebird f10dafc430 bpo-46407: Optimizing some modulo operations (GH-30653)
Added new internal functions to compute mod without also computing the quotient.

The loops can be leaner then, which leads to modestly but reliably faster execution in contexts that know they don't need the quotient.

Code by Jeremiah Vivian (Pascual).
2022-01-27 18:46:45 -06:00
Tim Peters 7c26472d09 bpo-46504: faster code for trial quotient in x_divrem() (GH-30856)
* bpo-46504: faster code for trial quotient in x_divrem()

This brings x_divrem() back into synch with x_divrem1(), which was changed
in bpo-46406 to generate faster code to find machine-word division
quotients and remainders. Modern processors compute both with a single
machine instruction, but convincing C to exploit that requires writing
_less_ "clever" C code.
2022-01-24 19:06:00 -06:00
Gregory P. Smith c7f20f1cc8 bpo-46406: Faster single digit int division. (#30626)
* bpo-46406: Faster single digit int division.

This expresses the algorithm in a more basic manner resulting in better
instruction generation by todays compilers.

See https://mail.python.org/archives/list/python-dev@python.org/thread/ZICIMX5VFCX4IOFH5NUPVHCUJCQ4Q7QM/#NEUNFZU3TQU4CPTYZNF3WCN7DOJBBTK5
2022-01-23 10:00:41 +00:00
Victor Stinner 1781d55eb3 bpo-46417: _curses uses PyStructSequence_NewType() (GH-30736)
The _curses module now creates its ncurses_version type as a heap
type using PyStructSequence_NewType(), rather than using a static
type.

* Move _PyStructSequence_FiniType() definition to pycore_structseq.h.
* test.pythoninfo: log curses.ncurses_version.
2022-01-21 03:30:20 +01:00
Victor Stinner e9e3eab0b8 bpo-46417: Finalize structseq types at exit (GH-30645)
Add _PyStructSequence_FiniType() and _PyStaticType_Dealloc()
functions to finalize a structseq static type in Py_Finalize().
Currrently, these functions do nothing if Python is built in release
mode.

Clear static types:

* AsyncGenHooksType: sys.set_asyncgen_hooks()
* FlagsType: sys.flags
* FloatInfoType: sys.float_info
* Hash_InfoType: sys.hash_info
* Int_InfoType: sys.int_info
* ThreadInfoType: sys.thread_info
* UnraisableHookArgsType: sys.unraisablehook
* VersionInfoType: sys.version
* WindowsVersionType: sys.getwindowsversion()
2022-01-21 01:42:25 +01:00
Brandt Bucher 5cd9a162cd bpo-46361: Fix "small" int caching (GH-30583) 2022-01-16 16:06:37 +00:00
Tim Peters fc05e6bfce bpo-46020: Optimize long_pow for the common case (GH-30555)
This cuts a bit of overhead by not initializing the table of small
odd powers unless it's needed for a large exponent.
2022-01-12 12:55:02 -06:00