Commit Graph

268 Commits

Author SHA1 Message Date
Serhiy Storchaka a83a6a3275 Issue #28701: _PyUnicode_EqualToASCIIId and _PyUnicode_EqualToASCIIString now
require ASCII right argument and assert this condition in debug build.
2016-11-16 20:02:44 +02:00
Serhiy Storchaka dddec81b2d Issue #21449: Removed private function _PyUnicode_CompareWithId. 2016-11-16 15:56:27 +02:00
Serhiy Storchaka fab6acd9f5 Issue #28701: Replace _PyUnicode_CompareWithId with _PyUnicode_EqualToASCIIId.
The latter function is more readable, faster and doesn't raise exceptions.

Based on patch by Xiang Zhang.
2016-11-16 15:41:11 +02:00
Serhiy Storchaka f5894dd646 Issue #28701: Replace _PyUnicode_CompareWithId with _PyUnicode_EqualToASCIIId.
The latter function is more readable, faster and doesn't raise exceptions.

Based on patch by Xiang Zhang.
2016-11-16 15:40:39 +02:00
Serhiy Storchaka 3b73ea1278 Issue #28701: Replace PyUnicode_CompareWithASCIIString with _PyUnicode_EqualToASCIIString.
The latter function is more readable, faster and doesn't raise exceptions.
2016-11-16 10:19:20 +02:00
Serhiy Storchaka f4934ea77d Issue #28701: Replace PyUnicode_CompareWithASCIIString with _PyUnicode_EqualToASCIIString.
The latter function is more readable, faster and doesn't raise exceptions.
2016-11-16 10:17:58 +02:00
Eric V. Smith 5646648678 Issue 28128: Print out better error/warning messages for invalid string escapes. Backport to 3.6. 2016-10-31 14:46:26 -04:00
Serhiy Storchaka 0093907f0e Issue #28426: Deprecated undocumented functions PyUnicode_AsEncodedObject(),
PyUnicode_AsDecodedObject(), PyUnicode_AsDecodedUnicode() and
PyUnicode_AsEncodedUnicode().
2016-10-27 21:05:49 +03:00
Serhiy Storchaka b3648576cd Issue #28295: Fixed the documentation and added tests for PyUnicode_AsUCS4().
Original patch by Xiang Zhang.
2016-10-02 21:30:35 +03:00
Serhiy Storchaka cc164232aa Issue #28295: Fixed the documentation and added tests for PyUnicode_AsUCS4().
Original patch by Xiang Zhang.
2016-10-02 21:29:26 +03:00
Martin Panter d508d00919 Issue #28139: Merge indentation fixes from 3.5 into 3.6 2016-09-17 07:59:14 +00:00
Martin Panter 6d57fe1c23 Issue #28139: Fix messed up indentation
Also update the classmethod and staticmethod doc strings and comments to
match the RST documentation.
2016-09-17 03:26:16 +00:00
Christian Heimes f051e43b22 Issue #28126: Replace Py_MEMCPY with memcpy(). Visual Studio can properly optimize memcpy(). 2016-09-13 20:22:02 +02:00
Serhiy Storchaka 9fab79bcb5 Issue #26900: Excluded underscored names and other private API from limited API. 2016-09-11 11:03:14 +03:00
Benjamin Peterson a13e367778 simplify Py_UCSN definitions with stdint types 2016-09-08 11:38:28 -07:00
Steve Dower cc16be85c0 Issue #27781: Change file system encoding on Windows to UTF-8 (PEP 529) 2016-09-08 10:35:16 -07:00
Steve Dower f5aba58480 Issue #27959: Adds oem encoding, alias ansi to mbcs, move aliasmbcs to codec lookup 2016-09-06 19:42:27 -07:00
Serhiy Storchaka ea525a2d1a Issue #27078: Added BUILD_STRING opcode. Optimized f-strings evaluation. 2016-09-06 22:07:53 +03:00
Martin Panter 02b75abf73 Merge spelling and grammar fixes from 3.5 2016-08-05 01:51:39 +00:00
Martin Panter 69332c1a64 Fix spelling and grammar in documentation and code comments 2016-08-04 13:07:31 +00:00
Serhiy Storchaka b6a9c9761c Issue #26778: Fixed "a/an/and" typos in code comment, documentation and error
messages.
2016-04-17 09:39:28 +03:00
Serhiy Storchaka 6a7b3a77b4 Issue #26778: Fixed "a/an/and" typos in code comment and documentation. 2016-04-17 08:32:47 +03:00
Martin Panter cda80940ed Issue #15984: Merge PyUnicode doc from 3.5 2016-04-15 02:27:11 +00:00
Martin Panter 20d325574e Issue #15984: Correct PyUnicode_FromObject() and _FromEncodedObject() docs 2016-04-15 00:56:21 +00:00
Martin Panter 6245cb3c01 Correct “an” → “a” with “Unicode”, “user”, “UTF”, etc
This affects documentation, code comments, and a debugging messages.
2016-04-15 02:14:19 +00:00
Martin Panter c86c91aab0 Merge typo fixes from 3.5 2016-04-05 06:20:32 +00:00
Martin Panter cc71a795df Fix typos in documentation and comments 2016-04-05 06:19:42 +00:00
Serhiy Storchaka 4a7c03aab4 Issue #25523: Merge a-to-an corrections from 3.5. 2015-11-02 14:44:29 +02:00
Serhiy Storchaka a84f6c3dd3 Issue #25523: Merge a-to-an corrections from 3.4. 2015-11-02 14:39:05 +02:00
Serhiy Storchaka d65c9496da Issue #25523: Further a-to-an corrections. 2015-11-02 14:10:23 +02:00
Victor Stinner fdfbf78114 Issue #25318: Add _PyBytesWriter API
Add a new private API to optimize Unicode encoders. It uses a small buffer
allocated on the stack and supports overallocation.

Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable
overallocation for the UTF-8 encoder with error handlers.

unicode_encode_ucs1(): initialize collend to collstart+1 to not check the
current character twice, we already know that it is not ASCII.
2015-10-09 00:33:49 +02:00
Victor Stinner ca9381ea01 Issue #24870: Add _PyUnicodeWriter_PrepareKind() macro
Add a macro which ensures that the writer has at least the requested kind.
2015-09-22 00:58:32 +02:00
Raymond Hettinger ac2ef65c32 Make the unicode equality test an external function rather than in-lining it.
The real benefit of the unicode specialized function comes from
bypassing the overhead of PyObject_RichCompareBool() and not
from being in-lined (especially since there was almost no shared
data between the caller and callee).  Also, the in-lining was
having a negative effect on code generation for the callee.
2015-07-04 16:04:44 -07:00
Serhiy Storchaka 7e9d1d1a1b Issue #23908: os functions now reject paths with embedded null character
on Windows instead of silently truncate them.

Removed no longer used _PyUnicode_HasNULChars().
2015-04-20 10:12:28 +03:00
Victor Stinner ce2c584ea5 Merge 3.4 (typo) 2015-02-11 18:18:10 +01:00
Victor Stinner 22fabe218d Fix typo: PyMem_Alloc => PyMem_Malloc 2015-02-11 18:17:56 +01:00
Ethan Furman b95b56150f Issue20284: Implement PEP461 2015-01-23 20:05:18 -08:00
Benjamin Peterson 82f34ada45 fix instances of consecutive articles (closes #23221)
Patch by Karan Goel.
2015-01-13 09:17:24 -05:00
Serhiy Storchaka b757c83ec6 Issue #22581: Use more "bytes-like object" throughout the docs and comments. 2014-12-05 22:25:22 +02:00
Antoine Pitrou 8c6f8dc527 Issue #19537: Fix PyUnicode_DATA() alignment under m68k. Patch by Andreas Schwab. 2014-03-23 22:55:03 +01:00
Martin v. Löwis 1c0689c613 Issue #19526: Exclude all new API from the stable ABI. 2014-01-03 21:36:49 +01:00
Victor Stinner a726192181 oops, remove _PyObject_ReprWriter() definition (unwanted change) 2013-11-19 13:18:45 +01:00
Victor Stinner 4a58707a34 Add _PyUnicodeWriter_WriteASCIIString() function 2013-11-19 12:54:53 +01:00
Victor Stinner ad14ccd047 Issue #19512: add _PyUnicode_CompareWithId() function
_PyUnicode_CompareWithId() is faster than PyUnicode_CompareWithASCIIString()
when both strings are equal and interned.

Add also _PyId_builtins identifier for "builtins" common string.
2013-11-07 00:46:04 +01:00
Antoine Pitrou 9ed5f27266 Issue #18722: Remove uses of the "register" keyword in C code. 2013-08-13 20:18:52 +02:00
Victor Stinner f476405503 fix typo in a comment 2013-04-18 23:21:19 +02:00
Victor Stinner 8f674ccd64 Close #17694: Add minimum length to _PyUnicodeWriter
* Add also min_char attribute to _PyUnicodeWriter structure (currently unused)
 * _PyUnicodeWriter_Init() has no more argument (except the writer itself):
   min_length and overallocate must be set explicitly
 * In error handlers, only enable overallocation if the replacement string
   is longer than 1 character
 * CJK decoders don't use overallocation anymore
 * Set min_length, instead of preallocating memory using
   _PyUnicodeWriter_Prepare(), in many decoders
 * _PyUnicode_DecodeUnicodeInternal() checks for integer overflow
2013-04-17 23:02:17 +02:00
Victor Stinner a0dd0213cc Close #17693: Rewrite CJK decoders to use the _PyUnicodeWriter API instead of
the legacy Py_UNICODE API.

Add also a new _PyUnicodeWriter_WriteChar() function.
2013-04-11 22:09:04 +02:00
Victor Stinner cfc4c13b04 Add _PyUnicodeWriter_WriteSubstring() function
Write a function to enable more optimizations:

 * If the substring is the whole string and overallocation is disabled, just
   keep a reference to the string, don't copy characters
 * Avoid a call to the expensive _PyUnicode_FindMaxChar() function when
   possible
2013-04-03 01:48:39 +02:00
Victor Stinner d45c7f8d74 Issue #16455: On FreeBSD and Solaris, if the locale is C, the
ASCII/surrogateescape codec is now used, instead of the locale encoding, to
decode the command line arguments. This change fixes inconsistencies with
os.fsencode() and os.fsdecode() because these operating systems announces an
ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
2012-12-04 01:34:47 +01:00