chardet 2.3.0
Universal encoding detector for Python 2 and 3
Chardet: The Universal Character Encoding Detector
- Detects
- ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)
- Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)
- EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP (Japanese)
- EUC-KR, ISO-2022-KR (Korean)
- KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)
- ISO-8859-2, windows-1250 (Hungarian)
- ISO-8859-5, windows-1251 (Bulgarian)
- windows-1252 (English)
- ISO-8859-7, windows-1253 (Greek)
- ISO-8859-8, windows-1255 (Visual and Logical Hebrew)
- TIS-620 (Thai)
Requires Python 2.6 or later
Command-line Tool
chardet comes with a command-line script which reports on the encodings of one or more files:
% chardetect somefile someotherfile somefile: windows-1252 with confidence 0.5 someotherfile: ascii with confidence 1.0
About
This is a continuation of Mark Pilgrim’s excellent chardet. Previously, two versions needed to be maintained: one that supported python 2.x and one that supported python 3.x. We’ve recently merged with Ian Cordasco’s charade fork, so now we have one coherent version that works for Python 2.6+.
| maintainer: | Dan Blanchard |
|---|
| File | Type | Py Version | Uploaded on | Size | |
|---|---|---|---|---|---|
| chardet-2.3.0.tar.gz (md5) | Source | 2014-10-07 | 160KB | ||
- Author: Ian Cordasco
- Home Page: https://github.com/chardet/chardet
- Keywords: encoding,i18n,xml
- License: LGPL
-
Categories
- Development Status :: 4 - Beta
- Intended Audience :: Developers
- License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)
- Operating System :: OS Independent
- Programming Language :: Python
- Programming Language :: Python :: 2
- Programming Language :: Python :: 2.6
- Programming Language :: Python :: 2.7
- Programming Language :: Python :: 3
- Programming Language :: Python :: 3.2
- Programming Language :: Python :: 3.3
- Topic :: Software Development :: Libraries :: Python Modules
- Topic :: Text Processing :: Linguistic
- Package Index Owner: ajung, MarkPilgrim, erikrose, Daniel.Blanchard, graffatcolmingov
- DOAP record: chardet-2.3.0.xml
