The Classical Language Toolkit (CLTK)¶
Contents¶
- About
- Citation
- Installation
- Importing Corpora
- Akkadian
- Arabic
- Bengali
- Chinese
- Coptic
- Ancient Egyptian
- Old English
- Middle English
- French
- Greek
- Accentuation and diacritics
- Alphabet
- Converting Beta Code to Unicode
- Converting TLG texts with TLGU
- Information Retrieval
- Lemmatization
- Named Entity Recognition
- Normalization
- POS tagging
- Prosody Scanning
- Sentence Tokenization
- Stopword Filtering
- TEI XML
- Text Cleanup
- TLG Indices
- Transliteration
- Word2Vec
- Hebrew
- Hindi
- Javanese
- Latin
- Clausulae Analysis
- Converting J to I, V to U
- Converting PHI texts with TLGU
- Information Retrieval
- Declining
- Lemmatization
- Lemmatization, backoff method
- Line Tokenization
- Macronizer
- Making POS training sets
- Named Entity Recognition
- PHI Indices
- POS tagging
- Prosody Scanning
- Scansion of Poetry
- Sentence Tokenization
- Stemming
- Stopword Filtering
- Syllabifier
- Text Cleanup
- Transliteration
- Word Tokenization
- Word2Vec
- Malayalam
- Marathi
- Multilingual
- Old Norse
- Pali
- Persian
- Prakrit
- Punjabi
- Sanskrit
- Telugu
- Tibetan
- Urdu