Linguistic Data
The Wolfram Language has not only convenient built-in multilingual dictionaries, but also built-in information on word meaning, structure, and usage, as well as the relationship between words. Together with the Wolfram Language's tightly integrated string manipulation functions, visualization, and data import and export, this provides a uniquely powerful platform for natural language computing.
DictionaryLookup — look up words in English and other dictionaries using string patterns
WordList — lists of words of various types in many languages
RandomWord — random word of specified type
DictionaryWordQ — test if a word is a correctly spelled dictionary word
SpellingCorrectionList — list of spelling suggestions for misspelled words
WordFrequencyData — data on typical current and historical word frequencies
PartOfSpeech — possible parts of speech for a word
WordDefinition — definitions of words
WordTranslation — translations of words in many languages
WordData — properties of words and networks of relationships between them
IntegerName — words for integers in many languages
Textual Analysis »
Classify — classify text based on language, topic, sentiment, or arbitrary training
StringSplit ▪ StringCases ▪ StringCount ▪ Counts ▪ Nearest ▪ ...
TextTranslation — translate text using an integrated external service
Importing Data »
Import — import or "scrape" text from all standard formats
"HTML" ▪ "PDF" ▪ "RTF" ▪ "XML" ▪ "TeX" ▪ "Text" ▪ "String" ▪ ...
Alphabet — alphabets for many languages
Transliterate ▪ AlphabeticOrder
LanguageData — data on 6000+ languages
ResourceData — access textual data in the Wolfram Data Repository
ExampleData — standard sample texts, including complete books
WebSearch — integrated web search in all languages, with snippets etc.
Proper Names & Linguistic Entities
PersonData ▪ CityData ▪ ChemicalData ▪ SpeciesData
Interpreter ▪ SemanticInterpretation ▪ SemanticImportString
WikipediaData — full information from Wikipedia