TextRecognize

TextRecognize[image]

recognizes text in image and returns it as a string.

TextRecognize[image,level]

returns a list of strings at the specified structural level.

TextRecognize[image,level,prop]

returns prop for text at the given level.

Details and Options

TextRecognize works with arbitrary grayscale and multichannel images, operating on the intensity value of each pixel.
TextRecognize[{image₁,image₂,…}] returns recognition for all image_i.
By default, the recognized text is returned as a single string for the whole image. Recognized text can be split into levels.
Structural elements specified in level include:

	Automatic	text found in the whole image as a single string (default)
	"Block"	a list of results for each block of text
	"Line"	a list of results for each line
	"Word"	a list of results for each word
	"Character"	a list of results for each character

TextRecognize[image,level,prop] computes prop at the given level and returns the result as a list {val₁,val₂,…}.
Possible settings for prop include:

	"Text"	recognized text (default)
	"BoundingBox"	bounding box around the text as a Rectangle
	"Strength"	strength of the recognized text
	"Image"	cropped image containing the recognized text
	{prop₁,prop₂,…}	a list of properties

The following options can be specified:

	Language	$Language	the language to recognize
	Masking	All	the region of interest that includes text
	RecognitionPrior	Automatic	assumption about text in each masked area

TextRecognize accepts a Language option. By default, Language:>$Language is used. Using Language->{lang₁,lang₂,…} can be used to perform multi-language recognition.
The following Language settings can be used:

"Afrikaans"	"Albanian"	"Azerbaijani"	"Belarusian"
"Bosnian"	"Bulgarian"	"Catalan"	"Cebuano"
"Chinese"	"Croatian"	"Czech"	"Danish"
"Dutch"	"English"	"Esperanto"	"Estonian"
"Finnish"	"French"	"Galician"	"Georgian"
"German"	"Greek"	"Haitian"	"Hungarian"
"Icelandic"	"Indonesian"	"Irish"	"Italian"
"Japanese"	"Kazakh"	"Kirghiz"	"Korean"
"Lao"	"Latin"	"Lithuanian"	"Macedonian"
"Malay"	"Norwegian"	"Polish"	"Portuguese"
"Romanian"	"Russian"	"Serbian"	"Slovak"
"Slovenian"	"Spanish"	"Swahili"	"Swedish"
"Tajik"	"Turkish"	"Ukrainian"	"Uzbek"
"Vietnamese"	"Welsh"

RecognitionPrior makes an assumption about the kind of text present in the whole image or in each masked area. Possible settings include:

	Automatic	automatic structure recognition (default)
	"Column"	a single column of text
	"Line"	a single line of text
	"Word"	a single word
	"Character"	a single character
	"SparseText"	text in no particular structure

TextRecognize uses machine learning, and its training set and methods may change in different versions of the Wolfram Language, yielding different results.

Examples

open allclose all

Basic Examples (2)

Recognize text in an image:

In[1]:=

Click for copyable input

Out[1]=

Recognize lines of text and their corresponding bounding boxes:

In[1]:=

Click for copyable input

In[2]:=

Out[2]=

Highlight the bounding box of each recognized line:

In[3]:=

Out[3]=

Scope (14)

Options (8)

Applications (6)

Properties & Relations (2)

Possible Issues (8)

Neat Examples (1)

See Also

LanguageIdentify BarcodeRecognize ImageIdentify Classify TextWords TextSentences TextPosition WordCloud WordTranslation

Related Guides

Related Links

An Elementary Introduction to the Wolfram Language: Machine Learning

Introduced in 2010

(8.0)

| Updated in 2017

(11.2)

Top

Enable JavaScript to interact with content and submit forms on Wolfram websites. Learn how