apply_deep_ocr
— Apply a Deep OCR model on a set of images for inference.
apply_deep_ocr(Image : : DeepOcrHandle, Mode : DeepOcrResult)
apply_deep_ocr
applies the Deep OCR model given by
DeepOcrHandle
on the tuple of input images Image
.
The operator returns DeepOcrResult
, a tuple with a result
dictionary for every input image.
The operator apply_deep_ocr
poses requirements on the input
Image
:
Image type: byte or real.
Number of channels: Depending on the Mode
'auto' : 1 or 3
'detection' : 1 or 3
'recognition' : 1
Further, the operator apply_deep_ocr
will preprocess the given
Image
to match the model specifications.
This means, the input image will be converted to type real.
Byte images will also be normalized.
Further, for Mode
= 'auto' or 'detection'
the input image Image
is padded to the model input dimensions and,
in case it has only one channel, converted into a three-channel image.
The parameter Mode
specifies a mode and with it, which component is
executed. Supported values:
Perform both parts, detection of the word and its recognition.
Perform only the detection part. Hence, the model will merely localize the word regions within the image.
Perform only the recognition part. Hence, the model requires that the image contains solely a tight crop of the word.
Note, the model must have been created with the desired component, see
create_deep_ocr
.
The output dictionary DeepOcrResult
can have entries according to the
applied Mode
(marked by its abbreviation):
image
(A, DET, REC):Preprocessed image.
score_maps
(A, DET):Scores given as image with four channels:
Character score: Score for the character detection.
Link score: Score for the connection of detected character centers to a connected word.
Orientation 1: Sine component of the predicted word orientation.
Orientation 2: Cosine component of the predicted word orientation.
words
(A, DET):Dictionary containing the following entries. Thereby, the entries are tuples with a value for every found word.
word
(A): Recognized word.
word_image
(A): Preprocessed image part containing the word.
row
(A, DET): Localized word: Center point, row coordinate.
col
(A, DET): Localized word: Center point, column
coordinate.
phi
(A, DET): Localized word: Angle phi.
lenght1
(A, DET): Localized word: Half length of edge 1.
length2
(A, DET): Localized word: Half length of edge 2.
line_index
(A, DET): Line index of localized word if
'detection_sort_by_line' set to 'true' .
The word localization is given by the parameters of an oriented rectangle,
see gen_rectangle2
for further information.
word
(REC):Recognized word.
System requirements:
To run this operator on GPU (see get_deep_ocr_param
), cuDNN and cuBLAS
are required.
For further details, please refer to the “Installation Guide”,
paragraph “Requirements for Deep Learning and Deep-Learning-Based Methods”.
Alternatively, this operator can also be run on CPU.
This operator returns a handle. Note that the state of an instance of this handle type may be changed by specific operators even though the handle is used as an input parameter by those operators.
This operator supports cancelling timeouts and interrupts.
This operator supports breaking timeouts and interrupts.
Image
(input_object) (multichannel-)image(-array) →
object (byte / real)
Input image.
DeepOcrHandle
(input_control) deep_ocr →
(handle)
Handle of the Deep OCR model.
Mode
(input_control) string →
(string)
Inference mode.
Default value: []
List of values: 'auto' , 'detection' , 'recognition'
DeepOcrResult
(output_control) dict(-array) →
(handle)
Tuple of result dictionaries.
If the parameters are valid, the operator apply_deep_ocr
returns the value 2 (H_MSG_TRUE). If necessary, an exception is raised.
get_deep_ocr_param
,
set_deep_ocr_param
,
create_deep_ocr
OCR/OCV