Name
create_ocr_class_knnT_create_ocr_class_knnCreateOcrClassKnnCreateOcrClassKnn — Create an OCR classifier using a k-Nearest Neighbor (k-NN) classifier.
void CreateOcrClassKnn(const HTuple& WidthCharacter, const HTuple& HeightCharacter, const HTuple& Interpolation, const HTuple& Features, const HTuple& Characters, const HTuple& GenParamName, const HTuple& GenParamValue, HTuple* OCRHandle)
void HOCRKnn::HOCRKnn(Hlong WidthCharacter, Hlong HeightCharacter, const HString& Interpolation, const HTuple& Features, const HTuple& Characters, const HTuple& GenParamName, const HTuple& GenParamValue)
void HOCRKnn::HOCRKnn(Hlong WidthCharacter, Hlong HeightCharacter, const HString& Interpolation, const HString& Features, const HTuple& Characters, const HTuple& GenParamName, const HTuple& GenParamValue)
void HOCRKnn::HOCRKnn(Hlong WidthCharacter, Hlong HeightCharacter, const char* Interpolation, const char* Features, const HTuple& Characters, const HTuple& GenParamName, const HTuple& GenParamValue)
void HOCRKnn::CreateOcrClassKnn(Hlong WidthCharacter, Hlong HeightCharacter, const HString& Interpolation, const HTuple& Features, const HTuple& Characters, const HTuple& GenParamName, const HTuple& GenParamValue)
void HOCRKnn::CreateOcrClassKnn(Hlong WidthCharacter, Hlong HeightCharacter, const HString& Interpolation, const HString& Features, const HTuple& Characters, const HTuple& GenParamName, const HTuple& GenParamValue)
void HOCRKnn::CreateOcrClassKnn(Hlong WidthCharacter, Hlong HeightCharacter, const char* Interpolation, const char* Features, const HTuple& Characters, const HTuple& GenParamName, const HTuple& GenParamValue)
static void HOperatorSet.CreateOcrClassKnn(HTuple widthCharacter, HTuple heightCharacter, HTuple interpolation, HTuple features, HTuple characters, HTuple genParamName, HTuple genParamValue, out HTuple OCRHandle)
public HOCRKnn(int widthCharacter, int heightCharacter, string interpolation, HTuple features, HTuple characters, HTuple genParamName, HTuple genParamValue)
public HOCRKnn(int widthCharacter, int heightCharacter, string interpolation, string features, HTuple characters, HTuple genParamName, HTuple genParamValue)
void HOCRKnn.CreateOcrClassKnn(int widthCharacter, int heightCharacter, string interpolation, HTuple features, HTuple characters, HTuple genParamName, HTuple genParamValue)
void HOCRKnn.CreateOcrClassKnn(int widthCharacter, int heightCharacter, string interpolation, string features, HTuple characters, HTuple genParamName, HTuple genParamValue)
create_ocr_class_knncreate_ocr_class_knnCreateOcrClassKnnCreateOcrClassKnnCreateOcrClassKnn creates an OCR classifier that uses a
k-Nearest Neighbor (k-NN). The handle of the k-NN classifier is
returned in OCRHandleOCRHandleOCRHandleOCRHandleOCRHandle.
For a description on how a k-NN works, see create_class_knncreate_class_knnCreateClassKnnCreateClassKnnCreateClassKnn.
The length of the feature vector of the k-NN is determined from the
features that are used for the OCR, which are passed in FeaturesFeaturesFeaturesFeaturesfeatures.
The features are described below. The number of classes is determined
from the names of the characters which are passed in CharactersCharactersCharactersCharacterscharacters.
FeaturesFeaturesFeaturesFeaturesfeatures can contain a tuple of several
feature names. Each of these names results in one or more
features to be calculated for the classifier. Some of the feature
names compute gray value features (e.g., 'pixel_invar'"pixel_invar""pixel_invar""pixel_invar""pixel_invar").
Because a classifier requires a constant number of features (input
variables), a character to be classified is transformed to a
standard size, which is determined by WidthCharacterWidthCharacterWidthCharacterWidthCharacterwidthCharacter and
HeightCharacterHeightCharacterHeightCharacterHeightCharacterheightCharacter. The interpolation to be used for the
transformation is determined by InterpolationInterpolationInterpolationInterpolationinterpolation. It has the
same meaning as in affine_trans_imageaffine_trans_imageAffineTransImageAffineTransImageAffineTransImage. The interpolation
should be chosen such that no aliasing effects occur in the
transformation. For most applications, InterpolationInterpolationInterpolationInterpolationinterpolation =
'constant'"constant""constant""constant""constant" should be used. It should be noted that the
size of the transformed character is not chosen too large, because
the generalization properties of the classifier may become bad for
large sizes. In particular, large sizes will cause
small segmentation errors to have a large influence on the
computed features if gray value features are used. This happens
because segmentation errors will change the smallest enclosing
rectangle of the regions, which results in characters
are zoomed differently than the characters in the training set. In
most applications, sizes between 6x8 and
10x14 should be used.
The parameter FeaturesFeaturesFeaturesFeaturesfeatures can contain the following feature
names for the classification of the characters.
- 'default'"default""default""default""default":
-
'ratio'"ratio""ratio""ratio""ratio" and 'pixel_invar'"pixel_invar""pixel_invar""pixel_invar""pixel_invar" are selected.
- 'pixel:'"pixel:""pixel:""pixel:""pixel:"
-
Gray values of the character (WidthCharacterWidthCharacterWidthCharacterWidthCharacterwidthCharacter x
HeightCharacterHeightCharacterHeightCharacterHeightCharacterheightCharacter features).
- 'pixel_invar:'"pixel_invar:""pixel_invar:""pixel_invar:""pixel_invar:"
-
Gray values of the character with maximum scaling of the gray
values (WidthCharacterWidthCharacterWidthCharacterWidthCharacterwidthCharacter x HeightCharacterHeightCharacterHeightCharacterHeightCharacterheightCharacter
features).
- 'pixel_binary:'"pixel_binary:""pixel_binary:""pixel_binary:""pixel_binary:"
-
Region of the character as a binary image zoomed to a size of
WidthCharacterWidthCharacterWidthCharacterWidthCharacterwidthCharacter x HeightCharacterHeightCharacterHeightCharacterHeightCharacterheightCharacter
(WidthCharacterWidthCharacterWidthCharacterWidthCharacterwidthCharacter x HeightCharacterHeightCharacterHeightCharacterHeightCharacterheightCharacter
features).
- 'gradient_8dir:'"gradient_8dir:""gradient_8dir:""gradient_8dir:""gradient_8dir:"
-
Gradients are computed on the character image. The gradient
directions are discretized into 8 directions. The amplitude image
is decomposed into 8 channels according to these discretized
directions. 25 samples on a 5x5 grid are extracted from
each channel. These samples are used as features (200 features).
- 'projection_horizontal:'"projection_horizontal:""projection_horizontal:""projection_horizontal:""projection_horizontal:"
-
Horizontal projection of the gray values (see
gray_projectionsgray_projectionsGrayProjectionsGrayProjectionsGrayProjections, HeightCharacterHeightCharacterHeightCharacterHeightCharacterheightCharacter features).
- 'projection_horizontal_invar:'"projection_horizontal_invar:""projection_horizontal_invar:""projection_horizontal_invar:""projection_horizontal_invar:"
-
Maximally scaled horizontal projection of the gray values
(HeightCharacterHeightCharacterHeightCharacterHeightCharacterheightCharacter features).
- 'projection_vertical:'"projection_vertical:""projection_vertical:""projection_vertical:""projection_vertical:"
-
Vertical projection of the gray values (see
gray_projectionsgray_projectionsGrayProjectionsGrayProjectionsGrayProjections, WidthCharacterWidthCharacterWidthCharacterWidthCharacterwidthCharacter features).
- 'projection_vertical_invar:'"projection_vertical_invar:""projection_vertical_invar:""projection_vertical_invar:""projection_vertical_invar:"
-
Maximally scaled vertical projection of the gray values
(WidthCharacterWidthCharacterWidthCharacterWidthCharacterwidthCharacter features).
- 'ratio:'"ratio:""ratio:""ratio:""ratio:"
-
Aspect ratio of the character (see
height_width_ratioheight_width_ratioHeightWidthRatioHeightWidthRatioHeightWidthRatio, 1 feature).
- 'anisometry:'"anisometry:""anisometry:""anisometry:""anisometry:"
-
Anisometry of the character (see eccentricityeccentricityEccentricityEccentricityEccentricity, 1 feature).
- 'width:'"width:""width:""width:""width:"
-
Width of the character before scaling the character to the
standard size (not scale-invariant, see
height_width_ratioheight_width_ratioHeightWidthRatioHeightWidthRatioHeightWidthRatio, 1 feature).
- 'height:'"height:""height:""height:""height:"
-
Height of the character before scaling the character to the
standard size (not scale-invariant, see
height_width_ratioheight_width_ratioHeightWidthRatioHeightWidthRatioHeightWidthRatio, 1 feature).
- 'zoom_factor:'"zoom_factor:""zoom_factor:""zoom_factor:""zoom_factor:"
-
Difference in size between the character and the values
WidthCharacterWidthCharacterWidthCharacterWidthCharacterwidthCharacter and HeightCharacterHeightCharacterHeightCharacterHeightCharacterheightCharacter (not
scale-invariant, 1 feature).
- 'foreground:'"foreground:""foreground:""foreground:""foreground:"
-
Fraction of pixels in the foreground (1 feature).
- 'foreground_grid_9:'"foreground_grid_9:""foreground_grid_9:""foreground_grid_9:""foreground_grid_9:"
-
Fraction of pixels in the foreground in a 3x3 grid within
the smallest enclosing rectangle of the character (9 features).
- 'foreground_grid_16:'"foreground_grid_16:""foreground_grid_16:""foreground_grid_16:""foreground_grid_16:"
-
Fraction of pixels in the foreground in a 4x4 grid within
the smallest enclosing rectangle of the character (16 features).
- 'compactness:'"compactness:""compactness:""compactness:""compactness:"
-
Compactness of the character (see compactnesscompactnessCompactnessCompactnessCompactness, 1 feature).
- 'convexity:'"convexity:""convexity:""convexity:""convexity:"
-
Convexity of the character (see convexityconvexityConvexityConvexityConvexity, 1 feature).
- 'moments_region_2nd_invar:'"moments_region_2nd_invar:""moments_region_2nd_invar:""moments_region_2nd_invar:""moments_region_2nd_invar:"
-
Normalized 2nd moments of the character (see
moments_region_2nd_invarmoments_region_2nd_invarMomentsRegion2ndInvarMomentsRegion2ndInvarMomentsRegion2ndInvar, 3 features).
- 'moments_region_2nd_rel_invar:'"moments_region_2nd_rel_invar:""moments_region_2nd_rel_invar:""moments_region_2nd_rel_invar:""moments_region_2nd_rel_invar:"
-
Normalized 2nd relative moments of the character (see
moments_region_2nd_rel_invarmoments_region_2nd_rel_invarMomentsRegion2ndRelInvarMomentsRegion2ndRelInvarMomentsRegion2ndRelInvar, 2 features).
- 'moments_region_3rd_invar:'"moments_region_3rd_invar:""moments_region_3rd_invar:""moments_region_3rd_invar:""moments_region_3rd_invar:"
-
Normalized 3rd moments of the character (see
moments_region_3rd_invarmoments_region_3rd_invarMomentsRegion3rdInvarMomentsRegion3rdInvarMomentsRegion3rdInvar, 4 features).
- 'moments_central:'"moments_central:""moments_central:""moments_central:""moments_central:"
-
Normalized central moments of the character (see
moments_region_centralmoments_region_centralMomentsRegionCentralMomentsRegionCentralMomentsRegionCentral, 4 features).
- 'moments_gray_plane:'"moments_gray_plane:""moments_gray_plane:""moments_gray_plane:""moments_gray_plane:"
-
Normalized gray value moments and the angle of the gray value
plane (see moments_gray_planemoments_gray_planeMomentsGrayPlaneMomentsGrayPlaneMomentsGrayPlane, 4 features).
- 'phi:'"phi:""phi:""phi:""phi:"
-
Sinus and cosinus of the orientation (angle) of the character
(see elliptic_axiselliptic_axisEllipticAxisEllipticAxisEllipticAxis, 2 feature).
- 'num_connect:'"num_connect:""num_connect:""num_connect:""num_connect:"
-
Number of connected components (see connect_and_holesconnect_and_holesConnectAndHolesConnectAndHolesConnectAndHoles, 1
feature).
- 'num_holes:'"num_holes:""num_holes:""num_holes:""num_holes:"
-
Number of holes (see connect_and_holesconnect_and_holesConnectAndHolesConnectAndHolesConnectAndHoles, 1 feature).
- 'cooc:'"cooc:""cooc:""cooc:""cooc:"
-
Values of the binary cooccurrence matrix (see
gen_cooc_matrixgen_cooc_matrixGenCoocMatrixGenCoocMatrixGenCoocMatrix, 8 features).
- 'num_runs:'"num_runs:""num_runs:""num_runs:""num_runs:"
-
Number of runs in the region normalized by the height (1 feature).
- 'chord_histo:'"chord_histo:""chord_histo:""chord_histo:""chord_histo:"
-
Frequency of the runs per row (not scale-invariant,
HeightCharacterHeightCharacterHeightCharacterHeightCharacterheightCharacter features).
After the classifier has been created, it is trained using
trainf_ocr_class_knntrainf_ocr_class_knnTrainfOcrClassKnnTrainfOcrClassKnnTrainfOcrClassKnn. After this, the classifier can be
saved using write_ocr_class_knnwrite_ocr_class_knnWriteOcrClassKnnWriteOcrClassKnnWriteOcrClassKnn. Alternatively, the
classifier can be used immediately after training to classify
characters using do_ocr_single_class_knndo_ocr_single_class_knnDoOcrSingleClassKnnDoOcrSingleClassKnnDoOcrSingleClassKnn or
do_ocr_multi_class_knndo_ocr_multi_class_knnDoOcrMultiClassKnnDoOcrMultiClassKnnDoOcrMultiClassKnn.
A comparison of the k-NN and the support vector machine (SVM) (see
create_ocr_class_svmcreate_ocr_class_svmCreateOcrClassSvmCreateOcrClassSvmCreateOcrClassSvm) typically shows that SVMs are
generally slower at training, especially for huge training sets, but
achieve slightly better recognition rates than k-NNs. Please note that
this guideline assumes optimal tuning of the parameters of the SVM.
- Multithreading type: reentrant (runs in parallel with non-exclusive operators).
- Multithreading scope: global (may be called from any thread).
- Processed without parallelization.
This operator returns a handle. Note that the state of an instance of this handle type may be changed by specific operators even though the handle is used as an input parameter by those operators.
Width of the rectangle to which the gray values
of the segmented character are zoomed.
Default value: 8
Suggested values: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 20
Typical range of values: 4
≤
WidthCharacter
WidthCharacter
WidthCharacter
WidthCharacter
widthCharacter
≤
20
Height of the rectangle to which the gray values
of the segmented character are zoomed.
Default value: 10
Suggested values: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 20
Typical range of values: 4
≤
HeightCharacter
HeightCharacter
HeightCharacter
HeightCharacter
heightCharacter
≤
20
Interpolation mode for the zooming of the characters.
Default value:
'constant'
"constant"
"constant"
"constant"
"constant"
List of values: 'bicubic'"bicubic""bicubic""bicubic""bicubic", 'bilinear'"bilinear""bilinear""bilinear""bilinear", 'constant'"constant""constant""constant""constant", 'nearest_neighbor'"nearest_neighbor""nearest_neighbor""nearest_neighbor""nearest_neighbor", 'weighted'"weighted""weighted""weighted""weighted"
Features to be used for classification.
Default value:
'default'
"default"
"default"
"default"
"default"
List of values: 'anisometry'"anisometry""anisometry""anisometry""anisometry", 'chord_histo'"chord_histo""chord_histo""chord_histo""chord_histo", 'compactness'"compactness""compactness""compactness""compactness", 'convexity'"convexity""convexity""convexity""convexity", 'cooc'"cooc""cooc""cooc""cooc", 'default'"default""default""default""default", 'foreground'"foreground""foreground""foreground""foreground", 'foreground_grid_16'"foreground_grid_16""foreground_grid_16""foreground_grid_16""foreground_grid_16", 'foreground_grid_9'"foreground_grid_9""foreground_grid_9""foreground_grid_9""foreground_grid_9", 'gradient_8dir'"gradient_8dir""gradient_8dir""gradient_8dir""gradient_8dir", 'height'"height""height""height""height", 'moments_central'"moments_central""moments_central""moments_central""moments_central", 'moments_gray_plane'"moments_gray_plane""moments_gray_plane""moments_gray_plane""moments_gray_plane", 'moments_region_2nd_invar'"moments_region_2nd_invar""moments_region_2nd_invar""moments_region_2nd_invar""moments_region_2nd_invar", 'moments_region_2nd_rel_invar'"moments_region_2nd_rel_invar""moments_region_2nd_rel_invar""moments_region_2nd_rel_invar""moments_region_2nd_rel_invar", 'moments_region_3rd_invar'"moments_region_3rd_invar""moments_region_3rd_invar""moments_region_3rd_invar""moments_region_3rd_invar", 'num_connect'"num_connect""num_connect""num_connect""num_connect", 'num_holes'"num_holes""num_holes""num_holes""num_holes", 'num_runs'"num_runs""num_runs""num_runs""num_runs", 'phi'"phi""phi""phi""phi", 'pixel'"pixel""pixel""pixel""pixel", 'pixel_binary'"pixel_binary""pixel_binary""pixel_binary""pixel_binary", 'pixel_invar'"pixel_invar""pixel_invar""pixel_invar""pixel_invar", 'projection_horizontal'"projection_horizontal""projection_horizontal""projection_horizontal""projection_horizontal", 'projection_horizontal_invar'"projection_horizontal_invar""projection_horizontal_invar""projection_horizontal_invar""projection_horizontal_invar", 'projection_vertical'"projection_vertical""projection_vertical""projection_vertical""projection_vertical", 'projection_vertical_invar'"projection_vertical_invar""projection_vertical_invar""projection_vertical_invar""projection_vertical_invar", 'ratio'"ratio""ratio""ratio""ratio", 'width'"width""width""width""width", 'zoom_factor'"zoom_factor""zoom_factor""zoom_factor""zoom_factor"
All characters of the character set to be read.
Default value:
['0','1','2','3','4','5','6','7','8','9']
["0","1","2","3","4","5","6","7","8","9"]
["0","1","2","3","4","5","6","7","8","9"]
["0","1","2","3","4","5","6","7","8","9"]
["0","1","2","3","4","5","6","7","8","9"]
This parameter is not yet supported.
Default value: []
List of values: []
This parameter is not yet supported.
Default value: []
List of values: []
Handle of the k-NN classifier.
read_image (Image, 'letters')
* Segment the image.
binary_threshold(Image,&Region, 'otsu', 'dark', &UsedThreshold);
dilation_circle (Region, RegionDilation, 3.5)
connection (RegionDilation, ConnectedRegions)
intersection (ConnectedRegions, Region, RegionIntersection)
sort_region (RegionIntersection, Characters, 'character', 'true', 'row')
* Generate the training file.
count_obj (Characters, Number)
Classes := []
for J := 0 to 25 by 1
Classes := [Classes,gen_tuple_const(20,chr(ord('a')+J))]
endfor
Classes := [Classes,gen_tuple_const(20,'.')]
write_ocr_trainf (Characters, Image, Classes, 'letters.trf')
* Generate and train the classifier.
read_ocr_trainf_names ('letters.trf', CharacterNames, CharacterCount)
create_ocr_class_knn (8, 10, 'constant', 'default', CharacterNames, \
[], [], OCRHandle)
trainf_ocr_class_knn (OCRHandle, 'letters.trf', [], [])
* Re-classify the characters in the image.
do_ocr_multi_class_knn (Characters, Image, OCRHandle, Class, Confidence)
If the parameters are valid, the operator
create_ocr_class_knncreate_ocr_class_knnCreateOcrClassKnnCreateOcrClassKnnCreateOcrClassKnn returns the value 2 (H_MSG_TRUE). If necessary,
an exception is raised.
trainf_ocr_class_knntrainf_ocr_class_knnTrainfOcrClassKnnTrainfOcrClassKnnTrainfOcrClassKnn
create_ocr_class_svmcreate_ocr_class_svmCreateOcrClassSvmCreateOcrClassSvmCreateOcrClassSvm
do_ocr_single_class_knndo_ocr_single_class_knnDoOcrSingleClassKnnDoOcrSingleClassKnnDoOcrSingleClassKnn,
do_ocr_multi_class_knndo_ocr_multi_class_knnDoOcrMultiClassKnnDoOcrMultiClassKnnDoOcrMultiClassKnn,
clear_class_knnclear_class_knnClearClassKnnClearClassKnnClearClassKnn,
create_class_knncreate_class_knnCreateClassKnnCreateClassKnnCreateClassKnn,
trainf_ocr_class_knntrainf_ocr_class_knnTrainfOcrClassKnnTrainfOcrClassKnnTrainfOcrClassKnn,
classify_class_knnclassify_class_knnClassifyClassKnnClassifyClassKnnClassifyClassKnn
OCR/OCV