Operators |
create_class_gmm — Create a Gaussian Mixture Model for classification
create_class_gmm( : : NumDim, NumClasses, NumCenters, CovarType, Preprocessing, NumComponents, RandSeed : GMMHandle)
create_class_gmm creates a Gaussian Mixture Model (GMM) for classification. NumDim specifies the number of dimensions of the feature space, NumClasses specifies the number of classes. A GMM consists of NumCenters Gaussian centers per class. NumCenters can not only be the exact number of centers to be used, but, depending on the number of parameters, can specify upper and lower bounds for the number of centers:
The parameter determines the exact number of centers to be used for all classes.
The first parameter determines the mimimum number of centers, the second determines the maximum number of centers for all classes.
Alternatingly every first parameter determines the minimum number of centers per class and every second parameters determines the maximum number of centers per class.
When upper and lower bounds are specified, the optimum number of centers will be determined with the help of the Mimimum Message Length Criterion (MML). In general, we recommend to start the training with (too) many centers as maximum and the expected number of centers as minimum.
Each center is described by the parameters center , covariance matrix , and mixing coefficient . These parameters are calculated from the training data by means of the Expectation Maximization (EM) algorithm. A GMM can approximate an arbitrary probability density, provided that enough centers are being used. The covariance matrices have the dimensions NumDim x NumDim (NumComponents x NumComponents if preprocessing is used) and are symmetric. Further constraints can be given by CovarType:
For CovarType = 'spherical' , is a scalar multiple of the identity matrix . The center density function p(x|j) is
For CovarType = 'diag' , is a diagonal matrix . The center density function p(x|j) is
For CovarType = 'full' , is a positive definite matrix. The center density function p(x|j) is
The complexity of the calculations increases from CovarType = 'spherical' over CovarType = 'diag' to CovarType = 'full' . At the same time the flexibility of the centers increases. In general, 'spherical' therefore needs higher values for NumCenters than 'full' .
The procedure to use GMM is as follows: First, a GMM is created by create_class_gmm . Then, training vectors are added by add_sample_class_gmm, afterwards they can be written to disk with write_samples_class_gmm. With train_class_gmm the classifier center parameters (defined above) are determined. Furthermore, they can be saved with write_class_gmm for later classifications.
From the mixing probabilities and the center density function p(x|j), the probability density function p(x) can be calculated by:
The probability density function p(x) can be evaluated with evaluate_class_gmm for a feature vector x. classify_class_gmm sorts the p(x) and therefore discovers the most probable class of the feature vector.
The parameters Preprocessing and NumComponents can be used to preprocess the training data and reduce its dimensions. These parameters are explained in the description of the operator create_class_mlp.
create_class_gmm initializes the coordinates of the centers with random numbers. To ensure that the results of training the classifier with train_class_gmm are reproducible, the seed value of the random number generator is passed in RandSeed.
This operator returns a handle. Note that the state of an instance of this handle type may be changed by specific operators even though the handle is used as an input parameter by those operators.
Number of dimensions of the feature space.
Default value: 3
Suggested values: 1, 2, 3, 4, 5, 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100
Restriction: NumDim >= 1
Number of classes of the GMM.
Default value: 5
Suggested values: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
Restriction: NumClasses >= 1
Number of centers per class.
Default value: 1
Suggested values: 1, 2, 3, 4, 5, 8, 10, 15, 20, 30
Restriction: NumClasses >= 1
Type of the covariance matrices.
Default value: 'spherical'
List of values: 'diag' , 'full' , 'spherical'
Type of preprocessing used to transform the feature vectors.
Default value: 'normalization'
List of values: 'canonical_variates' , 'none' , 'normalization' , 'principal_components'
Preprocessing parameter: Number of transformed features (ignored for Preprocessing = 'none' and Preprocessing = 'normalization' ).
Default value: 10
Suggested values: 1, 2, 3, 4, 5, 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100
Restriction: NumComponents >= 1
Seed value of the random number generator that is used to initialize the GMM with random values.
Default value: 42
GMM handle.
* Classification with Gaussian Mixture Models create_class_gmm (NumDim , NumClasses, [1,5], 'full', 'none',\ NumComponents, 42, GMMHandle) * Add the training data for J := 0 to NumData-1 by 1 * Features := [...] * ClassID := [...] add_sample_class_gmm (GMMHandle, Features, ClassID, Randomize) endfor * Train the GMM train_class_gmm (GMMHandle, 100, 0.001, 'training', 0.0001, Centers, Iter) * Classify unknown data in 'Features' classify_class_gmm (GMMHandle, Features, 1, ID, Prob, Density, KSigmaProb)
If the parameters are valid, the operator create_class_gmm returns the value 2 (H_MSG_TRUE). If necessary an exception is raised.
add_sample_class_gmm, add_samples_image_class_gmm
create_class_mlp, create_class_svm
clear_class_gmm, train_class_gmm, classify_class_gmm, evaluate_class_gmm, classify_image_class_gmm
Christopher M. Bishop: “Neural Networks for Pattern Recognition”;
Oxford University Press, Oxford; 1995.
Mario A.T. Figueiredo: “Unsupervised Learning of Finite Mixture
Models”; IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 24, No. 3; March 2002.
Foundation
Operators |