Operators |
train_class_gmm — Train a Gaussian Mixture Model.
train_class_gmm( : : GMMHandle, MaxIter, Threshold, ClassPriors, Regularize : Centers, Iter)
train_class_gmm trains the Gaussian Mixture Model (GMM) referenced by GMMHandle. Before the GMM can be trained, all training samples to be used for the training must be stored in the GMM using add_sample_class_gmm, add_samples_image_class_gmm, or read_samples_class_gmm. After the training, new training samples can be added to the GMM and the GMM can be trained again.
During the training, the error that results from the GMM applied to the training vectors will be minimized with the expectation maximization (EM) algorithm.
MaxIter specifies the maximum number of iterations per class for the EM algorithm. In practice, values between 20 and 200 should be sufficient for most problems. Threshold specifies a threshold for the relative changes of the error. If the relative change in error exceeds the threshold after MaxIter iterations, the algorithm will be canceled for this class. Because the algorithm starts with the maximum specified number of centers (parameter NumCenters in create_class_gmm), in case of a premature termination the number of centers and the error for this class will not be optimal. In this case, a new training with different parameters (e.g. another value for RandSeed in create_class_gmm) can be tried.
ClassPriors specifies the method of calculation of the class priors in GMM. If 'training' is specified, the priors of the classes are taken from the proportion of the corresponding sample data during training. If 'uniform' is specified, the priors are set equal to 1/NumClasses for all classes.
Regularize is used to regularize (nearly) singular covariance matrices during the training. A covariance matrix might collapse to singularity if it is trained with linearly dependent data. To avoid this, a small value specified by Regularize is added to each main diagonal element of the covariance matrix, which prevents this element from becoming smaller than Regularize. A recommended value for Regularize is 0.0001. If Regularize is set to 0.0, no regularization is performed.
The centers are initially randomly distributed. In individual cases, relatively high errors will result from the algorithm because the initial random values determined by RandSeed in create_class_gmm lead to local minima. In this case, a new GMM with a different value for RandSeed should be generated to test whether a significantly smaller error can be obtained.
It should be noted that, depending on the number of centers, the type of covariance matrix, and the number of training samples, the training can take from a few seconds to several hours.
On output, train_class_gmm returns in Centers the number of centers per class that have been found to be optimal by the EM algorithm. These values can be used as a reference in NumCenters (in create_class_gmm) for future GMMs. If the number of centers found by training a new GMM on integer training data is unexpectedly high, this might be corrected by adding a Randomize noise to the training data in add_sample_class_gmm. Iter contains the number of performed iterations per class. If a value in Iter equals MaxIter, the training algorithm has been terminated prematurely (see above).
This operator modifies the state of the following input parameter:
The value of this parameter may not be shared across multiple threads without external synchronization.GMM handle.
Maximum number of iterations of the expectation maximization algorithm
Default value: 100
Suggested values: 10, 20, 30, 50, 100, 200
Threshold for relative change of the error for the expectation maximization algorithm to terminate.
Default value: 0.001
Suggested values: 0.001, 0.0001
Restriction: Threshold >= 0.0 && Threshold <= 1.0
Mode to determine the a-priori probabilities of the classes
Default value: 'training'
List of values: 'training' , 'uniform'
Regularization value for preventing covariance matrix singularity.
Default value: 0.0001
Restriction: Regularize >= 0.0 && Regularize < 1.0
Number of found centers per class
Number of executed iterations per class
create_class_gmm (NumDim, NumClasses, [1,5], 'full', 'none', 0, 42,\ GMMHandle) * Add the training data read_samples_class_gmm (GMMHandle, 'samples.gsf') * Train the GMM train_class_gmm (GMMHandle, 100, 1e-4, 'training', 1e-4, Centers, Iter) * Write the Gaussian Mixture Model to file write_class_gmm (GMMHandle, 'gmmclassifier.gmm')
If the parameters are valid, the operator train_class_gmm returns the value 2 (H_MSG_TRUE). If necessary an exception is raised.
add_sample_class_gmm, read_samples_class_gmm
evaluate_class_gmm, classify_class_gmm, write_class_gmm, create_class_lut_gmm
Christopher M. Bishop: “Neural Networks for Pattern Recognition”;
Oxford University Press, Oxford; 1995.
Mario A.T. Figueiredo: “Unsupervised Learning of Finite Mixture
Models”; IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 24, No. 3; March 2002.
Foundation
Operators |