Operators |
set_feature_lengths_class_train_data — Define subfeatures in training data.
set_feature_lengths_class_train_data( : : ClassTrainDataHandle, SubFeatureLength, Names : )
set_feature_lengths_class_train_data defines subfeatures in the training data in ClassTrainDataHandle. The subfeatures are defined in SubFeatureLength by a set of lengths that groups the previously added columns subsequently into subfeatures. It is not possible to group columns which are not subsequent. The sum over all entries in SubFeatureLength must be equal to the number of dimensions set in create_class_train_data with the parameter NumDim. Optionally, names for all subsets can be defined in Names.
An exemplary situation in which this operator is helpful is described here: Two different data sources are available. Both data sources provide a vector of a certain length. The first data source provides data of length n and the second of length m. In order to automatically decide which of the data sources is more valuable for a certain classification problem, training data can be created that contains both data sources. E.g., if create_class_train_data was called with NumDim =n+m=w, then set_feature_lengths_class_train_data can be called with [n,m] in SubFeatureLength and [Name1, Name2] in Names to describe this situation for a later usage of operators like select_feature_set_knn or select_feature_set_svm. Then the classification problem has to be specified via calls of add_sample_class_train_data, by giving a vector of the first data source and a vector of the second data source as the combined feature vector of length w. The result of the call of select_feature_set_knn would then be either [Name1] if the first is more relevant, [Name2] if the second is more relevant or [Name1, Name2] if both are necessary.
This operator modifies the state of the following input parameter:
The value of this parameter may not be shared across multiple threads without external synchronization.Handle of the training data that should be partitioned into subfeatures.
Length of the subfeatures.
Names of the subfeatures.
* Find out which of the two features distinguishes two Classes NameFeature1 := 'Good Feature' NameFeature2 := 'Bad Feature' LengthFeature1 := 3 LengthFeature2 := 2 * Create training data create_class_train_data (LengthFeature1+LengthFeature2,\ ClassTrainDataHandle) * Define the features which are in the training data set_feature_lengths_class_train_data (ClassTrainDataHandle, [LengthFeature1,\ LengthFeature2], [NameFeature1, NameFeature2]) * Add training data * |Feat1| |Feat2| add_sample_class_train_data (ClassTrainDataHandle, 'row', [1,1,1, 2,1 ], 0) add_sample_class_train_data (ClassTrainDataHandle, 'row', [2,2,2, 2,1 ], 1) add_sample_class_train_data (ClassTrainDataHandle, 'row', [1,1,1, 3,4 ], 0) add_sample_class_train_data (ClassTrainDataHandle, 'row', [2,2,2, 3,4 ], 1) * Add more data * ... * Select the better feature select_feature_set_knn (ClassTrainDataHandle, 'greedy', [], [], KNNHandle,\ SelectedFeature, Score) classify_class_knn (KNNHandle, [1,1,1], Result, Rating) classify_class_knn (KNNHandle, [2,2,2], Result, Rating) * Use the classifier * ...
If the parameters are valid, the operator set_feature_lengths_class_train_data returns the value 2 (H_MSG_TRUE). If necessary, an exception is raised.
create_class_train_data, add_sample_class_train_data
select_feature_set_knn, select_feature_set_svm, select_feature_set_mlp, select_feature_set_gmm
Foundation
Operators |