This chapter explains how to use Deep 3D Matching.
Deep 3D Matching is used to accurately detect objects in a scene and compute their 3D pose. This approach is particularly effective for complex scenarios where traditional 3D matching techniques (like shape-based 3D matching) may struggle due to variations in object appearance, occlusions, or noisy data. Compared to surface-based matching, Deep 3D Matching works with a calibrated multi-view setup and does not require data from a 3D sensor.
The Deep 3D Matching model consists of two components, which are dedicated to two distinct tasks, the detection, which localizes objects, and the estimation of object poses. For a Deep 3D Matching application, both components need to be trained on the 3D CAD model of the object to be found in the application scenes.
Note: For now only inference is possible in HALCON, the custom training of a model will be available in a future version of HALCON. If you want to use the feature for your applications, please contact your HALCON sales partner for further information.
Once trained, the deep learning model can be used to infer the pose of the object in new application scenes. During the inference process, images from different angles are used as input.
This paragraph describes how to determine a 3D pose using the
Deep 3D Matching method.
An application scenario can be seen in the HDevelop example
deep_3d_matching_workflow.hdev
.
Read the trained Deep 3D Matching model by using
Optimize the deep learning network for the use with AI 2-interfaces
Extract the detection network from the deep 3d matching model using
Optimize the parameter for inference with
Set the optimized detection network using
Repeat these steps for the 3D pose estimation network.
Save the optimized model using
Note that the optimization of the model has significant impact on the runtime, if it is done with every inference run. So writing the optimized model saves time in the inference.
Set the camera parameters using
Apply the model using the operator
Visualize the resulting 3D poses.
For now only inference is possible in HALCON, training of a model will be available in a future version. If you want to use the feature for your applications, please contact your HALCON sales partner for further information.
This section gives information on the camera setup and data that needs to be provided for the model inference or training and evaluation of a Deep 3D Matching model.
As a basic concept, the model handles data by dictionaries, meaning it
receives the input data from a dictionary DLSample
and returns
a dictionary DeepMatchingResults
.
More information on the data handling can be found in the chapter
Deep Learning / Model.
In order to use Deep 3D Matching with high accuracy you need a calibrated stereo or multi-view camera setup. In comparison to stereo reconstruction, Deep 3D Matching can deal with more strongly varying camera constellations and distances. Also there is no need to use 3D sensors in the setup. For information how to calibrate the used setup, please refer to the chapter Calibration / Multi-View.
The objects to be detected must be captured from two or more different perspectives in order to calculate the 3D poses.
( 1) | ( 2) |
The training data is used to train and evaluate a Deep 3D Matching model specifically for your application.
The required training data is generated using CAD models. Synthetic images of the object are created from various angles, lighting conditions, and backgrounds. Note that there are no real images required, the required data is generated based of the CAD model.
The data needed for this is a CAD model and corresponding information on material, surface finish and color. Information about possible axial and radial symmetries can significantly improve the generated training data.
apply_deep_matching_3d
get_deep_matching_3d_param
read_deep_matching_3d
set_deep_matching_3d_param
write_deep_matching_3d