This chapter explains how to use 3D Gripping Point Detection.
3D Gripping Point Detection is used to find suitable gripping points on the surface of arbitrary objects in a 3D scene. The results can be used to target the gripping points with a robot arm and pick up the objects using vacuum grippers with suction cups.
HALCON provides a pretrained model which is ready for inference without an additional training step. 3D Gripping Point Detection also works on objects that were not seen in training. Thus, there is no need to provide a 3D model of the objects that are to be targeted. 3D Gripping Point Detection can also cope with scenes containing various different objects at once, scenes with partly occluded objects, and with scenes containing cluttered 3D data.
The inference workflow is described in the following section.
This paragraph describes how to determine a suitable gripping
point on arbitrary object surfaces using a 3D Gripping Point Detection model.
An application scenario can be seen in the HDevelop example
3d_gripping_point_detection_workflow.hdev
.
Read the pretrained 3D Gripping Point Detection model by using
Set the model parameter regarding, e.g., the used devices or image dimensions using
Generate a data dictionary DLSample
for each 3D scene.
This can be done using the procedure
gen_dl_samples_3d_gripping_point_detection
,
which can cope with different kinds of 3D data. For further information on the data requirements see the section “Data” below.
Preprocessing of the data before the inference. For this, you can use the procedure
preprocess_dl_samples
.
The required preprocessing parameters can be generated from the model with
create_dl_preprocess_param_from_model
or set manually using
create_dl_preprocess_param
.
Note that the preprocessing of the data has significant impact on the inference. See the section “3D scenes” below for further details.
Apply the model using the operator
Perform a post-processing step on the resulting DLResult
to retrieve gripping points for your scene using the procedure
gen_dl_3d_gripping_points_and_poses
.
Visualize the 2D and 3D results using the procedure
dev_display_dl_data
.
This section gives information on the data that needs to be provided for the inference with a 3D Gripping Point Detection model.
As a basic concept, the model handles data by dictionaries, meaning it
receives the input data from a dictionary DLSample
and returns
a dictionary DLResult
.
More information on the data handling can be found in the chapter
Deep Learning / Model.
3D Gripping Point Detection processes 3D scenes, which consist of regular 2D images and depth information.
In order to adapt these 3D data to the network input requirements, a preprocessing step is necessary for the inference. See the section “Specific Preprocessing Parameters” below for information on certain preprocessing parameters. It is recommended to use a high resolution 3D sensor, in order to ensure the necessary data quality. The following data are needed:
RGB image, or
intensity (gray value) image
X-image (values need to increase from left to right)
Y-image (values need to increase from top to bottom)
Z-image (values need to increase from points close to the sensor to far points; this is for example the case if the data is given in the camera coordinate system)
(1) | (2) | (3) |
2D mappings (3-channel image)
In order to restrict the search area, the domain of the RGB/intensity image can be reduced. For details, see the section “Specific Preprocessing Parameters” below. Note that the domain of the XYZ-images and the (optional) normals images need to be identical. Furthermore, for all input data, only valid pixels may be part of the used domain.
As inference output, the model will return a dictionary DLResult
for every sample. This dictionary includes the following entries:
'gripping_map'
: Binary image, indicating for each pixel of
the scene whether the model predicted a gripping point
(pixel value = 1.0) or not (0.0).
'gripping_confidence'
: Image, containing raw, uncalibrated
confidence values for every point in the scene.
The model results DLResult
can be postprocessed with
gen_dl_3d_gripping_points_and_poses
in order to generate
gripping points.
Furthermore, this procedure can be parameterized in order to reject
small gripping regions using min_area_size
,
or serve as a template to define custom selection criteria.
The procedure adds the following entry to the dictionary
DLResult
:
'gripping_points'
: Tuple of dictionaries containing
information on suitable gripping points in a scene:
'region'
: Connected region of potential gripping points.
The determined gripping point lies inside this
region.
'row'
: Row coordinate of the gripping point in the
preprocessed RGB/intensity image.
'column'
: Column coordinate of the gripping point in the
preprocessed RGB/intensity image.
'pose'
: 3D pose of the gripping point (relative to
the coordinate system of the XYZ-images, i.e.,
of the camera) which can be used
by the robot.
In the preprocessing step, along with the data, preprocessing parameters
need to be passed to preprocess_dl_samples
.
Two pairs of those preprocessing parameters have
particularly significant impact:
'image_width'
, 'image_height'
:
Determine the image dimensions of the images to be inferred.
With larger image dimensions and thus a better resolution, smaller gripping surfaces can be detected. However, the runtime and memory consumption of the application increases.
'min_z'
, 'max_z'
:
Determine the allowed distance from the camera for 3D points based on
the Z-image.
These parameters can therefore help to reduce erroneous outliers and therefore increase the application robustness.
A restriction of the search area can be done by reducing the domain of
the input images (using
). The way
reduce_domain
preprocess_dl_samples
handles the domain is set using the
preprocessing parameter 'domain_handling'
. The parameter
'domain_handling'
should be used in a way that only essential
information is passed on to the network for inference.
The following images show how an input image with reduced domain
is passed on after the preprocessing step depending on the set
'domain_handling'
.
(1) | (2) | (3) | (4) |