1. New Pattern Matching AlgorithmPattern matching is a commonly used technique to locate regions of an image that match a known reference pattern, referred to as a template. Pattern matching algorithms are some of the most important functions in machine vision because of their use in varying applications, including alignment, gauging, and inspection. The NI Vision Development Module 2013 adds a new pattern matching algorithm called pyramidal matching, which improves performance in images with blur or low contrast.
match
Figure 1: Example of pattern matching with blur and low contrast
Pyramidal matching improves the computation time of pattern matching by reducing the size of the image and template. In pyramidal matching, both the image and the template are sampled to smaller spatial resolutions using Gaussian pyramids. This method samples every other pixel and thus the image and the template can both be reduced to one-fourth of their original sizes for every successive pyramid 'level'.
匹配
Figure 2: Pyramid matching uses multiple levels to quickly refine searches.
In the learning phase, the algorithm automatically computes the maximum pyramid level that can be used for the given template, and learns the data needed to represent the template and its rotated versions across all pyramid levels. The algorithm attempts to find an 'optimal' pyramid level (based on an analysis of template data) which would give the fastest and most accurate match. The algorithm then iterates through each level of the pyramid, refining the match at each stage until the full resolution is used to give the best match while still achieving a speed boost. You can also choose to refine the match candidates to one last stage of refinement to find sub-pixel accurate locations and sub-degree accurate angles. This stage relies on specially-extracted edge and pixel information from the template and employs interpolation techniques to get a highly accurate match location and angle.
2. Object TrackingThe NI Vision Development Module 2013 introduces a new algorithm for object tracking, which tracks the location of an object over a sequence of images to determine how it is moving relative to other objects in the image. Object tracking has many uses in application areas such as:
- Security and surveillance - In the surveillance industry, objects of interest such as people and vehicles can be tracked. Object tracking can be used for detecting trespassing or observing anomalies like unattended baggage.
- Traffic management - The flow of traffic can be analyzed, and collisions detected.
- Medicine - Cells can be tracked in medical images.
- Industry - Defective items can be detected and tracked.
- Robotics and navigation - Robots can follow the trajectory of an object. Robotic assistance can maneuver in a factory (de-palletizing objects).
- Human-computer interaction (HCI) - Users can be tracked in a gaming environment.
- Object modeling - An object tracked from multiple perspectives can be used to create a partial 3D model of the object.
- Bio-mechanics - Tracking body parts to interpret gestures or movements.
目标跟踪
Figure 3: Example of object tracking for a traffic monitoring application
NI Vision implements two object tracking algorithms: Mean shift and EM-based mean shift. Mean shift tracks the user-defined objects by iteratively updating the location of the object while EM-based mean shift not only tracks the location but also the shape and scale of the object is adapted for each frame. Both algorithms are tolerant of gradual changes in the tracked object, including geometric transformations such as shifting, rotation, scaling, or partial occlusion of the object.
3. OCR ImprovementsOptical Character Recognition (OCR) provides machine vision functions you can use in an application to read text or characters in an image. The NI Vision Development Module 2013 brings improvements to OCR functionality including multi-line, weak rotation tolerance, and better segmentation.
Multiline detection allows a user to set a region of interest (ROI) enclosing multiple lines of text rather than needing to specify an ROI for each expected line. Multiline uses particle analysis and clustering based on vertical overlap to detect the lines in a specified ROI. Users can explicitly set the number of lines expected or the algorithm can auto detect the number of lines and apply character segmentation to all lines. If multiple lines are detected and the number of lines expected is specified, the lines with the highest ranked classification score will be returned.
多行OCR
Figure 4: Multiline support reduces the need for a separate ROI for each line of text and detects the highest scoring lines.
OCR reading functionality has also been improved to support detection and reading of lines and characters with slight rotations (±20°) and differing character heights. Character segmentation refers to the process of locations and separating each character in the image from the background and other characters. This process applies to both the training and reading procedures and has significant impact upon the performance of the OCR application. OCR includes multiple threshold methods to separate the characters from the background and an AutoSplit algorithm to segment slanted, or italic, characters. A shortest segment algorithm is also implemented to ensure valid segmentation even when the characters are merged. The algorithm works in three steps:
- Attempt to divide the characters by applying multiple shortest cut paths.
- Choose the cuts that are closest to the maximum character width.
- Intelligently choose the cuts which segment a character correctly based on classification during reading.
改善字符分割
Figure 5: Segmentation improvements ensure robust reading for OCR applications.