0% found this document useful (0 votes)
495 views206 pages

CV RN

Uploaded by

yiyituso
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
495 views206 pages

CV RN

Uploaded by

yiyituso
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 206

Computer Vision Toolbox™ Release Notes

How to Contact MathWorks

Latest news: www.mathworks.com

Sales and services: www.mathworks.com/sales_and_services

User community: www.mathworks.com/matlabcentral

Technical support: www.mathworks.com/support/contact_us

Phone: 508-647-7000

The MathWorks, Inc.


1 Apple Hill Drive
Natick, MA 01760-2098
Computer Vision Toolbox™ Release Notes
© COPYRIGHT 2010–2023 by The MathWorks, Inc.
The software described in this document is furnished under a license agreement. The software may be used or copied
only under the terms of the license agreement. No part of this manual may be photocopied or reproduced in any form
without prior written consent from The MathWorks, Inc.
FEDERAL ACQUISITION: This provision applies to all acquisitions of the Program and Documentation by, for, or through
the federal government of the United States. By accepting delivery of the Program or Documentation, the government
hereby agrees that this software or documentation qualifies as commercial computer software or commercial computer
software documentation as such terms are used or defined in FAR 12.212, DFARS Part 227.72, and DFARS 252.227-7014.
Accordingly, the terms and conditions of this Agreement and only those rights specified in this Agreement, shall pertain
to and govern the use, modification, reproduction, release, performance, display, and disclosure of the Program and
Documentation by the federal government (or other entity acquiring for or through the federal government) and shall
supersede any conflicting contractual terms or conditions. If this License fails to meet the government's needs or is
inconsistent in any respect with federal procurement law, the government agrees to return the Program and
Documentation, unused, to The MathWorks, Inc.
Trademarks
MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See
www.mathworks.com/trademarks for a list of additional trademarks. Other product or brand names may be
trademarks or registered trademarks of their respective holders.
Patents
MathWorks products are protected by one or more U.S. patents. Please see www.mathworks.com/patents for
more information.
Contents

R2023b

Ground Truth Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2

Collaborative team labeling: Distribute, monitor, and review labeling tasks


across a team . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
Image Labeler: Merge ground truth objects . . . . . . . . . . . . . . . . . . . . . . . 1-2
Labeler Enhancements: Labeling interactions and other enhancements . . 1-2

Feature Detection and Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4

Visualization Color Specification: Specify RGB color values in the range [0,
1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4

Recognition, Object Detection, and Semantic Segmentation . . . . . . . . . . 1-5

Object Detection Evaluation: Evaluate detection results with comprehensive


metrics, including size-based metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
Instance Segmentation Evaluation: Specify size-based evaluation metrics
...................................................... 1-5
Deep Learning Instance Segmentation: Train SOLOv2 networks . . . . . . . . 1-5
Automated Visual Inspection: Detect objects in image using YOLOX deep
learning network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
Automated Visual Inspection: Use anomaly detection and small object
detection techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
YOLO v4 Object Detector: Rotated rectangle bounding box support . . . . . 1-6
HRNet Object Keypoint Detector: Detect object keypoints in image using
pretrained HRNet deep learning network . . . . . . . . . . . . . . . . . . . . . . . 1-6
Object Keypoint Detection Visualizations: Visualize object keypoints on
images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
Vision Transformer Networks: Pretrained ViT Neural Network . . . . . . . . . 1-7
Vision Transformer Networks: Create and train neural networks containing
patch embedding layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
Functionality being removed or changed . . . . . . . . . . . . . . . . . . . . . . . . . 1-8

Structure from Motion and Visual SLAM . . . . . . . . . . . . . . . . . . . . . . . . . 1-10

Monocular vSLAM: Implement complete feature-based monocular visual


SLAM workflow with the monovslam object . . . . . . . . . . . . . . . . . . . . 1-10
Monocular vSLAM Example: Build a map of an indoor environment and
estimate the trajectory of the camera . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
Code Generation for Monocular vSLAM Example: Code generation for the
Monocular vSLAM example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
Monocular Visual-Inertial SLAM Example: Perform SLAM from monocular
images with measurements obtained from IMU Sensor . . . . . . . . . . . . 1-10
Visual SLAM with ROS Example: Build and deploy visual SLAM algorithm
with ROS in MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10

iii
Quaternions: Represent orientation and rotations efficiently for localization
..................................................... 1-11

Point Cloud Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12

Point Cloud Viewer: Rotate point cloud around selected point . . . . . . . . . 1-12
ICP Point Cloud Registration: Register two point clouds accounting for color
information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12
ICP Point Cloud Registration: Improved performance . . . . . . . . . . . . . . . 1-12
Functionality being removed or changed . . . . . . . . . . . . . . . . . . . . . . . . 1-12

Tracking and Motion Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-14

Tracking with Re-Identification Example: Track people in video sequence


using deep learning ReID Network . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-14

Code Generation, GPU, and Third-Party Support . . . . . . . . . . . . . . . . . . . 1-15

Generate C and C++ Code Using MATLAB Coder: Support for functions
..................................................... 1-15
Generate CUDA code for NVIDIA Using GPU Coder: Support for function
..................................................... 1-15

R2023a

Ground Truth Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2

Pixel Label File Naming: Unique filenames for image files related to pixel
labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
Labeler Enhancements: Labeling interactions and other enhancements . . 2-2

Recognition, Object Detection, and Semantic Segmentation . . . . . . . . . . 2-4

OCR: Recognize text using deep learning . . . . . . . . . . . . . . . . . . . . . . . . . 2-4


Automated Visual Inspection: Train FastFlow and PatchCore anomaly
detection networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
Automated Visual Inspection: Split anomaly data sets for training,
validation, and testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
Automated Visual Inspection: Load detectors for generating C, C++, and
CUDA code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5

Point Cloud Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6

Point Cloud Viewer: View, navigate through, and interact with a large point
cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
Point Cloud Processing: Cylindrical filtering for point cloud data . . . . . . . . 2-6
ICP Point Cloud Registration: Exclude outlier points with distance
thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
Velodyne File Reader: Read GPS and IMU data from Velodyne PCAP files
...................................................... 2-7
Cylinder and Sphere Geometric Models: Set color for parametric plots . . . 2-7

iv Contents
Code Generation, GPU, and Third-Party Support . . . . . . . . . . . . . . . . . . . . 2-8

Generate C and C++ Code Using MATLAB Coder: Support for functions
...................................................... 2-8
Generate CUDA code for NVIDIA GPU Coder: Support for function . . . . . . 2-8

Computer Vision with Simulink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9

Computer Vision with Simulink: Visualize and navigate through a point


cloud sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9

R2022b

Geometric Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2

Geometric Transformations: New functionality uses premultiply matrix


convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
Geometric Transformations: Current functionality updated to support
premultiply matrix convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3

Ground Truth Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4

Labeler Enhancements: Labeling interactions and other enhancements . . 3-4

Recognition, Object Detection, and Semantic Segmentation . . . . . . . . . . 3-5

3-D Object Detection: Training data and visualization support for 3-D object
detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5
Automated Visual Inspection: Train FCDD anomaly detector using the
Automated Visual Inspection Library . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5
Deep Learning Instance Segmentation: Evaluate segmentation results . . . 3-6
Deep Learning Instance Segmentation: Segment objects in image
datastores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
Semantic Segmentation: Write output files to specified unique folder . . . . 3-6
Image Classification: Create scene label training data . . . . . . . . . . . . . . . 3-6
YOLO v2 Object Detector: Detect objects in image using pretrained YOLO
v2 deep learning networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
YOLO v3 Object Detector: Computer Vision Toolbox includes
yolov3ObjectDetector object and its object functions . . . . . . . . . . . . . . 3-6
Deep Learning Object Detector Block: Support for YOLO v3 and YOLO v4
object detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
Seven-Segment Digit Recognition: Recognize digits in a seven-segment
display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
Object Detection Example: Object detection on large satellite imagery using
deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
Functionality Being Removed or Changed . . . . . . . . . . . . . . . . . . . . . . . . . 3-7

Structure from Motion and Visual SLAM . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8

Visual SLAM: Query indirectly connected camera views in an imageviewset


...................................................... 3-8

v
Visual SLAM: Store and manage additional attributes of 3-D world points in
a worldpointset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
Epipolar Line: epipolarLine function accepts feature point objects . . . . . . 3-8

Point Cloud Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9

Point Cloud Reading: Velodyne file reader enhancements . . . . . . . . . . . . . 3-9


Point Cloud Registration: Generalized-ICP registration algorithm added to
pcregistericp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
Point Cloud Processing: Support for 16-bit color in point clouds . . . . . . . . 3-9
Point Cloud Processing: Generate point cloud from depth image . . . . . . . . 3-9
Point Cloud Visualization: Point cloud viewer control and interaction
enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
vision.VideoFileReader system object removed in future release . . . . . . . 3-10

Code Generation, GPU, and Third-Party Support . . . . . . . . . . . . . . . . . . . 3-11

Generate C and C++ Code Using MATLAB Coder: Support for functions
..................................................... 3-11
Generate CUDA code for NVIDIA GPU Coder: Support for function . . . . . 3-11

Computer Vision with Simulink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12

Computer Vision with Simulink: Retrieve Simulink image attributes . . . . 3-12


Image From Workspace block default value changes . . . . . . . . . . . . . . . . 3-12

R2022a

Ground Truth Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2

Labeler Enhancements: Labeling interactions and other enhancements . . 4-2

Feature Detection and Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3

Functionality Being Removed or Changed: Support for GPU removed for


detectFASTFeatures function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3

Recognition, Object Detection, and Semantic Segmentation . . . . . . . . . . 4-4

Deep Learning Instance Segmentation: Train Mask R-CNN networks . . . . 4-4


YOLO v4 Object Detector: Detect objects in image using YOLO v4 deep
learning network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
SSD Object Detector Enhancements: Create pretrained or custom SSD
object detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
Text Detection: Detect natural scene texts using CRAFT deep learning
model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Automated Visual Inspection: Use anomaly detection and classification
techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Bounding Box Coordinates: Data augmentation for object detection using
spatial coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Functionality being removed or changed . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Experiment Manager Example: Find optimal training options . . . . . . . . . . 4-6

vi Contents
Multiclass Object Detection Example: Train multiclass object detector using
deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
Datastore Support: Use datastores with ACF and cascade object detectors
...................................................... 4-6

Structure from Motion and Visual SLAM . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7

Stereo Vision Rectification Parameters: Access stereo camera rectification


parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
Bundle Adjustment Data Management: Integration of bundle adjustment
functions with data management objects . . . . . . . . . . . . . . . . . . . . . . . 4-7
Visual SLAM Example: Process image data from RGB-D camera to build
dense map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
Functionality Being Removed or Changed: Support for GPU removed for
disparityBM function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7

Point Cloud Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8

Point Cloud Preprocessing: Preserve organized structure of point cloud


...................................................... 4-8
Velodyne Organized Point Cloud Support: Velodyne file reader can return
organized point clouds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
Point Cloud: uint16 data type support for intensity . . . . . . . . . . . . . . . . . . 4-8

Code Generation, GPU, and Third-Party Support . . . . . . . . . . . . . . . . . . . . 4-9

OpenCV Interface: Integrate OpenCV version 4.5.0 projects with MATLAB


...................................................... 4-9
Generate C and C++ Code Using MATLAB Coder: Support for functions
...................................................... 4-9
Generate CUDA code for NVIDIA GPU Coder: Support for functions . . . . . 4-9
Functionality being removed or changed . . . . . . . . . . . . . . . . . . . . . . . . 4-10

Computer Vision with Simulink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-11

Computer Vision with Simulink: Specify image data type in Simulink model
..................................................... 4-11

R2021b

Ground Truth Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2

Labeler Enhancements: Labeling interactions and other enhancements . . 5-2

Feature Detection and Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3

SIFT Feature Detector: Scale-invariant feature transform detection and


feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3

Recognition, Object Detection, and Semantic Segmentation . . . . . . . . . . 5-4

vii
Experiment Manager App Support: Track progress of deep learning object
detector training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
Deep Learning Activity Recognition: Video classification using deep learning
...................................................... 5-4
Create Training Data for Video Classifier: Extract video clips for labeling
and training workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
Deep Learning Instance Segmentation: Create and configure pretrained
Mask R-CNN neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
Deep Learning ROI Pooling: Nonquantized ROI pooling . . . . . . . . . . . . . . 5-5
Deep Learning Object Detector Block: Simulate and generate code for deep
learning object detectors in Simulink . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
Pretrained Deep Learning Models on GitHub: Perform object detection and
segmentation using latest pretrained models on GitHub . . . . . . . . . . . . 5-6

Camera Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7

Camera Calibration: Circle grid calibration pattern detection . . . . . . . . . . 5-7


Camera Calibration: Custom pattern detection . . . . . . . . . . . . . . . . . . . . . 5-7
Rigid 3-D Support: Pass rigid 3-D transformation object to calibration
functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
OpenCV Camera Parameters: Relay camera intrinsics and stereo
parameters to and from OpenCV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7

Structure from Motion and Visual SLAM . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8

Bundle Adjustment Solver: Specify optimization solver . . . . . . . . . . . . . . . 5-8


Rigid 3-D Support: Pass 3-D rigid transformation object to camera
parameter functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
Image View Set Support: Find views and view connections . . . . . . . . . . . . 5-8
Bag of Features: Support for binary features . . . . . . . . . . . . . . . . . . . . . . . 5-8
Bag of Features Search Index: Support for Visual Simultaneous Localization
and Mapping (vSLAM) loop closure detection . . . . . . . . . . . . . . . . . . . . 5-8

Point Cloud Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9

Point Cloud Simultaneous Localization and Mapping (SLAM): Detect loop


closures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
Point Cloud View Set Support: Find views and view connections . . . . . . . . 5-9
Multiquery Radius Search: Optimized radius search for point cloud
segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
Point Cloud Viewers: Modify background color programmatically . . . . . . . 5-9

Code Generation, GPU, and Third-Party Support . . . . . . . . . . . . . . . . . . . 5-10

Generate C and C++ Code Using MATLAB Coder: Support for functions
..................................................... 5-10
Generate C and C++ Code Using MATLAB Coder: Compiler links to OpenCV
libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
Generate CUDA code for NVIDIA GPUs using GPU Coder . . . . . . . . . . . . 5-10
Computer Vision Toolbox Interface for OpenCV in MATLAB (September
2021, Version 21.2): Call OpenCV functions from MATLAB . . . . . . . . . 5-10
Computer Vision Toolbox Interface for OpenCV in Simulink: Specify image
data type in Simulink model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11

viii Contents
R2021a

Camera Calibration: Checkerboard detector for fisheye camera


calibration and partial checkerboard detection . . . . . . . . . . . . . . . . . . . 6-2

Visual Simultaneous Localization and Mapping (SLAM): Enhancements to


the SLAM workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2

Point Cloud Simultaneous Localization and Mapping (SLAM): Support for


point cloud SLAM with NDT map representation . . . . . . . . . . . . . . . . . . 6-2

Point Cloud Processing: Set cluster density limits, get indices of bins, and
registration enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2

Deep Learning Training: Pass training options to detectors, perform


cutout data augmentation, and balance labels of blocked images . . . . 6-3

Semantic Segmentation Enhancements: Support for dlnetwork objects


and specified classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3

Evaluate Image Segmentation: Calculate generalized Dice similarity


coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3

Labeler Enhancements: Instance segmentation labeling, super pixel flood


fill, labeling large images, and additional features . . . . . . . . . . . . . . . . 6-3

YOLO v3 Object Detector: Computer Vision Toolbox Model for YOLO v3


Object Detection (Version 21.1.0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4

Extended Capability: Perform GPU and C/C++ code generation . . . . . . . . 6-5


Generate C and C++ Code Using MATLAB Coder . . . . . . . . . . . . . . . . . . . 6-5
Generate CUDA code for NVIDIA GPUs using GPU Coder . . . . . . . . . . . . . 6-5

Functionality being removed or changed . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6


GPU support for disparityBM, detectFASTFeatures, and the interface for
OpenCV in MATLAB will be removed in a future release . . . . . . . . . . . . 6-6
Computer Vision Toolbox Support Package for Xilinx Zynq-Based Hardware
has been moved to Vision HDL Toolbox Support Package for Xilinx Zynq-
Based Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6

R2020b

Mask R-CNN: Train Mask R-CNN networks for instance segmentation


using deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2

Visual SLAM Data Management: Manage 3-D world points and projection
correspondences to 2-D image points . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2

ix
AprilTag Pose Estimation: Detect and estimate pose for AprilTags in an
image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2

Point Cloud Registration: Register point clouds using phase correlation


.......................................................... 7-2

Point Cloud Loop Closure Detection: Compute Point cloud feature


descriptor for loop closure detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2

Triangulation Accuracy Improvements: Filter triangulated 3-D world


points behind camera view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2

Geometric Transforms: Estimate 2-D and 3-D geometric transformations


from matching point pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3

Labeler Enhancements: Label objects in images and video using projected


3-D bounding boxes, load custom image formats, use additional
keyboard shortcuts, and more . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3

Object Detection Visualizations: Visualize shapes on images and point


clouds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4

Evaluate Pixel-Level Segmentation: Compute a confusion matrix of


multiclass pixel-level image segmentation . . . . . . . . . . . . . . . . . . . . . . . 7-4

Focal Loss Layer Improvements: Add a focal loss layer to a semantic


segmentation or image classification deep learning network . . . . . . . . 7-4

focalCrossEntropy function: Compute focal cross-entropy loss in custom


training loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4

Computer Vision Examples: Explore deep learning workflows, explore


camera calibration using AprilTags, and compute segmentation metrics
.......................................................... 7-5

Extended Capability: Perform GPU and C/C++ code generation . . . . . . . . 7-5


Generate C and C++ Code Using MATLAB Coder . . . . . . . . . . . . . . . . . . . 7-5
Generate CUDA code for NVIDIA GPUs using GPU Coder . . . . . . . . . . . . . 7-5

OpenCV Interface: Integrate OpenCV version 4.2.0 projects with MATLAB


.......................................................... 7-5

Functionality being removed or changed . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5


focalLossLayer input arguments, alpha and gamma now have default values
...................................................... 7-5
yolov2ReorgLayer will be removed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
estimateGeometricTransform will be removed . . . . . . . . . . . . . . . . . . . . . 7-6

x Contents
R2020a

Point Cloud Deep Learning: Detect and classify objects in 3-D point
clouds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2

Deep Learning with Big Images: Train and use deep learning object
detectors and semantic segmentation networks on very large images
.......................................................... 8-2

Simultaneous Localization and Mapping (SLAM): Perform point cloud and


visual SLAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2

Bar Code Reader: Detect and decode 1-D and 2-D barcodes . . . . . . . . . . . 8-3

SSD Object Detection: Detect objects in images using a single shot


multibox object detector (SSD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3

Velodyne Point Cloud Reader: Store start time for each point cloud frame
.......................................................... 8-3

Labelers: Rename scene labels, select ROI color, and show ROI label
names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3

Validate Deep Learning Networks: Specify training options to validate


deep learning networks during training . . . . . . . . . . . . . . . . . . . . . . . . . 8-4

YOLO v2 Enhancements: Import and export pretrained YOLO v2 object


detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4

YOLO v3 Deep Learning: Perform object detection using YOLO v3 deep


learning network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4

Computer Vision Examples: Explore object detection with deep learning


workflows, structure from motion, and point cloud processing . . . . . . 8-4

Code Generation: Generate C/C++ code using MATLAB Coder . . . . . . . . . 8-5

Computer Vision Toolbox Interface for OpenCV in Simulink: Import


OpenCV code into Simulink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6

Functionality being removed or changed . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6


pcregisterndt, pcregistericp, and the pcregistercpd functions return a
rigid3d object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6
New imageviewset replaces viewSet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6

xi
R2019b

Video and Image Labeler: Copy and paste pixel labels, improved pan and
zoom, improved frame navigation, and line ROI, label attributes, and
sublabels added to Image Labeler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2

Data Augmentation for Object Detectors: Transform image and bounding


box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2

Semantic Segmentation: Classify individual pixels in images and 3-D


volumes using DeepLab v3+ and 3-D U-Net networks. . . . . . . . . . . . . . 9-2

Deep Learning Object Detection: Perform faster R-CNN end-to-end


training, anchor box estimation, and use multichannel image data . . . 9-3

Deep Learning Acceleration: Optimize YOLO v2 and semantic


segmentation using MEX acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3

Multiview Geometry: Reconstruct 3-D scenes and camera poses from


multiple cameras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3

Velodyne Point Cloud Reader: Read lidar data from VLS- 128 device
model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3

Point Cloud Normal Distribution Transform (NDT): Register point clouds


using NDT with improved performance . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4

Code Generation: Generate C/C++ code using MATLAB Coder . . . . . . . . . 9-4

Functionality Being Removed or Changed . . . . . . . . . . . . . . . . . . . . . . . . . 9-4


The NumOutputChannels argument of unetLayers function has been
renamed to NumFirstEncoderFilters . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4

R2019a

YOLO v2 Object Detection: Train a "you-only-look-once" (YOLO) v2 deep


learning object detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2

3-D Semantic Segmentation: Classify pixel regions in 3-D volumes using


deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2

Code Generation: Generate C code for point cloud processing, ORB,


disparity, and ACF functionality using MATLAB Coder . . . . . . . . . . . . 10-2

ORB Features: Detect and extract oriented FAST and rotated BRIEF
(ORB) features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3

Velodyne Point Cloud Reader: Read lidar data from Puck LITE and Puck
Hi-Res device models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3

xii Contents
GPU Acceleration for Stereo Disparity: Compute stereo disparity maps on
GPUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3

Ground Truth Data: Select labels by group, type, and attribute . . . . . . . 10-3

Projection Matrix Estimation: Use direct linear transform (DLT) to


compute projection matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4

Organized Point Clouds: Perform faster approximate search using camera


projection matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4

Point Cloud Viewers: Modify color display and view data tips . . . . . . . . . 10-4

Image and Video Labeling: Organize labels by logical groups, use assisted
freehand for pixel labeling, and other label management
enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4

DeepLab v3+, deep learning, and lidar tracking examples . . . . . . . . . . . 10-5

Relative camera pose computed from homography matrix . . . . . . . . . . . 10-5

Functionality being removed or changed . . . . . . . . . . . . . . . . . . . . . . . . . 10-5


disparity function will be removed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-5
selectLabels object function will be removed . . . . . . . . . . . . . . . . . . . . . 10-5

R2018b

Video Labeler App: Interactive and semi-automatic labeling of ground


truth data in a video, image sequence, or custom data source . . . . . . 11-2

Lidar Segmentation: Segment ground points from organized 3-D lidar


data and organize point clouds into clusters . . . . . . . . . . . . . . . . . . . . 11-2

Point Cloud Registration: Align 3-D point clouds using coherent point
drift (CPD) registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2

MSAC Fitting: Find a polynomial that best fits noisy data using the M-
estimator sample consensus (MSAC) . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2

Faster R-CNN Enhancements: Train Faster R-CNN object detectors using


DAG networks such as ResNet-50 and Inception-v3 . . . . . . . . . . . . . . . 11-2

Semantic Segmentation Using Deep Learning: Create U-Net network


......................................................... 11-3

Velodyne Point Cloud Reader: Support for VLP-32 device . . . . . . . . . . . . 11-3

Labeler Apps: Create a definition table, change file path, and assign data
attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3

xiii
OpenCV Interface: Integrate OpenCV version 3.4.0 projects with MATLAB
......................................................... 11-3

Functionality Being Removed or Changed . . . . . . . . . . . . . . . . . . . . . . . . 11-3


ClassNames property of PixelClassificationLayer will be removed . . . . . . 11-4

R2018a

Lidar Segmentation: Segment lidar point clouds using Euclidean distance


......................................................... 12-2

Lidar Registration: Register multiple lidar point clouds using normal


distributions transform (NDT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2

Image Labeler App: Mark foreground and background for pixel labeling
......................................................... 12-2

Fisheye Calibration: Interactively calibrate fisheye lenses using the


Camera Calibrator app . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2

Stereo Baseline Estimation: Estimate baseline of a stereo camera with


known intrinsic parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2

Interactively rotate point cloud around any point . . . . . . . . . . . . . . . . . . 12-2

Multiclass nonmaxima suppression (NMS) . . . . . . . . . . . . . . . . . . . . . . . . 12-2

pcregrigid name changed to pcregistericp . . . . . . . . . . . . . . . . . . . . . . . . 12-2

Efficiently read and preprocess pixel-labeled images for deep learning


training and prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3

Code Generation Support for KAZE Detection . . . . . . . . . . . . . . . . . . . . . 12-3

Functionality Being Removed or Changed . . . . . . . . . . . . . . . . . . . . . . . . 12-3

R2017b

Semantic Segmentation Using Deep Learning: Classify pixel regions in


images, evaluate, and visualize segmentation results . . . . . . . . . . . . . 13-2

Image Labeling App: Interactively label individual pixels for semantic


segmentation and label regions using bounding boxes for object
detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2

xiv Contents
Fisheye Camera Calibration: Calibrate fisheye cameras to estimate
intrinsic camera parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2

KAZE Features: Detect and extract KAZE features for object recognition
or image registration workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2

Code generation for camera intrinsics . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-3

Image Labeler app replaces Training Image Labeler app . . . . . . . . . . . . 13-3

Ground Truth Labeling Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-3

Computer Vision Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-3

R2017a

Deep Learning for Object Detection: Detect objects using Fast R-CNN and
Faster R-CNN object detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-2

Object Detection Using ACF: Train object detectors using aggregate


channel features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-2

Object Detector Evaluation: Evaluate object detector performance,


including precision and miss-rate metrics . . . . . . . . . . . . . . . . . . . . . . 14-2

OpenCV Interface: Integrate OpenCV version 3.1.0 projects with MATLAB


......................................................... 14-2

Object for storing intrinsic camera parameters . . . . . . . . . . . . . . . . . . . . 14-2

Disparity function updated to fix inconsistent results between multiple


invocations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-2

Improved algorithm to calculate intrinsics in Camera Calibration apps


......................................................... 14-2

R2016b

Deep Learning for Object Detection: Detect objects using region-based


convolution neural networks (R-CNN) . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2

Structure from Motion: Estimate the essential matrix and compute


camera pose from 3-D to 2-D point correspondences . . . . . . . . . . . . . 15-2

Point Cloud File I/O: Read and write PCD files using Point Cloud File I/O
Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2

xv
Code Generation for ARM Example: Detect and track faces on a
Raspberry Pi 2 target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2

Visual Odometry Example: Estimate camera locations and trajectory from


an ordered sequence of images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-2

cameraPose function renamed to relativeCameraPose . . . . . . . . . . . . . . 15-2

New capabilities for Training Image Labeler app . . . . . . . . . . . . . . . . . . . 15-2

Train cascade object detector function takes tables and uses


imageDatastore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-3

Project 3-D world points into image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-3

Code generation support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-3

Plot camera function accepts a table of camera poses . . . . . . . . . . . . . . 15-3

Eliminate 3-D points input from extrinsics function . . . . . . . . . . . . . . . . 15-3

Simpler way to call System objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-3

R2016a

OCR Trainer App: Train an optical character recognition (OCR) model to


recognize a specific set of characters . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2

Structure from Motion: Estimate the camera poses and 3-D structure of a
scene from multiple images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2

Pedestrian Detection: Locate pedestrians in images and video using


aggregate channel features (ACF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2

Bundle Adjustment: Refine estimated locations of 3-D points and camera


poses for the structure from motion (SFM) framework . . . . . . . . . . . . 16-2

Multiview Triangulation: Triangulate 3-D locations of points matched


across multiple images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2

Rotate matrix to vector and vector to matrix . . . . . . . . . . . . . . . . . . . . . . 16-2

Select spatially uniform distribution of feature points . . . . . . . . . . . . . . 16-2

Single camera and stereo camera calibration app enhancements . . . . . 16-2

Point cloud from Kinect V2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-3

Point cloud viewer enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-3

xvi Contents
Support package for Xilinx Zynq-based hardware . . . . . . . . . . . . . . . . . . 16-3

C code generation support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-3

Future removal warning of several System objects . . . . . . . . . . . . . . . . . 16-4

R2015b

3-D Shape Fitting: Fit spheres, cylinders, and planes into 3-D point
clouds using RANSAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-2

Streaming Point Cloud Viewer: Visualize streaming 3-D point cloud data
from sensors such as the Microsoft Kinect . . . . . . . . . . . . . . . . . . . . . . 17-2

Point Cloud Normal Estimation: Estimate normal vectors of a 3-D point


cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-2

Farneback Optical Flow: Estimate optical flow vectors using the


Farneback method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-2

LBP Feature Extraction: Extract local binary pattern features from a


grayscale image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-2

Multilanguage Text Insertion: Insert text into image data, with support
for multiple languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-2

3-D point cloud extraction from Microsoft Kinect . . . . . . . . . . . . . . . . . . 17-2

3-D point cloud displays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3

Downsample point cloud using nonuniform box grid filter . . . . . . . . . . . 17-3

Compute relative rotation and translation between camera poses . . . . . 17-3

Warp block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3

GPU support for FAST feature detection . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3

Camera calibration optimization options . . . . . . . . . . . . . . . . . . . . . . . . . 17-3

C code generation support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-3

Examples for face detection, tracking, 3-D reconstruction, and point


cloud registration and display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-4

Example using Vision HDL Toolbox for noise removal and image
sharpening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-4

Removed video package from Computer Vision System Toolbox . . . . . . . 17-4

xvii
Morphological System objects future removal warning . . . . . . . . . . . . . . 17-4

No edge smoothing in outputs of undistortImage and rectifyStereoImages


......................................................... 17-5

VideoFileReader play count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-5

R2015a

3-D point cloud functions for registration, denoising, downsampling,


geometric transformation, and PLY file reading and writing . . . . . . . 18-2

Image search and retrieval using bag of visual words . . . . . . . . . . . . . . . 18-2

User-defined feature extractor for bag-of-visual-words framework . . . . 18-2

C code generation for eight functions, including rectifyStereoImages and


vision.DeployableVideoPlayer on Mac . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2

Mac support for vision.DeployableVideoPlayer and To Video Display block


......................................................... 18-3

Plot camera figure in 3-D coordinate system . . . . . . . . . . . . . . . . . . . . . . 18-3

Line width for insertObjectAnnotation . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3

Upright option for extractFeatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3

Rotate integral images in integralImage, integralKernel, and


integralFilter functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3

Performance improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3

Optical flow functions and object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3

Examples for image retrieval, 3-D point cloud registration and stitching,
and code generation for depth estimation from stereo video . . . . . . . 18-4

R2014b

Stereo camera calibration app . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-2

imageSet class for handling large collections of image files . . . . . . . . . . 19-2

Bag-of-visual-words suite of functions for image category classification​​


......................................................... 19-2

xviii Contents
Approximate nearest neighbor search method for fast feature matching
......................................................... 19-2

3-D point cloud visualization function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-3

3-D point cloud extraction from Kinect . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-3

Kinect color image to depth image alignment . . . . . . . . . . . . . . . . . . . . . 19-3

Point locations from stereo images using triangulation . . . . . . . . . . . . . 19-3

Red-cyan anaglyph from stereo images . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-3

Point coordinates correction for lens distortion . . . . . . . . . . . . . . . . . . . . 19-3

Camera projection matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-3

Calculation of calibration standard errors . . . . . . . . . . . . . . . . . . . . . . . . 19-3

Live image capture in Camera Calibrator app . . . . . . . . . . . . . . . . . . . . . 19-3

Region of interest (ROI) copy and paste support for Training Image
Labeler app . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-4

Non-maximal suppression of bounding boxes for object detection . . . . . 19-4

Linux support for deployable video player . . . . . . . . . . . . . . . . . . . . . . . . 19-4

GPU support for Harris feature detection . . . . . . . . . . . . . . . . . . . . . . . . . 19-4

Extended language support package for optical character recognition


(OCR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-4

Support package for OpenCV Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-4

Convert format of rectangle to a list of points . . . . . . . . . . . . . . . . . . . . . 19-5

Bag-of-visual-words, stereo vision, image stitching, and tracking


examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-5

R2014a

Stereo vision functions for rectification, disparity calculation, scene


reconstruction, and stereo camera calibration . . . . . . . . . . . . . . . . . . . 20-2

Optical character recognition (OCR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-2

Binary Robust Invariant Scalable Keypoints (BRISK) feature detection


and extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-2

xix
App for labeling images for training cascade object detectors . . . . . . . . 20-2

C code generation for Harris and minimum eigenvalue corner detectors


using MATLAB Coder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-3

Line width control for insertShape function and Draw Shapes block . . . 20-3

Replacing vision.CameraParameters with cameraParameters . . . . . . . . 20-3

Output view modes and fill value selection added to undistortImage


function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-3

Generated code optimized for the matchFeatures function and


vision.ForegroundDetector System object . . . . . . . . . . . . . . . . . . . . . . . 20-3

Merging mplay viewer into implay viewer . . . . . . . . . . . . . . . . . . . . . . . . . 20-3

MPEG-4 and JPEG2000 file formats added to vision.VideoFileWriter


System object and To Multimedia File block . . . . . . . . . . . . . . . . . . . . . 20-4

Region of interest (ROI) support added to detectMSERFeatures and


detectSURFFeatures functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20-4

MATLAB code script generation added to Camera Calibrator app . . . . . 20-4

Featured examples for text detection, OCR, 3-D reconstruction, 3-D dense
reconstruction, code generation, and image search . . . . . . . . . . . . . . 20-4

Play count default value updated for video file reader . . . . . . . . . . . . . . . 20-4

R2013b

Camera intrinsic, extrinsic, and lens distortion parameter estimation


using camera calibration app . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-2

Camera calibration functions for checkerboard pattern detection, camera


parameter estimation, correct lens distortion, and visualization of
results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-2

Histogram of Oriented Gradients (HOG) feature extractor . . . . . . . . . . . 21-2

C code generation support for 12 additional functions . . . . . . . . . . . . . . 21-2

System objects matlab.system.System warnings . . . . . . . . . . . . . . . . . . . 21-3

Restrictions on modifying properties in System object Impl methods


......................................................... 21-3

xx Contents
R2013a

Cascade object detector training using Haar, Histogram of Oriented


Gradients (HOG), and Local Binary Pattern (LBP) features . . . . . . . . 22-2

Fast Retina Keypoint (FREAK) algorithm for feature extraction . . . . . . 22-2

Hamming distance method for matching features . . . . . . . . . . . . . . . . . . 22-2

Multicore support in matchFeatures function and ForegroundDetector


System object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-2

Functions for corner detection, geometric transformation estimation, and


text and graphics overlay, augmenting similar System objects . . . . . . 22-2

Error-out condition for old coordinate system . . . . . . . . . . . . . . . . . . . . . 22-2

Support for nonpersistent System objects . . . . . . . . . . . . . . . . . . . . . . . . 22-3

New method for action when System object input size changes . . . . . . . 22-3

Scaled double data type support for System objects . . . . . . . . . . . . . . . . 22-3

Scope Snapshot display of additional scopes in Simulink Report


Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-3

R2012b

Kalman filter and Hungarian algorithm for multiple object tracking . . 23-2

Image and video annotation for detected or tracked objects . . . . . . . . . 23-2

Kanade-Lucas-Tomasi (KLT) point tracker . . . . . . . . . . . . . . . . . . . . . . . . 23-2

HOG-based people detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-2

Video file reader support for H.264 codec (MPEG-4) on Windows 7 . . . . 23-2

Show matched features display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-2

Matching methods added for match features function . . . . . . . . . . . . . . 23-2

Kalman filter for tracking tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-3

Motion-based multiple object tracking example . . . . . . . . . . . . . . . . . . . 23-3

Face detection and tracking examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-3

xxi
Stereo image rectification example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-3

System object tunable parameter support in code generation . . . . . . . . 23-3

save and load for System objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23-3

Save and restore SimState not supported for System objects . . . . . . . . . 23-3

R2012a

Dependency on DSP System Toolbox and Signal Processing Toolbox


Software Removed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-2
Audio Output Sampling Mode Added to the From Multimedia File Block
..................................................... 24-2
Kalman Filter and Variable Selector Blocks Removed from Library . . . . . 24-2
2-D Median and 2-D Histogram Blocks Replace Former Median and
Histogram Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-2
Removed Sample-based Processing Checkbox from 2-D Maximum, 2-D
Minimum, 2-D Variance, and 2-D Standard Deviation Blocks . . . . . . . . 24-2

New Viola-Jones Cascade Object Detector . . . . . . . . . . . . . . . . . . . . . . . . . 24-2

New MSER Feature Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-3

New CAMShift Histogram-Based Tracker . . . . . . . . . . . . . . . . . . . . . . . . . 24-3

New Integral Image Computation and Box Filtering . . . . . . . . . . . . . . . . 24-3

New Demo to Detect and Track a Face . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-3

Improved MATLAB Compiler Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-3

Code Generation Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-3

Conversion of Error and Warning Message Identifiers . . . . . . . . . . . . . . 24-3

System Object Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-4


Code Generation for System Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-4
New System Object Option on File Menu . . . . . . . . . . . . . . . . . . . . . . . . 24-4
Variable-Size Input Support for System Objects . . . . . . . . . . . . . . . . . . . 24-4
Data Type Support for User-Defined System Objects . . . . . . . . . . . . . . . . 24-4
New Property Attribute to Define States . . . . . . . . . . . . . . . . . . . . . . . . . 24-4
New Methods to Validate Properties and Get States from System Objects
..................................................... 24-4
matlab.system.System changed to matlab.System . . . . . . . . . . . . . . . . . . 24-4

xxii Contents
R2011b

Conventions Changed for Indexing, Spatial Coordinates, and


Representation of Geometric Transforms . . . . . . . . . . . . . . . . . . . . . . . 25-2
Running your Code with New Conventions . . . . . . . . . . . . . . . . . . . . . . . 25-2
One-Based Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-2
Coordinate System Convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-2
Migration to [x y] Coordinate System . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-3
Updated Blocks, Functions, and System Objects . . . . . . . . . . . . . . . . . . . 25-4

New SURF Feature Detection, Extraction, and Matching Functions . . . . 25-9

New Disparity Function for Depth Map Calculation . . . . . . . . . . . . . . . . . 25-9

Added Support for Additional Video File Formats for Non-Windows


Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-9

Variable-Size Support for System Objects . . . . . . . . . . . . . . . . . . . . . . . . . 25-9

New Demo to Retrieve Rotation and Scale of an Image Using Automated


Feature Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-9

Apply Geometric Transformation Block Replaces Projective


Transformation Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-9

Trace Boundaries Block Replaced with Trace Boundary Block . . . . . . . . 25-9

FFT and IFFT Support for Non-Power-of-Two Transform Length with FFTW
Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-10

vision.BlobAnalysis Count and Fill-Related Properties Removed . . . . . 25-10

vision.CornerDetector Count Output Removed . . . . . . . . . . . . . . . . . . . . 25-10

vision.LocalMaximaFinder Count Output and CountDataType Property


Removed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-10

vision.GeometricTransformEstimator Default Properties Changed . . . 25-11

Code Generation Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-11

vision.MarkerInserter and vision.ShapeInserter Properties Not Tunable


........................................................ 25-11

Custom System Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-11

System Object DataType and CustomDataType Properties Changes . . . 25-12

xxiii
R2011a

Product Restructuring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-2


System Object Name Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-2

New Computer Vision Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-2


Extract Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-2
Feature Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-2
Uncalibrated Stereo Rectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-2
Determine if Image Contains Epipole . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-3
Epipolar Lines for Stereo Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-3
Line-to-Border Intersection Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-3

New Foreground Detector System Object . . . . . . . . . . . . . . . . . . . . . . . . . 26-3

New Tracking Cars Using Gaussian Mixture Models Demo . . . . . . . . . . . 26-3

Expanded To Video Display Block with Additional Video Formats . . . . . 26-3

New Printing Capability for the mplay Function and Video Viewer Block
......................................................... 26-3

Improved Display Updates for mplay Function, Video Viewer Block and
vision.VideoPlayer System Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-3

Improved Performance of FFT Implementation with FFTW library . . . . 26-3

Variable Size Data Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-4

System Object Input and Property Warnings Changed to Errors . . . . . . 26-4

System Object Code Generation Support . . . . . . . . . . . . . . . . . . . . . . . . . 26-4

MATLAB Compiler Support for System Objects . . . . . . . . . . . . . . . . . . . . 26-4

R2010a MAT Files with System Objects Load Incorrectly . . . . . . . . . . . . 26-4

Documentation Examples Renamed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-5

R2010b

New Estimate Fundamental Matrix Function for Describing Epipolar


Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-2

New Histogram System Object Replaces Histogram2D Object . . . . . . . . 27-2

New System Object release Method Replaces close Method . . . . . . . . . . 27-2


Compatability Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-2

xxiv Contents
Expanded Embedded MATLAB Support . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-2
Supported Image Processing Toolbox Functions . . . . . . . . . . . . . . . . . . . 27-2
Supported System objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-2

Data Type Assistant and Ability to Specify Design Minimums and


Maximums Added to More Fixed-Point Blocks . . . . . . . . . . . . . . . . . . . 27-3

Data Types Pane Replaces the Data Type Attributes and Fixed-Point Panes
on Fixed-Point Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-3

Enhanced Fixed-Point and Integer Data Type Support with System Objects
......................................................... 27-3
Compatability Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-3

Variable Size Data Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-3

Limitations Removed from Video and Image Processing Blockset


Multimedia Blocks and Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-4

R2010a

New System Objects Provide Video and Image Processing Algorithms for
use in MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-2

Intel Integrated Performance Primitives Library Support Added to 2-D


Correlation, 2-D Convolution, and 2-D FIR Filter Blocks . . . . . . . . . . . 28-2

Variable Size Data Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-2

Expanded From and To Multimedia File Blocks with Additional Video


Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-2

New Simulink Demos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-3


New Modeling a Video Processing System for an FPGA Target Demo . . . 28-3

New System Object Demos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-3


New Image Rectification Demo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-3
New Stereo Vision Demo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-3
New Video Stabilization Using Point Feature Matching . . . . . . . . . . . . . . 28-3

SAD Block Obsoleted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28-3

xxv
1

R2023b

Version: 23.2

New Features

Compatibility Considerations
R2023b

Ground Truth Labeling

Collaborative team labeling: Distribute, monitor, and review labeling


tasks across a team
This release introduces an interface for a collaborative, multiuser, team-based, labeling using the
Image Labeler app. This new mode for the app assists in a complete labeling workflow project where
multiple individuals perform assigned labeling tasks. The app dynamically adjusts the view based on
the role of the user.

The new interface features include:

• Create label definitions for distribution.


• Create and assign separate tasks for labeling ground truth data and for reviewing the labeled
data.
• Include feedback within the app for task owners or project manager.
• Track the progress of any of the tasks.
• Create an executable labeling app, which team members can use to label or review tasks without
a MATLAB® license.
• Export a ground truth object.

For more details, see “Get Started with Team-Based Labeling”.

Image Labeler: Merge ground truth objects


Use the merge object function of the groundTruth object to merge two or more ground truth
objects.

Labeler Enhancements: Labeling interactions and other enhancements


This table describes enhancements for these labeling apps:

• Image Labeler

1-2
Ground Truth Labeling

• Video Labeler
• Ground Truth Labeler
• Lidar Labeler
• Medical Image Labeler

Feature Image Video Ground Lidar Medica


Labeler Labeler Truth Labeler l Image
Labeler Labele
r
New Image Labeler interface Yes No No No No
Rotated rectangle ROI support Yes Yes Yes No No
Rotated rectangle ROI support for No Yes Yes No No
temporal interpolation and point
tracker automation algorithms
Merge ground truth objects Yes No No No Yes
Load pointCloud objects from No No No Yes No
workspace
Visualize point cloud intensity No No No Yes No
using colormap value
Cluster point cloud using DBSCAN No No No Yes No
clustering algorithm
Use burst mode to label static No No No Yes No
objects in all point cloud frames at
once
Use pretrained deep learning No No No Yes No
models to generate automated
labels in a point cloud
Use a custom automation function Yes Yes Yes Yes Yes
to generate labels
Specify image display window level No No No No Yes
and window width in the app
toolstrip
New Preferences dialog box for No No No No Yes
image display settings

1-3
R2023b

Feature Detection and Extraction

Visualization Color Specification: Specify RGB color values in the


range [0, 1]
Starting in R2023b, you can now specify color using RGB values in the range [0, 1] for these
visualization functions:

• insertText
• insertObjectAnnotation
• insertShape
• insertMarker
• insertObjectMask

1-4
Recognition, Object Detection, and Semantic Segmentation

Recognition, Object Detection, and Semantic Segmentation


Object Detection Evaluation: Evaluate detection results with
comprehensive metrics, including size-based metrics
Use the evaluateObjectDetection function to evaluate the quality of object detection results
using these metrics: confusion matrix, normalized confusion matrix, average precision, and mean
average precision. You can specify for the function to return additional metrics such as miss rate
(MR), log-average miss rate (LAMR), false positives per images (FFPI), and average orientation
similarity (AOS, for rotated bounding boxed only). The objectDetectionMetrics object stores the
metrics. Use the metricsByArea object function of the objectDetectionMetrics object to
extract subsets of evaluation metrics by object size range.

Instance Segmentation Evaluation: Specify size-based evaluation


metrics
Use the metricsByArea object function of the instanceSegmentationMetrics object to extract
subsets of evaluation metrics by object mask bounding box area ranges.

Deep Learning Instance Segmentation: Train SOLOv2 networks


Create a SOLOv2 instance segmentation network using the solov2 object. This network is trained on
the COCO data set with a ResNet-50 or ResNet-18 network as the feature extractor. You can segment
object instances in an image using the pretrained network, or you can configure the network to
perform transfer learning.

To train the SOLOv2 network, use the trainSOLOV2 function. For more details, see “Get Started with
SOLOv2 for Instance Segmentation”.

For an example, see “Perform Instance Segmentation Using SOLOv2”.

This functionality requires the Computer Vision Toolbox Model for SOLOv2 Instance Segmentation
and a Deep Learning Toolbox™ license. You can install the Computer Vision Toolbox Model for
SOLOv2 Instance Segmentation from Add-On Explorer. For more information about installing add-
ons, see Get and Manage Add-Ons.

Automated Visual Inspection: Detect objects in image using YOLOX


deep learning network
The yoloxObjectDetector object creates a YOLOX object detector to detect objects in images. You
can either create a pretrained YOLOX object detector or a custom YOLOX object detector. Use the
detect object function to detect objects in a test image by using the trained YOLOX object detector.
To train a YOLOX object detection network, use the trainYOLOXObjectDetector function.

The “Detect Defects on Printed Circuit Boards Using YOLOX Network” example shows how to use a
pretrained YOLOX object detector to detect small objects in a sample image and configure a
pretrained YOLOX object detector to perform transfer learning.

This functionality requires the Computer Vision Toolbox Automated Visual Inspection Library and a
Deep Learning Toolbox license. You can install the Computer Vision Toolbox Automated Visual

1-5
R2023b

Inspection Library from Add-On Explorer. For more information about installing add-ons, see Get and
Manage Add-Ons.

To load a YOLOX object detector model for code generation, use the
vision.loadYOLOXObjectDetector function.

Automated Visual Inspection: Use anomaly detection and small object


detection techniques
These examples show how to use anomaly detection and small object detection techniques.

• The “Localize Industrial Defects Using PatchCore Anomaly Detector” example detects and
localizes defects on printed circuit boards (PCBs) using semi-supervised learning of a single-class
classification neural network. For more details, see “Getting Started with Anomaly Detection
Using Deep Learning”.
• The “Detect Defects on Printed Circuit Boards Using YOLOX Network” uses a YOLOX network to
perform object detection of small industrial defects on PCBs. This example replaces an example
from the previous release that used the YOLO v4 network.

YOLO v4 Object Detector: Rotated rectangle bounding box support


The deep learning YOLO v4 object detector and supporting functionality now support rotated
rectangle bounding boxes. The functions that now support rotated rectangle bounding boxes as
inputs are:

• Training — trainYOLOv4ObjectDetector
• Detection — yolov4ObjectDetector, estimateAnchorBoxes
• Visualization — insertShape, insertObjectAnnotation, showShape
• Augmentation and Preprocessing — balanceBoxLabels, bbox2points
• Simulink block — Deep Learning Object Detector

The insertObjectAnnotation and showShape functions add the logical ShowOrientation


name-value argument, enabling you to specify whether to visually display the orientation of the
rotated rectangle.

HRNet Object Keypoint Detector: Detect object keypoints in image


using pretrained HRNet deep learning network
Use the hrnetObjectKeypointDetector object to create an object keypoint detector from a
pretrained HRNet deep learning network. Then, use the detect object function to detect object
keypoints in an image by using the pretrained HRNet object detector. You can also extract visible
keypoints detected by the HRNet detector by using the visibleKeypoints object function.

The “Hand Pose Estimation Using HRNet Deep Learning” example shows how to detect hand pose
keypoints using an HRNet network.

To load an HRNet detector for code generation, use the loadHRNETObjectKeypointDetector


function.

You can use the pretrained HRNet deep learning networks provided in the Computer Vision Toolbox
Model for Object Keypoint Detection support package. There are two variants of pretrained HRNet

1-6
Recognition, Object Detection, and Semantic Segmentation

deep learning network: one with 32 channels and one with 48 channels. These networks have been
trained on the COCO keypoint detection data set. You can download the Computer Vision Toolbox
Model for Object Keypoint Detection from the Add-On Explorer. For more information, see “Get and
Manage Add-Ons”.

For more details on HRNet, see “Getting Started with HRNet”.

Object Keypoint Detection Visualizations: Visualize object keypoints


on images
Use the insertObjectKeypoints function to insert detected object keypoints on their respective
objects in an image.

Vision Transformer Networks: Pretrained ViT Neural Network


Load a pretrained vision transformer (ViT) neural network using the visionTransformer function
provided in the Computer Vision Toolbox Model for Vision Transformer Network support package.
The returned object is a dlnetwork object. You can fine-tune the network to classify new images
using transfer learning. To classify images, use the predict object function.

In the support package, there are three variants of the pretrained ViT deep learning network:

• "base-16-imagenet-384" — Base-sized model (86.8 million parameters) with a patch size of 16


and fine-tuned using the ImageNet 2012 data set at a resolution of 384-by-384.
• "small-16-imagenet-384" — Small-sized model (22.1 million parameters) with a patch size of
16 and fine-tuned using the ImageNet 2012 data set at a resolution of 384-by-384.
• "tiny-16-imagenet-384" — Tiny-sized model (5.7 million parameters) with a patch size of 16
and fine-tuned using the ImageNet 2012 data set at a resolution of 384-by-384.

You can download the Computer Vision Toolbox Model for Vision Transformer Network from the Add-
On Explorer. For more information, see “Get and Manage Add-Ons”.

This feature requires Deep Learning Toolbox.

For an example, see “Train Vision Transformer Network for Image Classification”.

1-7
R2023b

Vision Transformer Networks: Create and train neural networks


containing patch embedding layers
A patch embedding layer maps patches of pixels to vectors. Use this layer in vision transformer
neural networks such as ViT to encode information about patches in images.

To create a patch embedding layer, use the patchEmbeddingLayer function to create a


PatchEmbeddingLayer object.

This feature requires Deep Learning Toolbox.

Functionality being removed or changed


anchorBoxLayer function will be removed
Warns

The anchorBoxLayer function will be removed in a future release. Use the ssdObjectDetector
object to specify the anchor boxes for training a single shot detector (SSD) multibox object detection
network instead.

ssdLayers function will be removed


Warns

The ssdLayers function will be removed in a future release. Use the ssdObjectDetector object to
create an SSD object detection network instead.

imageSet object will be removed


Warns

The imageSet object will be removed in a future release. Use the imageDatastore object to
manage collections of image data instead.

evaluateDetectionPrecision function will be removed


Still runs

The evaluateDetectionPrecision function will be removed in a future release. Use the


evaluateObjectDetection function to evaluate object detection results with metrics such as
average precision instead.

evaluateDetectionMissRate function will be removed


Still runs

The evaluateDetectionMissRate function will be removed in a future release. Use the


evaluateObjectDetection function to evaluate object detection results with metrics such as the
log-average miss rate instead.

evaluateDetectionAOS function will be removed


Still runs

The evaluateDetectionAOS function will be removed in a future release. Use the


evaluateObjectDetection function to evaluate object detection results with metrics such as the
AOS instead.

1-8
Recognition, Object Detection, and Semantic Segmentation

TargetCategories name-value argument of focalCrossEntropy is not recommended


Still runs

The TargetCategories name-value argument of the focalCrossEntropy function is not


recommended, use the ClassificationMode name-value argument instead.

In your code, replace instances of TargetCategories="exclusive" and


TargetCategories="independent" with ClassificationMode="single-label" and
ClassificationMode="multilabel", respectively.

1-9
R2023b

Structure from Motion and Visual SLAM

Monocular vSLAM: Implement complete feature-based monocular


visual SLAM workflow with the monovslam object
The monovslam object and supporting object functions enable you to implement a complete visual
simultaneous localization and mapping (vSLAM) workflow.

The monovslam object functions enable you to extract ORB features from images and track the
features to estimate camera poses, identify key frames, and reconstruct a 3-D environment. The
vSLAM algorithm also searches for loop closures using the bag-of-features algorithm and optimizes
the camera poses using pose graph optimization. The monovslam object improves performance
outcomes, and supports C/C++ code generation for both host and non-host target platforms, which
you can use for hardware deployment of vSLAM code.

To use the monovslam object, you must have a Navigation Toolbox™ license.

Monocular vSLAM Example: Build a map of an indoor environment and


estimate the trajectory of the camera
The “Monocular Visual Simultaneous Localization and Mapping” example shows two implementations
of processing image data from a monocular camera to build a map of an indoor environment and
estimate the trajectory of the camera.

Code Generation for Monocular vSLAM Example: Code generation for


the Monocular vSLAM example
The “Code Generation for Monocular Visual Simultaneous Localization and Mapping” example shows
you how to use MATLAB Coder™ to generate C++ code for the Monocular Visual Simultaneous
Localization and Mapping example.

Monocular Visual-Inertial SLAM Example: Perform SLAM from


monocular images with measurements obtained from IMU Sensor
The “Monocular Visual-Inertial SLAM” demonstrates how to effectively perform SLAM by combining
images captured by a monocular camera with measurements obtained from an IMU sensor.

Visual SLAM with ROS Example: Build and deploy visual SLAM
algorithm with ROS in MATLAB
The “Build and Deploy Visual SLAM Algorithm with ROS in MATLAB” example shows you how to
implement a visual simultaneous localization and mapping (vSLAM) algorithm to estimate the camera
poses for the TUM RGB-D Benchmark data set.

1-10
Structure from Motion and Visual SLAM

Quaternions: Represent orientation and rotations efficiently for


localization
The quaternion object enables efficient representation of orientation and rotations. You can convert
quaternions to other rotation formats, such as Euler angles and rotation matrices.

Previously, the quaternion object required Automated Driving Toolbox™.

1-11
R2023b

Point Cloud Processing


Point Cloud Viewer: Rotate point cloud around selected point
Use the Rotate Around Point button in pcviewer to enable point cloud rotation around a selected
point of the display.

ICP Point Cloud Registration: Register two point clouds accounting for
color information
The pcregistericp function now enables you to take color information into account when
registering two point clouds with color information when using the ICP algorithm. The inclusion of
color information refines registration and reduces drifts in ICP point cloud registration. To include
point cloud color information for registration, specify the Metric name-value argument as
"pointToPlaneWithColor" or "planeToPlaneWithColor".

ICP Point Cloud Registration: Improved performance


Increased speed performance added to the iterative closest point (ICP) point cloud registration
pcregistericp function for the pointToPoint and pointToPlane metrics.

Functionality being removed or changed


pcregrigid function will be removed
Still runs

The pcregrigid function will be removed in a future release. Use the pcregistericp object to
perform ICP point cloud registration instead.

Extrapolate name-value argument of the pcregistericp function will be removed


Warns

The Extrapolate name-value argument of the pcregistericp function will be removed in a future
release. You can increase the value of the MaxIterations name-value argument, or use the

1-12
Point Cloud Processing

"pointToPlane" or "planeToPlane" values of the Metric name-value argument, for similar


accuracy.

1-13
R2023b

Tracking and Motion Estimation

Tracking with Re-Identification Example: Track people in video


sequence using deep learning ReID Network
The “Reidentify People Throughout a Video Sequence Using ReID Network” example shows you how
to track people throughout a video sequence using a re-identification (ReID) deep learning network.

1-14
Code Generation, GPU, and Third-Party Support

Code Generation, GPU, and Third-Party Support

Generate C and C++ Code Using MATLAB Coder: Support for functions
These functions and objects now support C/C++ code generation for both host and non-host target
platforms.

• monovslam
• The "planeToPlane" value of the Metric name-value argument of the pcregistericp
function.

Generate CUDA code for NVIDIA Using GPU Coder: Support for
function
The pcregistercorr function now supports CUDA® code generation using GPU Coder™.

1-15
2

R2023a

Version: 10.4

New Features

Compatibility Considerations
R2023a

Ground Truth Labeling

Pixel Label File Naming: Unique filenames for image files related to
pixel labels
Labelers now provide a naming scheme to associate pixel label filenames to corresponding image
files. Once you create pixel labels, the labeler names the pixel label file as:
Labeler_#_<original image filename>.png, where # indicates the file index and <original image
filename> is the name of the image file the pixel labels are based on.
For example:
Label_1_<original image filename>.png
Label_1_<original image filename>.png
Label_2_<original image filename>.png
Label_3_<original image filename>.png
...
Label_N_<original image filename>.png

Compatibility Considerations
If you export pixel labels from a session created prior to R2023a, the exported pixel filenames that
use the new naming scheme. You may need to modify any code that relies on pixel filenames created
prior to R2023a.

Labeler Enhancements: Labeling interactions and other enhancements


This table describes enhancements for these labeling apps:

• Image Labeler
• Video Labeler
• Ground Truth Labeler (Automated Driving Toolbox)
• Lidar Labeler (Lidar Toolbox)
• Medical Image Labeler (Medical Imaging Toolbox)

Feature Image Video Ground Lidar Medica


Labeler Labeler Truth Labeler l Image
Labeler Labele
r
New Image Labeler interface. Yes No No No No
Use the Point ROI label to mark Yes Yes Yes No No
one or more keypoints in objects.
New naming scheme for pixel label Yes Yes Yes No No
file names when you export ground
truth data. For more details, see
the Pixel Label File Naming:
Unique filenames for image files
related to pixel labels release note.

2-2
Ground Truth Labeling

Feature Image Video Ground Lidar Medica


Labeler Labeler Truth Labeler l Image
Labeler Labele
r
Use the Smart Voxel tool to refine No No No Yes No
point cloud labeling by marking the
foreground and background within
a region of interest.
Label a planar region within a No No No Yes No
point cloud by selecting any three
points on the plane.
Configure, preview, and export 2-D No No No No Yes
and 3-D animations of medical
image data.
View and navigate 2-D slices of a No No No No Yes
medical volume using crosshair
navigation.

2-3
R2023a

Recognition, Object Detection, and Semantic Segmentation

OCR: Recognize text using deep learning


Updated OCR language models for English, Japanese, seven-segment displays, and 62 other
languages now leverage deep learning. Computer Vision Toolbox OCR labeling and training workflow
for creating custom OCR language models. New functionalities include:

Labeling

• Image Labeler — Use the Image Labeler app to interactively label text in images.
• ocrTrainingData — Create OCR training data to use to train and evaluate an OCR model from
ground truth.

Training

• ocrTrainingOptions — Set OCR training options.


• trainOCR — Train an OCR model to recognize text in images.

Quantization and Evaluation

• quantizeOCR — Quantize the weights of an OCR model to create a faster, but less accurate,
model.
• evaluateOCR — Compute metrics to evaluate the quality of OCR results.

Text Recognition

• ocr — Updated OCR functionalities that include:

A new evaluation syntax to run an OCR model on a data set.

The Model name-value argument, enabling you to specify an OCR model, replaces the Language
name-value argument, and the LayoutAnalysis name-value argument, enabling you to specify
the type of layout analysis to perform for text segmentation, replaces the TextLayout name-value
argument. You can continue to use the Language and TextLayout arguments with their
corresponding value options.
• ocrText — New TextLines, TextLineBoundingBoxes, and TextLineConfidences
properties to support text lines.

Compatibility Considerations
The OCR Trainer app will be removed in a future release. Instead, use the Image Labeler app for
labeling and the trainOCR function for training.

Automated Visual Inspection: Train FastFlow and PatchCore anomaly


detection networks
Use the fastFlowAnomalyDetector object to detect anomalies using a FastFlow model. You can
train the detector using the trainFastFlowAnomalyDetector function.

2-4
Recognition, Object Detection, and Semantic Segmentation

Use the patchCoreAnomalyDetector object to detect anomalies using a PatchCore model. You can
train the detector using the trainPatchCoreAnomalyDetector function.

This functionality requires the Computer Vision Toolbox Automated Visual Inspection Library and a
Deep Learning Toolbox license. You can install the Computer Vision Toolbox Automated Visual
Inspection Library from Add-On Explorer. For more information about installing add-ons, see Get and
Manage Add-Ons.

Automated Visual Inspection: Split anomaly data sets for training,


validation, and testing
Use the splitAnomalyData function to partition an anomaly data set into training, validation, and
test datastores. You can create the datastores with normal images and optionally include anomalous
training images.

This functionality requires the Computer Vision Toolbox Automated Visual Inspection Library and a
Deep Learning Toolbox license.

Automated Visual Inspection: Load detectors for generating C, C++,


and CUDA code
These functions enable you to load anomaly detectors in a format compatible with the generation of
C, C++, and CUDA code:

• vision.loadFastFlowAnomalyDetector
• vision.loadFCDDAnomalyDetector
• vision.loadPatchCoreAnomalyDetector

The predict, classify, and anomalyMap functions now support the generation of C, C++, and
CUDA code.

2-5
R2023a

Point Cloud Processing

Point Cloud Viewer: View, navigate through, and interact with a large
point cloud
The pcviewer object provides these features:

• Rendering, performance, and interactions — Render and color a point cloud with a large (20–40
million) number of points. Smoothly rotate, pan, and zoom into the point cloud.
• Camera — View the point cloud in perspective or orthographic projection. Modify camera
parameters such as position, target, zoom, up-direction vector, and the view angle for perspective
projection. You can also restore a camera view, change the vertical axis, and choose a custom
camera line of sight.
• Color — Change point cloud color, color source, color map, and background color.
• Navigation controls — Navigate around a point cloud using two types of controls, orbital and first
person. Orbital control enables you to navigate around the point cloud. First-person control
enables you to navigate through the point cloud. You can use keyboard shortcuts and mouse
interactions to control these navigations modes.
• Data exploration — Determine the axis orientation of the point cloud. Toggle between axes on and
axes off. Set the display size of points, or switch to a zoomed display of the points.

Point Cloud Processing: Cylindrical filtering for point cloud data


Use the findPointsInCylinder object function of the cylinderModel to find points within a
cylindrical region of a point cloud.

ICP Point Cloud Registration: Exclude outlier points with distance


thresholding
The pcregistericp now includes these enhancements:

• The new InlierDistance name-value argument enables you to specify the distance at which the
function considers matched nearest neighbor points to be inliers.
• The default value for the InitialTransform name-value argument is now a rigidtform3d
object that contains a translation aligning the centroid of the moving point cloud to that of the
fixed point cloud.
• Returns the root mean square error (RMSE) of the Euclidean distance between aligned point
clouds.

Compatibility Considerations
Starting in R2023a, the pcregistericp function now returns the RMSE of all points. Previously, the
function returned the RMSE of the inliers. The RMSE of the inliers is still available at each iteration
when you set the Verbose argument to true.

Starting in R2023a, the pcregistericp function now uses a modified default value for the
InitialTransform name-value argument. The default value is now a rigidtform3d object that

2-6
Point Cloud Processing

contains a translation aligning the centroid of the moving point cloud to that of the fixed point cloud.
Prior to R2023a, the default value was rigidtform3d().

Velodyne File Reader: Read GPS and IMU data from Velodyne PCAP
files
The velodyneFileReader object now reads the Global Positioning System (GPS) and the movement
inertial measurement unit (IMU) data from packet capture (PCAP) data files. The object returns the
positionData structure, which contains positional GPS, gyroscope, and accelerometer information,
along with the corresponding point indices and GPS timestamps, for each frame.

Cylinder and Sphere Geometric Models: Set color for parametric plots
Use the plot object function of the cylinderModel and the sphereModel objects to set the color
of the point cloud parametric model plot.

2-7
R2023a

Code Generation, GPU, and Third-Party Support

Generate C and C++ Code Using MATLAB Coder: Support for functions
These functions and objects now support C/C++ code generation for both host and non-host target
platforms.

• indexImages function
• retrieveImages function
• evaluateImageRetrieval function
• invertedImageIndex object and its addImages, removeImages, and addImageFeatures
functions
• fcddAnomalyDetector object and its predict, classify, and anomalyMap functions

Generate CUDA code for NVIDIA GPU Coder: Support for function
This function now supports code generation using GPU Coder.

• fcddAnomalyDetector object and its predict, classify, and anomalyMap functions.

2-8
Computer Vision with Simulink

Computer Vision with Simulink

Computer Vision with Simulink: Visualize and navigate through a


point cloud sequence
Starting in R2023a, you can use the Point Cloud Viewer block to:

• Visualize and navigate through streaming point cloud sequence using play, pause, step forward,
and stop simulation controls
• Limit the display of the 3-D point cloud sequence by setting the x-, y-, and z- axis limits
• Set the viewing plane to change the viewing angle of the point cloud data.
• Analyze point cloud data in the sequence

2-9
3

R2022b

Version: 10.3

New Features

Compatibility Considerations
R2022b

Geometric Transformations

Geometric Transformations: New functionality uses premultiply matrix


convention
Starting in R2022b, most Computer Vision Toolbox functions use the premultiply convention to create
and perform geometric transformations. The toolbox includes new functions that enable geometric
transformations using the premultiply convention.

The old geometric transformation functions that use a postmultiply convention are not recommended.
Although there are no plans to remove the old functions at this time, you can streamline your
geometric transformation workflows by switching to the new functions that use the premultiply
convention. For more information, see Migrate Geometric Transformations to Premultiply
Convention.

This table lists the new functions that use the premultiply convention and the corresponding
discouraged functions that use the postmultiply convention.

New Function Discouraged Function Description


cameraProjection cameraMatrix Estimate camera projection
matrix from intrinsic
parameters and extrinsic
parameters
estimateCameraProjection estimateCameraMatrix Estimate camera projection
matrix from world-to-image
point correspondences
estimateExtrinsics extrinsics Estimate extrinsic parameters
from intrinsic parameters and
world-to-image point
correspondences
estimateStereoRectificat estimateUncalibratedRect Estimate geometric
ion ification transformations for rectifying
stereo images
estgeotform2d estimateGeometricTransfo Estimate 2-D geometric
rm2D transformation from matching
point pairs
estgeotform3d estimateGeometricTransfo Estimate 3-D geometric
rm3D transformation from matching
point pairs
estrelpose relativeCameraPose Estimate camera pose relative
to another pose
estworldpose estimateWorldCameraPose Estimate camera pose in world
coordinates
extr2pose extrinsicsToCameraPose Convert extrinsic parameters to
camera pose
img2world2d pointsToWorld Determine world coordinates of
image points

3-2
Geometric Transformations

New Function Discouraged Function Description


pose2extr cameraPoseToExtrinsics Convert camera pose to
extrinsic parameters
rotmat2vec3d rotationMatrixToVector Convert 3-D rotation matrix to
rotation vector
rotvec2mat3d rotationVectorToMatrix Convert 3-D rotation vector to
rotation matrix
world2img worldToImage Project world points into image

Geometric Transformations: Current functionality updated to support


premultiply matrix convention
These objects and functions have been updated to support the premultiply matrix convention.

• bboxwarp
• bundleAdjustment
• bundleAdjustmentMotion
• bundleAdjustmentStructure
• cameraIntrinsics
• cameraParameters
• estimateCameraParameters
• estimateStereoBaseline
• findPose
• imageviewset and its object functions (addView, updateView, addConnection,
updateConnection, connectedViews, and poses)
• normalRotation
• pcalign
• pcregistercorr
• pcregistericp
• pcregistercpd
• pcregisterndt
• pctransform
• pcviewset and its object functions (addView, updateView, addConnection,
updateConnection, connectedViews, and poses)
• plotCamera
• readAprilTag
• rectifyStereoImages
• showExtrinsics
• stereoParameters
• triangulate
• triangulateMultiview

3-3
R2022b

Ground Truth Labeling

Labeler Enhancements: Labeling interactions and other enhancements


This table describes enhancements for these labeling apps:

• Image Labeler
• Video Labeler
• Ground Truth Labeler (Automated Driving Toolbox)
• Lidar Labeler (Lidar Toolbox)
• Medical Image Labeler (Medical Imaging Toolbox) — Introduced in R2022b

Feature Image Video Ground Lidar Medica


Labeler Labeler Truth Labeler l Image
Labeler Labele
r
The cuboid2img function returns Yes Yes Yes No No
rectangular projected cuboids to
create data compatible with the
labeling apps.
Automate projected cuboid labeling No Yes Yes No No
with the temporal interpolator
automation algorithm.
The boxLabelDatastore object, Yes Yes Yes No No
the gatherLabelData
(Automated Driving Toolbox), and
the
objectDetectorTrainingData
object, used in creating training
data for object detection, now
support 2-D projected cuboid
labels.
Visualize the color information of No No No Yes No
the point cloud using the updated
colormap.
Add background color for the point No No Yes Yes No
cloud.
Visualize the XY, YZ, and ZX views No No Yes Yes No
of the point cloud.
Label 2-D and 3-D medical images No No No No Yes
for semantic segmentation.

3-4
Recognition, Object Detection, and Semantic Segmentation

Recognition, Object Detection, and Semantic Segmentation

3-D Object Detection: Training data and visualization support for 3-D
object detection
• Use the cuboid2img function to project 3-D cuboids into an image using the image data and
camera intrinsic parameters specified by a projection matrix. You can use the showShape,
insertShape, and the insertObjectAnnotation functions to visualize the projected cuboids.
The cuboid2img function returns rectangular projected cuboids to create data compatible with
the Image Labeler, Video Labeler, and Ground Truth Labeler (Automated Driving Toolbox) apps.
• The boxLabelDatastore object, gatherLabelData (Automated Driving Toolbox) function, and
objectDetectorTrainingData object used in creating training data for object detection, now
support 2-D projected cuboid labels.
• The objectDetectorTrainingData function now returns extracted attributes and sublabels as
a third output. The attributes and sublabels are packaged as an array datastore.

Automated Visual Inspection: Train FCDD anomaly detector using the


Automated Visual Inspection Library
Perform anomaly detection and evaluate anomaly classification results by using the functions in the
Computer Vision Toolbox Automated Visual Inspection Library. For more details, see Getting Started
with Anomaly Detection Using Deep Learning.

Use the fcddAnomalyDetector object to detect anomalies using a fully convolutional data
description (FCDD) network. You can train an FCDD network using the
trainFCDDAnomalyDetector function. Tune the network by finding the optimal anomaly threshold
for a set of anomaly scores and corresponding labels using the anomalyThreshold function.

Use the object functions of the fcddAnomalyDetector object for training and inference workflows.
The classify function performs a binary classification of an image as normal or anomalous. The
predict function calculates a single anomaly score for an image. The anomalyMap function
calculates a per-pixel anomaly score for an image.

Use the evaluateAnomalyDetection function to evaluate the quality of the anomaly detection
results using metrics such as the confusion matrix and average precision. The
anomalyDetectionMetrics object stores the metrics.

Use the function anomalyMapOverlay to overlay per-pixel anomaly scores as a heatmap on a


grayscale or RGB background image. Use the viewAnomalyDetectionResults function to open an
interactive figure window that enables you to explore the classification results and anomaly score
maps for a set of images.

This functionality requires the Computer Vision Toolbox Automated Visual Inspection Library and
Deep Learning Toolbox. You can install the Computer Vision Toolbox Automated Visual Inspection
Library from Add-On Explorer. For more information about installing add-ons, see Get and Manage
Add-Ons.

3-5
R2022b

Deep Learning Instance Segmentation: Evaluate segmentation results


Use the evaluateInstanceSegmentation function to evaluate the quality of instance
segmentation results using metrics such as the confusion matrix and average precision. The
instanceSegmentationMetrics object stores the metrics.

Deep Learning Instance Segmentation: Segment objects in image


datastores
The segmentObjects function now supports specifying test images as a datastore, such as an
imageDatastore or a CombinedDatastore. New name-value arguments enable more options for
performing instance segmentation of images in a datastore.

Semantic Segmentation: Write output files to specified unique folder


The semanticseg function now writes output files to the folder specified by the WriteLocation
and OutputFolderName name-value arguments as <WriteLocation>/<OutputFolderName>.

Compatibility Considerations
Prior to R2023a, the function wrote output files directly into the location specified by
WriteLocation. To get the same results as in previous releases, set the OutputFolderName name-
value argument to "".

Image Classification: Create scene label training data


The sceneLabelTrainingData function creates an image datastore from scene-labeled ground
truth data, which you can use to train a classification or anomaly detection network.

YOLO v2 Object Detector: Detect objects in image using pretrained


YOLO v2 deep learning networks
You can now use the yolov2ObjectDetector object to create an object detector from pretrained
YOLO v2 deep learning networks. Then, use the detect object function to detect objects in a test
image by using the pretrained YOLO v2 object detector.

To create a pretrained YOLO v2 object detector, use the pretrained DarkNet-19 and tiny YOLO v2
deep learning networks in the Computer Vision Toolbox Model for YOLO v2 Object Detection support
package. These networks are trained on the COCO data set.

You can download the Computer Vision Toolbox Model for YOLO v2 Object Detection from the Add-On
Explorer. For more information, see Get and Manage Add-Ons.

YOLO v3 Object Detector: Computer Vision Toolbox includes


yolov3ObjectDetector object and its object functions
Starting in R2022b, Computer Vision Toolbox includes the yolov3ObjectDetector object and all its
object functions. Use of this object and its object functions, does not require installation of the
Computer Vision Toolbox Model for YOLO v3 Object Detection add-on.

3-6
Recognition, Object Detection, and Semantic Segmentation

Deep Learning Object Detector Block: Support for YOLO v3 and YOLO
v4 object detectors
The Deep Learning Object Detector block now supports YOLO v3 and YOLO v4 object detectors . You
can use this block to load yolov3ObjectDetector and yolov4ObjectDetector objects into a
Simulink® model to perform object detection.

Seven-Segment Digit Recognition: Recognize digits in a seven-


segment display
You can now use the ocr function to recognize the digits in a seven-segment display image. To
identify the seven-segment digits, specify the value of the Language name-value argument as
"seven-segment".

The Recognize Seven-Segment Digits Using OCR example shows how to detect and recognize seven-
segment digits in an image by using optical character recognition (OCR).

Object Detection Example: Object detection on large satellite imagery


using deep learning
The Object Detection In Large Satellite Imagery Using Deep Learning example shows you how to
perform object detection on a large satellite image using a pretrained SSD object detector.

Functionality Being Removed or Changed


yolov2ReorgLayer function being removed in future release
Warns

The yolov2ReorgLayer function will be removed in a future release. Use the spaceToDepthLayer
function to add a reorganization layer to the YOLO v2 deep learning network instead.

To update your code, replace all instances of the yolov2ReorgLayer function with the
spaceToDepthLayer function.

3-7
R2022b

Structure from Motion and Visual SLAM

Visual SLAM: Query indirectly connected camera views in an


imageviewset
The connectedViews object function of the imageviewset object includes a new name-value
argument, MaxDistance. Use MaxDistance to find indirectly connected views by setting it to a
value greater than 1.

Visual SLAM: Store and manage additional attributes of 3-D world


points in a worldpointset
The worldpointset object includes additional properties and object functions to manage the
relationship between 3-D world points and camera views.

Properties:

• ViewingDirection — Provides an estimate of the view angle from which a 3-D point can be
observed.
• DistanceLimits — Estimates what 3-D world points can potentially be observed when a new
camera is introduced into a visual SLAM scenario.
• RepresentativeViewId — Representative view ID of the world points.
• RepresentativeFeatureIndex — Index of the feature associated with the representative view.

Object functions:

• updateLimitsAndDirection — Update mean viewing direction and distance range.


• updateRepresentativeView — Update major view ID and major feature index.

Epipolar Line: epipolarLine function accepts feature point objects


The epipolarLine function accepts feature point objects. For more information on feature point
objects, see Point Feature Types.

3-8
Point Cloud Processing

Point Cloud Processing

Point Cloud Reading: Velodyne file reader enhancements


The velodyneFileReader object supports reading point cloud data from a Velarray H800
Velodyne® device model. The readFrame function of the velodyneFileReader object additionally
returns the timestamps for all the points in the point cloud as a duration vector or matrix.

Point Cloud Registration: Generalized-ICP registration algorithm


added to pcregistericp
The pcregistericp function now includes the Generalized-ICP (G-ICP) registration algorithm, also
known as the plane-to-plane algorithm. This algorithm can provide greater accuracy and robustness
to ICP point cloud registration. To use this algorithm, specify the Metric name-value argument as
"planeToPlane".

Compatibility Considerations
The default value for the InitialTransform name-value argument is now a rigidtform3d object
that contains a translation aligning the center of the moving point cloud to that of the fixed point
cloud. Prior to R2023a, the default value was a rigidtform3d() object.

The pcregistericp function now returns the RMSE of all points. Previously, the function returned
the RMSE of the inliers. The RMSE of the inliers is still available at each iteration when you set the
Verbose argument to true.

Point Cloud Processing: Support for 16-bit color in point clouds


The Color property for the pointCloud object now supports the uint16 data type.

Point Cloud Processing: Generate point cloud from depth image


Use the pcfromdepth function to convert a depth image taken from an RGB-D camera to a point
cloud using camera intrinsic parameters.

Point Cloud Visualization: Point cloud viewer control and interaction


enhancements
Added support for these point cloud viewer controls and interactions for the pcshow, pcshowpair,
pcplayer functions:

• Keyboard shortcut to switch between rotate and pan.


• Specify perspective or orthographic projection view using the Projection name-value argument.
• Navigate through a point cloud scene using keyboard shortcuts. Navigation capabilities include
moving left, right, forward, and back, looking around the scene, and rotating and rolling the
scene.

3-9
R2022b

• Configure point cloud viewer with a specific view plane, color source, axes visibility, and axes
projection programmatically using the ViewPlane, ColorSource, AxesVisibility, and the
Perspective name-value arguments, respectively.
• Change the view plane, background color, axes visibility, projection, and vertical axis by using the
new axes toolbar options.

Compatibility Considerations
The default projection for the camera 3-D view display has been changed from orthographic to
perspective. You can use the new Projection name-value argument to specify the projection as
"perspective" or "orthographic". Prior to R2022b, the pcshow display rendered the point cloud
in an orthographic view only.

vision.VideoFileReader system object removed in future release

Compatibility Considerations
The vision.VideoFileReader system object will be removed in a future release. Use the
VideoReader object instead.

3-10
Code Generation, GPU, and Third-Party Support

Code Generation, GPU, and Third-Party Support

Generate C and C++ Code Using MATLAB Coder: Support for functions
These functions now support C/C++ code generation for host target platforms.

• bundleAdjustment
• bundleAdjustmentMotion
• bundleAdjustmentStructure
• pcfromdepth
• pcregistericp for the pointToPlane and the pointToPoint metrics.

The bagOfFeatures object and the encode function now support C/C++ code generation for both
host and non-host target platforms.

Generate CUDA code for NVIDIA GPU Coder: Support for function
This function now supports code generation using GPU Coder.

• pcfromdepth

3-11
R2022b

Computer Vision with Simulink

Computer Vision with Simulink: Retrieve Simulink image attributes


Starting in R2022b, you can use the Image Attributes block to retrieve image attributes from a
Simulink image signal to:

• Display image attributes for debugging purposes


• Use image attributes in the control or data flow of an image processing algorithm

Add the Image Attributes block from the Simulink Library Browser window in the Computer Vision
Toolbox > Utilities library. The output ports of the Image Attributes block output these attributes of
a signal that is of Simulink.ImageType data type:

• rows — Number of rows in image data


• columns — Number of columns in image data
• channels — Number of color channels or samples for each pixel in array
• class — Data type of underlying image data
• color — Color format of underlying image data
• layout — Array layout of image data

Image From Workspace block default value changes

Compatibility Considerations
This release changes the default value for the Image From Workspace block to rand(100,100,3).
Previously, the default value was checker_board.

3-12
4

R2022a

Version: 10.2

New Features

Compatibility Considerations
R2022a

Ground Truth Labeling

Labeler Enhancements: Labeling interactions and other enhancements


The following table describes enhancements for these labeling apps:

• Image Labeler
• Video Labeler
• Ground Truth Labeler
• Lidar Labeler (Lidar Toolbox)

Enhancement Image Video Ground Lidar


Labeler Labeler Truth Labeler
Labeler
Draw, visualize, and export semantic No No No Yes
labels in the point cloud.
Create training data for object No No No Yes
detection in point clouds by using the
lidarObjectDetectorTrainingDat
a function.
Import multiframe DICOM images. Yes No No No
Create 3-D line ROI for point cloud No No Yes Yes
data.
Create voxel ROI for point cloud data. No No No Yes
Show or hide pixel labels in a labeled Yes Yes Yes No
image or video.

4-2
Feature Detection and Extraction

Feature Detection and Extraction

Functionality Being Removed or Changed: Support for GPU removed


for detectFASTFeatures function
The detectFASTFeatures function no longer supports GPU. You can use the
detectHarrisFeatures function on the GPU to detect corner points instead. Though
detectHarrisFeatures does not provide identical results, it might be suitable based on the
application.

4-3
R2022a

Recognition, Object Detection, and Semantic Segmentation

Deep Learning Instance Segmentation: Train Mask R-CNN networks


The trainMaskRCNN function trains a Mask R-CNN network represented by a maskrcnn object. You
can perform transfer learning on a pretrained Mask R-CNN network. This functionality requires the
Computer Vision Toolbox Model for Mask R-CNN Instance Segmentation and a Deep Learning
Toolbox license.

YOLO v4 Object Detector: Detect objects in image using YOLO v4 deep


learning network
The yolov4ObjectDetector object creates a YOLO v4 object detector to detect objects in images.
You can either create a pretrained YOLO v4 object detector or a custom YOLO v4 object detector.
Then, use the detect object function to detect objects in a test image by using the trained YOLO v4
object detector. To train a YOLO v4 object detection network, use the
trainYOLOv4ObjectDetector function.

Create Pretrained YOLO v4 Object Detector

• To create a pretrained YOLO v4 object detector, use the pretrained DarkNet-53 and tiny YOLO v4
deep learning networks in the Computer Vision Toolbox Model for YOLO v4 Object Detection
support package. These networks are trained on the COCO data set.

You can download the Computer Vision Toolbox Model for YOLO v4 Object Detection from the
Add-On Explorer. For more information, see Get and Manage Add-Ons.

Create Custom YOLO v4 Object Detector

• You can create a custom YOLO v4 object detector by using an untrained or pretrained YOLO v4
deep learning network.
• You can also create a custom YOLO v4 object detector by using a base network for feature
extraction. The yolov4ObjectDetector object adds detection heads to the feature extraction
layers of the base network and creates a YOLO v4 object detector. You must specify the source
layers to which to add the detection heads.

The Object Detection Using YOLO v4 Deep Learning example shows how to create and train a
custom YOLO v4 object detector for performing object detection.

SSD Object Detector Enhancements: Create pretrained or custom SSD


object detector
Starting in R2022a, use the ssdObjectDetector object to create a pretrained or custom SSD object
detection network. Then, use the detect object function to detect objects in a test image by using
the trained SSD object detector. To train a SSD object detector, use the trainSSDObjectDetector
function.

4-4
Recognition, Object Detection, and Semantic Segmentation

Text Detection: Detect natural scene texts using CRAFT deep learning
model
The detectTextCRAFT function detects text in natural scene images by using a pretrained character
region awareness for text detection (CRAFT) model. The pretrained CRAFT model can detect text in
these nine languages: Chinese, Japanese, Korean, Italian, English, French, Arabic, German, and
Bangla (Indian).

Automated Visual Inspection: Use anomaly detection and


classification techniques
These examples show how to use anomaly detection and classification techniques.

• The Classify Defects on Wafer Maps Using Deep Learning example classifies manufacturing
defects on wafer maps.
• The Detect Image Anomalies Using Explainable One-Class Classification Neural Network example
detects and localizes anomalies using single-class classification neural network.
• The Detect Image Anomalies Using Pretrained ResNet-18 Feature Embeddings example trains a
similarity-based anomaly detector using one-class learning of feature embeddings extracted from
a pretrained ResNet-18 convolutional neural network.

For more details, see Getting Started with Anomaly Detection Using Deep Learning.

Bounding Box Coordinates: Data augmentation for object detection


using spatial coordinates
The bboxresize, bboxcrop, bboxwarp, and showShape functions assume the input bounding box
coordinates for axis-aligned rectangles are specified in spatial coordinates and return the
transformed bounding boxes in spatial coordinates.

Use non-integer coordinates to specify a bounding box using the bboxresize, bboxcrop,
bboxwarp, or showShape function.

Functionality being removed or changed


anchorBoxLayer object being removed in future release
Still runs

The anchorBoxLayer object will be removed in a future release. Use ssdObjectDetector object
to specify the anchor boxes for training a single shot detector (SSD) multibox object detection
network instead.

ssdLayers function being removed in future release


Still runs

The ssdLayers function will be removed in a future release. Use the ssdObjectDetector object to
create a SSD multibox object detector instead.

pixelLabelImageDatastore object being removed in future release


Errors

4-5
R2022a

The pixelLabelImageDatastore will be removed in a future release. Use the imageDatastore


and pixelLabelDatastore objects and the combine function instead.

Support for LayerGraph object as input to trainSSDObjectDetector function being removed


in future release
Still runs

Starting in R2022a, use of LayerGraph (Deep Learning Toolbox) object to specify SSD object
detection network as input to the trainSSDObjectDetector function is not recommended and will
be removed in a future release.

If your SSD object detection network is a LayerGraph (Deep Learning Toolbox) object, configure the
network as a ssdObjectDetector object by using the ssdObjectDetector function. Then, use the
ssdObjectDetector object as input to the trainSSDObjectDetector function for training.

Experiment Manager Example: Find optimal training options


The Experiment Manager (Deep Learning Toolbox) app enables you to create deep learning
experiments to train object detectors under multiple initial conditions and compare the results. The
Train Object Detectors in Experiment Manager example uses the Experiment Manager (Deep
Learning Toolbox) to find optimal training options for object detectors.

Multiclass Object Detection Example: Train multiclass object detector


using deep learning
The Multiclass Object Detection Using YOLO v2 Deep Learning example shows you how to train a
YOLO v2 multiclass object detector.

Datastore Support: Use datastores with ACF and cascade object


detectors
Use datastores with the trainACFObjectDetector and trainCascadeObjectDetector
functions, as well as with the detect object function of the acfObjectDetector object and the step
object function of the vision.CascadeObjectDetector object

4-6
Structure from Motion and Visual SLAM

Structure from Motion and Visual SLAM

Stereo Vision Rectification Parameters: Access stereo camera


rectification parameters
Use these enhancements to the rectifyStereoImages and reconstructScene functions to
access stereo camera rectification parameters.

• Use the rectifyStereoImages function can now return a reprojection matrix, for reprojecting a
2-D point in a disparity map to a 3-D point. The function can also return the camera projection
matrices for the rectified cameras from a a pair of stereo images. You can use these matrices to
project 3-D world points from the coordinate system of the primary camera into the image plane of
the rectified images.
• Use the reconstructScene function can now reconstruct a 3-D scene from a disparity map and
a reprojection matrix.

Bundle Adjustment Data Management: Integration of bundle


adjustment functions with data management objects
Use these bundle adjustment enhancements with data management objects:

• Use the bundleAdjustment function to refine world points in a worldpointset and to view
poses in an imageviewset.
• Use the bundleAdjustmentStructure function to refine world points in a worldpointset.

Visual SLAM Example: Process image data from RGB-D camera to build
dense map
The Visual SLAM with an RGB-D Camera example shows you how to process image data from an
RGB-D camera to build a dense map of an indoor environment and estimate the trajectory of the
camera. The example uses ORB-SLAM2, which is a feature-based vSLAM algorithm that supports
RGB-D cameras.

Functionality Being Removed or Changed: Support for GPU removed


for disparityBM function
The disparityBM function no longer supports GPU. You can use the enhanced disparitySGM
function on the GPU instead. Results are not identical using the disparitySGM function, but the
semi-global algorithm used in the disparitySGM function is normally recommended over the block-
matching algorithm in the disparityBM function .

4-7
R2022a

Point Cloud Processing

Point Cloud Preprocessing: Preserve organized structure of point


cloud
Use the PreserveStructure name-value argument of the pcdownsample and pcdenoise
functions to preserve the organized structure of a point cloud. Additionally, the pcdownsample
function now returns the linear indices of points in the downsampled point cloud.

Velodyne Organized Point Cloud Support: Velodyne file reader can


return organized point clouds
Use the OrganizePoints property of the velodyneFileReader object to return an organized
point cloud.

Point Cloud: uint16 data type support for intensity


The pointCloud object now supports the uint16 data type for the Intensity property.

4-8
Code Generation, GPU, and Third-Party Support

Code Generation, GPU, and Third-Party Support

OpenCV Interface: Integrate OpenCV version 4.5.0 projects with


MATLAB
Integrate OpenCV projects with MATLAB using OpenCV version 4.5.0.

Generate C and C++ Code Using MATLAB Coder: Support for functions
These functions now supports portable C code generation for non-host target platforms.

• pcdownsample (only when using the grid average downsample method)


• pcmerge
• pcregisterndt

These functions and objects now support C/C++ code generation for both host and non-host target
platforms.

• pointTrack
• worldToImage
• scanContextLoopDetector
• pcregistericp
• pointsToWorld
• selectStrongestBboxMulticlass
• undistortPoints
• worldpointset
• bboxOverlapRatio
• copy function of the pointCloud object.
• pcviewset object and its functions, excluding the optimizePoses function.
• imageviewset object and its functions, excluding the optimizePoses.
• The optimizePoses function of the imageviewset and pcviewset objects support code
generation for host target platforms.
• yolov3ObjectDetector object and its detect function.
• yolov4ObjectDetector object and its detect function.

Generate CUDA code for NVIDIA GPU Coder: Support for functions
These functions now support code generation using GPU Coder.

• yolov3ObjectDetector object and its detect function.


• yolov4ObjectDetector object and its detect function.

4-9
R2022a

Functionality being removed or changed


GPU support removed for OpenCV interface with MATLAB
Errors

The MATLAB functions in the Computer Vision Toolbox Interface for OpenCV in MATLAB support
package no longer supports GPU.

4-10
Computer Vision with Simulink

Computer Vision with Simulink

Computer Vision with Simulink: Specify image data type in Simulink


model
Prior to R2022a, Computer Vision Toolbox supported importing an image into a Simulink model only
as matrix data. In some cases, Simulink implemented the imported image as separate signals (for
example, separate R, G, and B signals). In other cases, it was implemented as a single signal.
Simulink did not have a universal standard to model the image data across Simulink blocks and
MATLAB functions. You needed the Computer Vision Toolbox Interface for OpenCV in Simulink to
import the images specified as Simulink.ImageType data type in your OpenCV code.

In R2022a, you can now use Computer Vision Toolbox to directly specify an image of the
Simulink.ImageType data type and generate code for the model. Specify this data type for signals
and other data in your model. The Simulink.ImageType data type is an encapsulated object that
defines an image with fixed meta-attributes specific to this data type.

To specify a signal as the Simulink.ImageType data type for a supported block, on the Signal
Attributes tab, specify Data type, or the equivalent parameter for the block, as
Simulink.ImageType(480,640,3,'ColorFormat','RGB','Layout','ColumnMajor','Clas
sUnderlying','uint8'). This specification assigns a default Simulink.ImageType data type to
the signal. You can also use the Data Type Assistant to customize the image attributes. For more
information, see Specify an Image Data Type (Simulink).

To integrate images of the Simulink.ImageType data type in existing image processing algorithms
that operate on matrix data only, use the new To Simulink Image and From Simulink Image blocks to
convert the image data to and from the Simulink.ImageType data type, respectively. For more
information, see Track Marker Using Simulink Images.

If your model includes image signals of the Simulink.ImageType data type, you can use the Color
Space Conversion block to convert color information between color spaces by setting the Conversion
parameter to R'G'B' to intensity or one of these new options:

• R’G’B’ to B’G’R’
• B’G’R’ to R’G’B’
• B’G’R’ to intensity

4-11
5

R2021b

Version: 10.1

New Features
R2021b

Ground Truth Labeling

Labeler Enhancements: Labeling interactions and other enhancements


The following table describes enhancements for these labeling apps:

• Image Labeler
• Video Labeler
• Ground Truth Labeler
• Lidar Labeler (Lidar Toolbox)

Enhancement Image Video Ground Lidar


Labeler Labeler Truth Labeler
Labeler
Show or hide labels and sublabels of Yes Yes Yes No
type Rectangle, Line, Polygon, and
Projected cuboid in a labeled
image or video.
Show or hide labels of type Cuboid in No No Yes Yes
a labeled point cloud or point cloud
sequence.
View and edit cuboid ROI labels using No No Yes Yes
top, side, and front 2-D view
projections by selecting Projected
View.
Segment ground from lidar data using No No Yes (only with Yes
the simple morphological filter (SMRF) Lidar
algorithm. For more information about Toolbox™
the algorithm parameters, see the license)
segmentGroundSMRF (Lidar Toolbox)
function.
Extract video scenes and No Yes Yes No
corresponding labels from a
groundTruth or
groundTruthMultisignal object.
Digital Imaging and Communication in Yes No No No
Medicine (DICOM) image format.

5-2
Feature Detection and Extraction

Feature Detection and Extraction

SIFT Feature Detector: Scale-invariant feature transform detection


and feature extraction
Use the detectSIFTFeatures function to detect scale-invariant features. You can use the
SIFTPoints object to store SIFT feature keypoints. In R2021b, the extractFeatures function now
also has the SIFT descriptor.

5-3
R2021b

Recognition, Object Detection, and Semantic Segmentation

Experiment Manager App Support: Track progress of deep learning


object detector training
Use the Experiment Manager (Deep Learning Toolbox) app to track the progress of deep learning
object detector training. Set the ExperimentMonitor name-value argument in the Computer Vision
Toolbox training functions to specify an experiments.Monitor (Deep Learning Toolbox) object to
use with the app. You can use the app with these training functions:

• trainRCNNObjectDetector
• trainFastRCNNObjectDetector
• trainFasterRCNNObjectDetector
• trainSSDObjectDetector
• trainYOLOv2ObjectDetector

Deep Learning Activity Recognition: Video classification using deep


learning
Analyze, classify, and track activity contained in visual data sources, such as a video stream, using
deep learning techniques. Use the Inflated-3D, SlowFast, or R(2+1)D video classifiers with their
supporting software packages.

You can install software packages from Add-On Explorer. For more information about installing add-
ons, see Get and Manage Add-Ons. To run these functionalities, you will require the Deep Learning
Toolbox.

5-4
Recognition, Object Detection, and Semantic Segmentation

• The inflated3dVideoClassifier model contains two subnetworks: the video network and the
optical flow network. These networks are trained on the Kinetics-400 data set with RGB data and
optical flow data, respectively. This functionality requires the Computer Vision Toolbox Model for
Inflated-3D Video Classification.
• The slowFastVideoClassifier model is pretrained on the Kinetics-400 data set which
contains the residual network ResNet-50 model as the backbone architecture with slow and fast
pathways. This functionality requires the Computer Vision Toolbox Model for SlowFast Video
Classification.
• The r2plus1dVideoClassifier model is pretrained on the Kinetics-400 data set which
contains 18 spatio-temporal (ST) residual layers. This functionality requires the Computer Vision
Toolbox Model for R(2+1)D Video Classification.

For more details, see Getting Started with Video Classification Using Deep Learning.

Create Training Data for Video Classifier: Extract video clips for
labeling and training workflow
Use the sceneTimeRanges and writeVideoScenes functions to extract video clips from labels
from ground truth data.

Deep Learning Instance Segmentation: Create and configure


pretrained Mask R-CNN neural networks
Create a Mask R-CNN instance segmentation object detector using the maskrcnn object. This
network is trained on the COCO dataset with a ResNet-50 network as the feature extractor. You can
detect objects in an image using the pretrained network or you can configure the network to perform
transfer learning. This functionality requires the Computer Vision Toolbox Model for Mask R-CNN
Instance Segmentation and Deep Learning Toolbox.

For more details, see Getting Started with Mask R-CNN for Instance Segmentation.

Deep Learning ROI Pooling: Nonquantized ROI pooling


Use the roialign function for nonquantized ROI pooling of dlarray (Deep Learning Toolbox) data.

Deep Learning Object Detector Block: Simulate and generate code for
deep learning object detectors in Simulink
Simulate and generate code for deep learning object detectors in Simulink. The Analysis &
Enhancement block library now includes the Deep Learning Object Detector block. This block
predicts bounding boxes, class labels, and scores for the input image data by using a specified trained
object detector. This block enables you to load a pretrained object detector into the Simulink model
from a MAT file or from a MATLAB function.

For more information about working with the Deep Learning Object Detector block, see Lane and
Vehicle Detection in Simulink Using Deep Learning (Deep Learning Toolbox). To learn more about
generating code for Simulink models containing the Deep Learning Object Detector block, see Code
Generation for a Deep Learning Simulink Model that Performs Lane and Vehicle Detection (GPU
Coder).

5-5
R2021b

Pretrained Deep Learning Models on GitHub: Perform object detection


and segmentation using latest pretrained models on GitHub
The MATLAB Deep Learning GitHub repository provides the latest pretrained deep learning networks
to download and use for performing object detection and segmentation. For example:

• To perform object detection by using the pretrained you-only-look-once (YOLO) v4 deep learning
network, see the Object Detection Using Pretrained YOLO v4 Deep Learning Network GitHub
repository. You can download the model and use it to perform out-of-the-box inference.
• To perform scene text detection by using a pretrained character region awareness for text
detection (CRAFT) model, see the Pretrained Character Region Awareness For Text Detection
Model GitHub repository.
• To perform segmentation by using a pretrained DeepLabv3+ network, see the Pretrained
DeepLabv3+ Semantic Segmentation Network GitHub repository.

5-6
Camera Calibration

Camera Calibration

Camera Calibration: Circle grid calibration pattern detection


Support is available for asymmetric and symmetric circle grid patterns in the Camera Calibrator and
Stereo Camera Calibrator apps or programmatically when you use the detectCircleGridPoints
and generateCircleGridPoints functions.

For more information about camera calibration, using the calibration app, and circle grid patterns,
see Using the Single Camera Calibrator App, Select Calibration Pattern and Set Properties, and
Calibration Patterns.

Camera Calibration: Custom pattern detection


The Camera Calibrator and the Stereo Camera Calibrator apps support calibration using custom
patterns. You can create a custom detector and load it into the app as one of the selectable pattern
choices. To see how to use a custom calibration pattern, see the Camera Calibration Using AprilTag
Markers example.

Rigid 3-D Support: Pass rigid 3-D transformation object to calibration


functions
Pass a rigid3d object as an input to the cameraPoseToExtrinsics and
extrinsicsToCameraPose functions to return the output transformations as rigid3d objects.

OpenCV Camera Parameters: Relay camera intrinsics and stereo


parameters to and from OpenCV
Use OpenCV camera intrinsics and stereo parameters with Computer Vision Toolbox by using these
functions:

• cameraIntrinsicsFromOpenCV
• cameraIntrinsicsToOpenCV
• stereoParametersFromOpenCV
• stereoParametersToOpenCV

5-7
R2021b

Structure from Motion and Visual SLAM

Bundle Adjustment Solver: Specify optimization solver


Use the Solver name-value argument of the bundleAdjustment function to select between the
preconditioned-conjugate-gradient solver for high sparsity, and the sparse-linear-
algebra solver. High sparsity indicates that each camera view observes only a small selection of
world points.

Rigid 3-D Support: Pass 3-D rigid transformation object to camera


parameter functions
Pass a rigid3d object as an input to the pointsToWorld, cameraMatrix, stereoParameters,
and worldToImage functions and objects.

Image View Set Support: Find views and view connections


Use the findView and findConnection object functions of the imageviewset object to find views
and connections in image view sets.

Bag of Features: Support for binary features


You can now use the custom feature extraction function extractorFcn to return a
binaryFeatures object.

Use the TreeProperties property of the bagOfFeatures object to control the amount of
vocabulary in the tree at successive levels.

The TreeProperties property replaces the VocabularySize property of the bagOfFeatures


object. VocabularySize continues to work.

Bag of Features Search Index: Support for Visual Simultaneous


Localization and Mapping (vSLAM) loop closure detection
Enhancements have been made to the invertedImageIndex object and retrieveImages to
support using bag-of-features for loop closure detection.

• Use the addImages object function of the invertedImageIndex object to add an image to the
image index with an image identifier.
• Use the addImageFeatures object function of the invertedImageIndex to add the features of
an image to the image index with an image identifier.
• Use the Metric name-value argument of retrieveImages to set the similarity metric for
ranking image retrieval results.

5-8
Point Cloud Processing

Point Cloud Processing

Point Cloud Simultaneous Localization and Mapping (SLAM): Detect


loop closures
Use the scanContextLoopDetector object to manage the database of scan context descriptors and
detect loop closures. For details, see Implement Point Cloud SLAM in MATLAB.

Point Cloud View Set Support: Find views and view connections
Use the findView and findConnection object functions of the pcviewset object to find views and
connections in point cloud view sets.

Multiquery Radius Search: Optimized radius search for point cloud


segmentation
Use the ParallelNeighborSearch name-value argument of the pcsegdist function, to improve
segmentation speed.

Point Cloud Viewers: Modify background color programmatically


Use the BackgroundColor name-value argument to modify the axis, figure, and title background
colors for the pcshow, pcshowpair, and pcplayer functions.

5-9
R2021b

Code Generation, GPU, and Third-Party Support

Generate C and C++ Code Using MATLAB Coder: Support for functions
The readAprilTag and readBarcode functions now support code generation for host target
platforms.

These functions now support code generation for both host and nonhost target platforms:

• fisheyeIntrinsics
• fisheyeParameters
• pcalign

Generate C and C++ Code Using MATLAB Coder: Compiler links to


OpenCV libraries
You can generate C code for your target by using any of the supported functions. The support enables
you to build the application for your target using a C++ compiler. The C++ compiler links to OpenCV
libraries that you provide for the particular target. You can also build a standalone application by
using the packngo function and setting the packType name-value argument to "hierarchical".

Generate CUDA code for NVIDIA GPUs using GPU Coder


These Computer Vision Toolbox functions now support code generation using the GPU Coder:

• pcnormals
• pctransform
• pcdenoise
• pcfitplane
• pcmapndt
• pcmerge

Computer Vision Toolbox Interface for OpenCV in MATLAB (September


2021, Version 21.2): Call OpenCV functions from MATLAB
Computer Vision Toolbox Interface for OpenCV in MATLAB provides a prebuilt MATLAB interface to
OpenCV library. You can directly call OpenCV functions from MATLAB without having to write a C++
MEX file. The prebuilt interface provides these utility functions for moving data between OpenCV and
MATLAB:

• createMat
• createUMat
• getBasePtr
• getImage
• keyPointsToStruct
• rectToBbox

5-10
Code Generation, GPU, and Third-Party Support

Computer Vision Toolbox Interface for OpenCV in Simulink: Specify


image data type in Simulink model
Prior to R2021a, the Computer Vision Toolbox Interface for OpenCV in Simulink only supported
importing an image into a Simulink model as matrix data. In some cases, Simulink implemented the
imported image as separate signals (for example, separate R, G, and B signals). In other cases, it was
implemented as a single signal. Simulink did not have a universal standard to model the image data
across Simulink blocks and MATLAB functions.

In R2021b, you can now use the Computer Vision Toolbox Interface for OpenCV in Simulink to specify
an image of the Simulink.ImageType data type and generate code for the model. Specify this data
type for signals and other data in the model. The Simulink.ImageType data type is an encapsulated
object that defines an image with fixed meta-attributes specific to this data type.

When importing OpenCV code into a Simulink model, you can use the OpenCV Importer app to
configure default values for the Simulink.ImageType data type. Specify these configurations on the
Create Simulink Library page:

• Select Configure library to use Simulink.ImageType signals.


• Specify Default Color Format of Simulink.ImageType signal as BGR, RGB, or Grayscale.
• Specify Default Array layout of Simulink.ImageType signal as Row-major or Column-major.

The OpenCV Importer app generates library and ToOpenCV and FromOpenCV blocks inside the
subsystem and configures them to use the Simulink.ImageType data type.

To integrate images of Simulink.ImageType data type in existing image processing algorithms that
operate on matrix data only, use the new blocks Image To Matrix and Matrix To Image to convert the
image data. For more information, see Convert Between Simulink Image Type and Matrices.

You can configure the From Multimedia File block to generate signals of the Simulink.ImageType
data type. When the Output color format parameter is set to RGB or Intensity, specify the Image
signal parameter as Simulink image signal.

The Video Viewer block supports visualization for Simulink.ImageType signals.

These blocks support simulation and code generation of a Simulink.ImageType object:

Block Library Block Name


Sources • Ground
• Inport
• Outport

5-11
R2021b

Block Library Block Name


Signal Routing • Goto
• From
• Data Store Read
• Data Store Write
• Data Store Memory
• Switch
• Multiport Switch
• Merge
• Variant Source
• Variant Merge (internal block added during
code generation)
• Mux
• Demux
• Vector Concatenate, Matrix Concatenate
• Selector
Sink • Terminator
Ports & Subsystems • Subsystem, Atomic Subsystem, CodeReuse
Subsystem
• Enabled Subsystem
• Triggered Subsystem
• Function-Call Subsystem
• If
• Switch Case
• Resettable Subsystem
• For Iterator Subsystem
• Model
Discrete • Unit Delay
Signal Attributes • Signal Conversion
• Signal Specification
User-Defined Functions • Initialize Function, Reset Function, and
Terminate Function
• Simulink Function block with side I/O and
arguments

5-12
6

R2021a

Version: 10.0

New Features

Compatibility Considerations
R2021a

Camera Calibration: Checkerboard detector for fisheye camera


calibration and partial checkerboard detection
• Calibrate cameras with up to a 195 degree field of view (FOV) using the Camera Calibrator app.
Perform programmatic calibration of wide field of view cameras by enabling the
HighDistortion name-value argument of the detectCheckerboardPoints function.
• All functions in the single camera calibration workflow now support partially detected
checkerboards: detectCheckerboardPoints, estimateCameraParameters,
estimateFisheyeParameters, and showExtrinsics. Perform detection of partial
checkerboards by enabling the PartialDetections name-value argument of the
detectCheckerboardPoints function.

Visual Simultaneous Localization and Mapping (SLAM): Enhancements


to the SLAM workflow
The visual SLAM and structure from motion (SfM) workflow includes these enhancements:

• The matchFeaturesInRadius function returns, the indices of features, most likely to correspond
between two input feature matrices, within the specified spatial constraints.
• The optimizePoses function now supports 3-D similarity transformations within imageviewset
objects.
• The worldToImage function can now return the indices of the points within a boundary in an
image.
• Use the minNumMatches argument of the connectedViews function to select strongly connected
views.

For more details, see Visual SLAM Overview.

The new Stereo Visual Simultaneous Localization and Mapping example shows you how to process
image data from a stereo camera to build a map of an outdoor environment and estimate the
trajectory of the camera. The example uses ORB-SLAM2 , which is a feature-based vSLAM algorithm
supporting stereo cameras.

Point Cloud Simultaneous Localization and Mapping (SLAM): Support


for point cloud SLAM with NDT map representation
Use the pcmapndt object to create a normal distribution transform (NDT) map from a prebuilt point
cloud map of the environment. The NDT map is a compressed, memory-efficient representation,
suitable for point cloud localization in SLAM. For more details, see Point Cloud SLAM Overview.

Point Cloud Processing: Set cluster density limits, get indices of bins,
and registration enhancements
The point cloud segmentation and registration workflow includes these enhancements:

• New NumClusterPoints name-value argument for pcsegdist and segmentLidarData


functions to specify the minimum and maximum number of points in each point cloud cluster.
• The pcbin function returns bin indices as a vector or a matrix using the 'BinOutput' argument.

6-2
Code Generation, GPU, and Third-Party Support

• The pcregistercorr function can now measure the quality of a registration as the peak
correlation value of the phase difference between two occupancy grids. Use the 'Window' name-
value argument to increase the stability of the registration results.

Deep Learning Training: Pass training options to detectors, perform


cutout data augmentation, and balance labels of blocked images
Computer Vision Toolbox training functions for deep learning detectors support the 'OutputFcn'
(Deep Learning Toolbox) option of the trainingOptions (Deep Learning Toolbox) function. Use this
option to display or plot progress information, or to stop training.

The bboxerase function removes image data and bounding boxes that lie within a specified region of
interest (ROI). Use the bboxerase function to update the bounding box information while you
perform cutout augmentation of training data.

The balanceBoxLabels and balancePixelLabels functions now accept a collection of large


images specified as blockedImage objects.

Semantic Segmentation Enhancements: Support for dlnetwork objects


and specified classes
The semanticseg function now accepts networks specified as a dlnetwork (Deep Learning
Toolbox) object. You can also specify the classes into which pixels or voxels are classified by using the
'Classes' name-value argument of the semanticseg function.

Evaluate Image Segmentation: Calculate generalized Dice similarity


coefficient
Use the generalizedDice function to compute a generalized Dice similarity coefficient for two
segmented images.

Labeler Enhancements: Instance segmentation labeling, super pixel


flood fill, labeling large images, and additional features
The following table describes enhancements for these labeling apps:

• Image Labeler
• Video Labeler
• Ground Truth Labeler
• Lidar Labeler (Lidar Toolbox)

Enhancement Image Video Ground Lidar


Labeler Labeler Truth Labeler
Labeler
Label distinct instances of objects Yes Yes Yes No
belonging to the same class using a
polygon label. For more details, see
Label Objects Using Polygons.

6-3
R2021a

Enhancement Image Video Ground Lidar


Labeler Labeler Truth Labeler
Labeler
Use superpixel automation to quickly Yes Yes Yes No
pixel label regions of an image with
similar pixel values. For more details,
see Label Pixels Using Superpixel Tool.
Automate the labeling of multiple No No Yes No
signals together within a single
automation run. For an example, see
Automate Ground Truth Labeling
Across Multiple Signals.
Label very large images (with at least Yes No No No
one dimension <8K) that previously
could not be loaded into memory. Load
these images as blocked images. For
more details, see Label Large Images
in Image Labeler.
Use a custom reader function to import No No No Yes
any point cloud. For more details, see
Use Custom Point Cloud Source Reader
for Labeling (Lidar Toolbox).
Define and view a region of interest No No No Yes
(ROI) in the point cloud and label
objects in it. For more details, see ROI
View (Lidar Toolbox).
Control the point dimension of the No No No Yes
point cloud.

YOLO v3 Object Detector: Computer Vision Toolbox Model for YOLO v3


Object Detection (Version 21.1.0)
Perform object detection with you only look once version 3 (YOLO v3) deep learning networks by
using the functions in the Computer Vision Toolbox Model for YOLO v3 Object Detection support
package.

The yolov3ObjectDetector object creates a YOLO v3 object detector to detect objects in images.
You can either create a pretrained YOLO v3 object detector or a custom YOLO v3 object detector.

Create Pretrained YOLO v3 Object Detector

• To create a pretrained YOLO v3 object detector, use the pretrained DarkNet-53 and tiny YOLO v3
deep learning networks included in the support package. These networks are trained on the COCO
dataset.

Create Custom YOLO v3 Object Detector

• You can create a custom YOLO v3 object detector by using an untrained or pretrained YOLO v3
deep learning network.

6-4
Code Generation, GPU, and Third-Party Support

• You can also create a custom YOLO v3 object detector by using a base network for feature
extraction. The yolov3ObjectDetector object adds detection heads to the feature extraction
layers of the base network and creates a YOLO v3 object detector. You must specify the source
layers to which to add the detection heads.

You must specify the anchor boxes and the class names to use to train the YOLO v3 object detector
on a custom dataset.

The yolov3ObjectDetector object configures the detector for transfer learning if you use a
pretrained deep learning network and specify the new anchor boxes and class names.

The Object Detection Using YOLO v3 Deep Learning example shows how to create and train a
custom YOLO v3 object detector for performing object detection.

Use the yolov3ObjectDetector object functions for training and inference workflows:

• The detect function performs object detection using the pretrained YOLO v3 object detector and
a test image.
• The preprocess function resizes the training data and the test data to the nearest network input
size and rescales the intensity values to the range [0, 1]. You can use this function to preprocess
the input images before training and detection.
• The forward function computes the network outputs during forward pass while training the
network. You can use the network outputs to model the gradient losses.
• The predict function computes the network outputs for inference.

You can download the Computer Vision Toolbox Model for YOLO v3 Object Detection from the Add-On
Explorer. For more information, see Get and Manage Add-Ons.

Extended Capability: Perform GPU and C/C++ code generation


Generate C and C++ Code Using MATLAB Coder

Computer Vision Toolbox functions now support code generation in host and nonhost target platform:

pccat
scanContextDescriptor
scanContextDistance
detect of acfObjectDetector

You can generate C code for your target by using any of the supported functions. The support enables
you to build the application for your target using a C++ compiler. The C++ compiler links to OpenCV
libraries that you provide for the particular target. You can also build a standalone application by
using the packngo function and using the 'packType' name-value pair with a value of
'hierarchical'.

Generate CUDA code for NVIDIA GPUs using GPU Coder

These Computer Vision Toolbox functions now supports code generation using the GPU Coder:

pcbin

6-5
R2021a

pcregisterndt
segmentGroundFromLidarData

Functionality being removed or changed


GPU support for disparityBM, detectFASTFeatures, and the interface for OpenCV in MATLAB
will be removed in a future release
Still runs

GPU support will be removed in a future release for:

• The detectFASTFeatures and disparityBM functions.


• The Computer Vision Toolbox OpenCV Interface in MATLAB support package. For more details see
Install and Use Computer Vision Toolbox Interface for OpenCV in MATLAB.

Computer Vision Toolbox Support Package for Xilinx Zynq-Based Hardware has been moved
to Vision HDL Toolbox Support Package for Xilinx Zynq-Based Hardware
Behavior change

Starting in R2021a, the Computer Vision Toolbox Support Package for Xilinx® Zynq®-Based Hardware
has been renamed to Vision HDL Toolbox™ Support Package for Xilinx Zynq-Based Hardware. To use
this support package in R2021a, you must have Vision HDL Toolbox installed. See Vision HDL
Toolbox .

6-6
7

R2020b

Version: 9.3

New Features

Compatibility Considerations
R2020b

Mask R-CNN: Train Mask R-CNN networks for instance segmentation


using deep learning
Use these features to support a Mask R-CNN network:

• Use the roiAlignLayer object to output fixed-size feature maps for every rectangular ROI within
an input feature map.
• Use the insertObjectMask object to insert masks in an image or video stream.

Visual SLAM Data Management: Manage 3-D world points and


projection correspondences to 2-D image points
Use the worldpointset object to store correspondences between 3-D world points and 2-D image
points across camera views.

AprilTag Pose Estimation: Detect and estimate pose for AprilTags in an


image
The readAprilTag function detects AprilTags in an image and returns the encoded ID, tag family,
and estimated pose for the tag in the scene with respect to the camera. For details on how to use this
function for camera calibration, see the Camera Calibration Using AprilTag Markers example.

Point Cloud Registration: Register point clouds using phase


correlation
Use the pcregistercorr function to register two point clouds using a phase correlation algorithm.
The function performs registration on the occupancy grids of the two point clouds.

Use the normalRotation function to correct the ground plane such that it is parallel to the XY-plane
and has a normal vector of [0 0 1].

Point Cloud Loop Closure Detection: Compute Point cloud feature


descriptor for loop closure detection
Use the scanContextDescriptor function to compute a 2-D global feature descriptor that captures
the distinctiveness of a view from point cloud scans.

Use the scanContextDistance function to compute the distance between two scan context
descriptors.

Use the pcalign function to align an array of point clouds and the pccat function to concatenate an
array of point clouds.

Triangulation Accuracy Improvements: Filter triangulated 3-D world


points behind camera view
Use the validIndex output argument of the triangulate and triangulateMultiview functions
to filter and remove triangulated 3-D world points that appear behind the camera.

7-2
Code Generation, GPU, and Third-Party Support

Geometric Transforms: Estimate 2-D and 3-D geometric


transformations from matching point pairs
Use the estimateGeometricTransform2D function to estimate 2-D rigid, affine, similarity, and
projective transformations.

Use the estimateGeometricTransform3D function to estimate 3-D rigid and similarity


transformations.

Labeler Enhancements: Label objects in images and video using


projected 3-D bounding boxes, load custom image formats, use
additional keyboard shortcuts, and more
This table describes enhancements for these labeling apps:

• Image Labeler
• Video Labeler
• Ground Truth Labeler
• Lidar Labeler — Introduced in R2020b

Enhancement Image Labeler Video Labeler Ground Truth Lidar Labeler


Labeler
Load images with Supported Not supported Not supported Not supported
custom image
formats using an
imageDatastore
object
Draw projected 3- Supported Supported Supported Not supported
D bounding boxes
around objects in
images and video
using the projected
cuboid label type
Delete pixel labels Supported Supported Supported Not supported
Undo and redo Supported Supported Supported Not supported
drawing a pixel
label an increased
number of times
Use keyboard Supported Supported Supported Not supported
shortcuts for
selecting drawn
labels and resizing
bounding boxes
Specify attributes Not supported Not supported Supported Supported
for cuboid ROI
labels

7-3
R2020b

Enhancement Image Labeler Video Labeler Ground Truth Lidar Labeler


Labeler
Visualize point Not supported Not supported Supported Supported
cloud clusters
across all frames,
not just individual
frames, when
Snap to Cluster
option is selected,
by using a new
Cluster Settings
option
Use keyboard Not supported Not supported Supported Supported
shortcuts for
panning across the
point cloud frame
and moving
multiple selected
cuboids

Object Detection Visualizations: Visualize shapes on images and point


clouds
Use the showShape function to highlight detected objects in an image or point cloud using shapes
such as a rectangles, cuboids, polygons, and circles.

Use the insertObjectMask function to insert a color mask of segmented shapes into an image or
video.

Evaluate Pixel-Level Segmentation: Compute a confusion matrix of


multiclass pixel-level image segmentation
Use the segmentationConfusionMatrix function to compute a confusion matrix from the
predicted pixel labels and ground truth pixel labels from labeled images.

Focal Loss Layer Improvements: Add a focal loss layer to a semantic


segmentation or image classification deep learning network
You can now use the focalLossLayer object to add a focal loss layer to a semantic segmentation or
image classification deep learning network. Use the focal loss layer to train a deep learning network
on class-imbalanced data sets.

focalCrossEntropy function: Compute focal cross-entropy loss in


custom training loops
Use the focalCrossEntropy function to compute the focal cross-entropy loss in order to train a
custom deep learning network on class-imbalanced datasets.

7-4
Code Generation, GPU, and Third-Party Support

Computer Vision Examples: Explore deep learning workflows, explore


camera calibration using AprilTags, and compute segmentation
metrics
Explore these Computer Vision Toolbox examples:

• The Estimate Body Pose Using Deep Learning example shows how to estimate the body pose of
one or more people using the OpenPose algorithm and a pretrained network.
• The Generate Image from Segmentation Map Using Deep Learning example shows how to
generate a synthetic image of a scene from a semantic segmentation map using a Pix2PixHD
conditional generative adversarial network (CGAN).
• The Activity Recognition from Video and Optical Flow Data Using Deep Learning example shows
how to train an Inflated-3D (I3D) two-stream convolutional neural network for activity recognition
using RGB and optical flow data from videos.
• The Camera Calibration Using AprilTag Markers example shows how to detect and localize
AprilTags in a calibration pattern.

Extended Capability: Perform GPU and C/C++ code generation


Generate C and C++ Code Using MATLAB Coder

These Computer Vision Toolbox functions now support code generation in host and nonhost target
platform:

• ransac
• pcbin

You can generate C code for your target by using either of the supported functions. The support
enables you to build the application for your target using a C++ compiler. The C++ compiler links to
OpenCV libraries that you provide for the particular target. You can also build a standalone
application by using the packNGo function and using the 'packType' name-value pair with a value
of 'hierarchical'.

Generate CUDA code for NVIDIA GPUs using GPU Coder

The pcsegdist function now supports code generation using the GPU Coder.

OpenCV Interface: Integrate OpenCV version 4.2.0 projects with


MATLAB
Integrate OpenCV projects with MATLAB using OpenCV version 4.2.0.

Functionality being removed or changed


focalLossLayer input arguments, alpha and gamma now have default values
Still runs

The focalLossLayer object's input arguments alpha and gamma are no longer required to specify
the balancing and focusing parameter values, respectively. These properties now have default values:

7-5
R2020b

• Alpha: 0.25
• Gamma: 2.0

To change the balancing value, set the Alpha property. To change the focusing value, set the Gamma
property. For example, focalLossLayer('Alpha',0.7,'Gamma',1) sets the Alpha and Gamma
properties to 0.7 and 1, respectively, upon creation of the object.

yolov2ReorgLayer will be removed


Still runs

The yolov2ReorgLayer function will be removed in a future release. Use the spaceToDepthLayer
function to add a reorganization layer to the YOLO v2 deep learning network.

estimateGeometricTransform will be removed


Still runs

The estimateGeometricTransform will be removed in a future release. Use the


estimateGeometricTransform2D function instead.

7-6
8

R2020a

Version: 9.2

New Features

Compatibility Considerations
R2020a

Point Cloud Deep Learning: Detect and classify objects in 3-D point
clouds
Use the boxLabelDatastore object with cuboid bounding box support. Preprocess point cloud data
using the pcbin function.

To build and evaluate point cloud based object detectors, use these functions, which now support
rotated rectangle box formats.

• pcbin
• bboxwarp
• bboxcrop
• bboxresize
• bboxOverlapRatio
• selectStrongestBbox
• selectStrongestBboxMulticlass
• bboxPrecisionRecall
• evaluateDetectionAOS

The Point Cloud Classification Using PointNet Deep Learning example trains a PointNet network for
point cloud classification.

Deep Learning with Big Images: Train and use deep learning object
detectors and semantic segmentation networks on very large images
Balance and store big image data by using these added features.

• The balanceBoxLabels function balances the distribution of detector training data from a
collection of very large images.
• The balancePixelLabels function balances the distribution of pixel labeled training data from a
collection of very large images.
• The boxLabelDatastore and blockLocationSet objects load multiple blocks of data for
training and evaluation.

Simultaneous Localization and Mapping (SLAM): Perform point cloud


and visual SLAM
Use these objects, functions, and properties to manage SLAM and point cloud processing.

• The imageviewset object manages visual odometry and structure from motion (SfM) data.
• The pcviewset object manages point cloud odometry data.
• The rigid3d object stores a 3-D rigid transformation. You can use the rigid3d object with point
cloud processing functions like pctransform and pcregisterndt.
• The optimizePoses function optimizes absolute poses using relative pose constraints.

Specify an absolute pose as a rigid3d object in the plotCamera and triangulateMultiview


functions. You can use the 'AbsolutePose' name-value pair argument instead of the combination
of the 'Location' and 'Orientation' name-value pairs.

8-2
Code Generation, GPU, and Third-Party Support

• Refine camera poses using the bundleAdjustmentMotion function. Refine 3-D points using the
bundleAdjustmentStructure function.

The Monocular Visual Simultaneous Localization and Mapping example processes image data from a
monocular camera to build a map of an indoor environment and estimate the trajectory of the camera
using ORB-SLAM, a feature-based vSLAM algorithm.

Bar Code Reader: Detect and decode 1-D and 2-D barcodes
Read linear (1-D) and matrix (2-D) barcodes using the readBarcode function.

The Localize and Read Multiple Barcodes in Image example demonstrates preprocessing steps that
can be used to improve the detection of 1-D and 2-D barcodes in an image.

SSD Object Detection: Detect objects in images using a single shot


multibox object detector (SSD)
Use these functions, objects, and layers to detect objects in images using an SSD object detector.

• The trainSSDObjectDetector function trains a deep learning SSD object detector.


• The ssdObjectDetector object detects objects using the SSD-based detector.
• The ssdLayers function creates an SSD object detection network.
• The anchorBoxLayer layer stores anchor boxes for object detection.
• The focalLossLayer classification layer using focal loss for object detection.
• The ssdMergeLayer layer merges activations from several feature maps.

The Object Detection Using SSD Deep Learning example trains a single shot object detector using a
deep learning network architecture.

The Code Generation for Object Detection by Using Single Shot Multibox Detector example generates
CUDA code for an SSD network.

Velodyne Point Cloud Reader: Store start time for each point cloud
frame
The velodyneFileReader object has a new property named Timestamps. The Timestamps
property stores the start time for each point cloud frame in the input sequence.

Labelers: Rename scene labels, select ROI color, and show ROI label
names
The Video Labeler and the Image Labeler apps now support these features.

• Rename scene labels


• Set custom colors for ROIs
• Hover over an ROI to display its label name

8-3
R2020a

Validate Deep Learning Networks: Specify training options to validate


deep learning networks during training
The trainFastRCNNObjectDetector, trainFasterRCNNObjectDetector,
trainSSDObjectDetector, and trainYOLOv2ObjectDetector functions now support validation
during training. Specify validation options for network training by using the input argument
options.

• The value of options must be a TrainingOptionsADAM, TrainingOptionsRMSProp, or


TrainingOptionsSGDM object returned by the trainingOptions function.
• Use the name-value pair arguments 'ValidationData', 'ValidationFrequency', and
'ValidationPatience' of a trainingOptions function to set validation options for network
training.

YOLO v2 Enhancements: Import and export pretrained YOLO v2 object


detectors
Create a you-only-look-once (YOLO) v2 object detector from a deep learning framework, such as
ONNX™ or Keras. Import the pretrained network using the network input to the
yolov2ObjectDetector object. The Import Pretrained ONNX YOLO v2 Object Detector example
shows how to import a pretrained YOLO v2 network in ONNX model format to
yolov2ObjectDetector object and perform object detection.

Additionally, you can create a YOLO v2 object detector that was trained with the
trainYOLOv2ObjectDetector function, and then export it to ONNX using the
exportONNXNetwork function. The Export YOLO v2 Object Detector to ONNX example shows how
to:

• Export a YOLO v2 object detector to ONNX model format.


• Perform object detection using the exported YOLO v2 network in ONNX model format.

You can now specify Classes as input to the yolov2OutputLayer function. Use the name-value
pair argument 'Classes' to input the object classes in training data to the output layer.

YOLO v3 Deep Learning: Perform object detection using YOLO v3 deep


learning network
The Object Detection Using YOLO v3 Deep Learning example shows how to design, train, and use a
YOLO v3 network for object detection.

Computer Vision Examples: Explore object detection with deep


learning workflows, structure from motion, and point cloud processing
• The Object Detection Using SSD Deep Learning example trains a single shot object detector using
a deep learning network architecture.
• The Code Generation for Object Detection by Using Single Shot Multibox Detector example
generates CUDA code for an SSD network.
• The Point Cloud Classification Using PointNet Deep Learning example trains a PointNet network
for point cloud classification.

8-4
Code Generation, GPU, and Third-Party Support

• The Monocular Visual Simultaneous Localization and Mapping example processes image data
from a monocular camera to build a map of an indoor environment and estimate the trajectory of
the camera using ORB-SLAM, a feature-based vSLAM algorithm.
• The Import Pretrained ONNX YOLO v2 Object Detector example imports a pretrained YOLO v2
object detector from an ONNX deep learning framework.
• The Export YOLO v2 Object Detector to ONNX example exports a pretrained YOLO v2 object
detector to an ONNX deep learning framework.
• The Object Detection Using YOLO v3 Deep Learning example shows how to design, train, and use
a YOLO v3 network for object detection.
• The Localize and Read Multiple Barcodes in Image example demonstrates preprocessing steps
that can be used to improve the detection of 1-D and 2-D barcodes in an image.

Code Generation: Generate C/C++ code using MATLAB Coder


More Computer Vision Toolbox functions and objects now support portable C code generation. The
segmentGroundFromLidarData function now supports code generation in host and nonhost target
platforms. These lidar, point cloud, and tracking functions now support code generation in nonhost
platforms:

Lidar and Point Cloud Processing


pcdenoise
pcdownsample
pcnormals
pcmerge
pcsegdist
segmentLidarData
pctransform
pcregistercpd
pcregisterndt
pcfitcylinder
pcfitsphere
pcfitplane
Tracking and Motion Estimation
insertShape
insertMarker

You can generate C code for your specific target by using any of the supported functions. The support
enables you to build the application for your target using a C++ compiler. The C++ compiler links to
OpenCV libraries that you provide for the particular target. You can also build a standalone
application by using the packNGo function and setting the 'packType' name-value pair to
'hierarchical'.

8-5
R2020a

Computer Vision Toolbox Interface for OpenCV in Simulink: Import


OpenCV code into Simulink
The Computer Vision Toolbox Interface for OpenCV in Simulink support package enables you to
import OpenCV code into a Simulink model. To install the support package, first click Add-Ons on the
MATLAB Home tab. In the Add-On Explorer window, find and click the support package, and then
click Install. This support package requires Computer Vision Toolbox. After installing the support
package, you can import your OpenCV code and create Simulink library by using the OpenCV
Importer app. The importer uses two OpenCV conversion blocks ToOpenCV and FromOpenCV. You
can generate C++ code from the created Simulink model and deploy the code into your target
hardware. For more information, see Install and Use Computer Vision Toolbox OpenCV Interface for
Simulink.

Functionality being removed or changed


pcregisterndt, pcregistericp, and the pcregistercpd functions return a rigid3d object
Behavior change

The pcregisterndt and pcregistericp functions now return a rigid3d object as their rigid
transformation output. Previously, the functions returned an affine3d object.

The pcregistercpd function can return a rigid3d object, an affine3d object, or a displacement
field as its transformation object.

New imageviewset replaces viewSet

Use the imageviewset object in place of the viewSet object for managing data for structure from
motion, visual odometry, and visual SLAM workflows.

8-6
9

R2019b

Version: 9.1

New Features

Compatibility Considerations
R2019b

Video and Image Labeler: Copy and paste pixel labels, improved pan
and zoom, improved frame navigation, and line ROI, label attributes,
and sublabels added to Image Labeler
The Image Labeler and Video Labeler apps now support these features:

Feature Video Labeler Image Labeler


Copy and paste pixel labels New New
Pan and zoom more easily within the New New
labeling window
Create a line region of interest (ROI) Introduced in R2018b New
Create label attributes and sublabels Introduced in R2018b New
Use scrubber for video tracking New Not supported
Click on Visual Summary timeline to go to New New
corresponding (timestamp) frame

Data Augmentation for Object Detectors: Transform image and


bounding box
Use bounding box transformations and datastore support for deep learning workflows.

• The bboxresize, bboxwarp, and bboxcrop functions support bounding box transformations.
• The boxLabelDatastore creates a datastore for bounding box label data. The object can contain
different tables for labeled bounding boxes of various classes.
• The detect object functions for the trainFastRCNNObjectDetector,
trainFasterRCNNObjectDetector, and trainYOLOv2ObjectDetector object trainers now
support the use of a datastore.
• The evaluateDetectionPrecision and evaluateDetectionMissRate functions now
support the use of a datastore.

Semantic Segmentation: Classify individual pixels in images and 3-D


volumes using DeepLab v3+ and 3-D U-Net networks.
Create convolutional neural networks with added layer and datastore support:

• The dicePixelClassificationLayer layer creates a pixel classification layer by using


generalized dice loss for semantic segmentation.
• The deeplabv3plusLayers function creates a DeepLab v3+ convolutional neural network for
semantic segmentation.
• The unet3dLayers function creates a 3-D U-Net convolutional neural network for semantic
segmentation of volumetric images.
• The unetLayers function now supports a padding style for convolution layers in the encoder and
the decoder subnetworks. Use the 'ConvolutionPadding' name-value pair to specify the
padding.

9-2
Code Generation, GPU, and Third-Party Support

• The semanticseg and the evaluateSemanticSegmentation functions now support use of a


datastore.

Deep Learning Object Detection: Perform faster R-CNN end-to-end


training, anchor box estimation, and use multichannel image data
Enhancements to training deep learning object detectors functions and new faster R-CNN layer:

• Create network architecture for Faster R-CNN by using the fasterRCNNlayers function.
• Use the trainFastRCNNObjectDetector function end-to-end training method to train a Fast R-
CNN or Faster R-CNN detector.
• Use datastores with the trainFastRCNNObjectDetector and
trainFastRCNNObjectDetector functions.
• Use the estimateAnchorBoxes function to automatically estimate anchor boxes based on a
training data set and a k-means clustering algorithm.
• Use a multichannel image to train an R-CNN, Fast R-CNN, or Faster R-CNN detector. You can still
use grayscale or RGB images as well.

Compatibility Considerations
Starting in R2019b, by default, the trainFasterRCNNObjectDetector function uses the end-to-
end method for training a detector.

In previous releases, the default training method used the four-step method. To preserve
compatibility, set the TrainingMethod property to 'four-step'.

Deep Learning Acceleration: Optimize YOLO v2 and semantic


segmentation using MEX acceleration
Computer Vision Toolbox now supports performance optimization in both CPU and GPU execution
environments for these functions.

• The detect function for YOLO v2 object detection. Use the 'Acceleration','mex' name-value
pair.
• The semanticseg function.

Multiview Geometry: Reconstruct 3-D scenes and camera poses from


multiple cameras
The triangulateMultiview and bundleAdjustment functions support images from multiple
(pinhole) cameras. The Intrinsics property of the cameraParameters object enables you to pass
intrinsics to related functions in the structure from motion (SfM) workflow.

Velodyne Point Cloud Reader: Read lidar data from VLS- 128 device
model
The velodyneFileReader object now supports VLS- 128 Velodyne LiDAR® device models.

9-3
R2019b

Point Cloud Normal Distribution Transform (NDT): Register point


clouds using NDT with improved performance
Improved performance using the pcregisterndt function to register two point clouds with the NDT
algorithm.

Code Generation: Generate C/C++ code using MATLAB Coder


The pcregisterndt function supports code generation in host target platforms.

These point cloud functions now support code generation in nonhost target platforms.

• pointCloud
• findNearestNeighbors
• findNeighborsInRadius
• findPointsInROI
• removeInvalidPoints
• select

Functionality Being Removed or Changed


The NumOutputChannels argument of unetLayers function has been renamed to
NumFirstEncoderFilters
Still runs

The Name-Value pair argument NumOutputChannels of unetLayers function has been renamed to
NumFirstEncoderFilters. To update your code, replace all instances of the NumOutputChannels
argument name with NumFirstEncoderFilters.

9-4
10

R2019a

Version: 9.0

New Features

Compatibility Considerations
R2019a

YOLO v2 Object Detection: Train a "you-only-look-once" (YOLO) v2


deep learning object detector
Use YOLO v2, a deep convolutional neural network, for object detection.

• Construct a YOLO v2 network by using the yolov2Layers function. Alternatively, you can use the
yolov2TransformLayer, yolov2ReorgLayer, and the yolov2OutputLayer functions to
manually construct a YOLO v2 network.
• Use the trainYOLOv2ObjectDetector function to train the YOLO v2 network.
• Use the yolov2ObjectDetector object and the detect function to detect objects in a target
image using the trained YOLO v2 network.

Use of these objects require Deep Learning Toolbox.

• yolov2TransformLayer supports GPU array inputs and GPU code generation.


• yolov2ReorgLayer supports GPU array inputs.

3-D Semantic Segmentation: Classify pixel regions in 3-D volumes


using deep learning
These functions now support 3-D semantic segmentation:

• semanticseg
• evaluateSemanticSegmentation
• pixelClassificationLayer
• pixelLabelDatastore

Code Generation: Generate C code for point cloud processing, ORB,


disparity, and ACF functionality using MATLAB Coder
The Computer Vision Toolbox objects and functions in this table now support code generation.

Objects
pointCloud
ORBPoints
acfObjectDetector
Lidar and Point Cloud Processing Functions
findNearestNeighbors
findNeighborsInRadius
findPointsInROI
removeInvalidPoints
select
pcnormals
pcdownsample

10-2
Code Generation, GPU, and Third-Party Support

pctransform
pcregistercpd
pcmerge
pcfitcylinder
pcfitplane
pcfitsphere
pcdenoise
pcsegdist
segmentLidarData
Feature Detection and Extraction Functions
detectORBFeatures
Camera Calibration and 3-D Vision Functions
disparityBM
disparitySGM

ORB Features: Detect and extract oriented FAST and rotated BRIEF
(ORB) features
For object recognition or image registration workflows use the detectORBFeatures function,
ORBPoints object, and 'ORB' name-value pair set to 'Method' for the extractFeatures function.

Velodyne Point Cloud Reader: Read lidar data from Puck LITE and
Puck Hi-Res device models
The velodyneFileReader object now supports Puck LITE and Puck Hi-Res Velodyne LiDAR device
models.

GPU Acceleration for Stereo Disparity: Compute stereo disparity maps


on GPUs
These functions compute a disparity map from a stereo-pair image. Both functions support GPU
processing.

• disparityBM — Uses the block matching method to compute a disparity map


• disparitySGM — Uses the semiglobal matching method to compute a disparity map

Ground Truth Data: Select labels by group, type, and attribute


The groundTruth object now includes these object functions for selecting labels by group, label
type, or attribute.

• selectLabelsByName
• selectLabelsByType

10-3
R2019a

• selectLabelsByGroup

Projection Matrix Estimation: Use direct linear transform (DLT) to


compute projection matrix
Use the estimateCameraMatrix function to compute a projection matrix. The function uses the
DLT algorithm to compute a 2-D projection matrix from 3-D correspondences for depth sensors and
cameras. The use of a camera projection matrix speeds up the nearest neighbors search in a point
cloud that is generated by an RGBD sensor, such as Microsoft® Kinect®. You can use the
estimateCameraMatrix function with the findNearestNeighbors function to speed up the
search.

Organized Point Clouds: Perform faster approximate search using


camera projection matrix
You can now pass a camera matrix as an input argument to the findNearestNeighbors,
findNeighborsInRadius, and findPointsInROI functions. The functions use the camera
projection matrix to identify the relationship between adjacent points,speeding up the nearest
neighbor search in organized point clouds.

Point Cloud Viewers: Modify color display and view data tips
The pcshow function, pcplayer object, and pcshowpair function provide figure options for viewing
data and changing colormaps.

• Data tips — Select any point to view (x, y, z) point data and additional data value properties. For
example, this table shows the data value properties available for a depth image and lidar point
cloud.

Point Cloud Data Data Value Properties


Depth image (RGB-D sensor) Color, row, column
Lidar Intensity, range, azimuth angle, elevation
angle, row, column
• Background color — Change background color.
• Colormap value — Map a color in the current colormap to a data point.
• View angle — Change the viewing axis angle to an xz-, zx-, yz-, zy-, xy-, or yx-plane.

Image and Video Labeling: Organize labels by logical groups, use


assisted freehand for pixel labeling, and other label management
enhancements
With the Image Labeler and Video Labeler apps, you can now:

• Create groups for organizing label definitions. You can also move labels between groups by
dragging them.
• Use the assisted freehand feature to create pixel regions of interest (ROIs) for semantic
segmentation. This tool automatically finds edges between selected points in an image.

10-4
Code Generation, GPU, and Third-Party Support

• Move multiple selected ROIs in an image.


• Edit previously created label definitions.
• Add additional list items to a previously created attribute (Video Labeler only).

DeepLab v3+, deep learning, and lidar tracking examples


• Semantic Segmentation Using Deep Learning updated to use DeepLab v3+.
• Object Detection Using YOLO v2 Deep Learning
• Track Vehicles Using Lidar: From Point Cloud to Track List
• Code Generation for Object Detection Using YOLO v2

Relative camera pose computed from homography matrix


The relativeCameraPose function can now compute the relative camera pose based on a
homography matrix, specified as a projective2d object. The relative camera pose can now be
computed from a homography matrix in addition to an essential or a fundamental matrix.

Functionality being removed or changed


disparity function will be removed
Still runs

The disparity function will be removed in a future release. Use the disparityBM or
disparitySGM functions instead. Use disparityBM to compute a disparity map by using block
matching method. Use disparitySGM to compute disparity map using the semi-global matching
method.

selectLabels object function will be removed


Still runs

The selectLabels object function will be removed in a future release. Use the
selectLabelsByGroup, selectLabelsByType , and selectLabelsByName functions instead.

10-5
11

R2018b

Version: 8.2

New Features

Compatibility Considerations
R2018b

Video Labeler App: Interactive and semi-automatic labeling of ground


truth data in a video, image sequence, or custom data source
The Video Labeler app enables you to label ground truth in a video, image sequence, or a data custom
source. Use the app to interactively specify regions of interest. You can export marked labels from the
app and use them to train an object detector or to compare against ground truth data. The app
includes computer vision algorithms to automate the labeling of ground truth by using detection and
tracking algorithms.

For more details, see Get Started with the Video Labeler.

Lidar Segmentation: Segment ground points from organized 3-D lidar


data and organize point clouds into clusters
Use the segmentGroundFromLidarData function to segment ground points from organized lidar
data. Use the segmentLidarData function to organize 3-D point cloud range data into clusters.

Point Cloud Registration: Align 3-D point clouds using coherent point
drift (CPD) registration
Use the pcregistercpd function for point cloud registration based on the coherent point drift
algorithm.

MSAC Fitting: Find a polynomial that best fits noisy data using the M-
estimator sample consensus (MSAC)
Use the ransac and fitPolynomialRANSAC functions to find the coefficients of a polynomial that
best fits input noisy data.

Faster R-CNN Enhancements: Train Faster R-CNN object detectors


using DAG networks such as ResNet-50 and Inception-v3
You can use the trainFasterRCNNObjectDetector, trainFastRCNNObjectDetector, or
trainRCNNObjectDetector functions with deep learning pretrained network to create a CNN
model

Use the new layers to train directed acyclic graph (DAG) networks using the faster R-CNN model:

• roiInputLayer
• roiMaxPooling2dLayer
• rpnSoftmaxLayer
• rpnClassificationLayer
• rcnnBoxRegressionLayer
• regionProposalLayer

11-2
Code Generation, GPU, and Third-Party Support

Semantic Segmentation Using Deep Learning: Create U-Net network


Use the unetLayers function to create U-Net semantic segmentation networks.

Velodyne Point Cloud Reader: Support for VLP-32 device


The velodyneFileReader now supports the VLP-32C Velodyne LiDAR device.

Labeler Apps: Create a definition table, change file path, and assign
data attributes
Use the labelDefinitionCreator to create a label definitions table to use with the Ground Truth
Labeler (requires Automated Driving System Toolbox™), Image Labeler and Video Labeler apps.

Use changeFilePaths to change file paths in the data source and pixel label data of a
groundTruth object.

Use attributeType to specify the type of attributes in the labelDefinitionCreator.

OpenCV Interface: Integrate OpenCV version 3.4.0 projects with


MATLAB
Integrate OpenCV projects with MATLAB using OpenCV version 3.4.0.

Functionality Being Removed or Changed


Functionality Change Use Instead Compatibility
Considerations
trainFasterRCNNObj You must set the Use Change the
ectDetector, training option NumRegionsToSample MiniBatchSize
trainFastRCNNObjec MiniBatchSize property to specify the property in your code.
tDetector property to 1 or these number of regions to Set the
functions produce an sample per training NumRegionsToSample
error. iteration. property value in these
functions to the value
you had the
MiniBatchSize
property (in the
trainingOptions
function) set to before
R2018b.
fastRCNNObjectDete Network property No change None
ctor changed to a
DAGNetwork.

11-3
R2018b

Functionality Change Use Instead Compatibility


Considerations
fasterRCNNObjectDe • Network property The Network property None
tector changed to a now contains the
DAGNetwork. complete Faster R-CNN
• The network. Use the
RegionProposalNe AnchorBoxes property
twork property was instead of the
removed. MinBoxSizes,
BoxPyramidScale,
• The MinBoxSizes, NumBoxPyramidLevel
BoxPyramidScale, s.
and
NumBoxPyramidLev
els properties were
removed.

ClassNames property of PixelClassificationLayer will be removed


Still runs

ClassNames property of PixelClassificationLayer will be removed. Use Classes instead. To


update your code, replace all instances of the ClassNames property with Classes. There are some
differences between the functions that require additional updates to your code.

The ClassNames property contains a cell array of character vectors. The Classes contains a
categorical array. To use the Classes property with functions that require cell array input, then
convert the classes using the cellstr function.

11-4
12

R2018a

Version: 8.1

New Features

Compatibility Considerations
R2018a

Lidar Segmentation: Segment lidar point clouds using Euclidean


distance
Use the pcsegdist function to segment a point cloud into clusters based on the Euclidean distance
between individual points.

Lidar Registration: Register multiple lidar point clouds using normal


distributions transform (NDT)
Use the pcregisterndt function to register multiple lidar point clouds using NDT.

Image Labeler App: Mark foreground and background for pixel


labeling
In the Image Labeler app, the Smart Polygon tool now enables you to refine the segmentation within
a polygonal region of interest by marking pixels as foreground or background.

Fisheye Calibration: Interactively calibrate fisheye lenses using the


Camera Calibrator app
The Camera Calibrator app now includes fisheye lens calibration.

Stereo Baseline Estimation: Estimate baseline of a stereo camera with


known intrinsic parameters
Use the estimateStereoBaseline function or the Stereo Camera Calibrator app to estimate the
baseline of a stereo camera when intrinsics of the individual cameras are known.

Interactively rotate point cloud around any point


In the point cloud viewer functions, pcplayer, pcshow, and pcshowpair, you can now rotate a
point cloud around any point.

Multiclass nonmaxima suppression (NMS)


Use the selectStrongestBboxMulticlass function to select the strongest multiclass bounding
boxes from overlapping clusters. The function uses greedy nonmaximal suppression (NMS) to
eliminate overlapping bounding boxes.

pcregrigid name changed to pcregistericp


The pcregrigid function has been renamed to pcregistericp. The pcregistericp function
supports angular difference in degrees. The prior version, pcregrigid, used radians. You can still
use pcregrigid.

12-2
Code Generation, GPU, and Third-Party Support

Efficiently read and preprocess pixel-labeled images for deep learning


training and prediction
The pixelLabelImageDatastore object preprocesses pixel-labeled data for training semantic
segmentation networks. Preprocessing operations include resizing, rotation, reflection, and cropping.

Compatibility Considerations
In the previous release, you could preprocess pixel-labeled training images by using a
pixelLabelImageSource object. The pixelLabelImageSource function now creates a
pixelLabelImageDatastore object instead. This new object has similar behavior as a
pixelLabelImageSource, with additional properties and methods to assist with data
preprocessing.

You can use pixelLabelImageDatastore for both training and prediction. In the previous release,
you could use pixelLabelImageSource for training but not prediction.

Code Generation Support for KAZE Detection


The detectKAZEFeatures function and the KAZEPoints object that it returns now supports code
generation.

Functionality Being Removed or Changed


Functionality Result Use Instead Compatibility
Considerations
pixelLabelImageSou Still runs pixelLabelImageDat In R2018a, you cannot
rce astore create a
pixelLabelImageSou
rce object. The
pixelLabelImageSou
rce function now
creates a
pixelLabelImageDat
astore object.
vision.VideoFileWr removed wmv and .wma Use ..avi file format .wmv and .wma formats
iter, To Multimedia file format support. for writing audio and are not supported.
File video or .mp4 for highly
compressed video.

12-3
R2018a

Functionality Result Use Instead Compatibility


Considerations
trainFastRCNNObjec Same Use same functions The
tDetector, InitialLearnRate
trainFasterRCNNObj specified by
ectDetector trainingOptions
should be increased to
achieve similar training
results. For example, if
the
InitialLearnRate
was between 1e-5 and
1e-6, this should be set
to a value between 1e-3
and 1e-4 in R2018a.
pcshow, pcshowpair, Same Use same functions For single-color point
pcplayer clouds, the
MarkerSize property
now approximates the
marker diameter with a
finer scale. To achieve
the same size as in
previous releases, you
must use the square-
root of the value used.
For example, if you used
100 for MarkerSize in
previous versions, you
must now use 10.

12-4
13

R2017b

Version: 8.0

New Features

Compatibility Considerations
R2017b

Semantic Segmentation Using Deep Learning: Classify pixel regions in


images, evaluate, and visualize segmentation results
Several new features to support semantic segmentation using deep learning techniques:

• semanticseg: Perform semantic image segmentation on images and image collections.


• segnetLayers and fcnLayers: Create SegNet and a fully convolutional network (FCN)
segmentation networks.
• pixelLabelImageSource: Provides training data for semantic segmentation networks.
Additionally supports several on-the-fly data augmentation techniques during training.
• pixelLabelDatastore: Provides a data store object that can be used to read pixel label data.
• evaluateSemanticSegmentation: Evaluate semantic segmentation results using intersection-
over-union (IoU) and other common metrics.
• crop2dLayer: Provides a layer for center cropping an input feature map.
• pixelClassificationLayer: Creates a pixel classification layer.

Image Labeling App: Interactively label individual pixels for semantic


segmentation and label regions using bounding boxes for object
detection
Use the Image Labeler app for interactive image labeling. You can label rectangular regions of
interest (ROI) for object detection, pixels for semantic segmentation, and scenes for image
classification. You can also define and execute custom label automation algorithms with the app.

Fisheye Camera Calibration: Calibrate fisheye cameras to estimate


intrinsic camera parameters
The following functions support calibration of a wide-angle fisheye camera:

• fisheyeCalibrationErrors
• fisheyeParameters
• fisheyeIntrinsics
• fisheyeIntrinsicsEstimationErrors
• undistortFisheyeImage and undistortFisheyePoints

In addition, the showExtrinsics and showReprojectionErrors now accommodate fisheye data.

KAZE Features: Detect and extract KAZE features for object


recognition or image registration workflows
Detect KAZE points from an image with the detectKAZEFeatures function, which returns a
KAZEPoints object. Extract the KAZE features with the new KAZE method selection of the
extractFeatures function.

13-2
Code Generation, GPU, and Third-Party Support

Code generation for camera intrinsics


The cameraIntrinsics object now supports code generation.

Image Labeler app replaces Training Image Labeler app


Use the Image Labeler app in place of the Training Image Labeler app. The
trainingImageLabeler function now opens the Image Labeler app.

Ground Truth Labeling Utilities


The groundTruth, groundTruthDataSource, and objectDetectorTrainingData functions are
added to support the image labeling.

Ground Truth Labeling Utilities Description


groundTruth Object for storing ground truth labels
groundTruthDataSource Create a ground truth data source
objectDetectorTrainingData Create training data from ground truth data for
an object detector

Computer Vision Example


• Semantic Segmentation Using Deep Learning

13-3
14

R2017a

Version: 7.3

New Features

Bug Fixes

Compatibility Considerations
R2017a

Deep Learning for Object Detection: Detect objects using Fast R-CNN
and Faster R-CNN object detectors
Use trainFastRCNNObjectDetector to train a Fast R-CNN deep learning object detector. Use
trainFasterRCNNObjectDetector to train a Faster R-CNN deep learning object detector. You can
train a Faster R-CNN detector to detect multiple object classes. Also new this release are the
fastRCNNObjectDetector and fasterRCNNObjectDetector RCNN support functions.

Object Detection Using ACF: Train object detectors using aggregate


channel features
Use trainACFObjectDetector to train a classifier to recognize rigid objects. Use
acfObjectDetector to detect objects in images.

Object Detector Evaluation: Evaluate object detector performance,


including precision and miss-rate metrics
Use the evaluateDetectionPrecision to return the average precision to measure detection
performance. Use the evaluateDetectionMissRate to evaluate the miss rate metric for object
detection.

OpenCV Interface: Integrate OpenCV version 3.1.0 projects with


MATLAB
Integrate OpenCV projects with MATLAB using OpenCV version 3.1.0.

Object for storing intrinsic camera parameters


Use the cameraIntrinsics object to store information about a camera’s intrinsic calibration
parameters, including the lens distortion parameters.

Disparity function updated to fix inconsistent results between


multiple invocations
In prior releases, the disparity function sporadically returned different results for depth estimation
using SemiGlobal method.

Compatibility Considerations
When you use the disparity function’s SemiGlobal method, the results will be different. Examine
the results from your code carefully to see if you need to make any adjustments.

Improved algorithm to calculate intrinsics in Camera Calibration apps


This release improves the stability of estimating the principle point in Camera Calibration apps.

14-2
Code Generation, GPU, and Third-Party Support

Compatibility Considerations
To reproduce prior results, you must use estimateCameraParameters function for camera
calibration and do not specify the ImageSize property.

14-3
15

R2016b

Version: 7.2

New Features

Bug Fixes

Compatibility Considerations
R2016b

Deep Learning for Object Detection: Detect objects using region-


based convolution neural networks (R-CNN)
Use the trainRCNNObjectDetector function and the rcnnObjectDetector object to train an R-
CNN deep learning object detector.

Structure from Motion: Estimate the essential matrix and compute


camera pose from 3-D to 2-D point correspondences
Use the estimateEssentialMatrix, estimateWorldCameraPose, extrinsicsToCameraPose,
cameraPoseToExtrinsics functions to estimate a 3-D structure of a scene from a set of 2-D
images. It also adds a worldToImage method to the cameraParameters class to project world
points into an image.

Point Cloud File I/O: Read and write PCD files using Point Cloud File
I/O Functions
The pcread and pcwrite functions now support PCD (point cloud data) format files.

Code Generation for ARM Example: Detect and track faces on a


Raspberry Pi 2 target
This release adds two examples that detail the steps for generating code for detecting and tracking
faces on the Raspberry Pi 2 hardware.

• Detect Face (Raspberry Pi2)


• Track Face (Raspberry Pi2)

Visual Odometry Example: Estimate camera locations and trajectory


from an ordered sequence of images
This release adds a visual odometry example, Monocular Visual Odometry, that details the steps for
estimating camera locations and camera trajectory.

cameraPose function renamed to relativeCameraPose


The cameraPose function has been renamed to the more descriptive relativeCameraPose.
Additionally, the function can now accept an essential matrix from the new
estimateEssentialMatrix function.

New capabilities for Training Image Labeler app


You can now use the Training Image Labeler app to:

• Create a full-image region of interest (ROI).


• Add multiple ROI labels (categories).
• Import ROIs from a MAT file or from the workspace.

15-2
Code Generation, GPU, and Third-Party Support

• Output a table if there are multiple ROI labels.

Train cascade object detector function takes tables and uses


imageDatastore
The trainCascadeObjectDetector function can now take positive instances as a table or as a
struct array. It can also take negative images using imageDatastore.

Project 3-D world points into image


The cameraParameters object now provides a worldToImage method that projects 3-D world
points into an image.

Code generation support


The following functions and methods:

• cameraPoseToExtrinsics
• extrinsicsToCameraPose
• worldToImage method of the cameraParameters object
• estimateEssentialMatrix
• estimateWorldCameraPose
• relativeCameraPose

Plot camera function accepts a table of camera poses


The plotCamera function can now accept and plot a table of camera poses.

Eliminate 3-D points input from extrinsics function


The extrinsics function no longer accepts 3-D x,y,z points as an input. Instead, use the
estimateWorldCameraPose function.

Compatibility Considerations
When you try to input x,y,z points, the extrinsics issues a warning.

Simpler way to call System objects


Instead of using the step method to perform the operation defined by a System object™, you can call
the object with arguments, as if it were a function. The step method will continue to work. This
feature improves the readability of scripts and functions that use many different System objects.

For example, if you create a vision.Pyramid System object named gaussPyramid, then you call
the System object as a function with that name.

gaussPyramid = vision.Pyramid('PyramidLevel',2);
gaussPyramid(x);

15-3
R2016b

The equivalent operation using the step method is:

gaussPyramid = vision.Pyramid('PyramidLevel',2);
step(gaussPyramid,x);

When the step method has the System object as its only argument, the function equivalent has no
arguments. This function must be called with empty parentheses. For example, step(sysobj) and
sysobj() perform equivalent operations.

15-4
16

R2016a

Version: 7.1

New Features

Bug Fixes

Compatibility Considerations
R2016a

OCR Trainer App: Train an optical character recognition (OCR) model


to recognize a specific set of characters
This release adds the OCR Trainer app.

Structure from Motion: Estimate the camera poses and 3-D structure
of a scene from multiple images
This release adds a collection of functions and objects to support structure from motion.

• bundleAdjustment
• pointTrack
• viewSet with several supporting methods for finding tracks and storing camera poses.
• triangulateMultiview

Pedestrian Detection: Locate pedestrians in images and video using


aggregate channel features (ACF)
This release adds the detectPeopleACF function to detect people in a scene.

Bundle Adjustment: Refine estimated locations of 3-D points and


camera poses for the structure from motion (SFM) framework
This release adds the bundleAdjustment function to estimate camera poses and 3-D points
simultaneously.

Multiview Triangulation: Triangulate 3-D locations of points matched


across multiple images
This release adds the triangulateMultiview function to recover the location of a 3-D world point
from its projections into 2-D images.

Rotate matrix to vector and vector to matrix


This release adds the rotationMatrixToVector and the rotationVectorToMatrix functions.
These function implement the Rodrigues transform.

Select spatially uniform distribution of feature points


This release adds the selectUniform method to the SURFPoints, cornerPoints, and
BRISKPoints objects.

Single camera and stereo camera calibration app enhancements


This release continues to enhance the single camera and stereo camera calibrator apps. The
enhancements include:

16-2
Code Generation, GPU, and Third-Party Support

• Ability to select multiple outlier images that correspond to a high mean reprojection error.
• Minimized analysis charts.
• Removed Viewing tab and placed viewing controls in the toolbar.
• Speeded up calibration for the Single Calibration App.

Point cloud from Kinect V2


This release adds support for extracting point clouds from Kinect V2 using the pcfromkinect
function.

Point cloud viewer enhancements


This release adds enhancements to the pcshow, pcplayer, and pcshowpair viewers. The functions
now rotates the point cloud around the center of the axis and shows the rotation axis. The display of
point clouds are downsampled for a large range of data. The downsampling is for display only, it does
not modify the data. This change makes display time faster. The functions also now support subplot.

Support package for Xilinx Zynq-based hardware


The Computer Vision System Toolbox™ Support Package for Xilinx Zynq-Based Hardware supports
verification and prototyping of vision algorithms on Zynq boards. HDL Coder™ is required for
customizing the algorithms running on the FPGA fabric of the Zynq device. Embedded Coder® is
required for customizing the algorithms running on the ARM® processor of the Zynq device.

• Target your video processing algorithms to Zynq hardware from Simulink


• Stream HDMI signals into Simulink to explore designs with real data
• Generate HDL vision IP cores using HDL Coder
• Deploy algorithms and visualize using HDMI output on a screen

For additional information, see Computer Vision System Toolbox Support Package for Xilinx Zynq-
Based Hardware .

C code generation support


This release continues to add C code generation support to new and existing functions and objects.

• rotationVectorToMatrix
• rotationMatrixToVector
• insertObjectAnnotation

This release also adds new support for portable C code generation. You can generate C code for your
specific target using any of the newly supported functions. The new support allows you to build the
application for your target using a C++ compiler. The C++ compiler links to OpenCV (Version 2.4.9)
libraries that you provide for the particular target. To build a standalone application, use packNGo
with 'Hierarchical' packType.

The newly supported functions for portable C code generation:

• vision.CascadeObjectDetector

16-3
R2016a

• detectBRISKFeatures
• detectFASTFeatures
• detectMSERFeatures
• disparity
• extractFeatures for BRISK, FREAK, and SURF methods.
• detectSURFFeatures
• vision.PeopleDetector
• vision.PointTracker
• matchFeatures
• opticalFlowFarneback

Future removal warning of several System objects


The Computer Vision System Toolbox begins removal of overlapping functionality with equivalent
functions.

Compatibility Considerations
Starting in this release, when you use any of the System objects listed in the table, MATLAB issues a
warning. Replace the use of the System object with the corresponding function.

Computer Vision System Toolbox Equivalent function


System object
Analysis and Enhancement
vision.ContrasterAdjuster imadjust, stretchlim
vision.CornerDetector detectHarrisFeatures ,
detectMinEigenFeatures, cornerPoints
vision.EdgeDetector edge
vision.HistogramEqualizer histeq
vision.OpticalFlow opticalFlowLKDoG,opticalFlowLK,
opticalFlowFarneback, opticalFlowHS
vision.BoundaryTracer bwtraceboundary,bwboundaries

Conversions
vision.Autothresholder graythresh, multithresh
vision.ColorSpaceConverter rgb2gray, rgb2ycbcr, makecform, applycform
vision.DemosaicInterpolator demosaic
vision.GammaCorrector imadjust
vision.ImageComplementer imcomplement
vision.ImageDataTypeConverter im2double,im2single, im2uint8, im2int16,
im2uint16

16-4
Code Generation, GPU, and Third-Party Support

Computer Vision System Toolbox Equivalent function


System object
Filtering
vision.ImageFilter imfilter
vision.MedianFilter2D medfilt2

Geometric Transformations
vision.GeometricTransformer imwarp
vision.GeometricTransformEstimat fitgeotrans
or
vision.GeometricScaler imresize
vision.GeometricRotator imrotate
vision.GeometricTranslator imtranslate

Sinks and Sources


vision.BinaryFileWriter No support
vision.BinaryFileReader No support

Statistics
vision.Histogram imhist
vision.PSNR psnr

Text & Graphics


vision.MarkerInserter insertMarker
vision.ShapeInserter insertShape
vision.TextInserter insertText

Transforms
vision.HoughTransform hough

Utilities
vision.ImagePadder padarray

16-5
17

R2015b

Version: 7.0

New Features

Bug Fixes

Compatibility Considerations
R2015b

3-D Shape Fitting: Fit spheres, cylinders, and planes into 3-D point
clouds using RANSAC
This release adds 3-D point cloud processing functions and classes to fit a sphere, cylinder, or plane
to a point cloud.

• pcfitsphere
• pcfitplane
• pcfitcylinder
• planeModel
• sphereModel
• cylinderModel

Streaming Point Cloud Viewer: Visualize streaming 3-D point cloud


data from sensors such as the Microsoft Kinect
This release adds the pcplayer function for visualizing streaming 3-D point cloud data.

Point Cloud Normal Estimation: Estimate normal vectors of a 3-D point


cloud
This release adds the pcnormals function to estimate normal vectors for point clouds.

Farneback Optical Flow: Estimate optical flow vectors using the


Farneback method
This release adds the opticalFlowFarneback object that allows you to compute optical flow. This
object supports C code generation.

LBP Feature Extraction: Extract local binary pattern features from a


grayscale image
This release adds the extractLBPFeatures function that extracts local binary patterns (LBP) from
a grayscale image.

Multilanguage Text Insertion: Insert text into image data, with


support for multiple languages
This release adds the ability to insert a TrueType font with the insertText function. Also new, is the
listTrueTypeFonts function to list available TrueType fonts on your system.

3-D point cloud extraction from Microsoft Kinect


This release adds the pcfromkinect function to convert a Microsoft Kinect depth and RGB image to
a 3-D point cloud. To use this function you must have the Image Acquisition Toolbox™ installed.

17-2
Code Generation, GPU, and Third-Party Support

3-D point cloud displays


This release adds the pcshow function for plotting a 3-D point cloud. The release also adds the
pcshowpair function to visualize the differences between two point clouds.

Downsample point cloud using nonuniform box grid filter


This release adds a new syntax to the pcdownsample function which returns a downsampled point
cloud using a nonuniform box grid filter. The box grid filter can be used as a preprocessing step for
point cloud registration.

Compute relative rotation and translation between camera poses


This release adds the cameraPose function to compute the relative pose of a calibrated camera
based on two views.

Warp block
This release adds the Warp block. Use the Warp block to apply projective or affine transforms to an
image. You can use this block to transform the entire image or portions of the image with either a
polygon or rectangular region of interest (ROI). This block replaces the Apply Geometric
Transformation block.

Compatibility Considerations
Apply Geometric Transformation block will be removed in a future release. Use the Warp block
instead.

GPU support for FAST feature detection


This release adds GPU acceleration for the detectFASTFeatures function. GPU acceleration for
this function requires Parallel Computing Toolbox™.

Camera calibration optimization options


This release adds new properties to the estimateCameraParameters function that provide the
ability to supply an initial guess for calibration parameters prior to optimization.

C code generation support


This release continues to add C code generation support to new and existing functions and objects.

• cameraPose
• detectCheckerboardPoints
• extractLBPFeatures
• generateCheckerboardPoints
• insertText

17-3
R2015b

• opticalFlowFarneback

Examples for face detection, tracking, 3-D reconstruction, and point


cloud registration and display
This release adds the following new featured examples:

• The Face Detection and Tracking Using Live Video Acquisition example shows how to
automatically detect and track a face in a live video stream, using the KLT algorithm.
• The Tracking Pedestrians from a Moving Car example shows how to perform automatic detection
and tracking of people in a video from a moving camera.
• The Structure From Motion From Two Views example shows you how to generate a point cloud
from features matched between two images taken with a calibrated camera.
• The 3-D Point Cloud Registration and Stitching example shows how to combine multiple point
clouds to reconstruct a 3-D scene using iterative closest point (ICP) algorithm.

Example using Vision HDL Toolbox for noise removal and image
sharpening
This release, the Vision HDL Toolbox product adds the Noise Removal and Image Sharpening
featured example. This example shows how to implement a front-end module of an image processing
design. This front-end module removes noise and sharpens the image to provide a better initial
condition for the subsequent processing.

Removed video package from Computer Vision System Toolbox


This release removes the use of the video package. It was replaced with the use of the vision
package name.

Compatibility Considerations
Replace the use of the video name with vision.

Morphological System objects future removal warning


The Computer Vision System Toolbox begins removal of overlapping functionality with equivalent
functions in the Image Processing Toolbox™.

Compatibility Considerations
Starting in this release, when you use any of the morphological System objects, MATLAB issues a
warning. Replace the use of the System object with the corresponding Image Processing Toolbox
function.

Computer Vision System Toolbox System Equivalent Image Processing Toolbox


object function
vision.MorphologicalDilate imdilate

17-4
Code Generation, GPU, and Third-Party Support

Computer Vision System Toolbox System Equivalent Image Processing Toolbox


object function
vision.MorphologicalOpen imopen
vision.MorphologicalClose imclose
vision.MorphologicalErode imerode
vision.MorphologicalBottomHat imbothat
vision.MorphologicalTopHat imtophat
vision.ConnectedComponentLabeler bwlabel or bwlabeln (n-D version)

No edge smoothing in outputs of undistortImage and


rectifyStereoImages
The undistortImage and rectifyStereoImages functions were modified in version 7.0 of the
Computer Vision System Toolbox (MATLAB R2015b).

Compatibility Considerations
Previous versions of these functions smoothed the edges of the output image. The current versions do
not smooth the image borders in order to increase speed.

VideoFileReader play count


The vision.VideoFileReader PlayCount default value was changed from inf to 1.

Compatibility Considerations
Any vision.VideoFileReader objects saved in previous versions and loaded into R2015b (or later)
will loop continuously.

17-5
18

R2015a

Version: 6.2

New Features

Bug Fixes

Compatibility Considerations
R2015a

3-D point cloud functions for registration, denoising, downsampling,


geometric transformation, and PLY file reading and writing
This release adds 3-D point cloud processing functions. It also adds an object for storing a point
cloud, and functions to read and write point cloud files in a PLY format.

• pctransform
• pcregrigid
• pcmerge
• pcdownsample
• pcdenoise
• pointCloud
• pcread
• pcwrite

Image search and retrieval using bag of visual words


This release adds functionality for searching large image collections. It adds the retrieveImages,
evaluateImageRetrieval, and indexImages functions, and the invertedImageIndex object.

User-defined feature extractor for bag-of-visual-words framework


This release adds the ability to define a custom feature extractor to use with the bagOfFeatures
framework.

C code generation for eight functions, including rectifyStereoImages


and vision.DeployableVideoPlayer on Mac
This release adds C code generation support to eight functions, one System object, and one block.
Additionally, the new optical flow functions support C code generation.

• rectifyStereoImages
• reconstructScene
• undistortImage
• triangulate
• extrinsics
• cameraMatrix
• cameraParameters
• stereoParameters
• vision.DeployableVideoPlayer
• To Video Display block on the Mac
• opticalFlow
• opticalFlowHS
• opticalFlowLK

18-2
Code Generation, GPU, and Third-Party Support

• opticalFlowLKDoG

Mac support for vision.DeployableVideoPlayer and To Video Display


block
This release enables the vision.DeployableVideoPlayer System object and the To Video Display
block on the Mac. These players are capable of displaying high-definition video at high frame rates.

Compatibility Considerations
The player displays frames at the rate of the video input. Setting the FrameRate property has no
effect on changing the rate of the display. The FrameRate property will be removed in a future
release.

Plot camera figure in 3-D coordinate system


This release adds the plotCamera function. The function returns a camera visualization object to
plot a camera figure in the current axes.

Line width for insertObjectAnnotation


This release introduces a LineWidth property for the insertObjectAnnotation function. Setting
this property enables you to modify the line width of inserted annotations.

Upright option for extractFeatures


This release introduces the Upright property for the extractFeatures function. This property
enables you to restrict extracted features to an upright orientation only.

Rotate integral images in integralImage, integralKernel, and


integralFilter functions
This release introduces rotated summed area table (RSAT) support to the integralImage,
integralKernel, and integralFilter functions.

Performance improvements
• detectCheckerboardPoints and estimateCameraParameters functions
• integralFilter function
• vision.DeployableVideoPlayer System object
• To Video Display block

Optical flow functions and object


This release adds the opticalFlowHS, opticalFlowLK, and opticalFlowLKDoG functions. These
functions enable you to compute optical flow using the Horn-Schunck and Lucas-Kanade methods.

18-3
R2015a

This release also adds the opticalFlow object for storing optical flow velocity matrices. The
functions and the object support C code generation.

Examples for image retrieval, 3-D point cloud registration and


stitching, and code generation for depth estimation from stereo video
• Image Retrieval Using Customized Bag-of-Features
• 3-D Point Cloud Registration and Stitching
• Code Generation for Depth Estimation From Stereo Video

18-4
19

R2014b

Version: 6.1

New Features

Bug Fixes
R2014b

Stereo camera calibration app


This release adds a stereo calibration app. You can use the app to calibrate a stereo camera. Use the
stereoCameraCalibrator function to invoke the app. See the Stereo Calibration Using the Stereo
Camera Calibrator App tutorial.

imageSet class for handling large collections of image files


This release adds the imageSet class for managing large collections of image files. Use the
imageSet partition method to create subsets of image sets. You can use these subsets to provide
training and validation images for classification.

Bag-of-visual-words suite of functions for image category


classification​​
This release adds a suite of functions to support the bag-of-features workflow. The workflow allows
you to manage your image collections and partition them into training and validation sets. It
constructs a bag of visual words for use in image category classification. The training and
classification includes support for Parallel Computing Toolbox.

• imageSet
• bagOfFeatures
• trainImageCategoryClassifier
• imageCategoryClassifier

Approximate nearest neighbor search method for fast feature


matching
This release provides updates to the matchFeatures function. The update replaces previous
matching methods with 'Exhaustive' and 'Approximate' Nearest Neighbor methods. It also
adds the Unique match logical property to only return unique matches from the input feature set.

As a result of this update, the following methods and properties were removed:

• 'NearestNeighborRatio', 'NearestNeighborSymmetric', and 'Threshold' matching


methods
• 'normxcorr' normalized cross-correlation metric and the 'Prenormalized' properties

Use the following new methods to match the behavior of the removed properties.

Previous Match Method Set New Match Method


'NearestNeighborRatio' Set the Method property to 'Exhaustive' and
the Unique property to false.
'NearestNeighborSymmetric' Set the Method property to 'Exhaustive', the
Unique property to true, and the MaxRatio to
1.

19-2
Code Generation, GPU, and Third-Party Support

3-D point cloud visualization function


This release adds the showPointCloud function for plotting point clouds.

3-D point cloud extraction from Kinect


This release adds the depthToPointCloud function to convert a Kinect depth image to a 3-D point
cloud. This function requires the Image Acquisition Toolbox.

Kinect color image to depth image alignment


This release adds the alignColorToDepth function for registering a Kinect color image to a depth
image. This function requires the Image Acquisition Toolbox.

Point locations from stereo images using triangulation


This release adds the triangulate function. You can use this function to find 3-D locations of
matching points in stereo images.

Red-cyan anaglyph from stereo images


This release adds the stereoAnaglyph function. Use this function to combine stereo images to form
an anaglyph, which can be viewed with red-blue stereo glasses.

Point coordinates correction for lens distortion


This release adds the undistortPoints function. Use this function to remove the effects of lens
distortion from individual point locations.

Camera projection matrix


This release adds the cameraMatrix function. You can use the matrix returned by this function to
project 3-D world points in homogeneous coordinates into an image.

Calculation of calibration standard errors


This release adds the ability to return the standard errors incurred during the camera calibration
process. The estimateCameraParameters function returns the errors. You can use the errors to
evaluate the quality of the camera calibration. You can return errors for both single and stereo
camera calibration.

Live image capture in Camera Calibrator app


You can now do live camera calibration using the Camera Calibrator app. The new Image Capture
tab allows you to bring live images from USB Webcams into the app. Previously, you had to save your
images to disk and manually add them into the app.

The image capture functionality in the Camera Calibrator app allows you to:

19-3
R2014b

• Capture live images from USB Webcams


• Browse the captured images
• Save acquired images
• Integrate between image acquisition and calibration
• Control camera properties, such as brightness and contrast.

Use the cameraCalibrator function to open the app. Then select Add Images > From camera to
open the Image Capture tab. Select your device, set any properties, and define the capture settings.
You can then capture images and calibrate the camera.

Region of interest (ROI) copy and paste support for Training Image
Labeler app
This release adds the ability to copy-and-paste regions of interest within the Training Image Labeler
app.

Non-maximal suppression of bounding boxes for object detection


This release adds the bboxOverlapRatio and the selectStrongestBbox functions. Use
bboxOverlapRatio to compute the overlap ratio between pairs of bounding boxes. Use
selectStrongestBbox to select the strongest bounding boxes from overlapping clusters.

Linux support for deployable video player


This release adds Linux® support for the To Video Display block and the
vision.DeployableVideoPlayer System object. This added support includes the ability to
generate code.

GPU support for Harris feature detection


This release adds GPU acceleration for the detectHarrisFeatures function. GPU acceleration for
this function requires Parallel Computing Toolbox.

Extended language support package for optical character recognition


(OCR)
This release adds the ability to download additional language support for the ocr function. You can
use the visionSupportPackages function to download the language support package.

Support package for OpenCV Interface


This release adds a support package to help you integrate your OpenCV C++ code into MATLAB. It
lets you build MEX files that calls OpenCV functions. You can use the visionSupportPackages
function to download the OpenCV Interface support package.

19-4
Code Generation, GPU, and Third-Party Support

Convert format of rectangle to a list of points


This release adds the bbox2points function. You can use this function to convert a rectangle,
specified as [x, y, width, height], into a list of [x, y] points.

Bag-of-visual-words, stereo vision, image stitching, and tracking


examples
This release adds several new examples.

• Pedestrian tracking from a moving car


• Image classification using bag-of-visual-words workflow
• Face tracking from a web cam
• Evaluate camera calibration results
• Image stitching
• Depth estimation from a stereo video
• Code generation with PackNGo

19-5
20

R2014a

Version: 6.0

New Features

Bug Fixes

Compatibility Considerations
R2014a

Stereo vision functions for rectification, disparity calculation, scene


reconstruction, and stereo camera calibration
This release adds a suite of stereo vision algorithms to the Computer Vision System Toolbox.

• rectifyStereoImages for stereo rectification.


• reconstructScene for computing dense 3-D reconstruction based on a disparity map.
• extrinsics for computing location of a calibrated camera.
• stereoParameters object for storing stereo system parameters.
• detectCheckerboardPoints extended to support stereo calibration
• disparity adds new method for semi-global block matching.
• estimateCameraParameters extended to calibrate stereo cameras.
• cameraParameters object for storing camera parameters.
• showExtrinsics extended to support stereo cameras.
• showReprojectionErrors extended to support stereo cameras.

Compatibility Considerations
This release modifies the disparity function’s default method for block matching. The new
SemiGobal default method may produce different results in code created that used the previous
BlockMatching default method. To obtain the same results, set the 'Method' property to
'BlockMatching'.

Optical character recognition (OCR)


This release adds the ocr function and ocrText object. You can use the ocr function to recognize
text using optical character recognition. The ocrText object stores optical character recognition
results.

Binary Robust Invariant Scalable Keypoints (BRISK) feature detection


and extraction
This release adds the detectBRISKFeatures function. You can use the Binary Robust Invariant
Scalable Keypoints (BRISK) algorithm to detect multi-scale corner features. This release also adds the
BRISKPoints object to store the BRISK detection results. This release adds BRISK descriptor to the
extractFeatures.

App for labeling images for training cascade object detectors


This release adds a Training Image Labeler app. The app can be used to select regions of interest in
images for the purpose of training a classifier. You can invoke the app by using the
trainingImageLabeler function. See the Label Images for Classification Model Training tutorial.

20-2
Code Generation, GPU, and Third-Party Support

C code generation for Harris and minimum eigenvalue corner


detectors using MATLAB Coder
This release adds C code generation support for the detectHarrisFeatures and
detectMinEigenFeatures functions. This release also adds C code generation to the
estimateGeometricTransform function.

Line width control for insertShape function and Draw Shapes block
This release adds line thickness control to the insertShape function and the Draw Shapes.

Replacing vision.CameraParameters with cameraParameters


This release replaces the vision.CameraParameters object with the cameraParameters object.
The new object contains identical functionality.

Compatibility Considerations
You must replace the vision.CameraParameters with cameraParameters object in your code. If
you attempt to create a vision.CameraParameters object, MATLAB returns an error.

Output view modes and fill value selection added to undistortImage


function
This release adds new output view modes and fill value selection to the undistortImage function.
You can control the output view size by setting the OutputView property. You can also set the fill
value with the FillValues property.

Generated code optimized for the matchFeatures function and


vision.ForegroundDetector System object
This release provides generated code optimization for the matchFeatures function and the
vision.ForegroundDetector System object on a Windows®, Linux, macOS platform.

Merging mplay viewer into implay viewer


This release merges the mplay viewer function from the Computer Vision System Toolbox into the
implay function in Image Processing Toolbox.

Compatibility Considerations
Use the implay function with functionality identical to mplay. The mplay function will be removed
in a future release.

20-3
R2014a

MPEG-4 and JPEG2000 file formats added to vision.VideoFileWriter


System object and To Multimedia File block
This release adds support for writing MPEG-4 and JPEG 2000 file formats with the
vision.VideoFileWriter object and the To Multimedia File block.

Region of interest (ROI) support added to detectMSERFeatures and


detectSURFFeatures functions
This release adds region of interest (ROI) support to the detectMSERFeatures and
detectSURFFeatures functions.

MATLAB code script generation added to Camera Calibrator app


This release adds MATLAB code script generation to the Camera Calibrator app.

Featured examples for text detection, OCR, 3-D reconstruction, 3-D


dense reconstruction, code generation, and image search
This release the Computer Vision System Toolbox adds several new featured examples:

• Automatically Detect and Recognize Text in Natural Images


• Image Search using Point Features
• Recognize Text Using Optical Character Recognition (OCR)
• Code Generation for Feature Matching and Registration (updated)
• Stereo Calibration and Scene Reconstruction
• Sparse 3-D Reconstruction From Multiple Views

Play count default value updated for video file reader


This release the Computer Vision System Toolbox modifies the default value from inf to 1 for the
PlayCount property of the VideoFileReader System object. This change allows proper functionality
while using the isDone method.

20-4
21

R2013b

Version: 5.3

New Features

Bug Fixes

Compatibility Considerations
R2013b

Camera intrinsic, extrinsic, and lens distortion parameter estimation


using camera calibration app
This release adds a camera calibration app. The app can be used to estimate camera intrinsic and
extrinsic parameters, and to compute parameters needed to remove the effects of lens distortion from
an image. You can invoke the calibrator using the cameraCalibrator function. See the Find
Camera Parameters with the Camera Calibrator tutorial.

Camera calibration functions for checkerboard pattern detection,


camera parameter estimation, correct lens distortion, and
visualization of results
This release adds a suite of functions that, when used together, provide a workflow to calibrate a
camera:

• detectCheckerboardPoints
• estimateCameraParameters
• generateCheckerboardPoints
• showExtrinsics
• showReprojectionErrors
• undistortImage
• vision.CameraParameters

Histogram of Oriented Gradients (HOG) feature extractor


This release adds the extractHOGFeatures descriptor function. The extracted features encode
local shape information from regions within an image. You can use this function for many tasks
including classification, detection, and tracking.

C code generation support for 12 additional functions


This release adds C code generation support for several Computer Vision System Toolbox functions,
classes, and System objects.

• extractHOGFeatures
• extractFeatures
• detectSURFFeatures
• disparity
• detectMSERFeatures
• detectFASTFeatures
• vision.CascadeObjectDetector
• vision.PointTracker
• vision.PeopleDetector
• MSERRegions
• cornerPoints

21-2
Code Generation, GPU, and Third-Party Support

• SURFPoints

System objects matlab.system.System warnings


The System object base class, matlab.system.System has been replaced by matlab.System. If
you use matlab.system.System when defining a new System object, a warning message results.

Compatibility Considerations
Change all instances of matlab.system.System in your System objects code to matlab.System.

Restrictions on modifying properties in System object Impl methods


When defining a new System object, certain restrictions affect your ability to modify a property.

You cannot use any of the following methods to modify the properties of an object:

• cloneImpl
• getDiscreteStateImpl
• getDiscreteStateSpecificationImpl
• getNumInputsImpl
• getNumOutputsImpl
• getOutputDataTypeImpl
• getOutputSizeImpl
• isInputDirectFeedthroughImpl
• isOutputComplexImpl
• isOutputFixedSizeImpl
• validateInputsImpl
• validatePropertiesImpl

This restriction is required by code generation, which assumes that these methods do not change any
property values. These methods are validation and querying methods that are expected to be
constant and should not impact the algorithm behavior.

Also, if either of the following conditions exist:

• You plan to generate code for the object


• The object will be used in the MATLAB System block

you cannot modify tunable properties for any of the following runtime methods:

• outputImpl
• processTunedPropertiesImpl
• resetImpl
• setupImpl
• stepImpl

21-3
R2013b

• updateImpl

This restriction prevents tunable parameter updates within the object from interfering with updates
from outside the generated code. Tunable parameters can only be changed from outside the
generated code.

Compatibility Considerations
If any of your class definition files contain code that changes a property in one of the above Impl
methods, move that property code into an allowable Impl method. Refer to the System object Impl
method reference pages for more information.

21-4
22

R2013a

Version: 5.2

New Features

Bug Fixes

Compatibility Considerations
R2013a

Cascade object detector training using Haar, Histogram of Oriented


Gradients (HOG), and Local Binary Pattern (LBP) features
This release adds the trainCascadeObjectDetector function for Haar, Histogram of Oriented
Gradients (HOG), and Local Binary Pattern (LBP) features. The function creates a custom
classification model to use with the vision.CascadeObjectDetector cascade object detector.

Fast Retina Keypoint (FREAK) algorithm for feature extraction


This release adds the Fast Retina Keypoint (FREAK) descriptor algorithm to the extractFeatures
function. This function now supports the FREAK method for descriptor extraction.

Hamming distance method for matching features


This release adds the Hamming distance method to the matchFeatures function in support of binary
features produced by descriptors such as the FREAK method for extraction. It also adds the new
binaryFeatures object, which is an output of the extractFeatures function and serves as an
input to the matchFeatures function.

Multicore support in matchFeatures function and ForegroundDetector


System object
This release brings multicore performance improvements for the matchFeatures function and the
vision.ForegroundDetector detector.

Functions for corner detection, geometric transformation estimation,


and text and graphics overlay, augmenting similar System objects
This release adds several new functions. For corner detection, the new detectHarrisFeatures,
detectMinEigenFeatures, and detectFASTFeatures functions. The insertText,
insertMarker, and insertShape functions for inserting text, markers, and shapes into images and
video. Lastly, the estimateGeometricTransform function for estimating a geometric transform
from putatively matched point pairs.

Error-out condition for old coordinate system


This release ends support for the row-column coordinate system for the Computer Vision System
Toolbox algorithms. All blocks are replaced with blocks using [x y] coordinates, and all functions and
System objects are updated to use the one-based [x y] convention. Using any MATLAB or Simulink
related algorithms will error out when using RC-based functions or blocks.

Compatibility Considerations
Conventions for indexing, spatial coordinates, and representation of geometric transforms were
changed in R2011b to provide improved interoperability with the Image Processing Toolbox product.
Beginning in this release, all Computer Vision System Toolbox blocks, functions, classes, and System
objects will only operate in the [x y] coordinate system. Blocks affected by the [x y] coordinate system
should be replaced with blocks of the same name from the Vision library. Adjust your models, code,
and data as necessary.

22-2
Code Generation, GPU, and Third-Party Support

For extended details on the coordinate system change, see “Conventions Changed for Indexing,
Spatial Coordinates, and Representation of Geometric Transforms” on page 25-2 R2011b Release
Notes.

Support for nonpersistent System objects


You can now generate code for local variables that contain references to System objects. In previous
releases, you could not generate code for these objects unless they were assigned to persistent
variables.

New method for action when System object input size changes
The new processInputSizeChangeImpl method allows you to specify actions to take when an
input to a System object you defined changes size. If an input changes size after the first call to step,
the actions defined in processInputSizeChangeImpl occur when step is next called on that
object.

Scaled double data type support for System objects


System objects now support scaled double data types.

Scope Snapshot display of additional scopes in Simulink Report


Generator
Using Simulink Report Generator™ software, you can include snapshots of the display produced by a
Scope block in a generated report. The Scope Snapshot component, which inserts images of the
Simulink Scope block and XY Graph block, now supports the Video Viewer block in Computer Vision
System Toolbox software.

Note This feature requires that you have a license for the Simulink Report Generator product.

For more information, see the Simulink Report Generator product documentation.

22-3
23

R2012b

Version: 5.1

New Features

Bug Fixes

Compatibility Considerations
R2012b

Kalman filter and Hungarian algorithm for multiple object tracking


The vision.KalmanFilter object is designed for object tracking. You can use it to predict an
object's future location, to reduce noise in the detected location, or to help associate multiple objects
with their corresponding tracks. The configureKalmanFilter function helps you to set up the
Kalman filter object.

The assignDetectionsToTracks function assigns detections to tracks in the context of multiple


object tracking using the James Munkres' variant of the Hungarian assignment algorithm. The
function also determines which tracks are missing, and which detections should begin a new track.

Image and video annotation for detected or tracked objects


The insertObjectAnnotation function inserts labels and corresponding circles or rectangles into
an image or video to easily display tracked objects. You can use it with either a grayscale or true color
image input.

Kanade-Lucas-Tomasi (KLT) point tracker


The vision.PointTracker object tracks a set of points using the Kanade-Lucas-Tomasi (KLT),
feature tracking algorithm. You can use the point tracker for video stabilization, camera motion
estimation, and object tracking.

HOG-based people detector


The vision.PeopleDetector object detects people in an input image using the Histogram of
Oriented Gradient (HOG) features and a trained Support Vector Machine (SVM) classifier. The object
detects unoccluded people in an upright position.

Video file reader support for H.264 codec (MPEG-4) on Windows 7


This release adds H.264 codec (MPEG-4) video formats for Windows 7 operating systems.

Show matched features display


The showMatchedFeatures function displays corresponding feature points. It displays a falsecolor
overlay of two images with a color-coded plot of the corresponding points connected by a line.

Matching methods added for match features function


This release enhances the matchFeatures function for applications in computing the fundamental
matrix, stereo vision, registration, and object detection. It provides three different matching methods:
simple threshold match, unique matches, and unambiguous matches.

Compatibility Considerations
The new implementation of matchFeatures uses different default value for the method parameter.
If you need the same results as those produced by the previous implementation, set the Method
parameter with syntax:

23-2
Code Generation, GPU, and Third-Party Support

matchFeatures(FEATURES1, FEATURES2, 'Method', 'NearestNeighbor_old', ...).

Kalman filter for tracking tutorial


The Kalman filter is a popular tool for object tracking. The Using Kalman Filter for Object Tracking
example helps you to understand how to setup and use the vision.KalmanFilter object and the
configureKalmanFilter function to track objects.

Motion-based multiple object tracking example


The Motion-Based Multiple Object Tracking example shows you how to perform automatic detection
and motion-based tracking of moving objects in a video from a stationary camera.

Face detection and tracking examples


The Face Detection and Tracking example shows you how to automatically detect and a track a face.
The Face Detection and Tracking Using the KLT Algorithm example uses the Kanade-Lucas-Tomasi
(KLT) algorithm and shows you how to automatically detect a face and track it using a set of feature
points.

Stereo image rectification example


This release enhances the Stereo Image Rectification example. It uses SURF feature detection with
the estimateFundamentalMatrix, estimateUncalibratedRectification, and
detectSURFFeatures functions to compute the rectification of two uncalibrated images, where the
camera intrinsics are unknown.

System object tunable parameter support in code generation


You can change tunable properties in user-defined System objects at any time, regardless of whether
the object is locked. For System objects predefined in the software, the object must be locked. In
previous releases, you could tune System object properties only for a limited number of predefined
System objects in generated code.

save and load for System objects


You can use the save method to save System objects to a MAT file. If the object is locked, its state
information is saved, also. You can recall and use those saved objects with the load method.

You can also create your own save and load methods for a System object you create. To do so, use
the saveObjectImpl and loadObjectImpl, respectively, in your class definition file.

Save and restore SimState not supported for System objects


The Save and Restore Simulation State as SimState option is no longer supported for any System
object in a MATLAB Function block. This option was removed because it prevented parameter
tunability for System objects, which is important in code generation.

23-3
R2012b

Compatibility Considerations
If you need to save and restore simulation states, you may be able to use a corresponding Simulink
block, instead of a System object.

23-4
24

R2012a

Version: 5.0

New Features

Bug Fixes

Compatibility Considerations
R2012a

Dependency on DSP System Toolbox and Signal Processing Toolbox


Software Removed
The DSP System Toolbox™ and Signal Processing Toolbox™ software are no longer required products
for using Computer Vision System Toolbox software. As a result, a few blocks have been modified or
removed.

Audio Output Sampling Mode Added to the From Multimedia File Block

The From Multimedia File block now includes a new parameter, which allows you to select frame- or
sample-based audio output. If you do not have a DSP System Toolbox license and you set this
parameter for frame-based processing, your model will return an error. The Computer Vision System
Toolbox software uses only sample-based processing.

Kalman Filter and Variable Selector Blocks Removed from Library

This release removes the Kalman Filter and Variable Selector Blocks from the Computer Vision
System Toolbox block library.

Compatibility Considerations
To use these blocks or to run a model containing these blocks, you must have a DSP System Toolbox
license.

2-D Median and 2-D Histogram Blocks Replace Former Median and Histogram Blocks

The Median and Histogram blocks have been removed. You can replace these blocks with the 2-D
Median and the 2-D Histogram blocks.

Compatibility Considerations
Replace these blocks in your models with the new 2-D blocks from the Computer Vision System
Toolbox library.

Removed Sample-based Processing Checkbox from 2-D Maximum, 2-D Minimum, 2-D
Variance, and 2-D Standard Deviation Blocks

This release removes the Treat sample-based row input as a column checkbox from the 2-D
Maximum, 2-D Minimum, 2-D Variance, and 2-D Standard Deviation statistics blocks.

Compatibility Considerations
Modify your code accordingly.

New Viola-Jones Cascade Object Detector


The vision.CascadeObjectDetector System object uses the Viola-Jones algorithm to detect
objects in an image. This detector includes Haar-like features and a cascade of classifiers. The
cascade object detector is pretrained to detect faces, noses and other objects.

24-2
Code Generation, GPU, and Third-Party Support

New MSER Feature Detector


The detectMSERFeatures function detects maximally stable extremal regions (MSER) features in a
grayscale image. You can use the MSERRegions object, returned by the function, to manipulate and
plot MSER features.

New CAMShift Histogram-Based Tracker


The vision.HistogramBasedTracker System object uses the continuously adaptive mean shift
(CAMShift) algorithm for tracking objects. It uses the histogram of pixel values to identify the object.

New Integral Image Computation and Box Filtering


The integralKernel object with the integralImage and integralFilter functions use integral
images for filtering an image with box filters. The speed of the filtering operation is independent of
the filter size, making it ideally suited for fast analysis of images at different scales.

New Demo to Detect and Track a Face


This release provides a new demo, Face Detection and Tracking Using CAMShift. This
example shows you how to develop a simple face tracking system by detecting a face, identifying its
facial features, and tracking it.

Improved MATLAB Compiler Support


MATLAB Compiler™ now supports detectSURFFeatures and disparity functions.

Code Generation Support


The vision.HistogramBasedTracker and vision.CornerDetector System objects now
support code generation. See About MATLAB Coder for more information about code generation.

Conversion of Error and Warning Message Identifiers


This release changes error and warning message identifiers.

Compatibility Considerations
If you have scripts or functions using message identifiers that have changed, you must update the
code to use the new identifiers. Typically, you use message identifiers to turn off specific warning
messages. You can also use them in code that uses a try/catch statement and performs an action
based on a specific error identifier.

For example, the <'XXXXX:old:ID'> identifier has changed to <'new:similar:ID'>. If your code checks
for <'XXXXX:old:ID'>, you must update it to check for <'new:similar:ID'> instead.

To determine the identifier for a warning, run the following command just after you see the warning:

[MSG,MSGID] = lastwarn;

24-3
R2012a

This command saves the message identifier to the variable MSGID.

To determine the identifier for an error that appears at the MATLAB prompt, run the following
command just after you see the error.

exception = MException.last;
MSGID = exception.identifier;

Note Warning messages indicate a potential issue with your code. While you can turn off a warning,
a suggested alternative is to change your code without producing a warning.

System Object Updates


Code Generation for System Objects

System objects defined by users now support C code generation. To generate code, you must have the
MATLAB Coder product.

New System Object Option on File Menu

The File menu on the MATLAB desktop now includes a New > System object menu item. This option
opens a System object class template, which you can use to define a System object class.

Variable-Size Input Support for System Objects

System objects that you define now support inputs that change size at runtime.

Data Type Support for User-Defined System Objects

System objects that you define now support all MATLAB data types as inputs and outputs.

New Property Attribute to Define States

R2012a adds the new DiscreteState attribute for properties in your System object class definition
file. Discrete states are values calculated during one step of an object’s algorithm that are needed
during future steps.

New Methods to Validate Properties and Get States from System Objects

The following methods have been added:

• validateProperties – Checks that the System object is in a valid configuration. This applies
only to objects that have a defined validatePropertiesImpl method
• getDiscreteState – Returns a struct containing a System object’s properties that have the
DiscreteState attribute

matlab.system.System changed to matlab.System

The base System object class name has changed from matlab.system.System to matlab.System.

24-4
Code Generation, GPU, and Third-Party Support

Compatibility Considerations
The previous matlab.system.System class will remain valid for existing System objects. When you
define new System objects, your class file should inherit from the matlab.System class.

24-5
25

R2011b

Version: 4.1

New Features

Bug Fixes

Compatibility Considerations
R2011b

Conventions Changed for Indexing, Spatial Coordinates, and


Representation of Geometric Transforms
Conventions for indexing, spatial coordinates, and representation of geometric transforms have been
changed to provide improved interoperability with the Image Processing Toolbox product.

Running your Code with New Conventions

How to run code Solution


Written with R2011b or You can safely ignore the warning, and turn it off. Your code will use the
later one-based [x y] coordinate system.
(New User)
To turn the warning off, place the following command in your startup.m
file:
warning('off','vision:transition:usesOldCoordinates')
Written prior to R2011b To run your pre-R2011b code using the zero-based [row column]
conventions, invoke vision.setCoordinateSystem('RC') command
prior to running your code.

Support for the pre-R2011b coordinate system will be removed in a


future release. You should update your code to use R2011b coordinate
system conventions.

To turn the warning off, place the following command in your startup.m
file:
warning('off','vision:transition:usesOldCoordinates')

One-Based Indexing

The change from zero-based to one-based indexing simplifies the ability to blend Image Processing
Toolbox functionality with Computer Vision System Toolbox algorithms and visualization functions.

Coordinate System Convention

Image locations in the Computer Vision System Toolbox are now expressed in [x y] coordinates, not in
[row column]. The orientation of matrices containing image locations has changed. In previous
releases, the orientation was a 2-by-N matrix of zero-based [row column] point coordinates. Effective
in R2011b, the orientation is an M-by-2 matrix of one-based [x y] point coordinates. Rectangular ROI
representation changed from [r c height width] to [x y width height].

Example: Convert a point represented in the [r c] coordinate system to a point in the [x y]


coordinate system

Convert your data to be consistent with MATLAB and the Image Processing Toolbox coordinate
systems by switching the order indexing and adding 1 to each dimension. The row index dimension
corresponds to the y index, and the column index corresponds to the x index. The following figure
shows the equivalent row-column and x-y coordinates for a pixel location in an image.

25-2
Code Generation, GPU, and Third-Party Support

The following MATLAB code converts point coordinates from an [r c] coordinate system to the [x y]
coordinate system:
ptsRC = [2 0; 3 5] % Two RC points at [2 3] and [0 5]
ptsXY = fliplr(ptsRC'+1) % RC points converted to XY

Example: Convert a bounding box represented in the [r c] coordinate system to the [x y]


coordinate system
% Two bounding boxes represented as [r c height width]
% First box is [2 3 10 5] and the second box is[0 5 15 10]
bboxRC = [2 0; 3 5; 10 15; 5 10]
% Convert the boxes to XY coordinate system format [x y width heigth]
bboxXY = [fliplr(bboxRC(1:2,:)'+1) fliplr(bboxRC(3:4,:)')]

Example: Convert an affine geometric transformation matrix represented in the [r c]


coordinate system to the [x y] coordinate system
% Transformation matrix [h1 h2 h3; h4 h5 h6] represented in RC coordinates
tformRC = [5 2 3; 7 8 13]
% Transformation matrix [h5 h2; h4 h1; h6 h3] represented in XY coordinates
temp = rot90(tformRC,3);
tformXY = [flipud(temp(1:2,:)); temp(3,:)]

Note: You cannot use this code to remap a projective transformation matrix. You must derive the
tformXY matrix from your data.

See Expressing Image Locations for an explanation of pixel and spatial coordinate systems.

Migration to [x y] Coordinate System

By default, all Computer Vision System Toolbox blocks, functions, and System objects are set to
operate in the [x y] coordinate system. Use the vision.setCoordinateSystem and
vision.getCoordinateSystem functions to help you migrate your code, by enabling you to revert
to the previous coordinate system until you can update your code. Use
vision.setCoordinateSystem('RC') call to set the coordinate system back to the zero-based [r
c] conventions .

For Simulink users, blocks affected by the [x y] coordinate system should be replaced with blocks of
the same name from the Vision library. Old blocks are marked with a red “Replace” badge. The

25-3
R2011b

following figure shows the Hough Lines block, as it would appear with the Replace badge, indicating
that it should be replaced with the Hough Lines block from the R2011b version.

Support for the pre-R2011b coordinate system will be removed in a future release.

Updated Blocks, Functions, and System Objects

The following table provides specifics for the functions, System objects, and blocks that were affected
by this update:

Functions Description of Update Prior to R2011b R2011b


epipolarLine The output A,B,C line A*row + B*col + C A*x + B*y + C
parameters were changed
to work with [x y] one-
based coordinates.
Accepts Fundamental
matrix in [x y] format.
estimateFundamentalM Adjusted to format of [r;c] 2-by-N zero-based [x y] M-by-2 one-based
atrix fundamental matrix. points. points.
Modified to work with Fundamental matrix Fundamental matrix
points expressed in [x y] formatted points for [r;c] formatted to work with [x
one-based coordinates. zero-based coordinates. y] one-based coordinates.
estimateUncalibrated Fundamental matrix, Fundamental matrix Fundamental matrix
Rectification matched points, and formatted only for zero- formatted for one-based [x
output projective based [r;c] coordinate y] coordinate system.
transformation matrices system
provided in new format. [r;c] 2-by-N zero-based [x y] M-by-2 one-based
points. points.
extractFeatures Converted to accept [x y] [r;c] 2-by-N zero-based [x y] M-by-2 one-based
coordinates points. points.
isEpipoleInImage Adjusted Fundamental Fundamental matrix Fundamental matrix
matrix format. Converted formatted only for zero- formatted only for one-
to [x y] coordinates. based [r;c] coordinate based, [x y] coordinate
system. system.
lineToBorderPoints The input A,B,C line A*row + B*col + C, where A*x + B*y + C, where A,B,
parameters were changed A,B, and C are represented and C are represented in
to work with [x y] in a 3-by-N matrix of [r;c] an M-by-3 matrix of [x y]
coordinates. zero-based points. one-based points.

25-4
Code Generation, GPU, and Third-Party Support

Functions Description of Update Prior to R2011b R2011b


Output intersection points The function returned the The function returns the
converted to [x y] one- intersection points in an 4- intersection points in an
based coordinate system. by-M matrix. format of M-by-4 matrix of format of
[r1;c1;r2;c2] zero-based [x1, y1, x2, y2] one-based
coordinate system. coordinate system.
matchFeatures Converted the Index Pairs The function returns the The function returns the
matrix to match output Index Pairs in a 2- output Index Pairs in a M-
orientation of the POINTS by-M [r c] zero-based by-2 [x y] one-based
with [x y] one-based format. format.
coordinates.
Changed orientation of Input feature vectors Input feature vectors
input feature vectors. stored in columns. stored in rows.

System Objects Description of Update Prior to R2011b R2011b


vision.AlphaBlende Converted Location Location format in [r;c] Location format in [x y]
r property to take [x y] zero-based coordinates. one-based coordinates.
coordinate location.
vision.BlobAnalysi Centroid and Bounding Centroid format in 2-by- Centroid format in M-
s Box formats converted M [r1 r2; c1 c2] zero- by-2 of format [x1 y1 x2
to [x y] coordinate based coordinates. y2] one-based
system. coordinates.
Bounding Box format in Bounding Box format in
4-by-N zero-based M-by-4 one-based
matrix matrix [x y width
[r;c;height;width]. height].
vision.BoundaryTra Converted to accept and 2-by-N matrix of [r c] M-by-2 matrix of [x y]
cer output [x y] one-based zero-based coordinates. one-based coordinates.
points.
vision.CornerDetec Corner locations Corner location in a 2- Corner locations in an
tor converted to [x y] by-N set of [r c] zero- M-by-2 one-based [x y]
coordinate system. based coordinates. coordinates.
vision.GeometricSc Converted ROI input to Shape in [r c height Shape in [x y width
aler [x y] coordinate one- width] zero-based height] one-based
based system. matrix. matrix.
vision.GeometricTr Converted Transformation matrix Takes one-based, [x y]
ansformer transformation matrix formatted only for zero- coordinate format for
format to support based [r;c] coordinate Transformation matrix.
changed ROI [x y] one- system.
based coordinate ROI format in ROI format in [x y width
system format. [r;c;height;width] zero- height] one-based
based format. format.
vision.GeometricTr Converted formatting Input points: [r1 r2;c1 Input points: [x1 y1; x2
ansformEstimator for input points. c2]. y2].

25-5
R2011b

System Objects Description of Update Prior to R2011b R2011b


Converted Transformation matrix Transformation matrix
transformation matrix formatted only for zero- format matches Image
to [x y] one-based based [r;c] coordinate Processing Toolbox
coordinate system. system. format.
vision.HoughLines Converted format for Output: [r11 r21; c11 Output: [x11 y11 x12
lines to [x y] one-based c21; r12 r22; c12 c22]. y12; x21 y21 x22 y22].
coordinate system. Size of output in a 4-by- Size of the output in M-
N zero-based matrix. by-4 one-based matrix.
vision.LocalMaxima Converted format for 2-by-N zero-based [r c] M-by-2 one-based [x y]
Finder Maxima locations coordinates. coordinates.
vision.MarkerInser Converted format for 2-by-N zero-based [r c] M-by-2, one-based [x y]
ter locations. coordinates. coordinates.
vision.Maximum Converted formats for Line: [r1 c1 r2 c2 r3 c3]. Line: [x1 y1 x2 y2 x3
vision.Mean line and rectangle ROIs. y3].
vision.Minimum Rectangle: [r c height Rectangle: [x y width
vision.StandardDev width]. height].
iation
vision.Variance
vision.ShapeInsert Converted format for Rectangle: [r; c; height; Rectangle: [x y width
er rectangles, lines, width] zero-based height] one-based
polygons, and circles to format. format.
[x y] one-based format. Line: [r1 c1 r2 c2] zero- Line: [x1 y1 x2 y2] one-
based format. based format.
Polygon: 4-by-M zero- Polygon: M-by-4 one-
based matrix. based matrix.
Circle: [r c radius] zero- Circle: [x y radius] one-
based format. based format.
Input image intensity N-by-M and N-by-M-by- M-by-N and M-by-N-by-
values converted to [x y] P [r c] zero-based P [x y] one-based
one-based format. format. format.
vision.TemplateMat Converted Location and Location output: [r; c] Location output: [x y]
cher ROI format to [x y] one- zero-based format. one-based format.
based coordinate ROI: [r c height width] ROI processing: [x y
system. zero-based format. width height] one-based
format.
vision.TextInserte Converted location and 2-by-N zero-based [r;c] M-by-2 [x y] one-based
r color orientation. locations. locations.
numColorPlanes-by-N M-by-numColorPlanes
zero-based format. one-based format.

25-6
Code Generation, GPU, and Third-Party Support

Blocks Description of Update Prior to R2011b R2011b


Apply Geometric Converted Transformation matrix Takes one-based, [x y]
Transformation Transformation matrix formatted only for zero- coordinate format for
format to support based [r;c] coordinate Transformation matrix.
changed ROI [x y] one- system.
based coordinate ROI format in ROI format in [x y width
system format. [r;c;height;width] zero- height] one-based
based format. format.
Blob Analysis Centroid and Bounding Centroid format in 2-by- Centroid format in M-
Box formats converted M [r1 r2; c1 c2] zero- by-2 of format [x1 y1 x2
to [x y] coordinate based coordinates. y2] one-based
system. coordinates.
Bounding Box format in Bounding Box format in
4-by-N zero-based M-by-4 one-based
matrix matrix [x y width
[r;c;height;width]. height].
Compositing Converted Location Location format in [r;c] Location format in [x y]
property to takes [x y] zero-based coordinates. one-based coordinates.
coordinate location.
Corner Detection Corner locations Corner location in a 2- Corner locations in an
converted to [x y] by-N set of [r c] zero- M-by-2 one-based [x y]
coordinate system. based coordinates. coordinates.
Draw Markers Converted format for 2-by-N zero-based [r c] M-by-2, one-based [x y]
locations. coordinates. coordinates.
Draw Shapes Converted format for Rectangle: [r; c; height; Rectangle: [x y width
rectangles, lines, width] zero-based height] one-based
polygons, and circles to format. format.
[x y] one-based format. Line: [r1 c1 r2 c2] zero- Line: [x1 y1 x2 y2] one-
based format. based format.
Polygon: 4-by-M zero- Polygon: M-by-4 one-
based matrix. based matrix.
Circle: [r c radius] zero- Circle: [x y radius] one-
based format based format.
Estimate Geometric Converted formatting Input points: [r1 r2;c1 Input points: [x1 y1; x2
Transformation for input points. c2]. y2].
Converted Transformation: T=[t22 Transformation matrix
Transformation matrix t12 t32; t21 t11 t31; t23 format matches Image
to [x y] one-based t13 t33]. Processing Toolbox
coordinate system. format.
Find Local Maxima Converted format for 2-by-N zero-based [r c] M-by-2 one-based [x y]
Maxima locations coordinates. coordinates.
Hough Lines Converted format for Output: [r11 r21; c11 Output: [x11 y11 x12
lines to [x y] one-based c21; r12 r22; c12 c22]. y12; x21 y21 x22 y22].
coordinate system. Size of output in a 4-by- Size of the output in M-
N zero-based matrix. by-4 one-based matrix.

25-7
R2011b

Blocks Description of Update Prior to R2011b R2011b


Template Matching Converted Location and Location output: [r; c] Location output: [x y]
ROI format to [x y] one- zero-based format. one-based format.
based coordinate ROI: [r c height width] ROI processing: [x y
system. zero-based format. width height] one-based
format.
nsert Text Converted location and 2-by-N zero-based [r;c] M-by-2 [x y] one-based
color orientation. locations. locations.
numColorPlanes-by-N M-by-numColorPlanes
zero-based format. one-based format.
2-D Maximum2-D Converted formats for Line: [r1 c1 r2 c2 r3 c3]. Line: [x1 y1 x2 y2 x3
Mean2-D Minimum2-D line and rectangle ROIs. y3].
Standard Deviation2-D Rectangle: [r c height Rectangle: [x y width
Variance width]. height].
Resize Converted ROI input to Shape in [r c height Shape in [x y width
[x y] coordinate one- width] zero-based height] one-based
based system. matrix. matrix.
Trace Boundary Converted to accept and 2-by-N matrix of [r c] M-by-2 matrix of [x y]
output [x y] one-based zero-based coordinates. one-based coordinates.
points.

Compatibility Considerations
Blocks affected by the [x y] coordinate system should be replaced with blocks of the same name from
the Vision library. Old blocks are marked with a red “Replace” badge. The following figure shows a
block which was affected by the coordinate system change:

Adjust your model and data as necessary. All functions and System objects are updated to use the
one-based [x y] convention.

By default, all Computer Vision System Toolbox blocks, functions, and System objects are set to
operate in the [x y] coordinate system. Use the vision.setCoordinateSystem and
vision.getCoordinateSystem functions to help migrate your code containing System objects and
functions to the [x y] coordinate system. Use vision.setCoordinateSystem('RC') call to
temporarily set the coordinate system to old conventions.

When you invoke an affected block, object, or function, a one time, per MATLAB session, warning
appears.

See the section, Expressing Image Locations for a description of the coordinate systems now used by
the Computer Vision System Toolbox product.

25-8
Code Generation, GPU, and Third-Party Support

New SURF Feature Detection, Extraction, and Matching Functions


This release introduces a new Speeded Up Robust Features (SURF) detector with functions
supporting interest feature detection, extraction and matching. The detectSURFFeatures function
returns information about SURF features detected in a grayscale image. You can use the
SURFPoints object returned by the detectSURFFeatures function to manipulate and plot SURF
features.

New Disparity Function for Depth Map Calculation


The new disparity function provides the disparity map between a pair of stereo images. You can
use the disparity function to find relative depth of the scene for tasks such as, segmentation, robot
navigation, or 3-D scene reconstruction.

Added Support for Additional Video File Formats for Non-Windows


Platforms
The From Multimedia File block and the vision.VideoFileReader now support many compressed
video file formats on Linux and Macintosh OS X platforms.

Variable-Size Support for System Objects


Computer Vision System Toolbox System objects support inputs that change their size at run time.

New Demo to Retrieve Rotation and Scale of an Image Using


Automated Feature Matching
This release provides a new demo, Find Image Rotation and Scale Using Automated
Feature Matching. This demo shows you how to use the
vision.GeometricTransformEstimator System object and the new detectSURFFeatures
function to find the rotation angle and scale factor of a distorted image.

Apply Geometric Transformation Block Replaces Projective


Transformation Block
The Projective Transformation block will be removed in a future release. It is recommended that you
replace this block with the combination of Apply Geometric Transformation and the Estimate
Geometric Transformation blocks to apply projective or affine transform to an image.

Trace Boundaries Block Replaced with Trace Boundary Block


This release provides a replacement block for the Trace Boundaries block. The Trace Boundary block
now returns variable size data. See Working with Variable-Size Signals for more information about
variable size data.

Note Unlike the Trace Boundaries block, the new Trace Boundary block only traces a single
boundary.

25-9
R2011b

The Trace Boundaries block will be removed in a future release.

Compatibility Considerations
The new Trace Boundary block no longer provides the Count output port that the older Trace
Boundaries block provided. Instead, the new Trace Boundary block and the corresponding
vision.BoundaryTracer System object now return variable size data.

FFT and IFFT Support for Non-Power-of-Two Transform Length with


FFTW Library
The 2-D FFT and 2-D IFFT blocks and the vision.IFFT and vision.FFT System objects include the
use of the FFTW library. The blocks and objects now support non-power-of-two transform lengths.

vision.BlobAnalysis Count and Fill-Related Properties Removed


The blob analysis System object now supports variable-size outputs. Therefore, the Count output,
and the NumBlobsOutputPort, FillEmptySpaces, and FillValues properties related to fixed-
size outputs, were removed from the object.

Compatibility Considerations
Remove these properties from your code, and update accordingly. If you require an explicit blob
count, call size on one of the object’s outputs, such as AREA.

vision.CornerDetector Count Output Removed


The corner detector System object now supports variable-size outputs. Therefore, the Count output
related to fixed-size outputs, were removed from the object.

Compatibility Considerations
Update your code accordingly. If you require an explicit count, call size on the object METRIC
output.

vision.LocalMaximaFinder Count Output and CountDataType Property


Removed
The local maxima finder System object now supports variable-size outputs. Therefore, the Count
output, and the CountDataType property related to fixed-size outputs, were removed from the
object.

Compatibility Considerations
Remove the property from your code, and update accordingly.

25-10
Code Generation, GPU, and Third-Party Support

vision.GeometricTransformEstimator Default Properties Changed


The following default property values for the vision.GeometricTransformEstimator System
object have been changed to provide more reliable outputs.

Property Default Value


From To
Transform Projective Affine
AlgebraicDistanceThreshold 1.5 2.5
PixelDistanceThreshold 1.5 2.5
NumRandomSamplings 100 500
MaximumRandomSamples 200 1000

Compatibility Considerations
The effect of these changes make the object’s default-value computations more reliable. If your code
relies on the previous default values, you might need to update the affected property values.

Code Generation Support


The vision.IFFT System object now supports code generation. See About MATLAB Coder for more
information about code generation.

vision.MarkerInserter and vision.ShapeInserter Properties Not


Tunable
The following vision.MarkerInserter and vision.ShapeInserter properties are now
nontunable:

• FillColor
• BorderColor

When objects are locked (for instance, after calling the step method), you cannot change any
nontunable property values.

Compatibility Considerations
Review any code that changes any vision.MarkerInserter or vision.ShapeInserter property
value after calling the step method. You should update the code to use property values that do not
change.

Custom System Objects


You can now create custom System objects in MATLAB. This capability allows you to define your own
System objects for time-based and data-driven algorithms, I/O, and visualizations. The System object
API provides a set of implementation and service methods that you incorporate into your code to
implement your algorithm. See Define New System Objects for more information.

25-11
R2011b

System Object DataType and CustomDataType Properties Changes


When you set a System object, fixed-point <xxx>DataType property to 'Custom', it activates a
dependent Custom<xxx>DataType property. If you set that dependent Custom<xxx>DataType
property before setting its <xxx>DataType property, a warning message displays. <xxx> differs for
each object.

Compatibility Considerations
Previously, setting the dependent Custom<xxx>DataType property would automatically change its
<xxx>DataType property to 'Custom'. If you have code that sets the dependent property first,
avoid warnings by updating your code. Set the <xxx>DataType property to 'Custom' before setting
its Custom<xxx>DataType property.

Note If you have a Custom<xxx>DataType in your code, but do not explicitly update your code to
change <xxx>DataType to 'Custom', you may see different numerical output.

25-12
26

R2011a

Version: 4.0

New Features

Bug Fixes

Compatibility Considerations
R2011a

Product Restructuring
The Video and Image Processing Blockset has been renamed to Computer Vision System Toolbox.
This product restructuring reflects the broad expansion of computer vision capabilities for the
MATLAB and Simulink environments. The Computer Vision System Toolbox software requires the
Image Processing Toolbox and DSP System Toolbox software.

You can access archived documentation for the Video and Image Processing Blockset™ products on
the MathWorks website.

System Object Name Changes

Package Name Change

The System object package name has changed from video to vision. For example,
video.BlobAnalysis is now vision.BlobAnalysis.

Object Name Changes

The 2D System object names have changed. They no longer have 2D in the name and now use the
new package name.

Old Name New Name


video.Autocorrelator2D vision.Autocorrelator
video.Convolver2D vision.Convolver
video.Crosscorrelator2D vision.Crosscorrelator
video.DCT2D vision.DCT
video.FFT2D vision.FFT
video.Histogram2D vision.Histogram
video.IDCT2D vision.IDCT
video.IFFT2D vision.IFFT
video.MedianFilter2D vision.MedianFilter

New Computer Vision Functions


Extract Features

The extractFeatures function extracts feature vectors, also known as descriptors, from an image.

Feature Matching

The matchFeatures function takes a pair of feature vectors, as returned by the extractFeatures
function, and finds the features which are most likely to correspond.

Uncalibrated Stereo Rectification

The estimateUncalibratedRectification function returns projective transformations for


rectifying stereo images.

26-2
Code Generation, GPU, and Third-Party Support

Determine if Image Contains Epipole

The isEpipoleInImage function determines whether an image contains an epipole. This function
supports the estimateUncalibratedRectification function.

Epipolar Lines for Stereo Images

The epipolarLine computes epipolar lines for stereo images.

Line-to-Border Intersection Points

The lineToBorderPoints function calculates the location of the point of intersection of line in an
image with the image border. This function supports the epipolarLine function.

New Foreground Detector System Object


The vision.ForegroundDetector object computes a foreground mask using Gaussian mixture
models (GMM).

New Tracking Cars Using Gaussian Mixture Models Demo


The new Tracking Cars Using Gaussian Mixture Models demo illustrates the use of Gaussian mixture
models for detection and tracking of cars. The algorithm detects and tracks the cars in a video by
separating them from their background.

Expanded To Video Display Block with Additional Video Formats


The To Video Display block now supports 4:2:2 YCbCr video input format.

New Printing Capability for the mplay Function and Video Viewer
Block
You can now print the display information from the GUI interface of the mplay function and the Video
Viewer block.

Improved Display Updates for mplay Function, Video Viewer Block and
vision.VideoPlayer System Object
R2011a introduces the capability to improve the performance of mplay, the Video Viewer block and
the vision.VideoPlayer System object by reducing the frequency with which the display updates.
You can now choose between this new enhanced performance mode and the old behavior. By default,
all scopes operate in the new enhanced performance mode.

Improved Performance of FFT Implementation with FFTW library


The 2-D FFT, 2-D IFFT blocks include the use of the FFTW library.

26-3
R2011a

Variable Size Data Support


The Resize block now supports variable size data. See Working with Variable-Size Signals for more
information about variable size data.

System Object Input and Property Warnings Changed to Errors


When a System object is locked (e.g., after the step method has been called), the following situations
now produce an error. This change prevents the loss of state information.

• Changing the input data type


• Changing the number of input dimensions
• Changing the input complexity from real to complex
• Changing the data type, dimension, or complexity of tunable property
• Changing the value of a nontunable property

Compatibility Considerations
Previously, the object issued a warning for these situations. The object then unlocked, reset its state
information, relocked, and continued processing. To update existing code so that it does not error, use
the release method before changing any of the items listed above.

System Object Code Generation Support


The following System objects now support code generation:

• vision.GeometricScaler
• vision.ForegroundDetector

MATLAB Compiler Support for System Objects


The Computer Vision System Toolbox supports the MATLAB Compiler for all objects except
vision.VideoPlayer. With this capability, you can use the MATLAB Compiler to take MATLAB
files, which can include System objects, as input and generate standalone applications.

R2010a MAT Files with System Objects Load Incorrectly


If you saved a System object to a MAT file in R2010a and load that file in R2011a, MATLAB may
display a warning that the constructor must preserve the class of the returned object. This occurs
because an aspect of the class definition changed for that object in R2011a. The object's saved
property settings may not restore correctly.

Compatibility Considerations
MAT files containing a System object saved in R2010a may not load correctly in R2011a. You should
recreate the object with the desired property values and save the MAT file.

26-4
Code Generation, GPU, and Third-Party Support

Documentation Examples Renamed


In previous releases, the examples used throughout the Video and Image Processing Blockset™
documentation were named with a doc_ prefix. In R2011a, this changed to a ex_ prefix. For example,
in R2010b, you could launch an example model using the Video Viewer block by typing
doc_thresholding at the MATLAB command line. To launch the same model in R2011a, you must
type ex_thresholding at the command line.

Compatibility Considerations
You can no longer launch Video and Image Processing Blockset™ documentation example models
using the doc_ prefix name. To open these models in R2011a, you must replace the doc_ prefix in the
model name with ex_.

26-5
27

R2010b

Version: 3.1

New Features

Bug Fixes

Compatibility Considerations
R2010b

New Estimate Fundamental Matrix Function for Describing Epipolar


Geometry
New Estimate Fundamental Matrix function for describing epipolar geometry. Epipolar
geometry applies to the geometry of stereo vision, where you can calculate depth information based
on corresponding points in stereo image pairs. The function supports the generation of embeddable C
code.

New Histogram System Object Replaces Histogram2D Object


The new video.Histogram System object replaces the video.Histogram2D System object. The name
change was made to align this object with its corresponding block.

Compatibility Considerations
The video.Histogram2D System object now issues a warning. Update code that uses the 2D-
Histogram object to use the new Histogram object.

New System Object release Method Replaces close Method


The close method has been replaced by the new release method, which unlocks the object and
releases memory and other resources, including files, used by the object. The new release method
includes the functionality of the old close method, which only closed files used by the object.

Compatability Considerations

The close method now issues a warning. Update code that uses the close method to use the new
release method.

Expanded Embedded MATLAB Support


Embedded MATLAB® now supports the generation of embeddable C code for two Image Processing
Toolbox functions and additional Video and Image Processing Blockset System objects. The generated
C code meets the strict memory and data type requirements of embedded target environments. Video
and Image Processing Blockset provides Embedded MATLAB support for these Image Processing
Toolbox functions. See Code Generation for details, including limitations.

Supported Image Processing Toolbox Functions

label2rgb
fspecial

Supported System objects

Video and Image Processing Blockset objects now support code generation:
video.CornerDetector
video.GeometricShearer
video.Histogram
video.MorpologicalBottomHat
video.MorpologicalTopHat
video.MultimediaFileReader

27-2
Code Generation, GPU, and Third-Party Support

video.MultimediaFileWriter

Data Type Assistant and Ability to Specify Design Minimums and


Maximums Added to More Fixed-Point Blocks
The following blocks now offer a Data Type Assistant to help you specify fixed-point data types on
the block mask. Additionally, you can now enable simulation range checking for certain data types on
these blocks. To do so, specify appropriate minimum and maximum values on the block dialog box.
The blocks that support these features are:

• 2-D DCT
• 2-D FFT
• 2-D IDCT
• 2-D IFFT
• 2-D FIR Filter

For more information on these features, see the following sections in the Simulink documentation:

• Using the Data Type Assistant


• Signal Ranges

Data Types Pane Replaces the Data Type Attributes and Fixed-Point
Panes on Fixed-Point Blocks
In previous releases, some fixed-point blocks had a Data type attributes pane, and others had a
Fixed-point pane. The functionality of these panes remains the same, but the pane now appears as
the Data Types pane on all fixed-point Computer Vision System Toolbox blocks.

Enhanced Fixed-Point and Integer Data Type Support with System


Objects
For nonfloating point input, System objects now output the data type you specify. Previously, the
output was always a fixed-point, numeric fi object.

Compatability Considerations

Update any code that takes nonfloating point input, where you expect the object to output a fi
object.

Variable Size Data Support


Several Video and Image Processing Blockset blocks now support changes in signal size during
simulation. The following blocks support variable size data as of this release:

PSNR 2-D Correlation


Median Filter 2-D Convolution
Block Processing 2-D Autocorrelation

27-3
R2010b

Image Complement Deinterlacing


Gamma Correction

See Working with Variable-Size Signals for more information about variable size data.

Limitations Removed from Video and Image Processing Blockset


Multimedia Blocks and Objects
Support for reading interleaved AVI data and reading AVI files larger than 2GB on UNIX platforms.
Previously, this was only possible on Windows platforms. The following blocks and System objects
have the limitation removed:
From Multimedia File block
video.MultimediaFileReader System object

Support for writing AVI files larger than 2GB on UNIX platforms, which was previously only possible
on Windows platforms. The following blocks and System objects have the limitation removed:
To Multimedia File block
video.MultimediaFileWriter System object

27-4
28

R2010a

Version: 3.0

New Features

Bug Fixes
R2010a

New System Objects Provide Video and Image Processing Algorithms


for use in MATLAB
System Objects are algorithms that provide stream processing, fixed-point modeling, and code
generation capabilities for use in MATLAB programs. These new objects allow you to use video and
image processing algorithms in MATLAB, providing the same parameters, numerics and performance
as corresponding Video and Image Processing Blockset blocks. System objects can also be used in
Simulink models via the Embedded MATLAB Function block.

Intel Integrated Performance Primitives Library Support Added to 2-D


Correlation, 2-D Convolution, and 2-D FIR Filter Blocks
The 2-D Correlation, 2-D Convolution, and 2-D FIR Filter blocks are now taking advantage of SSE
Intel instruction set and multi-core processor capabilities for double and single data types.

Variable Size Data Support


Several Video and Image Processing Blockset blocks now support changes in signal size during
simulation. The following blocks support variable size data as of this release:

2-D FFT Hough Transform


2-D FIR Filter Image Data Type Conversion
Apply Geometric Transformation Image Pad
Autothreshold Insert Text
Bottom-hat Label
Chroma Resampling 2-D Maximum
Closing 2-D Mean
Color Space Conversion
Compositing 2-D Minimum
Contrast Adjustment Opening
Dilation Rotate
Edge Detection 2-D Standard Deviation
Erosion Template Matching
Estimate Geometric Transformation To Video Display
Find Local Maxima Top-hat
Frame Rate Display 2-D Variance
Gaussian Pyramid Video Viewer

See Working with Variable-Size Signals for more information about variable size data.

Expanded From and To Multimedia File Blocks with Additional Video


Formats
The To Multimedia File and From Multimedia File blocks now support 4:2:2 YCbCr video formats.

28-2
Code Generation, GPU, and Third-Party Support

The To Multimedia File block now supports WMV, WMA, and WAV file formats on Windows platforms.
This block now supports broadcasting WMV and WMA streams over the network.

New Simulink Demos


The Video and Image Processing Blockset contain new and enhanced demos.

New Modeling a Video Processing System for an FPGA Target Demo

This demo uses the Video and Image Processing Blockset in conjunction with Simulink HDL Coder™
to show a design workflow for generating Hardware Design Language (HDL) code suitable for
targeting video processing application on an FPGA. The demo reviews how to design a system that
can operate on hardware.

New System Object Demos


New Image Rectification Demo

This demo shows how to rectify two uncalibrated images where the camera intrinsics are unknown.
Rectification is a useful procedure in many computer vision applications. For example, in stereo
vision, it can be used to reduce a 2-D matching problem to a 1-D search. This demo is a prerequisite
for the Stereo Vision demo.

New Stereo Vision Demo

This demo computes the depth map between two rectified stereo images using block matching, which
is the standard algorithm for high-speed stereo vision in hardware systems. It further explores
dynamic programming to improve accuracy, and image pyramiding to improve speed.

New Video Stabilization Using Point Feature Matching

This demo uses a point feature matching approach for video stabilization, which does not require
knowledge of a feature or region of the image to track. The demo automatically searches for the
background plane in a video sequence, and uses its observed distortion to correct for camera motion.
This demo presents a more advanced algorithm in comparison to the existing Video Stabilization
demo in Simulink.

SAD Block Obsoleted


The new Template Matching block introduced in the previous release, supports Sum of Absolute
Differences (SAD) algorithm. Consequently, the SAD Block has been obsoleted.

28-3

You might also like