BogoToBogo
  • Home
  • About
  • Big Data
  • Machine Learning
  • AngularJS
  • Python
  • C++
  • go
  • DevOps
  • Kubernetes
  • Algorithms
  • More...
    • Qt 5
    • Linux
    • FFmpeg
    • Matlab
    • Django 1.8
    • Ruby On Rails
    • HTML5 & CSS

Matlab Tutorial : Video Processing 2 - Face detection and CamShift Tracking

Matlab_logo.png




Bookmark and Share





bogotobogo.com site search:




Note

This is based on Face Detection and Tracking Using CAMShift. I'm hoping I'll be able to add some values to the reference material.




1st Input Video

verona.wmv

This one failed. We'll see why.


Face Detection

For the details of the technical aspect, please visit my OpenCV page, Image object detection : Face detection using Haar Cascade Classifiers.

Before we start tracking a face, we should be able to detect it.

In Matlab, we the vision.CascadeObjectDetector() to detect the location of a face in a video frame acquired by a step() function. It uses Viola-Jones detection algorithm (cascade of scaled images) and a trained classification model for detection. By default, the detector is configured to detect faces, but it can be configured for other object types.


% Read a video frame and run the detector.
videoFileReader = vision.VideoFileReader('verona.wmv');
videoFrame      = step(videoFileReader);

% Create a cascade detector object.
faceDetector = vision.CascadeObjectDetector();
bbox = step(faceDetector, videoFrame);

% Draw the returned bounding box around the detected face.
videoOut = insertObjectAnnotation(videoFrame,'rectangle',bbox,'Face');
figure, imshow(videoOut), title('Detected face');
DetectedFace.png

The Matlab doc said "You can use the cascade object detector to track a face across successive video frames. However, when the face tilts or the person turns their head, you may lose tracking. This limitation is due to the type of trained classification model used for detection. To avoid this issue, and because performing face detection for every video frame is computationally intensive, this example uses a simple facial feature for tracking."

Actually, when I was working with OpenCV, I've seen the face detection failed for tilted or for the face not facing front. So, to be more general and good, as the doc said we need well trained detector.



bogotobogo.com site search:

Extract features from the face

We can locate a face in our video. Now what?

The next step is to identify a feature that will help us track the face. For example, we can use the shape, color, or anything. We need to find a feature that's unique to our target, features remain invariant even when the target moves.

In this example, we'll use skin tone as the feature to track following the Matlab doc. It said "the skin tone provides a good deal of contrast between the face and the background and does not change as the face rotates or moves."

skin_tone.png

When I googled it with "skin tone face detection", indeed, it's been widely used for face detection. In the same context, skin tone is also used in "face segmentation".

HueChannelData.png

Here is the code responsible for the image above:

% Read a video frame and run the detector.
videoFileReader = vision.VideoFileReader('verona.wmv');
videoFrame      = step(videoFileReader);

% Create a cascade detector object.
faceDetector = vision.CascadeObjectDetector();
bbox = step(faceDetector, videoFrame);

% Draw the returned bounding box around the detected face.
videoOut = insertObjectAnnotation(videoFrame,'rectangle',bbox,'Face');
figure, imshow(videoOut), title('Detected face');

% Get the skin tone information by extracting the Hue 
% from the video frame and
% convert to the HSV color space.
[hueChannel,~,~] = rgb2hsv(videoFrame);

% Display the Hue Channel data 
% and draw the bounding box around the face.
figure, imshow(hueChannel), title('Hue channel data');
rectangle('Position',bbox(1,:),'LineWidth',2,'EdgeColor',[1 1 0])

Tracking the face

"With the skin tone selected as the feature to track, you can now use the vision.HistogramBasedTracker for tracking. The histogram based tracker uses the CAMShift algorithm, which provides the capability to track an object using a histogram of pixel values. In this example, the Hue channel pixels are extracted from the nose region of the detected face. These pixels are used to initialize the histogram for the tracker. The example tracks the object over successive video frames using this histogram."

The final code;


% Read a video frame and run the detector.
videoFileReader = vision.VideoFileReader('verona.wmv');
videoFrame      = step(videoFileReader);

% Create a cascade detector object.
faceDetector = vision.CascadeObjectDetector();
bbox = step(faceDetector, videoFrame);

% Draw the returned bounding box around the detected face.
videoOut = insertObjectAnnotation(videoFrame,'rectangle',bbox,'Face');
figure, imshow(videoOut), title('Detected face');

% Get the skin tone information by extracting the Hue 
% from the video frame and
% convert to the HSV color space.
[hueChannel,~,~] = rgb2hsv(videoFrame);

% Display the Hue Channel data 
% and draw the bounding box around the face.
figure, imshow(hueChannel), title('Hue channel data');
rectangle('Position',bbox(1,:),'LineWidth',2,'EdgeColor',[1 1 0]);

% Detect the nose within the face region. 
% The nose provides a more accurate
% measure of the skin tone 
% because it does not contain any background pixels.
noseDetector = vision.CascadeObjectDetector('Nose');
faceImage    = imcrop(videoFrame,bbox(1,:));
noseBBox     = step(noseDetector,faceImage);

% The nose bounding box is defined relative to the cropped face image.
% Adjust the nose bounding box so that it is relative to the original video
% frame.
noseBBox(1,1:2) = noseBBox(1,1:2) + bbox(1,1:2);

% Create a tracker object.
tracker = vision.HistogramBasedTracker;

% Initialize the tracker histogram 
% using the Hue channel pixels from the nose.
initializeObject(tracker, hueChannel, noseBBox(1,:));

% Create a video player object for displaying video frames.
videoInfo    = info(videoFileReader);
videoPlayer  = vision.VideoPlayer('Position',[300 300 videoInfo.VideoSize+30]);

% Track the face over successive video frames 
% until the video is finished.
while ~isDone(videoFileReader)

    % Extract the next video frame
    videoFrame = step(videoFileReader);

    % RGB -> HSV
    [hueChannel,~,~] = rgb2hsv(videoFrame);

    % Track using the Hue channel data
    bbox = step(tracker, hueChannel);

    % Insert a bounding box around the object being tracked
    videoOut = insertObjectAnnotation(videoFrame,'rectangle',bbox,'Face');

    % Display the annotated video frame 
    % using the video player object
    step(videoPlayer, videoOut);

end

% Release resources
release(videoFileReader);
release(videoPlayer);

However, when I run this, I got the following error:

Index exceeds matrix dimensions.

Error in test (line 34)
noseBBox(1,1:2) = noseBBox(1,1:2) + bbox(1,1:2);

The failure caused by the assumption we made when design the code: node is at the center of the face. So, unlike the initial detection of face, nose detection was just zoom-in process within the face.





Input 2

I cut the beginning of the input video (verona.wmv) a bit (1 second), and made a new one: verona2.wmv


Output 2

Here are the outputs with the new input video:

DetectedFace2.png

HueChannelData2.png

And the tracking video. This one was successful, at least for couple of seconds, however, it lost track and stays that way to the end. It was fully expected because the scene changes drastically after the initial seconds.

Your browser does not support the video tag.

Odd but there is a moment in the video, our detector recognizes the whole body of a woman as a face. Probably, we used the skin tone to detect a face.

Lots of things to learn to make a smart detector!






Matlab Image and Video Processing Tutorial

  1. Vectors and Matrices
  2. m-Files (Scripts)
  3. For loop
  4. Indexing and masking
  5. Vectors and arrays with audio files
  6. Manipulating Audio I
  7. Manipulating Audio II
  8. Introduction to FFT & DFT
  9. Discrete Fourier Transform (DFT)
  10. Digital Image Processing 2 - RGB image & indexed image
  11. Digital Image Processing 3 - Grayscale image I
  12. Digital Image Processing 4 - Grayscale image II (image data type and bit-plane)
  13. Digital Image Processing 5 - Histogram equalization
  14. Digital Image Processing 6 - Image Filter (Low pass filters)
  15. Video Processing 1 - Object detection (tagging cars) by thresholding color
  16. Video Processing 2 - Face Detection and CAMShift Tracking









Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization

YouTubeMy YouTube channel

Sponsor Open Source development activities and free contents for everyone.

Thank you.

- K Hong







Matlab Image and Video Processing



Vectors and Matrices

m-Files (Scripts)

For loop

Indexing and masking

Vectors and arrays with audio files

Manipulating Audio I

Manipulating Audio II

Introduction to FFT & DFT

Discrete Fourier Transform (DFT)



Digital Image Processing 2 - RGB image & indexed image

Digital Image Processing 3 - Grayscale image I

Digital Image Processing 4 - Grayscale image II (image data type and bit-plane)

Digital Image Processing 5 - Histogram equalization

Digital Image Processing 6 - Image Filter (Low pass filters)

Video Processing 1 - Object detection (tagging cars) by thresholding color

Video Processing 2 - Face Detection and CAMShift Tracking




Sponsor Open Source development activities and free contents for everyone.

Thank you.

- K Hong







OpenCV 3 -

image & video processing



Installing on Ubuntu 13

Mat(rix) object (Image Container)

Creating Mat objects

The core : Image - load, convert, and save

Smoothing Filters A - Average, Gaussian

Smoothing Filters B - Median, Bilateral






OpenCV 3 image and video processing with Python



OpenCV 3 with Python

Image - OpenCV BGR : Matplotlib RGB

Basic image operations - pixel access

iPython - Signal Processing with NumPy

Signal Processing with NumPy I - FFT and DFT for sine, square waves, unitpulse, and random signal

Signal Processing with NumPy II - Image Fourier Transform : FFT & DFT

Inverse Fourier Transform of an Image with low pass filter: cv2.idft()

Image Histogram

Video Capture and Switching colorspaces - RGB / HSV

Adaptive Thresholding - Otsu's clustering-based image thresholding

Edge Detection - Sobel and Laplacian Kernels

Canny Edge Detection

Hough Transform - Circles

Watershed Algorithm : Marker-based Segmentation I

Watershed Algorithm : Marker-based Segmentation II

Image noise reduction : Non-local Means denoising algorithm

Image object detection : Face detection using Haar Cascade Classifiers

Image segmentation - Foreground extraction Grabcut algorithm based on graph cuts

Image Reconstruction - Inpainting (Interpolation) - Fast Marching Methods

Video : Mean shift object tracking

Machine Learning : Clustering - K-Means clustering I

Machine Learning : Clustering - K-Means clustering II

Machine Learning : Classification - k-nearest neighbors (k-NN) algorithm










Contact

BogoToBogo
contactus@bogotobogo.com

Follow Bogotobogo

About Us

contactus@bogotobogo.com

YouTubeMy YouTube channel
Pacific Ave, San Francisco, CA 94115

Pacific Ave, San Francisco, CA 94115

Copyright © 2024, bogotobogo
Design: Web Master