In this post we will use the OpenCV library for facial recognition. Here is an example using my webcam as an input:
The best part is that it can be done using less than 20 lines of code:
import cv2, os
BLUE_COLOR = (255, 0, 0)
STROKE = 2
xml_path = 'haarcascade_frontalface_alt2.xml'
clf = cv2.CascadeClassifier(xml_path)
cap = cv2.VideoCapture(0)
while(not cv2.waitKey(20) & 0xFF == ord('q')):
ret, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = clf.detectMultiScale(gray)
for x, y, w, h in faces:
cv2.rectangle(frame, (x, y), (x+w, y+h), BLUE_COLOR, STROKE)
cv2.imshow('frame',frame)
cap.release()
cv2.destroyAllWindows()
The rest of this post will be dedicated to developing and understanding this code step by step on the Python console. It will be divided into four parts:
- Prerequisites
- How to capture a frame from your webcam.
- How to detect faces in the frame that was previously captured.
- At the end there is a quiz for you to finish the code for detecting faces in real time. The solution is at the end of the article.
First, let's install the prerequisites.
For this tutorial, you will need Python 3. You will also need to have OpenCV installed. Here's my sugestion:
-
Install using Python 3.7 using Anaconda . This way, the most important Data Science packages will already be included.
-
Open your terminal (if Linux/Mac) or CMD (if Windows)
-
Create a new environment for installing OpenCV (highly suggested because some packages might be downgraded for compatibility):
$ conda create -n image-detection anaconda
-
Activate this environment. On Linux / Mac the command is:
$ source activate image-detection
On Windows:$ activate image-detection
-
Install OpenCV in this environment:
conda install opencv
Done! It is important to install opencv using conda
instead of pip
since there are non-Python library dependencies (for more info, see the difference between pip
and conda
).
In order to gain intuition, we build our code little by little using the iPython console (you can also use Jupyter if familiar). With your terminal (or CMD) open, activate your environment and then open ipython
:
... $ source activate image-detection
... $ ipython
The ipython console should be opened. Next, let's import the only dependency of this tutorial:
In [1]: import cv2
Now, let's initialize your webcam. This can be done using the command cv2.VideoCapture(0)
:
In [2]: cap = cv2.VideoCapture(0)
The LED on your webcam should probably light up.
Note: if you do not have a webcam, try using a video path as input, for example
cap = cv2.VideoCapture("c:\\my_video.mp4")
).
Now, let's learn how to take a “photo” from your webcam. This can be done from the method cap.read
followed by the release method to turn it off:
In [3]: ret, frame = cap.read()
In [4]: cap.release()
The captured image will be stored in frame
. ret
is a binary that indidicates whether an image was been retrieved or not. Now to view this image, we use cv2.imshow
combined with waitKey
:
In [5]: cv2.imshow('frame', frame)
In [6]: cv2.waitKey(1)
Out [6]: -1
In order to update the display or close of the image, note that we need the method cv2.waitKey(1)
that basically waits for a key to be pressed. In order to close the image, we do the following:
In [7]: cv2.destroyAllWindows()
In [8]: cv2.waitKey(1)
Out [8]: -1
More info about
waitKey
: Note that its usage is mandatory for both displaying and closing images. This function is the only one that has access to the graphical interface and will wait for a key to be pressed in the specified milliseconds. As we want the image to be displayed or closed regardless of a key being pressed, we use the 1 ms delay.waitKey
is also required for frame-by-frame display in loops for displaying videos.
In order to display the webcam's content in real time, we need to place all of these previous commands inside a while loop:
In [10]: cap = cv2.VideoCapture(0)
...: while (not cv2.waitKey(20) & 0xFF == ord('q')):
...: ret, frame = cap.read()
...: cv2.imshow('frame', frame)
...: cap.release()
...: cv2.destroyAllWindows()
...: cv2.waitKey(1)
...:
Suggestion: You can copy-paste the code from this tutorial in the same way it is presented. iPython will handle non-python sintax such as
In [10]:
and...:
automatically.
The command cv2.waitKey(20) & 0xFF == ord(‘q’)
will wait for the q
key to be pressed in the video window (obs: if you press q
in the terminal it is not going to work) and then end the display.
Now, we have finished the first part of this tutorial. In the next section, we'll learn how to recognize faces from any image, including frames that are captured in the webcam.
For face recognition, we are going to import a pre-trained face detection model known as haar cascade. As defined in Wikipedia, a haar cascade model considers adjacent rectangular regions at a specific location in a detection window, sums up the pixel intensities in each region and calculates the difference between these sums. This difference is then used to categorize subsections of an image. Those models usually come as XML files and are located in a subfolder of the OpenCV installation directory.
As our goal is to identify faces, the model file we want to import is called haarcascade_frontalface_alt2.xml
. As its location may vary dependin on the OS that is being used, we can either download the model from the source code or define a function for searching such file in the OpenCV directory:
In [11]: import os
In [12]: cv2path = os.path.dirname(cv2.__file__)
In [13]: def find(name, path): # recursive search function
...: for root, dirs, files in os.walk(path):
...: if (name in files) or (name in dirs):
...: return os.path.join(root, name)
...: return find(name, os.path.dirname(path))
In [14]: xml_path = find('haarcascade_frontalface_alt2.xml', cv2path)
In [15]: xml_path
Out[15]: '/Users/.../anaconda/envs/image-detection/share/OpenCV/haarcascades/haarcascade_frontalface_alt2.xml'
In the previous code, we defined a recursive function find
which searches for a file taking an input the filename and the directory to be used as starting point in the search. As inputs, we set the instalation directory of OpenCV the XML filename which contains the face detector. In Out[15]:
we confirmed that the file has been found in the directory anaconda/envs/image-detection/share/OpenCV/haarcascades/
. As a bonus, we can also check which other feature detectors are available by setting the directory haarcascades
as input in the find
function:
In [16]: haar_path = find('haarcascades', cv2path)
In [17]: os.listdir(haar_path)
Out[17]:
['haarcascade_eye.xml',
'haarcascade_eye_tree_eyeglasses.xml',
'haarcascade_frontalcatface.xml',
'haarcascade_frontalcatface_extended.xml',
'haarcascade_frontalface_alt.xml',
'haarcascade_frontalface_alt2.xml',
'haarcascade_frontalface_alt_tree.xml',
'haarcascade_frontalface_default.xml',
'haarcascade_fullbody.xml',
'haarcascade_lefteye_2splits.xml',
'haarcascade_licence_plate_rus_16stages.xml',
'haarcascade_lowerbody.xml',
'haarcascade_profileface.xml',
'haarcascade_righteye_2splits.xml',
'haarcascade_russian_plate_number.xml',
'haarcascade_smile.xml',
'haarcascade_upperbody.xml']
For example, if we wanted to identify eyes instead of faces, we could use the model haarcascade_eye.xml
.
Now, we will initialize the classifier using the previously found XML filepath:
In [18]: clf = cv2.CascadeClassifier(xml_path)
Now with the classifier initialized, we can identify faces in images. However, this classifier only accepts grayscale images as input. The conversion of the frame to grayscale can be done as follows:
In [19]: gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
And finally face detection can be performed on top of the image converted to grayscale:
In [20]: faces = clf.detectMultiScale(gray)
The variable faces
contains a list with coordinates of a rectangle corresponding to the faces that were identified (if identified). And finally, to visualize these coordinates, we can draw a rectangle using cv2.rectangle
:
In [21]: for x, y, w, h in faces:
...: cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0))
In [22]: cv2.imshow('frame', frame); cv2.waitKey(1)
The coordinates x
and y
correspond to the lower left point of the rectangle while w
and h
correspond to the rectangle's width and height. We've chosen blue as the color of the rectagle to be drawn, which is in the BGR format, i.e. (255, 0, 0)
. And finally, we can close the image using:
In [23]: cv2.destroyAllWindows(); cv2.waitKey(1)
Here's an exercise to gather the ideas from the two previous sections (i.e. reading the webcam and locating faces in it). I have advanced the routine for reading the webcam, you just have to complete the #TODO
parts for classifying the frames:
import cv2, os
# TODO: Import Haar XML Model
# TODO: Inicialize Classifier
# Initialize webcam
cap = cv2.VideoCapture(0)
# Loop for capturing webcam's content
while(not cv2.waitKey(20) & 0xFF == ord('q')):
# Capture next frame
ret, frame = cap.read()
# TODO: Convert frame into grayscale
# TODO: Classify frame
# TODO: Draw rectangle
# Visualize it
cv2.imshow('frame',frame)
# Turn off Webcam
cap.release()
# Destroy all windows
cv2.destroyAllWindows()
The simplest answer is at the top of the page. Here is a more complete solution which includes an automatic seach of the XML model:
import cv2, os
# Find files given a starting path
def find(name, path):
for root, dirs, files in os.walk(path):
if (name in files) or (name in dirs):
return os.path.join(root, name)
# If not found, start recursion
return find(name, os.path.dirname(path))
# TODO: Import Haar XML Model
cv2path = os.path.dirname(cv2.__file__)
haar_path = find('haarcascades', cv2path)
xml_name = 'haarcascade_frontalface_alt2.xml'
xml_path = os.path.join(haar_path, xml_name)
# TODO: Inicialize Classifier
clf = cv2.CascadeClassifier(xml_path)
# Initialize webcam
cap = cv2.VideoCapture(0)
# Loop for capturing webcam's content
while(not cv2.waitKey(20) & 0xFF == ord('q')):
# Capture next frame
ret, frame = cap.read()
# TODO: Convert frame into grayscale
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# TODO: Classify frame
faces = clf.detectMultiScale(gray)
# TODO: Draw rectangle
for x, y, w, h in faces:
cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0))
# Visualize it
cv2.imshow('frame',frame)
# Turn off Webcam
cap.release()
# Destroy all windows
cv2.destroyAllWindows()