 |
| |
|
Seeing With OpenCV, Part 5: Implementing Eigenface
(Continued)
|
|
|
|
This article originally appeared in
SERVO Magazine, May 2007.
Reprinted by permission of T & L Publications, Inc.
|
| |
The recognition phase
Figure 8
shows the recognize() function, which implements the recognition phase of the Eigenface program. It has just three steps. Two of them - loading the face images and projecting them onto the subspace - are already familiar.
|
|
|
|
Figure 8. (Click for larger view) The recognize() function implements the recognition phase of the Eigenface program.
|
|
As described above, the face images for recognition testing should be listed in a file named test.txt, using the same format as in train.txt. At line 8, the call to loadFaceImgArray() loads these into the faceImgArr and stores the ground truth for person ID number in personNumTruthMat. This step is similar to line 6 of the learn() function, in
Figure 4.
Here, the number of face images is stored in the local variable, nTestFaces.
We also need to load the global variable nTrainFaces as well as most of the other training data - nEigens, EigenVectArr, pAvgTrainImg, and so on. The function loadTrainingData(), in
Figure 9,
does that for us. Again, OpenCV's persistence functions make this step easy. To open file storage for reading, use the CV_STORAGE_READ flag. Then, simply call the appropriate Read() function for each variable. OpenCV locates and loads each data value in the XML file by name. When the variable is a CvMat type, OpenCV creates a new matrix for you automatically, then sets its data values.
|
Figure 9. (Click for larger view) OpenCV's persistence functions make it easy to load the saved training data from the XML file.
|
The last parameter in the Read() function's interface is a default value. If a named variable is missing from the XML file, it will be set to the default. For pointer types, such as the matrices, it's a good idea to set the default to 0. You can then add a validation check to make sure these pointers have a non-zero value before you use them. To simplify the example code, I've omitted these, and similar, validation steps from the loadTrainingData() function.
After all the data are loaded, the final step in the recognition phase is to project each test image onto the PCA subspace and locate the closest projected training image. The for loop, at lines 16-34 of the recognize() function
(Figure 8),
implements this final step. The call to cvEigenDecomposite(), which projects the test image, is similar to the face-projection code in the learn() function.
As before, we pass it the number of eigenvalues (nEigens), and the array of eigenvectors (eigenVectArr). This time, however, we pass a test image, instead of a training image, as the first parameter. The output from cvEigenDecomposite() is stored in a local variable - projectedTestFace. Because there's no need to store the projected test image, I've used a C array for projectedTestFace, rather than an OpenCV matrix.
Finding the nearest neighbor
As last month's article
explained, eigenface "recognizes" a face image by looking for the training image that's closest to it in the PCA subspace. Finding the closest training example in a learned subspace is a very common AI technique. It's called Nearest Neighbor matching.
Figure 10
shows the code for the findNearestNeighbor() function. It computes distance from the projected test image to each projected training example. The distance basis here is "Squared Euclidean Distance." As last month's column explained, to calculate Euclidean distance between two points, you'd add up the squared distance in each dimension, then take the square root of that sum. Here, we take the sum, but skip the square root step. The final result is the same, because the neighbor with the smallest distance also has the smallest squared distance, so we can save some computation time by comparing squared values.
|
Figure 10. (Click for larger view) The findNearestNeighbor() function computes the distance from the projected test image to each projected training example to find the closest training image.
|
The for loop at lines 6-22 computes the squared distance to each projected training image, and keeps track (at lines 18-21) of which training image is closest.
The return value is the index of the closest training image. In the recognize() function
(Figure 8),
this return value is used, at line 31, to look up the person ID number associated with the nearest training image.
Here is the print output from the recognize() function:
nearest = 1, Truth = 1
nearest = 2, Truth = 2
nearest = 4, Truth = 4
nearest = 1, Truth = 1
nearest = 2, Truth = 2
nearest = 4, Truth = 4
nearest = 1, Truth = 1
nearest = 2, Truth = 2
nearest = 4, Truth = 4
nearest = 1, Truth = 1
nearest = 2, Truth = 2
nearest = 4, Truth = 4
nearest = 1, Truth = 1
nearest = 2, Truth = 2
nearest = 4, Truth = 4
nearest = 1, Truth = 1
nearest = 2, Truth = 2
nearest = 1, Truth = 4
Not bad! We only have one mismatch: the last test image was misrecognized as Subject 1 instead of 4.
Improving Eigenface
Having a framework like this for training and testing will make it easier for you to add improvements to Eigenface and to test their effects.
One of the first improvements you might want to add is to change the way distance is measured. The original eigenface paper used Euclidean distances between points, and that's the distance basis I've used in findNearestNeighbors(). But a different basis, called Mahalanobis distance (after its inventor) usually gives better results.
One of the things that happens when you project a face image onto the PCA subspace is that each dimension receives a certain amount of stretch. The amount of stretch isn't the same, though, in every direction. The directions that correspond to the largest eigenvalues get stretched far more than the directions associated with smaller eigenvalues. Because Euclidean distance ignores this stretching, using it to measure distance is approximately the same as using only one eigenvector and ignoring the rest!
It's easy to switch from Euclidean to Mahalanobis distance. Just change line 15, in findNearestNeighbors(), from
distSq += d_i*d_i;
to
distSq += d_i*d_i/eigenValMat->data.fl[i];
Switching to Mahalanobis distance eliminates the mismatch error mentioned above, bringing recognition accuracy up to 100% for these three subjects.
Where to Go From Here?
This article introduced several new OpenCV concepts. You can gain a deeper understanding of these from the OpenCV documentation. The persistence functions, the CvTermCriteria struct, and the CvMat datatype are described in detail in the in the CXCORE documentation. The eigenface functions are described in the CVAUX documentation. The CVAUX documentation isn't linked from the documentation index page, but you can find it in the documentation subdirectory named ref.
I've also set up a page here with links to all documentation for OpenCV, version 1.0.
If you want to incorporate Eigenface into a system that detects faces in live video, you'll first need to detect the face, then extract it into a separate image. Since each face image must be exactly the same size, the easiest way to do that is to define a standard size, say 50x50 pixels, ahead of time. Then, when you detect a face, you can use code like this to extract and resize it:
CvRect * pFaceRect = (CvRect*)cvGetSeqElem(pRectSeq, 0);
cvSetImageROI(pImg, *pFaceRect);
IplImage * pFaceImg =
cvCreateImage( STD_SIZE, IPL_DEPTH_8U, 1 );
cvResize(pImg, pFaceImg, CV_INTER_AREA );
There are more capabilities built into OpenCV, and many, many more computer vision programs one can create using this library. I hope this short series of articles has given you a taste of what's possible with OpenCV, and perhaps motivated you to explore more of its capabilities.
Be seeing you!
Pages
1
2
3
4
5
|
 |
|
 |