Part 1: Introduction to OpenCV
by Robin Hewitt
This article originally appeared in
SERVO Magazine, January 2007.
Reprinted by permission of T & L Publications, Inc.
OpenCV, Intel's free, open-source computer-vision library can greatly simplify computer-vision programming. It includes advanced capabilities - face detection, face tracking, face recognition, Kalman filtering, and a variety of artificial-intelligence (AI) methods - in ready-to-use form. In addition, it provides many basic computer-vision algorithms via its lower-level APIs.
A good understanding of how these methods work is the key to getting good results when using OpenCV. In this five-part series, I'll introduce you to OpenCV and show you how to use it to implement face detection, face tracking, and face recognition. Then, I'll take you behind the scenes to explain how each of these methods works and give you tips and tricks for getting the most out of them.
This first article in the series introduces OpenCV. I'll tell you how to get it and give you a few pointers for setting it up on your computer. You'll learn how to read and write image files, capture video, convert between color formats, and access pixel data - all through OpenCV interfaces.
OpenCV is a free, open-source computer vision library for C/C++ programmers. You can download it from
Intel released the first version of OpenCV in 1999. Initially, it required Intel's Image Processing Library. That dependency was eventually removed, and you can now use OpenCV as a standalone library.
OpenCV is multi-platform. It supports both Windows and linux, and more recently, MacOSX . With one exception (CVCAM, which I'll describe later in this article), its interfaces are platform independent.
Figure 1. Among OpenCV's many capabilities are face detection (top left), contour detection (top right), and edge detection (bottom)
OpenCV has so many capabilities it can seem overwhelming at first. Fortunately, you'll need only a few to get started. I'll walk you through a useful subset in this series.
Here's a summary of the major functionality categories in OpenCV, version 1.0, which was just released at the time of this writing:
General computer-vision and image-processing algorithms (mid- and low-level APIs)
Using these interfaces, you can experiment with many standard computer-vision algorithms without having to code them yourself. These include edge, line, and corner detection, ellipse fitting, image pyramids for multiscale processing, template matching, various transforms (Fourier, discrete cosine, and distance transforms), and more.
High-level computer-vision modules
OpenCV includes several high-level capabilities. In addition to face-detection, recognition, and tracking, it includes optical flow (using camera motion to determine 3D structure), camera calibration, and stereo.
AI and machine-learning methods
Computer-vision applications often require machine learning or other AI methods. Some of these are available in OpenCV's Machine Learning package.
Image sampling and view transformations
It's often useful to process a group of pixels as a unit. OpenCV includes interfaces for extracting image subregions, random sampling, resizing, warping, rotating, and applying perspective effects.
Methods for creating and analyzing binary (two-valued) images
Binary images are frequently used in inspection systems that scan for shape defects or count parts. A binary representation is also convenient when locating an object to grasp.
Methods for computing 3D information
These functions are useful for mapping and localization - either with a stereo rig or with multiple views from a single camera.
Math routines for image processing, computer vision, and image interpretation
OpenCV includes math commonly used algorithms from linear algebra, statistics, and computational geometry.
These interfaces let you write text and draw on images. In addition to various fun and creative possibilities, these functions are useful for labeling and marking. For example, if you write a program that detects objects, it's helpful to label images with their sizes and locations.
OpenCV includes its own windowing interfaces. While these are limited compared to what can be done on each platform, they provide a simple, multi-platform API to display images, accept user input via mouse or keyboard, and implement slider controls.
Datastructures and algorithms
With these interfaces, you can efficiently store, search, save, and manipulate large lists, collections (also called sets), graphs, and trees.
These methods provide convenient interfaces for storing various types of data to disk and retrieving them later.
shows a few examples of OpenCV's capabilities in action: face detection, contour detection, and edge detection.