Skin Color Detection in Images and Video

by Stephen Wylie
stev-o@u.northwestern.edu
EECS 349 - Machine Learning
Northwestern University
Prof. Doug Downey

Skin detection is an important problem in image analysis. By detecting regions of skin, one can isolate the presence of faces, arms, hands, and gestures. A well-written skin detector used with other image analysis techniques such as Hough transforms or eigenvalue-based approaches for pattern recognition could have powerful applications in security, human-computer interaction, or gaming. Imagine drawing a picture on your computer simply by moving your finger over a particular color on an easel, then pressing down on a white surface. The computer would need to know where your finger is to begin with, and this is facilitated by a special case of color segmentation: skin detection.

There are several ways that one can make a skin detector. A basic method is to construct a histogram of properties, used to keep track of how many pixels have particular attributes (e.g. particular values for Hue and Saturation) Another method is to train neural networks. Perceptrons can be combined to form a classifier that can learn a non-linear region, and such a tool is perfect for classifying skin. Both of these methods were used to classify pixels for skin, and trained on pairs of attributes of a pixel, either among RGB (red/green/blue) or HSV (hue/saturation/value).

Training took place on over 200,000 pixels, all classified as skin colors. The histograms would simply store how many of what properties were found among these pixels, but the neural networks also had to be trained on what was not a skin tone. Four different classifiers were tested on each color space attribute pair, and scored based on their precision and recall. Ability to recall what is actually skin without classifying non-skin tones is very important, and relative success was measured by sorting the sum of precision and recall.

Some of the resulting solutions perform quite well in detecting skin, granted that the camera used to take the picture is of relatively good quality and no one is overexposed in the picture. Several neural network methods outperformed the baseline methods. The choice of color space is actually much more relevant to the performance of the skin detector than is the number of perceptrons in the hidden layer. Most skin does get recognized, but the system is also prone to picking up objects similar in color, and perhaps hair. Morphological operations can be used to help get rid of the extra noise in these images.

Curious to read more? Read the full report here!

Or, take a gander at the poster.

As seen here, the HS baseline classifier does quite well extracting the skin tone from the picture of this hand.