Computer vision is an extremely difficult subject because it tries to mimic the human cognitive faculties. Technically, computer vision is about mimicking the human visual system not just by seeing, for the camera is akin to the eye, but the interpretation of what we see through the eye, the way we relate it to surroundings and use our past knowledge of situations, associativity etc to understand what we saw. That is the hardest part to put into an algorithm. Strictly speaking emotion recognition should not be part of computer vision and is not, but is closely related. The difference is that we are not trying to figure out by measuring the geometry of her face to ascertain whether she was smiling or not, but using even deeper knowledge of relationship of this geometry to emotions and figuring out what emotions were being displayed ! This background was necessary to state the complexity and uniqueness of this experiment, which is one of its kind surely.
Why this is unique is because the chances of errors are very high. Robustness is another issue. It can not just work on the Mona Lisa. It should work on many or all other human faces.
Coming back to emotion recognition. Recognition part is easy to understand but the "emotion" is really a colloquial term and needs formal approach. Research in psychology has shown that human emotions can be classified into six archetypal emotions: surprise, fear, disgust, anger, happiness and sadness. Facial motion plays an integral part in expressing these emotions. The other part that completes this expression is speech. But is outside of this scope, for no one quite has Mona Lisa's audio tapes!
An interesting research from psychology was to understand the role speech and facial motion play in understanding each of the emotions. The findings showed that while sadness and fear can be made out from speech data, whereas the video or the facial clues provided clues on anger and happiness.
There have been countless theories of fans of The Da Vinci Code who know that the Mona Lisa smile isn't the only mystery associated with Leonardo's masterpiece. In 1509, he collaborated with mathematician Fra Luca Pacioli and artist Piero della Francesca on a book about the golden ratio: If a line is divided into two unequal lengths, in such a way that the ratio of the longer segment to the shorter segment will be the same as the ratio of the whole line to the longer segment, the resulting number will be something close to 1.618. Some art historians say that within the painting, the relationship between the Mona Lisa's right shoulder and cheek and her left shoulder and cheek forms a golden triangle whose shortest sides are in divine proportion to its base.
Now advances in computer vision have facilitated whole new generation of software programs and point in case is an algorithm developed that can now map a person's face onto a mesh computer model and calculate facial expressions based on facial points such as lip curvature, eyebrow position, and cheek contraction. The algorithm claims it detects happiness, disgust, fear, anger, surprise and sadness with 85 percent accuracy, but researchers don't yet have the technology to detect more subtle emotions.
So any guesses when the algorithm was subjected to the famed Mona Lisa painting? It analyzed it and found this. The Mona Lisa's expression is 83-percent happy, 9-percent disgusted, 6-percent fearful, and two-percent angry!
The researchers also found that George Bush was feeling surprise, fear and sadness during a speech regarding the war in Iraq. Michael Jackson was 33-percent fearful in his mug shot and angry and disgusted as the press snapped pictures after his trial.
Any invention has to lead to practical use and this invention of the algorithm can become an innovation if used appropriately. For example, emotion-recognition technology may be used to detect that a driver is getting sleepy at the wheel and have an alert signal and to detect how you feel about certain items while you're shopping ... Proof it takes a look at the past to pave the way for the future. Other applications of emotion recognition software might be to detect terror suspects on the basis of their emotions, not just on their physical characteristics.
The inventors of this same program were hired by Unilever, the food and consumer goods giant, to work on a project that could probably change the face of marketing. At the Unilever outlets, around 300 women faces were willingly photographed in 6 European cities to capture their facial expressions while tasting five food types: vanilla ice cream, chocolate, cereal bars, yogurt and apples. Not surprisingly, ice cream and chocolate produced the most happy expressions.
Not surprisingly, the software registered fewer smiley faces for healthy foods. Apples produced 87 percent neutral expressions, with Italians and Swedes registering disappointment when eating them; yogurt didn't fare much better, evoking "sad" expressions for 28 percent of Europeans.
This is not necessarily a new research, but has been picking up recently in last 3-4 years. Why it was interesting to report was because of the 'fun' and 'educational' elements therein. Serious research can be quite a challenge and more innovation in terms of its applications to other areas hiterto unexplored, may be even more tricky and challenging.
No comments:
Post a Comment