This talk is about tech­nol­o­gy that goes under var­i­ous names: emo­tion­al robots, affec­tive com­put­ing, human-centered com­put­ing. But behind all these is actu­al­ly the tech­nol­o­gy for auto­mat­ic under­stand­ing of human behav­ior, and more specif­i­cal­ly human facial behavior. 

The human face is sim­ply fas­ci­nat­ing. It serves as our pri­ma­ry means to iden­ti­fy oth­er mem­bers of our species. It also serves to judge oth­er people’s age, gen­der, beau­ty, or even per­son­al­i­ty. But more impor­tant is that the face is a con­stant flow of facial expres­sions. We react and emote to exter­nal stim­uli all the time. And it is exact­ly this flow of expres­sions that is the observ­able win­dow to our inner self. Our emo­tions, our inten­tions, atti­tudes, moods.

Why is this impor­tant? Because we can use it in a very wide vari­ety of appli­ca­tions. So, every­body wants to know who the per­son is and what is the mean­ing of his or her expres­sion, and use it for var­i­ous appli­ca­tions. When it comes to analy­sis of our faces in sta­t­ic face images, iden­ti­fi­ca­tion of faces, this prob­lem is actu­al­ly con­sid­ered solved. Similarly, we can say for facial expres­sion analy­sis, in frontal view videos that the prob­lem is more or less solved.

As you can see from these videos, we can accu­rate­ly track faces in frontal views, and even judge expres­sions such as frowns, or smiles, high-level behav­iors like inten­si­ty of joy, or inten­si­ty of inter­est, even in out­door envi­ron­ments. However, when it comes to com­plete­ly uncon­strained envi­ron­ments where we have large changes in head pose, and when he have occur­rences of large occlu­sions, then we are fac­ing a chal­lenge. We call this prob­lem auto­mat­ic face and facial expres­sion analy­sis in videos uploaded [to] social media like YouTube and Facebook. 

We have to col­lect a lot of data in the wild, anno­tate this in terms of where the face is and where the parts of the face are. And then build these multi-view mod­els that will be able to actu­al­ly han­dle these large changes in head pose. We also need to take the con­text in which a facial expres­sion is expressed into account in order to be able to deal with the sub­tle facial behav­ior, or with occlu­sions of facial expres­sions. So, context-sensitive machine learn­ing mod­els are the future.

Nonetheless, the tech­nol­o­gy as it is now is still very much applic­a­ble to a wide vari­ety of appli­ca­tions. A good exam­ple is mar­ket analy­sis, where we could use the reac­tion of peo­ple to prod­ucts in adverts in order to judge the suc­cess­ful­ness of these prod­ucts in adverts. The soft­ware is com­mer­cial­ly avail­able [from] Realeyes, and we are work­ing with this com­pa­ny in order to include ver­bal feed­back about prod­ucts and adverts as well, and to build anoth­er tool for skill enhance­ments such as con­flict res­o­lu­tion skills.

Another very impor­tant field in which work quite a lot is the med­ical field, and we cur­rent­ly have the tech­nol­o­gy avail­able for auto­mat­ic analy­sis of pain and inten­si­ty of pain from facial expres­sions. We use this in vis­i­tor phys­io­ther­a­peu­ti­cal envi­ron­ments, but we could also use that for inten­sive care. 

Another impor­tant project for us is the European Commission project on work with autis­tic chil­dren, where we would like to help the kids to under­stand the facial expres­siv­i­ty of them­selves and oth­ers by using social robots with which they inter­act. These robots will have a cam­era which will watch them, and the soft­ware that will inter­pret these expres­sions and give them feedback.

In any case, once this tech­nol­o­gy real­ly becomes mature and we can tru­ly do face and facial expres­sion analy­sis in the wild, we would be able to have a lot of appli­ca­tions, such as for exam­ple, a sys­tem for analy­sis of a nego­ti­a­tion styles or man­ag­ing styles. Or sim­ply mea­sur­ing the stress in job inter­views, in car envi­ron­ments, in enter­tain­ing envi­ron­ments, and then increase dis­tress if peo­ple find this enter­tain­ing, or decrease it in order to increase the safe­ty of the dri­ver and the patient. So that’s just to men­tion a few examples. 

Thank you very much for your attention. 

