The development of facial recognition – much harder than we think!

Facial recognition is a notably hyped (if not over-hyped) technology. FRT developers and vendors are keen to boost the far-reaching capabilities of their products, while critics also are ready the fear the worst of what might potentially be done. As with most emerging technologies, facial recognition is the focus for ongoing exaggerated ‘booster’ and ‘doomster’ depictions of a very powerful technology.

As such, there are reasonable grounds to suspect that FRT is not capable of ever working to the full extent (and effect) that many people would like us to believe. For example, visions of ubiquitous infallible ‘one-to-many’ FRT are technically impossible to develop. Consequently, the dystopian promise of all-seeing surveillant cities that can track an individual across a 24 hour period falls short of the mark. Similarly, the prospect of real-time emotion monitoring based on facial expressions is fatally far-fetched. As many commentators have pointed out – emotions are simply not detectable or recognisable via the face (or any other physical feature). 

Nevertheless, as Rob Horning (2021, n.p) puts it, our concerns with these technologies “need not be premised on the technology actually working”. The continued enthusiasms for deploying expression detection technology suggests that important decisions will be made using such systems regardless of any underpinning flawed assumptions about the relationship between muscle movements and inner feelings and intentions. 

As such, our main concerns with these more speculative claims being made on behalf of FRT is that they result in actions falsely premised on the notion that the technology is capable of doing what is being claimed on its behalf. For example, there is a real danger that the continued use of FRT to ‘detect’ emotions ends up producing emotions. In other words, these technologies are capable of rationalising a judgement of a particular emotion being present, and then triggering subsequent actions accordingly. 

Similarly, the use of FRT to ‘predict’ a future event might end up triggering the consequences of that event having taken place. If a risk assessment falsely identifies someone as posing a threat during a confrontation, this could contribute to forms of escalation that then make that threat materialize. 

Thus rather than anticipating the worst-case consequences of the enduring hype and exaggerated expectations that surround FRT, much more attention needs to be paid to the inherent limitations to facial technologies and the AI that lies behind them. To paraphrase Melanie Mitchell (2021) – a computer science professor from the Santa Fe Institute – more attention needs to be given to why the development of AI technologies such as FRT “is harder than we think”, and developing more realistic expectations (and with this, more realistic hopes and fears) about the technology.

Mitchell suggests a number of fallacies that she feels are commonplace within the ways in which AI research is currently approached, and which certainly are apparent in current FRT research and development. These include the fact that many current discussions of FRT have lost sight of the fact that ‘easy things are hard’. Here, Mitchell laments the loss of the traditional shared understanding that existed amongst twentieth century computer science researchers that the most difficult and complex tasks to automate are actually things that humans can do with little thought. 

Mitchell gives the example of the AI pioneer Hebert Simon, who back in the 1970s would often make the point he was able to recognise his mother with ease, whereas programming a computer to do this was incredibly complex. In contrast to these split-millisecond cognitive processes, it was generally understood that codifying things that humans find difficult (such as solving a complex mathematical problem or playing Go or Chess) were actually far easier to automate from a computational point of view.

Now, Mitchell argues, these understandings have been flipped in computer science’s recent turn away from ‘symbolic’ AI approaches and toward the ‘statistical’ AI approach of deep learning. Now, she sees a growing belief amongst the current generation of AI researchers that anything a human can do through unconscious thought is relatively likely to be computationally modelled and automated in the near future. 

Seen in this light, there is a statistically-driven confidence that AI will soon be developed to recognise Herbert Simon’s mother – as well as every other mother in the world. Conversely, getting a machine to play a board game such as Go is now heralded by Google DeepMind researchers as “the most challenging of domains”. Nevertheless, this logic over-estimates the capacity of current AI to codify human thought. As Mitchell (2021, p.4) concludes, AI continues to be harder than we think, “because we are largely unconscious of the complexity of our own thought processes”.

This recent tendency to presume FRT to soon be capable of recognising human emotions, thoughts and intentions is perpetuated by the misleading terminology alluding to human intelligence that now tends to be attached to contemporary AI. Again, Mitchell argues that this gives a false sense of equivalency between machine and human intelligence.

For example, the idea of ‘computer vision’ might be loosely inspired by – but is certainly not comparable to – the visual processes that take place within a biological brain. Facial ‘recognition’ certainly does not resemble the ‘recognition’ undertaken by humans and non-humans, not least in the capacity to transfer what is recognised in one domain to another domain. As Ruha Benjamin (2019, p.45) reminds us, FRT only offers the promise of ‘viewing’, ‘recognising’ and ‘reading’ a person at a surface level – a ‘thin description’ of a person’s skin, hair, and a few other facial features, and therefore marginalises any thicker descriptions that would come with a human gaze.

Thus, any talk about FRT in anthropomorphic terms of ‘seeing’, ‘recognising’, ‘remembering’ and ‘noticing’ is clearly raising expectations and perceptions of what is technically possible. Even if facial recognition developers and researchers are simply using such terms as shorthand, and fully understand that machines do not ‘see’ and ‘think’ in the same way that humans do, such distinctions are not necessarily made by system vendors, their customers or the general public. Indeed, these connotations are likely to eventually begin to unconsciously shape how facial recognition experts think about their systems. As Nora Khan (2019) puts it: “We must stop with understanding the machine’s seeing as anything like human seeing. This comparison is a fallacy”.

**

References

Benjamin, R. (2019). Race after technology. Polity

Horning, R.  (2021)  https://twitter.com/robhorning/status/1356675880328847360

Khan, N. (2019). Seeing, naming, knowing.  The Brooklyn Rail.  March

Mitchell, M. (2021). Why AI is harder than we think. arXiv preprint arXiv:2104.12871.

Comments are closed.

Create a website or blog at WordPress.com

Up ↑

%d bloggers like this: