THE FIELDS OF artificial intelligence and machine learning are moving so quickly that any notion of ethics is lagging decades behind, or left to works of science fiction. This might explain a new study out of Shanghai Jiao Tong University, which says computers can tell whether you will be a criminal based on nothing more than your facial features.
The bankrupt attempt to infer moral qualities from physiology was a popular pursuit for millennia, particularly among those who wanted to justify the supremacy of one racial group over another. But phrenology, which involved studying the cranium to determine someone’s character and intelligence, was debunked around the time of the Industrial Revolution, and few outside of the pseudo-scientific fringe would still claim that the shape of your mouth or size of your eyelids might predict whether you’ll become a rapist or thief.
Not so in the modern age of Artificial Intelligence, apparently: In a paper titled “Automated Inference on Criminality using Face Images,” two Shanghai Jiao Tong University researchers say they fed “facial images of 1,856 real persons” into computers and found “some discriminating structural features for predicting criminality, such as lip curvature, eye inner corner distance, and the so-called nose-mouth angle.” They conclude that “all four classifiers perform consistently well and produce evidence for the validity of automated face-induced inference on criminality, despite the historical controversy surrounding the topic.”
Though long ago rejected by the scientific community, phrenology and other forms of physiognomy have reappeared throughout dark chapters of history. A 2009 article in Pacific Standard on the racial horrors of colonial Rwanda might’ve been good background material for the pair:
In the 1920s and 1930s, the Belgians, in their role as occupying power, put together a national program to try to identify individuals’ ethnic identity through phrenology, an abortive attempt to create an ethnicity scale based on measurable physical features such as height, nose width and weight, with the hope that colonial administrators would not have to rely on identity cards.
This can’t be overstated: The authors of this paper — in 2016 — believe computers are capable of scanning images of your lips, eyes, and nose to detect future criminality. It’s enough to make phrenology seem quaint.
The study contains virtually no discussion of why there is a “historical controversy” over this kind of analysis — namely, that it was debunked hundreds of years ago. Rather, the authors trot out another discredited argument to support their main claims:, that computers can’t be racist, because they’re computers:
Unlike a human examiner/judge, a computer vision algorithm or classifier has absolutely no subjective baggages, having no emotions, no biases whatsoever due to past experience, race, religion, political doctrine, gender, age, etc., no mental fatigue, no preconditioning of a bad sleep or meal. The automated inference on criminality eliminates the variable of meta-accuracy (the competence of the human judge/examiner) all together. Besides the advantage of objectivity, sophisticated algorithms based on machine learning may discover very delicate and elusive nuances in facial characteristics and structures that correlate to innate personal traits and yet hide below the cognitive threshold of most untrained nonexperts.
This misses the fact that no computer or software is created in a vacuum. Software is designed by people, and people who set out to infer criminality from facial features are not free from inherent bias.
Absent, too, is any discussion of the incredible potential for abuse of this software by law enforcement. Kate Crawford, an AI researcher with Microsoft Research New York, MIT, and NYU, told The Intercept, “I‘d call this paper literal phrenology, it’s just using modern tools of supervised machine learning instead of calipers. It’s dangerous pseudoscience.”
Crawford cautioned that “as we move further into an era of police body cameras and predictive policing, it’s important to critically assess the problematic and unethical uses of machine learning to make spurious correlations,” adding that it’s clear the authors “know it’s ethically and scientifically problematic, but their ‘curiosity’ was more important.”
Given the explosive, excited growth of AI as a field of study and a hot commodity, don’t be surprised if this curiosity is contagious.
IF this is real it won’t go far because cops politicians and lawyers are by FAR the worst criminals. Imagine coupling cameras mounted in public places using this technology with an automated warrant writing machine. How many cops and politicians and lawyers would be targeted?
Yep. Plonk one of these machines down in Washington D.C. and watch it start bleeping away.
Bleeebledeeebleeepbleeepbleeep!!!
Then the machine burns out from criminal sensory overload.
If you don’t smile we will drag you off to a camp and harvest your organs.
“…The study contains virtually no discussion of why there is a ‘historical controversy’ over this kind of analysis — namely, that it was debunked hundreds of years ago….”
Debunked by whom? There’s a lot of “debunking” of truth done for the purpose of reinforcing politically convenient lies, and I assume that’s been the case for a long time.
One thing that’s overlooked by this article is that people who share “criminal” facial features probably also share certain DNA characteristics that may also predispose them to be criminals.
One problem with the study as presented is that the definition of “criminal” varies according to what laws are on the books, and those laws can’t change people’s facial features. In some tyrannical societies of the past, the only morally upright citizens were criminals. Hell, America’s Founders were criminals.
That said, it’s my understanding that certain facial features ARE correlated with aggressive tendencies, possibly due to certain genes and/or levels of testosterone exposure in the womb or in later stages of development. The thing is, not everyone with aggressive tendencies acts on them. And even moderate correlations imply many exceptions — far too many for an analysis of facial features to be truly predictive.
It’s worth noting here that previous studies have shown that finger-length ratio can be predictive of traits that include aggression. The reasons are believed to be the same as mentioned above. But the correlation is far from perfect.
In a word, the science here might be real, but it’s almost certainly too weak to be of any practical use.
“… the science here might be real, but it’s almost certainly too weak to be of any practical use.”
which makes it ideal for their purpose.
I’d much rather have something like this not work than actually do what it claims. The pigs don’t need pseudoscience to frame people — they can just plant drugs, etc. — so this doesn’t really add anything to their arsenal in that regard. But if this predictive software did somehow work, then they could use it to identify potential rebels.
Fortunately, it’s extremely likely to be garbage science.
As mentioned, “criminal” is a floating signifier that depends on the laws in a particular country and time. The exact same human being might be regarded as a “criminal” in one society and as a “heroic revolutionary” in another, even though his biology stays exactly the same.
There are also some very good objections raised in the comments at the Intercept (linked above), especially by the poster named “Ben Whitmore.” Among other things, he notes:
*** The study is also probably suffering from “overfitting”, because they are testing on the same dataset that they trained on. When you demand that your algorithm learn the difference between two groups, it can “overfit” to the point that it is largely just learning the features of all the individuals in one group and all the individuals in the other group. When you then try to use the trained model on other individuals not seen during training, the results are pretty-much random, because the model doesn’t recognize them. This is why any study such as this should always use a training set for the actual training and a test set for validating / measuring accuracy. Their failure to do so is amateur in the extreme. They have used 10-fold cross-validation, which is something at least, but this still allows them to achieve apparent accuracy scores approaching 90% based on overfitting alone, rather than based on any real accuracy. No surprise their reported accuracy is 89.5%. ***