The Economist features face recognition on its front page, reporting that deep neural networks can now tell whether you’re straight or gay better than humans can just by looking at your face. The research they cite is a preprint, available here.
Its authors Kosinski and Wang downloaded thousands of photos from a dating site, ran them through a standard feature-extraction program, then classified gay vs straight using a standard statistical classifier, which they found could tell the men seeking men from the men seeking women. My students pretty well instantly called this out as selection bias; if gay men consider boyish faces to be cuter, then they will upload their most boyish photo. The paper authors suggest their finding may support a theory that sexuality is influenced by fetal testosterone levels, but when you don’t control for such biases your results may say more about social norms than about phenotypes.
Quite apart from the scientific value of the research, which is perhaps best assessed by specialists, I’m concerned with the ethics and privacy aspects. I am surprised that the paper doesn’t report having been through ethical review; the authors consider that photos on a dating website are public information and appear to assume that privacy issues simply do not arise.
Yet UK courts decided, in Campbell v Mirror, that privacy could be violated even by photos taken on the public street, and European courts have come to similar conclusions in I v Finland and elsewhere. For example, a Catholic woman is entitled to object to the use of her medical record in research on abortifacients and contraceptives even if the proposed use is fully anonymised and presents no privacy risk whatsoever. The dating site users would be similarly entitled to object to their photos being used in research to which they might have an ethical objection, even if they could not be identified from their photos. There are surely going to be people who object to research in any nature vs nurture debate, especially on a charged topic such as sexuality. And the whole point of the Economist’s coverage is that face-recognition technology is now good enough to work at population scale.
What do LBT readers think?
As a gay man (for some reason excluded from the invitation of your last sentence), I laugh out loud at your fantasy that gays like boyish looks. But Andrew Ng has said that NNs are good at classification tasks that humans can do in less than a second, and this is a case in point. Not for any reasons of phenotypes but merely because gays and straights follow different conventions about how they present themselves, it is easy for those that look out for these things to tell them apart.
As for the privacy, I wish that scientists doing experiments on photos we upload to dating sites was all we had to worry about.
In this context, “LBT” = “Light Blue Touchpaper”.
LOL LBT.
Big data is big data. AI knows all, even the things politically incorrect to note. AI will attribute extra disposable income to homosexuals just like it will deny you a loan because of your race. Get over it.
“I laugh out loud at your fantasy that gays like boyish looks”.
If you read the article carefully, you’ll notice a small but very important word: IF.
Ross wrote:
“*IF* gay men consider boyish faces to be cuter..” (Emphasis mine)
Ross isn’t saying gay men prefer boyish faces, he was just using it as a possible example.
That’s still not quite right though. It should be
“If gay men believe that *other* gay men consider boyish faces to be cuter…”
It’s nothing to do with how accurate these beliefs might be. Compare:
“If straight men believe that straight women consider X to be cuter…”
I’ve used dating sites in the past, and one of things I can say with confidence is that many people put up misleading photos. I turned up to some dates only to discover the other person didn’t look like their profile pictures.
You have to use a certain amount of skill to work out from their profile details if the picture is likely to be accurate.
The paper uses the word “consent” twice when talking about situations without consent.
There is no mention of using “informed consent” on the users of the dating site. In Europe you can’t give away this consent through a EULA.
Then there is also the fact that pictures contain racial information, which is extra sensitive personal information. They even combine that with sexual orientation, which has the same status.
I totally understand your astonishment of not having an ethics committee involved.
Even more so after we’ve already had an uproar over a user publishing data from a dating site last year: http://fortune.com/2016/05/18/okcupid-data-research/
While I’ve not sat on an IRB, I have been used as an advisor by a member of an IRB to deal with projects involving privacy issues and I would have advised that this research did not meet appropriate practice. While using a dating site is easy, there are alternative sources of data which can be gathered with informed consent. The quality of the data would also potentially be much higher with a better method of collection.
I’d also have recommended rejection of the paper on ethical grounds had I been a reviewer or editor.
I’m not LGBT in any way, but surely this is just like the good old “should we publicly release a lockpicking book” debate. It may be unethical and quiet dangerous if this is found to be true and found in the wrong hands. But its still entirely possible and research shouldn’t be shunned. Similar to whether you should disclose whether you found an exploit, hackers may still use it for personal gains but the good guys can fix the issue.
The title page of the preprint states “The study has been approved by the IRB at Stanford University”, which suggests that an ethics committee was involved. Whether we should agree with its approval is of course a different question.
I am guessing the answer is “No human subjects are involved, therefore the IRB can not review this.”
Something has to change in this strict interpretation of rules around IRBs.
I have been discussing this paper on the University of Manchester’s ‘Health Care Ethics and Law’ Facebook page. The first issue that concerns me, as it does you and your students, is that the research is seriously flawed because it mixes biological data with social data. The authors need to exclude the ‘expressions and “grooming styles”’ data from the photographs if they are going to make any biological/genetic claims about the ’causes’ of homosexuality. Of course, this would be extremely difficult if not impossible to do even if the photographs were taken by the researchers to avoid any expressions and grooming styles. It beggars belief that the authors used photos from dating sites because the individuals in the photos would perhaps be trying to project themselves as drop dead gorious people, i.e., expression and grooming styles will be accentuated. By using this data it is not possible to make any meaningful hypothesis about testosterone levels in the womb influencing sexuality. Errors of this type in neural net research go back decades and are no less common now.
As I have explained on the HCEL Facebook page, debates about the ethics raised by flawed and incorrect AI research only helps to legitimise it and has started a new wave of ‘AI hype’ and ‘AI myths’. To my mind, there are quite enough ethical issues raised by the use deep neural nets without having to invent them. The privacy issue you raise is a legitimate concern and I am in agreement that the research should have been reviewed by an ethics committee. (The Stanford IRB regualtions appear to allow the use of this type of information, so the researchers would have presumably had their ‘HSR Determination’ form rubber stamped.) The individuals on the dating websites did not give their consent for their information to be used for research. Indeed, the authors seem more than happy to raise the privacy issues the technology might create in the future, but seem for the most part ablivious to them in their own research. Sometimes I wonder whether – wittingly or unwittingly – some researchs still believe everybody has a ‘duty’ to participate in medical research (this research could count as medical research). (John Harris would say they do have a duty and that privacy issues should not hamper scientific progress.) Unfortunately, although this attitude has always been wrong, it will be increasingly difficult to maintain privacy in an age of rapidly increasing machine intelligence.
The thing is, the authors have not weighed in on the nature/nurture thing, but limited the scope to “Can a computer recognize a gay person”, and chances are its using the same social/presentational cues a human does. Excluding those would break the question. The research isn’t “Can we use phrenology to recognize queer folk” , because thats silly, of course you cant.
the most basic flaw is that they assume that there is a definable category of “gay men”. they actually acknowledge this in relation to gay women, who they excluded on the basis that “female sexuality is more fluid”. what absolutely bl**dy nonsense! i am no expert sexuologist but surely since somewhere in the 70s if not earlier everyone has acknowledged that all humans’ sexuality is “fluid”. and if you cannot clearly define a category, you cannot reliably determine whether a particular target (here person) fits into that category. surely that in itself torpedoes this rubbish? and of course, as i said earlier, bl**dy dangerous rubbish!
Comments by a sociologist on the research substance here
Yes, I will go along with most of Mattson’s review. I hope others will join in.
Worth noting that the analyzed “facial features” include such mutable things as hairstyle, facial hair, makeup. This could amount to little more than a neural network discovering that fashion exists.