Good security and cybercrime research often creates an impact and we want to ensure that impact is positive. This week I will discuss three papers on ethics in computer security research in the run up to next week’s Security and Human Behaviour workshop (SHB). Ethical issues in research using datasets of illicit origin (Thomas, Pastrana, Hutchings, Clayton, Beresford) from IMC 2017, Measuring eWhoring (Pastrana, Hutchings, Thomas, Tapiador) from IMC 2019, and An Ethics Framework for Research into Heterogeneous Systems (Happa, Nurse, Goldsmith, Creese, Williams).
Ethical issues in research using datasets of illicit origin (blog post) came about because in prior work we had noticed that there were ethical complexities to take care of when using data that had “fallen off the back of a lorry” such as the backend databases of hacked booter services that we had used. We took a broad look at existing published guidance to synthesise those issues which particularly apply to using data of illicit origin and we expected to see discussed by researchers:
- Identification of stakeholders: If you don’t realise that stakeholders could be impacted by your research then you can’t account for that (e.g. compromising ongoing police investigations).
- Informed consent: This is usually not possible with data of illicit origin as the data normally concerns many people who can be hard to identify and even where they could be identified that normally involves processing the data in the first place.
- Identify harms: Work out what could go wrong in order to avoid it going wrong.
- Safeguards: Measures to prevent harm.
- Justice: Not unfairly disadvantaging particular social or cultural groups. Diversity on the research team makes it easier to spot this.
- Public interest: As there are risks to this research the benefits need to outweigh them.
We then reviewed papers from a range of disciplines to see what ethical issues they talked about when they used various different kinds of data of illicit origin from leaked government secrets, company financial/legal records, passwords, and malware. We categorised and summarised the issues discussed in these papers to pull out the superset of the justifications, safeguards, harms, and benefits considered. In general researchers don’t talk about ethics enough and so understanding of what best practice is is much more limited than it might be. This was changing and has continued to change since we wrote the paper with a much greater awareness in the community of the importance of considering and communicating the ethical case.
In this paper we took a pretty utilitarian view of ethics and that is by no means the best or only way of thinking about ethics and security. However, it is simple to apply in simple cases. When things get more complicated the variety of perspectives on the relevant Research Ethics Board (ethics committee/IRB) should draw out other ways of thinking about the issues.
Measuring eWhoring was one of those papers where the ethics committee initially said “No, but..” and by using their suggestions and thinking rather harder about minimising harm we developed a much safer framework for working with images and found that our initial assumption about the safety of the images was false.
This paper measures eWhoring (the offender’s name for the practice) where offenders fraudulently claim to their customers to be young women and offer virtual sexual encounters in exchange for payment. In this work we used posts from an underground forum dedicated to the practice and while there were some issues to take care of in relation to the analysis of the text the key ethical challenge concerned the processing of the images. There were a variety of kinds of image posted on the forum both directly into posts and via links to third party websites. Of interest to us were screenshots showing purported earnings from this fraud (which we wanted to analyse to understand earnings) and images of models (which we wanted to do reverse image search on to analyse where they had come from, but crucially, didn’t want to actually view). The key risks were psychological harms arising from exposing researchers to images used in virtual sexual encounters (and where these had been obtained without consent further harm to the victim who had had their images abused) and legal risk to researchers arising if any indecent images of children were obtained, as possession is a strict liability offence in the UK were the work was conducted. We initially thought that it was extremely unlikely that any indecent images of children would be collected but at the ethics committee’s suggestion we collaborated with the IWF, got access to their API and fed all the images through that and hence identified that there were indecent images of children being shared within the eWhoring community (all images were reported and deleted). We developed an automatic processing pipeline to separate out the images we wanted to look at (proof of earnings) and those we didn’t so that researchers didn’t need to look at any not safe for work images.
eWhoring has seen a boom during lockdown but it was and remains illegal.
An Ethics Framework for Research into Heterogeneous Systems (Living in the Internet of Things: Cybersecurity of the IoT – 2018, IET, by Happa, Nurse, Goldsmith, Creese, Williams). They survey and systematise the ethical and legal issues in developing IoT systems and present a framework for reducing the number of ethical mistakes made and for learning from those mistakes. Following this framework would be likely to work in producing safer systems with less ethical issues. However, I think the key difficulty with delivering this in practice is the current lack of economic incentive to do so. Taking this quantity of care costs time and money and in both academia and industry the incentives are against this. As the authors note if this level of care becomes the industry norm then other members of that industry might be held to it. Even though there has been some progress in the last few years we are still a very long way from the standard advocated in this paper. Speaking of time, a more thorough review was prevented by time shortages resulting from lockdown.
Dr Daniel R. Thomas, Chancellor’s Fellow (lecturer/assistant professor), Computer and Information Sciences, University of Strathclyde.