Data ordering attacks

Most deep neural networks are trained by stochastic gradient descent. Now “stochastic” is a fancy Greek word for “random”; it means that the training data are fed into the model in random order.

So what happens if the bad guys can cause the order to be not random? You guessed it – all bets are off. Suppose for example a company or a country wanted to have a credit-scoring system that’s secretly sexist, but still be able to pretend that its training was actually fair. Well, they could assemble a set of financial data that was representative of the whole population, but start the model’s training on ten rich men and ten poor women drawn from that set – then let initialisation bias do the rest of the work.

Does this generalise? Indeed it does. Previously, people had assumed that in order to poison a model or introduce backdoors, you needed to add adversarial samples to the training data. Our latest paper shows that’s not necessary at all. If an adversary can manipulate the order in which batches of training data are presented to the model, they can undermine both its integrity (by poisoning it) and its availability (by causing training to be less effective, or take longer). This is quite general across models that use stochastic gradient descent.

This work helps remind us that computer systems with DNN components are still computer systems, and vulnerable to a wide range of well-known attacks. A lesson that cryptographers have learned repeatedly in the past is that if you rely on random numbers, they had better actually be random (remember preplay attacks) and you’d better not let an adversary anywhere near the pipeline that generates them (remember injection attacks). It’s time for the machine-learning community to carefully examine their assumptions about randomness.

Infrastructure – the Good, the Bad and the Ugly

Infrastructure used to be regulated and boring; the phones just worked and water just came out of the tap. Software has changed all that, and the systems our society relies on are ever more complex and contested. We have seen Twitter silencing the US president, Amazon switching off Parler and the police closing down mobile phone networks used by crooks. The EU wants to force chat apps to include porn filters, India wants them to tell the government who messaged whom and when, and the US Department of Justice has launched antitrust cases against Google and Facebook.

Infrastructure – the Good, the Bad and the Ugly analyses the security economics of platforms and services. The existence of platforms such as the Internet and cloud services enabled startups like YouTube and Instagram soar to huge valuations almost overnight, with only a handful of staff. But criminals also build infrastructure, from botnets through malware-as-a-service. There’s also dual-use infrastructure, from Tor to bitcoins, with entangled legitimate and criminal applications. So crime can scale too. And even “respectable” infrastructure has disruptive uses. Social media enabled both Barack Obama and Donald Trump to outflank the political establishment and win power; they have also been used to foment communal violence in Asia. How are we to make sense of all this?

I argue that this is not simply a matter for antitrust lawyers, but that computer scientists also have some insights to offer, and the interaction between technical and social factors is critical. I suggest a number of principles to guide analysis. First, what actors or technical systems have the power to exclude? Such control points tend to be at least partially social, as social structures like networks of friends and followers have more inertia. Even where control points exist, enforcement often fails because defenders are organised in the wrong institutions, or otherwise fail to have the right incentives; many defenders, from payment systems to abuse teams, focus on process rather than outcomes.

There are implications for policy. The agencies often ask for back doors into systems, but these help intelligence more than interdiction. To really push back on crime and abuse, we will need institutional reform of regulators and other defenders. We may also want to complement our current law-enforcement strategy of decapitation – taking down key pieces of criminal infrastructure such as botnets and underground markets – with pressure on maintainability. It may make a real difference if we can push up offenders’ transaction costs, as online criminal enterprises rely more on agility than on on long-lived, critical, redundant platforms.

This was a Dertouzos Distinguished Lecture at MIT in March 2021.

Three Paper Thursday: Subverting Neural Networks via Adversarial Reprogramming

This is a guest post by Alex Shepherd.

Five years after Szegedy et al. demonstrated the capacity for neural networks to be fooled by crafted inputs containing adversarial perturbations, Elsayed et al. introduced adversarial reprogramming as a novel attack class for adversarial machine learning. Their findings demonstrated the capacity for neural networks to be reprogrammed to perform tasks outside of their original scope via crafted adversarial inputs, creating a new field of inquiry for the fields of AI and cybersecurity.

Their discovery raised important questions regarding the topic of trustworthy AI, such as what the unintended limits of functionality are in machine learning models and whether the complexity of their architectures can be advantageous to an attacker. For this Three Paper Thursday, we explore the three most eminent papers concerning this emerging threat in the field of adversarial machine learning.

Adversarial Reprogramming of Neural Networks, Gamaleldin F. Elsayed, Ian Goodfellow, and Jascha Sohl-Dickstein, International Conference on Learning Representations, 2018.

In their seminal paper, Elsayed et al. demonstrated their proof-of-concept for adversarial reprogramming by successfully repurposing six pre-trained ImageNet classifiers to perform three alternate tasks via crafted inputs containing adversarial programs. Their threat model considered an attacker with white-box access to the target models, whose objective was to subvert the models by repurposing them to perform tasks they were not originally intended to do. For the purposes of their hypothesis testing, adversarial tasks included counting squares and classifying MNIST digits and CIFAR-10 images.
Continue reading Three Paper Thursday: Subverting Neural Networks via Adversarial Reprogramming

Friendly neighbourhood cybercrime: online harm in the pandemic and the futures of cybercrime policing

As cybercrime researchers we’re often focused on the globalised aspects of online harms – how the Internet connects people and services around the world, opening up opportunities for crime, risk, and harm on a global scale. However, as we argue in open access research published this week in the Journal of Criminal Psychology in collaboration between the Cambridge Cybercrime Centre (CCC), Edinburgh Napier University, the University of Edinburgh, and Abertay University, as we have seen an enormous rise in reported cybercrime in the pandemic, we have paradoxically seen this dominated by issues with a much more local character. Our paper sketches a past: of cybercrime in a turbulent 2020, and a future: of the roles which state law enforcement might play in tackling online harm a post-pandemic world.

Continue reading Friendly neighbourhood cybercrime: online harm in the pandemic and the futures of cybercrime policing

An exploration of the cybercrime ecosystem around Shodan

By Maria Bada & Ildiko Pete

Internet of Things (IoT) solutions, which have permeated our everyday life, present a wide attack surface. They are present in our homes in the form of smart home solutions, and in industrial use cases where they provide automation. The potentially profound effects of IoT attacks have attracted much research attention. We decided to analyse the IoT landscape from a novel perspective, that of the hacking community. 

Our recent paper published at the 7th IEEE International Conference on Internet of Things: Systems, Management and Security (IOTSMS 2020) presents an analysis of underground forum discussions around Shodan, one of the most popular search engines of Internet facing devices and services. In particular, we explored the role Shodan plays in the cybercriminal ecosystem of IoT hacking and exploitation, the main motivations of using Shodan, and popular targets of exploits in scenarios where Shodan is used. 

To answer these questions, we followed a qualitative approach and performed a thematic analysis of threads and posts extracted from 19 underground forums presenting discussions from 2009 to 2020. The data were extracted from the CrimeBB dataset, collected and made available to researchers through a legal agreement by the Cambridge Cybercrime Centre (CCC). Specifically, the majority of posts we analysed stem from Hackforums (HF), one of the largest general purpose hacking forums covering a wide range of topics, including IoT. HF is also notable for being the platform where the source code of the Mirai malware was released in 2016 (Chen and Y. Luo, 2017). 

 The analysis revealed that Shodan provides easier access to targets and simplifies IoT hacking. This is demonstrated for example by discussions that centre around selling and buying Shodan exports, search results that can be readily used to target vulnerable devices and services. Forum members also expressed this view directly:

‘… Shodan and other tools, such as exploit-db make hacking almost like a recipe that you can follow.’

From the perspective of hackers a significant factor determining the utility of Shodan is if those targets can indeed be utilised. For example, whether all scanned hosts in scan results are active and whether they can be used for exploitation. Thus, the value of Shodan as a hacking tool is determined by its intended use cases.

The discussions were ripe with tutorials on various aspects of hacking, which provided a glimpse into the methodology of hacking in general, hacking IoT devices, and the role Shodan plays in IoT attacks.  The discussions show that Shodan and similar tools, such as Censys and Zoomeye, play a key role in passive information gathering and reconnaissance. The majority of users agree that Shodan provides value and is a useful tool and do suggest its use. They mention Shodan both in the context of searching for targets and exploiting devices or services with known vulnerabilities. As to the targets of information gathering and exploitation, we found multiple devices and services, including web cameras, industrial control systems, open databases, to mention a few.

Shodan is a versatile tool and plays a prominent role in various use cases. Since IoT devices can potentially expose personally identifiable information, such as health records, user names and passwords, members of underground forums actively discuss utilising Shodan for gathering such data. In particular, this can be achieved by exploiting open databases.

Members of forums discuss accessing remote devices for various reasons. In some cases, it is for fun, while more maliciously inclined actors can use such exploits to collect images and videos and use them in for example extortion use cases. Previous research has shown that camera systems represent easy targets for hackers. Accordingly, our findings highlight that these systems are one of the most popular targets, and they are widely discussed in the context of watching the video stream or listening to the audio stream of a compromised vulnerable cameras, or exposing someone through their camera recording. Users frequently discuss IP camera trolling, and we found posts sharing leaked video footage and websites that list hacked cameras. 

Shodan, and in particular the Shodan API can be used to automate scanning for devices which could be used to create a botnet:

…you don’t need fancy exploits to get bots just look for bad configurations on shodan.’

And finally, a major use case member discusses utilising Shodan in Distributed Reflection Denial of Service attacks, and specifically in the first step where Shodan can be used to gather a list of reflectors, for example, NTP servers.

 Discussions around selling or buying Shodan accounts show that forum members trade these accounts and associated assets due to Shodan’s credit model, which limits its use. To effectively utilise the output of Shodan queries, premium accounts are required as they provide the necessary scan, query and export credits.

Although Shodan and other search engines alike attract malicious actors, they are widely used by security professionals and for penetration testing to unveil IoT security issues. Raising awareness of vulnerabilities provides invaluable help in alleviating these issues. Shodan provides a variety of services, including Malware Hunter, which is a specialised Shodan crawler aimed at discovering malware command-and-control (CC) servers. The service is of great value to security professionals and in the fight against malware reducing its impact and ability to compromise targeted victims. This study contributes to IoT security research by highlighting the need for action towards securing the IoT ecosystem based on forum members’ discussions on underground forums. The findings suggest that more focus needs to be placed upon the security considerations while developing IoT devices, as a measure to prevent their malicious use.

Reference

F. Chen and Y. Luo, Industrial IoT Technologies and Applications: Second EAI International Conference, Industrial IoT 2017, Wuhu, China, March 25–26, 2017, Proceedings, ser. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. Springer International Publishing, 2017.

WEIS 2020 – Liveblog

I’ll be trying to liveblog the seventeenth Workshop on the Economics of Information Security (WEIS), which is being held online today and tomorrow (December 14/15) and streamed live on the CEPS channel on YouTube. The event was introduced by the general chair, Lorenzo Pupillo of CEPS, and the program chair Nicolas Christin of CMU. My summaries of the sessions will appear as followups to this post, and videos will be linked here in a few days.

Pushing the limits: acoustic side channels

How far can we go with acoustic snooping on data?

Seven years ago we showed that you could use a phone camera to measure the phone’s motion while typing and use that to recover PINs. Four years ago we showed that you could use interrupt timing to recover text entered using gesture typing. Last year we showed how a gaming app can steal your banking PIN by listening to the vibration of the screen as your finger taps it. In that attack we used the on-phone microphones, as they are conveniently located next to the screen and can hear the reverberations of the screen glass.

This year we wondered whether voice assistants can hear the same taps on a nearby phone as the on-phone microphones could. We knew that voice assistants could do acoustic snooping on nearby physical keyboards, but everyone had assumed that virtual keyboards were so quiet as to be invulnerable.

Almos Zarandy, Ilia Shumailov and I discovered that attacks are indeed possible. In Hey Alexa what did I just type? we show that when sitting up to half a meter away, a voice assistant can still hear the taps you make on your phone, even in presence of noise. Modern voice assistants have two to seven microphones, so they can do directional localisation, just as human ears do, but with greater sensitivity. We assess the risk and show that a lot more work is needed to understand the privacy implications of the always-on microphones that are increasingly infesting our work spaces and our homes.

Three Paper Thursday: Vēnī, Vīdī, Vote-y – Election Security

With the recent quadrennial instantiation of the US presidential election, discussions of election security have predictably resurged across much of the world. Indeed, news cycles in the US, UK, and EU abound with talking points surrounding the security of elections. In light of this context, we will use this week’s Three Paper Thursday to shed light on the technical challenges, solutions, and opportunities in designing secure election systems.

This post will focus on the technical security of election systems. That said, the topic of voter manipulation techniques such as disinformation campaigns, although out of scope here, is also an open area of research.

At first glance, voting may not seem like a challenging problem. If we are to consider a simple majority vote, surely a group of young schoolchildren could reach a consensus in minutes via hand-raising. Striving for more efficient vote tallying, though, perhaps we may opt to follow the IETF in consensus through humming. As we seek a solution that can scale to large numbers of voters, practical limitations will force us to select a multi-location, asynchronous process. Whether we choose in-person polling stations or mail-in voting, challenges quickly develop: how do we know a particular vote was counted, its contents kept secret, and the final tally correct?

National Academies of Sciences, Engineering, and Medicine (U.S.), Ed., Securing the vote: protecting American democracy, The National Academies Press (2018)

The first paper is particularly prominent due to its unified, no-nonsense, and thorough analysis. The report is specific to the United States, but its key themes apply generally. Written in response to accusations of international interference in the US 2016 presidential election, the National Academies provide 41 recommendations to strengthen the US election system.

These recommendations are extremely straightforward, and as such a reminder that adversaries most often penetrate large systems by targeting the “weakest link.” Among other things, the authors recommend creating standardized ballot data formats, regularly validating voter registration lists, evaluating the accessibility of ballot formats, ensuring access to absentee ballots, conducting appropriate audits, and providing adequate funding for elections.

It’s important to get the basics right. While there are many complex, stimulating proposals that utilize cutting-edge algorithms, cryptography, and distributed systems techniques to strengthen elections, many of these proposals are moot if the basic logistics are mishandled.

Some of these low-tech recommendations are, to the surprise of many passionate technologists, quite common among election security specialists. For example, requiring a paper ballot trail and avoiding internet voting based on current technology is also cited in our next paper.

Matthew Bernhard et al., Public Evidence from Secret Ballots, arXiv:1707.08619 (2017)

Governance aside, the second paper offers a comprehensive survey of the key technical challenges in election security and common tools used to solve them. The paper motivates the difficulty of election systems by attesting that all actors involved in an election are mutually distrustful, meaningful election results require evidence, and voters require ballot secrecy.

Ballot secrecy is more than a nicety; it is key to a properly functioning election system. Implemented correctly, ballot secrecy prevents voter coercion. If a voter’s ballot is not secret, or indeed if there is any way a voter can post-facto prove the casting a certain vote, malicious actors may pressure the voter to provide proof that they voted as directed. This can be insidiously difficult to prevent if not considered thoroughly.

Bernhard et al. discuss risk-limiting audits (RLAs) as an efficient yet powerful way to limit uncertainty in election results. By sampling and recounting a subset of votes, RLAs enable the use of statistical methods to increase confidence in a correct ballot count. Employed properly, RLAs can enable the high-probability validation of election tallies with effort inversely proportional to the expected margin. RLAs are now being used in real-world elections, and many RLA techniques exist in practice. 

Refreshingly, this paper establishes that blockchain-based voting is a bad idea. Blockchains inherently lack a central authority, so enforcing election rules would be a challenge. Furthermore, a computationally powerful adversary could control which votes get counted.

The paper also discusses high-level cryptographic tools that can be useful in elections. This leads us to our third and final paper.

Josh Benaloh, ElectionGuard Specification v0.95, Microsoft GitHub (2020)

Our final paper is slightly different from the others in this series; it’s a snapshot of a formal specification that is actively being developed, largely based on the author’s 1996 Yale doctoral thesis.

The specification describes ElectionGuard, a system being built by Microsoft to enable verifiable election results (disclaimer: the author of this post holds a Microsoft affiliation). It uses a combination of exponential ElGamal additively-homomorphic encryption, zero knowledge proofs, and Shamir’s secret sharing to conduct publicly-verifiable, secret-ballot elections.

When a voter casts a ballot, they are given a tracking code which can be used to verify the counting of the ballot’s votes via cryptographic proofs published with the final tally. Voters can achieve high confidence that their ballot represents a proper encryption of their desired votes by optionally spoiling an unlimited number of ballots triggering a decryption of the spoiled ballot at the time of voting. Encrypted ballots are homomorphically tallied in encrypted form by the election authorities, and the number of authorities that participate in tallying must meet the threshold set for the election to protect against malicious authorities.

The specification does not require that the system be used for exclusively internet-based or polling station-based elections; rather it is a framework for users to consume as they wish. Indeed, one of the draws to ElectionGuard is that it does not mandate a specific UI, ballot marking device, or even API. This flexibility allows election authorities to leverage the system in the manner that best fits their jurisdiction. The open source implementation can be found on GitHub.

There are many pieces of voting software available, but ElectionGuard is the new kid on the block that addresses many of the concerns raised in our earlier papers.

Key Themes

Designing secure election systems is difficult.

Often, election systems fall short on the basics; improper voting lists, postage issues, and poorly formatted ballots can disrupt elections as much as some adversaries. Ensuring that the foundational components of an election are handled well currently involves seemingly mundane but important things such as paper ballot trails, chains of custody, and voter ID verification.

High-tech election proposals are not new; indeed key insights into the use of cryptographic techniques in elections were being discussed in the academic literature well over two decades ago. That said, in recent years there has been an ostensibly increased investment in implementing cryptographic election systems, and although there remain many problems to be solved the future in this area looks promising.

Three Paper Thursday: Attacking Machine Vision Models In Real Life

This is a guest post by Alex Shepherd.

There is a growing body of research literature concerning the potential threat of physical-world adversarial attacks against machine-vision models. By applying adversarial perturbations to physical objects, machine-vision models may be vulnerable to images containing these perturbed objects, resulting in an increased risk of misclassification. The potential impacts could be significant and have been identified as risk areas for autonomous vehicles and military UAVs.

For this Three Paper Thursday, we examine the following papers exploring the potential threat of physical-world adversarial attacks, with a focus on the impact for autonomous vehicles.

Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial examples in the physical world, arXiv:1607.02533 (2016)

In this seminal paper, Kurakin et al. report their findings of an experiment conducted using adversarial images taken from a phone camera as input for a pre-trained ImageNet Inceptionv3 image classification model. Methodology was based on a white-box threat model, with adversarial images crafted from the ImageNet validation dataset using the Inceptionv3 model.
Continue reading Three Paper Thursday: Attacking Machine Vision Models In Real Life

Three paper Thursday: Online Extremism and Radicalisation

With the recent United States presidential election, I have chosen to focus the theme of this Three Paper Thursday on extremism and radicalisation. This topic has got increasing media attention during the past six years in the United States, through both a general rise in the public prominence of far-right, racist rhetoric in political culture (often attributed to the Trump presidency), and a series of high-profile violent events associated with far-right extremism. These events range from the riots in Charlottesville, Virginia (which turned violent when rally attendees clashed with counter-protesters and a vehicle drove into a crowd marching through downtown, killing one protester (Heim, Silverman, Shapiro, & Brown, 2017), to the recent arrest of individuals plotting a kidnap of the Governor of Michigan. This far-right violence brought to light the continued existence of right-wing extremism in the United States. This has historical roots in well-known organisations such as the Ku Klux Klan (KKK), a secretive, racist, terrorist organisation founded in 1865 during Reconstruction as part of a backlash against the acquisition of civil rights by African-American people in the South (Bowman-Grieve, 2009; Martin, 2006).

In contemporary online societies, the landscape and dynamics of right-wing extremist communities have changed. These communities have learned how to exploit the capacities of online social networks for recruitment, information sharing, and community building. The sophistication and reach of online platforms has evolved rapidly from the bulletin board system (BBS) to online forums and now social media platforms, which incorporate powerful technologies for marketing, targeting, and disseminating information. However, the use of these platforms for right-wing radicalisation (the process through which an individual develops and/or accepts extreme ideologies and beliefs) remains under-examined in academic scholarship. This Three Paper Thursday pulls together some key current literature on radicalisation in online contexts.

Maura Conway, Determining the role of the internet in violent extremism and terrorism: Six suggestions for progressing research. Studies in Conflict & Terrorism, 40(1), 77-98. https://www.tandfonline.com/doi/full/10.1080/1057610X.2016.1157408.

The first paper comments on future directions for research in understanding and determining the role of the Internet in violent extremism and terrorism. After guiding readers through an overview of current research, the author argues that there is a lack of both descriptive and explanatory work on the topic, as the field remains divided. Some view Internet as mere speech platforms and argue that participation in online radicalised communities is often the most extreme behaviour in which most individuals engage. Others acknowledge the affordances of the Internet but are uncertain in its role in replacing or strengthening other radicalisation processes. The author concludes that two major research questions remain to be answered: whether radicalisation can occur in a purely online context, and if so, does it contribute to violence? In that case, the mechanisms merit further exploration. The author makes six suggestions for future researchers: a) widening current research to include movements beyond jihadism, b) conducting comparison research (e.g., between platforms and/or organisations), c) studying individual users in extremist communities and groups, d) using large-scale datasets, e) adopting an interdisciplinary approach, and f) examining the role of gender.

Yi Ting Chua, Understanding radicalization process in online far-right extremist forums using social influence model. PhD thesis, Michigan State University, 2019. Available from https://d.lib.msu.edu/etd/48077.

My doctoral dissertation examines the impact of participation in online far-right extremist groups on radicalisation. In this research, I applied social network analysis and integrated theories from criminology (social learning theory) and political science (the idea of the echo chamber) to understand the process of attitudinal changes within social networks. It draws on a longitudinal database of threads saved from eight online far-right extremist forums. With the social influence model, which is a regression model with a network factor, I was able to include the number of interactions and attitudinal beliefs of user pairs when examining attitudinal changes across time. This model allows us to determine if, and how, active interactions result in expression of more radical ideological beliefs. Findings suggested that online radicalisation occurred at varying degrees in six of seven forums, with a general lowered level of expressed extremism towards the end of observed time period. The study found strong support the proposition that active interactions with forum members and connectedness are predictors of radicalisation, while suggesting that other mechanisms, such as self-radicalisation and users’ prior beliefs, were also important. This research highlighted the need for theory integration, detailed measures of online peer association, and cross-platform comparisons (i.e. Telegram and Gab) to address the complex phenomena of online radicalisation.

Magdalena Wojcieszak, ‘Don’t talk to me’: effects of ideologically homogeneous online groups and politically dissimilar offline ties on extremism. New Media & Society, 12(4) (2010) pp 637-655. https://journals.sagepub.com/doi/abs/10.1177/1461444809342775.

In this article, the author is interested in answering two questions: 1) does participation in ideologically homogeneous online groups increase extreme beliefs, and 2) how do offline strong and weak ties with dissimilar beliefs affect extreme beliefs? The author uses online survey data and posts from neo-Nazi online forums. The outcome is measured by respondents’ responses to 10 ideology-specific statements. Other variables in the analysis included the level of participation in online groups, perceived dissimilarity of offline ties, news media exposure and demographics. Findings from a multivariate regression model indicate that participation in online groups was a strong predictor of support for racial violence after controlling for demographic factors and news media exposure. Forum members’ attitudes are subjected to normative influences via punitive or rewarding replies. For individuals with politically dissimilar offline ties, the author finds a weakened participation effect.

Together, these papers highlight the complexity of assessing the role played by the Internet in the radicalisation process. The first paper encourages researchers to tackle whether online violent radicalisation occurs via six different approaches. The other two papers show support for online radicalisation while simultaneously calling attention to the effect of other variables, such as the influence of offline relationships and users’ baseline beliefs prior to online participation. All of these papers cross academic disciplines, highlighting the importance of an interdisciplinary perspective.

References

Bowman-Grieve, L. (2009). Exploring “Stormfront”: A virtual community of the radical right. Studies in Conflict & Terrorism, 32(11), 989-1007.

Heim, J., Silverman, E., Shapiro, T. R., Brown, E. (2017, August 13). One dead as car strikes crowds amid protests of white nationalist gathering in Charlottesville; two police die in helicopter crash. The Washington Post. Retrieved from https://www.washingtonpost.com/local/fights-in-advance-of-saturday-protest-in-charlottesville/2017/08/12/155fb636-7f13-11e7-83c7-5bd5460f0d7e_story.html?utm_term=.33b6686c7838.

Martin, G. (2006). Understanding Terrorism: Challenges, Perspectives, and Issues. Thousand Oaks, California: Sage Publications.