Category Archives: News coverage

Media reports that may interest you

Bugs in our pockets?

In August, Apple announced a system to check all our iPhones for illegal images, then delayed its launch after widespread pushback. Yet some governments continue to press for just such a surveillance system, and the EU is due to announce a new child protection law at the start of December.

Now, in Bugs in our Pockets: The Risks of Client-Side Scanning, colleagues and I take a long hard look at the options for mass surveillance via software embedded in people’s devices, as opposed to the current practice of monitoring our communications. Client-side scanning, as the agencies’ new wet dream is called, has a range of possible missions. While Apple and the FBI talked about finding still images of sex abuse, the EU was talking last year about videos and text too, and of targeting terrorism once the argument had been won on child protection. It can also use a number of possible technologies; in addition to the perceptual hash functions in the Apple proposal, there’s talk of machine-learning models. And, as a leaked EU internal report made clear, the preferred outcome for governments may be a mix of client-side and server-side scanning.

In our report, we provide a detailed analysis of scanning capabilities at both the client and the server, the trade-offs between false positives and false negatives, and the side effects – such as the ways in which adding scanning systems to citizens’ devices will open them up to new types of attack.

We did not set out to praise Apple’s proposal, but we ended up concluding that it was probably about the best that could be done. Even so, it did not come close to providing a system that a rational person might consider trustworthy.

Even if the engineering on the phone were perfect, a scanner brings within the user’s trust perimeter all those involved in targeting it – in deciding which photos go on the naughty list, or how to train any machine-learning models that riffle through your texts or watch your videos. Even if it starts out trained on images of child abuse that all agree are illegal, it’s easy for both insiders and outsiders to manipulate images to create both false negatives and false positives. The more we look at the detail, the less attractive such a system becomes. The measures required to limit the obvious abuses so constrain the design space that you end up with something that could not be very effective as a policing tool; and if the European institutions were to mandate its use – and there have already been some legislative skirmishes – they would open up their citizens to quite a range of avoidable harms. And that’s before you stop to remember that the European Court of Justice struck down the Data Retention Directive on the grounds that such bulk surveillance, without warrant or suspicion, was a grossly disproportionate infringement on privacy, even in the fight against terrorism. A client-side scanning mandate would invite the same fate.

But ‘if you build it, they will come’. If device vendors are compelled to install remote surveillance, the demands will start to roll in. Who could possibly be so cold-hearted as to argue against the system being extended to search for missing children? Then President Xi will want to know who has photos of the Dalai Lama, or of men standing in front of tanks; and copyright lawyers will get court orders blocking whatever they claim infringes their clients’ rights. Our phones, which have grown into extensions of our intimate private space, will be ours no more; they will be private no more; and we will all be less secure.

Is Apple’s NeuralMatch searching for abuse, or for people?

Apple stunned the tech industry on Thursday by announcing that the next version of iOS and macOS will contain a neural network to scan photos for sex abuse. Each photo will get an encrypted ‘safety voucher’ saying whether or not it’s suspect, and if more than about ten suspect photos are backed up to iCloud, then a clever cryptographic scheme will unlock the keys used to encrypt them. Apple staff or contractors can then look at the suspect photos and report them.

We’re told that the neural network was trained on 200,000 images of child sex abuse provided by the US National Center for Missing and Exploited Children. Neural networks are good at spotting images “similar” to those in their training set, and people unfamiliar with machine learning may assume that Apple’s network will recognise criminal acts. The police might even be happy if it recognises a sofa on which a number of acts took place. (You might be less happy, if you own a similar sofa.) Then again, it might learn to recognise naked children, and flag up a snap of your three-year-old child on the beach. So what the new software in your iPhone actually recognises is really important.

Now the neural network described in Apple’s documentation appears very similar to the networks used in face recognition (hat tip to Nicko van Someren for spotting this). So it seems a fair bet that the new software will recognise people whose faces appear in the abuse dataset on which it was trained.

So what will happen when someone’s iPhone flags ten pictures as suspect, and the Apple contractor who looks at them sees an adult with their clothes on? There’s a real chance that they’re either a criminal or a witness, so they’ll have to be reported to the police. In the case of a survivor who was victimised ten or twenty years ago, and whose pictures still circulate in the underground, this could mean traumatic secondary victimisation. It might even be their twin sibling, or a genuine false positive in the form of someone who just looks very much like them. What processes will Apple use to manage this? Not all US police forces are known for their sensitivity, particularly towards minority suspects.

But that’s just the beginning. Apple’s algorithm, NeuralMatch, stores a fingerprint of each image in its training set as a short string called a NeuralHash, so new pictures can easily be added to the list. Once the tech is built into your iPhone, your MacBook and your Apple Watch, and can scan billions of photos a day, there will be pressure to use it for other purposes. The other part of NCMEC’s mission is missing children. Can Apple resist demands to help find runaways? Could Tim Cook possibly be so cold-hearted as to refuse at add Madeleine McCann to the watch list?

After that, your guess is as good as mine. Depending on where you are, you might find your photos scanned for dissidents, religious leaders or the FBI’s most wanted. It also reminds me of the Rasterfahndung in 1970s Germany – the dragnet search of all digital data in the country for clues to the Baader-Meinhof gang. Only now it can be done at scale, and not just for the most serious crimes either.

Finally, there’s adversarial machine learning. Neural networks are fairly easy to fool in that an adversary can tweak images so they’re misclassified. Expect to see pictures of cats (and of Tim Cook) that get flagged as abuse, and gangs finding ways to get real abuse past the system. Apple’s new tech may end up being a distributed person-search machine, rather than a sex-abuse prevention machine.

Such a technology requires public scrutiny, and as the possession of child sex abuse images is a strict-liability offence, academics cannot work with them. While the crooks will dig out NeuralMatch from their devices and play with it, we cannot. It is possible in theory for Apple to get NeuralMatch to ignore faces; for example, it could blur all the faces in the training data, as Google does for photos in Street View. But they haven’t claimed they did that, and if they did, how could we check? Apple should therefore publish full details of NeuralMatch plus a set of NeuralHash values trained on a public dataset with which we can legally work. It also needs to explain how the system it deploys was tuned and tested; and how dragnet searches of people’s photo libraries will be restricted to those conducted by court order so that they are proportionate, necessary and in accordance with the law. If that cannot be done, the technology must be abandoned.

A new way to detect ‘deepfake’ picture editing

Common graphics software now offers powerful tools for inpainting – using machine-learning models to reconstruct missing pieces of an image. They are widely used for picture editing and retouching, but like many sophisticated tools they can also be abused. They can remove someone from a picture of a crime scene, or remove a watermark from a stock photo. Could we make such abuses more difficult?

We introduce Markpainting, which uses adversarial machine-learning techniques to fool the inpainter into making its edits evident to the naked eye. An image owner can modify their image in subtle ways which are not themselves very visible, but will sabotage any attempt to inpaint it by adding visible information determined in advance by the markpainter.

One application is tamper-resistant marks. For example, a photo agency that makes stock photos available on its website with copyright watermarks can markpaint them in such a way that anyone using common editing software to remove a watermark will fail; the copyright mark will be markpainted right back. So watermarks can be made a lot more robust.

In the fight against fake news, markpainting news photos would mean that anyone trying to manipulate them would risk visible artefacts. So bad actors would have to check and retouch photos manually, rather than trying use inpainting tools to automate forgery at scale.

This paper has been accepted at ICML.

Infrastructure – the Good, the Bad and the Ugly

Infrastructure used to be regulated and boring; the phones just worked and water just came out of the tap. Software has changed all that, and the systems our society relies on are ever more complex and contested. We have seen Twitter silencing the US president, Amazon switching off Parler and the police closing down mobile phone networks used by crooks. The EU wants to force chat apps to include porn filters, India wants them to tell the government who messaged whom and when, and the US Department of Justice has launched antitrust cases against Google and Facebook.

Infrastructure – the Good, the Bad and the Ugly analyses the security economics of platforms and services. The existence of platforms such as the Internet and cloud services enabled startups like YouTube and Instagram soar to huge valuations almost overnight, with only a handful of staff. But criminals also build infrastructure, from botnets through malware-as-a-service. There’s also dual-use infrastructure, from Tor to bitcoins, with entangled legitimate and criminal applications. So crime can scale too. And even “respectable” infrastructure has disruptive uses. Social media enabled both Barack Obama and Donald Trump to outflank the political establishment and win power; they have also been used to foment communal violence in Asia. How are we to make sense of all this?

I argue that this is not simply a matter for antitrust lawyers, but that computer scientists also have some insights to offer, and the interaction between technical and social factors is critical. I suggest a number of principles to guide analysis. First, what actors or technical systems have the power to exclude? Such control points tend to be at least partially social, as social structures like networks of friends and followers have more inertia. Even where control points exist, enforcement often fails because defenders are organised in the wrong institutions, or otherwise fail to have the right incentives; many defenders, from payment systems to abuse teams, focus on process rather than outcomes.

There are implications for policy. The agencies often ask for back doors into systems, but these help intelligence more than interdiction. To really push back on crime and abuse, we will need institutional reform of regulators and other defenders. We may also want to complement our current law-enforcement strategy of decapitation – taking down key pieces of criminal infrastructure such as botnets and underground markets – with pressure on maintainability. It may make a real difference if we can push up offenders’ transaction costs, as online criminal enterprises rely more on agility than on on long-lived, critical, redundant platforms.

This was a Dertouzos Distinguished Lecture at MIT in March 2021.

Our new “Freedom of Speech” policy

Our beloved Vice-Chancellor proposes a “free speech” policy under which all academics must treat other academics with “respect”. This is no doubt meant well, but the drafting is surprisingly vague and authoritarian for a university where the VC, the senior pro-VC, the HR pro-VC and the Registrary are all lawyers. The bottom line is that in future we might face disciplinary charges and even dismissal for mockery of ideas and individuals with which we disagree.

The policy was slipped out in March, when nobody was paying attention. There was a Discussion in June, at which my colleague Arif Ahmad spelled out the problems.

Vigorous debate is intrinsic to academia and it should be civil, but it is unreasonable to expect people to treat all opposing views with respect. Oxford’s policy spells this out. At the Discussion, Arif pointed out that “respect” must be changed to “tolerance” if we are to uphold the liberal culture that we have not just embraced but developed over several centuries.

At its first meeting this term, the University Council considered these arguments but decided to press ahead anyway. We are therefore calling a ballot on three amendments to the policy. If you’re a senior member of the University we invite you to sign up your support for them on the flysheets. The first amendment changes “respect” to “tolerance”; the second makes it harder to force university societies to disinvite speakers whose remarks may be controversial, and the third restricts the circumstances in which the university itself can ban speakers.

Liberalism is coming under attack from authoritarians of both left and right, yet it is the foundation on which modern academic life is built and our own university has contributed more than any other to its development over the past 811 years. If academics can face discipline for using tactics such as scorn, ridicule and irony to criticise folly, how does that sit with having such alumni as John Maynard Keynes and Charles Darwin, not to mention Bertrand Rusell, Douglas Adams and Salman Rushdie?

Is science being set up to take the blame?

Yesterday’s publication of the minutes of the government’s Scientific Advisory Group for Emergencies (SAGE) raises some interesting questions. An initial summary in yesterday’s Guardian has a timeline suggesting that it was the distinguished medics on SAGE rather than the Prime Minister who went from complacency in January and February to panic in March, and who ignored the risk to care homes until it was too late.

Is this a Machiavellian conspiracy by Dominic Cummings to blame the scientists, or is it business as usual? Having spent a dozen years on the university’s governing body and various of its subcommittees, I can absolutely get how this happened. Once a committee gets going, it can become very reluctant to change its opinion on anything. Committees can become sociopathic, worrying about their status, ducking liability, and finding reasons why problems are either somebody else’s or not practically soluble.

So I spent a couple of hours yesterday reading the minutes, and indeed we see the group worried about its power: on February 13th it wants the messaging to emphasise that official advice is both efficaceous and sufficient, to “reduce the likelihood of the public adopting unnecessary or contradictory behaviours”. Turf is defended: Public Health England (PHE) ruled on February 18th that it can cope with 5 new cases a week (meaning tracing 800 contacts) and hoped this might be increased to 50; they’d already decided the previous week that it wasn’t possible to accelerate diagnostic capacity. So far, so much as one might expect.

The big question, though, is why nobody thought of protecting people in care homes. The answer seems to be that SAGE dismissed the problem early on as “too hard” or “not our problem”. On March 5th they note that social distancing for over-65s could save a lot of lives and would be most effective for those living independently: but it would be “a challenge to implement this measure in communal settings such as care homes”. They appear more concerned that “Many of the proposed measures will be easier to implement for those on higher incomes” and the focus is on getting PHE to draft guidance. (This is the meeting at which Dominic Cummings makes his first appearance, so he cannot dump all the blame on the scientists.)

Continue reading Is science being set up to take the blame?

Contact Tracing in the Real World

There have recently been several proposals for pseudonymous contact tracing, including from Apple and Google. To both cryptographers and privacy advocates, this might seem the obvious way to protect public health and privacy at the same time. Meanwhile other cryptographers have been pointing out some of the flaws.

There are also real systems being built by governments. Singapore has already deployed and open-sourced one that uses contact tracing based on bluetooth beacons. Most of the academic and tech industry proposals follow this strategy, as the “obvious” way to tell who’s been within a few metres of you and for how long. The UK’s National Health Service is working on one too, and I’m one of a group of people being consulted on the privacy and security.

But contact tracing in the real world is not quite as many of the academic and industry proposals assume.

First, it isn’t anonymous. Covid-19 is a notifiable disease so a doctor who diagnoses you must inform the public health authorities, and if they have the bandwidth they call you and ask who you’ve been in contact with. They then call your contacts in turn. It’s not about consent or anonymity, so much as being persuasive and having a good bedside manner.

I’m relaxed about doing all this under emergency public-health powers, since this will make it harder for intrusive systems to persist after the pandemic than if they have some privacy theater that can be used to argue that the whizzy new medi-panopticon is legal enough to be kept running.

Second, contact tracers have access to all sorts of other data such as public transport ticketing and credit-card records. This is how a contact tracer in Singapore is able to phone you and tell you that the taxi driver who took you yesterday from Orchard Road to Raffles has reported sick, so please put on a mask right now and go straight home. This must be controlled; Taiwan lets public-health staff access such material in emergencies only.

Third, you can’t wait for diagnoses. In the UK, you only get a test if you’re a VIP or if you get admitted to hospital. Even so the results take 1–3 days to come back. While the VIPs share their status on twitter or facebook, the other diagnosed patients are often too sick to operate their phones.

Fourth, the public health authorities need geographical data for purposes other than contact tracing – such as to tell the army where to build more field hospitals, and to plan shipments of scarce personal protective equipment. There are already apps that do symptom tracking but more would be better. So the UK app will ask for the first three characters of your postcode, which is about enough to locate which hospital you’d end up in.

Fifth, although the cryptographers – and now Google and Apple – are discussing more anonymous variants of the Singapore app, that’s not the problem. Anyone who’s worked on abuse will instantly realise that a voluntary app operated by anonymous actors is wide open to trolling. The performance art people will tie a phone to a dog and let it run around the park; the Russians will use the app to run service-denial attacks and spread panic; and little Johnny will self-report symptoms to get the whole school sent home.

Sixth, there’s the human aspect. On Friday, when I was coming back from walking the dogs, I stopped to chat for ten minutes to a neighbour. She stood halfway between her gate and her front door, so we were about 3 metres apart, and the wind was blowing from the side. The risk that either of us would infect the other was negligible. If we’d been carrying bluetooth apps, we’d have been flagged as mutual contacts. It would be quite intolerable for the government to prohibit such social interactions, or to deploy technology that would punish them via false alarms. And how will things work with an orderly supermarket queue, where law-abiding people stand patiently six feet apart?

Bluetooth also goes through plasterboard. If undergraduates return to Cambridge in October, I assume there will still be small-group teaching, but with protocols for distancing, self-isolation and quarantine. A supervisor might sit in a teaching room with two or three students, all more than 2m apart and maybe wearing masks, and the window open. The bluetooth app will flag up not just the others in the room but people in the next room too.

How is this to be dealt with? I expect the app developers will have to fit a user interface saying “You’re within range of device 38a5f01e20. Within infection range (y/n)?” But what happens when people get an avalanche of false alarms? They learn to click them away. A better design might be to invite people to add a nickname and a photo so that contacts could see who they are. “You are near to Ross [photo] and have been for five minutes. Are you maintaining physical distance?”

When I discussed this with a family member, the immediate reaction was that she’d refuse to run an anonymous app that might suddenly say “someone you’ve been near in the past four days has reported symptoms, so you must now self-isolate for 14 days.” A call from a public health officer is one thing, but not knowing who it was would just creep her out. It’s important to get the reactions of real people, not just geeks and wonks! And the experience of South Korea and Taiwan suggests that transparency is the key to public acceptance.

Seventh, on the systems front, decentralised systems are all very nice in theory but are a complete pain in practice as they’re too hard to update. We’re still using Internet infrastructure from 30 years ago (BGP, DNS, SMTP…) because it’s just too hard to change. Watch Moxie Marlinspike’s talk at 36C3 if you don’t get this. Relying on cryptography tends to make things even more complex, fragile and hard to change. In the pandemic, the public health folks may have to tweak all sorts of parameters weekly or even daily. You can’t do that with apps on 169 different types of phone and with peer-to-peer communications.

Personally I feel conflicted. I recognise the overwhelming force of the public-health arguments for a centralised system, but I also have 25 years’ experience of the NHS being incompetent at developing systems and repeatedly breaking their privacy promises when they do manage to collect some data of value to somebody else. The Google Deepmind scandal was just the latest of many and by no means the worst. This is why I’m really uneasy about collecting lots of lightly-anonymised data in a system that becomes integrated into a whole-of-government response to the pandemic. We might never get rid of it.

But the real killer is likely to be the interaction between privacy and economics. If the app’s voluntary, nobody has an incentive to use it, except tinkerers and people who religiously comply with whatever the government asks. If uptake remains at 10-15%, as in Singapore, it won’t be much use and we’ll need to hire more contact tracers instead. Apps that involve compulsion, such as those for quarantine geofencing, will face a more adversarial threat model; and the same will be true in spades for any electronic immunity certificate. There the incentive to cheat will be extreme, and we might be better off with paper serology test certificates, like the yellow fever vaccination certificates you needed for the tropics, back in the good old days when you could actually go there.

All that said, I suspect the tracing apps are really just do-something-itis. Most countries now seem past the point where contact tracing is a high priority; even Singapore has had to go into lockdown. If it becomes a priority during the second wave, we will need a lot more contact tracers: last week, 999 calls in Cambridge had a 40-minute wait and it took ambulances six hours to arrive. We cannot field an app that will cause more worried well people to phone 999.

The real trade-off between surveillance and public health is this. For years, a pandemic has been at the top of Britain’s risk register, yet far less was spent preparing for one than on anti-terrorist measures, many of which were ostentatious rather than effective. Worse, the rhetoric of terror puffed up the security agencies at the expense of public health, predisposing the US and UK governments to disregard the lesson of SARS in 2003 and MERS in 2015 — unlike the governments of China, Singapore, Taiwan and South Korea, who paid at least some attention. What we need is a radical redistribution of resources from the surveillance-industrial complex to public health.

Our effort should go into expanding testing, making ventilators, retraining everyone with a clinical background from vet nurses to physiotherapists to use them, and building field hospitals. We must call out bullshit when we see it, and must not give policymakers the false hope that techno-magic might let them avoid the hard decisions. Otherwise we can serve best by keeping out of the way. The response should not be driven by cryptographers but by epidemiologists, and we should learn what we can from the countries that have managed best so far, such as South Korea and Taiwan.

Symposium on Post-Bitcoin Cryptocurrencies

I am at the Symposium on Post-Bitcoin Cryptocurrencies in Vienna and will try to liveblog the talks in follow-ups to this post.

The introduction was by Bernhard Haslhofer of AIT, who maintains the graphsense.info toolkit and runs the Titanium project on bitcoin forensics jointly with Rainer Boehme of Innsbruck. Rainer then presented an economic analysis arguing that criminal transactions were pretty well the only logical app for bitcoin as it’s permissionless and trustless; if you have access to the courts then there are better ways of doing things. However in the post-bitcoin world of ICOs and smart contracts, it’s not just the anti-money-laundering agencies who need to understand cryptocurrency but the securities regulators and the tax collectors. Yet there is a real policy tension. Governments hype blockchains; Austria uses them to auction sovereign bonds. Yet the only way in for the citizen is through the swamp. How can the swamp be drained?

Privacy for Tigers

As mobile phone masts went up across the world’s jungles, savannas and mountains, so did poaching. Wildlife crime syndicates can not only coordinate better but can mine growing public data sets, often of geotagged images. Privacy matters for tigers, for snow leopards, for elephants and rhinos – and even for tortoises and sharks. Animal data protection laws, where they exist at all, are oblivious to these new threats, and no-one seems to have started to think seriously about information security.

So we have been doing some work on this, and presented some initial ideas via an invited talk at Usenix Security in August. A video of the talk is now online.

The most serious poaching threats involve insiders: game guards who go over to the dark side, corrupt officials, and (now) the compromise of data and tools assembled for scientific and conservation purposes. Aggregation of data makes things worse; I might not care too much about a single geotagged photo, but a corpus of thousands of such photos tells a poacher where to set his traps. Cool new AI tools for recognising individual animals can make his work even easier. So people developing systems to help in the conservation mission need to start paying attention to computer security. Compartmentation is necessary, but there are hundreds of conservancies and game reserves, many of which are mutually mistrustful; there is no central authority at Fort Meade to manage classifications and clearances. Data sharing is haphazard and poorly understood, and the limits of open data are only now starting to be recognised. What sort of policies do we need to support, and what sort of tools do we need to create?

This is joint work with Tanya Berger-Wolf of Wildbook, one of the wildlife data aggregation sites, which is currently redeveloping its core systems to incorporate and test the ideas we describe. We are also working to spread the word to both conservators and online service firms.

BBC Horizon documentary: A Week without lying, the honesty experiment

Together with Ronald Poppe, Paul Taylor, and Gordon Wright, Sophie van der Zee (previously employed at the Cambridge Computer Laboratory), took a plunge and tested their automated lie detection methods in the real world. How well do the lie detection methods that we develop and test under very controlled circumstances in the lab, perform in the real world? And what happens to you and your social environment when you constantly feel monitored and attempt to live a truthful life? Is living a truthful life actually something we should desire? Continue reading BBC Horizon documentary: A Week without lying, the honesty experiment