I’m liveblogging the Workshop on Security and Human Behaviour which is being held here in Cambridge. The programme is here. For background, see the liveblogs for SHB 2008-15 which are linked here. Blog posts summarising the talks at the workshop sessions will appear as followups below.
The first SHB talk actually took place the previous day when my co-organiser Angela Sasse gave the Wheeler Lecture, the Computer Lab’s annual distinguished invited talk.
Her lecture was on “Can we make people value IT security?”. She recalled stumbling into information security by accident in the late 1990s when working on early VOIP and videoconferencing tools. Their telco partner had found the cost of resetting passwords had trebled over three years as they had separate passwords for different services; she found that the firm was asking people to perform unfeasible tasks, which led to her 1999 paper Users are not the enemy. Impossible workloads led to workarounds, leading to a downward spiral in security culture as users came to disbelieve and disrespect the security team. We now know in detail that complex systems cause mistakes while conflicts with primary tasks lead to noncompliance, but still many measures annoy people for little discernible benefit. SSL warnings are an example; any designer can tell you that warnings should be reserved for genuine exceptions. Yet Microsoft has habituated almost everyone to swat warnings away. Akhawe and others estimated a false positive rate of 15000 to 1; anything above 3% damages response. Felt showed that warnings work better if they’re brief and to the point. Another approach is to try to stop habituation; FMRi studies show that changing the warnings can do this; but is such a bullying approach at all justifiable when so many warnings are false alarms? There is unfortunately a nagging paternalism in security, often justified by “nudge” behavioural economics. It’s the old cartoon of why people choose the murder car. At CHI this year, there was the beginnings of a pushback against this, with a workshop on Batya Friedman’s value-sensitive design. The idea is that both intended and included values should be negotiated at the design stage, particularly when the embodied values aren’t transparent in the product. For her view on click to consent, see the Biggest Lie website; see also Turow’s survey of what people feel about all this. But training users often doesn’t work; Whitten’s attempts to train people to understand public-key cryptography using the LIME tutorial met significant resistance, with users expecting “that sort of thing” to be handled automatically. more recently, Abu-Salma interviewed 60 people who had downloaded encrypted chat tools; 50 of them had given up. As Phil Hallam-Baker put it, people want to protect themselves, not join a crypto-cult. And most secure tools are like a car that doesn’t go to most of the places you want to visit. Turning to desktop sandboxing, Dodier-Lazaro found that most users preferred to retain their plugins and other features; utility won over security. Needham pointed out in his Clifford Paterson lecture that security isn’t the only place where an ordinary person has a problem and a friendly mathematician solves a neighbouring problem. Yet again and again it’s assumed that people are the problem, not technology. Recently, the Denver manifesto is a new initiative, which Angela supports; it calls for computer science educators to not just consider values but to empower students with the tools necessary for discussing and evaluating relevant values and tensions between them, as well as an understanding of externalities and risk evaluation. More broadly, the word “security” is often unhelpful, as “security” problems are generally IT design problems. We should not have to tell people not to open unexpected emails – or for the police to warn people of ransomware with an attachment called ransomware.pdf! For people to be credible, they need to be competent and to have appropriate motivation, and the fracas over Wannacry revealed few actors with both attributes. Molotch pointed out that security is often best improved by investing in other things, such as proper staffing levels and PA systems that work. In conclusion, the categorical imperative of human-centred security is “don’t waste people’s time and attention”. Second security paternalism often masks incompetence, vested interests and unwillingness to change. Instead, we should understand and support user activities and values; and finally, fixing this will need serious attitude change as well as much broader skills.
Questions included whether information security matters that much, if it’s awful and the world still doesn’t end; whether regulation is needed to fix the problem; and why bank accounts are still presented as sixteen digits at the customer’s risk, when we’ve known for years that they should be presented in chunks that fit human short-term memory; whether we need mandatory design reviews, even at the cost of slowing innovation, as Bosch has done; whether we’re teaching our kids the rights things or transmitting fundamental misconceptions (the latter, which is hard to fix with ten different security education bodies in the UK fwith self-appointed experts); practical things we can do, such as going through our programming course materials and removing all the examples of potentially vulnerable code; whether we’re making structural and social responsibilities into individual ones, and whether indeed it’s reasonable to expect all cryptographers to be warm and fuzzy experts, or whether it’s mostly risk shifting (it is); what we can practically do when the risk can’t be eliminated (in most cases there isn’t a trade-off between risk and usability – it’s an excuse for not bothering with usability, or even with what people want); and finally whether the trend to third-party authentication by google and facebook is good or bad (Angela’s worried when the data are collected by companies who’ll then do behavioural analysis and use it for advertising).
Alice Hutchings kicked off the regular sessions, talking about crime in the sky. People think of plane crime nowadays in terms of shoe bombs, laptop bombs and other terrorist threats; yet the reality is volume crime – of drug and cigarette smuggling, people trafficking, and fraud. Tying offences together is travel fraud: cheap tickets, obtained by credit card fraud and other types of cybercrime, and often sold at 25% of market value. Alice has been interviewing airlines, banks, travel agents and others; it’s a bit like the story of the elephant, where each of the blind men had a quite different perspective. The police response has been aimed at the traveler, and that doesn’t work well as they are either victims or claim to be.
Marc Schuilenburg has written extensively about the management of security and safety, which are contested concepts. Marc is both a philosopher and a professor of criminal law, and sees security as a way of ordering our lives; it’s essentially anthropological, and non-human things like badges and fences historically played essentially passive roles. Nowadays however we see an emergent “system” in which big brother controls us, augmented by surveillance capitalism where the “little sisters” Google and Facebook monetize us. However they face a lot of difficulties in implementing their programmes; rather than just looking at their aims and effects, we need to pay more attention to their problems and limitations in detail. What’s the uncertainty principle (of context and consequences) and what about black swans, and unintended functionality such as planes that can be used as weapons? In short, we need to think a lot more about complexity.
Yi-Ting Chua uses social network analysis to understand cybercrime. One topic of how trust works in stolen data markets. Where both buyer and seller can be untrustworthy, the dyadic aspects are more interesting, and social network analysis lets you go beyond the purely financial aspects of an evolutionary game model to the propagation of beliefs and how people learn criminal techniques. There are interesting questions about how similar they are both to legitimate markets and also to the process of radicalization.
Monica Whitty has worked in the past on romance scams, and now works with Gumtree on detecting and preventing mass marketing fraud more generally. She studies the psychology of victims and the effectiveness of warnings; as the criminals get better, it’s ever harder to tell genuine offers from fake ones. One surprising finding from her work is an apparently perverse effect of guardianship attempts: that people who read sites like getsafeonline seem more likely to be scammed, and also to become repeat victims.
Richard Clayton described being a victim of a travel scam where the crooks phone up conference speakers and offer to arrange hotel bookings, which turn out not to exist. However if the booking had been genuine, that would just have been a rather entrepreneurial travel agency. A large chunk of cybercrime is not about interesting technical of psychological stuff; it’s just businesses that don’t keep their promises because their victims are overseas and the police can’t get it together. Richard has large numbers of such dodgy offers, which are available to researchers via the Cambridge Cybercrime Center; the way to study this stuff is at scale.
Jeff Yan talked about detecting various kinds of fraud and sharp practice on the Hong Kong stock market but asked us not to blog the details.
Discussion started on what are good signals for companies; Richard suggested looking whether the annual report has photos of company staff, and actually has their names. Monika discuss further why people who pay attention to fraud advice become victims more often; this really needs to be picked apart! To what extent are anti-fraud warnings just risk dumping? To what extent are victim behaviours masked, complemented or otherwise affected by other traits such as impulsivity? And to what extent does adversarial behaviour inevitably defeat static models, or is it just nimble crooks avoiding slow-moving police bureaucracies? Criminals are certainly usually better at timing, and are often seen as sexy. Social tools such as messaging enable crooks to build strong trust relationships with targets. This is why the posters in Western Union saying “Don’t send money to strangers” don’t work; by the time you’re ready to pay, the scammer isn’t a stranger any more! You have to explain the modus operandi in much more detail, and we’re not good at that. You can’t just teach three things, or the bad guys will design their scams to avoid them. In Japan, the biggest fraud problem is old people being scammed that their grandchildren are in trouble; the roots of the trust problem there go back to changing social structures. And smart lottery fraudsters give free lottery tickets to their victims first to create trust. Many signals of trust are easy to mimic, As one criminal said, “There’s a scam out there for everyone!” Much of this is not new; it’s documented in books of a generation or two ago such as “The Big Con”.
Richard Harper is working on the security of habit. Years ago he discovered that prepaid mobile phone contracts were popular in part because people didn’t trust themselves not to talk too much. This led to a long-term interest in the case where the person I don’t trust is me. However, there are even worse problems around how dealing with material managed by others, such as photos on the web. People want to delete photos of old girlfriends and boyfriends but can’t when these are curated by others. So we end up dividing the world into stuff we can’t trust ourselves to curate, and stuff we can’t trust others to curate. He wonders if it might be possible to come up with AI tool that people trust; how they could evaluate them; and what this even means.
Sarah Gold is a designer who directs a studio called If whose speciality is building systems that people can trust. In general, people don’t trust the things we build; too many things have gone wrong, with Uber and Cayla just being the most recent. Trust should be a design goal like accessibility, as digital rights become mainstream consumer rights. Things like the GDPR are a natural regulatory response to the way EULAs rob us of our rights; we should see the rights in it as opportunities to do things better. She has come up with some design patterns to do things better. Trust must be seen as a matter of design, not compliance.
Ame Elliott is design director of Simply Secure. She is annoyed by designers who mislead users about security properties – often because they don’t understand them. Design decisions shape behaviour; forcing people to do odd things to protect themselves puts them into a different kind of behaviour loop, and the same happens is people have to stop, think and make complex inferences. She showed a number of examples of bad practice, many from Chinese service firms but not all. For example, slackbot reads everything but doesn’t reply to direct messages – misleading most users into thinking that it’s not reading them.
Jean Camp argues that security expectations are unreasonable: asking people to understand public-key crypto, or even memorise a dozen hard passwords, is like expecting people to understand the Bernoulli principle before they fly. She has been looking for alternative metaphors, testing a number of cartoons to communicate concepts such as boundaries and the risk of theft. This works where effective cartoons can be found (which isn’t always easy) but implementation is hard; you can’t do anything much with Facebook, which hosts an awful lot of content.
Frank Stajano has worked for years on trying to fix the problem of the unusability of passwords, but chose to talk today about his other passion: the need to train a lot of skilled cybersecurity defenders. This is not primarily about training (though we do that of course) but inspiration. How do we attract more smart people, and particularly women? The idea is to make it interesting, challenging and fun. Last March we ran a competition Cambridge2Cambridge where students from Cambridge and MIT competed in a capture-the-flag competition. This has spawned national and international competitions.
The last speaker of the morning was Angela Sasse, talking about how you actually change password practices in a real university. People were already complaining vigorously about a 150 day password expiry cycle, and yet the institution got the idea to incentivise stronger passwords by allowing people who chose them to keep them for longer. It turned out that that one admin had found this idea on a website, and had disregarded the newer GCHQ advice not to expire passwords; he recommended “seecateek” as an easier version of correct horse battery staple, having phoned someone at GCHQ who said “best current advice is just to string three words together”. Password reset was expensive as people have to go to a physical help desk, so they decided to make registration of a phone number mandatory. Password lifetime actually dropped slightly; people changed passwords because of various technical and usability issues. Password changes usually make the password slightly weaker, and the mean time to reset is 11 minutes. Most resets are now self-service; people need to get a 4-digit code on their mobile, and there’s no signal in the basement with the helpdesk. There was no prototyping or evaluation. So the new design wastes a lot of staff time (but perhaps cuts the cost to the university).
Discussion started on the overload of the word “trust”: the old geek view that a trusted system is one that can harm me sits poorly with a design approach. A female-friendly approach was also discussed; competitions can put women off as they’re less self-confident when learning a new skill. Thus Frank organised his contests as team events with each team having one capable person from each of the UK and the USA and randomly selected other members. As well as the input, we need to think of the output: will a competitive approach select a cadre of cyber defenders who don’t care about people at all? A problem with the security community is that there’s too many security people in it. Silicon Valley corporations generally don’t employ trainers all round the world when they roll someone new out; the training budget is generally zero, and so stuff has to be discovered. So a key question is what options or assets does a user have to discover quickly in order not to be harmed? A designer may prefer to simply show that they have followed best practice, so there’s an issue about how the art moves forward. A long-term issue is that developers in general and security folks in particular are very WEIRD compared with the users, multidisciplinary teams add complexity but allow some scope for redressing the bias and doing design better. Finally, the framing of “cyber” is almost exclusively military; security as caring for others, as protection and as solidarity is a different and better worldview.
Alessandro Acquisti’s job was to keep us awake after lunch. We are bombarded all the time by weapons of mass distraction; social media interruptions are reckoned to cost industry hundreds of billions a year, leading to tools such as rescuetime and the pomodoro technique. So “freedom apps” are designed to restrain you – a modern version of Odysseus having himself bound to the mast to survive the sirens’ song. But do they work? Alessandro did an experiment with 456 mTurkers using freedom as it works across platforms. Subjects worked for four weeks, and after two weeks subjects were assigned to placebo (block random sites for random amounts of time), blocking facebook and youtube (for 6h per day), and blocking the subject’s choice of websites. Performance measures included total mTurk earnings and a proofreading task. There was no difference in hours worked, but earnings increased for the group blocked from facebook and youtube. The strongest effect is medium users of social media; heavy users were actually harmed by blocking.
Laura Brandimarte acknowledges that people will trade information for services, but the data market is opaque and companies have zero incentive to make it transparent. How can this be fixed? She’s working on TD-chain, a kind of blockchain that will make information trading auditable and support a reputation system that will maintain a point-based score based on previous access decisions. She hopes that such a reputation system would appeal to firms as a way of winning user trust.
Serge Egelman has built instrumentation into Android for his Haystack and Lumen projects and is now using it to measure where app data goes, how often, what persistent identifiers are used and whether they’re shared across apps. He uses the exerciser monkey to drive the apps. It turns out that if you opt out of tracking, google still sends the same information to advertisers, but sends a flag telling them to ignore it – almost certainly a violation of COPPA. Many apps also send wifi hotspot data to advertisers inMobi settled with the FTC for $4m for doing this); in this case an app developer in China whom they notified acknowledged and updated their app, which involved them changing the library they used. So it appears that at least some of the time developers are breaching privacy law inadvertently by using bad libraries. Transmitting an advertising ID along with a persistent identifier such as a router MAC address is not just a privacy violation for kids’ apps but a violation of federal law.
Elissa Redmiles worries about the ecological validity of many security studies. In America 40% of people have less than a high school education and 44% are over 50; mTurk isn’t good at attracting either. You can use census-representative web panels, but they’re people who’re comfortable with the Internet. The survey community prefers probabilistic phone and mail surveys where it matters, but this costs $20-100k per sample, and you can’t ask people to use an online tool by phone or mail. Where we need work are first, understanding the bias (so we need to think about technical literacy not just demographics), second, understand non-demographic weighting better; and third, watch out for unfounded assumptions by combining sampling mechanisms to get at hard-to-reach populations. We also would like scalable alternatives to self-reports and logs and perhaps new measurement strategies altogether.
Adam Joinson is interested in different ways of unterstanding other people. Sam Gosling noted that people could assess each other by how tidy their rooms are, and what sort of stuff was scattered around; our digital residue potentially tells very much more. That’s after all what Cambridge Analytica does when helping politicians decide which ad to send to which user. He’s done a meta-analysis of 9000 articles exploring whether humans and computers can make accurate judgments of personality and from what. He concludes that you can do the big five; that computers are generally better than humans, especially for attributes that are hidden. Then he did a lab study. Zelig quotient measures how much you move your language towards or away from that of others; he did lab experiments in people pitching kickstarter ideas to others. Low-power people were more accommodating, and even more so if Machiavellian (and it worked; they got higher payouts). There was a lot of variety in terms of the influence mechanisms used. He’s also been looking at the effect of interruptions: they can make people more ready to click carelessly. Pulling it all together, it’s complicated. You can profile people accurately, but actually exploiting this is harder work. Context, cognitive load and the message itself are much more likely to influence decisions than innate susceptibility.
David Modic was the last. He asked who in the audience checked code hashes, email signatures, certificate chains and other crypto stuff? About five people do, in an audience skewed to security geeks. He’s been sending out lightly encoded messages in bogus GPG signatures for months now. Now similar stuff has turned up in a bad email; the investigation is ongoing.
In discussion, it was noted that the leaks of geolocation information through apps is often useful to cybercrime investigators, and such information is basically a matter of notification and transparency rather than being illegal per se. Why does someone not sue the library authors, or get the FTC to sue them? People said that changing all the code was a lot to ask; Serge replied that asking people not to break the criminal law was not a very high bar and asked Google’s official position; Ben replied, to general laughter, that he thought the company didn’t approve of illegal advertising. Serge added that most of the bad libraries do in fact disclose what information they collect; why don’t developers pay attention to the need to not disclose it from children’s apps? The Google play store even brings COPPA to the attention of people who upload apps there, but perhaps the developers treat this as yet another annoying warning to click away. A further issue is why only children get such serious protection; that’s down to politics. There was discussion on other ways to collect survey data, including people on trains and people queuing at the local courthouse; Alessandro tries to vary methodologies between field, lab and natural experiments and argues that this is the only way to be sure your sampling is robust. Alice divided one DDoS survey into two, with half being offered an online survey and half an interactive one. Elissa noted that more personal interview techniques may give more data, but people may withhold more sensitive data.
Sascha Fahl started Thursday’s last session talking about usability for programmers. The crypto APIs exposed for programmers by Microsoft, Oracle and Android have unsafe defaults, such as ECB encryption, and the APIs in Eclipse and Android Studio for X.509 certificate customization are basically unusable. As most developers aren’t security experts, they have to find workarounds, so they ask Google and get to stackoverflow where they pick up snippets of insecure code. Five years ago, he found that over 95% of Android apps that implement custom certificate validation do it wrong (that’s 8% of all apps). In followup, he contacted 80 developers of insecure apps and got 15 to agree to an interview. A typical comment was “we just implemented the first working solution we found on the Internet”. Android 7 implemented part of what he recommended. He’s rerun the experiment and human carelessness hasn’t changed any.
Yasemin Acar hopes we know that copying and pasting code from the Internet is bad, so she set out to survey how many people do it. Mostly people don’t look at books or official Android docs, but go to Stackoverflow via google. She set four groups a programming task and found that as far as functionality was concerned, people allowed only the official documentation fared worst; then came people allowed to use books, coders using stackoverflow were best. Security was different; books worked best, then official documentation, with stackoverflow users making the most errors. She concludes that what we really need is more usable documentation. Now there are APIs designed for usability (libsodium, keyczar): in a second study she asked whether they actually work. She gave people various python coding tasks. The functionality was worst for keyczar, the supposedly usable library; pycrypto and m2crypto worked best as they have the most code samples online. When it came to security, keyczar did way better. She concludes that both crypto coding and designing crypto APIs are hard; in the latter case you need to test the thing properly to understand how people will use it, and supply lots of examples of good code.
Charles Weir has been investigating what an outsider might do or say to make a programmer write more secure code. Options include a half-day threat modelling workshop for a developer team to sensitive everyone and get them on the same page; a “you’re doomed” incentivisation workshop versus continuous reminders; component choice; old-fashioned paternalistic developer training; static analysis; code review by a person; and penetration testing. Two of these cost money (pen testing and training) and one costs discipline (code review). Discipline is hard, so forget code review – and probably pen testing and training too. That leaves five options, of which three are purely social and component choice almost so. Static analysis is the only “geek” solution.
Nathan Malkin starts with the “mud puddle” test: suppose you fall, bang your head and forget all your passwords. Which accounts and devices can you get back? Nathan’s working on social authentication: getting your friends to vouch for you. Facebook now allows this. This has all sorts of great things going for it; so why isn’t it happening a lot more than it is? What are the real barriers to adoption? Is it usability? Is it speed? He has a prototype of an asynchronous system. Is it a social factor, such as not wanting to bother your friends about something you think unimportant? This throws us back to the questions of methodology raised in the previous session.
Eliot Lear has been working on the Internet of Things, where the limit isn’t number of things but the number of types of thing. The industries that make these things have their own expertise but not network security. The people who plug in the lighting or the coke machine don’t think to talk to the CISO. And everybody’s complaining about the weather, as the man said, but nobody’s doing anything about it. Who can help? And what does this mean for the evolution of the Internet? We need to be able to install stuff at scale without bothering the user. Eliot is trying to help by working to standardize manufacturer usage descriptions, and he’d like people to kick the tyres.
Thursday’s last speaker was Elizabeth Stobert, who has looked at whether 15 security experts could manage passwords any better than 27 others. The non-experts talked about their personal strategies, such as reusing passwords with slight variants; they kept and reused passwords for a long period of time, and had favourite passwords with personal significance; and they tried to save good passwords for important accounts. The experts were much the same; they were more aware of risks, used specific defence strategies for specific threats; they combined multiple strategies; yet they still encountered usability problems. Some used password managers to generate random passwords for unimportant accounts, others for important ones. In this context expertise may map not do much to skilled decision-making as situational awareness. As for what we can do, perhaps we can tell normal users more about real risks and give them better heuristics to reason about threats. Education isn’t all the answer but we somehow have to communicate, perhaps by design, what various tools are trying to do.
In discussion, Ben admitted that keyczar was the public version of an internal google library but said it wasn’t designed to support more functionality; Sascha replied that his study aimed at finding out how hard it was for programmers to figure out what it could and could not do. The problems that social authentication attempts to solve are hard in all the (important and frequent) edge cases ranging from attacks by about-to-be-ex-partners through decision-making at times of stress to what happens to our accounts on death (or when our friends die or become demented). Jeunese Payne noted that there’s also shame attached to losing an important credential such as access to a bank account. Zinaida agreed that for most people passwords are too unimportant to bother your friends with. Contextual integrity may make it sensible to use facebook friends to recover a facebook account, but even so people tend not to defriend, and the attacks you mostly care about are from people close to you. Friends make little sense in the context of a bank account, either as a help or as a particularly severe threat. Richard recommended that anyone wanting to do research in this field should go and intern with the abuse team at Yahoo or Facebook to understand the real contentious cases against which their recovery systems have evolved. Jean Camp mentioned the problems with vulnerable minorities such as gay kids in high school only some of whom are out; forcing them to trust their parents is not just unreasonable but dangerous. Nick Humphrey noted that leaving keys with neighbours is perfectly sensible and secure; it’s normal human behaviour and you’re not going to get divorced from them. John Lyle pointed out that even passwords are vulnerable to spousal attacks; what else is reasonably available in the last ditch (such as in a bitter divorce) other than social means? Facebook lets you upload your passport, but that can be gamed too. The cross-product of this diversity with the Internet of Things could be a real problem. Bruce Schneier has counted 22 different IoT security standards; Eliot Lear noted that the definition of a “Thing” varied from one company to another depending on what they were trying to sell you. The authentication standards might drop from two handfuls to one handful over the next five years, but the functionality is all bespoke, and all we can hope for is some consolidation is terms of the code base thanks to the platforms being pushed by Google, Intel and ARM. And how can users swamped with all this complexity even adopt tools like password managers to manage all the stuff in their home? As it is, even moving your web-based accounts to a manager will take half a day, by the time you reset all the passwords you’ve forgotten. Elizabeth Stobert said that in future we’d have to get used to spending maybe ten minutes here and there on security management, just as we might currently spend ten minutes dusting every few days.
Maryam Mehrnezhad started Friday’s sessions by asking who didn’t have a smartphone; about ten percent of the audience confessed, a surprisingly large number. Most of the sensors on a mobile phone aren’t understood by most users; a minority know what a gyroscope is and only a handful understand a Hall effect sensor. Maryam tried tagging them with icons, and found that as user understanding increased, their level of concern dropped sharply; it takes a lot more work to teach people that the gyroscope can be used by a malicious app to steal your bank PIN. But with a bit of nudging, users could identify overprivileged apps. Some might block camera and microphone access for an app that didn’t need them while others would use it anyway.
Howard Bowman started off with a demo of rapid serial visual presentation, testing to see whether audience members could spot Nigel Farage or Donald Trump as salient in a rapid jumble of pictures. An EEG variant, the “fringe P3 method” detects whether pictures cut through the fringe of awareness; it’s one of the biggest things you get in EEG. This can be used to apply a concealed knowledge test even without the cooperation of the subject, and which is resistant to conscious countermeasures. This much Howard presented last year. The new work is a discovery, made in collaboration with the West Midlands police, that the method works for email addresses too; if a suspect pretends that an email address isn’t his they can verify this claim 70% of the time. His next project is to verify whether a suspect has a fringe-P3 response to content on a dark web forum.
Tony Vance responded to Angela’s criticism of Adrienne Porter-Felt’s and his work on warning. Adrienne found that false positives can be reduced by 75% using server resources but only 25% if the work’s done by the client; now 30% of Windows users ignore warnings but only 13% of Android users. So what about habituation? This happens with all studied animals, right down to the sea slug. He has a meta-analysis accepted for MISQ that found gaps in the literature, including on recovery once a stimulus is withdrawn. He studied subjects for five days, finding that polymorphic warning had a sustained advantage; he also found that this could be measured as easily by eye-gaze tracking as by fMRI. He followed this with a three-week field study in which subjects downloaded three apps a day for three weeks, with four cases of bad warnings with which users were told to comply. He found once more that polymorphic warnings worked best, and concludes that polymorphic warnings help provided the volume isn’t excessive.
Kami Vaniea has been studying how people interact with software updates. Experts think that updates are the most important security measure while normal users don’t; Rob Reeder’s survey didn’t put it in the top 100. She’s found that twice as many people found updates unsatisfactory; reasons for installing one anyway range from habit through perceived importance to trust in the vendor. The process itself can be long and complex; on an iPhone you need memory to update, so you may have to move photos or pay for Internet storage. You might also find the phone now chews more battery. Quite a few people believe that “updates can contain viruses”: for example if you update flash it installs McAfee unless you uncheck a box.
Zinaida Benenson has been experimenting on what causes students to click on links. She’s been sending messages by email or Facebook saying “Hey John, Here are the pictures from last week!” (56% of email contacts clicked vs 38 for FB) or “Hey the party was great, here are the pictures!” (where it was reversed: 20% email vs 42.5 FB). So addressing people by name seems more important for email, while in the party content people socialise with strangers. Stated reasons for clicking included curiosity (34%), a fit to a real New Year party (27%), and other reasons in the teens. Could researchers be vulnerable? Zinaida got an email from a “CNN journalist” asking for an exclusive after she had a paper accepted at Black Hat. She replied, and it turned out to be genuine despite having all the signals of a phish on later inspection (unknown sender, dodgy domain, all message context in the public domain, and so on). Good mood, intuition, creativity and gullibility are one cluster, while sadness, vigilance, suspicion, and an analytic approach are another, according to Kahnemann. Perhaps James Bond can be relaxed and vigilant at the same time but most of us can’t, and we probably don’t want our sales, support and PR folks to be in James Bond mode all the time.
Rick Wash has been working on human interdependencies with Emilee Rader. There are plenty attacks whereby an attacker compromises one system, and some technical attacks that can propagate from one machine to another (as with worms), or exploit dependencies of many systems on a handful (such as DNS servers). Yet increasingly people use many systems, and we are starting to see more and more attacks that use humans to transport attacks from one machine to another. The easy example is password reuse, and people talk about weak passwords in this context. Rick and Emilee studied all the passwords used by a student population for several weeks and found that password strength was nothing to do with it; it was how many times people had to enter passwords and how many times people were asked to come up with new passwords. At their university they have a 20-minute logout, so they end up entering passwords dozens of times a day; this leads to the university password being reused like crazy. Curiously, naive users hear “don’t write down your passwords” from friends and relatives and friends, while experts don’t offer such advice. To understand phenomena like this, we have to study how people learn to use computers in a real ecosystem rather than isolated lab studies.
Discussion started on the fact that polygraphs aren’t admissible in evidence because it doesn’t measure deception but anxiety, which is easy to induce; and whether the fringe P3 method might be even worse, because of the constellation of brain regions that may be activated. Howard thinks its proper application is in investigation. Could it be used in, or to justify, torture? Howard said the glib answer would be that with fringe P3 you wouldn’t need torture, but admitted it’s more complex than that. We then moved to whether updates should be automatic and how easy an opt-out should be; Kami described the history of autoupdate, which most people accept; people should be able to turn them off. The controversy comes not from technical security updates versus functionality “upgrades” with fundamental UI changes or moves to more predatory business models. If there are too many horrid upgrades, people will start to resist patches; similarly if there are too many predatory emails or Facebook posts, people will move from the nice relaxed innovative space to the paranoid vigilant one, which imposes excessive social costs. Living in a state of heightened alert is just bad for your health; so we really ought to find technical solutions to this stuff. Relying on humans as the first line of defence is just too expensive. Jean asked whether it’s not just a business-model problem; people don’t have a chance to just take the security patches without the functionality changes. Kami answered that from the viewpoint of learning, the predatory updates poisoned the well; but with universal patching by default, the new issue will be whether you get unexpected functionality changes. It is possible to do better: the discussion by Apple users is different in fascinating ways from that of Microsoft users. Microsoft users have lower expectations; some Apple users look forward breathlessly to the latest update while some will airgap an old laptop so it can’t update and screw their precious first-generation iPod. Finally, Angela replied to Tony by pointing out that the current false alarm rate is utterly dysfunctional given the cost of interruptions, and the only rational response by users is to ignore warnings. Most of us have to work 90/10 to get through the day, and if we didn’t habituate we didn’t get through the day. Tony responded that depriving the user of warnings would be paternalism; Angela’s reply was that vendors should find out what users actually wanted and in the absence of that it was irresponsible to help the vendors frame the problem as being the users’ fault.
Rachel Greenstadt is working on how to keep people safe in collaborative working. An example of a problem is that wikipedia doesn’t let people edit pages using Tor; yet Bassel Khartabil, a contributor of images of Palmyra and well as software, has disappeared. She has interviewed a number of wikipedia contributors who are concerned about safety about the threats they perceived and the mitigation strategies they adopted. Tor users’ threats were surveillance, employment, personal or family safety, harassment in declining order; wikipedians saw safety as the main threat. As well as being shot or beaten up, both groups were worried about digital harm such as being doxxed of having their head photoshopped on to porn. People who perceived no threats tended to explain this as being employed white males or otherwise privileged. Some avoided sensitive topics such as women’s health or sexuality because they didn’t want a backlash. Privacy measures short of Tor included multiple accounts and asking others to post. Does all this create a new digital divide between people privileged enough to speak in their own name, and the rest?
John Lyle works with a large team on account security at Facebook. He’s worked on systems that detect attempts large-scale account takeover; if a logon appears suspicious they add an extra authentication check, and Facebook has more ways of doing this than just about anyone else on the Internet. The requirements are that the account holder should pass, scripted bad actors should not pass, and it should work at scale (i.e. for everyone, with no tokens, and without intervention by Facebook staff except occasionally). Their methodology when assessing a proposed method (a recent example being to ask about recent comments) is to do a technical security review, then screening eligibility (how many people made recent comments?), then testing it (did the audience go up or down?) and then the economics (did support costs go up or down?) – then looking for other benefits. In the case of identifying recent comments, it was just as good or bad as the others on the above criteria, but it worked much better for the visually impaired. This led them to try a related idea, of choosing the people you’d messages in the last few weeks, which turned out to work even better. Small margins in usability really matter with 1.9bn people!
Emilee Rader has been thinking about derived data from things like phones, fitbits and smart car locks; such things give not just traffic congestion but more sensitive data from your income bracket to where your kids go to school. Such side-effects are near impossible to predict when you set device preferences; they’re more of a social dilemma, and perhaps we might think of them as a common-pool resource like a fishery. To manage such a thing you need to be aware of other users, measure what they do, and agree rules that can then be enforced. Such resources have low excludability and high subtractability. How might these ideas go across? We already have rules and norms for managing privacy, such as by closing the office door – but IoT trackers are often invisible by design and the externalities are even better hidden. Figuring out how to link norms to such technology is something we all need to work on.
Bekah Overdorf is working on Tor security and in particular on website fingerprinting. There’s a ton of literature on possible attacks; Bekah’s been studying the error patterns and what makes them fail in practice on real websites. Size matters: big sites are easier to fingerprint, as are static sites. She’s trying to train an AI to predict website fingerprintability.
Arvind Narayanan is interested no in how AI can help security, but how security analysis can critique machine learning systems. AI/ML systems are becoming pervasive without much thought; we see headlines like A beauty contest was judged by AI and the robots didn’t like dark skin. This is perfectly understandable if AIs just learn from previous human history; they inhale the prejudices. Arvind has been looking at the prejudices embedded in the words in a language. They started from the observation that machine translation programs introduced gender bias (doctors are male, nurses are female), and invented a version of the implicit association test for natural language processing systems. They used the Glove algorithm trained on a standard corpus and found very strong associations between European names and pleasant words, and also between African-American names and bad words; gender and arts/science was the same; age, religion, physical illness, mental illness are all the same, with near-perfect stereotyping. He believes that all human biases are embedded in language and lurk there. What do we want from our machine learning, and how do we fix bias? We need to alter action not perception, use many metrics.
Discussion started off with the problem that arguments between good people and bad people tend just to promote bad discourse in the social media rankings. It moved to what might work as affirmative action online; one problem is when people aim for a single metric, then the bias may creep in there; also a single metric will drive people to avoid secondary considerations such as editing text to mitigate prejudices. Technical fixes to editing the embeddings are unlikely to work; we have to acknowledge history but maybe we need humans in the loop in some places. Humans deal with biases in very different ways; we are very good at not acting on these biases and machines are nowhere close. There is a separate and burgeoning literature on employment discrimination, such as in the targeting of ads for high-paying jobs. Majorities can bully: the majority of wikipedians can’t be bothered to maintain exceptions for Chinese editors who used to use Tor against a background of Tor being associated with too many abusive edits from elsewhere. The geographic scale of social decision-making has many more ramifications and is usually ignored. When it comes to aggregating data, for example, it might be nice to allow local aggregation in a particular neighbourhood, but not across the country; but no deployed IoT system appears to offer this. People also need coping mechanisms; someone who loves chocolate biscuits but whose wife is a health freak might hide the biscuits in the shed, and if that privacy vanishes their marriage might break down. Teasing out the difference between rational inferences, social norms is really hard because people’s privacy requirements are so hugely different. David Murakami Wood pointed out that the entire drift of western law has been about enclosing the commons and making it private, so the creation of collaboratively managed resources goes against the grain – although there are mechanisms such as commons councils and a whole literature on club goods thanks to Ostrum and others. Security folks are starting to pay attention, as are Internet governance folks. Nick Humphrey suggested that choice blindness might explain why people are not good at recognising their own comments; there are lots of other interesting biases to look at. Finally we discussed patterns of Facebook logon: people in less developed countries are more likely to log on a lot as they share devices, so it’s more important to make logon easy if you want to serve poor people.
Lydia Wilson uses anthropological methods to ask questions of terrorists in conflict zones. The Manchester attack was straight out of the Al-Qaida and Daesh textbooks; killing young girls evokes horror, gets the troops on the streets and draws the infidel into a war he cannot win. As an anthropologist she finds it easy to understand terrorists’ motives: just ask them! They are happy to talk. At the same time as the horror in the west there is joy among the rebels at the success, coupled with psychological mechanisms to minimise the guilt at killing vulnerable civilians. She’s interested in pathways into, and out of, armed conflict, and has been traveling to Kurdistan for six years now. The PKK remain the most extremist actors, on a level with some ISIS people. Her main methodology is to use flash cards to assess, first, how well people are fused to particular identities; at the highest levels, people identify so thoroughly with a group and so bonded to its other members that all insults are felt personally. Second, she has flash cards to assess perceived physical and spiritual strength of out-groups, and to compare willingness for extreme sacrifice with in-group and out-group feelings. She developed questionnaires initially in fieldwork in Lebanon which still has strong group antagonisms but in a stalemate that’s lasted decades. She had hoped to find that antagonisms would die away after a conflict, but this was not at all the case.
Richard John has been modelling the value of deterrence. The rational actor model is that the adversary is maximising expected consequences, and we often assume that the terrorist’s utility is the minus of ours. That’s really dumb but pervades research in this area. People try to create hierarchies of objectives of terrorists and do game theory while ignoring the other side’s risk preferences. Deterrence can work by denial, by monitoring or by threat of punishment, but it’s extremely hard to assess whether we’re spending the optimum amount. Perhaps some ineffective attacks were shifted from better targets or methods that were adequately defended but we don’t know. Psychology is the hard part.
Harvey Molotch is an antiquarian interested in the translation of threat, which goes back to Shakespeare and even the Greeks. Who punishes whom, and how does society respond institutionally to threat? In ambiguous situations there may be a compulsion to do something; Reagan said “Don’t just do something; stand there!” Doing nothing is often the rational thing to do but it’s radical and takes courage. Another irrationality is the way the fear of infection has led to most public toilets being closed in both the UK and the USA; others have lighting that makes it hard for addicts to see their veins, but making the environment unpleasant for all. Even in universities toilet stalls leave open space above and below of the private area to combine surveillance with minimal needed privacy. The mechanisms adopted for solving such dilemmas – the social sweet spots &ndash tell us a lot about a society. In New York subways, the “cheese-slicer” exit gates slow egress so much that people are always using the emergency exits, so the alarms are ignored, so kids who want to cheat send one in with money who holds the exit open for everybody else. Compare this with the honour system in Vienna!
Melody Zhifang Ni studied the July 2007 London bombings, which led to a fall in subway use for two months, coupled with an increase in bike use and accidents. It took about six months for things to get back to normal. She designed an experiment to test four interventions. 600 London commuters were recruited and asked about their transport choices , then again after seeing video clips of the bombings, and then after one of four interventions: a video of Blair promising a “vigorous and intensive” response (“security”), a request to plan ahead for the next seven days of meetings (“plan”), asking them the cost of either traveling by cab or paying the congestion charge to drive in (“cost”), and asking them to look up the risks of traveling by underground, taxi, car, bike, scooter or taxi (“risk”). 71% preferred public transport before the attacks. Cost didn’t hack it; security helped a little, plan better still, and the best of all was “risk”. In conclusion: you can break the dread cycle and debias people, and the best way is to educate them about risk.
John Mueller is working on a book on aviation security which will come out later this year. The 2010 NRC report concluded that the DHS had absolutely no idea how to assess whether security measures justified their cost. John has been trying to model the many countermeasures that might stop hijacking or bombing attacks and assessing the likelihood that they will deter or disrupt an attack; he concludes that attacks on aircraft are not a good idea as bombings will be stopped with odds 50 to 1 and hijackings with odds 180 to 1. The risk of being killed by a terrorist in the air is about one in 120 million, while the “acceptable risk” of being killed by cancer following a single expose to backscatter x-ray screening is 1 in 15 million. There are some interesting substitutions; for example, reducing air marshals by 2/3 (which are ineffective as there are very few of them) and put half the money into armed pilots instead, we’d do better. The airline response was “we don’t want to offend the Thin Skinned Agency”. Increasing the pre-check lines would be another win. We could not only save billions; improving the passenger experience could save 500 people a year who otherwise die on the roads. In fact, the TSA has killed more Americans by now than Al-Qaida.
Sander van der Linden is interested in the spread of fake news. What might a psychological vaccine look like? He’s been studying climate change rather than terrorism as it may be more important; also, most scientists agree but many people are induced to doubt the science. There has been a concerted disinformation campaign for many years; there may be a parallel with information on security, but climate change is better documented. There are fake experts, as in infosec, but the agenda is not to present an alternative view but just to make people doubt the facts. He did an experiment where attitudes were measured before and after seeing straight facts (97% of climate scientists) plus in the treatment group the bogus petition against climate change. The bogus petition neatly cancelled out the effect of the straight facts; introducing doubt is enough. The vaccine is to expose people to a bit of the fake news, then debunk it, and then present the straight news. He tested a partial vaccine – straightforward general warning against fake news – and a full vaccine that showed the actual fake petition. The partial vaccine got back to a third of the raw belief and the full vaccine recouped two-thirds. Future work might include stopping people becoming habituated to warnings about fake news, or just clicking them away; there’s a ton of stuff to do with the sort of ideas discussed at this workshop.
Discussion started on the rational actor model not being the right one for deterrence and the question of whether terrorism “works” at all in a rational sense. However there has been no shift in US attitudes towards terrorism since 2004; all the money hasn’t reduced the fear. It’s all basically bottom-up; the politicians follow the public’s fear, and when you suggest sensible measures in Congress they say “would you say that outside”? The same thing happens in medicine, though; people were resistant for a very long time to arguments about measuring the value of human life in Quality adjusted life years (QUALYs), until eventually people figured out how to reframe the question and create appropriate institutions. We also need to be able to make non-financial trade-offs such as whether you put laptops in holds, making them less likely to be bombs, but making it impossible to deal with battery fires; and whether arming pilots would increase the rate of pilots killing passengers in a suicide. And when it comes to inoculation, can’t the purveyors of fake news as easily inoculate people against the facts? The original work on inoculation theory didn’t take a view on the validity of the arguments on either side; there is related work on cultural truism which also predates social networks.
The last session started with David Murakami Wood promoting his new open-access journal, Surveillance and Society. From Erdogan through Duterte to Trump we see a wave of populist authoritarianism, and Mark Zuckerberg’s Facebook manifesto Building global community in its own way is also dangerous: private rather than public surveillance, supportive and safe, informed and exposing individuals to new ideas, civically engaged and inclusive. David sees global political ambition to provide an alternative to governments that are failing, based on assumptions about networks and globalisation. David thinks Zuck is correct in some ways, and Fukuyama-ish in promising the end of history, but the end is Facebook rather than liberal democracy; so David opposes it. Zuck’s view of Facebook as social infrastructure, replacing many things government fails to do, is also “community” in an antipolitical sense that excludes most forms of radical or revolutionary politics. In that sense it is one of the most conservative, confining political manifestos ever. Both authoritarianism and Facebook communitarianism depend on the deep surveillance that’s available now; politics is recast as your choice of surveillance system. Facebook’s offering is so much nicer than Trump’s or Duterte’s, which makes it a seductively dangerous option.
Bob Axelrod has been wondering whether machines can become persons. The law acknowledges legal persons, so how can a machine be a company? Incorporation is easy enough; corporations can have assets and credit ratings, and can be sued or fined if they break the law. So a university or company could build an autonomous machine, register it as the “Kosher Incorporated Machine”, lend it some money and set goals such as maximising its longevity in business. It could outsource things it can’t do on its own and acquire assets to work for it such as autonomous cars to deliver packages. If smart it would develop subgoals such as respecting the law, providing good customer service and meeting its loan obligations so it could borrow more. Corporations can of course have accidents, as Union Carbide did in Bhopal, and crime, as with Volkswagen. The real question is what regulation might be appropriate.
My talk started off from Ame Elliott’s argument yesterday that “cybersecurity” is an unhelpful and indeed militaristic reframing of what we do. This resonates with the last 25 years of my life through the crypto wars, the birth of security economics and my book on security engineering. More recently I’ve been working on what safety regulation means in the Internet of Things, with a short video anticipating a paper we have accepted at WEIS. But how do we shift the language? It’s not enough to denounce the word “cyber”; what else do we need to do? We need a whole countervailing narrative. We have neutral words for both safety and security in most European languages (Sicherheit/surete/seguridad/sicurezza) but what word should we use in English? What are the stories we need to weave around it, and what’s our campaign strategy? Do we go for elite adoption first or try to mass-market it?
Bruce Schneier recently built a company to help firms manage incidents, from which he learned that protection and detection alone isn’t enough; you need response too. The response can be high-speed (catching a virus in a link you carelessly clicked on) or slower, but the cost in terms of people goes up. The reason is that flexibility requires people; big data was supposed to work by having enough data to enable us to make certain decisions, but it hasn’t worked out that way. McMaster spoke at Harvard about how the Revolution in Military Affairs was supposed to bring certainty to the battlefield, and the USA changed its command structures on that assumption, but it didn’t work that way. You never have a good enough plan, so instead of focusing on synchronisation you have to focus on innovation; instead of command you have to focus on delegation. This has direct relevance to the problems we’re worried about here. Rather than automation we need orchestration, and within that the automation must have the correct role, subservient to the people who are in charge. In future we’ll need to think harder about agility when building systems, and look for orchestration of people, process and technology.
Nick Humphrey recalled the Edge question from 2007 about optimism; most people thought that the best was yet to come, at least in the arts. Sometimes Nick thinks we may be past our best thanks to the Internet; creativity comes out of the fact that we have to absorb lots of stuff, internalise it, worry about it and dream about it. The new creations come from so much stuff bouncing around inside someone’s head. Children nowadays internalise much less; one student about to get a first at Oxford had not read a book for six years. (He warned this student that he might never get a girlfriend; he now has one, and she’s never read a book either.) The machines who undermine our creativity have every incentive to go on doing it – it’s in their interest to make us dependent. So could we become immune to the Internet’s bad effects, as with an endemic disease? Well, Nick is also interested in the past, at how our minds evolved, and also looks at bizarre phenomena such as the placebo effect. If a shaman’s chant or a sugar pill get us to marshal our resources, why don’t we just do it anyway? Well, you don’t always want to get better as quickly as possible because of the resource drain; things like pain, vomiting and fever are often defences. The immune system is like a natural health management system which tries to forecast risks and allocate resources sensibly. Placebos cheat us about the safety of the environment, where our immune system is still programmed to assume we’re in 20,000BC; so we can respond to placebos with effects that are usually strongly positive. However it is emphatically not the case that our computer environment is more benign than 20,000 years ago. Rationally, we should develop nocebo responses – ways of tricking ourselves into believing that things are actually very much worse than they seem. So we should not regard users as enemies but we still ought to be paternalistic on this; just as doctors debate the ethics of placebos, so the ethical debate in information security about scaring users might follow similar lines.
The last speaker at SHB17 was Sophie van der Zee, on how our research can impact society. Blogs work for academia and tweets can spread the news further; Frank’s hackathon between Cambridge England and Cambridge Massachusetts for national media attention and inspires students to compete; Sophie’s SHB 2016 talk recounted interviewing practitioners about lie detectors found that they learn from TV, not from academic papers. Since then she’s aiming more at the kind of investigation that the practitioners described to her. In short, good research comes from real problems. She’s been looking at how to analyse testimony about a hit and run, and collected data from mock terrorist attack drills, but has not had time to analyse the data as she got married on Saturday. She has produced a booklet summarising research in our field. Finally she’s researching the Dutch guidelines for getting reimbursed if you’re a fraud victim. Most people say they comply with most rules, except that only 49% say they never let others use their card; only 22% abide by all five cards. When you stare at the fine print and specifically all the password rules, only 3.9% of people comply. In other words, a rule book that seems to favour the customer actually favours the bank instead.
Discussion started on creativity: the knowledge has to be all in one head for the connections to be made. Perhaps artistic productivity will decline in line with Internet penetration. Yet television and the telephone have already upset society, changing art rather than trashing it; and the piano was central to everything Beethoven did. The artists of the future will have a very different understanding of the world. And many art forms are physical and emotional rather than purely cerebral. Once autonomous machines come along many conversations will change; rights and punishments will make little sense. Another topic is whether surveillance capitalism is sustainable (see peak ads for some of the arguments); perhaps we should club together, buy twitter and turn it into a commons. On speaking about security, the international relations people use “positive security”; however this may just be interpreted as a velvet glove round the mailed first of the state. `On the question of language appropriation there was some discussion and no real consensus; many were unhappy with “cyber” but some had to use it for now as it had become embedded in their discipline. Language change campaigns can have real effect, though: computer scientists’ detestation of the term “ICT” led eventually to school curriculum change that has boosted the numbers of youngsters starting computer science degrees. Harvey Molotch directed us to look at other disciplines with conflicts of language, such as medicine where people debate the logic of care versus the logic of choice. Finally he warned us against being too pessimistic about the Internet: fifty years ago, people were pessimistic about advertising which was supposed to capture our free will and rob us of the power of choice. But that didn’t happen, because the world is just queer.
Somewhere in there is a typo, I think you meant “club together and *buy* Twitter”.
There’s another reason for buying Twitter, pull the plug on Trumps account and free us from this constant distraction. We need never hear much from him again.
I fixed the bug, but I am not so sure about the censorship…
Correction, June 19th: I had accidentally omitted the liveblog text for the seventh session, so I just edited it in at the start of the text on the eighth section so as to maintain contextual integrity.