Bitter Harvest: Systematically Fingerprinting Low- and Medium-interaction Honeypots at Internet Scale

Next week we will present a new paper at USENIX WOOT 2018, in which we show that we can find low- and medium-interaction honeypots on the Internet with a few packets. So if you are running such a honeypot (Cowrie, Glastopf, Conpot etc.), then “we know where you live” and the bad guys might soon as well.

In total, we identify 7,605 honeypot instances across nine different honeypot implementations for the most important network protocols SSH, Telnet, and HTTP.

These honeypots rely on standard libraries to implement large parts of the transport layer, but they were never intended to provide identical behaviour to the systems being impersonated. We show that fixing the identity string pretending to be OpenSSH or Apache and not “any” library or fixing other common identifiers such as error messages is not enough. The problem is that there are literally thousands of distinguishing protocol interactions, part of the contribution of the paper is to show how to pick the “best” one. Even worse, to fingerprint these honeypots, we do not need to send any credentials so it will be hard to tell from the logging that you have been detected.

We also find that many honeypots are deployed and forgotten about because part of the fingerprinting has been to determine how many people are not actively patching their systems! We find that  27% of the SSH honeypots have not been updated within the last 31 months and only 39% incorporate improvements from 7 months ago. It turns out that security professionals are as bad as anyone.

We argue that our method is a  ‘class break’ in that trivial patches cannot address the issue. Thus we need to move on from the current dominant honeypot architecture of python libraries and python programs for low- and medium-interaction honeypots. We also have developed a modified version of the OpenSSH daemon (sshd) which can front-end a Cowrie instance so that the protocol layer distinguishers will no longer work.

The paper is available here.

The two-time pad: midwife of information theory?

The NSA has declassified a fascinating account by John Tiltman, one of Britain’s top cryptanalysts during world war 2, of the work he did against Russian ciphers in the 1920s and 30s.

In it, he reveals (first para, page 8) that from the the time the Russians first introduced one-time pads in 1928, they actually allowed these pads to be used twice.

This was still a vast improvement on the weak ciphers and code books the Russians had used previously. Tiltman notes ruefully that “We were hardly able to read anything at all except in the case of one or two very stereotyped proforma messages”.

Now after Gilbert Vernam developed encryption using xor with a key tape, Joseph Mauborgne suggested using it one time only for security, and this may have seemed natural in the context of a cable company. When the Russians developed their manual system (which may have been inspired by the U.S. work or a German one-time pad developed earlier in the 1920s) they presumably reckoned that using them twice was safe enough.

They were spectacularly wrong. The USA started Operation Venona in 1943 to decrypt messages where one-time pads had been reused, and this later became one of the first applications of computers to cryptanalysis, leading to the exposure of spies such as Blunt and Cairncross. The late Bob Morris, chief scientist at the NSA, used to warn us enigmatically of “The Two-time pad”. The story up till now was that the Russians must have reused pads under pressure of war, when it became difficult to get couriers through to embassies. Now it seems to have been Russian policy all along.

Many people have wondered what classified war work might have inspired Claude Shannon to write his stunning papers at the end of WW2 in which he established the mathematical basis of cryptography, and of information more generally.

Good research usually comes from real problems. And here was a real problem, which demanded careful clarification of two questions. Exactly why was the one-time pad good and the two-time pad bad? And how can you measure the actual amount of information in an English (or Russian) plaintext telegram: is it more or less than half the amount of information you might squeeze into that many bits? These questions are very much sharper for the two-time pad than for rotor machines or the older field ciphers.

That at least was what suddenly struck me on reading Tiltman. Of course this is supposition; but perhaps there are interesting documents about Shannon’s war work to be flushed out with freedom of information requests. (Hat tip: thanks to Dave Banisar for pointing us at the Tiltman paper.)

Graphical Models of Security (GraMSec 2018)

I was at The Fifth International Workshop on Graphical Models for Security (part of FLoC 2018) this weekend where I presented a paper. Following is a summarized account of the talks that took place there. Slides can be found here.

The first speaker was Mike Fisk who was giving an invited talk on Intrusion Tolerance in Complex Cyber Systems. Mike started off by elaborating the differences in the construction of physically secure systems such as forts versus the way software engineers go about creating so-called secure systems. He then made the case for thinking in terms of intrusion tolerance rather than just intrusion resistance – even if an intruder gets in, your system should be designed in such a way that it impedes the intruder’s exploration of your network. He then instantiated this idea by talking about credentials for accessing network resource and how they’re stored. He noted that normal users (with the notable exceptions of sysadmins) show predictable access patterns whereas attackers show wildly different access patterns; an intrusion tolerant system should take these into account and ask for re-authentication in case of abnormal patterns. He then talked about metrics for figuring out which nodes in a network are most interesting to an attacker. While some of these are expected – say, the ActiveDirectory server – others are quite surprising such as regular desktops with very high network centrality. He concluded by giving advise on how to use these metrics to direct resources for intrusion resistance most effectively.

Sabarathinam Chockalingam gave a talk on using Bayesian networks and fishbone diagrams to distinguish between intentional attacks and accidental technical failures in cyber-physical systems. His work focused specifically on water level sensors used in floodgates. He first gave an introduction to fishbone diagrams highlighting their salient features such as the ability to facilitate brainstorming sessions while showcasing all the relevant factors in a problem. He then presented a way to translate fishbone diagrams into Bayesian networks. He utilized this technique to convert the risk factor fishbone diagram for the water level sensors into a Bayesian network and generated some predictions. These predictions were mostly based on expert knowledge and literature review. He concluded by pointing at some possible future research directions primary of which was exploring the conversion of fishbone diagrams into conditional probability tables.

I gave a talk on visualizing the diffusion of stolen bitcoins. This works builds upon our previous work on applying the FIFO algorithm to tainting bitcoins, presented at SPW2018. Here, I focused on the challenges facing effective visualization of the tainting dataset. I highlighted the size of the dataset (>450 GB for just 56 kinds of taint), the unbounded number of inputs and outputs as well as the unbounded number of hops a satoshi can take. All these make visualization without abstraction challenging. We refused to use lossy abstractions since what is interesting to the user might be something that we abstract away. Instead, we made two prototypes that, for the most part, convey the underlying information in an accessible manner to the end-user without using any abstractions. The first provides a static map of the taint-graph, useful for getting a global view of the graph; the second provides an interactive way to explore individual transactions. I concluded by pointing out that this is a much more general problem since what we are trying to do is visualize a large subset of transactions in a massive dataset – something that is encountered in many other domains.

Ross Horne presented a specialization of attack trees where he took into consideration of an attacker about the underlying system that they are trying to compromise. He pointed out that existing attack trees assume perfect knowledge on the part of the attacker whereas this is not realistic. The attacker often acts under uncertainty. To model this, he introduced a new operator to act between branches of an attack tree that conveys ignorance on the effectiveness and possible outcomes in case the attacker chooses to traverse that sub-tree. He then introduced a way of reasoning about the specialization of such trees and showed how the placement of the newly introduced operator has varying impact on the capabilities of the attacker. He concluded by remarking how these new attack trees could be used for moving target defence.

Harley Eades III gave a talk on applying linear logic to attack trees. He started off by pointing out that when understanding the difficulty of execution of an attack, we only care about the weights assigned to the leaves of the tree, the root nodes only serve as combinatorial operators. He then presented an exhaustive list of operators and provided a representation to convert attack trees into linear logic statements. He then introduced Maude, a quarternary semantics of attack trees followed by the introduction of Lina, an embedded domain specific programming language. Lina is used to do automated reasoning about attack trees using Maude. He presented Lina’s functionalities and showed an example application of Lina: automated threat analysis. He concluded by talking about future work conjecturing different formal models of causal attack trees specifically mentioning a petri net model.

Responsible vulnerability disclosure in Europe

There is a report out today from the European economics think-tank CEPS on how responsible vulnerability disclosure might be harmonised across Europe. I was one of the advisers to this effort which involved not just academics and NGOs but also industry.

It was inspired in part by earlier work reported here on standardisation and certification in the Internet of Things. What happens to car safety standards once cars get patched once a month, like phones and laptops? The answer is not just that safety becomes a moving target, rather than a matter of pre-market testing; we also need a regime whereby accidents, hazards, vulnerabilities and security breaches get reported. That will mean responsible disclosure not just to OEMs and component vendors, but also to safety regulators, standards bodies, traffic police, insurers and accident victims. If we get it right, we could have a learning system that becomes steadily safer and more secure. But we could also get it badly wrong.

Getting it might will involve significant organisational and legal changes, which we discussed in our earlier report and which we carry forward here. We didn’t get everything we wanted; for example, large software vendors wouldn’t support our recommendation to extend the EU Product Liability Directive to services. Nonetheless, we made some progress, so today’s report can be seen a second step on the road.

Raising a new generation of cyber defenders

 

Over the past few years we launched and ran two university-level hacking competitions in  order to attract bright students to our field, with the long term goal of addressing the skills gap in cyber security.

Analysts estimate that, globally, over the next few years, in the field of cyber security there will be a gap of over a million people between the positions that need filling and the people with the skills to fill those positions.

In 2015 we founded the international Cambridge2Cambridge cyber security challenge, in collaboration with MIT CSAIL, which first took place at MIT, and then in 2016 the UK-level Inter-ACE among the UK ACE-CSRs, which first took place at the University of Cambridge. The Inter-ACE has now expanded beyond the ACEs and the C2C admits university students from anywhere in the world. None of this would have been possible without strong cooperation between academia, government and industry. We are grateful to our many supporters, who are all credited in the report.

After three years, my precious collaborators Graham Rymer and Michelle Houghton have moved on to new jobs and it is time for someone else to pick up the torch. To help our successors, today we publish a comprehensive technical report distilling our experience running these events for the past three years. We wrote it for all those who share
our vision and goals and who wish to take these competitions forward: we hope they will find it useful and it will help them make future editions even better. It contains a detailed chronicle of what we did and an extensive list of lessons learnt. Attendees of the Security and Human Behavior 2018 workshop will have heard me speak about some of the associated challenges, from fostering cooperation to redressing gender balance to preventing cheating, with detours into Japanese swordsmanship and Plato.

The extensive appendices contain a wealth of training material including write-ups of our practice CTFs and of the Inter-ACE 2018 for which we developed the problems in-house, as well as the latest course notes for the binary reverse engineering training seminar that we ran in Cambridge several times over the years, initially for our own students and then for hundreds of ACE-CSR participants.

We hope you will enjoy our report and that it will inspire you to contribute to future events in this series, whether as a participant, host or supporting institution, and keep the momentum going.

Frank Stajano, Graham Rymer, Michelle Houghton. “Raising a new generation of cyber defenders—The first three years of the Cambridge2Cambridge and Inter-ACE cyber security competitions”. University of Cambridge Technical Report UCAM-CL-TR-922, June 2018, 307 pages. http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-922.pdf

 

Third Annual Cybercrime Conference

The Cambridge Cybercrime Centre is organising another one day conference on cybercrime on Thursday, 12th July 2018.

We have a stellar group of invited speakers who are at the forefront of their fields:

They will present various aspects of cybercrime from the point of view of criminology, policy, security economics, law and industry.

This one day event, to be held in the Faculty of Law, University of Cambridge will follow immediately after (and will be in the same venue as) the “11th International Conference on Evidence Based Policing” organised by the Institute of Criminology which runs on the 10th and 11th July 2018.

Full details (and information about booking) is here.

Hiring for the Cambridge Cybercrime Centre

We have three open positions in the Cambridge Cybercrime Centre: https://www.cambridgecybercrime.uk.

We wish to fill at least one of the three posts with someone from a computer science, data science, or similar technical background.

BUT we’re not just looking for computer science people: to continue our multi-disciplinary approach, we wish to fill at least one of the three posts with someone from a criminology, sociology, psychology or legal background.

Details of the posts, and what we’re looking for are in the job advert here: http://www.jobs.cam.ac.uk/job/17827/.

Bitcoin Redux: crypto crime, and how to tackle it

Bitcoin Redux explains what’s going wrong in the world of cryptocurrencies. The bitcoin exchanges are developing into a shadow banking system, which do not give their customers actual bitcoin but rather display a “balance” and allow them to transact with others. However if Alice sends Bob a bitcoin, and they’re both customers of the same exchange, it just adjusts their balances rather than doing anything on the blockchain. This is an e-money service, according to European law, but is the law enforced? Not where it matters. We’ve been looking at the details.

In March we wrote about how to trace stolen bitcoin, describing new tools that enable us to track crime proceeds on the blockchain with more precision than before. We waited for victims of bitcoin theft and fraud to come to us, so we could test our tools on real cases. However in most of them it was not clear that the victims had ever owned any bitcoin at all.

There are basically three ways you could try to hold a bitcoin. You could buy one from an exchange and get them to send it to a wallet you host yourself, but almost nobody does that.

You could buy one from an exchange and get the exchange to keep the keys for you, so that the asset was unique to you and they were only guarding it for you – just like when you buy gold and the bullion merchant then charges you a fee to guard your gold in his vault. If the merchant goes bust, you can turn up at the vault with your receipt and demand your gold back.

Or you could buy one from an exchange and have them owe you a bitcoin – just as when you put your money in the bank. The bank doesn’t have a stack of banknotes in the vault with your name on it; and if it goes bust you have to stand in line with the other creditors.

It seems that most people who buy bitcoin think that they’re operating under the gold merchant model, while most exchanges operate under the bank model. This raises a whole host of issues around solvency, liquidity, accounting practices, money laundering, risk and trust. The details matter, and the more we look at them, the worse it seems.

This paper will appear at the Workshop on the Economics of Information Security later this month. It contains eight recommendations for what governments should be doing to clean up this mess.