Currently the performance of the Tor anonymity network is quite poor. This problem is frequently stated as a reason for people not using anonymizing proxies, so improving performance is a high priority of their developers. There are only about 1 000 Tor nodes and many are on slow Internet connections so in aggregate there is about 1 Gbit/s shared between 100 000 or so users. One way to improve the experience of Tor users is to increase the number of Tor nodes (especially high-bandwidth ones). Some means to achieve this goal are discussed in Challenges in Deploying Low-Latency Anonymity, but here I want to explore what will happen when Tor’s total bandwidth increases.
If Tor’s bandwidth doubled tomorrow, the naïve hypothesis is that users would experience twice the throughput. Unfortunately this is not true, because it assumes that the number of users does not vary with bandwidth available. In fact, as the supply of the Tor network’s bandwidth increases, there will be a corresponding increase in the demand for bandwidth from Tor users. This fact will apply just as well for other networks, but for the purposes of this post, I’ll use Tor as an example. Simple economics shows that performance of Tor is controlled by how the number of users scales with available bandwidth, which can be represented by a demand curve.
I don’t claim this is a new insight; in fact between me starting this draft and now, Andreas Pfitzmann made a very similar observation while answering a question following the presentation of Performance Comparison of Low-Latency Anonymisation Services from a User Perspective at the PET Symposium. He said, as I recall, that the performance of the anonymity network is the slowest tolerable speed for people who care about their privacy. Despite this, I couldn’t find anyone who had written a succinct description anywhere, perhaps because it is too obvious. Equally, I have heard the naïve version stated occasionally, so I think it’s helpful to publish something people can point at. The rest of this post will discuss the consequences of modelling Tor user behaviour in this way, and the limitations of the technique.
[ R source code ]
The figure above is the typical supply and demand graph from economics textbooks, except with long-term throughput per user substituted for price and number of users substituted for quantity of goods sold. Also, it is inverted, because users prefer higher throughput, whereas consumers prefer lower prices. Similarly, as the number of users increases, the bandwidth supplied by the network falls, whereas suppliers will produce more goods if the price is higher. In drawing the supply curve, I’ve assumed the network’s bandwidth is constant and shared equally over as many users as needed. The shape of the demand curve is much harder to even approximate, but for the sake of discussion, I’ve drawn three alternatives. We will return to these assumptions later. The number of Tor users and the throughput they each get is the intersection between the supply and demand curves — the equilibrium. If the number of users is below this point, more users will join and the throughput per user will fall to the lowest tolerable level. Similarly, if the number of users is too high, some will be getting lower throughput than their minimum, so will give up, improving the network for the rest of the users.
Now let’s assume Tor’s bandwidth grows by 50% — the supply curve shifts, as shown in the figure. By comparing how the equilibrium moves, we can see how the shape of the demand curve affects the performance improvement that Tor users see. If the number of users is independent of performance, shown in curve A, then everyone gets a 50% improvement, which matches the naïve hypothesis. More realistically, the number of users increases, so the performance gain is less and the shallower the curve gets, the smaller the performance increase will be. For demand curve B, there is a 18% increase in the number of Tor users and a 27% increase in throughput; whereas with curve C there are 33% more users and so only a 13% increase in throughput for each user.
In an extreme case where the demand curve points down (not shown), as the network bandwidth increases, performance for users will fall. Products exhibiting this type of demand curve, such as designer clothes, are known as Veblen goods. As the price increases, their value as status symbols grows, so more people want to buy them. I don’t think it is likely to be the case with Tor, but there could be a few users who might think that the slower the network is, the better it is for anonymity.
To keep the explanation simple, I’ve made quite a few assumptions, some more reasonable than others. For the supply curve, I assume that all Tor’s bandwidth goes into servicing user requests, it is shared fairly between users, there is no overhead when the number of Tor clients grows, and the performance bottleneck is the network, not clients. I don’t think any of these are true, but the difference between the ideal case and reality might not be significant enough to nullify the analysis. The demand curves are basically guesswork — it’s unlikely that the true one is as nicely behaved as the ideal ones shown. It more likely will be a combination of the different classes, as different user communities come into relevance.
I glossed over the aspect of reaching equilibrium — in fact it could take some time between the network bandwidth changing and the user population reaching stability. If this period is sufficiently long and network bandwidth is sufficiently volatile it might never reach equilibrium. I’ve also ignored effects which shift the demand curve. In normal economics, marketing makes people buy a product even though they considered it too expensive. Similarly, a Slashdot article or news of a privacy scandal could make Tor users more tolerant of the poor performance. Finally, the user perception of performance is an interesting and complex topic, which I’ve not covered here. I’ve assumed that performance is equivalent to throughput, but actually latency, packet loss, predictability, and their interaction with TCP/IP congestion control are important components too.
In summary, I’ve shown how the relationship between network bandwidth and user-perceived performance is more subtle than it might at first seem. The dominant factor behind Tor’s performance is the number of potential users who are willing to tolerate a certain throughput. Until this relationship is better understood, it remains unclear how much faster Tor will become as the network grows. It would be an interesting research project to establish the shape of the supply and demand curves, through modelling Tor’s scalability and predicting user tolerance. For the latter quantity, it might be more helpful to consider tolerance in terms of latency rather than throughput, which would lead to a non-inverted supply and demand chart. However, the relationship between number of network users and latency is even less clear than that of throughput. Finally, the application of more advanced economic techniques could give more insight than that of the rudimentary approach discussed here.
This reminds me of a discussion going on here in Vancouver, BC, Canada with respect to building new bridges and expanding the capacity of the highway and road network. Some are arguing that this will not result in a decrease in congestion because it encourages more private vehicle use. They say the money should be put into public transit instead …
I didn’t read all that long and complicated stuff, but were you just trying to say that Tor’s performance is a case of Jevon’s paradox?
@R
Jevons paradox is a special case of the demand/supply model where a fall in price of a good, cased by a rise in conversion efficiency, results in a rise in demand high enough to increase the amount of raw good consumed. In my model of Tor I’ve assumed the total bandwidth available is static, so Jevons paradox doesn’t apply.
“If Tor’s bandwidth doubled tomorrow, the naïve hypothesis is that users would experience twice the throughput. Unfortunately this is not true, because it assumes that the number of users does not vary with bandwidth available. In fact, as the supply of the Tor network’s bandwidth increases, there will be a corresponding increase in the demand for bandwidth from Tor users.”
This sounded a lot like Jevon’s paradox to me.
It’s a long long time since I had to get involved with studying “Coal Supply” but I do vaguly remember Jevons paradox.
One explination we where given (as students) for it was that as the effective cost of the “good” came down due to more efficient utilisation of it, it became viable for others to use where it had previously been to expensive. It was further pointed out that demand usually does not go up in proportion to the effective reduction in cost but according to a power law, so you effectivly get a watershed point after which Jevons paradox applies.
You sometime here this refered to as the “cost of entry” or “cost of technology switch” or even “cost of (S) curve jump” effects (by marketing/managment speak persons 😉 when applied to new technology take up.
On another note with regards to increasing “efficiency” it generaly has an inverse relationshp to security…
That is the more efficiently a resource is utalized the more likley it is to leak information about it’s usage. This is seen in practice with things like timing and side channel attacks.
An example of this is things like implementations of AES being “unrolled” which cuases timing differences due to the CPU caching involved. The resulting changes due to cache hits can be seen on the network and as has been demonstrated the secret key extracted from the timing diffs.
Various papers have shown that TOR and equivalents are prone to this sort of effect. Which begs the question,
“If the current quite poor performance of the Tor anonymity network is improved such that it reaches a takeup watershed, will the security decrease proportionatly to the usage or worse?”
First off, interesting post. I’m glad there is a succinct version of this argument.
A few comments on the comments: This is not a case of Jevons paradox and perhaps the best way to understand this is to consider what would make it apply. If Tor were to change its protocol and use more efficient cryptography to do simple, repetitive tasks like circuit building (maybe use Pairing-based crypto, as suggested at PET), the speed to process a user’s data would hypothetically increase. In other words, Tor’s efficiency is being increased.
W.S. Javons (who btw came an inch away from inventing RSA a hundred years before) originally would have predicted that this would result in less demand for Tor. If I can do all my web browsing in a fraction of the time it used to take, then I will be using Tor less. However Javons soon realized that, as Steven points out in his article, that this assumes the user base does not change. First I might surf longer with Tor since its faster. And second, new users how would have considered it too slow before will start using it. And so increasing the efficiency of Tor has the exact same effect as giving it more bandwidth (and perhaps an equally articulated fallacy in the PET community is that if we could just increase the efficiency of Tor, with crypto improvements or whatever, that it will make Tor faster).
This doesn’t mean that increases in bandwidth or efficiency are valueless. What is being increased is the number of users who can use the technology.
Also, I wanted to mention that the comparison to transportation infrastructure is a good one. However, for what its worth, roads are an even worst case of the diminishing returns of throwing more bandwidth at the problem because roads also exhibit a second property: peak-load. If you think about it, roads for most of the day and all night sit nearly empty. It is terrible economic efficiency to have a resource be vacant for most of the day, and completely congested for a few hours in the morning and late afternoon.