Papers tagged ‘Information leakage’

k-Indistinguishable Traffic Padding in Web Applications

A Web application is a piece of software whose execution is split between the client and the server, and it sends data between the two sides every time you do something significant—even, perhaps, on every single keystroke. Inevitably, the volume of data has some correlation with the action performed. Since most Web applications are offered to the general public at low or no cost, an adversary can map out much of its internal state machine and develop a model that lets them deduce what someone’s doing from an observed sequence of packets. [1] [2] This has been successfully applied (in the lab) to Web applications for highly sensitive tasks, such as tax preparation and medical advice.

The only known solution to this problem is to pad the amount of data passed over the network in each direction so that the adversary cannot learn anything about the state of the web application from an observed sequence of packets. Unfortunately, the obvious way to do this involves unacceptably high levels of overhead—this paper quotes [1] as observing 21074% overhead for a well-known online tax system. Today’s paper proposes that techniques from the field of privacy-preserving data publishing (PPDP, [3]) be brought to bear on the problem.

The paper is mathematically heavy going and I’m not going to go into details of their exact proposal, because it’s really just a starting point. Instead, I’m going to pull out some interesting observations:

  • More padding is not always better. They show an example case where padding messages to a multiple of 128 bytes is exactly as good as padding them to a multiple of 512 bytes, and padding them to a multiple of 520 bytes is worse.

  • Computing the optimal amount of padding for a web application (as they define it) is NP-hard even with complete, ideal knowledge of its behavior. However, there are reasonably efficient and speedy approximations—if I understood them correctly, it might even be feasible to apply one of their algorithms on the fly with each client.

  • Because optimality is defined over an entire sequence of user interactions with an application, knowing how much padding to send in response to any given request may require the server to predict the future. Again, however, there are approximations that avoid this problem.

  • PPDP offers a whole bunch of different ways to model the degree to which one has mitigated the information leak: k-anonymity is the model they spend the most time on, but there’s also l-diversity, (αk)-anonymity, t-closeness, etc. etc. I don’t follow that literature but I have the impression people are still actively making up new models and haven’t spent a whole lot of time on figuring out which is best in which situations.

The great failing of this and all the other papers I’ve ever read in this area is operationalization. This paper’s algorithms need a complete, or nearly so, map of all possible sequences of states that a Web application can enter, and the authors shrug off the question of how you create this map in the first place. (I very strongly suspect that task can’t be mechanized in the general case—it will turn out to be the Halting Problem in another of its many disguises.) They also don’t seem to realize that things like Google search suggestions (the running example in the paper) are constantly changing. (Of course, that makes life harder for the adversary too.) This doesn’t mean there is no hope; it only means you need more approximations. I think it ought to be possible to build good-enough automatic padding into CMSes, and maybe even into web application frameworks. Trying that should be a priority for future research.

I Know What You’re Buying: Privacy Breaches on eBay

eBay intends not to let anyone else figure out what you’re in a habit of buying on the site. Because of that, lots of people consider eBay the obvious place to buy things you’d rather your neighbors not know you bought (there is a survey in this very paper confirming this fact). However, this paper demonstrates that a determined adversary can figure out what you bought.

(Caveat: This paper is from 2014. I do not know whether eBay has made any changes since it was published.)

eBay encourages both buyers and sellers to leave feedback on each other, the idea being to encourage fair dealing by attaching a persistent reputation to everyone. Feedback is associated with specific transactions, and anyone (whether logged into eBay or not) can see each user’s complete feedback history. Items sold are visible, items bought are not, and buyers’ identities are obscured. The catch is, you can match up buyer feedback with seller feedback by the timestamps, using obscured buyer identities as a disambiguator, and thus learn what was bought. It involves crawling a lot of user pages, but it’s possible to do this in a couple days without coming to eBay’s notice.

They demonstrate that this is a serious problem by identifying users who purchased gun holsters (eBay does not permit the sale of actual firearms), pregnancy tests, and HIV tests. As an additional fillip they show that people commonly use the same handle on eBay as Facebook and therefore purchase histories can be correlated with all the personal information one can typically extract from a Facebook profile.

Solutions are pretty straightforward and obvious—obscured buyer identities shouldn’t be correlated with their real handles; feedback timestamps should be quantized to weeks or even months; feedback on buyers might not be necessary anymore; eBay shouldn’t discourage use of various enhanced-privacy modes, or should maybe even promote them to the default. (Again, I don’t know whether any of these solutions has been implemented.) The value of the paper is perhaps more in reminding website developers in general that cross-user correlations are always a privacy risk.

Students Who Don’t Understand Information Flow Should Be Eaten: An Experience Paper

On a completely different and much lighter note from Wednesday, today we’re going to look at a paper about teaching students information-flow theory with Werewolf.

Werewolf (also known as Mafia) is a game in which the players are divided into two groups, the werewolves (or mafiosi) and the villagers, and each is trying to wipe the other out at a rate of one murder per turn. There are a whole bunch of variants. Normally this is played around a table, as a game of strategic deception: only the werewolves know who the werewolves are, and they participate as villagers on the villagers’ turn. In this paper, it becomes a game played online, of surveillance and countersurveillance: the villagers are actively encouraged to exploit information leaks in the game server and discover who the werewolves are. (In a normal game this would be cheating.)

The authors don’t report how this teaching method compares to traditional lectures on any quantitative basis (e.g. final exam scores, class grades). However, they do say that the students loved the exercise, met up outside of official class hours to discuss strategies and plan, and that over the term the exploits and countermeasures grew steadily more sophisticated, in some cases requiring adjustments to the game server to ensure that both sides could still win. It’s hard to imagine this level of student engagement not leading to better grades, better retention, and deeper interest in the material.

I think this is a brilliant idea and not just for teaching information flow. One of the hardest things to teach in a security course is what Bruce Schneier calls the security mindset: intentionally thinking about how a system can be caused to fail, to do something it’s not intended to do. In particular, it is in tension with the usual engineering mindset, which focuses on verifying that something works when used as designed. (Safety engineering and failure analysis have more of the security mindset about them, but usually focus on failures due to accident rather than malice.) But it is exactly how you need to think to successfully cheat at a game, or to notice when someone else is cheating—and in that context, it’s going to be familiar to most college students. Using a game of strategic deception as the backdrop will also encourage students to think along the right lines. I’d like to see this idea adapted to teach other challenging notions in security—penetration testing is the obvious one, but maybe also code exploitation and key management?

Robust and Efficient Elimination of Cache and Timing Side Channels

We’re closing out Unpublished Results Week here with a paper that was submitted to CCS 2015 last May, but doesn’t appear to have made it.

I’ve talked about side channels a little before, but let me repeat why time and cache channels keep coming up and why they’re hard to get rid of: they’re side effects of hardware-design choices that, most of the time, are exactly what you want, if you want your computer to go fast. Think about long versus short division, for instance. It takes less mental effort, and less scratch paper, to manually divide 123456789 by 3 than by 29; the same is true (sort of) for your computer. So if it’s your job to design the CPU, why shouldn’t the integer divide instruction finish quicker in the former case? Especially if you have profiling information that says that most of the time integer divide gets used with small divisors. Memory caches are even more vital—the other day I saw a graph suggesting that it takes order of 17,000 clock cycles for the CPU to come all the way back up to full number crunching speed after a context switch; most of that is probably due to stuff falling out of the cache and having to be pulled back in.

So: side channels are hard to get rid of in hardware because that would involve making design tradeoffs that would hurt performance for the supermajority of code that isn’t crunching sensitive data, and they’re hard to get rid of in software because the hardware, and the compiler, are actively fighting you. (The paper says several times that side-channel-free cryptographic primitives pretty much have to be done in hand-written assembly language, and that getting them right is extremely difficult even by the standards of people who write assembly language by hand.) Wouldn’t it be nice to find a big hammer that would make the problem just … go away? That’s this paper.

A principal design goal for the big hammer in this paper is ease of use. The only thing the application programmer has to do is label protected functions that need to have their side channels eliminated. And it’s not a compiler paper; they haven’t come up with a way to make a compiler do the extremely difficult assembly tuning for you. All the magic happens at runtime. Given all that, it works pretty much the only way it could work: it measures how much the timing varies and then it pads out the fast cases to the same length as the slow cases. It also flushes state from the protected function out of as many CPU caches as possible (memory, TLB, branch target buffer, …) after the function is done, and takes steps (page coloring, CPU and memory pinning, realtime scheduling priority) to ensure that the protected function is not perturbed by what malicious code has done before it starts executing. The authors demonstrate reliability by testing several different crypto primitives that definitely do have side-channel information leaks, and showing that the leak is rendered unavailable in all cases. They also claim the overhead is acceptable (8.5%).

I really like the concept and hope to see more in this vein in the future. I can’t speculate on why this particular paper didn’t get accepted at CCS, but I do feel that the overhead measurements were inadequate: they don’t consider indirect costs of flushing caches and pinning stuff to CPUs all the time. I’d like to see whole-system benchmarking of, say, a shared-hosting HTTPS server. And at least one round of adversarial testing would be nice. It is one thing to demonstrate that your measurements show that the side channels you already knew about have been closed; it is another thing if someone who spends all their time thinking of side-channel attacks cannot find any in your system.

Limitations of End-to-End Encryption in Secure Computer Networks

Today we’re going to go back in time, all the way to the the dawn of computer networks. When this technical report was filed, the largest operational internetwork was still called ARPAnet and it still ran on NCP; people were still talking about hosts and communications subnetwork processors as if they were two different physical devices, and security levels as if that was the only possible way to conceptualize access control; and I was five months old.

(I apologize for the poor quality of the linked PDF—to the best of my knowledge, this is the only version to be found online. Also, if anyone knows D.W. Snow’s first name, please tell me.)

To the best of my knowledge, this is the very first published article to discuss the things that end-to-end encryption (what we would now call a secure channel protocol) does not protect from an eavesdropper. Everyone doing computer security in 1978 was thinking mostly about protecting classified government secrets, so the authors frame the problem in terms of a Trojan Horse program with access to such secrets, but forbidden by the OS from sending messages to anyone who isn’t cleared to access the same secrets: if all outgoing network traffic is encrypted end-to-end to its (legitimate) recipient, can this Trojan Horse still exfiltrate information to someone who isn’t a legitimate recipient?

They point out that, of necessity, a packet-switched network has to reveal the destination address, transmission time, and length of every packet in cleartext. They model each of these as Shannonian communication channels, and determine sending rates on the order of 100 bits per second for each—more than adequate to leak a text document. (They observe, by way of comparison, that the standard military teletype runs at 75 bps.)

Nowadays, this threat model might seem quaint, even silly—we put a lot more effort into preventing untrusted code from seeing secret information in the first place. The information leak, however, is real, still exists, and can be used for other purposes. The most terrifying example I know is Hookt on fon-iks, in which a completely passive eavesdropper can reconstruct the words spoken in an encrypted VoIP phone conversation, just from the length and timing of each packet. Different syllables compress to different length packets, and every natural language has rules about which syllables can follow which; the rules can be modeled with a Markov chain, and there you are.

The countermeasures and conclusions sections of this paper are much more embarrassing in retrospect than the dated threat model. They say, there’s nothing one can practically do about this end-to-end, but we can close the hole if we make every single intermediate relay a trusted arbiter of the (one true) security policy, at which point we don’t need end-to-end encryption… I feel quite confident in saying, even in 1978 it ought to have been obvious that that was never going to happen. What’s sad is, if people hadn’t given up on end-to-end countermeasures back then, perhaps we would actually have some by now. (It’s easy for VoIP! All you have to do is use a constant-bitrate compression algorithm. Shame none of the widely deployed VoIP programs bother.)

Cache Timing Attacks Revisited: Efficient and Repeatable Browser History, OS, and Network Sniffing

Cache timing attacks use the time some operation takes to learn whether or not a datum is in a cache. They’re inherently hard to fix, because the entire point of a cache is to speed up access to data that was used recently. They’ve been studied as far back as Multics. [1] In the context of the Web, the cache in question (usually) is the browser’s cache of resources retrieved from the network, and the attacker (usually) is a malicious website that wants to discover something about the victim’s activity on other websites. [2] [3] This paper gives examples of using cache timing to learn information needed for phishing, discover the victim’s location, and monitor the victim’s search queries. It’s also known to be possible to use the victim’s browsing habits as an identifying fingerprint [4].

The possibility of this attack is well-known, but to date it has been dismissed as an actual risk for two reasons: it was slow, and probing the cache was a destructive operation, i.e. the attacker could only probe any given resource once, after which it would be cached whether or not the victim had ever loaded it. This paper overcomes both of those limitations. It uses Web Workers to parallelize cache probing, and it sets a very short timeout on all its background network requests—so short that the request can only succeed if it’s cached. Otherwise, it will be cancelled and the cache will not be perturbed. (Network requests will always take at least a few tens of milliseconds, whereas the cache can respond in less than a millisecond.) In combination, these achieve two orders of magnitude speedup over the best previous approach, and, more importantly, they render the attack repeatable.

What do we do about it? Individual sites can mark highly sensitive data not to be cached at all, but this slows the web down, and you’ll never get everyone to do it for everything that could be relevant. Browsers aren’t about to disable caching; it’s too valuable. A possibility (not in this paper) is that we could notice the first time a resource was loaded from a new domain, and deliberately delay satisfying it from the cache by approximately the amount of time it took to load off the network originally. I’d want to implement and test that to make sure it didn’t leave a hole, but it seems like it would allow us to have the cake and eat it.

In closing, I want to point out that this is a beautiful example of the maxim that attacks always get better. Cryptographers have been banging that drum for decades, trying to get people to move away from cipher primitives that are only a little weak before it’s too late, without much impact. The same, it seems, goes for information leaks.

On Subnormal Floating Point and Abnormal Timing

Most of the time, people think the floating-point unit has no security significance, simply because you don’t use the floating-point unit for anything with security significance. Cryptography, authentication, access control, process isolation, virtualization, trusted paths, it’s all done with integers. Usually not even negative integers.

Today’s paper presents some situations where that is not the case: where the low-level timing behavior of floating-point arithmetic is, in fact, security-critical. The most elemental of these situations involves displaying stuff on the screen—nowadays, everything on the screen gets run through a 3D rendering pipeline even if it looks completely flat, because you have the hardware just sitting there, and that process intrinsically involves floating point. And there’s an API to tell you how long it took to render the previous frame of animation because if you were actually animating something you would genuinely need to know that. So if there’s something being displayed on the screen that you, a malicious program, are not allowed to know what it is, but you can influence how it is being displayed, you might be able to make the information you’re not allowed to know affect the rendering time, and thus extract the information—slowly, but not too slowly to be practical. There is also a scenario involving differentially private databases, where you’re allowed to know an approximation to an aggregate value but not see individual table rows; the aggregate is computed with floating point, and, again, computation time can reveal the secret values.

In both cases, the floating-point computations are uniform, with no data-dependent branches or anything like that, so how does timing variation sneak in? It turns out that on all tested CPUs and GPUs, primitive floating-point arithmetic operations—add, multiply, divide—don’t always take the same amount of time to execute. Certain combinations of input values are slower than others, and predictably so. As the authors point out, this is a well-known problem for numerical programmers. It has to do with a feature of IEEE floating point known as subnormal numbers. These allow IEEE floating point to represent numbers that are very close to zero, so close that they don’t fit into the bit representation without bending the rules. This is mathematically desirable because it means that if two floating-point values are unequal, subtracting one from the other will never produce zero. However, subnormals are awkward to implement in hardware; so awkward that CPUs in the 1990s were notorious for suffering a slowdown of 100x or more for basic arithmetic on subnormals. Nowadays it’s not so bad; if I’m reading the (poorly designed) charts in this paper correctly, it’s only a 2–4x slowdown on modern hardware. But that’s still enough to detect and build a timing channel out of.

Timing channels are a perennial problem because they tend to be side-effects of something desirable. Algorithms are often easier to understand if you dispose of special cases up-front—this paper also talks about how division by zero might be much faster than division by a normal value, presumably because the CPU doesn’t bother running through the divide circuit in that case. Running fast most of the time, slow in unusual cases, is often an excellent algorithmic choice for overall performance: hash tables, quicksort, etc. The papers that defined the concept of covert channels [1] [2] discuss timing channels introduced by data caching, without which Moore’s Law would have been hamstrung by fundamental physics decades ago.

However, I don’t think there’s any excuse for variable-time arithmetic or logic operations, even in floating point. (I might be persuaded to allow it for division, which is genuinely algorithmically hard.) In particular I have never understood why subnormal numbers are a problem for hardware. Now I don’t know beans about hardware design, but I have written floating-point emulation code, in assembly language, and here’s the thing: in software, subnormals are straightforward to implement. You need larger internal exponent precision, an additional test in both decode and encode, a count-leading-zeroes operation on the way in and an extra bit-shift on the way out. I didn’t try to make my emulator constant-time, but I can imagine doing it without too much trouble, at least for addition and multiplication. In hardware, that seems like it ought to translate to a wider internal data bus, a couple of muxes, a count-leading-zeroes widget, and a barrel shifter that you probably need anyway, and it should be easy to make it constant-time across both subnormals and normals, without sacrificing speed at all.

Constant time floating point operations for all inputs and outputs does strike me as harder, but only because IEEE floating point is specified to generate hardware faults and/or out-of-band exception bits under some circumstances. This was a mistake. A floating-point standard that has neither control nor status registers, and generates exclusively in-band results, is decades overdue. It would require sacrificing a bit of mantissa for a sticky this number is inexact flag, but I think that should be acceptable.

As a final headache, how long do you suppose it’ll be before someone comes up with an exploit that can only be fixed by making all of the transcendental function library (sin, cos, log, exp, …) constant-time? Never mind performance, I doubt this can be done without severely compromising accuracy.

A Survey on Encrypted Traffic Classification

This paper gives a brief overview of techniques for extracting information from encrypted network traffic, covering the past five or so years, mostly focusing on the (perceived) network-management need to know what application is communicating within an encrypted tunnel. It’s not very good—it’s a big list of other papers, each earning a couple sentences of description, and not much else. I would not be surprised to learn that the authors have completely missed important pieces of literature in this area.

I’m writing it up here for two reasons. First, it’s useful to read mediocre papers in order to learn how to do better. In this case, the missing element is synthesis. A good survey paper is research upon the existing published research and, like primary research, has a result of its own to demonstrate. Typically, that result will have something to do with the connections among the articles reviewed. It will highlight and compare lines of research, or it will demonstrate how people’s thinking about the topic has changed over time, or it will highlight gaps where new work would be useful, or perhaps all of the above. A big list of articles, briefly described and classified, is only the skeleton of such a survey.

Second, survey papers are often an entry point into the literature for people outside a subfield. Those people are less likely to share assumptions with the people deeply embedded in a subfield. For instance: in this paper, the authors consistently assume, with only a brief handwave in the direction of a justification, that it is necessary for network management tools to be able to identify at least the protocol and perhaps also something about the data being carried over an encrypted channel. Now, me, I tend to think that if a box in the middle can extract any information from an encrypted channel, that’s a bug in the cryptosystem. And I am working in very nearly this subfield, so I already know why the network-management people think they need to be able to extract data from an encrypted channel. Someone coming at this paper from an application-security, end-user privacy, or theory-of-crypto perspective might have an even stronger negative reaction. And that’s the thing: the authors should have anticipated that people who don’t share their assumptions would read their paper, so they should have taken steps to back up those assumptions.

Identifying Webbrowsers in Encrypted Communications

If you are a website, it is fairly easy to identify the web browser in use by each of your visitors, even if they take steps to suppress the blatant things like the User-Agent header. [1] [2] It is so easy, in fact, that researchers typically try to make it harder for themselves, trying instead to identify individual users even as they move around, change IP addresses, flush their cookies, etc. [3] [4]

If you are a passive eavesdropper in between the browser and the website, and the network traffic is encrypted, and particularly if you are isolated from the client’s IP address by anonymizing relays (e.g. Tor), the task should logically be much harder. Or is it? The authors of this short paper did the most obvious thing: capture packet traces and throw them at an off-the-shelf machine classifier. The feature vectors seen by the machine classifier are not described as clearly as I’d like, but I think they divided the trace into equal-length intervals and aggregated packet sizes in each direction in each interval; this is also one of the most basic and obvious things to do (the future work bit talks a little about better feature engineering). Despite the lack of tuning, they report 70–90% classification accuracy on a four-way choice among browsers (Chrome, Firefox, IE, Tor Browser) and 30–80% accuracy for a 13-way choice among browser and plugin combinations (by which they seem to mean whether or not JavaScript and Flash were enabled) (note that for a 13-way choice, chance performance would be 7.7% accuracy).

This is a short workshop paper, so it is to be expected that the results are a little crude and have missing pieces. The authors already know they need to tune their classifier. I hope someone has told them about ROC curves; raw accuracies make me grind my teeth. Besides that, the clear next step is to figure out what patterns the classifiers are picking up on, and then how to efface those patterns. I think it’s quite likely that the signal they see depends on gross characteristics of the different HTTP implementations used by each browser; for instance, at time of publication, Chrome implemented SPDY and QUIC, and the others didn’t.

The paper contains some handwaving in the direction of being able to fingerprint individual users with this information, but I’d want to see detectable variation among installations of the same browser before I’d be convinced that’s possible.