Papers published in 2013

Stealthy Dopant-Level Hardware Trojans

Most of the time we treat silicon chips—CPUs, for instance—as black boxes. The manufacturer publishes specifications for integrating it into a larger widget and making use of it, we code to those specifications, and the chip does its job. But inside the package is a fiendishly complicated machine, and there have been plenty of incidents where it didn’t work quite right, e.g. the infamous F00F and FDIV bugs in the Pentium. This is not to pick on Intel; every CPU manufacturer has had similar troubles, but only the x86 is sufficiently famous that its troubles make Wikipedia.

Anything that can happen by accident can happen on purpose. Most chip designers have to contract manufacture out to a fab plant run by a separate company, and the manufacturing process is opaque to them. It might occur to you to wonder whether the manufacturer can tamper with the design. It’s possible to disassemble a chip under an electron microscope and make sure all the wires are where they’re supposed to be, but this is expensive and destroys the chip, and it can only detect some of the possible changes to the design.

This paper presents a technique a fab plant could use to sabotage a chip design, that can’t be detected by any current technique for inspecting a chip after the fact. Basically, by changing the dopant mask, the fab can turn individual transistors into shorts to ground or Vcc. This is invisible to a microscope; all the wires are where they’re supposed to be, it’s just that a block of semiconductor has the wrong electronic properties. They demonstrate that this can be used to introduce subtle bugs that will not be caught by functional testing, such as making the output of a hardware RNG predictable, or introducing a power-consumption side channel into a cryptographic accelerator.

No solutions are presented; I’m not much of a hardware person and I have no idea what solutions might be possible. This is the hardware equivalent of a malicious compiler, and people are working on solving that problem—but they rely on the fact that you can inspect the output of a compiler in detail, because it’s just another string of bits. How do you detect that a block of silicon has been doped with boron instead of phosphorus? X-ray crystallography, maybe?

Security Analysis of Pseudo-Random Number Generators with Input

Coming up in this year’s CCS is a paper with the provocative subtitle /dev/random is not Robust, and thanks to widespread availability of online preprints, it’s already scored quite a bit of online attention, e.g. at Bruce Schneier’s blog and on Hacker News.

I’m going to pass over the alleged attacks on the Linux kernel CSPRNG, which are adequately covered in both of the above discussions; suffice it to say that in this paper robust is a precisely defined theoretical security property, and there’s a solid case that it’s stronger than necessary to guarantee practical security. What I’m interested in is the simple and very efficient PRNG construction that is provably robust, advertised in the abstract. CSPRNG-robustness (as defined here) may not be necessary for practical security, but if you can have it essentially for free, why wouldn’t you adopt it? Unfortunately it turns out that there’s a major palmed card in their definition of a CSPRNG with input, and another one in the construction itself, rendering the algorithm useless in practice.

The first problem is that the definition of a CSPRNG with input includes a setup phase that produces a random seed which is not known to the adversary. Despite consistently calling this item a public parameter, the text of the paper makes it clear that it must not be known to the adversary, and they consider this both essential and unproblematic:

Unfortunately, it is well known that no deterministic extractor is capable to simultaneously extract good randomness from all efficiently samplable high-entropy distributions (e.g. consider nearly full entropy distribution II which is random, except the first bit of Extract(I)\operatorname{Extract}(I) is zero).

… [Therefore], we chose … to assume the existence of the setup procedure … this will allow one to consider a seeded extractor … which can now extract entropy from all high-entropy distributions. As a warning, this nice extra generality comes at a price that the [seed] is not passed to the [adversarial] distribution sampler …

But this is impossible to achieve in real life. If the seed is generated on the device that will use it, we have a chicken-and-egg problem, because initial startup is precisely when environmental randomness is least available; the authors should have been aware of actual, catastrophic exploits of systems with little or no access to randomness on first boot, e.g. Mining your Ps and Qs. If the seed is not generated on the device that will use it, then the adversary can probably learn it; in particular, if the seed for all devices of a particular model is embedded in a firmware image that anyone can download off the manufacturer’s website, you have accomplished precisely nothing.

The second problem is in the technical details of their proposed CSPRNG construction, which I will now quote verbatim.

Let 𝐆:{0,1}m{0,1}n+l\mathbf{G}: \{0,1\}^m \to \{0,1\}^{n+l} be a (deterministic) pseudorandom generator where m<nm < n. We use the notation [y]1m[y]_1^m to denote the first mm bits of y{0,1}ny \in \{0,1\}^n. Our construction of PRNG with input has parameters nn (state length), ll (output length), and p=np = n (sample length), and is defined as follows:

  • setup:\operatorname{setup}: Output seed=(X,X){0,1}2n\text{seed} = (X, X^\prime)\leftarrow\{0,1\}^{2n}.
  • S=refresh(S,I):S^\prime = \operatorname{refresh}(S,I): Given seed=(X,X)\text{seed} = (X, X^\prime), current state S{0,1}nS \in \{0,1\}^n, and a sample I{0,1}nI \in \{0,1\}^n, output S=SX+IS^\prime = S \cdot X + I, where all operations are over 𝔽2n\mathbb{F}_{2^n}.
  • (S,R)=next(S):(S^\prime, R) = \operatorname{next}(S): Given seed=(X,X)\text{seed} = (X, X^\prime) and a state S{0,1}nS \in \{0,1\}^n, first compute U=[XS]1mU = [X^\prime \cdot S]_1^m. Then output (S,R)=𝐆(U)(S^\prime, R) = \mathbf{G}(U).

It’s a blink-and-you-miss-it sort of thing: by all operations are over 𝔽2n\mathbb{F}_{2^n} they are specifying that both the refresh and next steps make use of arithmetic over finite, nonprime fields (more usually referred to as GF(2n)\operatorname{GF}(2^n) in crypto literature). nn has to be large—section 6.1 of the paper suggests specific fields with nn = 489, 579, and 705. The known techniques for this kind of arithmetic are all very slow, require large look-up tables (and therefore permit side-channel attacks), or cannot be used for a software implementation. [1] [2] This makes them impractical for a cryptographic component of an operating system that cannot rely on special-purpose hardware.

Both problems arise from the same source, namely wanting to build the algorithm around a keyed hash so that the entropy extractor is fully general, and then taking a mathematical construct with convenient theoretical but inconvenient practical properties as that hash. If we back down to the use of an unkeyed, but computationally secure, hash as the entropy extractor, we’d have to redo the analysis, but I think it ought to be possible to arrange that, without any secret unknown to the adversary, it is still computationally infeasible for the adversary to bias the entropy pool; and we could then pick that hash for its practical utility.

… Oddly enough, this hypothetical construction looks an awful lot like the existing Linux CSPRNG.

Analysis of the HTTPS Certificate Ecosystem

The Internet Measurement Conference brings us an attempt to figure out just how X.509 server certificates are being used in the wild, specifically for HTTPS servers. Yet more specifically, they are looking for endemic operational problems that harm security. The basic idea is to scan the entire IPv4 number space for servers responding on port 443, make note of the certificate(s) presented, and then analyze them.

This research question is nothing new; the EFF famously ran a similar study back in 2010, the SSL Observatory. And operational concerns about the way certificates are used in the wild go back decades; see Peter Gutmann’s slide deck Everything you Never Wanted to Know about PKI but were Forced to Find Out (PDF). What makes this study interesting is, first, it’s three years later; things can change very fast in Internet land (although, in this case, they have not). Second, the scale: the authors claim to have successfully contacted 178% more TLS hosts (per scan) and harvested 736% more certificates (in total, over the course of 110 scans spanning a little more than a year) than any previous such study.

What do we learn? Mostly that yeah, the TLS PKI is a big mess, and it hasn’t gotten any better since 2010. There are too many root CAs. There are far too many unconstrained intermediate certificates, and yet, at the same time, there are too few intermediates! (The point of intermediates is that they’re easy to replace, so if they get compromised you don’t have a catastrophe on your hands. Well, according to this paper, some 26% of all currently valid HTTPS server certificates are signed by one intermediate. No way is that going to be easy to replace if it gets compromised.) Lots of CAs ignore the baseline policies for certificate issuance and get away with it. (Unfortunately, the paper doesn’t say whether there are similar problems with the EV policies.)

Zoom out: when you have a piece of critical infrastructure with chronic operational issues, it’s a safe bet that they’re symptoms and the real problem is with operator incentives. The paper doesn’t discuss this at all, unfortunately, so I’ll throw in some speculation here. The browser vendors are notionally in the best position to Do Something about this mess, but they don’t: because the only real option they have is to delete root certificates from the Official List. Not only does this tend to put the offending CA out of business, it also causes some uncertain-but-large number of websites (most or all of which didn’t do anything wrong) to stop working. Such a drastic sanction is almost never seen to be appropriate. Browsers have hardly any good positive incentives to offer the CAs to do things right; note that EV certificates, which get special treatment in the browser UI and can therefore be sold at a premium, do come with a tighter set of CA requirements (stronger crypto, reliable OCSP, that sort of thing) which are, as far as I’m aware, followed.

Zoom out again: there’s no shortage of technical suggestions that could turn into less drastic sanctions and incentives for the CAs, but they never get implemented: why? Well, you ask me, I say it’s because both OpenSSL and NSS are such terrible code that nobody wants to hack on them, and the brave souls who do it anyway are busy chipping away at the mountain of technical debt and/or at features that are even more overdue. This, though, we know how to fix. It only takes money.