Papers tagged ‘Memory isolation’

Using Memory Errors to Attack a Virtual Machine

All modern operating systems include some form of process isolation—a mechanism for insuring that two programs can run simultaneously on the same machine without interfering with each other. Standard techniques include virtual memory and static verification. Whatever the technique, though, it relies on the axiom that the computer faithfully executes its specified instruction set (as the authors of today’s paper put it). In other words, a hardware fault can potentially ruin everything. But how practical is it to exploit a hardware fault? They are unpredictable, and the most common syndrome is for a single bit in RAM to flip. How much damage could one bit-flip do, when you have no control over where or when it will happen?

Today’s paper demonstrates that, if you are able to run at least one program yourself on a target computer, a single bit-flip can give you total control. Their exploit has two pieces. First, they figured out a way to escalate a single bit-flip to total control of a Java virtual machine; basically, once the bit flips, they have two pointers to the same datum but with different types (integer, pointer) and that allows them to read and write arbitrary memory locations, which in turn enables them to overwrite the VM’s security policy with one that lets the program do whatever it wants. That’s the hard part. The easy part is, their attack program fills up the computer’s memory with lots of copies of the data structure that will give them total control if a bit flips inside it. That way, the odds of the bit flip being useful are greatly increased. (This trick is known as heap spraying and it turns out to be useful for all sorts of exploits, not just the ones where you mess with the hardware.)

This is all fine and good, but you’re still waiting for a cosmic ray to hit the computer and flip a bit for you, aren’t you? Well, maybe there are other ways to induce memory errors. Cosmic rays are ionizing radiation, which you can also get from radioactivity. But readily available radioactive material (e.g. in smoke detectors) isn’t powerful enough, and powerful enough radioactive material is hard to get (there’s a joke where the paper suggests that the adversary could purchase a small oil drilling company in order to get their hands on a neutron source). But there’s also heat, and that turns out to work quite well, at least on the DRAM that was commonly available in 2003. Just heat the chips to 80–100˚C and they start flipping bits. The trick here is inducing enough bit-flips to exploit, but not so many that the entire computer crashes; they calculate an ideal error rate at which the exploit program has a 71% chance of succeeding before the computer crashes.

The attack in this paper probably isn’t worth worrying about in real life. But attacks only ever get nastier. For instance, about a decade later some people figured out how to control where the memory errors happen, and how to induce them without pointing a heat gun at the hardware: that was An Experimental Study of DRAM Disturbance Errors which I reviewed back in April. That attack is probably practical, if the hardware isn’t taking any steps to prevent it.

Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors

A few weeks ago, a new exploit technique called rowhammering was making the rounds of all the infosec news outlets. Turns out you can flip bits in RAM by reading over and over and over again from bits that are physically adjacent on the RAM chip; turns out that with sufficient cleverness you can leverage that to escape security restrictions being enforced in software. Abstractly, it’s as simple as toggling the this process is allowed to do whatever it wants bit, which is a real bit in RAM on many operating systems—most of the cleverness is in grooming the layout of physical RAM so that that bit is physically adjacent to a group of bits that you’re allowed to read from.

This paper describes and analyzes the underlying physical flaw that makes this exploit possible. It is (as far as I know) the first suggestion in the literature that this might be a security vulnerability; chip manufacturers have been aware of it as a reliability problem for much longer. [1] [2] If you’re not familiar with the details of how DRAM works, there are two important things to understand. First, DRAM deliberately trades stability for density. You can (presently) pack more bits into a given amount of chip surface area with DRAM than with any other design, but the bits don’t retain their values indefinitely; each bit has an electrical charge that slowly leaks away. DRAM is therefore refreshed continuously; every few tens of milliseconds, each bit will be read and written back, restoring the full charge.

Second, software people are accustomed to thinking of memory at the lowest level of abstraction as a one-dimensional array of bytes, an address space. But DRAM is physically organized as a two-dimensional array of bits, laid out as a grid on the two-dimensional surface of the chip. To read data from memory, the CPU first instructs the memory bank to load an entire row into a buffer (unimaginatively called the row buffer), and then to transmit columns (which are the width of entire cache lines) from that row. Rows are huge—32 to 256 kilobits, typically; larger than the page granularity of memory access isolation. This means, bits which are far apart in the CPU’s address spaces (even the so-called physical address space!) may be right next to each other on a RAM chip. And bits which are right next to each other on a RAM chip may belong to different memory-isolation regions, so it’s potentially a security problem if they can affect each other.

Now, the physical flaw is simply that loading one row of a DRAM chip into the row-buffer can cause adjacent rows to leak their charges faster, due to unavoidable electromagnetic phenomena, e.g. capacitative coupling between rows. Therefore, if you force the memory bank to load a particular row over and over again, enough charge might leak out of an adjacent row to make the bits change their values. (Reading a row implicitly refreshes it, so the attack row itself won’t be affected.) It turns out to be possible to do this using only ordinary CPU instructions.

The bulk of this paper is devoted to studying exactly when and how this can happen on a wide variety of different DRAM modules from different vendors. Important takeaways include: All vendors have affected products. Higher-capacity chips are more vulnerable, and newer chips are more vulnerable—both of these are because the bit-cells in those chips are smaller and packed closer together. The most common error-correcting code used for DRAM is inadequate to protect against or even detect the attack, because it can only correct one-bit errors and detect two-bit errors within any 64-bit word, but under the attack, several nearby bits often flip at the same time. (You should still make a point of buying error-correcting RAM, plus a motherboard and CPU that actually take advantage of it. One-bit errors are the normal case and they can happen as frequently as once a day. See Updates on the need to use error-correcting memory for more information.)

Fortunately, the authors also present a simple fix: the memory controller should automatically refresh nearby rows with some low, fixed probability whenever a row is loaded. This will have no visible effect in normal operation, but if a row is being hammered, nearby rows will get refreshed sufficiently more often that bit flips will not happen. It’s not clear to anyone whether this has actually been implemented in the latest motherboards, vendors being infamously closemouthed about how their low-level hardware actually works. However, several of the authors here work for Intel, and I’d expect them to take the problem seriously, at least.