A couple weeks ago, Google's Project Zero team demonstrated a very clever exploit of modern DDR3 memory. Their inspiration came from a research paper, Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors, written by Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. Despite the title, the paper doesn't expose a critical design flaw in DDR3. Instead, it reveals that the march toward ever smaller computational lithography processes can have unintended consequences.
Kim's technique, to flip bits without directly touching them, involves accessing adjacent memory rows over and over again. Writing to those adjacent rows isn't required; repeated reads can be sufficient to cause a disturbance. How many? Around 540,000 instances in 64 milliseconds, according to Ars Technica. A simple code loop can accomplish this end:
mov (X), %eax // Read from address X
mov (Y), %ebx // Read from address Y
clflush (X) // Flush cache for address X
clflush (Y) // Flush cache for address Y
The structure of DRAM doesn't tend to be a hot topic even within enthusiast circles, so I should clarify what a row is with a quick explanation. The DRAM chips on memory modules are grouped into one or more constructs called ranks. In DDR3, these ranks are made up of a series of layered banks. Each bank has 1024 columns and thousands of rows (the density of the module determines the exact number or rows). This graphic at AnandTech shows the arrangement visually, and here's a mathematical breakdown:
65,536 rows x 1024 columns x 8 banks = 512MiB per chip
8 chips per rank x 2 ranks x 512MiB per chip = 8GiB
32,768 rows x 1024 columns x 8 banks = 256MiB per chip
8 chips per rank x 2 ranks x 256MiB per chip = 4GiB
Each of the tightly packed bits that make up a row in our example (65536 or 32768) is a tiny capacitor. At small process sizes, the resulting density faces an uphill battle against leakage and noise. Kim's testing concludes that some DDR3 modules made in the last three years are susceptible to having their bits randomly flipped. This development appears to coincide with a move to 42-nm production. At the time, DRAM engineers noted the pitfalls of further shrinkage:
Already, at 4x, the distance between two access transistors is less than 35 nm in the wordline direction, and the distance between the storage node and metal-1 (bitline) is less than 15 nm. Those numbers suggest that leakage issues may become predominant for future nodes.
Kim more explicitly details the noise challenges faced by DRAM built on finer nodes.
First, a small cell can hold only a limited amount of charge, which reduces its noise margin and renders it more vulnerable to data loss [14, 47, 72]. Second, the close proximity of cells introduces electromagnetic coupling effects between them, causing them to interact with each other in undesirable ways [14, 42, 47, 55]. Third, higher variation in process technology increases the number of outlier cells that are exceptionally susceptible to inter-cell crosstalk, exacerbating the two effects described above.
Memory manufacturers will have to tackle this reliability issue. One might even point to these difficulties and say that all memory will inevitably have to adopt ECC, not unlike the disk sectors that make up modern storage.
Although the engineers who develop DRAM were aware of these pitfalls, the Project Zero researchers postulate that no one imagined the flaws would be used to compromise a system. That brings us to the next part of our tale, the clever exploit that turned a reliability issue into an attack.
Two proofs of concept were devised around the ability to flip bits in adjacent rows. One attacked Chrome and escaped the browser's native-client sandbox. The other showed that it's possible to escalate privileges in Linux to become a root user.
The attacks require three major steps. The first is to have the malicious application fragment the contents of memory across the large breadth of rows available to the system. This fragmentation can be achieved through calls like mmap and munmap. The goal here is to isolate the code targeted for attack in a physical row without any adjacent neighbors.
Step two involves tracking down the targeted code. Project Zero abused the ping command to execute shell code as root. Ping has root-level access so it can send and listen for packets.
Tracking down the target code in physical memory is one of the many challenges associated with this attack. Linux eases the burden so long as there's access to /proc/PID/pagemap. The pagemap helps to show which physical row holds the target data after the attack's fragmentation efforts.
This brings us to the third and final step: to execute a loop, like Kim's above, which hammers the rows adjacent to our victim (ping). Sandwiching the target by hammering the rows on both sides increases the chance that the trapped row will suffer flipped bits.
What can be done to protect against this so-called "rowhammer" attack?
- If you run multi-user machines, make sure to virtualize those users.
- As always, be mindful of the code you execute on a machine.
- ECC makes this vulnerability more difficult to exploit, but it is not infallible.
- Run DDR4 memory, which is immune according to Real World Tech's David Kanter.
- The Project Zero team notes at least one brand of DDR3 is also immune, but it doesn't name names.
- Google Chrome protects the sanctity of its sandbox for native-client code by blacklisting the use of CLFLUSH.
This is a very complex attack that brings together many different elements in order to succeed. I think typical hackers would have access to other, easier exploits. My gut feeling is that rowhammer might interest sophisticated attackers like the Equation Group, which has been linked to the NSA, but that it won't become a staple of attacks driven by petty crime.