Nickolai Zeldovich and his colleagues at MIT have created a file system they claim is mathematically proven to be impervious to any unexpected system crash. The team built the file system on the basis of formal verification, which by itself isn't entirely new. What's different about the MIT team's work is that their formal proofs are made against the final code they wrote, and not against some high-level flowchart describing the system.
Zeldovich described the difficulties encountered in designing a crash-resilient filesystem:
Making sure that the file system can recover from a crash at any point is tricky because there are so many different places that you could crash. You literally have to consider every instruction or every disk operation and think, ‘Well, what if I crash now? What now? What now?’ And so empirically, people have found lots of bugs in file systems that have to do with crash recovery, and they keep finding them, even in very well tested file systems, because it’s just so hard to do.
The performance of the resulting file system is reported to be rather slow, but the team is confident their work can be applied to other designs. Zeldovich and his colleagues say they rewrote the project "probably 10 times", but they also estimate they spent 90% of their time defining components, the relationships between them, and the actual formal proof. From the sound of it, they may not have had much time for optimization. For the curious, the team used the Coq Proof Assistant.
The MIT team's work has attracted notice from some members of industry. Úlfar Erlingsson, Google's lead manager for security, said that existing formal designs are so specific to a given filesystem that there is little change of reusing them. In contrast, Erlingsson believes MIT's work can be used as a foundation for many different applications. The team will present their work in October at the ACM Symposium on Operating System Principles.