One of the toughest challenges in computing is building large-scale systems like those used in the world of high-performance computing (HPC). Currently, such systems typically communicate between nodes over an interconnect like Ethernet or Infiniband, but Intel intends to replace those technologies with a new interconnect known as Omni-Path Architecture. This week at the Hot Interconnects conference, the firm is unveiling detailed information about how Omni-Path works and how it will be deployed alongside Intel's other products.
Omni-Path is interesting in part because of the intense difficulty of the problem it has to address: allowing the hundreds or thousands of nodes in a supercomputer to communicate with one another at high speed while keeping delays at a minimum. Intel says Omni-Path is meant to scale from "small clusters to the largest supercomputers." To make that happen, the chipmaker has taken on the task of designing nearly every piece of the puzzle needed to establish a new interconnect fabric, from the host adapters to switches, software, and tools. Some of the core technology comes from Intel's acquisition of Aries IP from Cray, and other bits come from its acquisition of True Scale InifiniBand IP from QLogic. Much of the software comes from the existing Open Fabric Alliance. At the end of the day, though, Omni-Path looks to be Intel's own product and very much a rival for other standards—and for products from firms like Mellanox.
Omni-Path has an inside line because it comes from Intel, maker of the Xeon lineup—the dominant CPUs in the data center—and of the Xeon Phi parallel processor. The first Omni-Path products will include discrete host adapters that fit into PCIe slots, but Intel plans to integrate Omni-Path connectivity on to the same package as the Xeon and Xeon Phi—and eventually directly into the processors' silicon. The purported benefits of this integration will be familiar to anyone who's been watching CPUs in the past ten years: lower point-to-point latency thanks to the elimination of a chip-to-chip "hop" in the network, lower power use, a smaller physical footprint, and, of course, lower overall costs. Integration also has the side-effect of making life very difficult for competing interconnect standards that are not integrated into the processor.
At a press preview event during last week's Intel Developer Forum, we got a first look at an Omni-Path host adapter in the form of a PCI Express x16 card. This card has a single port onboard and can support transfer rates up to 100 Gbps. Cards based on this Intel network chip can have one or two ports. The single-port cards typically require 8W of power, with a maximum up to 12W.
Omni-Path supports four "lanes" per connection, and each of those connections has a 25 Gbps transfer rate, for 100 Gbps in aggregate. Those connections can happen over optical links or over copper cables up to three meters in length. Even though it's a next-generation interconnect, Omni-Path supports copper cabling in order to keep costs in check.
Most of Omni-Path's unique magic happens at the lowest layers of the network stack, traditionally known as layers 1 and 2 in the OSI model. Above that, the interconnect will support existing transport and application standards.
The most interesting innovation in Omni-Path may be the insertion of a layer "1.5," known as a link transport layer or LTL, that grew out of work formerly being done at Cray on the Aries interconnect. According to Omni-Path Chief System Architect Phil Murphy, this additional layer breaks packets into 65-bit units known as "flits," and set of 16 flits together with a CRC comprises a packet. Busting packets into smaller units gives Omni-Path the ability to keep transfer latencies low by allowing high-priority packets to pre-empt large, low-priority packets that "we've already started putting on the wire," in Murphy's words. Flits from different packets can be interleaved with one another as needed in order to make sure that critical messages reach their destinations as quickly as possible. Which is very slick.
I'm barely scratching the surface here. If you'd like more info, Intel has released a paper on the OPA fabric, but you have to register at this website in order to download it. I believe IEEE Spectrum will be publishing a paper on Omni-Path, as well.
This interconnect is clearly aimed at a particular mission in a relatively small but lucrative market. Still, the fact that Intel intends to integrate this tech into its CPUs immediately raises intriguing prospects for future uses of this technology—or something very much like it—as part of future system architectures generally, which makes it very much worth watching.