An intro to all things ARM

Processors designed by the folks at ARM have been around for ages, but they’ve mostly inhabited computing devices you probably didn’t particularly like: sluggish GPS units, slow-as-molasses in-flight entertainment systems, digital picture frames, and the like. For a time, these devices were relatively cheap and becoming more common thanks to the magic of Moore’s Law, but they didn’t have much else to recommend them.

Then, of course, the iPhone happened, and everything changed seemingly overnight.

With the rise of smartphones and tablets, the arc of consumer computing has been radically altered. Now, ARM processors power some of the computers many of us like the most: ridiculously powerful smartphones, tablets whose stunning displays are wildly superior to the average PC’s, and e-readers so well-suited for their roles that they’re capable of inspiring poetry.

ARM’s fortunes have risen, too—as have the firm’s ambitions. ARM-based chips dominate the mobile device market, despite repeated attempts by traditional PC players like Intel to make inroads there. Meanwhile, the market for consumer PCs has grown soft, its prospects dimmed by competition from tablets. With the erosion of AMD’s prominence in the traditional PC space, a new contest is shaping up that may determine the course of computing for the next decade or more, pitting the ARM ecosystem versus Intel, the world’s biggest chipmaker.

You know Andy Grove’s famous declaration that only the paranoid survive? Well, the current source of Intel’s paranoia is undoubtedly ARM and its partners. Intel has acted on its fear by steering its product offerings directly toward the places where ARM is strongest. The recently introduced Haswell processors can squeeze into power envelopes as small as six watts—and those are the big chips. Intel revealed the more direct threat to ARM in the form of its upcoming Silvermont microarchitecture a couple of months ago. Not only does Silvermont promise to improve the aging Atom CPU core with something much more potent, but it also comes with a renewed commitment. Intel is bringing its vaunted “tick-tock” development rhythm to the Atom, interleaving the introduction of new chip fabrication technologies and revised microarchitectures in a series of annual updates.

That’s the formula that left AMD in the dust, and it’s no doubt formidable. In ARM, though, Intel faces a very different sort of competition. Indeed, I get the sense folks at ARM were a little bit peeved by some of the claims Intel made about Silvermont’s competitive stature. That may be one reason why ARM summoned some of the world’s most technically inclined journalists and analysts to a confab at its headquarters in Cambridge, England recently. In an obvious case of poor judgment, I was also allowed to attend.

This is Cambridge, England. ARM’s offices do not look like this.

Happily, I now get the chance to tell you another side of the story of the next generation of low-power processors, one that began with Intel’s Silvermont and AMD’s Kabini and extends to a host of different chips based on ARM’s instruction set architecture and CPU microarchitectures. There’s no doubt now that a new front has opened in the CPU market.

How anyone can build an ARM-based chip
The most important technology ARM has brought to the looming fight against Intel may not be a CPU architecture or a feature set. Instead, ARM’s biggest advantage may be the legal and technical framework it has established that allows a host of different companies to build ARM-compatible chips with relative ease.

ARM doesn’t make any chips itself. Instead, it designs components of chips, like CPU cores, and licenses this intellectual property (IP) to semiconductor companies for incorporation into all sorts of different products. Those products are usually SoCs, or systems on a chip, with relatively modest power budgets and an entire computer’s worth of components integrated together.

Ian Thornton, ARM’s VP of Communications and Investor Relations, describes ARM’s role as an R&D outsourcing operation. ARM concentrates on developing new CPU microarchitectures, while its clients like Samsung and Nvidia focus on everything else that goes into building a complete SoC. By contrast, Intel’s x86 instruction set architecture, on which the Windows PC ecosystem relies, is closely guarded. Although anyone can write x86-compatible software, only Intel itself and a handful of companies with some historical patent leverage, like AMD and Via, can produce x86-compatible processor silicon.

Thornton offered us a sketch of how ARM’s licensing model works. At its core, the approach is fairly simple. Customers who wish to incorporate ARM’s technology into a chip pay for the privilege in two ways: an upfront licensing fee and a per-chip royalty. The fee and royalty can vary depending on the type of customer and the sort of technology being licensed, but that’s the basic structure.

According to Thornton, the license fee typically ranges from $1-10 million dollars. This payment is made at the outset of a licensing deal, and it gets the customer access to ARM’s technology and to support from ARM personnel, so the process of developing a product can begin. Semiconductor development projects usually take three to four years to complete. The license fee may be the only revenue ARM realizes for a while.

Once a product containing ARM IP begins shipping, then the licensee must pay a royalty to ARM, which is typically 1-2% of the selling price of each chip. The royalty is intended to be small enough to allow ARM’s customers to realize a profit on their creations, but one can imagine how two percent of the price of every chip going into nearly every smartphone in the market might add up over time.

Thus, ARM benefits when its customers succeed, and it realizes higher profits when its IP is licensed widely across multiple products that ship in high volumes. That means ARM’s incentives ought to map well to its customers’ needs, which are dictated by the economics of the semiconductor industry, where producing large volumes of chips is the surest path to success.

An unnecessarily complex diagram of ARM’s licensing scheme. Source: ARM.

Within this simple fee-plus-royalty structure, ARM offers a range of options tailored toward different sorts of customers. Corporate customers can license a specific bit of IP to be used in a single product, or they can choose to subscribe to ARM’s entire library for use across an array of products. ARM also offers heavily discounted options for start-ups and academics.

What ARM then hands over to a customer will depend on that company’s needs and capabilities. The usual procedure is for ARM to hand off a description of, say, a CPU core in register transfer level (RTL) format. The semiconductor firm then takes the RTL and feeds it into logic synthesis software in order to create a gate-level description of the hardware. Those results are then converted into a physical level circuit design using placement and routing tools. Most semiconductor companies don’t have their own manufacturing capability, so they’ll work with a foundry like TSMC or GlobalFoundries to produce the chips, adapting the physical design to match the requirements of the foundry’s manufacturing process.

I believe that’s the exact path that, say, Nvidia’s Tegra SoCs have taken to market. The Tegra 4 incorporates five copies of ARM’s Cortex-A15 CPU microarchitecture and is manufactured at TSMC on a 28-nm fabrication process. Contrast that to something like Haswell, which went from architecture to design to final silicon to integrated platform entirely inside of Intel.

For firms that prefer not to handle the physical design themselves, ARM offers pre-baked physical designs known as processor optimization packs (POPs) in cooperation with the major foundries. These are optimized implementations of popular ARM cores geared toward a specific fabrication process, and they are sold by the foundries to semiconductor firms. ARM receives a royalty of between 1-2.5% of the wafer price from the foundry for the use of its POPs.

More intriguing is a third option, known as an architecture license. This type of license allows a third-party firm to design its own processor core that is compatible with ARM’s instruction set architecture, or ISA. This setup is akin to the agreement that Intel has with AMD, allowing AMD to make x86-compatible CPUs. The difference is that ARM offers ISA licenses openly, and they have become relatively popular in recent years. Some of the most prominent CPU cores shipping in today’s phone and tablets are the fruit of ISA licenses, including Qualcomm’s Krait and Apple’s Swift. With this option on the table, CPU performance in ARM’s key markets can advance even when ARM itself isn’t as quick as it should be in producing a new CPU architecture. In fact, that’s arguably just what happened in the latest generation of smartphones.

Of course, PC processor performance also advanced via a licensee when Intel stumbled with the Pentium 4, but that didn’t turn out so well. In this case, ARM collects a similar royalty from architecture licensees as from its other customers, so the success of Krait and Swift doesn’t sap its revenues.

Success in complexity
Royalties and fees do vary depending on several factors. Newer and more complex IP—like, say, the Cortex-A15 CPU core used in high-end Android devices—tends to command a bit of a premium, although ARM has built in a number of provisions to ease the pain. Multiple copies of the same core on a chip usually don’t cost any more, for instance. Discounts are applied to the royalties on chips that mix two different core types, so that the total royalty remains fairly modest. Some enabling IP, like ARM’s AMBA interconnect and its memory controllers, are included as part of the package for no additional cost. I understand ARM has also been quite aggressive in discounting its Mali graphics IP for its CPU licensees in order to spur wider adoption.

Also not ARM’s offices.

That’s all quite complicated, but the thing to know about ARM’s licensing model is that it works. In fact, licensed IP is arguably the lifeblood of today’s semiconductor industry outside of a few traditional PC players. ARM’s AMBA interconnect and its CPU cores are de facto standards in the world of SoCs, but a number of other players compete with ARM pretty directly. For instance, the graphics IP in all of Apple’s iOS devices is supplied by Imagination Technologies, who abandoned the desktop PC market after the Kyro II. The success of this model has even prompted Nvidia to open up its current and future GPU portfolio for licensing by third-party device makers.

Thornton offered some numbers to illustrate the extent of ARM’s, er, reach. (Yes, I did that.) Currently, the firm has roughly 1000 processor licenses active—that is, those licenses still have the potential to generate a royalty. A total of 320 companies are partnered with ARM via license agreements, and about half of those are currently shipping ARM-based chips. Each year, the firm adds 30 to 40 new licensees, and Thornton estimates that about 80% of those end up building a chip successfully. (The other 20% are start-ups that wind up being acquired.) All told, ARM’s partners presently ship about 2.5 billion chips every quarter, and ARM estimates that its IP has shipped in a cumulative total of 45 billion chips over the years. Those are, obviously, some very big numbers. We’re only three orders of magnitude shy of the scale of government debt.

So, this is… different
The fact that ARM (along with other IP houses) doesn’t produce its own chips presents some difficulties for those of us well attuned to the traditional PC market. We’re generally accustomed to a certain level of openness about things like future CPU roadmaps, die sizes and, transistor counts. ARM shared quite a bit with those of us who attended its recent event for the tech press, including juicy details of its current architectures that we’ll be writing about in future articles. But the nature of the information ARM will disclose is limited by its business model, oftentimes because the answers to common questions depend on how and when ARM’s partners choose to implement its technology.

For instance, ARM’s engineers presented some compelling arguments about the merits of its big.LITTLE power-management scheme, which uses dual CPU architectures and intelligent thread scheduling to boost power efficiency. Trouble is, there isn’t yet a really strong implementation of big.LITTLE in the wild, and ARM doesn’t feel at liberty to disclose what it knows of its partners’ plans. As a result, it’s hard to gauge big.LITTLE’s prospects for widespread adoption. By contrast, we know definitively that Intel’s Bay Trail platform based on the Silvermont microarchitecture is slated for release later this year, and we can reasonably expect Silvermont’s power-saving features be implemented uniformly at that time.

This one difference between ARM and Intel isn’t a big deal in itself, but it illustrates a larger reality that permeates any discussion of ARM-based solutions. Because it’s an upstream provider of technologies, ARM’s control over how its creations are implemented is limited. That fact can be a strength, given the sheer diversity of its partners’ products and how those products are tailored to specific applications. But it can also be a detriment, for example when ARM’s partners choose to pursue higher core counts and clock speeds at the expense of architectural efficiency.

I expect this drawback will be something of a persistent challenge for ARM. On the PC, Intel addressed this problem years ago by delivering near-complete platforms to PC makers, as it did with Centrino and continues to do with ultrabooks. ARM has less leverage, so it must count on its licensees to do the right thing.

Then again, have you tried to find an ultrabook with a decent touchpad? Even Intel’s clout doesn’t solve every problem.

ARM tackles this challenge in part by offering a “lead licensing” program when it’s ready to bring, say, a new CPU core to market in an SoC for the first time. The firm will choose several partners and work closely with them on the first few implementations of its new microarchitecture. These partners generally must serve different markets, and they must be able to dedicate substantial engineering resources to the project. Their reward, of course, is being one of the first to offer a new core, which seems like a pretty good incentive. This program helps ARM pass the hurdle of getting a new core etched into silicon, at least.

The issues created by the separation of ARM’s R&D efforts from it partners’ specific implementations come into sharper focus when the time comes to make performance assessments. Claims made about a microarchitecture’s instruction throughput only mean so much without a chip to test. Also, the inherent complexity of the ARM ecosystem can make it difficult to eyeball a chip and estimate its performance.

For example, ARM reckons the performance of the ultra-popular Cortex-A9 CPU core more than tripled during its lifetime. Changes in the process technology and power envelopes used by ARM partners contributed some of those gains, but so did four separate revisions made to the Cortex-A9’s RTL. We’re talking fundamental architectural tuning here, such as tweaks to the branch prediction and cache pre-fetching algorithms. Similarly, the Cortex-A15 is already on its third rev. And we haven’t even talked about the many possibilities for varying cache and uncore configurations in ARM-based SoCs. Knowing that your tablet has a Cortex-A9 at 800MHz inside of it doesn’t provide much of a basis for comparison.

So, as the contest between Intel and ARM heats up and the performance claims are inevitably bandied about, we have much to consider. I expect we’ll hear some hare-brained marketing claims from all sides of this contest.  As always, the ultimate verification will come from high-quality benchmarking that relies on real applications and takes the user’s perceived experience into account. That we can do.

ARM presented us with a formidable amount of detailed info about its CPU and graphics cores, so we have lots more to say about these things. For now, though, I think we have a start on understanding how ARM’s business works and why it’s such a threat to the traditional dominance of x86 processors in high-end computing devices. That should be enough to chew on, since it pretty much changes everything.

Scott Wasson