Single page Print

When I first was introduced to semiconductor manufacturing, I can't even remember when now, there was always talk of this mythical process called the tapeout. I always wondered why it was called that, because I couldn't fathom where a tape was ever involved in semiconductor design or manufacturing. I fathomed right, because it's a legacy term from back in the day, when the final data required for manufacturing was actually delivered on a tape or tapes. These days, the data is transferred electronically.

So what are the constituent parts of a tapeout as far as the chip design goes? The big part is the GDS2, produced by the synthesis and layout steps. It's sent to a mask house to be turned into a literal photon-blocking mask. More about why in the next part.

The mask, or normally a set of masks for today's designs, is the most critical part of the manufacturing process. The mask set can be modified by the foundry after its creation, but it's generally set in stone and can't be altered, so it's critical that the mask house gets it right.

The delivery of GDS2 to the mask company is digital, but the mask is obviously a physical object. It looks really cool in person, if you're ever lucky to see one; it's big enough that you can see the constituent blocks of the design in good detail without a microscope.

The mask set and some associated metadata is then sent to the foundry for manufacturing.

An AMD employee shows off a 300-mm wafer.

This part is a series of books on its own, and I'm no expert, so I'm going to be brief. Silicon dice, and I presume it'll be the same with the replacement materials for silicon when they eventually arrive, have circuits etched into them by lithography. And it's not really just silicon either; the actual material is a mixture of silicon and dopants to give the resulting transistors certain electrical and switching properties.

Laser light, these days specific short wavelengths of ultraviolet and I believe usually 193-nm UV, is shone through the mask created in the previous step. It then passes through complex optics that focus and steady the mask beam, allowing the light to etch out a circuit on the silicon. Some foundry processes even pass the beam through water, using the water as a lens! Since the transistor feature size is smaller than the light wavelength, the process is called sub-wavelength lithography.

The wafer is moved underneath the laser light to manufacture each individual die on the wafer. Apparently, there's an incredibly complex set of computations happening in real-time to correct the optical assembly and the laser emission in order to ensure the dice are etched without defects. Some process nodes require multiple exposures through the mask set per chip, increasing time and cost.

Then the metal stack is laid on top and the full die assembly comes together (a gross oversimplification, but I don't really understand the metal stack assembly and how it works). The wafers are then cut so that each individual die can be taken out. I believe modern wafers, even for tiny dice, are cut with what's effectively a saw rather than something like a laser. The margin for error is incredibly small given how tightly the dice are packed together on the wafer.

Testing and packaging
After the dice are cut from the wafer, they need to be packaged into something that can be placed into the final device. Packaging at this point depends entirely on the chip and the target device market. For big PC chips like CPUs and GPUs, that usually means placement onto an organic substrate that connects the metal pins on the die to larger balls or pins underneath of the package.

If you don't own your own foundry like Intel or Samsung, packaging tends to happen externally, via third-party companies. The supply chain for semiconductors is really quite long. Most people assume that to create the final packaged chip, the foundry does everything after receiving the design from the designer, but that's not true. External packaging adds some latency to the production process on top of the time taken for manufacturing by the foundry. The cut dice are sent to the packaging house for that step, then the packaging house sends them to another place for testing. There's been some recent consolidation of this part of the production process, with packaging and testing houses becoming the same entity by merger or acquisition. Geographically, almost all of those for hire are in Taiwan.

A BGA-style Haswell SoC package with separate processor and platform chips onboard

Testing is the point in the process where you figure out if the chip is going to work or not. Certain tests can be performed on the full die, ahead of packaging. But there are some that obviously can't, where you need the chip to be completely functional, powered on fully, and running certain external or self tests to determine operational functionality.

There are obviously certain other steps in testing, usually longer completely functional tests with full software stacks. Here, the packages are placed into form-factor devices and run through long run-time tests in varied operating conditions to ensure the chip can run in all of the environments it will ever find itself in. Those kinds of tests tend to be done by the chip vendor, with the chip in situ in a device form factor that's representative of what you'll finally buy.

If the chip vendor is happy at this point and testing completes properly, the design is signed off for limited production.

Yields and binning
Yields are computable at this point. You know how many wafer starts you had, you know how many chips came back working, and you can start to bin those chips at various grades. Because of the inherent nature of the physical manufacturing process, and despite the high degree of control over the whole process from the individual wafers upwards, not every chip is the same.

You want it to be that way, but there are inherent things stopping that from being the case. Sometimes you have functional defects, where blocks of logic on the chip just don't work, but where you can suffer the loss and sell the chip as a slightly different SKU with the defective blocks turned off—and with different performance.

Sometimes you have process variation, where everything works the same as another copy of the same chip, but it won't clock as high at the same voltage. So you have to test the chip functionally for defects and then test it operationally to find out where on the voltage/frequency/power curve it sits. Binning is inherently time consuming and therefore adds significant costs to things, but it's the only way to guarantee you can sell as many dice as possible from a given production run.

Otherwise, if your products demand uniform performance so your customers know exactly what they're buying, and you only have a couple of performance levels to sell at, say a tablet and a phone, then you'll have to discard some of your production run unless your foundry is fantastic. It's all a big trade-off between the complexity of the chip, the complexity of manufacturing, and the target devices the chip is supposed to go into in the end.

There are various stages of production in a chip's lifetime. The first part already happened, if you've been following along. There's been enough wafer runs to produce enough chips to sign-off basic functional and operational tests. This is hundreds of chips but usually not thousands.

Binning is inherently time consuming and therefore adds significant costs to things, but it's the only way to guarantee you can sell as many dice as possible from a given production run.

Then modern chips usually go into a wider—but still not full—production run to create enough chips for device vendors, which then make sure the chip behaves properly in final devices. More on that soon.

Then there's full mass production. At this point, there's a lot of money at stake for the chip vendor. They place an order with the foundry that can't usually be altered; because of how the foundries work, which is related in part to the manufacturing section earlier, the chip being produced in a foundry can't be changed quickly. To amortize the cost of swapping one set of masks and wafers out—remember this whole process takes place in completely clean environments with no contaminants that could find themselves landing on the chip and spoiling the lithography, and every swap of the mask set and wafers is a chance to introduce something that could compromise yields—you either have to place a big order, or you have to wait for the foundry to be doing something special with the production run for some reason.

That does happen from time to time. For example, when a new fab building comes online, the foundry will dedicate time and energy to swapping out the wafer types, optics, and mask sets more often than normal in order to produce a bunch of different designs, test out the production pipeline, and make sure the fab is operational. They'll often produce different designs on the same wafer at this point! For certain kinds of wafer starts, Vendor A's chips might be right next to Vendor B's chips on the wafer, without either one ever knowing.

Mass production is usually on the order of at least hundreds of thousands of chips for consumer device designs, if not tens or even the low hundreds of millions over the production lifetime of some devices. Economies of scale kick in big time here, especially for the bigger chip designers. Some companies are able to keep an entire fab building consumed with a single chip design, for a single target market, for extended periods of weeks or longer at a time.

The economics of production mean that it's not financially viable for a fab to run cold with no wafer starts, or for it to be constantly swapping and changing design starts. So chip vendors that can guarantee volumes and longer runs of production get priority over the smaller vendors that don't need as much or that need many more designs to go into production.