review amd puts on the ritz with six core opteron demo

AMD puts on the ritz with six-core Opteron demo

AUSTIN, TEXAS — With Intel’s Nehalem-based Xeons gathering like a storm on the horizon, AMD today gave the first working demonstration of its potential counterpunch: a six-core Opteron processor code-named “Istanbul.” Istanbul is a fairly straightforward upgrade over current ‘Shanghai’ Opterons: a 45nm processor with 6MB of L3 cache that fits into a Socket F-style motherboards, only with six cores rather than four. As a result, the upcoming Istanbul-based Opterons will serve as drop-in upgrades for existing Socket F systems. The chips will take advantage of the same 2P, 4P, and 8P infrastructure as today’s Opterons, with HyperTransport and two channels of DDR2 memory per socket.

AMD has previously stated that Istanbul processors will become available in the second half of this year, and the firm hasn’t yet provided any more specific guidance about when to expect Istanbul-based systems. However, the presence of working silicon would seem to indicate that Istanbul Opterons could be introduced much earlier in that broad “second half” time-frame than originally anticipated.

AMD showed us several demonstrations of Istanbul silicon in action. The first was a simple showing of Task Manager on the Windows Server 2008 desktop, in which the utility showed activity indicators for each of the 24 cores in a quad-socket system.

Simple, yet impressive for what it indicated. The second demo was conducted on a dual-socket system with 12 cores. The main OS was Windows Server 2008, but the system also hosted three separate virtual machines: one each for Windows Server 2003, Red Hat Linux, and SLES 11 x64. Each VM had four cores dedicated to it.

The third demo was the most interesting for a couple of reasons. First, because it was intended to show how Istanbul can serve as a drop-in upgrade for Socket F systems. The only requirements: the system must support split power planes, and it must have a BIOS upgrade to operate with the new processors. Second, the demo was impressive because it included a performance test. Two otherwise-identical systems were situated side by side: one with a quartet of Shanghai Opterons, the other with four Istanbul chips. Both systems were running with HyperTransport 3 active—a capability coming soon to Shanghai Opterons but not yet available in current products. To illustrate the performance difference between the two boxes, the AMD tech ran a Stream benchmark. The 16-core Shanghai system produced throughput numbers in the range of 25,000 MB/s. The 24-core Istanbul box, by contrast, hit about 42,000 MB/s. The tech then swapped the processor-and-memory daughtercards between the two boxes, and of course, the performance characteristics moved with them.

That’s one heck of an in-place upgrade, but the bigger question may be: Why the huge performance gain with the addition of more cores, given that Stream is typically considered, at least partially, a bandwidth-bound benchmark? And why the magnitude of the gain, with only 50% more cores and (although they were not disclosed) likely lower per-core clock frequencies for Istanbul?

Part of the answer, it seems, may be a feature new to Istanbul that AMD calls HT assist (presumably for HyperTransport assist). This feature is what the company calls a probe filter (and may more commonly be called a snoop filter) that functions to reduce traffic on socket-to-socket HyperTransport links by storing an index of all caches and preventing unnecessary coherency synchronization requests. Current Opteron systems use a broadcast-based probe protocol, sending probe requests to all sockets. Istanbul, instead, either knows that no probes are required or is able to do a directed probe to a single socket. (Although it may still use broadcasts in certain, specific situations.) Istanbul’s probe filter stores its data in the processor’s L3 cache. The amount of cache space dedicated to probe filter storage, AMD says, will be configurable in the BIOS, and the more space dedicated to probe filter storage, the more granular its operation will be.

AMD didn’t handicap the exact performance impact of HT assist for us, but the quad-socket Stream test may have been an extreme case. The probe filter capability will be unique to the six-core Istanbul and will not be incorporated in a future revison of Shanghai. However, with fewer cores, Shanghai Opterons will be able to reach higher clock speeds within the same power envelopes as Istanbul, and AMD expects the clock frequency advantage to somewhat offset the lack of a probe filter. Regardless, one would expect Istanbul to be especially popular in systems with four or more sockets, where coherency traffic is a thornier issue and HyperTransport bandwidth is at a premium.

AMD plans a full lineup of six-core Opterons based on Istanbul, including low-power HE versions and high-performance SE models, all within the customary Opteron power and thermal envelopes. Although the percentage of server installations that take advantage of the drop-in upgrade opportunity is relatively small—AMD estimates it at around 5%—the firm hopes that tighter IT budgets and a strong value proposition might prompt a change in habits among some IT decision-makers. Beyond that, AMD expects system vendors to treat Istanbul very much like any other new Opteron speed grade, with a much easier qualification path than an all-new product. That should mean fairly quick and widespread adoption of six-core Opterons among vendors shipping Shanghai-based systems today, if all goes as planned.

0 responses to “AMD puts on the ritz with six-core Opteron demo

  1. I had a thought about this from listening to the Podcast:

    Wouldn’t having 3 servers with 4 cores each be a bad idea in this case? Because at least one of those servers would have their cores spread across multiple CPU’s, which could play havock on the cache/memory side of things. I’d think that sharing a chip with 2 for 3 each, and then the most demanding having 6 of the other chip would work better in many cases. I guess I don’t really have a feel for how much logic is built-into the chips for cache and memory association.

  2. Just that if you were to buy today, Opterons are a very compelling product, and in many cases have better performance/watt or performance/$$ than Xeon’s. Nehalem-based ones should level the playing field considerably, but those aren’t out yet.

  3. 24 cores in task manager?
    How about 64 cores in task manager?
    8-core nehalem-EX, quad socket, with HT is going to be exactly that.

  4. well and you’re like 50, so it’s obviously older. We agree, not sure why you insist on arguing about it.

  5. I had heard it as a kid, played on actual vinyl records. I was at a summer festival where they played it prior to releasing it on an album, and my friends were surprised I already knew the words to a “new” song (and assumed I had heard it by calling their “dial-a-song” number in NY). But long after that I heard people referring to it as a TMBG song.

  6. Yeah, a real shame that we’re limited to 60+ frames per second on current games on today’s hardware (with 2 or 3 exceptions).

  7. Don’t much care what a game site editor prints.
    We delivered Dunnington systems to many tier 1 customers over 1 /12 years ago. One a four box quad socket/box filled with Dunningtons to Microsoft for a 96 core unit.

  8. Maybe you can act as his translator. I don’t know what “features” they’re missing. These 6-core Opterons aren’t available in the open market any more than Nehalem Xeons, and Nehalem Xeons are (supposably) coming in March. Last I checked, that’s as little as 10 days away.

  9. It never occurred to me that they might have written it. I always assumed it was much older than that.

  10. Ahem, no.

    Intel is the one still needs to play check-up in the enterprise market in terms of features.

    They still need to deliver their Nehalem-based Xeons. I suspect that they will be significantly faster then their Core 2 predecessors in multi-socket setups and can beat or rival Opterons.

  11. Obviously you didn’t look at AMD’s roadmap and see the MCM version of Istanbul is Magny-Cours, which is 12 cores in a socket.

  12. Count me with #20 then. (Though I am amazed how many people think that song was written by TMBG)

  13. Intel has 6- and 8-core models near in the pipe, and 32-nm hitting later this year.

    AMD may have some chips that kinda-sorta look like Intel’s chips in a block diagram, but AMD will never be “competitive”.

    And what with AMD’s new cost structure, where it pays a profit to its partners for every chip it makes, its margins just took a dive.

  14. Most likely this is an MCM made of two triple-core dice. AMD is pretty good at getting 75% of its fabbed cores to work.

  15. Eight socket systems are rare and disproportionately expensive (it’s usually cheaper, and sometimes more performant, and always better from a fail-over standpoint, to buy two 4P systems than one 8P system). The 4P systems are the sweet spot. And not many users of this class of hardware do “drop in” upgrades — they typically swap the whole box. But a drop-in upgrade certainly reduces the validation costs and time-to-market for the OEMs.

    There are some clusters that will take an in-place upgrade though, and that should be a big boost for them.

    The high end 8P system fight is going to be entertaining to watch, but the real meat of the market is 2P and 4P systems. That’s where AMD has to have a competitive offering to keep their business alive.

  16. Didn’t the article say that 6 core Dunnington Xeon processors were released about 4-5 months ago? How is this 2-3 years behind? Also, there are no Xeons (this will change on March 29th) with integrated memory controllers or high speed links between processors. They all connect directly to the northbridge. Now of course Intel has other architectural advantages too but this doesn’t not make either company 2-3 years behind each other. I guess I just don’t understand your comment or you don’t understand the server/workstation processor market very well.

  17. LOL @ ignorance!

    100% of fortune 100 companies use VMWare. Opterons wipe the floor with Xeons in virturalization benchmarks. §[<<]§ Opertons are the top systems in each category except 24 cores since they have no 24 core config. Doesn't matter though since a 16 core Opteron box is faster than a 24 core Xeon box.

  18. Everyone seems to be missing the other shoe here. Istanbul will also be a drop in replacement in existing 8-socket Barcelona and Shanghai systems, as well as support the new HT3 star topology with 1-hop 8-bit wide HT paths between all of the cores. But let’s stick with existing 8-socket systems from Sun and HP among others. The new topology will require new hardware designs, but Istanbul will be a drop-in for existing designs, whether done as an upgrade, or done by the OEMs as new options for existing hardware.

    The HT assist should result in these 48 core systems scaling pretty well. At that level, the real competition is IBM Power, SPARC, and possibly Itanium 2.

  19. AMD is about 2 years behind Intel with this product. By the time AMD releases it, they will be almost 3 years behind.

  20. Yesm people seem to forget that in terms of stable server image, people currently using AMD systems are likely to keep on using them – the same base system can be used. In addition the platform is well known and works.

    Intel is moving to their new platform, with untested motherboards, new CPU platforms, etc. I’m sure they’ll market the hell out of their solution.

    AMD do have a good upgrade proposition. 50% more cores without needing new servers, which may be used by some companies to stretch their server lives during these times.

    I do think this processor has some worries – 1MB L3 per core for a start, and some of that is taken up by the probe functionality too.

    The CPU must be around 350mm^2 as well. Hopefully the 32nm version will have 12MB L3 cache and 6-8 cores. Except that’s 18 months away for AMD…

  21. Istanbul was Constantinople
    Now it’s Istanbul, not Constantinople
    Been a long time gone, Constantinople
    Now it’s a Turkish delight on a moonlit night.

    The fact that it’s a drop-in replacement for dual-core Socket F platforms should get this thing at least SOME sort of audience. That is, if it’s really a drop-in replacement for all server boards and not some half-assed “maybe” upgrade like the Phenom, which could go in some AM2 boards and not others.

    I get really annoyed when the socket is the same but the CPU upgrade path isn’t there. Just like LGA 775 Pentium Ds on Intel 925 boards couldn’t be upgraded to LGA 775 Core 2 CPUs.

  22. At least AMD still has a chance to become competitive.

    The reason is as following. Intels 8 Core + 8 virtual = ~ 10-12 real cores. As shown in various multicore benchmarks, there is a boost between 20-40%.

    So on a dual setup Intel has 20-24 “real core” system. AMD on quad has 24 real. Granted, Intel is more efficient. So a direct comparison (when factoring Mhz and efficiency) would be 26-30 core Intel, to amds 24.

    AMD is behind, there is no doubt about it, but at least in the game. (Intel dual socket vs AMD quad socket) and (Intel quad socet vs AMD Octal socket)

    So overall, it will all depend on price Intel and AMD will release their platforms at. Don’t care who has the best tech, as long as there is a good competition. This 6 core should give AMD at least that.

    And for me this is the most important.

  23. Until AMD has a new architecture that will be true. We all know that. I may have missed it in the article, but is this a native six core design or MCM? Pretty cool we can pack 4,6 or more cores onto one die.

  24. 4S Dunnington system was already tested at AnandTech some time ago and scaling was rather poor for a 24-core array. Part of the reason was that the L3 cache is running at the half of the core clock rate, just to maintain the TDP limit.

  25. It’d be interesting to see how Istanbul fared against last September’s six-core Dunnington Xeon 74xx processors in a four-socket configuration, with a 24-core box on either side.

  26. Intel is making an 8 core/16 threads Xeon with 24MB L3 cache

    So, Intel will still hold the performance crown.

  27. This is pleasant news indeed. I’ve been waiting for news on this but didn’t expect it to come out this early . It looks like AMD is may have a decent chance of surviving next year!