Knights Corner supercomputer on track for 2013

Remember Larrabee? Intel’s ill-fated desktop GPU has been reborn as Knights Corner, a high-performance computing chip targeted at servers and supercomputers. It’s being built on the same 22-nm process used to fabricate Ivy Bridge CPUs and features “more than 50” cores that purportedly combine to offer a teraflop of double-precision computing power. According to an Intel spokesman quoted by Xbit Labs, the first supercomputer based on Knights Corner is still on track to be turned on early next year.

Dubbed Stampede, the system will reportedly consist of “several thousand” servers, each with dual eight-core Sandy Bridge-EP CPUs and 32GB of memory. The story doesn’t say how many Knights Corner chips will be incorporated into each of those servers, but it does note that the rig will feature 128 Kepler-based Quadro GPUs for “remote visualization.” The supercomputer will also sport a collection of servers with a terabyte of shared memory devoted to analyzing large data sets. A high-performance disk subsystem will be part of the package, too.

Intel has yet to release specifics on when Knights Corner will be available outside the Stampede supercomputer, and it’s unclear whether standalone products will come first. Knights Corner is expected to be sold as a PCI Express add-in card that competes directly with GPU computing products from AMD and Nvidia.

Comments closed
    • ish718
    • 8 years ago

    [quote<]features "more than 50" cores that purportedly combine to offer a teraflop of double-precision computing power.[/quote<] Knights Corner is equal to a HD 7970 in double precision FP performance. I wonder how much power it consumes at 22nm... AMD has yet to release a 7000 series based FireStream card..

      • DavidC1
      • 8 years ago

      Not at all. HD 7970 offers 940GFlops theoretical, while Knights Corner is 1TFlop measured. Ok, the latter’s not coming until next year, but the difference is important.

    • HighTech4US2
    • 8 years ago

    [quote<]some of the discussions around programming the upcoming MIC chips leave me scratching my head – particularly the notion that, because MIC runs the x86 instruction set, there’s no need to change your existing code, and your port will come for free.[/quote<] [quote<]No “Magic” Compiler The reality is that there is no such thing as a “magic” compiler that will automatically parallelize your code. No future processor or system (from Intel, NVIDIA, or anyone else) is going to relieve today’s programmers from the hard work of preparing their applications for the future.[/quote<] Food for thought for those who post the quick sound bytes in how easy they believe it will be to program for Knights Corner because it is X86 based. [url<]http://blogs.nvidia.com/2012/04/no-free-lunch-for-intel-mic-or-gpus[/url<]

      • chuckula
      • 8 years ago

      I don’t think that Intel is saying that their MIC magically eliminates the need for good programming, but they are saying that their architecture has advantages for certain problemsets that are not well addressed by existing GPU compute solutions.

      I don’t expect MIC to wipe out GPUs at raw number crunching, but I wouldn’t be surprised if it has advantages in code that requires conditional execution and other more complex logic than simply performing massive sets of add/mul operations on numbers.

        • dpaus
        • 8 years ago

        [quote<]I think.... they {Intel} are saying that their architecture has advantages for certain problemsets that are not well addressed by existing GPU compute solutions.[/quote<] I think they're saying "You want a GPU solution? Here's a GPU solution. From Intel. Now, why you wanna risk your good relationship with us by buying sumptin' from dose udder guys, hunh? Wassa matter for you?"

      • HighTech4US2
      • 8 years ago

      The Perils of Parallel

      [url<]http://perilsofparallel.blogspot.com/2011/10/mic-and-knights.html[/url<] [quote<]Right. Now, for all those readers screaming at me “OK, it runs, but does it perform?” – Well, not necessarily.[/quote<] [quote<]Returning to the main discussion, Intel’s MIC has the great advantage that you immediately get a simply ported, working program; and, in the cases that don’t require SIMD operations to hum, that may be all you need. Intel is pushing this notion hard. One IDF session presentation was titled “Program the SAME Here and Over There” (caps were in the title). This is a very big win, and can be sold easily because customers want to believe that they need do little. Furthermore, you will probably always need less SIMD / vector width with MIC than with GPGPU-style accelerators. Only experience over time will tell whether that really matters in a practical sense, but I suspect it does.[/quote<] [quote<]Hope aside, a lot of very difficult hardware and software still has to come together to make MIC work. And… Larrabee was supposed to be real, too.[/quote<]

      • TurtlePerson2
      • 8 years ago

      Intel of all people should know that you can’t just shift the burden to the compiler. Part of the idea behind Itanium was that the compiler was going to do a ton of scheduling work for you. That idea failed, kind of like Itanium.

      By the way, is “HighTech4US2” a shill? I’ve never seen him before and he posts a bunch of quotes against Intel. It’s too well sourced and produced to simply be a normal person.

        • OneArmedScissor
        • 8 years ago

        He’s been here a while. Why does someone have to be a “shill” just because you don’t like what they have to say?

        The internets are amazing. If you post incoherently, you’re a troll. If you post coherently, you’re a shill.

        • dpaus
        • 8 years ago

        [quote<]It's too.....{whatever, whatever} to simply be a normal person[/quote<] Let he among us who is most 'normal' cast the first stone. SSK? LiquidSpace? Deanjo? NeelyCam? Meadows? derFunk? Krogoth? Even UberGerbil doesn't qualify as 'normal'.

          • UberGerbil
          • 8 years ago

          Uhhhh, thanks. I think.

            • dpaus
            • 8 years ago

            Like Sheldon Cooper, you’re just out standing in your field.

            • UberGerbil
            • 8 years ago

            [url=http://us.123rf.com/400wm/400/400/jannyjus/jannyjus1010/jannyjus101000035/7982210-man-with-laptop-standing-in-a-field.jpg<]Indeed[/url<]

          • Forge
          • 8 years ago

          I claim this title! I. Am. Normal!

          Rawr! Ph33r!

        • NeelyCam
        • 8 years ago

        HighTech4US2, if I remember right, tends to mostly write pro-NVidia (or rather, anti-NVidia-Competitor) posts. AMD in particular has been on the receiving end

      • WillBach
      • 8 years ago

      That post is hosted by NVIDIA, which reminds me, does NVIDIA even have a working C or C++ compiler for their Quadro or Tesla GPGPUs? They don’t. They have provisional support for FORTRAN but I haven’t looked at it and I don’t know how extensive it is. I don’t recall anyone saying that a port from general purpose x86 to MIC would be “free” but there’s a big gulf between that and the total code rewrite you would need going to CUDA, or Stream, or even OpenCL.

      Even if porting code to MIC was as intensive as porting it to CUDA (or CUDA provided much better performance for that effort) your end product is written in a standard-compliant language. It’ll run on MIC, Xeon, Opteron, Power7/Power6/Cell, SPARC, or hell, those MIPS-based processors coming out or research labs in China, i.e. anything with a C, C++, or Fortran compiler. If you port your code to CUDA, it’ll run on anything NVIDIA makes a CUDA compiler for, the same company that swore up and down that SLI would never, never work without special NVIDIA chips on the motherboard.

      Disclaimer: I run an NVIDIA GPU and an Intel CPU. Used to run NVIDIA SLI, but a single card serves my needs now.

      Edit: typos.

      • Lans
      • 8 years ago

      To be fair, as others have said, that is from Nvidia (a clear competitor in this field) and is a blog. And even author concedes it could work (albeit with reservations as you pointed out):

      [quote<]Just recompile with the –mmic flag, and your existing MPI or OpenMP code will run natively on the MIC processor! (In other words, ignore the Xeon, and just use the MIC chip as a big multi-core processor.) ... Functionally, a simple recompile may work...[/quote<] Of course, anyone with some/decent? programming knowledge will scratch their head and say "magic compiler is going to automatically parallelize my code, any code I write"? However, in context of existing MPI or OpenMP code, that seems only natural? I agree with some points made in blog but it remains to be seen how much of the problem space is solve with which approach.

        • willmore
        • 8 years ago

        The assumption seems to be that you’ve already done the work to parallelize your code so the compiler doesn’t have to do more than the usual magic.

        But, the base arguement is that the parallelize step is the hard part. So, it’s just skipping over that. That’s no answer.

    • HighTech4US2
    • 8 years ago

    [quote<]note that the rig will feature 128 Kepler-based Quadro GPUs for "remote visualization." [/quote<] So is Knights Corner not good at doing "remote visualization" or is Intel's software stack lacking?

      • chuckula
      • 8 years ago

      Considering MIC isn’t a graphics card, it would be hard for it to perform any “visualization” at all….

        • Scrotos
        • 8 years ago

        Ok, but why do you need 128 of them spread across several thousand servers? I don’t know much about HPC but it seems like those GPUs aren’t going to be combining powers to drive a 128-way 3D display at 5 million by 5 million pixels. And if those Quadros are just going to be used to crunch numbers to dump to a regular ol’ display, why not use KC?

        I think it’s a valid question. Are all the visualization techniques written for Cg or somethin’?

          • chuckula
          • 8 years ago

          You don’t have 128 of them “spread” over the system like salt or something. You have a large compute cluster (thousands of nodes) that churns out lots and lots of data that is then piped to a smaller number of “visualization” systems (a few dozen nodes where the graphics cards are located) that turn the results of the compute cluster into the awesome CSI-esque visual output.

          Edit: Wow I post a factual response and I still get downmodded. Looks like some fanbois can’t fathom the fact that a supercomputer might include Intel chips in it… ohs noes it’s the end of the worldz!

            • Scrotos
            • 8 years ago

            Ok, so the GPUs are still not being really used to drive a display, right? They are being used as compute nodes that basically compute the data into a visual representation? So why can’t you get away with, say, 4 GPUs to drive a fancy 3D model of the data and keep using KC to do the visualization computation?

            You see where I’m coming from, yes? Am I just not understanding how HPC visualization works? I’ve seen the results before of stuff like this: [url<]http://arstechnica.com/science/news/2012/02/a-brief-history-of-the-multiverse.ars[/url<] Even weather models and models of nuclear explosions seem to have simplistic visualizations like that. Is it even antialiased or do mathematical curves move in discrete chunks like those spikes seem to do? I just don't quite understand why you need 128 GPUs for "visualization" if they aren't really displaying anything special. If that's the case, like HT4US2 was asking, why not just keep using KC? Intel's not got the software goods for that kinda stuff? What's the deal?

            • chuckula
            • 8 years ago

            [quote<]Ok, so the GPUs are still not being really used to drive a display, right? They are being used as compute nodes that basically compute the data into a visual representation?[/quote<] No, I did not say that and I don't see where you are coming from because you are not correctly describing how this system works. How difficult is this to understand? There's a problem like say, for example, a complex weather simulation. The main compute cluster churns through lots and lots of data to produce a model of the weather simulation. The model is the "result" of the simulation and that model includes lots of data. The "visualization" machines then turn the model into 3D pictures and animations... this isn't that hard a concept to grasp here....

            • Scrotos
            • 8 years ago

            No, I get it. In which case, why use nvidia GPUs for that visualization instead of just more KC? That’s what HT asked. You came back to him and said it wasn’t a graphics card. Ok. Well, those Quadros aren’t displaying anything on the screen, are they? They are number crunching the raw data into a visual representation. So again, why switch vendors and APIs instead of using only KC? Does that mean that KC’s support for some reason can’t handle crunching raw data into 3D pictures and animations?

            I understand it’s up to the group building the supercomputer to dictate why they want to use Cg or CUDA or whatever for their (assuming here) premade visualization clusters. But this IS the first public deployment of KC and you’d think Intel would bend over backwards to NOT be bundled with a competitor. What’s the message? Our chips are ok for HPC stuff but you’ll still need to go to another vendor for some of your stuff because we suck at it?

            To me, that’s like Intel pimping an Itanium-based supercomputer and, oh, yeah there’s a few hundred AMD chips sorting the data because Xeons for some reason couldn’t handle that task.

            • HighTech4US2
            • 8 years ago

            [quote<]That's what HT asked. You came back to him and said it wasn't a graphics card. Ok. Well, those Quadros aren't displaying anything on the screen, are they? They are number crunching the raw data into a visual representation. So again, why switch vendors and APIs instead of using only KC? Does that mean that KC's support for some reason can't handle crunching raw data into 3D pictures and animations? What's the message? Our chips are ok for HPC stuff but you'll still need to go to another vendor for some of your stuff because we suck at it? To me, that's like Intel pimping an Itanium-based supercomputer and, oh, yeah there's a few hundred AMD chips sorting the data because Xeons for some reason couldn't handle that task.[/quote<] Well stated. Now we wait for Chuck's response.

    • HighTech4US2
    • 8 years ago

    Knights Corner I.E. Larrabee II

    [quote<]"Knights Corner is in great shape and is exactly where it has to be according to our internal schedule.[/quote<] So Intel is on schedule based on a non-public schedule that can and does change whenever they want it to. Intel Internal Discussion: Lets see "Knights Corner" isn't ready yet. Hurry up and make up some internal schedule and then post a statement that it is exactly where it has to be according to our internal schedule.

      • chuckula
      • 8 years ago

      How is this different than Nvidia’s schedule for parts? Are you upset that Intel is copying Nvidia now?

        • dpaus
        • 8 years ago

        [quote<]Are you upset that Intel is copying Nvidia now?[/quote<] No, that AMD [i<]isn't[/i<] copying Intel on this advanced manufacturing technique.

        • Deanjo
        • 8 years ago

        The one difference I can think of is that nvidia does eventually bring out their product. I can think of a few intel projects that gave roadmaps to release and then drop the project stone cold.

      • superjawes
      • 8 years ago

      Can I tell my boss it’s okay that I missed a deadline because the schedule wasn’t public?

        • smilingcrow
        • 8 years ago

        If you want to get fired then Yes. 🙂

          • superjawes
          • 8 years ago

          Thanks a lot captain obvious!

          • crabjokeman
          • 8 years ago

          [url<]http://www.thefreedictionary.com/rhetorical+question[/url<]

        • UberGerbil
        • 8 years ago

        I’m pretty sure Wally used that explanation successfully with the PHB

        • WillBach
        • 8 years ago

        My boss used to work on Larrabee, I should ask him that…

Pin It on Pinterest

Share This