FirePro S7100 graphics cards bring hardware GPU virtualization to life

AMD first revealed its plans for hardware-based GPU virtualization, called Multiuser GPU or MxGPU, in September of last year. Today, the company is taking the wraps off the first two FirePro server cards that incorporate those hardware features: the FirePro S7150 and the FirePro S7150 X2. AMD says these cards are ideal for delivering workstation-class graphics in virtual desktop infrastructure (or VDI) environments.

AMD believes the S7100-series cards will help organizations solve a couple of growing problems with workstation computing. For one, the company says organizations are dealing with increasingly large data sets, and those data sets need to remain secure, as well. In turn, it makes sense for those data sets to live in the datacenter rather than on a client workstation. Using virtualized graphics also lets workers use thin-and-light computers on the go rather than a desktop workstation that's tied to one place.

What's in that "hardware-based virtualization" name? In short, AMD uses the SR-IOV standard to present the physical graphics card as multiple virtual devices on the PCIe bus. The hardware then uses time-slicing to switch between each of 16 virtual contexts in a round-robin fashion, performing computations and returning the results of that work to the client before moving on to the next context. AMD says this approach delivers more consistent performance and more secure computing, since no one user can tie up the entire graphics card and each virtual client has its own distinct slice of the GPU's memory.

Both S7100-series cards are built using AMD's Tonga GPU. The S7150 is a single-slot, single-GPU card with 2,048 stream processors that will be available in active- or passively-cooled versions. This card is expected to operate in a 150W thermal envelope. The S7150 X2 is a passively-cooled, dual-GPU card that'll only be available in a dual-slot, full-height configuration. It'll offer 4,096 stream processors across its two GPUs, and it'll dissipate 265W in operation.

Since both of these cards are designed for use in VDI environments, neither features built-in display outputs. Both cards are also less than 10.5" long for compatibility with common server chassis, and they both feature out-of-band temperature monitoring support.

On the host side, MxGPU is compatible with VMWare's ESXi and vSphere solutions from version 5.5 on. AMD will support Windows 7 and 8.1 as guest operating systems, using the same graphics driver it provides for non-virtualized desktop operating systems. Each GPU can handle up to 16 users, so the S7150 will support up to 16 users per card while the S7150 X2 can handle up to 32. Graphics performance will scale inversely with the number of users. AMD says 2-6 designers or engineers can share an S7100-series GPU, while 6-10 "CAD viewers" or up to 16 "knowledge workers" can make use of the card.

Nvidia offers a similar solution with its Grid virtual graphics product for businesses, but AMD's MxGPU could offer some compelling advantages over Grid in some cases. For one, S7100-series GPUs can provide OpenCL support to all virtual users without relying on pass-through mode, which dedicates the resources of an entire graphics card to a single virtual user. AMD MxGPU also doesn't rely on per-user licenses or profiles as Grid does, so system administrators are free to provision an entire S7100-series virtual graphics card among their VDI VMs as they see fit.

The S7150 will carry a $2399 MSRP when it becomes available in servers from major vendors in the first half of this year, while the S7150 X2 will be priced at $3999.

Comments closed
    • ronch
    • 4 years ago

    Is the market for these things that big?

      • helix
      • 4 years ago

      It’s a lucrative niche.
      If you can replace under-the-desk one-per-user workstations with big multi-user remote workstations the company save money and employees can be more mobile. So you accept a high price tag.

    • Buzzard44
    • 4 years ago

    Uhhh….what?

    [quote<]organizations are dealing with increasingly large data sets[/quote<] If you're actually working with a big data set where you need the full power and parallelism of a GPU, you probably want a full GPU or a bank of GPUs - not a tiny sliver of a GPU. I don't see this being a particularly good use of this. [quote<]no one user can tie up the entire graphics card and each virtual client has its own distinct slice of the GPU's memory[/quote<] So what you're saying is that despite the fact that I have a virtual GPU, I have no computing elasticity, and can't scale up or down - one of the main advantages of virtualization? [quote<]Using virtualized graphics also lets workers use thin-and-light computers on the go rather than a desktop workstation that's tied to one place.[/quote<] How does this make sense? You can already do this with a physical GPU on a remote machine or VM using IOMMU. The only niche I see this filling is a multi-headed prosumer or gaming rig (though it's largely priced out of that market) or to service modest video requirements on a bank of remote VMs that people remotely connect to with a console. Good first steps, but give me an implementation where you can provision stream processors and memory per vGPU, and then you have a really interesting product. This time-slicing implementation is a bit crude.

      • BaronMatrix
      • 4 years ago

      Companies with a need tend to buy a LOT MORE than one… The point is that for consultants moving in and out, a bank of 12 would handle a full complement of devs…

      Plus render time can be placed into slices since multiple devs won’t always be rendering loads at once…

        • Buzzard44
        • 4 years ago

        But that’s my point – all your users won’t always be rendering loads at once. So if one dev needs to render something while the other n devs are sleeping, that dev can’t take advantage of the whole bank of GPUs – they’re restricted to their little slice, while the rest of the GPU time slices sit idle. Extremely inefficient, as opposed to most cloud computing solutions, where the elasticity of the compute means you can have less hardware servicing more people, because the peak demand from each user is spread out over time.

          • slowriot
          • 4 years ago

          I don’t see these being useful for rendering.

          If anything, they’re serving say an editors/modleler/dev/designer VMs that needs dedicated but minimal GPU resources to accelerate an app front-end. The rendering would being done on a server.

          You could then potentially have a single server with this new Firepro inside serving VMs for all your editors, while all rendering is handled on a dedicated server. And then the VM host server and render server would be connected via some high-bandwidth interface(s).

            • Buzzard44
            • 4 years ago

            I completely agree with you. You just more clearly stated what I meant in my original comment by “to service modest video requirements on a bank of remote VMs that people remotely connect to with a console.”

            I definitely agree that this product serves a niche (I’m assuming this is a niche use case/market, but have no data to back that up – could be wrong), but am Krogoth’ed by the current limitations of it, especially after reading “gpu virtualization.” Also, while this is the use we both see for it, that’s not at all the intended use cases I got from reading the article.

          • semitope
          • 4 years ago

          like slowriot said, this would allow better efficiency from what I can understand. A single user not tying up the whole GPU would mean being able to spread GPU resources around as needed rather than a user who is doing less work using a whole GPU anyway.

            • Buzzard44
            • 4 years ago

            In some cases, yes – but in other cases, you have the same level of inefficiency in the opposite direction.

            If you have n users who simultaneously just need an nth of the GPU, then this is a great product. But for the times when you have 1 user on here needing all the GPU while the others don’t need any, that user is restricted to a nth of the GPU despite the rest of it being idle, wasting the other n-1/n of the GPU, which is very inefficient.

            This of course goes back to slowriot’s point that for a particular use case, this makes sense. I still think there’s more to go before this makes sense for the lionshare of the market who could be interested in virtual GPUs.

        • chuckula
        • 4 years ago

        What country do you come from where sentences end in ellipses and not periods?

      • slowriot
      • 4 years ago

      [quote<]How does this make sense? You can already do this with a physical GPU on a remote machine or VM using IOMMU.[/quote<] This card is about improving the efficiency in that space. With VMs and IOMMU you're assigning the entire video card to the VM. The goal here is to provide only a slice of the GPU resources to each VM. I imagine there will be the ability to split up the stream processors and memory per vGPU. Hard to speak on the granularity but it does seem the thinnest slices are 1/16 of the GPU resources currently.

    • lycium
    • 4 years ago

    Is this just a driver/firmware change with existing Tonga silicon? If so, that’s some *expensive* software.

      • Laykun
      • 4 years ago

      They’ll have support plan you simply don’t get with consumer level cards. You’ll be able to contact AMD directly for support and actually get a response back. You also have a lot of ‘unlocked’ CAD features and probably also full DP speed. On top of all this these will probably be the better binned tonga chips.

        • tipoo
        • 4 years ago

        I think the DP speed is the same, AMDs consumer cards are pretty good at compute. Nvidia cut all that out of them for more efficient gaming performance. The silicon is largely the same as the consumer end. I think the high end Quadros have better DP performance than the Nvidia consumer cards though as they had more culled back. I think the first Titan was like a Quadro in that respect, but not sure about the subsequent ones.

          • Laykun
          • 4 years ago

          Historically consumer level versions of GPUs have had artificial limitations of DP performance done at software level, not hardware (at least I presume it’s done at BIOS level, it’s also likely to be laser cut in some way). The hardware for performing would be there but they’d just section it off for professional users who pay the big bucks. In the case of Fermi consumer Geforce cards were limited to 1/8th DP performance whereas professional level cards were able to use the full 1/2 DP performance of the chip. You are however right that the latest batch of nvidia Maxwell cards ditched much of it’s DP hardware to fit more SP into the transistor budget for the sake of gaming. I believe with Kepler only the Tesla cards are capable of decent DP performance (save for the Titan Black), with it being 1/3 of SP on Telsa and 1/24 of SP on geforce/quadro.

            • derFunkenstein
            • 4 years ago

            So you said it’s probably done somewhere between the software, the BIOS, and laser etching. Way to drop some knowledge bombs and narrow it down for us. 😆

            • Laykun
            • 4 years ago

            You’re welcome 🙂

      • tipoo
      • 4 years ago

      As always the Firepro/Quadro upcharge is largely due to customer support being built in, and driver testing against professional apps and certification for them to use the GPU. In olden times they used to cut out some of the compute functions on consumer GPUs (or remember the jumper trick to turn a radeon into a firepro?), but now it’s largely the same silicon with different drivers.

      In a way it feels like a ripoff, but when a business with hundreds of thousands of dollars running through it weekly loses productivity due to lack of consumer driver support, or even an individual working on such things as a business, that can cost more than it was worth saving with consumer grade hardware.

      Incidentally that’s how Apple got away with the Mac Pro, calling them FirePros and providing them fairly cheap for FPs. Since they cover the support, and write large parts of the driver, they can call them FirePros, while dual booting on Windows just shows them as Radeons.

    • terminalrecluse
    • 4 years ago

    Wish this wasn’t so expensive, I would like it for my home lab.

      • willmore
      • 4 years ago

      These cards have no extra features that would make them more useful than any other cards in a home lab.

    • Mr Bill
    • 4 years ago

    Letting many people securely subdivide a single graphics card’s resources is expensive!

      • UberGerbil
      • 4 years ago

      Assuming you already have the server infrastructure, this plus thin clients is still cheaper than buying 16 GPU-equipped laptops.

        • Mr Bill
        • 4 years ago

        Can this also be the basis for GPU based supercomputing?

          • BobbinThreadbare
          • 4 years ago

          Sure, but you don’t need virtualization for that. You just assign them jobs like supercomputing has been doing for years.

          That means you can’t subdivide each GPU, but when you’re doing supercomputing that’s hardly a problem either.

            • semitope
            • 4 years ago

            it shouldnt be hard to understand why being able to more finely assign resources is better than throwing whole blocks at a single task that may not require it

      • lycium
      • 4 years ago

      I love how we say the same thing (well, I also asked a question), you get an upvote and I get a downvote.

      Oh well, should be doing other things than commenting on TR ¯\_(ツ)_/¯

        • DrDominodog51
        • 4 years ago

        You were probably downvoted for saying it is expensive software whereas in reality that’s extremely cheap for software.

    • DPete27
    • 4 years ago

    150W passive single-slot cooler….? Yeah right.

      • deruberhanyok
      • 4 years ago

      I’ll believe it. Probably designed specifically for use in 4U rackmount chassis where someone would be hosting a VDI environment, in an air conditioned server room with proper front-back airflow and all that, so it won’t need it’s own fans or a massive heatsink.

      The tradeoff is that most of those chassis are loud enough to break glass.

        • slowriot
        • 4 years ago

        Exactly. It’s designed to make use of the airflow from chassis fans. Having a fan on the GPU card itself would be undesirable. In the inevitable event the fan fails it would likely mean downtime for the entire chassis and certainly put the card out of service. Relying on the chassis fans means a much easier replacement process (just put a new chassis fan, they’re designed to be hot-swapped) and may not even require down time at all in the event of a chassis fan failure.

      • UberGerbil
      • 4 years ago

      As soon as I saw that in the listing I knew there’d be comments from folks who had never seen a server room.

        • Firestarter
        • 4 years ago

        or heard or felt one for that matter

          • moose17145
          • 4 years ago

          Glad I wasn’t the only one who thought the exact same thing.

      • BlackDove
      • 4 years ago

      Nvidia K80 is only passively cooled. Passive cooling means it depends on the servers high RPM fans to move air.

      • kalelovil
      • 4 years ago

      Why not? It already exists in the form of [url<]http://www.amd.com/en-us/products/graphics/workstation/firepro-3d/7100#[/url<] The S7150 is basically that - the display ports and + the deactivated stream processors. It certainly won't be quiet, but that is of little concern to its target market.

    • chubbyhorse
    • 4 years ago

    But will it play Half Life 3?

      • morphine
      • 4 years ago

      Only virtually.

        • willmore
        • 4 years ago

        At this point, that’s the only way to play it anyway. 🙁

          • tipoo
          • 4 years ago

          Or as Tales from the Half Life. Heck that’ll probably come out before HL3.

        • ronch
        • 4 years ago

        As in VR? Do these come with VR goggles? 🙂

        • kuttan
        • 4 years ago

        Yeah can do so with up to 16 users/GPU…

      • BIF
      • 4 years ago

      Likewise, I’m sure it tears through those F@H virtual Work Units….

      • NTMBK
      • 4 years ago

      Yes, but unfortunately it’s free-to-play.

Pin It on Pinterest

Share This