Trinity to boost performance, save power with resonant clock mesh

The Piledriver CPU cores in AMD’s upcoming Trinity APU will feature some special sauce courtesy of Cyclos Semiconductor, a company spun out of the University of Michigan in 2006. Cyclos’ specialty? Resonant clock mesh technology that purportedly offers higher performance and lower power consumption.

According to this Cyclos whitepaper (PDF), performance can be improved by 5-10% using a clock mesh instead of the clock trees typically employed by microprocessors. This mesh distributes the clock cycle over a uniform grid that covers the entire chip. The mesh reduces the clock skew associated with tree-based designs, allowing the chip to utilize more of its clock cycle.

The problem with meshes is that they tend to consume more power than clock trees. Cyclos’ solution is a resonant clock mesh that uses on-chip capacitors and inductors to create a tank circuit that acts as a sort of electronic pendulum. The charge flowing between the capacitors and inductors is largely self-sustaining, generating an effective clock cycle that needs only a “nudge” from an external source to keep the virtual pendulum swinging in time. Cyclos claims this approach can lower total chip power consumption by up to 30% without compromising the performance benefits of the mesh.

In a Piledriver core running at over 4GHz, Cyclos’ resonant clock mesh is said to reduce “clock distribution power” by 24%. That’s a far cry from cutting total chip power by 30%, but it’s a reduction nonethelessโ€”and an important one considering AMD intends to offer a 17W version of Trinity. What’s more, AMD says Cyclos’ technology was easy to integrate without increasing the die size of the chip or tweaking the manufacturing process.

Comments closed
    • Rageypoo
    • 7 years ago

    [battlefield earth reference] No one else here is interested that the technology came from “cyclos?” [/battlefield earth reference]

    • grndzro
    • 7 years ago

    I’m pretty sure AMD was targeted for this technology since it can lower power consumption up to 30% and increase overclocking by even more.

    Mabye in the long run AMD will get even more mileage out of it than intel since it can apply to GPU’s as well

    AMD probably was given the technology for peanuts. Now Intel and Nvidia will want it.
    Intel or Nvidia aint gettin it for peanuts. Intel and Nvidia will have to pay through the nose. Really good marketing strategy IMO. who knows.

    • Bensam123
    • 7 years ago

    The curious side in me wonders how this will interact with overclocking…

    • phileasfogg
    • 7 years ago

    This may not be particularly germane to this discussion, but no one has yet pointed out that Intel bought Fulcrum Microsystems in July last year. [url<]http://www.theregister.co.uk/2011/07/19/intel_acquires_fulcrum_microsystems/[/url<] Fulcrum came out of a team at Caltech that worked on asynchronous (i.e. clockless) processor designs. As this link above says, they've burned thru many rounds of capital before succumbing to Intel's charms. One can be fairly certain that a 32nm (?) 'clockless' server networking chip (not necessarily a Xeon family CPU) will issue forth from Intel in the next couple years.

    • bcronce
    • 8 years ago

    Patent are great in theory, but how about a join force between these “meshes” and tri-gate?

    Anyway… AWESOME. GO AMD!

    I love how there is so many ways still left to make things faster and more efficient.

    • cegras
    • 8 years ago

    Can someone explain what a clock tree is, and what a clock mesh is?

      • flip-mode
      • 8 years ago

      Tanks on a skewed circuit pendulum.

      • Dissonance
      • 8 years ago

      Follow the whitepaper link ๐Ÿ˜‰

        • cegras
        • 8 years ago

        That was surprisingly easy to follow. Thanks ๐Ÿ™‚

    • TurtlePerson2
    • 8 years ago

    It’s always funny to see how long it takes industry to catch up with the ideas in academia. When I googled “resonant clock mesh,” I found a paper from 2004. From what I can tell, the only reason this wasn’t adopted sooner is that it involves complicated transmission line calculations and there needs to be tuning for process variation.

    It will be interesting to see if using this new clocking scheme will make the chips less robust as far as operating temperature and voltage range are concerned.

    By the way, Marios C. Papaefthymiou (one of the founders of Cyclos Semiconductor) was my professor for Digital Logic Design.

      • OneArmedScissor
      • 8 years ago

      Intel developed tri-gates 10 years ago and they showed off functional 3D CPUs years back.

      In the not-so-distant future, things will be so much more specialized that mass production won’t be an issue. Even if we won’t have things like quantum computers or chips with internal liquid coolant in our pockets, we’ll all still benefit. Or we’ll all be killed by Skynet. It’s definitely one or the other.

      • sschaem
      • 8 years ago

      Funny? I would say expected.

      Marios founded Cyclos in 2006 to commercialize the idea, once its was ready for commercial applications.
      And AMD piledriver is entering production, so the final new clocking was already finalized a long time ago.
      So this leave ~3 years to go to silicon on 2 billion+ 32nm chip once the contract was approved
      Thats assuming AMD got the contract the day Cyclos was formed… this most likely happened not earlier then late 2007.

      So if you look closer at this, its actually amazing how fast AMD was able to integrate this in their production design using GF 32nm fab.

      If the potential is real, I’m surprised Apple didn’t buy Cyclos and all its IP for itself…

        • Voldenuit
        • 8 years ago

        [quote<]If the potential is real, I'm surprised Apple didn't buy Cyclos and all its IP for itself...[/quote<] They'll just steal the idea, then sue Cyclos for "patent infringement".

    • wingless
    • 8 years ago

    It sounds like overclocking one of these would cause an instability in the resonant field and either rip a hole in the space-time continuum or destabilize the warp core. I saw that on Star Trek Voyager.

    Sounds like a good idea and I sure as hell hope it works out for AMD, though.

      • NarwhaleAu
      • 7 years ago

      You forgot to add that it may cause a reflection on the event horizon, which is why AMD claims 8 cores when only 4 are really present.

    • ImSpartacus
    • 8 years ago

    Woosh! Right over my head.

      • Wirko
      • 8 years ago

      Sorry, I didn’t mean to scare you but I had to test my prototype somewhere. The blue glow that you probably noticed comes from the resonant pipeline. I admit that the resonant multipliers are still quite noisy but the clockless cache is dead silent.

      • crabjokeman
      • 8 years ago

      Just read the last paragraph for the “executive” summary.

        • ImSpartacus
        • 8 years ago

        Or get a Electrical Engineering degree, whichever’s faster.

          • dpaus
          • 7 years ago

          Oh, I think that ‘whoooshed’ right over NeelyCam’s head… ๐Ÿ™‚

    • OneArmedScissor
    • 8 years ago

    Great. Now sell it – and without it breaking something.

    I’ve lost track of how many amazing new things AMD recently talked up, just before they were supposed to go on sale, only to find their completely illogical implementation actually screwed up the new CPU or just lead to it being completely cancelled.

    Just make something, [i<]anything[/i<], that isn't so weird that it can't be mass produced and widely adopted. Please?

    • internetsandman
    • 8 years ago

    Tying everything in to a single base clock speed in order to boost efficiency? Isn’t that what intel started with SB? Correct me if I’m wrong but the two seem quite similar

      • OneArmedScissor
      • 8 years ago

      That’s not what either of them did.

      Cores with only dedicates caches have worked that way for a long time. Sandy Bridge additionally tied the [i<]shared[/i<] cache and ring bus to the core clock, but if you don't even use those things, as is the case with Trinity, no big deal. There are still numerous power planes, clock speeds, and corresponding types of transistors in Sandy Bridge/Ivy Bridge, and Trinity will have roughly just as many. But you still have to [b<]distribute[/b<] the clock to every part of the chip. As chips become more and more tightly integrated and complex, that becomes an increasingly inaccurate, inefficient process. The resonant clock speed mesh attempts to deal with it. As for how, and more importantly, [b<]if[/b<], it works, that's wait and see.

    • Farting Bob
    • 8 years ago

    Trinity is shaping up to look pretty good, right up until you compare it with similar priced Ib and SB chips where it will no doubt be crushed on power consumption and performance.

      • Yeats
      • 8 years ago

      ..and SB/IB will lose to Trinity in GPU performance. Choose for your needs, like always.

        • theonespork
        • 8 years ago

        You fool!!!!

        Toe the line Yeats, otherwise you are gonna be called out as a fanboy (the horror, THE HORROR). AMD is unable to execute, is perpetually mismanaged, and exists despite itself. This is nothing more than undelivered promises by a failing company. It has to be, or, or, or, well, I don’t know, but it is 2012 and something bad will certainly occur.

      • Voldenuit
      • 8 years ago

      IB IGPs won’t be competitive with Trinity’s APU. And SB’s IGP doesn’t support DX11 or OpenCL.

      Not a big deal on desktop, but it’s going to make the Ultrabook vs Ultrathin battle interesting, assuming Piledriver performs.

        • sweatshopking
        • 8 years ago

        yes it does support both of those. dx11 AND opencl see [url<]http://www.anandtech.com/show/4792/and-now-ivy-bridge-gpu-architectures-detailed[/url<] for details. that's old news, going past idf.

          • DancinJack
          • 8 years ago

          [quote<]And SB's IGP doesn't support DX11 or OpenCL.[/quote<] He said SNB, not IB. Calm down there buddy.

          • Voldenuit
          • 8 years ago

          Did you misread my post? I clearly said that [b<]SB[/b<] doesn't support DX11 or OpenCL. I have read that the reason SB doesn't support OpenCL has more to do with intel not exposing the API than technical limitations, but I'm uncertain how accurate those reports are.

            • sweatshopking
            • 8 years ago

            that’s fair. i did misread your post. i assumed that you were talking about ivy bridge, since it’s the competing chip to trinity. and i expect that ivy bridge’s gpu will be more competitive than most people think.

        • bcronce
        • 8 years ago

        I’m not saying that you’re “wrong”, but I’m curious to the logic of Trinity’s APU being much stronger than IB’s IGP.

        I haven’t read of any rumors about actual specs of IB’s IGP, but I would like to know if anything has come up.

        I would also assume AMD’s GPU will be more powerful, but I wonder by how much. Right now SB’s IGP is only about 1/2 the speed in games, which is easy to make up in one generation, but Intel hasn’t really done much for GPUs.

          • Voldenuit
          • 8 years ago

          Everything I’ve read points to intel claiming a 60% performance increase with IB IGP. That won’t even put it on par with Llano, let alone Trinity. And let’s not forget that intel is rendering at much lower quality than AMD and nvidia to achieve their framerates.

          Intel could be playing their cards close to their chest and under-promising on IB, but graphics is one area they are a long way from catching up in.

      • OneArmedScissor
      • 8 years ago

      Similarly priced will likely be Pentium level, so probably not. That’s how AMD has always hobbled along.

    • ronch
    • 8 years ago

    Ok, a bit like cheating to achieve better performance/watt, but if it means AMD gets some leverage, I guess I’ll take it. Oh, and another thing, if AMD has access to this technology, what’s stopping Intel from also using it?

    EDIT – I didn’t say they were actually cheating, I said it’s [u<]a bit like cheating.[/u<]. EDIT 2 - Whoa, I didn't expect my comment to get this much thumb downs. You'd think that I've said bad things against your mothers. Extreme fanboyism. EDIT 3 - Just to clarify to everyone why I think this is [u<]a bit like cheating.[/u<]. It's [u<]not[/u<] about the technology per se. All sorts of technological tricks have been used [u<]throughout[/u<] the history of processors and so many other things, [u<]and there's no problem with that[/u<]. The reason why I said this comment, and which you all seem to have failed to realize, is that AMD has resorted to using technology from [u<]outside[/u<] their company to save their troubled architecture. It's a technology that I believe [u<]anyone[/u<] can quite easily use (AMD themselves admitted implementing it wouldn't be much trouble), not just by AMD, to get an advantage. And if Intel also uses this technology, what advantage would a company like AMD have over their competition? It's a technology from outside their company that will probably grant a short-lived advantage to AMD because it can easily be used by others as well. It's like playing chess. You make the moves, but when someone watching gives you a hint or moves for you, isn't that [u<]a bit like cheating?[/u<]

      • Yeats
      • 8 years ago

      “Cheating”? How?

        • Meadows
        • 8 years ago

        +1, because I’d like to know the same.

        • Palek
        • 8 years ago

        Because it’s witchcraft!!! Burn them at the stake!

          • dpaus
          • 7 years ago

          Hit them with very small rocks!

      • Goty
      • 8 years ago

      [quote<]if AMD has access to this technology, what's stopping Intel from also using it?[/quote<] Nothing. Intel will probably be licensing the same technology in the coming years.

        • OneArmedScissor
        • 8 years ago

        Yeah, like, when it actually works as intended lol. AMD has a history of jumping the gun, and one of the only times it really worked out in their favor was with integrated memory controllers. They seem a little hung up on scratching away at more lottery tickets to find the next one.

        Some other firsts of theirs that didn’t backfire, like x64, are really more about the licensing than the benefit of early implementation.

          • Goty
          • 8 years ago

          Mind giving a list of AMD firsts that DID backfire?

            • Voldenuit
            • 8 years ago

            Not a backfire, but 3DNow! (adding SIMD support for floating point a year before intel did with SSE) certainly fizzled.

            EDIT: It’s worth pointing out that AMD had a lot of successful firsts (at least in the consumer space) which were later emulated by intel.
            Integrated memory controller.
            Point-to-point Bus.
            x86-64.
            True dual-core dies (as opposed to MCMs).

            Bulldozer is probably their highest profile backfire (I guess I’m not counting overpaying for AMD since that’s not a technical decision), their decision to run two execution cores off a single decode and issuer hasn’t paid off in desktop and workstation applications.

            And it’s not like they haven’t copied from intel. After all, they started off making exact copies of intel CPUs!

            • Yeats
            • 8 years ago

            Mmm, I remember downloading the 3DNow! patch for Quake 2, got a big boost, too.

            • OneArmedScissor
            • 8 years ago

            How about a list of things that didn’t? I have all of these down votes, as if there are so many successful firsts for AMD, and yet, there’s not one reply actually naming one.

            I named two, though, so why can’t anyone else?

            But I’ll play your game. Here’s a few “firsts” that didn’t work out from just recently:

            Phenom, first mainstream CPU with L3 cache – Huge latency hit, but with less capacity than Intel’s L2. Amounted to little benefit even in servers vs. Intel’s CPUs that didn’t even have an integrated memory controller.

            Phenom, first monolithic quad-core – Again, trounced in very nearly every scenario by Intel’s dual die, much less integrated quad-cores.

            Phenom, first with individual power states for numerous cores – What was supposed to cut power use screwed with it and also caused potential performance issues, as there was no regard to OS scheduling of the time. Sounds familiar… :p

            Llano, first APU – …Except then it wasn’t. Massive delays. Ended up slower per clock than the Athlon II, despite larger caches and integrated PCIe controller. Just too much, too soon.

            Bulldozer, first…uh… – What is this I don’t even…the Opteron version hardly even make sense.

            Bobcat 2, first x86 SoC and 28nm HKMG CPU from TSMC – Not anymore! Overambitious, dead in the water.

            That fiasco is particularly concerning, much as people here may not care. All they really needed to do was shrink the existing Bobcat, leave it socket compatible for OEMs, and run with it for another year. That would have eventually lowered the cost and also allowed it to make early headway into tablets.

            Instead, they nearly committed suicide. AMD’s greatest success in years – and way out of the limited PC market – will now go unsuceeded until [i<]after[/i<] the Windows 8 release, leaving the door wide open for the influx of ARM laptops and tablets. The Phenom II / Athlon II and Bobcat worked out well for AMD, none of which were trying to do anything outlandish before anyone else. It's too bad they didn't learn from their own successes.

            • Yeats
            • 8 years ago

            [quote<]Llano, first APU - ...Except then it wasn't. Massive delays. [b<]Ended up slower per clock than the Athlon II[/b<], despite larger caches and integrated PCIe controller. Just too much, too soon.[/quote<] It did? Anandtech disagrees with you. [url<]http://www.anandtech.com/show/4448/amd-llano-desktop-performance-preview/2[/url<]

            • JMccovery
            • 8 years ago

            I think the reason that ‘Enhanced Bobcat’ 28nm products were canned, has to do with the delays/costs/limited capacity of TSMC’s 28nm process. AMD could have been looking at actually losing money on each chip sold (this is just a guess).

            It is possible that the Krishna, Wichita and (original)Hondo could have been saved if GF had a working 32/28nm HKMG process or TSMC didn’t have issues.

          • Squeazle
          • 8 years ago

          Yeah man, early adopters are idiots, not innovators. Who needs progress?

        • helboy
        • 7 years ago

        “if AMD has access to this technology, what’s stopping Intel from also using it?”

        “Nothing. Intel will probably be licensing the same technology in the coming years.”

        or better,Intel will buy the company itself ๐Ÿ˜‰

      • basket687
      • 8 years ago

      By using that logic then having a cache inside the processor will be considered “cheating” in order to achieve better performance and turbo boost can be also considered cheating to improve single threaded performance, you can even go as far as saying that EIST/cool n quiet is a form of cheating to improve power efficiency…..

      The bottom line: Any feature that improves performance and/or power efficiency is welcome regardless of how it works.

        • bcronce
        • 8 years ago

        Computers are cheating, I do math in my head.

      • flip-mode
      • 8 years ago

      Obligatory car analogy: diesel fuel is cheating to achieve internal combustion.

      • NeelyCam
      • 8 years ago

      It’s not cheating any more than turbo is.

      • Meadows
      • 7 years ago

      [quote<]EDIT - I didn't say they were actually cheating, I said it's a bit like cheating.[/quote<] It's [u<]not[/u<] like cheating.

      • no51
      • 7 years ago

      I thumbed you down cause it’s the cool thing to do.

        • ronch
        • 7 years ago

        Very funny.

      • NeelyCam
      • 7 years ago

      Re: EDIT 2/3 – you really like those thumbdowns, eh?

        • ronch
        • 7 years ago

        Look at your rear view mirror, I’m gonna overtake you soon enough. ๐Ÿ™‚

          • NeelyCam
          • 7 years ago

          I haven’t pressed the Turbo Boost button yet..

            • ronch
            • 7 years ago

            I thought you had Turbo Core.

            • NeelyCam
            • 7 years ago

            No – mine came with the turbo core functionality disabled (market differentiation, I presume). I had to add an aftermarket turbo on it

      • Rageypoo
      • 7 years ago

      I thumbed you +1 because I think what you’re saying is sound, logical, and has nothing to do with being a fan of AMD or Nvidia, or even Intel, you’re just voicing an opinion and that’s fine.

      I think anyone who did thumb you down didn’t even take into consideration what you said, they just wanna follow their boyfriends and feel connected to a community, either because they are completely insecure, or they have a really tiny…confidence range = )

      • Rageypoo
      • 7 years ago

      (double post)

      • crabjokeman
      • 7 years ago

      Please continue to make more edits and dig yourself further into the thumbs down hole. It’s really entertaining!

    • dpaus
    • 8 years ago

    Sounds fascinating… I’d love to see David Kanter’s take on it.

      • NeelyCam
      • 8 years ago

      How about NeelyCam’s take on it?

      This is a good idea. I’ve seen it used in clock distribution networks before (some ISSCC papers from Rambus and Intel), but those were for high-speed I/Os – not for a mesh replacing a logic clock tree. I’m sure this will help with clock skew.

      There is one downside to it. LC-resonance peaks at one frequency, while modern chips are expected to scale the clock frequency up and down significantly based on load. If the clock is running much slower than the ‘optimum’, the resonance loses its power-saving effect, and if the inductance is simply placed in parallel with the network (like the whitepaper says, although their picture has a series LC-tank, which doesn’t quite make sense..), it would present a very low impedance to the clock drivers (unless you somehow switch off the resonator inductance… which may be exactly what they do).

      So, this scheme works great if the clock frequency doesn’t change. If clock frequency changes, other schemes are probably needed to keep the mesh network operating well.

        • wierdo
        • 8 years ago

        It should be useful at high clockspeeds, which seems to be a good fit for a Speed-racer type CPU design like the Trinity.

        The PDF says the benefits are tangible at roughly 1ghz and up, below that it’ll work fine but wont offer tangible gains over the “clock tree” approach more or less.

        That shouldn’t matter with Trinity, probably even in throttled operation, but it makes it less ideal for something like a low clocked chip used in a smartphone.

          • NeelyCam
          • 8 years ago

          [quote<]The PDF says the benefits are tangible at 1ghz and up mostly, below that it'll work fine but wont offer tangible gains over the "clock tree" approach more or less. [/quote<] The whitepaper means that this technique works well if your clock frequency is 1GHz or above, but chooses not to mention how well it works if the [i<]same[/i<] network runs at 1GHz, 2GHz [i<]and[/i<] 4GHz. Generally you have to pick the resonant frequency for your clock mesh network when you physically build it - each resonant network is built for one given frequency. The equation for an LC resonator is shown in the PDF - the C is set by the physical network, and the L is added to tune out the C. It's difficult to change the L (and, consequently, the resonant frequency) of such a network "on the fly" to accommodate lower operating frequencies once it's built for, say, 4GHz. I'm not saying it can't be done - I'm just saying it's difficult. I can imagine a couple of approaches to try to handle a wide frequency range with this sort of a scheme, but they all sacrifice something... then again, maybe that's already baked in into the power savings numbers they are mentioning. I'm also wondering how solid the 'reference' case is that's mentioned in the whitepaper. Startups promoting their technology tend to hand-pick comparisons that make themselves look good for obvious reasons. I'm wondering if Intel and AMD are already using some highly power-optimized clock distribution schemes, making the power savings from this less impressive. This might end up being a nice marketing gimmick, but in real world doesn't result in more than 5-10% of power savings compared to the 'baseline' (whatever that may be). I'm interested to find out more about this, though.

            • BobbinThreadbare
            • 8 years ago

            Isn’t a 10% power savings from just switching how the chip is clocked, an enormous savings?

            • NeelyCam
            • 7 years ago

            I’d say so, unless it comes with a 20% power increase penalty at 1GHz and below (you know, the frequencies that truly determine the battery life of a laptop). And what if it saves “only” 5% of power compared to other clocking techniques, but comes with a 2% royalty fee?

            Don’t get me wrong – I think it’s a clever approach, and works exceedingly well for a subset of system conditions. I’m just not yet convinced that it’ll save significant amount of power in real-life use.

            EDIT: Just started wondering how this would work with various power-gating schemes.. if any part of the mesh network is to be clocked, the whole mesh network needs to be powered up. It seems to me that it’s much easier to establish fine-grain clock gating schemes with the conventional clock trees.

            I also read the other link (previously I read just the PDF). They explicitly mention they’ve figured out dynamic frequency scaling. Maybe they are using switched inductors after all.. I’m getting more and more curious. I wonder if AMD is presenting something on Trinity at the VLSI Symposium..?

            • CBHvi7t
            • 7 years ago

            At lower frequencies the saving is lower but the need is lower to begin with. For the first steps down they could add 50% inductance and then simply turn it off. They have to disconnect them anyhow if the frequency gets low to avoid a short.

            • NeelyCam
            • 7 years ago

            Yep – that’s pretty much what I was thinking.

            • ImSpartacus
            • 8 years ago

            Are you an engineer?

            • NeelyCam
            • 7 years ago

            Yes. Are you Spartacus?

            • ImSpartacus
            • 7 years ago

            Yes, how’d you know?

            • NeelyCam
            • 7 years ago

            You looked so buff, it was a lucky guess

        • helix
        • 8 years ago

        What if they have more than one inductance and can choose which one to feed the mesh with? It would obviously cost even more die area, but could it work?

          • NeelyCam
          • 7 years ago

          That’s the “not impossible but difficult” part..

          You could switch on/off inductors with big transistor switches, but the switch resistance would have to be in the same order of magnitude as the other impedances (2*pi*f*L, 1/(2*pi*f*C)) or it can affect the quality factor (Q) of the resonance and reduce the power-saving effect. Also, when the switch is off, its parasitic capacitance forms a series LC-resonator with the inductor alone, affecting the mesh resonance.

          One would have to have a set of inductors with good switches and tune the network for each operating frequency, including all those LC series resonators (inductors+switches) in it. It can be done, but it’s a bit messy..

        • Bensam123
        • 7 years ago

        Holy shit, did you just contribute something useful to the community? D:

          • NeelyCam
          • 7 years ago

          It was a moment of weakness… sorry about that.

            • dpaus
            • 7 years ago

            OK, but… You know; don’t let it happen too often.

            Seriously; good explanation (not quite Kanter-level, but who is??), your membership is renewed for another year.

            • Bensam123
            • 7 years ago

            *sniffles* I actually read it without getting any sort of extreme bias or cut throat undertones… I’m proud of you Neely. *cries*

Pin It on Pinterest

Share This