![]()
![]()
| Edit Reply |
|
Anonymous Gerbil |
#33 - AMD might just have to come up with something new if P3 syndrome strikes and they get stuck at X.XGHz. The P3 is a good processor, it just ran out of headroom like the K7 and P4 eventually will - the K7 will run out before P4 though, the question is if it will matter when that does happen.
|
![]()
| Edit Reply |
|
Ryu Connor |
[q Sure the PPro got MMX. They then called it a P2. It got SSE later, and they started calling it a P3. They put the cache from ondie to off when they stopped calling it PPro, but they put it back on when they started calling the PPro+MMX+SSE a 'coppermine'.[/q]
It is a tad bit more complex than that. http://www.sandpile.org/impl/p6.htm http://www.sandpile.org/impl/p2.htm http://www.sandpile.org/impl/p3.htm Each chip also saw some restructuring or increase in aspects of its functional units. The Pentium II gained additional registers in order to accomodate MMX support. The ALU executions units also saw alteration to allow execution of the MMX SIMD instructions. Furthermore aspects of power management support and L1 cache size and function were changed. The Pentium Pro and Pentium II actually shared the exact same L2 cache design. The PPro was backsided within the [backagin/b] (not the die) and designed for operation at full clock while the Pentium II was backsided externally using various commercial SRAM modules that ran at half clock. The latter was far cheaper than the former, and at the time just as good. |
![]()
| Edit Reply |
|
Anonymous Gerbil |
Originally Posted by mctwin2kman
I hate to see people say look Intel won there P-4 1.8 beats AMD\'s 1.4. Well it should it is more than twice the price and 400 MHz faster. The point should be look at how far AMD has come to get a 1.4 to beat a 1.8 at less than half the price. I mean come on people. Compare the 1.4 P-4 to a 1.4 AMD either T-Bird or Pally. I bet the AMD Proc wins. Now that is something to brag about. As far as AMD needing to come up with something new, why would they, the Athlon is a good Proc as it is, and still has a lot of room to go. It really bothers me when everyone says that the 1.8 P-4 is all that. Well it should be for the price, but the sad part is that the Athlon can be bought for less than half the price and perform just as well. I guess that is all for now. |
![]()
| Edit Reply |
|
Forge |
AG #31 - Sure the PPro got MMX. They then called it a P2. It got SSE later, and they started calling it a P3. They put the cache from ondie to off when they stopped calling it PPro, but they put it back on when they started calling the PPro+MMX+SSE a 'coppermine'.
See how AG #29 meant it? |
![]()
| Edit Reply |
|
Anonymous Gerbil |
poster 29 - the p-pro had no mmx, nor sse. the p-2 had mmx, the p-3 had both.
|
![]()
| Edit Reply |
|
Anonymous Gerbil |
#24 again. Don't get all riled, i'm not saying the P4 is god's gift to the world..i'm just tying to explain why it's not necessarily a crap core, and why maybe intel did what it did :-)
The main architechtural way (not only) for a CPU design to scale to higher clock speeds is to increase pipeline depth at the sacrificial altar of IPC. Now if AMD don't follow the follow the "'to increase Mhz, decrease IPC" school of thought, then why did they increase the pipeline depth over the K6? - to increase scalability. They offset the problems of a longer pipeline very well by adding a lot more useful units so they have far better IPC than the previous K6, taking advantage of the much increased transistor count todays manufacturing processes allow. So in essence, the unwanted side effect of the increased pipeline depth is being overcompensated for thus giving greater IPC. This is always what has happened in introducing a new CPU generation - make a new CPU that scales well (better than the old one) and add enough new 'clever stuff' (tm) to actually more than make up for the inevitable loss in IPC that brings, in order to efficiently increase performance for minimum r+d cost and maximum profit out of scaling the clock speed for X years. Example - the transition from P5 to P6 - 9 million transistors against 3 million and no major performance benefit (not on the order of 3 times anyway) because the built in IPC sacrifice had to be made up with more execution units. The P7 over P6 does the same thing as P6 over P5, just not very well because it's pipeline is too long for present CPU design methodology to add enough features to offset the IPC loss completely and that's why the P4 doesn't perform - yet, and why I said the time for the P4 isn't quite here because it just makes intel look stupid to anybody who looks at enough comparisons. If intel actually make real architechtural improvements (not just 'add more cache' which isn't very clever if you ask me - more like brute force) it'll reach the point where it is producing not only the same/better IPC as the P3/K7 but more clock cycles. Consider also that performance does not scale linearly with increased clock speed - it'll level out, how fast depends on the architechture. The real architechtural comparison is the question of which one begins to fall flat with ever increasing multipliers as time goes on, and like it or not the P4/Rambus has a headstart there. The P4's longevity is likely to be cut short with the proliferation of 64bit CPU's however - how soon they get onto the desktop is what will determine if AMD need to introduce another capable 32-bit core to counter a P4 3 to 5 years down the line. Forge, I assumed numbers based on a 150MHz Pentium Pro as a base figure =1200MHz; near enough for my point I think. |
![]()
| Edit Reply |
|
Anonymous Gerbil |
i guess somethingīs wrong here: the p6 architecture is at an end. the p-pro was outfitted with two mmx units, two sse units and several different L2 cache versions. now it is old. intel HAD to make something new.
and what about amd? the palomino is still the good old k7 with the thunderbird cache, sse and a data prefetch unit. if i understand the reviews correctly, the p4 1.8 IS in most cases faster than the thunderbird 1.4. the only drawbacks are that it is much more expensive and one must use rambust. soon the p4 will be mature, 300mm wafers and the northwood core will make the prices decrease in the next few months, ddram platforms will be available, and intel is trying to establish pc400 ddram in the future. besides, the p4 is designed to scale well on further clock speed increases and takes advantage of fast memory. now look at the athlon: the palomino got a data prefetch unit to make proper use of ddram because the original k7 design was made for sdram. now amd will add this and that to the athlon core to stay competitive, just like intel did with the p6. sooner or later, amd will have to make something completely new, just like intel did. |
![]()
| Edit Reply |
|
Anonymous Gerbil |
AG11, thanks on your thoughts on the comparisons betweem celery and p3. Yeh, I think I'd be better off getting an celly 850 for this machine rather than a pentium3 800, to help even out any scores. Too much moolah for the p3. I'm surprised that they've been so popular, in fact. Guess that's what incessant advertising is for.
|
![]()
| Edit Reply |
|
Anonymous Gerbil |
Poster 23,26 here: Forge - the pentuim was the p5 core. It started at 60 mhz. The pentium-3 uses the same core as the p-2, and the p-pro. The first p-pros were 133 mhz. Only a few (p-pros)of those were made. 0.5 micron I think - 1995. The ones made at .35 micron were more popular - 166, 180, 200 mhz. So about 10x in clock .133 to 1.3 Ghz. The p-3 looks to go to 1.6 Ghz also. Even if intel never offers that speed, the current chips look to go that high. As I said below - I'll take the p-3 over the p-4 anyday. I'm biased, I admit. I never appreciated the forced (attemped) obolesence of x87 by Intel. Fortunately AMD has other ideas, as do most programmers (who look to p-3 optimization as the foundation for x86 compatability, and maximun user base).
|
![]()
| Edit Reply |
|
Anonymous Gerbil |
posyer 24 - 23 here. The p-4 is the we to go - even for AMD?? - Well the p-4 is too far the the other extreme. They neglected IPC too much. If a 1.2 GHZ chip outperforms a 1.7 GHZ chip, the that same chip at 1.5 will outperform a 2.0 chip. The k-7 will continue to scall with the p-4 and without a horrid x87. The p-4's FPU is a discrace. It equals that of the old winchip-2 and the k-6 (assuming same clock). The p-4's x87 is 60-percent that of the k-7. So, assuming the Althlon-4 (horrid name) will add 400 mhz to the k-7, it should max out at 1.9 ghz on the .18 core (the k-7 will go to 1.5 - some reach 1.6). Well, intel may be able to reach 2.2 on there present .18 core ------ so WTR both interger (k-7 is faster here), and FPU (the one the is used in apps - x87) we need a p-4 at ? clock to equal in performance, if a 2.2 Ghz p-4 is 60-percent the speed of the Athlon-4. ----- You need a 3.167 Ghz p-4 to equal an Athlon 1.9 Ghz. The AMD chip doesn't need parity in clock with that performance leed. And who is going to bother with SSE-2 , esp. after viewing the AceHardware article on the next complers coming out. M$'s are x87 optimized and have no SSE-2, and they boost the k-7 WAY up from today's currently p-3 otimized apps. X87 on the k-7 is shown to AT LEAST equal (IMO - beats) the SSE-2 optimized Intel compiler. Most use M$ compliers - and the Intel one is still buggy. SSE-2 is not relivant, and I don't think it ever will be. The next p-4 needs a good x87 FPU period. I'll never buy one anyway - 3d-studio on k-7 stomps the p-4 - even my k-7 750!!.
|
![]()
| Edit Reply |
|
Forge |
Athlon is ready to scale well past 2Ghz with nothing but a single die shrink. AMD has lots of tricks left in the K7 bag, and none of them involve compromising IPC in any way. IPC on Palomino is the same as Thunderbird. AMD doesn't follow the 'to increase Mhz, decrease IPC' school of thought.
Also, the Pentium line is closer to 20X it's original speed. Pentium was introduced at 60Mhz, and there are 1.13Ghz Tualatins (still P6), and I believe even higher (1.26Ghz, was it?). 8X would only get 480Mhz, or if you want to compare only PPro-P3 cores, It's only approaching 8X. Whichever. |
![]()
| Edit Reply |
|
Anonymous Gerbil |
The big problem with the P4 isn't that it's a 'dud' - it just goes after performance in a different way, 'speedracer' (clock speed) as opposed to 'brainiac' (actually doing work). SSE2 isn't necessarily needed to make the thing fast. The P4 does less work per clock than anything since the Pentium, but it does it a hell of a lot more often due to it's inflated clock frequency.
The P3's problem is that architechturally, it's EOL - it's clock frequency is up almost 8 times on what it was introduced as and it's not effecient to shrink the die every time you want another increase in clock speed which is what would end up being the case, and why Coppermines at 1.13GHz [i]wer/i] duds on mass market, it couldn't make the grade at .18ĩ. Now Intel has the effeciently scaling P4 around it's done so at the partial expense of performance per clock - instruction pipeline fully twice as long. Long term the P4 [i/i] the way to go, even for AMD, the problem for Intel is that this time isn't quite yet, and AMD is the x86 performance leader (with the P3 technically speaking, not far behind - given a die shrink headstart) with a 12 stage pipline CPU that does more work per cycle as a result for given transistor countP3. P4's weapon.. more clock cycles. I'm reasonably confident in saying Intel went for such an exclusively scalable architechture as the P4 because they have no intention of introducing another 32 bit x86 CPU (just revisions, like PPro-P2-P3). AMD will have to introduce some hefty revisions of the K7 (like G4/G4+) which will cut it's IPC it's to keep at/around performance parity, either that or introduce another core which = $X bn of r+d money. erm.. maybe not the correct thread for this, excuse me for going off topic.. |
![]()
| Edit Reply |
|
Anonymous Gerbil |
poster 22 - me thinks that is logical - only problem is were talking Intel here ;-)
poster 19 - the "ancient" core i'll take over that castrated FPUed "modern?" slow ass core of the p-4. Esp. since SSE-2 ain't gonna happen IMO. K7 is king, but the p6 core is good too - unlike that p-4 crap core. I'll take "ancient" thanks ;-) |
![]()
| Edit Reply |
|
Anonymous Gerbil |
letīs see what intel does. in q4, there will probably be 2ghz northwood on the market, and letīs look if itīs only a die shrink compared with the wiliamette, or if it gets a bigger cache, too (which is what i suspect after the tualatin got 512 kb l2...).
when this will be the case, the northwood would be a good performer and intel wouldnīt be afraid that any p6 could outrun it - even with 1.4 ghz. so making a celeron with 256 kb l2 and a server pIII with 512 kb l2 would be more logical in order to attack amd on the low end. and if northwood is only a smaller wiliamette, the new celeron could be hardly faster with only half the clock speed. |
![]()
| Edit Reply |
|
Anonymous Gerbil |
Originally Posted by dissonance
The p4 is apparently set to go mobile next year, Q2 IIRC. |
![]()
| Edit Reply |
|
Anonymous Gerbil |
Will the P4 ever go mobile??
|
![]()
| Edit Reply |
|
Anonymous Gerbil |
Originally Posted by Robert
Well, saying that the P6 architecture is \"mature\" is a bit of an understatement... \"Ancient\" and \"stretched beyond its limits\" would be more appropriate. They\'ve been trying to get the good ol\' Pentium Pro/Pentium II/Pentium III core to run at over 1,13 GHz for a few months now without much success, so they probably thougt that it\'s time to let it go... |
![]()
| Edit Reply |
|
Forge |
Well, if you notice, it was 'I wouldn't mind' and 'Hopefully' and not 'I'm going to' and 'They will'.
Intel is going to continue shooting themselves in the foot till their investors form a lynch mob, methinks. I'm thinking dual 700E@933 on another P2B-D. BX133 = kickass for memory bandwidth, and dual at 933 ought to be quite a bit of horsepower for a fileserver box. Hell, if it works out well enough, maybe I'll offer Twofer the 933s for his 800s. I feel guilty over how much that rig depreciated after I sold it to him. Of course, my lingering desire to have a P2B-D dually box around should tell you something about how particularly slick that rig was. The only thing out of whack on that box was AGP, which was at 88Mhz. Hmmm.... 750Es are only running about 125$ retail boxed, 700E retail is almost 150$... Intel is smoking some really good crack these days. Uh oh. Anybody know where I can get a P2B-D still? I can only find the P2B-DS on pricewatch, and 450$ for a fileserver mobo is a little high for me. |
![]()
| Edit Reply |
|
Anonymous Gerbil |
Thanks Forge, for the detail. But.... you say you hope to "see the new 'Celeron' being a full P3, no crippled cache." To which I reiterate: "Of COURSE they're going to neuter the poor little cache -- that's what a Celeron is to Intel, for years and years now." The only way they'd let it stay a REAL P3 is if the competing P4 were faster. But it's NOT, so they WON'T! -- AC/AG/Whatever |
![]()
| Edit Reply |
|
cRock |
It seems that 512k cache P3s will be pretty pricey if you can lay hands on them. I doubt you'll be building a cheap dually with them any time soon.
I'm seriously starting to consider biting the bullet and buying a Tyan Thunder K7... waiting on my next paycheck....... |
![]()
| Edit Reply |
|
Forge |
P3/Celeron shows a wider delta than Athlon/Duron. The Duron was designed for 64K L2, while the Celeron is a castrated P3. Makes a huge difference in real world usage.
P3 = 256bit L2/CPU interface, 256K of 4 way associative cache. Celeron, designed for same, but only gets 128bit and 2 way associative. Athlon - 64bit L2/CPU interface, 256K of 16way associative cache. Duron - 64bit L2 interface, 64k of 16way associative. The K7s have a weaker cache design, but they both get the full cache they were designed for. The Celeron has a stronger design, but it's crippled. Hopefully we'll see the new 'Celeron' being a full P3, no crippled cache. I wouldn't mind getting P3s with 512K cache for a cheap dually box to replace my aging single Celery file server. |
![]()
| Edit Reply |
|
Anonymous Gerbil |
#13
that comparison wouldn't be valid because of the size and nature of each companies cache architecture. P3s and celeries all have an inclusive cache, meaning that everthing in L1 will be duped to L2. Granted the L1 is small (16K... right?), it also has a wider bus(256-bit) than an athlon/duron. Athlon/Durons have an exclusive cache. thier L1 is 128K and L2 is 256K and 64K respectively. Also, since the caches are exclusive, the L1 doesn't have to be duoed to L2. the only drawback is the 64-bit bus that goes from L1 to L2 |
![]()
| Edit Reply |
|
Anonymous Gerbil |
For a little more info regarding cache size importance, it should be easier to find comparisons of the Athlon/Duron chips -- they are the same except the Athlon has 256K of cache and the Duron has just 64K. (Yes, it's a bigger difference than Celeron/PIII, but...) As to the bus speed, I get noticeable speedup with FlaskMPEG in overclocking my mobo bus from 100 to 110mhz. Of course it all depends on what apps you're running. If you just do email/light office work, for example, then don't sweat it too much. :)
|
![]()
| Edit Reply |
|
Anonymous Gerbil |
So sorry. Make that "-- AG". |
![]()
| Edit Reply |
|
Anonymous Gerbil |
Salient? There's a word you don't see every day. Of COURSE they're going to neuter the poor little cache -- that's what a Celeron is to Intel, for years and years now. The current Celerons differ from the PIII only in the size, and the nature (4-way versus 2-way associative I think), of their cache. Re: comparing Celeron 800 v. PIII-800 -- I've looked and a direct comparison is surprisingly hard to find. The Celeron 800 being the first 100 Mhz FSB Celeron makes things a little better (unless of course you were going to overclock it anyway), but I think the performance difference is somewhere between 10 and 20% Worth double the money? I don't think so. The only direct benchmark comparisons I found were based on Intel SPECmarks, so I didn't trust them too much. All I wanted was a business Winstone of the two (Celeron 800 and PIII 800), since I'm not a gamer. Of course most new cheap enthusiast builders are going to go Athlon/Duron anyhow. -- AC |
![]()
| Edit Reply |
|
Khopesh |
Forge,
Your point about pushing people to AMD is salient to probably less than one percent of Intel's market, hell probably less than 1/10 of one percent. This will blow by almost everyone without notice. It's only geeks like us who see the technical/political strategies such as this, and ultimately it will probably benefit Intel, and not anyone else. The Celeron had a crippled branch predictor from what I recall too, so I doubt the core would be changed beyond what's needed to make a die shrink. |
![]()
| Edit Reply |
|
Anonymous Gerbil |
Originally Posted by djspitfire
I\'d imagine they\'re going to cripple it somehow, wouldn\'t make much sense to have the Celeron be a better performer than the P4. Still though, it\'s just sad that they won\'t just go all out with the PIII and get as much as they can out of it, make it the best product they can. |
![]()
| Edit Reply |
|
Anonymous Gerbil |
If they did that, Intel would REALLY be tightening thier own noose then.
|
![]()
| Edit Reply |
|
Anonymous Gerbil |
I wonder if that means Intel will be crippling the PIII (Err... "New" Celeron) for dual processor useage. That would really suck seeing as how the P4 Xeons would be the only DP capable Intel processor.
Good news for AMD maybe? |
|
Jazztags: (they MUST be closed) r{ red }r g{ green }g /[ italic ]/ *[ bold ]* _[ underline ]_ -[ |
That all sounds like 'adding MMX' to me. Sure, it wasn't just duct taped on, but it wasn't a massive core redesign.
Furthermore aspects of power management support and L1 cache size and function were changed.
OK, those are changes, but I think you'd have to admit that they're pretty minor.
There haven't been any sweeping redesigns or large fundamental changes to the P6 core since the Ppro. That was the point I was trying to make. We need [sarcasm] tags. Where is Wumpus?