Now don't get me wrong, when I said dead cards I meant that the seller specified they were not functional and the price was adjusted accordingly. I'll do the same for mine, i'll put a price lower than what this card normally sells for second hand while specifying to the buyer that it has TDR issues.
So long as they know what they're buying, so be it.
I do blame NVidia since they allowed their board partners to modify their reference design and they did not make them do a mass recall/replacement when the problem showed up back in 2011. I presume they thought they could patch it up with drivers but as we found out, it was not to be.
The vendors did replace cards during said timeframe (you can find plenty of individual stories from that era). As you note though, apparently your warranty system prevents you from getting direct service with the vendor. So again, not NVIDIA's fault.
(You should have a large pile of evidence sitting at "C:\Windows\LiveKernelReports\WATCHDOG". That's where TDR dump reports are written.)
This issue wasn't big enough for a general recall. NVIDIA would only issue a recall for a defective chip that impacted all models. Again this wasn't that sort of issue. This was simply an update to the driver that started better leveraging the chip. That better leveraging of the chip exposed defective hardware as being defective.
There is no fixing this in software. You're falling into an ever too common trap I see from junior technicians. To believe that it's always just a software bug. It's not the software, it is the hardware.
There was a real software TDR issue that got conflated with the unveiling of all this defective hardware. That issue has been fixed. There is nothing left to fix at this point (and frankly can't be fixed without a performance and efficiency regression). You don't regress your software for defective hardware. Defective hardware has a servicing method to fix this problem.
You're also falling into the trap of seeing Internet threads as being a larger issue than it actually is. The people with defective hardware still experiencing this fall into a typical industry failure rate.
I suspect though the only thing that's going to get you to realize reality is to buy defective AMD hardware. I think once the shoe is on the other foot things will finally be more clear.
To that end I have a defective 4870X2 with the infamous black square problem
I'll sell you for real cheap.
I mean look at all those hits from Google. Surely that implies it's a widespread software problem or hardware problem. Why hasn't AMD done a recall? Why haven't they fixed it in their drivers? It even happens in newer chips! What are their engineers doing? How many years will it take to fix this? Clearly they don't care about their customers.
"Welcome back my friends to the show that never ends. We're so glad you could attend. Come inside! Come inside!"