*CPU i7 7700k
*GPU: Asus 1080ti Strix X 2
*Motherboard: Msi z270 Titanium
*Corsair Vengeance LPX 32 GB
*Seasonic 1050W Snow Silent PSU
I am pulling my hair out at this point. I will try to provide as much information as possible.
So I had bought a second Asus 1080ti strix about a week ago; while eagerly waiting for the matched Ek waterblock to arrive, I plugged it into my rig to test out the sli configuration and the much anticipated improved gaming performance. From here on, for the purpose of convenience, I will call the old GPU that was being water cooled Oldie, and the new GPU that had the stock cooler Newie.
All was good and smooth until I noticed a sharp FPS drop whiling playing DOTA 2 ( a very low demanding game btw) from 178 FPS to 38 FPS and immediately noticed severe screen tearing. I had dealt with my fair share of graphics card problems. My first instinct was that it was either the GPU wasn’t outputting enough frames or vice versa, and since the screen tearing was coupled with a significant drop in FPS; I had concluded that it had to be the prior scenario. I initially thought, it was probably a non-issue, so I removed Newie and kept on gaming with only Oldie; life was good and fun for like 3 hours and the issue happened again, my fps would drop without warning to 145mhz and restart PC would not fix it (please note that for almost 2 years, I had never had this issue).
Funny thing was when I ran furmark benchmark test, the clock speed would randomly fluctuate from normal to low at uncertain intervals.
I opened MSI Afterburner and realized that the oldie was running at only 145mhz (GPUZ shows 100% load at 300mhz!! like WHAT??), the memory clock was running fine at 5005mhz; however Newie was running at normal 1750~1830 mhz range. So here comes the frustrating part that had me rolling in the dirt for the past 4 days, I had exhausted every last bit of the scarce knowledge I had, I just couldn’t for the **** sake of me figured out how to restore the clock to normal functioning level.
My core problem is this, I have never had any problem with Oldie before Saturday. I am in a state of denial at this point, I kinda refused to believe that my faithful companion Olide would just die on me like that.
There are some accompanying issues starting to show up that may or may not be related:
1) My monitor plugged into Newie will not have signal, only plugging into Oldie can (from prior experience, I don’t think this is necessarily related because I think it can be solved by reinstalling the driver and manually disable Oldie from device manager).
2) My monitor would not display bios setting page ( can show once it gets into windows). I knew the system was successfully booted into Bios because the z270 motherboard has a distinct message on the onboard display, just no image was shown on monitor.
The method I had tried:
1) The first thing I ruled out is thermal throttling. s Oldie never run hotter than 55 degree under 100% load.I also opened up the waterblock to make sure that all the thermal pads were correctly placed and I put heat sensor on the vram plates to ascertain that temperatures were at the normal range.
2) Uninstall the driver with DDU in safemode, perform a clean installation on the newest driver available.
3) Uninstall the driver with DDU in safemode, perform a clean installation on an archived driver dating back to earlier last year.
4) I had tried setting Nvidia control panel to prefer maximum performance and disable gsync.
5) Switching among 3 new sets of PCIE cables, with or without extension.
6) Unplug the PSU cable from UPS and plug it directly into wall socket.
7) Uninstall Windows, formatted all the storage drives and reinstall the windows 10 ( I don’t have windows 7)
UPDATE 04/06/2018
Hi, sorry for the 2 week hiatus.. I have tried the methods previous posters mentioned, I am very grateful for them and I have used them in this 2 week period.
So here are the new methods I have tried (thank you for people helping me in this thread providing advice).
1) i have tried switching pcie position between Oldie and Newie. Result was surprising, Newie will run fine but Oldie still has the same problem of randomly dropping core clock speed despite being moved to primary slot. This indicates that PCie position might not be the problem.
2) I have tried using 4 independent cables to power 2 cards. This does noticeably enhance performance when the card works but won't do anything when Oldie's core clock gets stuck.. so it doesn't solve the problem. However, this method brings up an interesting problem; oldie's voltage is consistently about 5~6% (780mv idle vs 820mv idle) lower than Newie.
3) I have tried reinstall windows for so many times my brain hurts.. For each time I have thoroughly formatted my drive using diskpart before installation. This had me questioned whether the nvidia driver was not the culprit like I had previously suspected. However, windows forces me to install a bunch of crap after installation so I am not sure those subsequent forced updates interfere with the driver.
Remeber i said that I found a driver that kept Oldie working? well it worked for like a week and the problem starts showing up again.....
I have moved from denial to acceptence in the stages of grief; Before officially declare Oldie to be dead, I bought a 17 dollars windows 7 thumb drive yesterday and decided to go for one last test.
I have a question though, have you guys ever seen a card die by stuck in low clock speed? I have never seen a card DIE by outputting monitor signal fine, but just gets stuck in idle clock speed. I will update pictures during the weekend, I am sorry for the lack for screenshots because it is a lot of work removing/installing water loops constantly
final diagnostic
card is broken
