Failure to recognize SATA disk drives

Where Penguins and Daemons chill together in the warmth of the Sun.

Moderators: SecretSquirrel, notfred

Failure to recognize SATA disk drives

Postposted on Fri Jan 11, 2008 7:22 pm

So today I tried to install Fedora 8 on a SATA drive, connected to one of the on-board SATA ports of an Asus A8V-VM motherboard. Fedora installer would consistently hang for several minutes at the "Loading AHCI driver" screen, then fail to find any hard drives to install on. This was quite puzzling, since I'd installed Fedora Core 6 on this motherboard previously without incident.

After much head-scratching, flashing the BIOS to the latest version (this had no effect), and Googling around, I finally got it sorted. Turns out newer Linux kernels enable something called MSI (Message Signaled Interrupts) by default. Apparently this is a new(ish) way for PCI and PCIe devices to signal interrupts to the CPU, which supposedly does away with the concept of IRQ lines.

The problem is, MSI is broken. I'm not sure if the problem is with the kernel itself, the MSI implementation on this motherboard (it has a VIA 8251 southbridge, so hardware bugs would not be surprising), or a combination of the two. But apparently other people have been having similar problems ever since MSI was enabled by default in the Linux kernel a few months back.

So anyhow, adding pci=nomsi to the boot options fixed the problem. Just thought I'd share that, in case anyone else has been trying to install a new(ish) Linux distro on a SATA drive, and is having similar issues.
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37846
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Postposted on Fri Jan 11, 2008 11:34 pm

This is useful info. I haven't encountered it on any of my machines of had it come up at any of our recent LUG installfests, but it's good to know how to fix it regardless.

I was looking around on LKML and there's some interesting information regarding MSI. Benjamin Herrenschmidt pointed out that PCIe only requires MSI and old style interrupt support is optional. The only thing keeping x86 PCIe implementations supporting the old style is the fact that Windows doesn't support MSI:
Note that the PCIe spec mandates MSIs while old style interrults (LSIs) are optional... The result is that there are already platforms (though not x86 just yet) being design with simply no support for LSIs on PCIe and I've heard of devices doing that too (yeah, that's weird, they wouldn't work in windows I suppose).


Linus has some strong thoughts (as usual) about this:
http://lkml.org/lkml/2006/11/14/333

When somebody can actually say what the huge advantages to MSI are that it's worth using when

(a) several motherboards are apparently known broken

(b) microsoft apparently is of the same opinion and _also_ doesn't use it

(c) the old non-MSI code works fine

(d) there is apparently no fool-proof way to tell when it works and when it doesn't.

then please holler. Btw, I'm not even _interested_ in any advantages unless you also have a solution for (d). Not a "it should work". I want to hear something that is _guaranteed_ to work.
bitvector
Grand Gerbil Poohbah
 
Posts: 3234
Joined: Wed Jun 22, 2005 4:39 pm
Location: Mountain View, CA

Postposted on Sat Jan 12, 2008 12:48 am

I believe this is a subset / side effect of PnP interrupt sharing - since PCI(e) AHCI usually does interrupt sharing off of IRQ 9, a standard IRQ-based system wouldn't work after the BIOS enumeration. On a PnP machine each PCI(e) card is set a hardware IRQ by the AHCI BIOS based upon the PCI slot at hard boot. The PnP aware OS then queries each card, configures, shuts down and then restart each in order. If I get this correctly, a PnP OS will therefore set each PCI(e) card to IRQ 9 then switch over to MSI because LSI no longer functions due to *all* PCI(e) cards being reconfigured to IRQ 9.

Sounds like some BIOS / MB combos, with Linux, don't get it quite right. I would assume that this is due to most PCI(e) card & chipset supplier, and BIOS manufacturers, working more on Windows problems than xNIX. It is possible, like the old Intel NIC NWAY 100BaseT autonegotiation flipbit error, that they "cover" up the problem by fixing it in the Windows drivers (but not in Linux, for example).
Snake
Gerbil Elite
 
Posts: 740
Joined: Sun Aug 28, 2005 1:46 pm
Location: Somewhere...(if you say "over the rainbow", I'll kill ya!)

Postposted on Sat Jan 12, 2008 12:58 am

I agree with Linus on this one. Enabling something by default that causes the system to break on a lot of hardware is dumb. I wonder if he changed his mind since that post, or got overruled, or perhaps it is something that only certain distros enable by default? It wouldn't be the first time I've gotten burned by Fedora enabling some bleeding edge feature that isn't quite ready for prime time yet. But then I guess that's to be expected, given that Fedora is effectively the beta test program for RHEL.
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37846
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Postposted on Sat Jan 12, 2008 1:09 am

Snake wrote:Sounds like some BIOS / MB combos, with Linux, don't get it quite right. I would assume that this is due to most PCI(e) card & chipset supplier, and BIOS manufacturers, working more on Windows problems than xNIX. It is possible, like the old Intel NIC NWAY 100BaseT autonegotiation flipbit error, that they "cover" up the problem by fixing it in the Windows drivers (but not in Linux, for example).

Except that according to bitvector's first link, Windows doesn't support MSI (as of about a year ago... maybe Vista supports it now?). So it would seem that this is more an issue of Linux trying to use a feature that probably hasn't been tested on many motherboards -- especially older ones -- since the feature isn't needed to run Windows.

Enabling MSI by default sounds like one of those things which may be the "right" thing to do from a purely technical standpoint, but is a terrible decision from a practical standpoint, if your goal is to be compatible with the existing installed base of hardware. Linux distros really need to pay more attention to this sort of stuff; most people wouldn't have the patience to figure out stuff like the pci=nomsi option.
Last edited by just brew it! on Sat Jan 12, 2008 1:13 am, edited 1 time in total.
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37846
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Postposted on Sat Jan 12, 2008 1:11 am

I find it..."interesting"...that Linus claims MS does not use MSI, when Vista DOES prefer it:

http://www.microsoft.com/whdc/system/bus/PCI/MSI.mspx

As noted, apparently it was supported in XP but wasn't default. But I believe I HAVE seen some MSI PCI (I spend a lot of time debugging PCI card conflicts)

[edit - oops, missed your post! Yep, MS now prefers it for Vista]
Snake
Gerbil Elite
 
Posts: 740
Joined: Sun Aug 28, 2005 1:46 pm
Location: Somewhere...(if you say "over the rainbow", I'll kill ya!)

Postposted on Sat Jan 12, 2008 1:15 am

Snake wrote:I find it..."interesting"...that Linus claims MS does not use MSI, when Vista DOES prefer it:

http://www.microsoft.com/whdc/system/bus/PCI/MSI.mspx

As noted, apparently it was supported in XP but wasn't default. But I believe I HAVE seen some MSI PCI (I spend a lot of time debugging PCI card conflicts)

The posts bitvector linked were more than a year old. So at the time, it wasn't the default, even if it was (supposedly) supported in XP.

Edit: I skimmed the whitepaper linked from that MS page, and didn't see where it says it was supported (but not enabled by default) in XP? It does say that it was part of PCI 2.2 spec; but based on my experience today, apparently some vendors (*cough* *cough* VIA...) got it wrong.
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37846
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Postposted on Sat Jan 12, 2008 1:23 am

just brew it! wrote:I agree with Linus on this one. Enabling something by default that causes the system to break on a lot of hardware is dumb. I wonder if he changed his mind since that post, or got overruled, or perhaps it is something that only certain distros enable by default?

It looks like they've kept it enabled by default with a blacklist of broken devices. They discussed changing it to a whitelist around May 2007. Eric Biederman pointed out, "what I see is a steadily growing black list that looks like it should include every non-Intel x86 pci-express chipset." It looks like Greg KH sort of vetoed merging that patch because a) all non-x86 arches don't seem to have the problem with buggy MSI implementations and b) they believe support should get better over time (particularly with Vista supporting it) thus ending their ever-growing blacklist nightmare. Reading the problems MSI is causing them is just nasty. :lol:
bitvector
Grand Gerbil Poohbah
 
Posts: 3234
Joined: Wed Jun 22, 2005 4:39 pm
Location: Mountain View, CA

Postposted on Sat Jan 12, 2008 1:36 am

Well all I can say for certain is, something changed between the release of Fedora Core 6 and Fedora 8. Either it was disabled by default in 6, or the blacklisting mechanism is broken in 8. Because 6 will install on this motherboard without the nomsi option, but 8 won't.

Ultimately, it's probably VIA's fault in this case... but the kernel and/or Fedora ought not to be enabling the feature by default, especially if they know it is broken on a lot of hardware.

Buggy MSI implementations should actually be less of an issue for Vista, since most people installing Vista will be doing so on new(er) hardware.
(this space intentionally left blank)
just brew it!
Administrator
Gold subscriber
 
 
Posts: 37846
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer


Return to Linux, Unix, and Assorted Madness

Who is online

Users browsing this forum: No registered users and 2 guests