Personal computing discussed

Moderators: SecretSquirrel, notfred

 
just brew it!
Gold subscriber
Administrator
Topic Author
Posts: 48762
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 10:21 am

So we've encountered an unusual use case at my day job where our software misbehaves, but only on systems running recent Linux kernels and a particular (no longer widely used) configuration of our software. I've been thrown into the mix of developers trying to chase this down to root cause.

I now have a clone of the entire Linux kernel git repository (source code for every release and release candidate going back to v2.6.12... all 2GB of it), a spare machine to test custom-built kernels on, and a recipe for reproducing the slowdown.

If I'm not back in a week send out a search party.
Nostalgia isn't what it used to be.
 
chuckula
Gold subscriber
Gerbil Jedi
Posts: 1512
Joined: Wed Jan 23, 2008 9:18 pm
Location: Probably where I don't belong.

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 10:46 am

I'm not expecting it to be easy, but git-bisect is your friend: https://git-scm.com/docs/git-bisect
4770K @ 4.7 GHz; 32GB DDR3-2133; GTX-1080; 512GB 840 Pro (2x); Fractal Define XL-R2; NZXT Kraken-X60
--Many thanks to the TR Forum for advice in getting it built.
 
notfred
Maximum Gerbil
Posts: 4352
Joined: Tue Aug 10, 2004 10:10 am
Location: Ottawa, Canada

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 11:30 am

chuckula beat me to it. If it used to work and now doesn't git-bisect should point you at a list of areas that are suspicious.
 
whm1974
Maximum Gerbil
Posts: 4806
Joined: Fri Dec 05, 2014 5:29 am

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 12:01 pm

Not to be an ass, but maybe your company shouldn't be using a software configuration that is no longer widely used to begin with. But anyway, good luck in solving your problem.
 
Glorious
Gold subscriber
Grand Admiral Gerbil
Posts: 10033
Joined: Tue Aug 27, 2002 6:35 pm

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 12:04 pm

whm1974 you misread it.

It's a configuration of HIS COMPANY'S software that is no longer widely used, but is presumably still used by a few paying customers and therefore supported.

And even then, yeah, thanks for the "advice". :roll: :roll: :roll: :roll: :roll: :roll: :roll:
 
just brew it!
Gold subscriber
Administrator
Topic Author
Posts: 48762
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 12:47 pm

Glorious wrote:
whm1974 you misread it.

It's a configuration of HIS COMPANY'S software that is no longer widely used, but is presumably still used by a few paying customers and therefore supported.

Bingo. And moving those legacy customers to a newer system configuration is hairy enough (involving conversion of petabytes of data) that it is a non-starter from a logistics and cost standpoint.

But those same customers also need to be on a currently supported release of our software, in order to get bug fixes and new features. This in turn means rolling out a newer kernel to them, because that's what the later versions of our software require. So we need current versions of our software, in a legacy configuration, to work on newer Linux kernels without the performance degradation.

Long-term support of large applications in an enterprise environment is a non-trivial thing. You can't just tell people "oh well, I'm afraid you're just going to have to export all of your data, reformat, reinstall everything from scratch, and re-import all of your data" like you can in the consumer world.

Edit: And if you have to prefix your comment with "Not to be an ass, but...", maybe you need to think for another 10 seconds before hitting Submit? :lol:
Nostalgia isn't what it used to be.
 
whm1974
Maximum Gerbil
Posts: 4806
Joined: Fri Dec 05, 2014 5:29 am

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 1:56 pm

just brew it! wrote:
Glorious wrote:
whm1974 you misread it.

It's a configuration of HIS COMPANY'S software that is no longer widely used, but is presumably still used by a few paying customers and therefore supported.

Bingo. And moving those legacy customers to a newer system configuration is hairy enough (involving conversion of petabytes of data) that it is a non-starter from a logistics and cost standpoint.

But those same customers also need to be on a currently supported release of our software, in order to get bug fixes and new features. This in turn means rolling out a newer kernel to them, because that's what the later versions of our software require. So we need current versions of our software, in a legacy configuration, to work on newer Linux kernels without the performance degradation.

Long-term support of large applications in an enterprise environment is a non-trivial thing. You can't just tell people "oh well, I'm afraid you're just going to have to export all of your data, reformat, reinstall everything from scratch, and re-import all of your data" like you can in the consumer world.

Edit: And if you have to prefix your comment with "Not to be an ass, but...", maybe you need to think for another 10 seconds before hitting Submit? :lol:

OK I'm an ASS so be it. Personally I think companies should have plans to keep their systems current and and not be still using way out of date hardware and software that belongs in a some museum somewhere. Such as CRT displays and punch cards for example. This is 2017 after all.
 
Glorious
Gold subscriber
Grand Admiral Gerbil
Posts: 10033
Joined: Tue Aug 27, 2002 6:35 pm

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 1:58 pm

You are *totally* misunderstanding the situation JBI is talking about.

Stop.
 
Waco
Gold subscriber
Gerbil Jedi
Posts: 1953
Joined: Tue Jan 20, 2009 4:14 pm
Location: Los Alamos, NM

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 2:13 pm

whm1974 wrote:
OK I'm an ASS so be it. Personally I think companies should have plans to keep their systems current and and not be still using way out of date hardware and software that belongs in a some museum somewhere. Such as CRT displays and punch cards for example. This is 2017 after all.

You have literally zero idea what he's talking about, please stop.

JBI - I feel your pain. :(
Z170A Gaming Pro Carbon | 6700K @ 4.5 | 16 GB | GTX Titan X | Seasonix Gold 850 | XSPC RX360 | Heatkiller R3 | D5 + RP-452X2 | Cosmos II | Samsung 4K 40" | 480 + 240 + LSI 9207-8i (128x8) SSDs
 
notfred
Maximum Gerbil
Posts: 4352
Joined: Tue Aug 10, 2004 10:10 am
Location: Ottawa, Canada

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 2:43 pm

It's not just the Enterprise space. In the embedded space we are fixing bugs in software that shipped 10 years ago, customers have support contracts for it so we have to fix it.
 
Glorious
Gold subscriber
Grand Admiral Gerbil
Posts: 10033
Joined: Tue Aug 27, 2002 6:35 pm

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 2:52 pm

To re-iterate what JBI is actually saying and why Waco and I are like "STAWP":

We're not talking about hardware revision or software version, we're talking about a configuration option that (presumably) many different versions of JBI's software supports on (potentially) many different revisions/types of hardware.

That is, something like a compression option for a filesystem that JBI's software has supported since version 1.0 up to version 3.2(current), something which no new customer either can or would ever pick now, but certain customers have petabytes of data compressed with on mission-critical production systems because they chose it back in version 1.0 or whatever.

It's not a bug, nor would upgrading anything (hardware, OS, software) "fix" it. So long as JBI's software supports that compression, it's still, well, supported. And if he DOESN'T support it, customers get angry and instead of buying version 4.0 without that compression and doing painful migration, they flip the bird and just go with JBI's competitor. Why wouldn't they? They have to start from scratch anyway BECAUSE of JBI!

So, no, "Upgrade" is not a solution. In fact, as JBI said, this is a problem BECAUSE people are upgrading to a new version of his software (and therefore newer kernels) and using that compression suddenly has a performance issues and no one knows why.

Now, that compression idea is just an example, I have no idea what it is, but this is the kind of situation that JBI is talking about.
 
whm1974
Maximum Gerbil
Posts: 4806
Joined: Fri Dec 05, 2014 5:29 am

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 3:03 pm

OK sorry that I have been an ass about this and going off half cocked.
 
just brew it!
Gold subscriber
Administrator
Topic Author
Posts: 48762
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 4:44 pm

notfred wrote:
It's not just the Enterprise space. In the embedded space we are fixing bugs in software that shipped 10 years ago, customers have support contracts for it so we have to fix it.

Yup, been there too. Heck, with military stuff, the hardware is probably gonna be several generations obsolete before you get done with all the revisions, qualification testing, and red tape of Pentagon procurement. Then you've gotta support it for 10-20 years; once a piece of equipment gets fielded, they expect to use it for a long time. Parts going EOL by the time you hit volume production was even a worry (and this burned us a couple of times) when I worked in that industry.
Nostalgia isn't what it used to be.
 
whm1974
Maximum Gerbil
Posts: 4806
Joined: Fri Dec 05, 2014 5:29 am

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 4:47 pm

just brew it! wrote:
notfred wrote:
It's not just the Enterprise space. In the embedded space we are fixing bugs in software that shipped 10 years ago, customers have support contracts for it so we have to fix it.

Yup, been there too. Heck, with military stuff, the hardware is probably gonna be several generations obsolete before you get done with all the revisions, qualification testing, and red tape of Pentagon procurement. Then you've gotta support it for 10-20 years; once a piece of equipment gets fielded, they expect to use it for a long time. Parts going EOL by the time you hit volume production was even a worry (and this burned us a couple of times) when I worked in that industry.

OUCH!!!
 
just brew it!
Gold subscriber
Administrator
Topic Author
Posts: 48762
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 4:55 pm

whm1974 wrote:
just brew it! wrote:
notfred wrote:
It's not just the Enterprise space. In the embedded space we are fixing bugs in software that shipped 10 years ago, customers have support contracts for it so we have to fix it.

Yup, been there too. Heck, with military stuff, the hardware is probably gonna be several generations obsolete before you get done with all the revisions, qualification testing, and red tape of Pentagon procurement. Then you've gotta support it for 10-20 years; once a piece of equipment gets fielded, they expect to use it for a long time. Parts going EOL by the time you hit volume production was even a worry (and this burned us a couple of times) when I worked in that industry.

OUCH!!!

Welcome to the real world!

Edit: And while we're talking legacy military hardware... the communications bus used on many military aircraft is still MIL-STD-1553. This is a multi-drop serial bus designed in the early 1970s, that runs at a blazing fast (NOT!) 1 mbit/sec. The transceiver chips for this bus are horrendously expensive (hundreds of $) because only 2 companies make them, and they have a captive market. Ethernet was still considered a new, novel (and unproven!) concept for military avionics as recently as the mid-'00s; and if you need data from any legacy aircraft systems you'll still need to implement a 1553 interface on your device today!
Nostalgia isn't what it used to be.
 
Captain Ned
Gold subscriber
Global Moderator
Posts: 26206
Joined: Wed Jan 16, 2002 7:00 pm
Location: Vermont, USA

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 5:04 pm

just brew it! wrote:
Welcome to the real world!

The Apollo Guidance Computer survived long enough to be used in 1st-gen Fly By Wire tests and in the Navy's 2 DSRVs. It's child, the AP101, powered every Shuttle flight and may still live on in the B-52 and B-1B.
If the Earth were flat, cats would have pushed everything off of it by now.
 
whm1974
Maximum Gerbil
Posts: 4806
Joined: Fri Dec 05, 2014 5:29 am

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 5:25 pm

Captain Ned wrote:
just brew it! wrote:
Welcome to the real world!

The Apollo Guidance Computer survived long enough to be used in 1st-gen Fly By Wire tests and in the Navy's 2 DSRVs. It's child, the AP101, powered every Shuttle flight and may still live on in the B-52 and B-1B.

And yet the specs required to run and use modern applications these days... Sometimes it seems that the Web was much faster back in the early and mid 90's even with being stuck with dial up then it is now. When did we went wrong?
 
Topinio
Graphmaster Gerbil
Posts: 1253
Joined: Mon Jan 12, 2015 9:28 am
Location: London

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 5:59 pm

whm1974 wrote:
When did we went wrong?

September.
Desktop: E3-1270 v5, X11SAT-F, 32GB, RX Vega 56, 250GB BX100, 2TB Ultrastar, Xonar DGX, XL2730Z
HTPC: i5-2500K, DH67GD, 6GB, GT 1030 SC, 250GB BX100, 1.5TB Barracuda, Xonar DX
 
Glorious
Gold subscriber
Grand Admiral Gerbil
Posts: 10033
Joined: Tue Aug 27, 2002 6:35 pm

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 6:00 pm

Wed Sep 8716 01:00:35 CEST 1993
 
whm1974
Maximum Gerbil
Posts: 4806
Joined: Fri Dec 05, 2014 5:29 am

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 6:04 pm

I blame Microsoft, at least in part.
 
Waco
Gold subscriber
Gerbil Jedi
Posts: 1953
Joined: Tue Jan 20, 2009 4:14 pm
Location: Los Alamos, NM

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 6:50 pm

whm1974 wrote:
I blame Microsoft, at least in part.

Of course you do. :lol:
Z170A Gaming Pro Carbon | 6700K @ 4.5 | 16 GB | GTX Titan X | Seasonix Gold 850 | XSPC RX360 | Heatkiller R3 | D5 + RP-452X2 | Cosmos II | Samsung 4K 40" | 480 + 240 + LSI 9207-8i (128x8) SSDs
 
just brew it!
Gold subscriber
Administrator
Topic Author
Posts: 48762
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 6:58 pm

If you want to blame anyone, it should be the companies that have made vast amounts of compute power and storage available on the cheap. Software complexity always increases to consume the available resources.
Nostalgia isn't what it used to be.
 
Captain Ned
Gold subscriber
Global Moderator
Posts: 26206
Joined: Wed Jan 16, 2002 7:00 pm
Location: Vermont, USA

Re: Down the Linux kernel rabbit hole we go...

Tue Jul 11, 2017 7:10 pm

Glorious wrote:
Wed Sep 8716 01:00:35 CEST 1993

It's not Flash Player, it's not IE, it's not the first Linux kernel, it's not Win95 so just exactly what is my "D'Oh!" moment?

Ah, AOL granting access to USENET. Been a long time since I've dabbled around in that particular corner of the 'Net. Just checked and both GigaNews and Forte Agent still exist should I ever feel the need to find the really good tentacle-pr0n.

Not grokking Central European Standard Time, but what's a time zone or 4 these days?
If the Earth were flat, cats would have pushed everything off of it by now.
 
SuperSpy
Gold subscriber
Minister of Gerbil Affairs
Posts: 2204
Joined: Thu Sep 12, 2002 9:34 pm
Location: TR Forums

Re: Down the Linux kernel rabbit hole we go...

Wed Jul 12, 2017 7:59 am

I remember being dumbfounded when work took delivery of a new Cessna in 2007 and asking the tech why it took so long to load mapping data on it (it had a new-fangled feature that allowed a laptop to load map/chart data on it over Ethernet) to which he replied the bus all it's computer equipment ran on which was significantly older that me.
Desktop: i7-4790K @4.8 GHz | 32 GB | EVGA Gefore 1060 | Windows 10 x64
Laptop: i7 740QM | 12 GB | Mobility Radeon 5850 | Windows 10 x64
 
just brew it!
Gold subscriber
Administrator
Topic Author
Posts: 48762
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Down the Linux kernel rabbit hole we go...

Wed Jul 12, 2017 8:13 am

SuperSpy wrote:
I remember being dumbfounded when work took delivery of a new Cessna in 2007 and asking the tech why it took so long to load mapping data on it (it had a new-fangled feature that allowed a laptop to load map/chart data on it over Ethernet) to which he replied the bus all it's computer equipment ran on which was significantly older that me.

Once you get a piece of equipment through the FAA's certification process, you don't change it. Ever. Because that would require re-certification. And that can easily cost more than the original development cost.
Nostalgia isn't what it used to be.
 
SuperSpy
Gold subscriber
Minister of Gerbil Affairs
Posts: 2204
Joined: Thu Sep 12, 2002 9:34 pm
Location: TR Forums

Re: Down the Linux kernel rabbit hole we go...

Wed Jul 12, 2017 9:30 am

just brew it! wrote:
SuperSpy wrote:
I remember being dumbfounded when work took delivery of a new Cessna in 2007 and asking the tech why it took so long to load mapping data on it (it had a new-fangled feature that allowed a laptop to load map/chart data on it over Ethernet) to which he replied the bus all it's computer equipment ran on which was significantly older that me.

Once you get a piece of equipment through the FAA's certification process, you don't change it. Ever. Because that would require re-certification. And that can easily cost more than the original development cost.

Yeah, I always joked with the pilot over how a modern cell phone could easily handle all the duties of every electronic system in that aircraft if it had the I/O to connect to everything.
Desktop: i7-4790K @4.8 GHz | 32 GB | EVGA Gefore 1060 | Windows 10 x64
Laptop: i7 740QM | 12 GB | Mobility Radeon 5850 | Windows 10 x64
 
Waco
Gold subscriber
Gerbil Jedi
Posts: 1953
Joined: Tue Jan 20, 2009 4:14 pm
Location: Los Alamos, NM

Re: Down the Linux kernel rabbit hole we go...

Wed Jul 12, 2017 12:24 pm

SuperSpy wrote:
Yeah, I always joked with the pilot over how a modern cell phone could easily handle all the duties of every electronic system in that aircraft if it had the I/O to connect to everything.

Sure, minus the inevitable update or crash that causes all of the aircraft systems to die mid-flight. :lol:
Z170A Gaming Pro Carbon | 6700K @ 4.5 | 16 GB | GTX Titan X | Seasonix Gold 850 | XSPC RX360 | Heatkiller R3 | D5 + RP-452X2 | Cosmos II | Samsung 4K 40" | 480 + 240 + LSI 9207-8i (128x8) SSDs
 
just brew it!
Gold subscriber
Administrator
Topic Author
Posts: 48762
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Down the Linux kernel rabbit hole we go...

Wed Jul 12, 2017 12:35 pm

Waco wrote:
SuperSpy wrote:
Yeah, I always joked with the pilot over how a modern cell phone could easily handle all the duties of every electronic system in that aircraft if it had the I/O to connect to everything.

Sure, minus the inevitable update or crash that causes all of the aircraft systems to die mid-flight. :lol:

The only way a consumer device like that would fly (literally) as part of the aircraft electronics would be if it had no connection to anything that could potentially affect flight safety. FAA is very strict about that.

Edit: A few years ago, I was involved in a project where we sold the USAF a system based on a 1GHz Celeron single-board computer (which was ostensibly designed for industrial control applications, not avionics). They would've really preferred something based on an avionics certified PowerPC SBC, but the Celeron was the only way to provide the capabilities they wanted within the space, power, and cooling budgets we had to work with. They eventually gave in, because A) it wasn't part of any critical control flight systems (if it malfunctioned and became a distraction the pilot could simply turn it off, and we also implemented a built-in killswitch which would turn it off if certain types of failures were detected); and B) unlike the FAA (who would prefer not to have to deal with figuring out how to regulate new tech), the military actually wants new toys.
Nostalgia isn't what it used to be.
 
Waco
Gold subscriber
Gerbil Jedi
Posts: 1953
Joined: Tue Jan 20, 2009 4:14 pm
Location: Los Alamos, NM

Re: Down the Linux kernel rabbit hole we go...

Wed Jul 12, 2017 1:55 pm

just brew it! wrote:
Waco wrote:
SuperSpy wrote:
Yeah, I always joked with the pilot over how a modern cell phone could easily handle all the duties of every electronic system in that aircraft if it had the I/O to connect to everything.

Sure, minus the inevitable update or crash that causes all of the aircraft systems to die mid-flight. :lol:

The only way a consumer device like that would fly (literally) as part of the aircraft electronics would be if it had no connection to anything that could potentially affect flight safety. FAA is very strict about that.

Oh absolutely, I'm aware of the restrictions. You don't mess with control systems on something that can kill you if it fails. Last I heard, all commercial avionics software/hardware essentially required a *real* RTOS to avoid even the smallest of hiccups.

The power is there on consumer devices, but the reliability of the software and hardware is something that makes the comparison a bit moot. :P
Z170A Gaming Pro Carbon | 6700K @ 4.5 | 16 GB | GTX Titan X | Seasonix Gold 850 | XSPC RX360 | Heatkiller R3 | D5 + RP-452X2 | Cosmos II | Samsung 4K 40" | 480 + 240 + LSI 9207-8i (128x8) SSDs
 
just brew it!
Gold subscriber
Administrator
Topic Author
Posts: 48762
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Down the Linux kernel rabbit hole we go...

Wed Jul 12, 2017 2:28 pm

Waco wrote:
Oh absolutely, I'm aware of the restrictions. You don't mess with control systems on something that can kill you if it fails. Last I heard, all commercial avionics software/hardware essentially required a *real* RTOS to avoid even the smallest of hiccups.

Yes, a hard real-time OS with appropriate "certification artifacts" documenting the software lifecycle is required. Certification artifacts need to be provided for the application code as well. The highest certification level (for critical flight systems) needs to achieve a (predicted) reliability of less than 1 failure per billion flight hours; this requires redundancy in the hardware, and extensive automatic fault detection and failover mechanisms.
Nostalgia isn't what it used to be.

Who is online

Users browsing this forum: No registered users and 1 guest