Personal computing discussed

Moderators: Flying Fox, Ryu Connor

 
leor
Gold subscriber
Maximum Gerbil
Topic Author
Posts: 4861
Joined: Wed Dec 11, 2002 6:34 pm
Location: NYC
Contact:

Is Microsoft bad at math?

Fri Apr 12, 2019 5:16 pm

My NAS is getting full so I got a bigger one, and am in the process of transferring LOADS of files. I've been doing chunks of 200-900gb, and observing the file transfer rates based on # of files and what not (lots of small files take waaaay longer than a few big ones), and I have a few observations.

The time remaining indicator is as dumb as it could possibly be, the system takes lots of time to determine #of files and how big they are, but it completely not responsive to what's actually happening. If you happen to be transferring one large file and the rate jumps to 100mb/second the time shrinks, then if it his a rough patch (lots of small files) and goes down to say 5mb/second the time remaining jumps to 20x what it was. My guess is given the system did an assessment of what was going to be transferred and the copious amount of experience they MUST have with this sort of thing, a more intelligent estimate could be arrived at when you start, but lets say that's not true. An easy number they COULD arrive at after the transfer rates spike up and down based on the files is an average of what to expect based on past performance (I know this is easy because I've built it into a software product recently), but no they just re-extrapolate based on what's right in front of them, not unlike a gold fish and its 6 second working memory.

Since it's the weekend, I decided to throw 10TB at it, and Windows just threw its hands in the air and said "More than 1 day." What's that? 36 hours? A week? Will it be done before Game of Thrones airs?

Math y'all... solving these problems since 6BC.
 
Redocbew
Gold subscriber
Gerbil Jedi
Posts: 1881
Joined: Sat Mar 15, 2014 11:44 am

Re: Is Microsoft bad at math?

Fri Apr 12, 2019 5:25 pm

The same kind of thing used to happen with printers. When someone complains about printer problems I'll often jokingly say "hey, at least it didn't freak out and print 80 pages of worthless garbage.", but the joke is that really did used to happen from time to time. The difference is we fixed the printers, but whoever has been in charge of those transfer rate timers is still just completely hopeless.
Do not meddle in the affairs of archers, for they are subtle and you won't hear them coming.
 
JustAnEngineer
Gold subscriber
Gerbil God
Posts: 18543
Joined: Sat Jan 26, 2002 7:00 pm
Location: The Heart of Dixie

Re: Is Microsoft bad at math?

Fri Apr 12, 2019 5:26 pm

Goldfish may have longer memories than product managers do:
https://economictimes.indiatimes.com/ma ... 604288.cms
i7-9700K, NH-D15, Z390M Pro4, 32 GiB, RX Vega64, Define Mini-C, SSR-850PX, C32HG70+U2410, RK-9000BR, MX518
 
Usacomp2k3
Gerbil God
Posts: 22701
Joined: Thu Apr 01, 2004 4:53 pm
Location: Orlando, FL
Contact:

Re: Is Microsoft bad at math?

Fri Apr 12, 2019 5:29 pm

They did just release the source code for Calculator...
 
just brew it!
Gold subscriber
Administrator
Posts: 52532
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Is Microsoft bad at math?

Fri Apr 12, 2019 6:28 pm

If it's any consolation, the estimated file transfer time in the KDE Linux desktop's file manager is at least as bad, if not worse. I've seen it bounce all over the place even when transferring relatively uniformly sized files.
Nostalgia isn't what it used to be.
 
leor
Gold subscriber
Maximum Gerbil
Topic Author
Posts: 4861
Joined: Wed Dec 11, 2002 6:34 pm
Location: NYC
Contact:

Re: Is Microsoft bad at math?

Fri Apr 12, 2019 7:16 pm

JustAnEngineer wrote:
Goldfish may have longer memories than product managers do:
https://economictimes.indiatimes.com/ma ... 604288.cms

Touché :-)
 
LostCat
Minister of Gerbil Affairs
Posts: 2039
Joined: Thu Aug 26, 2004 6:18 am
Location: Alphanumeric symbols.

Re: Is Microsoft bad at math?

Fri Apr 12, 2019 9:28 pm

just brew it! wrote:
If it's any consolation, the estimated file transfer time in the KDE Linux desktop's file manager is at least as bad, if not worse. I've seen it bounce all over the place even when transferring relatively uniformly sized files.

I've seen it explained that it's due to the hardware utilization constantly shifting so it's a much harder problem than you'd think.

I don't remember where I saw the best explanation of it though, and searching for it isn't pulling much up from beyond the XP era so eh.
Meow.
 
derFunkenstein
Gold subscriber
Gerbil God
Posts: 24961
Joined: Fri Feb 21, 2003 9:13 pm
Location: Comin' to you directly from the Mothership

Re: Is Microsoft bad at math?

Fri Apr 12, 2019 9:39 pm

Especially when you're trying to transfer small files, it seems Windows has a much harder time gauging how quickly the transfer will go. Guessing that JBI's Linux box is the same way.
I do not understand what I do. For what I want to do I do not do, but what I hate I do.
Twittering away the day at @TVsBen
 
Redocbew
Gold subscriber
Gerbil Jedi
Posts: 1881
Joined: Sat Mar 15, 2014 11:44 am

Re: Is Microsoft bad at math?

Fri Apr 12, 2019 9:43 pm

Yeah, I see short transfers being mis-measured all the time with both local disks and remote connections. I always figured it was being thrown off because the overhead of beginning the transfer was a significant portion of the total work to be done, and the timer hadn't been told how to separate that from the time spent in the actual transfer. I could be wrong though since I've never tested it.

I don't think random usage explains the problem, because random usage on a desktop PC is usually just a few bytes here and there. That's not enough to significantly impact performance, or any of the stats involved when transferring multiple gigabytes in one shot. Or at least, it shouldn't be, but clearly it is a much more difficult problem than they had thought. :lol:
Do not meddle in the affairs of archers, for they are subtle and you won't hear them coming.
 
just brew it!
Gold subscriber
Administrator
Posts: 52532
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Is Microsoft bad at math?

Fri Apr 12, 2019 10:41 pm

derFunkenstein wrote:
Especially when you're trying to transfer small files, it seems Windows has a much harder time gauging how quickly the transfer will go. Guessing that JBI's Linux box is the same way.

I'm pretty sure that's part of it. It also seems that Linux's disk caching algorithms can sometimes result in pretty bursty/inconsistent I/O performance as seen by the application layer; this is likely confusing the estimated time calculations.
Nostalgia isn't what it used to be.
 
Wirko
Gold subscriber
Gerbil First Class
Posts: 180
Joined: Fri Jun 15, 2007 4:38 am
Location: Central Europe

Re: Is Microsoft bad at math?

Sat Apr 13, 2019 1:18 am

There's an xkcd for that.
https://xkcd.com/612/
 
leor
Gold subscriber
Maximum Gerbil
Topic Author
Posts: 4861
Joined: Wed Dec 11, 2002 6:34 pm
Location: NYC
Contact:

Re: Is Microsoft bad at math?

Sat Apr 13, 2019 9:34 am

lulz

So let's see, I posted this at 5:16 last night, and it's showing 43% complete, so that's about 17 hours. It's still showing "More than a day," but it seems like it will be closer to 20 hours.
 
Waco
Gold subscriber
Minister of Gerbil Affairs
Posts: 2826
Joined: Tue Jan 20, 2009 4:14 pm
Location: Los Alamos, NM

Re: Is Microsoft bad at math?

Mon Apr 15, 2019 10:08 am

As others have said, it's a much harder problem than you'd initially think. Windows 10 seems to be better than predecessors in that it tries to estimate the time remaining based on the total average speed based on the run so far. I've seen it be mostly reliable for getting a rough guess of "a few hours" versus "a few minutes". The exact time it says is rarely accurate.
Desktop: Z170A | 6700K @ 4.4 | 32 GB | Alphacool Eisblock Radeon VII | Heatkiller R3 | Samsung 4K 40" | 1 TB NVME + 2 TB SATA + LSI (128x8) RAID 0
NAS: 1950X | Designare EX | 32 GB ECC | 7x8 TB RAIDZ2 | 8x2 TB RAID10 | FreeNAS | ZFS | LSI SAS
 
Bauxite
Gerbil Elite
Posts: 781
Joined: Sat Jan 28, 2006 12:10 pm
Location: electrolytic redox smelting plant

Re: Is Microsoft bad at math?

Mon Apr 15, 2019 10:18 am

Ironically, most forensics tools I use are very accurate at how long it will be to finish imaging an entire drive.

When you are copying every raw block sequentially there is not much to get in the way other than bad block re-read attempts.
2018: at 120 Zen cores and counting, so pretty much done with intel on the desktop.
E5 2696v4 22c44t 2.2~3.7Ghz - The last great gleam of the pre-nerf HEDT era.
E5 1680v2 8c16t 4.5Ghz - "Yes Virginia, there were unlocked xeons" /weep for them.
 
Krogoth
Gold subscriber
Gerbil Elder
Posts: 5673
Joined: Tue Apr 15, 2003 3:20 pm
Location: somewhere on Core Prime
Contact:

Re: Is Microsoft bad at math?

Mon Apr 15, 2019 10:41 am

Due to the nature of how data is written onto HDDs and how each file system works. Getting an accurate ETC on massive file/directory transfers is like trying to seer a 5-day weather forecast.
Gigabyte Z390 AORUS-PRO Coffee Lake R 9700K, 2x8GiB of G.Skill DDR4-3600, Sapphire RX Vega 64, Corsair CX-750M V2 and Fractal Define R4 (W)
Ivy Bridge 3570K, 2x4GiB of G.Skill RIPSAW DDR3-1600, Gigabyte Z77X-UD3H, Corsair CX-750M V2, and PC-7B
 
cphite
Graphmaster Gerbil
Posts: 1010
Joined: Thu Apr 29, 2010 9:28 am

Re: Is Microsoft bad at math?

Mon Apr 15, 2019 12:16 pm

Not really a Microsoft problem... it's just as bad in Linux and iOS - even worse in some cases.

It's one of those things that sounds like it ought to be easy but turns out to be really difficult to actually do.

It tends to be better for very large files - especially if there's only one - or in other cases where it has to do one, singular operation. For example, SQL backup and restore estimates tend to be right on the nose; as do drive cloning times.

As soon as you start throwing multiple files or multiple operations into the mix, it just kinda guesses.
 
UberGerbil
Grand Admiral Gerbil
Posts: 10363
Joined: Thu Jun 19, 2003 3:11 pm

Re: Is Microsoft bad at math?

Mon Apr 15, 2019 1:02 pm

If you're actually doing a lot of large copies, you really shouldn't be using Explorer anyway (if you care about performance). From this TechNet blog post:

Most file copy tools like the EXPLORER.EXE GUI, the COPY command in the shell, the XCOPY.EXE tool and the PowerShell Copy-Item cmdlet not optimized for performance. They are single-threaded, one-file-at-a-time solutions that will do the job but are not designed to transfer files as fast as possible.

The best file copy tool included in Windows is actually the ROCOBOPY.EXE tool. It includes very useful options like /MT (for using multiple threads to copy multiple files at once) and /J (copy using unbuffered I/O, which is recommended for large files).

That tool got some love from the Performance Fundamentals team at Microsoft and it’s usually much faster than anything else in Windows.

It’s important to note that even ROBOCOPY with the /MT option won’t help if you’re copying a single file. Like most other file copy programs, it uses a common file copy API instead of custom code.
 
Waco
Gold subscriber
Minister of Gerbil Affairs
Posts: 2826
Joined: Tue Jan 20, 2009 4:14 pm
Location: Los Alamos, NM

Re: Is Microsoft bad at math?

Mon Apr 15, 2019 1:47 pm

On the other hand, writing with multiple threads to a destination is a great way to end up with a lot of fragmented files. I do serial copies to avoid that.
Desktop: Z170A | 6700K @ 4.4 | 32 GB | Alphacool Eisblock Radeon VII | Heatkiller R3 | Samsung 4K 40" | 1 TB NVME + 2 TB SATA + LSI (128x8) RAID 0
NAS: 1950X | Designare EX | 32 GB ECC | 7x8 TB RAIDZ2 | 8x2 TB RAID10 | FreeNAS | ZFS | LSI SAS
 
UberGerbil
Grand Admiral Gerbil
Posts: 10363
Joined: Thu Jun 19, 2003 3:11 pm

Re: Is Microsoft bad at math?

Mon Apr 15, 2019 1:49 pm

Krogoth wrote:
Due to the nature of how data is written onto HDDs and how each file system works. Getting an accurate ETC on massive file/directory transfers is like trying to seer a 5-day weather forecast.

And yet 5-day forecasts are exponentially better than they used to be. "A modern 5-day forecast is as accurate as a 1-day forecast was in 1980, and useful forecasts now reach 9 to 10 days into the future."

When I took Atmospheric Physics classes in the early 80s, the department had a forecasting competition in which people could use any technique they wanted. The guy who won most consistently always forecast tomorrow to have exactly the same weather as today. He was wrong a lot, but he was still right more often than the people who were using rudimentary numerical models. Things have improved markedly since then. Beyond three days out was a complete crapshoot in 1980; today the forecast is right far more often than it's wrong -- and nobody was even talking about meaningful 10 day forecasts back then.

There are still failures, of course, and that's what people remember -- not the hundreds of times the forecasts got it right. And there are still difficult places and conditions, of course: the coastal Pacific Northwest (for example) has a complex local geography that creates many micro-climates, and in the winter it's right on the edge of the rain/snow line, so predicting snowfall amounts can be wildly variable (it doesn't help that the weather comes out of the north-east pacific, where non-satellite data-gathering is sparse). And doing small-scale prediction for things like tornado tracks is incredibly difficult as well, and yet they've made massive strides in that also. Hurricane prediction is good enough now that it has saved countless lives and billions of dollars (and that's enough to pay for the entire weather prediction apparatus).

Which is to say: if Microsoft wanted to throw a lot of money and modelling (and these days deep learning) at the "how long is a copy going to take" problem, they almost certainly could solve it for most use cases. But it's hardly worth it. And they would still get it wrong occasionally, which is the only thing people would notice or remember.
 
UberGerbil
Grand Admiral Gerbil
Posts: 10363
Joined: Thu Jun 19, 2003 3:11 pm

Re: Is Microsoft bad at math?

Mon Apr 15, 2019 1:53 pm

Waco wrote:
On the other hand, writing with multiple threads to a destination is a great way to end up with a lot of fragmented files. I do serial copies to avoid that.

Which don't matter on SSDs. But if you're writing to hard drives, sure -- in that case you're likely limited by the HD throughput rather than threading anyway. (Though if this is a backup, you don't really care about fragmentation; and if your primary concern is copy speed, then go full tilt with many threads, and leave the defragmentation to happen in the background later).
 
just brew it!
Gold subscriber
Administrator
Posts: 52532
Joined: Tue Aug 20, 2002 10:51 pm
Location: Somewhere, having a beer

Re: Is Microsoft bad at math?

Mon Apr 15, 2019 2:02 pm

UberGerbil wrote:
Waco wrote:
On the other hand, writing with multiple threads to a destination is a great way to end up with a lot of fragmented files. I do serial copies to avoid that.

Which don't matter on SSDs. But if you're writing to hard drives, sure -- in that case you're likely limited by the HD throughput rather than threading anyway. (Though if this is a backup, you don't really care about fragmentation; and if your primary concern is copy speed, then go full tilt with many threads, and leave the defragmentation to happen in the background later).

If you're writing to hard drives, using multiple writers may actually slow it down because the heads will be seeking more.
Nostalgia isn't what it used to be.
 
Waco
Gold subscriber
Minister of Gerbil Affairs
Posts: 2826
Joined: Tue Jan 20, 2009 4:14 pm
Location: Los Alamos, NM

Re: Is Microsoft bad at math?

Mon Apr 15, 2019 2:59 pm

UberGerbil wrote:
Which don't matter on SSDs. But if you're writing to hard drives, sure -- in that case you're likely limited by the HD throughput rather than threading anyway. (Though if this is a backup, you don't really care about fragmentation; and if your primary concern is copy speed, then go full tilt with many threads, and leave the defragmentation to happen in the background later).

The filesystem you're using may care about fragmentation quite a bit regardless of underlying media.

Further, HDDs don't necessarily handle writers from multiple streams very well - so it completely depends on the filesystem you're using whether more threads are faster. Restore performance will be hurt if you're seeking around all over the place on HDDs too.
Desktop: Z170A | 6700K @ 4.4 | 32 GB | Alphacool Eisblock Radeon VII | Heatkiller R3 | Samsung 4K 40" | 1 TB NVME + 2 TB SATA + LSI (128x8) RAID 0
NAS: 1950X | Designare EX | 32 GB ECC | 7x8 TB RAIDZ2 | 8x2 TB RAID10 | FreeNAS | ZFS | LSI SAS

Who is online

Users browsing this forum: No registered users and 1 guest