Back to home page

DOS ain't dead

Forum index page

Log in | Register

Back to the forum
Board view  Mix view

p7zip 9.20.1b "testing" (Users)

posted by Rugxulo Homepage, Usono, 26.08.2012, 04:59

> The completely surprising result is that of all those misc versions, the
> osi486fp.exe works best and fastest on my Atom netbook.

That's not that surprising. 486 optimizations were often referred to in talk of Atom. Though I probably should've compiled one with and one without -fomit-frame-pointer (as it bloats up the code and I have no idea if it helps or hurts here).

> it is quite a bit
> faster than any other version of 7zip posted on this forum so far, by
> around 3% compression and 10-15% in decompression. Also, that version DOES
> work fine with .exe filters and -mx8 and -mx9.

-O2 and -O3 are showing the problem, but -Os isn't. Hence why I rebuilt in 9.20.1b with -Os for the BCJ*.cpp files (only) while keeping -O2 for everything else. (I even changed the makefile default to -Os to be ultra safe so no one else would be confronted by the bug, though I hate doing that. I should've commented why!)

> For my i7, this optimized
> for size i486 version actually compresses 0.5 to 1% faster than your other
> versions, and decompresses faster, around 5%. Of course decompression speed
> of your optimized 7zdec.exe 9.12 beats all these versions by around 50% on
> my netbook (yes, it decompresses the same archive in 14 sec that
> osi486fp.exe decompresses in 28 seconds, and o3i686.exe take 33 seconds to
> extract. This is just one archive, others have tested the same.

9.12 was the one I compiled with a bunch of compilers. 9.22 I only built via DJGPP with FOSS-friendly everything (i.e. no D3X). It's on iBiblio explicitly meant for FreeDOS purists. I can't remember what changed, if anything. (PPMD??)

> Deflate is also improved most by osi486fp.exe in -mx9 mode around 3-5% and
> in other modes around 7-10%! Wonder why?? This on both my systems. I'll
> also test on a Core2Duo E8400 chip...
>
> As expected, os-i386.exe is slowest by around 15% on both my systems.

Very strange. The 486 was very sensitive to alignment, way moreso than the Pentium, hence "-march=i386" and "-march=i486" only differ (last I checked) in that one detail! I didn't know newer cpus reverted back to being so sensitive, seems like a bad decision, oh well.

It's almost ridiculous how much difference a bunch of optimizations can make in binary size!

> > FYI, these "misc" binaries mentioned in this post did not workaround such
> a
> > thing, so they probably still have that bug.
>
> But in my testing osi486.exe works perfectly on all settings, and
> o3-i686.exe fails on mx8 and -mx9 and exe filters. Very strange.

Yes, only -O2 and -O3 show this problem. I wasn't sure if these new attempts would even show the bug, much less present it correctly in benchmarking, so I didn't fiddle with anything.

> There seems to be quite a spread over the different compiles of 7zip! More
> than just 1 or 2% that you might suspect.

Ideally, there wouldn't be 50% binary size increase just to speed up 5%. But speed is probably more important than size (though I still say you can have both, but apparently GCC doesn't care).

> Also, there was a vastly increased speed of your AdvanceZIP recompile. For
> -z -4 method, a typical decrease in time was from 60.4 sec using the 2005
> compile to 41.7 sec using your compile. Bravo! What a great difference.

I'm honestly (somewhat) surprised that newer is better. It isn't always the case, but luckily it is here. I guess their testing has improved (or maybe the new backend is just that much better designed).

But yes, I had tried this years ago, so I knew it was actually faster in this case. (I remember testing on my old P166, heh.)

> This speed optimization holds for both my i7 and Atom systems. Thanks for
> making these compiles available.

Yes, I tested it later (on the huge gcc463s.zip file), and it was 20% faster for me with "advzip -z4" (Core i5), something like 4 mins. vs 5 mins.

> There was only around 2% difference in the
> generic and i686 and Atom versions, however. Times for that same archive
> ranged from 41.7 sec (fastest = generic) to 42.5 sec (slowest = i686).
> Wonder why with 7zip there is more variance.

Dunno, I'd be surprised if "generic" included Atom-friendly stuff, but you never know. But I did notice they were almost the same size, which for GCC usually means not much difference.

> By the way, all my tests were done on a RAMdrive to minimize variation in
> HD speed affecting the results.

Yes, that's fine, me too. :-)

 

Complete thread:

Back to the forum
Board view  Mix view
22762 Postings in 2122 Threads, 402 registered users (1 online)
DOS ain't dead | Admin contact
RSS Feed
powered by my little forum