Back to home page

DOS ain't dead

Forum index page

Log in | Register

Back to the forum
Board view  Mix view

New kernel compression methods (Announce)

posted by ecm Homepage E-mail, Düsseldorf, Germany, 13.04.2020, 00:05

I just finished fixing the lzd port to my inicomp kernel compression stage. It is a rather direct port of the lzd reference implementation that is shipped along with lzip's manual.

It decompresses a specific variant of LZMA, called LZMA-302eos or LZMA-lzip. This presumably simplifies the decompressor somewhat as compared to the xz and 7-Zip uses of the LZMA.

LZMA-lzip beats, in final compressed executable size, all six formats I've previously implemented. These are (in order of implementation) BriefLZ, LZ4, Snappy, Exomizer 3, X compressor, and Heatshrink. LZMA's high compression ratios do incur CPU time costs though.

For the following numbers, do note that I have not yet optimised the lzd compressor. In particular, I have not yet added overlapping source and destination buffers support. That is only part of the high memory usage for LZMA-lzip though; its probability tables need about 30 KiB of buffer space. To be honest, that is less than what I expected prior to putting it together.

The X compressor does actually have a memory need that large. That's 256 KiB of context tables, and then some more as a multi-layer decompression threshold. The author has suggested variants to lessen this but I only want to implement existing formats, as produced by the corresponding tools.

By the way, as yet I always ignore the checksums, if present in the format. I preferred the best compression settings. Because the depacker reads and writes from memory, the dictionary size doesn't matter to us; if there is enough memory to hold the (possibly overlapping) buffers while decompressing then the entire history of the decompression output is available to the decompressor. With enough contiguous low memory area space permitting, my decompressors all handle compressed and uncompressed file lengths exceeding 64 KiB well. (I think the FreeDOS kernel compression doesn't properly allow that.)

I used a scriptlet to gather some numbers to compare the decompressors. Here's the results when building the lDebug default branch, revision a6c6df3e2820. Size is the total executable file size. Ini size is the iniload (loader) and inicomp (compression) stage sizes together. Alloc specifies how much memory is needed to run the file in EXE mode, which is dynamically detected during building when the use_build_decomp_test option is in effect. The last line for each method gives the time in seconds for a quick test, running the executable 512 times from a single batch file, in dosemu2 gb2414e9c7 running FreeCom version 0.84-pre7 - GNUC - XMS_Swap [Dec 29 2019 15:36:33] and FreeDOS kernel - SVN (build 2042 OEM:0xfd) [compiled Sep 22 2017].

ldebug/tmp$ hg id
a6c6df3e2820 tip
ldebug/tmp$ fname="ldebug.com"; fname2="debug"; fname3="debug.big"; fnameu="ldebugu.com"; arg="/C=Q"; upcase=0; method=none; ((upcase)) && method="$(echo "$method" | tr 'a-z' 'A-Z')"; mkdir -p "$method"; cp -a ../bin/"$fnameu" "$method"/"$fname"; cp -a "$fname3" "$method/$fname2.$method"; for method in none blz lz4 sz exo x hs lz; do ((upcase)) && method="$(echo "$method" | tr 'a-z' 'A-Z')"; echo -e "\nmethod=$method\nsize=$(stat -c %s "$method/$fname")"; echo -e "ini size=$(( $(stat -c %s "$method/$fname") - ( ($(stat -c %s "$method/$fname2.$method") + 15) / 16 * 16 ) ))"; echo -e "alloc=$(exememls "$method/$fname")"; echo "@echo off" > test.bat; for jj in $(seq 0 511); do echo "@$method\\$fname $arg"; done >> test.bat; (export TIMEFORMAT='%3R'; time dosemu -dumb -quiet -K "$PWD" -E "test.bat" 2> /dev/null > /dev/null); done

method=none
size=82944
ini size=4944
alloc=96912 bytes = 6057 paragraphs
2.244

method=blz
size=64512
ini size=6064
alloc=96688 bytes = 6043 paragraphs
5.011

method=lz4
size=62976
ini size=5936
alloc=96672 bytes = 6042 paragraphs
3.378

method=sz
size=70144
ini size=6144
alloc=96864 bytes = 6054 paragraphs
2.858

method=exo
size=54784
ini size=5888
alloc=96688 bytes = 6043 paragraphs
6.111

method=x
size=71680
ini size=6384
alloc=343920 bytes = 21495 paragraphs
12.471

method=hs
size=63488
ini size=5536
alloc=343904 bytes = 21494 paragraphs
7.730

method=lz
size=53760
ini size=8336
alloc=344208 bytes = 21513 paragraphs
16.503
ldebug/tmp$


Here are the same numbers for the RxDOS kernel revision 5161b8327e36.

rxdos/tmp$ hg id
5161b8327e36 tip
rxdos/tmp$ fname="RxDOS.COM"; fname2="RxDOS"; fname3="RxDOS.BIN"; fnameu="RxDOSU.COM"; arg="version"; upcase=1; method=none; ((upcase)) && method="$(echo "$method" | tr 'a-z' 'A-Z')"; mkdir -p "$method"; cp -a ../bin/"$fnameu" "$method"/"$fname"; cp -a "$fname3" "$method/$fname2.$method"; for method in none blz lz4 sz exo x hs lz; do ((upcase)) && method="$(echo "$method" | tr 'a-z' 'A-Z')"; echo -e "\nmethod=$method\nsize=$(stat -c %s "$method/$fname")"; echo -e "ini size=$(( $(stat -c %s "$method/$fname") - ( ($(stat -c %s "$method/$fname2.$method") + 15) / 16 * 16 ) ))"; echo -e "alloc=$(exememls "$method/$fname")"; echo "@echo off" > test.bat; for jj in $(seq 0 511); do echo "@$method\\$fname $arg"; done >> test.bat; (export TIMEFORMAT='%3R'; time dosemu -dumb -quiet -K "$PWD" -E "test.bat" 2> /dev/null > /dev/null); done

method=NONE
size=101376
ini size=4528
alloc=98992 bytes = 6187 paragraphs
2.230

method=BLZ
size=56320
ini size=6240
alloc=100480 bytes = 6280 paragraphs
4.685

method=LZ4
size=54784
ini size=5712
alloc=99968 bytes = 6248 paragraphs
3.223

method=SZ
size=59904
ini size=5760
alloc=100000 bytes = 6250 paragraphs
3.137

method=EXO
size=47616
ini size=5776
alloc=100016 bytes = 6251 paragraphs
5.658

method=X
size=68608
ini size=6752
alloc=363152 bytes = 22697 paragraphs
24.592

method=HS
size=57344
ini size=5504
alloc=99744 bytes = 6234 paragraphs
6.824

method=LZ
size=47104
ini size=8336
alloc=170688 bytes = 10668 paragraphs
15.271
rxdos/tmp$

---
l

 

Complete thread:

Back to the forum
Board view  Mix view
22632 Postings in 2109 Threads, 402 registered users, 408 users online (0 registered, 408 guests)
DOS ain't dead | Admin contact
RSS Feed
powered by my little forum