lrzip/doc/README.benchmarks

The first comparison is that of a linux kernel tarball (2.6.31). In all cases
the default options were used. 3 other common compression apps were used for
comparison, 7z which is an excellent all-round lzma based compression app,
gzip which is the benchmark fast standard that has good compression, and bzip2
which is the most common linux used compression.

In the following tables, lrzip means lrzip default options, lrzip -l means
lrzip using the lzo backend, lrzip -g means using the gzip backend,
lrzip -b means using the bzip2 backend and lrzip -z means using the zpaq
backend.


linux-2.6.31.tar

These are benchmarks performed on a 3GHz quad core Intel Core2 with 8GB ram
using lrzip v0.540

Compression	Size		Percentage	Compress	Decompress
None		365711360	100
7z		53315279	14.6		1m58s		0m5.6s
lrzip		52724172	14.4		1m33s		0m13.5s
lrzip -z	43223954	11.8		3m32s		3m40s
lrzip -l	110893724	30.3		0m21s		0m12.1s
lrzip -g	72746424	19.9		0m25s		0m12.3s
lrzip -b	60774043	16.6		0m29s		0m15.2s
bzip2		62416571	17.1		0m44s		0m10.5s
gzip		80563601	22.0		0m14s		0m3.0s


These results are interesting to note the compression of lrzip by default is
only slightly better than lzma, but it's significantly faster thanks to its
heavily multithreaded nature. Decompression is slower because of the 2 stages.
Zpaq offers by far the best compression but at the cost of extra time. However
with the heavily threaded nature of lrzip, it's not a lot longer given how
much better its compression is.


Let's take six kernel trees one version apart as a tarball, linux-2.6.31 to
linux-2.6.36. These will show lots of redundant information, but hundreds
of megabytes apart, which lrzip will be very good at compressing. For
simplicity, only 7z will be compared since that's by far the best general
purpose compressor at the moment:

These are benchmarks performed on a 2.53Ghz dual core Intel Core2 with 4GB ram
using lrzip v0.5.1. Note that it was running with a 32 bit userspace so only
2GB addressing was posible. However the benchmark was run with the -U option
allowing the whole file to be treated as one large compression window.

Tarball of 6 consecutive kernel trees.

Compression	Size		Percentage	Compress	Decompress
None		2373713920	100
7z		344088002	14.5		17m26s		1m22s
lrzip		104874109	4.4		11m37s		56s
lrzip -l	223130711	9.4		05m21s		1m01s
lrzip -U	73356070	3.1		08m53s		43s
lrzip -Ul	158851141	6.7		04m31s		35s
lrzip -Uz	62614573	2.6		24m42s		25m30s

Things start getting very interesting now when lrzip is really starting to
shine. Note how it's not that much larger for 6 kernel trees than it was for
one. That's because all the similar data in both kernel trees is being
compressed as one copy and only the differences really make up the extra size.
All compression software does this, but not over such large distances. If you
copy the same data over multiple times, the resulting lrzip archive doesn't
get much larger at all. You might find this example interesting because the
-U option is actually faster as well as providing better compression. The
reason is that the window is not much larger than the amount of ram addressable
(2GB), and it compresses so much more in the rzip stage that it makes up the
time by not needing to compress anywhere near as much data with the backend
compressor.


Using the first example (linux-2.6.31.tar) and simply copying the data multiple
times over gives these results with lrzip(lzo):

Copies		Size		Compressed	Compress	Decompress
1		365711360	112151676	0m14.9s		0m5.1s
2		731422720	112151829	0m16.2s		0m6.5s
3		1097134080	112151832	0m17.5s		0m8.1s


I had the amusing thought that this compression software could be used as a
bullshit detector if you were to compress people's speeches because if their
talks were full of catchphrases and not much actual content, it would all be
compressed down. So the larger the final archive, the less bullshit =)

Now let's move on to the other special feature of lrzip, the ability to
compress massive amounts of data on huge ram machines by using massive
compression windows. This is a 10GB virtual image of an installed operating
system and some basic working software on it. The default options on the
8GB machine meant that it was using a 5 GB window.


10GB Virtual image:

These benchmarks were done on the quad core with version 0.561

Compression	Size		Percentage	Compress Time	Decompress Time
None		10737418240	100.0
gzip		2772899756	 25.8		05m47s		2m46s
bzip2		2704781700	 25.2		16m15s		6m19s
xz		2272322208	 21.2		50m58s		3m52s
7z		2242897134	 20.9		26m36s		5m41s
lrzip		1372218189	 12.8		10m23s		2m53s
lrzip -U	1095735108	 10.2		08m44s		2m45s
lrzip -l	1831894161	 17.1		04m53s		2m37s
lrzip -lU	1414959433	 13.2		04m48s		2m38s
lrzip -zU	1067075961	  9.9		69m36s		69m35s


At this end of the spectrum things really start to heat up. The compression
advantage is massive, with the lzo backend even giving much better results than
7z, and over a ridiculously short time. The improvements in version 0.530 in
scalability with multiple CPUs has a huge impact on compression time here,
with zpaq almost being as fast on quad core as xz is, yet producing a file
less than half the size.
What appears to be a big disappointment is actually zpaq here which takes more
than 6 times longer than lzma for a measly .3% improvement. The reason is that
most of the advantage here is achieved by the rzip first stage since there's a
lot of redundant space over huge distances on a virtual image. The -U option
which works the memory subsystem rather hard making noticeable impact on the
rest of the machine also does further wonders for the compression (virtually
always) and even the times in this particular case.

This should help govern what compression you choose. Small files are nicely
compressed with zpaq. Intermediate files are nicely compressed with lzma.
Large files get excellent results even with lzo provided you have enough ram.
(Small being < 100MB, intermediate <1GB, large >1GB).
Or, to make things easier, just use the default settings all the time and be
happy as lzma gives good results. :D

Con Kolivas
Tue, 22 Feb 2011
Initial import 2010-03-29 01:07:08 +02:00			`The first comparison is that of a linux kernel tarball (2.6.31). In all cases`
			`the default options were used. 3 other common compression apps were used for`
			`comparison, 7z which is an excellent all-round lzma based compression app,`
			`gzip which is the benchmark fast standard that has good compression, and bzip2`
			`which is the most common linux used compression.`

Updated benchmark results. More tidying up. 2010-11-05 04:52:14 +01:00			`In the following tables, lrzip means lrzip default options, lrzip -l means`
			`lrzip using the lzo backend, lrzip -g means using the gzip backend,`
			`lrzip -b means using the bzip2 backend and lrzip -z means using the zpaq`
Initial import 2010-03-29 01:07:08 +02:00			`backend.`


			`linux-2.6.31.tar`

Updated benchmark results. More tidying up. 2010-11-05 04:52:14 +01:00			`These are benchmarks performed on a 3GHz quad core Intel Core2 with 8GB ram`
Increase version number to 0.540. Fix compression type reporting on -i. Remove some unnecessary output when -i is used. Update docs and benchmarks. 2010-11-16 13:14:35 +01:00			`using lrzip v0.540`
Updated benchmark results. More tidying up. 2010-11-05 04:52:14 +01:00
Initial import 2010-03-29 01:07:08 +02:00			`Compression Size Percentage Compress Decompress`
			`None 365711360 100`
Bump version number up to 0.530. Update all documentation. Minor fixes by Jari Aalto for build and docs. 2010-11-13 11:37:17 +01:00			`7z 53315279 14.6 1m58s 0m5.6s`
Increase version number to 0.540. Fix compression type reporting on -i. Remove some unnecessary output when -i is used. Update docs and benchmarks. 2010-11-16 13:14:35 +01:00			`lrzip 52724172 14.4 1m33s 0m13.5s`
			`lrzip -z 43223954 11.8 3m32s 3m40s`
			`lrzip -l 110893724 30.3 0m21s 0m12.1s`
			`lrzip -g 72746424 19.9 0m25s 0m12.3s`
			`lrzip -b 60774043 16.6 0m29s 0m15.2s`
Bump version number up to 0.530. Update all documentation. Minor fixes by Jari Aalto for build and docs. 2010-11-13 11:37:17 +01:00			`bzip2 62416571 17.1 0m44s 0m10.5s`
			`gzip 80563601 22.0 0m14s 0m3.0s`
Initial import 2010-03-29 01:07:08 +02:00

			`These results are interesting to note the compression of lrzip by default is`
Bump version number up to 0.530. Update all documentation. Minor fixes by Jari Aalto for build and docs. 2010-11-13 11:37:17 +01:00			`only slightly better than lzma, but it's significantly faster thanks to its`
Increase version number to 0.540. Fix compression type reporting on -i. Remove some unnecessary output when -i is used. Update docs and benchmarks. 2010-11-16 13:14:35 +01:00			`heavily multithreaded nature. Decompression is slower because of the 2 stages.`
Bump version number up to 0.530. Update all documentation. Minor fixes by Jari Aalto for build and docs. 2010-11-13 11:37:17 +01:00			`Zpaq offers by far the best compression but at the cost of extra time. However`
			`with the heavily threaded nature of lrzip, it's not a lot longer given how`
			`much better its compression is.`
Initial import 2010-03-29 01:07:08 +02:00

Updated benchmark results. More tidying up. 2010-11-05 04:52:14 +01:00			`Let's take six kernel trees one version apart as a tarball, linux-2.6.31 to`
			`linux-2.6.36. These will show lots of redundant information, but hundreds`
Initial import 2010-03-29 01:07:08 +02:00			`of megabytes apart, which lrzip will be very good at compressing. For`
			`simplicity, only 7z will be compared since that's by far the best general`
			`purpose compressor at the moment:`

Updated benchmark results. More tidying up. 2010-11-05 04:52:14 +01:00			`These are benchmarks performed on a 2.53Ghz dual core Intel Core2 with 4GB ram`
			`using lrzip v0.5.1. Note that it was running with a 32 bit userspace so only`
			`2GB addressing was posible. However the benchmark was run with the -U option`
			`allowing the whole file to be treated as one large compression window.`
Initial import 2010-03-29 01:07:08 +02:00
Updated benchmark results. More tidying up. 2010-11-05 04:52:14 +01:00			`Tarball of 6 consecutive kernel trees.`
Initial import 2010-03-29 01:07:08 +02:00
			`Compression Size Percentage Compress Decompress`
Updated benchmark results. More tidying up. 2010-11-05 04:52:14 +01:00			`None 2373713920 100`
			`7z 344088002 14.5 17m26s 1m22s`
Bump version number to 0.5.2. Update docs. 2010-11-07 05:33:07 +01:00			`lrzip 104874109 4.4 11m37s 56s`
			`lrzip -l 223130711 9.4 05m21s 1m01s`
Updated benchmark results. More tidying up. 2010-11-05 04:52:14 +01:00			`lrzip -U 73356070 3.1 08m53s 43s`
			`lrzip -Ul 158851141 6.7 04m31s 35s`
Bump version number to 0.5.2. Update docs. 2010-11-07 05:33:07 +01:00			`lrzip -Uz 62614573 2.6 24m42s 25m30s`
Initial import 2010-03-29 01:07:08 +02:00
			`Things start getting very interesting now when lrzip is really starting to`
Updated benchmark results. More tidying up. 2010-11-05 04:52:14 +01:00			`shine. Note how it's not that much larger for 6 kernel trees than it was for`
Initial import 2010-03-29 01:07:08 +02:00			`one. That's because all the similar data in both kernel trees is being`
			`compressed as one copy and only the differences really make up the extra size.`
			`All compression software does this, but not over such large distances. If you`
			`copy the same data over multiple times, the resulting lrzip archive doesn't`
Bump version number to 0.5.2. Update docs. 2010-11-07 05:33:07 +01:00			`get much larger at all. You might find this example interesting because the`
			`-U option is actually faster as well as providing better compression. The`
			`reason is that the window is not much larger than the amount of ram addressable`
			`(2GB), and it compresses so much more in the rzip stage that it makes up the`
			`time by not needing to compress anywhere near as much data with the backend`
			`compressor.`
Initial import 2010-03-29 01:07:08 +02:00
Updated benchmark results. More tidying up. 2010-11-05 04:52:14 +01:00
Initial import 2010-03-29 01:07:08 +02:00			`Using the first example (linux-2.6.31.tar) and simply copying the data multiple`
			`times over gives these results with lrzip(lzo):`

			`Copies Size Compressed Compress Decompress`
Bump version number to 0.5.2. Update docs. 2010-11-07 05:33:07 +01:00			`1 365711360 112151676 0m14.9s 0m5.1s`
			`2 731422720 112151829 0m16.2s 0m6.5s`
			`3 1097134080 112151832 0m17.5s 0m8.1s`
Initial import 2010-03-29 01:07:08 +02:00

			`I had the amusing thought that this compression software could be used as a`
Updated benchmark results. More tidying up. 2010-11-05 04:52:14 +01:00			`bullshit detector if you were to compress people's speeches because if their`
Initial import 2010-03-29 01:07:08 +02:00			`talks were full of catchphrases and not much actual content, it would all be`
			`compressed down. So the larger the final archive, the less bullshit =)`

			`Now let's move on to the other special feature of lrzip, the ability to`
			`compress massive amounts of data on huge ram machines by using massive`
			`compression windows. This is a 10GB virtual image of an installed operating`
			`system and some basic working software on it. The default options on the`
			`8GB machine meant that it was using a 5 GB window.`


			`10GB Virtual image:`

Update benchmarks. 2011-02-22 10:26:55 +01:00			`These benchmarks were done on the quad core with version 0.561`
Updated benchmark results. More tidying up. 2010-11-05 04:52:14 +01:00
Initial import 2010-03-29 01:07:08 +02:00			`Compression Size Percentage Compress Time Decompress Time`
			`None 10737418240 100.0`
Bump version number to 0.5.2. Update docs. 2010-11-07 05:33:07 +01:00			`gzip 2772899756 25.8 05m47s 2m46s`
			`bzip2 2704781700 25.2 16m15s 6m19s`
			`xz 2272322208 21.2 50m58s 3m52s`
			`7z 2242897134 20.9 26m36s 5m41s`
Update benchmarks. 2011-02-22 10:26:55 +01:00			`lrzip 1372218189 12.8 10m23s 2m53s`
			`lrzip -U 1095735108 10.2 08m44s 2m45s`
			`lrzip -l 1831894161 17.1 04m53s 2m37s`
			`lrzip -lU 1414959433 13.2 04m48s 2m38s`
			`lrzip -zU 1067075961 9.9 69m36s 69m35s`
Initial import 2010-03-29 01:07:08 +02:00
Huge rewrite of buffer reading in rzip.c. We use a wrapper instead of accessing the buffer directly, thus allowing us to have window sizes larger than available ram. This is implemented through the use of a "sliding mmap" implementation. Sliding mmap uses two mmapped buffers, one large one as previously, and one page sized smaller one. When an attempt is made to read beyond the end of the large buffer, the small buffer is remapped to the file area that's being accessed. While this implementation is 100x slower than direct mmapping, it allows us to implement unlimited sized compression windows. Implement the -U option with unlimited sized windows. Rework the selection of compression windows. Instead of trying to guess how much ram the machine might be able to access, we try to safely buffer as much ram as we can, and then use that to determine the file buffer size. Do not choose an arbitrary upper window limit unless -w is specified. Rework the -M option to try to buffer the entire file, reducing the buffer size until we succeed. Align buffer sizes to page size. Clean up lots of unneeded variables. Fix lots of minor logic issues to do with window sizes accepted/passed to rzip and the compression backends. More error handling. Change -L to affect rzip compression level directly as well as backend compression level and use 9 by default now. More cleanups of information output. Use 3 point release numbering in case one minor version has many subversions. Numerous minor cleanups and tidying. Updated docs and manpages. 2010-11-04 11:14:55 +01:00
Initial import 2010-03-29 01:07:08 +02:00			`At this end of the spectrum things really start to heat up. The compression`
Update benchmarks. Minor tweaks to output. 2010-11-04 14:16:18 +01:00			`advantage is massive, with the lzo backend even giving much better results than`
Bump version number up to 0.530. Update all documentation. Minor fixes by Jari Aalto for build and docs. 2010-11-13 11:37:17 +01:00			`7z, and over a ridiculously short time. The improvements in version 0.530 in`
			`scalability with multiple CPUs has a huge impact on compression time here,`
			`with zpaq almost being as fast on quad core as xz is, yet producing a file`
Increase version number to 0.540. Fix compression type reporting on -i. Remove some unnecessary output when -i is used. Update docs and benchmarks. 2010-11-16 13:14:35 +01:00			`less than half the size.`
Bump version number up to 0.530. Update all documentation. Minor fixes by Jari Aalto for build and docs. 2010-11-13 11:37:17 +01:00			`What appears to be a big disappointment is actually zpaq here which takes more`
Update docs. 2010-12-12 00:46:22 +01:00			`than 6 times longer than lzma for a measly .3% improvement. The reason is that`
Bump version number up to 0.530. Update all documentation. Minor fixes by Jari Aalto for build and docs. 2010-11-13 11:37:17 +01:00			`most of the advantage here is achieved by the rzip first stage since there's a`
Update benchmarks. 2011-02-22 10:26:55 +01:00			`lot of redundant space over huge distances on a virtual image. The -U option`
Bump version number up to 0.530. Update all documentation. Minor fixes by Jari Aalto for build and docs. 2010-11-13 11:37:17 +01:00			`which works the memory subsystem rather hard making noticeable impact on the`
Update benchmarks. 2011-02-22 10:26:55 +01:00			`rest of the machine also does further wonders for the compression (virtually`
			`always) and even the times in this particular case.`
Initial import 2010-03-29 01:07:08 +02:00
			`This should help govern what compression you choose. Small files are nicely`
			`compressed with zpaq. Intermediate files are nicely compressed with lzma.`
			`Large files get excellent results even with lzo provided you have enough ram.`
			`(Small being < 100MB, intermediate <1GB, large >1GB).`
			`Or, to make things easier, just use the default settings all the time and be`
			`happy as lzma gives good results. :D`

			`Con Kolivas`
Update benchmarks. 2011-02-22 10:26:55 +01:00			`Tue, 22 Feb 2011`