Move the ram allocation phase into rzip_fd to be able to get a more accurate measure of percentage done.

Prevent failure when offset is not a multiple of page size. Add chunk percentage complete to output. Tweak output at various verbosities. Update documentation to reflect improved performance of unlimited mode. Update benchmark results. More tidying.
2026-04-04 13:57:40 +00:00 · 2010-11-05 23:02:58 +11:00 · 2010-11-05 23:02:58 +11:00 · 017ec9e85a
commit 017ec9e85a
parent a66dafe66a
5 changed files with 126 additions and 90 deletions
--- a/28
+++ b/28
@ -74,8 +74,7 @@ a compression window larger than your ramsize, if the file is that large. It
 does this (with the -U option) by implementing one large mmap buffer as per
 normal, and a smaller moving buffer to track which part of the file is
 currently being examined, emulating a much larger single mmapped buffer.
-Unfortunately this mode is 100 times slower once lrzip begins examining the
-ram beyond the larger base window.
+Unfortunately this mode is many times slower.

 See the file README.benchmarks in doc/ for performance examples and what kind
 of data lrzip is very good with.
@ -105,7 +104,26 @@ Q. I want the absolute maximum compression I can possibly get, what do I do?
 A. Try the command line options -MUz. This will use all available ram and ZPAQ
 compression, and even use a compression window larger than you have ram.
 Expect serious swapping to occur if your file is larger than your ram and for
-it to take 1000 times longer. A more practical option is just -M.
+it to take many times longer.
+
+Q. How much slower is the unlimited mode?
+A. It depends on 2 things. First, just how much larger than your ram the file
+is, as the bigger the difference, the slower it will be. The second is how much
+redundant data there is. The more there is, the slower, but ultimately the
+better the compression. Using the example of a 10GB virtual image on a machine
+with 8GB ram, it would allocate about 5.5GB by default, yet is capable of
+allocating all the ram for the 10GB file in -M mode.
+
+Options		Size		Compress	Decompress
+-l		1793312108	05m13s		3m12s
+-lM		1413268368	04m18s		2m54s
+-lU		1413268368	06m05s		2m54s
+
+As you can see, the -U option gives the same compression in this case as the
+-M option, and for about 50% more time. The advantage to using -U is that it
+will work even when the size can't be encompassed by -M, but progressively
+slower. Why isn't it on by default? If the compression window is a LOT larger
+than ram, with a lot of redundant information it can be drastically slower.

 Q. Can I use your tool for even more compression than lzma offers?
 A. Yes, the rzip preparation of files makes them more compressible by every
@ -288,6 +306,10 @@ A. Yes, that's the nature of the compression/decompression mechanism. The jump
 is because the rzip preparation makes the amount of data much smaller than the
 compression backend (lzma) needs to compress.

+Q. The percentage counter doesn't always get to 100%.
+A. It's quite hard to predict during the rzip phase how long it will take as
+lots of redundant data will not count towards the percentage.
+
 Q. Tell me about patented compression algorithms, GPL, lawyers and copyright.
 A. No