diff --git a/README b/README deleted file mode 100644 index a9a6db7..0000000 --- a/README +++ /dev/null @@ -1,356 +0,0 @@ -lrzip README - -Long Range ZIP or Lzma RZIP - -This is a compression program optimised for large files. The larger the file -and the more memory you have, the better the compression advantage this will -provide, especially once the files are larger than 100MB. The advantage can -be chosen to be either size (much smaller than bzip2) or speed (much faster -than bzip2). - - -Quick lowdown of the most used options: - - lrztar directory -This will produce an archive directory.tar.lrz compressed with lzma. - - lrzuntar directory.tar.lrz -This will completely extract an archived directory. - - lrzip filename -This will produce an archive filename.lrz compressed with lzma (best all -round) giving slow compression and fast decompression - - lrzip -z filename -This will produce an archive filename.lrz compressed with ZPAQ giving extreme -compression but which takes ages to compress and decompress - - lrzip -l filename -This will produce an archive filename.lrz compressed with LZO giving very -fast compression and fast decompression - - lrunzip filename.lrz -This will decompress filename.lrz into filename - - -Lrzip uses an extended version of rzip which does a first pass long distance -redundancy reduction. The lrzip modifications make it scale according to -memory size. -The data is then either: -1. Compressed by lzma (default) which gives excellent compression -at approximately twice the speed of bzip2 compression -2. Compressed by a number of other compressors chosen for different reasons, -in order of likelihood of usefulness: -2a. ZPAQ: Extreme compression up to 20% smaller than lzma but ultra slow -at compression AND decompression. -2b. LZO: Extremely fast compression and decompression which on most machines -compresses faster than disk writing making it as fast (or even faster) than -simply copying a large file -2c. GZIP: Almost as fast as LZO but with better compression. -2d. BZIP2: A defacto linux standard of sorts but is the middle ground between -lzma and gzip and neither here nor there. -3. Leaving it uncompressed and rzip prepared. This form improves substantially -any compression performed on the resulting file in both size and speed (due to -the nature of rzip preparation merging similar compressible blocks of data and -creating a smaller file). By "improving" I mean it will either speed up the -very slow compressors with minor detriment to compression, or greatly increase -the compression of simple compression algorithms. - -The major disadvantages are: -1. The main lrzip application only works on single files so it requires the -lrztar wrapper to fake a complete archiver. -2. It requires a lot of memory to get the best performance out of, and is not -really usable (for compression) with less than 256MB. Decompression requires -less ram and works on smaller ram machines. Sometimes swap may need to be -enabled on these lower ram machines for the operating system to be happy. -3. STDIN/STDOUT works fine on both compression and decompression, but larger -files compressed in this manner will end up being less efficiently compressed. - -The unique feature of lrzip is that it tries to make the most of the available -ram in your system at all times for maximum benefit. It does this by default, -choosing the largest sized window possible without running out of memory. It -also has a unique "sliding mmap" feature which makes it possible to even use -a compression window larger than your ramsize, if the file is that large. It -does this (with the -U option) by implementing one large mmap buffer as per -normal, and a smaller moving buffer to track which part of the file is -currently being examined, emulating a much larger single mmapped buffer. -Unfortunately this mode can be many times slower. - -See the file README.benchmarks in doc/ for performance examples and what kind -of data lrzip is very good with. - - -Requires: -pthreads -liblzo2-dev -libbz2-dev -libz-dev -libm -tar -(nasm on 32bit x86) - -To build/install: -./configure -make -make install - -To build from the git repository do: -./autogen.sh -before the above steps. - - -FAQS. -Q. What encryption does lrzip use? -A. Lrzip uses the best current proven technologies to achieve high grade -password protected encryption. It uses a combination of sha512 to multiply -hash the password with random salt and aes128 to do block encryption of each -block of data with more random salt. The amount of initial hashing of the -password increases by the date an lrzip archive is encrypted according to -Moore's law, making it harder each year to brute force attack the password -to keep up with the increasing computing power each year. It is virtually -guaranteed that the same file encrypted with the same password will never -be the same twice. The weakest link in this encryption mode by far is the -password chosen by the user. There is currently no known attack or backdoor -for this encryption mechanism, and there is absolutely no way of retrieving -your password should you forget it. - -Q. How do I make a static build? -A. ./configure --enable-static-bin - -Q. I want the absolute maximum compression I can possibly get, what do I do? -A. Try the command line options "-Uzp 1 -L 9". This uses all available ram and -ZPAQ compression, and even uses a compression window larger than you have ram. -The -p 1 option disables multithreading which improves compression but at the -expense of speed. Expect it to take many times longer. - -Q. I want the absolute fastest decent compression I can possibly get. -A. Try the command line option -l. This will use the lzo backend compression, -and level 7 compression (1 isn't much faster). - -Q. How much slower is the unlimited mode? -A. It depends on 2 things. First, just how much larger than your ram the file -is, as the bigger the difference, the slower it will be. The second is how much -redundant data there is. The more there is, the slower, but ultimately the -better the compression. Why isn't it on by default? If the compression window is -a LOT larger than ram, with a lot of redundant information it can be drastically -slower. I may revisit this possibility in the future if I can make it any -faster. - -Q. Can I use your tool for even more compression than lzma offers? -A. Yes, the rzip preparation of files makes them more compressible by most -other compression technique I have tried. Using the -n option will generate -a .lrz file smaller than the original which should be more compressible, and -since it is smaller it will compress faster than it otherwise would have. - -Q. 32bit? -A. 32bit machines have a limit of 2GB sized compression windows due to -userspace limitations on mmap and malloc, so even if you have much more ram -you will not be able to use compression windows larger than 2GB. Also you -may be unable to decompress files compressed on 64bit machines which have -used windows larger than 2GB. - -Q. How about 64bit? -A. 64bit machines with their ability to address massive amounts of ram will -excel with lrzip due to being able to use compression windows limited only in -size by the amount of physical ram. - -Q. Other operating systems? -A. The code is POSIXy with GNU extensions. Patches are welcome. Version 0.43+ -should build on MacOSX 10.5+ - -Q. Does it work on stdin/stdout? -A. Yes it does. Compression and decompression work well to/from STDIN/STDOUT. -However because lrzip does multiple passes on the data, it has to store a -large amount in ram before it dumps it to STDOUT (and vice versa), thus it -is unable to work with the massive compression windows regular operation -provides. Thus the compression afforded on files larger than approximately -25% RAM size will be less efficient (though still benefiting compared to -traditional compression formats). - -Q. I have another compression format that is even better than zpaq, can you -use that? -A. You can use it yourself on rzip prepared files (see above). Alternatively -if the source code is compatible with the GPL license it can be added to the -lrzip source code. Libraries with functions similar to compress() and -decompress() functions of zlib would make the process most painless. Please -tell me if you have such a library so I can include it :) - -Q. What's this "Starting lzma back end compression thread..." message? -A. While I'm a big fan of progress percentage being visible, unfortunately -lzma compression can't currently be tracked when handing over 100+MB chunks -over to the lzma library. Therefore you'll see progress percentage until -each chunk is handed over to the lzma library. - -Q. What's this "lzo testing for incompressible data" message? -A. Other compression is much slower, and lzo is the fastest. To help speed up -the process, lzo compression is performed on the data first to test that the -data is at all compressible. If a small block of data is not compressible, it -tests progressively larger blocks until it has tested all the data (if it fails -to compress at all). If no compressible data is found, then the subsequent -compression is not even attempted. This can save a lot of time during the -compression phase when there is incompressible data. Theoretically it may be -possible that data is compressible by the other backend (zpaq, lzma etc) and not -at all by lzo, but in practice such data achieves only minuscule amounts of -compression which are not worth pursuing. Most of the time it is clear one way -or the other that data is compressible or not. If you wish to disable this -test and force it to try compressing it anyway, use -T. - -Q. I have truckloads of ram so I can compress files much better, but can my -generated file be decompressed on machines with less ram? -A. Yes. Ram requirements for decompression go up only by the -L compression -option with lzma and are never anywhere near as large as the compression -requirements. However if you're on 64bit and you use a compression window -greater than 2GB, it might not be possible to decompress it on 32bit machines. - -Q. Why are you including bzip2 compression? -A. To maintain a similar compression format to the original rzip (although the -other modes are more useful). - -Q. What about multimedia? -A. Most multimedia is already in a heavily compressed "lossy" format which by -its very nature has very little redundancy. This means that there is not -much that can actually be compressed. If your video/audio/picture is in a -high bitrate, there will be more redundancy than a low bitrate one making it -more suitable to compression. None of the compression techniques in lrzip are -optimised for this sort of data. However, the nature of rzip preparation -means that you'll still get better compression than most normal compression -algorithms give you if you have very large files. ISO images of dvds for -example are best compressed directly instead of individual .VOB files. ZPAQ is -the only compression format that can do any significant compression of -multimedia. - -Q. Is this multithreaded? -A. As of version 0.540, it is HEAVILY multithreaded with the back end -compression and decompression phase, and will continue to process the rzip -pre-processing phase so when using one of the more CPU intensive backend -compressions like lzma or zpaq, SMP machines will show massive speed -improvements. Lrzip will detect the number of CPUs to use, but it can be -overridden with the -p option if the slightly better compression is desired -more than speed. -p 1 will give the best compression but also be the slowest. - -Q. This uses heaps of memory, can I make it use less? -A. Well you can by setting -w to the lowest value (1) but the huge use of -memory is what makes the compression better than ordinary compression -programs so it defeats the point. You'll still derive benefit with -w 1 but -not as much. - -Q. What CFLAGS should I use? -A. With a recent enough compiler (gcc>4) setting both CFLAGS and CXXFLAGS to - -O2 -march=native -fomit-frame-pointer - -Q. What compiler does this work with? -A. It has been tested on gcc, ekopath and the intel compiler successfully -previously. Whether the commercial compilers help or not, I could not tell you. - -Q. What codebase are you basing this on? -A. rzip v2.1 and lzma sdk920, but it should be possible to stay in sync with -each of these in the future. - -Q. Do we really need yet another compression format? -A. It's not really a new one at all; simply a reimplementation of a few very -good performing ones that will scale with memory and file size. - -Q. How do you use lrzip yourself? -A. Three basic uses. I compress large files currently on my drive with the --l option since it is so quick to get a space saving. When archiving data for -permanent storage I compress it with the default options. When compressing -small files for distribution I use the -z option for the smallest possible -size. - -Q. I found a file that compressed better with plain lzma. How can that be? -A. When the file is more than 5 times the size of the compression window -you have available, the efficiency of rzip preparation drops off as a means -of getting better compression. Eventually when the file is large enough, -plain lzma compression will get better ratios. The lrzip compression will be -a lot faster though. The only way around this is to use as large compression -windows as possible with -U option. - -Q. Can I use swapspace as ram for lrzip with a massive window? -A. It will indirectly do this with -U (unlimited) mode enabled. This mode will -make the compression window as big as the file itself no matter how big it is, -but it will slow down proportionately more the bigger the file is than your ram. - -Q. Why do you nice it to +19 by default? Can I speed up the compression by -changing the nice value? -A. This is a common misconception about what nice values do. They only tell the -cpu process scheduler how to prioritise workloads, and if your application is -the _only_ thing running it will be no faster at nice -20 nor will it be any -slower at +19. - -Q. What is the LZO Testing option, -T? -A. LZO testing is normally performed for the slower back-end compression of LZMA -and ZPAQ. The reasoning is that if it is completely incompressible by LZO then -it will also be incompressible by them. Thus if a block fails to be compressed -by the very fast LZO, lrzip will not attempt to compress that block with the -slower compressor, thereby saving time. If this option is enabled, it will -bypass the LZO testing and attempt to compress each block regardless. - -Q. Compression and decompression progress on large archives slows down and -speeds up. There's also a jump in the percentage at the end? -A. Yes, that's the nature of the compression/decompression mechanism. The jump -is because the rzip preparation makes the amount of data much smaller than the -compression backend (lzma) needs to compress. - -Q. Tell me about patented compression algorithms, GPL, lawyers and copyright. -A. No - -Q. I receive an error "LZMA ERROR: 2. Try a smaller compression window." - what does this mean? -A. LZMA requests large amounts of memory. When a higher compression window is - used, there may not be enough contiguous memory for LZMA. LZMA may request - up to 25% of TOTAL ram depending on compression level. If contiguous blocks - of memory are not free, LZMA will return an error. This is not a fatal - error, and a backup mode of compression will be used. - -Q. Where can I get more information about the internals of LZMA? -A. See http://www.7-zip.org and http://www.p7zip.org. Also, see the file - ./lzma/C/lzmalib.h which explains the LZMA properties used and the LZMA - memory requirements and computation. - -Q. This version is much slower than the old version? -A. Make sure you have set CFLAGS and CXXFLAGS. An unoptimised build will be -almost 3 times slower. - -LIMITATIONS -Due to mmap limitations the maximum size a window can be set to is currently -2GB on 32bit unless the -U option is specified. Files generated on 64 bit -machines with windows >2GB in size might not be decompressible on 32bit -machines. Large files might not decompress on machines with less RAM if SWAP is -disabled. - -BUGS: -Probably lots. Tell me if you spot any :) Any known ones should be documented -in the file BUGS. - - -Links: -rzip: -http://rzip.samba.org/ -lzo: -http://www.oberhumer.com/opensource/lzo/ -lzma: -http://www.7-zip.org/ -zpaq: -http://mattmahoney.net/dc/ - -Thanks to Andrew Tridgell for rzip. Thanks to Markus Oberhumer for lzo. -Thanks to Igor Pavlov for lzma. Thanks to Jean-loup Gailly and Mark Adler -for the zlib compression library. Thanks to Christian Leber for lzma -compat layer, Michael J Cohen for Darwin support, Lasse Collin for fix -to LZMALib.cpp and for Makefile.in suggestions, and everyone else who coded -along the way. Huge thanks to Peter Hyman for most of the 0.19-0.24 changes, -and the update to the multithreaded lzma library and all sorts of other -features. Thanks to René Rhéaume for fixing executable stacks and -Ed Avis for various fixes. Thanks to Matt Mahoney for zpaq code. Thanks to -Jukka Laurila for Darwin support. Thanks to George Makrydakis for lrztar. -Thanks to Ulrich Drepper for his md5 implementation. Thanks to Michael -Blumenkrantz for new configuration tools. Thanks to the PolarSSL authors for -encryption code. Thanks to Serge Belyshev for extensive help, advice, and -patches to implement encryption. Michael Blumenkrantz also for liblrzip. - -Con Kolivas -Sat, 11 March 2011 - -Also documented by -Peter Hyman -Sun, 04 Jan 2009 diff --git a/README.md b/README.md new file mode 100644 index 0000000..ed6bfff --- /dev/null +++ b/README.md @@ -0,0 +1,476 @@ +lrzip - Long Range ZIP or LZMA RZIP +=================================== + +A compression utility that excels at compressing large files (usually > 10-50 MB). +Larger files and/or more free RAM means that the utility will be able to more +effectively compress your files (ie: faster / smaller size), especially if the +filesize(s) exceed 100 MB. You can either choose to optimize for speed (fast +compression / decompression) or size, but not both. + + +### haneefmubarak's TL;DR for the long explanation: + +Just change the word `directory` to the name of the directory you wish to compress. + +#### Compression: + +```bash +lrzdir=directory; tar cvf $lrzdir; lrzip -Ubvvp `nproc` -S .bzip2-lrz -L 9 $lrzdir.tar; rm -fv $lrzdir.tar; unset lrzdir +``` + +`tar`s the directory, then maxes out all of the system's processor cores +along with sliding window RAM to give the best **BZIP2** compression while being as fast as possible, +enables max verbosity output, attaches the extension `.bzip2-lrz`, and finally +gets rid of the temporary tarfile. Uses a tempvar `lrzdir` which is unset automatically. + +#### Decompression for the kind of file from above: + +```bash +lrzdir=directory; lrunzip -cdivvp `nproc` -o $lrzdir.tar $lrzdir.tar.bzip2-lrz; tar xvf $lrzdir.tar; rm -vf $lrzdir.tar +``` + +Checks integrity, then decompresses the directory using all of the +processor cores for max speed, enables max verbosity output, unarchives +the resulting tarfile, and finally gets rid of the temporary tarfile. Uses the same kind of tempvar. + + +### lrzip build/install guide: + +A quick guide on building and installing. + +#### What you will need + + - gcc + - bash or zsh + - pthreads + - tar + - libc + - libm + - libz-dev + - libbz2-dev + - liblzo2-dev + - coreutils + - nasm on x86, not needed on x64 + - git if you want a repo-fresh copy + - an OS with the usual *nix headers and libraries + +#### Obtaining the source + +Two different ways of doing this: + +Stable: Packaged tarball that is known to work: + +Go to and downlaod the `tar.gz` +file from the top. `cd` to the directory you downloaded, and use `tar xvzf lrzip-X.X.tar.gz` +to extract the files (don't forget to replace `X.X` with the correct version). Finally, cd +into the directory you just extracted. + +Latest: `git clone -v https://github.com/ckolivas/lrzip.git; cd lrzip` + +#### Build + +```bash +./autogen.sh +./configure +make -j `nproc` # maxes out all cores +``` + +#### Install + +Simple 'n Easy™: `sudo make install` + +### lrzip 101: + +|Command|Result| +|------|------| +|`lrztar directory`|An archive `directory.tar.lrz` compressed with **LZMA**.| +|`lrzuntar directory.tar.lrz`|A directory extracted from a `lrztar` archive.| +|`lrzip filename`|An archive `filename.lrz` compressed with **LZMA**, meaning slow compression and fast decompression.| +|`lrzip -z filename`|An archive "filename.lrz" compressed with **ZPAQ** that can give extreme compression, but takes a bit longer than forever to compress and decompress.| +|`lrzip -l filename`|An archive lightly compressed with **LZO**, meaning really, really fast compression and decompression.| +|`lrunzip filename.lrz`|Decompress filename.lrz to filename.| + +### lrzip internals + +lrzip uses an extended version of [rzip](http://rzip.samba.org/) which does a first pass long distance +redundancy reduction. lrzip's modifications allow it to scale to accomodate various memory sizes. + +Then, one of the following scenarios occurs: + + - Compressed + - (default) **LZMA** gives excellent compression @ ~2x the speed of bzip2 + - **ZPAQ** gives extreme compression while taking forever + - **LZO** gives insanely fast compression that can actually be faster than simply copying a large file + - **GZIP** gives compression almost as fast as LZO but with better compression + - **BZIP2** is a defacto linux standard and hacker favorite which usually gives + quite good compression (ZPAQ>LZMA>BZIP2>GZIP>LZO) while staying fairly fast (LZO>GZIP>BZIP2>LZMA>ZPAQ); + in other words, a good middle-ground and a good choice overall + - Uncompressed, in the words of the software's original author: + +> Leaving it uncompressed and rzip prepared. This form improves substantially +> any compression performed on the resulting file in both size and speed (due to +> the nature of rzip preparation merging similar compressible blocks of data and +> creating a smaller file). By "improving" I mean it will either speed up the +> very slow compressors with minor detriment to compression, or greatly increase +> the compression of simple compression algorithms. +> +> (Con Kolivas, from the original lrzip README) + + +The only real disadvantages: + + - The main program, lrzip, only works on single files, and therefore + requires the use of an lrztar wrapper to fake a complete archiver. + - lrzip requires quite a bit of memory along with a modern processor + to get the best performance in reasonable time. This usually means that + it is somewhat unusable with less than 256 MB. However, decompression + usually requires less RAM and can work on less powerful machines with much + less RAM. On machines with less RAM, it may be a good idea to enable swap + if you want to keep your operating system happy. + - Piping output to and/or from STDIN and/or STDOUT works fine with both + compression and decompression, but larger files compressed this way will + likely end up being compressed less efficiently. Decompression doesn't + really have any issues with piping, though. + +One of the more unique features of lrzip is that it will try to use all of +the available RAM as best it can at all times to provide maximum benefit. This +is the default operating method, where it will create and use the single +largest memory window that will still fit in available memory without freezing +up the system. It does this by `mmap`ing the small portions of the file that +it is working on. However, it also has a unique "sliding `mmap`" feature, which +allows it to use compression windows that far exceed the size of your RAM if +the file you are compressing is large. It does this by using one large `mmap` +along with a smaller moving `mmap` buffer to track the part of the file that +is currently being examined. From a higher level, this can be seen as simply +emulating a single, large `mmap` buffer. The unfortunate thing about this +feature is that it can become extremely slow. The counter-argument to +being slower is that it will usually give a better compression factor. + +The file `doc/README.benchmarks` has some performance examples to show +what kind of data lrzip is good with. + + + +### FAQ + +> Q: What kind of encryption does lrzip use? + +> A: lrzip uses SHA2-512 repetitive hashing of the password along with a salt +> to provide a key which is used by AES-128 to do block encryption. Each block +> has more random salts added to the block key. The amount of initial hashing +> increases as the timestamp goes forward, in direct relation to Moore's law, +> which means that the amount of time required to encrypt/decrypt the file +> stays the same on a contemporary computer. It is virtually +> guaranteed that the same file encrypted with the same password will never +> be the same twice. The weakest link in this encryption mode by far is the +> password chosen by the user. There is currently no known attack or backdoor +> for this encryption mechanism, and there is absolutely no way of retrieving +> your password should you forget it. + +> Q: How do I make a static build? + +> A: `./configure --enable-static-bin` + +> Q: I want the absolute maximum compression I can possibly get, what do I do? + +> A: Try the command line options "-Uzp 1 -L 9". This uses all available ram and +> ZPAQ compression, and even uses a compression window larger than you have ram. +> The -p 1 option disables multithreading which improves compression but at the +> expense of speed. Expect it to take many times longer. + +> Q: I want the absolute fastest decent compression I can possibly get. + +> A: Try the command line option -l. This will use the lzo backend compression, +> and level 7 compression (1 isn't much faster). + +> Q: How much slower is the unlimited mode? + +> A: It depends on 2 things. First, just how much larger than your ram the file +is, as the bigger the difference, the slower it will be. The second is how much +redundant data there is. The more there is, the slower, but ultimately the +better the compression. Why isn't it on by default? If the compression window is +a LOT larger than ram, with a lot of redundant information it can be drastically +slower. I may revisit this possibility in the future if I can make it any +faster. + +> Q: Can I use your tool for even more compression than lzma offers? + +> A: Yes, the rzip preparation of files makes them more compressible by most +other compression technique I have tried. Using the -n option will generate +a .lrz file smaller than the original which should be more compressible, and +since it is smaller it will compress faster than it otherwise would have. + +> Q: 32bit? + +> A: 32bit machines have a limit of 2GB sized compression windows due to +userspace limitations on mmap and malloc, so even if you have much more ram +you will not be able to use compression windows larger than 2GB. Also you +may be unable to decompress files compressed on 64bit machines which have +used windows larger than 2GB. + +> Q: How about 64bit? + +> A: 64bit machines with their ability to address massive amounts of ram will +excel with lrzip due to being able to use compression windows limited only in +size by the amount of physical ram. + +> Q: Other operating systems? + +> A: The code is POSIXy with GNU extensions. Patches are welcome. Version 0.43+ +should build on MacOSX 10.5+ + +> Q: Does it work on stdin/stdout? + +> A: Yes it does. Compression and decompression work well to/from STDIN/STDOUT. +However because lrzip does multiple passes on the data, it has to store a +large amount in ram before it dumps it to STDOUT (and vice versa), thus it +is unable to work with the massive compression windows regular operation +provides. Thus the compression afforded on files larger than approximately +25% RAM size will be less efficient (though still benefiting compared to +traditional compression formats). + +> Q: I have another compression format that is even better than zpaq, can you +use that? + +> A: You can use it yourself on rzip prepared files (see above). Alternatively +if the source code is compatible with the GPL license it can be added to the +lrzip source code. Libraries with functions similar to compress() and +decompress() functions of zlib would make the process most painless. Please +tell me if you have such a library so I can include it :) + +> Q: What's this "Starting lzma back end compression thread..." message? + +> A: While I'm a big fan of progress percentage being visible, unfortunately +lzma compression can't currently be tracked when handing over 100+MB chunks +over to the lzma library. Therefore you'll see progress percentage until +each chunk is handed over to the lzma library. + +> Q: What's this "lzo testing for incompressible data" message? + +> A: Other compression is much slower, and lzo is the fastest. To help speed up +the process, lzo compression is performed on the data first to test that the +data is at all compressible. If a small block of data is not compressible, it +tests progressively larger blocks until it has tested all the data (if it fails +to compress at all). If no compressible data is found, then the subsequent +compression is not even attempted. This can save a lot of time during the +compression phase when there is incompressible dat +> A: Theoretically it may be +possible that data is compressible by the other backend (zpaq, lzma etc) and not +at all by lzo, but in practice such data achieves only minuscule amounts of +compression which are not worth pursuing. Most of the time it is clear one way +or the other that data is compressible or not. If you wish to disable this +test and force it to try compressing it anyway, use -T. + +> Q: I have truckloads of ram so I can compress files much better, but can my +generated file be decompressed on machines with less ram? + +> A: Yes. Ram requirements for decompression go up only by the -L compression +option with lzma and are never anywhere near as large as the compression +requirements. However if you're on 64bit and you use a compression window +greater than 2GB, it might not be possible to decompress it on 32bit machines. + +> Q: Why are you including bzip2 compression? + +> A: To maintain a similar compression format to the original rzip (although the +other modes are more useful). + +> Q: What about multimedia? + +> A: Most multimedia is already in a heavily compressed "lossy" format which by +its very nature has very little redundancy. This means that there is not +much that can actually be compressed. If your video/audio/picture is in a +high bitrate, there will be more redundancy than a low bitrate one making it +more suitable to compression. None of the compression techniques in lrzip are +optimised for this sort of dat +> A: However, the nature of rzip preparation +means that you'll still get better compression than most normal compression +algorithms give you if you have very large files. ISO images of dvds for +example are best compressed directly instead of individual .VOB files. ZPAQ is +the only compression format that can do any significant compression of +multimedi +> A: + +> Q: Is this multithreaded? + +> A: As of version 0.540, it is HEAVILY multithreaded with the back end +compression and decompression phase, and will continue to process the rzip +pre-processing phase so when using one of the more CPU intensive backend +compressions like lzma or zpaq, SMP machines will show massive speed +improvements. Lrzip will detect the number of CPUs to use, but it can be +overridden with the -p option if the slightly better compression is desired +more than speed. -p 1 will give the best compression but also be the slowest. + +> Q: This uses heaps of memory, can I make it use less? + +> A: Well you can by setting -w to the lowest value (1) but the huge use of +memory is what makes the compression better than ordinary compression +programs so it defeats the point. You'll still derive benefit with -w 1 but +not as much. + +> Q: What CFLAGS should I use? + +> A: With a recent enough compiler (gcc>4) setting both CFLAGS and CXXFLAGS to + -O2 -march=native -fomit-frame-pointer + +> Q: What compiler does this work with? + +> A: It has been tested on gcc, ekopath and the intel compiler successfully +previously. Whether the commercial compilers help or not, I could not tell you. + +> Q: What codebase are you basing this on? + +> A: rzip v2.1 and lzma sdk920, but it should be possible to stay in sync with +each of these in the future. + +> Q: Do we really need yet another compression format? + +> A: It's not really a new one at all; simply a reimplementation of a few very +good performing ones that will scale with memory and file size. + +> Q: How do you use lrzip yourself? + +> A: Three basic uses. I compress large files currently on my drive with the +-l option since it is so quick to get a space saving. When archiving data for +permanent storage I compress it with the default options. When compressing +small files for distribution I use the -z option for the smallest possible +size. + +> Q: I found a file that compressed better with plain lzm +> A: How can that be? + +> A: When the file is more than 5 times the size of the compression window +you have available, the efficiency of rzip preparation drops off as a means +of getting better compression. Eventually when the file is large enough, +plain lzma compression will get better ratios. The lrzip compression will be +a lot faster though. The only way around this is to use as large compression +windows as possible with -U option. + +> Q: Can I use swapspace as ram for lrzip with a massive window? + +> A: It will indirectly do this with -U (unlimited) mode enabled. This mode will +make the compression window as big as the file itself no matter how big it is, +but it will slow down proportionately more the bigger the file is than your ram. + +> Q: Why do you nice it to +19 by default? Can I speed up the compression by +changing the nice value? + +> A: This is a common misconception about what nice values do. They only tell the +cpu process scheduler how to prioritise workloads, and if your application is +the _only_ thing running it will be no faster at nice -20 nor will it be any +slower at +19. + +> Q: What is the LZO Testing option, -T? + +> A: LZO testing is normally performed for the slower back-end compression of LZMA +and ZPA> Q: The reasoning is that if it is completely incompressible by LZO then +it will also be incompressible by them. Thus if a block fails to be compressed +by the very fast LZO, lrzip will not attempt to compress that block with the +slower compressor, thereby saving time. If this option is enabled, it will +bypass the LZO testing and attempt to compress each block regardless. + +> Q: Compression and decompression progress on large archives slows down and +speeds up. There's also a jump in the percentage at the end? + +> A: Yes, that's the nature of the compression/decompression mechanism. The jump +is because the rzip preparation makes the amount of data much smaller than the +compression backend (lzma) needs to compress. + +> Q: Tell me about patented compression algorithms, GPL, lawyers and copyright. + +> A: No + +> Q: I receive an error "LZMA ERROR: 2. Try a smaller compression window." + what does this mean? + +> A: LZMA requests large amounts of memory. When a higher compression window is + used, there may not be enough contiguous memory for LZM +> A: LZMA may request + up to 25% of TOTAL ram depending on compression level. If contiguous blocks + of memory are not free, LZMA will return an error. This is not a fatal + error, and a backup mode of compression will be used. + +> Q: Where can I get more information about the internals of LZMA? + +> A: See http://www.7-zip.org and http://www.p7zip.org. Also, see the file + ./lzma/C/lzmalib.h which explains the LZMA properties used and the LZMA + memory requirements and computation. + +> Q: This version is much slower than the old version? + +> A: Make sure you have set CFLAGS and CXXFLAGS. An unoptimized build will be +almost 3 times slower. + +#### LIMITATIONS +Due to mmap limitations the maximum size a window can be set to is currently +2GB on 32bit unless the -U option is specified. Files generated on 64 bit +machines with windows >2GB in size might not be decompressible on 32bit +machines. Large files might not decompress on machines with less RAM if SWAP is +disabled. + +#### BUGS: +Probably lots. if you spot any :D + + Any known ones should be documented +in the file BUGS. + + + +#### Backends: + +rzip: + + +lzo: + + +lzma: + + +zpaq: + + +### Thanks (CONTRIBUTORS) + +|Person(s)|Thanks for| +|---|---| +|`Andrew Tridgell`|`rzip`| +|`Markus Oberhumer`|`lzo`| +|`Igor Pavlov`|`lzma`| +|`Jean-Loup Gailly & Mark Adler`|`zlib`| +|***`Con Kolivas`***|***Original Code, binding all of this together, managing the project, original `README`***| +|`Christian Leber`|`lzma` compatibility layer| +|`Michael J Cohen`|Darwin/OSX support| +|`Lasse Collin`|fixes to `LZMALib.cpp` and `Makefile.in`| +|Everyone else who coded along the way (add yourself where appropriate if that's you)|Miscellaneous Coding| +|**`Peter Hyman`**|Most of the `0.19` to `0.24` changes| +|`^^^^^^^^^^^`|Updating the multithreaded `lzma` lib +|`^^^^^^^^^^^`|All sorts of other features +|`René Rhéaume`|Fixing executable stacks| +|`Ed Avis`|Various fixes| +|`Matt Mahoney`|`zpaq` integration code| +|`Jukka Laurila`|Additional Darwin/OSX support| +|`George Makrydakis`|`lrztar` wrapper| +|`Ulrich Drepper`|*special* implementation of md5| +|**`Michael Blumenkrantz`**|New config tools| +|`^^^^^^^^^^^^^^^^^^^^`|`liblrzip`| +|Authors of `PolarSSL`|Encryption code| +|`Serge Belyshev`|Extensive help, advice, and patches to implement secure encryption| +|`Jari Aalto`|Fixing typos, esp. in code| +|`Carlo Alberto Ferraris`|Code cleanup +|`Peter Hyman`|Additional documentation| +|`Haneef Mubarak`|Cleanup, Rewrite, and GH Markdown of `README` --> `README.md`| + +Persons above are listed in chronological order of first contribution to **lrzip**. Person(s) with names in **bold** have multiple major contributions, person(s) with names in *italics* have made massive contributions, person(s) with names in ***both*** have made innumerable massive contributions. + +#### README Authors + +Con Kolivas (`ckolivas` on GitHub) +Sat, 11 March 2011: README + +Also documented by +Peter Hyman +Sun, 04 Jan 2009: README + +Mostly Rewritten + GFMified: +Haneef Mubarak (haneefmubarak on GitHub) +Sun/Mon Sep 01-02 2013: README.md