Huge rewrite of buffer reading in rzip.c. We use a wrapper instead of

accessing the buffer directly, thus allowing us to have window sizes larger than available ram. This is implemented through the use of a "sliding mmap" implementation. Sliding mmap uses two mmapped buffers, one large one as previously, and one page sized smaller one. When an attempt is made to read beyond the end of the large buffer, the small buffer is remapped to the file area that's being accessed. While this implementation is 100x slower than direct mmapping, it allows us to implement unlimited sized compression windows. Implement the -U option with unlimited sized windows. Rework the selection of compression windows. Instead of trying to guess how much ram the machine might be able to access, we try to safely buffer as much ram as we can, and then use that to determine the file buffer size. Do not choose an arbitrary upper window limit unless -w is specified. Rework the -M option to try to buffer the entire file, reducing the buffer size until we succeed. Align buffer sizes to page size. Clean up lots of unneeded variables. Fix lots of minor logic issues to do with window sizes accepted/passed to rzip and the compression backends. More error handling. Change -L to affect rzip compression level directly as well as backend compression level and use 9 by default now. More cleanups of information output. Use 3 point release numbering in case one minor version has many subversions. Numerous minor cleanups and tidying. Updated docs and manpages.
2025-12-06 07:12:00 +01:00 · 2010-11-04 21:14:55 +11:00 · 2010-11-04 21:14:55 +11:00 · 29b166629a
parent c106128d1a
commit 29b166629a
12 changed files with 400 additions and 256 deletions
--- a/34
+++ b/34
@ -1,4 +1,38 @@
 lrzip ChangeLog
+NOVEMBER 2010, version 0.5.1 Con Kolivas
+* Fix Darwin build - Darwin doesn't support mremap so introduce a fake wrapper
+for it.
+* Fix the memopen routines, a wrongly implemented wrapper for Darwin equivalents
+was also using the faked versions on all builds.
+* Fix dodgy ordered includes.
+* Clean up excessive use of #ifdefs
+* Huge rewrite of buffer reading in rzip.c. We use a wrapper instead of
+accessing the buffer directly, thus allowing us to have window sizes larger than
+available ram. This is implemented through the use of a "sliding mmap"
+implementation. Sliding mmap uses two mmapped buffers, one large one as
+previously, and one page sized smaller one. When an attempt is made to read
+beyond the end of the large buffer, the small buffer is remapped to the file
+area that's being accessed. While this implementation is 100x slower than direct
+mmapping, it allows us to implement unlimited sized compression windows.
+* Implement the -U option with unlimited sized windows.
+* Rework the selection of compression windows. Instead of trying to guess how
+much ram the machine might be able to access, we try to safely buffer as much
+ram as we can, and then use that to determine the file buffer size. Do not
+choose an arbitrary upper window limit unless -w is specified.
+* Rework the -M option to try to buffer the entire file, reducing the buffer
+size until we succeed.
+* Align buffer sizes to page size.
+* Clean up lots of unneeded variables.
+* Fix lots of minor logic issues to do with window sizes accepted/passed to rzip
+and the compression backends.
+* More error handling.
+* Change -L to affect rzip compression level directly as well as backend
+compression level and use 9 by default now.
+* More cleanups of information output.
+* Use 3 point release numbering in case one minor version has many subversions.
+* Numerous minor cleanups and tidying.
+* Updated docs and manpages.
+
 NOVEMBER 2010, version 0.5 Con Kolivas
 * Changed offset encoding in rzip stage to use variable byte width offsets
 instead of 64 bits wide. Makes for better compression and slightly faster.
--- a/54
+++ b/54
@ -1,4 +1,4 @@
-lrzip v0.5
+lrzip v0.5.1

 Long Range ZIP or Lzma RZIP

@ -66,6 +66,17 @@ less ram and works on smaller ram machines.
 stdin/stdout work but in a very inefficient manner generating temporary files
 on disk so this method of using lrzip is not recommended.

+The unique feature of lrzip is that it tries to make the most of the available
+ram in your system at all times for maximum benefit. It does this by default,
+choosing the largest sized window possible without running out of memory. It
+also has a unique "sliding mmap" feature which makes it possible to even use
+a compression window larger than your ramsize, if the file is that large. It
+does this (with the -U option) by implementing one large mmap buffer as per
+normal, and a smaller moving buffer to track which part of the file is
+currently being examined, emulating a much larger single mmapped buffer.
+Unfortunately this mode is 100 times slower once lrzip begins examining the
+ram beyond the larger base window.
+
 See the file README.benchmarks in doc/ for performance examples and what kind
 of data lrzip is very good with.

@ -91,10 +102,10 @@ Q. How do I make a static build?
 A. make static

 Q. I want the absolute maximum compression I can possibly get, what do I do?
-A. Try the command line options -Mz. This will use all available ram and ZPAQ
-compression. Expect serious swapping to occur if your file is larger than your
-ram. It may even fail to run if you do not have enough swap space allocated.
-Why? Well the more ram lrzip uses the better the compression it can achieve.
+A. Try the command line options -MUz. This will use all available ram and ZPAQ
+compression, and even use a compression window larger than you have ram.
+Expect serious swapping to occur if your file is larger than your ram and for
+it to take 1000 times longer. A more practical option is just -M.

 Q. Can I use your tool for even more compression than lzma offers?
 A. Yes, the rzip preparation of files makes them more compressible by every
@ -111,11 +122,12 @@ used windows larger than 2GB.

 Q. How about 64bit?
 A. 64bit machines with their ability to address massive amounts of ram will
-excel with lrzip due to being able to use compresion windows limited only in
+excel with lrzip due to being able to use compression windows limited only in
 size by the amount of physical ram.

 Q. Other operating systems?
-A. Patches are welcome. Version 0.43+ should build on MacOSX 10.5+
+A. The code is POSIXy with GNU extensions. Patches are welcome. Version 0.43+
+should build on MacOSX 10.5+

 Q. Does it work on stdin/stdout?
 A. Yes it does. Compression from stdin works nicely.. However the other
@ -146,7 +158,7 @@ to compress at all). If no compressible data is found, then the subsequent
 compression is not even attempted. This can save a lot of time during the
 compression phase when there is incompressible data. Theoretically it may be
 possible that data is compressible by the other backend (zpaq, lzma etc) and not
-at all by lzo, but in practice such data achieves only miniscule amounts of
+at all by lzo, but in practice such data achieves only minuscule amounts of
 compression which are not worth pursuing. Most of the time it is clear one way
 or the other that data is compressible or not. If you wish to disable this
 test and force it to try compressing it anyway, use -T 0.
@ -156,8 +168,7 @@ generated file be decompressed on machines with less ram?
 A. Yes. Ram requirements for decompression go up only by the -L compression
 option with lzma and are never anywhere near as large as the compression
 requirements. However if you're on 64bit and you use a compression window
-greater than 2GB, it may NOT be possible to decompress it on 32bit machines.
-lrzip will warn you and fail if you try.
+greater than 2GB, it might not be possible to decompress it on 32bit machines.

 Q. I've changed the compression level with -L in combination with -l or -z and
 the file size doesn't vary?
@ -212,28 +223,21 @@ good performing ones that will scale with memory and file size.
 Q. How do you use lrzip yourself?
 A. Two basic uses. I compress large files currently on my drive with the
 -l option since it is so quick to get a space saving, and when archiving
-data for permament storage I compress it with the default options.
+data for permanent storage I compress it with the default options.

 Q. I found a file that compressed better with plain lzma. How can that be?
 A. When the file is more than 5 times the size of the compression window
 you have available, the efficiency of rzip preparation drops off as a means
 of getting better compression. Eventually when the file is large enough,
 plain lzma compression will get better ratios. The lrzip compression will be
-a lot faster though. Currently I have no way around this problem without
-throwing more and more ram at the compression because trying to do this off
-disk (whether directly on the file or from swap) will mean the file is read
-a ridulous number of times over and over again. It presents an interesting
-problem for which there is no perfect solution but it certainly has us
-thinking hard about how to tackle it.
+a lot faster though. The only way around this is to use as much ram as
+possible with the -M option, and going beyond that with the -U option.

 Q. Can I use swapspace as ram for lrzip with a massive window?
-A. No. To make lrzip work completely from disk would make the data be read
-off disk an unrealistic number of times over again and again. For example, if
-you have 1GB of ram and a 2GB file to compress, it might read the file a
-billion times off disk. Most hard drives would fail in that time :) See the
-previous question. Update; I have been informed that people have successfully
-done this without destroying their hard drives and they've been _very_ patient,
-but it didn't take as long as I had predicted.
+A. It will indirectly do this with -M mode enabled. If you want the windows
+even larger, -U (unlimited) mode will make the compression window as big as
+the file itself no matter how big it is, but it will slow down 100 times
+during the compression phase once it has reached your full ram.

 Q. Why do you nice it to +19 by default? Can I speed up the compression by
 changing the nice value?
@ -331,7 +335,7 @@ Ed Avis for various fixes. Thanks to Matt Mahoney for zpaq code. Thanks to
 Jukka Laurila for Darwin support. Thanks to George Makrydakis for lrztar.

 Con Kolivas <kernel@kolivas.org>
-Mon, 1 Nov 2010
+Mon, 4 Nov 2010

 Also documented by
 Peter Hyman <pete@peterhyman.com>
--- a/15
+++ b/15
@ -1,3 +1,18 @@
+lrzip-0.5.1
+
+Fixed the build on Darwin.
+Rewrote the rzip compression phase to make it possible to use unlimited sized
+windows now, not limited by ram. Unfortunately it's 100 times slower in this
+mode but you can compress a file of any size as one big compression window with
+it using the new -U option.
+Changed the memory selection system to simply find the largest reasonable sized
+window and use that by default instead of guessing the window size.
+Setting -M now only affects the window size, trying to find the largest
+unreasonably sized window that will still work.
+The default compression level is now 9 and affects the rzip compression stage
+as well as the backend compression.
+Changed to 3 point releases in case we get more than 9 subversions ;)
+
 lrzip-0.50

 Rewrote the file format to be up to 5% more compact and slightly faster.
--- a/22
+++ b/22
@ -1,6 +1,6 @@
 #! /bin/sh
 # Guess values for system-dependent variables and create Makefiles.
-# Generated by GNU Autoconf 2.67 for lrzip 0.5.
+# Generated by GNU Autoconf 2.67 for lrzip 0.5.1.
 #
 # Report bugs to <kernel@kolivas.org>.
 #
@ -551,9 +551,9 @@ MAKEFLAGS=

 # Identity of this package.
 PACKAGE_NAME='lrzip'
-PACKAGE_TARNAME='lrzip-0.5'
-PACKAGE_VERSION='0.5'
-PACKAGE_STRING='lrzip 0.5'
+PACKAGE_TARNAME='lrzip-0.5.1'
+PACKAGE_VERSION='0.5.1'
+PACKAGE_STRING='lrzip 0.5.1'
 PACKAGE_BUGREPORT='kernel@kolivas.org'
 PACKAGE_URL=''

@ -1221,7 +1221,7 @@ if test "$ac_init_help" = "long"; then
  # Omit some internal or obsolete options to make the list less imposing.
  # This message is too long to be a string in the A/UX 3.1 sh.
  cat <<_ACEOF
-\`configure' configures lrzip 0.5 to adapt to many kinds of systems.
+\`configure' configures lrzip 0.5.1 to adapt to many kinds of systems.

 Usage: $0 [OPTION]... [VAR=VALUE]...

@ -1269,7 +1269,7 @@ Fine tuning of the installation directories:
  --infodir=DIR           info documentation [DATAROOTDIR/info]
  --localedir=DIR         locale-dependent data [DATAROOTDIR/locale]
  --mandir=DIR            man documentation [DATAROOTDIR/man]
-  --docdir=DIR            documentation root [DATAROOTDIR/doc/lrzip-0.5]
+  --docdir=DIR            documentation root [DATAROOTDIR/doc/lrzip-0.5.1]
  --htmldir=DIR           html documentation [DOCDIR]
  --dvidir=DIR            dvi documentation [DOCDIR]
  --pdfdir=DIR            pdf documentation [DOCDIR]
@ -1286,7 +1286,7 @@ fi

 if test -n "$ac_init_help"; then
  case $ac_init_help in
-     short | recursive ) echo "Configuration of lrzip 0.5:";;
+     short | recursive ) echo "Configuration of lrzip 0.5.1:";;
   esac
  cat <<\_ACEOF

@ -1375,7 +1375,7 @@ fi
 test -n "$ac_init_help" && exit $ac_status
 if $ac_init_version; then
  cat <<\_ACEOF
-lrzip configure 0.5
+lrzip configure 0.5.1
 generated by GNU Autoconf 2.67

 Copyright (C) 2010 Free Software Foundation, Inc.
@ -2014,7 +2014,7 @@ cat >config.log <<_ACEOF
 This file contains any messages produced by compilers while
 running configure, to aid debugging if configure makes a mistake.

-It was created by lrzip $as_me 0.5, which was
+It was created by lrzip $as_me 0.5.1, which was
 generated by GNU Autoconf 2.67.  Invocation command line was

  $ $0 $@
@ -5324,7 +5324,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
 # report actual input values of CONFIG_FILES etc. instead of their
 # values after options handling.
 ac_log="
-This file was extended by lrzip $as_me 0.5, which was
+This file was extended by lrzip $as_me 0.5.1, which was
 generated by GNU Autoconf 2.67.  Invocation command line was

  CONFIG_FILES    = $CONFIG_FILES
@ -5386,7 +5386,7 @@ _ACEOF
 cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
 ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
 ac_cs_version="\\
-lrzip config.status 0.5
+lrzip config.status 0.5.1
 configured by $0, generated by GNU Autoconf 2.67,
  with options \\"\$ac_cs_config\\"

--- a/configure.ac
+++ b/configure.ac
@ -1,5 +1,5 @@
 dnl Process this file with autoconf to produce a configure script.
-AC_INIT([lrzip],[0.5],[kernel@kolivas.org],[lrzip-0.5])
+AC_INIT([lrzip],[0.5.1],[kernel@kolivas.org],[lrzip-0.5.1])
 AC_CONFIG_HEADER(config.h)
 # see what our system is!
 AC_CANONICAL_HOST
--- a/doc/README.benchmarks
+++ b/doc/README.benchmarks
@ -89,12 +89,14 @@ gzip		2772899756	 25.8		7m52.667s	4m8.661s
 bzip2		2704781700	 25.2		20m34.269s	7m51.362s
 xz		2272322208	 21.2		58m26.829s	4m46.154s
 7z		2242897134	 20.9		29m28.152s	6m35.952s
-lrzip		1361276826	 12.7		27m45.874s	9m20.046
-lrzip(lzo)	1837206675	 17.1		4m48.167s	8m28.842s
+lrzip*		1354237684	 12.6		29m13.402s	6m55.441s
+lrzip(lzo)*	1828073980	 17.0		3m34.816s	5m06.266s
 lrzip(zpaq)	1341008779	 12.5		4h11m14s
 lrzip(zpaq)M	1270134391	 11.8		4h30m14
 lrzip(zpaq)MW	1066902006	  9.9

+(The benchmarks with * were done with version 0.5)
+
 At this end of the spectrum things really start to heat up. The compression
 advantage is massive, with the lzo backend even giving much better results
 than 7z, and over a ridiculously short time. Note that it's not much longer
@ -117,4 +119,4 @@ Or, to make things easier, just use the default settings all the time and be
 happy as lzma gives good results. :D

 Con Kolivas
-Sat, 19 Dec 2009
+Tue, 2nd Nov 2010
--- a/main.c
+++ b/main.c
@ -23,13 +23,13 @@ struct rzip_control control;

 static void usage(void)
 {
-	print_output("lrzip version %d.%d%d\n", LRZIP_MAJOR_VERSION, LRZIP_MINOR_VERSION, LRZIP_MINOR_SUBVERSION);
+	print_output("lrzip version %d.%d.%d\n", LRZIP_MAJOR_VERSION, LRZIP_MINOR_VERSION, LRZIP_MINOR_SUBVERSION);
 	print_output("Copyright (C) Con Kolivas 2006-2010\n\n");
 	print_output("Based on rzip ");
 	print_output("Copyright (C) Andrew Tridgell 1998-2003\n");
 	print_output("usage: lrzip [options] <file...>\n");
 	print_output(" Options:\n");
-	print_output("     -w size       compression window in hundreds of MB\n");
+	print_output("     -w size       maximum compression window in hundreds of MB\n");
 	print_output("                   default chosen by heuristic dependent on ram and chosen compression\n");
 	print_output("     -d            decompress\n");
 	print_output("     -o filename   specify the output file name and/or path\n");
@ -39,13 +39,14 @@ static void usage(void)
 	print_output("     -D            delete existing files\n");
 	print_output("     -P            don't set permissions on output file - may leave it world-readable\n");
 	print_output("     -q            don't show compression progress\n");
-	print_output("     -L level      set lzma/bzip2/gzip compression level (1-9, default 7)\n");
+	print_output("     -L level      set lzma/bzip2/gzip compression level (1-9, default 9)\n");
 	print_output("     -n            no backend compression - prepare for other compressor\n");
 	print_output("     -l            lzo compression (ultra fast)\n");
 	print_output("     -b            bzip2 compression\n");
 	print_output("     -g            gzip compression using zlib\n");
 	print_output("     -z            zpaq compression (best, extreme compression, extremely slow)\n");
-	print_output("     -M            Maximum window and level - (all available ram and level 9)\n");
+	print_output("     -M            Maximum window (all available ram)\n");
+	print_output("     -U            Use unlimited window size beyond ramsize (100x slower)\n");
 	print_output("     -T value      Compression threshold with LZO test. (0 (nil) - 10 (high), default 1)\n");
 	print_output("     -N value      Set nice value to value (default 19)\n");
 	print_output("     -v[v]         Increase verbosity\n");
@ -519,7 +520,7 @@ static void compress_file(void)
 }

 /*
- * Returns ram size on linux/darwin.
+ * Returns ram size in bytes on linux/darwin.
 */
 #ifdef __APPLE__
 static i64 get_ram(void)
@ -533,17 +534,14 @@ static i64 get_ram(void)
 	sysctl(mib, 2, NULL, &len, NULL, 0);
 	p = malloc(len);
 	sysctl(mib, 2, p, &len, NULL, 0);
-	ramsize = *p / 1024; // bytes -> KB
+	ramsize = *p

-	/* Darwin can't overcommit as much as linux so we return half the ram
-	   size to fudge it to use smaller windows */
-	ramsize /= 2;
 	return ramsize;
 }
 #else
 static i64 get_ram(void)
 {
-	return (i64)sysconf(_SC_PHYS_PAGES) * (i64)sysconf(_SC_PAGE_SIZE) / 1024;
+	return (i64)sysconf(_SC_PHYS_PAGES) * (i64)sysconf(_SC_PAGE_SIZE);
 }
 #endif

@ -552,7 +550,7 @@ int main(int argc, char *argv[])
 	struct timeval start_time, end_time;
 	struct sigaction handler;
 	double seconds,total_time; // for timers
-	int c, i, maxwin = 0;
+	int c, i;
 	int hours,minutes;
 	extern int optind;
 	char *eptr; /* for environment */
@ -567,8 +565,8 @@ int main(int argc, char *argv[])
 	if (strstr(argv[0], "lrunzip"))
 		control.flags |= FLAG_DECOMPRESS;

-	control.compression_level = 7;
-	control.ramsize = get_ram() / 104858ull; /* hundreds of megabytes */
+	control.compression_level = 9;
+	control.ramsize = get_ram();
 	control.window = 0;
 	control.threshold = 1.0;	/* default lzo test compression threshold (level 1) with LZMA compression */
 	/* for testing single CPU */
@ -594,7 +592,7 @@ int main(int argc, char *argv[])
 	else if (!strstr(eptr,"NOCONFIG"))
 		read_config(&control);

-	while ((c = getopt(argc, argv, "L:hdS:tVvDfqo:w:nlbMO:T:N:gPzi")) != -1) {
+	while ((c = getopt(argc, argv, "L:hdS:tVvDfqo:w:nlbMUO:T:N:gPzi")) != -1) {
 		switch (c) {
 		case 'L':
 			control.compression_level = atoi(optarg);
@ -661,8 +659,10 @@ int main(int argc, char *argv[])
 			control.flags |= FLAG_NO_COMPRESS;
 			break;
 		case 'M':
-			control.compression_level = 9;
-			maxwin = 1;
+			control.flags |= FLAG_MAXRAM;
+			break;
+		case 'U':
+			control.flags |= FLAG_UNLIMITED;
 			break;
 		case 'O':
 			if (control.outname)	/* can't mix -o and -O */
@ -724,39 +724,13 @@ int main(int argc, char *argv[])
 	if (argc < 1)
 		control.flags |= FLAG_STDIN;

-	if (control.window > control.ramsize)
-		print_output("Compression window has been set to larger than ramsize, proceeding at your request. If you did not mean this, abort now.\n");
-
-	if (sizeof(long) == 4 && control.ramsize > 9) {
-		/* On 32 bit, the default high/lowmem split of 896MB lowmem
-		   means we will be unable to allocate contiguous blocks
-		   over 900MB between 900 and 1800MB. It will be less prone
-		   to failure if we limit the block size.
-		   */
-		if (control.ramsize < 18)
-			control.ramsize = 9;
-		else
-			control.ramsize -= 9;
-	}
-
-	/* The control window chosen is the largest that will not cause
-	   massive swapping on the machine (60% of ram). Most of the pages
-	   will be shared by lzma even though it uses just as much ram itself
-	   */
-	if (!control.window) {
-		if (maxwin)
-			control.window = (control.ramsize * 9 / 10);
-		else
-			control.window = (control.ramsize * 2 / 3);
-		if (!control.window)
-			control.window = 1;
-	}
-
+#if 0
 	/* malloc limited to 2GB on 32bit */
 	if (sizeof(long) == 4 && control.window > 20) {
 		control.window = 20;
 		print_verbose("Limiting control window to 2GB due to 32bit limitations.\n");
 	}
+#endif

 	/* OK, if verbosity set, print summary of options selected */
 	if (VERBOSE && !INFO) {
@ -801,8 +775,10 @@ int main(int argc, char *argv[])
 				       (control.threshold < 1.05 ? 21 - control.threshold * 20 : 0));
 			else if (NO_COMPRESS)
 				print_err("RZIP\n");
-			print_err("Compression Window: %lld = %lldMB\n", control.window, control.window * 100ull);
-			print_err("Compression Level: %d\n", control.compression_level);
+			if (control.window) {
+				print_verbose("Compression Window: %lld = %lldMB\n", control.window, control.window * 100ull);
+				print_verbose("Compression Level: %d\n", control.compression_level);
+			}
 		}
 		print_err("\n");
 	}
--- a/man/lrzip.1
+++ b/man/lrzip.1
@ -1,4 +1,4 @@
-.TH "lrzip" "1" "May 2010" "" ""
+.TH "lrzip" "1" "November 2010" "" ""
 .SH "NAME"
 lrzip \- a large-file compression program
 .SH "SYNOPSIS"
@ -41,13 +41,14 @@ Here is a summary of the options to lrzip\&.
  \-D            delete existing files
  \-P            don't set permissions on output file. It may leave it world-readable
  \-q            don't show compression progress
-  \-L level      set lzma/bzip2/gzip compression level (1\-9, default 7)
+  \-L level      set rzip/lzma/bzip2/gzip compression level (1\-9, default 9)
  \-n            no backend compression. Prepare for other compressor
  \-l            lzo compression (ultra fast)
  \-b            bzip2 compression
  \-g            gzip compression using zlib
  \-z            zpaq compression (best, extreme compression, extremely slow)
-  \-M            Maximum window and level - (all available ram and level 9)
+  \-M            Maximum window (all available ram)
+  \-U            Use unlimited window size beyond ramsize (100x slower)
  \-T value      Compression threshold with LZO test. (0 (nil) - 10 (high), default 1)
  \-N value      Set nice value to value (default 19)
  \-v[v]         Increase verbosity
@ -73,29 +74,35 @@ Print the lrzip version number
 Increases verbosity. \-vv will print more messages than \-v.
 .IP
 .IP "\fB-w n\fP"
-Set the compression window size to n in hundreds of megabytes. This is the amount
-of memory lrzip will search during its first stage of pre-compression and is
-the main thing that will determine how much benefit lrzip will provide over
-ordinary compression with the 2nd stage algorithm. Because of buffers and
-compression overheads, the value chosen must be significantly smaller than
-your available ram or lrzip will induce a massive swap load. If not set
+Set the maximum allowable compression window size to n in hundreds of megabytes.
+This is the amount of memory lrzip will search during its first stage of
+pre-compression and is the main thing that will determine how much benefit lrzip
+will provide over ordinary compression with the 2nd stage algorithm. If not set
 (recommended), the value chosen will be determined by internal heuristic in
-lrzip which uses the most memory that is reasonable. It is limited to 2GB on
-32bit machines.
+lrzip which uses the most memory that is reasonable, without any hard upper
+limit. It is limited to 2GB on 32bit machines. lrzip will always reduce the
+window size to the biggest it can be without running out of memory.
 .IP
 .IP "\fB-L 1\&.\&.9\fP"
-Set the compression level from 1 to 9. The default is
-to use level 7, which is a reasonable compromise between speed and
-compression. The compression level is also strongly related to how much
-memory lrzip uses. See the \-w option for details.
+Set the compression level from 1 to 9. The default is to use level 9, which
+gives good all round compression. The compression level is also strongly related
+to how much memory lrzip uses. See the \-w option for details.
 .IP
 .IP "\fB-M \fP"
-Maximum compression\&. If this option is set, then lrzip ignores the heuristic
-mentioned for the default window and tries to set it to all available ram,
-and sets the compression level to maximum. This will cause a significant swap
-load on most machines, and may even fail without enough swap space allocated.
-Be prepared to walk away if you use this option. It is not recommended to use
-this as it hardly ever improves compression.
+Maximum window size\&. If this option is set, then lrzip tries to load the
+entire file into ram as one big compression window, and will reduce the size of
+the window until it does fit. This may induce a hefty swap load on your machine
+but can also give dramatic size advantages when your file is the size of your
+ram or larger. .IP
+.IP "\fB-U \fP"
+Unlimited window size\&. If this option is set, and the file being compressed
+does not fit into the available ram, lrzip will use a moving second buffer as a
+"sliding mmap" which emulates having infinite ram. This will provide the most
+possible compression in the first rzip stage which can improve the compression
+of ultra large files. However it also runs 100x slower than the regular first
+stage compression so it is worth trying the -M option first to see if the whole
+file can be accessed in one pass, and then if not, it should be used together
+with the -M option (if at all).
 .IP
 .IP "\fB-T 0\&.\&.10\fP"
 Sets the LZO compression threshold when testing a data chunk when slower
@ -192,15 +199,14 @@ if later blocks were compressible.
 .PP
 .SH "COMPRESSION ALGORITHM"
 .PP
-LRZIP operates in two stages. The first stage finds and encodes large
-chunks of duplicated data over potentially very long distances (limited
-only by your available ram) in the input file. The second stage is to
-use a compression algorithm to compress the output of the
-first stage. The compression algorithm can be chosen to be optimised
-for size (lzma - default), speed (lzo), legacy (bzip2) or (gzip)
-or can be omitted entirely doing only the first stage. A one stage only
-compressed file can almost always improve both the compression size and
-speed done by a subsequent compression program.
+LRZIP operates in two stages. The first stage finds and encodes large chunks of
+duplicated data over potentially very long distances (limited only by your
+available ram) in the input file. The second stage is to use a compression
+algorithm to compress the output of the first stage. The compression algorithm
+can be chosen to be optimised for extreme size (zpaq), size (lzma - default),
+speed (lzo), legacy (bzip2) or (gzip) or can be omitted entirely doing only the
+first stage. A one stage only compressed file can almost always improve both the
+compression size and speed done by a subsequent compression program.

 .PP
 The key difference between lrzip and other well known compression
@ -223,7 +229,7 @@ might achieve a much lower compression ratio than lrzip can achieve.
 .PP
 .SH "FILES"
 .PP
-LRZIP now recognizes a configuration file that contains default settings.
+LRZIP now recognises a configuration file that contains default settings.
 This configuration is searched for in the current directory, /etc/lrzip,
 and $HOME/.lrzip. The configuration filename must be \fBlrzip.conf\fP.
 .PP
--- a/rzip.c
+++ b/rzip.c
@ -69,7 +69,7 @@ struct rzip_state {
 	i64 hash_limit;
 	tag minimum_tag_mask;
 	i64 tag_clean_ptr;
-	uchar *last_match;
+	i64 last_match;
 	i64 chunk_size;
 	char chunk_bytes;
 	uint32_t cksum;
@ -86,6 +86,49 @@ struct rzip_state {
 	} stats;
 };

+struct sliding_buffer {
+	uchar *buf_low;	/* The low window buffer */
+	uchar *buf_high;/* "" high "" */
+	i64 orig_offset;/* Where the original buffer started */
+	i64 offset_high;/* What the current offset the high buffer has */
+	i64 orig_size;	/* How big the full buffer would be */
+	i64 size_low;	/* How big the low buffer is */
+	i64 size_high;
+	int fd;		/* The fd of the mmap */
+} sb;	/* Sliding buffer */
+
+static void remap_high_sb(i64 p)
+{
+	if (munmap(sb.buf_high, sb.size_high) != 0)
+		fatal("Failed to munmap in remap_high_sb\n");
+	sb.size_high = 4096; /* In case we shrunk it when we hit the end of the file */
+	sb.offset_high = p;
+	if ((sb.offset_high + sb.orig_offset) % 4096)
+		sb.offset_high -= (sb.offset_high + sb.orig_offset) % 4096;
+	if (sb.offset_high + sb.size_high > sb.orig_size)
+		sb.size_high = sb.orig_size - sb.offset_high;
+	sb.buf_high = (uchar *)mmap(NULL, sb.size_high, PROT_READ, MAP_SHARED, sb.fd, sb.orig_offset + sb.offset_high);
+	if (sb.buf_high == MAP_FAILED)
+		fatal("Failed to re mmap in remap_high_sb\n");
+}
+
+/* We use a "sliding mmap" to effectively read more than we can fit into the
+ * compression window. This is done by using a maximally sized lower mmap at
+ * the beginning of the block, and a one-page-sized mmap block that slides up
+ * and down as is required for any offsets beyond the lower one. This is
+ * 100x slower than mmap but makes it possible to have unlimited sized
+ * compression windows. */
+static uchar *get_sb(i64 p)
+{
+	if (p < sb.size_low)
+		return (sb.buf_low + p);
+	if (p >= sb.offset_high && p < (sb.offset_high + sb.size_high))
+		return (sb.buf_high + (p - sb.offset_high));
+	/* (p > sb.size_low &&  p < sb.offset_high) */
+	remap_high_sb(p);
+	return (sb.buf_high + (p - sb.offset_high));
+}
+
 static inline void put_u8(void *ss, int stream, uchar b)
 {
 	if (write_stream(ss, stream, &b, 1) != 0)
@ -117,7 +160,7 @@ static void put_header(void *ss, uchar head, i64 len)
 	put_vchars(ss, 0, len, 2);
 }

-static void put_match(struct rzip_state *st, uchar *p, uchar *buf, i64 offset, i64 len)
+static void put_match(struct rzip_state *st, i64 p, i64 offset, i64 len)
 {
 	do {
 		i64 ofs;
@ -125,7 +168,7 @@ static void put_match(struct rzip_state *st, uchar *p, uchar *buf, i64 offset, i
 		if (n > 0xFFFF)
 			n = 0xFFFF;

-		ofs = (p - (buf + offset));
+		ofs = (p - offset);
 		put_header(st->ss, 1, n);
 		put_vchars(st->ss, 0, ofs, st->chunk_bytes);

@ -137,10 +180,35 @@ static void put_match(struct rzip_state *st, uchar *p, uchar *buf, i64 offset, i
 	} while (len);
 }

-static void put_literal(struct rzip_state *st, uchar *last, uchar *p)
+/* write some data to a stream mmap encoded. Return -1 on failure */
+int write_sbstream(void *ss, int stream, i64 p, i64 len)
+{
+	struct stream_info *sinfo = ss;
+
+	while (len) {
+		i64 n, i;
+
+		n = MIN(sinfo->bufsize - sinfo->s[stream].buflen, len);
+
+		for (i = 0; i < n; i++) {
+			memcpy(sinfo->s[stream].buf+sinfo->s[stream].buflen + i,
+			       get_sb(p + i), 1);
+		}
+		sinfo->s[stream].buflen += n;
+		p += n;
+		len -= n;
+		if (sinfo->s[stream].buflen == sinfo->bufsize) {
+			if (flush_buffer(sinfo, stream) != 0)
+				return -1;
+		}
+	}
+	return 0;
+}
+
+static void put_literal(struct rzip_state *st, i64 last, i64 p)
 {
 	do {
-		i64 len = (i64)(p - last);
+		i64 len = p - last;
 		if (len > 0xFFFF)
 			len = 0xFFFF;

@ -149,7 +217,7 @@ static void put_literal(struct rzip_state *st, uchar *last, uchar *p)

 		put_header(st->ss, 0, len);

-		if (len && write_stream(st->ss, 1, last, len) != 0)
+		if (len && write_sbstream(st->ss, 1, last, len) != 0)
 			fatal("Failed to write_stream in put_literal\n");
 		last += len;
 	} while (p > last);
@ -272,33 +340,33 @@ again:
 	goto again;
 }

-static inline tag next_tag(struct rzip_state *st, uchar *p, tag t)
+static inline tag next_tag(struct rzip_state *st, i64 p, tag t)
 {
-	t ^= st->hash_index[p[-1]];
-	t ^= st->hash_index[p[MINIMUM_MATCH - 1]];
+	t ^= st->hash_index[*get_sb(p - 1)];
+	t ^= st->hash_index[*get_sb(p + MINIMUM_MATCH - 1)];
 	return t;
 }

-static inline tag full_tag(struct rzip_state *st, uchar *p)
+static inline tag full_tag(struct rzip_state *st, i64 p)
 {
 	tag ret = 0;
 	int i;

 	for (i = 0; i < MINIMUM_MATCH; i++)
-		ret ^= st->hash_index[p[i]];
+		ret ^= st->hash_index[*get_sb(p + i)];
 	return ret;
 }

-static inline i64 match_len(struct rzip_state *st,
-			    uchar *p0, uchar *op, uchar *buf, uchar *end, i64 *rev)
+static inline i64 match_len(struct rzip_state *st, i64 p0, i64 op, i64 end,
+			    i64 *rev)
 {
-	uchar *p = p0;
+	i64 p = p0;
 	i64 len = 0;

 	if (op >= p0)
 		return 0;

-	while ((*p == *op) && (p < end)) {
+	while ((*get_sb(p) == *get_sb(op)) && (p < end)) {
 		p++;
 		op++;
 	}
@ -307,11 +375,11 @@ static inline i64 match_len(struct rzip_state *st,
 	p = p0;
 	op -= len;

-	end = buf;
+	end = 0;
 	if (end < st->last_match)
 		end = st->last_match;

-	while (p > end && op > buf && op[-1] == p[-1]) {
+	while (p > end && op > 0 && *get_sb(op - 1) == *get_sb(p-1)) {
 		op--;
 		p--;
 	}
@ -325,8 +393,7 @@ static inline i64 match_len(struct rzip_state *st,
 	return len;
 }

-static i64 find_best_match(struct rzip_state *st,
-			   tag t, uchar *p, uchar *buf, uchar *end,
+static i64 find_best_match(struct rzip_state *st, tag t, i64 p, i64 end,
 			   i64 *offset, i64 *reverse)
 {
 	i64 length = 0;
@ -343,8 +410,8 @@ static i64 find_best_match(struct rzip_state *st,
 		i64 mlen;

 		if (t == st->hash_table[h].t) {
-			mlen = match_len(st, p, buf+st->hash_table[h].offset,
-					 buf, end, &rev);
+			mlen = match_len(st, p, st->hash_table[h].offset, end,
+					 &rev);

 			if (mlen)
 				st->stats.tag_hits++;
@ -387,14 +454,13 @@ static void show_distrib(struct rzip_state *st)
 	       primary*100.0/total);
 }

-static void hash_search(struct rzip_state *st, uchar *buf,
-			double pct_base, double pct_multiple)
+static void hash_search(struct rzip_state *st, double pct_base, double pct_multiple)
 {
 	i64 cksum_limit = 0, pct, lastpct=0;
-	uchar *p, *end;
+	i64 p, end;
 	tag t = 0;
 	struct {
-		uchar *p;
+		i64 p;
 		i64 ofs;
 		i64 len;
 	} current;
@ -424,8 +490,8 @@ static void hash_search(struct rzip_state *st, uchar *buf,
 	st->cksum = 0;
 	st->hash_count = 0;

-	p = buf;
-	end = buf + st->chunk_size - MINIMUM_MATCH;
+	p = 0;
+	end = st->chunk_size - MINIMUM_MATCH;
 	st->last_match = p;
 	current.len = 0;
 	current.p = p;
@ -446,13 +512,13 @@ static void hash_search(struct rzip_state *st, uchar *buf,
 		if ((t & st->minimum_tag_mask) != st->minimum_tag_mask)
 			continue;

-		mlen = find_best_match(st, t, p, buf, end, &offset, &reverse);
+		mlen = find_best_match(st, t, p, end, &offset, &reverse);

 		/* Only insert occasionally into hash. */
 		if ((t & tag_mask) == tag_mask) {
 			st->stats.inserts++;
 			st->hash_count++;
-			insert_hash(st, t, (i64)(p - buf));
+			insert_hash(st, t, p);
 			if (st->hash_count > st->hash_limit)
 				tag_mask = clean_one_from_hash(st);
 		}
@ -467,29 +533,32 @@ static void hash_search(struct rzip_state *st, uchar *buf,
 		    && current.len >= MINIMUM_MATCH) {
 			if (st->last_match < current.p)
 				put_literal(st, st->last_match, current.p);
-			put_match(st, current.p, buf, current.ofs, current.len);
+			put_match(st, current.p, current.ofs, current.len);
 			st->last_match = current.p + current.len;
 			current.p = p = st->last_match;
 			current.len = 0;
 			t = full_tag(st, p);
 		}

-		if (SHOW_PROGRESS && (p - buf) % 100 == 0) {
-			pct = pct_base + (pct_multiple * (100.0 * (p - buf)) /
+		if (p % 100 == 0) {
+			pct = pct_base + (pct_multiple * (100.0 * p) /
 			      st->chunk_size);
 			if (pct != lastpct) {
 				struct stat s1, s2;

 				fstat(st->fd_in, &s1);
 				fstat(st->fd_out, &s2);
-				print_output("%2lld%%\r", pct);
+				if (!STDIN)
+					print_progress("%2lld%%\r", pct);
 				lastpct = pct;
 			}
 		}

-		if (p - buf > (i64)cksum_limit) {
-			i64 n = st->chunk_size - (p - buf);
-			st->cksum = CrcUpdate(st->cksum, buf + cksum_limit, n);
+		if (p > (i64)cksum_limit) {
+			i64 i, n = st->chunk_size - p;
+
+			for (i = 0; i < n; i++)
+				st->cksum = CrcUpdate(st->cksum, get_sb(cksum_limit + i), 1);
 			cksum_limit += n;
 		}
 	}
@ -498,16 +567,18 @@ static void hash_search(struct rzip_state *st, uchar *buf,
 	if (MAX_VERBOSE)
 		show_distrib(st);

-	if (st->last_match < buf + st->chunk_size)
-		put_literal(st, st->last_match, buf + st->chunk_size);
+	if (st->last_match < st->chunk_size)
+		put_literal(st, st->last_match, st->chunk_size);

 	if (st->chunk_size > cksum_limit) {
-		i64 n = st->chunk_size - cksum_limit;
-		st->cksum = CrcUpdate(st->cksum, buf+cksum_limit, n);
+		i64 i, n = st->chunk_size - cksum_limit;
+
+		for (i = 0; i < n; i++)
+			st->cksum = CrcUpdate(st->cksum, get_sb(cksum_limit + i), 1);
 		cksum_limit += n;
 	}

-	put_literal(st, NULL, 0);
+	put_literal(st, 0, 0);
 	put_u32(st->ss, 0, st->cksum);
 }

@ -558,6 +629,7 @@ static void mmap_stdin(uchar *buf, struct rzip_state *st)
 			if (buf == MAP_FAILED)
 				fatal("Failed to remap to smaller buf in mmap_stdin\n");
 			st->chunk_size = total;
+			control.st_size += total;
 			st->stdin_eof = 1;
 			break;
 		}
@ -566,60 +638,85 @@ static void mmap_stdin(uchar *buf, struct rzip_state *st)
 	}
 }

+static void init_sliding_mmap(struct rzip_state *st, int fd_in, i64 offset)
+{
+	i64 size = st->chunk_size;
+
+	print_verbose("Allocating sliding_mmap...\n");
+	sb.orig_offset = offset;
+retry:
+	/* Mmapping anonymously first will tell us how much ram we can use in
+	 * advance and zeroes it which has a defragmenting effect on ram
+	 * before the real read in. We can map a lot more file backed ram than
+	 * anonymous ram so do not do this preallocation in MAXRAM mode. Using
+	 * the larger mmapped window will cause a lot more ram trashing of the
+	 * system so we do not use MAXRAM mode by default. */
+	if (!MAXRAM || STDIN) {
+		sb.buf_low = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+		/* Better to shrink the window to the largest size that works than fail */
+		if (sb.buf_low == MAP_FAILED) {
+			size = size / 10 * 9;
+			size -= size % 4096; /* Round to page size */
+			if (!size)
+				fatal("Unable to mmap any ram\n");
+			goto retry;
+		}
+		print_maxverbose("Succeeded in preallocating %lld sized mmap\n", size);
+		if (!STDIN) {
+			if (munmap(sb.buf_low, size) != 0)
+				fatal("Failed to munmap\n");
+		} else
+			st->chunk_size = size;
+	}
+	if (!STDIN) {
+		sb.buf_low = (uchar *)mmap(sb.buf_low, size, PROT_READ, MAP_SHARED, fd_in, offset);
+		if (sb.buf_low == MAP_FAILED) {
+			size = size / 10 * 9;
+			size -= size % 4096; /* Round to page size */
+			if (!size)
+				fatal("Unable to mmap any ram\n");
+			goto retry;
+		}
+	} else
+		mmap_stdin(sb.buf_low, st);
+	print_maxverbose("Succeeded in allocating %lld sized mmap\n", size);
+	if (size < st->chunk_size) {
+		if (UNLIMITED && !STDIN)
+			print_verbose("File is beyond window size, will proceed MUCH slower in unlimited mode beyond the window size\n");
+		else {
+			print_verbose("Needed to shrink window size to %lld\n", size);
+			st->chunk_size = size;
+		}
+	}
+	if (UNLIMITED && !STDIN) {
+		sb.buf_high = (uchar *)mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd_in, offset);
+		if (sb.buf_high == MAP_FAILED)
+			fatal("Unable to mmap buf_high in init_sliding_mmap\n");
+		sb.size_high = 4096;
+		sb.offset_high = 0;
+	}
+	sb.size_low = size;
+	sb.orig_size = st->chunk_size;
+	sb.fd = fd_in;
+}
+
 /* compress a chunk of an open file. Assumes that the file is able to
   be mmap'd and is seekable */
 static void rzip_chunk(struct rzip_state *st, int fd_in, int fd_out, i64 offset,
-		       double pct_base, double pct_multiple, i64 limit)
+		       double pct_base, double pct_multiple)
 {
-	i64 prealloc_size = st->chunk_size;
-	uchar *buf = MAP_FAILED;
-
-	/* Mmapping first will tell us if we can allocate this much ram
-	 * faster than slowly reading in the file and then failing. Filling
-	 * it with zeroes has a defragmenting effect on ram before the real
-	 * read in. */
-	print_verbose("Preallocating ram...\n");
-	while (buf == MAP_FAILED) {
-		/* If we fail to mmap the full amount, it is worth trying to
-		 * mmap ever smaller sizes till we succeed as we may be able
-		 * to continue with file backed mmap in the presence of swap
-		 * and defragmentation */
-		buf = mmap(NULL, prealloc_size, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
-		if (buf == MAP_FAILED) {
-			prealloc_size = prealloc_size / 10 * 9;
-			continue;
-		}
-		print_maxverbose("Preallocated %lld ram...\n", prealloc_size);
-		if (!STDIN) {
-			/* STDIN will use this already allocated ram */
-			if (munmap(buf, prealloc_size) != 0)
-				fatal("Failed to munmap in rzip_chunk\n");
-		} else
-			st->chunk_size = prealloc_size;
-	}
-	if (!STDIN) {
-		print_verbose("Reading file into mmapped ram...\n");
-retry:
-		buf = (uchar *)mmap(buf, st->chunk_size, PROT_READ, MAP_SHARED, fd_in, offset);
-		/* Better to shrink the window to the largest size that works than fail */
-		if (buf == MAP_FAILED) {
-			st->chunk_size = st->chunk_size / 10 * 9;
-			goto retry;
-		}
-		print_verbose("Mmapped %lld ram...\n", st->chunk_size);
-	} else {
-		/* We don't know how big the data will be so we add it up here */
-		mmap_stdin(buf, st);
-		control.st_size += st->chunk_size;
-	}
-
-	st->ss = open_stream_out(fd_out, NUM_STREAMS, limit);
+	init_sliding_mmap(st, fd_in, offset);
+	st->ss = open_stream_out(fd_out, NUM_STREAMS, st->chunk_size);
 	if (!st->ss)
 		fatal("Failed to open streams in rzip_chunk\n");
-	hash_search(st, buf, pct_base, pct_multiple);
+	hash_search(st, pct_base, pct_multiple);
 	/* unmap buffer before closing and reallocating streams */
-	if (munmap(buf, st->chunk_size) != 0)
+	if (munmap(sb.buf_low, sb.size_low) != 0)
 		fatal("Failed to munmap in rzip_chunk\n");
+	if (UNLIMITED && !STDIN) {
+		if (munmap(sb.buf_high, sb.size_high) != 0)
+			fatal("Failed to munmap in rzip_chunk\n");
+	}

 	if (close_stream_out(st->ss) != 0)
 		fatal("Failed to flush/close streams in rzip_chunk\n");
@ -661,9 +758,17 @@ void rzip_fd(int fd_in, int fd_out)
 	} else
 		control.st_size = 0;

-	chunk_window = control.window * CHUNK_MULTIPLE;
+	if (control.window)
+		chunk_window = control.window * CHUNK_MULTIPLE;
+	else {
+		if (STDIN)
+			chunk_window = control.ramsize;
+		else
+			chunk_window = len;
+	}
+	st->chunk_size = chunk_window;

-	st->level = &levels[MIN(9, control.window)];
+	st->level = &levels[control.compression_level];
 	st->fd_in = fd_in;
 	st->fd_out = fd_out;
 	st->stdin_eof = 0;
@ -678,7 +783,6 @@ void rzip_fd(int fd_in, int fd_out)

 	while (len > 0 || (STDIN && !st->stdin_eof)) {
 		double pct_base, pct_multiple;
-		i64 chunk, limit = 0;
 		int bits = 8;

 		/* Flushing the dirty data will decrease our chances of
@ -688,19 +792,16 @@ void rzip_fd(int fd_in, int fd_out)
 		if (last_chunk)
 			print_verbose("Flushing data to disk.\n");
 		fsync(fd_out);
-		chunk = chunk_window;
-		if (chunk > len && !STDIN)
-			chunk = len;
-		limit = chunk;
-		st->chunk_size = chunk;
-		print_maxverbose("Chunk size: %lld\n", chunk);
+		if (st->chunk_size > len && !STDIN)
+			st->chunk_size = len;
+		print_maxverbose("Chunk size: %lld\n", st->chunk_size);

 		/* Determine the chunk byte width and write it to the file
 		 * This allows archives of different chunk sizes to have
 		 * optimal byte width entries. When working with stdin we
 		 * won't know in advance how big it is so it will always be
 		 * rounded up to the window size. */
-		while (chunk >> bits > 0)
+		while (st->chunk_size >> bits > 0)
 			bits++;
 		st->chunk_bytes = bits / 8;
 		if (bits % 8)
@ -710,12 +811,12 @@ void rzip_fd(int fd_in, int fd_out)
 			fatal("Failed to write chunk_bytes size in rzip_fd\n");

 		pct_base = (100.0 * (s.st_size - len)) / s.st_size;
-		pct_multiple = ((double)chunk) / s.st_size;
+		pct_multiple = ((double)st->chunk_size) / s.st_size;
 		pass++;

 		gettimeofday(&current, NULL);
 		/* this will count only when size > window */
-		if (!STDIN && last.tv_sec > 0) {
+		if (last.tv_sec > 0) {
 			elapsed_time = current.tv_sec - start.tv_sec;
 			finish_time = elapsed_time / (pct_base / 100.0);
 			elapsed_hours = (unsigned int)(elapsed_time) / 3600;
@ -731,9 +832,10 @@ void rzip_fd(int fd_in, int fd_out)
 		}
 		last.tv_sec = current.tv_sec;
 		last.tv_usec = current.tv_usec;
-		last_chunk = chunk;
-		rzip_chunk(st, fd_in, fd_out, s.st_size - len, pct_base, pct_multiple, limit);
-		len -= chunk;
+		rzip_chunk(st, fd_in, fd_out, s.st_size - len, pct_base, pct_multiple);
+		/* st->chunk_bytes may be shrunk in rzip_chunk */
+		last_chunk = st->chunk_size;
+		len -= st->chunk_size;
 	}

 	gettimeofday(&current, NULL);
@ -751,13 +853,10 @@ void rzip_fd(int fd_in, int fd_out)
 	       (unsigned int)st->stats.inserts,
 	       (1.0 + st->stats.match_bytes) / st->stats.literal_bytes);

-	if (!STDIN) {
+	if (!STDIN)
 		print_progress("%s - ", control.infile);
-		print_progress("Compression Ratio: %.3f. Average Compression Speed: %6.3fMB/s.\n",
-		        1.0 * s.st_size / s2.st_size, chunkmbs);
-	}
+	print_progress("Compression Ratio: %.3f. Average Compression Speed: %6.3fMB/s.\n",
+		       1.0 * s.st_size / s2.st_size, chunkmbs);

-	if (st->hash_table)
-		free(st->hash_table);
 	free(st);
 }
--- a/rzip.h
+++ b/rzip.h
@ -19,7 +19,7 @@

 #define LRZIP_MAJOR_VERSION 0
 #define LRZIP_MINOR_VERSION 5
-#define LRZIP_MINOR_SUBVERSION 0
+#define LRZIP_MINOR_SUBVERSION 1

 #define NUM_STREAMS 2

@ -149,6 +149,8 @@ typedef uint32_t u32;
 #define FLAG_STDIN 16384
 #define FLAG_STDOUT 32768
 #define FLAG_INFO 65536
+#define FLAG_MAXRAM 131072
+#define FLAG_UNLIMITED 262144

 #define FLAG_VERBOSE (FLAG_VERBOSITY | FLAG_VERBOSITY_MAX)
 #define FLAG_NOT_LZMA (FLAG_NO_COMPRESS | FLAG_LZO_COMPRESS | FLAG_BZIP2_COMPRESS | FLAG_ZLIB_COMPRESS | FLAG_ZPAQ_COMPRESS)
@ -171,6 +173,8 @@ typedef uint32_t u32;
 #define STDIN		(control.flags & FLAG_STDIN)
 #define STDOUT		(control.flags & FLAG_STDOUT)
 #define INFO		(control.flags & FLAG_INFO)
+#define MAXRAM		(control.flags & FLAG_MAXRAM)
+#define UNLIMITED	(control.flags & FLAG_UNLIMITED)

 #define CTYPE_NONE 3
 #define CTYPE_BZIP2 4
@ -197,9 +201,24 @@ struct rzip_control {
 	int major_version;
 	int minor_version;
 	i64 st_size;
+} control;
+
+struct stream {
+	i64 last_head;
+	uchar *buf;
+	i64 buflen;
+	i64 bufp;
 };

-extern struct rzip_control control;
+struct stream_info {
+	struct stream *s;
+	int num_streams;
+	int fd;
+	i64 bufsize;
+	i64 cur_pos;
+	i64 initial_pos;
+	i64 total_read;
+};

 void fatal(const char *format, ...);
 void sighandler();
@ -212,11 +231,12 @@ int write_stream(void *ss, int stream, uchar *p, i64 len);
 i64 read_stream(void *ss, int stream, uchar *p, i64 len);
 int close_stream_out(void *ss);
 int close_stream_in(void *ss);
+int flush_buffer(struct stream_info *sinfo, int stream);
 void read_config(struct rzip_control *s);
 ssize_t write_1g(int fd, void *buf, i64 len);
 ssize_t read_1g(int fd, void *buf, i64 len);
-extern void zpipe_compress(FILE *in, FILE *out, FILE *msgout, long long int buf_len, int progress);
-extern void zpipe_decompress(FILE *in, FILE *out, FILE *msgout, long long int buf_len, int progress);
+void zpipe_compress(FILE *in, FILE *out, FILE *msgout, long long int buf_len, int progress);
+void zpipe_decompress(FILE *in, FILE *out, FILE *msgout, long long int buf_len, int progress);

 #define print_err(format, args...)	do {\
 	fprintf(stderr, format, ##args);	\
--- a/stream.c
+++ b/stream.c
@ -23,23 +23,6 @@

 #define STREAM_BUFSIZE (1024 * 1024 * 10)

-struct stream {
-	i64 last_head;
-	uchar *buf;
-	i64 buflen;
-	i64 bufp;
-};
-
-struct stream_info {
-	struct stream *s;
-	int num_streams;
-	int fd;
-	i64 bufsize;
-	i64 cur_pos;
-	i64 initial_pos;
-	i64 total_read;
-};
-
 /* just to keep things clean, declare function here
 * but move body to the end since it's a work function
 */
@ -590,6 +573,7 @@ void *open_stream_out(int f, int n, i64 limit)
 	if (!sinfo)
 		return NULL;

+	sinfo->bufsize = 0;
 	sinfo->num_streams = n;
 	sinfo->cur_pos = 0;
 	sinfo->fd = f;
@ -602,11 +586,11 @@ void *open_stream_out(int f, int n, i64 limit)
 	if (LZMA_COMPRESS) {
 		if (sizeof(long) == 4) {
 			/* Largest window supported on lzma 32bit is 600MB */
-			if (cwindow > 6)
+			if (!cwindow || cwindow > 6)
 				cwindow = 6;
 		}
 		/* Largest window supported on lzma 64bit is 4GB */
-		if (cwindow > 40)
+		if (!cwindow || cwindow > 40)
 			cwindow = 40;
 	}

@ -616,7 +600,9 @@ void *open_stream_out(int f, int n, i64 limit)
 		sinfo->bufsize = STREAM_BUFSIZE;

 	/* No point making the stream larger than the amount of data */
-	if (limit && limit < sinfo->bufsize)
+	if (sinfo->bufsize)
+		sinfo->bufsize = MIN(sinfo->bufsize, limit);
+	else
 		sinfo->bufsize = limit;
 	sinfo->initial_pos = lseek(f, 0, SEEK_CUR);

@ -741,7 +727,7 @@ failed:
 }

 /* flush out any data in a stream buffer. Return -1 on failure */
-static int flush_buffer(struct stream_info *sinfo, int stream)
+int flush_buffer(struct stream_info *sinfo, int stream)
 {
 	i64 c_len = sinfo->s[stream].buflen;
 	int c_type = CTYPE_NONE;
--- a/util.c
+++ b/util.c
@ -1,6 +1,7 @@
 /* 
   Copyright (C) Andrew Tridgell 1998
-   
+   Con Kolivas 2006-2010
+
   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2 of the License, or
@ -51,23 +52,24 @@ void fatal(const char *format, ...)
 	}

 	/* Delete temporary files generated for testing or faking stdio */
-	if (control.flags & (FLAG_TEST_ONLY | FLAG_STDOUT))
+	if (TEST_ONLY || STDOUT)
 		unlink(control.outfile);

-	if (control.flags & FLAG_STDIN)
+	if (DECOMPRESS && STDIN)
 		unlink(control.infile);

-	fprintf(stderr, "Fatal error - exiting\n");
+	perror(NULL);
+	print_output("Fatal error - exiting\n");
 	exit(1);
 }

 void sighandler()
 {
 	/* Delete temporary files generated for testing or faking stdio */
-	if (control.flags & (FLAG_TEST_ONLY | FLAG_STDOUT))
+	if (TEST_ONLY || STDOUT)
 		unlink(control.outfile);

-	if (control.flags & FLAG_STDIN)
+	if (DECOMPRESS && STDIN)
 		unlink(control.infile);

 	exit(0);