From 28079083c15623931fbab814200ebc8ff929ada4 Mon Sep 17 00:00:00 2001 From: Con Kolivas Date: Sun, 12 Dec 2010 10:46:22 +1100 Subject: [PATCH] Update docs. --- ChangeLog | 25 +++++++++++++++++++++++++ TODO | 6 ++++-- WHATS-NEW | 16 ++++++++++++++++ doc/README.benchmarks | 12 ++++++------ 4 files changed, 51 insertions(+), 8 deletions(-) diff --git a/ChangeLog b/ChangeLog index 22e7639..2b2152a 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,4 +1,29 @@ lrzip ChangeLog +DECEMBER 2010, version 0.550 Con Kolivas +* Move the threading on compression to higher up in the code, allowing the next +stream to start using compression threads before the previous stream has +finished. This speeds up compression on files that take more than one pass to +compress. +* Limit the number of threads decompressing stream 0 to just 1 since it's always +followed by stream 1 chunks, and it may lead to failure to decompress due to +running out of memory by running too many threads. +* Default compression level and window size on lzma is set to 7 which is the +highest it goes. Scale the 9 lrzip levels into 7, thus making the default lzma +level 5 which uses a lot less memory and is substantially faster at the cost of +some compression. +* Rationalise the memory testing now that the default lzma settings use a lot +less ram by default, and make all systems use no more than 1/3 ram in one mmap. +This allows larger windows to be used by 32 bit at last without memory +allocation errors. +* Revert "Make threads spawn at regular intervals along chunk size thus speeding +up compression" as it actually slowed it down instead of speeding it up. +* Cope with compression/decompression threads failing by waiting till the +previous thread has finished its work, thus serialising the work and using less +ram, making success more likely. +* Fix some dodgy callocs which weren't really allocating enough ram. +* Destory semaphores used in stream_in on closing the stream. +* Minor output improvements. + DECEMBER 2010, version 0.544 Con Kolivas * Make multiple stream 0 entry decompression more robust by creating separate thread groups for stream 0 and stream 1. diff --git a/TODO b/TODO index aaf9e69..18a3080 100644 --- a/TODO +++ b/TODO @@ -25,5 +25,7 @@ Make testing file integrity work without a temporary file. Fix darwin build since it doesn't support unnamed semamphores. -Parallelise compression further by moving into next stream while backend -threads still continue on earlier stream. +Add error detection and correction with either reed-solomon or low density +parity checking. + +Add password protection. diff --git a/WHATS-NEW b/WHATS-NEW index 0a379b7..473465a 100644 --- a/WHATS-NEW +++ b/WHATS-NEW @@ -1,3 +1,19 @@ +lrzip-0.550 + +Speed up compression on large files that take more than one pass by overlapping +work on successive streams, thus using multiple CPUs better. +Fix for failures to decompress large files. Decompression will be slightly +slower but more reliable. +Faster lzma compression by default, less prone to memory failures, but at slight +compression cost. +Recover from multithreaded failures by serialising work that there isn't enough +ram to do in parallel. +Revert the "smooth out spacing" change in 0.544 as it slowed things down instead +of speeding them up. +Larger compression windows are back for 32 bits now that memory usage is kept +under better control. +Fixed some memory allocation issues which may have been causing subtle bugs. + lrzip-0.544 Hopefully a fix for corrupt decompression on large files with multiple stream 0 diff --git a/doc/README.benchmarks b/doc/README.benchmarks index 81b9e3d..8e2eeac 100644 --- a/doc/README.benchmarks +++ b/doc/README.benchmarks @@ -94,7 +94,7 @@ system and some basic working software on it. The default options on the 10GB Virtual image: -These benchmarks were done on the quad core with version 0.530 +These benchmarks were done on the quad core with version 0.550 Compression Size Percentage Compress Time Decompress Time None 10737418240 100.0 @@ -102,10 +102,10 @@ gzip 2772899756 25.8 05m47s 2m46s bzip2 2704781700 25.2 16m15s 6m19s xz 2272322208 21.2 50m58s 3m52s 7z 2242897134 20.9 26m36s 5m41s -lrzip 1239219863 11.5 15m45s 3m07s -lrzip -M 1079682231 10.1 12m03s 2m50s -lrzip -l 1754694010 16.3 05m30s 2m23s -lrzip -lM 1414958844 13.2 04m38s 2m20s +lrzip 1372218189 12.8 11m03s 3m43s +lrzip -M 1079682231 10.2 09m30s 3m02s +lrzip -l 1831906483 17.1 05m38s 3m05s +lrzip -lM 1414958844 13.2 05m24s 2m52s lrzip -zM 1066902006 9.9 71m20s 72m0s @@ -116,7 +116,7 @@ scalability with multiple CPUs has a huge impact on compression time here, with zpaq almost being as fast on quad core as xz is, yet producing a file less than half the size. What appears to be a big disappointment is actually zpaq here which takes more -than 6 times longer than lzma for a measly .2% improvement. The reason is that +than 6 times longer than lzma for a measly .3% improvement. The reason is that most of the advantage here is achieved by the rzip first stage since there's a lot of redundant space over huge distances on a virtual image. The -M option which works the memory subsystem rather hard making noticeable impact on the