Compare commits

..

No commits in common. "master" and "v0.5.2" have entirely different histories.

87 changed files with 15748 additions and 16500 deletions

View file

@ -1,29 +0,0 @@
name: check_build
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: prepare repo
run: git fetch --prune --unshallow
- name: autogen
run: ./autogen.sh
- name: install liblzo2-dev
run: sudo apt install -y liblzo2-dev
- name: install liblz4-dev
run: sudo apt install -y liblz4-dev
- name: configure
run: ./configure
- name: make
run: make
- name: make check
run: make check

37
.gitignore vendored
View file

@ -1,39 +1,2 @@
*~
*.o
*.lo
config.*
Makefile
Makefile.in
.deps
.libs
*.la
aclocal.m4
autom4te.cache/
configure
depcomp
install-sh
libtool
lrzip
lrzip*.tar.bz2
lrzip*.tar.gz
lrzip*.tar.lrz
ltmain.sh
missing
stamp-h1
libtool.m4
ltoptions.m4
ltsugar.m4
ltversion.m4
lt~obsolete.m4
compile
man/lrunzip.1
man/lrzcat.1
man/lrztar.1
man/lrzuntar.1
man/lrz.1
libzpaq/.dirstamp
lrzip.pc
regressiontest.out
decompress_demo
liblrzip_demo

View file

@ -16,8 +16,3 @@ Jukka Laurila for newer Darwin support
George Makrydakis for lrztar, lrzuntar
Jari Aalto for documentation and typos and git help
Jon Tibble for nasm tests & Solaris support
Michael Blumenkrantz for updated autotools and liblrzip!
Serge Belyshev for encryption help and code
Ulrich Drepper for MD5 implementation
PolarSSL authors for sha512 + aes128 implementation
Fernando Auil for lrzip completion

9
BUGS
View file

@ -1,8 +1,3 @@
BUGME
BUGME May 2010
Issues can be reported/tracked here:
https://github.com/ckolivas/lrzip/issues
Known issues:
Mac may not be able to work with STDIN/STDOUT on very large files.
MD5 is disabled on Mac due to not working properly.
Nil known.

42
COPYING
View file

@ -1,12 +1,12 @@
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
675 Mass Ave, Cambridge, MA 02139, USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
@ -15,7 +15,7 @@ software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Lesser General Public License instead.) You can apply it to
the GNU Library General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
@ -55,8 +55,8 @@ patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
@ -110,7 +110,7 @@ above, provided that you also meet all of these conditions:
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
@ -168,7 +168,7 @@ access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
@ -225,7 +225,7 @@ impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
@ -255,7 +255,7 @@ make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
@ -277,9 +277,9 @@ YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
END OF TERMS AND CONDITIONS
Appendix: How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
@ -291,7 +291,7 @@ convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
Copyright (C) 19yy <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@ -303,16 +303,16 @@ the "copyright" line and a pointer to where the full notice is found.
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision version 69, Copyright (C) 19yy name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
@ -335,5 +335,5 @@ necessary. Here is a sample; alter the names:
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Lesser General
library. If this is what you want to do, use the GNU Library General
Public License instead of this License.

432
ChangeLog
View file

@ -1 +1,431 @@
Check git for changelog: https://github.com/ckolivas/lrzip/commits/master
lrzip ChangeLog
NOVEMBER 2010, version 0.5.2 Con Kolivas
* Fixed the Darwin build, again.
* Fixed cases of extreme ram usage on 32 bit failing by limiting zpaq to 600MB
windows as well.
* Check page size if we can instead of assuming it's always 4k.
* Improve the progress output.
* Change failure to chmod and failure to set nice level to warnings only.
* Standardise what's a stderr message and what's output.
NOVEMBER 2010, version 0.5.1 Con Kolivas
* Fix Darwin build - Darwin doesn't support mremap so introduce a fake wrapper
for it.
* Fix the memopen routines, a wrongly implemented wrapper for Darwin equivalents
was also using the faked versions on all builds.
* Fix dodgy ordered includes.
* Clean up excessive use of #ifdefs
* Huge rewrite of buffer reading in rzip.c. We use a wrapper instead of
accessing the buffer directly, thus allowing us to have window sizes larger than
available ram. This is implemented through the use of a "sliding mmap"
implementation. Sliding mmap uses two mmapped buffers, one large one as
previously, and one smaller one. When an attempt is made to read beyond the end
of the large buffer, the small buffer is remapped to the file area that's being
accessed, while the larger one is remapped as the search progresses along the
file. While this implementation is potentially much slower than direct mmapping,
it allows us to implement unlimited sized compression windows.
* Implement the -U option with unlimited sized compression windows.
* Rework the selection of compression windows. Instead of trying to guess how
much ram the machine might be able to access, we try to safely buffer as much
ram as we can, and then use that to determine the file buffer size. Do not
choose an arbitrary upper window limit unless -w is specified.
* Rework the -M option to try to buffer the entire file, reducing the buffer
size until we succeed.
* Align buffer sizes to page size.
* Clean up lots of unneeded variables.
* Fix lots of minor logic issues to do with window sizes accepted/passed to rzip
and the compression backends.
* More error handling.
* Change -L to affect rzip compression level directly as well as backend
compression level and use 9 by default now.
* Fix file size reporting on compressed files generated from stdin.
* More cleanups of information output and more information.
* Add chunk percentage to progress update.
* Reinstated the 2GB buffer limit on 32 bit machines during compression, though
the -U mode can work around it now.
* Code micro-optimisations.
* Use 3 point release numbering in case one minor version has many subversions.
* Numerous minor cleanups and tidying.
* Updated docs, manpages, and benchmarks.
NOVEMBER 2010, version 0.5 Con Kolivas
* Changed offset encoding in rzip stage to use variable byte width offsets
instead of 64 bits wide. Makes for better compression and slightly faster.
* Write the byte width into the file before each block.
* Shrunk match lengths to maximum of 16 bits again as per original rzip as the
larger offsets did not achieve greater compression and made final size larger.
* New file format not backwards compatible due to variable byte widths.
* Rewrote memory initialisation to have a pre-allocation stage to try and
find the maximum memory usable and defragment ram.
* Use reduced window size if allocating memory fails at higher size.
* Change use of malloc to mmap to make it possible to address up to 44 bit
sized offsets even on 32 bit machines on decompression. Still unable to use
greater than 2GB windows on 32 bit machines and unsure if this is fixable.
* Reworked the STDIN code to use an anonymous mmap and read in stdin into this
to make it possible to compress from STDIN without the need for temporary
files. As the file size is not known in advance, memory allocation is set to
large and byte width to equivalent size.
* Reallocation of ram where possible to minimise risk of running out of memory
in the middle of a compression phase, and flushing to disk to empty dirty ram
for the same reason.
* More robust fatal warnings.
* Numerous cleanups and tidying of code and addition of comments.
* Updated documentation to reflect changes.
OCTOBER 2010, version 0.47, Con Kolivas
* Fix the symlinking problem when DESTDIR is in use reported by a billion
people.
MAY 2010, version 0.46, Con Kolivas, Ed Avis.
* Suppress final [OK] message with -q flag EA
* Handle mkstemp() errors correctly EA
* Add lrzuntar manpage
* Update manpages
APRIL 2010, version 0.45, Con Kolivas, Jon Tibble, George Makrydakis
* Fixes the nasm program test (AC_CHECK_PROG doesn't overwrite a
variable that is already set so do it manually) JT
* Fix compiler flags as not all compilers accept -Wall -W (cc on
Solaris/OpenSolaris) JT
* Fix lrztar to not try to compress files already with the .lrz extension GM
* Fix lrztar to decompress files where the pathname is ../* GM
* Add lrzuntar symlink to call lrztar -d
MAR 2010, version 0.45, Con Kolivas, Jari Aalto
* Fixed reported window size
* Fixed 32bit windows being attempted to be larger than contiguous amounts
by taking into account VM kernel/userspace split of 896MB.
* Minor code cleanups
* Added lrztar and lrunzip docs
* Fix minor typos
* Added distclean and maintainer-clean make targets
DEC 2009, version 0.44, Con Kolivas, George Makrydakis
* Added lrztar wrapper to manage whole directories.
* Added -i option to provide information about a compressed file.
* Fixed "nan" showing as Compression speed on very small files.
* Fixed build for old bz library.
* Avoid overwriting output file if input doesn't exist.
* Implement signal handler to delete temporary files.
DEC 2009, version 0.43, Con Kolivas, Jukka Laurila
* Darwin support thanks to Jukka Laurila.
* Finally added stdin/stdout support due to popular demand. This is done
by basically using temporary files so is a low performance way of using
lrzip.
* Added test function. This just uses a temporary file during decompression.
* Config files should now accept zpaq options.
* Minor code style cleanups.
* Updated benchmarks in docs.
* Add a warning when attempting to decompress a file from a newer lrzip
version.
NOV 2009, version 0.42, Con Kolivas
* Changed progress update to show which of 2 chunks are being compressed
in zpaq.
* Fixed progress update in ZPAQ to not update with each byte which was
wasting heaps of CPU time.
NOV 2009, version 0.41, Con Kolivas
* Added zpaq compression backend for extremely good but extremely slow
compression (incompatible with previous versions if used).
* Limited chunk size passed to LZMA to 4GB to avoid library overflows.
* Minor changes to the formatting output
* Changed lower limit of -T threshhold to 0 to allow disabling it.
* Added lzo_compresses check into zpaq and bzip2 as well since they're
slow.
NOV 2009, version 0.40, Con Kolivas
* Massive core code rewrite.
* All code moved to be 64bit based for compression block addressing and length
allowing compression windows to be limited by ram only.
* 64bit userspace should now have no restriction on compression window size,
32bit is still limited to 2GB windows due to userspace limitations.
* New file format using the new addressing and data types, incompatible with
versions prior to 0.40.
* Support for reading and decompressing older formats.
* Minor speedups in read/write routines.
* Countless minor code fixes throughout.
* Code style cleanups and consistency changes in core code.
* Configure script improvements.
NOV 2009, version 0.31, Con Kolivas
* Updated to be in sync with lzma SDK 9.07beta.
* Cleanups and fixes of the configure scripts to use the correct package version
name.
* Massive fixes to the memory management code fixing lots of 32bit overflow
errors. The window size limit is now 2GB on both 32bit and 64bit. While it
appears to be smaller than the old windows, only 900MB was being used on .30
even though it claimed to use more. This can cause huge improvements in the
compression of very large files.
* The offset when mmap()ing was not being set to a multiple of page size so
it would fail if the window size was not a multiple of it.
* Flushing of data to disk between compression windows was implemented to
minimise disk thrashing of read vs write.
NOV 2009, version 0.30, Con Kolivas
* Numerous bugfixes to try and make the most of 64bit environments with huge
memory and to barf less on 32bit environments.
* Executable stacks were fixed.
* Probably other weird and wonderful bugs have been introduced.
* -P option to not set permissions on output files allowing you to write to
braindead filesystems (eg fat32).
JAN 2009, version 0.24, Peter Hyman, pete@peterhyman.com
Happy New Year!
* Upgrade LZMA SDK to 4.63. Use new C Wrapper. Invalidates
LZMA archives created earlier due to new Magic property
bytes.
* New LZMA logic will automatically determine allow LZMA
code to determine optimal lc, lp, pb, fb, and dictionary
size settings. stream.c will only pass level and thread
information. Compress function will return encoded 5 byte
data with compression settings. This will be stored in lrz
file header.
* add error messages during LZMA compression. There are some
edge cases where LZMA cannot allocate memory. These errors
are reported and the user will be advised to use a lower
compression window setting.
* type changes in rzip_fd function for correctness.
* remove function *Realloc() since it was never used. Cleaned
in rzip.h and util.c.
* apply munmap prior to closing and compressing stream in
function rzip_chunk in rzip.c.
* add realloc function in close_stream_out in stream.c
to reclaim some ram and try and allieviate out of memory
conditions in LZMA compression.
* remove file acconfig.h and include DEFINE in configure.in.
* add lrzip.conf capability.
* add timer for compression including elapsed time and eta.
* add compression and decompression MB/s calculation.
* Updated WHATS-NEW, TODO and created BUGS file.
* Updated lrzip.1 manpage and created lrzip.conf.5 manpage.
* Added lrzip.conf.example file in doc directory.
MAR 2008, Con Kolivas, kernel@kolivas.org
* Numerous changes all over to place restrictions on window
size to work with 32 bit limitations.
* Various bugfixes with respect to detecting buffer sizes and
likelihood of compressibility.
* Fixed the inappropriate straight copying uncompressed data for
files larger than 4GB.
* Re-initiated the 10MB window limits for non-lzma compression.
I was unable to reproduce any file size savings.
* Allow compression windows larger than ramsize if people really
really want them.
* Decrease thresholds for the test function to a minimum of 5%
compressibility since the hanging in lzma compression bug has been
fixed.
JAN 2008, version 0.22, Peter Hyman, pete@peterhyman.com
* version update
lzma/LZMALib.cpp
Thanks to Lasse Collin for debugging the problem LZMA
had with hanging on uncompressable files.
Update for control parameters to both compress and
decompress functions.
Makefile.in
* use of @top_srcdir@ (Lasse Collin). Also moved away
more cruft.
main.c stream.c.rzip.h LZMALib.cpp lzmalib.h
* addition of three new control structure members.
control.lc -- literal context bits
control.lp -- literal post state bits
control.pb -- post state bits
These are needed to ensure decompression will work.
These will now be stored along with control.compression_level
in the lrz file beginning at offset 0x16 for three bytes.
These will be passed to the functions lzma_compresses and
lzma_uncompress. Currently, only compression level is
needed or used, but the others are stored for possible future
use.
See magic file for more information.
stream.c
* Change to lzo_compresses function that will reject a chunk
without testing it if the size of the chunk is greater
than the compression window * threshold. This is to avoid
a low probability that lzma would still be passed a chunk
that contains uncompressible data or barely compressible
data. If after rzip hashing the chunk size is still close
to the window size, there is hardly anything worth
compressing. While there is no reason lzma cannot get the
chunk, this will save a lot of time.
magic.headers.txt
* updated file to show new layout that includes lzma
parameters.
README-NOT-BACKWARD-COMPATIBLE
* added warning about using lrzip-0.22 with earlier versions.
WHATS-NEW
* highlight of new features.
DEC 2007, version 0.21. Peter Hyman, pete@peterhyman.com
* version update.
* Modified to use Assembler routines from lzma SDK for CRC
computation when hashing streams in rzip.c and runzip.c.
Added files 7zCrcT8.c and 7zCrcT8u.s to lzma tree.
Cleaned up source tree. Moved unused files out of the way.
Moved non-core docs to doc directory
configure.in
* correct AC_INIT to set program variables.
* modified to add check for nasm assembler.
* modified syntax of test for errno in error.h to use
echo $ECHO_N/$ECHO_C instead of $ac_n/$ac_c which
was incorrect.
Makefile.in, lzma/Makefile
* modified to add compile instructions for 7zCrcT8.c
and 7zCrcT8U.s and Assembler. Cleaned up to remove
targets that don't exist or sources that don't
exist.
Modified to properly set directories. Added doc install.
Add link command to symlink lrunzip to lrzip.
*main.c
Add CrcGenerateTable() function to init CRC tables.
This is needed for all crc routines including those
in MatchFinderMT.
rzip.c and runzip.c
* Updated source to change call to crc32_buffer to call
CrcUpdate in the assembler code. Changed parameter order
to conform.
stream.c
* Removed 10MB limit on streams for bzip, gzip, and lzo.
This, to improve effeciency of long range analysis. For
some files, this could improve results.
Current-Benchmarks.txt
* Added file to keep benchmarks current to version.
(probably need to update README too).
README.Assembler
* Explain how to remove default compile of Assembler
modules.
config.sub config.guess
* added files for system detection.
DEC 2007, version 0.20. Peter Hyman, pete@peterhyman.com
* Updated to LZMA SDK 4.57.
* Updated to p7zip POSIX version. (www.p7zip.org)
* Added multi-threading support (up to 2x speed with LZMA).
* Edited LZMADecompress.cpp for backward compatibility
with decompress function. Needed SetPropertiesRaw function.
* Repopulated source tree for distribution.
* Updated Makefile.in to reflect new source files.
Updated to include command to link lrunzip to lrzip because
lrzip will test if lrunzip was used on command line.
* Updated Makefile.in for new compile time and linking options.
* Updated LZMALibs.cpp to include new property members for
LZMAEncoders as well as changed default dictionaries to
level+16. This would make the default compression level
of 7 translate to a dictionary number of 23.
* Added output to show Nice Level when verbose mode set
Initial add of support for zlib which seems to give quite
excellent performance.
* configure.in added AC_CHECK for libz and libm.
Added AC_PROG_LN_S for Makefile symlink section.
* lrzip.1 updated man page for -g option
* main.c added option test for gzip
Added sysconf(_SC_NPROCESSORS_CONF) for CPU detection
for threading.
Updated verbose output to show whether or not
Threading will be used.
Added Timer for each file compressed.
* rzip.h added flags for GZIP compression.
Added control member for threads. Arg passed to
lzma_conpress.
* stream.c update to accomodate gzip compress and decompress
functions. Cleaned up file by rearranging functions into
groups.
Removed include of lzmalib.h since it was causing a
compile time warning with zlib.h. Prototyped functions
manually.
Cleanup output from lzo_compresses function so that
unnecessary linefeeds are eliminated.
lzma_compress function call now uses threads as argument.
* Added README.benchmarks file to explain a method of
comparing results between different methods.
* LZMALib.cpp, lzmalib.h. Adjust function lzma_compress
prototype and function to include new argument threads.
This parameter is now placed in properties.
* lzma/Makefile. Updated to reflect new API library.
Updated to include Threading option.
DEC 2007, version 0.19. Con Kolivas.
* Added nice support, defaulting to nice 19.
DEC 2007, version 0.19. Peter Hyman, pete@peterhyman.com
* Major goal was to stop LZMA from hanging on some files.
Accomplished this with a threhold setting that is used by
the lzo_compresses function to better analyze chunk data.
Threshold makes it less likely that uncompressible data
will be passed to the LZMA compressor.
main.c
* Added Threshold option 1-10 to control LZMA compression attempt.
Default value=2. This means that anything over 10% compression
as reported by lzo_compresses will return a true value to
the LZMA compression function.
* Added verbosity option and more verbosity option (-v[v]).
* Added -O option to specify output directory.
* Updated compress_file and decompress_file functions to handle.
output directories and better handle multi files and filename
extensions. Optimized some string handling routines.
Improved flexibility in determining location of output files
when using -O. Added fflush(stdout) to improve printf reliability.
* decompress_file will accept any filename and will automatically
append .lrz if not present. Won't automatically fail.
* Added logic to protect against conflicting options such as
-q and -v, -o and -O.
* Added printout to screen of options selected. Will display
only when -v or -vv used.
* Adjusted several printf statements to avoid compiler
warnings (use %ll for long long int types).
runzip.c
* Added decompression progress indicator.
Will show percent decompressed along with bytes decompressed
and total to be decompressed. Will show if -q option NOT used.
rzip.h
* Version incremented to 0.19.
* Added flag DEFINESs for verbosity and more verbosity.
* Updated control struct to include output directory and
threshold value. Removed verbosity member.
rzip.c
* Minor changes to handle display when verbosity set. Changed
number format in some printf statements to properly handle
unsigned data.
stream.c
* major overhaul of lzo_compresses function to use a threshold
value when testing a data chunk to see if it is suitable for
LZMA compression. Optimized test loop to improve performance
and reduce number of passes. Improved output reporting depending
on verbosity setting.
* Added print controls for verbosity option.
* Corrected if statements that tested for error condition of
some lzo functions that only return a true value regardless.
lrzip.1
* updated man page to show new options and explain -T threshold.
README
* updated README to explain -T threshold option.
README.lzo_compresses.test.txt
* Added this file to help explain the theory behind the rewrite
of the lzo_compresses function and how to use the -T option.
TODO
* wish list and future enhancements.
ChangeLog
* added file.

View file

@ -1,14 +0,0 @@
FROM alpine as builder
RUN apk add --update git autoconf automake libtool gcc musl-dev zlib-dev bzip2-dev lzo-dev coreutils make g++ lz4-dev && \
git clone https://github.com/ckolivas/lrzip.git && \
cd /lrzip && ./autogen.sh && ./configure && make -j `nproc` && make install
FROM alpine
RUN apk add --update --no-cache lzo libbz2 libstdc++ lz4-dev && \
rm -rf /tmp/* /var/tmp/*
COPY --from=builder /usr/local/bin/lrzip /usr/local/bin/lrzip
CMD ["/bin/sh"]

365
INSTALL
View file

@ -1,365 +0,0 @@
Installation Instructions
*************************
Copyright (C) 1994, 1995, 1996, 1999, 2000, 2001, 2002, 2004, 2005,
2006, 2007, 2008, 2009 Free Software Foundation, Inc.
Copying and distribution of this file, with or without modification,
are permitted in any medium without royalty provided the copyright
notice and this notice are preserved. This file is offered as-is,
without warranty of any kind.
Basic Installation
==================
Briefly, the shell commands `./configure; make; make install' should
configure, build, and install this package. The following
more-detailed instructions are generic; see the `README' file for
instructions specific to this package. Some packages provide this
`INSTALL' file but do not implement all of the features documented
below. The lack of an optional feature in a given package is not
necessarily a bug. More recommendations for GNU packages can be found
in *note Makefile Conventions: (standards)Makefile Conventions.
The `configure' shell script attempts to guess correct values for
various system-dependent variables used during compilation. It uses
those values to create a `Makefile' in each directory of the package.
It may also create one or more `.h' files containing system-dependent
definitions. Finally, it creates a shell script `config.status' that
you can run in the future to recreate the current configuration, and a
file `config.log' containing compiler output (useful mainly for
debugging `configure').
It can also use an optional file (typically called `config.cache'
and enabled with `--cache-file=config.cache' or simply `-C') that saves
the results of its tests to speed up reconfiguring. Caching is
disabled by default to prevent problems with accidental use of stale
cache files.
If you need to do unusual things to compile the package, please try
to figure out how `configure' could check whether to do them, and mail
diffs or instructions to the address given in the `README' so they can
be considered for the next release. If you are using the cache, and at
some point `config.cache' contains results you don't want to keep, you
may remove or edit it.
The file `configure.ac' (or `configure.in') is used to create
`configure' by a program called `autoconf'. You need `configure.ac' if
you want to change it or regenerate `configure' using a newer version
of `autoconf'.
The simplest way to compile this package is:
1. `cd' to the directory containing the package's source code and type
`./configure' to configure the package for your system.
Running `configure' might take a while. While running, it prints
some messages telling which features it is checking for.
2. Type `make' to compile the package.
3. Optionally, type `make check' to run any self-tests that come with
the package, generally using the just-built uninstalled binaries.
4. Type `make install' to install the programs and any data files and
documentation. When installing into a prefix owned by root, it is
recommended that the package be configured and built as a regular
user, and only the `make install' phase executed with root
privileges.
5. Optionally, type `make installcheck' to repeat any self-tests, but
this time using the binaries in their final installed location.
This target does not install anything. Running this target as a
regular user, particularly if the prior `make install' required
root privileges, verifies that the installation completed
correctly.
6. You can remove the program binaries and object files from the
source code directory by typing `make clean'. To also remove the
files that `configure' created (so you can compile the package for
a different kind of computer), type `make distclean'. There is
also a `make maintainer-clean' target, but that is intended mainly
for the package's developers. If you use it, you may have to get
all sorts of other programs in order to regenerate files that came
with the distribution.
7. Often, you can also type `make uninstall' to remove the installed
files again. In practice, not all packages have tested that
uninstallation works correctly, even though it is required by the
GNU Coding Standards.
8. Some packages, particularly those that use Automake, provide `make
distcheck', which can by used by developers to test that all other
targets like `make install' and `make uninstall' work correctly.
This target is generally not run by end users.
Compilers and Options
=====================
Some systems require unusual options for compilation or linking that
the `configure' script does not know about. Run `./configure --help'
for details on some of the pertinent environment variables.
You can give `configure' initial values for configuration parameters
by setting variables in the command line or in the environment. Here
is an example:
./configure CC=c99 CFLAGS=-g LIBS=-lposix
*Note Defining Variables::, for more details.
Compiling For Multiple Architectures
====================================
You can compile the package for more than one kind of computer at the
same time, by placing the object files for each architecture in their
own directory. To do this, you can use GNU `make'. `cd' to the
directory where you want the object files and executables to go and run
the `configure' script. `configure' automatically checks for the
source code in the directory that `configure' is in and in `..'. This
is known as a "VPATH" build.
With a non-GNU `make', it is safer to compile the package for one
architecture at a time in the source code directory. After you have
installed the package for one architecture, use `make distclean' before
reconfiguring for another architecture.
On MacOS X 10.5 and later systems, you can create libraries and
executables that work on multiple system types--known as "fat" or
"universal" binaries--by specifying multiple `-arch' options to the
compiler but only a single `-arch' option to the preprocessor. Like
this:
./configure CC="gcc -arch i386 -arch x86_64 -arch ppc -arch ppc64" \
CXX="g++ -arch i386 -arch x86_64 -arch ppc -arch ppc64" \
CPP="gcc -E" CXXCPP="g++ -E"
This is not guaranteed to produce working output in all cases, you
may have to build one architecture at a time and combine the results
using the `lipo' tool if you have problems.
Installation Names
==================
By default, `make install' installs the package's commands under
`/usr/local/bin', include files under `/usr/local/include', etc. You
can specify an installation prefix other than `/usr/local' by giving
`configure' the option `--prefix=PREFIX', where PREFIX must be an
absolute file name.
You can specify separate installation prefixes for
architecture-specific files and architecture-independent files. If you
pass the option `--exec-prefix=PREFIX' to `configure', the package uses
PREFIX as the prefix for installing programs and libraries.
Documentation and other data files still use the regular prefix.
In addition, if you use an unusual directory layout you can give
options like `--bindir=DIR' to specify different values for particular
kinds of files. Run `configure --help' for a list of the directories
you can set and what kinds of files go in them. In general, the
default for these options is expressed in terms of `${prefix}', so that
specifying just `--prefix' will affect all of the other directory
specifications that were not explicitly provided.
The most portable way to affect installation locations is to pass the
correct locations to `configure'; however, many packages provide one or
both of the following shortcuts of passing variable assignments to the
`make install' command line to change installation locations without
having to reconfigure or recompile.
The first method involves providing an override variable for each
affected directory. For example, `make install
prefix=/alternate/directory' will choose an alternate location for all
directory configuration variables that were expressed in terms of
`${prefix}'. Any directories that were specified during `configure',
but not in terms of `${prefix}', must each be overridden at install
time for the entire installation to be relocated. The approach of
makefile variable overrides for each directory variable is required by
the GNU Coding Standards, and ideally causes no recompilation.
However, some platforms have known limitations with the semantics of
shared libraries that end up requiring recompilation when using this
method, particularly noticeable in packages that use GNU Libtool.
The second method involves providing the `DESTDIR' variable. For
example, `make install DESTDIR=/alternate/directory' will prepend
`/alternate/directory' before all installation names. The approach of
`DESTDIR' overrides is not required by the GNU Coding Standards, and
does not work on platforms that have drive letters. On the other hand,
it does better at avoiding recompilation issues, and works well even
when some directory options were not specified in terms of `${prefix}'
at `configure' time.
Optional Features
=================
If the package supports it, you can cause programs to be installed
with an extra prefix or suffix on their names by giving `configure' the
option `--program-prefix=PREFIX' or `--program-suffix=SUFFIX'.
Some packages pay attention to `--enable-FEATURE' options to
`configure', where FEATURE indicates an optional part of the package.
They may also pay attention to `--with-PACKAGE' options, where PACKAGE
is something like `gnu-as' or `x' (for the X Window System). The
`README' should mention any `--enable-' and `--with-' options that the
package recognizes.
For packages that use the X Window System, `configure' can usually
find the X include and library files automatically, but if it doesn't,
you can use the `configure' options `--x-includes=DIR' and
`--x-libraries=DIR' to specify their locations.
Some packages offer the ability to configure how verbose the
execution of `make' will be. For these packages, running `./configure
--enable-silent-rules' sets the default to minimal output, which can be
overridden with `make V=1'; while running `./configure
--disable-silent-rules' sets the default to verbose, which can be
overridden with `make V=0'.
Particular systems
==================
On HP-UX, the default C compiler is not ANSI C compatible. If GNU
CC is not installed, it is recommended to use the following options in
order to use an ANSI C compiler:
./configure CC="cc -Ae -D_XOPEN_SOURCE=500"
and if that doesn't work, install pre-built binaries of GCC for HP-UX.
On OSF/1 a.k.a. Tru64, some versions of the default C compiler cannot
parse its `<wchar.h>' header file. The option `-nodtk' can be used as
a workaround. If GNU CC is not installed, it is therefore recommended
to try
./configure CC="cc"
and if that doesn't work, try
./configure CC="cc -nodtk"
On Solaris, don't put `/usr/ucb' early in your `PATH'. This
directory contains several dysfunctional programs; working variants of
these programs are available in `/usr/bin'. So, if you need `/usr/ucb'
in your `PATH', put it _after_ `/usr/bin'.
On Haiku, software installed for all users goes in `/boot/common',
not `/usr/local'. It is recommended to use the following options:
./configure --prefix=/boot/common
Specifying the System Type
==========================
There may be some features `configure' cannot figure out
automatically, but needs to determine by the type of machine the package
will run on. Usually, assuming the package is built to be run on the
_same_ architectures, `configure' can figure that out, but if it prints
a message saying it cannot guess the machine type, give it the
`--build=TYPE' option. TYPE can either be a short name for the system
type, such as `sun4', or a canonical name which has the form:
CPU-COMPANY-SYSTEM
where SYSTEM can have one of these forms:
OS
KERNEL-OS
See the file `config.sub' for the possible values of each field. If
`config.sub' isn't included in this package, then this package doesn't
need to know the machine type.
If you are _building_ compiler tools for cross-compiling, you should
use the option `--target=TYPE' to select the type of system they will
produce code for.
If you want to _use_ a cross compiler, that generates code for a
platform different from the build platform, you should specify the
"host" platform (i.e., that on which the generated programs will
eventually be run) with `--host=TYPE'.
Sharing Defaults
================
If you want to set default values for `configure' scripts to share,
you can create a site shell script called `config.site' that gives
default values for variables like `CC', `cache_file', and `prefix'.
`configure' looks for `PREFIX/share/config.site' if it exists, then
`PREFIX/etc/config.site' if it exists. Or, you can set the
`CONFIG_SITE' environment variable to the location of the site script.
A warning: not all `configure' scripts look for a site script.
Defining Variables
==================
Variables not defined in a site shell script can be set in the
environment passed to `configure'. However, some packages may run
configure again during the build, and the customized values of these
variables may be lost. In order to avoid this problem, you should set
them in the `configure' command line, using `VAR=value'. For example:
./configure CC=/usr/local2/bin/gcc
causes the specified `gcc' to be used as the C compiler (unless it is
overridden in the site shell script).
Unfortunately, this technique does not work for `CONFIG_SHELL' due to
an Autoconf bug. Until the bug is fixed you can use this workaround:
CONFIG_SHELL=/bin/bash /bin/bash ./configure CONFIG_SHELL=/bin/bash
`configure' Invocation
======================
`configure' recognizes the following options to control how it
operates.
`--help'
`-h'
Print a summary of all of the options to `configure', and exit.
`--help=short'
`--help=recursive'
Print a summary of the options unique to this package's
`configure', and exit. The `short' variant lists options used
only in the top level, while the `recursive' variant lists options
also present in any nested packages.
`--version'
`-V'
Print the version of Autoconf used to generate the `configure'
script, and exit.
`--cache-file=FILE'
Enable the cache: use and save the results of the tests in FILE,
traditionally `config.cache'. FILE defaults to `/dev/null' to
disable caching.
`--config-cache'
`-C'
Alias for `--cache-file=config.cache'.
`--quiet'
`--silent'
`-q'
Do not print messages saying which checks are being made. To
suppress all normal output, redirect it to `/dev/null' (any error
messages will still be shown).
`--srcdir=DIR'
Look for the package's source code in directory DIR. Usually
`configure' can determine that directory automatically.
`--prefix=DIR'
Use DIR as the installation prefix. *note Installation Names::
for more details, including other options available for fine-tuning
the installation locations.
`--no-create'
`-n'
Run the configure checks, but stop before creating any output
files.
`configure' also accepts some other, not widely useful, options. Run
`configure --help' for more details.

View file

@ -1,108 +0,0 @@
ACLOCAL_AMFLAGS = -I m4
MAINTAINERCLEANFILES = \
Makefile.in \
aclocal.m4 \
config.guess \
config.h.in \
config.h.in~ \
config.sub \
configure \
depcomp \
install-sh \
ltconfig \
ltmain.sh \
missing \
$(PACKAGE_TARNAME)-$(PACKAGE_VERSION).tar.gz \
$(PACKAGE_TARNAME)-$(PACKAGE_VERSION).tar.bz2 \
$(PACKAGE_TARNAME)-$(PACKAGE_VERSION).tar.xz \
$(PACKAGE_TARNAME)-$(PACKAGE_VERSION).tar.lrz \
$(PACKAGE_TARNAME)-$(PACKAGE_VERSION)-doc.tar.bz2 \
m4/libtool.m4 \
m4/lt~obsolete.m4 \
m4/ltoptions.m4 \
m4/ltsugar.m4 \
m4/ltversion.m4
SUBDIRS = lzma man doc
AM_CFLAGS = -I. -I lzma/C -DNDEBUG
AM_CXXFLAGS = $(AM_CFLAGS)
lrztardir = $(bindir)
lrztar_SCRIPTS = lrztar
noinst_LTLIBRARIES = libtmplrzip.la
libtmplrzip_la_SOURCES = \
lrzip_private.h \
lrzip.c \
lrzip_core.h \
rzip.h \
rzip.c \
runzip.c \
runzip.h \
stream.c \
stream.h \
util.c \
util.h \
md5.c \
md5.h \
aes.c \
aes.h \
sha4.c \
sha4.h \
libzpaq/libzpaq.cpp \
libzpaq/libzpaq.h
libtmplrzip_la_LIBADD = lzma/C/liblzma.la
bin_PROGRAMS = lrzip
lrzip_SOURCES = \
main.c
nodist_EXTRA_lrzip_SOURCES = dummyy.cxx
lrzip_LDADD = libtmplrzip.la
if STATIC
lrzip_LDFLAGS = -all-static
endif
dist_doc_DATA = \
AUTHORS \
BUGS \
ChangeLog \
COPYING \
README.md \
README-NOT-BACKWARD-COMPATIBLE \
TODO \
WHATS-NEW
lrzipdir = $(includedir)
EXTRA_DIST = \
lrztar \
description-pak \
autogen.sh \
INSTALL \
$(dist_doc_DATA)
install-exec-hook:
$(LN_S) -f lrzip$(EXEEXT) $(DESTDIR)$(bindir)/lrunzip$(EXEEXT)
$(LN_S) -f lrzip$(EXEEXT) $(DESTDIR)$(bindir)/lrzcat$(EXEEXT)
$(LN_S) -f lrztar$(EXEEXT) $(DESTDIR)$(bindir)/lrzuntar$(EXEEXT)
$(LN_S) -f lrzip$(EXEEXT) $(DESTDIR)$(bindir)/lrz$(EXEEXT)
uninstall-local:
rm -f $(bindir)/lrunzip
rm -f $(bindir)/lrzcat
rm -f $(bindir)/lrzuntar
rm -f $(bindir)/lrz
.PHONY: doc
# Documentation
doc: all
@echo "entering doc/"
$(MAKE) -C doc doc

135
Makefile.in Normal file
View file

@ -0,0 +1,135 @@
# Makefile for
# lrzip. This is processed by configure to produce the final
# Makefile
# See README.Assembler for notes about ASM module.
prefix=@prefix@
exec_prefix=@exec_prefix@
datarootdir=@datarootdir@
ASM_OBJ=@ASM_OBJ@
PACKAGE_TARNAME=@PACKAGE_TARNAME@
INSTALL_BIN=$(exec_prefix)/bin
INSTALL_MAN1=@mandir@/man1
INSTALL_MAN5=@mandir@/man5
INSTALL_DOC=@docdir@
INSTALL_DOC_LZMA=@docdir@/lzma
LIBS=@LIBS@
LDFLAGS=@LDFLAGS@
CC=@CC@
CXX=@CXX@
CFLAGS=@CFLAGS@ -I. -I$(srcdir) -c
CXXFLAGS=@CXXFLAGS@ -I. -I$(srcdir) -c
LZMA_CFLAGS=-I@top_srcdir@/lzma/C -DCOMPRESS_MF_MT -D_REENTRANT
INSTALLCMD=@INSTALL@
LN_S=@LN_S@
RM=rm -f
ASM=@ASM@
VPATH=@srcdir@
srcdir=@srcdir@
SHELL=/bin/sh
.SUFFIXES:
.SUFFIXES: .c .o
OBJS= main.o rzip.o runzip.o stream.o util.o \
@ASM_OBJ@ \
zpipe.o \
Threads.o \
LzFind.o \
LzFindMt.o \
LzmaDec.o \
LzmaEnc.o \
LzmaLib.o
DOCFILES= AUTHORS BUGS ChangeLog COPYING README README-NOT-BACKWARD-COMPATIBLE \
TODO WHATS-NEW \
doc/README.Assembler doc/README.benchmarks \
doc/README.lzo_compresses.test.txt \
doc/magic.header.txt doc/lrzip.conf.example
DOCFILES_LZMA= lzma/7zC.txt lzma/7zFormat.txt lzma/Methods.txt \
lzma/history.txt lzma/lzma.txt lzma/README lzma/README-Alloc
MAN1FILES= man/lrzip.1 man/lrunzip.1 man/lrztar.1 man/lrzuntar.1
MAN5FILES= man/lrzip.conf.5
#note that the -I. is needed to handle config.h when using VPATH
.c.o:
$(CC) $(CFLAGS) $(LZMA_CFLAGS) $< -o $@
all: lrzip make-man man doc
make-man:
$(MAKE) -C man
.PHONY: make-man
7zCrcT8.o: @top_srcdir@/lzma/C/7zCrcT8.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/7zCrcT8.c
7zCrcT8U.o: @top_srcdir@/lzma/ASM/x86/7zCrcT8U.s
$(ASM) -o 7zCrcT8U.o @top_srcdir@/lzma/ASM/x86/7zCrcT8U.s
7zCrcT8U_64.o: @top_srcdir@/lzma/ASM/x86_64/7zCrcT8U_64.s
$(ASM) -o 7zCrcT8U_64.o @top_srcdir@/lzma/ASM/x86_64/7zCrcT8U_64.s
7zCrc.o: @top_srcdir@/lzma/C/7zCrc.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/7zCrc.c
LzmaLib.o: @top_srcdir@/lzma/C/LzmaLib.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/LzmaLib.c
LzmaDec.o: @top_srcdir@/lzma/C/LzmaDec.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/LzmaDec.c
LzmaEnc.o: @top_srcdir@/lzma/C/LzmaEnc.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/LzmaEnc.c
Threads.o: @top_srcdir@/lzma/C/Threads.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/Threads.c
LzFind.o: @top_srcdir@/lzma/C/LzFind.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/LzFind.c
LzFindMt.o: @top_srcdir@/lzma/C/LzFindMt.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/LzFindMt.c
zpipe.o: zpipe.cpp
$(CXX) $(CXXFLAGS) -DNDEBUG zpipe.cpp
install: all
mkdir -p $(DESTDIR)${INSTALL_BIN}
${INSTALLCMD} -m 755 lrzip $(DESTDIR)${INSTALL_BIN}
${INSTALLCMD} -m 755 lrztar $(DESTDIR)${INSTALL_BIN}
(cd $(DESTDIR)${INSTALL_BIN} && ${LN_S} -f lrzip lrunzip )
(cd $(DESTDIR)${INSTALL_BIN} && ${LN_S} -f lrztar lrzuntar)
chmod 755 $(DESTDIR)${INSTALL_BIN}/lrzuntar
mkdir -p $(DESTDIR)${INSTALL_MAN1}
${INSTALLCMD} -m 644 $(MAN1FILES) $(DESTDIR)${INSTALL_MAN1}
mkdir -p $(DESTDIR)${INSTALL_MAN5}
${INSTALLCMD} -m 644 $(MAN5FILES) $(DESTDIR)${INSTALL_MAN5}
mkdir -p $(DESTDIR)${INSTALL_DOC}
${INSTALLCMD} -m 644 $(DOCFILES) $(DESTDIR)${INSTALL_DOC}
mkdir -p $(DESTDIR)${INSTALL_DOC_LZMA}
${INSTALLCMD} -m 644 $(DOCFILES_LZMA) $(DESTDIR)${INSTALL_DOC_LZMA}
uninstall:
rm -rf $(DESTDIR)${INSTALL_BIN}/{lrztar,lrzuntar,lrunzip,lrzip}
rm -rf $(DESTDIR)${INSTALL_DOC}
rm -rf $(DESTDIR)${INSTALL_MAN1}/{lrunzip.1,lrzip.1,lrztar.1,lrzuntar.1}
rm -rf $(DESTDIR)${INSTALL_MAN5}/lrzip.conf.5
lrzip: $(OBJS)
$(CXX) $(LDFLAGS) -o lrzip $(OBJS) $(LIBS)
static: $(OBJS)
$(CXX) $(LDFLAGS) -static -o lrzip $(OBJS) $(LIBS)
clean:
-${RM} *~ $(OBJS) lrzip config.cache config.log config.status *.o \
man/lrunzip.1 man/lrztar.1 man/lrzuntar.1
distclean: clean
-rm -rf autom4te.cache config.h Makefile
maintainer-clean: distclean
-rm -f configure

366
README Normal file
View file

@ -0,0 +1,366 @@
lrzip README
Long Range ZIP or Lzma RZIP
This is a compression program optimised for large files. The larger the file
and the more memory you have, the better the compression advantage this will
provide, especially once the files are larger than 100MB. The advantage can
be chosen to be either size (much smaller than bzip2) or speed (much faster
than bzip2).
Quick lowdown of the most used options:
lrztar directory
This will produce an archive directory.tar.lrz compressed with lzma
lrzuntar directory.tar.lrz
This will completely extract an archived directory
lrzip filename
This will produce an archive filename.lrz compressed with lzma (best all
round) giving slow compression and fast decompression
lrzip -z filename
This will produce an archive filename.lrz compressed with ZPAQ giving extreme
compression but which takes ages to compress and decompress
lrzip -l filename
This will produce an archive filename.lrz compressed with LZO giving very
fast compression and fast decompression
lrunzip filename.lrz
This will decompress filename.lrz into filename
Lrzip uses an extended version of rzip which does a first pass long distance
redundancy reduction. The lrzip modifications make it scale according to
memory size.
The data is then either:
1. Compressed by lzma (default) which gives excellent compression
at approximately twice the speed of bzip2 compression
2. Compressed by a number of other compressors chosen for different reasons,
in order of likelihood of usefulness:
2a. ZPAQ: Extreme compression up to 20% smaller than lzma but ultra slow
at compression AND decompression.
2b. LZO: Extremely fast compression and decompression which on most machines
compresses faster than disk writing making it as fast (or even faster) than
simply copying a large file
2c. GZIP: Almost as fast as LZO but with better compression.
2d. BZIP2: A defacto linux standard of sorts but is the middle ground between
lzma and gzip and neither here nor there.
3. Leaving it uncompressed and rzip prepared. This form improves substantially
any compression performed on the resulting file in both size and speed (due to
the nature of rzip preparation merging similar compressible blocks of data and
creating a smaller file). By "improving" I mean it will either speed up the
very slow compressors with minor detriment to compression, or greatly increase
the compression of simple compression algorithms.
The major disadvantages are:
1. The main lrzip application only works on single files so it requires the
lrztar wrapper to fake a complete archiver.
2. It requires a lot of memory to get the best performance out of, and is not
really usable (for compression) with less than 256MB. Decompression requires
less ram and works on smaller ram machines.
3. Only stdin in compression works well. The other combinations of
stdin/stdout work but in a very inefficient manner generating temporary files
on disk so this method of using lrzip is not recommended.
The unique feature of lrzip is that it tries to make the most of the available
ram in your system at all times for maximum benefit. It does this by default,
choosing the largest sized window possible without running out of memory. It
also has a unique "sliding mmap" feature which makes it possible to even use
a compression window larger than your ramsize, if the file is that large. It
does this (with the -U option) by implementing one large mmap buffer as per
normal, and a smaller moving buffer to track which part of the file is
currently being examined, emulating a much larger single mmapped buffer.
Unfortunately this mode is many times slower.
See the file README.benchmarks in doc/ for performance examples and what kind
of data lrzip is very good with.
Requires:
pthreads
liblzo2-dev
libbz2-dev
libz-dev
libm
tar
(nasm on 32bit x86)
To build/install:
./configure
make
make install
FAQS.
Q. How do I make a static build?
A. make static
Q. I want the absolute maximum compression I can possibly get, what do I do?
A. Try the command line options -MUz. This will use all available ram and ZPAQ
compression, and even use a compression window larger than you have ram.
Expect serious swapping to occur if your file is larger than your ram and for
it to take many times longer.
Q. How much slower is the unlimited mode?
A. It depends on 2 things. First, just how much larger than your ram the file
is, as the bigger the difference, the slower it will be. The second is how much
redundant data there is. The more there is, the slower, but ultimately the
better the compression. Using the example of a 10GB virtual image on a machine
with 8GB ram, it would allocate about 5.5GB by default, yet is capable of
allocating all the ram for the 10GB file in -M mode.
Options Size Compress Decompress
-l 1793312108 05m13s 3m12s
-lM 1413268368 04m18s 2m54s
-lU 1413268368 06m05s 2m54s
As you can see, the -U option gives the same compression in this case as the
-M option, and for about 50% more time. The advantage to using -U is that it
will work even when the size can't be encompassed by -M, but progressively
slower. Why isn't it on by default? If the compression window is a LOT larger
than ram, with a lot of redundant information it can be drastically slower. I
may revisit this possibility in the future if I can make it any faster.
Q. Can I use your tool for even more compression than lzma offers?
A. Yes, the rzip preparation of files makes them more compressible by every
other compression technique I have tried. Using the -n option will generate
a .lrz file smaller than the original which should be more compressible, and
since it is smaller it will compress faster than it otherwise would have.
Q. 32bit?
A. 32bit machines have a limit of 2GB sized compression windows due to
userspace limitations on mmap and malloc, so even if you have much more ram
you will not be able to use compression windows larger than 2GB. Also you
may be unable to decompress files compressed on 64bit machines which have
used windows larger than 2GB.
Q. How about 64bit?
A. 64bit machines with their ability to address massive amounts of ram will
excel with lrzip due to being able to use compression windows limited only in
size by the amount of physical ram.
Q. Other operating systems?
A. The code is POSIXy with GNU extensions. Patches are welcome. Version 0.43+
should build on MacOSX 10.5+
Q. Does it work on stdin/stdout?
A. Yes it does. Compression from stdin works nicely.. However the other
combinations of stdin and stdout use temporary files on disk because of
seeking requirements so the performance of these mode is low. Not recommended!
Q. I have another compression format that is even better than zpaq, can you
use that?
A. You can use it yourself on rzip prepared files (see above). Alternatively
if the source code is compatible with the GPL license it can be added to the
lrzip source code. Libraries with functions similar to compress() and
decompress() functions of zlib would make the process most painless. Please
tell me if you have such a library so I can include it :)
Q. What's this "Progress percentage pausing during lzma compression" message?
A. While I'm a big fan of progress percentage being visible, unfortunately
lzma compression can't currently be tracked when handing over 100+MB chunks
over to the lzma library. Therefore you'll see progress percentage until
each chunk is handed over to the lzma library. lzo, bzip2 or no compression
doesn't have this problem and shows progress continuously.
Q. What's this "lzo testing for incompressible data" message?
A. Other compression is much slower, and lzo is the fastest. To help speed up
the process, lzo compression is performed on the data first to test that the
data is at all compressible. If a small block of data is not compressible, it
tests progressively larger blocks until it has tested all the data (if it fails
to compress at all). If no compressible data is found, then the subsequent
compression is not even attempted. This can save a lot of time during the
compression phase when there is incompressible data. Theoretically it may be
possible that data is compressible by the other backend (zpaq, lzma etc) and not
at all by lzo, but in practice such data achieves only minuscule amounts of
compression which are not worth pursuing. Most of the time it is clear one way
or the other that data is compressible or not. If you wish to disable this
test and force it to try compressing it anyway, use -T 0.
Q. I Have truckloads of ram so I can compress files much better, but can my
generated file be decompressed on machines with less ram?
A. Yes. Ram requirements for decompression go up only by the -L compression
option with lzma and are never anywhere near as large as the compression
requirements. However if you're on 64bit and you use a compression window
greater than 2GB, it might not be possible to decompress it on 32bit machines.
Q. I've changed the compression level with -L in combination with -l or -z and
the file size doesn't vary?
A. That's right, -l and -z only has one compression level.
Q. Why are you including bzip2 compression?
A. To maintain a similar compression format to the original rzip (although the
other modes are more useful).
Q. What about multimedia?
A. Most multimedia is already in a heavily compressed "lossy" format which by
its very nature has very little redundancy. This means that there is not
much that can actually be compressed. If your video/audio/picture is in a
high bitrate, there will be more redundancy than a low bitrate one making it
more suitable to compression. None of the compression techniques in lrzip are
optimised for this sort of data. However, the nature of rzip preparation
means that you'll still get better compression than most normal compression
algorithms give you if you have very large files. ISO images of dvds for
example are best compressed directly instead of individual .VOB files. ZPAQ is
the only compression format that can do any significant compression of
multimedia.
Q. Is this multithreaded?
A. As of version 0.21, the answer is yes for lzma compression only thanks to a
multithreaded lzma library. However I have not found the gains to scale well
with number of cpus, but there are definite performance gains with more cpus.
It is important to note that the mulithreading actually decreases the
compression somewhat. It's a tradeoff either way.
Q. This uses heaps of memory, can I make it use less?
A. Well you can by setting -w to the lowest value (1) but the huge use of
memory is what makes the compression better than ordinary compression
programs so it defeats the point. You'll still derive benefit with -w 1 but
not as much.
Q. What CFLAGS should I use?
A. With a recent enough compiler (gcc>4) setting both CFLAGS and CXXFLAGS to
-O3 -march=native -fomit-frame-pointer
Q. What compiler does this work with?
A. It was been tested on gcc, ekopath and the intel compiler successfully.
Whether the commercial compilers help or not, I could not tell you.
Q. What codebase are you basing this on?
A. rzip v2.1 and lzma sdk907, but it should be possible to stay in sync with
each of these in the future.
Q. Do we really need yet another compression format?
A. It's not really a new one at all; simply a reimplementation of a few very
good performing ones that will scale with memory and file size.
Q. How do you use lrzip yourself?
A. Two basic uses. I compress large files currently on my drive with the
-l option since it is so quick to get a space saving, and when archiving
data for permanent storage I compress it with the default options.
Q. I found a file that compressed better with plain lzma. How can that be?
A. When the file is more than 5 times the size of the compression window
you have available, the efficiency of rzip preparation drops off as a means
of getting better compression. Eventually when the file is large enough,
plain lzma compression will get better ratios. The lrzip compression will be
a lot faster though. The only way around this is to use as much ram as
possible with the -M option, and going beyond that with the -U option.
Q. Can I use swapspace as ram for lrzip with a massive window?
A. It will indirectly do this with -M mode enabled. If you want the windows
even larger, -U (unlimited) mode will make the compression window as big as
the file itself no matter how big it is, but it will slow down 100 times
during the compression phase once it has reached your full ram.
Q. Why do you nice it to +19 by default? Can I speed up the compression by
changing the nice value?
A. This is a common misconception about what nice values do. They only tell the
cpu process scheduler how to prioritise workloads, and if your application is
the _only_ thing running it will be no faster at nice -20 nor will it be any
slower at +19.
Q. What is the Threshold option, -T ## (1-10)?
A. It is for adjusting the sensitivity of the LZO test that is used when LZMA
compression is selected. When highly random or already-compressed data chunks
are evaluated for LZMA compression, sometimes LZO compression actually will
create a larger chunk than the original.
The Threshold is used to determine a minimum compression amount relative to
the size of the data being evaluated. A value of 1 is the default. This
means that the compression threshold amount is >0% of the size of the
original data. If the threshold is not achieved, the LZMA compression will not
be done and the chunk will not be compressed. Values can be from 0 (bypass the
test) to 10 (maximum compression efficiency expected). The following table can
be used.
For LZO compressor test
T value Compression % Compression Ratio
0 Ignored
1 0-5% 1.00-1.05 very low compression expected
2 5-10% 1.05-1.10 default value
3 10-20% 1.12-1.25
4 20-30% 1.25-1.43
5 30-40% 1.43-1.66
6 40-50% 1.66-2.00
7 50-60% 2.00-2.50
8 60-70% 2.50-3.33
9 70-80% 3.33-5.00
10 80+% 5x+
Whenever the data chunk does not compress to the Threshold value, no LZMA
compression will be attempted. For example, if you select -T 5, LZMA
compression will be performed if the projected compression ratio is
less than 1.43. Otherwise, data will be written in rzip format. Setting
a very high T value will result in a lot of uncompressed data in the lrzip
file. However, a lot of time will be saved. For most people you shouldn't ever
need to touch this.
Q. Compression and decompression progress on large archives slows down and
speeds up. There's also a jump in the percentage at the end?
A. Yes, that's the nature of the compression/decompression mechanism. The jump
is because the rzip preparation makes the amount of data much smaller than the
compression backend (lzma) needs to compress.
Q. The percentage counter doesn't always get to 100%.
A. It's quite hard to predict during the rzip phase how long it will take as
lots of redundant data will not count towards the percentage.
Q. Tell me about patented compression algorithms, GPL, lawyers and copyright.
A. No
Q. I receive an error "LZMA ERROR: 2. Try a smaller compression window."
what does this mean?
A. LZMA requests large amounts of memory. When a higher compression window is
used, there may not be enough contiguous memory for LZMA. LZMA may request
up to 25% of TOTAL ram depending on compression level. If contiguous blocks
of memory are not free, LZMA will return an error. This is not a fatal
error. However, the current Stream will not be compressed.
Q. Where can I get more information about the internals of LZMA?
A. See http://www.7-zip.org and http://www.p7zip.org. Also, see the file
./lzma/C/lzmalib.h which explains the LZMA properties used and the LZMA
memory requirements and computation.
LIMITATIONS
Due to mmap limitations the maximum size a window can be set to is currently
2GB on 32bit unless the -U option is specified. Files generated on 64 bit
machines with windows >2GB in size might not be decompressible on 32bit
machines.
BUGS:
Probably lots. Tell me if you spot any :)
Links:
rzip:
http://rzip.samba.org/
lzo:
http://www.oberhumer.com/opensource/lzo/
lzma:
http://www.7-zip.org/
zpaq:
http://mattmahoney.net/dc/
Thanks to Andrew Tridgell for rzip. Thanks to Markus Oberhumer for lzo.
Thanks to Igor Pavlov for lzma. Thanks to Jean-loup Gailly and Mark Adler
for the zlib compression library. Thanks to Christian Leber for lzma
compat layer, Michael J Cohen for Darwin support, Lasse Collin for fix
to LZMALib.cpp and for Makefile.in suggestions, and everyone else who coded
along the way. Huge thanks to Peter Hyman for most of the 0.19-0.24 changes,
and the update to the multithreaded lzma library and all sorts of other
features. Thanks to René Rhéaume for fixing executable stacks and
Ed Avis for various fixes. Thanks to Matt Mahoney for zpaq code. Thanks to
Jukka Laurila for Darwin support. Thanks to George Makrydakis for lrztar.
Con Kolivas <kernel@kolivas.org>
Mon, 7 Nov 2010
Also documented by
Peter Hyman <pete@peterhyman.com>
Sun, 04 Jan 2009

View file

@ -1,13 +1,6 @@
lrzip-0.60 update
All files created with lrzip 0.6x are not backward compatible with versions
prior to 0.60. v0.6x can read files generated with earlier versions.
Con Kolivas March 2011.
lrzip-0.50 update
All files created with lrzip 0.5x are not backward compatible with versions
All files created with lrzip 0.50+ are not backward compatible with versions
prior to 0.50. v0.50 can read earlier generated files.
lrzip-0.41 update

479
README.md
View file

@ -1,479 +0,0 @@
lrzip - Long Range ZIP or LZMA RZIP
===================================
A compression utility that excels at compressing large files (usually > 10-50 MB).
Larger files and/or more free RAM means that the utility will be able to more
effectively compress your files (ie: faster / smaller size), especially if the
filesize(s) exceed 100 MB. You can either choose to optimise for speed (fast
compression / decompression) or size, but not both.
### haneefmubarak's TL;DR for the long explanation:
Just change the word `directory` to the name of the directory you wish to compress.
#### Compression:
```bash
lrzdir=directory; tar cvf $lrzdir.tar $lrzdir; lrzip -Ubvvp `nproc` -S .bzip2-lrz -L 9 $lrzdir.tar; rm -fv $lrzdir.tar; unset lrzdir
```
`tar`s the directory, then maxes out all of the system's processor cores
along with sliding window RAM to give the best **BZIP2** compression while being as fast as possible,
enables max verbosity output, attaches the extension `.bzip2-lrz`, and finally
gets rid of the temporary tarfile. Uses a tempvar `lrzdir` which is unset automatically.
#### Decompression for the kind of file from above:
```bash
lrzdir=directory; lrunzip -cdivvp `nproc` -o $lrzdir.tar $lrzdir.tar.bzip2-lrz; tar xvf $lrzdir.tar; rm -vf $lrzdir.tar
```
Checks integrity, then decompresses the directory using all of the
processor cores for max speed, enables max verbosity output, unarchives
the resulting tarfile, and finally gets rid of the temporary tarfile. Uses the same kind of tempvar.
### lrzip build/install guide:
A quick guide on building and installing.
#### What you will need
- gcc
- bash or zsh
- pthreads
- tar
- libc
- libm
- libz-dev
- libbz2-dev
- liblzo2-dev
- liblz4-dev
- coreutils
- Optional nasm
- git if you want a repo-fresh copy
- an OS with the usual *nix headers and libraries
#### Obtaining the source
Two different ways of doing this:
Stable: Packaged tarball that is known to work:
Go to <https://github.com/ckolivas/lrzip/releases> and download the `tar.gz`
file from the top. `cd` to the directory you downloaded, and use `tar xvzf lrzip-X.X.tar.gz`
to extract the files (don't forget to replace `X.X` with the correct version). Finally, cd
into the directory you just extracted.
Latest: `git clone -v https://github.com/ckolivas/lrzip.git; cd lrzip`
#### Build
```bash
./autogen.sh
./configure
make -j `nproc` # maxes out all cores
```
#### Install
Simple 'n Easy™: `sudo make install`
### lrzip 101:
|Command|Result|
|------|------|
|`lrztar directory`|An archive `directory.tar.lrz` compressed with **LZMA**.|
|`lrzuntar directory.tar.lrz`|A directory extracted from a `lrztar` archive.|
|`lrzip filename`|An archive `filename.lrz` compressed with **LZMA**, meaning slow compression and fast decompression.|
|`lrzip -z filename`|An archive "filename.lrz" compressed with **ZPAQ** that can give extreme compression, but takes a bit longer than forever to compress and decompress.|
|`lrzip -l filename`|An archive lightly compressed with **LZO**, meaning really, really fast compression and decompression.|
|`lrunzip filename.lrz`|Decompress filename.lrz to filename.|
|`lrz filename`|As per lrzip above but with gzip compatible semantics (i.e. will be quiet and delete original file)
|`lrz -d filename.lrz`|As per lrunzip above but with gzip compatible semantics (i.e. will be quiet and delete original file)
### lrzip internals
lrzip uses an extended version of [rzip](http://rzip.samba.org/) which does a first pass long distance
redundancy reduction. lrzip's modifications allow it to scale to accommodate various memory sizes.
Then, one of the following scenarios occurs:
- Compressed
- (default) **LZMA** gives excellent compression @ ~2x the speed of bzip2
- **ZPAQ** gives extreme compression while taking forever
- **LZO** gives insanely fast compression that can actually be faster than simply copying a large file
- **GZIP** gives compression almost as fast as LZO but with better compression
- **BZIP2** is a defacto linux standard and hacker favorite which usually gives
quite good compression (ZPAQ>LZMA>BZIP2>GZIP>LZO) while staying fairly fast (LZO>GZIP>BZIP2>LZMA>ZPAQ);
in other words, a good middle-ground and a good choice overall
- Uncompressed, in the words of the software's original author:
> Leaving it uncompressed and rzip prepared. This form improves substantially
> any compression performed on the resulting file in both size and speed (due to
> the nature of rzip preparation merging similar compressible blocks of data and
> creating a smaller file). By "improving" I mean it will either speed up the
> very slow compressors with minor detriment to compression, or greatly increase
> the compression of simple compression algorithms.
>
> (Con Kolivas, from the original lrzip README)
The only real disadvantages:
- The main program, lrzip, only works on single files, and therefore
requires the use of an lrztar wrapper to fake a complete archiver.
- lrzip requires quite a bit of memory along with a modern processor
to get the best performance in reasonable time. This usually means that
it is somewhat unusable with less than 256 MB. However, decompression
usually requires less RAM and can work on less powerful machines with much
less RAM. On machines with less RAM, it may be a good idea to enable swap
if you want to keep your operating system happy.
- Piping output to and/or from STDIN and/or STDOUT works fine with both
compression and decompression, but larger files compressed this way will
likely end up being compressed less efficiently. Decompression doesn't
really have any issues with piping, though.
One of the more unique features of lrzip is that it will try to use all of
the available RAM as best it can at all times to provide maximum benefit. This
is the default operating method, where it will create and use the single
largest memory window that will still fit in available memory without freezing
up the system. It does this by `mmap`ing the small portions of the file that
it is working on. However, it also has a unique "sliding `mmap`" feature, which
allows it to use compression windows that far exceed the size of your RAM if
the file you are compressing is large. It does this by using one large `mmap`
along with a smaller moving `mmap` buffer to track the part of the file that
is currently being examined. From a higher level, this can be seen as simply
emulating a single, large `mmap` buffer. The unfortunate thing about this
feature is that it can become extremely slow. The counter-argument to
being slower is that it will usually give a better compression factor.
The file `doc/README.benchmarks` has some performance examples to show
what kind of data lrzip is good with.
### FAQ
> Q: What kind of encryption does lrzip use?
> A: lrzip uses SHA2-512 repetitive hashing of the password along with a salt
> to provide a key which is used by AES-128 to do block encryption. Each block
> has more random salts added to the block key. The amount of initial hashing
> increases as the timestamp goes forward, in direct relation to Moore's law,
> which means that the amount of time required to encrypt/decrypt the file
> stays the same on a contemporary computer. It is virtually
> guaranteed that the same file encrypted with the same password will never
> be the same twice. The weakest link in this encryption mode by far is the
> password chosen by the user. There is currently no known attack or backdoor
> for this encryption mechanism, and there is absolutely no way of retrieving
> your password should you forget it.
> Q: How do I make a static build?
> A: `./configure --enable-static-bin`
> Q: I want the absolute maximum compression I can possibly get, what do I do?
> A: Try the command line options "-Uzp 1 -L 9". This uses all available ram and
> ZPAQ compression, and even uses a compression window larger than you have ram.
> The -p 1 option disables multithreading which improves compression but at the
> expense of speed. Expect it to take many times longer.
> Q: I want the absolute fastest decent compression I can possibly get.
> A: Try the command line option -l. This will use the lzo backend compression,
> and level 7 compression (1 isn't much faster).
> Q: How much slower is the unlimited mode?
> A: It depends on 2 things. First, just how much larger than your ram the file
is, as the bigger the difference, the slower it will be. The second is how much
redundant data there is. The more there is, the slower, but ultimately the
better the compression. Why isn't it on by default? If the compression window is
a LOT larger than ram, with a lot of redundant information it can be drastically
slower. I may revisit this possibility in the future if I can make it any
faster.
> Q: Can I use your tool for even more compression than lzma offers?
> A: Yes, the rzip preparation of files makes them more compressible by most
other compression technique I have tried. Using the -n option will generate
a .lrz file smaller than the original which should be more compressible, and
since it is smaller it will compress faster than it otherwise would have.
> Q: 32bit?
> A: 32bit machines have a limit of 2GB sized compression windows due to
userspace limitations on mmap and malloc, so even if you have much more ram
you will not be able to use compression windows larger than 2GB. Also you
may be unable to decompress files compressed on 64bit machines which have
used windows larger than 2GB.
> Q: How about 64bit?
> A: 64bit machines with their ability to address massive amounts of ram will
excel with lrzip due to being able to use compression windows limited only in
size by the amount of physical ram.
> Q: Other operating systems?
> A: The code is POSIXy with GNU extensions. Patches are welcome. Version 0.43+
should build on MacOSX 10.5+
> Q: Does it work on stdin/stdout?
> A: Yes it does. Compression and decompression work well to/from STDIN/STDOUT.
However because lrzip does multiple passes on the data, it has to store a
large amount in ram before it dumps it to STDOUT (and vice versa), thus it
is unable to work with the massive compression windows regular operation
provides. Thus the compression afforded on files larger than approximately
25% RAM size will be less efficient (though still benefiting compared to
traditional compression formats).
> Q: I have another compression format that is even better than zpaq, can you
use that?
> A: You can use it yourself on rzip prepared files (see above). Alternatively
if the source code is compatible with the GPL license it can be added to the
lrzip source code. Libraries with functions similar to compress() and
decompress() functions of zlib would make the process most painless. Please
tell me if you have such a library so I can include it :)
> Q: What's this "Starting lzma back end compression thread..." message?
> A: While I'm a big fan of progress percentage being visible, unfortunately
lzma compression can't currently be tracked when handing over 100+MB chunks
over to the lzma library. Therefore you'll see progress percentage until
each chunk is handed over to the lzma library.
> Q: What's this "lz4 testing for incompressible data" message?
> A: Other compression is much slower, and lz4 is the fastest. To help speed up
the process, lz4 compression is performed on the data first to test that the
data is at all compressible. If a small block of data is not compressible, it
tests progressively larger blocks until it has tested all the data (if it fails
to compress at all). If no compressible data is found, then the subsequent
compression is not even attempted. This can save a lot of time during the
compression phase when there is incompressible data. Theoretically it may be
possible that data is compressible by the other backend (zpaq, lzma etc) and
not at all by lz4, but in practice such data achieves only minuscule amounts of
compression which are not worth pursuing. Most of the time it is clear one way
or the other that data is compressible or not. If you wish to disable this test
and force it to try compressing it anyway, use -T.
> Q: I have truckloads of ram so I can compress files much better, but can my
generated file be decompressed on machines with less ram?
> A: Yes. Ram requirements for decompression go up only by the -L compression
option with lzma and are never anywhere near as large as the compression
requirements. However if you're on 64bit and you use a compression window
greater than 2GB, it might not be possible to decompress it on 32bit machines.
> Q: Why are you including bzip2 compression?
> A: To maintain a similar compression format to the original rzip (although the
other modes are more useful).
> Q: What about multimedia?
> A: Most multimedia is already in a heavily compressed "lossy" format which by
its very nature has very little redundancy. This means that there is not much
that can actually be compressed. If your video/audio/picture is in a high
bitrate, there will be more redundancy than a low bitrate one making it more
suitable to compression. None of the compression techniques in lrzip are
optimised for this sort of data. However, the nature of rzip preparation means
that you'll still get better compression than most normal compression
algorithms give you if you have very large files. ISO images of dvds for
example are best compressed directly instead of individual .VOB files. ZPAQ is
the only compression format that can do any significant compression of
multimedia.
> Q: Is this multithreaded?
> A: As of version 0.540, it is HEAVILY multithreaded with the back end
compression and decompression phase, and will continue to process the rzip
pre-processing phase so when using one of the more CPU intensive backend
compressions like lzma or zpaq, SMP machines will show massive speed
improvements. Lrzip will detect the number of CPUs to use, but it can be
overridden with the -p option if the slightly better compression is desired
more than speed. -p 1 will give the best compression but also be the slowest.
> Q: This uses heaps of memory, can I make it use less?
> A: Well you can by setting -w to the lowest value (1) but the huge use of
memory is what makes the compression better than ordinary compression
programs so it defeats the point. You'll still derive benefit with -w 1 but
not as much.
> Q: What CFLAGS should I use?
> A: With a recent enough compiler (gcc>4) setting both CFLAGS and CXXFLAGS to
-O2 -march=native -fomit-frame-pointer
> Q: What compiler does this work with?
> A: It has been tested on gcc, ekopath and the intel compiler successfully
previously. Whether the commercial compilers help or not, I could not tell you.
> Q: What codebase are you basing this on?
> A: rzip v2.1 and lzma sdk920, but it should be possible to stay in sync with
each of these in the future.
> Q: Do we really need yet another compression format?
> A: It's not really a new one at all; simply a reimplementation of a few very
good performing ones that will scale with memory and file size.
> Q: How do you use lrzip yourself?
> A: Three basic uses. I compress large files currently on my drive with the
-l option since it is so quick to get a space saving. When archiving data for
permanent storage I compress it with the default options. When compressing
small files for distribution I use the -z option for the smallest possible
size.
> Q: I found a file that compressed better with plain lzma. How can that be?
> A: When the file is more than 5 times the size of the compression window
you have available, the efficiency of rzip preparation drops off as a means
of getting better compression. Eventually when the file is large enough,
plain lzma compression will get better ratios. The lrzip compression will be
a lot faster though. The only way around this is to use as large compression
windows as possible with -U option.
> Q: Can I use swapspace as ram for lrzip with a massive window?
> A: It will indirectly do this with -U (unlimited) mode enabled. This mode will
make the compression window as big as the file itself no matter how big it is,
but it will slow down proportionately more the bigger the file is than your ram.
> Q: Why do you nice it to +19 by default? Can I speed up the compression by
changing the nice value?
> A: This is a common misconception about what nice values do. They only tell the
cpu process scheduler how to prioritise workloads, and if your application is
the _only_ thing running it will be no faster at nice -20 nor will it be any
slower at +19.
> Q: What is the LZ4 Testing option, -T?
> A: LZ4 testing is normally performed for the slower back-end compression of
LZMA and ZPAQ. The reasoning is that if it is completely incompressible by LZ4
then it will also be incompressible by them. Thus if a block fails to be
compressed by the very fast LZ4, lrzip will not attempt to compress that block
with the slower compressor, thereby saving time. If this option is enabled, it
will bypass the LZ4 testing and attempt to compress each block regardless.
> Q: Compression and decompression progress on large archives slows down and
speeds up. There's also a jump in the percentage at the end?
> A: Yes, that's the nature of the compression/decompression mechanism. The jump
is because the rzip preparation makes the amount of data much smaller than the
compression backend (lzma) needs to compress.
> Q: Tell me about patented compression algorithms, GPL, lawyers and copyright.
> A: No
> Q: I receive an error "LZMA ERROR: 2. Try a smaller compression window."
what does this mean?
> A: LZMA requests large amounts of memory. When a higher compression window is
used, there may not be enough contiguous memory for LZMA: LZMA may request up
to 25% of TOTAL ram depending on compression level. If contiguous blocks of
memory are not free, LZMA will return an error. This is not a fatal error, and
a backup mode of compression will be used.
> Q: Where can I get more information about the internals of LZMA?
> A: See http://www.7-zip.org and http://www.p7zip.org. Also, see the file
./lzma/C/lzmalib.h which explains the LZMA properties used and the LZMA
memory requirements and computation.
> Q: This version is much slower than the old version?
> A: Make sure you have set CFLAGS and CXXFLAGS. An unoptimised build will be
almost 3 times slower.
> Q: Why not update to the latest version of libzpaq?
> A: For reasons that are unclear the later versions of libzpaq create
corrupt archives when included with lrzip
#### LIMITATIONS
Due to mmap limitations the maximum size a window can be set to is currently
2GB on 32bit unless the -U option is specified. Files generated on 64 bit
machines with windows >2GB in size might not be decompressible on 32bit
machines. Large files might not decompress on machines with less RAM if SWAP is
disabled.
#### BUGS:
Probably lots. <https://github.com/ckolivas/lrzip/issues> if you spot any :D
Any known ones should be documented
in the file BUGS.
#### Backends:
rzip:
<http://rzip.samba.org/>
lzo:
<http://www.oberhumer.com/opensource/lzo/>
lzma:
<http://www.7-zip.org/>
zpaq:
<http://mattmahoney.net/dc/>
### Thanks (CONTRIBUTORS)
|Person(s)|Thanks for|
|---|---|
|`Andrew Tridgell`|`rzip`|
|`Markus Oberhumer`|`lzo`|
|`Igor Pavlov`|`lzma`|
|`Jean-Loup Gailly & Mark Adler`|`zlib`|
|***`Con Kolivas`***|***Original Code, binding all of this together, managing the project, original `README`***|
|`Christian Leber`|`lzma` compatibility layer|
|`Michael J Cohen`|Darwin/OSX support|
|`Lasse Collin`|fixes to `LZMALib.cpp` and `Makefile.in`|
|Everyone else who coded along the way (add yourself where appropriate if that's you)|Miscellaneous Coding|
|**`Peter Hyman`**|Most of the `0.19` to `0.24` changes|
|`^^^^^^^^^^^`|Updating the multithreaded `lzma` lib
|`^^^^^^^^^^^`|All sorts of other features
|`René Rhéaume`|Fixing executable stacks|
|`Ed Avis`|Various fixes|
|`Matt Mahoney`|`zpaq` integration code|
|`Jukka Laurila`|Additional Darwin/OSX support|
|`George Makrydakis`|`lrztar` wrapper|
|`Ulrich Drepper`|*special* implementation of md5|
|**`Michael Blumenkrantz`**|New config tools|
|`^^^^^^^^^^^^^^^^^^^^`|`liblrzip`|
|Authors of `PolarSSL`|Encryption code|
|`Serge Belyshev`|Extensive help, advice, and patches to implement secure encryption|
|`Jari Aalto`|Fixing typos, esp. in code|
|`Carlo Alberto Ferraris`|Code cleanup
|`Peter Hyman`|Additional documentation|
|`Haneef Mubarak`|Cleanup, Rewrite, and GH Markdown of `README` --> `README.md`|
Persons above are listed in chronological order of first contribution to **lrzip**. Person(s) with names in **bold** have multiple major contributions, person(s) with names in *italics* have made massive contributions, person(s) with names in ***both*** have made innumerable massive contributions.
#### README Authors
Con Kolivas (`ckolivas` on GitHub) <kernel@kolivas.org>
Tuesday, 16 February 2021: README
Also documented by
Peter Hyman <pete@peterhyman.com>
Sun, 04 Jan 2009: README
Mostly Rewritten + GFMified:
Haneef Mubarak (haneefmubarak on GitHub)
Sun/Mon Sep 01-02 2013: README.md

25
TODO
View file

@ -1,13 +1,4 @@
MAYBE TODO for lrzip program
Upgrade to newer version of zpaq supporting 3 compression levels without
relying on open_memstream so it works without temporary files on apple.
Get MD5 working on apple.
Make sure STDIO works properly on large files on apple.
Make a liblrzip library.
TODO for lrzip program
Other posix/windows builds?? Need help there...
@ -21,3 +12,17 @@ Consider ncurses version or even GUI one.
Consider using LZMA Filters for processor-optimised
coding to increase compression.
Get the ASM working on 64bit.
Clean up the config system since it's a mystery to me.
Increased multi-threading.
Make stdout work without a temporary file.
Make stdin on decompression work without a temporary file.
Make testing file integrity work without a temporary file.
Stop breaking Darwin builds :P

390
WHATS-NEW
View file

@ -1,393 +1,3 @@
lrzip-0.651
Remove redundant files
Revert locale dependent output
Add warnings for low memory and threads
lrzip-0.650
Minor optimisations.
Exit status fixes.
Update and beautify information output.
Fix Android build.
Enable MD5 on Apple build.
Deprecate and remove liblrzip which was unused and at risk of bitrot.
Fix failures with compressing to STDOUT with inadequate memory.
Fix possible race conditions.
Fix memory leaks.
Fix -q to only hide progress.
Add -Q option for very quiet.
lrzip-0.641
Critical bugfix for broken lz4 testing which would prevent secondary
compression from being enabled.
lrzip-0.640
Numerous bugfixes and build fixes.
lz4 now used for compressibility testing (only) making lz4-dev a build
requirement.
Fixes for handling of corrupt archives without crashing.
Fixes for creating small lzma based archives to stdout.
Incomplete files are now deleted on interrupting lrzip unless the keep-broken
option is enabled.
Version prints to stdout instead of stderr.
lrzip-0.631
Assembler code is back and works with x86_64
lrzip-0.621
Substantial speed ups for the rzip stage in both regular and unlimited modes.
Lrzip now supports long command line options.
Proper support for the various forms of TMPDIR environment variables.
More unix portability fixes.
OSX fixes.
Fixed order of lrzip.conf search.
Addressed all warnings created with pedantic compiler settings and clang
Fixes for some stderr messages being swallowed up.
Fixed being unable to decompress to STDOUT when in a non-writable directory.
Changed broken liblrzip callback function API to match lrzip proper.
lrzip-0.620
Fixes display output of lrzip -i for large files greater than one chunk.
Fixes for various failure to allocate memory conditions when dealing with
large files and STDIO.
Fixes for more unix portability.
Fixes for failure to decompress to STDOUT.
lrzip-0.616
Fixes for various issues with -O not working with trailing slashes and
outputting to directories that already exist.
lrzip-0.615
Fixed -O not working on lrztar.
Made it less likely to run out of ram when working with STDIN/OUT.
Fixed running out of ram when using -U on huge files.
Fixed corrupt archives being generated from incompressible data.
Fixed corrupt archives being generated from very small files.
Fixed endianness on various platforms for MD5 calculation to work.
Fixed rare corruption when compressing with lzma from STDIN.
Fixed all blank data being generated when compressing from STDIN on OSX.
Performance micro-optimisations.
Fixed corrupt archive being generated when all the same non-zero bytes exist on
large files.
lrzip-0.614
Fixed lrztar not working.
lrzip-0.613
Fixed the bug where massive files would show an incorrect md5 value on
decompression - this was a bug from the md5 code upstream.
Compressing ultra-small files to corrupt archives was fixed.
Compilation on various other platforms was fixed.
A crash with using -S was fixed.
lrzip-0.612
Updated to a new zpaq library back end which is faster and now supports three
different compression levels, which will be activated at lrzip levels -L 1+, 4+
and 8+. This significantly increases the maximum compression available by lrzip
with -L 9.
The include file Lrzip.h used by liblrzip is now properly installed into
$prefix/include.
lrzip-0.611
lrzcat and lrzuntar have been fixed.
The update counter will continue to update even when there is nothing being
matched (like a file full of zeroes).
Numerous optimisations in the rzip stage speeds up the faster compression modes
noticeably.
Checksumming is done in a separate thread during rzip compression for more
compression speed improvements.
lrzip-0.610
The new liblrzip library allows you to add lrzip compression and decompression
to other applications with either simple lrzip_compress and lrzip_decompress
functions or fine control over all the options with low level functions.
Faster rzip stage when files are large enough to require the sliding mmap
feature (usually >1/3 of ram) and in unlimited mode.
A bug where multiple files being compressed or decompressed from the one
command line could have gotten corrupted was fixed.
Modification date of the decompressed file is now set to that of the lrzip
archive (support for storing the original file's date would require modifying
the archive format again).
Compilation warning fixes.
Make lrztar work with directories with spaces in their names.
lrzip-0.608
Faster rzip stage through use of a selective get_sb function.
The bash completion script is no longer installed by default to not conflict
with distribution bash completion packages.
More compilation fixes for non-linux platforms.
lrzip-0.607
A rare case of not being able to decompress archives was fixed.
The lzma library was updated to version 920.
A bash completion script for lrzip was added.
More debugging info was added in maximum verbose mode.
Less messages occur without verbose mode.
FreeBSD and posix compilation fixes were committed.
lrzip-0.606
lrzuntar, which broke last version leaving behind an untarred .tar file, is
working properly again.
lrzip-0.605
Addition of lrzcat - automatically decompresses .lrz files to stdout.
lrzip and lrunzip will no longer automatically output to stdout due to
addition of lrzcat executable, and to be consistent with gzip.
lrzip progress output will no longer spam the output unless the percentage
has changed.
lrzip now has no lower limit on file sizes it will happily compress and is
able to work with zero byte sized files.
The percentage counter when getting file info on small files will not show
%nan.
The executable bit will not be enabled when compressing via a means that
can't preserve the original permissions (e.g. from STDIN).
lrzip-0.604
lrzip will no longer fail with a "resource temporarily unavailable" error
when compressing files over 100GB that require hundreds of threads to
complete.
lrzip-0.603
lrzip now supports stdout without requiring the '-o -' option. It detects when
output is being redirected without a filename and will automatically output to
stdout so you can do:
lrunzip patch-2.6.38.4.lrz | patch -p1
Apple builds will not have errors on compressing files >2GB in size which
broke with 0.600.
lrztar will properly support -o, -O and -S.
lrzip.conf file now supports encryption.
lrzip will now warn if it's inappropriately passed a directory as an argument
directly.
lrzip-0.602
Fixed wrong symlinks which broke some package generation.
Imposed limits for 32bit machines with way too much ram for their own good.
Disable md5 generation on Apple for now since it's faulty.
Displays full version with -V.
Checks for podman on ./configure
Now builds on Cygwin.
File permissions are better carried over instead of being only 0600.
lrzip-0.601
lrzuntar, lrunzip symlinks and the pod-based manpages are installed again.
Configuration clearly shows now that ASM isn't supported on 64bit.
lrzip-0.600
Compressing/decompressing to/from STDIN/STDOUT now works without generating
any temporary files. Very large files compressed in this way will be less
efficiently compressed than if the whole solid file is presented to lrzip,
but it is guaranteed not to generate temporary files on compression.
Decompressing files on a machine with the same amount of ram will also not
generate temporary files, but if a file was generated on a larger ram machine,
lrzip might employ temporary files, but they will not be the full size of the
final file.
Decompression should now be faster as the rzip reconstruction stage is mostly
performed in ram before being written to disk, and testing much faster.
Final file sizes should be slightly smaller as block headers are now also
compressed.
Heavy grade encryption is now provided with the -e option. A combination of
a time scaled multiply hashed sha512 password with random salt followed by
aes128 block encryption of all data, including the data headers, provides for
extremely secure encryption. Passwords up to 500 characters in length are
supported, and the same file encrypted with the same password is virtually
guaranteed to never produce the same data twice. All data beyond the basic
lrzip opening header is completely obscured. Don't lose your password!
Lrzip will not try to malloc a negative amount of ram on smaller ram machines,
preferring to decrease the number of threads used when compressing, and then
aborting to a nominal minimum.
A new build configuration system which should be more robust and provides
neater output during compilation.
lrzip should work again on big endian hardware.
lrztar / lrzuntar will no longer use temporary files.
lrzip-0.571
Avoid spurious errors on failing to mmap a file.
Fee space will now be checked to ensure there is enough room for the
compressed or decompressed file and lrzip will abort unless the -f option is
passed to it.
The extra little chunk at the end of every large file should now be fixed.
The file lzma.txt now has unix end-of-lines.
There will be a more accurate summary of what compression window will be used
when lrzip is invoked with STDIN/STDOUT.
STDIN will now be able to show estimated time to completion and percentage
complete once lrzip knows how much file is left.
Temporary files are much less likely to be left lying around.
Less temporary file space will be used when decompressing to stdout.
File checking will not be attempted when it's meaningless (like to stdout).
Times displayed should avoid the nonsense thousands of seconds bug.
lrzip-0.570
Multi-threaded performance has been improved with a significant speed-up on
both compression and decompression. New benchmark results have been added to
the README.benchmarks file.
Visual output has been further improved, with an updated help menu and no
unrelated system errors on failure.
lrzip.conf supports the newer options available.
TMP environment is now respected when using temporary files and TMPDIR can be
set in lrzip.conf.
LRZIP=NOCONFIG environment variable setting can be used to bypass lrzip.conf.
The -M option has been removed as the -U option achieves more and has
understandable semantics.
Memory usage should be very tightly controlled on compression now by default,
using the most possible without running out of ram.
Temporary files generated when doing -t from stdin will no longer be left lying
around.
lrzip will no longer stupidly sit waiting to read from stdin/stdout when called
from a terminal without other arguments.
Executable size will be slightly smaller due to stripping symbols by default
now.
The -T option no longer takes an argument. It simply denotes that lzo testing
should be disabled.
Verbose added to -i now prints a lot more information about an lrzip archive.
lrzip-0.560
Implemented OSX multi-threading by converting all semaphores to pthread_mutexes.
Converted the integrity checking to also use md5 hash checking. As a bonus it
is still backwardly compatible by still storing the crc value, and yet is
faster on large files than the old one. On decompression it detects whether
the md5 value has been stored and chooses what integrity checking to use.
Implemented the -H feature which shows the md5 hash value on compression and
decompression. It is also shown in max verbose mode.
Added information about what integrity testing will be used in verbose mode,
and with the -i option.
Added the -c option which will perform a hash check on the file generated on
disk on decompression, comparing it to that from the archive to validate the
decompressed file.
Modified lrzip to delete broken or damaged files when lrzip is interrupted or
the file generated fails an integrity test.
Added the -k keep option to keep broken or damaged files.
Case reports of corruption have been confirmed to NOT BE DUE TO LRZIP.
lrzip-0.552
Fixed a potential silent corruption bug on decompression.
Fixed compilation on freebsd.
Fixed failures on incompressible blocks with bzip2 or gzip.
Fixed osx failing to work. It does not support threaded compression or
decompression but should work again.
lrzip-0.551
Compressing from stdin should be unbroken again.
Compression values returned at the end of stdin work.
lzma failing to compress a block will not cause a failure.
lrzip-0.550
Speed up compression on large files that take more than one pass by overlapping
work on successive streams, thus using multiple CPUs better.
Fix for failures to decompress large files. Decompression will be slightly
slower but more reliable.
Faster lzma compression by default, less prone to memory failures, but at slight
compression cost.
Recover from multithreaded failures by serialising work that there isn't enough
ram to do in parallel.
Revert the "smooth out spacing" change in 0.544 as it slowed things down instead
of speeding them up.
Larger compression windows are back for 32 bits now that memory usage is kept
under better control.
Fixed some memory allocation issues which may have been causing subtle bugs.
lrzip-0.544
Hopefully a fix for corrupt decompression on large files with multiple stream 0
entries.
Fix for use under uclibc.
Fix for memory allocation errors on large files on 32 bits.
Smooth out spacing of compression threads making better use of CPU on compress
and decompress.
Fix for using -U on ultra-small files.
Use bzip2 on blocks that lzma fails to compress to make sure they are still
compressed.
lrzip-0.543
A fix for when large files being decompressed fail with multithreaded
decompression.
Slight speedup on multithreaded workloads by decreasing the nice value of the
main process compared to the back end threads as it tends to be the rate
limiting component.
Fixed lzma compression windows being set way too small by default.
lrzip-0.542
Lrzip will now try to select sane defaults for memory usage in cases where the
virtual memory heavily overcommits (eg. Linux) as this seriously slows down
compression.
For compression windows larger than 2/3 ram, lrzip will now use a sliding mmap
buffer for better performance.
The progress output is more informative in max verbose mode, and will no longer
do more passes than it estimates.
32 bit machines should be able to use slightly larger windows.
The sliding mmap not working on 2nd pass onwards has been fixed which should
speed up the slowdown of death.
lrzip-0.540
MASSIVE MULTITHREADING on the decompression phase. Provided there are enough
chunks of data in the archived file, lrzip will use as many threads as there
are CPUs for the backend decompression. Much like the multithreading on the
compression side, it makes the slower compression algorithms speed up the most.
Fixed output from being scrambled and consuming a lot of CPU time on threaded
zpaq compression.
Further fixes to ensure window sizes work on 32 bit machines.
Be more careful about testing for how much ram lrzip can use.
Minor build warning fixes.
Minor tweaks to screen output.
Updated benchmarks.
lrzip-0.530
MASSIVE MULTITHREADING on the compression phase. Lrzip will now use as many
threads as you have CPU cores for the back end compression, and even continue
doing the rzip preprocessing stage as long as it can which the other threads
continue. This makes the slower compression algorithms (lzma and zpaq) much
faster on multicore machines, to the point of making zpaq compression almost
as fast as single threaded lzma compression.
-p option added to allow you to specify number of processors to override the
built-in test, or if you wish to disable threading.
-P option to not set permissions has now been removed since failing to set
permissions is only a warning now and not a failure.
Further improvements to the progress output.
Updated benchmarks and docs.
lrzip-0.520
Just changed version numbering back to 2 point.
lrzip-0.5.2
Fixed the Darwin build again.

545
aes.c
View file

@ -1,545 +0,0 @@
/*
* FIPS-197 compliant AES implementation
*
* Copyright (C) 2011, Con Kolivas <kernel@kolivas.org>
* Copyright (C) 2006-2010, Brainspark B.V.
*
* This file is part of PolarSSL (http://www.polarssl.org)
* Lead Maintainer: Paul Bakker <polarssl_maintainer at polarssl.org>
*
* All rights reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
*/
/*
* The AES block cipher was designed by Vincent Rijmen and Joan Daemen.
*
* http://csrc.nist.gov/encryption/aes/rijndael/Rijndael.pdf
* http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf
*/
#include "aes.h"
#include <string.h>
/*
* 32-bit integer manipulation macros (little endian)
*/
#ifndef GET_ULONG_LE
#define GET_ULONG_LE(n,b,i) \
{ \
(n) = ( (unsigned long) (b)[(i) ] ) \
| ( (unsigned long) (b)[(i) + 1] << 8 ) \
| ( (unsigned long) (b)[(i) + 2] << 16 ) \
| ( (unsigned long) (b)[(i) + 3] << 24 ); \
}
#endif
#ifndef PUT_ULONG_LE
#define PUT_ULONG_LE(n,b,i) \
{ \
(b)[(i) ] = (unsigned char) ( (n) ); \
(b)[(i) + 1] = (unsigned char) ( (n) >> 8 ); \
(b)[(i) + 2] = (unsigned char) ( (n) >> 16 ); \
(b)[(i) + 3] = (unsigned char) ( (n) >> 24 ); \
}
#endif
/*
* Forward S-box & tables
*/
static unsigned char FSb[256];
static unsigned long FT0[256];
static unsigned long FT1[256];
static unsigned long FT2[256];
static unsigned long FT3[256];
/*
* Reverse S-box & tables
*/
static unsigned char RSb[256];
static unsigned long RT0[256];
static unsigned long RT1[256];
static unsigned long RT2[256];
static unsigned long RT3[256];
/*
* Round constants
*/
static unsigned long RCON[10];
/*
* Tables generation code
*/
#define ROTL8(x) ( ( x << 8 ) & 0xFFFFFFFF ) | ( x >> 24 )
#define XTIME(x) ( ( x << 1 ) ^ ( ( x & 0x80 ) ? 0x1B : 0x00 ) )
#define MUL(x,y) ( ( x && y ) ? pow[(log[x]+log[y]) % 255] : 0 )
static int aes_init_done = 0;
static void aes_gen_tables( void )
{
int i, x, y, z;
int pow[256];
int log[256];
/*
* compute pow and log tables over GF(2^8)
*/
for( i = 0, x = 1; i < 256; i++ )
{
pow[i] = x;
log[x] = i;
x = ( x ^ XTIME( x ) ) & 0xFF;
}
/*
* calculate the round constants
*/
for( i = 0, x = 1; i < 10; i++ )
{
RCON[i] = (unsigned long) x;
x = XTIME( x ) & 0xFF;
}
/*
* generate the forward and reverse S-boxes
*/
FSb[0x00] = 0x63;
RSb[0x63] = 0x00;
for( i = 1; i < 256; i++ )
{
x = pow[255 - log[i]];
y = x; y = ( (y << 1) | (y >> 7) ) & 0xFF;
x ^= y; y = ( (y << 1) | (y >> 7) ) & 0xFF;
x ^= y; y = ( (y << 1) | (y >> 7) ) & 0xFF;
x ^= y; y = ( (y << 1) | (y >> 7) ) & 0xFF;
x ^= y ^ 0x63;
FSb[i] = (unsigned char) x;
RSb[x] = (unsigned char) i;
}
/*
* generate the forward and reverse tables
*/
for( i = 0; i < 256; i++ )
{
x = FSb[i];
y = XTIME( x ) & 0xFF;
z = ( y ^ x ) & 0xFF;
FT0[i] = ( (unsigned long) y ) ^
( (unsigned long) x << 8 ) ^
( (unsigned long) x << 16 ) ^
( (unsigned long) z << 24 );
FT1[i] = ROTL8( FT0[i] );
FT2[i] = ROTL8( FT1[i] );
FT3[i] = ROTL8( FT2[i] );
x = RSb[i];
RT0[i] = ( (unsigned long) MUL( 0x0E, x ) ) ^
( (unsigned long) MUL( 0x09, x ) << 8 ) ^
( (unsigned long) MUL( 0x0D, x ) << 16 ) ^
( (unsigned long) MUL( 0x0B, x ) << 24 );
RT1[i] = ROTL8( RT0[i] );
RT2[i] = ROTL8( RT1[i] );
RT3[i] = ROTL8( RT2[i] );
}
}
/*
* AES key schedule (encryption)
*/
int aes_setkey_enc( aes_context *ctx, const unsigned char *key, int keysize )
{
int i;
unsigned long *RK;
#if !defined(POLARSSL_AES_ROM_TABLES)
if( aes_init_done == 0 )
{
aes_gen_tables();
aes_init_done = 1;
}
#endif
switch( keysize )
{
case 128: ctx->nr = 10; break;
case 192: ctx->nr = 12; break;
case 256: ctx->nr = 14; break;
default : return( POLARSSL_ERR_AES_INVALID_KEY_LENGTH );
}
#if defined(PADLOCK_ALIGN16)
ctx->rk = RK = PADLOCK_ALIGN16( ctx->buf );
#else
ctx->rk = RK = ctx->buf;
#endif
for( i = 0; i < (keysize >> 5); i++ )
{
GET_ULONG_LE( RK[i], key, i << 2 );
}
switch( ctx->nr )
{
case 10:
for( i = 0; i < 10; i++, RK += 4 )
{
RK[4] = RK[0] ^ RCON[i] ^
( (unsigned long) FSb[ ( RK[3] >> 8 ) & 0xFF ] ) ^
( (unsigned long) FSb[ ( RK[3] >> 16 ) & 0xFF ] << 8 ) ^
( (unsigned long) FSb[ ( RK[3] >> 24 ) & 0xFF ] << 16 ) ^
( (unsigned long) FSb[ ( RK[3] ) & 0xFF ] << 24 );
RK[5] = RK[1] ^ RK[4];
RK[6] = RK[2] ^ RK[5];
RK[7] = RK[3] ^ RK[6];
}
break;
case 12:
for( i = 0; i < 8; i++, RK += 6 )
{
RK[6] = RK[0] ^ RCON[i] ^
( (unsigned long) FSb[ ( RK[5] >> 8 ) & 0xFF ] ) ^
( (unsigned long) FSb[ ( RK[5] >> 16 ) & 0xFF ] << 8 ) ^
( (unsigned long) FSb[ ( RK[5] >> 24 ) & 0xFF ] << 16 ) ^
( (unsigned long) FSb[ ( RK[5] ) & 0xFF ] << 24 );
RK[7] = RK[1] ^ RK[6];
RK[8] = RK[2] ^ RK[7];
RK[9] = RK[3] ^ RK[8];
RK[10] = RK[4] ^ RK[9];
RK[11] = RK[5] ^ RK[10];
}
break;
case 14:
for( i = 0; i < 7; i++, RK += 8 )
{
RK[8] = RK[0] ^ RCON[i] ^
( (unsigned long) FSb[ ( RK[7] >> 8 ) & 0xFF ] ) ^
( (unsigned long) FSb[ ( RK[7] >> 16 ) & 0xFF ] << 8 ) ^
( (unsigned long) FSb[ ( RK[7] >> 24 ) & 0xFF ] << 16 ) ^
( (unsigned long) FSb[ ( RK[7] ) & 0xFF ] << 24 );
RK[9] = RK[1] ^ RK[8];
RK[10] = RK[2] ^ RK[9];
RK[11] = RK[3] ^ RK[10];
RK[12] = RK[4] ^
( (unsigned long) FSb[ ( RK[11] ) & 0xFF ] ) ^
( (unsigned long) FSb[ ( RK[11] >> 8 ) & 0xFF ] << 8 ) ^
( (unsigned long) FSb[ ( RK[11] >> 16 ) & 0xFF ] << 16 ) ^
( (unsigned long) FSb[ ( RK[11] >> 24 ) & 0xFF ] << 24 );
RK[13] = RK[5] ^ RK[12];
RK[14] = RK[6] ^ RK[13];
RK[15] = RK[7] ^ RK[14];
}
break;
default:
break;
}
return( 0 );
}
/*
* AES key schedule (decryption)
*/
int aes_setkey_dec( aes_context *ctx, const unsigned char *key, int keysize )
{
int i, j;
aes_context cty;
unsigned long *RK;
unsigned long *SK;
int ret;
switch( keysize )
{
case 128: ctx->nr = 10; break;
case 192: ctx->nr = 12; break;
case 256: ctx->nr = 14; break;
default : return( POLARSSL_ERR_AES_INVALID_KEY_LENGTH );
}
#if defined(PADLOCK_ALIGN16)
ctx->rk = RK = PADLOCK_ALIGN16( ctx->buf );
#else
ctx->rk = RK = ctx->buf;
#endif
ret = aes_setkey_enc( &cty, key, keysize );
if( ret != 0 )
return( ret );
SK = cty.rk + cty.nr * 4;
*RK++ = *SK++;
*RK++ = *SK++;
*RK++ = *SK++;
*RK++ = *SK++;
for( i = ctx->nr - 1, SK -= 8; i > 0; i--, SK -= 8 )
{
for( j = 0; j < 4; j++, SK++ )
{
*RK++ = RT0[ FSb[ ( *SK ) & 0xFF ] ] ^
RT1[ FSb[ ( *SK >> 8 ) & 0xFF ] ] ^
RT2[ FSb[ ( *SK >> 16 ) & 0xFF ] ] ^
RT3[ FSb[ ( *SK >> 24 ) & 0xFF ] ];
}
}
*RK++ = *SK++;
*RK++ = *SK++;
*RK++ = *SK++;
*RK++ = *SK++;
memset( &cty, 0, sizeof( aes_context ) );
return( 0 );
}
#define AES_FROUND(X0,X1,X2,X3,Y0,Y1,Y2,Y3) \
{ \
X0 = *RK++ ^ FT0[ ( Y0 ) & 0xFF ] ^ \
FT1[ ( Y1 >> 8 ) & 0xFF ] ^ \
FT2[ ( Y2 >> 16 ) & 0xFF ] ^ \
FT3[ ( Y3 >> 24 ) & 0xFF ]; \
\
X1 = *RK++ ^ FT0[ ( Y1 ) & 0xFF ] ^ \
FT1[ ( Y2 >> 8 ) & 0xFF ] ^ \
FT2[ ( Y3 >> 16 ) & 0xFF ] ^ \
FT3[ ( Y0 >> 24 ) & 0xFF ]; \
\
X2 = *RK++ ^ FT0[ ( Y2 ) & 0xFF ] ^ \
FT1[ ( Y3 >> 8 ) & 0xFF ] ^ \
FT2[ ( Y0 >> 16 ) & 0xFF ] ^ \
FT3[ ( Y1 >> 24 ) & 0xFF ]; \
\
X3 = *RK++ ^ FT0[ ( Y3 ) & 0xFF ] ^ \
FT1[ ( Y0 >> 8 ) & 0xFF ] ^ \
FT2[ ( Y1 >> 16 ) & 0xFF ] ^ \
FT3[ ( Y2 >> 24 ) & 0xFF ]; \
}
#define AES_RROUND(X0,X1,X2,X3,Y0,Y1,Y2,Y3) \
{ \
X0 = *RK++ ^ RT0[ ( Y0 ) & 0xFF ] ^ \
RT1[ ( Y3 >> 8 ) & 0xFF ] ^ \
RT2[ ( Y2 >> 16 ) & 0xFF ] ^ \
RT3[ ( Y1 >> 24 ) & 0xFF ]; \
\
X1 = *RK++ ^ RT0[ ( Y1 ) & 0xFF ] ^ \
RT1[ ( Y0 >> 8 ) & 0xFF ] ^ \
RT2[ ( Y3 >> 16 ) & 0xFF ] ^ \
RT3[ ( Y2 >> 24 ) & 0xFF ]; \
\
X2 = *RK++ ^ RT0[ ( Y2 ) & 0xFF ] ^ \
RT1[ ( Y1 >> 8 ) & 0xFF ] ^ \
RT2[ ( Y0 >> 16 ) & 0xFF ] ^ \
RT3[ ( Y3 >> 24 ) & 0xFF ]; \
\
X3 = *RK++ ^ RT0[ ( Y3 ) & 0xFF ] ^ \
RT1[ ( Y2 >> 8 ) & 0xFF ] ^ \
RT2[ ( Y1 >> 16 ) & 0xFF ] ^ \
RT3[ ( Y0 >> 24 ) & 0xFF ]; \
}
/*
* AES-ECB block encryption/decryption
*/
int aes_crypt_ecb( aes_context *ctx,
int mode,
const unsigned char input[16],
unsigned char output[16] )
{
int i;
unsigned long *RK, X0, X1, X2, X3, Y0, Y1, Y2, Y3;
#if defined(POLARSSL_PADLOCK_C) && defined(POLARSSL_HAVE_X86)
if( padlock_supports( PADLOCK_ACE ) )
{
if( padlock_xcryptecb( ctx, mode, input, output ) == 0 )
return( 0 );
// If padlock data misaligned, we just fall back to
// unaccelerated mode
//
}
#endif
RK = ctx->rk;
GET_ULONG_LE( X0, input, 0 ); X0 ^= *RK++;
GET_ULONG_LE( X1, input, 4 ); X1 ^= *RK++;
GET_ULONG_LE( X2, input, 8 ); X2 ^= *RK++;
GET_ULONG_LE( X3, input, 12 ); X3 ^= *RK++;
if( mode == AES_DECRYPT )
{
for( i = (ctx->nr >> 1) - 1; i > 0; i-- )
{
AES_RROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 );
AES_RROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 );
}
AES_RROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 );
X0 = *RK++ ^ \
( (unsigned long) RSb[ ( Y0 ) & 0xFF ] ) ^
( (unsigned long) RSb[ ( Y3 >> 8 ) & 0xFF ] << 8 ) ^
( (unsigned long) RSb[ ( Y2 >> 16 ) & 0xFF ] << 16 ) ^
( (unsigned long) RSb[ ( Y1 >> 24 ) & 0xFF ] << 24 );
X1 = *RK++ ^ \
( (unsigned long) RSb[ ( Y1 ) & 0xFF ] ) ^
( (unsigned long) RSb[ ( Y0 >> 8 ) & 0xFF ] << 8 ) ^
( (unsigned long) RSb[ ( Y3 >> 16 ) & 0xFF ] << 16 ) ^
( (unsigned long) RSb[ ( Y2 >> 24 ) & 0xFF ] << 24 );
X2 = *RK++ ^ \
( (unsigned long) RSb[ ( Y2 ) & 0xFF ] ) ^
( (unsigned long) RSb[ ( Y1 >> 8 ) & 0xFF ] << 8 ) ^
( (unsigned long) RSb[ ( Y0 >> 16 ) & 0xFF ] << 16 ) ^
( (unsigned long) RSb[ ( Y3 >> 24 ) & 0xFF ] << 24 );
X3 = *RK++ ^ \
( (unsigned long) RSb[ ( Y3 ) & 0xFF ] ) ^
( (unsigned long) RSb[ ( Y2 >> 8 ) & 0xFF ] << 8 ) ^
( (unsigned long) RSb[ ( Y1 >> 16 ) & 0xFF ] << 16 ) ^
( (unsigned long) RSb[ ( Y0 >> 24 ) & 0xFF ] << 24 );
}
else /* AES_ENCRYPT */
{
for( i = (ctx->nr >> 1) - 1; i > 0; i-- )
{
AES_FROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 );
AES_FROUND( X0, X1, X2, X3, Y0, Y1, Y2, Y3 );
}
AES_FROUND( Y0, Y1, Y2, Y3, X0, X1, X2, X3 );
X0 = *RK++ ^ \
( (unsigned long) FSb[ ( Y0 ) & 0xFF ] ) ^
( (unsigned long) FSb[ ( Y1 >> 8 ) & 0xFF ] << 8 ) ^
( (unsigned long) FSb[ ( Y2 >> 16 ) & 0xFF ] << 16 ) ^
( (unsigned long) FSb[ ( Y3 >> 24 ) & 0xFF ] << 24 );
X1 = *RK++ ^ \
( (unsigned long) FSb[ ( Y1 ) & 0xFF ] ) ^
( (unsigned long) FSb[ ( Y2 >> 8 ) & 0xFF ] << 8 ) ^
( (unsigned long) FSb[ ( Y3 >> 16 ) & 0xFF ] << 16 ) ^
( (unsigned long) FSb[ ( Y0 >> 24 ) & 0xFF ] << 24 );
X2 = *RK++ ^ \
( (unsigned long) FSb[ ( Y2 ) & 0xFF ] ) ^
( (unsigned long) FSb[ ( Y3 >> 8 ) & 0xFF ] << 8 ) ^
( (unsigned long) FSb[ ( Y0 >> 16 ) & 0xFF ] << 16 ) ^
( (unsigned long) FSb[ ( Y1 >> 24 ) & 0xFF ] << 24 );
X3 = *RK++ ^ \
( (unsigned long) FSb[ ( Y3 ) & 0xFF ] ) ^
( (unsigned long) FSb[ ( Y0 >> 8 ) & 0xFF ] << 8 ) ^
( (unsigned long) FSb[ ( Y1 >> 16 ) & 0xFF ] << 16 ) ^
( (unsigned long) FSb[ ( Y2 >> 24 ) & 0xFF ] << 24 );
}
PUT_ULONG_LE( X0, output, 0 );
PUT_ULONG_LE( X1, output, 4 );
PUT_ULONG_LE( X2, output, 8 );
PUT_ULONG_LE( X3, output, 12 );
return( 0 );
}
/*
* AES-CBC buffer encryption/decryption
*/
int aes_crypt_cbc( aes_context *ctx,
int mode,
long long int length,
unsigned char iv[16],
const unsigned char *input,
unsigned char *output )
{
int i;
unsigned char temp[16];
if( length % 16 )
return( POLARSSL_ERR_AES_INVALID_INPUT_LENGTH );
#if defined(POLARSSL_PADLOCK_C) && defined(POLARSSL_HAVE_X86)
if( padlock_supports( PADLOCK_ACE ) )
{
if( padlock_xcryptcbc( ctx, mode, length, iv, input, output ) == 0 )
return( 0 );
// If padlock data misaligned, we just fall back to
// unaccelerated mode
//
}
#endif
if( mode == AES_DECRYPT )
{
while( length > 0 )
{
memcpy( temp, input, 16 );
aes_crypt_ecb( ctx, mode, input, output );
for( i = 0; i < 16; i++ )
output[i] = (unsigned char)( output[i] ^ iv[i] );
memcpy( iv, temp, 16 );
input += 16;
output += 16;
length -= 16;
}
}
else
{
while( length > 0 )
{
for( i = 0; i < 16; i++ )
output[i] = (unsigned char)( input[i] ^ iv[i] );
aes_crypt_ecb( ctx, mode, output, output );
memcpy( iv, output, 16 );
input += 16;
output += 16;
length -= 16;
}
}
return( 0 );
}

140
aes.h
View file

@ -1,140 +0,0 @@
/**
* \file aes.h
*
* Copyright (C) 2011, Con Kolivas <kernel@kolivas.org>
* Copyright (C) 2006-2010, Brainspark B.V.
*
* This file is part of PolarSSL (http://www.polarssl.org)
* Lead Maintainer: Paul Bakker <polarssl_maintainer at polarssl.org>
*
* All rights reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
*/
#ifndef POLARSSL_AES_H
#define POLARSSL_AES_H
#define AES_ENCRYPT 1
#define AES_DECRYPT 0
#define POLARSSL_ERR_AES_INVALID_KEY_LENGTH -0x0800
#define POLARSSL_ERR_AES_INVALID_INPUT_LENGTH -0x0810
/**
* \brief AES context structure
*/
typedef struct
{
int nr; /*!< number of rounds */
unsigned long *rk; /*!< AES round keys */
unsigned long buf[68]; /*!< unaligned data */
}
aes_context;
#ifdef __cplusplus
extern "C" {
#endif
/**
* \brief AES key schedule (encryption)
*
* \param ctx AES context to be initialized
* \param key encryption key
* \param keysize must be 128, 192 or 256
*
* \return 0 if successful, or POLARSSL_ERR_AES_INVALID_KEY_LENGTH
*/
int aes_setkey_enc( aes_context *ctx, const unsigned char *key, int keysize );
/**
* \brief AES key schedule (decryption)
*
* \param ctx AES context to be initialized
* \param key decryption key
* \param keysize must be 128, 192 or 256
*
* \return 0 if successful, or POLARSSL_ERR_AES_INVALID_KEY_LENGTH
*/
int aes_setkey_dec( aes_context *ctx, const unsigned char *key, int keysize );
/**
* \brief AES-ECB block encryption/decryption
*
* \param ctx AES context
* \param mode AES_ENCRYPT or AES_DECRYPT
* \param input 16-byte input block
* \param output 16-byte output block
*
* \return 0 if successful
*/
int aes_crypt_ecb( aes_context *ctx,
int mode,
const unsigned char input[16],
unsigned char output[16] );
/**
* \brief AES-CBC buffer encryption/decryption
* Length should be a multiple of the block
* size (16 bytes)
*
* \param ctx AES context
* \param mode AES_ENCRYPT or AES_DECRYPT
* \param length length of the input data
* \param iv initialization vector (updated after use)
* \param input buffer holding the input data
* \param output buffer holding the output data
*
* \return 0 if successful, or POLARSSL_ERR_AES_INVALID_INPUT_LENGTH
*/
int aes_crypt_cbc( aes_context *ctx,
int mode,
long long int length,
unsigned char iv[16],
const unsigned char *input,
unsigned char *output );
/**
* \brief AES-CFB128 buffer encryption/decryption.
*
* \param ctx AES context
* \param mode AES_ENCRYPT or AES_DECRYPT
* \param length length of the input data
* \param iv_off offset in IV (updated after use)
* \param iv initialization vector (updated after use)
* \param input buffer holding the input data
* \param output buffer holding the output data
*
* \return 0 if successful
*/
int aes_crypt_cfb128( aes_context *ctx,
int mode,
int length,
int *iv_off,
unsigned char iv[16],
const unsigned char *input,
unsigned char *output );
/**
* \brief Checkup routine
*
* \return 0 if successful, or 1 if the test failed
*/
int aes_self_test( int verbose );
#ifdef __cplusplus
}
#endif
#endif /* aes.h */

View file

@ -1,17 +0,0 @@
#!/bin/sh
cwd="$PWD"
bs_dir="$(dirname $(readlink -f $0))"
rm -rf "${bs_dir}"/autom4te.cache
rm -f "${bs_dir}"/aclocal.m4 "${bs_dir}"/ltmain.sh
echo 'Running autoreconf -if...'
autoreconf -if || exit 1
if test -z "$NOCONFIGURE" ; then
echo 'Configuring...'
cd "${bs_dir}" &> /dev/null
test "$?" = "0" || e=1
test "$cwd" != "$bs_dir" && cd "$bs_dir" &> /dev/null
./configure $@
test "$e" = "1" && exit 1
cd "$cwd"
fi

1502
config.guess vendored Normal file

File diff suppressed because it is too large Load diff

134
config.h.in Normal file
View file

@ -0,0 +1,134 @@
/* config.h.in. Generated from configure.in by autoheader. */
/* Define to 1 if you have the <ctype.h> header file. */
#undef HAVE_CTYPE_H
/* Define to 1 if errno.h present */
#undef HAVE_ERRNO_DECL
/* Define to 1 if you have the <fcntl.h> header file. */
#undef HAVE_FCNTL_H
/* Define to 1 if you have the `getopt_long' function. */
#undef HAVE_GETOPT_LONG
/* Define to 1 if you have the <inttypes.h> header file. */
#undef HAVE_INTTYPES_H
/* */
#undef HAVE_LARGE_FILES
/* Define to 1 if you have the `bz2' library (-lbz2). */
#undef HAVE_LIBBZ2
/* Define to 1 if you have the `lzo2' library (-llzo2). */
#undef HAVE_LIBLZO2
/* Define to 1 if you have the `m' library (-lm). */
#undef HAVE_LIBM
/* Define to 1 if you have the `pthread' library (-lpthread). */
#undef HAVE_LIBPTHREAD
/* Define to 1 if you have the `z' library (-lz). */
#undef HAVE_LIBZ
/* Define to 1 if you have the <memory.h> header file. */
#undef HAVE_MEMORY_H
/* Define to 1 if you have the `mmap' function. */
#undef HAVE_MMAP
/* Define to 1 if you have the <stdint.h> header file. */
#undef HAVE_STDINT_H
/* Define to 1 if you have the <stdlib.h> header file. */
#undef HAVE_STDLIB_H
/* Define to 1 if you have the `strerror' function. */
#undef HAVE_STRERROR
/* Define to 1 if you have the <strings.h> header file. */
#undef HAVE_STRINGS_H
/* Define to 1 if you have the <string.h> header file. */
#undef HAVE_STRING_H
/* Define to 1 if you have the <sys/ioctl.h> header file. */
#undef HAVE_SYS_IOCTL_H
/* Define to 1 if you have the <sys/param.h> header file. */
#undef HAVE_SYS_PARAM_H
/* Define to 1 if you have the <sys/stat.h> header file. */
#undef HAVE_SYS_STAT_H
/* Define to 1 if you have the <sys/time.h> header file. */
#undef HAVE_SYS_TIME_H
/* Define to 1 if you have the <sys/types.h> header file. */
#undef HAVE_SYS_TYPES_H
/* Define to 1 if you have the <sys/unistd.h> header file. */
#undef HAVE_SYS_UNISTD_H
/* Define to 1 if you have the <sys/wait.h> header file. */
#undef HAVE_SYS_WAIT_H
/* Define to 1 if you have the <unistd.h> header file. */
#undef HAVE_UNISTD_H
/* Define to the address where bug reports for this package should be sent. */
#undef PACKAGE_BUGREPORT
/* Define to the full name of this package. */
#undef PACKAGE_NAME
/* Define to the full name and version of this package. */
#undef PACKAGE_STRING
/* Define to the one symbol short name of this package. */
#undef PACKAGE_TARNAME
/* Define to the version of this package. */
#undef PACKAGE_VERSION
/* The size of `int', as computed by sizeof. */
#undef SIZEOF_INT
/* The size of `long', as computed by sizeof. */
#undef SIZEOF_LONG
/* The size of `short', as computed by sizeof. */
#undef SIZEOF_SHORT
/* Define to 1 if you have the ANSI C header files. */
#undef STDC_HEADERS
/* Number of bits in a file offset, on hosts where this is settable. */
#undef _FILE_OFFSET_BITS
/* Define to make ftello visible on some hosts (e.g. HP-UX 10.20). */
#undef _LARGEFILE_SOURCE
/* Define for large files, on AIX-style hosts. */
#undef _LARGE_FILES
/* Define to make ftello visible on some hosts (e.g. glibc 2.1.3). */
#undef _XOPEN_SOURCE
/* Define to `__inline__' or `__inline' if that's what the C compiler
calls it, or to nothing if 'inline' is not supported under any name. */
#ifndef __cplusplus
#undef inline
#endif
/* Define to `long int' if <sys/types.h> does not define. */
#undef off_t
/* Define to `unsigned int' if <sys/types.h> does not define. */
#undef size_t
#define _GNU_SOURCE
#define _LARGEFILE64_SOURCE
#define _FILE_OFFSET_BITS 64

1714
config.sub vendored Normal file

File diff suppressed because it is too large Load diff

6095
configure vendored Executable file

File diff suppressed because it is too large Load diff

View file

@ -1,49 +1,22 @@
##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##
##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##
m4_define([v_maj], [0])
m4_define([v_min], [6])
m4_define([v_mic], [51])
##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##
m4_define([v_v], m4_join([], v_min, v_mic))
m4_define([v_ver], [v_maj.v_v])
m4_define([lt_rev], m4_eval(v_maj + v_min))
m4_define([lt_cur], v_mic)
m4_define([lt_age], v_min)
##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##
##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##
dnl Process this file with autoconf to produce a configure script.
AC_INIT([lrzip],[v_ver],[kernel@kolivas.org])
AC_PREREQ([2.71])
AC_CONFIG_SRCDIR([configure.ac])
AC_CONFIG_MACRO_DIR([m4])
AC_CONFIG_HEADERS([config.h])
AM_INIT_AUTOMAKE([1.6 dist-bzip2 foreign subdir-objects])
m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])])
AC_USE_SYSTEM_EXTENSIONS
LT_INIT
##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##
##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##
m4_ifdef([v_rev], , [m4_define([v_rev], [0])])
m4_ifdef([v_rel], , [m4_define([v_rel], [])])
AC_DEFINE_UNQUOTED(LRZIP_MAJOR_VERSION, [v_maj], [Major version])
AC_DEFINE_UNQUOTED(LRZIP_MINOR_VERSION, [v_min], [Minor version])
AC_DEFINE_UNQUOTED(LRZIP_MINOR_SUBVERSION, [v_mic], [Micro version])
version_info="lt_rev:lt_cur:lt_age"
release_info="v_rel"
AC_SUBST(version_info)
AC_SUBST(release_info)
##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##
##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##--##
VMAJ=v_maj
AC_SUBST(VMAJ)
AC_CANONICAL_BUILD
AC_INIT([lrzip],[0.5.2],[kernel@kolivas.org],[lrzip-0.5.2])
AC_CONFIG_HEADER(config.h)
# see what our system is!
AC_CANONICAL_HOST
AC_ARG_ENABLE(
asm,
[AC_HELP_STRING([--enable-asm],[Enable native Assembly code])],
ASM=$enableval,
ASM=yes
)
if test x"$ASM" = xyes; then
AC_CHECK_PROG( ASM_PROG, nasm, yes, no )
if test x"$ASM_PROG" = x"no "; then
ASM=no
fi
fi
dnl Checks for programs.
AC_PROG_CC
AC_PROG_CXX
@ -51,42 +24,18 @@ AC_PROG_INSTALL
AC_PROG_LN_S
AC_SUBST(SHELL)
AC_SYS_LARGEFILE
AC_FUNC_FSEEKO
AC_FUNC_ALLOCA
AC_CHECK_PROG([HAVE_POD2MAN], [pod2man], [yes])
AS_IF([test "$HAVE_POD2MAN" != "yes"],
AC_MSG_FAILURE([pod2man is needed to generate manual from POD]))
AC_ARG_ENABLE(
asm,
[AS_HELP_STRING([--enable-asm],[Enable native Assembly code])],
ASM=$enableval,
ASM=yes
)
if test x"$ASM" = x"yes"; then
AC_CHECK_PROG( ASM_PROG, nasm, nasm, no ) # fix to set ASM_PROG to nasm, not yes.
if test x"$ASM_PROG" = x"no "; then
ASM=no
fi
if test x"$GCC" = xyes; then
CFLAGS="$CFLAGS -Wall -W"
fi
static=no
AC_ARG_ENABLE([static-bin],
[AS_HELP_STRING([--enable-static-bin],[Build statically linked binary @<:@default=no@:>@])],
[static=$enableval]
)
AM_CONDITIONAL([STATIC], [test x"$static" = x"yes"])
AC_CHECK_HEADERS(fcntl.h sys/time.h unistd.h sys/mman.h)
AC_CHECK_HEADERS(ctype.h errno.h sys/resource.h)
AC_CHECK_HEADERS(endian.h sys/endian.h arpa/inet.h)
AC_CHECK_HEADERS(alloca.h pthread.h)
AC_CHECK_HEADERS(fcntl.h sys/time.h sys/unistd.h unistd.h)
AC_CHECK_HEADERS(sys/param.h ctype.h sys/wait.h sys/ioctl.h)
AC_CHECK_HEADERS(string.h stdlib.h sys/types.h)
AC_TYPE_OFF_T
AC_TYPE_SIZE_T
AC_C___ATTRIBUTE__
AC_CHECK_SIZEOF(int)
AC_CHECK_SIZEOF(long)
AC_CHECK_SIZEOF(short)
@ -100,9 +49,8 @@ if test x"$rzip_cv_HAVE_LARGE_FILES" = x"yes"; then
AC_DEFINE(HAVE_LARGE_FILES, 1, [ ])
fi
AC_C_INLINE
AC_C_BIGENDIAN
AC_C_INLINE
AC_CHECK_LIB(pthread, pthread_create, ,
AC_MSG_ERROR([Could not find pthread library - please install libpthread]))
@ -114,68 +62,34 @@ AC_CHECK_LIB(bz2, BZ2_bzBuffToBuffCompress, ,
AC_MSG_ERROR([Could not find bz2 library - please install libbz2-dev]))
AC_CHECK_LIB(lzo2, lzo1x_1_compress, ,
AC_MSG_ERROR([Could not find lzo2 library - please install liblzo2-dev]))
AC_CHECK_LIB(lz4, LZ4_compress_default, ,
AC_MSG_ERROR([Could not find lz4 library - please install liblz4-dev]))
AC_DEFINE([HAVE_ERRNO_DECL],[0],[Define to 1 if errno.h present])
echo $ECHO_N "checking for errno in errno.h... $ECHO_C"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[#include <errno.h>]], [[int i = errno]])],[echo yes; AC_DEFINE(HAVE_ERRNO_DECL)],[echo no])
AC_CHECK_FUNCS(mmap strerror)
AC_CHECK_FUNCS(getopt_long)
AX_PTHREAD
LIBS="$PTHREAD_LIBS $LIBS"
CFLAGS="$CFLAGS $PTHREAD_CFLAGS"
CXXFLAGS="$CXXFLAGS $PTHREAD_CXXFLAGS"
# final checks for assembler
# ASM is back for x86_64 by using newer CRC code from p7zip-16.02
# object files handled in lzma/C/Makefile.am
if test x"$ASM" = x"yes"; then
ASM_OPT="-I../ASM/x86/"
# final checks for x86 and/or assembler
if test x"$ASM" = x"no"; then
ASM_OBJ=7zCrc.o
ASM=
else
case $host in
i?86-*)
ASM_OPT="$ASM_OPT -g -f elf" ;;
x86_64-*)
ASM_OPT="$ASM_OPT -Dx64 -g -f elf64" ;;
*) ASM_OPT= ;;
ASM_OBJ="7zCrcT8.o 7zCrcT8U.o"
ASM="nasm -f elf" ;;
# x86_64 code is broken still
# x86_64-*)
# ASM_OBJ="7zCrcT8.o 7zCrcT8U_64.o"
# ASM="nasm -f elf64" ;;
*) ASM_OBJ=7zCrc.o ;;
esac
else
ASM_OPT=
fi
AM_CONDITIONAL([USE_ASM], [test x"$ASM" = x"yes"])
AC_SUBST([ASM_OPT])
AC_SUBST([ASM_CMD])
EFL_CHECK_DOXYGEN([build_doc="yes"], [build_doc="no"])
AC_CONFIG_FILES([
Makefile
lzma/Makefile
lzma/C/Makefile
lzma/ASM/x86/Makefile
doc/Makefile
man/Makefile
])
AC_SUBST([ASM_OBJ])
AC_SUBST([ASM])
AC_CONFIG_FILES([Makefile])
AC_OUTPUT
echo
echo
echo
echo "------------------------------------------------------------------------"
echo "$PACKAGE $VERSION"
echo "------------------------------------------------------------------------"
echo
echo
echo "Configuration Options Summary:"
echo
echo " ASM................: $ASM"
echo " Static binary......: $static"
echo
echo "Documentation..........: ${build_doc}"
echo
echo "Compilation............: make (or gmake)"
echo " CPPFLAGS.............: $CPPFLAGS"
echo " CFLAGS...............: $CFLAGS"
echo " CXXFLAGS.............: $CXXFLAGS"
echo " LDFLAGS..............: $LDFLAGS"
echo
echo "Installation...........: make install (as root if needed, with 'su' or 'sudo')"
echo " prefix...............: $prefix"
echo

View file

@ -1,58 +0,0 @@
/*
Copyright (C) 2012 Con Kolivas
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifdef HAVE_CONFIG_H
# include "config.h"
#endif
#undef NDEBUG
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
#include <Lrzip.h>
static const char *suffix_me(const char *file)
{
const char *p;
static char buf[4096];
p = strrchr(file, '.');
if (p && (strlen(p + 1) < 4))
strncat(buf, file, p - file);
else
strcat(buf, file);
return &buf[0];
}
int main(int argc, char *argv[])
{
Lrzip *lr;
if ((argc != 2) && (argc != 3)) {
fprintf(stderr, "Usage: %s file [outfile]\n", argv[0]);
exit(1);
}
lr = lrzip_new(LRZIP_MODE_DECOMPRESS);
assert(lr);
lrzip_config_env(lr);
assert(lrzip_filename_add(lr, argv[1]));
if (argc == 2)
lrzip_outfilename_set(lr, suffix_me(argv[1]));
else
lrzip_outfilename_set(lr, argv[2]);
assert(lrzip_run(lr));
return 0;
}

View file

@ -1,217 +0,0 @@
DOXYFILE_ENCODING = UTF-8
PROJECT_NAME = Lrzip
PROJECT_NUMBER =
OUTPUT_DIRECTORY = .
CREATE_SUBDIRS = NO
OUTPUT_LANGUAGE = English
BRIEF_MEMBER_DESC = YES
REPEAT_BRIEF = YES
ABBREVIATE_BRIEF =
ALWAYS_DETAILED_SEC = NO
INLINE_INHERITED_MEMB = NO
FULL_PATH_NAMES = NO
STRIP_FROM_PATH =
STRIP_FROM_INC_PATH =
SHORT_NAMES = NO
JAVADOC_AUTOBRIEF = YES
QT_AUTOBRIEF = NO
MULTILINE_CPP_IS_BRIEF = NO
INHERIT_DOCS = YES
SEPARATE_MEMBER_PAGES = NO
TAB_SIZE = 2
ALIASES =
OPTIMIZE_OUTPUT_FOR_C = YES
OPTIMIZE_OUTPUT_JAVA = NO
OPTIMIZE_FOR_FORTRAN = NO
OPTIMIZE_OUTPUT_VHDL = NO
EXTENSION_MAPPING =
BUILTIN_STL_SUPPORT = NO
CPP_CLI_SUPPORT = NO
SIP_SUPPORT = NO
IDL_PROPERTY_SUPPORT = YES
DISTRIBUTE_GROUP_DOC = NO
SUBGROUPING = YES
TYPEDEF_HIDES_STRUCT = NO
SYMBOL_CACHE_SIZE = 0
EXTRACT_ALL = NO
EXTRACT_PRIVATE = NO
EXTRACT_STATIC = NO
EXTRACT_LOCAL_CLASSES = NO
EXTRACT_LOCAL_METHODS = NO
EXTRACT_ANON_NSPACES = NO
HIDE_UNDOC_MEMBERS = YES
HIDE_UNDOC_CLASSES = YES
HIDE_FRIEND_COMPOUNDS = YES
HIDE_IN_BODY_DOCS = NO
INTERNAL_DOCS = NO
CASE_SENSE_NAMES = YES
HIDE_SCOPE_NAMES = NO
SHOW_INCLUDE_FILES = NO
FORCE_LOCAL_INCLUDES = NO
INLINE_INFO = YES
SORT_MEMBER_DOCS = YES
SORT_BRIEF_DOCS = NO
SORT_MEMBERS_CTORS_1ST = NO
SORT_GROUP_NAMES = NO
SORT_BY_SCOPE_NAME = NO
GENERATE_TODOLIST = YES
GENERATE_TESTLIST = YES
GENERATE_BUGLIST = YES
GENERATE_DEPRECATEDLIST= YES
ENABLED_SECTIONS =
MAX_INITIALIZER_LINES = 30
SHOW_USED_FILES = NO
SHOW_DIRECTORIES = NO
SHOW_FILES = YES
SHOW_NAMESPACES = YES
FILE_VERSION_FILTER =
LAYOUT_FILE =
QUIET = NO
WARNINGS = YES
WARN_IF_UNDOCUMENTED = YES
WARN_IF_DOC_ERROR = YES
WARN_NO_PARAMDOC = NO
WARN_FORMAT = "$file:$line: $text"
WARN_LOGFILE =
INPUT = ../Lrzip.h
INPUT_ENCODING = UTF-8
FILE_PATTERNS =
RECURSIVE = YES
EXCLUDE =
EXCLUDE_SYMLINKS = NO
EXCLUDE_PATTERNS = */extras/* *private*
EXCLUDE_SYMBOLS =
EXAMPLE_PATH =
EXAMPLE_PATTERNS =
EXAMPLE_RECURSIVE = YES
IMAGE_PATH =
INPUT_FILTER =
FILTER_PATTERNS =
FILTER_SOURCE_FILES = NO
SOURCE_BROWSER = NO
INLINE_SOURCES = NO
STRIP_CODE_COMMENTS = YES
REFERENCED_BY_RELATION = YES
REFERENCES_RELATION = YES
REFERENCES_LINK_SOURCE = YES
USE_HTAGS = NO
VERBATIM_HEADERS = NO
ALPHABETICAL_INDEX = YES
COLS_IN_ALPHA_INDEX = 2
IGNORE_PREFIX =
GENERATE_HTML = YES
HTML_OUTPUT = html
HTML_FILE_EXTENSION = .html
HTML_TIMESTAMP = YES
HTML_ALIGN_MEMBERS = YES
HTML_DYNAMIC_SECTIONS = NO
GENERATE_DOCSET = NO
DOCSET_FEEDNAME = "Doxygen generated docs"
DOCSET_BUNDLE_ID = org.doxygen.Project
DOCSET_PUBLISHER_ID = org.doxygen.Publisher
DOCSET_PUBLISHER_NAME = Publisher
GENERATE_HTMLHELP = NO
CHM_FILE =
HHC_LOCATION =
GENERATE_CHI = NO
CHM_INDEX_ENCODING =
BINARY_TOC = NO
TOC_EXPAND = NO
GENERATE_QHP = NO
QCH_FILE =
QHP_NAMESPACE = org.doxygen.Project
QHP_VIRTUAL_FOLDER = doc
QHP_CUST_FILTER_NAME =
QHP_CUST_FILTER_ATTRS =
QHP_SECT_FILTER_ATTRS =
QHG_LOCATION =
GENERATE_ECLIPSEHELP = NO
ECLIPSE_DOC_ID = org.doxygen.Project
DISABLE_INDEX = YES
ENUM_VALUES_PER_LINE = 1
GENERATE_TREEVIEW = NO
USE_INLINE_TREES = NO
TREEVIEW_WIDTH = 250
EXT_LINKS_IN_WINDOW = NO
FORMULA_FONTSIZE = 10
FORMULA_TRANSPARENT = YES
USE_MATHJAX = NO
MATHJAX_RELPATH = http://www.mathjax.org/mathjax
SEARCHENGINE = NO
SERVER_BASED_SEARCH = NO
GENERATE_LATEX = YES
LATEX_OUTPUT = latex
LATEX_CMD_NAME = latex
MAKEINDEX_CMD_NAME = makeindex
COMPACT_LATEX = NO
PAPER_TYPE = a4wide
EXTRA_PACKAGES =
LATEX_HEADER =
PDF_HYPERLINKS = YES
USE_PDFLATEX = NO
LATEX_BATCHMODE = NO
LATEX_HIDE_INDICES = NO
LATEX_SOURCE_CODE = NO
GENERATE_RTF = NO
RTF_OUTPUT = rtf
COMPACT_RTF = NO
RTF_HYPERLINKS = NO
RTF_STYLESHEET_FILE =
RTF_EXTENSIONS_FILE =
GENERATE_MAN = YES
MAN_OUTPUT = man
MAN_EXTENSION = .3
MAN_LINKS = YES
GENERATE_XML = NO
XML_OUTPUT = xml
XML_SCHEMA =
XML_DTD =
XML_PROGRAMLISTING = YES
GENERATE_AUTOGEN_DEF = NO
GENERATE_PERLMOD = NO
PERLMOD_LATEX = NO
PERLMOD_PRETTY = YES
PERLMOD_MAKEVAR_PREFIX =
ENABLE_PREPROCESSING = YES
MACRO_EXPANSION = YES
EXPAND_ONLY_PREDEF = NO
SEARCH_INCLUDES = YES
INCLUDE_PATH =
INCLUDE_FILE_PATTERNS =
PREDEFINED =
EXPAND_AS_DEFINED =
SKIP_FUNCTION_MACROS = YES
TAGFILES =
GENERATE_TAGFILE =
ALLEXTERNALS = NO
EXTERNAL_GROUPS = YES
PERL_PATH = /usr/bin/perl
CLASS_DIAGRAMS = NO
MSCGEN_PATH =
HIDE_UNDOC_RELATIONS = YES
HAVE_DOT = NO
DOT_NUM_THREADS = 0
DOT_FONTNAME = FreeSans.ttf
DOT_FONTSIZE = 10
DOT_FONTPATH =
CLASS_GRAPH = NO
COLLABORATION_GRAPH = NO
GROUP_GRAPHS = YES
UML_LOOK = NO
TEMPLATE_RELATIONS = NO
INCLUDE_GRAPH = NO
INCLUDED_BY_GRAPH = NO
CALL_GRAPH = NO
CALLER_GRAPH = NO
GRAPHICAL_HIERARCHY = NO
DIRECTORY_GRAPH = YES
DOT_IMAGE_FORMAT = png
DOT_PATH =
DOTFILE_DIRS =
MSCFILE_DIRS =
DOT_GRAPH_MAX_NODES = 50
MAX_DOT_GRAPH_DEPTH = 0
DOT_TRANSPARENT = NO
DOT_MULTI_TARGETS = NO
GENERATE_LEGEND = YES
DOT_CLEANUP = YES

View file

@ -1,38 +0,0 @@
MAINTAINERCLEANFILES = Makefile.in
dist_doc_DATA = \
README.Assembler \
README.benchmarks \
README.lzo_compresses.test.txt \
magic.header.txt \
lrzip.conf.example
PACKAGE_DOCNAME = $(PACKAGE_TARNAME)-$(PACKAGE_VERSION)-doc
.PHONY: doc
if EFL_BUILD_DOC
doc-clean:
rm -rf html/ latex/ man/ xml/ $(PACKAGE_DOCNAME).tar*
doc: all doc-clean
$(efl_doxygen)
rm -rf $(PACKAGE_DOCNAME).tar*
mkdir -p $(PACKAGE_DOCNAME)/doc
cp -R html/ latex/ man/ $(PACKAGE_DOCNAME)/doc
tar cf $(PACKAGE_DOCNAME).tar $(PACKAGE_DOCNAME)/
bzip2 -9 $(PACKAGE_DOCNAME).tar
rm -rf $(PACKAGE_DOCNAME)/
mv $(PACKAGE_DOCNAME).tar.bz2 $(top_srcdir)
clean-local: doc-clean
else
doc:
@echo "Documentation not built. Run ./configure --help"
endif
EXTRA_DIST = Doxyfile

View file

@ -1,20 +1,5 @@
README.Assembler
Update November 2019
Assembler is enabled by
./configure --enable-asm
and disabled by
./configure --disable-asm
not
ASM=no ./configure
New files replace 32 and 64 bit assembler code.
fixes to lzma/C/Makefile.am permit libtool linking.
Original text follows.
==========================
Notes about CRC Assembly Language Coding.
lrzip-0.21 makes use of an x86 assembly language file

View file

@ -1,9 +1,8 @@
The first comparison is that of a linux kernel tarball (2.6.37). In all cases
the default options were used. 4 other common compression apps were used for
The first comparison is that of a linux kernel tarball (2.6.31). In all cases
the default options were used. 3 other common compression apps were used for
comparison, 7z which is an excellent all-round lzma based compression app,
gzip which is the benchmark fast standard that has good compression, and bzip2
which is the most common linux used compression. xz was included for
completeness.
which is the most common linux used compression.
In the following tables, lrzip means lrzip default options, lrzip -l means
lrzip using the lzo backend, lrzip -g means using the gzip backend,
@ -11,30 +10,31 @@ lrzip -b means using the bzip2 backend and lrzip -z means using the zpaq
backend.
linux-2.6.37.tar
linux-2.6.31.tar
These are benchmarks performed on a 3GHz quad core Intel Core2 with 8GB ram
using lrzip v0.612 on an SSD drive.
using lrzip v0.42.
Compression Size Percentage Compress Decompress
None 430612480 100
7z 63636839 14.8 2m28s 0m6.6s
xz 63291156 14.7 4m02s 0m8.7
lrzip 64561485 14.9 1m12s 0m4.3s
lrzip -z 51588423 12.0 2m02s 2m08s
lrzip -l 137515997 31.9 0m14s 0m2.7s
lrzip -g 86142459 20.0 0m17s 0m3.0s
lrzip -b 72103197 16.7 0m21s 0m6.5s
bzip2 74060625 17.2 0m48s 0m12.8s
gzip 94512561 21.9 0m17s 0m4.0s
None 365711360 100
7z 53315279 14.6 2m4s 0m5.4s
lrzip 52372722 14.3 2m48s 0m8.3s
lrzip -z 43455498 11.9 10m11s 10m14s
lrzip -l 112151676 30.7 0m14s 0m5.1s
lrzip -g 73476127 20.1 0m29s 0m5.6s
lrzip -b 60851152 16.6 0m43s 0m12.2s
bzip2 62416571 17.1 0m44s 0m9.8s
gzip 80563601 22.0 0m14s 0m2.8s
These results are interesting to note the compression of lrzip by default is
about the same as 7z, but it's significantly faster thanks to its heavily
multithreaded nature. Zpaq offers by far the best compression but at the cost
of extra time. However with the heavily threaded nature of lrzip, it's not a lot
longer given how much better its compression is. It's actually faster than xz
on compression on a quad core machine.
only slightly better than lzma, but at some cost in time at the compress and
decompress end of the spectrum. Clearly zpaq compression is much better than any
other compression algorithm by far, but the speed cost on both compression and
decompression is extreme. At this size compression, lzo is interesting because
it's faster than simply copying the file but only offers modest compression.
What lrzip offers at this end of the spectrum is extreme compression if
desired.
Let's take six kernel trees one version apart as a tarball, linux-2.6.31 to
@ -45,7 +45,7 @@ purpose compressor at the moment:
These are benchmarks performed on a 2.53Ghz dual core Intel Core2 with 4GB ram
using lrzip v0.5.1. Note that it was running with a 32 bit userspace so only
2GB addressing was possible. However the benchmark was run with the -U option
2GB addressing was posible. However the benchmark was run with the -U option
allowing the whole file to be treated as one large compression window.
Tarball of 6 consecutive kernel trees.
@ -96,7 +96,7 @@ system and some basic working software on it. The default options on the
10GB Virtual image:
These benchmarks were done on the quad core with version 0.612
These benchmarks were done on the quad core with version 0.5.1
Compression Size Percentage Compress Time Decompress Time
None 10737418240 100.0
@ -104,44 +104,24 @@ gzip 2772899756 25.8 05m47s 2m46s
bzip2 2704781700 25.2 16m15s 6m19s
xz 2272322208 21.2 50m58s 3m52s
7z 2242897134 20.9 26m36s 5m41s
lrzip 1372218189 12.8 10m23s 2m53s
lrzip -U 1095735108 10.2 08m44s 2m45s
lrzip -l 1831894161 17.1 04m53s 2m37s
lrzip -lU 1414959433 13.2 04m48s 2m38s
lrzip -zU 1067169419 9.9 39m32s 39m46s
lrzip 1354237684 12.6 29m13s 6m55s
lrzip -M 1079528708 10.1 23m44s 4m05s
lrzip -l 1793312108 16.7 05m13s 3m12s
lrzip -lM 1413268368 13.2 04m18s 2m54s
lrzip -z 1299844906 12.1 04h32m14s 04h33m
lrzip -zM 1066902006 9.9 04h07m14s 04h08m
At this end of the spectrum things really start to heat up. The compression
advantage is massive, with the lzo backend even giving much better results than
7z, and over a ridiculously short time. The improvements in version 0.530 in
scalability with multiple CPUs has a huge impact on compression time here,
with zpaq almost being faster on quad core than xz is, yet producing a file
less than half the size.
What appears to be a big disappointment is actually zpaq here which takes more
than 4 times longer than r/lzma for a measly .3% improvement. The reason is that
most of the advantage here is achieved by the rzip first stage since there's a
lot of redundant space over huge distances on a virtual image. The -U option
which works the memory subsystem rather hard making noticeable impact on the
rest of the machine also does further wonders for the compression (virtually
always) and even the times in this particular case.
Finally testing the same 10GB image on a i7-3930K at 3.2GHz (12 thread CPU!)
with 32GB of ram so the whole image fits in ram with a fast SSD:
Compression Size Percentage Compress Time Decompress Time
None 10737418240 100.0
gzip 2772899756 25.8 3m56s 2m15s
pbzip2 2705814394 25.2 1m41s 1m46s
lrzip 1095337763 10.2 2m54s 2m21s
Note that with enough ram and CPU, lrzip is actually faster than gzip (which
does compression in place) and comparable on decompression, despite a huge
increase in compression. pbzip2 is faster than both but its compression is
almost no better than gzip.
7z, and over a ridiculously short time. The default lzma backend is slightly
slower than 7z, but provides a lot more compression. What appears to be a big
disappointment is actually zpaq here which takes more than 8 times longer than
lzma for a measly .2% improvement. The reason is that most of the advantage here
is achieved by the rzip first stage since there's a lot of redundant space over
huge distances on a virtual image. The -M option which works the memory
subsystem rather hard making noticeable impact on the rest of the machine also
does further wonders for the compression and times.
This should help govern what compression you choose. Small files are nicely
compressed with zpaq. Intermediate files are nicely compressed with lzma.
@ -151,4 +131,4 @@ Or, to make things easier, just use the default settings all the time and be
happy as lzma gives good results. :D
Con Kolivas
Saturday, 7th July 2012
Tue, 7th Nov 2010

View file

@ -1,55 +1,45 @@
# lrzip.conf example file
# anything beginning with a # or whitespace will be ignored
# valid parameters are separated with an = and a value
# parameters and values are not case sensitive except where specified
# parameters and values are not case sensitive
#
# lrzip 0.24+, peter hyman, pete@peterhyman.com
# lrzip 0.24, peter hyman, pete@peterhyman.com
# ignored by earlier versions.
# Compression Window size in 100MB. Normally selected by program. (-w)
# WINDOW = 20
# Compression Level 1-9 (7 Default). (-L)
# COMPRESSIONLEVEL = 7
# Use -U setting, Unlimited ram. Yes or No
# UNLIMITED = NO
# Compression Method, rzip, gzip, bzip2, lzo, or lzma (default), or zpaq. (-n -g -b -l --lzma -z)
# May be overridden by command line compression choice.
# COMPRESSIONMETHOD = lzma
# Perform LZO Test. Default = YES (-T )
# LZOTEST = NO
# Hash Check on decompression, (-c)
# HASHCHECK = YES
# Show HASH value on Compression even if Verbose is off, YES (-H)
# SHOWHASH = YES
# Compression Window size in 100MB. Normally selected by program.
WINDOW = 5
# Default output directory (-O)
# Compression Level 1-9 (7 Default).
COMPRESSIONLEVEL = 7
# Compression Method, rzip, gzip, bzip2, lzo, or lzma (default).
# If specified here, command line options not usable.
# COMPRESSIONMETHOD = lzo
# Test Threshold value 1-10 (2 Default).
TESTTHRESHOLD = 2
# Default output directory
# OUTPUTDIRECTORY = location
# Verbosity, YES or MAX (v, vv)
# VERBOSITY = max
# Show Progress as file is parsed, YES or no (NO = -q option)
# SHOWPROGRESS = YES
# Set Niceness. 19 is default. -20 to 19 is the allowable range (-N)
# NICE = 19
# Verbosity, true or 1, or max or 2
VERBOSITY = max
# Keep broken or damaged output files, YES (-K)
# KEEPBROKEN = YES
# Show Progress as file is parsed, true or 1, false or 0
SHOWPROGRESS = true
# Delete source file after compression (-D)
# Set Niceness. 19 is default. -20 to 19 is the allowable range
NICE = 19
# Delete source file after compression
# this parameter and value are case sensitive
# value must be YES to activate
# DELETEFILES = NO
# Replace existing lrzip file when compressing (-f)
# Replace existing lrzip file when compressing
# this parameter and value are case sensitive
# value must be YES to activate
# REPLACEFILE = YES
# Override for Temporary Directory. Only valid when stdin/out or Test is used
# TMPDIR = /tmp
# Whether to use encryption on compression YES, NO (-e)
# ENCRYPT = NO

View file

@ -1,86 +1,28 @@
lrzip-0.6x file format
March 2011
lrzip-0.50+ file header format
November 2010
Con Kolivas
Byte Content
0-23 Magic
---
24+ Rzip Chunk Data (RCD)
RCD+ Data blocks
--- repeat
(end-MD5_DIGEST_SIZE)->(end) md5 hash
Magic data:
0->3 LRZI
0-3 LRZI
4 LRZIP Major Version Number
5 LRZIP Minor Version Number
6->14 Source File Size or 0 if unknown, or salt in encrypted file
16->20 LZMA Properties Encoded (lc,lp,pb,fb, and dictionary size)
21 1 = md5sum hash is stored at the end of the archive
22 1 = data is encrypted with sha512/aes128
23 Unused
6-14 Source File Size
16-20 LZMA Properties Encoded (lc,lp,pb,fb, and dictionary size)
21-22 not used
23-48 Stream 1 header data
49-74 Stream 2 header data
Encrypted salt (bytes 6->14 in magic if encrypted):
0->1 Encoded number of loops to hash password
2->7 Random data
(RCD0 is set to 8 bytes always on encrypted files)
Rzip Chunk Data:
0 Data offsets byte width (meaning length is < (2 * 8)^RCD0)
1 Flag that there is no chunk beyond this
(RCD0 bytes) Chunk decompressed size (not stored in encrypted file)
XX Stream 0 header data
XX Stream 1 header data
Stream Header Data:
Byte:
0 Compressed data type
(RCD0 bytes) Compressed data length
(RCD0 bytes) Uncompressed data length
(RCD0 bytes) Next block head
Data blocks:
0->(end-2) data
(end-1)->end crc data
lrzip-0.5x file format
March 2011
Con Kolivas
Byte Content
0->23 Magic
--
24->74 Rzip chunk data
75+ Data blocks
-- repeat
(end-MD5_DIGEST_SIZE)->(end) md5 hash
Magic data:
0->3 LRZI
4 LRZIP Major Version Number
5 LRZIP Minor Version Number
6->14 Source File Size
16->20 LZMA Properties Encoded (lc,lp,pb,fb, and dictionary size)
21 Flag that md5sum hash is stored at the end of the archive
22-23 not used
Rzip chunk data:
0 Data offsets byte width
1-25 Stream 0 header data
26-50 Stream 1 header data
Stream Header Data:
Block Data:
Byte:
0 Compressed data type
1-8 Compressed data length
9-16 Uncompressed data length
17-24 Next block head
25 Data offsets byte width
26+ Data
Data blocks:
0->(end-2) data
(end-1)->end crc data
End:
0-1 crc data
lrzip-0.40+ file header format
November 2009
@ -92,8 +34,8 @@ Byte Content
5 LRZIP Minor Version Number
6-14 Source File Size
16-20 LZMA Properties Encoded (lc,lp,pb,fb, and dictionary size)
21-24 not used
24-48 Stream 1 header data
21-22 not used
23-48 Stream 1 header data
49-74 Stream 2 header data
Block Data:
@ -119,7 +61,7 @@ Byte Content
6-9 Source File Size (no HAVE_LARGE_FILES)
6-14 Source File Size
16-20 LZMA Properties Encoded (lc,lp,pb,fb, and dictionary size)
21-23 not used
24-36 Stream 1 header data
21-22 not used
23-36 Stream 1 header data
37-50 Stream 2 header data
51 Compressed data type

238
install-sh Executable file
View file

@ -0,0 +1,238 @@
#! /bin/sh
#
# install - install a program, script, or datafile
# This comes from X11R5.
#
# Calling this script install-sh is preferred over install.sh, to prevent
# `make' implicit rules from creating a file called install from it
# when there is no Makefile.
#
# This script is compatible with the BSD install script, but was written
# from scratch.
#
# set DOITPROG to echo to test this script
# Don't use :- since 4.3BSD and earlier shells don't like it.
doit="${DOITPROG-}"
# put in absolute paths if you don't have them in your path; or use env. vars.
mvprog="${MVPROG-mv}"
cpprog="${CPPROG-cp}"
chmodprog="${CHMODPROG-chmod}"
chownprog="${CHOWNPROG-chown}"
chgrpprog="${CHGRPPROG-chgrp}"
stripprog="${STRIPPROG-strip}"
rmprog="${RMPROG-rm}"
mkdirprog="${MKDIRPROG-mkdir}"
transformbasename=""
transform_arg=""
instcmd="$mvprog"
chmodcmd="$chmodprog 0755"
chowncmd=""
chgrpcmd=""
stripcmd=""
rmcmd="$rmprog -f"
mvcmd="$mvprog"
src=""
dst=""
dir_arg=""
while [ x"$1" != x ]; do
case $1 in
-c) instcmd="$cpprog"
shift
continue;;
-d) dir_arg=true
shift
continue;;
-m) chmodcmd="$chmodprog $2"
shift
shift
continue;;
-o) chowncmd="$chownprog $2"
shift
shift
continue;;
-g) chgrpcmd="$chgrpprog $2"
shift
shift
continue;;
-s) stripcmd="$stripprog"
shift
continue;;
-t=*) transformarg=`echo $1 | sed 's/-t=//'`
shift
continue;;
-b=*) transformbasename=`echo $1 | sed 's/-b=//'`
shift
continue;;
*) if [ x"$src" = x ]
then
src=$1
else
# this colon is to work around a 386BSD /bin/sh bug
:
dst=$1
fi
shift
continue;;
esac
done
if [ x"$src" = x ]
then
echo "install: no input file specified"
exit 1
else
true
fi
if [ x"$dir_arg" != x ]; then
dst=$src
src=""
if [ -d $dst ]; then
instcmd=:
else
instcmd=mkdir
fi
else
# Waiting for this to be detected by the "$instcmd $src $dsttmp" command
# might cause directories to be created, which would be especially bad
# if $src (and thus $dsttmp) contains '*'.
if [ -f $src -o -d $src ]
then
true
else
echo "install: $src does not exist"
exit 1
fi
if [ x"$dst" = x ]
then
echo "install: no destination specified"
exit 1
else
true
fi
# If destination is a directory, append the input filename; if your system
# does not like double slashes in filenames, you may need to add some logic
if [ -d $dst ]
then
dst="$dst"/`basename $src`
else
true
fi
fi
## this sed command emulates the dirname command
dstdir=`echo $dst | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'`
# Make sure that the destination directory exists.
# this part is taken from Noah Friedman's mkinstalldirs script
# Skip lots of stat calls in the usual case.
if [ ! -d "$dstdir" ]; then
defaultIFS='
'
IFS="${IFS-${defaultIFS}}"
oIFS="${IFS}"
# Some sh's can't handle IFS=/ for some reason.
IFS='%'
set - `echo ${dstdir} | sed -e 's@/@%@g' -e 's@^%@/@'`
IFS="${oIFS}"
pathcomp=''
while [ $# -ne 0 ] ; do
pathcomp="${pathcomp}${1}"
shift
if [ ! -d "${pathcomp}" ] ;
then
$mkdirprog "${pathcomp}"
else
true
fi
pathcomp="${pathcomp}/"
done
fi
if [ x"$dir_arg" != x ]
then
$doit $instcmd $dst &&
if [ x"$chowncmd" != x ]; then $doit $chowncmd $dst; else true ; fi &&
if [ x"$chgrpcmd" != x ]; then $doit $chgrpcmd $dst; else true ; fi &&
if [ x"$stripcmd" != x ]; then $doit $stripcmd $dst; else true ; fi &&
if [ x"$chmodcmd" != x ]; then $doit $chmodcmd $dst; else true ; fi
else
# If we're going to rename the final executable, determine the name now.
if [ x"$transformarg" = x ]
then
dstfile=`basename $dst`
else
dstfile=`basename $dst $transformbasename |
sed $transformarg`$transformbasename
fi
# don't allow the sed command to completely eliminate the filename
if [ x"$dstfile" = x ]
then
dstfile=`basename $dst`
else
true
fi
# Make a temp file name in the proper directory.
dsttmp=$dstdir/#inst.$$#
# Move or copy the file name to the temp name
$doit $instcmd $src $dsttmp &&
trap "rm -f ${dsttmp}" 0 &&
# and set any options; do chmod last to preserve setuid bits
# If any of these fail, we abort the whole thing. If we want to
# ignore errors from any of these, just make sure not to ignore
# errors from the above "$doit $instcmd $src $dsttmp" command.
if [ x"$chowncmd" != x ]; then $doit $chowncmd $dsttmp; else true;fi &&
if [ x"$chgrpcmd" != x ]; then $doit $chgrpcmd $dsttmp; else true;fi &&
if [ x"$stripcmd" != x ]; then $doit $stripcmd $dsttmp; else true;fi &&
if [ x"$chmodcmd" != x ]; then $doit $chmodcmd $dsttmp; else true;fi &&
# Now rename the file to the real destination.
$doit $rmcmd -f $dstdir/$dstfile &&
$doit $mvcmd $dsttmp $dstdir/$dstfile
fi &&
exit 0

View file

@ -1,737 +0,0 @@
# Documentation for libzpaq
#
# Copyright (C) 2012, Dell Inc. Written by Matt Mahoney.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so without restriction.
# This Software is provided "as is" without warranty.
#
# To create man page: pod2man libzpaq.3.pod > libzpaq.3
# To create HTML documentation: pod2html libzpaq.3.pod > libzpaq.html
=pod
=head1 NAME
libzpaq - ZPAQ compression API
=head1 SYNOPSIS
#include "libzpaq.h"
namespace libzpaq {
extern void error(const char* msg);
class Reader {
public:
virtual int get() = 0;
virtual int read(char* buf, int n); // optional
virtual ~Reader() {}
};
class Writer {
public:
virtual void put(int c) = 0;
virtual void write(const char* buf, int n); // optional
virtual ~Writer() {}
};
class SHA1 {
public:
SHA1();
void put(int c);
double size() const;
uint64_t usize() const
const char* result();
};
class Compressor {
public:
Compressor();
void setOutput(Writer* out);
void writeTag();
void startBlock(int level);
void startBlock(const char* hcomp);
void startSegment(const char* filename = 0,
const char* comment = 0);
void setInput(Reader* i);
void postProcess(const char* pcomp = 0, int length = 0);
bool compress(int n = -1);
void endSegment(const char* sha1string = 0);
void endBlock();
};
class Decompresser {
public:
Decompresser();
void setInput(Reader* in);
bool findBlock(double* memptr = 0);
void hcomp(Writer* out);
bool findFilename(Writer* = 0);
void readComment(Writer* = 0);
void setOutput(Writer* out);
void setSHA1(SHA1* sha1ptr);
bool decompress(int n = -1);
bool pcomp(Writer* out);
void readSegmentEnd(char* sha1string = 0);
};
void compress(Reader* in, Writer* out, int level);
void decompress(Reader* in, Writer* out);
}
=head1 DESCRIPTION
I<libzpaq> is a C++ API for compressing or decompressing
files or objects in memory comforming to the ZPAQ level 1 and 2 standards
(see I<availability>). This document describes version 5.00
of the software. The software may be used without
restriction under a modified MIT license.
ZPAQ provides a high level of data compression in a streaming
(single pass) self-describing format that supports single or multiple
named objects (such as archives) with optional integrity checking.
The library provides 3 default compression levels but supports
custom algorithms. The performance of the default levels is
shown in the table below for the 14 file Calgary corpus as
a tar file. Compression and decompression times are in seconds
on a 2 GHz T3200 running on one of two cores. Memory required
to compress or decompress is in MB. Some popular formats
are shown for comparison.
Program Format Size Time (C, D) Memory
----------- ------ --------- ----------- ------
Uncompresed .tar 3,152,896
compress .tar.Z 1,319,521 1.6 0.2 .1 MB
gzip -9 .tar.gz 1,022,810 0.7 0.1 .1 MB
bzip2 -9 .tar.bz2 860,097 0.6 0.4 5 MB
7zip .tar.7z 824,573 1.5 0.1 195 MB
zpaq 1 (fast) .tar.zpaq 806,959 2 2 38 MB
zpaq 2 (mid) .tar.zpaq 699,191 8 8 112 MB
zpaq 3 (max) .tar.zpaq 644,190 20 20 246 MB
A ZPAQ stream consists of one or more blocks, possibly mixed with
other data, that can be decompressed independently in any order.
Each block consists of one or more segments that must be decompressed
in order from the beginning of the block. Each block header contains
a description of the decompression algorithm. Each segment consists
of an optional filename string, an optional comment string,
self delimiting compressed data, and an optional SHA-1 checksum.
If ZPAQ blocks are mixed with other data, they must be
preceded by an identifying 13 byte tag which does not otherwise
appear in that data.
ZPAQ compression is based on the PAQ context mixing model.
An array of components predict the probability of the next bit
of input, either independently or depending on the predictions
of earlier components. The final prediction is arithmetic coded.
Each component inputs a context computed from earlier input
by a program written in ZPAQL byte code which runs on a virtual
machine. Both the component array description and the ZPAQL
code are encoded in a string called HCOMP in each block header.
Data can also be stored uncompressed.
A block may optionally specify a post-processor, a program
(also in ZPAQL) which takes the decoded data as input and
outputs the decompressed output. This program, if present,
is encoded as a string called PCOMP which is compressed
in the first segment prior to the compressed data. The first
decoded byte from the first segment is a flag indicating
whether a PCOMP string is present. The user is responsible
for correctly pre-processing the data so that post-processing
restores the original data.
=head2 API Organization
The I<libzpaq> API consists of 2 files.
=over
=item libzpaq.h
Header file to include in your application.
=item libzpaq.cpp
Source code file to link to your application.
=back
An application would have the line C<#include "libzpaq.h"> and
link to libzpaq.cpp.
The API provides two classes, C<Compressor> and C<Decompresser>
which write or read respectively each of the syntactic elements
of a ZPAQ stream. The two functions C<compress()> and
C<decompress()> provide simple interfaces for the most common
uses. In either case, the user must create classes derived
from the abstract base classes C<Reader> and C<Writer> and
define methods C<get()> and C<put()> which the code
will use to read and write bytes. The user must also define
a callback error handler.
By default, libzpaq(3) uses just-in-time (JIT) acceleration
by translating ZPAQL code to x86-32 or x86-64 internally
and executing it. This feature can be disabled by compiling
with -DNOJIT. If enabled, it requires an x86 processor
capable of executing SSE2 instructions. SSE2 is supported
by most Intel processors since 2001 and AMD since 2003.
Run time checks (assertions) can be enabled with -DDEBUG
for debugging purposes.
All of the API code is contained in the namespace C<libzpaq>.
=head2 Callback Functions
The following three functions must be defined by the user.
=over
=item C<extern void libzpaq::error(const char* msg);>
This function must be defined by the user to handle errors
from libzpaq. The library will call the function with
an English language message passed to C<msg>. Errors may
result from bad input during decompression, out of memory,
or illegal arguments or calling sequences to libzpaq
functions. Errors should be considered unrecoverable.
=item C<int libzpaq::Reader::get() = 0;>
The user must create a class derived from Reader with an
implementation for C<get()> that reads one byte of input
and returns its value in the range 0...255, or returns
EOF (-1) at end of input. Objects of the derived type
would then be passed to functions that require a C<Reader>.
=item C<void libzpaq::Writer::put(int c) = 0;>
The user must create a class derived from Writer with
an implemenation of C<put()> which is expected to take
a byte value C<c> in the range 0...255 and write it to
output. Objects of the derived type
would then be passed to functions that require a C<Writer>.
=back
The following two functions are optional. Defining them
can improve performance slightly.
=over
=item C<virtual int read(char* buf, int n);>
If defined, this function should input up to C<n> bytes into
the array C<buf> and return the number actually read, in
the range 0..n. A return value of 0 indicates end of input.
If C<read()> is not defined, then the default implementation
will call C<get()> n times.
=item C<virtual void write(const char* buf, int n);>
If defined, this function should output the elements C<buf[0]>
through C<buf[n-1]> in order. If not defined, then the default
implementation will call C<put()> n times.
=back
=head2 Simple Compression
In the remainder of this document, all classes and
functions are assumed to be in namespace C<libzpaq>.
=over
=item C<void compress(Reader* in, Writer* out, int mode);>
C<compress()> compresses from C<in> to C<out> until C<get()>
returns EOF. It writes a single segment in a single block
with empty filename, comment, and checksum fields. C<mode>
must be 1, 2, or 3, to select models I<fast>, I<mid>, or
I<max> respectively. Higher modes compress smaller but
take longer to compress and subsequently decompress.
=item C<void decompress(Reader* in, Writer* out);>
C<decompress()> decompresses any valid ZPAQ stream from
C<in> to C<out> until C<get()> returns EOF. Any
non-ZPAQ data in the input is ignored. Any ZPAQ blocks
following non-ZPAQ must be preceded by a marker tag
to be recognized. Each block is decoded according to the
instructions in the block header. The contents of the
filename, comment, and checksum fields are ignored.
Data with bad checksums will be decoded anyway. If there
is more than one segment, then all of the output
data will be concatenated.
=back
=head2 class SHA1
The SHA1 class is used to compute SHA-1 checksums for compression
and verify them for decompression. It is believed to be
computationally infeasible to find two different strings
with the same hash value. Its member functions
are as follows:
=over
=item C<SHA1();>
The constructor creates a new SHA1 object representing the
hash of an empty string.
=item C<void put(int c);>
Appends one byte c (0...255) to the string whose hash is represented.
=item C<double size() const;>
Returns the length (so far) of the string whose hash is represented.
The largest possible value returned is
2^61 - 1 = 2305843009213693951.0, but values larger than 2^53 =
9007199254740992.0
will not be exact on systems using IEEE 64 bit floating point
representation of type C<double>. The initial value is 0.0.
=item C<int64_t usize() const;>
Returns the length (so far) as a 64 bit unsigned integer.
=item C<const char* result();>
Computes the 20 byte SHA-1 hash and resets the string back
to a size of 0.0. The returned pointer points to an array
inside the SHA1 object whose
contents remain unchanged until the next call to C<result()>.
=back
=head2 class Compressor
The C<Compressor> class has member functions to write
each of the syntactic elements of a ZPAQ stream and to specify
their values. It will compress using either built-in or
user supplied models.
=over
=item C<Compressor();>
The constructor creates a Compression object. No input source,
output destination, or compression model is specified.
=item C<void setOutput(Writer* out);>
Specifies a destination for output. Must be specified before
calling any function that writes data.
=item C<void writeTag();>
Writes a 13 byte marker tag which can be used to identify
the start of a block following non-ZPAQ data.
=item C<void startBlock(int level);>
Writes a block header and specifies a compression model.
If linked with F<libzpaqo.cpp>, then C<level> must be 1, 2, or 3
to specify I<fast>, I<mid>, or I<max> respectively. Higher numbers
compress smaller but more slowly. These models are compatible
with both the ZPAQ level 1 and 2 standards.
=item C<void startBlock(const char* hcomp);>
Writes a block header and specifies the HCOMP portion of the
compression model. The first two bytes of the string should
encode the length of the rest of the string as a 16 bit unsigned
number with the least significant bit first. The meaning of the
rest of the string is defined in the ZPAQ level 2 standard.
If the number of components (C<hcomp[8]>) is 0, then the block
is saved in ZPAQ level 2 format, which cannot be read by
older ZPAQ level 1 decoders. Otherwise the block is saved in
ZPAQ level 1 format, which is compatible with all decoders.
=item C<void startSegment(const char* filename = 0, const char* comment = 0);>
Writes a segment header. C<filename> and
C<comment> are NUL terminated strings. If specified, then their
values are stored. Normally, C<filename> would be a file name
when compressing to an archive or omitted otherwise. If a file
is split among segments, then by convention only the first segment
is named. C<comment> is normally the uncompressed size as a decimal
number which is displayed when listing the contents of an archive.
Omitting it does not affect decompression.
=item C<void postProcess(const char* pcomp = 0, int length = 0);>
Specifies the optional PCOMP string used for post-processing.
It must be called from within the first segment
of each block prior to compressing any data, but not from within
any other segment.
If C<pcomp> is 0 or no argument is passed, then the decompresser
will not post-process the data. The effect is to compress a
0 byte to indicate to the decompresser that no PCOMP string
is present.
If C<pcomp> is not 0, then I<length> bytes of the string I<pcomp>
are passed. If I<length> is 0 or omitted, then
the first two bytes must encode
the length of the rest of the string as a 16 bit unsigned number
with the least significant byte first. The format of the remainder
of the string is described in the ZPAQ level 2 standard.
The effect is to compress a 1 byte
to indicate the presence of PCOMP, followed by the two length
bytes and the string as passed. For example, either
C<pcomp("\x02\x00\x05\x08")> or C<pcomp("\x05\x08", 2)>
would compress the 5 bytes 1, 2, 0, 5, 8.
The user is responsible for pre-processing the input
prior to compression so that PCOMP restores the original data.
=item C<void setInput(Reader* in);>
Specifies the input source for compression. It must be set
prior to the first call to C<compress()>.
=item C<bool compress(int n = -1);>
Compress n bytes of data, or until EOF is input, whichever comes
first. If n < 0 or omitted, then compress until EOF.
Returns true if there is more input available, or false if EOF
was read.
=item C<void endSegment(const char* sha1string = 0);>
Stop compressing and write the end of a segment. If
C<sha1string> is specified, it should be a 20 byte string
as returned by C<SHA1::result()> on the input data for
this segment I<before> pre-processing.
=item C<void endBlock();>
Finish writing the current block.
=back
In order to create a valid ZPAQ stream, the components must
be written in the following order:
for each block do {
if any non-ZPAQ data then {
write non-ZPAQ data
writeTag()
}
startBlock()
for each segment do {
startSegment()
if first segment in block then {
postProcess()
}
while (compress(n)) ;
endSegment()
}
endBlock()
}
=head2 class Decompresser
The class Decompresser has member functions to read each of the
syntactic elements of a ZPAQ stream.
=over
=item C<Decompresser()>
The constructor creates a Decompresser object. No input source or
output destination is specified.
=item C<void setInput(Reader* in);>
Specifies where the ZPAQ stream will be read from. Must be called
before any function that reads the stream.
=item C<bool findBlock(double* memptr = 0);>
Scan the input to find the start of the next block. If a block
does not start immediately, then the block must be preceded by
a marker tag (written with C<Compressor::writeTag()>) or it will
not be found. If C<memptr> is not 0, then write the approximate
memory requirement (in bytes) to decompress to C<*memptr>). The
memory will be allocated by the first call to C<decompress()>.
It returns true if a block is found, or false if it reads to EOF
without finding a block.
=item C<void hcomp(Writer* out);>
Write the HCOMP string of the current block to C<out>.
It will be in a format suitable
for passing to C<Compressor::startBlock()>. The first 2 bytes will
encode the length of the rest of the string as a 16 bit unsigned
integer with the least significant byte first. The format of the
remainder of the string is described in the ZPAQ level 1
specification.
=item C<bool findFilename(Writer* out = 0);>
Find the start of the next segment. If another segment is found
within the current block then return true. If the end of the block
is found first, then return false. If a segment is found, the
filename field is not empty, and C<out>
is not 0, then write the filename (without a terminating NUL byte)
to C<out>.
=item C<void readComment(Writer* out = 0);>
Read or skip past the comment field following the filename field
in the segment header. If C<out> is not 0 and the comment field is
not empty, then write the comment
(without a terminating NUL byte) to C<out>.
=item C<void setOutput(Writer* out);>
Specify the destination for decompression. It must be set before
any data can be decompressed.
=item C<void setSHA1(SHA1* sha1ptr);>
Specify the address of a SHA1 object for computing the checksum
of the decompressed data (after post-processing). As each byte C<c>
is output, it is also passed to C<sha1ptr-E<gt>put(c)>. In order to
compute the correct checksum, the SHA1 object should be in its
initial state, either newly created, or by calling C<SHA1::result()>,
before the first call to C<decompress()>. When the end of the segment
is reached, the value returned by C<sha1ptr-E<gt>result()> should match
the stored checksum, if any.
=item C<bool decompress(int n = -1);>
Decode n bytes or until the end of segment, whichever comes
first. Return false if the end of segment is reached first. If
n < 0 or not specified, then decompress to the end of segment
and return false. C<n> is the number of bytes prior to post-processing.
If the data is post-processed, then the size of the output may
be different.
=item C<bool pcomp(Writer* out);>
Write the PCOMP string, if any, for the current block to C<out>.
If there is no PCOMP string (no post-processor) then return false.
Otherwise write the string to C<out> in a format suitable for
passing to C<Compressor::postProcess()> and return true. If written,
then the first 2 bytes will encode the length of the rest of the
string as a 16 bit unsigned integer with the least significant
bit first. The format of the rest of the string is descibed in
the ZPAQ level 1 standard.
C<pcomp()> is only valid after the first call to C<decompress()>
in the current block. To read the PCOMP string without decompressing any
data, then call C<decompress(0)> first. It is not necessary to
call C<setOutput()> in this case.
=item C<void readSegmentEnd(char* sha1string = 0);>
Skip any compressed data in the current segment that has not yet
been decompressed and advance to the end of the segment.
Then if C<sha1string> is not 0 then write into
the 21 byte array that it points to. If a checksum is present,
then write a 1 into C<sha1string[0]> and write the stored checksum
in C<sha1string[1...20]>. Otherwise write a 0 in C<sha1string[0]>.
Note that it is not permitted to call decompress() if any compressed
data has been skipped in any earlier segments in the same block.
=back
A valid sequence of calls is as follows:
while (findBlock()) {
while (findFilename()) {
readComment();
if first segment in block then { (optional)
decompress(0)
pcomp()
}
while (decompress(n)) ; (optional)
readSegmentEnd();
}
}
=head1 EXAMPLES
The following program F<listzpaq.cpp>
lists the contents of a ZPAQ archive
read from standard input.
#include <stdio.h>
#include <stdlib.h>
#include "libzpaq.h"
// Implement Reader and Writer interfaces for file I/O
class File: public libzpaq::Reader, public libzpaq::Writer {
FILE* f;
public:
File(FILE* f_): f(f_) {}
int get() {return getc(f);}
void put(int c) {putc(c, f);}
int read(char* buf, int n) {return fread(buf, 1, n, f);}
void write(const char* buf, int n) {fwrite(buf, 1, n, f);}
};
// Implement error handler
namespace libzpaq {
void error(const char* msg) {
fprintf(stderr, "Error: %s\n", msg);
exit(1);
}
}
// List the contents of an archive. For each block, show
// the memory required to decompress. For each segment,
// show the filename and comment.
void list(FILE* input, FILE* output) {
libzpaq::Decompresser d;
File in(input), out(output);
double memory;
d.setInput(&in);
for (int block=1; d.findBlock(&memory); ++block) {
printf("Block %d needs %1.0f MB\n", block, memory/1e6);
while (d.findFilename(&out)) { // print filename
printf("\t");
d.readComment(&out); // print comment
printf("\n");
d.readSegmentEnd(); // skip compressed data
}
}
}
int main() {
list(stdin, stdout);
return 0;
}
The program could be compiled as follows:
g++ listzpaq.cpp libzpaq.cpp
The following code compresses a list of files into one block
written to stdout. Each file is compressed to a separate
segment. For each segment, the filename, comment, and SHA-1
checksum are stored. The comment, as conventional, is the
file size as a decimal string.
// Compress one file to one segment
void compress_file(libzpaq::Compressor& c,
const char* filename,
bool first_segment) {
// Open input file
FILE* f;
f=fopen(filename, "rb");
if (!f) return;
// Compute SHA-1 checksum and file size
libzpaq::SHA1 sha1;
int ch;
while ((ch=getc(f))!=EOF)
sha1.put(ch);
// Write file size as a comment.
// The size can have at most 19 digits.
char comment[20];
sprintf(comment, "%1.0f", sha1.size());
// Compress segment
rewind(f);
File in(f);
c.startSegment(filename, comment);
if (first_segment)
c.postProcess();
c.setInput(&in);
c.compress();
c.endSegment(sha1.result());
// Close input file
fclose(f);
}
// Compress a list of argc files in argv[0...argc-1] into one
// ZPAQ block to stdout at level 2.
void compress_list(int argc, char** argv) {
libzpaq::Compressor c;
File out(stdout);
c.setOutput(&out);
c.startBlock(2);
for (int i=0; i<argc; ++i)
compress_file(c, argv[i], i==0);
c.endBlock();
}
The following function decompresses from stdin to stdout.
Filenames and comments are ignored, but checksums are verified
if present.
void decompress() {
libzpaq::Decompresser d;
File in(stdin), out(stdout);
d.setInput(&in);
while (d.findBlock()) {
while (d.findFilename()) {
d.readComment();
libzpaq::SHA1 sha1;
d.setSHA1(&sha1);
d.setOutput(&out);
d.decompress();
char sha1string[21];
d.readSegmentEnd(sha1string);
const char* sha1result = sha1.result();
if (sha1string[0]==1
&& memcmp(sha1string+1, sha1result, 20))
libzpaq::error("checksum verify error");
}
}
}
C<Compressor::compress()> and C<Decompresser::decompress()> can
be passed an argument n to display progress every n bytes,
for example:
for (int i=1; d.decompress(1000000); ++i)
fprintf(stderr, "Decompressed %d MB\n", i);
To compress or decompress to and from objects in memory, derive
appropriate classes from C<Reader> and C<Writer>. For example, it is
possible to compress or decompress to a C<std::string> using
the following class.
struct String: public libzpaq::Writer {
std::string s;
void put(int c) {s+=char(c);}
};
This class is also useful for reading the filename and comment
fields during decompression as follows:
String filename, comment;
while (d.findFilename(&filename)) {
d.readComment(&comment);
// ...
=head1 AVAILABILITY
I<libzpaq>, I<zpaq>, and the ZPAQ level 1 and 2 specifications are
available from L<http://mattmahoney.net/zpaq/>.
=head1 SEE ALSO
C<zpaq(1)>
C<sha1(1SSL)>
=cut

File diff suppressed because it is too large Load diff

View file

@ -1,542 +0,0 @@
/* libzpaq.h - LIBZPAQ Version 5.00.
Copyright (C) 2011, Dell Inc. Written by Matt Mahoney.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so without restriction.
This Software is provided "as is" without warranty.
LIBZPAQ is a C++ library for compression and decompression of data
conforming to the ZPAQ level 2 standard. See http://mattmahoney.net/zpaq/
By default, LIBZPAQ uses JIT (just in time) acceleration. This only
works on x86-32 and x86-64 processors that support the SSE2 instruction
set. To disable JIT, compile with -DNOJIT. To enable run time checks,
compile with -DDEBUG. Both options will decrease speed.
The decompression code, when compiled with -DDEBUG and -DNOJIT,
comprises the reference decoder for the ZPAQ level 2 standard.
*/
#ifndef LIBZPAQ_H
#define LIBZPAQ_H
#ifndef DEBUG
#define NDEBUG 1
#endif
#include <assert.h>
#include <stddef.h>
#include <stdint.h>
#include <string.h>
namespace libzpaq {
// 1, 2, 4, 8 byte unsigned integers
typedef uint8_t U8;
typedef uint16_t U16;
typedef uint32_t U32;
typedef uint64_t U64;
// Standard library prototypes redirected to libzpaq.cpp
void* calloc(size_t, size_t);
void free(void*);
// Callback for error handling
extern void error(const char* msg);
// Virtual base classes for input and output
// get() and put() must be overridden to read or write 1 byte.
// read() and write() may be overridden to read or write n bytes more
// efficiently than calling get() or put() n times.
class Reader {
public:
virtual int get() = 0; // should return 0..255, or -1 at EOF
virtual int read(char* buf, int n); // read to buf[n], return no. read
virtual ~Reader() {}
};
class Writer {
public:
virtual void put(int c) = 0; // should output low 8 bits of c
virtual void write(const char* buf, int n); // write buf[n]
virtual ~Writer() {}
};
// Read 16 bit little-endian number
int toU16(const char* p);
// An Array of T is cleared and aligned on a 64 byte address
// with no constructors called. No copy or assignment.
// Array<T> a(n, ex=0); - creates n<<ex elements of type T
// a[i] - index
// a(i) - index mod n, n must be a power of 2
// a.size() - gets n
template <typename T>
class Array {
T *data; // user location of [0] on a 64 byte boundary
size_t n; // user size
int offset; // distance back in bytes to start of actual allocation
void operator=(const Array&); // no assignment
Array(const Array&); // no copy
public:
Array(size_t sz=0, int ex=0): data(0), n(0), offset(0) {
resize(sz, ex);} // [0..sz-1] = 0
void resize(size_t sz, int ex=0); // change size, erase content to zeros
~Array() {resize(0);} // free memory
size_t size() const {return n;} // get size
int isize() const {return int(n);} // get size as an int
T& operator[](size_t i) {assert(n>0 && i<n); return data[i];}
T& operator()(size_t i) {assert(n>0 && (n&(n-1))==0); return data[i&(n-1)];}
};
// Change size to sz<<ex elements of 0
template<typename T>
void Array<T>::resize(size_t sz, int ex) {
assert(size_t(-1)>0); // unsigned type?
while (ex>0) {
if (sz>sz*2) error("Array too big");
sz*=2, --ex;
}
if (n>0) {
assert(offset>0 && offset<=64);
assert((char*)data-offset);
free((char*)data-offset);
}
n=0;
if (sz==0) return;
n=sz;
const size_t nb=128+n*sizeof(T); // test for overflow
if (nb<=128 || (nb-128)/sizeof(T)!=n) error("Array too big");
data=(T*)calloc(nb, 1);
if (!data) error("Out of memory");
offset=64-(((char*)data-(char*)0)&63);
assert(offset>0 && offset<=64);
data=(T*)((char*)data+offset);
}
//////////////////////////// SHA1 ////////////////////////////
// For computing SHA-1 checksums
class SHA1 {
public:
void put(int c) { // hash 1 byte
U32& r=w[len0>>5&15];
r=(r<<8)|(c&255);
if (!(len0+=8)) ++len1;
if ((len0&511)==0) process();
}
double size() const {return len0/8+len1*536870912.0;} // size in bytes
uint64_t usize() const {return len0/8+(U64(len1)<<29);} // size in bytes
const char* result(); // get hash and reset
SHA1() {init();}
private:
void init(); // reset, but don't clear hbuf
U32 len0, len1; // length in bits (low, high)
U32 h[5]; // hash state
U32 w[80]; // input buffer
char hbuf[20]; // result
void process(); // hash 1 block
};
//////////////////////////// ZPAQL ///////////////////////////
// Symbolic constants, instruction size, and names
typedef enum {NONE,CONS,CM,ICM,MATCH,AVG,MIX2,MIX,ISSE,SSE} CompType;
extern const int compsize[256];
// A ZPAQL machine COMP+HCOMP or PCOMP.
class ZPAQL {
public:
ZPAQL();
~ZPAQL();
void clear(); // Free memory, erase program, reset machine state
void inith(); // Initialize as HCOMP to run
void initp(); // Initialize as PCOMP to run
double memory(); // Return memory requirement in bytes
void run(U32 input); // Execute with input
int read(Reader* in2); // Read header
bool write(Writer* out2, bool pp); // If pp write PCOMP else HCOMP header
int step(U32 input, int mode); // Trace execution (defined externally)
Writer* output; // Destination for OUT instruction, or 0 to suppress
SHA1* sha1; // Points to checksum computer
U32 H(int i) {return h(i);} // get element of h
void flush(); // write outbuf[0..bufptr-1] to output and sha1
void outc(int c) { // output byte c (0..255) or -1 at EOS
if (c<0 || (outbuf[bufptr]=c, ++bufptr==outbuf.isize())) flush();
}
// ZPAQ1 block header
Array<U8> header; // hsize[2] hh hm ph pm n COMP (guard) HCOMP (guard)
int cend; // COMP in header[7...cend-1]
int hbegin, hend; // HCOMP/PCOMP in header[hbegin...hend-1]
private:
// Machine state for executing HCOMP
Array<U8> m; // memory array M for HCOMP
Array<U32> h; // hash array H for HCOMP
Array<U32> r; // 256 element register array
Array<char> outbuf; // output buffer
int bufptr; // number of bytes in outbuf
U32 a, b, c, d; // machine registers
int f; // condition flag
int pc; // program counter
int rcode_size; // length of rcode
U8* rcode; // JIT code for run()
// Support code
int assemble(); // put JIT code in rcode
void init(int hbits, int mbits); // initialize H and M sizes
int execute(); // execute 1 instruction, return 0 after HALT, else 1
void run0(U32 input); // default run() when select==0
void div(U32 x) {if (x) a/=x; else a=0;}
void mod(U32 x) {if (x) a%=x; else a=0;}
void swap(U32& x) {a^=x; x^=a; a^=x;}
void swap(U8& x) {a^=x; x^=a; a^=x;}
void err(); // exit with run time error
};
///////////////////////// Component //////////////////////////
// A Component is a context model, indirect context model, match model,
// fixed weight mixer, adaptive 2 input mixer without or with current
// partial byte as context, adaptive m input mixer (without or with),
// or SSE (without or with).
struct Component {
size_t limit; // max count for cm
size_t cxt; // saved context
size_t a, b, c; // multi-purpose variables
Array<U32> cm; // cm[cxt] -> p in bits 31..10, n in 9..0; MATCH index
Array<U8> ht; // ICM/ISSE hash table[0..size1][0..15] and MATCH buf
Array<U16> a16; // MIX weights
void init(); // initialize to all 0
Component() {init();}
};
////////////////////////// StateTable ////////////////////////
// Next state table generator
class StateTable {
enum {N=64}; // sizes of b, t
int num_states(int n0, int n1); // compute t[n0][n1][1]
void discount(int& n0); // set new value of n0 after 1 or n1 after 0
void next_state(int& n0, int& n1, int y); // new (n0,n1) after bit y
public:
U8 ns[1024]; // state*4 -> next state if 0, if 1, n0, n1
int next(int state, int y) { // next state for bit y
assert(state>=0 && state<256);
assert(y>=0 && y<4);
return ns[state*4+y];
}
int cminit(int state) { // initial probability of 1 * 2^23
assert(state>=0 && state<256);
return ((ns[state*4+3]*2+1)<<22)/(ns[state*4+2]+ns[state*4+3]+1);
}
StateTable();
};
///////////////////////// Predictor //////////////////////////
// A predictor guesses the next bit
class Predictor {
public:
Predictor(ZPAQL&);
~Predictor();
void init(); // build model
int predict(); // probability that next bit is a 1 (0..4095)
void update(int y); // train on bit y (0..1)
int stat(int); // Defined externally
bool isModeled() { // n>0 components?
assert(z.header.isize()>6);
return z.header[6]!=0;
}
private:
// Predictor state
int c8; // last 0...7 bits.
int hmap4; // c8 split into nibbles
int p[256]; // predictions
U32 h[256]; // unrolled copy of z.h
ZPAQL& z; // VM to compute context hashes, includes H, n
Component comp[256]; // the model, includes P
// Modeling support functions
int predict0(); // default
void update0(int y); // default
int dt2k[256]; // division table for match: dt2k[i] = 2^12/i
int dt[1024]; // division table for cm: dt[i] = 2^16/(i+1.5)
U16 squasht[4096]; // squash() lookup table
short stretcht[32768];// stretch() lookup table
StateTable st; // next, cminit functions
U8* pcode; // JIT code for predict() and update()
int pcode_size; // length of pcode
// reduce prediction error in cr.cm
void train(Component& cr, int y) {
assert(y==0 || y==1);
U32& pn=cr.cm(cr.cxt);
U32 count=pn&0x3ff;
int error=y*32767-(cr.cm(cr.cxt)>>17);
pn+=(error*dt[count]&-1024)+(count<cr.limit);
}
// x -> floor(32768/(1+exp(-x/64)))
int squash(int x) {
assert(x>=-2048 && x<=2047);
return squasht[x+2048];
}
// x -> round(64*log((x+0.5)/(32767.5-x))), approx inverse of squash
int stretch(int x) {
assert(x>=0 && x<=32767);
return stretcht[x];
}
// bound x to a 12 bit signed int
int clamp2k(int x) {
if (x<-2048) return -2048;
else if (x>2047) return 2047;
else return x;
}
// bound x to a 20 bit signed int
int clamp512k(int x) {
if (x<-(1<<19)) return -(1<<19);
else if (x>=(1<<19)) return (1<<19)-1;
else return x;
}
// Get cxt in ht, creating a new row if needed
size_t find(Array<U8>& ht, int sizebits, U32 cxt);
// Put JIT code in pcode
int assemble_p();
};
//////////////////////////// Decoder /////////////////////////
// Decoder decompresses using an arithmetic code
class Decoder {
public:
Reader* in; // destination
Decoder(ZPAQL& z);
int decompress(); // return a byte or EOF
int skip(); // skip to the end of the segment, return next byte
void init(); // initialize at start of block
int stat(int x) {return pr.stat(x);}
private:
U32 low, high; // range
U32 curr; // last 4 bytes of archive
Predictor pr; // to get p
enum {BUFSIZE=1<<16};
Array<char> buf; // input buffer of size BUFSIZE bytes
// of unmodeled data. buf[low..high-1] is input with curr
// remaining in sub-block.
int decode(int p); // return decoded bit (0..1) with prob. p (0..65535)
void loadbuf(); // read unmodeled data into buf to EOS
};
/////////////////////////// PostProcessor ////////////////////
class PostProcessor {
int state; // input parse state: 0=INIT, 1=PASS, 2..4=loading, 5=POST
int hsize; // header size
int ph, pm; // sizes of H and M in z
public:
ZPAQL z; // holds PCOMP
PostProcessor(): state(0), hsize(0), ph(0), pm(0) {}
void init(int h, int m); // ph, pm sizes of H and M
int write(int c); // Input a byte, return state
int getState() const {return state;}
void setOutput(Writer* out) {z.output=out;}
void setSHA1(SHA1* sha1ptr) {z.sha1=sha1ptr;}
};
//////////////////////// Decompresser ////////////////////////
// For decompression and listing archive contents
class Decompresser {
public:
Decompresser(): z(), dec(z), pp(), state(BLOCK), decode_state(FIRSTSEG) {}
void setInput(Reader* in) {dec.in=in;}
bool findBlock(double* memptr = 0);
void hcomp(Writer* out2) {z.write(out2, false);}
bool findFilename(Writer* = 0);
void readComment(Writer* = 0);
void setOutput(Writer* out) {pp.setOutput(out);}
void setSHA1(SHA1* sha1ptr) {pp.setSHA1(sha1ptr);}
bool decompress(int n = -1); // n bytes, -1=all, return true until done
bool pcomp(Writer* out2) {return pp.z.write(out2, true);}
void readSegmentEnd(char* sha1string = 0);
int stat(int x) {return dec.stat(x);}
private:
ZPAQL z;
Decoder dec;
PostProcessor pp;
enum {BLOCK, FILENAME, COMMENT, DATA, SEGEND} state; // expected next
enum {FIRSTSEG, SEG, SKIP} decode_state; // which segment in block?
};
/////////////////////////// decompress() /////////////////////
void decompress(Reader* in, Writer* out);
//////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////
// Code following this point is not a part of the ZPAQ level 2 standard.
//////////////////////////// Encoder /////////////////////////
// Encoder compresses using an arithmetic code
class Encoder {
public:
Encoder(ZPAQL& z):
out(0), low(1), high(0xFFFFFFFF), pr(z) {}
void init();
void compress(int c); // c is 0..255 or EOF
int stat(int x) {return pr.stat(x);}
Writer* out; // destination
private:
U32 low, high; // range
Predictor pr; // to get p
Array<char> buf; // unmodeled input
void encode(int y, int p); // encode bit y (0..1) with prob. p (0..65535)
};
//////////////////////// Compressor //////////////////////////
class Compressor {
public:
Compressor(): enc(z), in(0), state(INIT) {}
void setOutput(Writer* out) {enc.out=out;}
void writeTag();
void startBlock(int level); // level=1,2,3
void startBlock(const char* hcomp);
void startSegment(const char* filename = 0, const char* comment = 0);
void setInput(Reader* i) {in=i;}
void postProcess(const char* pcomp = 0, int len = 0);
bool compress(int n = -1); // n bytes, -1=all, return true until done
void endSegment(const char* sha1string = 0);
void endBlock();
int stat(int x) {return enc.stat(x);}
private:
ZPAQL z;
Encoder enc;
Reader* in;
enum {INIT, BLOCK1, SEG1, BLOCK2, SEG2} state;
};
/////////////////////////// compress() ///////////////////////
void compress(Reader* in, Writer* out, int level);
} // namespace libzpaq
/////////////////////////// lrzip functions //////////////////
#include <stdio.h>
#ifndef uchar
#define uchar unsigned char
#endif
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
#define __maybe_unused __attribute__((unused))
typedef int64_t i64;
struct bufRead: public libzpaq::Reader {
uchar *s_buf;
i64 *s_len;
i64 total_len;
int *last_pct;
bool progress;
long thread;
FILE *msgout;
bufRead(uchar *buf_, i64 *n_, i64 total_len_, int *last_pct_, bool progress_, long thread_, FILE *msgout_):
s_buf(buf_), s_len(n_), total_len(total_len_), last_pct(last_pct_), progress(progress_), thread(thread_), msgout(msgout_) {}
int get() {
if (progress && !(*s_len % 128)) {
int pct = (total_len > 0) ?
(total_len - *s_len) * 100 / total_len : 100;
if (pct / 10 != *last_pct / 10) {
int i;
fprintf(msgout, "\r\t\t\tZPAQ\t");
for (i = 0; i < thread; i++)
fprintf(msgout, "\t");
fprintf(msgout, "%ld:%i%% \r",
thread + 1, pct);
fflush(msgout);
*last_pct = pct;
}
}
if (likely(*s_len > 0)) {
(*s_len)--;
return ((int)(uchar)*s_buf++);
}
return -1;
} // read and return byte 0..255, or -1 at EOF
int read(char *buf, int n) {
if (unlikely(n > *s_len))
n = *s_len;
if (likely(n > 0)) {
*s_len -= n;
memcpy(buf, s_buf, n);
}
return n;
}
};
struct bufWrite: public libzpaq::Writer {
uchar *c_buf;
i64 *c_len;
bufWrite(uchar *buf_, i64 *n_): c_buf(buf_), c_len(n_) {}
void put(int c) {
c_buf[(*c_len)++] = (uchar)c;
}
void write(const char *buf, int n) {
memcpy(c_buf + *c_len, buf, n);
*c_len += n;
}
};
extern "C" void zpaq_compress(uchar *c_buf, i64 *c_len, uchar *s_buf, i64 s_len, int level,
FILE *msgout, bool progress, long thread)
{
i64 total_len = s_len;
int last_pct = 100;
bufRead bufR(s_buf, &s_len, total_len, &last_pct, progress, thread, msgout);
bufWrite bufW(c_buf, c_len);
compress (&bufR, &bufW, level);
}
extern "C" void zpaq_decompress(uchar *s_buf, i64 *d_len, uchar *c_buf, i64 c_len,
FILE *msgout, bool progress, long thread)
{
i64 total_len = c_len;
int last_pct = 100;
bufRead bufR(c_buf, &c_len, total_len, &last_pct, progress, thread, msgout);
bufWrite bufW(s_buf, d_len);
decompress(&bufR, &bufW);
}
#endif // LIBZPAQ_H

1524
lrzip.c

File diff suppressed because it is too large Load diff

View file

@ -1,50 +0,0 @@
/*
Copyright (C) 2006-2016,2022 Con Kolivas
Copyright (C) 2011 Peter Hyman
Copyright (C) 1998-2003 Andrew Tridgell
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef LRZIP_CORE_H
#define LRZIP_CORE_H
#include "lrzip_private.h"
i64 get_ram(rzip_control *control);
i64 nloops(i64 seconds, uchar *b1, uchar *b2);
bool write_magic(rzip_control *control);
bool read_magic(rzip_control *control, int fd_in, i64 *expected_size);
bool preserve_perms(rzip_control *control, int fd_in, int fd_out);
int open_tmpoutfile(rzip_control *control);
bool flush_tmpout(rzip_control *control);
int open_tmpinfile(rzip_control *control);
bool read_tmpinfile(rzip_control *control, int fd_in);
bool decompress_file(rzip_control *control);
bool get_header_info(rzip_control *control, int fd_in, uchar *ctype, i64 *c_len, i64 *u_len, i64 *last_head);
bool get_fileinfo(rzip_control *control);
bool compress_file(rzip_control *control);
bool write_fdout(rzip_control *control, void *buf, i64 len);
bool write_fdin(rzip_control *control);
void close_tmpoutbuf(rzip_control *control);
void clear_tmpinbuf(rzip_control *control);
bool clear_tmpinfile(rzip_control *control);
void close_tmpinbuf(rzip_control *control);
bool initialise_control(rzip_control *control);
#define initialize_control(_control) initialise_control(_control)
extern void zpaq_compress(uchar *c_buf, i64 *c_len, uchar *s_buf, i64 s_len, int level,
FILE *msgout, bool progress, long thread);
extern void zpaq_decompress(uchar *s_buf, i64 *d_len, uchar *c_buf, i64 c_len,
FILE *msgout, bool progress, long thread);
#endif

View file

@ -1,574 +0,0 @@
/*
Copyright (C) 2006-2016,2018,2021-2022 Con Kolivas
Copyright (C) 2011 Peter Hyman
Copyright (C) 1998-2003 Andrew Tridgell
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef LRZIP_PRIV_H
#define LRZIP_PRIV_H
#include "config.h"
#define NUM_STREAMS 2
#define STREAM_BUFSIZE (1024 * 1024 * 10)
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <stdbool.h>
#include <stdarg.h>
#include <semaphore.h>
#ifdef HAVE_PTHREAD_H
# include <pthread.h>
#endif
#ifdef HAVE_STRING_H
# include <string.h>
#endif
#ifdef HAVE_MALLOC_H
# include <malloc.h>
#endif
#ifdef HAVE_ALLOCA_H
# include <alloca.h>
#elif defined __GNUC__
# define alloca __builtin_alloca
#elif defined _AIX
# define alloca __alloca
#elif defined _MSC_VER
# include <malloc.h>
# define alloca _alloca
#else
# include <stddef.h>
# ifdef __cplusplus
extern "C"
# endif
void *alloca (size_t);
#endif
#ifdef HAVE_ENDIAN_H
# include <endian.h>
#elif HAVE_SYS_ENDIAN_H
# include <sys/endian.h>
#endif
#ifndef __BYTE_ORDER
# ifndef __BIG_ENDIAN
# define __BIG_ENDIAN 4321
# define __LITTLE_ENDIAN 1234
# endif
# ifdef WORDS_BIGENDIAN
# define __BYTE_ORDER __BIG_ENDIAN
# else
# define __BYTE_ORDER __LITTLE_ENDIAN
# endif
#endif
#ifndef MD5_DIGEST_SIZE
# define MD5_DIGEST_SIZE 16
#endif
#define free(X) do { free((X)); (X) = NULL; } while (0)
#ifndef strdupa
# define strdupa(str) strcpy(alloca(strlen(str) + 1), str)
#endif
#ifndef strndupa
# define strndupa(str, len) strncpy(alloca(len + 1), str, len)
#endif
#ifndef uchar
#define uchar unsigned char
#endif
#ifndef int32
#if (SIZEOF_INT == 4)
#define int32 int
#elif (SIZEOF_LONG == 4)
#define int32 long
#elif (SIZEOF_SHORT == 4)
#define int32 short
#endif
#endif
#ifndef int16
#if (SIZEOF_INT == 2)
#define int16 int
#elif (SIZEOF_SHORT == 2)
#define int16 short
#endif
#endif
#ifndef uint32
#define uint32 unsigned int32
#endif
#ifndef uint16
#define uint16 unsigned int16
#endif
#ifndef MIN
#define MIN(a, b) ((a) < (b)? (a): (b))
#endif
#ifndef MAX
#define MAX(a, b) ((a) > (b)? (a): (b))
#endif
#if !HAVE_STRERROR
extern char *sys_errlist[];
#define strerror(i) sys_errlist[i]
#endif
#ifndef HAVE_ERRNO_H
extern int errno;
#endif
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
#define __maybe_unused __attribute__((unused))
#if defined(__MINGW32__) || defined(__CYGWIN__) || defined(__ANDROID__) || defined(__APPLE__) || defined(__OpenBSD__)
# define ffsll __builtin_ffsll
#endif
typedef int64_t i64;
typedef uint32_t u32;
typedef struct rzip_control rzip_control;
typedef struct md5_ctx md5_ctx;
/* ck specific unnamed semaphore implementations to cope with osx not
* implementing them. */
#ifdef __APPLE__
struct cksem {
int pipefd[2];
};
typedef struct cksem cksem_t;
#else
typedef sem_t cksem_t;
#endif
#if !defined(__linux)
#define mremap fake_mremap
#endif
#define bswap_32(x) \
((((x) & 0xff000000) >> 24) | (((x) & 0x00ff0000) >> 8) | \
(((x) & 0x0000ff00) << 8) | (((x) & 0x000000ff) << 24))
# define bswap_64(x) \
((((x) & 0xff00000000000000ull) >> 56) \
| (((x) & 0x00ff000000000000ull) >> 40) \
| (((x) & 0x0000ff0000000000ull) >> 24) \
| (((x) & 0x000000ff00000000ull) >> 8) \
| (((x) & 0x00000000ff000000ull) << 8) \
| (((x) & 0x0000000000ff0000ull) << 24) \
| (((x) & 0x000000000000ff00ull) << 40) \
| (((x) & 0x00000000000000ffull) << 56))
#ifdef leto32h
# define le32toh(x) leto32h(x)
# define le64toh(x) leto64h(x)
#endif
#ifndef le32toh
# if __BYTE_ORDER == __LITTLE_ENDIAN
# define htole32(x) (x)
# define le32toh(x) (x)
# define htole64(x) (x)
# define le64toh(x) (x)
# elif __BYTE_ORDER == __BIG_ENDIAN
# define htole32(x) bswap_32 (x)
# define le32toh(x) bswap_32 (x)
# define htole64(x) bswap_64 (x)
# define le64toh(x) bswap_64 (x)
#else
#error UNKNOWN BYTE ORDER
#endif
#endif
#define FLAG_SHOW_PROGRESS (1 << 0)
#define FLAG_KEEP_FILES (1 << 1)
#define FLAG_TEST_ONLY (1 << 2)
#define FLAG_FORCE_REPLACE (1 << 3)
#define FLAG_DECOMPRESS (1 << 4)
#define FLAG_NO_COMPRESS (1 << 5)
#define FLAG_LZO_COMPRESS (1 << 6)
#define FLAG_BZIP2_COMPRESS (1 << 7)
#define FLAG_ZLIB_COMPRESS (1 << 8)
#define FLAG_ZPAQ_COMPRESS (1 << 9)
#define FLAG_VERBOSITY (1 << 10)
#define FLAG_VERBOSITY_MAX (1 << 11)
#define FLAG_STDIN (1 << 12)
#define FLAG_STDOUT (1 << 13)
#define FLAG_INFO (1 << 14)
#define FLAG_UNLIMITED (1 << 15)
#define FLAG_HASH (1 << 16)
#define FLAG_MD5 (1 << 17)
#define FLAG_CHECK (1 << 18)
#define FLAG_KEEP_BROKEN (1 << 19)
#define FLAG_THRESHOLD (1 << 20)
#define FLAG_TMP_OUTBUF (1 << 21)
#define FLAG_TMP_INBUF (1 << 22)
#define FLAG_ENCRYPT (1 << 23)
#define FLAG_OUTPUT (1 << 24)
#define NO_MD5 (!(HASH_CHECK) && !(HAS_MD5))
#define BITS32 (sizeof(long) == 4)
#define CTYPE_NONE 3
#define CTYPE_BZIP2 4
#define CTYPE_LZO 5
#define CTYPE_LZMA 6
#define CTYPE_GZIP 7
#define CTYPE_ZPAQ 8
#define PASS_LEN 512
#define HASH_LEN 64
#define SALT_LEN 8
#define CBC_LEN 16
#define one_g (1000 * 1024 * 1024)
#if defined(NOTHREAD) || !defined(_SC_NPROCESSORS_ONLN)
# define PROCESSORS (1)
#else
# define PROCESSORS (sysconf(_SC_NPROCESSORS_ONLN))
#endif
#ifndef PAGE_SIZE
# ifdef _SC_PAGE_SIZE
# define PAGE_SIZE (sysconf(_SC_PAGE_SIZE))
# else
# define PAGE_SIZE (4096)
# endif
#endif
#define dealloc(ptr) do { \
free(ptr); \
ptr = NULL; \
} while (0)
/* Determine how many times to hash the password when encrypting, based on
* the date such that we increase the number of loops according to Moore's
* law relative to when the data is encrypted. It is then stored as a two
* byte value in the header */
#define MOORE 1.835 // world constant [TIMES per YEAR]
#define ARBITRARY 1000000 // number of sha2 calls per one second in 2011
#define T_ZERO 1293840000 // seconds since epoch in 2011
#define SECONDS_IN_A_YEAR (365*86400)
#define MOORE_TIMES_PER_SECOND pow (MOORE, 1.0 / SECONDS_IN_A_YEAR)
#define ARBITRARY_AT_EPOCH (ARBITRARY * pow (MOORE_TIMES_PER_SECOND, -T_ZERO))
#define FLAG_VERBOSE (FLAG_VERBOSITY | FLAG_VERBOSITY_MAX)
#define FLAG_NOT_LZMA (FLAG_NO_COMPRESS | FLAG_LZO_COMPRESS | FLAG_BZIP2_COMPRESS | FLAG_ZLIB_COMPRESS | FLAG_ZPAQ_COMPRESS)
#define LZMA_COMPRESS (!(control->flags & FLAG_NOT_LZMA))
#define SHOW_PROGRESS (control->flags & FLAG_SHOW_PROGRESS)
#define KEEP_FILES (control->flags & FLAG_KEEP_FILES)
#define TEST_ONLY (control->flags & FLAG_TEST_ONLY)
#define FORCE_REPLACE (control->flags & FLAG_FORCE_REPLACE)
#define DECOMPRESS (control->flags & FLAG_DECOMPRESS)
#define NO_COMPRESS (control->flags & FLAG_NO_COMPRESS)
#define LZO_COMPRESS (control->flags & FLAG_LZO_COMPRESS)
#define BZIP2_COMPRESS (control->flags & FLAG_BZIP2_COMPRESS)
#define ZLIB_COMPRESS (control->flags & FLAG_ZLIB_COMPRESS)
#define ZPAQ_COMPRESS (control->flags & FLAG_ZPAQ_COMPRESS)
#define VERBOSE (control->flags & FLAG_VERBOSE)
#define VERBOSITY (control->flags & FLAG_VERBOSITY)
#define MAX_VERBOSE (control->flags & FLAG_VERBOSITY_MAX)
#define STDIN (control->flags & FLAG_STDIN)
#define STDOUT (control->flags & FLAG_STDOUT)
#define INFO (control->flags & FLAG_INFO)
#define UNLIMITED (control->flags & FLAG_UNLIMITED)
#define HASH_CHECK (control->flags & FLAG_HASH)
#define HAS_MD5 (control->flags & FLAG_MD5)
#define CHECK_FILE (control->flags & FLAG_CHECK)
#define KEEP_BROKEN (control->flags & FLAG_KEEP_BROKEN)
#define LZ4_TEST (control->flags & FLAG_THRESHOLD)
#define TMP_OUTBUF (control->flags & FLAG_TMP_OUTBUF)
#define TMP_INBUF (control->flags & FLAG_TMP_INBUF)
#define ENCRYPT (control->flags & FLAG_ENCRYPT)
#define SHOW_OUTPUT (control->flags & FLAG_OUTPUT)
#define IS_FROM_FILE ( !!(control->inFILE) && !STDIN )
/* Structure to save state of computation between the single steps. */
struct md5_ctx
{
uint32_t A;
uint32_t B;
uint32_t C;
uint32_t D;
uint32_t total[2];
uint32_t buflen;
uint32_t buffer[32];
};
struct sliding_buffer {
uchar *buf_low; /* The low window buffer */
uchar *buf_high;/* "" high "" */
i64 orig_offset;/* Where the original buffer started */
i64 offset_low; /* What the current offset the low buffer has */
i64 offset_high;/* "" high buffer "" */
i64 offset_search;/* Where the search is up to */
i64 orig_size; /* How big the full buffer would be */
i64 size_low; /* How big the low buffer is */
i64 size_high; /* "" high "" */
i64 high_length;/* How big the high buffer should be */
int fd; /* The fd of the mmap */
};
struct checksum {
uint32_t *cksum;
uchar *buf;
i64 len;
};
typedef i64 tag;
struct node {
void *data;
struct node *prev;
};
struct runzip_node {
struct stream_info *sinfo;
pthread_t *pthreads;
struct runzip_node *prev;
};
struct rzip_state {
void *ss;
struct node *sslist;
struct node *head;
struct level *level;
tag hash_index[256];
struct hash_entry *hash_table;
char hash_bits;
i64 hash_count;
i64 hash_limit;
tag minimum_tag_mask;
i64 tag_clean_ptr;
i64 last_match;
i64 chunk_size;
i64 mmap_size;
char chunk_bytes;
uint32_t cksum;
int fd_in, fd_out;
char stdin_eof;
struct {
i64 inserts;
i64 literals;
i64 literal_bytes;
i64 matches;
i64 match_bytes;
i64 tag_hits;
i64 tag_misses;
} stats;
};
struct rzip_control {
char *infile;
FILE *inFILE; // if a FILE is being read from
char *outname;
char *outfile;
FILE *outFILE; // if a FILE is being written to
char *outdir;
char *tmpdir; // when stdin, stdout, or test used
uchar *tmp_outbuf; // Temporary file storage for stdout
i64 out_ofs; // Output offset when tmp_outbuf in use
i64 hist_ofs; // History offset
i64 out_len; // Total length of tmp_outbuf
i64 out_maxlen; // The largest the tmp_outbuf can be used
i64 out_relofs; // Relative tmp_outbuf offset when stdout has been flushed
uchar *tmp_inbuf;
i64 in_ofs;
i64 in_len;
i64 in_maxlen;
FILE *msgout; //stream for output messages
FILE *msgerr; //stream for output errors
char *suffix;
uchar compression_level;
i64 overhead; // compressor overhead
i64 usable_ram; // the most ram we'll try to use on one activity
i64 maxram; // the largest chunk of ram to allocate
unsigned char lzma_properties[5]; // lzma properties, encoded
i64 window;
unsigned long flags;
i64 ramsize;
i64 max_chunk;
i64 max_mmap;
int threads;
char nice_val; // added for consistency
int current_priority;
char major_version;
char minor_version;
i64 st_size;
long page_size;
int fd_in;
int fd_out;
int fd_hist;
i64 encloops;
i64 secs;
void (*pass_cb)(void *, char *, size_t); /* callback to get password in lib */
void *pass_data;
uchar salt[SALT_LEN];
uchar *salt_pass;
int salt_pass_len;
uchar *hash;
char *passphrase;
pthread_mutex_t control_lock;
unsigned char eof;
unsigned char magic_written;
bool lzma_prop_set;
cksem_t cksumsem;
md5_ctx ctx;
uchar md5_resblock[MD5_DIGEST_SIZE];
i64 md5_read; // How far into the file the md5 has done so far
struct checksum checksum;
const char *util_infile;
char delete_infile;
const char *util_outfile;
char delete_outfile;
FILE *outputfile;
char library_mode;
int log_level;
void (*info_cb)(void *data, int pct, int chunk_pct);
void *info_data;
void (*log_cb)(void *data, unsigned int level, unsigned int line, const char *file, const char *func, const char *format, va_list args);
void *log_data;
char chunk_bytes;
struct sliding_buffer sb;
void (*do_mcpy)(rzip_control *, unsigned char *, i64, i64);
void (*next_tag)(rzip_control *, struct rzip_state *, i64, tag *);
tag (*full_tag)(rzip_control *, struct rzip_state *, i64);
i64 (*match_len)(rzip_control *, struct rzip_state *, i64, i64, i64, i64 *);
pthread_t *pthreads;
struct runzip_node *ruhead;
};
struct uncomp_thread {
uchar *s_buf;
i64 u_len, c_len;
i64 last_head;
uchar c_type;
int busy;
int streamno;
};
struct stream {
i64 last_head;
uchar *buf;
i64 buflen;
i64 bufp;
uchar eos;
long uthread_no;
long unext_thread;
long base_thread;
int total_threads;
i64 last_headofs;
};
struct stream_info {
struct stream *s;
uchar num_streams;
int fd;
i64 bufsize;
i64 cur_pos;
i64 initial_pos;
i64 total_read;
i64 ram_alloced;
i64 size;
struct uncomp_thread *ucthreads;
long thread_no;
long next_thread;
int chunks;
char chunk_bytes;
};
static inline void print_stuff(const rzip_control *control, int level, unsigned int line, const char *file, const char *func, const char *format, ...)
{
va_list ap;
if (control->library_mode && control->log_cb && (control->log_level >= level)) {
va_start(ap, format);
control->log_cb(control->log_data, level, line, file, func, format, ap);
va_end(ap);
} else if (control->msgout) {
va_start(ap, format);
vfprintf(control->msgout, format, ap);
va_end(ap);
fflush(control->msgout);
}
}
static inline void print_err(const rzip_control *control, unsigned int line, const char *file, const char *func, const char *format, ...)
{
va_list ap;
if (control->library_mode && control->log_cb && (control->log_level >= 0)) {
va_start(ap, format);
control->log_cb(control->log_data, 0, line, file, func, format, ap);
va_end(ap);
} else if (control->msgerr) {
va_start(ap, format);
vfprintf(control->msgerr, format, ap);
va_end(ap);
fflush(control->msgerr);
}
}
#define print_stuff(level, ...) do {\
print_stuff(control, level, __LINE__, __FILE__, __func__, __VA_ARGS__); \
} while (0)
#define print_output(...) do {\
if (SHOW_OUTPUT) \
print_stuff(1, __VA_ARGS__); \
} while (0)
#define print_progress(...) do {\
if (SHOW_PROGRESS) \
print_stuff(2, __VA_ARGS__); \
} while (0)
#define print_verbose(...) do {\
if (VERBOSE) \
print_stuff(3, __VA_ARGS__); \
} while (0)
#define print_maxverbose(...) do {\
if (MAX_VERBOSE) \
print_stuff(4, __VA_ARGS__); \
} while (0)
#define print_err(...) do {\
print_err(control, __LINE__, __FILE__, __func__, __VA_ARGS__); \
} while (0)
#endif

152
lrztar
View file

@ -1,7 +1,7 @@
#!/bin/bash
# Copyright (C) George Makrydakis 2009-2011,2013
# Copyright (C) Con Kolivas 2011-2012,2016,2018,2021
# Copyright (C) George Makrydakis 2009, 2010
# Copyright (C) Con Kolivas 2010
# A bash wrapper for Con Kolivas' excellent lrzip utility. For the time
# being, lrzip does not like pipes, so we had to do this. It is kind of
@ -21,132 +21,50 @@
# along with this program. If not, see <http://www.gnu.org/licenses/>.
function lrztar_local() {
local hv="\
lrztar GNU/bash wrapper script for lrzip and tar input/output over directories.
Copyright (C) George Makrydakis 2009-2011,2013
Copyright (C) Con Kolivas 2011-2012,2016,2018,2021
Usage : lrztar [lrzip options] <directory>
Result: a lrzip tarball is produced.
Extras: when an lrzip tarball is used with -d, -O, it gets extracted:
-h: will display this message.
-d: <path1> will decompress a <path1> lrzip tarball to current directory.
-O: <path2> will decompress a -d specified lrzip tarball to <path2> path.
-f: will force overwrites.
Notice:
- The input argument is always last, all options and their arguments precede.
- The -O flag is an option flag, goes before: (-O <somedir> <input arg>).
- You can use the remaining options of lrzip as they were.
- lrzuntar is equivalent to lrztar [options] -d <filename>.
- This script exists because of how lrzip behaves.
- Beware the -f flag, it stands for what it says...
"
[[ $1 == "" ]] && {
printf "lrztar: no arguments given\n";
return 1;
}
local p=("${@:1:$(($#-1))}") s="${!#}" vopt=("lrz") \
v_w=0 v_S=0 v_D=0 v_p=0 v_q=0 v_L=0 \
v_n=0 v_l=0 v_b=0 v_g=0 v_z=0 v_U=0 \
v_T=0 v_N=0 v_v=0 v_f=0 v_d=0 v_h=0 \
v_H=0 v_c=0 v_k=0 v_o=0 v_O=0 v_m=0 x= i="$(pwd)"
tar --version &> /dev/null \
|| { printf "lrztar: no tar in your path\n"; return 1; }
lrzip --version &> /dev/null \
|| { printf "lrztar: no lrzip in your path\n"; return 1; }
lrzcat --version &> /dev/null \
|| { printf "lrztar: no lrzcat in your path\n"; return 1; }
while getopts w:O:S:DqL:nlbgzUm:TN:p:vfo:d:tVhHck x; do
[[ $x == [tV] ]] && {
printf "lrztar: invalid option for lrztar: %s\n" "$x";
return 1;
}
((v_$x=${#vopt[@]}))
vopt[${#vopt[@]}]="$OPTARG"
local p="${@:1:$(($#-1))}" s="${!#}" tname= fname= \
v_w=0 v_O=0 v_S=0 v_D=0 v_P=0 v_q=0 v_L=0 \
v_n=0 v_l=0 v_b=0 v_g=0 v_z=0 v_M=0 v_U=0 \
v_T=0 v_N=0 v_v=0 v_f=0 v_d=0 v_h=0 x= i=
OPTERR=0
trap '[[ -z $tname ]] || rm -rf "$tname" &> /dev/null' 1 2 3 15
which tar &> /dev/null || { printf "lrztar: no tar in your path\n"; return 1; }
which lrzip &> /dev/null || { printf "lrztar: no lrzip in your path\n"; return 1; }
while getopts w:O:S:DPqL:nlbgzMUT:N:vfodtVh x; do
[[ $x == [otV] ]] || ((v_$x=1)) &> /dev/null \
|| { printf "lrztar: invalid option for lrztar %s\n" "$x"; return 1; }
done
[[ $(basename "$0") == lrzuntar ]] \
&& { ((v_d=${#vopt[@]})); vopt[${#vopt[@]}]="$s"; }
&& { v_d=1; p="-d $p"; }
{ ! (($#)) || ((v_h)); } && {
printf "%s\n" "$hv"
printf "lrztar wrapper for compressing/decompressing whole directories with lrzip.\n"
printf "usage: lrztar [lrzip options] <directory> will compress directory to directory.tar.lrz\n"
printf "lrztar -d [lrzip options] <directory.tar.lrz> will extract directory from directory.tar.lrz\n"
printf "lrzuntar [lrzip options] <directory.tar.lrz> will extract directory from directory.tar.lrz\n"
printf "lrz[un]tar -h will display this help message\n"
printf "lrzip -h will display lrzip options\n"
return
}
((v_d)) && {
[[ -e ${vopt[v_d]} ]] || {
printf "lrztar: file does not exist: %s\n" \
"${vopt[v_d]}"
fname="$(basename "$s")"; tname="${fname%.lrz}";
! ((v_f)) && [[ -e ${tname%.tar} ]] && {
printf "lrztar: ${tname%.tar} already present, aborting\n"
return 1
}
i+="/${vopt[v_d]##*/}"
i="${i%.tar.*}"
if ((v_O)); then
for x in ${!p[@]};do
[[ ${p[x]} == "-O" ]] && {
p[x]=
p[$((x+1))]=
break;
}
done
i="${vopt[v_O]%/}"
x="${s##*/}"
if [[ -d "$i/${x%.tar.*}" ]] && ! ((v_f)); then
printf "lrztar: %s exists, use -f.\n" \
"$i/${x%.tar.*}"
return 1
fi
if ! [[ -d $i ]]; then
printf "lrztar: %s output path does not exist.\n" \
"$i"
return 1
fi
else
i="./"
fi
[ ! -z "$s" ] && {
lrzcat ${p[@]// /\\ } "$s" | tar x -C "$i"
x=$?
} || {
lrzcat ${p[@]// /\\ } | tar x -C "$i"
x=$?
}
[[ ${s%/*} != $s ]] && i="${s%/*}/" || i="./"
tname="$i$tname"
lrzip $p "$i$fname" && tar xf "$tname"
x=$?
} || {
if ((v_o)); then
! ((v_f)) && [[ -e ${vopt[$v_o]} ]] && {
printf "lrztar: %s exists, use -f to overwrite.\n" \
"${vopt[$v_o]}"
return 1
}
else
if ((v_O)); then
if ! [[ -d ${vopt[v_O]} ]]; then
printf "lrztar: %s output path does not exist.\n" \
"${vopt[v_O]}"
return 1
fi
for x in ${!p[@]};do
[[ ${p[x]} == "-O" ]] && {
p[x]=
i="${p[$((x+1))]%/}"
p[$((x+1))]=
s="${!#}"
break;
}
done
fi
s="${s%/}"
p+=(-o "$i/${s##*/}.tar.${vopt[v_S]}");
fi
if ! ((v_o)); then
! ((v_f)) && [[ -e $i/${s##*/}.tar.${vopt[v_S]} ]] && {
printf "lrztar: %s exists, use -f to overwrite\n" \
"$i/${s##*/}.tar.${vopt[v_S]}"
return 1
}
fi
tar c "$s" | lrzip "${p[@]}"
fname="$(basename "$s")"; tname="$fname.tar"
[[ $fname == *.lrz ]] && {
printf "lrztar: $fname is already a .lrz file, aborting\n"
return 1
}
tar cf "$tname" "$s" && lrzip $p "$tname"
x=$?
}
rm -rf "$tname" &> /dev/null
! ((x)) && ((v_D)) && rm -rf "$s" &> /dev/null
return $x
}

View file

@ -1,100 +0,0 @@
; 7zAsm.asm -- ASM macros
; 2009-12-12 : Igor Pavlov : Public domain
; 2011-10-12 : P7ZIP : Public domain
%define NOT ~
%macro MY_ASM_START 0
SECTION .text
%endmacro
%macro MY_PROC 2 ; macro name:req, numParams:req
align 16
%define proc_numParams %2 ; numParams
global %1
global _%1
%1:
_%1:
%endmacro
%macro MY_ENDP 0
%ifdef x64
ret
; proc_name ENDP
%else
ret ; (proc_numParams - 2) * 4
%endif
%endmacro
%ifdef x64
REG_SIZE equ 8
%else
REG_SIZE equ 4
%endif
%define x0 EAX
%define x1 ECX
%define x2 EDX
%define x3 EBX
%define x4 ESP
%define x5 EBP
%define x6 ESI
%define x7 EDI
%define x0_L AL
%define x1_L CL
%define x2_L DL
%define x3_L BL
%define x0_H AH
%define x1_H CH
%define x2_H DH
%define x3_H BH
%ifdef x64
%define r0 RAX
%define r1 RCX
%define r2 RDX
%define r3 RBX
%define r4 RSP
%define r5 RBP
%define r6 RSI
%define r7 RDI
%else
%define r0 x0
%define r1 x1
%define r2 x2
%define r3 x3
%define r4 x4
%define r5 x5
%define r6 x6
%define r7 x7
%endif
%macro MY_PUSH_4_REGS 0
push r3
push r5
%ifdef x64
%ifdef CYGWIN64
push r6
push r7
%endif
%else
push r6
push r7
%endif
%endmacro
%macro MY_POP_4_REGS 0
%ifdef x64
%ifdef CYGWIN64
pop r7
pop r6
%endif
%else
pop r7
pop r6
%endif
pop r5
pop r3
%endmacro

View file

@ -1,147 +0,0 @@
; 7zCrcOpt.asm -- CRC32 calculation : optimized version
; 2009-12-12 : Igor Pavlov : Public domain
%include "7zAsm.asm"
MY_ASM_START
%define rD r2
%define rN r7
%ifdef x64
%define num_VAR r8
%define table_VAR r9
%else
data_size equ (REG_SIZE * 7)
crc_table equ (REG_SIZE + data_size)
%define num_VAR [r4 + data_size]
%define table_VAR [r4 + crc_table]
%endif
%define SRCDAT rN + rD + 4 *
%macro CRC 4 ;CRC macro op:req, dest:req, src:req, t:req
%1 %2, DWORD [r5 + %3 * 4 + 0400h * %4] ; op dest, DWORD [r5 + src * 4 + 0400h * t]
%endmacro
%macro CRC_XOR 3 ; CRC_XOR macro dest:req, src:req, t:req
CRC xor, %1, %2, %3
%endmacro
%macro CRC_MOV 3 ; CRC_MOV macro dest:req, src:req, t:req
CRC mov, %1, %2, %3 ; CRC mov, dest, src, t
%endmacro
%macro CRC1b 0
movzx x6, BYTE [rD]
inc rD
movzx x3, x0_L
xor x6, x3
shr x0, 8
CRC xor, x0, r6, 0
dec rN
%endmacro
%macro MY_PROLOG 1 ; MY_PROLOG macro crc_end:req
MY_PUSH_4_REGS
%ifdef x64
%ifdef CYGWIN64
;ECX=CRC, RDX=buf, R8=size R9=table
; already in R8 : mov num_VAR,R8 ; LEN
; already in RDX : mov rD, RDX ; BUF
; already in R9 : mov table_VAR,R9; table
mov x0, ECX ; CRC
%else
;EDI=CRC, RSI=buf, RDX=size RCX=table
mov num_VAR,RDX ; LEN
mov rD, RSI ; BUF
mov table_VAR,RCX; table
mov x0, EDI ; CRC
%endif
%else
mov x0, [r4 + 20] ; CRC
mov rD, [r4 + 24] ; buf
%endif
mov rN, num_VAR
mov r5, table_VAR
test rN, rN
jz near %1 ; crc_end
%%sl:
test rD, 7
jz %%sl_end
CRC1b
jnz %%sl
%%sl_end:
cmp rN, 16
jb near %1; crc_end
add rN, rD
mov num_VAR, rN
sub rN, 8
and rN, NOT 7
sub rD, rN
xor x0, [SRCDAT 0]
%endmacro
%macro MY_EPILOG 1 ; MY_EPILOG macro crc_end:req
xor x0, [SRCDAT 0]
mov rD, rN
mov rN, num_VAR
sub rN, rD
%1: ; crc_end:
test rN, rN
jz %%end ; @F
CRC1b
jmp %1 ; crc_end
%%end:
MY_POP_4_REGS
%endmacro
MY_PROC CrcUpdateT8, 4
MY_PROLOG crc_end_8
mov x1, [SRCDAT 1]
align 16
main_loop_8:
mov x6, [SRCDAT 2]
movzx x3, x1_L
CRC_XOR x6, r3, 3
movzx x3, x1_H
CRC_XOR x6, r3, 2
shr x1, 16
movzx x3, x1_L
movzx x1, x1_H
CRC_XOR x6, r3, 1
movzx x3, x0_L
CRC_XOR x6, r1, 0
mov x1, [SRCDAT 3]
CRC_XOR x6, r3, 7
movzx x3, x0_H
shr x0, 16
CRC_XOR x6, r3, 6
movzx x3, x0_L
CRC_XOR x6, r3, 5
movzx x3, x0_H
CRC_MOV x0, r3, 4
xor x0, x6
add rD, 8
jnz main_loop_8
MY_EPILOG crc_end_8
MY_ENDP
; T4 CRC deleted
; end
%ifidn __OUTPUT_FORMAT__,elf
section .note.GNU-stack noalloc noexec nowrite progbits
%endif
%ifidn __OUTPUT_FORMAT__,elf32
section .note.GNU-stack noalloc noexec nowrite progbits
%endif
%ifidn __OUTPUT_FORMAT__,elf64
section .note.GNU-stack noalloc noexec nowrite progbits
%endif

102
lzma/ASM/x86/7zCrcT8U.s Normal file
View file

@ -0,0 +1,102 @@
SECTION .text
%macro CRC1b 0
movzx EDX, BYTE [ESI]
inc ESI
movzx EBX, AL
xor EDX, EBX
shr EAX, 8
xor EAX, [EBP + EDX * 4]
dec EDI
%endmacro
data_size equ (28)
crc_table equ (data_size + 4)
align 16
global CrcUpdateT8
global _CrcUpdateT8
CrcUpdateT8:
_CrcUpdateT8:
push EBX
push ESI
push EDI
push EBP
mov EAX, [ESP + 20]
mov ESI, [ESP + 24]
mov EDI, [ESP + data_size]
mov EBP, [ESP + crc_table]
test EDI, EDI
jz sl_end
sl:
test ESI, 7
jz sl_end
CRC1b
jnz sl
sl_end:
cmp EDI, 16
jb NEAR crc_end
mov [ESP + data_size], EDI
sub EDI, 8
and EDI, ~ 7
sub [ESP + data_size], EDI
add EDI, ESI
xor EAX, [ESI]
mov EBX, [ESI + 4]
movzx ECX, BL
align 16
main_loop:
mov EDX, [EBP + ECX*4 + 0C00h]
movzx ECX, BH
xor EDX, [EBP + ECX*4 + 0800h]
shr EBX, 16
movzx ECX, BL
xor EDX, [EBP + ECX*4 + 0400h]
xor EDX, [ESI + 8]
movzx ECX, AL
movzx EBX, BH
xor EDX, [EBP + EBX*4 + 0000h]
mov EBX, [ESI + 12]
xor EDX, [EBP + ECX*4 + 01C00h]
movzx ECX, AH
add ESI, 8
shr EAX, 16
xor EDX, [EBP + ECX*4 + 01800h]
movzx ECX, AL
xor EDX, [EBP + ECX*4 + 01400h]
movzx ECX, AH
mov EAX, [EBP + ECX*4 + 01000h]
movzx ECX, BL
xor EAX,EDX
cmp ESI, EDI
jne main_loop
xor EAX, [ESI]
mov EDI, [ESP + data_size]
crc_end:
test EDI, EDI
jz fl_end
fl:
CRC1b
jnz fl
fl_end:
pop EBP
pop EDI
pop ESI
pop EBX
ret
%ifidn __OUTPUT_FORMAT__,elf
section .note.GNU-stack noalloc noexec nowrite progbits
%endif

View file

@ -1,7 +0,0 @@
MAINTAINERCLEANFILES = Makefile.in
noinst_LTLIBRARIES = liblzmaasm.la
liblzmaasm_la_SOURCES = \
7zAsm.asm \
7zCrcOpt_asm.asm

View file

@ -0,0 +1,105 @@
SECTION .text
%macro CRC1b 0
movzx EDX, BYTE [RSI]
inc RSI
movzx EBX, AL
xor EDX, EBX
shr EAX, 8
xor EAX, [RDI + RDX * 4]
dec R8
%endmacro
align 16
global CrcUpdateT8
CrcUpdateT8:
push RBX
push RSI
push RDI
push RBP
mov EAX, ECX
mov RSI, RDX
mov RDI, R9
test R8, R8
jz sl_end
sl:
test RSI, 7
jz sl_end
CRC1b
jnz sl
sl_end:
cmp R8, 16
jb crc_end
mov R9, R8
and R8, 7
add R8, 8
sub R9, R8
add R9, RSI
xor EAX, [RSI]
mov EBX, [RSI + 4]
movzx ECX, BL
align 16
main_loop:
mov EDX, [RDI + RCX*4 + 0C00h]
movzx EBP, BH
xor EDX, [RDI + RBP*4 + 0800h]
shr EBX, 16
movzx ECX, BL
xor EDX, [RSI + 8]
xor EDX, [RDI + RCX*4 + 0400h]
movzx ECX, AL
movzx EBP, BH
xor EDX, [RDI + RBP*4 + 0000h]
mov EBX, [RSI + 12]
xor EDX, [RDI + RCX*4 + 01C00h]
movzx EBP, AH
shr EAX, 16
movzx ECX, AL
xor EDX, [RDI + RBP*4 + 01800h]
movzx EBP, AH
mov EAX, [RDI + RCX*4 + 01400h]
add RSI, 8
xor EAX, [RDI + RBP*4 + 01000h]
movzx ECX, BL
xor EAX,EDX
cmp RSI, R9
jne main_loop
xor EAX, [RSI]
crc_end:
test R8, R8
jz fl_end
fl:
CRC1b
jnz fl
fl_end:
pop RBP
pop RDI
pop RSI
pop RBX
ret
%ifidn __OUTPUT_FORMAT__,elf
section .note.GNU-stack noalloc noexec nowrite progbits
%endif

View file

@ -1,19 +1,18 @@
/* Alloc.h -- Memory allocation functions
2009-02-07 : Igor Pavlov : Public domain */
2008-03-13
Igor Pavlov
Public domain */
#ifndef __COMMON_ALLOC_H
#define __COMMON_ALLOC_H
#include <stddef.h>
#ifdef __cplusplus
extern "C" {
#endif
#ifdef _WIN32
void *MyAlloc(size_t size);
void MyFree(void *address);
#ifdef _WIN32
void SetLargePageSize();
@ -24,15 +23,15 @@ void BigFree(void *address);
#else
#define MidAlloc(size) MyAlloc(size)
#define MidFree(address) MyFree(address)
#define BigAlloc(size) MyAlloc(size)
#define BigFree(address) MyFree(address)
#include <stdlib.h> /* malloc */
#define MyAlloc(size) malloc(size)
#define MyFree(address) free(address)
#define MidAlloc(size) malloc(size)
#define MidFree(address) free(address)
#define BigAlloc(size) malloc(size)
#define BigFree(address) free(address)
#endif
#ifdef __cplusplus
}
#endif
#endif

View file

@ -1,12 +1,10 @@
/* LzFindMt.c -- multithreaded Match finder for LZ algorithms
2009-09-20 : Igor Pavlov : Public domain */
2009-05-26 : Igor Pavlov : Public domain */
#include "LzHash.h"
#include "LzFindMt.h"
#include "lrzip_core.h"
void MtSync_Construct(CMtSync *p)
{
p->wasCreated = False;
@ -456,7 +454,7 @@ void MatchFinderMt_Destruct(CMatchFinderMt *p, ISzAlloc *alloc)
static unsigned MY_STD_CALL HashThreadFunc2(void *p) { HashThreadFunc((CMatchFinderMt *)p); return 0; }
static unsigned MY_STD_CALL BtThreadFunc2(void *p)
{
__maybe_unused Byte allocaDummy[0x180];
Byte allocaDummy[0x180];
int i = 0;
for (i = 0; i < 16; i++)
allocaDummy[i] = (Byte)i;
@ -713,47 +711,47 @@ UInt32 MatchFinderMt_GetMatches(CMatchFinderMt *p, UInt32 *distances)
return len;
}
#define SKIP_HEADER2_MT do { GET_NEXT_BLOCK_IF_REQUIRED
#define SKIP_HEADER_MT(n) SKIP_HEADER2_MT if (p->btNumAvailBytes-- >= (n)) { const Byte *cur = p->pointerToCurPos; UInt32 *hash = p->hash;
#define SKIP_FOOTER_MT } INCREASE_LZ_POS p->btBufPos += p->btBuf[p->btBufPos] + 1; } while (--num != 0);
#define SKIP_HEADER2 do { GET_NEXT_BLOCK_IF_REQUIRED
#define SKIP_HEADER(n) SKIP_HEADER2 if (p->btNumAvailBytes-- >= (n)) { const Byte *cur = p->pointerToCurPos; UInt32 *hash = p->hash;
#define SKIP_FOOTER } INCREASE_LZ_POS p->btBufPos += p->btBuf[p->btBufPos] + 1; } while (--num != 0);
void MatchFinderMt0_Skip(CMatchFinderMt *p, UInt32 num)
{
SKIP_HEADER2_MT { p->btNumAvailBytes--;
SKIP_FOOTER_MT
SKIP_HEADER2 { p->btNumAvailBytes--;
SKIP_FOOTER
}
void MatchFinderMt2_Skip(CMatchFinderMt *p, UInt32 num)
{
SKIP_HEADER_MT(2)
SKIP_HEADER(2)
UInt32 hash2Value;
MT_HASH2_CALC
hash[hash2Value] = p->lzPos;
SKIP_FOOTER_MT
SKIP_FOOTER
}
void MatchFinderMt3_Skip(CMatchFinderMt *p, UInt32 num)
{
SKIP_HEADER_MT(3)
SKIP_HEADER(3)
UInt32 hash2Value, hash3Value;
MT_HASH3_CALC
hash[kFix3HashSize + hash3Value] =
hash[ hash2Value] =
p->lzPos;
SKIP_FOOTER_MT
SKIP_FOOTER
}
/*
void MatchFinderMt4_Skip(CMatchFinderMt *p, UInt32 num)
{
SKIP_HEADER_MT(4)
SKIP_HEADER(4)
UInt32 hash2Value, hash3Value, hash4Value;
MT_HASH4_CALC
hash[kFix4HashSize + hash4Value] =
hash[kFix3HashSize + hash3Value] =
hash[ hash2Value] =
p->lzPos;
SKIP_FOOTER_MT
SKIP_FOOTER
}
*/

View file

@ -1,5 +1,5 @@
/* LzmaDec.c -- LZMA Decoder
2009-09-20 : Igor Pavlov : Public domain */
2008-11-06 : Igor Pavlov : Public domain */
#include "LzmaDec.h"
@ -113,6 +113,12 @@
StopCompilingDueBUG
#endif
static const Byte kLiteralNextStates[kNumStates * 2] =
{
0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 4, 5,
7, 7, 7, 7, 7, 7, 7, 10, 10, 10, 10, 10
};
#define LZMA_DIC_MIN (1 << 12)
/* First LZMA-symbol is always decoded.
@ -169,7 +175,6 @@ static int MY_FAST_CALL LzmaDec_DecodeReal(CLzmaDec *p, SizeT limit, const Byte
if (state < kNumLitStates)
{
state -= (state < 4) ? state : 3;
symbol = 1;
do { GET_BIT(prob + symbol, symbol) } while (symbol < 0x100);
}
@ -177,7 +182,6 @@ static int MY_FAST_CALL LzmaDec_DecodeReal(CLzmaDec *p, SizeT limit, const Byte
{
unsigned matchByte = p->dic[(dicPos - rep0) + ((dicPos < rep0) ? dicBufSize : 0)];
unsigned offs = 0x100;
state -= (state < 10) ? 3 : 6;
symbol = 1;
do
{
@ -192,6 +196,9 @@ static int MY_FAST_CALL LzmaDec_DecodeReal(CLzmaDec *p, SizeT limit, const Byte
}
dic[dicPos++] = (Byte)symbol;
processedPos++;
state = kLiteralNextStates[state];
/* if (state < 4) state = 0; else if (state < 10) state -= 3; else state -= 6; */
continue;
}
else
@ -371,6 +378,7 @@ static int MY_FAST_CALL LzmaDec_DecodeReal(CLzmaDec *p, SizeT limit, const Byte
else if (distance >= checkDicSize)
return SZ_ERROR_DATA;
state = (state < kNumStates + kNumLitStates) ? kNumLitStates : kNumLitStates + 3;
/* state = kLiteralNextStates[state]; */
}
len += kMatchMinLen;

View file

@ -1,5 +1,5 @@
/* LzmaEnc.c -- LZMA Encoder
2010-04-16 : Igor Pavlov : Public domain */
2009-04-22 : Igor Pavlov : Public domain */
#include <string.h>
@ -13,12 +13,10 @@
#include "LzmaEnc.h"
#include "LzFind.h"
#ifndef _7ZIP_ST
#ifdef COMPRESS_MF_MT
#include "LzFindMt.h"
#endif
#include "lrzip_core.h"
#ifdef SHOW_STAT
static int ttt = 0;
#endif
@ -68,7 +66,7 @@ void LzmaEncProps_Normalize(CLzmaEncProps *p)
if (p->mc == 0) p->mc = (16 + (p->fb >> 1)) >> (p->btMode ? 0 : 1);
if (p->numThreads < 0)
p->numThreads =
#ifndef _7ZIP_ST
#ifdef COMPRESS_MF_MT
((p->btMode && p->algo) ? 2 : 1);
#else
1;
@ -174,7 +172,7 @@ typedef struct
#define kEndPosModelIndex 14
#define kNumPosModels (kEndPosModelIndex - kStartPosModelIndex)
#define kNumFullDistances (1 << (kEndPosModelIndex >> 1))
#define kNumFullDistances (1 << (kEndPosModelIndex / 2))
#ifdef _LZMA_PROB32
#define CLzmaProb UInt32
@ -261,14 +259,14 @@ typedef struct
IMatchFinder matchFinder;
void *matchFinderObj;
#ifndef _7ZIP_ST
#ifdef COMPRESS_MF_MT
Bool mtMode;
CMatchFinderMt matchFinderMt;
#endif
CMatchFinder matchFinderBase;
#ifndef _7ZIP_ST
#ifdef COMPRESS_MF_MT
Byte pad[128];
#endif
@ -397,7 +395,7 @@ SRes LzmaEnc_SetProps(CLzmaEncHandle pp, const CLzmaEncProps *props2)
LzmaEncProps_Normalize(&props);
if (props.lc > LZMA_LC_MAX || props.lp > LZMA_LP_MAX || props.pb > LZMA_PB_MAX ||
props.dictSize > ((UInt32)1 << kDicLogSizeMaxCompress) || props.dictSize > ((UInt32)1 << 30))
props.dictSize > (unsigned)(1 << kDicLogSizeMaxCompress) || props.dictSize > (1 << 30))
return SZ_ERROR_PARAM;
p->dictSize = props.dictSize;
p->matchFinderCycles = props.mc;
@ -430,7 +428,7 @@ SRes LzmaEnc_SetProps(CLzmaEncHandle pp, const CLzmaEncProps *props2)
p->writeEndMark = props.writeEndMark;
#ifndef _7ZIP_ST
#ifdef COMPRESS_MF_MT
/*
if (newMultiThread != _multiThread)
{
@ -806,7 +804,7 @@ static void MovePos(CLzmaEnc *p, UInt32 num)
{
#ifdef SHOW_STAT
ttt += num;
printf("\n MovePos %d", num);
fprintf(stderr, "\n MovePos %d", num);
#endif
if (num != 0)
{
@ -821,12 +819,12 @@ static UInt32 ReadMatchDistances(CLzmaEnc *p, UInt32 *numDistancePairsRes)
p->numAvail = p->matchFinder.GetNumAvailableBytes(p->matchFinderObj);
numPairs = p->matchFinder.GetMatches(p->matchFinderObj, p->matches);
#ifdef SHOW_STAT
printf("\n i = %d numPairs = %d ", ttt, numPairs / 2);
fprintf(stderr, "\n i = %d numPairs = %d ", ttt, numPairs / 2);
ttt++;
{
UInt32 i;
for (i = 0; i < numPairs; i += 2)
printf("%2d %6d | ", p->matches[i], p->matches[i + 1]);
fprintf(stderr, "%2d %6d | ", p->matches[i], p->matches[i + 1]);
}
#endif
if (numPairs > 0)
@ -1117,9 +1115,9 @@ static UInt32 GetOptimum(CLzmaEnc *p, UInt32 position, UInt32 *backRes)
if (position >= 0)
{
unsigned i;
printf("\n pos = %4X", position);
fprintf(stderr, "\n pos = %4X", position);
for (i = cur; i <= lenEnd; i++)
printf("\nprice[%4X] = %d", position - cur + i, p->opt[i].price);
fprintf(stderr, "\nprice[%4X] = %d", position - cur + i, p->opt[i].price);
}
#endif
@ -1679,7 +1677,7 @@ void LzmaEnc_Construct(CLzmaEnc *p)
{
RangeEnc_Construct(&p->rc);
MatchFinder_Construct(&p->matchFinderBase);
#ifndef _7ZIP_ST
#ifdef COMPRESS_MF_MT
MatchFinderMt_Construct(&p->matchFinderMt);
p->matchFinderMt.MatchFinder = &p->matchFinderBase;
#endif
@ -1718,7 +1716,7 @@ void LzmaEnc_FreeLits(CLzmaEnc *p, ISzAlloc *alloc)
void LzmaEnc_Destruct(CLzmaEnc *p, ISzAlloc *alloc, ISzAlloc *allocBig)
{
#ifndef _7ZIP_ST
#ifdef COMPRESS_MF_MT
MatchFinderMt_Destruct(&p->matchFinderMt, allocBig);
#endif
MatchFinder_Free(&p->matchFinderBase, allocBig);
@ -1774,7 +1772,7 @@ static SRes LzmaEnc_CodeOneBlock(CLzmaEnc *p, Bool useLimits, UInt32 maxPackSize
len = GetOptimum(p, nowPos32, &pos);
#ifdef SHOW_STAT2
printf("\n pos = %4X, len = %d pos = %d", nowPos32, len, pos);
fprintf(stderr, "\n pos = %4X, len = %d pos = %d", nowPos32, len, pos);
#endif
posState = nowPos32 & p->pbMask;
@ -1903,7 +1901,7 @@ static SRes LzmaEnc_Alloc(CLzmaEnc *p, UInt32 keepWindowSize, ISzAlloc *alloc, I
if (!RangeEnc_Alloc(&p->rc, alloc))
return SZ_ERROR_MEM;
btMode = (p->matchFinderBase.btMode != 0);
#ifndef _7ZIP_ST
#ifdef COMPRESS_MF_MT
p->mtMode = (p->multiThread && !p->fastMode && btMode);
#endif
@ -1928,7 +1926,7 @@ static SRes LzmaEnc_Alloc(CLzmaEnc *p, UInt32 keepWindowSize, ISzAlloc *alloc, I
if (beforeSize + p->dictSize < keepWindowSize)
beforeSize = keepWindowSize - p->dictSize;
#ifndef _7ZIP_ST
#ifdef COMPRESS_MF_MT
if (p->mtMode)
{
RINOK(MatchFinderMt_Create(&p->matchFinderMt, p->dictSize, beforeSize, p->numFastBytes, LZMA_MATCH_LEN_MAX, allocBig));
@ -2075,7 +2073,7 @@ SRes LzmaEnc_MemPrepare(CLzmaEncHandle pp, const Byte *src, SizeT srcLen,
void LzmaEnc_Finish(CLzmaEncHandle pp)
{
#ifndef _7ZIP_ST
#ifdef COMPRESS_MF_MT
CLzmaEnc *p = (CLzmaEnc *)pp;
if (p->mtMode)
MatchFinderMt_ReleaseStream(&p->matchFinderMt);
@ -2157,8 +2155,8 @@ static SRes LzmaEnc_Encode2(CLzmaEnc *p, ICompressProgress *progress)
{
SRes res = SZ_OK;
#ifndef _7ZIP_ST
__maybe_unused Byte allocaDummy[0x300];
#ifdef COMPRESS_MF_MT
Byte allocaDummy[0x300];
int i = 0;
for (i = 0; i < 16; i++)
allocaDummy[i] = (Byte)i;

View file

@ -8,8 +8,8 @@ Public domain */
#include "Alloc.h"
#include "LzmaLib.h"
static void *SzAlloc(void __attribute__((unused)) *p, size_t size) { return MyAlloc(size); }
static void SzFree(void __attribute__((unused)) *p, void *address) { MyFree(address); }
static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
static void SzFree(void *p, void *address) { p = p; MyFree(address); }
static ISzAlloc g_Alloc = { SzAlloc, SzFree };
MY_STDAPI LzmaCompress(unsigned char *dest, size_t *destLen, const unsigned char *src, size_t srcLen,

View file

@ -1,64 +0,0 @@
MAINTAINERCLEANFILES = Makefile.in
# Update -D
AM_CFLAGS = \
-D_REENTRANT \
-I@top_builddir@ \
-I@top_srcdir@
ASM_S =
ASM_7z =
C_S =
if USE_ASM
ASM_7z += 7zCrcOpt_asm
ASM_S += @abs_top_srcdir@/lzma/ASM/x86/$(ASM_7z).asm
C_S += 7zCrcT8.c
else
C_S += 7zCrc.c
endif
noinst_LTLIBRARIES = liblzma.la
# need separate variable for ASM so that make will compile later
# to prevent an error even if -j## is used.
liblzma_la_SOURCES = \
$(C_S) \
7zCrc.h \
LzmaDec.h \
LzmaEnc.h \
LzFind.c \
LzFind.h \
LzFindMt.c \
LzFindMt.h \
LzmaDec.c \
LzmaEnc.c \
LzmaLib.c \
LzmaLib.h \
Alloc.c \
Alloc.h \
Threads.c \
Threads.h \
Types.h \
LzHash.h \
windows.h \
basetyps.h \
MyWindows.h \
MyGuidDef.h
## hack to force asm compilation and to trick libtool with .lo file
if USE_ASM
liblzma_la_LIBADD = $(ASM_7z).lo
7ZIPASMLOFILE := \
\# $(ASM_7z).lo - a libtool object file\
\n\# Generated by libtool -- hack to allow asm linking\
\n\# Peter Hyman\
\npic_object='.libs/$(ASM_7z).o'\
\nnon_pic_object='$(ASM_7z).o'\
\n
$(ASM_7z).lo: $(ASM_S)
$(ASM_PROG) $(ASM_OPT) -o $(ASM_7z).o $(ASM_S)
mkdir -p .libs
cp $(ASM_7z).o .libs/
@printf "$(7ZIPASMLOFILE)" > $(ASM_7z).lo
endif

View file

@ -3,7 +3,7 @@
#ifndef GUID_DEFINED
#define GUID_DEFINED
#include "Types.h"
// #include "Types.h"
typedef int HRes; // from Types.h
typedef struct {

View file

@ -198,7 +198,7 @@ WRes Thread_Create(CThread *thread, THREAD_FUNC_RET_TYPE (THREAD_FUNC_CALL_TYPE
ret = pthread_attr_setdetachstate(&attr,PTHREAD_CREATE_JOINABLE);
if (ret) return ret;
ret = pthread_create(&thread->_tid, &attr, (void *)startAddress, parameter);
ret = pthread_create(&thread->_tid, &attr, (void * (*)(void *))startAddress, parameter);
/* ret2 = */ pthread_attr_destroy(&attr);

View file

@ -46,7 +46,6 @@ typedef int SRes;
typedef DWORD WRes;
#else
typedef int WRes;
typedef void * HANDLE;
#endif
#ifndef RINOK

View file

@ -1,14 +0,0 @@
SUBDIRS = C ASM/x86
MAINTAINERCLEANFILES = Makefile.in
lzmadocdir = @docdir@/lzma
lzmadoc_DATA = \
7zC.txt \
7zFormat.txt \
Methods.txt \
history.txt \
lzma.txt \
README \
README-Alloc
EXTRA_DIST = $(lzmadoc_DATA)

View file

@ -1,4 +1,4 @@
7-Zip method IDs (9.18)
7-Zip method IDs (4.61)
-----------------------
Each compression or crypto method in 7z has unique binary value (ID).
@ -24,22 +24,14 @@ List of defined IDs
-------------------
00 - Copy
03 - Delta
04 - x86 (BCJ)
05 - PPC (Big Endian)
06 - IA64
07 - ARM (little endian)
08 - ARM Thumb (little endian)
09 - SPARC
21 - LZMA2
02.. - Common
01 - Reserved
02 - Common
03 Swap
- 2 Swap2
- 4 Swap4
04 Delta (subject to change)
03.. - 7z
03 - 7z
01 - LZMA
01 - Version
@ -68,8 +60,11 @@ List of defined IDs
7F -
01 - experimental methods.
80 - reserved for independent developers
04.. - Misc
E0 - Random IDs
04 - Misc
00 - Reserved
01 - Zip
00 - Copy (not used). Use {00} instead
@ -77,13 +72,7 @@ List of defined IDs
06 - Implode
08 - Deflate
09 - Deflate64
10 - Imploding
12 - BZip2 (not used). Use {04 02 02} instead
14 - LZMA
60 - Jpeg
61 - WavPack
62 - PPMd
63 - wzAES
02 - BZip
02 - BZip2
03 - Rar
@ -102,7 +91,7 @@ List of defined IDs
02 - BZip2NSIS
06.. - Crypto
06 - Crypto
00 -
01 - AES
0x - AES-128
@ -129,7 +118,7 @@ List of defined IDs
07 - 7z
01 - AES-256 + SHA-256
07.. - Hash (subject to change)
07 - Hash (subject to change)
00 -
01 - CRC
02 - SHA-1

View file

@ -1,46 +1,6 @@
HISTORY of the LZMA SDK
-----------------------
9.18 beta 2010-11-02
-------------------------
- New small SFX module for installers (SfxSetup).
9.12 beta 2010-03-24
-------------------------
- The BUG in LZMA SDK 9.* was fixed: LZMA2 codec didn't work,
if more than 10 threads were used (or more than 20 threads in some modes).
9.11 beta 2010-03-15
-------------------------
- PPMd compression method support
9.09 2009-12-12
-------------------------
- The bug was fixed:
Utf16_To_Utf8 funstions in UTFConvert.cpp and 7zMain.c
incorrectly converted surrogate characters (the code >= 0x10000) to UTF-8.
- Some bugs were fixed
9.06 2009-08-17
-------------------------
- Some changes in ANSI-C 7z Decoder interfaces.
9.04 2009-05-30
-------------------------
- LZMA2 compression method support
- xz format support
4.65 2009-02-03
-------------------------
- Some minor fixes
4.63 2008-12-31
-------------------------
- Some minor fixes

View file

@ -1,4 +1,4 @@
LZMA SDK 9.20
LZMA SDK 4.63
-------------
LZMA SDK provides the documentation, samples, header files, libraries,
@ -20,10 +20,6 @@ LICENSE
LZMA SDK is written and placed in the public domain by Igor Pavlov.
Some code in LZMA SDK is based on public domain code from another developers:
1) PPMd var.H (2001): Dmitry Shkarin
2) SHA-256: Wei Dai (Crypto++ library)
LZMA SDK Contents
-----------------
@ -37,7 +33,7 @@ LZMA SDK includes:
UNIX/Linux version
------------------
To compile C++ version of file->file LZMA encoding, go to directory
CPP/7zip/Bundles/LzmaCon
C++/7zip/Compress/LZMA_Alone
and call make to recompile it:
make -f makefile.gcc clean all
@ -53,7 +49,6 @@ lzma.txt - LZMA SDK description (this file)
7zC.txt - 7z ANSI-C Decoder description
methods.txt - Compression method IDs for .7z
lzma.exe - Compiled file->file LZMA encoder/decoder for Windows
7zr.exe - 7-Zip with 7z/lzma/xz support.
history.txt - history of the LZMA SDK
@ -71,7 +66,7 @@ C/ - C files
LzmaEnc.* - LZMA encoding
LzmaLib.* - LZMA Library for DLL calling
Types.h - Basic types for another .c files
Threads.* - The code for multithreading.
Threads.* - The code for multithreading.
LzmaLib - LZMA Library (.DLL for Windows)
@ -91,6 +86,12 @@ CPP/ -- CPP files
Compress - files related to compression/decompression
Copy - Copy coder
RangeCoder - Range Coder (special code of compression/decompression)
LZMA - LZMA compression/decompression on C++
LZMA_Alone - file->file LZMA compression/decompression
Branch - Filters for x86, IA-64, ARM, ARM-Thumb, PowerPC and SPARC code
Archive - files related to archiving
Common - common files for archive handling
@ -99,7 +100,6 @@ CPP/ -- CPP files
Bundles - Modules that are bundles of other modules
Alone7z - 7zr.exe: Standalone version of 7z.exe that supports only 7z/LZMA/BCJ/BCJ2
LzmaCon - lzma.exe: LZMA compression/decompression
Format7zR - 7zr.dll: Reduced version of 7za.dll: extracting/compressing to 7z/LZMA/BCJ/BCJ2
Format7zExtractR - 7zxr.dll: Reduced version of 7zxa.dll: extracting from 7z/LZMA/BCJ/BCJ2.
@ -369,8 +369,8 @@ Interface:
propData - LZMA properties (5 bytes)
propSize - size of propData buffer (5 bytes)
finishMode - It has meaning only if the decoding reaches output limit (*destLen).
LZMA_FINISH_ANY - Decode just destLen bytes.
LZMA_FINISH_END - Stream must be finished after (*destLen).
LZMA_FINISH_ANY - Decode just destLen bytes.
LZMA_FINISH_END - Stream must be finished after (*destLen).
You can use LZMA_FINISH_END, when you know that
current output buffer covers last bytes of stream.
alloc - Memory allocator.
@ -431,7 +431,7 @@ Memory Requirements:
{
...
int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,
const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode);
const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode);
...
}
@ -527,8 +527,7 @@ static ISzAlloc g_Alloc = { SzAlloc, SzFree };
LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc);
If callback function return some error code, LzmaEnc_Encode also returns that code
or it can return the code like SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS.
If callback function return some error code, LzmaEnc_Encode also returns that code.
Single-call RAM->RAM Compression
@ -550,8 +549,8 @@ Return code:
Defines
-------
LZMA Defines
------------
_LZMA_SIZE_OPT - Enable some optimizations in LZMA Decoder to get smaller executable code.
@ -563,9 +562,6 @@ _LZMA_UINT32_IS_ULONG - Define it if int is 16-bit on your compiler and long is
_LZMA_NO_SYSTEM_SIZE_T - Define it if you don't want to use size_t type.
_7ZIP_PPMD_SUPPPORT - Define it if you don't want to support PPMD method in AMSI-C .7z decoder.
C++ LZMA Encoder/Decoder
~~~~~~~~~~~~~~~~~~~~~~~~
C++ LZMA code use COM-like interfaces. So if you want to use it,

0
m4/.gitignore vendored
View file

View file

@ -1,43 +0,0 @@
dnl Copyright (C) 2004-2008 Kim Woelders
dnl Copyright (C) 2008 Vincent Torri <vtorri at univ-evry dot fr>
dnl That code is public domain and can be freely used or copied.
dnl Originally snatched from somewhere...
dnl Macro for checking if the compiler supports __attribute__
dnl Usage: AC_C___ATTRIBUTE__
dnl call AC_DEFINE for HAVE___ATTRIBUTE__ and __UNUSED__
dnl if the compiler supports __attribute__, HAVE___ATTRIBUTE__ is
dnl defined to 1 and __UNUSED__ is defined to __attribute__((unused))
dnl otherwise, HAVE___ATTRIBUTE__ is not defined and __UNUSED__ is
dnl defined to nothing.
AC_DEFUN([AC_C___ATTRIBUTE__],
[
AC_MSG_CHECKING([for __attribute__])
AC_CACHE_VAL([ac_cv___attribute__],
[AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[
#include <stdlib.h>
int func(int x);
int foo(int x __attribute__ ((unused)))
{
exit(1);
}
]], [[]])],[ac_cv___attribute__="yes"],[ac_cv___attribute__="no"
])])
AC_MSG_RESULT($ac_cv___attribute__)
if test "x${ac_cv___attribute__}" = "xyes" ; then
AC_DEFINE([HAVE___ATTRIBUTE__], [1], [Define to 1 if your compiler has __attribute__])
AC_DEFINE([__UNUSED__], [__attribute__((unused))], [Macro declaring a function argument to be unused])
else
AC_DEFINE([__UNUSED__], [], [Macro declaring a function argument to be unused])
fi
])
dnl End of ac_attribute.m4

View file

@ -1,95 +0,0 @@
dnl Copyright (C) 2008 Vincent Torri <vtorri at univ-evry dot fr>
dnl That code is public domain and can be freely used or copied.
dnl Macro that check if doxygen is available or not.
dnl EFL_CHECK_DOXYGEN([ACTION-IF-FOUND [, ACTION-IF-NOT-FOUND]])
dnl Test for the doxygen program
dnl Defines efl_doxygen
dnl Defines the automake conditionnal EFL_BUILD_DOC
dnl
AC_DEFUN([EFL_CHECK_DOXYGEN],
[
dnl
dnl Disable the build of the documentation
dnl
AC_ARG_ENABLE([doc],
[AS_HELP_STRING([--disable-doc],[Disable documentation build @<:@default=enabled@:>@])],
[
if test "x${enableval}" = "xyes" ; then
efl_enable_doc="yes"
else
efl_enable_doc="no"
fi
],
[efl_enable_doc="yes"])
AC_MSG_CHECKING([whether to build documentation])
AC_MSG_RESULT([${efl_enable_doc}])
if test "x${efl_enable_doc}" = "xyes" ; then
dnl
dnl Specify the file name, without path
dnl
efl_doxygen="doxygen"
AC_ARG_WITH([doxygen],
[AS_HELP_STRING([--with-doxygen=FILE],[doxygen program to use @<:@default=doxygen@:>@])],
dnl
dnl Check the given doxygen program.
dnl
[efl_doxygen=${withval}
AC_CHECK_PROG([efl_have_doxygen],
[${efl_doxygen}],
[yes],
[no])
if test "x${efl_have_doxygen}" = "xno" ; then
echo "WARNING:"
echo "The doxygen program you specified:"
echo "${efl_doxygen}"
echo "was not found. Please check the path and make sure "
echo "the program exists and is executable."
AC_MSG_WARN([no doxygen detected. Documentation will not be built])
fi
],
[AC_CHECK_PROG([efl_have_doxygen],
[${efl_doxygen}],
[yes],
[no])
if test "x${efl_have_doxygen}" = "xno" ; then
echo "WARNING:"
echo "The doxygen program was not found in your execute path."
echo "You may have doxygen installed somewhere not covered by your path."
echo ""
echo "If this is the case make sure you have the packages installed, AND"
echo "that the doxygen program is in your execute path (see your"
echo "shell manual page on setting the \$PATH environment variable), OR"
echo "alternatively, specify the program to use with --with-doxygen."
AC_MSG_WARN([no doxygen detected. Documentation will not be built])
fi
])
fi
dnl
dnl Substitution
dnl
AC_SUBST([efl_doxygen])
if ! test "x${efl_have_doxygen}" = "xyes" ; then
efl_enable_doc="no"
fi
AM_CONDITIONAL(EFL_BUILD_DOC, test "x${efl_enable_doc}" = "xyes")
if test "x${efl_enable_doc}" = "xyes" ; then
m4_default([$1], [:])
else
m4_default([$2], [:])
fi
])
dnl End of doxygen.m4

1267
main.c

File diff suppressed because it is too large Load diff

39
man/Makefile Executable file
View file

@ -0,0 +1,39 @@
#!/usr/bin/make -f
#
# Copyright information
#
# Copyright (C) 2010 Con Kolivas
# Copyright (C) 2010 Jari Aalto
#
# License
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
PODCENTER = Lrzip
all: lrunzip.1 lrztar.1 lrzuntar.1
lrunzip.1: lrunzip.1.pod
podchecker $<
$(MAKE) -f pod2man.mk PACKAGE=lrunzip PODCENTER=$(PODCENTER) makeman
lrztar.1: lrztar.1.pod
podchecker $<
$(MAKE) -f pod2man.mk PACKAGE=lrztar PODCENTER=$(PODCENTER) makeman
lrzuntar.1: lrzuntar.1.pod
podchecker $<
$(MAKE) -f pod2man.mk PACKAGE=lrzuntar PODCENTER=$(PODCENTER) makeman
# End of file

View file

@ -1,13 +0,0 @@
MAINTAINERCLEANFILES = Makefile.in lrunzip.1 lrztar.1 lrzuntar.1 lrz.1
man1_MANS = lrzip.1 lrunzip.1 lrzcat.1 lrztar.1 lrzuntar.1 lrz.1
man5_MANS = lrzip.conf.5
BUILT_SOURCES = lrunzip.1 lrzcat.1 lrztar.1 lrzuntar.1 lrz.1
CLEANFILES = $(BUILT_SOURCES)
EXTRA_DIST = lrzip.1 lrunzip.1.pod lrzcat.1.pod lrztar.1.pod lrzuntar.1.pod lrz.1.pod $(man5_MANS)
SUFFIXES = .1 .1.pod
.1.pod.1:
pod2man $< $@

View file

@ -1,6 +1,6 @@
# Copyright
#
# Copyright (C) 2010-2016 Con Kolivas
# Copyright (C) 2010 Con Kolivas
# Copyright (C) 2009-2009 Jari Aalto
#
# License
@ -65,24 +65,21 @@ None.
=head1 SEE ALSO
lrzip.conf(5),
lrzip(1),
lrzcat(1),
lrztar(1),
lrzuntar(1),
lrz(1),
bzip2(1),
gzip(1),
lzop(1),
lrzip(1),
rzip(1),
zip(1)
lrztar(1),
lrzip.conf(5)
=head1 AUTHORS
Program was written by Con Kolivas.
This manual page was written by Jari Aalto <jari.aalto@cante.net> (but
may be used by others). Released under license GNU GPL version 2 or (at
may be used by others). Released under license GNU GPL version 2or (at
your option) any later version. For more information about license,
visit <http://www.gnu.org/copyleft/gpl.html>.

View file

@ -1,346 +0,0 @@
#!/usr/bin/perl -w
# Copyright
#
# Copyright (C) 2021 Con Kolivas
#
# License
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# Description
#
# To learn what TOP LEVEL section to use in manual pages,
# see POSIX/Susv standard and "tility Description Defaults" at
# http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap01.html#tag_01_11
#
# This is manual page in Perl POD format. Read more at
# http://perldoc.perl.org/perlpod.html or run command:
#
# perldoc perlpod | less
#
# To check the syntax:
#
# podchecker *.pod
#
# Create manual page with command:
#
# pod2man PAGE.N.pod > PAGE.N
=pod
=encoding utf8
=head1 NAME
lrz - gzip compatible command line variant of lrzip
=head1 SYNOPSIS
B<lrz> [options] I<file>
=head1 DESCRIPTION
B<lrz> is identical to the B<lrzip> application, however, its command
line options and behaviour are made to be as compatible with B<gzip>
as possible.
=head1 OPTIONS
=head2 General options
=over 9
=item B<--stdout>
=item B<-c>
Output to STDOUT.
=item B<--check>
=item B<-C>
Check integrity of file written on decompression.
=item B<--decompress>
=item B<-d>
Decompress.
=item B<--encrypt>[=I<password>]
=item B<-e>
Password protect sha512/aes128 encryption on compression.
=item B<--help>
=item B<-h>
=item B<-?>
Show help.
=item B<--hash>
=item B<-H>
Display md5 hash integrity information.
=item B<--info>
=item B<-i>
Show compressed file information.
=item B<--license>
=item B<-L>
Display software version and license.
=item B<--progress>
=item B<-P>
Show compression progress.
=item B<--recursive>
=item B<-r>
Operate recursively on directories.
=item B<--test>
=item B<-t>
Test compressed file integrity.
=item B<--verbose>
=item B<-v[vv]>
Increase verbosity.
=item B<--version>
=item B<-V>
Show version.
=back
=head2 Options affecting output
=over 9
=item B<--force>
=item B<-f>
Force overwrite of any existing files.
=item B<--keep>
=item B<-k>
Don't delete source files on de/compression.
=item B<--keep-broken>
=item B<-K>
Keep broken or damaged output files.
=item B<--outfile> I<name>
=item B<-o> I<name>
Specify the output file name and/or path.
=item B<--outdir> I<dir>
=item B<-O> I<dir>
Specify the output directory when B<-o> is not used.
=item B<--suffix> I<suffix>
=item B<-S> I<suffix>
Specify compressed suffix (default '.lrz').
=back
=head2 Options affecting compression
=over 9
=item B<--bzip2>
=item B<-b>
Bzip2 compression.
=item B<--gzip>
=item B<-g>
Gzip compression using zlib.
=item B<--lzo>
=item B<-l>
Lzo compression (ultra fast).
=item B<--lzma>
Lzma compression (default).
=item B<--no-compress>
=item B<-n>
No backend compression - prepare for other compressor.
=item B<--zpaq>
=item B<-z>
Zpaq compression (best, extreme compression, extremely slow).
=back
=head2 Low level options
=over 9
=item B<-1> .. B<-9>
=item B<--level> I<level>
=item B<-L> I<level>
Set lzma/bzip2/gzip compression level (1-9, default 7).
=item B<--fast>
Alias for B<-1>.
=item B<--best>
Alias for B<-9>.
=item B<--nice-level> I<value>
=item B<-N> I<value>
Set nice value to I<value> (default 0).
=item B<--threads> I<value>
=item B<-P> I<value>
Set processor count to override number of threads.
=item B<--maxram> I<size>
=item B<-m> I<size>
Set maximum available ram as I<size> * 100 MB.
Overrides detected amount of available ram.
=item B<--threshold>
=item B<-T>
Disable LZ4 compressibility testing.
=item B<--unlimited>
=item B<-U>
Use unlimited window size beyond ramsize (potentially much slower).
=item B<--window> I<size>
=item B<-w> I<size>
Set maximum compression window as I<size> * 100 MB.
Default chosen by heuristic dependent on ram and chosen compression.
=back
See also lrzip(1)
=head1 ENVIRONMENT
lrz uses the same environment and configuration files as lrzip(1)
=head1 FILES
See lrzip(1)
=head1 SEE ALSO
lrzip.conf(5),
lrzip(1),
lrunzip(1),
lrztar(1),
lrzuntar(1),
bzip2(1),
gzip(1),
lzop(1),
rzip(1),
zip(1)
=head1 AUTHORS
This manual page was written by Con Kolivas <kernel@kolivas.org> (but
may be used by others). Released under license GNU GPL version 2 or (at
your option) any later version. For more information about license,
visit <http://www.gnu.org/copyleft/gpl.html>.
=cut

View file

@ -1,86 +0,0 @@
# Copyright
#
# Copyright (C) 2011-2016 Con Kolivas
#
# License
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# Description
#
# To learn what TOP LEVEL section to use in manual pages,
# see POSIX/Susv standard and "tility Description Defaults" at
# http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap01.html#tag_01_11
#
# This is manual page in Perl POD format. Read more at
# http://perldoc.perl.org/perlpod.html or run command:
#
# perldoc perlpod | less
#
# To check the syntax:
#
# podchecker *.pod
#
# Create manual page with command:
#
# pod2man PAGE.N.pod > PAGE.N
=pod
=head1 NAME
lrzcat - Uncompress LRZ files to STDOUT
=head1 SYNOPSIS
lrzcat [options] FILE [... FILE]
=head1 DESCRIPTION
lrzcat is identical to C<lrzip -d -o -> used to decompress files to STDOUT.
=head1 OPTIONS
See lrzip(1).
=head1 ENVIRONMENT
None.
=head1 FILES
None.
=head1 SEE ALSO
lrzip.conf(5),
lrzip(1),
lrunzip(1),
lrztar(1),
lrzuntar(1),
lrz(1),
bzip2(1),
gzip(1),
lzop(1),
rzip(1),
zip(1)
=head1 AUTHORS
This manual page was written by Con Kolivas <kernel@kolivas.org> (but
may be used by others). Released under license GNU GPL version 2 or (at
your option) any later version. For more information about license,
visit <http://www.gnu.org/copyleft/gpl.html>.
=cut

View file

@ -1,4 +1,4 @@
.TH "lrzip" "1" "February 2022" "" ""
.TH "lrzip" "1" "November 2010" "" ""
.SH "NAME"
lrzip \- a large-file compression program
.SH "SYNOPSIS"
@ -9,16 +9,12 @@ lrzip \-d [OPTIONS] <file>
.br
lrunzip [OPTIONS] <file>
.br
lrzcat [OPTIONS] <file>
.br
lrztar [lrzip options] <directory>
.br
lrztar \-d [lrzip options] <directory>
.br
lrzuntar [lrzip options] <directory>
.br
lrz [lrz options] <directory>
.br
LRZIP=NOCONFIG [lrzip|lrunzip] [OPTIONS] <file>
.PP
.SH "DESCRIPTION"
@ -35,175 +31,94 @@ Here is a summary of the options to lrzip\&.
.nf
General options:
\-c, \-\-check check integrity of file written on decompression
\-d, \-\-decompress decompress
\-e, \-\-encrypt[=password] password protected sha512/aes128 encryption on compression
\-h, \-?, \-\-help show help
\-H, \-\-hash display md5 hash integrity information
\-i, \-\-info show compressed file information
\-q, \-\-quiet don't show compression progress
\-Q, \-\-very-quiet don't show any output
\-r, \-\-recursive operate recursively on directories
\-t, \-\-test test compressed file integrity
\-v[v], \-\-verbose Increase verbosity
\-V, \-\-version show version
Options affecting output:
\-D, \-\-delete delete existing files
\-f, \-\-force force overwrite of any existing files
\-k, \-\-keep-broken keep broken or damaged output files
\-o, \-\-outfile filename specify the output file name and/or path
\-O, \-\-outdir directory specify the output directory when -o is not used
\-S, \-\-suffix suffix specify compressed suffix (default '.lrz')
Options affecting compression:
\-b, \-\-bzip2 bzip2 compression
\-g, \-\-gzip gzip compression using zlib
\-l, \-\-lzo lzo compression (ultra fast)
\-n, \-\-no-compress no backend compression - prepare for other compressor
\-z, \-\-zpaq zpaq compression (best, extreme compression, extremely slow)
Low level options:
\-L, \-\-level level set lzma/bzip2/gzip compression level (1-9, default 7)
\-N, \-\-nice-level value Set nice value to value (default 19)
\-p, \-\-threads value Set processor count to override number of threads
\-m, \-\-maxram size Set maximum available ram in hundreds of MB
overrides detected amount of available ram
\-T, \-\-threshold Disable LZ4 compressibility testing
\-U, \-\-unlimited Use unlimited window size beyond ramsize (potentially much slower)
\-w, \-\-window size maximum compression window in hundreds of MB
default chosen by heuristic dependent on ram and chosen compression
LRZIP=NOCONFIG environment variable setting can be used to bypass lrzip.conf.
TMP environment variable will be used for storage of temporary files when needed.
TMPDIR may also be stored in lrzip.conf file.
If no filenames or "-" is specified, stdin/out will be used.
\-w size compression window in hundreds of MB
default chosen by heuristic dependent on ram and chosen compression
\-d decompress
\-o filename specify the output file name and/or path
\-O directory specify the output directory when \-o is not used
\-S suffix specify compressed suffix (default '.lrz')
\-f force overwrite of any existing files
\-D delete existing files
\-P don't set permissions on output file. It may leave it world-readable
\-q don't show compression progress
\-L level set rzip/lzma/bzip2/gzip compression level (1\-9, default 9)
\-n no backend compression. Prepare for other compressor
\-l lzo compression (ultra fast)
\-b bzip2 compression
\-g gzip compression using zlib
\-z zpaq compression (best, extreme compression, extremely slow)
\-M Maximum window (all available ram)
\-U Use unlimited window size beyond ramsize (potentially much slower)
\-T value Compression threshold with LZO test. (0 (nil) - 10 (high), default 1)
\-N value Set nice value to value (default 19)
\-v[v] Increase verbosity
\-V show version
\-t test compressed file integrity
\-i show compressed file information
If no filenames or "-" is specified, stdin/out will be used (stdin/out is
inefficient with lrzip and not recommended usage).
.fi
.PP
.SH "OPTIONS"
.PP
.SH "General options"
.IP "\fB-c\fP"
This option enables integrity checking of the file written to disk on
decompression. All decompression is tested internally in lrzip with either
crc32 or md5 hash checking depending on the version of the archive already.
However the file written to disk may be corrupted for other reasons to do with
other userspace problems such as faulty library versions, drivers, hardware
failure and so on. Enabling this option will make lrzip perform an md5 hash
check on the file that's written to disk. When the archive has the md5 value
stored in it, it is compared to this. Otherwise it is compared to the value
calculated during decompression. This offers an extra guarantee that the file
written is the same as the original archived.
.IP
.IP "\fB-d\fP"
Decompress. If this option is not used then lrzip looks at
the name used to launch the program. If it contains the string
"lrunzip" then the \-d option is automatically set. If it contains the string
"lrzcat" then the \-d \-o \- options are automatically set.
.IP
.IP "\fB-e\fP"
.IP "\fB\-\-encrypt\fP[=\fIpassword\fP]"
Encrypt. This option enables high grade password encryption using a combination
of multiply sha512 hashed password, random salt and aes128 CBC encryption.
Passwords up to 500 characters long are supported, and the encryption mechanism
used virtually guarantees that the same file created with the same password
will never be the same. Furthermore, the password hashing is increased
according to the date the file is encrypted, increasing the number of CPU
cycles required for each password attempt in accordance with Moore's law, thus
making the difficulty of attempting brute force attacks proportional to the
power of modern computers.
.IP
.IP "\fB-h|-?\fP"
.IP "\fB-h\fP"
Print an options summary page
.IP
.IP "\fB-H\fP"
This shows the md5 hash value calculated on compressing or decompressing an
lrzip archive. By default all compression has the md5 value calculated and
stored in all archives since version 0.560. On decompression, when an md5
value has been found, it will be calculated and used for integrity checking.
If the md5 value is not stored in the archive, it will not be calculated unless
explicitly specified with this option, or check integrity (see below) has been
requested.
.IP
.IP "\fB-i\fP"
This shows information about a compressed file. It shows the compressed size,
the decompressed size, the compression ratio, what compression was used and
what hash checking will be used for internal integrity checking.
Note that the compression mode is detected from the first block only and
it will show no compression used if the first block was incompressible, even
if later blocks were compressible. If verbose options \-v or \-vv are added,
a breakdown of all the internal blocks and progressively more information
pertaining to them will also be shown.
.IP
.IP "\fB-q\fP"
If this option is specified then lrzip will not show the
percentage progress while compressing. Note that compression happens in
bursts with lzma compression which is the default compression. This means
that it will progress very rapidly for short periods and then stop for
long periods.
.IP
.IP "\fB-Q\fP"
If this option is specified then lrzip will not show any output to the console
except for error messages.
.IP
.IP "\fB-r\fP"
If this option is specified, lrzip will recursively enter the directories
specified, compressing or decompressing every file individually in the same
directory. Note for better compression it is recommended to instead combine
files in a tar file rather than compress them separately, either manually
or with the lrztar helper.
.IP
.IP "\fB-t\fP"
This tests the compressed file integrity. It does this by decompressing it
to a temporary file and then deleting it.
.IP
.IP "\fB-v[v]\fP"
Increases verbosity. \-vv will print more messages than \-v.
.IP
.IP "\fB-V\fP"
Print the lrzip version number
.IP
.PP
.SH "Options affecting output"
.PP
.IP "\fB-D\fP"
If this option is specified then lrzip will delete the
source file after successful compression or decompression. When this
option is not specified then the source files are not deleted.
.IP "\fB-v[v]\fP"
Increases verbosity. \-vv will print more messages than \-v.
.IP
.IP "\fB-f\fP"
If this option is not specified (Default) then lrzip will not
overwrite any existing files. If you set this option then rzip will
silently overwrite any files as needed.
.IP "\fB-w n\fP"
Set the maximum allowable compression window size to n in hundreds of megabytes.
This is the amount of memory lrzip will search during its first stage of
pre-compression and is the main thing that will determine how much benefit lrzip
will provide over ordinary compression with the 2nd stage algorithm. If not set
(recommended), the value chosen will be determined by an internal heuristic in
lrzip which uses the most memory that is reasonable, without any hard upper
limit. It is limited to 2GB on 32bit machines. lrzip will always reduce the
window size to the biggest it can be without running out of memory.
.IP
.IP "\fB-k\fP"
This option will keep broken or damaged files instead of deleting them.
When compression or decompression is interrupted either by user or error, or
a file decompressed fails an integrity check, it is normally deleted by LRZIP.
.IP "\fB-L 1\&.\&.9\fP"
Set the compression level from 1 to 9. The default is to use level 9, which
gives good all round compression. The compression level is also strongly related
to how much memory lrzip uses. See the \-w option for details.
.IP
.IP "\fB-o\fP"
Set the output file name. If this option is not set then
the output file name is chosen based on the input name and the
suffix. The \-o option cannot be used if more than one file name is
specified on the command line.
.IP "\fB-M \fP"
Maximum window size\&. If this option is set, then lrzip tries to load the
entire file into ram as one big compression window, and will reduce the size of
the window until it does fit. This may induce a hefty swap load on your machine
but can also give dramatic size advantages when your file is the size of your
ram or larger.
.IP
.IP "\fB-O\fP"
Set the output directory for the default filename. This option
cannot be combined with \-o.
.IP "\fB-U \fP"
Unlimited window size\&. If this option is set, and the file being compressed
does not fit into the available ram, lrzip will use a moving second buffer as a
"sliding mmap" which emulates having infinite ram. This will provide the most
possible compression in the first rzip stage which can improve the compression
of ultra large files when they're bigger than the available ram. However it runs
progressively slower the larger the difference between ram and the file size so
it is worth trying the -M option first to see if the whole file can be accessed
in one pass, and then if not, it should be used together with the -M option (if
at all).
.IP
.IP "\fB-S\fP"
Set the compression suffix. The default is '.lrz'.
.IP "\fB-T 0\&.\&.10\fP"
Sets the LZO compression threshold when testing a data chunk when slower
compression is used. The threshold level can be from 0 to 10.
This option is used to speed up compression by avoiding doing the slow
compression pass. The reasoning is that if it is completely incompressible
by LZO then it will also be incompressible by them, thereby saving time.
The default is 1.
.IP
.PP
.SH "Options affecting compression"
.PP
.IP "\fB-b\fP"
Bzip2 compression. Uses bzip2 compression for the 2nd stage, much like
the original rzip does.
.IP "\fB-g\fP"
Gzip compression. Uses gzip compression for the 2nd stage. Uses libz compress
and uncompress functions.
.IP "\fB-d\fP"
Decompress. If this option is not used then lrzip looks at
the name used to launch the program. If it contains the string
"lrunzip" then the \-d option is automatically set.
.IP
.IP "\fB-l\fP"
LZO Compression. If this option is set then lrzip will use the ultra
@ -218,61 +133,68 @@ not compress any faster than LZO compression, it produces a smaller file
that then responds better to further compression (by eg another application),
also reducing the compression time substantially.
.IP
.IP "\fB-b\fP"
Bzip2 compression. Uses bzip2 compression for the 2nd stage, much like
the original rzip does.
.IP "\fB-g\fP"
Gzip compression. Uses gzip compression for the 2nd stage. Uses libz compress
and uncompress functions.
.IP
.IP "\fB-z\fP"
ZPAQ compression. Uses ZPAQ compression which is from the PAQ family of
compressors known for having some of the highest compression ratios possible
but at the cost of being extremely slow on both compress and decompress (4x
slower than lzma which is the default).
.IP
.PP
.SH "Low level options"
.PP
.IP "\fB-L 1\&.\&.9\fP"
Set the compression level from 1 to 9. The default is to use level 7, which
gives good all round compression. The compression level is also strongly related
to how much memory lrzip uses. See the \-w option for details.
.IP "\fB-o\fP"
Set the output file name. If this option is not set then
the output file name is chosen based on the input name and the
suffix. The \-o option cannot be used if more than one file name is
specified on the command line.
.IP
.IP "\fB-O\fP"
Set the output directory for the default filename. This option
cannot be combined with \-o.
.IP
.IP "\fB-S\fP"
Set the compression suffix. The default is '.lrz'.
.IP
.IP "\fB-f\fP"
If this option is not specified (Default) then lrzip will not
overwrite any existing files. If you set this option then rzip will
silently overwrite any files as needed.
.IP
.IP "\fB-D\fP"
If this option is specified then lrzip will delete the
source file after successful compression or decompression. When this
option is not specified then the source files are not deleted.
.IP
.IP "\fB-P\fP"
If this option is specified then lrzip will not try to set the file
permissions on writing the file. This helps when writing to a brain
damaged filesystem like fat32 on windows.
.IP
.IP "\fB-q\fP"
If this option is specified then lrzip will not show the
percentage progress while compressing. Note that compression happens in
bursts with lzma compression which is the default compression. This means
that it will progress very rapidly for short periods and then stop for
long periods.
.IP "\fB-N value\fP"
The default nice value is 19. This option can be used to set the priority
scheduling for the lrzip backup or decompression. Valid nice values are
from \-20 to 19. Note this does NOT speed up or slow down compression.
.IP
.IP "\fB-p value\fP"
Set the number of processor count to determine the number of threads to run.
Normally lrzip will scale according to the number of CPUs it detects. Using
this will override the value in case you wish to use less CPUs to either
decrease the load on your machine, or to improve compression. Setting it to
1 will maximise compression but will not attempt to use more than one CPU.
.IP "\fB-t\fP"
This tests the compressed file integrity. It does this by decompressing it
to a temporary file and then deleting it.
.IP
.IP "\fB-T\fP"
Disables the LZ4 compressibility threshold testing when a slower compression
back-end is used. LZ4 testing is normally performed for the slower back-end
compression of LZMA and ZPAQ. The reasoning is that if it is completely
incompressible by LZ4 then it will also be incompressible by them. Thus if a
block fails to be compressed by the very fast LZ4, lrzip will not attempt to
compress that block with the slower compressor, thereby saving time. If this
option is enabled, it will bypass the LZ4 testing and attempt to compress each
block regardless.
.IP
.IP "\fB-U \fP"
Unlimited window size\&. If this option is set, and the file being compressed
does not fit into the available ram, lrzip will use a moving second buffer as a
"sliding mmap" which emulates having infinite ram. This will provide the most
possible compression in the first rzip stage which can improve the compression
of ultra large files when they're bigger than the available ram. However it runs
progressively slower the larger the difference between ram and the file size,
so is best reserved for when the smallest possible size is desired on a very
large file, and the time taken is not important.
.IP
.IP "\fB-w n\fP"
Set the maximum allowable compression window size to n in hundreds of megabytes.
This is the amount of memory lrzip will search during its first stage of
pre-compression and is the main thing that will determine how much benefit lrzip
will provide over ordinary compression with the 2nd stage algorithm. If not set
(recommended), the value chosen will be determined by an internal heuristic in
lrzip which uses the most memory that is reasonable, without any hard upper
limit. It is limited to 2GB on 32bit machines. lrzip will always reduce the
window size to the biggest it can be without running out of memory.
.IP "\fB-i\fP"
This shows information about a compressed file. It shows the compressed size,
the decompressed size, the compression ratio and what compression was used.
Note that the compression mode is detected from the first block only and
it will show no compression used if the first block was incompressible, even
if later blocks were compressible.
.IP
.PP
.SH "INSTALLATION"
@ -285,7 +207,7 @@ LRZIP operates in two stages. The first stage finds and encodes large chunks of
duplicated data over potentially very long distances in the input file. The
second stage is to use a compression algorithm to compress the output of the
first stage. The compression algorithm can be chosen to be optimised for extreme
size (zpaq), size (lzma - default), speed (lzo), legacy (bzip2 or gzip) or can
size (zpaq), size (lzma - default), speed (lzo), legacy (bzip2) or (gzip) or can
be omitted entirely doing only the first stage. A one stage only compressed file
can almost always improve both the compression size and speed done by a
subsequent compression program.
@ -338,33 +260,23 @@ with increasing ram sizes.
.PP
.SH "BUGS"
.PP
Nil known.
Nil known. Probably lots.
.PP
.SH "SEE ALSO"
lrzip.conf(5),
lrunzip(1),
lrzcat(1),
lrztar(1),
lrzuntar(1),
lrz(1),
bzip2(1),
gzip(1),
lzop(1),
lrzip(1),
rzip(1),
zip(1)
.PP
.SH "DIAGNOSTICS"
.PP
Exit status is normally 0; if an error occurs, exit status is 1, usage errors
is 2.
lrztar(1),
lrzuntar(1)
.PP
.SH "AUTHOR and CREDITS"
.br
lrzip is being extensively bastardised from rzip by Con Kolivas.
.br
rzip was written by Andrew Tridgell.
.br
lzma was written by Igor Pavlov.
@ -373,12 +285,13 @@ lzo was written by Markus Oberhumer.
.br
zpaq was written by Matt Mahoney.
.br
lrzip was bastardised from rzip by Con Kolivas.
.br
Peter Hyman added informational output, updated LZMA SDK,
and added lzma multi-threading capabilities.
and aded multi-threading capabilities.
.PP
If you wish to report a problem, or make a suggestion, then please consult the
git repository at:
https://github.com/ckolivas/lrzip
If you wish to report a problem or make a suggestion then please email Con at
kernel@kolivas.org
.PP
lrzip is released under the GNU General Public License version 2.
Please see the file COPYING for license details.

View file

@ -1,4 +1,4 @@
.TH "lrzip.conf" "5" "January 2009, updated May 2019" "" ""
.TH "lrzip.conf" "5" "January 2009" "" ""
.SH "NAME"
lrzip.conf \- Configuration File for lrzip
.SH "DESCRIPTION"
@ -13,63 +13,40 @@ three places\&:
.nf
$PWD \- Current Directory
/etc/lrzip
$HOME/\&.lrzip
$HOME/\&./lrzip
.PP
Parameters are set in \fBPARAMETER\&=VALUE\fP fashion where any line
beginning with a \fB#\fP or that is blank will be ignored\&.
Parameter values are not case sensitive except where specified\&.
.PP
.SH "CONFIG FILE EXAMPLE"
.nf
# This is a comment.
# Compression Window size in 100MB. Normally selected by program. (-w)
# WINDOW = 20
# Compression Level 1-9 (7 Default). (-L)
# COMPRESSIONLEVEL = 7
# Use -U setting, Unlimited ram. Yes or No
# UNLIMITED = NO
# Compression Method, rzip, gzip, bzip2, lzo, or lzma (default), or zpaq. (-n -g -b -l --lzma -z)
# If specified here, command line options not usable.
# COMPRESSIONMETHOD = lzma
# Perform LZO Test. Default = YES (-T )
# LZOTEST = NO
# Hash Check on decompression, (-c)
# HASHCHECK = YES
# Show HASH value on Compression even if Verbose is off, YES (-H)
# SHOWHASH = YES
# Default output directory (-O)
# OUTPUTDIRECTORY = location
# Verbosity, YES or MAX (v, vv)
# VERBOSITY = max
# Show Progress as file is parsed, YES or no (NO = -q option)
# SHOWPROGRESS = YES
# Set Niceness. 19 is default. -20 to 19 is the allowable range (-N)
# NICE = 19
# Keep broken or damaged output files, YES (-K)
# KEEPBROKEN = YES
# Delete source file after compression (-D)
# Compression Window size in 100MB. Normally selected by program.
WINDOW = 5
# Compression Level 1-9 (7 Default).
COMPRESSIONLEVEL = 7
# Compression Method, rzip, gzip, bzip2, lzo, or lzma (default).
COMPRESSIONMETHOD = lzma
# Test Threshold value 1-10 (2 Default).
TESTTHRESHOLD = 2
# Default output directory
OUTPUTDIRECTORY = location
# Verbosity, true or 1, or max or 2
VERBOSITY = max
# Show Progress as file is parsed, true or 1, false or 0
SHOWPROGRESS = true
# Set Niceness. 19 is default. \-20 to 19 is the allowable range
NICE = 19
# Delete source file after compression
# this parameter and value are case sensitive
# value must be YES to activate
# DELETEFILES = NO
# Replace existing lrzip file when compressing (-f)
# Replace existing lrzip file when compressing
# this parameter and value are case sensitive
# value must be YES to activate
# REPLACEFILE = YES
# Override for Temporary Directory. Only valid when stdin/out or Test is used
# TMPDIR = /tmp
# Whether to use encryption on compression YES, NO (-e)
# ENCRYPT = NO
# REPLACEFILE = NO
.fi
.PP
.SH "NOTES"

View file

@ -1,6 +1,6 @@
# Copyright
#
# Copyright (C) 2010-2016 Con Kolivas
# Copyright (C) 2010 Con Kolivas
# Copyright (C) 2009-2010 Jari Aalto
#
# License
@ -69,24 +69,20 @@ None.
=head1 SEE ALSO
lrzip.conf(5),
lrzuntar(1),
lrzip(1),
lrunzip(1),
lrzcat(1),
lrz(1),
bzip2(1),
gzip(1),
lzop(1),
lrzip(1),
rzip(1),
zip(1)
zip(1),
lrzip.conf(5)
=head1 AUTHORS
Program was written by Con Kolivas.
This manual page was written by Jari Aalto <jari.aalto@cante.net> (but
may be used by others). Released under license GNU GPL version 2 or (at
may be used by others). Released under license GNU GPL version 2or (at
your option) any later version. For more information about license,
visit <http://www.gnu.org/copyleft/gpl.html>.

View file

@ -1,6 +1,6 @@
# Copyright
#
# Copyright (C) 2010-2016 Con Kolivas
# Copyright (C) 2010 Con Kolivas
#
# License
#
@ -47,23 +47,17 @@ None.
=head1 SEE ALSO
lrzip.conf(5),
lrztar(1),
lrzip(1),
lrunzip(1),
lrzcat(1),
lrz(1),
bzip2(1),
gzip(1),
lzop(1),
lrzip(1),
rzip(1),
zip(1)
zip(1),
lrzip.conf(5)
=head1 AUTHORS
This manual page was written by Con Kolivas <kernel@kolivas.org> (but
may be used by others). Released under license GNU GPL version 2 or (at
your option) any later version. For more information about license,
visit <http://www.gnu.org/copyleft/gpl.html>.
Con Kolivas.
=cut

462
md5.c
View file

@ -1,462 +0,0 @@
/*
Copyright (C) 2012-2013 Con Kolivas
Copyright (C) 1995-2011 Ulrich Drepper.
Functions to compute MD5 message digest of files or memory blocks.
according to the definition of MD5 in RFC 1321 from April 1992.
Copyright (C) 1995-1997, 1999-2001, 2005-2006, 2008-2011 Free Software
Foundation, Inc.
This file is part of the GNU C Library.
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 3, or (at your option) any
later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */
/* Written by Ulrich Drepper <drepper@gnu.ai.mit.edu>, 1995. */
#ifdef HAVE_CONFIG_H
# include "config.h"
#endif
#include <stddef.h>
#include "lrzip_private.h"
#include "md5.h"
#if USE_UNLOCKED_IO
# include "unlocked-io.h"
#endif
#ifdef HAVE_ENDIAN_H
# include <endian.h>
#elif HAVE_SYS_ENDIAN_H
# include <sys/endian.h>
#endif
#ifdef HAVE_ARPA_INET_H
# include <arpa/inet.h>
#endif
/* We need to keep the namespace clean so define the MD5 function
protected using leading __ . */
# define md5_init_ctx __md5_init_ctx
# define md5_process_block __md5_process_block
# define md5_process_bytes __md5_process_bytes
# define md5_finish_ctx __md5_finish_ctx
# define md5_read_ctx __md5_read_ctx
# define md5_stream __md5_stream
# define md5_buffer __md5_buffer
#define BLOCKSIZE 32768
#if BLOCKSIZE % 64 != 0
# error "invalid BLOCKSIZE"
#endif
/* This array contains the bytes used to pad the buffer to the next
64-byte boundary. (RFC 1321, 3.1: Step 1) */
static const unsigned char fillbuf[64] = { 0x80, 0 /* , 0, 0, ... */ };
/* Initialize structure containing state of computation.
(RFC 1321, 3.3: Step 3) */
void
md5_init_ctx (struct md5_ctx *ctx)
{
ctx->A = 0x67452301;
ctx->B = 0xefcdab89;
ctx->C = 0x98badcfe;
ctx->D = 0x10325476;
ctx->total[0] = ctx->total[1] = 0;
ctx->buflen = 0;
}
/* Copy the 4 byte value from v into the memory location pointed to by *cp,
If your architecture allows unaligned access this is equivalent to
* (uint32_t *) cp = v */
static inline void
set_uint32 (char *cp, uint32_t v)
{
memcpy (cp, &v, sizeof v);
}
/* Put result from CTX in first 16 bytes following RESBUF. The result
must be in little endian byte order. */
void *
md5_read_ctx (const struct md5_ctx *ctx, void *resbuf)
{
char *r = resbuf;
set_uint32 (r + 0 * sizeof ctx->A, htole32 (ctx->A));
set_uint32 (r + 1 * sizeof ctx->B, htole32 (ctx->B));
set_uint32 (r + 2 * sizeof ctx->C, htole32 (ctx->C));
set_uint32 (r + 3 * sizeof ctx->D, htole32 (ctx->D));
return resbuf;
}
/* Process the remaining bytes in the internal buffer and the usual
prolog according to the standard and write the result to RESBUF. */
void *
md5_finish_ctx (struct md5_ctx *ctx, void *resbuf)
{
/* Take yet unprocessed bytes into account. */
uint32_t bytes = ctx->buflen;
size_t size = (bytes < 56) ? 64 / 4 : 64 * 2 / 4;
/* Now count remaining bytes. */
ctx->total[0] += bytes;
if (ctx->total[0] < bytes)
++ctx->total[1];
/* Put the 64-bit file length in *bits* at the end of the buffer. */
ctx->buffer[size - 2] = htole32 (ctx->total[0] << 3);
ctx->buffer[size - 1] = htole32 ((ctx->total[1] << 3) | (ctx->total[0] >> 29));
memcpy (&((char *) ctx->buffer)[bytes], fillbuf, (size - 2) * 4 - bytes);
/* Process last bytes. */
md5_process_block (ctx->buffer, size * 4, ctx);
return md5_read_ctx (ctx, resbuf);
}
/* Compute MD5 message digest for bytes read from STREAM. The
resulting message digest number will be written into the 16 bytes
beginning at RESBLOCK. */
int
md5_stream (FILE *stream, void *resblock)
{
struct md5_ctx ctx;
size_t sum;
char *buffer = malloc (BLOCKSIZE + 72);
if (!buffer)
return 1;
/* Initialize the computation context. */
md5_init_ctx (&ctx);
/* Iterate over full file contents. */
while (1)
{
/* We read the file in blocks of BLOCKSIZE bytes. One call of the
computation function processes the whole buffer so that with the
next round of the loop another block can be read. */
size_t n;
sum = 0;
/* Read block. Take care for partial reads. */
while (1)
{
n = fread (buffer + sum, 1, BLOCKSIZE - sum, stream);
sum += n;
if (sum == BLOCKSIZE)
break;
if (n == 0)
{
/* Check for the error flag IFF N == 0, so that we don't
exit the loop after a partial read due to e.g., EAGAIN
or EWOULDBLOCK. */
if (ferror (stream))
{
free (buffer);
return 1;
}
goto process_partial_block;
}
/* We've read at least one byte, so ignore errors. But always
check for EOF, since feof may be true even though N > 0.
Otherwise, we could end up calling fread after EOF. */
if (feof (stream))
goto process_partial_block;
}
/* Process buffer with BLOCKSIZE bytes. Note that
BLOCKSIZE % 64 == 0
*/
md5_process_block (buffer, BLOCKSIZE, &ctx);
}
process_partial_block:
/* Process any remaining bytes. */
if (sum > 0)
md5_process_bytes (buffer, sum, &ctx);
/* Construct result in desired memory. */
md5_finish_ctx (&ctx, resblock);
free (buffer);
return 0;
}
/* Compute MD5 message digest for LEN bytes beginning at BUFFER. The
result is always in little endian byte order, so that a byte-wise
output yields to the wanted ASCII representation of the message
digest. */
void *
md5_buffer (const char *buffer, size_t len, void *resblock)
{
struct md5_ctx ctx;
/* Initialize the computation context. */
md5_init_ctx (&ctx);
/* Process whole buffer but last len % 64 bytes. */
md5_process_bytes (buffer, len, &ctx);
/* Put result in desired memory area. */
return md5_finish_ctx (&ctx, resblock);
}
void
md5_process_bytes (const void *buffer, size_t len, struct md5_ctx *ctx)
{
/* When we already have some bits in our internal buffer concatenate
both inputs first. */
if (ctx->buflen != 0)
{
size_t left_over = ctx->buflen;
size_t add = 128 - left_over > len ? len : 128 - left_over;
memcpy (&((char *) ctx->buffer)[left_over], buffer, add);
ctx->buflen += add;
if (ctx->buflen > 64)
{
md5_process_block (ctx->buffer, ctx->buflen & ~63, ctx);
ctx->buflen &= 63;
/* The regions in the following copy operation cannot overlap. */
memcpy (ctx->buffer,
&((char *) ctx->buffer)[(left_over + add) & ~63],
ctx->buflen);
}
buffer = (const char *) buffer + add;
len -= add;
}
/* Process available complete blocks. */
if (len >= 64)
{
#if !_STRING_ARCH_unaligned
# define alignof(type) offsetof (struct { char c; type x; }, x)
# define UNALIGNED_P(p) (((size_t) p) % alignof (uint32_t) != 0)
if (UNALIGNED_P (buffer))
while (len > 64)
{
md5_process_block (memcpy (ctx->buffer, buffer, 64), 64, ctx);
buffer = (const char *) buffer + 64;
len -= 64;
}
else
#endif
{
md5_process_block (buffer, len & ~63, ctx);
buffer = (const char *) buffer + (len & ~63);
len &= 63;
}
}
/* Move remaining bytes in internal buffer. */
if (len > 0)
{
size_t left_over = ctx->buflen;
memcpy (&((char *) ctx->buffer)[left_over], buffer, len);
left_over += len;
if (left_over >= 64)
{
md5_process_block (ctx->buffer, 64, ctx);
left_over -= 64;
memcpy (ctx->buffer, &ctx->buffer[16], left_over);
}
ctx->buflen = left_over;
}
}
/* These are the four functions used in the four steps of the MD5 algorithm
and defined in the RFC 1321. The first function is a little bit optimized
(as found in Colin Plumbs public domain implementation). */
/* #define FF(b, c, d) ((b & c) | (~b & d)) */
#define FF(b, c, d) (d ^ (b & (c ^ d)))
#define FG(b, c, d) FF (d, b, c)
#define FH(b, c, d) (b ^ c ^ d)
#define FI(b, c, d) (c ^ (b | ~d))
/* Process LEN bytes of BUFFER, accumulating context into CTX.
It is assumed that LEN % 64 == 0. */
void
md5_process_block (const void *buffer, size_t len, struct md5_ctx *ctx)
{
uint32_t correct_words[16];
const uint32_t *words = buffer;
size_t nwords = len / sizeof (uint32_t);
const uint32_t *endp = words + nwords;
uint32_t A = ctx->A;
uint32_t B = ctx->B;
uint32_t C = ctx->C;
uint32_t D = ctx->D;
uint32_t lolen = len;
/* First increment the byte count. RFC 1321 specifies the possible
length of the file up to 2^64 bits. Here we only compute the
number of bytes. Do a double word increment. */
ctx->total[0] += lolen;
ctx->total[1] += (len >> 31 >> 1) + (ctx->total[0] < lolen);
/* Process all bytes in the buffer with 64 bytes in each round of
the loop. */
while (words < endp)
{
uint32_t *cwp = correct_words;
uint32_t A_save = A;
uint32_t B_save = B;
uint32_t C_save = C;
uint32_t D_save = D;
/* First round: using the given function, the context and a constant
the next context is computed. Because the algorithms processing
unit is a 32-bit word and it is determined to work on words in
little endian byte order we perhaps have to change the byte order
before the computation. To reduce the work for the next steps
we store the swapped words in the array CORRECT_WORDS. */
#define OP(a, b, c, d, s, T) \
do \
{ \
a += FF (b, c, d) + (*cwp++ = htole32 (*words)) + T; \
++words; \
CYCLIC (a, s); \
a += b; \
} \
while (0)
/* It is unfortunate that C does not provide an operator for
cyclic rotation. Hope the C compiler is smart enough. */
#define CYCLIC(w, s) (w = (w << s) | (w >> (32 - s)))
/* Before we start, one word to the strange constants.
They are defined in RFC 1321 as
T[i] = (int) (4294967296.0 * fabs (sin (i))), i=1..64
Here is an equivalent invocation using Perl:
perl -e 'foreach(1..64){printf "0x%08x\n", int (4294967296 * abs (sin $_))}'
*/
/* Round 1. */
OP (A, B, C, D, 7, 0xd76aa478);
OP (D, A, B, C, 12, 0xe8c7b756);
OP (C, D, A, B, 17, 0x242070db);
OP (B, C, D, A, 22, 0xc1bdceee);
OP (A, B, C, D, 7, 0xf57c0faf);
OP (D, A, B, C, 12, 0x4787c62a);
OP (C, D, A, B, 17, 0xa8304613);
OP (B, C, D, A, 22, 0xfd469501);
OP (A, B, C, D, 7, 0x698098d8);
OP (D, A, B, C, 12, 0x8b44f7af);
OP (C, D, A, B, 17, 0xffff5bb1);
OP (B, C, D, A, 22, 0x895cd7be);
OP (A, B, C, D, 7, 0x6b901122);
OP (D, A, B, C, 12, 0xfd987193);
OP (C, D, A, B, 17, 0xa679438e);
OP (B, C, D, A, 22, 0x49b40821);
/* For the second to fourth round we have the possibly swapped words
in CORRECT_WORDS. Redefine the macro to take an additional first
argument specifying the function to use. */
#undef OP
#define OP(f, a, b, c, d, k, s, T) \
do \
{ \
a += f (b, c, d) + correct_words[k] + T; \
CYCLIC (a, s); \
a += b; \
} \
while (0)
/* Round 2. */
OP (FG, A, B, C, D, 1, 5, 0xf61e2562);
OP (FG, D, A, B, C, 6, 9, 0xc040b340);
OP (FG, C, D, A, B, 11, 14, 0x265e5a51);
OP (FG, B, C, D, A, 0, 20, 0xe9b6c7aa);
OP (FG, A, B, C, D, 5, 5, 0xd62f105d);
OP (FG, D, A, B, C, 10, 9, 0x02441453);
OP (FG, C, D, A, B, 15, 14, 0xd8a1e681);
OP (FG, B, C, D, A, 4, 20, 0xe7d3fbc8);
OP (FG, A, B, C, D, 9, 5, 0x21e1cde6);
OP (FG, D, A, B, C, 14, 9, 0xc33707d6);
OP (FG, C, D, A, B, 3, 14, 0xf4d50d87);
OP (FG, B, C, D, A, 8, 20, 0x455a14ed);
OP (FG, A, B, C, D, 13, 5, 0xa9e3e905);
OP (FG, D, A, B, C, 2, 9, 0xfcefa3f8);
OP (FG, C, D, A, B, 7, 14, 0x676f02d9);
OP (FG, B, C, D, A, 12, 20, 0x8d2a4c8a);
/* Round 3. */
OP (FH, A, B, C, D, 5, 4, 0xfffa3942);
OP (FH, D, A, B, C, 8, 11, 0x8771f681);
OP (FH, C, D, A, B, 11, 16, 0x6d9d6122);
OP (FH, B, C, D, A, 14, 23, 0xfde5380c);
OP (FH, A, B, C, D, 1, 4, 0xa4beea44);
OP (FH, D, A, B, C, 4, 11, 0x4bdecfa9);
OP (FH, C, D, A, B, 7, 16, 0xf6bb4b60);
OP (FH, B, C, D, A, 10, 23, 0xbebfbc70);
OP (FH, A, B, C, D, 13, 4, 0x289b7ec6);
OP (FH, D, A, B, C, 0, 11, 0xeaa127fa);
OP (FH, C, D, A, B, 3, 16, 0xd4ef3085);
OP (FH, B, C, D, A, 6, 23, 0x04881d05);
OP (FH, A, B, C, D, 9, 4, 0xd9d4d039);
OP (FH, D, A, B, C, 12, 11, 0xe6db99e5);
OP (FH, C, D, A, B, 15, 16, 0x1fa27cf8);
OP (FH, B, C, D, A, 2, 23, 0xc4ac5665);
/* Round 4. */
OP (FI, A, B, C, D, 0, 6, 0xf4292244);
OP (FI, D, A, B, C, 7, 10, 0x432aff97);
OP (FI, C, D, A, B, 14, 15, 0xab9423a7);
OP (FI, B, C, D, A, 5, 21, 0xfc93a039);
OP (FI, A, B, C, D, 12, 6, 0x655b59c3);
OP (FI, D, A, B, C, 3, 10, 0x8f0ccc92);
OP (FI, C, D, A, B, 10, 15, 0xffeff47d);
OP (FI, B, C, D, A, 1, 21, 0x85845dd1);
OP (FI, A, B, C, D, 8, 6, 0x6fa87e4f);
OP (FI, D, A, B, C, 15, 10, 0xfe2ce6e0);
OP (FI, C, D, A, B, 6, 15, 0xa3014314);
OP (FI, B, C, D, A, 13, 21, 0x4e0811a1);
OP (FI, A, B, C, D, 4, 6, 0xf7537e82);
OP (FI, D, A, B, C, 11, 10, 0xbd3af235);
OP (FI, C, D, A, B, 2, 15, 0x2ad7d2bb);
OP (FI, B, C, D, A, 9, 21, 0xeb86d391);
/* Add the starting values of the context. */
A += A_save;
B += B_save;
C += C_save;
D += D_save;
}
/* Put checksum in context given as argument. */
ctx->A = A;
ctx->B = B;
ctx->C = C;
ctx->D = D;
}

118
md5.h
View file

@ -1,118 +0,0 @@
/*
Copyright (C) 2011 Con Kolivas
Copyright (C) 1995-2011 Ulrich Drepper.
Declaration of functions and data types used for MD5 sum computing
library functions.
Copyright (C) 1995-1997, 1999-2001, 2004-2006, 2008-2011 Free Software
Foundation, Inc.
This file is part of the GNU C Library.
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 3, or (at your option) any
later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */
#ifndef _MD5_H
#define _MD5_H 1
#include <stdint.h>
#include "lrzip_private.h"
#define MD5_DIGEST_SIZE 16
#define MD5_BLOCK_SIZE 64
#ifndef __GNUC_PREREQ
# if defined __GNUC__ && defined __GNUC_MINOR__
# define __GNUC_PREREQ(maj, min) \
((__GNUC__ << 16) + __GNUC_MINOR__ >= ((maj) << 16) + (min))
# else
# define __GNUC_PREREQ(maj, min) 0
# endif
#endif
#ifndef __THROW
# if defined __cplusplus && __GNUC_PREREQ (2,8)
# define __THROW throw ()
# else
# define __THROW
# endif
#endif
#ifndef _LIBC
# define __md5_buffer md5_buffer
# define __md5_finish_ctx md5_finish_ctx
# define __md5_init_ctx md5_init_ctx
# define __md5_process_block md5_process_block
# define __md5_process_bytes md5_process_bytes
# define __md5_read_ctx md5_read_ctx
# define __md5_stream md5_stream
#endif
# ifdef __cplusplus
extern "C" {
# endif
/*
* The following three functions are build up the low level used in
* the functions `md5_stream' and `md5_buffer'.
*/
/* Initialize structure containing state of computation.
(RFC 1321, 3.3: Step 3) */
extern void __md5_init_ctx (struct md5_ctx *ctx) __THROW;
/* Starting with the result of former calls of this function (or the
initialization function update the context for the next LEN bytes
starting at BUFFER.
It is necessary that LEN is a multiple of 64!!! */
extern void __md5_process_block (const void *buffer, size_t len,
struct md5_ctx *ctx) __THROW;
/* Starting with the result of former calls of this function (or the
initialization function update the context for the next LEN bytes
starting at BUFFER.
It is NOT required that LEN is a multiple of 64. */
extern void __md5_process_bytes (const void *buffer, size_t len,
struct md5_ctx *ctx) __THROW;
/* Process the remaining bytes in the buffer and put result from CTX
in first 16 bytes following RESBUF. The result is always in little
endian byte order, so that a byte-wise output yields to the wanted
ASCII representation of the message digest. */
extern void *__md5_finish_ctx (struct md5_ctx *ctx, void *resbuf) __THROW;
/* Put result from CTX in first 16 bytes following RESBUF. The result is
always in little endian byte order, so that a byte-wise output yields
to the wanted ASCII representation of the message digest. */
extern void *__md5_read_ctx (const struct md5_ctx *ctx, void *resbuf) __THROW;
/* Compute MD5 message digest for bytes read from STREAM. The
resulting message digest number will be written into the 16 bytes
beginning at RESBLOCK. */
extern int __md5_stream (FILE *stream, void *resblock) __THROW;
/* Compute MD5 message digest for LEN bytes beginning at BUFFER. The
result is always in little endian byte order, so that a byte-wise
output yields to the wanted ASCII representation of the message
digest. */
extern void *__md5_buffer (const char *buffer, size_t len,
void *resblock) __THROW;
# ifdef __cplusplus
}
# endif
#endif /* md5.h */

View file

@ -1,49 +0,0 @@
Test basic use
Test decompression in read-only dir
1000 1000 3893
this should be silent
man page for lrz should exist
0
compress stdin to stdout
Respect $TMPDIR
1000 1000 3893
Decompress in read only dir
1000 1000 3893
Test -cd
1000 1000 3893
Test -cfd should not remove testfile.lrz
1000 1000 3893
testfile.lrz
Test -1c
1002 1002 3975
Test -r
t10.lrz
t1.lrz
t2.lrz
t3.lrz
t4.lrz
t5.lrz
t6.lrz
t7.lrz
t8.lrz
t9.lrz
Test tar compatibility
t/
t/t8
t/t7
t/t3
t/t5
t/t2
t/t6
t/t10
t/t4
t/t9
t/t1
11
test compress of 1 GB data with parallel --pipe --compress
1073741824
test compress of 1 GB with sort --compress-program
1073741825
test should not lrz -dc removes file
OK
testfile.lrz

View file

@ -1,119 +0,0 @@
#!/bin/bash
# Regression test.
#
# Copyright (C) 2016
# Ole Tange and Free Software Foundation, Inc.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, see <http://www.gnu.org/licenses/>
# or write to the Free Software Foundation, Inc., 51 Franklin St,
# Fifth Floor, Boston, MA 02110-1301 USA
bash > regressiontest.out 2>&1 <<'_EOS'
rm -f testfile.lrz
seq 1000 > testfile
echo 'Test basic use'
lrz testfile
echo 'Test decompression in read-only dir'
mkdir -p ro
cp testfile.lrz ro
chmod 500 ro
cd ro
lrz -dc testfile.lrz | wc
cd ..
echo 'this should be silent'
lrz -d testfile.lrz
echo 'man page for lrz should exist'
man lrz >/dev/null
echo $?
echo 'compress stdin to stdout'
cat testfile | lrz | cat > testfile.lrz
echo 'Respect $TMPDIR'
mkdir -p t
chmod 111 t
cd t
TMPDIR=.. lrz -d < ../testfile.lrz | wc
cd ..
rm -rf t
echo 'Decompress in read only dir'
mkdir -p t
chmod 111 t
cd t
lrz -d < ../testfile.lrz | wc
cd ..
rm -rf t
echo 'Test -cd'
mkdir -p t
chmod 111 t
cd t
lrz -cd ../testfile.lrz | wc
cd ..
rm -rf t
echo 'Test -cfd should not remove testfile.lrz'
mkdir -p t
chmod 111 t
cd t
lrz -cfd ../testfile.lrz | wc
cd ..
rm -rf t
ls testfile.lrz
echo 'Test -1c'
lrz -1c testfile | wc
echo 'Test -r'
mkdir t
touch t/t{1..10}
lrz -r t
ls t
rm -r t
echo 'Test tar compatibility'
mkdir t
touch t/t{1..10}
tar --use-compress-program lrz -cvf testfile.tar.lrz t
tar --use-compress-program lrz -tvf testfile.tar.lrz | wc -l
rm -r t
echo 'test compress of 1 GB data with parallel --pipe --compress'
yes "`echo {1..100}`" |
head -c 1G |
parallel --pipe --block 100m --compress-program lrz cat |
wc -c
echo 'test compress of 1 GB with sort --compress-program'
yes "`echo {1..100}`" |
head -c 1G |
sort --compress-program lrz |
wc -c
echo 'test should not lrz -dc removes file'
rm testfile.lrz
echo OK > testfile
lrz testfile
lrz -dc testfile.lrz
ls testfile.lrz
_EOS
diff regressiontest.good regressiontest.out

437
runzip.c
View file

@ -1,6 +1,6 @@
/*
Copyright (C) 2006-2016,2018,2021-2022 Con Kolivas
Copyright (C) 1998-2003 Andrew Tridgell
Copyright (C) Andrew Tridgell 1998-2003
Con Kolivas 2006-2010
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@ -13,246 +13,126 @@
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
/* rzip decompression algorithm */
#ifdef HAVE_CONFIG_H
# include "config.h"
#endif
#include "rzip.h"
#include <sys/types.h>
#ifdef HAVE_SYS_STAT_H
# include <sys/stat.h>
#endif
#ifdef HAVE_SYS_TIME_H
# include <sys/time.h>
#endif
#ifdef HAVE_UNISTD_H
# include <unistd.h>
#endif
#ifdef HAVE_ENDIAN_H
# include <endian.h>
#elif HAVE_SYS_ENDIAN_H
# include <sys/endian.h>
#endif
#ifdef HAVE_ARPA_INET_H
# include <arpa/inet.h>
#endif
#include "md5.h"
#include "runzip.h"
#include "stream.h"
#include "util.h"
#include "lrzip_core.h"
/* needed for CRC routines */
#include "lzma/C/7zCrc.h"
static inline uchar read_u8(rzip_control *control, void *ss, int stream, bool *err)
static inline uchar read_u8(void *ss, int stream)
{
uchar b;
if (unlikely(read_stream(control, ss, stream, &b, 1) != 1)) {
*err = true;
fatal_return(("Stream read u8 failed\n"), 0);
}
if (unlikely(read_stream(ss, stream, &b, 1) != 1))
fatal("Stream read u8 failed\n");
return b;
}
static inline u32 read_u32(rzip_control *control, void *ss, int stream, bool *err)
static inline u32 read_u32(void *ss, int stream)
{
u32 ret;
if (unlikely(read_stream(control, ss, stream, (uchar *)&ret, 4) != 4)) {
*err = true;
fatal_return(("Stream read u32 failed\n"), 0);
}
ret = le32toh(ret);
if (unlikely(read_stream(ss, stream, (uchar *)&ret, 4) != 4))
fatal("Stream read u32 failed\n");
return ret;
}
/* Read a variable length of chars dependant on how big the chunk was */
static inline i64 read_vchars(rzip_control *control, void *ss, int stream, int length)
static inline i64 read_vchars(void *ss, int stream, int length)
{
int bytes;
i64 s = 0;
if (unlikely(read_stream(control, ss, stream, (uchar *)&s, length) != length))
fatal_return(("Stream read of %d bytes failed\n", length), -1);
s = le64toh(s);
for (bytes = 0; bytes < length; bytes++) {
int bits = bytes * 8;
uchar sb = read_u8(ss, stream);
s |= (i64)sb << bits;
}
return s;
}
static i64 seekcur_fdout(rzip_control *control)
static i64 read_header(void *ss, uchar *head)
{
if (!TMP_OUTBUF)
return lseek(control->fd_out, 0, SEEK_CUR);
return (control->out_relofs + control->out_ofs);
int chunk_bytes = 2;
/* All chunks were unnecessarily encoded 8 bytes wide version 0.4x */
if (control.major_version == 0 && control.minor_version == 4)
chunk_bytes = 8;
*head = read_u8(ss, 0);
return read_vchars(ss, 0, chunk_bytes);
}
static i64 seekto_fdhist(rzip_control *control, i64 pos)
static i64 unzip_literal(void *ss, i64 len, int fd_out, uint32 *cksum)
{
if (!TMP_OUTBUF)
return lseek(control->fd_hist, pos, SEEK_SET);
control->hist_ofs = pos - control->out_relofs;
if (control->hist_ofs > control->out_len)
control->out_len = control->hist_ofs;
if (unlikely(control->hist_ofs < 0 || control->hist_ofs > control->out_maxlen)) {
print_err("Trying to seek outside tmpoutbuf to %lld in seekto_fdhist\n", control->hist_ofs);
return -1;
}
return pos;
}
static i64 seekcur_fdin(rzip_control *control)
{
if (!TMP_INBUF)
return lseek(control->fd_in, 0, SEEK_CUR);
return control->in_ofs;
}
static i64 seekto_fdin(rzip_control *control, i64 pos)
{
if (!TMP_INBUF)
return lseek(control->fd_in, pos, SEEK_SET);
if (unlikely(pos > control->in_len || pos < 0)) {
print_err("Trying to seek outside tmpinbuf to %lld in seekto_fdin\n", pos);
return -1;
}
control->in_ofs = pos;
return 0;
}
static i64 seekto_fdinend(rzip_control *control)
{
int tmpchar;
if (!TMP_INBUF)
return lseek(control->fd_in, 0, SEEK_END);
while ((tmpchar = getchar()) != EOF) {
control->tmp_inbuf[control->in_len++] = (char)tmpchar;
if (unlikely(control->in_len > control->in_maxlen))
failure_return(("Trying to read greater than max_len\n"), -1);
}
control->in_ofs = control->in_len;
return control->in_ofs;
}
static i64 read_header(rzip_control *control, void *ss, uchar *head)
{
bool err = false;
*head = read_u8(control, ss, 0, &err);
if (err)
return -1;
return read_vchars(control, ss, 0, control->chunk_bytes);
}
static i64 unzip_literal(rzip_control *control, void *ss, i64 len, uint32 *cksum)
{
i64 stream_read;
uchar *buf;
if (unlikely(len < 0))
failure_return(("len %lld is negative in unzip_literal!\n",len), -1);
fatal("len %lld is negative in unzip_literal!\n",len);
buf = (uchar *)malloc(len);
if (unlikely(!buf))
fatal_return(("Failed to malloc literal buffer of size %lld\n", len), -1);
/* We use anonymous mmap instead of malloc to allow us to allocate up
* to 2^44 even on 32 bits */
buf = (uchar *)mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
if (unlikely(buf == MAP_FAILED))
fatal("Failed to allocate literal buffer of size %lld\n", len);
stream_read = read_stream(control, ss, 1, buf, len);
if (unlikely(stream_read == -1 )) {
dealloc(buf);
fatal_return(("Failed to read_stream in unzip_literal\n"), -1);
}
read_stream(ss, 1, buf, len);
if (unlikely(write_1g(fd_out, buf, (size_t)len) != (ssize_t)len))
fatal("Failed to write literal buffer of size %lld\n", len);
if (unlikely(write_1g(control, buf, (size_t)stream_read) != (ssize_t)stream_read)) {
dealloc(buf);
fatal_return(("Failed to write literal buffer of size %lld\n", stream_read), -1);
}
*cksum = CrcUpdate(*cksum, buf, len);
if (!HAS_MD5)
*cksum = CrcUpdate(*cksum, buf, stream_read);
if (!NO_MD5)
md5_process_bytes(buf, stream_read, &control->ctx);
dealloc(buf);
return stream_read;
}
static i64 read_fdhist(rzip_control *control, void *buf, i64 len)
{
if (!TMP_OUTBUF)
return read_1g(control, control->fd_hist, buf, len);
if (unlikely(len + control->hist_ofs > control->out_maxlen)) {
print_err("Trying to read beyond end of tmpoutbuf in read_fdhist\n");
return -1;
}
memcpy(buf, control->tmp_outbuf + control->hist_ofs, len);
munmap(buf, len);
return len;
}
static i64 unzip_match(rzip_control *control, void *ss, i64 len, uint32 *cksum, int chunk_bytes)
static i64 unzip_match(void *ss, i64 len, int fd_out, int fd_hist, uint32 *cksum, int chunk_bytes)
{
i64 offset, n, total, cur_pos;
uchar *buf;
if (unlikely(len < 0))
failure_return(("len %lld is negative in unzip_match!\n",len), -1);
fatal("len %lld is negative in unzip_match!\n",len);
total = 0;
cur_pos = seekcur_fdout(control);
cur_pos = lseek(fd_out, 0, SEEK_CUR);
if (unlikely(cur_pos == -1))
fatal_return(("Seek failed on out file in unzip_match.\n"), -1);
fatal("Seek failed on out file in unzip_match.\n");
/* Note the offset is in a different format v0.40+ */
offset = read_vchars(control, ss, 0, chunk_bytes);
if (unlikely(offset == -1))
return -1;
if (unlikely(seekto_fdhist(control, cur_pos - offset) == -1))
fatal_return(("Seek failed by %d from %d on history file in unzip_match\n",
offset, cur_pos), -1);
n = MIN(len, offset);
if (unlikely(n < 1))
fatal_return(("Failed fd history in unzip_match due to corrupt archive\n"), -1);
buf = (uchar *)malloc(n);
if (unlikely(!buf))
fatal_return(("Failed to malloc match buffer of size %lld\n", len), -1);
if (unlikely(read_fdhist(control, buf, (size_t)n) != (ssize_t)n)) {
dealloc(buf);
fatal_return(("Failed to read %d bytes in unzip_match\n", n), -1);
}
offset = read_vchars(ss, 0, chunk_bytes);
if (unlikely(lseek(fd_hist, cur_pos - offset, SEEK_SET) == -1))
fatal("Seek failed by %d from %d on history file in unzip_match - %s\n",
offset, cur_pos, strerror(errno));
while (len) {
uchar *buf;
n = MIN(len, offset);
if (unlikely(n < 1))
fatal_return(("Failed fd history in unzip_match due to corrupt archive\n"), -1);
if (unlikely(write_1g(control, buf, (size_t)n) != (ssize_t)n)) {
dealloc(buf);
fatal_return(("Failed to write %d bytes in unzip_match\n", n), -1);
}
buf = (uchar *)mmap(NULL, n, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
if (unlikely(buf == MAP_FAILED))
fatal("Failed to allocate match buffer of size %lld\n", n);
if (!HAS_MD5)
*cksum = CrcUpdate(*cksum, buf, n);
if (!NO_MD5)
md5_process_bytes(buf, n, &control->ctx);
if (unlikely(read_1g(fd_hist, buf, (size_t)n) != (ssize_t)n))
fatal("Failed to read %d bytes in unzip_match\n", n);
if (unlikely(write_1g(fd_out, buf, (size_t)n) != (ssize_t)n))
fatal("Failed to write %d bytes in unzip_match\n", n);
*cksum = CrcUpdate(*cksum, buf, n);
len -= n;
munmap(buf, n);
total += n;
}
dealloc(buf);
return total;
}
/* decompress a section of an open file. Call fatal_return(() on error
/* decompress a section of an open file. Call fatal() on error
return the number of bytes that have been retrieved
*/
static i64 runzip_chunk(rzip_control *control, int fd_in, i64 expected_size, i64 tally)
static i64 runzip_chunk(int fd_in, int fd_out, int fd_hist, i64 expected_size, i64 tally)
{
uint32 good_cksum, cksum = 0;
i64 len, ofs, total = 0;
@ -261,7 +141,6 @@ static i64 runzip_chunk(rzip_control *control, int fd_in, i64 expected_size, i64
struct stat st;
uchar head;
void *ss;
bool err = false;
/* for display of progress */
unsigned long divisor[] = {1,1024,1048576,1073741824U};
@ -283,206 +162,74 @@ static i64 runzip_chunk(rzip_control *control, int fd_in, i64 expected_size, i64
/* Determine the chunk_byte width size. Versions < 0.4 used 4
* bytes for all offsets, version 0.4 used 8 bytes. Versions 0.5+ use
* a variable number of bytes depending on chunk size.*/
if (control->major_version == 0 && control->minor_version < 4)
if (control.major_version == 0 && control.minor_version < 4)
chunk_bytes = 4;
else if (control->major_version == 0 && control->minor_version == 4)
else if (control.major_version == 0 && control.minor_version == 4)
chunk_bytes = 8;
else {
print_maxverbose("Reading chunk_bytes at %lld\n", get_readseek(control, fd_in));
/* Read in the stored chunk byte width from the file */
if (unlikely(read_1g(control, fd_in, &chunk_bytes, 1) != 1))
fatal_return(("Failed to read chunk_bytes size in runzip_chunk\n"), -1);
if (unlikely(chunk_bytes < 1 || chunk_bytes > 8))
failure_return(("chunk_bytes %d is invalid in runzip_chunk\n", chunk_bytes), -1);
if (unlikely(read(fd_in, &chunk_bytes, 1) != 1))
fatal("Failed to read chunk_bytes size in runzip_chunk\n");
}
if (!tally && expected_size)
print_maxverbose("Expected size: %lld\n", expected_size);
print_maxverbose("Chunk byte width: %d\n", chunk_bytes);
if (!tally)
print_maxverbose("\nExpected size: %lld", expected_size);
print_maxverbose("\nChunk byte width: %d\n", chunk_bytes);
ofs = seekcur_fdin(control);
ofs = lseek(fd_in, 0, SEEK_CUR);
if (unlikely(ofs == -1))
fatal_return(("Failed to seek input file in runzip_fd\n"), -1);
fatal("Failed to seek input file in runzip_fd\n");
if (fstat(fd_in, &st) || st.st_size - ofs == 0)
return 0;
ss = open_stream_in(control, fd_in, NUM_STREAMS, chunk_bytes);
ss = open_stream_in(fd_in, NUM_STREAMS);
if (unlikely(!ss))
failure_return(("Failed to open_stream_in in runzip_chunk\n"), -1);
fatal("Failed to open_stream_in in runzip_chunk\n");
/* All chunks were unnecessarily encoded 8 bytes wide version 0.4x */
if (control->major_version == 0 && control->minor_version == 4)
control->chunk_bytes = 8;
else
control->chunk_bytes = 2;
while ((len = read_header(control, ss, &head)) || head) {
i64 u;
if (unlikely(len == -1))
return -1;
while ((len = read_header(ss, &head)) || head) {
switch (head) {
case 0:
u = unzip_literal(control, ss, len, &cksum);
if (unlikely(u == -1)) {
close_stream_in(control, ss);
return -1;
}
total += u;
total += unzip_literal(ss, len, fd_out, &cksum);
break;
default:
u = unzip_match(control, ss, len, &cksum, chunk_bytes);
if (unlikely(u == -1)) {
close_stream_in(control, ss);
return -1;
}
total += u;
total += unzip_match(ss, len, fd_out, fd_hist, &cksum, chunk_bytes);
break;
}
if (expected_size) {
p = 100 * ((double)(tally + total) / (double)expected_size);
if (p / 10 != l / 10) {
prog_done = (double)(tally + total) / (double)divisor[divisor_index];
print_progress("%3d%% %9.2f / %9.2f %s\r",
p, prog_done, prog_tsize, suffix[divisor_index] );
l = p;
}
p = 100 * ((double)(tally + total) / (double)expected_size);
if (p != l) {
prog_done = (double)(tally + total) / (double)divisor[divisor_index];
print_progress("%3d%% %9.2f / %9.2f %s\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b",
p, prog_done, prog_tsize, suffix[divisor_index] );
l = p;
}
}
if (!HAS_MD5) {
good_cksum = read_u32(control, ss, 0, &err);
if (unlikely(err)) {
close_stream_in(control, ss);
return -1;
}
if (unlikely(good_cksum != cksum)) {
close_stream_in(control, ss);
failure_return(("Bad checksum: 0x%08x - expected: 0x%08x\n", cksum, good_cksum), -1);
}
print_maxverbose("Checksum for block: 0x%08x\n", cksum);
}
good_cksum = read_u32(ss, 0);
if (unlikely(good_cksum != cksum))
fatal("Bad checksum 0x%08x - expected 0x%08x\n", cksum, good_cksum);
if (unlikely(close_stream_in(control, ss)))
if (unlikely(close_stream_in(ss)))
fatal("Failed to close stream!\n");
return total;
}
/* Decompress an open file. Call fatal_return(() on error
/* Decompress an open file. Call fatal() on error
return the number of bytes that have been retrieved
*/
i64 runzip_fd(rzip_control *control, int fd_in, int fd_hist, i64 expected_size)
i64 runzip_fd(int fd_in, int fd_out, int fd_hist, i64 expected_size)
{
uchar md5_stored[MD5_DIGEST_SIZE];
struct timeval start,end;
i64 total = 0, u;
double tdiff;
i64 total = 0;
if (!NO_MD5)
md5_init_ctx (&control->ctx);
gettimeofday(&start,NULL);
do {
u = runzip_chunk(control, fd_in, expected_size, total);
if (u < 1) {
if (u < 0 || total < expected_size) {
print_err("Failed to runzip_chunk in runzip_fd\n");
return -1;
}
}
total += u;
if (unlikely(!flush_tmpout(control))) {
print_err("Failed to flush_tmpout in runzip_fd\n");
return -1;
}
if (TMP_INBUF)
clear_tmpinbuf(control);
else if (STDIN && !DECOMPRESS) {
if (unlikely(!clear_tmpinfile(control))) {
print_err("Failed to clear_tmpinfile in runzip_fd\n");
return -1;
}
}
} while (total < expected_size || (!expected_size && !control->eof));
while (total < expected_size)
total += runzip_chunk(fd_in, fd_out, fd_hist, expected_size, total);
gettimeofday(&end,NULL);
if (!ENCRYPT) {
tdiff = end.tv_sec - start.tv_sec;
if (!tdiff)
tdiff = 1;
print_output("\nAverage DeCompression Speed: %6.3fMB/s\n",
(total / 1024 / 1024) / tdiff);
}
if (!NO_MD5) {
int i,j;
md5_finish_ctx (&control->ctx, control->md5_resblock);
if (HAS_MD5) {
i64 fdinend = seekto_fdinend(control);
if (unlikely(fdinend == -1))
failure_return(("Failed to seekto_fdinend in rzip_fd\n"), -1);
if (unlikely(seekto_fdin(control, fdinend - MD5_DIGEST_SIZE) == -1))
failure_return(("Failed to seekto_fdin in rzip_fd\n"), -1);
if (unlikely(read_1g(control, fd_in, md5_stored, MD5_DIGEST_SIZE) != MD5_DIGEST_SIZE))
fatal_return(("Failed to read md5 data in runzip_fd\n"), -1);
if (ENCRYPT)
if (unlikely(!lrz_decrypt(control, md5_stored, MD5_DIGEST_SIZE, control->salt_pass)))
return -1;
for (i = 0; i < MD5_DIGEST_SIZE; i++)
if (md5_stored[i] != control->md5_resblock[i]) {
print_output("MD5 CHECK FAILED.\nStored:");
for (j = 0; j < MD5_DIGEST_SIZE; j++)
print_output("%02x", md5_stored[j] & 0xFF);
print_output("\nOutput file:");
for (j = 0; j < MD5_DIGEST_SIZE; j++)
print_output("%02x", control->md5_resblock[j] & 0xFF);
failure_return(("\n"), -1);
}
}
if (HASH_CHECK || MAX_VERBOSE) {
print_output("MD5: ");
for (i = 0; i < MD5_DIGEST_SIZE; i++)
print_output("%02x", control->md5_resblock[i] & 0xFF);
print_output("\n");
}
if (CHECK_FILE) {
FILE *md5_fstream;
int i, j;
if (TMP_OUTBUF)
close_tmpoutbuf(control);
memcpy(md5_stored, control->md5_resblock, MD5_DIGEST_SIZE);
if (unlikely(seekto_fdhist(control, 0) == -1))
fatal_return(("Failed to seekto_fdhist in runzip_fd\n"), -1);
if (unlikely((md5_fstream = fdopen(fd_hist, "r")) == NULL))
fatal_return(("Failed to fdopen fd_hist in runzip_fd\n"), -1);
if (unlikely(md5_stream(md5_fstream, control->md5_resblock)))
fatal_return(("Failed to md5_stream in runzip_fd\n"), -1);
/* We don't close the file here as it's closed in main */
for (i = 0; i < MD5_DIGEST_SIZE; i++)
if (md5_stored[i] != control->md5_resblock[i]) {
print_output("MD5 CHECK FAILED.\nStored:");
for (j = 0; j < MD5_DIGEST_SIZE; j++)
print_output("%02x", md5_stored[j] & 0xFF);
print_output("\nOutput file:");
for (j = 0; j < MD5_DIGEST_SIZE; j++)
print_output("%02x", control->md5_resblock[j] & 0xFF);
failure_return(("\n"), -1);
}
print_output("MD5 integrity of written file matches archive\n");
if (!HAS_MD5)
print_output("Note this lrzip archive did not have a stored md5 value.\n"
"The archive decompression was validated with crc32 and the md5 hash was "
"calculated on decompression\n");
}
}
print_progress("\nAverage DeCompression Speed: %6.3fMB/s\n",
(total / 1024 / 1024) / (double)((end.tv_sec-start.tv_sec)? : 1));
return total;
}

View file

@ -1,27 +0,0 @@
/*
Copyright (C) 2006-2011,2022 Con Kolivas
Copyright (C) 2011 Peter Hyman
Copyright (C) 1998-2003 Andrew Tridgell
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef RUNZIP_H
#define RUNZIP_H
#include "lrzip_private.h"
i64 runzip_fd(rzip_control *control, int fd_in, int fd_hist, i64 expected_size);
#endif

1144
rzip.c

File diff suppressed because it is too large Load diff

291
rzip.h
View file

@ -1,7 +1,6 @@
/*
Copyright (C) 2006-2016,2022 Con Kolivas
Copyright (C) 2011 Peter Hyman
Copyright (C) 1998 Andrew Tridgell
Copyright (C) Andrew Tridgell 1998,
Con Kolivas 2006-2010
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@ -14,13 +13,289 @@
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
#ifndef RZIP_H
#define RZIP_H
#include "lrzip_private.h"
#define LRZIP_MAJOR_VERSION 0
#define LRZIP_MINOR_VERSION 5
#define LRZIP_MINOR_SUBVERSION 2
void rzip_fd(rzip_control *control, int fd_in, int fd_out);
#define NUM_STREAMS 2
#include "config.h"
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <stddef.h>
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <bzlib.h>
#include <zlib.h>
#include <sys/resource.h>
#include <netinet/in.h>
#include <sys/time.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <lzo/lzoconf.h>
#include <lzo/lzo1x.h>
/* LZMA C Wrapper */
#include "lzma/C/LzmaLib.h"
#ifdef HAVE_STRING_H
#include <string.h>
#endif
#ifdef HAVE_MALLOC_H
#include <malloc.h>
#endif
#include <fcntl.h>
#include <sys/stat.h>
#ifdef HAVE_CTYPE_H
#include <ctype.h>
#endif
#include <errno.h>
#include <sys/mman.h>
/* needed for CRC routines */
#include "lzma/C/7zCrc.h"
#ifndef uchar
#define uchar unsigned char
#endif
#ifndef int32
#if (SIZEOF_INT == 4)
#define int32 int
#elif (SIZEOF_LONG == 4)
#define int32 long
#elif (SIZEOF_SHORT == 4)
#define int32 short
#endif
#endif
#ifndef int16
#if (SIZEOF_INT == 2)
#define int16 int
#elif (SIZEOF_SHORT == 2)
#define int16 short
#endif
#endif
#ifndef uint32
#define uint32 unsigned int32
#endif
#ifndef uint16
#define uint16 unsigned int16
#endif
#ifndef MIN
#define MIN(a, b) ((a) < (b)? (a): (b))
#endif
#ifndef MAX
#define MAX(a, b) ((a) > (b)? (a): (b))
#endif
#if !HAVE_STRERROR
extern char *sys_errlist[];
#define strerror(i) sys_errlist[i]
#endif
#ifndef HAVE_ERRNO_DECL
extern int errno;
#endif
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
typedef long long int i64;
typedef uint16_t u16;
typedef uint32_t u32;
#ifndef MAP_ANONYMOUS
#define MAP_ANONYMOUS MAP_ANON
#endif
#if defined(NOTHREAD) || !defined(_SC_NPROCESSORS_ONLN)
#define PROCESSORS (1)
#else
#define PROCESSORS (sysconf(_SC_NPROCESSORS_ONLN))
#endif
#ifdef _SC_PAGE_SIZE
#define PAGE_SIZE (sysconf(_SC_PAGE_SIZE))
#else
#define PAGE_SIZE (4096)
#endif
#ifdef __APPLE__
#include <sys/sysctl.h>
#define fmemopen fake_fmemopen
#define open_memstream fake_open_memstream
#define memstream_update_buffer fake_open_memstream_update_buffer
#define mremap fake_mremap
static inline i64 get_ram(void)
{
int mib[2];
size_t len;
i64 *p, ramsize;
mib[0] = CTL_HW;
mib[1] = HW_MEMSIZE;
sysctl(mib, 2, NULL, &len, NULL, 0);
p = malloc(len);
sysctl(mib, 2, p, &len, NULL, 0);
ramsize = *p;
return ramsize;
}
#else /* __APPLE__ */
#define memstream_update_buffer(A, B, C) (0)
static inline i64 get_ram(void)
{
return (i64)sysconf(_SC_PHYS_PAGES) * (i64)sysconf(_SC_PAGE_SIZE);
}
#endif
#define FLAG_SHOW_PROGRESS 2
#define FLAG_KEEP_FILES 4
#define FLAG_TEST_ONLY 8
#define FLAG_FORCE_REPLACE 16
#define FLAG_DECOMPRESS 32
#define FLAG_NO_COMPRESS 64
#define FLAG_LZO_COMPRESS 128
#define FLAG_BZIP2_COMPRESS 256
#define FLAG_ZLIB_COMPRESS 512
#define FLAG_ZPAQ_COMPRESS 1024
#define FLAG_VERBOSITY 2048
#define FLAG_VERBOSITY_MAX 4096
#define FLAG_NO_SET_PERMS 8192
#define FLAG_STDIN 16384
#define FLAG_STDOUT 32768
#define FLAG_INFO 65536
#define FLAG_MAXRAM 131072
#define FLAG_UNLIMITED 262144
#define FLAG_VERBOSE (FLAG_VERBOSITY | FLAG_VERBOSITY_MAX)
#define FLAG_NOT_LZMA (FLAG_NO_COMPRESS | FLAG_LZO_COMPRESS | FLAG_BZIP2_COMPRESS | FLAG_ZLIB_COMPRESS | FLAG_ZPAQ_COMPRESS)
#define LZMA_COMPRESS (!(control.flags & FLAG_NOT_LZMA))
#define SHOW_PROGRESS (control.flags & FLAG_SHOW_PROGRESS)
#define KEEP_FILES (control.flags & FLAG_KEEP_FILES)
#define TEST_ONLY (control.flags & FLAG_TEST_ONLY)
#define FORCE_REPLACE (control.flags & FLAG_FORCE_REPLACE)
#define DECOMPRESS (control.flags & FLAG_DECOMPRESS)
#define NO_COMPRESS (control.flags & FLAG_NO_COMPRESS)
#define LZO_COMPRESS (control.flags & FLAG_LZO_COMPRESS)
#define BZIP2_COMPRESS (control.flags & FLAG_BZIP2_COMPRESS)
#define ZLIB_COMPRESS (control.flags & FLAG_ZLIB_COMPRESS)
#define ZPAQ_COMPRESS (control.flags & FLAG_ZPAQ_COMPRESS)
#define VERBOSE (control.flags & FLAG_VERBOSE)
#define VERBOSITY (control.flags & FLAG_VERBOSITY)
#define MAX_VERBOSE (control.flags & FLAG_VERBOSITY_MAX)
#define NO_SET_PERMS (control.flags & FLAG_NO_SET_PERMS)
#define STDIN (control.flags & FLAG_STDIN)
#define STDOUT (control.flags & FLAG_STDOUT)
#define INFO (control.flags & FLAG_INFO)
#define MAXRAM (control.flags & FLAG_MAXRAM)
#define UNLIMITED (control.flags & FLAG_UNLIMITED)
#define BITS32 (sizeof(long) == 4)
#define CTYPE_NONE 3
#define CTYPE_BZIP2 4
#define CTYPE_LZO 5
#define CTYPE_LZMA 6
#define CTYPE_GZIP 7
#define CTYPE_ZPAQ 8
struct rzip_control {
char *infile;
char *outname;
char *outfile;
char *outdir;
FILE *msgout; //stream for output messages
const char *suffix;
int compression_level;
unsigned char lzma_properties[5]; // lzma properties, encoded
double threshold;
unsigned long long window;
unsigned long flags;
unsigned long long ramsize;
unsigned long threads;
int nice_val; // added for consistency
int major_version;
int minor_version;
i64 st_size;
long page_size;
} control;
struct stream {
i64 last_head;
uchar *buf;
i64 buflen;
i64 bufp;
};
struct stream_info {
struct stream *s;
int num_streams;
int fd;
i64 bufsize;
i64 cur_pos;
i64 initial_pos;
i64 total_read;
};
void fatal(const char *format, ...);
void sighandler();
i64 runzip_fd(int fd_in, int fd_out, int fd_hist, i64 expected_size);
void rzip_fd(int fd_in, int fd_out);
void *open_stream_out(int f, int n, i64 limit);
void *open_stream_in(int f, int n);
int write_stream(void *ss, int stream, uchar *p, i64 len);
i64 read_stream(void *ss, int stream, uchar *p, i64 len);
int close_stream_out(void *ss);
int close_stream_in(void *ss);
int flush_buffer(struct stream_info *sinfo, int stream);
void read_config(struct rzip_control *s);
ssize_t write_1g(int fd, void *buf, i64 len);
ssize_t read_1g(int fd, void *buf, i64 len);
void zpipe_compress(FILE *in, FILE *out, FILE *msgout, long long int buf_len, int progress);
void zpipe_decompress(FILE *in, FILE *out, FILE *msgout, long long int buf_len, int progress);
const i64 two_gig;
#define print_err(format, args...) do {\
fprintf(stderr, format, ##args); \
} while (0)
#define print_output(format, args...) do {\
fprintf(control.msgout, format, ##args); \
fflush(control.msgout); \
} while (0)
#define print_progress(format, args...) do {\
if (SHOW_PROGRESS) \
print_output(format, ##args); \
} while (0)
#define print_verbose(format, args...) do {\
if (VERBOSE) \
print_output(format, ##args); \
} while (0)
#define print_maxverbose(format, args...) do {\
if (MAX_VERBOSE) \
print_output(format, ##args); \
} while (0)

324
sha4.c
View file

@ -1,324 +0,0 @@
/*
* FIPS-180-2 compliant SHA-384/512 implementation
*
* Copyright (C) 2011, Con Kolivas <kernel@kolivas.org>
* Copyright (C) 2006-2010, Brainspark B.V.
*
* This file is part of PolarSSL (http://www.polarssl.org)
* Lead Maintainer: Paul Bakker <polarssl_maintainer at polarssl.org>
*
* All rights reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
*/
/*
* The SHA-512 Secure Hash Standard was published by NIST in 2002.
*
* http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf
*/
#include "sha4.h"
#include <string.h>
#include <stdio.h>
/*
* 64-bit integer manipulation macros (big endian)
*/
#ifndef GET_UINT64_BE
#define GET_UINT64_BE(n,b,i) \
{ \
(n) = ( (unsigned int64) (b)[(i) ] << 56 ) \
| ( (unsigned int64) (b)[(i) + 1] << 48 ) \
| ( (unsigned int64) (b)[(i) + 2] << 40 ) \
| ( (unsigned int64) (b)[(i) + 3] << 32 ) \
| ( (unsigned int64) (b)[(i) + 4] << 24 ) \
| ( (unsigned int64) (b)[(i) + 5] << 16 ) \
| ( (unsigned int64) (b)[(i) + 6] << 8 ) \
| ( (unsigned int64) (b)[(i) + 7] ); \
}
#endif
#ifndef PUT_UINT64_BE
#define PUT_UINT64_BE(n,b,i) \
{ \
(b)[(i) ] = (unsigned char) ( (n) >> 56 ); \
(b)[(i) + 1] = (unsigned char) ( (n) >> 48 ); \
(b)[(i) + 2] = (unsigned char) ( (n) >> 40 ); \
(b)[(i) + 3] = (unsigned char) ( (n) >> 32 ); \
(b)[(i) + 4] = (unsigned char) ( (n) >> 24 ); \
(b)[(i) + 5] = (unsigned char) ( (n) >> 16 ); \
(b)[(i) + 6] = (unsigned char) ( (n) >> 8 ); \
(b)[(i) + 7] = (unsigned char) ( (n) ); \
}
#endif
/*
* Round constants
*/
static const unsigned int64 K[80] =
{
UL64(0x428A2F98D728AE22), UL64(0x7137449123EF65CD),
UL64(0xB5C0FBCFEC4D3B2F), UL64(0xE9B5DBA58189DBBC),
UL64(0x3956C25BF348B538), UL64(0x59F111F1B605D019),
UL64(0x923F82A4AF194F9B), UL64(0xAB1C5ED5DA6D8118),
UL64(0xD807AA98A3030242), UL64(0x12835B0145706FBE),
UL64(0x243185BE4EE4B28C), UL64(0x550C7DC3D5FFB4E2),
UL64(0x72BE5D74F27B896F), UL64(0x80DEB1FE3B1696B1),
UL64(0x9BDC06A725C71235), UL64(0xC19BF174CF692694),
UL64(0xE49B69C19EF14AD2), UL64(0xEFBE4786384F25E3),
UL64(0x0FC19DC68B8CD5B5), UL64(0x240CA1CC77AC9C65),
UL64(0x2DE92C6F592B0275), UL64(0x4A7484AA6EA6E483),
UL64(0x5CB0A9DCBD41FBD4), UL64(0x76F988DA831153B5),
UL64(0x983E5152EE66DFAB), UL64(0xA831C66D2DB43210),
UL64(0xB00327C898FB213F), UL64(0xBF597FC7BEEF0EE4),
UL64(0xC6E00BF33DA88FC2), UL64(0xD5A79147930AA725),
UL64(0x06CA6351E003826F), UL64(0x142929670A0E6E70),
UL64(0x27B70A8546D22FFC), UL64(0x2E1B21385C26C926),
UL64(0x4D2C6DFC5AC42AED), UL64(0x53380D139D95B3DF),
UL64(0x650A73548BAF63DE), UL64(0x766A0ABB3C77B2A8),
UL64(0x81C2C92E47EDAEE6), UL64(0x92722C851482353B),
UL64(0xA2BFE8A14CF10364), UL64(0xA81A664BBC423001),
UL64(0xC24B8B70D0F89791), UL64(0xC76C51A30654BE30),
UL64(0xD192E819D6EF5218), UL64(0xD69906245565A910),
UL64(0xF40E35855771202A), UL64(0x106AA07032BBD1B8),
UL64(0x19A4C116B8D2D0C8), UL64(0x1E376C085141AB53),
UL64(0x2748774CDF8EEB99), UL64(0x34B0BCB5E19B48A8),
UL64(0x391C0CB3C5C95A63), UL64(0x4ED8AA4AE3418ACB),
UL64(0x5B9CCA4F7763E373), UL64(0x682E6FF3D6B2B8A3),
UL64(0x748F82EE5DEFB2FC), UL64(0x78A5636F43172F60),
UL64(0x84C87814A1F0AB72), UL64(0x8CC702081A6439EC),
UL64(0x90BEFFFA23631E28), UL64(0xA4506CEBDE82BDE9),
UL64(0xBEF9A3F7B2C67915), UL64(0xC67178F2E372532B),
UL64(0xCA273ECEEA26619C), UL64(0xD186B8C721C0C207),
UL64(0xEADA7DD6CDE0EB1E), UL64(0xF57D4F7FEE6ED178),
UL64(0x06F067AA72176FBA), UL64(0x0A637DC5A2C898A6),
UL64(0x113F9804BEF90DAE), UL64(0x1B710B35131C471B),
UL64(0x28DB77F523047D84), UL64(0x32CAAB7B40C72493),
UL64(0x3C9EBE0A15C9BEBC), UL64(0x431D67C49C100D4C),
UL64(0x4CC5D4BECB3E42B6), UL64(0x597F299CFC657E2A),
UL64(0x5FCB6FAB3AD6FAEC), UL64(0x6C44198C4A475817)
};
/*
* SHA-512 context setup
*/
void sha4_starts( sha4_context *ctx, int is384 )
{
ctx->total[0] = 0;
ctx->total[1] = 0;
if( is384 == 0 )
{
/* SHA-512 */
ctx->state[0] = UL64(0x6A09E667F3BCC908);
ctx->state[1] = UL64(0xBB67AE8584CAA73B);
ctx->state[2] = UL64(0x3C6EF372FE94F82B);
ctx->state[3] = UL64(0xA54FF53A5F1D36F1);
ctx->state[4] = UL64(0x510E527FADE682D1);
ctx->state[5] = UL64(0x9B05688C2B3E6C1F);
ctx->state[6] = UL64(0x1F83D9ABFB41BD6B);
ctx->state[7] = UL64(0x5BE0CD19137E2179);
}
else
{
/* SHA-384 */
ctx->state[0] = UL64(0xCBBB9D5DC1059ED8);
ctx->state[1] = UL64(0x629A292A367CD507);
ctx->state[2] = UL64(0x9159015A3070DD17);
ctx->state[3] = UL64(0x152FECD8F70E5939);
ctx->state[4] = UL64(0x67332667FFC00B31);
ctx->state[5] = UL64(0x8EB44A8768581511);
ctx->state[6] = UL64(0xDB0C2E0D64F98FA7);
ctx->state[7] = UL64(0x47B5481DBEFA4FA4);
}
ctx->is384 = is384;
}
static void sha4_process( sha4_context *ctx, const unsigned char data[128] )
{
int i;
unsigned int64 temp1, temp2, W[80];
unsigned int64 A, B, C, D, E, F, G, H;
#define SHR(x,n) (x >> n)
#define ROTR(x,n) (SHR(x,n) | (x << (64 - n)))
#define S0(x) (ROTR(x, 1) ^ ROTR(x, 8) ^ SHR(x, 7))
#define S1(x) (ROTR(x,19) ^ ROTR(x,61) ^ SHR(x, 6))
#define S2(x) (ROTR(x,28) ^ ROTR(x,34) ^ ROTR(x,39))
#define S3(x) (ROTR(x,14) ^ ROTR(x,18) ^ ROTR(x,41))
#define F0(x,y,z) ((x & y) | (z & (x | y)))
#define F1(x,y,z) (z ^ (x & (y ^ z)))
#define P(a,b,c,d,e,f,g,h,x,K) \
{ \
temp1 = h + S3(e) + F1(e,f,g) + K + x; \
temp2 = S2(a) + F0(a,b,c); \
d += temp1; h = temp1 + temp2; \
}
for( i = 0; i < 16; i++ )
{
GET_UINT64_BE( W[i], data, i << 3 );
}
for( ; i < 80; i++ )
{
W[i] = S1(W[i - 2]) + W[i - 7] +
S0(W[i - 15]) + W[i - 16];
}
A = ctx->state[0];
B = ctx->state[1];
C = ctx->state[2];
D = ctx->state[3];
E = ctx->state[4];
F = ctx->state[5];
G = ctx->state[6];
H = ctx->state[7];
i = 0;
do
{
P( A, B, C, D, E, F, G, H, W[i], K[i] ); i++;
P( H, A, B, C, D, E, F, G, W[i], K[i] ); i++;
P( G, H, A, B, C, D, E, F, W[i], K[i] ); i++;
P( F, G, H, A, B, C, D, E, W[i], K[i] ); i++;
P( E, F, G, H, A, B, C, D, W[i], K[i] ); i++;
P( D, E, F, G, H, A, B, C, W[i], K[i] ); i++;
P( C, D, E, F, G, H, A, B, W[i], K[i] ); i++;
P( B, C, D, E, F, G, H, A, W[i], K[i] ); i++;
}
while( i < 80 );
ctx->state[0] += A;
ctx->state[1] += B;
ctx->state[2] += C;
ctx->state[3] += D;
ctx->state[4] += E;
ctx->state[5] += F;
ctx->state[6] += G;
ctx->state[7] += H;
}
/*
* SHA-512 process buffer
*/
void sha4_update( sha4_context *ctx, const unsigned char *input, int ilen )
{
int fill;
unsigned int64 left;
if( ilen <= 0 )
return;
left = ctx->total[0] & 0x7F;
fill = (int)( 128 - left );
ctx->total[0] += ilen;
if( ctx->total[0] < (unsigned int64) ilen )
ctx->total[1]++;
if( left && ilen >= fill )
{
memcpy( (void *) (ctx->buffer + left),
(void *) input, fill );
sha4_process( ctx, ctx->buffer );
input += fill;
ilen -= fill;
left = 0;
}
while( ilen >= 128 )
{
sha4_process( ctx, input );
input += 128;
ilen -= 128;
}
if( ilen > 0 )
{
memcpy( (void *) (ctx->buffer + left),
(void *) input, ilen );
}
}
static const unsigned char sha4_padding[128] =
{
0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
};
/*
* SHA-512 final digest
*/
void sha4_finish( sha4_context *ctx, unsigned char output[64] )
{
int last, padn;
unsigned int64 high, low;
unsigned char msglen[16];
high = ( ctx->total[0] >> 61 )
| ( ctx->total[1] << 3 );
low = ( ctx->total[0] << 3 );
PUT_UINT64_BE( high, msglen, 0 );
PUT_UINT64_BE( low, msglen, 8 );
last = (int)( ctx->total[0] & 0x7F );
padn = ( last < 112 ) ? ( 112 - last ) : ( 240 - last );
sha4_update( ctx, (unsigned char *) sha4_padding, padn );
sha4_update( ctx, msglen, 16 );
PUT_UINT64_BE( ctx->state[0], output, 0 );
PUT_UINT64_BE( ctx->state[1], output, 8 );
PUT_UINT64_BE( ctx->state[2], output, 16 );
PUT_UINT64_BE( ctx->state[3], output, 24 );
PUT_UINT64_BE( ctx->state[4], output, 32 );
PUT_UINT64_BE( ctx->state[5], output, 40 );
if( ctx->is384 == 0 )
{
PUT_UINT64_BE( ctx->state[6], output, 48 );
PUT_UINT64_BE( ctx->state[7], output, 56 );
}
}
/*
* output = SHA-512( input buffer )
*/
void sha4( const unsigned char *input, int ilen,
unsigned char output[64], int is384 )
{
sha4_context ctx;
sha4_starts( &ctx, is384 );
sha4_update( &ctx, input, ilen );
sha4_finish( &ctx, output );
memset( &ctx, 0, sizeof( sha4_context ) );
}

96
sha4.h
View file

@ -1,96 +0,0 @@
/**
* \file sha4.h
*
* Copyright (C) 2011, Con Kolivas <kernel@kolivas.org>
* Copyright (C) 2006-2010, Brainspark B.V.
*
* This file is part of PolarSSL (http://www.polarssl.org)
* Lead Maintainer: Paul Bakker <polarssl_maintainer at polarssl.org>
*
* All rights reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
*/
#ifndef POLARSSL_SHA4_H
#define POLARSSL_SHA4_H
#if defined(_MSC_VER) || defined(__WATCOMC__)
#define UL64(x) x##ui64
#define int64 __int64
#else
#define UL64(x) x##ULL
#define int64 long long
#endif
/**
* \brief SHA-512 context structure
*/
typedef struct
{
unsigned int64 total[2]; /*!< number of bytes processed */
unsigned int64 state[8]; /*!< intermediate digest state */
unsigned char buffer[128]; /*!< data block being processed */
unsigned char ipad[128]; /*!< HMAC: inner padding */
unsigned char opad[128]; /*!< HMAC: outer padding */
int is384; /*!< 0 => SHA-512, else SHA-384 */
}
sha4_context;
#ifdef __cplusplus
extern "C" {
#endif
/**
* \brief SHA-512 context setup
*
* \param ctx context to be initialized
* \param is384 0 = use SHA512, 1 = use SHA384
*/
void sha4_starts( sha4_context *ctx, int is384 );
/**
* \brief SHA-512 process buffer
*
* \param ctx SHA-512 context
* \param input buffer holding the data
* \param ilen length of the input data
*/
void sha4_update( sha4_context *ctx, const unsigned char *input, int ilen );
/**
* \brief SHA-512 final digest
*
* \param ctx SHA-512 context
* \param output SHA-384/512 checksum result
*/
void sha4_finish( sha4_context *ctx, unsigned char output[64] );
/**
* \brief Output = SHA-512( input buffer )
*
* \param input buffer holding the data
* \param ilen length of the input data
* \param output SHA-384/512 checksum result
* \param is384 0 = use SHA512, 1 = use SHA384
*/
void sha4( const unsigned char *input, int ilen,
unsigned char output[64], int is384 );
#ifdef __cplusplus
}
#endif
#endif /* sha4.h */

1984
stream.c

File diff suppressed because it is too large Load diff

View file

@ -1,46 +0,0 @@
/*
Copyright (C) 2006-2016 Con Kolivas
Copyright (C) 2011 Peter Hyman
Copyright (C) 1998-2003 Andrew Tridgell
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef LRZIP_STREAM_H
#define LRZIP_STREAM_H
#include "lrzip_private.h"
#include <pthread.h>
bool create_pthread(rzip_control *control, pthread_t *thread, pthread_attr_t * attr,
void * (*start_routine)(void *), void *arg);
bool join_pthread(pthread_t th, void **thread_return);
bool init_mutex(rzip_control *control, pthread_mutex_t *mutex);
bool unlock_mutex(rzip_control *control, pthread_mutex_t *mutex);
bool lock_mutex(rzip_control *control, pthread_mutex_t *mutex);
ssize_t write_1g(rzip_control *control, void *buf, i64 len);
ssize_t read_1g(rzip_control *control, int fd, void *buf, i64 len);
i64 get_readseek(rzip_control *control, int fd);
bool prepare_streamout_threads(rzip_control *control);
bool close_streamout_threads(rzip_control *control);
void *open_stream_out(rzip_control *control, int f, unsigned int n, i64 chunk_limit, char cbytes);
void *open_stream_in(rzip_control *control, int f, int n, char cbytes);
void flush_buffer(rzip_control *control, struct stream_info *sinfo, int stream);
void write_stream(rzip_control *control, void *ss, int streamno, uchar *p, i64 len);
i64 read_stream(rzip_control *control, void *ss, int streamno, uchar *p, i64 len);
int close_stream_out(rzip_control *control, void *ss);
int close_stream_in(rzip_control *control, void *ss);
ssize_t put_fdout(rzip_control *control, void *offset_buf, ssize_t ret);
#endif

433
util.c
View file

@ -1,8 +1,6 @@
/*
Copyright (C) 2006-2016,2021-2022 Con Kolivas
Copyright (C) 2011 Serge Belyshev
Copyright (C) 2008, 2011 Peter Hyman
Copyright (C) 1998 Andrew Tridgell
Copyright (C) Andrew Tridgell 1998
Con Kolivas 2006-2010
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@ -15,7 +13,8 @@
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
/*
@ -31,194 +30,81 @@
* Peter Hyman, December 2008
*/
#ifdef HAVE_CONFIG_H
# include "config.h"
#endif
#include "rzip.h"
#include <stdarg.h>
#ifdef HAVE_UNISTD_H
# include <unistd.h>
#endif
#include <termios.h>
#ifdef _SC_PAGE_SIZE
# define PAGE_SIZE (sysconf(_SC_PAGE_SIZE))
#else
# define PAGE_SIZE (4096)
#endif
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <fcntl.h>
#include "lrzip_private.h"
#include "util.h"
#include "sha4.h"
#include "aes.h"
#ifdef HAVE_CTYPE_H
# include <ctype.h>
#endif
/* Macros for testing parameters */
#define isparameter( parmstring, value ) (!strcasecmp( parmstring, value ))
#define iscaseparameter( parmvalue, value ) (!strcmp( parmvalue, value ))
void register_infile(rzip_control *control, const char *name, char delete)
void fatal(const char *format, ...)
{
control->util_infile = name;
control->delete_infile = delete;
}
va_list ap;
void register_outfile(rzip_control *control, const char *name, char delete)
{
control->util_outfile = name;
control->delete_outfile = delete;
}
void register_outputfile(rzip_control *control, FILE *f)
{
control->outputfile = f;
}
void unlink_files(rzip_control *control)
{
/* Delete temporary files generated for testing or faking stdio */
if (control->util_outfile && control->delete_outfile)
unlink(control->util_outfile);
if (control->util_infile && control->delete_infile)
unlink(control->util_infile);
}
void fatal_exit(rzip_control *control)
{
struct termios termios_p;
/* Make sure we haven't died after disabling stdin echo */
tcgetattr(fileno(stdin), &termios_p);
termios_p.c_lflag |= ECHO;
tcsetattr(fileno(stdin), 0, &termios_p);
unlink_files(control);
if (!STDOUT && !TEST_ONLY && control->outfile) {
if (!KEEP_BROKEN) {
print_verbose("Deleting broken file %s\n", control->outfile);
unlink(control->outfile);
} else
print_verbose("Keeping broken file %s as requested\n", control->outfile);
if (format) {
va_start(ap, format);
vfprintf(stderr, format, ap);
va_end(ap);
}
fprintf(control->outputfile, "Fatal error - exiting\n");
fflush(control->outputfile);
/* Delete temporary files generated for testing or faking stdio */
if (TEST_ONLY || STDOUT)
unlink(control.outfile);
if (DECOMPRESS && STDIN)
unlink(control.infile);
perror(NULL);
print_output("Fatal error - exiting\n");
exit(1);
}
void setup_overhead(rzip_control *control)
void sighandler()
{
/* Work out the compression overhead per compression thread for the
* compression back-ends that need a lot of ram */
if (LZMA_COMPRESS) {
int level = control->compression_level * 7 / 9;
/* Delete temporary files generated for testing or faking stdio */
if (TEST_ONLY || STDOUT)
unlink(control.outfile);
if (!level)
level = 1;
i64 dictsize = (level <= 5 ? (1 << (level * 2 + 14)) :
(level == 6 ? (1 << 25) : (1 << 26)));
if (DECOMPRESS && STDIN)
unlink(control.infile);
control->overhead = (dictsize * 23 / 2) + (6 * 1024 * 1024) + 16384;
/* LZMA spec shows memory requirements as 6MB, not 4MB and state size
* where default is 16KB */
} else if (ZPAQ_COMPRESS)
control->overhead = 112 * 1024 * 1024;
exit(0);
}
void setup_ram(rzip_control *control)
{
/* Use less ram when using STDOUT to store the temporary output file. */
if (STDOUT && ((STDIN && DECOMPRESS) || !(DECOMPRESS || TEST_ONLY)))
control->maxram = control->ramsize / 6;
else
control->maxram = control->ramsize / 3;
if (BITS32) {
/* Decrease usable ram size on 32 bits due to kernel /
* userspace split. Cannot allocate larger than a 1
* gigabyte chunk due to 32 bit signed long being
* used in alloc, and at most 3GB can be malloced, and
* 2/3 of that makes for a total of 2GB to be split
* into thirds.
*/
control->usable_ram = MAX(control->ramsize - 900000000ll, 900000000ll);
control->maxram = MIN(control->maxram, control->usable_ram);
control->maxram = MIN(control->maxram, one_g * 2 / 3);
} else
control->usable_ram = control->maxram;
round_to_page(&control->maxram);
}
void round_to_page(i64 *size)
{
*size -= *size % PAGE_SIZE;
if (unlikely(!*size))
*size = PAGE_SIZE;
}
size_t round_up_page(rzip_control *control, size_t len)
{
int rem = len % control->page_size;
if (rem)
len += control->page_size - rem;
return len;
}
bool get_rand(rzip_control *control, uchar *buf, int len)
{
int fd, i;
fd = open("/dev/urandom", O_RDONLY);
if (fd == -1) {
for (i = 0; i < len; i++)
buf[i] = (uchar)random();
} else {
if (unlikely(read(fd, buf, len) != len))
fatal_return(("Failed to read fd in get_rand\n"), false);
if (unlikely(close(fd)))
fatal_return(("Failed to close fd in get_rand\n"), false);
}
return true;
}
bool read_config(rzip_control *control)
void read_config( struct rzip_control *control )
{
/* check for lrzip.conf in ., $HOME/.lrzip and /etc/lrzip */
char *HOME, homeconf[255];
char *parametervalue;
char *parameter;
char line[255];
FILE *fp;
char *parameter;
char *parametervalue;
char *line, *s;
char *HOME, *homeconf;
line = malloc(255);
homeconf = malloc(255);
if (line == NULL || homeconf == NULL)
fatal("Fatal Memory Error in read_config");
fp = fopen("lrzip.conf", "r");
if (fp)
fprintf(control->msgout, "Using configuration file ./lrzip.conf\n");
if (fp == NULL) {
fp = fopen("/etc/lrzip/lrzip.conf", "r");
if (fp)
fprintf(control->msgout, "Using configuration file /etc/lrzip/lrzip.conf\n");
}
if (fp == NULL) {
HOME=getenv("HOME");
if (HOME) {
snprintf(homeconf, sizeof(homeconf), "%s/.lrzip/lrzip.conf", HOME);
strcpy(homeconf, HOME);
strcat(homeconf,"/.lrzip/lrzip.conf");
fp = fopen(homeconf, "r");
if (fp)
fprintf(control->msgout, "Using configuration file %s\n", homeconf);
}
}
if (fp == NULL) {
fp = fopen("/etc/lrzip/lrzip.conf", "r");
if (fp)
fprintf(control->msgout, "Using configuration file /etc/lrzip/lrzip.conf\n");
}
if (fp == NULL)
return false;
return;
/* if we get here, we have a file. read until no more. */
while ((fgets(line, 255, fp)) != NULL) {
while ((s = fgets(line, 255, fp)) != NULL) {
if (strlen(line))
line[strlen(line) - 1] = '\0';
parameter = strtok(line, " =");
@ -236,96 +122,71 @@ bool read_config(rzip_control *control)
/* have valid parameter line, now assign to control */
if (isparameter(parameter, "window"))
if (!strcasecmp(parameter, "window"))
control->window = atoi(parametervalue);
else if (isparameter(parameter, "unlimited")) {
if (isparameter(parametervalue, "yes"))
control->flags |= FLAG_UNLIMITED;
} else if (isparameter(parameter, "compressionlevel")) {
else if (!strcasecmp(parameter, "compressionlevel")) {
control->compression_level = atoi(parametervalue);
if ( control->compression_level < 1 || control->compression_level > 9 )
failure_return(("CONF.FILE error. Compression Level must between 1 and 9"), false);
} else if (isparameter(parameter, "compressionmethod")) {
/* valid are rzip, gzip, bzip2, lzo, lzma (default), and zpaq */
fatal("CONF.FILE error. Compression Level must between 1 and 9");
} else if (!strcasecmp(parameter, "compressionmethod")) {
/* valid are rzip, gzip, bzip2, lzo, lzma (default) */
if (control->flags & FLAG_NOT_LZMA)
failure_return(("CONF.FILE error. Can only specify one compression method"), false);
if (isparameter(parametervalue, "bzip2"))
fatal("CONF.FILE error. Can only specify one compression method");
if (!strcasecmp(parametervalue, "bzip2"))
control->flags |= FLAG_BZIP2_COMPRESS;
else if (isparameter(parametervalue, "gzip"))
else if (!strcasecmp(parametervalue, "gzip"))
control->flags |= FLAG_ZLIB_COMPRESS;
else if (isparameter(parametervalue, "lzo"))
else if (!strcasecmp(parametervalue, "lzo"))
control->flags |= FLAG_LZO_COMPRESS;
else if (isparameter(parametervalue, "rzip"))
else if (!strcasecmp(parametervalue, "rzip"))
control->flags |= FLAG_NO_COMPRESS;
else if (isparameter(parametervalue, "zpaq"))
else if (!strcasecmp(parametervalue, "zpaq"))
control->flags |= FLAG_ZPAQ_COMPRESS;
else if (!isparameter(parametervalue, "lzma")) /* oops, not lzma! */
failure_return(("CONF.FILE error. Invalid compression method %s specified\n",parametervalue), false);
} else if (isparameter(parameter, "lzotest")) {
/* default is yes */
if (isparameter(parametervalue, "no"))
control->flags &= ~FLAG_THRESHOLD;
} else if (isparameter(parameter, "hashcheck")) {
if (isparameter(parametervalue, "yes")) {
control->flags |= FLAG_CHECK;
control->flags |= FLAG_HASH;
}
} else if (isparameter(parameter, "showhash")) {
if (isparameter(parametervalue, "yes"))
control->flags |= FLAG_HASH;
} else if (isparameter(parameter, "outputdirectory")) {
else if (strcasecmp(parametervalue, "lzma"))
fatal("CONF.FILE error. Invalid compression method %s specified",parametervalue);
} else if (!strcasecmp(parameter, "testthreshold")) {
control->threshold = atoi(parametervalue);
if (control->threshold < 1 || control->threshold > 10)
fatal("CONF.FILE error. Threshold value out of range %d", parametervalue);
control->threshold = 1.05-control->threshold / 20;
} else if (!strcasecmp(parameter, "outputdirectory")) {
control->outdir = malloc(strlen(parametervalue) + 2);
if (!control->outdir)
fatal_return(("Fatal Memory Error in read_config"), false);
fatal("Fatal Memory Error in read_config");
strcpy(control->outdir, parametervalue);
if (strcmp(parametervalue + strlen(parametervalue) - 1, "/"))
strcat(control->outdir, "/");
} else if (isparameter(parameter,"verbosity")) {
} else if (!strcasecmp(parameter,"verbosity")) {
if (control->flags & FLAG_VERBOSE)
failure_return(("CONF.FILE error. Verbosity already defined."), false);
if (isparameter(parametervalue, "yes"))
fatal("CONF.FILE error. Verbosity already defined.");
if (!strcasecmp(parametervalue, "true") || !strcasecmp(parametervalue, "1"))
control->flags |= FLAG_VERBOSITY;
else if (isparameter(parametervalue,"max"))
else if (!strcasecmp(parametervalue,"max") || !strcasecmp(parametervalue, "2"))
control->flags |= FLAG_VERBOSITY_MAX;
else /* oops, unrecognized value */
print_err("lrzip.conf: Unrecognized verbosity value %s. Ignored.\n", parametervalue);
} else if (isparameter(parameter, "showprogress")) {
/* Yes by default */
if (isparameter(parametervalue, "NO"))
control->flags &= ~FLAG_SHOW_PROGRESS;
} else if (isparameter(parameter,"nice")) {
} else if (!strcasecmp(parameter,"nice")) {
control->nice_val = atoi(parametervalue);
if (control->nice_val < -20 || control->nice_val > 19)
failure_return(("CONF.FILE error. Nice must be between -20 and 19"), false);
} else if (isparameter(parameter, "keepbroken")) {
if (isparameter(parametervalue, "yes" ))
control->flags |= FLAG_KEEP_BROKEN;
} else if (iscaseparameter(parameter, "DELETEFILES")) {
/* delete files must be case sensitive */
if (iscaseparameter(parametervalue, "YES"))
fatal("CONF.FILE error. Nice must be between -20 and 19");
} else if (!strcasecmp(parameter, "showprogress")) {
/* true by default */
if (!strcasecmp(parametervalue, "false") || !strcasecmp(parametervalue," 0"))
control->flags &= ~FLAG_SHOW_PROGRESS;
} else if (!strcmp(parameter, "DELETEFILES")) {
/* delete files must be case sensitive */
if (!strcmp(parametervalue, "YES"))
control->flags &= ~FLAG_KEEP_FILES;
} else if (iscaseparameter(parameter, "REPLACEFILE")) {
} else if (!strcmp(parameter, "REPLACEFILE")) {
/* replace lrzip file must be case sensitive */
if (iscaseparameter(parametervalue, "YES"))
if (!strcmp(parametervalue, "YES"))
control->flags |= FLAG_FORCE_REPLACE;
} else if (isparameter(parameter, "tmpdir")) {
control->tmpdir = realloc(NULL, strlen(parametervalue) + 2);
if (!control->tmpdir)
fatal_return(("Fatal Memory Error in read_config"), false);
strcpy(control->tmpdir, parametervalue);
if (strcmp(parametervalue + strlen(parametervalue) - 1, "/"))
strcat(control->tmpdir, "/");
} else if (isparameter(parameter, "encrypt")) {
if (isparameter(parameter, "YES"))
control->flags |= FLAG_ENCRYPT;
} else
/* oops, we have an invalid parameter, display */
print_err("lrzip.conf: Unrecognized parameter value, %s = %s. Continuing.\n",\
parameter, parametervalue);
}
}
if (unlikely(fclose(fp)))
fatal_return(("Failed to fclose fp in read_config\n"), false);
/* clean up */
free(line);
free(homeconf);
/* fprintf(stderr, "\nWindow = %d \
\nCompression Level = %d \
@ -333,120 +194,4 @@ bool read_config(rzip_control *control)
\nOutput Directory = %s \
\nFlags = %d\n", control->window,control->compression_level, control->threshold, control->outdir, control->flags);
*/
return true;
}
static void xor128 (void *pa, const void *pb)
{
i64 *a = pa;
const i64 *b = pb;
a [0] ^= b [0];
a [1] ^= b [1];
}
static void lrz_keygen(const rzip_control *control, const uchar *salt, uchar *key, uchar *iv)
{
uchar buf [HASH_LEN + SALT_LEN + PASS_LEN];
mlock(buf, HASH_LEN + SALT_LEN + PASS_LEN);
memcpy(buf, control->hash, HASH_LEN);
memcpy(buf + HASH_LEN, salt, SALT_LEN);
memcpy(buf + HASH_LEN + SALT_LEN, control->salt_pass, control->salt_pass_len);
sha4(buf, HASH_LEN + SALT_LEN + control->salt_pass_len, key, 0);
memcpy(buf, key, HASH_LEN);
memcpy(buf + HASH_LEN, salt, SALT_LEN);
memcpy(buf + HASH_LEN + SALT_LEN, control->salt_pass, control->salt_pass_len);
sha4(buf, HASH_LEN + SALT_LEN + control->salt_pass_len, iv, 0);
memset(buf, 0, sizeof(buf));
munlock(buf, sizeof(buf));
}
bool lrz_crypt(const rzip_control *control, uchar *buf, i64 len, const uchar *salt, int encrypt)
{
/* Encryption requires CBC_LEN blocks so we can use ciphertext
* stealing to not have to pad the block */
uchar key[HASH_LEN], iv[HASH_LEN];
uchar tmp0[CBC_LEN], tmp1[CBC_LEN];
aes_context aes_ctx;
i64 N, M;
bool ret = false;
/* Generate unique key and IV for each block of data based on salt */
mlock(&aes_ctx, sizeof(aes_ctx));
mlock(key, HASH_LEN);
mlock(iv, HASH_LEN);
lrz_keygen(control, salt, key, iv);
M = len % CBC_LEN;
N = len - M;
if (encrypt == LRZ_ENCRYPT) {
print_maxverbose("Encrypting data \n");
if (unlikely(aes_setkey_enc(&aes_ctx, key, 128)))
failure_goto(("Failed to aes_setkey_enc in lrz_crypt\n"), error);
aes_crypt_cbc(&aes_ctx, AES_ENCRYPT, N, iv, buf, buf);
if (M) {
memset(tmp0, 0, CBC_LEN);
memcpy(tmp0, buf + N, M);
aes_crypt_cbc(&aes_ctx, AES_ENCRYPT, CBC_LEN,
iv, tmp0, tmp1);
memcpy(buf + N, buf + N - CBC_LEN, M);
memcpy(buf + N - CBC_LEN, tmp1, CBC_LEN);
}
} else {
if (unlikely(aes_setkey_dec(&aes_ctx, key, 128)))
failure_goto(("Failed to aes_setkey_dec in lrz_crypt\n"), error);
print_maxverbose("Decrypting data \n");
if (M) {
aes_crypt_cbc(&aes_ctx, AES_DECRYPT, N - CBC_LEN,
iv, buf, buf);
aes_crypt_ecb(&aes_ctx, AES_DECRYPT,
buf + N - CBC_LEN, tmp0);
memset(tmp1, 0, CBC_LEN);
memcpy(tmp1, buf + N, M);
xor128(tmp0, tmp1);
memcpy(buf + N, tmp0, M);
memcpy(tmp1 + M, tmp0 + M, CBC_LEN - M);
aes_crypt_ecb(&aes_ctx, AES_DECRYPT, tmp1,
buf + N - CBC_LEN);
xor128(buf + N - CBC_LEN, iv);
} else
aes_crypt_cbc(&aes_ctx, AES_DECRYPT, len,
iv, buf, buf);
}
ret = true;
error:
memset(&aes_ctx, 0, sizeof(aes_ctx));
memset(iv, 0, HASH_LEN);
memset(key, 0, HASH_LEN);
munlock(&aes_ctx, sizeof(aes_ctx));
munlock(iv, HASH_LEN);
munlock(key, HASH_LEN);
return ret;
}
void lrz_stretch(rzip_control *control)
{
sha4_context ctx;
i64 j, n, counter;
mlock(&ctx, sizeof(ctx));
sha4_starts(&ctx, 0);
n = control->encloops * HASH_LEN / (control->salt_pass_len + sizeof(i64));
print_maxverbose("Hashing passphrase %lld (%lld) times \n", control->encloops, n);
for (j = 0; j < n; j ++) {
counter = htole64(j);
sha4_update(&ctx, (uchar *)&counter, sizeof(counter));
sha4_update(&ctx, control->salt_pass, control->salt_pass_len);
}
sha4_finish(&ctx, control->hash);
memset(&ctx, 0, sizeof(ctx));
munlock(&ctx, sizeof(ctx));
}

171
util.h
View file

@ -1,171 +0,0 @@
/*
Copyright (C) 2006-2016 Con Kolivas
Copyright (C) 2011 Peter Hyman
Copyright (C) 1998 Andrew Tridgell
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef LRZIP_UTIL_H
#define LRZIP_UTIL_H
#include "lrzip_private.h"
#include <errno.h>
#include <semaphore.h>
#include <stdarg.h>
#include <unistd.h>
#include <fcntl.h>
void register_infile(rzip_control *control, const char *name, char delete);
void register_outfile(rzip_control *control, const char *name, char delete);
void unlink_files(rzip_control *control);
void register_outputfile(rzip_control *control, FILE *f);
void fatal_exit(rzip_control *control);
/* Failure when there is likely to be a meaningful error in perror */
static inline void fatal(const rzip_control *control, unsigned int line, const char *file, const char *func, const char *format, ...)
{
va_list ap;
va_start(ap, format);
if (!control->log_cb) {
vfprintf(stderr, format, ap);
perror(NULL);
} else
control->log_cb(control->log_data, 0, line, file, func, format, ap);
va_end(ap);
if (!control->library_mode)
fatal_exit((rzip_control*)control);
}
#ifdef fatal
# undef fatal
#endif
#define fatal(...) fatal(control, __LINE__, __FILE__, __func__, __VA_ARGS__)
#define fatal_return(stuff, ...) do { \
fatal stuff; \
return __VA_ARGS__; \
} while (0)
#define fatal_goto(stuff, label) do { \
fatal stuff; \
goto label; \
} while (0)
static inline void failure(const rzip_control *control, unsigned int line, const char *file, const char *func, const char *format, ...)
{
va_list ap;
va_start(ap, format);
if (!control->log_cb)
vfprintf(stderr, format, ap);
else
control->log_cb(control->log_data, 0, line, file, func, format, ap);
va_end(ap);
if (!control->library_mode)
fatal_exit((rzip_control*)control);
}
#ifdef failure
# undef failure
#endif
#define failure(...) failure(control, __LINE__, __FILE__, __func__, __VA_ARGS__)
#define failure_return(stuff, ...) do { \
failure stuff; \
return __VA_ARGS__; \
} while (0)
#define failure_goto(stuff, label) do { \
failure stuff; \
goto label; \
} while (0)
void setup_overhead(rzip_control *control);
void setup_ram(rzip_control *control);
void round_to_page(i64 *size);
size_t round_up_page(rzip_control *control, size_t len);
bool get_rand(rzip_control *control, uchar *buf, int len);
bool read_config(rzip_control *control);
void lrz_stretch(rzip_control *control);
void lrz_stretch2(rzip_control *control);
bool lrz_crypt(const rzip_control *control, uchar *buf, i64 len, const uchar *salt, int encrypt);
#define LRZ_DECRYPT (0)
#define LRZ_ENCRYPT (1)
static inline bool lrz_encrypt(const rzip_control *control, uchar *buf, i64 len, const uchar *salt)
{
return lrz_crypt(control, buf, len, salt, LRZ_ENCRYPT);
}
static inline bool lrz_decrypt(const rzip_control *control, uchar *buf, i64 len, const uchar *salt)
{
return lrz_crypt(control, buf, len, salt, LRZ_DECRYPT);
}
/* ck specific wrappers for true unnamed semaphore usage on platforms
* that support them and for apple which does not. We use a single byte across
* a pipe to emulate semaphore behaviour there. */
#ifdef __APPLE__
static inline void cksem_init(const rzip_control *control, cksem_t *cksem)
{
int flags, fd, i;
if (pipe(cksem->pipefd) == -1)
fatal("Failed pipe errno=%d", errno);
/* Make the pipes FD_CLOEXEC to allow them to close should we call
* execv on restart. */
for (i = 0; i < 2; i++) {
fd = cksem->pipefd[i];
flags = fcntl(fd, F_GETFD, 0);
flags |= FD_CLOEXEC;
if (fcntl(fd, F_SETFD, flags) == -1)
fatal("Failed to fcntl errno=%d", errno);
}
}
static inline void cksem_post(const rzip_control *control, cksem_t *cksem)
{
const char buf = 1;
int ret;
ret = write(cksem->pipefd[1], &buf, 1);
if (unlikely(ret == 0))
fatal("Failed to write in cksem_post errno=%d", errno);
}
static inline void cksem_wait(const rzip_control *control, cksem_t *cksem)
{
char buf;
int ret;
ret = read(cksem->pipefd[0], &buf, 1);
if (unlikely(ret == 0))
fatal("Failed to read in cksem_post errno=%d", errno);
}
#else
static inline void cksem_init(const rzip_control *control, cksem_t *cksem)
{
int ret;
if ((ret = sem_init(cksem, 0, 0)))
fatal("Failed to sem_init ret=%d errno=%d", ret, errno);
}
static inline void cksem_post(const rzip_control *control, cksem_t *cksem)
{
if (unlikely(sem_post(cksem)))
fatal("Failed to sem_post errno=%d cksem=0x%p", errno, cksem);
}
static inline void cksem_wait(const rzip_control *control, cksem_t *cksem)
{
if (unlikely(sem_wait(cksem)))
fatal("Failed to sem_wait errno=%d cksem=0x%p", errno, cksem);
}
#endif
#endif

View file

@ -1,105 +0,0 @@
#!/bin/bash
# Peter Hyman, pete@peterhyman.com
# December 2020
# This program will return commit references based on Tags and Annotated Tags from git describe
usage() {
cat >&2 <<EOF
$(basename $0) command [-r]
all - entire git describe
commit - commit, omitting v
tagrev - tag revision count
major - major release version
ninor - minor release version
micro - micro release version
version - M.mic + [tag release count-HEAD commit]
-r -- get release tag only
EOF
exit 1
}
# showw message and usage
die() {
echo "$1"
usage
}
# return variables
# everything, with leading `v' and leading `g' for commits
describe_tag=
# abbreviated commit
commit=
# count of commits from last tag
tagrev=
# major version
major=
# minor version
minor=
# micro version
micro=
# get release or tag?
tagopt="--tags"
# get whole commit and parse
# if tagrev > 0 then add it and commit to micro version
# Expected format is:
# v#.###-g#######
init() {
describe_tag=$(git describe $tagopt --long --abbrev=7)
describe_tag=${describe_tag/v/}
describe_tag=${describe_tag/g/}
commit=$(echo $describe_tag | cut -d- -f3)
tagrev=$(echo $describe_tag | cut -d- -f2)
version=$(echo $describe_tag | cut -d- -f1)
micro=${version: -2}
[ $tagrev -gt 0 ] && micro=$micro-$tagrev-$commit
minor=${version: -3:1}
major=$(echo $version | cut -d. -f1)
}
[ ! $(which git) ] && die "Something very wrong: git not found."
[ $# -eq 0 ] && die "Must provide a command and optional argument."
# are we getting a release only?
if [ $# -eq 2 ]; then
if [ $2 = "-r" ]; then
tagopt=""
else
die "Invalid option. Must be -r or nothing."
fi
fi
init
case "$1" in
"all" )
retval=$describe_tag
;;
"commit" )
retval=$commit
;;
"tagrev" )
retval=$tagrev
;;
"version" )
retval=$version
;;
"major" )
retval=$major
;;
"minor" )
retval=$minor
;;
"micro" )
retval=$micro
;;
* )
die "Invalid command."
;;
esac
echo $retval
exit 0

1771
zpipe.cpp Normal file

File diff suppressed because it is too large Load diff