Initial import

This commit is contained in:
Con Kolivas 2010-03-29 10:07:08 +11:00
parent 725e478e19
commit 6dcceb0b1b
69 changed files with 26485 additions and 0 deletions

16
AUTHORS Normal file
View file

@ -0,0 +1,16 @@
Original concept and programming by Con Kolivas
Beginning with version 0.19, Peter Hyman submitted bug fixes,
patches, multi-threading support, assembler integration,
SDK updating, and autoconf improvements.
Thanks to:
Andrew Tridgell for rzip.
Markus Oberhumer for lzo.
Igor Pavlov for lzma and CRC Assembler code.
Jean-loup Gailly and Mark Adler for the zlib compression library
Christian Leber for lzma compat layer
Lasse Collin for debugging the compat layer
Michael J Cohen for Darwin support
Jukka Laurila for newer Darwin support
George Makrydakis for lrztar

4
BUGS Normal file
View file

@ -0,0 +1,4 @@
BUGME November 2009
MacOSX build is broken as of v0.41. No easy fix is planned
at the moment.

339
COPYING Normal file
View file

@ -0,0 +1,339 @@
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
675 Mass Ave, Cambridge, MA 02139, USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Library General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
Appendix: How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) 19yy <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) 19yy name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Library General
Public License instead of this License.

328
ChangeLog Normal file
View file

@ -0,0 +1,328 @@
lrzip ChangeLog
DEC 2009, version 0.44, Con Kolivas, George Makrydakis
* Added lrztar wrapper to manage whole directories.
* Added -i option to provide information about a compressed file.
* Fixed "nan" showing as Compression speed on very small files.
* Fixed build for old bz library.
* Avoid overwriting output file if input doesn't exist.
* Implement signal handler to delete temporary files.
DEC 2009, version 0.43, Con Kolivas, Jukka Laurila
* Darwin support thanks to Jukka Laurila.
* Finally added stdin/stdout support due to popular demand. This is done
by basically using temporary files so is a low performance way of using
lrzip.
* Added test function. This just uses a temporary file during decompression.
* Config files should now accept zpaq options.
* Minor code style cleanups.
* Updated benchmarks in docs.
* Add a warning when attempting to decompress a file from a newer lrzip
version.
NOV 2009, version 0.42, Con Kolivas
* Changed progress update to show which of 2 chunks are being compressed
in zpaq.
* Fixed progress update in ZPAQ to not update with each byte which was
wasting heaps of CPU time.
NOV 2009, version 0.41, Con Kolivas
* Added zpaq compression backend for extremely good but extremely slow
compression (incompatible with previous versions if used).
* Limited chunk size passed to LZMA to 4GB to avoid library overflows.
* Minor changes to the formatting output
* Changed lower limit of -T threshhold to 0 to allow disabling it.
* Added lzo_compresses check into zpaq and bzip2 as well since they're
slow.
NOV 2009, version 0.40, Con Kolivas
* Massive core code rewrite.
* All code moved to be 64bit based for compression block addressing and length
allowing compression windows to be limited by ram only.
* 64bit userspace should now have no restriction on compression window size,
32bit is still limited to 2GB windows due to userspace limitations.
* New file format using the new addressing and data types, incompatible with
versions prior to 0.40.
* Support for reading and decompressing older formats.
* Minor speedups in read/write routines.
* Countless minor code fixes throughout.
* Code style cleanups and consistency changes in core code.
* Configure script improvements.
NOV 2009, version 0.31, Con Kolivas
* Updated to be in sync with lzma SDK 9.07beta.
* Cleanups and fixes of the configure scripts to use the correct package version
name.
* Massive fixes to the memory management code fixing lots of 32bit overflow
errors. The window size limit is now 2GB on both 32bit and 64bit. While it
appears to be smaller than the old windows, only 900MB was being used on .30
even though it claimed to use more. This can cause huge improvements in the
compression of very large files.
* The offset when mmap()ing was not being set to a multiple of page size so
it would fail if the window size was not a multiple of it.
* Flushing of data to disk between compression windows was implemented to
minimise disk thrashing of read vs write.
NOV 2009, version 0.30, Con Kolivas
* Numerous bugfixes to try and make the most of 64bit environments with huge
memory and to barf less on 32bit environments.
* Executable stacks were fixed.
* Probably other weird and wonderful bugs have been introduced.
* -P option to not set permissions on output files allowing you to write to
braindead filesystems (eg fat32).
JAN 2009, version 0.24, Peter Hyman, pete@peterhyman.com
Happy New Year!
* Upgrade LZMA SDK to 4.63. Use new C Wrapper. Invalidates
LZMA archives created earlier due to new Magic property
bytes.
* New LZMA logic will automatically determine allow LZMA
code to determine optimal lc, lp, pb, fb, and dictionary
size settings. stream.c will only pass level and thread
information. Compress function will return encoded 5 byte
data with compression settings. This will be stored in lrz
file header.
* add error messages during LZMA compression. There are some
edge cases where LZMA cannot allocate memory. These errors
are reported and the user will be advised to use a lower
compression window setting.
* type changes in rzip_fd function for correctness.
* remove function *Realloc() since it was never used. Cleaned
in rzip.h and util.c.
* apply munmap prior to closing and compressing stream in
function rzip_chunk in rzip.c.
* add realloc function in close_stream_out in stream.c
to reclaim some ram and try and allieviate out of memory
conditions in LZMA compression.
* remove file acconfig.h and include DEFINE in configure.in.
* add lrzip.conf capability.
* add timer for compression including elapsed time and eta.
* add compression and decompression MB/s calculation.
* Updated WHATS-NEW, TODO and created BUGS file.
* Updated lrzip.1 manpage and created lrzip.conf.5 manpage.
* Added lrzip.conf.example file in doc directory.
MAR 2008, Con Kolivas, kernel@kolivas.org
* Numerous changes all over to place restrictions on window
size to work with 32 bit limitations.
* Various bugfixes with respect to detecting buffer sizes and
likelihood of compressibility.
* Fixed the inappropriate straight copying uncompressed data for
files larger than 4GB.
* Re-initiated the 10MB window limits for non-lzma compression.
I was unable to reproduce any file size savings.
* Allow compression windows larger than ramsize if people really
really want them.
* Decrease thresholds for the test function to a minimum of 5%
compressibility since the hanging in lzma compression bug has been
fixed.
JAN 2008, version 0.22, Peter Hyman, pete@peterhyman.com
* version update
lzma/LZMALib.cpp
Thanks to Lasse Collin for debugging the problem LZMA
had with hanging on uncompressable files.
Update for control parameters to both compress and
decompress functions.
Makefile.in
* use of @top_srcdir@ (Lasse Collin). Also moved away
more cruft.
main.c stream.c.rzip.h LZMALib.cpp lzmalib.h
* addition of three new control structure members.
control.lc -- literal context bits
control.lp -- literal post state bits
control.pb -- post state bits
These are needed to ensure decompression will work.
These will now be stored along with control.compression_level
in the lrz file beginning at offset 0x16 for three bytes.
These will be passed to the functions lzma_compresses and
lzma_uncompress. Currently, only compression level is
needed or used, but the others are stored for possible future
use.
See magic file for more information.
stream.c
* Change to lzo_compresses function that will reject a chunk
without testing it if the size of the chunk is greater
than the compression window * threshold. This is to avoid
a low probability that lzma would still be passed a chunk
that contains uncompressible data or barely compressible
data. If after rzip hashing the chunk size is still close
to the window size, there is hardly anything worth
compressing. While there is no reason lzma cannot get the
chunk, this will save a lot of time.
magic.headers.txt
* updated file to show new layout that includes lzma
parameters.
README-NOT-BACKWARD-COMPATIBLE
* added warning about using lrzip-0.22 with earlier versions.
WHATS-NEW
* highlight of new features.
DEC 2007, version 0.21. Peter Hyman, pete@peterhyman.com
* version update.
* Modified to use Assembler routines from lzma SDK for CRC
computation when hashing streams in rzip.c and runzip.c.
Added files 7zCrcT8.c and 7zCrcT8u.s to lzma tree.
Cleaned up source tree. Moved unused files out of the way.
Moved non-core docs to doc directory
configure.in
* correct AC_INIT to set program variables.
* modified to add check for nasm assembler.
* modified syntax of test for errno in error.h to use
echo $ECHO_N/$ECHO_C instead of $ac_n/$ac_c which
was incorrect.
Makefile.in, lzma/Makefile
* modified to add compile instructions for 7zCrcT8.c
and 7zCrcT8U.s and Assembler. Cleaned up to remove
targets that don't exist or sources that don't
exist.
Modified to properly set directories. Added doc install.
Add link command to symlink lrunzip to lrzip.
*main.c
Add CrcGenerateTable() function to init CRC tables.
This is needed for all crc routines including those
in MatchFinderMT.
rzip.c and runzip.c
* Updated source to change call to crc32_buffer to call
CrcUpdate in the assembler code. Changed parameter order
to conform.
stream.c
* Removed 10MB limit on streams for bzip, gzip, and lzo.
This, to improve effeciency of long range analysis. For
some files, this could improve results.
Current-Benchmarks.txt
* Added file to keep benchmarks current to version.
(probably need to update README too).
README.Assembler
* Explain how to remove default compile of Assembler
modules.
config.sub config.guess
* added files for system detection.
DEC 2007, version 0.20. Peter Hyman, pete@peterhyman.com
* Updated to LZMA SDK 4.57.
* Updated to p7zip POSIX version. (www.p7zip.org)
* Added multi-threading support (up to 2x speed with LZMA).
* Edited LZMADecompress.cpp for backward compatibility
with decompress function. Needed SetPropertiesRaw function.
* Repopulated source tree for distribution.
* Updated Makefile.in to reflect new source files.
Updated to include command to link lrunzip to lrzip because
lrzip will test if lrunzip was used on command line.
* Updated Makefile.in for new compile time and linking options.
* Updated LZMALibs.cpp to include new property members for
LZMAEncoders as well as changed default dictionaries to
level+16. This would make the default compression level
of 7 translate to a dictionary number of 23.
* Added output to show Nice Level when verbose mode set
Initial add of support for zlib which seems to give quite
excellent performance.
* configure.in added AC_CHECK for libz and libm.
Added AC_PROG_LN_S for Makefile symlink section.
* lrzip.1 updated man page for -g option
* main.c added option test for gzip
Added sysconf(_SC_NPROCESSORS_CONF) for CPU detection
for threading.
Updated verbose output to show whether or not
Threading will be used.
Added Timer for each file compressed.
* rzip.h added flags for GZIP compression.
Added control member for threads. Arg passed to
lzma_conpress.
* stream.c update to accomodate gzip compress and decompress
functions. Cleaned up file by rearranging functions into
groups.
Removed include of lzmalib.h since it was causing a
compile time warning with zlib.h. Prototyped functions
manually.
Cleanup output from lzo_compresses function so that
unnecessary linefeeds are eliminated.
lzma_compress function call now uses threads as argument.
* Added README.benchmarks file to explain a method of
comparing results between different methods.
* LZMALib.cpp, lzmalib.h. Adjust function lzma_compress
prototype and function to include new argument threads.
This parameter is now placed in properties.
* lzma/Makefile. Updated to reflect new API library.
Updated to include Threading option.
DEC 2007, version 0.19. Con Kolivas.
* Added nice support, defaulting to nice 19.
DEC 2007, version 0.19. Peter Hyman, pete@peterhyman.com
* Major goal was to stop LZMA from hanging on some files.
Accomplished this with a threhold setting that is used by
the lzo_compresses function to better analyze chunk data.
Threshold makes it less likely that uncompressible data
will be passed to the LZMA compressor.
main.c
* Added Threshold option 1-10 to control LZMA compression attempt.
Default value=2. This means that anything over 10% compression
as reported by lzo_compresses will return a true value to
the LZMA compression function.
* Added verbosity option and more verbosity option (-v[v]).
* Added -O option to specify output directory.
* Updated compress_file and decompress_file functions to handle.
output directories and better handle multi files and filename
extensions. Optimized some string handling routines.
Improved flexibility in determining location of output files
when using -O. Added fflush(stdout) to improve printf reliability.
* decompress_file will accept any filename and will automatically
append .lrz if not present. Won't automatically fail.
* Added logic to protect against conflicting options such as
-q and -v, -o and -O.
* Added printout to screen of options selected. Will display
only when -v or -vv used.
* Adjusted several printf statements to avoid compiler
warnings (use %ll for long long int types).
runzip.c
* Added decompression progress indicator.
Will show percent decompressed along with bytes decompressed
and total to be decompressed. Will show if -q option NOT used.
rzip.h
* Version incremented to 0.19.
* Added flag DEFINESs for verbosity and more verbosity.
* Updated control struct to include output directory and
threshold value. Removed verbosity member.
rzip.c
* Minor changes to handle display when verbosity set. Changed
number format in some printf statements to properly handle
unsigned data.
stream.c
* major overhaul of lzo_compresses function to use a threshold
value when testing a data chunk to see if it is suitable for
LZMA compression. Optimized test loop to improve performance
and reduce number of passes. Improved output reporting depending
on verbosity setting.
* Added print controls for verbosity option.
* Corrected if statements that tested for error condition of
some lzo functions that only return a true value regardless.
lrzip.1
* updated man page to show new options and explain -T threshold.
README
* updated README to explain -T threshold option.
README.lzo_compresses.test.txt
* Added this file to help explain the theory behind the rewrite
of the lzo_compresses function and how to use the -T option.
TODO
* wish list and future enhancements.
ChangeLog
* added file.

126
Makefile Normal file
View file

@ -0,0 +1,126 @@
# Makefile for
# lrzip. This is processed by configure to produce the final
# Makefile
# See README.Assembler for notes about ASM module.
prefix=/usr
exec_prefix=${prefix}
datarootdir=${prefix}/share
ASM_OBJ=7zCrc.o
PACKAGE_TARNAME=lrzip-0.44
INSTALL_BIN=$(exec_prefix)/bin
INSTALL_MAN1=/usr/share/man/man1
INSTALL_MAN5=/usr/share/man/man5
INSTALL_DOC=${datarootdir}/doc/${PACKAGE_TARNAME}
INSTALL_DOC_LZMA=${datarootdir}/doc/${PACKAGE_TARNAME}/lzma
LIBS=-llzo2 -lbz2 -lz -lm -lpthread
LDFLAGS=
CC=gcc
CXX=g++
CFLAGS=-O2 -march=native -fomit-frame-pointer -I. -I$(srcdir) -Wall -W -c
CXXFLAGS=-O2 -march=native -fomit-frame-pointer -I. -I$(srcdir) -Wall -W -c
LZMA_CFLAGS=-I./lzma/C -DCOMPRESS_MF_MT -D_REENTRANT
INSTALLCMD=/usr/bin/install -c
LN_S=ln -s
RM=rm -f
ASM=yes
srcdir=.
SHELL=/bin/sh
.SUFFIXES:
.SUFFIXES: .c .o
OBJS= main.o rzip.o runzip.o stream.o util.o \
7zCrc.o \
zpipe.o \
Threads.o \
LzFind.o \
LzFindMt.o \
LzmaDec.o \
LzmaEnc.o \
LzmaLib.o
DOCFILES= AUTHORS BUGS ChangeLog COPYING README README-NOT-BACKWARD-COMPATIBLE \
TODO WHATS-NEW \
doc/README.Assembler doc/README.benchmarks \
doc/README.lzo_compresses.test.txt \
doc/magic.header.txt doc/lrzip.conf.example
DOCFILES_LZMA= lzma/7zC.txt lzma/7zFormat.txt lzma/Methods.txt \
lzma/history.txt lzma/lzma.txt lzma/README lzma/README-Alloc
MAN1FILES= man/lrzip.1 man/lrztar.1 man/lrunzip.1
MAN5FILES= man/lrzip.conf.5
#note that the -I. is needed to handle config.h when using VPATH
.c.o:
$(CC) $(CFLAGS) $(LZMA_CFLAGS) $< -o $@
all: lrzip man doc
7zCrcT8.o: ./lzma/C/7zCrcT8.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) ./lzma/C/7zCrcT8.c
7zCrcT8U.o: ./lzma/ASM/x86/7zCrcT8U.s
$(ASM) -o 7zCrcT8U.o ./lzma/ASM/x86/7zCrcT8U.s
7zCrcT8U_64.o: ./lzma/ASM/x86_64/7zCrcT8U_64.s
$(ASM) -o 7zCrcT8U_64.o ./lzma/ASM/x86_64/7zCrcT8U_64.s
7zCrc.o: ./lzma/C/7zCrc.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) ./lzma/C/7zCrc.c
LzmaLib.o: ./lzma/C/LzmaLib.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) ./lzma/C/LzmaLib.c
LzmaDec.o: ./lzma/C/LzmaDec.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) ./lzma/C/LzmaDec.c
LzmaEnc.o: ./lzma/C/LzmaEnc.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) ./lzma/C/LzmaEnc.c
Threads.o: ./lzma/C/Threads.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) ./lzma/C/Threads.c
LzFind.o: ./lzma/C/LzFind.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) ./lzma/C/LzFind.c
LzFindMt.o: ./lzma/C/LzFindMt.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) ./lzma/C/LzFindMt.c
zpipe.o: zpipe.cpp
$(CXX) $(CXXFLAGS) -DNDEBUG zpipe.cpp
man: man/lrztar.1 man/lrunzip.1
man/lrztar.1: man/lrztar.1.pod
$(MAKE) -f pod2man.mk PACKAGE=lrztar PODCENTER=lrzip makeman
man/lrunzip.1: man/lrunzip.1.pod
$(MAKE) -f pod2man.mk PACKAGE=lrunzip PODCENTER=lrzip makeman
install: all
mkdir -p $(DESTDIR)${INSTALL_BIN}
${INSTALLCMD} -m 755 lrzip $(DESTDIR)${INSTALL_BIN}
( cd $(DESTDIR)${INSTALL_BIN} && ${LN_S} -f lrzip lrunzip )
${INSTALLCMD} -m 755 lrztar $(DESTDIR)${INSTALL_BIN}
mkdir -p $(DESTDIR)${INSTALL_MAN1}
${INSTALLCMD} -m 644 $(MAN1FILES) $(DESTDIR)${INSTALL_MAN1}
mkdir -p $(DESTDIR)${INSTALL_MAN5}
${INSTALLCMD} -m 644 $(MAN5FILES) $(DESTDIR)${INSTALL_MAN5}
mkdir -p $(DESTDIR)${INSTALL_DOC}
${INSTALLCMD} -m 644 $(DOCFILES) $(DESTDIR)${INSTALL_DOC}
mkdir -p $(DESTDIR)${INSTALL_DOC_LZMA}
${INSTALLCMD} -m 644 $(DOCFILES_LZMA) $(DESTDIR)${INSTALL_DOC_LZMA}
lrzip: $(OBJS)
$(CXX) $(LDFLAGS) -o lrzip $(OBJS) $(LIBS)
static: $(OBJS)
$(CXX) $(LDFLAGS) -static -o lrzip $(OBJS) $(LIBS)
clean:
-${RM} *~ $(OBJS) lrzip config.cache config.log config.status *.o

118
Makefile.in Normal file
View file

@ -0,0 +1,118 @@
# Makefile for
# lrzip. This is processed by configure to produce the final
# Makefile
# See README.Assembler for notes about ASM module.
prefix=@prefix@
exec_prefix=@exec_prefix@
datarootdir=@datarootdir@
ASM_OBJ=@ASM_OBJ@
PACKAGE_TARNAME=@PACKAGE_TARNAME@
INSTALL_BIN=$(exec_prefix)/bin
INSTALL_MAN1=@mandir@/man1
INSTALL_MAN5=@mandir@/man5
INSTALL_DOC=@docdir@
INSTALL_DOC_LZMA=@docdir@/lzma
LIBS=@LIBS@
LDFLAGS=@LDFLAGS@
CC=@CC@
CXX=@CXX@
CFLAGS=@CFLAGS@ -I. -I$(srcdir) -Wall -W -c
CXXFLAGS=@CXXFLAGS@ -I. -I$(srcdir) -Wall -W -c
LZMA_CFLAGS=-I@top_srcdir@/lzma/C -DCOMPRESS_MF_MT -D_REENTRANT
INSTALLCMD=@INSTALL@
LN_S=@LN_S@
RM=rm -f
ASM=@ASM@
VPATH=@srcdir@
srcdir=@srcdir@
SHELL=/bin/sh
.SUFFIXES:
.SUFFIXES: .c .o
OBJS= main.o rzip.o runzip.o stream.o util.o \
@ASM_OBJ@ \
zpipe.o \
Threads.o \
LzFind.o \
LzFindMt.o \
LzmaDec.o \
LzmaEnc.o \
LzmaLib.o
DOCFILES= AUTHORS BUGS ChangeLog COPYING README README-NOT-BACKWARD-COMPATIBLE \
TODO WHATS-NEW \
doc/README.Assembler doc/README.benchmarks \
doc/README.lzo_compresses.test.txt \
doc/magic.header.txt doc/lrzip.conf.example
DOCFILES_LZMA= lzma/7zC.txt lzma/7zFormat.txt lzma/Methods.txt \
lzma/history.txt lzma/lzma.txt lzma/README lzma/README-Alloc
MAN1FILES= man/lrzip.1
MAN5FILES= man/lrzip.conf.5
#note that the -I. is needed to handle config.h when using VPATH
.c.o:
$(CC) $(CFLAGS) $(LZMA_CFLAGS) $< -o $@
all: lrzip man doc
7zCrcT8.o: @top_srcdir@/lzma/C/7zCrcT8.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/7zCrcT8.c
7zCrcT8U.o: @top_srcdir@/lzma/ASM/x86/7zCrcT8U.s
$(ASM) -o 7zCrcT8U.o @top_srcdir@/lzma/ASM/x86/7zCrcT8U.s
7zCrcT8U_64.o: @top_srcdir@/lzma/ASM/x86_64/7zCrcT8U_64.s
$(ASM) -o 7zCrcT8U_64.o @top_srcdir@/lzma/ASM/x86_64/7zCrcT8U_64.s
7zCrc.o: @top_srcdir@/lzma/C/7zCrc.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/7zCrc.c
LzmaLib.o: @top_srcdir@/lzma/C/LzmaLib.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/LzmaLib.c
LzmaDec.o: @top_srcdir@/lzma/C/LzmaDec.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/LzmaDec.c
LzmaEnc.o: @top_srcdir@/lzma/C/LzmaEnc.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/LzmaEnc.c
Threads.o: @top_srcdir@/lzma/C/Threads.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/Threads.c
LzFind.o: @top_srcdir@/lzma/C/LzFind.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/LzFind.c
LzFindMt.o: @top_srcdir@/lzma/C/LzFindMt.c
$(CC) $(CFLAGS) $(LZMA_CFLAGS) @top_srcdir@/lzma/C/LzFindMt.c
zpipe.o: zpipe.cpp
$(CXX) $(CXXFLAGS) -DNDEBUG zpipe.cpp
install: all
mkdir -p $(DESTDIR)${INSTALL_BIN}
${INSTALLCMD} -m 755 lrzip $(DESTDIR)${INSTALL_BIN}
( cd $(DESTDIR)${INSTALL_BIN} && ${LN_S} -f lrzip lrunzip )
${INSTALLCMD} -m 755 lrztar $(DESTDIR)${INSTALL_BIN}
mkdir -p $(DESTDIR)${INSTALL_MAN1}
${INSTALLCMD} -m 644 $(MAN1FILES) $(DESTDIR)${INSTALL_MAN1}
mkdir -p $(DESTDIR)${INSTALL_MAN5}
${INSTALLCMD} -m 644 $(MAN5FILES) $(DESTDIR)${INSTALL_MAN5}
mkdir -p $(DESTDIR)${INSTALL_DOC}
${INSTALLCMD} -m 644 $(DOCFILES) $(DESTDIR)${INSTALL_DOC}
mkdir -p $(DESTDIR)${INSTALL_DOC_LZMA}
${INSTALLCMD} -m 644 $(DOCFILES_LZMA) $(DESTDIR)${INSTALL_DOC_LZMA}
lrzip: $(OBJS)
$(CXX) $(LDFLAGS) -o lrzip $(OBJS) $(LIBS)
static: $(OBJS)
$(CXX) $(LDFLAGS) -static -o lrzip $(OBJS) $(LIBS)
clean:
-${RM} *~ $(OBJS) lrzip config.cache config.log config.status *.o

View file

@ -0,0 +1,92 @@
lrzip-0.41 update
Files created with lrzip 0.41 and selecting the -z option for
ZPAQ compression are not backwardly compatible.
lrzip-0.40 update!
FILES CREATED WITH LRZIP 0.40+ are not backward compatible with
versions prior to 0.40. The file format was completely changed
to 64bit addressing throughout to allow massive compression windows
on massive files. v0.40+ will detect older version files and
decompress them fine though, but will always generate files in
the new format.
Con Kolivas November 2009.
lrzip-0.24 update!
FILES CREATED WITH LRZIP 0.23 and earlier are NOT
BACKWARE COMPATIBLE if compressed with LZMA.
All other compression schemes are compatible.
The lrz file header is changed. It now stores the encoded
parameters LZMA uses in bytes 16-20. This is a departure
from the method used in lrzip-0.23.
Please preserve the binary of lrzip-0.23 or earlier if you
require access to lrzip files using LZMA compression created
with an earlier version.
FILES CREATED WITH LRZIP-0.22 MAY NOT BE BACKWARD COMPATIBLE!
lrzip-0.22 uses a slightly different and improved method of
compressing and decompressing files compared to lrzip-0.19 and
earlier versions.
ANY FILE COMPRESSED WITH LZMA USING A COMPRESSION LEVEL > 7
cannot be decompressed with any earlier version of lrzip.
ANY FILE COMPRESSED WITH LZMA USING A COMPRESSION LEVEL <=7
CAN be decompressed with earlier versions of lrzip.
ANY FILE COMPRESSED WITH AN EARLIER VERSION OF LRZIP CAN
be decompressed with lrzip-0.22
---------------------------------------------------------
Brief Technical discussion.
Earlier versions of lrzip used a variable dictionary buffer size
when compressing files with LZMA. It used a formula of
Compression Level + 14 bits. LZMA Dictionary buffer size was
computed as 2^(level+14). 2MB, 21 bits had been the default for
compression level 7. Level 8 was 4MB and level 9, 8MB.
The default decompression level was fixed at 23 bits, 8MB. This
was equal to the (then) largest possible dictionary buffer size,
9+14=23, 2^23=8MB. So all data regardless of compression level
could decompress.
Beginning in lrzip-0.22, the default dictionary buffer size is
Level + 16 bits (7+16=23 bits or 8MB). Files compressed with the
default level or lower CAN be decompressed with an earlier lrzip
version.
Since the the maximum dictionary buffer size for lrzip-0.22 is
now 25 bits, or 32MB. Files compressed using level 8 or level 9
(24 or 25 bits) cannot be decompressed with earlier versions of
lrzip since the fixed dictionary buffer size of 8MB used for
decompression in lrzip-0.19 and earlier cannot hold the data from
lrzip-0.22.
Here is a table to show what can and cannot be decompressed with
lrzip-0.19 and earlier
LRZIP-0.22 LRZIP-0.19
COMPRESSION CAN DICTIONARY
LEVEL DECOMPRESS? BUFFER SIZE
----------- ----------- -----------
<=7 YES <=8MB (2^23)
8 NO 16MB (2^24)
9 NO 32MB (2^25)
lrzip-0.22 can decompress all earlier files.
lrzip-0.22 uses three bytes in the compressed file to store the
compression level used. Thus, when decompressing, lrzip will read
the proper dictionary buffer size and use it when decompressing
the file. See the file magic.header.txt for more information.
January 2008
Peter Hyman
pete@peterhyman.com

18
TODO Normal file
View file

@ -0,0 +1,18 @@
TODO for lrzip program
Other posix/windows builds?? Need help there...
Add log file option so that output could be
saved for review.
Add test function that would only run lzo_compresses
for a current file without doing any writes.
Consider ncurses version or even GUI one.
Consider using LZMA Filters for processor-optimized
coding to increase compression.
Write a wrapper script for tarballing directories.
Get the ASM working on 64bit.

123
WHATS-NEW Normal file
View file

@ -0,0 +1,123 @@
lrzip-0.44
Added an lrztar wrapper to compress / decompress whole directories (finally).
Added -i option to give information about a compressed file.
lrzip-0.43
Darwin support updated. Should build on OSX v10.5+
Finally, stdin/stdout support.
Test archive integrity support.
ZPAQ support in config files.
lrzip-0.42
ZPAQ compression update now shows which rzip stream it's currently compressing
making the update more useful. It also doesn't update unnecessarily with every
byte compressed which was slowing it down a LOT.
lrzip-0.41
ZPAQ compression backend! ZPAQ is from the family of "paq" compressors that
have some of the best compression ratios around, but at the cost of extremely
long compression and equally long decompression times. This can be enabled
with the -z option and makes lrzip archives made with this not backwardly
compatible.
lrzip-0.40
Compression windows should be limited by available ram now on 64bit. The limit
on 32bit is still 2GB.
The compression advantages on large files on 64bit machines with large ram
should be substantially better.
The file format is no longer compatible with earlier versions of lrzip.
Support for decompressing older formats is present, but all new files will
be generated in the new format.
Minor speedups.
Decompression should no longer stall at 4GB boundaries for extended periods
making decompression much faster on files >4GB in size.
Documentation and benchmark updates galore.
lrzip-0.31
The window size limit is now 2GB on both 32bit and 64bit. While it appears to be
smaller than the old windows, only 900MB was being used on .30 even though it
claimed to use more. This can cause huge improvements in the compression of very
large files. Flushing of data to disk between compression windows was
implemented to minimise disk thrashing of read vs write.
Con Kolivas
November 2009
lrzip-0.30
-P option to not set permissions on output files allowing you to write to
braindead filesystems (eg fat32).
Probably other weird and wonderful bugs have been introduced.
Con Kolivas
November 2009
lrzip-0.24 has updated functionality
FEATURE ENHANCEMENTS
lrzip.conf file may be used to set default parameters.
Omit conf using environment: LRZIP=NOCONFIG lrzip.....
LRZIP environment variable may be used in the future
to store certain types of parameters.
LZMA SDK has been upgraded to version 4.63. This
version fixes some problems certain users observed,
and is much simpler using a C-only wrapper
interface.
lrzip now is able to compute an ETA for completion.
In order to do this, the file must be larger than
one compression window in size. That is, is the
compression window is 500MB, and the file is 1GB,
then after the first pass, an ETA will be computed.
If the file is smaller, then no estimate can be made.
lrzip is now able to compute MB/s transfer speeds
for both compression and decompression.
CLEANUPS
Some file cleanups have been done.
Peter Hyman
January 2009
pete@peterhyman.com
lrzip-0.22 update
FEATURE ENHANCEMENTS
-g option. Now supports gzip compression. Very fast!
Expanded dictionary buffer size in lzma compressor.
Variable, expanded dictionary size buffer in both lzma
compressor and decompressor.
Improved output during compression when using -vv.
Multi-threading supprt when using multiple processors
or dual core processors when using lzma compression.
This results in a nearly 2x speed improvement.
Assembler module support to speed up CRC checking.
Improvements in autotools usage, system detection
and Makefile enhancements.
Lrzip now has a timer that will print total time
at the end of a compression or decompression if
-q command line option is not used.
BUG FIX!!!
Even though lrzip uses a compression threshold to
prevent the lzma compressor from getting data that
may not be compressible, there was still a possibility
that lrzip could hang. This was because a data chunk
could contain an uncompressible segment and if the
lzma compressor got it, it would hang.
THANKS TO LASSE COLLIN for uncovering the error in
the lzma wrapper code that was causing the hangup.
January 2008
Peter Hyman
pete@peterhyman.com

76
aclocal.m4 vendored Normal file
View file

@ -0,0 +1,76 @@
#serial 12
# This file is used by aclocal to generate aclocal.m4
dnl By default, many hosts won't let programs access large files;
dnl one must use special compiler options to get large-file access to work.
dnl For more details about this brain damage please see:
dnl http://www.sas.com/standards/large.file/x_open.20Mar96.html
dnl Written by Paul Eggert <eggert@twinsun.com>.
dnl Internal subroutine of AC_SYS_LARGEFILE.
dnl AC_SYS_LARGEFILE_TEST_INCLUDES
AC_DEFUN([AC_SYS_LARGEFILE_TEST_INCLUDES],
[[#include <sys/types.h>
int a[(off_t) 9223372036854775807 == 9223372036854775807 ? 1 : -1];
]])
dnl Internal subroutine of AC_SYS_LARGEFILE.
dnl AC_SYS_LARGEFILE_MACRO_VALUE(C-MACRO, VALUE, CACHE-VAR, COMMENT, INCLUDES, FUNCTION-BODY)
AC_DEFUN([AC_SYS_LARGEFILE_MACRO_VALUE],
[AC_CACHE_CHECK([for $1 value needed for large files], $3,
[$3=no
AC_TRY_COMPILE(AC_SYS_LARGEFILE_TEST_INCLUDES
$5
,
[$6],
,
[AC_TRY_COMPILE([#define $1 $2]
AC_SYS_LARGEFILE_TEST_INCLUDES
$5
,
[$6],
[$3=$2])])])
if test "[$]$3" != no; then
AC_DEFINE_UNQUOTED([$1], [$]$3, [$4])
fi])
AC_DEFUN([AC_SYS_LARGEFILE],
[AC_ARG_ENABLE(largefile,
[ --disable-largefile omit support for large files])
if test "$enable_largefile" != no; then
AC_CACHE_CHECK([for special C compiler options needed for large files],
ac_cv_sys_largefile_CC,
[ac_cv_sys_largefile_CC=no
if test "$GCC" != yes; then
# IRIX 6.2 and later do not support large files by default,
# so use the C compiler's -n32 option if that helps.
AC_TRY_COMPILE(AC_SYS_LARGEFILE_TEST_INCLUDES, , ,
[ac_save_CC="$CC"
CC="$CC -n32"
AC_TRY_COMPILE(AC_SYS_LARGEFILE_TEST_INCLUDES, ,
ac_cv_sys_largefile_CC=' -n32')
CC="$ac_save_CC"])
fi])
if test "$ac_cv_sys_largefile_CC" != no; then
CC="$CC$ac_cv_sys_largefile_CC"
fi
AC_SYS_LARGEFILE_MACRO_VALUE(_FILE_OFFSET_BITS, 64,
ac_cv_sys_file_offset_bits,
[Number of bits in a file offset, on hosts where this is settable.])
AC_SYS_LARGEFILE_MACRO_VALUE(_LARGEFILE_SOURCE, 1,
ac_cv_sys_largefile_source,
[Define to make ftello visible on some hosts (e.g. HP-UX 10.20).],
[#include <stdio.h>], [return !ftello;])
AC_SYS_LARGEFILE_MACRO_VALUE(_LARGE_FILES, 1,
ac_cv_sys_large_files,
[Define for large files, on AIX-style hosts.])
AC_SYS_LARGEFILE_MACRO_VALUE(_XOPEN_SOURCE, 500,
ac_cv_sys_xopen_source,
[Define to make ftello visible on some hosts (e.g. glibc 2.1.3).],
[#include <stdio.h>], [return !ftello;])
fi
])

1516
config.guess vendored Normal file

File diff suppressed because it is too large Load diff

135
config.h Normal file
View file

@ -0,0 +1,135 @@
/* config.h. Generated from config.h.in by configure. */
/* config.h.in. Generated from configure.in by autoheader. */
/* Define to 1 if you have the <ctype.h> header file. */
#define HAVE_CTYPE_H 1
/* Define to 1 if errno.h present */
#define HAVE_ERRNO_DECL 1
/* Define to 1 if you have the <fcntl.h> header file. */
#define HAVE_FCNTL_H 1
/* Define to 1 if you have the `getopt_long' function. */
#define HAVE_GETOPT_LONG 1
/* Define to 1 if you have the <inttypes.h> header file. */
#define HAVE_INTTYPES_H 1
/* */
#define HAVE_LARGE_FILES 1
/* Define to 1 if you have the `bz2' library (-lbz2). */
#define HAVE_LIBBZ2 1
/* Define to 1 if you have the `lzo2' library (-llzo2). */
#define HAVE_LIBLZO2 1
/* Define to 1 if you have the `m' library (-lm). */
#define HAVE_LIBM 1
/* Define to 1 if you have the `pthread' library (-lpthread). */
#define HAVE_LIBPTHREAD 1
/* Define to 1 if you have the `z' library (-lz). */
#define HAVE_LIBZ 1
/* Define to 1 if you have the <memory.h> header file. */
#define HAVE_MEMORY_H 1
/* Define to 1 if you have the `mmap' function. */
#define HAVE_MMAP 1
/* Define to 1 if you have the <stdint.h> header file. */
#define HAVE_STDINT_H 1
/* Define to 1 if you have the <stdlib.h> header file. */
#define HAVE_STDLIB_H 1
/* Define to 1 if you have the `strerror' function. */
#define HAVE_STRERROR 1
/* Define to 1 if you have the <strings.h> header file. */
#define HAVE_STRINGS_H 1
/* Define to 1 if you have the <string.h> header file. */
#define HAVE_STRING_H 1
/* Define to 1 if you have the <sys/ioctl.h> header file. */
#define HAVE_SYS_IOCTL_H 1
/* Define to 1 if you have the <sys/param.h> header file. */
#define HAVE_SYS_PARAM_H 1
/* Define to 1 if you have the <sys/stat.h> header file. */
#define HAVE_SYS_STAT_H 1
/* Define to 1 if you have the <sys/time.h> header file. */
#define HAVE_SYS_TIME_H 1
/* Define to 1 if you have the <sys/types.h> header file. */
#define HAVE_SYS_TYPES_H 1
/* Define to 1 if you have the <sys/unistd.h> header file. */
#define HAVE_SYS_UNISTD_H 1
/* Define to 1 if you have the <sys/wait.h> header file. */
#define HAVE_SYS_WAIT_H 1
/* Define to 1 if you have the <unistd.h> header file. */
#define HAVE_UNISTD_H 1
/* Define to the address where bug reports for this package should be sent. */
#define PACKAGE_BUGREPORT "kernel@kolivas.org"
/* Define to the full name of this package. */
#define PACKAGE_NAME "lrzip"
/* Define to the full name and version of this package. */
#define PACKAGE_STRING "lrzip 0.44"
/* Define to the one symbol short name of this package. */
#define PACKAGE_TARNAME "lrzip-0.44"
/* Define to the version of this package. */
#define PACKAGE_VERSION "0.44"
/* The size of `int', as computed by sizeof. */
#define SIZEOF_INT 4
/* The size of `long', as computed by sizeof. */
#define SIZEOF_LONG 8
/* The size of `short', as computed by sizeof. */
#define SIZEOF_SHORT 2
/* Define to 1 if you have the ANSI C header files. */
#define STDC_HEADERS 1
/* Number of bits in a file offset, on hosts where this is settable. */
/* #undef _FILE_OFFSET_BITS */
/* Define to make ftello visible on some hosts (e.g. HP-UX 10.20). */
/* #undef _LARGEFILE_SOURCE */
/* Define for large files, on AIX-style hosts. */
/* #undef _LARGE_FILES */
/* Define to make ftello visible on some hosts (e.g. glibc 2.1.3). */
/* #undef _XOPEN_SOURCE */
/* Define to `__inline__' or `__inline' if that's what the C compiler
calls it, or to nothing if 'inline' is not supported under any name. */
#ifndef __cplusplus
/* #undef inline */
#endif
/* Define to `long int' if <sys/types.h> does not define. */
/* #undef off_t */
/* Define to `unsigned int' if <sys/types.h> does not define. */
/* #undef size_t */
#define _GNU_SOURCE
#define _LARGEFILE64_SOURCE
#define _FILE_OFFSET_BITS 64

134
config.h.in Normal file
View file

@ -0,0 +1,134 @@
/* config.h.in. Generated from configure.in by autoheader. */
/* Define to 1 if you have the <ctype.h> header file. */
#undef HAVE_CTYPE_H
/* Define to 1 if errno.h present */
#undef HAVE_ERRNO_DECL
/* Define to 1 if you have the <fcntl.h> header file. */
#undef HAVE_FCNTL_H
/* Define to 1 if you have the `getopt_long' function. */
#undef HAVE_GETOPT_LONG
/* Define to 1 if you have the <inttypes.h> header file. */
#undef HAVE_INTTYPES_H
/* */
#undef HAVE_LARGE_FILES
/* Define to 1 if you have the `bz2' library (-lbz2). */
#undef HAVE_LIBBZ2
/* Define to 1 if you have the `lzo2' library (-llzo2). */
#undef HAVE_LIBLZO2
/* Define to 1 if you have the `m' library (-lm). */
#undef HAVE_LIBM
/* Define to 1 if you have the `pthread' library (-lpthread). */
#undef HAVE_LIBPTHREAD
/* Define to 1 if you have the `z' library (-lz). */
#undef HAVE_LIBZ
/* Define to 1 if you have the <memory.h> header file. */
#undef HAVE_MEMORY_H
/* Define to 1 if you have the `mmap' function. */
#undef HAVE_MMAP
/* Define to 1 if you have the <stdint.h> header file. */
#undef HAVE_STDINT_H
/* Define to 1 if you have the <stdlib.h> header file. */
#undef HAVE_STDLIB_H
/* Define to 1 if you have the `strerror' function. */
#undef HAVE_STRERROR
/* Define to 1 if you have the <strings.h> header file. */
#undef HAVE_STRINGS_H
/* Define to 1 if you have the <string.h> header file. */
#undef HAVE_STRING_H
/* Define to 1 if you have the <sys/ioctl.h> header file. */
#undef HAVE_SYS_IOCTL_H
/* Define to 1 if you have the <sys/param.h> header file. */
#undef HAVE_SYS_PARAM_H
/* Define to 1 if you have the <sys/stat.h> header file. */
#undef HAVE_SYS_STAT_H
/* Define to 1 if you have the <sys/time.h> header file. */
#undef HAVE_SYS_TIME_H
/* Define to 1 if you have the <sys/types.h> header file. */
#undef HAVE_SYS_TYPES_H
/* Define to 1 if you have the <sys/unistd.h> header file. */
#undef HAVE_SYS_UNISTD_H
/* Define to 1 if you have the <sys/wait.h> header file. */
#undef HAVE_SYS_WAIT_H
/* Define to 1 if you have the <unistd.h> header file. */
#undef HAVE_UNISTD_H
/* Define to the address where bug reports for this package should be sent. */
#undef PACKAGE_BUGREPORT
/* Define to the full name of this package. */
#undef PACKAGE_NAME
/* Define to the full name and version of this package. */
#undef PACKAGE_STRING
/* Define to the one symbol short name of this package. */
#undef PACKAGE_TARNAME
/* Define to the version of this package. */
#undef PACKAGE_VERSION
/* The size of `int', as computed by sizeof. */
#undef SIZEOF_INT
/* The size of `long', as computed by sizeof. */
#undef SIZEOF_LONG
/* The size of `short', as computed by sizeof. */
#undef SIZEOF_SHORT
/* Define to 1 if you have the ANSI C header files. */
#undef STDC_HEADERS
/* Number of bits in a file offset, on hosts where this is settable. */
#undef _FILE_OFFSET_BITS
/* Define to make ftello visible on some hosts (e.g. HP-UX 10.20). */
#undef _LARGEFILE_SOURCE
/* Define for large files, on AIX-style hosts. */
#undef _LARGE_FILES
/* Define to make ftello visible on some hosts (e.g. glibc 2.1.3). */
#undef _XOPEN_SOURCE
/* Define to `__inline__' or `__inline' if that's what the C compiler
calls it, or to nothing if 'inline' is not supported under any name. */
#ifndef __cplusplus
#undef inline
#endif
/* Define to `long int' if <sys/types.h> does not define. */
#undef off_t
/* Define to `unsigned int' if <sys/types.h> does not define. */
#undef size_t
#define _GNU_SOURCE
#define _LARGEFILE64_SOURCE
#define _FILE_OFFSET_BITS 64

1579
config.sub vendored Normal file

File diff suppressed because it is too large Load diff

6134
configure vendored Executable file

File diff suppressed because it is too large Load diff

88
configure.ac Normal file
View file

@ -0,0 +1,88 @@
dnl Process this file with autoconf to produce a configure script.
AC_INIT([lrzip],[0.44],[kernel@kolivas.org],[lrzip-0.44])
AC_CONFIG_HEADER(config.h)
# see what our system is!
AC_CANONICAL_HOST
AC_ARG_ENABLE(
asm,
[AC_HELP_STRING([--enable-asm],[Enable native Assembly code])],
ASM=$enableval,
ASM=yes
)
if test x"$ASM" = xyes; then
AC_CHECK_PROG( ASM, nasm, yes, no )
fi
dnl Checks for programs.
AC_PROG_CC
AC_PROG_CXX
AC_PROG_INSTALL
AC_PROG_LN_S
AC_SUBST(SHELL)
AC_SYS_LARGEFILE
AC_CHECK_HEADERS(fcntl.h sys/time.h sys/unistd.h unistd.h)
AC_CHECK_HEADERS(sys/param.h ctype.h sys/wait.h sys/ioctl.h)
AC_CHECK_HEADERS(string.h stdlib.h sys/types.h)
AC_TYPE_OFF_T
AC_TYPE_SIZE_T
AC_CHECK_SIZEOF(int)
AC_CHECK_SIZEOF(long)
AC_CHECK_SIZEOF(short)
AC_CACHE_CHECK([for large file support],rzip_cv_HAVE_LARGE_FILES,[
AC_RUN_IFELSE([AC_LANG_SOURCE([[
#include <stdio.h>
#include <sys/types.h>
main() { return (sizeof(off_t) == 4); }]])],[rzip_cv_HAVE_LARGE_FILES=yes],[rzip_cv_HAVE_LARGE_FILES=no],[rzip_cv_HAVE_LARGE_FILES=cross])])
if test x"$rzip_cv_HAVE_LARGE_FILES" = x"yes"; then
AC_DEFINE(HAVE_LARGE_FILES, 1, [ ])
fi
AC_C_INLINE
AC_CHECK_LIB(pthread, pthread_create, ,
AC_MSG_ERROR([Could not find pthread library - please install libpthread]))
AC_CHECK_LIB(m, sqrt, ,
AC_MSG_ERROR([Could not find math library - please install libm]))
AC_CHECK_LIB(z, compress2, ,
AC_MSG_ERROR([Could not find zlib library - please install zlib-dev]))
AC_CHECK_LIB(bz2, BZ2_bzBuffToBuffCompress, ,
AC_MSG_ERROR([Could not find bz2 library - please install libbz2-dev]))
AC_CHECK_LIB(lzo2, lzo1x_1_compress, ,
AC_MSG_ERROR([Could not find lzo2 library - please install liblzo2-dev]))
AC_DEFINE([HAVE_ERRNO_DECL],[0],[Define to 1 if errno.h present])
echo $ECHO_N "checking for errno in errno.h... $ECHO_C"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[#include <errno.h>]], [[int i = errno]])],[echo yes; AC_DEFINE(HAVE_ERRNO_DECL)],[echo no])
AC_CHECK_FUNCS(mmap strerror)
AC_CHECK_FUNCS(getopt_long)
# final checks for x86 and/or assembler
if test x"$ASM" = x"no"; then
ASM_OBJ=7zCrc.o
ASM=
else
case $host in
i?86-*)
ASM_OBJ="7zCrcT8.o 7zCrcT8U.o"
ASM="nasm -f elf" ;;
# x86_64 code is broken still
# x86_64-*)
# ASM_OBJ="7zCrcT8.o 7zCrcT8U_64.o"
# ASM="nasm -f elf64" ;;
*) ASM_OBJ=7zCrc.o ;;
esac
fi
AC_SUBST([ASM_OBJ])
AC_SUBST([ASM])
AC_CONFIG_FILES([Makefile])
AC_OUTPUT

1
description-pak Normal file
View file

@ -0,0 +1 @@
lrzip

44
doc/README.Assembler Normal file
View file

@ -0,0 +1,44 @@
README.Assembler
Notes about CRC Assembly Language Coding.
lrzip-0.21 makes use of an x86 assembly language file
that optimizes CRC computation used in lrzip. It includes
a wrapper C file, 7zCrcT8.c and the assembler code,
7zCrcT8U.s.
configure should detect your host system properly
and adjust the Makefile accordingly. If you don't
have the nasm assembler or have a ppc or other non-
x86 system, the standard C CRC routines will be
compiled and linked in.
If for any reason configure does not properly
detect your system type, or you do not want assembler
modules to be compiled, you can run
ASM=no ./configure
which will automatically not include the asm module or
change the line
ASM_OBJ=7zCrcT8.o 7zCrcT8U.o
to
ASM_OBJ=7zCrc.o
in Makefile. This will change the dependency tree.
To force assembly module compilation and linking (if
configure does not detect your system type properly),
type
ASM=yes ./configure
or change the Makefile to include the ASM_OBJ files
as described above.
Type `make clean' and then re-run make.
Peter Hyman
pete@peterhyman.com

120
doc/README.benchmarks Normal file
View file

@ -0,0 +1,120 @@
These are benchmarks performed on a 3GHz quad core Intel Core2 with 8GB ram
using lrzip v0.42.
The first comparison is that of a linux kernel tarball (2.6.31). In all cases
the default options were used. 3 other common compression apps were used for
comparison, 7z which is an excellent all-round lzma based compression app,
gzip which is the benchmark fast standard that has good compression, and bzip2
which is the most common linux used compression.
In the following tables, lrzip means lrzip default options, lrzip(lzo) means
lrzip using the lzo backend, lrzip(gzip) means using the gzip backend,
lrzip(bzip2) means using the bzip2 backend and lrzip(zpaq) means using the zpaq
backend.
linux-2.6.31.tar
Compression Size Percentage Compress Decompress
None 365711360 100
7z 53315279 14.6 2m4.770s 0m5.360s
lrzip 52372722 14.3 2m48.477s 0m8.336s
lrzip(zpaq) 43455498 11.9 10m11.335 10m14.296
lrzip(lzo) 112151676 30.7 0m14.913s 0m5.063s
lrzip(gzip) 73476127 20.1 0m29.628s 0m5.591s
lrzip(bzip2) 60851152 16.6 0m43.539s 0m12.244s
bzip2 62416571 17.1 0m44.493s 0m9.819s
gzip 80563601 22.0 0m14.343s 0m2.781s
These results are interesting to note the compression of lrzip by default is
only slightly better than lzma, but at some cost in time at the compress and
decompress end of the spectrum. Clearly zpaq compression is much better than any
other compression algorithm by far, but the speed cost on both compression and
decompression is extreme. At this size compression, lzo is interesting because
it's faster than simply copying the file but only offers modest compression.
What lrzip offers at this end of the spectrum is extreme compression if
desired.
Let's take two kernel trees one version apart as a tarball, linux-2.6.31 and
linux-2.6.32-rc8. These will show lots of redundant information, but hundreds
of megabytes apart, which lrzip will be very good at compressing. For
simplicity, only 7z will be compared since that's by far the best general
purpose compressor at the moment:
Tarball of two kernel trees, one version apart.
Compression Size Percentage Compress Decompress
None 749066240 100
7z 108710624 14.5 4m4.260s 0m11.133s
lrzip 57943094 7.7 3m08.788s 0m10.747s
lrzip(lzo) 124029899 16.6 0m18.997s 0m7.107s
Things start getting very interesting now when lrzip is really starting to
shine. Note how it's not that much larger for 2 kernel trees than it was for
one. That's because all the similar data in both kernel trees is being
compressed as one copy and only the differences really make up the extra size.
All compression software does this, but not over such large distances. If you
copy the same data over multiple times, the resulting lrzip archive doesn't
get much larger at all.
Using the first example (linux-2.6.31.tar) and simply copying the data multiple
times over gives these results with lrzip(lzo):
Copies Size Compressed Compress Decompress
1 365711360 112151676 0m14.913s 0m5.063s
2 731422720 112151829 0m16.174s 0m6.543s
3 1097134080 112151832 0m17.466s 0m8.115s
I had the amusing thought that this compression software could be used as a
bullshit detector if you were to compress peoples' speeches because if their
talks were full of catchphrases and not much actual content, it would all be
compressed down. So the larger the final archive, the less bullshit =)
Now let's move on to the other special feature of lrzip, the ability to
compress massive amounts of data on huge ram machines by using massive
compression windows. This is a 10GB virtual image of an installed operating
system and some basic working software on it. The default options on the
8GB machine meant that it was using a 5 GB window.
10GB Virtual image:
Compression Size Percentage Compress Time Decompress Time
None 10737418240 100.0
gzip 2772899756 25.8 7m52.667s 4m8.661s
bzip2 2704781700 25.2 20m34.269s 7m51.362s
xz 2272322208 21.2 58m26.829s 4m46.154s
7z 2242897134 20.9 29m28.152s 6m35.952s
lrzip 1361276826 12.7 27m45.874s 9m20.046
lrzip(lzo) 1837206675 17.1 4m48.167s 8m28.842s
lrzip(zpaq) 1341008779 12.5 4h11m14s
lrzip(zpaq)M 1270134391 11.8 4h30m14
lrzip(zpaq)MW 1066902006 9.9
At this end of the spectrum things really start to heat up. The compression
advantage is massive, with the lzo backend even giving much better results
than 7z, and over a ridiculously short time. Note that it's not much longer
than it takes to just *read* a 10GB file. Unfortunately at these large
compression windows, the decompression time is significantly longer, but
it's a fair tradeoff I believe :) What appears to be a big disappointment is
actually zpaq here which takes more than 8 times longer than lzma for a measly
.2% improvement. The reason is that most of the advantage here is achieved by
the rzip first stage. The -M option was included here for completeness to see
what the maximum possible compression was for this file on this machine, while
the MW run was with the options -W 200 (to make the window larger than the
file and the ram the machine has), and it still completed but induced a lot
of swap in the interim.
This should help govern what compression you choose. Small files are nicely
compressed with zpaq. Intermediate files are nicely compressed with lzma.
Large files get excellent results even with lzo provided you have enough ram.
(Small being < 100MB, intermediate <1GB, large >1GB).
Or, to make things easier, just use the default settings all the time and be
happy as lzma gives good results. :D
Con Kolivas
Sat, 19 Dec 2009

View file

@ -0,0 +1,118 @@
An explanation of the revised lzo_compresses function in stream.c.
The modifications to the lrzip program for 0.19 centered around an
attempt to catch data chunks that would cause lzma compression to either
take an inordinately long time or not complete at all. The files that
could cause problems for lzma are already-compressed files, multimedia
files, files that have compressed files in them, and files with
randomized data (such as an encrypted volume or file).
The lzo_compresses function is used to assess the data and return
a TRUE or FALSE to the lzma_compress_buf function based on whether or
not the function determined the data to be compressible or not. The
simple formula cdata < odata was used (c=compressed, o=original).
Some test cases were slipping through and caused the hangups. Beginning
with lrzip-0.19 a new option, -T, test compression threshold has been
introduced and sets configurable limits as to what is considered a
compressible data chunk and what is not.
In addition, with very large chunks of data, a small modification was
made to the initial test buffer size to make it more representative of
the entire sample.
To go along with this, increased verbosity was added to the function
so that the user/evaluator can better see what is going on. -v or -vv
can be used to increase informational output.
Functional Overview:
Data chunks are passed to the lzo_copresses function in two streams.
The first is the small data set in the primary hashing bucket which
can be seen when using the -v or -vv option. This is normally a small
sample. The second stream will be the rest. The size of the streams
are dependent on how the long range analysis that is performed on
the entire file and available memory.
After analysis of the data chunk, a value of TRUE or FALSE is returned
and lzma compression will either commence or be skipped. If skipped,
data written out to the .lrz file will simply be the rzip data which
is the reorganized data based on long range analysis.
The lzo_compresses function traverses through the data chunk comparing
larger and larger blocks. If suitable compression ratios are found,
the function ends and returns TRUE. If not, and the largest sample
block size has been reached, the function will traverse deeper into
the chunk and analyze that region. Anytime a compressible area is
found, the function returns TRUE. When the end of the data chunk has
been reached and no suitable compressible blocks found, the program
will return FALSE.
Under most circumstances, this logic was fine. However, if the test
found a chunk that could only achieve 2% compression, for example,
this type of result could adversely affect the lzma compression
routine. Hence, the concept of a limiting threshold.
The threshold option works as a limiter that forces the lzo_compresses
function to not just compare the estimated compressed size with the
original, but to add a limiting threshold. This ranges a very low
threshold, 1, to a very strict, 10. A threshold of 1 means that for
the function to return TRUE, the estimated compressed data size for
the current data chunk can be between 90-100% of the original size.
This means that almost no compressible data is observed or tested for.
A value of 2, means that the data MUST compress better than 90% of
the original size. However, if the observed compression of the data
chunk is over 90% of the original size, then lzo_compresses will fail.
Each additional threshold value will increase the strictness according
to the following formula
CDS = Observed Compressed Data Size from LZO
ODS = Original Data chunk size
T = Threshold
To return TRUE, CDS < ODS * (1.1-T/10)
At T=1, just 0.01% compression would be OK,
T=2, anything better than 10% would be OK, but under 10% compression would fail.
T=3, anything better 20% would be OK, but under 20% compression would fail.
...
T=10, I can't imagine a use for this. Anything better than 90% compression
would be OK. This would imply that LZO would need to get a 10x compression
ratio.
The following actual output from the lzo_compresses function will help
explain.
22501 in primary bucket (0.805%)
lzo testing for incompressible data...OK for chunk 43408.
Compressed size = 52.58% of chunk, 1 Passes
Progress percentage pausing during lzma compression...
lzo testing for incompressible data...FAILED - below threshold for chunk 523245383.
Compressed size = 98.87% of chunk, 50 Passes
This was for a video .VOB file of 1GB. A compression threshold of 2 was used.
-T 2 means that the estimated compression size of the data chunk had to be
better than 90% of the original size.
There were 43,408 bytes in the primary hash bucket and this chunk was
evaluated by lzo_compresses. The function estimated that the compressed
data size would be 52.58% of the original 43,408 byte chunk. This resulted
in LZMA compression occurring.
The second data chunk which included the rest of the data in the current hash,
523,245,383 bytes, failed the test. the lzo_compresses function made 50 passes
through the data using progressively larger samples until it reached the end
of the data chunk. It could not find better than a 1.2% compression benefit
and therefore FAILED, The result was NO LZMA compression and the data chunk
was written to the .lrz file in rzip format (no compression).
The higher the threshold option, the faster the LZMA compression will occur.
However, this could also cause some chunks that are compressible to be
omitted. After much testing, -T 2 seems to work very well in stopping data
which will cause LZMA to hang yet allow most compressible data to come
through.
Peter Hyman
pete@peterhyman.com
December 2007

45
doc/lrzip.conf.example Normal file
View file

@ -0,0 +1,45 @@
# lrzip.conf example file
# anything beginning with a # or whitespace will be ignored
# valid parameters are separated with an = and a value
# parameters and values are not case sensitive
#
# lrzip 0.24, peter hyman, pete@peterhyman.com
# ignored by earlier versions.
# Compression Window size in 100MB. Normally selected by program.
WINDOW = 5
# Compression Level 1-9 (7 Default).
COMPRESSIONLEVEL = 7
# Compression Method, rzip, gzip, bzip2, lzo, or lzma (default).
# If specified here, command line options not usable.
# COMPRESSIONMETHOD = lzo
# Test Threshold value 1-10 (2 Default).
TESTTHRESHOLD = 2
# Default output directory
# OUTPUTDIRECTORY = location
# Verbosity, true or 1, or max or 2
VERBOSITY = max
# Show Progress as file is parsed, true or 1, false or 0
SHOWPROGRESS = true
# Set Niceness. 19 is default. -20 to 19 is the allowable range
NICE = 19
# Delete source file after compression
# this parameter and value are case sensitive
# value must be YES to activate
# DELETEFILES = NO
# Replace existing lrzip file when compressing
# this parameter and value are case sensitive
# value must be YES to activate
# REPLACEFILE = YES

41
doc/magic.header.txt Normal file
View file

@ -0,0 +1,41 @@
lrzip-0.40+ file header format
November 2009
Con Kolivas
Byte Content
0-3 LRZI
4 LRZIP Major Version Number
5 LRZIP Minor Version Number
6-14 Source File Size
16-20 LZMA Properties Encoded (lc,lp,pb,fb, and dictionary size)
21-22 not used
23-48 Stream 1 header data
49-74 Stream 2 header data
Block Data:
Byte:
0 Compressed data type
1-8 Compressed data length
9-16 Uncompressed data length
17-24 Next block head
25+ Data
End:
0-1 crc data
lrzip-0.24+ file header format
January 2009
Peter Hyman, pete@peterhyman.com
Byte Content
0-3 LRZI
4 LRZIP Major Version Number
5 LRZIP Minor Version Number
6-9 Source File Size (no HAVE_LARGE_FILES)
6-14 Source File Size
16-20 LZMA Properties Encoded (lc,lp,pb,fb, and dictionary size)
21-22 not used
23-36 Stream 1 header data
37-50 Stream 2 header data
51 Compressed data type

238
install-sh Executable file
View file

@ -0,0 +1,238 @@
#! /bin/sh
#
# install - install a program, script, or datafile
# This comes from X11R5.
#
# Calling this script install-sh is preferred over install.sh, to prevent
# `make' implicit rules from creating a file called install from it
# when there is no Makefile.
#
# This script is compatible with the BSD install script, but was written
# from scratch.
#
# set DOITPROG to echo to test this script
# Don't use :- since 4.3BSD and earlier shells don't like it.
doit="${DOITPROG-}"
# put in absolute paths if you don't have them in your path; or use env. vars.
mvprog="${MVPROG-mv}"
cpprog="${CPPROG-cp}"
chmodprog="${CHMODPROG-chmod}"
chownprog="${CHOWNPROG-chown}"
chgrpprog="${CHGRPPROG-chgrp}"
stripprog="${STRIPPROG-strip}"
rmprog="${RMPROG-rm}"
mkdirprog="${MKDIRPROG-mkdir}"
transformbasename=""
transform_arg=""
instcmd="$mvprog"
chmodcmd="$chmodprog 0755"
chowncmd=""
chgrpcmd=""
stripcmd=""
rmcmd="$rmprog -f"
mvcmd="$mvprog"
src=""
dst=""
dir_arg=""
while [ x"$1" != x ]; do
case $1 in
-c) instcmd="$cpprog"
shift
continue;;
-d) dir_arg=true
shift
continue;;
-m) chmodcmd="$chmodprog $2"
shift
shift
continue;;
-o) chowncmd="$chownprog $2"
shift
shift
continue;;
-g) chgrpcmd="$chgrpprog $2"
shift
shift
continue;;
-s) stripcmd="$stripprog"
shift
continue;;
-t=*) transformarg=`echo $1 | sed 's/-t=//'`
shift
continue;;
-b=*) transformbasename=`echo $1 | sed 's/-b=//'`
shift
continue;;
*) if [ x"$src" = x ]
then
src=$1
else
# this colon is to work around a 386BSD /bin/sh bug
:
dst=$1
fi
shift
continue;;
esac
done
if [ x"$src" = x ]
then
echo "install: no input file specified"
exit 1
else
true
fi
if [ x"$dir_arg" != x ]; then
dst=$src
src=""
if [ -d $dst ]; then
instcmd=:
else
instcmd=mkdir
fi
else
# Waiting for this to be detected by the "$instcmd $src $dsttmp" command
# might cause directories to be created, which would be especially bad
# if $src (and thus $dsttmp) contains '*'.
if [ -f $src -o -d $src ]
then
true
else
echo "install: $src does not exist"
exit 1
fi
if [ x"$dst" = x ]
then
echo "install: no destination specified"
exit 1
else
true
fi
# If destination is a directory, append the input filename; if your system
# does not like double slashes in filenames, you may need to add some logic
if [ -d $dst ]
then
dst="$dst"/`basename $src`
else
true
fi
fi
## this sed command emulates the dirname command
dstdir=`echo $dst | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'`
# Make sure that the destination directory exists.
# this part is taken from Noah Friedman's mkinstalldirs script
# Skip lots of stat calls in the usual case.
if [ ! -d "$dstdir" ]; then
defaultIFS='
'
IFS="${IFS-${defaultIFS}}"
oIFS="${IFS}"
# Some sh's can't handle IFS=/ for some reason.
IFS='%'
set - `echo ${dstdir} | sed -e 's@/@%@g' -e 's@^%@/@'`
IFS="${oIFS}"
pathcomp=''
while [ $# -ne 0 ] ; do
pathcomp="${pathcomp}${1}"
shift
if [ ! -d "${pathcomp}" ] ;
then
$mkdirprog "${pathcomp}"
else
true
fi
pathcomp="${pathcomp}/"
done
fi
if [ x"$dir_arg" != x ]
then
$doit $instcmd $dst &&
if [ x"$chowncmd" != x ]; then $doit $chowncmd $dst; else true ; fi &&
if [ x"$chgrpcmd" != x ]; then $doit $chgrpcmd $dst; else true ; fi &&
if [ x"$stripcmd" != x ]; then $doit $stripcmd $dst; else true ; fi &&
if [ x"$chmodcmd" != x ]; then $doit $chmodcmd $dst; else true ; fi
else
# If we're going to rename the final executable, determine the name now.
if [ x"$transformarg" = x ]
then
dstfile=`basename $dst`
else
dstfile=`basename $dst $transformbasename |
sed $transformarg`$transformbasename
fi
# don't allow the sed command to completely eliminate the filename
if [ x"$dstfile" = x ]
then
dstfile=`basename $dst`
else
true
fi
# Make a temp file name in the proper directory.
dsttmp=$dstdir/#inst.$$#
# Move or copy the file name to the temp name
$doit $instcmd $src $dsttmp &&
trap "rm -f ${dsttmp}" 0 &&
# and set any options; do chmod last to preserve setuid bits
# If any of these fail, we abort the whole thing. If we want to
# ignore errors from any of these, just make sure not to ignore
# errors from the above "$doit $instcmd $src $dsttmp" command.
if [ x"$chowncmd" != x ]; then $doit $chowncmd $dsttmp; else true;fi &&
if [ x"$chgrpcmd" != x ]; then $doit $chgrpcmd $dsttmp; else true;fi &&
if [ x"$stripcmd" != x ]; then $doit $stripcmd $dsttmp; else true;fi &&
if [ x"$chmodcmd" != x ]; then $doit $chmodcmd $dsttmp; else true;fi &&
# Now rename the file to the real destination.
$doit $rmcmd -f $dstdir/$dstfile &&
$doit $mvcmd $dsttmp $dstdir/$dstfile
fi &&
exit 0

48
lrztar Executable file
View file

@ -0,0 +1,48 @@
#!/bin/bash
# A basic skeleton for making a new lrzip tar wrapper place messages etc
# where you think you want them, it kind of self - documents itself.
# For the time being, lrzip does not like pipes.
# Public domain with no warranties or suitability whatsoever etc.
# G.M.
function lrztar_local() {
local p="${@:1:$(($#-1))}" s="${!#}" tname= fname= \
v_w=0 v_O=0 v_S=0 v_D=0 v_P=0 v_q=0 v_L=0 \
v_n=0 v_l=0 v_b=0 v_g=0 v_z=0 v_M=0 v_T=0 \
v_N=0 v_v=0 v_f=0 v_d=0 v_h=0 x=
OPTERR=0
trap '[[ -z $tname ]] || rm -rf "$tname" &> /dev/null' 1 2 3 15
which tar &> /dev/null || { printf "lrztar: no tar in your path\n"; return 1; }
which lrzip &> /dev/null || { printf "lrztar: no lrzip in your path\n"; return 1; }
while getopts w:O:S:DPqL:nlbgzMT:N:vfodtVh x; do
[[ $x == [otV] ]] || ((v_$x=1)) &> /dev/null \
|| { printf "lrztar: invalid option for lrztar %s\n" "$x"; return 1; }
done
{ ! (($#)) || ((v_h)); } && {
printf "lrztar wrapper for compressing/decompressing while directories with lrzip.\n"
printf "usage: lrztar [lrzip options] <directory> will compress directory to directory.tar.lrz\n"
printf "lrztar -d [lrzip options] <directory.tar.lrz> will extract directory from directory.tar.lrz\n"
printf "lrzip -h will list lrzip options\n"
return
}
((v_d)) && {
fname="$(basename "$s")"; tname="${fname%.lrz}"
! ((v_f)) && [[ -e ${tname%.tar} ]] && {
printf "lrztar: ${tname%.tar} already present, aborting\n"
return 1
}
lrzip $p "$fname" && tar xf "$tname"
x=$?
} || {
tname="$(basename "$s").tar"
tar cf "$tname" "$s" && lrzip $p "$tname"
x=$?
}
rm -rf "$tname" &> /dev/null
! ((x)) && ((v_D)) && rm -rf "$s" &> /dev/null
return $x
}
lrztar_local "${@}"

194
lzma/7zC.txt Normal file
View file

@ -0,0 +1,194 @@
7z ANSI-C Decoder 4.62
----------------------
7z ANSI-C provides 7z/LZMA decoding.
7z ANSI-C version is simplified version ported from C++ code.
LZMA is default and general compression method of 7z format
in 7-Zip compression program (www.7-zip.org). LZMA provides high
compression ratio and very fast decompression.
LICENSE
-------
7z ANSI-C Decoder is part of the LZMA SDK.
LZMA SDK is written and placed in the public domain by Igor Pavlov.
Files
---------------------
7zDecode.* - Low level 7z decoding
7zExtract.* - High level 7z decoding
7zHeader.* - .7z format constants
7zIn.* - .7z archive opening
7zItem.* - .7z structures
7zMain.c - Test application
How To Use
----------
You must download 7-Zip program from www.7-zip.org.
You can create .7z archive with 7z.exe or 7za.exe:
7za.exe a archive.7z *.htm -r -mx -m0fb=255
If you have big number of files in archive, and you need fast extracting,
you can use partly-solid archives:
7za.exe a archive.7z *.htm -ms=512K -r -mx -m0fb=255 -m0d=512K
In that example 7-Zip will use 512KB solid blocks. So it needs to decompress only
512KB for extracting one file from such archive.
Limitations of current version of 7z ANSI-C Decoder
---------------------------------------------------
- It reads only "FileName", "Size", "LastWriteTime" and "CRC" information for each file in archive.
- It supports only LZMA and Copy (no compression) methods with BCJ or BCJ2 filters.
- It converts original UTF-16 Unicode file names to UTF-8 Unicode file names.
These limitations will be fixed in future versions.
Using 7z ANSI-C Decoder Test application:
-----------------------------------------
Usage: 7zDec <command> <archive_name>
<Command>:
e: Extract files from archive
l: List contents of archive
t: Test integrity of archive
Example:
7zDec l archive.7z
lists contents of archive.7z
7zDec e archive.7z
extracts files from archive.7z to current folder.
How to use .7z Decoder
----------------------
Memory allocation
~~~~~~~~~~~~~~~~~
7z Decoder uses two memory pools:
1) Temporary pool
2) Main pool
Such scheme can allow you to avoid fragmentation of allocated blocks.
Steps for using 7z decoder
--------------------------
Use code at 7zMain.c as example.
1) Declare variables:
inStream /* implements ILookInStream interface */
CSzArEx db; /* 7z archive database structure */
ISzAlloc allocImp; /* memory functions for main pool */
ISzAlloc allocTempImp; /* memory functions for temporary pool */
2) call CrcGenerateTable(); function to initialize CRC structures.
3) call SzArEx_Init(&db); function to initialize db structures.
4) call SzArEx_Open(&db, inStream, &allocMain, &allocTemp) to open archive
This function opens archive "inStream" and reads headers to "db".
All items in "db" will be allocated with "allocMain" functions.
SzArEx_Open function allocates and frees temporary structures by "allocTemp" functions.
5) List items or Extract items
Listing code:
~~~~~~~~~~~~~
{
UInt32 i;
for (i = 0; i < db.db.NumFiles; i++)
{
CFileItem *f = db.db.Files + i;
printf("%10d %s\n", (int)f->Size, f->Name);
}
}
Extracting code:
~~~~~~~~~~~~~~~~
SZ_RESULT SzAr_Extract(
CArchiveDatabaseEx *db,
ILookInStream *inStream,
UInt32 fileIndex, /* index of file */
UInt32 *blockIndex, /* index of solid block */
Byte **outBuffer, /* pointer to pointer to output buffer (allocated with allocMain) */
size_t *outBufferSize, /* buffer size for output buffer */
size_t *offset, /* offset of stream for required file in *outBuffer */
size_t *outSizeProcessed, /* size of file in *outBuffer */
ISzAlloc *allocMain,
ISzAlloc *allocTemp);
If you need to decompress more than one file, you can send these values from previous call:
blockIndex,
outBuffer,
outBufferSize,
You can consider "outBuffer" as cache of solid block. If your archive is solid,
it will increase decompression speed.
After decompressing you must free "outBuffer":
allocImp.Free(outBuffer);
6) call SzArEx_Free(&db, allocImp.Free) to free allocated items in "db".
Memory requirements for .7z decoding
------------------------------------
Memory usage for Archive opening:
- Temporary pool:
- Memory for uncompressed .7z headers
- some other temporary blocks
- Main pool:
- Memory for database:
Estimated size of one file structures in solid archive:
- Size (4 or 8 Bytes)
- CRC32 (4 bytes)
- LastWriteTime (8 bytes)
- Some file information (4 bytes)
- File Name (variable length) + pointer + allocation structures
Memory usage for archive Decompressing:
- Temporary pool:
- Memory for LZMA decompressing structures
- Main pool:
- Memory for decompressed solid block
- Memory for temprorary buffers, if BCJ2 fileter is used. Usually these
temprorary buffers can be about 15% of solid block size.
7z Decoder doesn't allocate memory for compressed blocks.
Instead of this, you must allocate buffer with desired
size before calling 7z Decoder. Use 7zMain.c as example.
Defines
-------
_SZ_ALLOC_DEBUG - define it if you want to debug alloc/free operations to stderr.
---
http://www.7-zip.org
http://www.7-zip.org/sdk.html
http://www.7-zip.org/support.html

471
lzma/7zFormat.txt Normal file
View file

@ -0,0 +1,471 @@
7z Format description (2.30 Beta 25)
-----------------------------------
This file contains description of 7z archive format.
7z archive can contain files compressed with any method.
See "Methods.txt" for description for defined compressing methods.
Format structure Overview
-------------------------
Some fields can be optional.
Archive structure
~~~~~~~~~~~~~~~~~
SignatureHeader
[PackedStreams]
[PackedStreamsForHeaders]
[
Header
or
{
Packed Header
HeaderInfo
}
]
Header structure
~~~~~~~~~~~~~~~~
{
ArchiveProperties
AdditionalStreams
{
PackInfo
{
PackPos
NumPackStreams
Sizes[NumPackStreams]
CRCs[NumPackStreams]
}
CodersInfo
{
NumFolders
Folders[NumFolders]
{
NumCoders
CodersInfo[NumCoders]
{
ID
NumInStreams;
NumOutStreams;
PropertiesSize
Properties[PropertiesSize]
}
NumBindPairs
BindPairsInfo[NumBindPairs]
{
InIndex;
OutIndex;
}
PackedIndices
}
UnPackSize[Folders][Folders.NumOutstreams]
CRCs[NumFolders]
}
SubStreamsInfo
{
NumUnPackStreamsInFolders[NumFolders];
UnPackSizes[]
CRCs[]
}
}
MainStreamsInfo
{
(Same as in AdditionalStreams)
}
FilesInfo
{
NumFiles
Properties[]
{
ID
Size
Data
}
}
}
HeaderInfo structure
~~~~~~~~~~~~~~~~~~~~
{
(Same as in AdditionalStreams)
}
Notes about Notation and encoding
---------------------------------
7z uses little endian encoding.
7z archive format has optional headers that are marked as
[]
Header
[]
REAL_UINT64 means real UINT64.
UINT64 means real UINT64 encoded with the following scheme:
Size of encoding sequence depends from first byte:
First_Byte Extra_Bytes Value
(binary)
0xxxxxxx : ( xxxxxxx )
10xxxxxx BYTE y[1] : ( xxxxxx << (8 * 1)) + y
110xxxxx BYTE y[2] : ( xxxxx << (8 * 2)) + y
...
1111110x BYTE y[6] : ( x << (8 * 6)) + y
11111110 BYTE y[7] : y
11111111 BYTE y[8] : y
Property IDs
------------
0x00 = kEnd,
0x01 = kHeader,
0x02 = kArchiveProperties,
0x03 = kAdditionalStreamsInfo,
0x04 = kMainStreamsInfo,
0x05 = kFilesInfo,
0x06 = kPackInfo,
0x07 = kUnPackInfo,
0x08 = kSubStreamsInfo,
0x09 = kSize,
0x0A = kCRC,
0x0B = kFolder,
0x0C = kCodersUnPackSize,
0x0D = kNumUnPackStream,
0x0E = kEmptyStream,
0x0F = kEmptyFile,
0x10 = kAnti,
0x11 = kName,
0x12 = kCreationTime,
0x13 = kLastAccessTime,
0x14 = kLastWriteTime,
0x15 = kWinAttributes,
0x16 = kComment,
0x17 = kEncodedHeader,
7z format headers
-----------------
SignatureHeader
~~~~~~~~~~~~~~~
BYTE kSignature[6] = {'7', 'z', 0xBC, 0xAF, 0x27, 0x1C};
ArchiveVersion
{
BYTE Major; // now = 0
BYTE Minor; // now = 2
};
UINT32 StartHeaderCRC;
StartHeader
{
REAL_UINT64 NextHeaderOffset
REAL_UINT64 NextHeaderSize
UINT32 NextHeaderCRC
}
...........................
ArchiveProperties
~~~~~~~~~~~~~~~~~
BYTE NID::kArchiveProperties (0x02)
for (;;)
{
BYTE PropertyType;
if (aType == 0)
break;
UINT64 PropertySize;
BYTE PropertyData[PropertySize];
}
Digests (NumStreams)
~~~~~~~~~~~~~~~~~~~~~
BYTE AllAreDefined
if (AllAreDefined == 0)
{
for(NumStreams)
BIT Defined
}
UINT32 CRCs[NumDefined]
PackInfo
~~~~~~~~~~~~
BYTE NID::kPackInfo (0x06)
UINT64 PackPos
UINT64 NumPackStreams
[]
BYTE NID::kSize (0x09)
UINT64 PackSizes[NumPackStreams]
[]
[]
BYTE NID::kCRC (0x0A)
PackStreamDigests[NumPackStreams]
[]
BYTE NID::kEnd
Folder
~~~~~~
UINT64 NumCoders;
for (NumCoders)
{
BYTE
{
0:3 DecompressionMethod.IDSize
4:
0 - IsSimple
1 - Is not simple
5:
0 - No Attributes
1 - There Are Attributes
7:
0 - Last Method in Alternative_Method_List
1 - There are more alternative methods
}
BYTE DecompressionMethod.ID[DecompressionMethod.IDSize]
if (!IsSimple)
{
UINT64 NumInStreams;
UINT64 NumOutStreams;
}
if (DecompressionMethod[0] != 0)
{
UINT64 PropertiesSize
BYTE Properties[PropertiesSize]
}
}
NumBindPairs = NumOutStreamsTotal - 1;
for (NumBindPairs)
{
UINT64 InIndex;
UINT64 OutIndex;
}
NumPackedStreams = NumInStreamsTotal - NumBindPairs;
if (NumPackedStreams > 1)
for(NumPackedStreams)
{
UINT64 Index;
};
Coders Info
~~~~~~~~~~~
BYTE NID::kUnPackInfo (0x07)
BYTE NID::kFolder (0x0B)
UINT64 NumFolders
BYTE External
switch(External)
{
case 0:
Folders[NumFolders]
case 1:
UINT64 DataStreamIndex
}
BYTE ID::kCodersUnPackSize (0x0C)
for(Folders)
for(Folder.NumOutStreams)
UINT64 UnPackSize;
[]
BYTE NID::kCRC (0x0A)
UnPackDigests[NumFolders]
[]
BYTE NID::kEnd
SubStreams Info
~~~~~~~~~~~~~~
BYTE NID::kSubStreamsInfo; (0x08)
[]
BYTE NID::kNumUnPackStream; (0x0D)
UINT64 NumUnPackStreamsInFolders[NumFolders];
[]
[]
BYTE NID::kSize (0x09)
UINT64 UnPackSizes[]
[]
[]
BYTE NID::kCRC (0x0A)
Digests[Number of streams with unknown CRC]
[]
BYTE NID::kEnd
Streams Info
~~~~~~~~~~~~
[]
PackInfo
[]
[]
CodersInfo
[]
[]
SubStreamsInfo
[]
BYTE NID::kEnd
FilesInfo
~~~~~~~~~
BYTE NID::kFilesInfo; (0x05)
UINT64 NumFiles
for (;;)
{
BYTE PropertyType;
if (aType == 0)
break;
UINT64 Size;
switch(PropertyType)
{
kEmptyStream: (0x0E)
for(NumFiles)
BIT IsEmptyStream
kEmptyFile: (0x0F)
for(EmptyStreams)
BIT IsEmptyFile
kAnti: (0x10)
for(EmptyStreams)
BIT IsAntiFile
case kCreationTime: (0x12)
case kLastAccessTime: (0x13)
case kLastWriteTime: (0x14)
BYTE AllAreDefined
if (AllAreDefined == 0)
{
for(NumFiles)
BIT TimeDefined
}
BYTE External;
if(External != 0)
UINT64 DataIndex
[]
for(Definded Items)
UINT32 Time
[]
kNames: (0x11)
BYTE External;
if(External != 0)
UINT64 DataIndex
[]
for(Files)
{
wchar_t Names[NameSize];
wchar_t 0;
}
[]
kAttributes: (0x15)
BYTE AllAreDefined
if (AllAreDefined == 0)
{
for(NumFiles)
BIT AttributesAreDefined
}
BYTE External;
if(External != 0)
UINT64 DataIndex
[]
for(Definded Attributes)
UINT32 Attributes
[]
}
}
Header
~~~~~~
BYTE NID::kHeader (0x01)
[]
ArchiveProperties
[]
[]
BYTE NID::kAdditionalStreamsInfo; (0x03)
StreamsInfo
[]
[]
BYTE NID::kMainStreamsInfo; (0x04)
StreamsInfo
[]
[]
FilesInfo
[]
BYTE NID::kEnd
HeaderInfo
~~~~~~~~~~
[]
BYTE NID::kEncodedHeader; (0x17)
StreamsInfo for Encoded Header
[]
---
End of document

102
lzma/ASM/x86/7zCrcT8U.s Normal file
View file

@ -0,0 +1,102 @@
SECTION .text
%macro CRC1b 0
movzx EDX, BYTE [ESI]
inc ESI
movzx EBX, AL
xor EDX, EBX
shr EAX, 8
xor EAX, [EBP + EDX * 4]
dec EDI
%endmacro
data_size equ (28)
crc_table equ (data_size + 4)
align 16
global CrcUpdateT8
global _CrcUpdateT8
CrcUpdateT8:
_CrcUpdateT8:
push EBX
push ESI
push EDI
push EBP
mov EAX, [ESP + 20]
mov ESI, [ESP + 24]
mov EDI, [ESP + data_size]
mov EBP, [ESP + crc_table]
test EDI, EDI
jz sl_end
sl:
test ESI, 7
jz sl_end
CRC1b
jnz sl
sl_end:
cmp EDI, 16
jb NEAR crc_end
mov [ESP + data_size], EDI
sub EDI, 8
and EDI, ~ 7
sub [ESP + data_size], EDI
add EDI, ESI
xor EAX, [ESI]
mov EBX, [ESI + 4]
movzx ECX, BL
align 16
main_loop:
mov EDX, [EBP + ECX*4 + 0C00h]
movzx ECX, BH
xor EDX, [EBP + ECX*4 + 0800h]
shr EBX, 16
movzx ECX, BL
xor EDX, [EBP + ECX*4 + 0400h]
xor EDX, [ESI + 8]
movzx ECX, AL
movzx EBX, BH
xor EDX, [EBP + EBX*4 + 0000h]
mov EBX, [ESI + 12]
xor EDX, [EBP + ECX*4 + 01C00h]
movzx ECX, AH
add ESI, 8
shr EAX, 16
xor EDX, [EBP + ECX*4 + 01800h]
movzx ECX, AL
xor EDX, [EBP + ECX*4 + 01400h]
movzx ECX, AH
mov EAX, [EBP + ECX*4 + 01000h]
movzx ECX, BL
xor EAX,EDX
cmp ESI, EDI
jne main_loop
xor EAX, [ESI]
mov EDI, [ESP + data_size]
crc_end:
test EDI, EDI
jz fl_end
fl:
CRC1b
jnz fl
fl_end:
pop EBP
pop EDI
pop ESI
pop EBX
ret
%ifidn __OUTPUT_FORMAT__,elf
section .note.GNU-stack noalloc noexec nowrite progbits
%endif

View file

@ -0,0 +1,105 @@
SECTION .text
%macro CRC1b 0
movzx EDX, BYTE [RSI]
inc RSI
movzx EBX, AL
xor EDX, EBX
shr EAX, 8
xor EAX, [RDI + RDX * 4]
dec R8
%endmacro
align 16
global CrcUpdateT8
CrcUpdateT8:
push RBX
push RSI
push RDI
push RBP
mov EAX, ECX
mov RSI, RDX
mov RDI, R9
test R8, R8
jz sl_end
sl:
test RSI, 7
jz sl_end
CRC1b
jnz sl
sl_end:
cmp R8, 16
jb crc_end
mov R9, R8
and R8, 7
add R8, 8
sub R9, R8
add R9, RSI
xor EAX, [RSI]
mov EBX, [RSI + 4]
movzx ECX, BL
align 16
main_loop:
mov EDX, [RDI + RCX*4 + 0C00h]
movzx EBP, BH
xor EDX, [RDI + RBP*4 + 0800h]
shr EBX, 16
movzx ECX, BL
xor EDX, [RSI + 8]
xor EDX, [RDI + RCX*4 + 0400h]
movzx ECX, AL
movzx EBP, BH
xor EDX, [RDI + RBP*4 + 0000h]
mov EBX, [RSI + 12]
xor EDX, [RDI + RCX*4 + 01C00h]
movzx EBP, AH
shr EAX, 16
movzx ECX, AL
xor EDX, [RDI + RBP*4 + 01800h]
movzx EBP, AH
mov EAX, [RDI + RCX*4 + 01400h]
add RSI, 8
xor EAX, [RDI + RBP*4 + 01000h]
movzx ECX, BL
xor EAX,EDX
cmp RSI, R9
jne main_loop
xor EAX, [RSI]
crc_end:
test R8, R8
jz fl_end
fl:
CRC1b
jnz fl
fl_end:
pop RBP
pop RDI
pop RSI
pop RBX
ret
%ifidn __OUTPUT_FORMAT__,elf
section .note.GNU-stack noalloc noexec nowrite progbits
%endif

35
lzma/C/7zCrc.c Normal file
View file

@ -0,0 +1,35 @@
/* 7zCrc.c -- CRC32 calculation
2008-08-05
Igor Pavlov
Public domain */
#include "7zCrc.h"
#define kCrcPoly 0xEDB88320
UInt32 g_CrcTable[256];
void MY_FAST_CALL CrcGenerateTable(void)
{
UInt32 i;
for (i = 0; i < 256; i++)
{
UInt32 r = i;
int j;
for (j = 0; j < 8; j++)
r = (r >> 1) ^ (kCrcPoly & ~((r & 1) - 1));
g_CrcTable[i] = r;
}
}
UInt32 MY_FAST_CALL CrcUpdate(UInt32 v, const void *data, size_t size)
{
const Byte *p = (const Byte *)data;
for (; size > 0 ; size--, p++)
v = CRC_UPDATE_BYTE(v, *p);
return v;
}
UInt32 MY_FAST_CALL CrcCalc(const void *data, size_t size)
{
return CrcUpdate(CRC_INIT_VAL, data, size) ^ 0xFFFFFFFF;
}

30
lzma/C/7zCrc.h Normal file
View file

@ -0,0 +1,30 @@
/* 7zCrc.h -- CRC32 calculation
2009-02-07 : Igor Pavlov : Public domain */
#ifndef __7Z_CRC_H
#define __7Z_CRC_H
#include <stddef.h>
#include "Types.h"
#ifdef __cplusplus
extern "C" {
#endif
extern UInt32 g_CrcTable[];
void MY_FAST_CALL CrcGenerateTable(void);
#define CRC_INIT_VAL 0xFFFFFFFF
#define CRC_GET_DIGEST(crc) ((crc) ^ 0xFFFFFFFF)
#define CRC_UPDATE_BYTE(crc, b) (g_CrcTable[((crc) ^ (b)) & 0xFF] ^ ((crc) >> 8))
UInt32 MY_FAST_CALL CrcUpdate(UInt32 crc, const void *data, size_t size);
UInt32 MY_FAST_CALL CrcCalc(const void *data, size_t size);
#ifdef __cplusplus
}
#endif
#endif

43
lzma/C/7zCrcT8.c Normal file
View file

@ -0,0 +1,43 @@
/* 7zCrcT8.c -- CRC32 calculation with 8 tables
2008-03-19
Igor Pavlov
Public domain */
#include "7zCrc.h"
#define kCrcPoly 0xEDB88320
#define CRC_NUM_TABLES 8
UInt32 g_CrcTable[256 * CRC_NUM_TABLES];
void MY_FAST_CALL CrcGenerateTable()
{
UInt32 i;
for (i = 0; i < 256; i++)
{
UInt32 r = i;
int j;
for (j = 0; j < 8; j++)
r = (r >> 1) ^ (kCrcPoly & ~((r & 1) - 1));
g_CrcTable[i] = r;
}
#if CRC_NUM_TABLES > 1
for (; i < 256 * CRC_NUM_TABLES; i++)
{
UInt32 r = g_CrcTable[i - 256];
g_CrcTable[i] = g_CrcTable[r & 0xFF] ^ (r >> 8);
}
#endif
}
UInt32 MY_FAST_CALL CrcUpdateT8(UInt32 v, const void *data, size_t size, const UInt32 *table);
UInt32 MY_FAST_CALL CrcUpdate(UInt32 v, const void *data, size_t size)
{
return CrcUpdateT8(v, data, size, g_CrcTable);
}
UInt32 MY_FAST_CALL CrcCalc(const void *data, size_t size)
{
return CrcUpdateT8(CRC_INIT_VAL, data, size, g_CrcTable) ^ 0xFFFFFFFF;
}

127
lzma/C/Alloc.c Normal file
View file

@ -0,0 +1,127 @@
/* Alloc.c -- Memory allocation functions
2008-09-24
Igor Pavlov
Public domain */
#ifdef _WIN32
#include <windows.h>
#endif
#include <stdlib.h>
#include "Alloc.h"
/* #define _SZ_ALLOC_DEBUG */
/* use _SZ_ALLOC_DEBUG to debug alloc/free operations */
#ifdef _SZ_ALLOC_DEBUG
#include <stdio.h>
int g_allocCount = 0;
int g_allocCountMid = 0;
int g_allocCountBig = 0;
#endif
void *MyAlloc(size_t size)
{
if (size == 0)
return 0;
#ifdef _SZ_ALLOC_DEBUG
{
void *p = malloc(size);
fprintf(stderr, "\nAlloc %10d bytes, count = %10d, addr = %8X", size, g_allocCount++, (unsigned)p);
return p;
}
#else
return malloc(size);
#endif
}
void MyFree(void *address)
{
#ifdef _SZ_ALLOC_DEBUG
if (address != 0)
fprintf(stderr, "\nFree; count = %10d, addr = %8X", --g_allocCount, (unsigned)address);
#endif
free(address);
}
#ifdef _WIN32
void *MidAlloc(size_t size)
{
if (size == 0)
return 0;
#ifdef _SZ_ALLOC_DEBUG
fprintf(stderr, "\nAlloc_Mid %10d bytes; count = %10d", size, g_allocCountMid++);
#endif
return VirtualAlloc(0, size, MEM_COMMIT, PAGE_READWRITE);
}
void MidFree(void *address)
{
#ifdef _SZ_ALLOC_DEBUG
if (address != 0)
fprintf(stderr, "\nFree_Mid; count = %10d", --g_allocCountMid);
#endif
if (address == 0)
return;
VirtualFree(address, 0, MEM_RELEASE);
}
#ifndef MEM_LARGE_PAGES
#undef _7ZIP_LARGE_PAGES
#endif
#ifdef _7ZIP_LARGE_PAGES
SIZE_T g_LargePageSize = 0;
typedef SIZE_T (WINAPI *GetLargePageMinimumP)();
#endif
void SetLargePageSize()
{
#ifdef _7ZIP_LARGE_PAGES
SIZE_T size = 0;
GetLargePageMinimumP largePageMinimum = (GetLargePageMinimumP)
GetProcAddress(GetModuleHandle(TEXT("kernel32.dll")), "GetLargePageMinimum");
if (largePageMinimum == 0)
return;
size = largePageMinimum();
if (size == 0 || (size & (size - 1)) != 0)
return;
g_LargePageSize = size;
#endif
}
void *BigAlloc(size_t size)
{
if (size == 0)
return 0;
#ifdef _SZ_ALLOC_DEBUG
fprintf(stderr, "\nAlloc_Big %10d bytes; count = %10d", size, g_allocCountBig++);
#endif
#ifdef _7ZIP_LARGE_PAGES
if (g_LargePageSize != 0 && g_LargePageSize <= (1 << 30) && size >= (1 << 18))
{
void *res = VirtualAlloc(0, (size + g_LargePageSize - 1) & (~(g_LargePageSize - 1)),
MEM_COMMIT | MEM_LARGE_PAGES, PAGE_READWRITE);
if (res != 0)
return res;
}
#endif
return VirtualAlloc(0, size, MEM_COMMIT, PAGE_READWRITE);
}
void BigFree(void *address)
{
#ifdef _SZ_ALLOC_DEBUG
if (address != 0)
fprintf(stderr, "\nFree_Big; count = %10d", --g_allocCountBig);
#endif
if (address == 0)
return;
VirtualFree(address, 0, MEM_RELEASE);
}
#endif

37
lzma/C/Alloc.h Normal file
View file

@ -0,0 +1,37 @@
/* Alloc.h -- Memory allocation functions
2008-03-13
Igor Pavlov
Public domain */
#ifndef __COMMON_ALLOC_H
#define __COMMON_ALLOC_H
#include <stddef.h>
#ifdef _WIN32
void *MyAlloc(size_t size);
void MyFree(void *address);
void SetLargePageSize();
void *MidAlloc(size_t size);
void MidFree(void *address);
void *BigAlloc(size_t size);
void BigFree(void *address);
#else
#include <stdlib.h> /* malloc */
#define MyAlloc(size) malloc(size)
#define MyFree(address) free(address)
#define MidAlloc(size) malloc(size)
#define MidFree(address) free(address)
#define BigAlloc(size) malloc(size)
#define BigFree(address) free(address)
#endif
#endif

761
lzma/C/LzFind.c Normal file
View file

@ -0,0 +1,761 @@
/* LzFind.c -- Match finder for LZ algorithms
2009-04-22 : Igor Pavlov : Public domain */
#include <string.h>
#include "LzFind.h"
#include "LzHash.h"
#define kEmptyHashValue 0
#define kMaxValForNormalize ((UInt32)0xFFFFFFFF)
#define kNormalizeStepMin (1 << 10) /* it must be power of 2 */
#define kNormalizeMask (~(kNormalizeStepMin - 1))
#define kMaxHistorySize ((UInt32)3 << 30)
#define kStartMaxLen 3
static void LzInWindow_Free(CMatchFinder *p, ISzAlloc *alloc)
{
if (!p->directInput)
{
alloc->Free(alloc, p->bufferBase);
p->bufferBase = 0;
}
}
/* keepSizeBefore + keepSizeAfter + keepSizeReserv must be < 4G) */
static int LzInWindow_Create(CMatchFinder *p, UInt32 keepSizeReserv, ISzAlloc *alloc)
{
UInt32 blockSize = p->keepSizeBefore + p->keepSizeAfter + keepSizeReserv;
if (p->directInput)
{
p->blockSize = blockSize;
return 1;
}
if (p->bufferBase == 0 || p->blockSize != blockSize)
{
LzInWindow_Free(p, alloc);
p->blockSize = blockSize;
p->bufferBase = (Byte *)alloc->Alloc(alloc, (size_t)blockSize);
}
return (p->bufferBase != 0);
}
Byte *MatchFinder_GetPointerToCurrentPos(CMatchFinder *p) { return p->buffer; }
Byte MatchFinder_GetIndexByte(CMatchFinder *p, Int32 index) { return p->buffer[index]; }
UInt32 MatchFinder_GetNumAvailableBytes(CMatchFinder *p) { return p->streamPos - p->pos; }
void MatchFinder_ReduceOffsets(CMatchFinder *p, UInt32 subValue)
{
p->posLimit -= subValue;
p->pos -= subValue;
p->streamPos -= subValue;
}
static void MatchFinder_ReadBlock(CMatchFinder *p)
{
if (p->streamEndWasReached || p->result != SZ_OK)
return;
if (p->directInput)
{
UInt32 curSize = 0xFFFFFFFF - p->streamPos;
if (curSize > p->directInputRem)
curSize = (UInt32)p->directInputRem;
p->directInputRem -= curSize;
p->streamPos += curSize;
if (p->directInputRem == 0)
p->streamEndWasReached = 1;
return;
}
for (;;)
{
Byte *dest = p->buffer + (p->streamPos - p->pos);
size_t size = (p->bufferBase + p->blockSize - dest);
if (size == 0)
return;
p->result = p->stream->Read(p->stream, dest, &size);
if (p->result != SZ_OK)
return;
if (size == 0)
{
p->streamEndWasReached = 1;
return;
}
p->streamPos += (UInt32)size;
if (p->streamPos - p->pos > p->keepSizeAfter)
return;
}
}
void MatchFinder_MoveBlock(CMatchFinder *p)
{
memmove(p->bufferBase,
p->buffer - p->keepSizeBefore,
(size_t)(p->streamPos - p->pos + p->keepSizeBefore));
p->buffer = p->bufferBase + p->keepSizeBefore;
}
int MatchFinder_NeedMove(CMatchFinder *p)
{
if (p->directInput)
return 0;
/* if (p->streamEndWasReached) return 0; */
return ((size_t)(p->bufferBase + p->blockSize - p->buffer) <= p->keepSizeAfter);
}
void MatchFinder_ReadIfRequired(CMatchFinder *p)
{
if (p->streamEndWasReached)
return;
if (p->keepSizeAfter >= p->streamPos - p->pos)
MatchFinder_ReadBlock(p);
}
static void MatchFinder_CheckAndMoveAndRead(CMatchFinder *p)
{
if (MatchFinder_NeedMove(p))
MatchFinder_MoveBlock(p);
MatchFinder_ReadBlock(p);
}
static void MatchFinder_SetDefaultSettings(CMatchFinder *p)
{
p->cutValue = 32;
p->btMode = 1;
p->numHashBytes = 4;
p->bigHash = 0;
}
#define kCrcPoly 0xEDB88320
void MatchFinder_Construct(CMatchFinder *p)
{
UInt32 i;
p->bufferBase = 0;
p->directInput = 0;
p->hash = 0;
MatchFinder_SetDefaultSettings(p);
for (i = 0; i < 256; i++)
{
UInt32 r = i;
int j;
for (j = 0; j < 8; j++)
r = (r >> 1) ^ (kCrcPoly & ~((r & 1) - 1));
p->crc[i] = r;
}
}
static void MatchFinder_FreeThisClassMemory(CMatchFinder *p, ISzAlloc *alloc)
{
alloc->Free(alloc, p->hash);
p->hash = 0;
}
void MatchFinder_Free(CMatchFinder *p, ISzAlloc *alloc)
{
MatchFinder_FreeThisClassMemory(p, alloc);
LzInWindow_Free(p, alloc);
}
static CLzRef* AllocRefs(UInt32 num, ISzAlloc *alloc)
{
size_t sizeInBytes = (size_t)num * sizeof(CLzRef);
if (sizeInBytes / sizeof(CLzRef) != num)
return 0;
return (CLzRef *)alloc->Alloc(alloc, sizeInBytes);
}
int MatchFinder_Create(CMatchFinder *p, UInt32 historySize,
UInt32 keepAddBufferBefore, UInt32 matchMaxLen, UInt32 keepAddBufferAfter,
ISzAlloc *alloc)
{
UInt32 sizeReserv;
if (historySize > kMaxHistorySize)
{
MatchFinder_Free(p, alloc);
return 0;
}
sizeReserv = historySize >> 1;
if (historySize > ((UInt32)2 << 30))
sizeReserv = historySize >> 2;
sizeReserv += (keepAddBufferBefore + matchMaxLen + keepAddBufferAfter) / 2 + (1 << 19);
p->keepSizeBefore = historySize + keepAddBufferBefore + 1;
p->keepSizeAfter = matchMaxLen + keepAddBufferAfter;
/* we need one additional byte, since we use MoveBlock after pos++ and before dictionary using */
if (LzInWindow_Create(p, sizeReserv, alloc))
{
UInt32 newCyclicBufferSize = historySize + 1;
UInt32 hs;
p->matchMaxLen = matchMaxLen;
{
p->fixedHashSize = 0;
if (p->numHashBytes == 2)
hs = (1 << 16) - 1;
else
{
hs = historySize - 1;
hs |= (hs >> 1);
hs |= (hs >> 2);
hs |= (hs >> 4);
hs |= (hs >> 8);
hs >>= 1;
hs |= 0xFFFF; /* don't change it! It's required for Deflate */
if (hs > (1 << 24))
{
if (p->numHashBytes == 3)
hs = (1 << 24) - 1;
else
hs >>= 1;
}
}
p->hashMask = hs;
hs++;
if (p->numHashBytes > 2) p->fixedHashSize += kHash2Size;
if (p->numHashBytes > 3) p->fixedHashSize += kHash3Size;
if (p->numHashBytes > 4) p->fixedHashSize += kHash4Size;
hs += p->fixedHashSize;
}
{
UInt32 prevSize = p->hashSizeSum + p->numSons;
UInt32 newSize;
p->historySize = historySize;
p->hashSizeSum = hs;
p->cyclicBufferSize = newCyclicBufferSize;
p->numSons = (p->btMode ? newCyclicBufferSize * 2 : newCyclicBufferSize);
newSize = p->hashSizeSum + p->numSons;
if (p->hash != 0 && prevSize == newSize)
return 1;
MatchFinder_FreeThisClassMemory(p, alloc);
p->hash = AllocRefs(newSize, alloc);
if (p->hash != 0)
{
p->son = p->hash + p->hashSizeSum;
return 1;
}
}
}
MatchFinder_Free(p, alloc);
return 0;
}
static void MatchFinder_SetLimits(CMatchFinder *p)
{
UInt32 limit = kMaxValForNormalize - p->pos;
UInt32 limit2 = p->cyclicBufferSize - p->cyclicBufferPos;
if (limit2 < limit)
limit = limit2;
limit2 = p->streamPos - p->pos;
if (limit2 <= p->keepSizeAfter)
{
if (limit2 > 0)
limit2 = 1;
}
else
limit2 -= p->keepSizeAfter;
if (limit2 < limit)
limit = limit2;
{
UInt32 lenLimit = p->streamPos - p->pos;
if (lenLimit > p->matchMaxLen)
lenLimit = p->matchMaxLen;
p->lenLimit = lenLimit;
}
p->posLimit = p->pos + limit;
}
void MatchFinder_Init(CMatchFinder *p)
{
UInt32 i;
for (i = 0; i < p->hashSizeSum; i++)
p->hash[i] = kEmptyHashValue;
p->cyclicBufferPos = 0;
p->buffer = p->bufferBase;
p->pos = p->streamPos = p->cyclicBufferSize;
p->result = SZ_OK;
p->streamEndWasReached = 0;
MatchFinder_ReadBlock(p);
MatchFinder_SetLimits(p);
}
static UInt32 MatchFinder_GetSubValue(CMatchFinder *p)
{
return (p->pos - p->historySize - 1) & kNormalizeMask;
}
void MatchFinder_Normalize3(UInt32 subValue, CLzRef *items, UInt32 numItems)
{
UInt32 i;
for (i = 0; i < numItems; i++)
{
UInt32 value = items[i];
if (value <= subValue)
value = kEmptyHashValue;
else
value -= subValue;
items[i] = value;
}
}
static void MatchFinder_Normalize(CMatchFinder *p)
{
UInt32 subValue = MatchFinder_GetSubValue(p);
MatchFinder_Normalize3(subValue, p->hash, p->hashSizeSum + p->numSons);
MatchFinder_ReduceOffsets(p, subValue);
}
static void MatchFinder_CheckLimits(CMatchFinder *p)
{
if (p->pos == kMaxValForNormalize)
MatchFinder_Normalize(p);
if (!p->streamEndWasReached && p->keepSizeAfter == p->streamPos - p->pos)
MatchFinder_CheckAndMoveAndRead(p);
if (p->cyclicBufferPos == p->cyclicBufferSize)
p->cyclicBufferPos = 0;
MatchFinder_SetLimits(p);
}
static UInt32 * Hc_GetMatchesSpec(UInt32 lenLimit, UInt32 curMatch, UInt32 pos, const Byte *cur, CLzRef *son,
UInt32 _cyclicBufferPos, UInt32 _cyclicBufferSize, UInt32 cutValue,
UInt32 *distances, UInt32 maxLen)
{
son[_cyclicBufferPos] = curMatch;
for (;;)
{
UInt32 delta = pos - curMatch;
if (cutValue-- == 0 || delta >= _cyclicBufferSize)
return distances;
{
const Byte *pb = cur - delta;
curMatch = son[_cyclicBufferPos - delta + ((delta > _cyclicBufferPos) ? _cyclicBufferSize : 0)];
if (pb[maxLen] == cur[maxLen] && *pb == *cur)
{
UInt32 len = 0;
while (++len != lenLimit)
if (pb[len] != cur[len])
break;
if (maxLen < len)
{
*distances++ = maxLen = len;
*distances++ = delta - 1;
if (len == lenLimit)
return distances;
}
}
}
}
}
UInt32 * GetMatchesSpec1(UInt32 lenLimit, UInt32 curMatch, UInt32 pos, const Byte *cur, CLzRef *son,
UInt32 _cyclicBufferPos, UInt32 _cyclicBufferSize, UInt32 cutValue,
UInt32 *distances, UInt32 maxLen)
{
CLzRef *ptr0 = son + (_cyclicBufferPos << 1) + 1;
CLzRef *ptr1 = son + (_cyclicBufferPos << 1);
UInt32 len0 = 0, len1 = 0;
for (;;)
{
UInt32 delta = pos - curMatch;
if (cutValue-- == 0 || delta >= _cyclicBufferSize)
{
*ptr0 = *ptr1 = kEmptyHashValue;
return distances;
}
{
CLzRef *pair = son + ((_cyclicBufferPos - delta + ((delta > _cyclicBufferPos) ? _cyclicBufferSize : 0)) << 1);
const Byte *pb = cur - delta;
UInt32 len = (len0 < len1 ? len0 : len1);
if (pb[len] == cur[len])
{
if (++len != lenLimit && pb[len] == cur[len])
while (++len != lenLimit)
if (pb[len] != cur[len])
break;
if (maxLen < len)
{
*distances++ = maxLen = len;
*distances++ = delta - 1;
if (len == lenLimit)
{
*ptr1 = pair[0];
*ptr0 = pair[1];
return distances;
}
}
}
if (pb[len] < cur[len])
{
*ptr1 = curMatch;
ptr1 = pair + 1;
curMatch = *ptr1;
len1 = len;
}
else
{
*ptr0 = curMatch;
ptr0 = pair;
curMatch = *ptr0;
len0 = len;
}
}
}
}
static void SkipMatchesSpec(UInt32 lenLimit, UInt32 curMatch, UInt32 pos, const Byte *cur, CLzRef *son,
UInt32 _cyclicBufferPos, UInt32 _cyclicBufferSize, UInt32 cutValue)
{
CLzRef *ptr0 = son + (_cyclicBufferPos << 1) + 1;
CLzRef *ptr1 = son + (_cyclicBufferPos << 1);
UInt32 len0 = 0, len1 = 0;
for (;;)
{
UInt32 delta = pos - curMatch;
if (cutValue-- == 0 || delta >= _cyclicBufferSize)
{
*ptr0 = *ptr1 = kEmptyHashValue;
return;
}
{
CLzRef *pair = son + ((_cyclicBufferPos - delta + ((delta > _cyclicBufferPos) ? _cyclicBufferSize : 0)) << 1);
const Byte *pb = cur - delta;
UInt32 len = (len0 < len1 ? len0 : len1);
if (pb[len] == cur[len])
{
while (++len != lenLimit)
if (pb[len] != cur[len])
break;
{
if (len == lenLimit)
{
*ptr1 = pair[0];
*ptr0 = pair[1];
return;
}
}
}
if (pb[len] < cur[len])
{
*ptr1 = curMatch;
ptr1 = pair + 1;
curMatch = *ptr1;
len1 = len;
}
else
{
*ptr0 = curMatch;
ptr0 = pair;
curMatch = *ptr0;
len0 = len;
}
}
}
}
#define MOVE_POS \
++p->cyclicBufferPos; \
p->buffer++; \
if (++p->pos == p->posLimit) MatchFinder_CheckLimits(p);
#define MOVE_POS_RET MOVE_POS return offset;
static void MatchFinder_MovePos(CMatchFinder *p) { MOVE_POS; }
#define GET_MATCHES_HEADER2(minLen, ret_op) \
UInt32 lenLimit; UInt32 hashValue; const Byte *cur; UInt32 curMatch; \
lenLimit = p->lenLimit; { if (lenLimit < minLen) { MatchFinder_MovePos(p); ret_op; }} \
cur = p->buffer;
#define GET_MATCHES_HEADER(minLen) GET_MATCHES_HEADER2(minLen, return 0)
#define SKIP_HEADER(minLen) GET_MATCHES_HEADER2(minLen, continue)
#define MF_PARAMS(p) p->pos, p->buffer, p->son, p->cyclicBufferPos, p->cyclicBufferSize, p->cutValue
#define GET_MATCHES_FOOTER(offset, maxLen) \
offset = (UInt32)(GetMatchesSpec1(lenLimit, curMatch, MF_PARAMS(p), \
distances + offset, maxLen) - distances); MOVE_POS_RET;
#define SKIP_FOOTER \
SkipMatchesSpec(lenLimit, curMatch, MF_PARAMS(p)); MOVE_POS;
static UInt32 Bt2_MatchFinder_GetMatches(CMatchFinder *p, UInt32 *distances)
{
UInt32 offset;
GET_MATCHES_HEADER(2)
HASH2_CALC;
curMatch = p->hash[hashValue];
p->hash[hashValue] = p->pos;
offset = 0;
GET_MATCHES_FOOTER(offset, 1)
}
UInt32 Bt3Zip_MatchFinder_GetMatches(CMatchFinder *p, UInt32 *distances)
{
UInt32 offset;
GET_MATCHES_HEADER(3)
HASH_ZIP_CALC;
curMatch = p->hash[hashValue];
p->hash[hashValue] = p->pos;
offset = 0;
GET_MATCHES_FOOTER(offset, 2)
}
static UInt32 Bt3_MatchFinder_GetMatches(CMatchFinder *p, UInt32 *distances)
{
UInt32 hash2Value, delta2, maxLen, offset;
GET_MATCHES_HEADER(3)
HASH3_CALC;
delta2 = p->pos - p->hash[hash2Value];
curMatch = p->hash[kFix3HashSize + hashValue];
p->hash[hash2Value] =
p->hash[kFix3HashSize + hashValue] = p->pos;
maxLen = 2;
offset = 0;
if (delta2 < p->cyclicBufferSize && *(cur - delta2) == *cur)
{
for (; maxLen != lenLimit; maxLen++)
if (cur[(ptrdiff_t)maxLen - delta2] != cur[maxLen])
break;
distances[0] = maxLen;
distances[1] = delta2 - 1;
offset = 2;
if (maxLen == lenLimit)
{
SkipMatchesSpec(lenLimit, curMatch, MF_PARAMS(p));
MOVE_POS_RET;
}
}
GET_MATCHES_FOOTER(offset, maxLen)
}
static UInt32 Bt4_MatchFinder_GetMatches(CMatchFinder *p, UInt32 *distances)
{
UInt32 hash2Value, hash3Value, delta2, delta3, maxLen, offset;
GET_MATCHES_HEADER(4)
HASH4_CALC;
delta2 = p->pos - p->hash[ hash2Value];
delta3 = p->pos - p->hash[kFix3HashSize + hash3Value];
curMatch = p->hash[kFix4HashSize + hashValue];
p->hash[ hash2Value] =
p->hash[kFix3HashSize + hash3Value] =
p->hash[kFix4HashSize + hashValue] = p->pos;
maxLen = 1;
offset = 0;
if (delta2 < p->cyclicBufferSize && *(cur - delta2) == *cur)
{
distances[0] = maxLen = 2;
distances[1] = delta2 - 1;
offset = 2;
}
if (delta2 != delta3 && delta3 < p->cyclicBufferSize && *(cur - delta3) == *cur)
{
maxLen = 3;
distances[offset + 1] = delta3 - 1;
offset += 2;
delta2 = delta3;
}
if (offset != 0)
{
for (; maxLen != lenLimit; maxLen++)
if (cur[(ptrdiff_t)maxLen - delta2] != cur[maxLen])
break;
distances[offset - 2] = maxLen;
if (maxLen == lenLimit)
{
SkipMatchesSpec(lenLimit, curMatch, MF_PARAMS(p));
MOVE_POS_RET;
}
}
if (maxLen < 3)
maxLen = 3;
GET_MATCHES_FOOTER(offset, maxLen)
}
static UInt32 Hc4_MatchFinder_GetMatches(CMatchFinder *p, UInt32 *distances)
{
UInt32 hash2Value, hash3Value, delta2, delta3, maxLen, offset;
GET_MATCHES_HEADER(4)
HASH4_CALC;
delta2 = p->pos - p->hash[ hash2Value];
delta3 = p->pos - p->hash[kFix3HashSize + hash3Value];
curMatch = p->hash[kFix4HashSize + hashValue];
p->hash[ hash2Value] =
p->hash[kFix3HashSize + hash3Value] =
p->hash[kFix4HashSize + hashValue] = p->pos;
maxLen = 1;
offset = 0;
if (delta2 < p->cyclicBufferSize && *(cur - delta2) == *cur)
{
distances[0] = maxLen = 2;
distances[1] = delta2 - 1;
offset = 2;
}
if (delta2 != delta3 && delta3 < p->cyclicBufferSize && *(cur - delta3) == *cur)
{
maxLen = 3;
distances[offset + 1] = delta3 - 1;
offset += 2;
delta2 = delta3;
}
if (offset != 0)
{
for (; maxLen != lenLimit; maxLen++)
if (cur[(ptrdiff_t)maxLen - delta2] != cur[maxLen])
break;
distances[offset - 2] = maxLen;
if (maxLen == lenLimit)
{
p->son[p->cyclicBufferPos] = curMatch;
MOVE_POS_RET;
}
}
if (maxLen < 3)
maxLen = 3;
offset = (UInt32)(Hc_GetMatchesSpec(lenLimit, curMatch, MF_PARAMS(p),
distances + offset, maxLen) - (distances));
MOVE_POS_RET
}
UInt32 Hc3Zip_MatchFinder_GetMatches(CMatchFinder *p, UInt32 *distances)
{
UInt32 offset;
GET_MATCHES_HEADER(3)
HASH_ZIP_CALC;
curMatch = p->hash[hashValue];
p->hash[hashValue] = p->pos;
offset = (UInt32)(Hc_GetMatchesSpec(lenLimit, curMatch, MF_PARAMS(p),
distances, 2) - (distances));
MOVE_POS_RET
}
static void Bt2_MatchFinder_Skip(CMatchFinder *p, UInt32 num)
{
do
{
SKIP_HEADER(2)
HASH2_CALC;
curMatch = p->hash[hashValue];
p->hash[hashValue] = p->pos;
SKIP_FOOTER
}
while (--num != 0);
}
void Bt3Zip_MatchFinder_Skip(CMatchFinder *p, UInt32 num)
{
do
{
SKIP_HEADER(3)
HASH_ZIP_CALC;
curMatch = p->hash[hashValue];
p->hash[hashValue] = p->pos;
SKIP_FOOTER
}
while (--num != 0);
}
static void Bt3_MatchFinder_Skip(CMatchFinder *p, UInt32 num)
{
do
{
UInt32 hash2Value;
SKIP_HEADER(3)
HASH3_CALC;
curMatch = p->hash[kFix3HashSize + hashValue];
p->hash[hash2Value] =
p->hash[kFix3HashSize + hashValue] = p->pos;
SKIP_FOOTER
}
while (--num != 0);
}
static void Bt4_MatchFinder_Skip(CMatchFinder *p, UInt32 num)
{
do
{
UInt32 hash2Value, hash3Value;
SKIP_HEADER(4)
HASH4_CALC;
curMatch = p->hash[kFix4HashSize + hashValue];
p->hash[ hash2Value] =
p->hash[kFix3HashSize + hash3Value] = p->pos;
p->hash[kFix4HashSize + hashValue] = p->pos;
SKIP_FOOTER
}
while (--num != 0);
}
static void Hc4_MatchFinder_Skip(CMatchFinder *p, UInt32 num)
{
do
{
UInt32 hash2Value, hash3Value;
SKIP_HEADER(4)
HASH4_CALC;
curMatch = p->hash[kFix4HashSize + hashValue];
p->hash[ hash2Value] =
p->hash[kFix3HashSize + hash3Value] =
p->hash[kFix4HashSize + hashValue] = p->pos;
p->son[p->cyclicBufferPos] = curMatch;
MOVE_POS
}
while (--num != 0);
}
void Hc3Zip_MatchFinder_Skip(CMatchFinder *p, UInt32 num)
{
do
{
SKIP_HEADER(3)
HASH_ZIP_CALC;
curMatch = p->hash[hashValue];
p->hash[hashValue] = p->pos;
p->son[p->cyclicBufferPos] = curMatch;
MOVE_POS
}
while (--num != 0);
}
void MatchFinder_CreateVTable(CMatchFinder *p, IMatchFinder *vTable)
{
vTable->Init = (Mf_Init_Func)MatchFinder_Init;
vTable->GetIndexByte = (Mf_GetIndexByte_Func)MatchFinder_GetIndexByte;
vTable->GetNumAvailableBytes = (Mf_GetNumAvailableBytes_Func)MatchFinder_GetNumAvailableBytes;
vTable->GetPointerToCurrentPos = (Mf_GetPointerToCurrentPos_Func)MatchFinder_GetPointerToCurrentPos;
if (!p->btMode)
{
vTable->GetMatches = (Mf_GetMatches_Func)Hc4_MatchFinder_GetMatches;
vTable->Skip = (Mf_Skip_Func)Hc4_MatchFinder_Skip;
}
else if (p->numHashBytes == 2)
{
vTable->GetMatches = (Mf_GetMatches_Func)Bt2_MatchFinder_GetMatches;
vTable->Skip = (Mf_Skip_Func)Bt2_MatchFinder_Skip;
}
else if (p->numHashBytes == 3)
{
vTable->GetMatches = (Mf_GetMatches_Func)Bt3_MatchFinder_GetMatches;
vTable->Skip = (Mf_Skip_Func)Bt3_MatchFinder_Skip;
}
else
{
vTable->GetMatches = (Mf_GetMatches_Func)Bt4_MatchFinder_GetMatches;
vTable->Skip = (Mf_Skip_Func)Bt4_MatchFinder_Skip;
}
}

115
lzma/C/LzFind.h Normal file
View file

@ -0,0 +1,115 @@
/* LzFind.h -- Match finder for LZ algorithms
2009-04-22 : Igor Pavlov : Public domain */
#ifndef __LZ_FIND_H
#define __LZ_FIND_H
#include "Types.h"
#ifdef __cplusplus
extern "C" {
#endif
typedef UInt32 CLzRef;
typedef struct _CMatchFinder
{
Byte *buffer;
UInt32 pos;
UInt32 posLimit;
UInt32 streamPos;
UInt32 lenLimit;
UInt32 cyclicBufferPos;
UInt32 cyclicBufferSize; /* it must be = (historySize + 1) */
UInt32 matchMaxLen;
CLzRef *hash;
CLzRef *son;
UInt32 hashMask;
UInt32 cutValue;
Byte *bufferBase;
ISeqInStream *stream;
int streamEndWasReached;
UInt32 blockSize;
UInt32 keepSizeBefore;
UInt32 keepSizeAfter;
UInt32 numHashBytes;
int directInput;
size_t directInputRem;
int btMode;
int bigHash;
UInt32 historySize;
UInt32 fixedHashSize;
UInt32 hashSizeSum;
UInt32 numSons;
SRes result;
UInt32 crc[256];
} CMatchFinder;
#define Inline_MatchFinder_GetPointerToCurrentPos(p) ((p)->buffer)
#define Inline_MatchFinder_GetIndexByte(p, index) ((p)->buffer[(Int32)(index)])
#define Inline_MatchFinder_GetNumAvailableBytes(p) ((p)->streamPos - (p)->pos)
int MatchFinder_NeedMove(CMatchFinder *p);
Byte *MatchFinder_GetPointerToCurrentPos(CMatchFinder *p);
void MatchFinder_MoveBlock(CMatchFinder *p);
void MatchFinder_ReadIfRequired(CMatchFinder *p);
void MatchFinder_Construct(CMatchFinder *p);
/* Conditions:
historySize <= 3 GB
keepAddBufferBefore + matchMaxLen + keepAddBufferAfter < 511MB
*/
int MatchFinder_Create(CMatchFinder *p, UInt32 historySize,
UInt32 keepAddBufferBefore, UInt32 matchMaxLen, UInt32 keepAddBufferAfter,
ISzAlloc *alloc);
void MatchFinder_Free(CMatchFinder *p, ISzAlloc *alloc);
void MatchFinder_Normalize3(UInt32 subValue, CLzRef *items, UInt32 numItems);
void MatchFinder_ReduceOffsets(CMatchFinder *p, UInt32 subValue);
UInt32 * GetMatchesSpec1(UInt32 lenLimit, UInt32 curMatch, UInt32 pos, const Byte *buffer, CLzRef *son,
UInt32 _cyclicBufferPos, UInt32 _cyclicBufferSize, UInt32 _cutValue,
UInt32 *distances, UInt32 maxLen);
/*
Conditions:
Mf_GetNumAvailableBytes_Func must be called before each Mf_GetMatchLen_Func.
Mf_GetPointerToCurrentPos_Func's result must be used only before any other function
*/
typedef void (*Mf_Init_Func)(void *object);
typedef Byte (*Mf_GetIndexByte_Func)(void *object, Int32 index);
typedef UInt32 (*Mf_GetNumAvailableBytes_Func)(void *object);
typedef const Byte * (*Mf_GetPointerToCurrentPos_Func)(void *object);
typedef UInt32 (*Mf_GetMatches_Func)(void *object, UInt32 *distances);
typedef void (*Mf_Skip_Func)(void *object, UInt32);
typedef struct _IMatchFinder
{
Mf_Init_Func Init;
Mf_GetIndexByte_Func GetIndexByte;
Mf_GetNumAvailableBytes_Func GetNumAvailableBytes;
Mf_GetPointerToCurrentPos_Func GetPointerToCurrentPos;
Mf_GetMatches_Func GetMatches;
Mf_Skip_Func Skip;
} IMatchFinder;
void MatchFinder_CreateVTable(CMatchFinder *p, IMatchFinder *vTable);
void MatchFinder_Init(CMatchFinder *p);
UInt32 Bt3Zip_MatchFinder_GetMatches(CMatchFinder *p, UInt32 *distances);
UInt32 Hc3Zip_MatchFinder_GetMatches(CMatchFinder *p, UInt32 *distances);
void Bt3Zip_MatchFinder_Skip(CMatchFinder *p, UInt32 num);
void Hc3Zip_MatchFinder_Skip(CMatchFinder *p, UInt32 num);
#ifdef __cplusplus
}
#endif
#endif

793
lzma/C/LzFindMt.c Normal file
View file

@ -0,0 +1,793 @@
/* LzFindMt.c -- multithreaded Match finder for LZ algorithms
2009-05-26 : Igor Pavlov : Public domain */
#include "LzHash.h"
#include "LzFindMt.h"
void MtSync_Construct(CMtSync *p)
{
p->wasCreated = False;
p->csWasInitialized = False;
p->csWasEntered = False;
Thread_Construct(&p->thread);
Event_Construct(&p->canStart);
Event_Construct(&p->wasStarted);
Event_Construct(&p->wasStopped);
Semaphore_Construct(&p->freeSemaphore);
Semaphore_Construct(&p->filledSemaphore);
}
void MtSync_GetNextBlock(CMtSync *p)
{
if (p->needStart)
{
p->numProcessedBlocks = 1;
p->needStart = False;
p->stopWriting = False;
p->exit = False;
Event_Reset(&p->wasStarted);
Event_Reset(&p->wasStopped);
Event_Set(&p->canStart);
Event_Wait(&p->wasStarted);
}
else
{
CriticalSection_Leave(&p->cs);
p->csWasEntered = False;
p->numProcessedBlocks++;
Semaphore_Release1(&p->freeSemaphore);
}
Semaphore_Wait(&p->filledSemaphore);
CriticalSection_Enter(&p->cs);
p->csWasEntered = True;
}
/* MtSync_StopWriting must be called if Writing was started */
void MtSync_StopWriting(CMtSync *p)
{
UInt32 myNumBlocks = p->numProcessedBlocks;
if (!Thread_WasCreated(&p->thread) || p->needStart)
return;
p->stopWriting = True;
if (p->csWasEntered)
{
CriticalSection_Leave(&p->cs);
p->csWasEntered = False;
}
Semaphore_Release1(&p->freeSemaphore);
Event_Wait(&p->wasStopped);
while (myNumBlocks++ != p->numProcessedBlocks)
{
Semaphore_Wait(&p->filledSemaphore);
Semaphore_Release1(&p->freeSemaphore);
}
p->needStart = True;
}
void MtSync_Destruct(CMtSync *p)
{
if (Thread_WasCreated(&p->thread))
{
MtSync_StopWriting(p);
p->exit = True;
if (p->needStart)
Event_Set(&p->canStart);
Thread_Wait(&p->thread);
Thread_Close(&p->thread);
}
if (p->csWasInitialized)
{
CriticalSection_Delete(&p->cs);
p->csWasInitialized = False;
}
Event_Close(&p->canStart);
Event_Close(&p->wasStarted);
Event_Close(&p->wasStopped);
Semaphore_Close(&p->freeSemaphore);
Semaphore_Close(&p->filledSemaphore);
p->wasCreated = False;
}
#define RINOK_THREAD(x) { if ((x) != 0) return SZ_ERROR_THREAD; }
static SRes MtSync_Create2(CMtSync *p, unsigned (MY_STD_CALL *startAddress)(void *), void *obj, UInt32 numBlocks)
{
if (p->wasCreated)
return SZ_OK;
RINOK_THREAD(CriticalSection_Init(&p->cs));
p->csWasInitialized = True;
RINOK_THREAD(AutoResetEvent_CreateNotSignaled(&p->canStart));
RINOK_THREAD(AutoResetEvent_CreateNotSignaled(&p->wasStarted));
RINOK_THREAD(AutoResetEvent_CreateNotSignaled(&p->wasStopped));
RINOK_THREAD(Semaphore_Create(&p->freeSemaphore, numBlocks, numBlocks));
RINOK_THREAD(Semaphore_Create(&p->filledSemaphore, 0, numBlocks));
p->needStart = True;
RINOK_THREAD(Thread_Create(&p->thread, startAddress, obj));
p->wasCreated = True;
return SZ_OK;
}
static SRes MtSync_Create(CMtSync *p, unsigned (MY_STD_CALL *startAddress)(void *), void *obj, UInt32 numBlocks)
{
SRes res = MtSync_Create2(p, startAddress, obj, numBlocks);
if (res != SZ_OK)
MtSync_Destruct(p);
return res;
}
void MtSync_Init(CMtSync *p) { p->needStart = True; }
#define kMtMaxValForNormalize 0xFFFFFFFF
#define DEF_GetHeads2(name, v, action) \
static void GetHeads ## name(const Byte *p, UInt32 pos, \
UInt32 *hash, UInt32 hashMask, UInt32 *heads, UInt32 numHeads, const UInt32 *crc) \
{ action; for (; numHeads != 0; numHeads--) { \
const UInt32 value = (v); p++; *heads++ = pos - hash[value]; hash[value] = pos++; } }
#define DEF_GetHeads(name, v) DEF_GetHeads2(name, v, ;)
DEF_GetHeads2(2, (p[0] | ((UInt32)p[1] << 8)), hashMask = hashMask; crc = crc; )
DEF_GetHeads(3, (crc[p[0]] ^ p[1] ^ ((UInt32)p[2] << 8)) & hashMask)
DEF_GetHeads(4, (crc[p[0]] ^ p[1] ^ ((UInt32)p[2] << 8) ^ (crc[p[3]] << 5)) & hashMask)
DEF_GetHeads(4b, (crc[p[0]] ^ p[1] ^ ((UInt32)p[2] << 8) ^ ((UInt32)p[3] << 16)) & hashMask)
/* DEF_GetHeads(5, (crc[p[0]] ^ p[1] ^ ((UInt32)p[2] << 8) ^ (crc[p[3]] << 5) ^ (crc[p[4]] << 3)) & hashMask) */
void HashThreadFunc(CMatchFinderMt *mt)
{
CMtSync *p = &mt->hashSync;
for (;;)
{
UInt32 numProcessedBlocks = 0;
Event_Wait(&p->canStart);
Event_Set(&p->wasStarted);
for (;;)
{
if (p->exit)
return;
if (p->stopWriting)
{
p->numProcessedBlocks = numProcessedBlocks;
Event_Set(&p->wasStopped);
break;
}
{
CMatchFinder *mf = mt->MatchFinder;
if (MatchFinder_NeedMove(mf))
{
CriticalSection_Enter(&mt->btSync.cs);
CriticalSection_Enter(&mt->hashSync.cs);
{
const Byte *beforePtr = MatchFinder_GetPointerToCurrentPos(mf);
const Byte *afterPtr;
MatchFinder_MoveBlock(mf);
afterPtr = MatchFinder_GetPointerToCurrentPos(mf);
mt->pointerToCurPos -= beforePtr - afterPtr;
mt->buffer -= beforePtr - afterPtr;
}
CriticalSection_Leave(&mt->btSync.cs);
CriticalSection_Leave(&mt->hashSync.cs);
continue;
}
Semaphore_Wait(&p->freeSemaphore);
MatchFinder_ReadIfRequired(mf);
if (mf->pos > (kMtMaxValForNormalize - kMtHashBlockSize))
{
UInt32 subValue = (mf->pos - mf->historySize - 1);
MatchFinder_ReduceOffsets(mf, subValue);
MatchFinder_Normalize3(subValue, mf->hash + mf->fixedHashSize, mf->hashMask + 1);
}
{
UInt32 *heads = mt->hashBuf + ((numProcessedBlocks++) & kMtHashNumBlocksMask) * kMtHashBlockSize;
UInt32 num = mf->streamPos - mf->pos;
heads[0] = 2;
heads[1] = num;
if (num >= mf->numHashBytes)
{
num = num - mf->numHashBytes + 1;
if (num > kMtHashBlockSize - 2)
num = kMtHashBlockSize - 2;
mt->GetHeadsFunc(mf->buffer, mf->pos, mf->hash + mf->fixedHashSize, mf->hashMask, heads + 2, num, mf->crc);
heads[0] += num;
}
mf->pos += num;
mf->buffer += num;
}
}
Semaphore_Release1(&p->filledSemaphore);
}
}
}
void MatchFinderMt_GetNextBlock_Hash(CMatchFinderMt *p)
{
MtSync_GetNextBlock(&p->hashSync);
p->hashBufPosLimit = p->hashBufPos = ((p->hashSync.numProcessedBlocks - 1) & kMtHashNumBlocksMask) * kMtHashBlockSize;
p->hashBufPosLimit += p->hashBuf[p->hashBufPos++];
p->hashNumAvail = p->hashBuf[p->hashBufPos++];
}
#define kEmptyHashValue 0
/* #define MFMT_GM_INLINE */
#ifdef MFMT_GM_INLINE
#define NO_INLINE MY_FAST_CALL
Int32 NO_INLINE GetMatchesSpecN(UInt32 lenLimit, UInt32 pos, const Byte *cur, CLzRef *son,
UInt32 _cyclicBufferPos, UInt32 _cyclicBufferSize, UInt32 _cutValue,
UInt32 *_distances, UInt32 _maxLen, const UInt32 *hash, Int32 limit, UInt32 size, UInt32 *posRes)
{
do
{
UInt32 *distances = _distances + 1;
UInt32 curMatch = pos - *hash++;
CLzRef *ptr0 = son + (_cyclicBufferPos << 1) + 1;
CLzRef *ptr1 = son + (_cyclicBufferPos << 1);
UInt32 len0 = 0, len1 = 0;
UInt32 cutValue = _cutValue;
UInt32 maxLen = _maxLen;
for (;;)
{
UInt32 delta = pos - curMatch;
if (cutValue-- == 0 || delta >= _cyclicBufferSize)
{
*ptr0 = *ptr1 = kEmptyHashValue;
break;
}
{
CLzRef *pair = son + ((_cyclicBufferPos - delta + ((delta > _cyclicBufferPos) ? _cyclicBufferSize : 0)) << 1);
const Byte *pb = cur - delta;
UInt32 len = (len0 < len1 ? len0 : len1);
if (pb[len] == cur[len])
{
if (++len != lenLimit && pb[len] == cur[len])
while (++len != lenLimit)
if (pb[len] != cur[len])
break;
if (maxLen < len)
{
*distances++ = maxLen = len;
*distances++ = delta - 1;
if (len == lenLimit)
{
*ptr1 = pair[0];
*ptr0 = pair[1];
break;
}
}
}
if (pb[len] < cur[len])
{
*ptr1 = curMatch;
ptr1 = pair + 1;
curMatch = *ptr1;
len1 = len;
}
else
{
*ptr0 = curMatch;
ptr0 = pair;
curMatch = *ptr0;
len0 = len;
}
}
}
pos++;
_cyclicBufferPos++;
cur++;
{
UInt32 num = (UInt32)(distances - _distances);
*_distances = num - 1;
_distances += num;
limit -= num;
}
}
while (limit > 0 && --size != 0);
*posRes = pos;
return limit;
}
#endif
void BtGetMatches(CMatchFinderMt *p, UInt32 *distances)
{
UInt32 numProcessed = 0;
UInt32 curPos = 2;
UInt32 limit = kMtBtBlockSize - (p->matchMaxLen * 2);
distances[1] = p->hashNumAvail;
while (curPos < limit)
{
if (p->hashBufPos == p->hashBufPosLimit)
{
MatchFinderMt_GetNextBlock_Hash(p);
distances[1] = numProcessed + p->hashNumAvail;
if (p->hashNumAvail >= p->numHashBytes)
continue;
for (; p->hashNumAvail != 0; p->hashNumAvail--)
distances[curPos++] = 0;
break;
}
{
UInt32 size = p->hashBufPosLimit - p->hashBufPos;
UInt32 lenLimit = p->matchMaxLen;
UInt32 pos = p->pos;
UInt32 cyclicBufferPos = p->cyclicBufferPos;
if (lenLimit >= p->hashNumAvail)
lenLimit = p->hashNumAvail;
{
UInt32 size2 = p->hashNumAvail - lenLimit + 1;
if (size2 < size)
size = size2;
size2 = p->cyclicBufferSize - cyclicBufferPos;
if (size2 < size)
size = size2;
}
#ifndef MFMT_GM_INLINE
while (curPos < limit && size-- != 0)
{
UInt32 *startDistances = distances + curPos;
UInt32 num = (UInt32)(GetMatchesSpec1(lenLimit, pos - p->hashBuf[p->hashBufPos++],
pos, p->buffer, p->son, cyclicBufferPos, p->cyclicBufferSize, p->cutValue,
startDistances + 1, p->numHashBytes - 1) - startDistances);
*startDistances = num - 1;
curPos += num;
cyclicBufferPos++;
pos++;
p->buffer++;
}
#else
{
UInt32 posRes;
curPos = limit - GetMatchesSpecN(lenLimit, pos, p->buffer, p->son, cyclicBufferPos, p->cyclicBufferSize, p->cutValue,
distances + curPos, p->numHashBytes - 1, p->hashBuf + p->hashBufPos, (Int32)(limit - curPos) , size, &posRes);
p->hashBufPos += posRes - pos;
cyclicBufferPos += posRes - pos;
p->buffer += posRes - pos;
pos = posRes;
}
#endif
numProcessed += pos - p->pos;
p->hashNumAvail -= pos - p->pos;
p->pos = pos;
if (cyclicBufferPos == p->cyclicBufferSize)
cyclicBufferPos = 0;
p->cyclicBufferPos = cyclicBufferPos;
}
}
distances[0] = curPos;
}
void BtFillBlock(CMatchFinderMt *p, UInt32 globalBlockIndex)
{
CMtSync *sync = &p->hashSync;
if (!sync->needStart)
{
CriticalSection_Enter(&sync->cs);
sync->csWasEntered = True;
}
BtGetMatches(p, p->btBuf + (globalBlockIndex & kMtBtNumBlocksMask) * kMtBtBlockSize);
if (p->pos > kMtMaxValForNormalize - kMtBtBlockSize)
{
UInt32 subValue = p->pos - p->cyclicBufferSize;
MatchFinder_Normalize3(subValue, p->son, p->cyclicBufferSize * 2);
p->pos -= subValue;
}
if (!sync->needStart)
{
CriticalSection_Leave(&sync->cs);
sync->csWasEntered = False;
}
}
void BtThreadFunc(CMatchFinderMt *mt)
{
CMtSync *p = &mt->btSync;
for (;;)
{
UInt32 blockIndex = 0;
Event_Wait(&p->canStart);
Event_Set(&p->wasStarted);
for (;;)
{
if (p->exit)
return;
if (p->stopWriting)
{
p->numProcessedBlocks = blockIndex;
MtSync_StopWriting(&mt->hashSync);
Event_Set(&p->wasStopped);
break;
}
Semaphore_Wait(&p->freeSemaphore);
BtFillBlock(mt, blockIndex++);
Semaphore_Release1(&p->filledSemaphore);
}
}
}
void MatchFinderMt_Construct(CMatchFinderMt *p)
{
p->hashBuf = 0;
MtSync_Construct(&p->hashSync);
MtSync_Construct(&p->btSync);
}
void MatchFinderMt_FreeMem(CMatchFinderMt *p, ISzAlloc *alloc)
{
alloc->Free(alloc, p->hashBuf);
p->hashBuf = 0;
}
void MatchFinderMt_Destruct(CMatchFinderMt *p, ISzAlloc *alloc)
{
MtSync_Destruct(&p->hashSync);
MtSync_Destruct(&p->btSync);
MatchFinderMt_FreeMem(p, alloc);
}
#define kHashBufferSize (kMtHashBlockSize * kMtHashNumBlocks)
#define kBtBufferSize (kMtBtBlockSize * kMtBtNumBlocks)
static unsigned MY_STD_CALL HashThreadFunc2(void *p) { HashThreadFunc((CMatchFinderMt *)p); return 0; }
static unsigned MY_STD_CALL BtThreadFunc2(void *p)
{
Byte allocaDummy[0x180];
int i = 0;
for (i = 0; i < 16; i++)
allocaDummy[i] = (Byte)i;
BtThreadFunc((CMatchFinderMt *)p);
return 0;
}
SRes MatchFinderMt_Create(CMatchFinderMt *p, UInt32 historySize, UInt32 keepAddBufferBefore,
UInt32 matchMaxLen, UInt32 keepAddBufferAfter, ISzAlloc *alloc)
{
CMatchFinder *mf = p->MatchFinder;
p->historySize = historySize;
if (kMtBtBlockSize <= matchMaxLen * 4)
return SZ_ERROR_PARAM;
if (p->hashBuf == 0)
{
p->hashBuf = (UInt32 *)alloc->Alloc(alloc, (kHashBufferSize + kBtBufferSize) * sizeof(UInt32));
if (p->hashBuf == 0)
return SZ_ERROR_MEM;
p->btBuf = p->hashBuf + kHashBufferSize;
}
keepAddBufferBefore += (kHashBufferSize + kBtBufferSize);
keepAddBufferAfter += kMtHashBlockSize;
if (!MatchFinder_Create(mf, historySize, keepAddBufferBefore, matchMaxLen, keepAddBufferAfter, alloc))
return SZ_ERROR_MEM;
RINOK(MtSync_Create(&p->hashSync, HashThreadFunc2, p, kMtHashNumBlocks));
RINOK(MtSync_Create(&p->btSync, BtThreadFunc2, p, kMtBtNumBlocks));
return SZ_OK;
}
/* Call it after ReleaseStream / SetStream */
void MatchFinderMt_Init(CMatchFinderMt *p)
{
CMatchFinder *mf = p->MatchFinder;
p->btBufPos = p->btBufPosLimit = 0;
p->hashBufPos = p->hashBufPosLimit = 0;
MatchFinder_Init(mf);
p->pointerToCurPos = MatchFinder_GetPointerToCurrentPos(mf);
p->btNumAvailBytes = 0;
p->lzPos = p->historySize + 1;
p->hash = mf->hash;
p->fixedHashSize = mf->fixedHashSize;
p->crc = mf->crc;
p->son = mf->son;
p->matchMaxLen = mf->matchMaxLen;
p->numHashBytes = mf->numHashBytes;
p->pos = mf->pos;
p->buffer = mf->buffer;
p->cyclicBufferPos = mf->cyclicBufferPos;
p->cyclicBufferSize = mf->cyclicBufferSize;
p->cutValue = mf->cutValue;
}
/* ReleaseStream is required to finish multithreading */
void MatchFinderMt_ReleaseStream(CMatchFinderMt *p)
{
MtSync_StopWriting(&p->btSync);
/* p->MatchFinder->ReleaseStream(); */
}
void MatchFinderMt_Normalize(CMatchFinderMt *p)
{
MatchFinder_Normalize3(p->lzPos - p->historySize - 1, p->hash, p->fixedHashSize);
p->lzPos = p->historySize + 1;
}
void MatchFinderMt_GetNextBlock_Bt(CMatchFinderMt *p)
{
UInt32 blockIndex;
MtSync_GetNextBlock(&p->btSync);
blockIndex = ((p->btSync.numProcessedBlocks - 1) & kMtBtNumBlocksMask);
p->btBufPosLimit = p->btBufPos = blockIndex * kMtBtBlockSize;
p->btBufPosLimit += p->btBuf[p->btBufPos++];
p->btNumAvailBytes = p->btBuf[p->btBufPos++];
if (p->lzPos >= kMtMaxValForNormalize - kMtBtBlockSize)
MatchFinderMt_Normalize(p);
}
const Byte * MatchFinderMt_GetPointerToCurrentPos(CMatchFinderMt *p)
{
return p->pointerToCurPos;
}
#define GET_NEXT_BLOCK_IF_REQUIRED if (p->btBufPos == p->btBufPosLimit) MatchFinderMt_GetNextBlock_Bt(p);
UInt32 MatchFinderMt_GetNumAvailableBytes(CMatchFinderMt *p)
{
GET_NEXT_BLOCK_IF_REQUIRED;
return p->btNumAvailBytes;
}
Byte MatchFinderMt_GetIndexByte(CMatchFinderMt *p, Int32 index)
{
return p->pointerToCurPos[index];
}
UInt32 * MixMatches2(CMatchFinderMt *p, UInt32 matchMinPos, UInt32 *distances)
{
UInt32 hash2Value, curMatch2;
UInt32 *hash = p->hash;
const Byte *cur = p->pointerToCurPos;
UInt32 lzPos = p->lzPos;
MT_HASH2_CALC
curMatch2 = hash[hash2Value];
hash[hash2Value] = lzPos;
if (curMatch2 >= matchMinPos)
if (cur[(ptrdiff_t)curMatch2 - lzPos] == cur[0])
{
*distances++ = 2;
*distances++ = lzPos - curMatch2 - 1;
}
return distances;
}
UInt32 * MixMatches3(CMatchFinderMt *p, UInt32 matchMinPos, UInt32 *distances)
{
UInt32 hash2Value, hash3Value, curMatch2, curMatch3;
UInt32 *hash = p->hash;
const Byte *cur = p->pointerToCurPos;
UInt32 lzPos = p->lzPos;
MT_HASH3_CALC
curMatch2 = hash[ hash2Value];
curMatch3 = hash[kFix3HashSize + hash3Value];
hash[ hash2Value] =
hash[kFix3HashSize + hash3Value] =
lzPos;
if (curMatch2 >= matchMinPos && cur[(ptrdiff_t)curMatch2 - lzPos] == cur[0])
{
distances[1] = lzPos - curMatch2 - 1;
if (cur[(ptrdiff_t)curMatch2 - lzPos + 2] == cur[2])
{
distances[0] = 3;
return distances + 2;
}
distances[0] = 2;
distances += 2;
}
if (curMatch3 >= matchMinPos && cur[(ptrdiff_t)curMatch3 - lzPos] == cur[0])
{
*distances++ = 3;
*distances++ = lzPos - curMatch3 - 1;
}
return distances;
}
/*
UInt32 *MixMatches4(CMatchFinderMt *p, UInt32 matchMinPos, UInt32 *distances)
{
UInt32 hash2Value, hash3Value, hash4Value, curMatch2, curMatch3, curMatch4;
UInt32 *hash = p->hash;
const Byte *cur = p->pointerToCurPos;
UInt32 lzPos = p->lzPos;
MT_HASH4_CALC
curMatch2 = hash[ hash2Value];
curMatch3 = hash[kFix3HashSize + hash3Value];
curMatch4 = hash[kFix4HashSize + hash4Value];
hash[ hash2Value] =
hash[kFix3HashSize + hash3Value] =
hash[kFix4HashSize + hash4Value] =
lzPos;
if (curMatch2 >= matchMinPos && cur[(ptrdiff_t)curMatch2 - lzPos] == cur[0])
{
distances[1] = lzPos - curMatch2 - 1;
if (cur[(ptrdiff_t)curMatch2 - lzPos + 2] == cur[2])
{
distances[0] = (cur[(ptrdiff_t)curMatch2 - lzPos + 3] == cur[3]) ? 4 : 3;
return distances + 2;
}
distances[0] = 2;
distances += 2;
}
if (curMatch3 >= matchMinPos && cur[(ptrdiff_t)curMatch3 - lzPos] == cur[0])
{
distances[1] = lzPos - curMatch3 - 1;
if (cur[(ptrdiff_t)curMatch3 - lzPos + 3] == cur[3])
{
distances[0] = 4;
return distances + 2;
}
distances[0] = 3;
distances += 2;
}
if (curMatch4 >= matchMinPos)
if (
cur[(ptrdiff_t)curMatch4 - lzPos] == cur[0] &&
cur[(ptrdiff_t)curMatch4 - lzPos + 3] == cur[3]
)
{
*distances++ = 4;
*distances++ = lzPos - curMatch4 - 1;
}
return distances;
}
*/
#define INCREASE_LZ_POS p->lzPos++; p->pointerToCurPos++;
UInt32 MatchFinderMt2_GetMatches(CMatchFinderMt *p, UInt32 *distances)
{
const UInt32 *btBuf = p->btBuf + p->btBufPos;
UInt32 len = *btBuf++;
p->btBufPos += 1 + len;
p->btNumAvailBytes--;
{
UInt32 i;
for (i = 0; i < len; i += 2)
{
*distances++ = *btBuf++;
*distances++ = *btBuf++;
}
}
INCREASE_LZ_POS
return len;
}
UInt32 MatchFinderMt_GetMatches(CMatchFinderMt *p, UInt32 *distances)
{
const UInt32 *btBuf = p->btBuf + p->btBufPos;
UInt32 len = *btBuf++;
p->btBufPos += 1 + len;
if (len == 0)
{
if (p->btNumAvailBytes-- >= 4)
len = (UInt32)(p->MixMatchesFunc(p, p->lzPos - p->historySize, distances) - (distances));
}
else
{
/* Condition: there are matches in btBuf with length < p->numHashBytes */
UInt32 *distances2;
p->btNumAvailBytes--;
distances2 = p->MixMatchesFunc(p, p->lzPos - btBuf[1], distances);
do
{
*distances2++ = *btBuf++;
*distances2++ = *btBuf++;
}
while ((len -= 2) != 0);
len = (UInt32)(distances2 - (distances));
}
INCREASE_LZ_POS
return len;
}
#define SKIP_HEADER2 do { GET_NEXT_BLOCK_IF_REQUIRED
#define SKIP_HEADER(n) SKIP_HEADER2 if (p->btNumAvailBytes-- >= (n)) { const Byte *cur = p->pointerToCurPos; UInt32 *hash = p->hash;
#define SKIP_FOOTER } INCREASE_LZ_POS p->btBufPos += p->btBuf[p->btBufPos] + 1; } while (--num != 0);
void MatchFinderMt0_Skip(CMatchFinderMt *p, UInt32 num)
{
SKIP_HEADER2 { p->btNumAvailBytes--;
SKIP_FOOTER
}
void MatchFinderMt2_Skip(CMatchFinderMt *p, UInt32 num)
{
SKIP_HEADER(2)
UInt32 hash2Value;
MT_HASH2_CALC
hash[hash2Value] = p->lzPos;
SKIP_FOOTER
}
void MatchFinderMt3_Skip(CMatchFinderMt *p, UInt32 num)
{
SKIP_HEADER(3)
UInt32 hash2Value, hash3Value;
MT_HASH3_CALC
hash[kFix3HashSize + hash3Value] =
hash[ hash2Value] =
p->lzPos;
SKIP_FOOTER
}
/*
void MatchFinderMt4_Skip(CMatchFinderMt *p, UInt32 num)
{
SKIP_HEADER(4)
UInt32 hash2Value, hash3Value, hash4Value;
MT_HASH4_CALC
hash[kFix4HashSize + hash4Value] =
hash[kFix3HashSize + hash3Value] =
hash[ hash2Value] =
p->lzPos;
SKIP_FOOTER
}
*/
void MatchFinderMt_CreateVTable(CMatchFinderMt *p, IMatchFinder *vTable)
{
vTable->Init = (Mf_Init_Func)MatchFinderMt_Init;
vTable->GetIndexByte = (Mf_GetIndexByte_Func)MatchFinderMt_GetIndexByte;
vTable->GetNumAvailableBytes = (Mf_GetNumAvailableBytes_Func)MatchFinderMt_GetNumAvailableBytes;
vTable->GetPointerToCurrentPos = (Mf_GetPointerToCurrentPos_Func)MatchFinderMt_GetPointerToCurrentPos;
vTable->GetMatches = (Mf_GetMatches_Func)MatchFinderMt_GetMatches;
switch(p->MatchFinder->numHashBytes)
{
case 2:
p->GetHeadsFunc = GetHeads2;
p->MixMatchesFunc = (Mf_Mix_Matches)0;
vTable->Skip = (Mf_Skip_Func)MatchFinderMt0_Skip;
vTable->GetMatches = (Mf_GetMatches_Func)MatchFinderMt2_GetMatches;
break;
case 3:
p->GetHeadsFunc = GetHeads3;
p->MixMatchesFunc = (Mf_Mix_Matches)MixMatches2;
vTable->Skip = (Mf_Skip_Func)MatchFinderMt2_Skip;
break;
default:
/* case 4: */
p->GetHeadsFunc = p->MatchFinder->bigHash ? GetHeads4b : GetHeads4;
/* p->GetHeadsFunc = GetHeads4; */
p->MixMatchesFunc = (Mf_Mix_Matches)MixMatches3;
vTable->Skip = (Mf_Skip_Func)MatchFinderMt3_Skip;
break;
/*
default:
p->GetHeadsFunc = GetHeads5;
p->MixMatchesFunc = (Mf_Mix_Matches)MixMatches4;
vTable->Skip = (Mf_Skip_Func)MatchFinderMt4_Skip;
break;
*/
}
}

105
lzma/C/LzFindMt.h Normal file
View file

@ -0,0 +1,105 @@
/* LzFindMt.h -- multithreaded Match finder for LZ algorithms
2009-02-07 : Igor Pavlov : Public domain */
#ifndef __LZ_FIND_MT_H
#define __LZ_FIND_MT_H
#include "LzFind.h"
#include "Threads.h"
#ifdef __cplusplus
extern "C" {
#endif
#define kMtHashBlockSize (1 << 13)
#define kMtHashNumBlocks (1 << 3)
#define kMtHashNumBlocksMask (kMtHashNumBlocks - 1)
#define kMtBtBlockSize (1 << 14)
#define kMtBtNumBlocks (1 << 6)
#define kMtBtNumBlocksMask (kMtBtNumBlocks - 1)
typedef struct _CMtSync
{
Bool wasCreated;
Bool needStart;
Bool exit;
Bool stopWriting;
CThread thread;
CAutoResetEvent canStart;
CAutoResetEvent wasStarted;
CAutoResetEvent wasStopped;
CSemaphore freeSemaphore;
CSemaphore filledSemaphore;
Bool csWasInitialized;
Bool csWasEntered;
CCriticalSection cs;
UInt32 numProcessedBlocks;
} CMtSync;
typedef UInt32 * (*Mf_Mix_Matches)(void *p, UInt32 matchMinPos, UInt32 *distances);
/* kMtCacheLineDummy must be >= size_of_CPU_cache_line */
#define kMtCacheLineDummy 128
typedef void (*Mf_GetHeads)(const Byte *buffer, UInt32 pos,
UInt32 *hash, UInt32 hashMask, UInt32 *heads, UInt32 numHeads, const UInt32 *crc);
typedef struct _CMatchFinderMt
{
/* LZ */
const Byte *pointerToCurPos;
UInt32 *btBuf;
UInt32 btBufPos;
UInt32 btBufPosLimit;
UInt32 lzPos;
UInt32 btNumAvailBytes;
UInt32 *hash;
UInt32 fixedHashSize;
UInt32 historySize;
const UInt32 *crc;
Mf_Mix_Matches MixMatchesFunc;
/* LZ + BT */
CMtSync btSync;
Byte btDummy[kMtCacheLineDummy];
/* BT */
UInt32 *hashBuf;
UInt32 hashBufPos;
UInt32 hashBufPosLimit;
UInt32 hashNumAvail;
CLzRef *son;
UInt32 matchMaxLen;
UInt32 numHashBytes;
UInt32 pos;
Byte *buffer;
UInt32 cyclicBufferPos;
UInt32 cyclicBufferSize; /* it must be historySize + 1 */
UInt32 cutValue;
/* BT + Hash */
CMtSync hashSync;
/* Byte hashDummy[kMtCacheLineDummy]; */
/* Hash */
Mf_GetHeads GetHeadsFunc;
CMatchFinder *MatchFinder;
} CMatchFinderMt;
void MatchFinderMt_Construct(CMatchFinderMt *p);
void MatchFinderMt_Destruct(CMatchFinderMt *p, ISzAlloc *alloc);
SRes MatchFinderMt_Create(CMatchFinderMt *p, UInt32 historySize, UInt32 keepAddBufferBefore,
UInt32 matchMaxLen, UInt32 keepAddBufferAfter, ISzAlloc *alloc);
void MatchFinderMt_CreateVTable(CMatchFinderMt *p, IMatchFinder *vTable);
void MatchFinderMt_ReleaseStream(CMatchFinderMt *p);
#ifdef __cplusplus
}
#endif
#endif

54
lzma/C/LzHash.h Normal file
View file

@ -0,0 +1,54 @@
/* LzHash.h -- HASH functions for LZ algorithms
2009-02-07 : Igor Pavlov : Public domain */
#ifndef __LZ_HASH_H
#define __LZ_HASH_H
#define kHash2Size (1 << 10)
#define kHash3Size (1 << 16)
#define kHash4Size (1 << 20)
#define kFix3HashSize (kHash2Size)
#define kFix4HashSize (kHash2Size + kHash3Size)
#define kFix5HashSize (kHash2Size + kHash3Size + kHash4Size)
#define HASH2_CALC hashValue = cur[0] | ((UInt32)cur[1] << 8);
#define HASH3_CALC { \
UInt32 temp = p->crc[cur[0]] ^ cur[1]; \
hash2Value = temp & (kHash2Size - 1); \
hashValue = (temp ^ ((UInt32)cur[2] << 8)) & p->hashMask; }
#define HASH4_CALC { \
UInt32 temp = p->crc[cur[0]] ^ cur[1]; \
hash2Value = temp & (kHash2Size - 1); \
hash3Value = (temp ^ ((UInt32)cur[2] << 8)) & (kHash3Size - 1); \
hashValue = (temp ^ ((UInt32)cur[2] << 8) ^ (p->crc[cur[3]] << 5)) & p->hashMask; }
#define HASH5_CALC { \
UInt32 temp = p->crc[cur[0]] ^ cur[1]; \
hash2Value = temp & (kHash2Size - 1); \
hash3Value = (temp ^ ((UInt32)cur[2] << 8)) & (kHash3Size - 1); \
hash4Value = (temp ^ ((UInt32)cur[2] << 8) ^ (p->crc[cur[3]] << 5)); \
hashValue = (hash4Value ^ (p->crc[cur[4]] << 3)) & p->hashMask; \
hash4Value &= (kHash4Size - 1); }
/* #define HASH_ZIP_CALC hashValue = ((cur[0] | ((UInt32)cur[1] << 8)) ^ p->crc[cur[2]]) & 0xFFFF; */
#define HASH_ZIP_CALC hashValue = ((cur[2] | ((UInt32)cur[0] << 8)) ^ p->crc[cur[1]]) & 0xFFFF;
#define MT_HASH2_CALC \
hash2Value = (p->crc[cur[0]] ^ cur[1]) & (kHash2Size - 1);
#define MT_HASH3_CALC { \
UInt32 temp = p->crc[cur[0]] ^ cur[1]; \
hash2Value = temp & (kHash2Size - 1); \
hash3Value = (temp ^ ((UInt32)cur[2] << 8)) & (kHash3Size - 1); }
#define MT_HASH4_CALC { \
UInt32 temp = p->crc[cur[0]] ^ cur[1]; \
hash2Value = temp & (kHash2Size - 1); \
hash3Value = (temp ^ ((UInt32)cur[2] << 8)) & (kHash3Size - 1); \
hash4Value = (temp ^ ((UInt32)cur[2] << 8) ^ (p->crc[cur[3]] << 5)) & (kHash4Size - 1); }
#endif

1007
lzma/C/LzmaDec.c Normal file

File diff suppressed because it is too large Load diff

231
lzma/C/LzmaDec.h Normal file
View file

@ -0,0 +1,231 @@
/* LzmaDec.h -- LZMA Decoder
2009-02-07 : Igor Pavlov : Public domain */
#ifndef __LZMA_DEC_H
#define __LZMA_DEC_H
#include "Types.h"
#ifdef __cplusplus
extern "C" {
#endif
/* #define _LZMA_PROB32 */
/* _LZMA_PROB32 can increase the speed on some CPUs,
but memory usage for CLzmaDec::probs will be doubled in that case */
#ifdef _LZMA_PROB32
#define CLzmaProb UInt32
#else
#define CLzmaProb UInt16
#endif
/* ---------- LZMA Properties ---------- */
#define LZMA_PROPS_SIZE 5
typedef struct _CLzmaProps
{
unsigned lc, lp, pb;
UInt32 dicSize;
} CLzmaProps;
/* LzmaProps_Decode - decodes properties
Returns:
SZ_OK
SZ_ERROR_UNSUPPORTED - Unsupported properties
*/
SRes LzmaProps_Decode(CLzmaProps *p, const Byte *data, unsigned size);
/* ---------- LZMA Decoder state ---------- */
/* LZMA_REQUIRED_INPUT_MAX = number of required input bytes for worst case.
Num bits = log2((2^11 / 31) ^ 22) + 26 < 134 + 26 = 160; */
#define LZMA_REQUIRED_INPUT_MAX 20
typedef struct
{
CLzmaProps prop;
CLzmaProb *probs;
Byte *dic;
const Byte *buf;
UInt32 range, code;
SizeT dicPos;
SizeT dicBufSize;
UInt32 processedPos;
UInt32 checkDicSize;
unsigned state;
UInt32 reps[4];
unsigned remainLen;
int needFlush;
int needInitState;
UInt32 numProbs;
unsigned tempBufSize;
Byte tempBuf[LZMA_REQUIRED_INPUT_MAX];
} CLzmaDec;
#define LzmaDec_Construct(p) { (p)->dic = 0; (p)->probs = 0; }
void LzmaDec_Init(CLzmaDec *p);
/* There are two types of LZMA streams:
0) Stream with end mark. That end mark adds about 6 bytes to compressed size.
1) Stream without end mark. You must know exact uncompressed size to decompress such stream. */
typedef enum
{
LZMA_FINISH_ANY, /* finish at any point */
LZMA_FINISH_END /* block must be finished at the end */
} ELzmaFinishMode;
/* ELzmaFinishMode has meaning only if the decoding reaches output limit !!!
You must use LZMA_FINISH_END, when you know that current output buffer
covers last bytes of block. In other cases you must use LZMA_FINISH_ANY.
If LZMA decoder sees end marker before reaching output limit, it returns SZ_OK,
and output value of destLen will be less than output buffer size limit.
You can check status result also.
You can use multiple checks to test data integrity after full decompression:
1) Check Result and "status" variable.
2) Check that output(destLen) = uncompressedSize, if you know real uncompressedSize.
3) Check that output(srcLen) = compressedSize, if you know real compressedSize.
You must use correct finish mode in that case. */
typedef enum
{
LZMA_STATUS_NOT_SPECIFIED, /* use main error code instead */
LZMA_STATUS_FINISHED_WITH_MARK, /* stream was finished with end mark. */
LZMA_STATUS_NOT_FINISHED, /* stream was not finished */
LZMA_STATUS_NEEDS_MORE_INPUT, /* you must provide more input bytes */
LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK /* there is probability that stream was finished without end mark */
} ELzmaStatus;
/* ELzmaStatus is used only as output value for function call */
/* ---------- Interfaces ---------- */
/* There are 3 levels of interfaces:
1) Dictionary Interface
2) Buffer Interface
3) One Call Interface
You can select any of these interfaces, but don't mix functions from different
groups for same object. */
/* There are two variants to allocate state for Dictionary Interface:
1) LzmaDec_Allocate / LzmaDec_Free
2) LzmaDec_AllocateProbs / LzmaDec_FreeProbs
You can use variant 2, if you set dictionary buffer manually.
For Buffer Interface you must always use variant 1.
LzmaDec_Allocate* can return:
SZ_OK
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_UNSUPPORTED - Unsupported properties
*/
SRes LzmaDec_AllocateProbs(CLzmaDec *p, const Byte *props, unsigned propsSize, ISzAlloc *alloc);
void LzmaDec_FreeProbs(CLzmaDec *p, ISzAlloc *alloc);
SRes LzmaDec_Allocate(CLzmaDec *state, const Byte *prop, unsigned propsSize, ISzAlloc *alloc);
void LzmaDec_Free(CLzmaDec *state, ISzAlloc *alloc);
/* ---------- Dictionary Interface ---------- */
/* You can use it, if you want to eliminate the overhead for data copying from
dictionary to some other external buffer.
You must work with CLzmaDec variables directly in this interface.
STEPS:
LzmaDec_Constr()
LzmaDec_Allocate()
for (each new stream)
{
LzmaDec_Init()
while (it needs more decompression)
{
LzmaDec_DecodeToDic()
use data from CLzmaDec::dic and update CLzmaDec::dicPos
}
}
LzmaDec_Free()
*/
/* LzmaDec_DecodeToDic
The decoding to internal dictionary buffer (CLzmaDec::dic).
You must manually update CLzmaDec::dicPos, if it reaches CLzmaDec::dicBufSize !!!
finishMode:
It has meaning only if the decoding reaches output limit (dicLimit).
LZMA_FINISH_ANY - Decode just dicLimit bytes.
LZMA_FINISH_END - Stream must be finished after dicLimit.
Returns:
SZ_OK
status:
LZMA_STATUS_FINISHED_WITH_MARK
LZMA_STATUS_NOT_FINISHED
LZMA_STATUS_NEEDS_MORE_INPUT
LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
SZ_ERROR_DATA - Data error
*/
SRes LzmaDec_DecodeToDic(CLzmaDec *p, SizeT dicLimit,
const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode, ELzmaStatus *status);
/* ---------- Buffer Interface ---------- */
/* It's zlib-like interface.
See LzmaDec_DecodeToDic description for information about STEPS and return results,
but you must use LzmaDec_DecodeToBuf instead of LzmaDec_DecodeToDic and you don't need
to work with CLzmaDec variables manually.
finishMode:
It has meaning only if the decoding reaches output limit (*destLen).
LZMA_FINISH_ANY - Decode just destLen bytes.
LZMA_FINISH_END - Stream must be finished after (*destLen).
*/
SRes LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,
const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode, ELzmaStatus *status);
/* ---------- One Call Interface ---------- */
/* LzmaDecode
finishMode:
It has meaning only if the decoding reaches output limit (*destLen).
LZMA_FINISH_ANY - Decode just destLen bytes.
LZMA_FINISH_END - Stream must be finished after (*destLen).
Returns:
SZ_OK
status:
LZMA_STATUS_FINISHED_WITH_MARK
LZMA_STATUS_NOT_FINISHED
LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
SZ_ERROR_DATA - Data error
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_UNSUPPORTED - Unsupported properties
SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src).
*/
SRes LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode,
ELzmaStatus *status, ISzAlloc *alloc);
#ifdef __cplusplus
}
#endif
#endif

2268
lzma/C/LzmaEnc.c Normal file

File diff suppressed because it is too large Load diff

80
lzma/C/LzmaEnc.h Normal file
View file

@ -0,0 +1,80 @@
/* LzmaEnc.h -- LZMA Encoder
2009-02-07 : Igor Pavlov : Public domain */
#ifndef __LZMA_ENC_H
#define __LZMA_ENC_H
#include "Types.h"
#ifdef __cplusplus
extern "C" {
#endif
#define LZMA_PROPS_SIZE 5
typedef struct _CLzmaEncProps
{
int level; /* 0 <= level <= 9 */
UInt32 dictSize; /* (1 << 12) <= dictSize <= (1 << 27) for 32-bit version
(1 << 12) <= dictSize <= (1 << 30) for 64-bit version
default = (1 << 24) */
int lc; /* 0 <= lc <= 8, default = 3 */
int lp; /* 0 <= lp <= 4, default = 0 */
int pb; /* 0 <= pb <= 4, default = 2 */
int algo; /* 0 - fast, 1 - normal, default = 1 */
int fb; /* 5 <= fb <= 273, default = 32 */
int btMode; /* 0 - hashChain Mode, 1 - binTree mode - normal, default = 1 */
int numHashBytes; /* 2, 3 or 4, default = 4 */
UInt32 mc; /* 1 <= mc <= (1 << 30), default = 32 */
unsigned writeEndMark; /* 0 - do not write EOPM, 1 - write EOPM, default = 0 */
int numThreads; /* 1 or 2, default = 2 */
} CLzmaEncProps;
void LzmaEncProps_Init(CLzmaEncProps *p);
void LzmaEncProps_Normalize(CLzmaEncProps *p);
UInt32 LzmaEncProps_GetDictSize(const CLzmaEncProps *props2);
/* ---------- CLzmaEncHandle Interface ---------- */
/* LzmaEnc_* functions can return the following exit codes:
Returns:
SZ_OK - OK
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_PARAM - Incorrect paramater in props
SZ_ERROR_WRITE - Write callback error.
SZ_ERROR_PROGRESS - some break from progress callback
SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version)
*/
typedef void * CLzmaEncHandle;
CLzmaEncHandle LzmaEnc_Create(ISzAlloc *alloc);
void LzmaEnc_Destroy(CLzmaEncHandle p, ISzAlloc *alloc, ISzAlloc *allocBig);
SRes LzmaEnc_SetProps(CLzmaEncHandle p, const CLzmaEncProps *props);
SRes LzmaEnc_WriteProperties(CLzmaEncHandle p, Byte *properties, SizeT *size);
SRes LzmaEnc_Encode(CLzmaEncHandle p, ISeqOutStream *outStream, ISeqInStream *inStream,
ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
SRes LzmaEnc_MemEncode(CLzmaEncHandle p, Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
int writeEndMark, ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
/* ---------- One Call Interface ---------- */
/* LzmaEncode
Return code:
SZ_OK - OK
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_PARAM - Incorrect paramater
SZ_ERROR_OUTPUT_EOF - output buffer overflow
SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version)
*/
SRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
const CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark,
ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
#ifdef __cplusplus
}
#endif
#endif

46
lzma/C/LzmaLib.c Normal file
View file

@ -0,0 +1,46 @@
/* LzmaLib.c -- LZMA library wrapper
2008-08-05
Igor Pavlov
Public domain */
#include "LzmaEnc.h"
#include "LzmaDec.h"
#include "Alloc.h"
#include "LzmaLib.h"
static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
static void SzFree(void *p, void *address) { p = p; MyFree(address); }
static ISzAlloc g_Alloc = { SzAlloc, SzFree };
MY_STDAPI LzmaCompress(unsigned char *dest, size_t *destLen, const unsigned char *src, size_t srcLen,
unsigned char *outProps, size_t *outPropsSize,
int level, /* 0 <= level <= 9, default = 5 */
unsigned dictSize, /* use (1 << N) or (3 << N). 4 KB < dictSize <= 128 MB */
int lc, /* 0 <= lc <= 8, default = 3 */
int lp, /* 0 <= lp <= 4, default = 0 */
int pb, /* 0 <= pb <= 4, default = 2 */
int fb, /* 5 <= fb <= 273, default = 32 */
int numThreads /* 1 or 2, default = 2 */
)
{
CLzmaEncProps props;
LzmaEncProps_Init(&props);
props.level = level;
props.dictSize = dictSize;
props.lc = lc;
props.lp = lp;
props.pb = pb;
props.fb = fb;
props.numThreads = numThreads;
return LzmaEncode(dest, destLen, src, srcLen, &props, outProps, outPropsSize, 0,
NULL, &g_Alloc, &g_Alloc);
}
MY_STDAPI LzmaUncompress(unsigned char *dest, size_t *destLen, const unsigned char *src, size_t *srcLen,
const unsigned char *props, size_t propsSize)
{
ELzmaStatus status;
return LzmaDecode(dest, destLen, src, srcLen, props, (unsigned)propsSize, LZMA_FINISH_ANY, &status, &g_Alloc);
}

135
lzma/C/LzmaLib.h Normal file
View file

@ -0,0 +1,135 @@
/* LzmaLib.h -- LZMA library interface
2009-04-07 : Igor Pavlov : Public domain */
#ifndef __LZMA_LIB_H
#define __LZMA_LIB_H
#include "Types.h"
#ifdef __cplusplus
extern "C" {
#endif
#define MY_STDAPI int MY_STD_CALL
#define LZMA_PROPS_SIZE 5
/*
RAM requirements for LZMA:
for compression: (dictSize * 11.5 + 6 MB) + state_size
for decompression: dictSize + state_size
state_size = (4 + (1.5 << (lc + lp))) KB
by default (lc=3, lp=0), state_size = 16 KB.
LZMA properties (5 bytes) format
Offset Size Description
0 1 lc, lp and pb in encoded form.
1 4 dictSize (little endian).
*/
/*
LzmaCompress
------------
outPropsSize -
In: the pointer to the size of outProps buffer; *outPropsSize = LZMA_PROPS_SIZE = 5.
Out: the pointer to the size of written properties in outProps buffer; *outPropsSize = LZMA_PROPS_SIZE = 5.
LZMA Encoder will use defult values for any parameter, if it is
-1 for any from: level, loc, lp, pb, fb, numThreads
0 for dictSize
level - compression level: 0 <= level <= 9;
level dictSize algo fb
0: 16 KB 0 32
1: 64 KB 0 32
2: 256 KB 0 32
3: 1 MB 0 32
4: 4 MB 0 32
5: 16 MB 1 32
6: 32 MB 1 32
7+: 64 MB 1 64
The default value for "level" is 5.
algo = 0 means fast method
algo = 1 means normal method
dictSize - The dictionary size in bytes. The maximum value is
128 MB = (1 << 27) bytes for 32-bit version
1 GB = (1 << 30) bytes for 64-bit version
The default value is 16 MB = (1 << 24) bytes.
It's recommended to use the dictionary that is larger than 4 KB and
that can be calculated as (1 << N) or (3 << N) sizes.
lc - The number of literal context bits (high bits of previous literal).
It can be in the range from 0 to 8. The default value is 3.
Sometimes lc=4 gives the gain for big files.
lp - The number of literal pos bits (low bits of current position for literals).
It can be in the range from 0 to 4. The default value is 0.
The lp switch is intended for periodical data when the period is equal to 2^lp.
For example, for 32-bit (4 bytes) periodical data you can use lp=2. Often it's
better to set lc=0, if you change lp switch.
pb - The number of pos bits (low bits of current position).
It can be in the range from 0 to 4. The default value is 2.
The pb switch is intended for periodical data when the period is equal 2^pb.
fb - Word size (the number of fast bytes).
It can be in the range from 5 to 273. The default value is 32.
Usually, a big number gives a little bit better compression ratio and
slower compression process.
numThreads - The number of thereads. 1 or 2. The default value is 2.
Fast mode (algo = 0) can use only 1 thread.
Out:
destLen - processed output size
Returns:
SZ_OK - OK
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_PARAM - Incorrect paramater
SZ_ERROR_OUTPUT_EOF - output buffer overflow
SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version)
*/
MY_STDAPI LzmaCompress(unsigned char *dest, size_t *destLen, const unsigned char *src, size_t srcLen,
unsigned char *outProps, size_t *outPropsSize, /* *outPropsSize must be = 5 */
int level, /* 0 <= level <= 9, default = 5 */
unsigned dictSize, /* default = (1 << 24) */
int lc, /* 0 <= lc <= 8, default = 3 */
int lp, /* 0 <= lp <= 4, default = 0 */
int pb, /* 0 <= pb <= 4, default = 2 */
int fb, /* 5 <= fb <= 273, default = 32 */
int numThreads /* 1 or 2, default = 2 */
);
/*
LzmaUncompress
--------------
In:
dest - output data
destLen - output data size
src - input data
srcLen - input data size
Out:
destLen - processed output size
srcLen - processed input size
Returns:
SZ_OK - OK
SZ_ERROR_DATA - Data error
SZ_ERROR_MEM - Memory allocation arror
SZ_ERROR_UNSUPPORTED - Unsupported properties
SZ_ERROR_INPUT_EOF - it needs more bytes in input buffer (src)
*/
MY_STDAPI LzmaUncompress(unsigned char *dest, size_t *destLen, const unsigned char *src, SizeT *srcLen,
const unsigned char *props, size_t propsSize);
#ifdef __cplusplus
}
#endif
#endif

55
lzma/C/MyGuidDef.h Normal file
View file

@ -0,0 +1,55 @@
// Common/MyGuidDef.h
#ifndef GUID_DEFINED
#define GUID_DEFINED
// #include "Types.h"
typedef int HRes; // from Types.h
typedef struct {
UInt32 Data1;
UInt16 Data2;
UInt16 Data3;
unsigned char Data4[8];
} GUID;
#ifdef __cplusplus
#define REFGUID const GUID &
#else
#define REFGUID const GUID *
#endif
#define REFCLSID REFGUID
#define REFIID REFGUID
#ifdef __cplusplus
inline int operator==(REFGUID g1, REFGUID g2)
{
for (int i = 0; i < (int)sizeof(g1); i++)
if (((unsigned char *)&g1)[i] != ((unsigned char *)&g2)[i])
return 0;
return 1;
}
inline int operator!=(REFGUID g1, REFGUID g2) { return !(g1 == g2); }
#endif
#ifdef __cplusplus
#define MY_EXTERN_C extern "C"
#else
#define MY_EXTERN_C extern
#endif
#endif // GUID_DEFINED
#ifdef DEFINE_GUID
#undef DEFINE_GUID
#endif
#ifdef INITGUID
#define DEFINE_GUID(name, l, w1, w2, b1, b2, b3, b4, b5, b6, b7, b8) \
MY_EXTERN_C const GUID name = { l, w1, w2, { b1, b2, b3, b4, b5, b6, b7, b8 } }
#else
#define DEFINE_GUID(name, l, w1, w2, b1, b2, b3, b4, b5, b6, b7, b8) \
MY_EXTERN_C const GUID name
#endif

226
lzma/C/MyWindows.h Normal file
View file

@ -0,0 +1,226 @@
// MyWindows.h
#ifndef __MYWINDOWS_H
#define __MYWINDOWS_H
#ifdef _WIN32
#include <windows.h>
#define CHAR_PATH_SEPARATOR '\\'
#define WCHAR_PATH_SEPARATOR L'\\'
#define STRING_PATH_SEPARATOR "\\"
#define WSTRING_PATH_SEPARATOR L"\\"
#else
#define CHAR_PATH_SEPARATOR '/'
#define WCHAR_PATH_SEPARATOR L'/'
#define STRING_PATH_SEPARATOR "/"
#define WSTRING_PATH_SEPARATOR L"/"
#include <stddef.h> // for wchar_t
#include <string.h>
#include "MyGuidDef.h"
typedef char CHAR;
typedef unsigned char UCHAR;
#undef BYTE
typedef unsigned char BYTE;
typedef short SHORT;
typedef unsigned short USHORT;
#undef WORD
typedef unsigned short WORD;
typedef short VARIANT_BOOL;
typedef int INT;
typedef Int32 INT32;
typedef unsigned int UINT;
typedef UInt32 UINT32;
typedef INT32 LONG; // LONG, ULONG and DWORD must be 32-bit
typedef UINT32 ULONG;
#undef DWORD
typedef UINT32 DWORD;
typedef Int64 LONGLONG;
typedef UInt64 ULONGLONG;
typedef struct LARGE_INTEGER { LONGLONG QuadPart; }LARGE_INTEGER;
typedef struct _ULARGE_INTEGER { ULONGLONG QuadPart;} ULARGE_INTEGER;
typedef const CHAR *LPCSTR;
typedef wchar_t WCHAR;
#ifdef _UNICODE
typedef WCHAR TCHAR;
#define lstrcpy wcscpy
#define lstrcat wcscat
#define lstrlen wcslen
#else
typedef CHAR TCHAR;
#define lstrcpy strcpy
#define lstrcat strcat
#define lstrlen strlen
#endif
typedef const TCHAR *LPCTSTR;
typedef WCHAR OLECHAR;
typedef const WCHAR *LPCWSTR;
typedef OLECHAR *BSTR;
typedef const OLECHAR *LPCOLESTR;
typedef OLECHAR *LPOLESTR;
typedef struct _FILETIME
{
DWORD dwLowDateTime;
DWORD dwHighDateTime;
}FILETIME;
#define HRESULT LONG
#define FAILED(Status) ((HRESULT)(Status)<0)
typedef ULONG PROPID;
typedef LONG SCODE;
#define S_OK ((HRESULT)0x00000000L)
#define S_FALSE ((HRESULT)0x00000001L)
#define E_NOTIMPL ((HRESULT)0x80004001L)
#define E_NOINTERFACE ((HRESULT)0x80004002L)
#define E_ABORT ((HRESULT)0x80004004L)
#define E_FAIL ((HRESULT)0x80004005L)
#define STG_E_INVALIDFUNCTION ((HRESULT)0x80030001L)
#define E_OUTOFMEMORY ((HRESULT)0x8007000EL)
#define E_INVALIDARG ((HRESULT)0x80070057L)
#ifdef _MSC_VER
#define STDMETHODCALLTYPE __stdcall
#else
#define STDMETHODCALLTYPE
#endif
#define STDMETHOD_(t, f) virtual t STDMETHODCALLTYPE f
#define STDMETHOD(f) STDMETHOD_(HRESULT, f)
#define STDMETHODIMP_(type) type STDMETHODCALLTYPE
#define STDMETHODIMP STDMETHODIMP_(HRESULT)
#define PURE = 0
#define MIDL_INTERFACE(x) struct
#ifdef __cplusplus
DEFINE_GUID(IID_IUnknown,
0x00000000, 0x0000, 0x0000, 0xC0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x46);
struct IUnknown
{
STDMETHOD(QueryInterface) (REFIID iid, void **outObject) PURE;
STDMETHOD_(ULONG, AddRef)() PURE;
STDMETHOD_(ULONG, Release)() PURE;
#ifndef _WIN32
virtual ~IUnknown() {}
#endif
};
typedef IUnknown *LPUNKNOWN;
#endif
#define VARIANT_TRUE ((VARIANT_BOOL)-1)
#define VARIANT_FALSE ((VARIANT_BOOL)0)
enum VARENUM
{
VT_EMPTY = 0,
VT_NULL = 1,
VT_I2 = 2,
VT_I4 = 3,
VT_R4 = 4,
VT_R8 = 5,
VT_CY = 6,
VT_DATE = 7,
VT_BSTR = 8,
VT_DISPATCH = 9,
VT_ERROR = 10,
VT_BOOL = 11,
VT_VARIANT = 12,
VT_UNKNOWN = 13,
VT_DECIMAL = 14,
VT_I1 = 16,
VT_UI1 = 17,
VT_UI2 = 18,
VT_UI4 = 19,
VT_I8 = 20,
VT_UI8 = 21,
VT_INT = 22,
VT_UINT = 23,
VT_VOID = 24,
VT_HRESULT = 25,
VT_FILETIME = 64
};
typedef unsigned short VARTYPE;
typedef WORD PROPVAR_PAD1;
typedef WORD PROPVAR_PAD2;
typedef WORD PROPVAR_PAD3;
#ifdef __cplusplus
typedef struct tagPROPVARIANT
{
VARTYPE vt;
PROPVAR_PAD1 wReserved1;
PROPVAR_PAD2 wReserved2;
PROPVAR_PAD3 wReserved3;
union
{
CHAR cVal;
UCHAR bVal;
SHORT iVal;
USHORT uiVal;
LONG lVal;
ULONG ulVal;
INT intVal;
UINT uintVal;
LARGE_INTEGER hVal;
ULARGE_INTEGER uhVal;
VARIANT_BOOL boolVal;
SCODE scode;
FILETIME filetime;
BSTR bstrVal;
};
} PROPVARIANT;
typedef PROPVARIANT tagVARIANT;
typedef tagVARIANT VARIANT;
typedef VARIANT VARIANTARG;
MY_EXTERN_C HRESULT VariantClear(VARIANTARG *prop);
MY_EXTERN_C HRESULT VariantCopy(VARIANTARG *dest, VARIANTARG *src);
#endif
MY_EXTERN_C BSTR SysAllocStringByteLen(LPCSTR psz, UINT len);
MY_EXTERN_C BSTR SysAllocString(const OLECHAR *sz);
MY_EXTERN_C void SysFreeString(BSTR bstr);
MY_EXTERN_C UINT SysStringByteLen(BSTR bstr);
MY_EXTERN_C UINT SysStringLen(BSTR bstr);
/* MY_EXTERN_C DWORD GetLastError(); */
MY_EXTERN_C LONG CompareFileTime(const FILETIME* ft1, const FILETIME* ft2);
#define CP_ACP 0
#define CP_OEMCP 1
typedef enum tagSTREAM_SEEK
{
STREAM_SEEK_SET = 0,
STREAM_SEEK_CUR = 1,
STREAM_SEEK_END = 2
} STREAM_SEEK;
#endif
#endif

582
lzma/C/Threads.c Normal file
View file

@ -0,0 +1,582 @@
/* Threads.c */
#include "Threads.h"
#ifdef ENV_BEOS
#include <kernel/OS.h>
#else
#include <pthread.h>
#include <stdlib.h>
#endif
#include <errno.h>
#if defined(__linux__)
#define PTHREAD_MUTEX_ERRORCHECK PTHREAD_MUTEX_ERRORCHECK_NP
#endif
#ifdef ENV_BEOS
/* TODO : optimize the code and verify the returned values */
WRes Thread_Create(CThread *thread, THREAD_FUNC_RET_TYPE (THREAD_FUNC_CALL_TYPE *startAddress)(void *), LPVOID parameter)
{
thread->_tid = spawn_thread((int32 (*)(void *))startAddress, "CThread", B_LOW_PRIORITY, parameter);
if (thread->_tid >= B_OK) {
resume_thread(thread->_tid);
} else {
thread->_tid = B_BAD_THREAD_ID;
}
thread->_created = 1;
return 0; // SZ_OK;
}
WRes Thread_Wait(CThread *thread)
{
int ret;
if (thread->_created == 0)
return EINVAL;
if (thread->_tid >= B_OK)
{
status_t exit_value;
wait_for_thread(thread->_tid, &exit_value);
thread->_tid = B_BAD_THREAD_ID;
} else {
return EINVAL;
}
thread->_created = 0;
return 0;
}
WRes Thread_Close(CThread *thread)
{
if (!thread->_created) return SZ_OK;
thread->_tid = B_BAD_THREAD_ID;
thread->_created = 0;
return SZ_OK;
}
WRes Event_Create(CEvent *p, BOOL manualReset, int initialSignaled)
{
p->_index_waiting = 0;
p->_manual_reset = manualReset;
p->_state = (initialSignaled ? TRUE : FALSE);
p->_created = 1;
p->_sem = create_sem(1,"event");
return 0;
}
WRes Event_Set(CEvent *p) {
int index;
acquire_sem(p->_sem);
p->_state = TRUE;
for(index = 0 ; index < p->_index_waiting ; index++)
{
send_data(p->_waiting[index], '7zCN', NULL, 0);
}
p->_index_waiting = 0;
release_sem(p->_sem);
return 0;
}
WRes Event_Reset(CEvent *p) {
acquire_sem(p->_sem);
p->_state = FALSE;
release_sem(p->_sem);
return 0;
}
WRes Event_Wait(CEvent *p) {
acquire_sem(p->_sem);
while (p->_state == FALSE)
{
thread_id sender;
p->_waiting[p->_index_waiting++] = find_thread(NULL);
release_sem(p->_sem);
/* int msg = */ receive_data(&sender, NULL, 0);
acquire_sem(p->_sem);
}
if (p->_manual_reset == FALSE)
{
p->_state = FALSE;
}
release_sem(p->_sem);
return 0;
}
WRes Event_Close(CEvent *p) {
if (p->_created)
{
p->_created = 0;
delete_sem(p->_sem);
}
return 0;
}
WRes Semaphore_Create(CSemaphore *p, UInt32 initiallyCount, UInt32 maxCount)
{
p->_index_waiting = 0;
p->_count = initiallyCount;
p->_maxCount = maxCount;
p->_created = 1;
p->_sem = create_sem(1,"sem");
return 0;
}
WRes Semaphore_ReleaseN(CSemaphore *p, UInt32 releaseCount)
{
UInt32 newCount;
int index;
if (releaseCount < 1) return EINVAL;
acquire_sem(p->_sem);
newCount = p->_count + releaseCount;
if (newCount > p->_maxCount)
{
release_sem(p->_sem);
return EINVAL;
}
p->_count = newCount;
for(index = 0 ; index < p->_index_waiting ; index++)
{
send_data(p->_waiting[index], '7zCN', NULL, 0);
}
p->_index_waiting = 0;
release_sem(p->_sem);
return 0;
}
WRes Semaphore_Wait(CSemaphore *p) {
acquire_sem(p->_sem);
while (p->_count < 1)
{
thread_id sender;
p->_waiting[p->_index_waiting++] = find_thread(NULL);
release_sem(p->_sem);
/* int msg = */ receive_data(&sender, NULL, 0);
acquire_sem(p->_sem);
}
p->_count--;
release_sem(p->_sem);
return 0;
}
WRes Semaphore_Close(CSemaphore *p) {
if (p->_created)
{
p->_created = 0;
delete_sem(p->_sem);
}
return 0;
}
WRes CriticalSection_Init(CCriticalSection * lpCriticalSection)
{
lpCriticalSection->_sem = create_sem(1,"cc");
return 0;
}
#else /* !ENV_BEOS */
WRes Thread_Create(CThread *thread, THREAD_FUNC_RET_TYPE (THREAD_FUNC_CALL_TYPE *startAddress)(void *), LPVOID parameter)
{
pthread_attr_t attr;
int ret;
thread->_created = 0;
ret = pthread_attr_init(&attr);
if (ret) return ret;
ret = pthread_attr_setdetachstate(&attr,PTHREAD_CREATE_JOINABLE);
if (ret) return ret;
ret = pthread_create(&thread->_tid, &attr, (void * (*)(void *))startAddress, parameter);
/* ret2 = */ pthread_attr_destroy(&attr);
if (ret) return ret;
thread->_created = 1;
return 0; // SZ_OK;
}
WRes Thread_Wait(CThread *thread)
{
void *thread_return;
int ret;
if (thread->_created == 0)
return EINVAL;
ret = pthread_join(thread->_tid,&thread_return);
thread->_created = 0;
return ret;
}
WRes Thread_Close(CThread *thread)
{
if (!thread->_created) return SZ_OK;
pthread_detach(thread->_tid);
thread->_tid = 0;
thread->_created = 0;
return SZ_OK;
}
#ifdef DEBUG_SYNCHRO
#include <stdio.h>
static void dump_error(int ligne,int ret,const char *text,void *param)
{
fprintf(stderr, "\n##T%d#ERROR2 (l=%d) %s : param=%p ret = %d (%s)##\n",(int)pthread_self(),ligne,text,param,ret,strerror(ret));
// abort();
}
WRes Event_Create(CEvent *p, BOOL manualReset, int initialSignaled)
{
int ret;
pthread_mutexattr_t mutexattr;
memset(&mutexattr,0,sizeof(mutexattr));
ret = pthread_mutexattr_init(&mutexattr);
if (ret != 0) dump_error(__LINE__,ret,"Event_Create::pthread_mutexattr_init",&mutexattr);
ret = pthread_mutexattr_settype(&mutexattr,PTHREAD_MUTEX_ERRORCHECK);
if (ret != 0) dump_error(__LINE__,ret,"Event_Create::pthread_mutexattr_settype",&mutexattr);
ret = pthread_mutex_init(&p->_mutex,&mutexattr);
if (ret != 0) dump_error(__LINE__,ret,"Event_Create::pthread_mutexattr_init",&p->_mutex);
if (ret == 0)
{
ret = pthread_cond_init(&p->_cond,0);
if (ret != 0) dump_error(__LINE__,ret,"Event_Create::pthread_cond_init",&p->_cond);
p->_manual_reset = manualReset;
p->_state = (initialSignaled ? TRUE : FALSE);
p->_created = 1;
}
return ret;
}
WRes Event_Set(CEvent *p) {
int ret = pthread_mutex_lock(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"ES::pthread_mutex_lock",&p->_mutex);
if (ret == 0)
{
p->_state = TRUE;
ret = pthread_cond_broadcast(&p->_cond);
if (ret != 0) dump_error(__LINE__,ret,"ES::pthread_cond_broadcast",&p->_cond);
if (ret == 0)
{
ret = pthread_mutex_unlock(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"ES::pthread_mutex_unlock",&p->_mutex);
}
}
return ret;
}
WRes Event_Reset(CEvent *p) {
int ret = pthread_mutex_lock(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"ER::pthread_mutex_lock",&p->_mutex);
if (ret == 0)
{
p->_state = FALSE;
ret = pthread_mutex_unlock(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"ER::pthread_mutex_unlock",&p->_mutex);
}
return ret;
}
WRes Event_Wait(CEvent *p) {
int ret = pthread_mutex_lock(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"EW::pthread_mutex_lock",&p->_mutex);
if (ret == 0)
{
while ((p->_state == FALSE) && (ret == 0))
{
ret = pthread_cond_wait(&p->_cond, &p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"EW::pthread_cond_wait",&p->_mutex);
}
if (ret == 0)
{
if (p->_manual_reset == FALSE)
{
p->_state = FALSE;
}
ret = pthread_mutex_unlock(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"EW::pthread_mutex_unlock",&p->_mutex);
}
}
return ret;
}
WRes Event_Close(CEvent *p) {
if (p->_created)
{
int ret;
p->_created = 0;
ret = pthread_mutex_destroy(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"EC::pthread_mutex_destroy",&p->_mutex);
ret = pthread_cond_destroy(&p->_cond);
if (ret != 0) dump_error(__LINE__,ret,"EC::pthread_cond_destroy",&p->_cond);
}
return 0;
}
WRes Semaphore_Create(CSemaphore *p, UInt32 initiallyCount, UInt32 maxCount)
{
int ret;
pthread_mutexattr_t mutexattr;
memset(&mutexattr,0,sizeof(mutexattr));
ret = pthread_mutexattr_init(&mutexattr);
if (ret != 0) dump_error(__LINE__,ret,"SemC::pthread_mutexattr_init",&mutexattr);
ret = pthread_mutexattr_settype(&mutexattr,PTHREAD_MUTEX_ERRORCHECK);
if (ret != 0) dump_error(__LINE__,ret,"SemC::pthread_mutexattr_settype",&mutexattr);
ret = pthread_mutex_init(&p->_mutex,&mutexattr);
if (ret != 0) dump_error(__LINE__,ret,"SemC::pthread_mutexattr_init",&p->_mutex);
if (ret == 0)
{
ret = pthread_cond_init(&p->_cond,0);
if (ret != 0) dump_error(__LINE__,ret,"SemC::pthread_cond_init",&p->_mutex);
p->_count = initiallyCount;
p->_maxCount = maxCount;
p->_created = 1;
}
return ret;
}
WRes Semaphore_ReleaseN(CSemaphore *p, UInt32 releaseCount)
{
int ret;
if (releaseCount < 1) return EINVAL;
ret = pthread_mutex_lock(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"SemR::pthread_mutex_lock",&p->_mutex);
if (ret == 0)
{
UInt32 newCount = p->_count + releaseCount;
if (newCount > p->_maxCount)
{
ret = pthread_mutex_unlock(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"SemR::pthread_mutex_unlock",&p->_mutex);
return EINVAL;
}
p->_count = newCount;
ret = pthread_cond_broadcast(&p->_cond);
if (ret != 0) dump_error(__LINE__,ret,"SemR::pthread_cond_broadcast",&p->_cond);
if (ret == 0)
{
ret = pthread_mutex_unlock(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"SemR::pthread_mutex_unlock",&p->_mutex);
}
}
return ret;
}
WRes Semaphore_Wait(CSemaphore *p) {
int ret = pthread_mutex_lock(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"SemW::pthread_mutex_lock",&p->_mutex);
if (ret == 0)
{
while ((p->_count < 1) && (ret == 0))
{
ret = pthread_cond_wait(&p->_cond, &p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"SemW::pthread_cond_wait",&p->_mutex);
}
if (ret == 0)
{
p->_count--;
ret = pthread_mutex_unlock(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"SemW::pthread_mutex_unlock",&p->_mutex);
}
}
return ret;
}
WRes Semaphore_Close(CSemaphore *p) {
if (p->_created)
{
int ret;
p->_created = 0;
ret = pthread_mutex_destroy(&p->_mutex);
if (ret != 0) dump_error(__LINE__,ret,"Semc::pthread_mutex_destroy",&p->_mutex);
ret = pthread_cond_destroy(&p->_cond);
if (ret != 0) dump_error(__LINE__,ret,"Semc::pthread_cond_destroy",&p->_cond);
}
return 0;
}
WRes CriticalSection_Init(CCriticalSection * lpCriticalSection)
{
if (lpCriticalSection)
{
int ret;
pthread_mutexattr_t mutexattr;
memset(&mutexattr,0,sizeof(mutexattr));
ret = pthread_mutexattr_init(&mutexattr);
if (ret != 0) dump_error(__LINE__,ret,"CS I::pthread_mutexattr_init",&mutexattr);
ret = pthread_mutexattr_settype(&mutexattr,PTHREAD_MUTEX_ERRORCHECK);
if (ret != 0) dump_error(__LINE__,ret,"CS I::pthread_mutexattr_settype",&mutexattr);
ret = pthread_mutex_init(&lpCriticalSection->_mutex,&mutexattr);
if (ret != 0) dump_error(__LINE__,ret,"CS I::pthread_mutexattr_init",&lpCriticalSection->_mutex);
return ret;
}
return EINTR;
}
void CriticalSection_Enter(CCriticalSection * lpCriticalSection)
{
if (lpCriticalSection)
{
int ret = pthread_mutex_lock(&(lpCriticalSection->_mutex));
if (ret != 0) dump_error(__LINE__,ret,"CS::pthread_mutex_lock",&(lpCriticalSection->_mutex));
}
}
void CriticalSection_Leave(CCriticalSection * lpCriticalSection)
{
if (lpCriticalSection)
{
int ret = pthread_mutex_unlock(&(lpCriticalSection->_mutex));
if (ret != 0) dump_error(__LINE__,ret,"CS::pthread_mutex_unlock",&(lpCriticalSection->_mutex));
}
}
void CriticalSection_Delete(CCriticalSection * lpCriticalSection)
{
if (lpCriticalSection)
{
int ret = pthread_mutex_destroy(&(lpCriticalSection->_mutex));
if (ret != 0) dump_error(__LINE__,ret,"CS::pthread_mutex_destroy",&(lpCriticalSection->_mutex));
}
}
#else
WRes Event_Create(CEvent *p, BOOL manualReset, int initialSignaled)
{
pthread_mutex_init(&p->_mutex,0);
pthread_cond_init(&p->_cond,0);
p->_manual_reset = manualReset;
p->_state = (initialSignaled ? TRUE : FALSE);
p->_created = 1;
return 0;
}
WRes Event_Set(CEvent *p) {
pthread_mutex_lock(&p->_mutex);
p->_state = TRUE;
pthread_cond_broadcast(&p->_cond);
pthread_mutex_unlock(&p->_mutex);
return 0;
}
WRes Event_Reset(CEvent *p) {
pthread_mutex_lock(&p->_mutex);
p->_state = FALSE;
pthread_mutex_unlock(&p->_mutex);
return 0;
}
WRes Event_Wait(CEvent *p) {
pthread_mutex_lock(&p->_mutex);
while (p->_state == FALSE)
{
pthread_cond_wait(&p->_cond, &p->_mutex);
}
if (p->_manual_reset == FALSE)
{
p->_state = FALSE;
}
pthread_mutex_unlock(&p->_mutex);
return 0;
}
WRes Event_Close(CEvent *p) {
if (p->_created)
{
p->_created = 0;
pthread_mutex_destroy(&p->_mutex);
pthread_cond_destroy(&p->_cond);
}
return 0;
}
WRes Semaphore_Create(CSemaphore *p, UInt32 initiallyCount, UInt32 maxCount)
{
pthread_mutex_init(&p->_mutex,0);
pthread_cond_init(&p->_cond,0);
p->_count = initiallyCount;
p->_maxCount = maxCount;
p->_created = 1;
return 0;
}
WRes Semaphore_ReleaseN(CSemaphore *p, UInt32 releaseCount)
{
UInt32 newCount;
if (releaseCount < 1) return EINVAL;
pthread_mutex_lock(&p->_mutex);
newCount = p->_count + releaseCount;
if (newCount > p->_maxCount)
{
pthread_mutex_unlock(&p->_mutex);
return EINVAL;
}
p->_count = newCount;
pthread_cond_broadcast(&p->_cond);
pthread_mutex_unlock(&p->_mutex);
return 0;
}
WRes Semaphore_Wait(CSemaphore *p) {
pthread_mutex_lock(&p->_mutex);
while (p->_count < 1)
{
pthread_cond_wait(&p->_cond, &p->_mutex);
}
p->_count--;
pthread_mutex_unlock(&p->_mutex);
return 0;
}
WRes Semaphore_Close(CSemaphore *p) {
if (p->_created)
{
p->_created = 0;
pthread_mutex_destroy(&p->_mutex);
pthread_cond_destroy(&p->_cond);
}
return 0;
}
WRes CriticalSection_Init(CCriticalSection * lpCriticalSection)
{
return pthread_mutex_init(&(lpCriticalSection->_mutex),0);
}
#endif /* DEBUG_SYNCHRO */
#endif /* ENV_BEOS */
WRes ManualResetEvent_Create(CManualResetEvent *p, int initialSignaled)
{ return Event_Create(p, TRUE, initialSignaled); }
WRes ManualResetEvent_CreateNotSignaled(CManualResetEvent *p)
{ return ManualResetEvent_Create(p, 0); }
WRes AutoResetEvent_Create(CAutoResetEvent *p, int initialSignaled)
{ return Event_Create(p, FALSE, initialSignaled); }
WRes AutoResetEvent_CreateNotSignaled(CAutoResetEvent *p)
{ return AutoResetEvent_Create(p, 0); }

121
lzma/C/Threads.h Normal file
View file

@ -0,0 +1,121 @@
/* Threads.h -- multithreading library
2008-11-22 : Igor Pavlov : Public domain */
#ifndef __7Z_THRESDS_H
#define __7Z_THRESDS_H
#include "Types.h"
#include "windows.h"
#ifdef ENV_BEOS
#include <kernel/OS.h>
#define MAX_THREAD 256
#else
#include <pthread.h>
#endif
/* #define DEBUG_SYNCHRO 1 */
typedef struct _CThread
{
#ifdef ENV_BEOS
thread_id _tid;
#else
pthread_t _tid;
#endif
int _created;
} CThread;
#define Thread_Construct(thread) (thread)->_created = 0
#define Thread_WasCreated(thread) ((thread)->_created != 0)
typedef unsigned THREAD_FUNC_RET_TYPE;
#define THREAD_FUNC_CALL_TYPE MY_STD_CALL
#define THREAD_FUNC_DECL THREAD_FUNC_RET_TYPE THREAD_FUNC_CALL_TYPE
WRes Thread_Create(CThread *thread, THREAD_FUNC_RET_TYPE (THREAD_FUNC_CALL_TYPE *startAddress)(void *), LPVOID parameter);
WRes Thread_Wait(CThread *thread);
WRes Thread_Close(CThread *thread);
typedef struct _CEvent
{
int _created;
int _manual_reset;
int _state;
#ifdef ENV_BEOS
thread_id _waiting[MAX_THREAD];
int _index_waiting;
sem_id _sem;
#else
pthread_mutex_t _mutex;
pthread_cond_t _cond;
#endif
} CEvent;
typedef CEvent CAutoResetEvent;
typedef CEvent CManualResetEvent;
#define Event_Construct(event) (event)->_created = 0
#define Event_IsCreated(event) ((event)->_created)
WRes ManualResetEvent_Create(CManualResetEvent *event, int initialSignaled);
WRes ManualResetEvent_CreateNotSignaled(CManualResetEvent *event);
WRes AutoResetEvent_Create(CAutoResetEvent *event, int initialSignaled);
WRes AutoResetEvent_CreateNotSignaled(CAutoResetEvent *event);
WRes Event_Set(CEvent *event);
WRes Event_Reset(CEvent *event);
WRes Event_Wait(CEvent *event);
WRes Event_Close(CEvent *event);
typedef struct _CSemaphore
{
int _created;
UInt32 _count;
UInt32 _maxCount;
#ifdef ENV_BEOS
thread_id _waiting[MAX_THREAD];
int _index_waiting;
sem_id _sem;
#else
pthread_mutex_t _mutex;
pthread_cond_t _cond;
#endif
} CSemaphore;
#define Semaphore_Construct(p) (p)->_created = 0
WRes Semaphore_Create(CSemaphore *p, UInt32 initiallyCount, UInt32 maxCount);
WRes Semaphore_ReleaseN(CSemaphore *p, UInt32 num);
#define Semaphore_Release1(p) Semaphore_ReleaseN(p, 1)
WRes Semaphore_Wait(CSemaphore *p);
WRes Semaphore_Close(CSemaphore *p);
typedef struct {
#ifdef ENV_BEOS
sem_id _sem;
#else
pthread_mutex_t _mutex;
#endif
} CCriticalSection;
WRes CriticalSection_Init(CCriticalSection *p);
#ifdef ENV_BEOS
#define CriticalSection_Delete(p) delete_sem((p)->_sem)
#define CriticalSection_Enter(p) acquire_sem((p)->_sem)
#define CriticalSection_Leave(p) release_sem((p)->_sem)
#else
#ifdef DEBUG_SYNCHRO
void CriticalSection_Delete(CCriticalSection *);
void CriticalSection_Enter(CCriticalSection *);
void CriticalSection_Leave(CCriticalSection *);
#else
#define CriticalSection_Delete(p) pthread_mutex_destroy(&((p)->_mutex))
#define CriticalSection_Enter(p) pthread_mutex_lock(&((p)->_mutex))
#define CriticalSection_Leave(p) pthread_mutex_unlock(&((p)->_mutex))
#endif
#endif
#endif

228
lzma/C/Types.h Normal file
View file

@ -0,0 +1,228 @@
/* Types.h -- Basic types
2009-08-14 : Igor Pavlov : Public domain */
#ifndef __7Z_TYPES_H
#define __7Z_TYPES_H
#include <stddef.h>
#ifdef _WIN32
#include <windows.h>
#endif
#ifndef EXTERN_C_BEGIN
#ifdef __cplusplus
#define EXTERN_C_BEGIN extern "C" {
#define EXTERN_C_END }
#else
#define EXTERN_C_BEGIN
#define EXTERN_C_END
#endif
#endif
EXTERN_C_BEGIN
#define SZ_OK 0
#define SZ_ERROR_DATA 1
#define SZ_ERROR_MEM 2
#define SZ_ERROR_CRC 3
#define SZ_ERROR_UNSUPPORTED 4
#define SZ_ERROR_PARAM 5
#define SZ_ERROR_INPUT_EOF 6
#define SZ_ERROR_OUTPUT_EOF 7
#define SZ_ERROR_READ 8
#define SZ_ERROR_WRITE 9
#define SZ_ERROR_PROGRESS 10
#define SZ_ERROR_FAIL 11
#define SZ_ERROR_THREAD 12
#define SZ_ERROR_ARCHIVE 16
#define SZ_ERROR_NO_ARCHIVE 17
typedef int SRes;
#ifdef _WIN32
typedef DWORD WRes;
#else
typedef int WRes;
#endif
#ifndef RINOK
#define RINOK(x) { int __result__ = (x); if (__result__ != 0) return __result__; }
#endif
/* zconf.h defines Byte. Don't redefine if it's included. */
#ifndef ZCONF_H
#ifndef _ZCONF_H
typedef unsigned char Byte;
#endif
#endif
typedef short Int16;
typedef unsigned short UInt16;
#ifdef _LZMA_UINT32_IS_ULONG
typedef long Int32;
typedef unsigned long UInt32;
#else
typedef int Int32;
typedef unsigned int UInt32;
#endif
#ifdef _SZ_NO_INT_64
/* define _SZ_NO_INT_64, if your compiler doesn't support 64-bit integers.
NOTES: Some code will work incorrectly in that case! */
typedef long Int64;
typedef unsigned long UInt64;
#else
#if defined(_MSC_VER) || defined(__BORLANDC__)
typedef __int64 Int64;
typedef unsigned __int64 UInt64;
#else
typedef long long int Int64;
typedef unsigned long long int UInt64;
#endif
#endif
#ifdef _LZMA_NO_SYSTEM_SIZE_T
typedef UInt32 SizeT;
#else
typedef size_t SizeT;
#endif
typedef int Bool;
#define True 1
#define False 0
#ifdef _MSC_VER
#if _MSC_VER >= 1300
#define MY_NO_INLINE __declspec(noinline)
#else
#define MY_NO_INLINE
#endif
#define MY_CDECL __cdecl
#define MY_STD_CALL __stdcall
#define MY_FAST_CALL MY_NO_INLINE __fastcall
#else
#define MY_CDECL
#define MY_STD_CALL
#define MY_FAST_CALL
#endif
/* The following interfaces use first parameter as pointer to structure */
typedef struct
{
SRes (*Read)(void *p, void *buf, size_t *size);
/* if (input(*size) != 0 && output(*size) == 0) means end_of_stream.
(output(*size) < input(*size)) is allowed */
} ISeqInStream;
/* it can return SZ_ERROR_INPUT_EOF */
SRes SeqInStream_Read(ISeqInStream *stream, void *buf, size_t size);
SRes SeqInStream_Read2(ISeqInStream *stream, void *buf, size_t size, SRes errorType);
SRes SeqInStream_ReadByte(ISeqInStream *stream, Byte *buf);
typedef struct
{
size_t (*Write)(void *p, const void *buf, size_t size);
/* Returns: result - the number of actually written bytes.
(result < size) means error */
} ISeqOutStream;
typedef enum
{
SZ_SEEK_SET = 0,
SZ_SEEK_CUR = 1,
SZ_SEEK_END = 2
} ESzSeek;
typedef struct
{
SRes (*Read)(void *p, void *buf, size_t *size); /* same as ISeqInStream::Read */
SRes (*Seek)(void *p, Int64 *pos, ESzSeek origin);
} ISeekInStream;
typedef struct
{
SRes (*Look)(void *p, void **buf, size_t *size);
/* if (input(*size) != 0 && output(*size) == 0) means end_of_stream.
(output(*size) > input(*size)) is not allowed
(output(*size) < input(*size)) is allowed */
SRes (*Skip)(void *p, size_t offset);
/* offset must be <= output(*size) of Look */
SRes (*Read)(void *p, void *buf, size_t *size);
/* reads directly (without buffer). It's same as ISeqInStream::Read */
SRes (*Seek)(void *p, Int64 *pos, ESzSeek origin);
} ILookInStream;
SRes LookInStream_LookRead(ILookInStream *stream, void *buf, size_t *size);
SRes LookInStream_SeekTo(ILookInStream *stream, UInt64 offset);
/* reads via ILookInStream::Read */
SRes LookInStream_Read2(ILookInStream *stream, void *buf, size_t size, SRes errorType);
SRes LookInStream_Read(ILookInStream *stream, void *buf, size_t size);
#define LookToRead_BUF_SIZE (1 << 14)
typedef struct
{
ILookInStream s;
ISeekInStream *realStream;
size_t pos;
size_t size;
Byte buf[LookToRead_BUF_SIZE];
} CLookToRead;
void LookToRead_CreateVTable(CLookToRead *p, int lookahead);
void LookToRead_Init(CLookToRead *p);
typedef struct
{
ISeqInStream s;
ILookInStream *realStream;
} CSecToLook;
void SecToLook_CreateVTable(CSecToLook *p);
typedef struct
{
ISeqInStream s;
ILookInStream *realStream;
} CSecToRead;
void SecToRead_CreateVTable(CSecToRead *p);
typedef struct
{
SRes (*Progress)(void *p, UInt64 inSize, UInt64 outSize);
/* Returns: result. (result != SZ_OK) means break.
Value (UInt64)(Int64)-1 for size means unknown value. */
} ICompressProgress;
typedef struct
{
void *(*Alloc)(void *p, size_t size);
void (*Free)(void *p, void *address); /* address can be 0 */
} ISzAlloc;
#define IAlloc_Alloc(p, size) (p)->Alloc((p), size)
#define IAlloc_Free(p, a) (p)->Free((p), a)
EXTERN_C_END
#endif

228
lzma/C/Types.h~ Normal file
View file

@ -0,0 +1,228 @@
/* Types.h -- Basic types
2009-08-14 : Igor Pavlov : Public domain */
#ifndef __7Z_TYPES_H
#define __7Z_TYPES_H
#include <stddef.h>
#ifdef _WIN32
#include <windows.h>
#endif
#ifndef EXTERN_C_BEGIN
#ifdef __cplusplus
#define EXTERN_C_BEGIN extern "C" {
#define EXTERN_C_END }
#else
#define EXTERN_C_BEGIN
#define EXTERN_C_END
#endif
#endif
EXTERN_C_BEGIN
#define SZ_OK 0
#define SZ_ERROR_DATA 1
#define SZ_ERROR_MEM 2
#define SZ_ERROR_CRC 3
#define SZ_ERROR_UNSUPPORTED 4
#define SZ_ERROR_PARAM 5
#define SZ_ERROR_INPUT_EOF 6
#define SZ_ERROR_OUTPUT_EOF 7
#define SZ_ERROR_READ 8
#define SZ_ERROR_WRITE 9
#define SZ_ERROR_PROGRESS 10
#define SZ_ERROR_FAIL 11
#define SZ_ERROR_THREAD 12
#define SZ_ERROR_ARCHIVE 16
#define SZ_ERROR_NO_ARCHIVE 17
typedef int SRes;
#ifdef _WIN32
typedef DWORD WRes;
#else
typedef int WRes;
#endif
#ifndef RINOK
#define RINOK(x) { int __result__ = (x); if (__result__ != 0) return __result__; }
#endif
/* zconf.h defines Byte. Don't redefine if it's included. */
#ifndef ZCONF_H
#ifndef _ZCONF_H
ypedef unsigned char Byte;
#endif
#endif
typedef short Int16;
typedef unsigned short UInt16;
#ifdef _LZMA_UINT32_IS_ULONG
typedef long Int32;
typedef unsigned long UInt32;
#else
typedef int Int32;
typedef unsigned int UInt32;
#endif
#ifdef _SZ_NO_INT_64
/* define _SZ_NO_INT_64, if your compiler doesn't support 64-bit integers.
NOTES: Some code will work incorrectly in that case! */
typedef long Int64;
typedef unsigned long UInt64;
#else
#if defined(_MSC_VER) || defined(__BORLANDC__)
typedef __int64 Int64;
typedef unsigned __int64 UInt64;
#else
typedef long long int Int64;
typedef unsigned long long int UInt64;
#endif
#endif
#ifdef _LZMA_NO_SYSTEM_SIZE_T
typedef UInt32 SizeT;
#else
typedef size_t SizeT;
#endif
typedef int Bool;
#define True 1
#define False 0
#ifdef _MSC_VER
#if _MSC_VER >= 1300
#define MY_NO_INLINE __declspec(noinline)
#else
#define MY_NO_INLINE
#endif
#define MY_CDECL __cdecl
#define MY_STD_CALL __stdcall
#define MY_FAST_CALL MY_NO_INLINE __fastcall
#else
#define MY_CDECL
#define MY_STD_CALL
#define MY_FAST_CALL
#endif
/* The following interfaces use first parameter as pointer to structure */
typedef struct
{
SRes (*Read)(void *p, void *buf, size_t *size);
/* if (input(*size) != 0 && output(*size) == 0) means end_of_stream.
(output(*size) < input(*size)) is allowed */
} ISeqInStream;
/* it can return SZ_ERROR_INPUT_EOF */
SRes SeqInStream_Read(ISeqInStream *stream, void *buf, size_t size);
SRes SeqInStream_Read2(ISeqInStream *stream, void *buf, size_t size, SRes errorType);
SRes SeqInStream_ReadByte(ISeqInStream *stream, Byte *buf);
typedef struct
{
size_t (*Write)(void *p, const void *buf, size_t size);
/* Returns: result - the number of actually written bytes.
(result < size) means error */
} ISeqOutStream;
typedef enum
{
SZ_SEEK_SET = 0,
SZ_SEEK_CUR = 1,
SZ_SEEK_END = 2
} ESzSeek;
typedef struct
{
SRes (*Read)(void *p, void *buf, size_t *size); /* same as ISeqInStream::Read */
SRes (*Seek)(void *p, Int64 *pos, ESzSeek origin);
} ISeekInStream;
typedef struct
{
SRes (*Look)(void *p, void **buf, size_t *size);
/* if (input(*size) != 0 && output(*size) == 0) means end_of_stream.
(output(*size) > input(*size)) is not allowed
(output(*size) < input(*size)) is allowed */
SRes (*Skip)(void *p, size_t offset);
/* offset must be <= output(*size) of Look */
SRes (*Read)(void *p, void *buf, size_t *size);
/* reads directly (without buffer). It's same as ISeqInStream::Read */
SRes (*Seek)(void *p, Int64 *pos, ESzSeek origin);
} ILookInStream;
SRes LookInStream_LookRead(ILookInStream *stream, void *buf, size_t *size);
SRes LookInStream_SeekTo(ILookInStream *stream, UInt64 offset);
/* reads via ILookInStream::Read */
SRes LookInStream_Read2(ILookInStream *stream, void *buf, size_t size, SRes errorType);
SRes LookInStream_Read(ILookInStream *stream, void *buf, size_t size);
#define LookToRead_BUF_SIZE (1 << 14)
typedef struct
{
ILookInStream s;
ISeekInStream *realStream;
size_t pos;
size_t size;
Byte buf[LookToRead_BUF_SIZE];
} CLookToRead;
void LookToRead_CreateVTable(CLookToRead *p, int lookahead);
void LookToRead_Init(CLookToRead *p);
typedef struct
{
ISeqInStream s;
ILookInStream *realStream;
} CSecToLook;
void SecToLook_CreateVTable(CSecToLook *p);
typedef struct
{
ISeqInStream s;
ILookInStream *realStream;
} CSecToRead;
void SecToRead_CreateVTable(CSecToRead *p);
typedef struct
{
SRes (*Progress)(void *p, UInt64 inSize, UInt64 outSize);
/* Returns: result. (result != SZ_OK) means break.
Value (UInt64)(Int64)-1 for size means unknown value. */
} ICompressProgress;
typedef struct
{
void *(*Alloc)(void *p, size_t size);
void (*Free)(void *p, void *address); /* address can be 0 */
} ISzAlloc;
#define IAlloc_Alloc(p, size) (p)->Alloc((p), size)
#define IAlloc_Free(p, a) (p)->Free((p), a)
EXTERN_C_END
#endif

19
lzma/C/basetyps.h Normal file
View file

@ -0,0 +1,19 @@
#ifndef _BASETYPS_H
#define _BASETYPS_H
#ifdef HAVE_GCCVISIBILITYPATCH
#define DLLEXPORT __attribute__ ((visibility("default")))
#else
#define DLLEXPORT
#endif
#ifdef __cplusplus
#define STDAPI extern "C" DLLEXPORT HRESULT
#else
#define STDAPI extern DLLEXPORT HRESULT
#endif /* __cplusplus */
typedef GUID IID;
typedef GUID CLSID;
#endif

194
lzma/C/windows.h Normal file
View file

@ -0,0 +1,194 @@
/*
windows.h - main header file for the Win32 API
Written by Anders Norlander <anorland@hem2.passagen.se>
This file is part of a free library for the Win32 API.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
*/
#ifndef _WINDOWS_H
#define _WINDOWS_H
#include <stdarg.h>
/* BEGIN #include <windef.h> */
#include "MyWindows.h" // FIXED
#ifndef CONST
#define CONST const
#endif
#undef MAX_PATH
#define MAX_PATH 4096 /* Linux : 4096 - Windows : 260 */
#ifndef FALSE
#define FALSE 0
#endif
#ifndef TRUE
#define TRUE 1
#endif
#define WINAPI
#undef BOOL
typedef int BOOL;
/* BEGIN #include <winnt.h> */
/* BEGIN <winerror.h> */
#define NO_ERROR 0L
#define ERROR_ALREADY_EXISTS EEXIST
#define ERROR_FILE_EXISTS EEXIST
#define ERROR_INVALID_HANDLE EBADF
#define ERROR_PATH_NOT_FOUND ENOENT
#define ERROR_DISK_FULL ENOSPC
#define ERROR_NO_MORE_FILES 0x100123 // FIXME
/* see Common/WyWindows.h
#define S_OK ((HRESULT)0x00000000L)
#define S_FALSE ((HRESULT)0x00000001L)
#define E_INVALIDARG ((HRESULT)0x80070057L)
#define E_NOTIMPL ((HRESULT)0x80004001L)
#define E_NOINTERFACE ((HRESULT)0x80004002L)
#define E_ABORT ((HRESULT)0x80004004L)
#define E_FAIL ((HRESULT)0x80004005L)
#define E_OUTOFMEMORY ((HRESULT)0x8007000EL)
#define STG_E_INVALIDFUNCTION ((HRESULT)0x80030001L)
#define SUCCEEDED(Status) ((HRESULT)(Status) >= 0)
#define FAILED(Status) ((HRESULT)(Status)<0)
*/
#ifndef VOID
#define VOID void
#endif
typedef void *PVOID,*LPVOID;
typedef WCHAR *LPWSTR;
typedef CHAR *LPSTR;
typedef TCHAR *LPTSTR;
#ifdef UNICODE
/*
* P7ZIP_TEXT is a private macro whose specific use is to force the expansion of a
* macro passed as an argument to the macro TEXT. DO NOT use this
* macro within your programs. It's name and function could change without
* notice.
*/
#define P7ZIP_TEXT(q) L##q
#else
#define P7ZIP_TEXT(q) q
#endif
/*
* UNICODE a constant string when UNICODE is defined, else returns the string
* unmodified.
* The corresponding macros _TEXT() and _T() for mapping _UNICODE strings
* passed to C runtime functions are defined in mingw/tchar.h
*/
#define TEXT(q) P7ZIP_TEXT(q)
typedef BYTE BOOLEAN;
/* BEGIN #include <basetsd.h> */
#ifndef __int64
#define __int64 long long
#endif
typedef unsigned __int64 UINT64;
typedef __int64 INT64;
/* END #include <basetsd.h> */
#define FILE_ATTRIBUTE_READONLY 1
#define FILE_ATTRIBUTE_HIDDEN 2
#define FILE_ATTRIBUTE_SYSTEM 4
#define FILE_ATTRIBUTE_DIRECTORY 16
#define FILE_ATTRIBUTE_ARCHIVE 32
#define FILE_ATTRIBUTE_DEVICE 64
#define FILE_ATTRIBUTE_NORMAL 128
#define FILE_ATTRIBUTE_TEMPORARY 256
#define FILE_ATTRIBUTE_SPARSE_FILE 512
#define FILE_ATTRIBUTE_REPARSE_POINT 1024
#define FILE_ATTRIBUTE_COMPRESSED 2048
#define FILE_ATTRIBUTE_OFFLINE 0x1000
#define FILE_ATTRIBUTE_ENCRYPTED 0x4000
#define FILE_ATTRIBUTE_UNIX_EXTENSION 0x8000 /* trick for Unix */
/* END <winerror.h> */
#include <string.h>
#include <stddef.h>
/* END #include <winnt.h> */
/* END #include <windef.h> */
/* BEGIN #include <winbase.h> */
#define WAIT_OBJECT_0 0
#define INFINITE 0xFFFFFFFF
typedef struct _SYSTEMTIME {
WORD wYear;
WORD wMonth;
WORD wDayOfWeek;
WORD wDay;
WORD wHour;
WORD wMinute;
WORD wSecond;
WORD wMilliseconds;
} SYSTEMTIME;
#ifdef __cplusplus
extern "C" {
#endif
BOOL WINAPI DosDateTimeToFileTime(WORD,WORD,FILETIME *);
BOOL WINAPI FileTimeToDosDateTime(CONST FILETIME *,WORD *, WORD *);
BOOL WINAPI FileTimeToLocalFileTime(CONST FILETIME *,FILETIME *);
BOOL WINAPI FileTimeToSystemTime(CONST FILETIME *,SYSTEMTIME *);
BOOL WINAPI LocalFileTimeToFileTime(CONST FILETIME *,FILETIME *);
VOID WINAPI GetSystemTime(SYSTEMTIME *);
BOOL WINAPI SystemTimeToFileTime(const SYSTEMTIME*,FILETIME *);
DWORD WINAPI GetTickCount(VOID);
#ifdef __cplusplus
}
#endif
/* END #include <winbase.h> */
/* BEGIN #include <winnls.h> */
#define CP_ACP 0
#define CP_OEMCP 1
#define CP_UTF8 65001
/* #include <unknwn.h> */
#include <basetyps.h>
struct IEnumSTATPROPSTG;
typedef struct tagSTATPROPSTG {
LPOLESTR lpwstrName;
PROPID propid;
VARTYPE vt;
} STATPROPSTG;
#ifdef __cplusplus
extern "C" const IID IID_ISequentialStream;
struct ISequentialStream : public IUnknown
{
STDMETHOD(QueryInterface)(REFIID,PVOID*) PURE;
STDMETHOD_(ULONG,AddRef)(void) PURE;
STDMETHOD_(ULONG,Release)(void) PURE;
STDMETHOD(Read)(void*,ULONG,ULONG*) PURE;
STDMETHOD(Write)(void const*,ULONG,ULONG*) PURE;
};
#else
extern const IID IID_ISequentialStream;
#endif /* __cplusplus */
/* END #include <ole2.h> */
#endif

141
lzma/Methods.txt Normal file
View file

@ -0,0 +1,141 @@
7-Zip method IDs (4.61)
-----------------------
Each compression or crypto method in 7z has unique binary value (ID).
The length of ID in bytes is arbitrary but it can not exceed 63 bits (8 bytes).
If you want to add some new ID, you have two ways:
1) Write request for allocating IDs to 7-zip developers.
2) Generate 8-bytes ID:
3F ZZ ZZ ZZ ZZ ZZ MM MM
3F - Prefix for random IDs (1 byte)
ZZ ZZ ZZ ZZ ZZ - Developer ID (5 bytes). Use real random bytes.
MM MM - Method ID (2 bytes)
You can notify 7-Zip developers about your Developer ID / Method ID.
Note: Use new ID only if old codec can not decode data encoded with new version.
List of defined IDs
-------------------
00 - Copy
01 - Reserved
02 - Common
03 Swap
- 2 Swap2
- 4 Swap4
04 Delta (subject to change)
03 - 7z
01 - LZMA
01 - Version
03 - Branch
01 - x86
03 - BCJ
1B - BCJ2
02 - PPC
05 - PPC (Big Endian)
03 - Alpha
01 - Alpha
04 - IA64
01 - IA64
05 - ARM
01 - ARM
06 - M68
05 - M68 (Big Endian)
07 - ARM Thumb
01 - ARMT
08 - SPARC
05 - SPARC
04 - PPMD
01 - Version
7F -
01 - experimental methods.
80 - reserved for independent developers
E0 - Random IDs
04 - Misc
00 - Reserved
01 - Zip
00 - Copy (not used). Use {00} instead
01 - Shrink
06 - Implode
08 - Deflate
09 - Deflate64
12 - BZip2 (not used). Use {04 02 02} instead
02 - BZip
02 - BZip2
03 - Rar
01 - Rar15
02 - Rar20
03 - Rar29
04 - Arj
01 - Arj (1,2,3)
02 - Arj 4
05 - Z
06 - Lzh
07 - Reserved for 7z
08 - Cab
09 - NSIS
01 - DeflateNSIS
02 - BZip2NSIS
06 - Crypto
00 -
01 - AES
0x - AES-128
4x - AES-192
8x - AES-256
Cx - AES
x0 - ECB
x1 - CBC
x2 - CFB
x3 - OFB
07 - Reserved
0F - Reserved
F0 - Misc Ciphers (Real Ciphers without hashing algo)
F1 - Misc Ciphers (Combine)
01 - Zip
01 - Main Zip crypto algo
03 - RAR
02 -
03 - Rar29 AES-128 + (modified SHA-1)
07 - 7z
01 - AES-256 + SHA-256
07 - Hash (subject to change)
00 -
01 - CRC
02 - SHA-1
03 - SHA-256
04 - SHA-384
05 - SHA-512
F0 - Misc Hash
F1 - Misc
03 - RAR
03 - Rar29 Password Hashing (modified SHA1)
07 - 7z
01 - SHA-256 Password Hashing
---
End of document

31
lzma/README Normal file
View file

@ -0,0 +1,31 @@
JANUARY 2009
This is an updated LZMA library wrapper provided with SDK 4.63.
The SDK is available here: http://www.7-zip.org/sdk.html.
It is written completely in C and compilation and integration
is much simpler. To enable multithreading support, compile
with COMPRESS_MF_MT and _REENTRANT defined. MF=Match Finder,
MT=Multi Thread. In addition, link in pthread. This is default
behavior in lrzip. For single thread support, remove these
defines in the Makefile.
Some additional documentation is provided from the SDK.
File ./C/7zCrcT8.c is added to support ASM CRC code. Taken
from p7zip.org.
Original README text follows.
This is a zlib like library for the lzma encoder/decoder originally created
by Oleg I. Vdovikin <oleg@cs.msu.su> and modified for lrzip by Con Kolivas
<kernel@kolivas.org>
It is based on a stripped down source tree of the lzma SDK by Igor Pavlov.
http://www.7-zip.org
You can build a standalone library called liblzma.a which gives functions
equivalent to compress2() and uncompress() called lzma_compress() and
lzma_uncompress().
Updated for recent SDK 4.57 and added assembler routines for crc
using p7zip.org variant by Peter Hyman

31
lzma/README-Alloc Normal file
View file

@ -0,0 +1,31 @@
README for Memory Allocation Debugging
If it is necessary or desired to debug the memory allocation
process in LZMA, edit the file C/Alloc.c and uncomment the
line:
/* #define _SZ_ALLOC_DEBUG */
Then, add this to the Makefile and relink. This output will
show chunks of memory Alloc uses during LZMA compression.
Output will appear similar to this:
Alloc 284484 bytes, count = 0, addr = 44251008
Alloc 65536 bytes, count = 1, addr = 80636F0
Alloc 12288 bytes, count = 2, addr = 80736F8
Alloc 12288 bytes, count = 3, addr = 8076700
Alloc 4456448 bytes, count = 4, addr = 43E10008
Alloc 102877690 bytes, count = 5, addr = 3DBF3008
Alloc 604246024 bytes, count = 6, addr = 19BB1008
Free; count = 6, addr = 43E10008
Free; count = 5, addr = 19BB1008
Free; count = 4, addr = 3DBF3008
Free; count = 3, addr = 80736F8
Free; count = 2, addr = 8076700
Free; count = 1, addr = 80636F0
Free; count = 0, addr = 44251008
As you can see, LZMA takes large chunks of ram and sometimes
it can use more than what is available and return an
SZ_ERROR_MEM (2) code.

231
lzma/history.txt Normal file
View file

@ -0,0 +1,231 @@
HISTORY of the LZMA SDK
-----------------------
4.63 2008-12-31
-------------------------
- Some minor fixes
4.61 beta 2008-11-23
-------------------------
- The bug in ANSI-C LZMA Decoder was fixed:
If encoded stream was corrupted, decoder could access memory
outside of allocated range.
- Some changes in ANSI-C 7z Decoder interfaces.
- LZMA SDK is placed in the public domain.
4.60 beta 2008-08-19
-------------------------
- Some minor fixes.
4.59 beta 2008-08-13
-------------------------
- The bug was fixed:
LZMA Encoder in fast compression mode could access memory outside of
allocated range in some rare cases.
4.58 beta 2008-05-05
-------------------------
- ANSI-C LZMA Decoder was rewritten for speed optimizations.
- ANSI-C LZMA Encoder was included to LZMA SDK.
- C++ LZMA code now is just wrapper over ANSI-C code.
4.57 2007-12-12
-------------------------
- Speed optimizations in Ñ++ LZMA Decoder.
- Small changes for more compatibility with some C/C++ compilers.
4.49 beta 2007-07-05
-------------------------
- .7z ANSI-C Decoder:
- now it supports BCJ and BCJ2 filters
- now it supports files larger than 4 GB.
- now it supports "Last Write Time" field for files.
- C++ code for .7z archives compressing/decompressing from 7-zip
was included to LZMA SDK.
4.43 2006-06-04
-------------------------
- Small changes for more compatibility with some C/C++ compilers.
4.42 2006-05-15
-------------------------
- Small changes in .h files in ANSI-C version.
4.39 beta 2006-04-14
-------------------------
- The bug in versions 4.33b:4.38b was fixed:
C++ version of LZMA encoder could not correctly compress
files larger than 2 GB with HC4 match finder (-mfhc4).
4.37 beta 2005-04-06
-------------------------
- Fixes in C++ code: code could no be compiled if _NO_EXCEPTIONS was defined.
4.35 beta 2005-03-02
-------------------------
- The bug was fixed in C++ version of LZMA Decoder:
If encoded stream was corrupted, decoder could access memory
outside of allocated range.
4.34 beta 2006-02-27
-------------------------
- Compressing speed and memory requirements for compressing were increased
- LZMA now can use only these match finders: HC4, BT2, BT3, BT4
4.32 2005-12-09
-------------------------
- Java version of LZMA SDK was included
4.30 2005-11-20
-------------------------
- Compression ratio was improved in -a2 mode
- Speed optimizations for compressing in -a2 mode
- -fb switch now supports values up to 273
- The bug in 7z_C (7zIn.c) was fixed:
It used Alloc/Free functions from different memory pools.
So if program used two memory pools, it worked incorrectly.
- 7z_C: .7z format supporting was improved
- LZMA# SDK (C#.NET version) was included
4.27 (Updated) 2005-09-21
-------------------------
- Some GUIDs/interfaces in C++ were changed.
IStream.h:
ISequentialInStream::Read now works as old ReadPart
ISequentialOutStream::Write now works as old WritePart
4.27 2005-08-07
-------------------------
- The bug in LzmaDecodeSize.c was fixed:
if _LZMA_IN_CB and _LZMA_OUT_READ were defined,
decompressing worked incorrectly.
4.26 2005-08-05
-------------------------
- Fixes in 7z_C code and LzmaTest.c:
previous versions could work incorrectly,
if malloc(0) returns 0
4.23 2005-06-29
-------------------------
- Small fixes in C++ code
4.22 2005-06-10
-------------------------
- Small fixes
4.21 2005-06-08
-------------------------
- Interfaces for ANSI-C LZMA Decoder (LzmaDecode.c) were changed
- New additional version of ANSI-C LZMA Decoder with zlib-like interface:
- LzmaStateDecode.h
- LzmaStateDecode.c
- LzmaStateTest.c
- ANSI-C LZMA Decoder now can decompress files larger than 4 GB
4.17 2005-04-18
-------------------------
- New example for RAM->RAM compressing/decompressing:
LZMA + BCJ (filter for x86 code):
- LzmaRam.h
- LzmaRam.cpp
- LzmaRamDecode.h
- LzmaRamDecode.c
- -f86 switch for lzma.exe
4.16 2005-03-29
-------------------------
- The bug was fixed in LzmaDecode.c (ANSI-C LZMA Decoder):
If _LZMA_OUT_READ was defined, and if encoded stream was corrupted,
decoder could access memory outside of allocated range.
- Speed optimization of ANSI-C LZMA Decoder (now it's about 20% faster).
Old version of LZMA Decoder now is in file LzmaDecodeSize.c.
LzmaDecodeSize.c can provide slightly smaller code than LzmaDecode.c
- Small speed optimization in LZMA C++ code
- filter for SPARC's code was added
- Simplified version of .7z ANSI-C Decoder was included
4.06 2004-09-05
-------------------------
- The bug in v4.05 was fixed:
LZMA-Encoder didn't release output stream in some cases.
4.05 2004-08-25
-------------------------
- Source code of filters for x86, IA-64, ARM, ARM-Thumb
and PowerPC code was included to SDK
- Some internal minor changes
4.04 2004-07-28
-------------------------
- More compatibility with some C++ compilers
4.03 2004-06-18
-------------------------
- "Benchmark" command was added. It measures compressing
and decompressing speed and shows rating values.
Also it checks hardware errors.
4.02 2004-06-10
-------------------------
- C++ LZMA Encoder/Decoder code now is more portable
and it can be compiled by GCC on Linux.
4.01 2004-02-15
-------------------------
- Some detection of data corruption was enabled.
LzmaDecode.c / RangeDecoderReadByte
.....
{
rd->ExtraBytes = 1;
return 0xFF;
}
4.00 2004-02-13
-------------------------
- Original version of LZMA SDK
HISTORY of the LZMA
-------------------
2001-2008: Improvements to LZMA compressing/decompressing code,
keeping compatibility with original LZMA format
1996-2001: Development of LZMA compression format
Some milestones:
2001-08-30: LZMA compression was added to 7-Zip
1999-01-02: First version of 7-Zip was released
End of document

594
lzma/lzma.txt Normal file
View file

@ -0,0 +1,594 @@
LZMA SDK 4.63
-------------
LZMA SDK provides the documentation, samples, header files, libraries,
and tools you need to develop applications that use LZMA compression.
LZMA is default and general compression method of 7z format
in 7-Zip compression program (www.7-zip.org). LZMA provides high
compression ratio and very fast decompression.
LZMA is an improved version of famous LZ77 compression algorithm.
It was improved in way of maximum increasing of compression ratio,
keeping high decompression speed and low memory requirements for
decompressing.
LICENSE
-------
LZMA SDK is written and placed in the public domain by Igor Pavlov.
LZMA SDK Contents
-----------------
LZMA SDK includes:
- ANSI-C/C++/C#/Java source code for LZMA compressing and decompressing
- Compiled file->file LZMA compressing/decompressing program for Windows system
UNIX/Linux version
------------------
To compile C++ version of file->file LZMA encoding, go to directory
C++/7zip/Compress/LZMA_Alone
and call make to recompile it:
make -f makefile.gcc clean all
In some UNIX/Linux versions you must compile LZMA with static libraries.
To compile with static libraries, you can use
LIB = -lm -static
Files
---------------------
lzma.txt - LZMA SDK description (this file)
7zFormat.txt - 7z Format description
7zC.txt - 7z ANSI-C Decoder description
methods.txt - Compression method IDs for .7z
lzma.exe - Compiled file->file LZMA encoder/decoder for Windows
history.txt - history of the LZMA SDK
Source code structure
---------------------
C/ - C files
7zCrc*.* - CRC code
Alloc.* - Memory allocation functions
Bra*.* - Filters for x86, IA-64, ARM, ARM-Thumb, PowerPC and SPARC code
LzFind.* - Match finder for LZ (LZMA) encoders
LzFindMt.* - Match finder for LZ (LZMA) encoders for multithreading encoding
LzHash.h - Additional file for LZ match finder
LzmaDec.* - LZMA decoding
LzmaEnc.* - LZMA encoding
LzmaLib.* - LZMA Library for DLL calling
Types.h - Basic types for another .c files
Threads.* - The code for multithreading.
LzmaLib - LZMA Library (.DLL for Windows)
LzmaUtil - LZMA Utility (file->file LZMA encoder/decoder).
Archive - files related to archiving
7z - 7z ANSI-C Decoder
CPP/ -- CPP files
Common - common files for C++ projects
Windows - common files for Windows related code
7zip - files related to 7-Zip Project
Common - common files for 7-Zip
Compress - files related to compression/decompression
Copy - Copy coder
RangeCoder - Range Coder (special code of compression/decompression)
LZMA - LZMA compression/decompression on C++
LZMA_Alone - file->file LZMA compression/decompression
Branch - Filters for x86, IA-64, ARM, ARM-Thumb, PowerPC and SPARC code
Archive - files related to archiving
Common - common files for archive handling
7z - 7z C++ Encoder/Decoder
Bundles - Modules that are bundles of other modules
Alone7z - 7zr.exe: Standalone version of 7z.exe that supports only 7z/LZMA/BCJ/BCJ2
Format7zR - 7zr.dll: Reduced version of 7za.dll: extracting/compressing to 7z/LZMA/BCJ/BCJ2
Format7zExtractR - 7zxr.dll: Reduced version of 7zxa.dll: extracting from 7z/LZMA/BCJ/BCJ2.
UI - User Interface files
Client7z - Test application for 7za.dll, 7zr.dll, 7zxr.dll
Common - Common UI files
Console - Code for console archiver
CS/ - C# files
7zip
Common - some common files for 7-Zip
Compress - files related to compression/decompression
LZ - files related to LZ (Lempel-Ziv) compression algorithm
LZMA - LZMA compression/decompression
LzmaAlone - file->file LZMA compression/decompression
RangeCoder - Range Coder (special code of compression/decompression)
Java/ - Java files
SevenZip
Compression - files related to compression/decompression
LZ - files related to LZ (Lempel-Ziv) compression algorithm
LZMA - LZMA compression/decompression
RangeCoder - Range Coder (special code of compression/decompression)
C/C++ source code of LZMA SDK is part of 7-Zip project.
7-Zip source code can be downloaded from 7-Zip's SourceForge page:
http://sourceforge.net/projects/sevenzip/
LZMA features
-------------
- Variable dictionary size (up to 1 GB)
- Estimated compressing speed: about 2 MB/s on 2 GHz CPU
- Estimated decompressing speed:
- 20-30 MB/s on 2 GHz Core 2 or AMD Athlon 64
- 1-2 MB/s on 200 MHz ARM, MIPS, PowerPC or other simple RISC
- Small memory requirements for decompressing (16 KB + DictionarySize)
- Small code size for decompressing: 5-8 KB
LZMA decoder uses only integer operations and can be
implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions).
Some critical operations that affect the speed of LZMA decompression:
1) 32*16 bit integer multiply
2) Misspredicted branches (penalty mostly depends from pipeline length)
3) 32-bit shift and arithmetic operations
The speed of LZMA decompressing mostly depends from CPU speed.
Memory speed has no big meaning. But if your CPU has small data cache,
overall weight of memory speed will slightly increase.
How To Use
----------
Using LZMA encoder/decoder executable
--------------------------------------
Usage: LZMA <e|d> inputFile outputFile [<switches>...]
e: encode file
d: decode file
b: Benchmark. There are two tests: compressing and decompressing
with LZMA method. Benchmark shows rating in MIPS (million
instructions per second). Rating value is calculated from
measured speed and it is normalized with Intel's Core 2 results.
Also Benchmark checks possible hardware errors (RAM
errors in most cases). Benchmark uses these settings:
(-a1, -d21, -fb32, -mfbt4). You can change only -d parameter.
Also you can change the number of iterations. Example for 30 iterations:
LZMA b 30
Default number of iterations is 10.
<Switches>
-a{N}: set compression mode 0 = fast, 1 = normal
default: 1 (normal)
d{N}: Sets Dictionary size - [0, 30], default: 23 (8MB)
The maximum value for dictionary size is 1 GB = 2^30 bytes.
Dictionary size is calculated as DictionarySize = 2^N bytes.
For decompressing file compressed by LZMA method with dictionary
size D = 2^N you need about D bytes of memory (RAM).
-fb{N}: set number of fast bytes - [5, 273], default: 128
Usually big number gives a little bit better compression ratio
and slower compression process.
-lc{N}: set number of literal context bits - [0, 8], default: 3
Sometimes lc=4 gives gain for big files.
-lp{N}: set number of literal pos bits - [0, 4], default: 0
lp switch is intended for periodical data when period is
equal 2^N. For example, for 32-bit (4 bytes)
periodical data you can use lp=2. Often it's better to set lc0,
if you change lp switch.
-pb{N}: set number of pos bits - [0, 4], default: 2
pb switch is intended for periodical data
when period is equal 2^N.
-mf{MF_ID}: set Match Finder. Default: bt4.
Algorithms from hc* group doesn't provide good compression
ratio, but they often works pretty fast in combination with
fast mode (-a0).
Memory requirements depend from dictionary size
(parameter "d" in table below).
MF_ID Memory Description
bt2 d * 9.5 + 4MB Binary Tree with 2 bytes hashing.
bt3 d * 11.5 + 4MB Binary Tree with 3 bytes hashing.
bt4 d * 11.5 + 4MB Binary Tree with 4 bytes hashing.
hc4 d * 7.5 + 4MB Hash Chain with 4 bytes hashing.
-eos: write End Of Stream marker. By default LZMA doesn't write
eos marker, since LZMA decoder knows uncompressed size
stored in .lzma file header.
-si: Read data from stdin (it will write End Of Stream marker).
-so: Write data to stdout
Examples:
1) LZMA e file.bin file.lzma -d16 -lc0
compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K)
and 0 literal context bits. -lc0 allows to reduce memory requirements
for decompression.
2) LZMA e file.bin file.lzma -lc0 -lp2
compresses file.bin to file.lzma with settings suitable
for 32-bit periodical data (for example, ARM or MIPS code).
3) LZMA d file.lzma file.bin
decompresses file.lzma to file.bin.
Compression ratio hints
-----------------------
Recommendations
---------------
To increase the compression ratio for LZMA compressing it's desirable
to have aligned data (if it's possible) and also it's desirable to locate
data in such order, where code is grouped in one place and data is
grouped in other place (it's better than such mixing: code, data, code,
data, ...).
Filters
-------
You can increase the compression ratio for some data types, using
special filters before compressing. For example, it's possible to
increase the compression ratio on 5-10% for code for those CPU ISAs:
x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC.
You can find C source code of such filters in C/Bra*.* files
You can check the compression ratio gain of these filters with such
7-Zip commands (example for ARM code):
No filter:
7z a a1.7z a.bin -m0=lzma
With filter for little-endian ARM code:
7z a a2.7z a.bin -m0=arm -m1=lzma
It works in such manner:
Compressing = Filter_encoding + LZMA_encoding
Decompressing = LZMA_decoding + Filter_decoding
Compressing and decompressing speed of such filters is very high,
so it will not increase decompressing time too much.
Moreover, it reduces decompression time for LZMA_decoding,
since compression ratio with filtering is higher.
These filters convert CALL (calling procedure) instructions
from relative offsets to absolute addresses, so such data becomes more
compressible.
For some ISAs (for example, for MIPS) it's impossible to get gain from such filter.
LZMA compressed file format
---------------------------
Offset Size Description
0 1 Special LZMA properties (lc,lp, pb in encoded form)
1 4 Dictionary size (little endian)
5 8 Uncompressed size (little endian). -1 means unknown size
13 Compressed data
ANSI-C LZMA Decoder
~~~~~~~~~~~~~~~~~~~
Please note that interfaces for ANSI-C code were changed in LZMA SDK 4.58.
If you want to use old interfaces you can download previous version of LZMA SDK
from sourceforge.net site.
To use ANSI-C LZMA Decoder you need the following files:
1) LzmaDec.h + LzmaDec.c + Types.h
LzmaUtil/LzmaUtil.c is example application that uses these files.
Memory requirements for LZMA decoding
-------------------------------------
Stack usage of LZMA decoding function for local variables is not
larger than 200-400 bytes.
LZMA Decoder uses dictionary buffer and internal state structure.
Internal state structure consumes
state_size = (4 + (1.5 << (lc + lp))) KB
by default (lc=3, lp=0), state_size = 16 KB.
How To decompress data
----------------------
LZMA Decoder (ANSI-C version) now supports 2 interfaces:
1) Single-call Decompressing
2) Multi-call State Decompressing (zlib-like interface)
You must use external allocator:
Example:
void *SzAlloc(void *p, size_t size) { p = p; return malloc(size); }
void SzFree(void *p, void *address) { p = p; free(address); }
ISzAlloc alloc = { SzAlloc, SzFree };
You can use p = p; operator to disable compiler warnings.
Single-call Decompressing
-------------------------
When to use: RAM->RAM decompressing
Compile files: LzmaDec.h + LzmaDec.c + Types.h
Compile defines: no defines
Memory Requirements:
- Input buffer: compressed size
- Output buffer: uncompressed size
- LZMA Internal Structures: state_size (16 KB for default settings)
Interface:
int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode,
ELzmaStatus *status, ISzAlloc *alloc);
In:
dest - output data
destLen - output data size
src - input data
srcLen - input data size
propData - LZMA properties (5 bytes)
propSize - size of propData buffer (5 bytes)
finishMode - It has meaning only if the decoding reaches output limit (*destLen).
LZMA_FINISH_ANY - Decode just destLen bytes.
LZMA_FINISH_END - Stream must be finished after (*destLen).
You can use LZMA_FINISH_END, when you know that
current output buffer covers last bytes of stream.
alloc - Memory allocator.
Out:
destLen - processed output size
srcLen - processed input size
Output:
SZ_OK
status:
LZMA_STATUS_FINISHED_WITH_MARK
LZMA_STATUS_NOT_FINISHED
LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
SZ_ERROR_DATA - Data error
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_UNSUPPORTED - Unsupported properties
SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src).
If LZMA decoder sees end_marker before reaching output limit, it returns OK result,
and output value of destLen will be less than output buffer size limit.
You can use multiple checks to test data integrity after full decompression:
1) Check Result and "status" variable.
2) Check that output(destLen) = uncompressedSize, if you know real uncompressedSize.
3) Check that output(srcLen) = compressedSize, if you know real compressedSize.
You must use correct finish mode in that case. */
Multi-call State Decompressing (zlib-like interface)
----------------------------------------------------
When to use: file->file decompressing
Compile files: LzmaDec.h + LzmaDec.c + Types.h
Memory Requirements:
- Buffer for input stream: any size (for example, 16 KB)
- Buffer for output stream: any size (for example, 16 KB)
- LZMA Internal Structures: state_size (16 KB for default settings)
- LZMA dictionary (dictionary size is encoded in LZMA properties header)
1) read LZMA properties (5 bytes) and uncompressed size (8 bytes, little-endian) to header:
unsigned char header[LZMA_PROPS_SIZE + 8];
ReadFile(inFile, header, sizeof(header)
2) Allocate CLzmaDec structures (state + dictionary) using LZMA properties
CLzmaDec state;
LzmaDec_Constr(&state);
res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc);
if (res != SZ_OK)
return res;
3) Init LzmaDec structure before any new LZMA stream. And call LzmaDec_DecodeToBuf in loop
LzmaDec_Init(&state);
for (;;)
{
...
int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,
const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode);
...
}
4) Free all allocated structures
LzmaDec_Free(&state, &g_Alloc);
For full code example, look at C/LzmaUtil/LzmaUtil.c code.
How To compress data
--------------------
Compile files: LzmaEnc.h + LzmaEnc.c + Types.h +
LzFind.c + LzFind.h + LzFindMt.c + LzFindMt.h + LzHash.h
Memory Requirements:
- (dictSize * 11.5 + 6 MB) + state_size
Lzma Encoder can use two memory allocators:
1) alloc - for small arrays.
2) allocBig - for big arrays.
For example, you can use Large RAM Pages (2 MB) in allocBig allocator for
better compression speed. Note that Windows has bad implementation for
Large RAM Pages.
It's OK to use same allocator for alloc and allocBig.
Single-call Compression with callbacks
--------------------------------------
Check C/LzmaUtil/LzmaUtil.c as example,
When to use: file->file decompressing
1) you must implement callback structures for interfaces:
ISeqInStream
ISeqOutStream
ICompressProgress
ISzAlloc
static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
static void SzFree(void *p, void *address) { p = p; MyFree(address); }
static ISzAlloc g_Alloc = { SzAlloc, SzFree };
CFileSeqInStream inStream;
CFileSeqOutStream outStream;
inStream.funcTable.Read = MyRead;
inStream.file = inFile;
outStream.funcTable.Write = MyWrite;
outStream.file = outFile;
2) Create CLzmaEncHandle object;
CLzmaEncHandle enc;
enc = LzmaEnc_Create(&g_Alloc);
if (enc == 0)
return SZ_ERROR_MEM;
3) initialize CLzmaEncProps properties;
LzmaEncProps_Init(&props);
Then you can change some properties in that structure.
4) Send LZMA properties to LZMA Encoder
res = LzmaEnc_SetProps(enc, &props);
5) Write encoded properties to header
Byte header[LZMA_PROPS_SIZE + 8];
size_t headerSize = LZMA_PROPS_SIZE;
UInt64 fileSize;
int i;
res = LzmaEnc_WriteProperties(enc, header, &headerSize);
fileSize = MyGetFileLength(inFile);
for (i = 0; i < 8; i++)
header[headerSize++] = (Byte)(fileSize >> (8 * i));
MyWriteFileAndCheck(outFile, header, headerSize)
6) Call encoding function:
res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable,
NULL, &g_Alloc, &g_Alloc);
7) Destroy LZMA Encoder Object
LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc);
If callback function return some error code, LzmaEnc_Encode also returns that code.
Single-call RAM->RAM Compression
--------------------------------
Single-call RAM->RAM Compression is similar to Compression with callbacks,
but you provide pointers to buffers instead of pointers to stream callbacks:
HRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark,
ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
Return code:
SZ_OK - OK
SZ_ERROR_MEM - Memory allocation error
SZ_ERROR_PARAM - Incorrect paramater
SZ_ERROR_OUTPUT_EOF - output buffer overflow
SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version)
LZMA Defines
------------
_LZMA_SIZE_OPT - Enable some optimizations in LZMA Decoder to get smaller executable code.
_LZMA_PROB32 - It can increase the speed on some 32-bit CPUs, but memory usage for
some structures will be doubled in that case.
_LZMA_UINT32_IS_ULONG - Define it if int is 16-bit on your compiler and long is 32-bit.
_LZMA_NO_SYSTEM_SIZE_T - Define it if you don't want to use size_t type.
C++ LZMA Encoder/Decoder
~~~~~~~~~~~~~~~~~~~~~~~~
C++ LZMA code use COM-like interfaces. So if you want to use it,
you can study basics of COM/OLE.
C++ LZMA code is just wrapper over ANSI-C code.
C++ Notes
~~~~~~~~~~~~~~~~~~~~~~~~
If you use some C++ code folders in 7-Zip (for example, C++ code for .7z handling),
you must check that you correctly work with "new" operator.
7-Zip can be compiled with MSVC 6.0 that doesn't throw "exception" from "new" operator.
So 7-Zip uses "CPP\Common\NewHandler.cpp" that redefines "new" operator:
operator new(size_t size)
{
void *p = ::malloc(size);
if (p == 0)
throw CNewException();
return p;
}
If you use MSCV that throws exception for "new" operator, you can compile without
"NewHandler.cpp". So standard exception will be used. Actually some code of
7-Zip catches any exception in internal code and converts it to HRESULT code.
So you don't need to catch CNewException, if you call COM interfaces of 7-Zip.
---
http://www.7-zip.org
http://www.7-zip.org/sdk.html
http://www.7-zip.org/support.html

872
main.c Normal file
View file

@ -0,0 +1,872 @@
/*
Copyright (C) Andrew Tridgell 1998-2003,
Con Kolivas 2006-2009
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
/* lrzip compression - main program */
#include "rzip.h"
struct rzip_control control;
static void usage(void)
{
fprintf(stderr, "lrzip version %d.%d%d\n", LRZIP_MAJOR_VERSION, LRZIP_MINOR_VERSION, LRZIP_MINOR_SUBVERSION);
fprintf(stderr, "Copyright (C) Con Kolivas 2006-2010\n\n");
fprintf(stderr, "Based on rzip ");
fprintf(stderr, "Copyright (C) Andrew Tridgell 1998-2003\n");
fprintf(stderr, "usage: lrzip [options] <file...>\n");
fprintf(stderr, " Options:\n");
fprintf(stderr, " -w size compression window in hundreds of MB\n");
fprintf(stderr, " default chosen by heuristic dependent on ram and chosen compression\n");
fprintf(stderr, " -d decompress\n");
fprintf(stderr, " -o filename specify the output file name and/or path\n");
fprintf(stderr, " -O directory specify the output directory when -o is not used\n");
fprintf(stderr, " -S suffix specify compressed suffix (default '.lrz')\n");
fprintf(stderr, " -f force overwrite of any existing files\n");
fprintf(stderr, " -D delete existing files\n");
fprintf(stderr, " -P don't set permissions on output file - may leave it world-readable\n");
fprintf(stderr, " -q don't show compression progress\n");
fprintf(stderr, " -L level set lzma/bzip2/gzip compression level (1-9, default 7)\n");
fprintf(stderr, " -n no backend compression - prepare for other compressor\n");
fprintf(stderr, " -l lzo compression (ultra fast)\n");
fprintf(stderr, " -b bzip2 compression\n");
fprintf(stderr, " -g gzip compression using zlib\n");
fprintf(stderr, " -z zpaq compression (best, extreme compression, extremely slow)\n");
fprintf(stderr, " -M Maximum window and level - (all available ram and level 9)\n");
fprintf(stderr, " -T value Compression threshold with LZO test. (0 (nil) - 10 (high), default 1)\n");
fprintf(stderr, " -N value Set nice value to value (default 19)\n");
fprintf(stderr, " -v[v] Increase verbosity\n");
fprintf(stderr, " -V show version\n");
fprintf(stderr, " -t test compressed file integrity\n");
fprintf(stderr, " -i show compressed file information\n");
fprintf(stderr, "\nIf no filenames or \"-\" is specified, stdin/out will be used.\n");
}
static void write_magic(int fd_in, int fd_out)
{
struct stat st;
char magic[24];
int i;
memset(magic, 0, sizeof(magic));
strcpy(magic, "LRZI");
magic[4] = LRZIP_MAJOR_VERSION;
magic[5] = LRZIP_MINOR_VERSION;
if (fstat(fd_in, &st) != 0)
fatal("bad magic file descriptor!?\n");
memcpy(&magic[6], &st.st_size, 8);
/* save LZMA compression flags */
if (LZMA_COMPRESS(control.flags)) {
for (i = 0; i < 5; i++)
magic[i + 16] = (char)control.lzma_properties[i];
}
if (lseek(fd_out, 0, SEEK_SET) != 0)
fatal("Failed to seek to BOF to write Magic Header\n");
if (write(fd_out, magic, sizeof(magic)) != sizeof(magic))
fatal("Failed to write magic header\n");
}
static void read_magic(int fd_in, i64 *expected_size)
{
uint32_t v;
char magic[24];
int i;
if (read(fd_in, magic, sizeof(magic)) != sizeof(magic))
fatal("Failed to read magic header\n");
*expected_size = 0;
if (strncmp(magic, "LRZI", 4) != 0) {
fatal("Not an lrzip file\n");
}
memcpy(&control.major_version, &magic[4], 1);
memcpy(&control.minor_version, &magic[5], 1);
/* Support the convoluted way we described size in versions < 0.40 */
if (control.major_version == 0 && control.minor_version < 4) {
memcpy(&v, &magic[6], 4);
*expected_size = ntohl(v);
memcpy(&v, &magic[10], 4);
*expected_size |= ((i64)ntohl(v)) << 32;
} else
memcpy(expected_size, &magic[6], 8);
/* restore LZMA compression flags only if stored */
if ((int) magic[16]) {
for (i = 0; i < 5; i++)
control.lzma_properties[i] = magic[i + 16];
}
if (control.flags & FLAG_VERBOSE) {
fprintf(stderr, "Detected lrzip version %d.%d file.\n", control.major_version, control.minor_version);
fflush(stderr);
}
if (control.major_version > LRZIP_MAJOR_VERSION ||
(control.major_version == LRZIP_MAJOR_VERSION && control.minor_version > LRZIP_MINOR_VERSION)) {
fprintf(stderr, "Attempting to work with file produced by newer lrzip version %d.%d file.\n", control.major_version, control.minor_version);
fflush(stderr);
}
}
/* preserve ownership and permissions where possible */
static void preserve_perms(int fd_in, int fd_out)
{
struct stat st;
if (fstat(fd_in, &st) != 0)
fatal("Failed to fstat input file\n");
if (fchmod(fd_out, (st.st_mode & 0777)) != 0)
fatal("Failed to set permissions on %s\n", control.outfile);
/* chown fail is not fatal */
fchown(fd_out, st.st_uid, st.st_gid);
}
/* Open a temporary outputfile for testing decompression or stdout */
static int open_tmpoutfile(void)
{
int fd_out;
if ((control.flags & FLAG_STDOUT) && (control.flags & FLAG_VERBOSE))
fprintf(stderr, "Outputting to stdout.\n");
control.outfile = realloc(NULL, 16);
strcpy(control.outfile, "lrzipout.XXXXXX");
if (!control.outfile)
fatal("Failed to allocate outfile name\n");
fd_out = mkstemp(control.outfile);
if (!fd_out)
fatal("Failed to create out tmpfile: %s\n", strerror(errno));
return fd_out;
}
/* Dump temporary outputfile to perform stdout */
static void dump_tmpoutfile(int fd_out)
{
FILE *tmpoutfp;
int tmpchar;
if (control.flags & FLAG_SHOW_PROGRESS)
fprintf(stderr, "Dumping to stdout.\n");
/* flush anything not yet in the temporary file */
fflush(NULL);
tmpoutfp = fdopen(fd_out, "r");
if (tmpoutfp == NULL)
fatal("Failed to fdopen out tmpfile: %s\n", strerror(errno));
rewind(tmpoutfp);
while ((tmpchar = fgetc(tmpoutfp)) != EOF)
putchar(tmpchar);
fflush(stdout);
}
/* Open a temporary inputfile to perform stdin */
static int open_tmpinfile(void)
{
int fd_in;
control.infile = malloc(15);
strcpy(control.infile, "lrzipin.XXXXXX");
if (!control.infile)
fatal("Failed to allocate infile name\n");
fd_in = mkstemp(control.infile);
if (!fd_in)
fatal("Failed to create in tmpfile: %s\n", strerror(errno));
return fd_in;
}
/* Read data from stdin into temporary inputfile */
static void read_tmpinfile(int fd_in)
{
FILE *tmpinfp;
int tmpchar;
if (control.flags & FLAG_SHOW_PROGRESS)
fprintf(stderr, "Copying from stdin.\n");
tmpinfp = fdopen(fd_in, "w+");
if (tmpinfp == NULL)
fatal("Failed to fdopen in tmpfile: %s\n", strerror(errno));
while ((tmpchar = getchar()) != EOF)
fputc(tmpchar, tmpinfp);
fflush(tmpinfp);
rewind(tmpinfp);
}
/*
decompress one file from the command line
*/
static void decompress_file(void)
{
int fd_in, fd_out = -1, fd_hist = -1;
i64 expected_size;
char *tmp, *tmpoutfile, *infilecopy = NULL;
if (!(control.flags & FLAG_STDIN)) {
if ((tmp = strrchr(control.infile, '.')) && strcmp(tmp,control.suffix)) {
/* make sure infile has an extension. If not, add it
* because manipulations may be made to input filename, set local ptr
*/
infilecopy = malloc(strlen(control.infile) + strlen(control.suffix) + 1);
if (infilecopy == NULL)
fatal("Failed to allocate memory for infile suffix\n");
else {
strcpy(infilecopy, control.infile);
strcat(infilecopy, control.suffix);
}
} else
infilecopy = strdup(control.infile);
/* regardless, infilecopy has the input filename */
}
if (!(control.flags & FLAG_STDOUT) && !(control.flags & FLAG_TEST_ONLY)) {
/* if output name already set, use it */
if (control.outname) {
control.outfile = strdup(control.outname);
} else {
/* default output name from infilecopy
* test if outdir specified. If so, strip path from filename of
* infilecopy, then remove suffix.
*/
if (control.outdir && (tmp = strrchr(infilecopy, '/')))
tmpoutfile = strdup(tmp + 1);
else
tmpoutfile = strdup(infilecopy);
/* remove suffix to make outfile name */
if ((tmp = strrchr(tmpoutfile, '.')) && !strcmp(tmp, control.suffix))
*tmp='\0';
control.outfile = malloc((control.outdir == NULL? 0: strlen(control.outdir)) + strlen(tmpoutfile) + 1);
if (!control.outfile)
fatal("Failed to allocate outfile name\n");
if (control.outdir) { /* prepend control.outdir */
strcpy(control.outfile, control.outdir);
strcat(control.outfile, tmpoutfile);
} else
strcpy(control.outfile, tmpoutfile);
free(tmpoutfile);
}
if ((control.flags & FLAG_SHOW_PROGRESS) && !(control.flags & FLAG_STDOUT)) {
fprintf(stderr, "Output filename is: %s...Decompressing...\n", control.outfile);
fflush(stderr);
}
}
if (control.flags & FLAG_STDIN) {
fd_in = open_tmpinfile();
read_tmpinfile(fd_in);
} else {
fd_in = open(infilecopy, O_RDONLY);
if (fd_in == -1) {
fatal("Failed to open %s: %s\n",
infilecopy,
strerror(errno));
}
}
if ((control.flags & (FLAG_TEST_ONLY | FLAG_STDOUT)) == 0) {
if (control.flags & FLAG_FORCE_REPLACE)
fd_out = open(control.outfile, O_WRONLY | O_CREAT | O_TRUNC, 0666);
else
fd_out = open(control.outfile, O_WRONLY | O_CREAT | O_EXCL, 0666);
if (fd_out == -1)
fatal("Failed to create %s: %s\n", control.outfile, strerror(errno));
if (!(control.flags & FLAG_NO_SET_PERMS))
preserve_perms(fd_in, fd_out);
} else
fd_out = open_tmpoutfile();
fd_hist = open(control.outfile, O_RDONLY);
if (fd_hist == -1)
fatal("Failed to open history file %s\n", control.outfile);
read_magic(fd_in, &expected_size);
if (control.flags & FLAG_SHOW_PROGRESS) {
fprintf(stderr, "Decompressing...");
fflush(stderr);
}
runzip_fd(fd_in, fd_out, fd_hist, expected_size);
if (control.flags & FLAG_STDOUT)
dump_tmpoutfile(fd_out);
/* if we get here, no fatal errors during decompression */
fprintf(stderr, "\r");
if (!(control.flags & (FLAG_STDOUT | FLAG_TEST_ONLY)))
fprintf(stderr, "Output filename is: %s: ", control.outfile);
fprintf(stderr, "[OK] - %lld bytes \n", (long long)expected_size);
fflush(stderr);
if (close(fd_hist) != 0 || close(fd_out) != 0)
fatal("Failed to close files\n");
if (control.flags & (FLAG_TEST_ONLY | FLAG_STDOUT)) {
/* Delete temporary files generated for testing or faking stdout */
if (unlink(control.outfile) != 0)
fatal("Failed to unlink tmpfile: %s\n", strerror(errno));
}
close(fd_in);
if (((control.flags & (FLAG_KEEP_FILES | FLAG_TEST_ONLY)) == 0) || (control.flags & FLAG_STDIN)) {
if (unlink(control.infile) != 0)
fatal("Failed to unlink %s: %s\n", infilecopy, strerror(errno));
}
free(control.outfile);
free(infilecopy);
}
static void get_fileinfo(void)
{
struct stat st;
int fd_in, ctype = 0;
i64 expected_size;
i64 infile_size;
int seekspot;
long double cratio;
char *tmp, *infilecopy = NULL;
if (!(control.flags & FLAG_STDIN)) {
if ((tmp = strrchr(control.infile, '.')) && strcmp(tmp,control.suffix)) {
infilecopy = malloc(strlen(control.infile) + strlen(control.suffix) + 1);
if (infilecopy == NULL)
fatal("Failed to allocate memory for infile suffix\n");
else {
strcpy(infilecopy, control.infile);
strcat(infilecopy, control.suffix);
}
} else
infilecopy = strdup(control.infile);
}
if (control.flags & FLAG_STDIN) {
fd_in = open_tmpinfile();
read_tmpinfile(fd_in);
} else {
fd_in = open(infilecopy, O_RDONLY);
if (fd_in == -1)
fatal("Failed to open %s: %s\n", infilecopy, strerror(errno));
}
/* Get file size */
if (fstat(fd_in, &st) != 0)
fatal("bad magic file descriptor!?\n");
memcpy(&infile_size, &st.st_size, 8);
/* Get decompressed size */
read_magic(fd_in, &expected_size);
/* Version < 0.4 had different file format */
if (control.major_version == 0 && control.minor_version < 4)
seekspot = 50;
else
seekspot = 74;
if (lseek(fd_in, seekspot, SEEK_SET) == -1)
fatal("Failed to lseek in get_fileinfo: %s\n", strerror(errno));
/* Read the compression type of the first block. It's possible that
not all blocks are compressed so this may not be accurate.
*/
read(fd_in, &ctype, 1);
cratio = (long double)expected_size / (long double)infile_size;
fprintf(stderr, "%s:\nlrzip version: %d.%d file\n", infilecopy, control.major_version, control.minor_version);
fprintf(stderr, "Compression: ");
if (ctype == CTYPE_NONE)
fprintf(stderr, "rzip alone\n");
else if (ctype == CTYPE_BZIP2)
fprintf(stderr, "rzip + bzip2\n");
else if (ctype == CTYPE_LZO)
fprintf(stderr, "rzip + lzo\n");
else if (ctype == CTYPE_LZMA)
fprintf(stderr, "rzip + lzma\n");
else if (ctype == CTYPE_GZIP)
fprintf(stderr, "rzip + gzip\n");
else if (ctype == CTYPE_ZPAQ)
fprintf(stderr, "rzip + zpaq\n");
fprintf(stderr, "Decompressed file size: %llu\n", expected_size);
fprintf(stderr, "Compressed file size: %llu\n", infile_size);
fprintf(stderr, "Compression ratio: %.3Lf\n", cratio);
if (control.flags & FLAG_STDIN) {
if (unlink(control.infile) != 0)
fatal("Failed to unlink %s: %s\n", infilecopy, strerror(errno));
}
free(control.outfile);
free(infilecopy);
}
/*
compress one file from the command line
*/
static void compress_file(void)
{
int fd_in, fd_out;
const char *tmp, *tmpinfile; /* we're just using this as a proxy for control.infile.
* Spares a compiler warning
*/
char header[24];
memset(header, 0, sizeof(header));
if (!(control.flags & FLAG_STDIN)) {
/* is extension at end of infile? */
if ((tmp = strrchr(control.infile, '.')) && !strcmp(tmp, control.suffix)) {
fprintf(stderr, "%s: already has %s suffix. Skipping...\n", control.infile, control.suffix);
return;
}
fd_in = open(control.infile, O_RDONLY);
if (fd_in == -1)
fatal("Failed to open %s: %s\n", control.infile, strerror(errno));
} else {
fd_in = open_tmpinfile();
read_tmpinfile(fd_in);
}
if (!(control.flags & FLAG_STDOUT)) {
if (control.outname) {
/* check if outname has control.suffix */
if (*(control.suffix) == '\0') /* suffix is empty string */
control.outfile = strdup(control.outname);
else if ((tmp=strrchr(control.outname, '.')) && strcmp(tmp, control.suffix)) {
control.outfile = malloc(strlen(control.outname) + strlen(control.suffix) + 1);
if (!control.outfile)
fatal("Failed to allocate outfile name\n");
strcpy(control.outfile, control.outname);
strcat(control.outfile, control.suffix);
fprintf(stderr, "Suffix added to %s.\nFull pathname is: %s\n", control.outname, control.outfile);
} else /* no, already has suffix */
control.outfile = strdup(control.outname);
} else {
/* default output name from control.infile
* test if outdir specified. If so, strip path from filename of
* control.infile
*/
if (control.outdir && (tmp = strrchr(control.infile, '/')))
tmpinfile = tmp + 1;
else
tmpinfile = control.infile;
control.outfile = malloc((control.outdir == NULL? 0: strlen(control.outdir)) + strlen(tmpinfile) + strlen(control.suffix) + 1);
if (!control.outfile)
fatal("Failed to allocate outfile name\n");
if (control.outdir) { /* prepend control.outdir */
strcpy(control.outfile, control.outdir);
strcat(control.outfile, tmpinfile);
} else
strcpy(control.outfile, tmpinfile);
strcat(control.outfile, control.suffix);
if ( control.flags & FLAG_SHOW_PROGRESS )
fprintf(stderr, "Output filename is: %s\n", control.outfile);
}
if (control.flags & FLAG_FORCE_REPLACE)
fd_out = open(control.outfile, O_WRONLY | O_CREAT | O_TRUNC, 0666);
else
fd_out = open(control.outfile, O_WRONLY | O_CREAT | O_EXCL, 0666);
if (fd_out == -1)
fatal("Failed to create %s: %s\n", control.outfile, strerror(errno));
} else
fd_out = open_tmpoutfile();
if (!(control.flags & FLAG_NO_SET_PERMS))
preserve_perms(fd_in, fd_out);
/* write zeroes to 24 bytes at beginning of file */
if (write(fd_out, header, sizeof(header)) != sizeof(header))
fatal("Cannot write file header\n");
rzip_fd(fd_in, fd_out);
/* write magic at end b/c lzma does not tell us properties until it is done */
write_magic(fd_in, fd_out);
if (control.flags & FLAG_STDOUT)
dump_tmpoutfile(fd_out);
if (close(fd_in) != 0 || close(fd_out) != 0)
fatal("Failed to close files\n");
if (control.flags & FLAG_STDOUT) {
/* Delete temporary files generated for testing or faking stdout */
if (unlink(control.outfile) != 0)
fatal("Failed to unlink tmpfile: %s\n", strerror(errno));
}
if ((control.flags & FLAG_KEEP_FILES) == 0 || (control.flags & FLAG_STDIN)) {
if (unlink(control.infile) != 0)
fatal("Failed to unlink %s: %s\n", control.infile, strerror(errno));
}
free(control.outfile);
}
/*
* Returns ram size on linux/darwin.
*/
#ifdef __APPLE__
static i64 get_ram(void)
{
int mib[2];
size_t len;
i64 *p, ramsize;
mib[0] = CTL_HW;
mib[1] = HW_MEMSIZE;
sysctl(mib, 2, NULL, &len, NULL, 0);
p = malloc(len);
sysctl(mib, 2, p, &len, NULL, 0);
ramsize = *p / 1024; // bytes -> KB
/* Darwin can't overcommit as much as linux so we return half the ram
size to fudge it to use smaller windows */
ramsize /= 2;
return ramsize;
}
#else
static i64 get_ram(void)
{
return (i64)sysconf(_SC_PHYS_PAGES) * (i64)sysconf(_SC_PAGE_SIZE) / 1024;
}
#endif
int main(int argc, char *argv[])
{
extern int optind;
int c, i, maxwin = 0;
struct timeval start_time, end_time;
struct sigaction handler;
int hours,minutes;
double seconds,total_time; // for timers
char *eptr; /* for environment */
memset(&control, 0, sizeof(control));
control.flags = FLAG_SHOW_PROGRESS | FLAG_KEEP_FILES;
control.suffix = ".lrz";
control.outdir = NULL;
if (strstr(argv[0], "lrunzip")) {
control.flags |= FLAG_DECOMPRESS;
}
control.compression_level = 7;
control.ramsize = get_ram() / 104858ull; /* hundreds of megabytes */
control.window = 0;
control.threshold = 1.0; /* default lzo test compression threshold (level 1) with LZMA compression */
/* for testing single CPU */
#ifndef NOTHREAD
control.threads = sysconf(_SC_NPROCESSORS_ONLN); /* get CPUs for LZMA */
#else
control.threads = 1;
#endif
control.nice_val = 19;
/* generate crc table */
CrcGenerateTable();
/* Get Preloaded Defaults from lrzip.conf
* Look in ., $HOME/.lrzip/, /etc/lrzip.
* If LRZIP=NOCONFIG is set, then ignore config
*/
eptr = getenv("LRZIP");
if (eptr == NULL)
read_config(&control);
else if (!strstr(eptr,"NOCONFIG"))
read_config(&control);
while ((c = getopt(argc, argv, "L:hdS:tVvDfqo:w:nlbMO:T:N:gPzi")) != -1) {
switch (c) {
case 'L':
control.compression_level = atoi(optarg);
if (control.compression_level < 1 || control.compression_level > 9)
fatal("Invalid compression level (must be 1-9)\n");
break;
case 'w':
control.window = atol(optarg);
break;
case 'd':
control.flags |= FLAG_DECOMPRESS;
break;
case 'S':
control.suffix = optarg;
break;
case 'o':
if (control.outdir)
fatal("Cannot have -o and -O together\n");
control.outname = optarg;
break;
case 'f':
control.flags |= FLAG_FORCE_REPLACE;
break;
case 'D':
control.flags &= ~FLAG_KEEP_FILES;
break;
case 't':
if (control.outname)
fatal("Cannot specify an output file name when just testing.\n");
if (!(control.flags & FLAG_KEEP_FILES))
fatal("Doubt that you want to delete a file when just testing.\n");
control.flags |= FLAG_TEST_ONLY;
break;
case 'v':
/* set verbosity flag */
if (!(control.flags & FLAG_VERBOSITY) && !(control.flags & FLAG_VERBOSITY_MAX))
control.flags |= FLAG_VERBOSITY;
else if ((control.flags & FLAG_VERBOSITY)) {
control.flags &= ~FLAG_VERBOSITY;
control.flags |= FLAG_VERBOSITY_MAX;
}
break;
case 'q':
control.flags &= ~FLAG_SHOW_PROGRESS;
break;
case 'V':
fprintf(stderr, "lrzip version %d.%d%d\n",
LRZIP_MAJOR_VERSION, LRZIP_MINOR_VERSION, LRZIP_MINOR_SUBVERSION);
exit(0);
break;
case 'l':
if (control.flags & FLAG_NOT_LZMA)
fatal("Can only use one of -l, -b, -g, -z or -n\n");
control.flags |= FLAG_LZO_COMPRESS;
break;
case 'b':
if (control.flags & FLAG_NOT_LZMA)
fatal("Can only use one of -l, -b, -g, -z or -n\n");
control.flags |= FLAG_BZIP2_COMPRESS;
break;
case 'n':
if (control.flags & FLAG_NOT_LZMA)
fatal("Can only use one of -l, -b, -g, -z or -n\n");
control.flags |= FLAG_NO_COMPRESS;
break;
case 'M':
control.compression_level = 9;
maxwin = 1;
break;
case 'O':
if (control.outname) /* can't mix -o and -O */
fatal("Cannot have options -o and -O together\n");
control.outdir = malloc(strlen(optarg) + 2);
if (control.outdir == NULL)
fatal("Failed to allocate for outdir\n");
strcpy(control.outdir,optarg);
if (strcmp(optarg+strlen(optarg) - 1, "/")) /* need a trailing slash */
strcat(control.outdir, "/");
break;
case 'T':
/* invert argument, a threshold of 1 means that the compressed result can be
* 90%-100% of the sample size
*/
control.threshold = atoi(optarg);
if (control.threshold < 0 || control.threshold > 10)
fatal("Threshold value must be between 0 and 10\n");
control.threshold = 1.05 - control.threshold / 20;
break;
case 'N':
control.nice_val = atoi(optarg);
if (control.nice_val < -20 || control.nice_val > 19)
fatal("Invalid nice value (must be -20..19)\n");
break;
case 'g':
if (control.flags & FLAG_NOT_LZMA)
fatal("Can only use one of -l, -b, -g, -z or -n\n");
control.flags |= FLAG_ZLIB_COMPRESS;
break;
case 'P':
control.flags |= FLAG_NO_SET_PERMS;
break;
case 'z':
if (control.flags & FLAG_NOT_LZMA)
fatal("Can only use one of -l, -b, -g, -z or -n\n");
control.flags |= FLAG_ZPAQ_COMPRESS;
break;
case 'i':
control.flags |= FLAG_INFO;
break;
case 'h':
usage();
return -1;
}
}
argc -= optind;
argv += optind;
if (control.outname && argc > 1)
fatal("Cannot specify output filename with more than 1 file\n");
if ((control.flags & FLAG_VERBOSE) && !(control.flags & FLAG_SHOW_PROGRESS)) {
fprintf(stderr, "Cannot have -v and -q options. -v wins.\n");
control.flags |= FLAG_SHOW_PROGRESS;
}
if (argc < 1)
control.flags |= FLAG_STDIN;
if (control.window > control.ramsize)
fprintf(stderr, "Compression window has been set to larger than ramsize, proceeding at your request. If you did not mean this, abort now.\n");
if (sizeof(long) == 4 && control.ramsize > 9) {
/* On 32 bit, the default high/lowmem split of 896MB lowmem
means we will be unable to allocate contiguous blocks
over 900MB between 900 and 1800MB. It will be less prone
to failure if we limit the block size.
*/
if (control.ramsize < 18)
control.ramsize = 9;
else
control.ramsize -= 9;
}
/* The control window chosen is the largest that will not cause
massive swapping on the machine (60% of ram). Most of the pages
will be shared by lzma even though it uses just as much ram itself
*/
if (!control.window) {
if (maxwin)
control.window = (control.ramsize * 9 / 10);
else
control.window = (control.ramsize * 2 / 3);
if (!control.window)
control.window = 1;
}
/* malloc limited to 2GB on 32bit */
if (sizeof(long) == 4 && control.window > 20) {
control.window = 20;
if (control.flags & FLAG_VERBOSE)
fprintf(stderr, "Limiting control window to 2GB due to 32bit limitations.\n");
}
/* OK, if verbosity set, print summary of options selected */
if ((control.flags & FLAG_VERBOSE) && !(control.flags & FLAG_INFO)) {
fprintf(stderr, "The following options are in effect for this %s.\n",
control.flags & FLAG_DECOMPRESS ? "DECOMPRESSION" : "COMPRESSION");
if (LZMA_COMPRESS(control.flags))
fprintf(stderr, "Threading is %s. Number of CPUs detected: %lu\n", control.threads > 1? "ENABLED" : "DISABLED",
control.threads);
fprintf(stderr, "Nice Value: %d\n", control.nice_val);
if (control.flags & FLAG_SHOW_PROGRESS)
fprintf(stderr, "Show Progress\n");
if (control.flags & FLAG_VERBOSITY)
fprintf(stderr, "Verbose\n");
else if (control.flags & FLAG_VERBOSITY_MAX)
fprintf(stderr, "Max Verbosity\n");
if (control.flags & FLAG_FORCE_REPLACE)
fprintf(stderr, "Overwrite Files\n");
if (!(control.flags & FLAG_KEEP_FILES))
fprintf(stderr, "Remove input files on completion\n");
if (control.outdir)
fprintf(stderr, "Output Directory Specified: %s\n", control.outdir);
else if (control.outname)
fprintf(stderr, "Output Filename Specified: %s\n", control.outname);
if (control.flags & FLAG_TEST_ONLY)
fprintf(stderr, "Test file integrity\n");
/* show compression options */
if (!(control.flags & FLAG_DECOMPRESS)) {
fprintf(stderr, "Compression mode is: ");
if (LZMA_COMPRESS(control.flags))
fprintf(stderr, "LZMA. LZO Test Compression Threshold: %.f\n",
(control.threshold < 1.05 ? 21 - control.threshold * 20 : 0));
else if (control.flags & FLAG_LZO_COMPRESS)
fprintf(stderr, "LZO\n");
else if (control.flags & FLAG_BZIP2_COMPRESS)
fprintf(stderr, "BZIP2. LZO Test Compression Threshold: %.f\n",
(control.threshold < 1.05 ? 21 - control.threshold * 20 : 0));
else if (control.flags & FLAG_ZLIB_COMPRESS)
fprintf(stderr, "GZIP\n");
else if (control.flags & FLAG_ZPAQ_COMPRESS)
fprintf(stderr, "ZPAQ. LZO Test Compression Threshold: %.f\n",
(control.threshold < 1.05 ? 21 - control.threshold * 20 : 0));
else if (control.flags & FLAG_NO_COMPRESS)
fprintf(stderr, "RZIP\n");
fprintf(stderr, "Compression Window: %lld = %lldMB\n", control.window, control.window * 100ull);
fprintf(stderr, "Compression Level: %d\n", control.compression_level);
}
fprintf(stderr, "\n");
}
if (setpriority(PRIO_PROCESS, 0, control.nice_val) == -1)
fatal("Unable to set nice value\n");
/* One extra iteration for the case of no parameters means we will default to stdin/out */
for (i = 0; i <= argc; i++) {
if (i < argc)
control.infile = argv[i];
else if (!(i == 0 && (control.flags & FLAG_STDIN)))
break;
if (control.infile && (strcmp(control.infile, "-") == 0))
control.flags |= FLAG_STDIN;
if (control.outname && (strcmp(control.outname, "-") == 0))
control.flags |= FLAG_STDOUT;
/* If we're using stdin and no output filename, use stdout */
if ((control.flags & FLAG_STDIN) && !control.outname)
control.flags |= FLAG_STDOUT;
/* Implement signal handler only once flags are set */
handler.sa_handler = &sighandler;
sigaction(SIGTERM, &handler, 0);
sigaction(SIGINT, &handler, 0);
gettimeofday(&start_time, NULL);
if (control.flags & (FLAG_DECOMPRESS | FLAG_TEST_ONLY))
decompress_file();
else if (control.flags & FLAG_INFO)
get_fileinfo();
else
compress_file();
/* compute total time */
gettimeofday(&end_time, NULL);
total_time = (end_time.tv_sec + (double)end_time.tv_usec / 1000000) -
(start_time.tv_sec + (double)start_time.tv_usec / 1000000);
hours = (int)total_time / 3600;
minutes = (int)(total_time - hours * 3600) / 60;
seconds = total_time - hours * 60 - minutes * 60;
if ((control.flags & FLAG_SHOW_PROGRESS) && !(control.flags & FLAG_INFO))
fprintf(stderr, "Total time: %02d:%02d:%06.3f\n", hours, minutes, seconds);
}
return 0;
}

83
man/lrunzip.1.pod Normal file
View file

@ -0,0 +1,83 @@
# Copyright
#
# Copyright (C) 2004-2009 Jari Aalto
#
# License
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# Description
#
# To learn what TOP LEVEL section to use in manual pages,
# see POSIX/Susv standard and "tility Description Defaults" at
# http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap01.html#tag_01_11
#
# This is manual page in Perl POD format. Read more at
# http://perldoc.perl.org/perlpod.html or run command:
#
# perldoc perlpod | less
#
# To check the syntax:
#
# podchecker *.pod
#
# Create manual page with command:
#
# pod2man PAGE.N.pod > PAGE.N
=pod
=head1 NAME
lrunzip - Uncompress LRZ files
=head1 SYNOPSIS
lrztar [options] FILE [... FILE]
=head1 DESCRIPTION
lrunzip is identical to C<lrzip -d> used to decompress files.
=head1 OPTIONS
See lrzip(1).
=head1 ENVIRONMENT
None.
=head1 FILES
None.
=head1 SEE ALSO
bzip2(1),
gzip(1),
lzop(1),
lrzip(1),
rzip(1),
zip(1)
=head1 AUTHORS
Program was written by Con Kolivas.
This manual page was written by Jari Aalto <jari.aalto@cante.net>, for
the Debian GNU system (but may be used by others). Released under license
GNU GPL version 2or (at your option) any later version. For more
information about license, visit <http://www.gnu.org/copyleft/gpl.html>.
=cut

276
man/lrzip.1 Normal file
View file

@ -0,0 +1,276 @@
.TH "lrzip" "1" "December 2009" "" ""
.SH "NAME"
lrzip \- a large-file compression program
.SH "SYNOPSIS"
.PP
lrzip [OPTIONS] <file>
.br
lrzip \-d [OPTIONS] <file>
.br
lrunzip [OPTIONS] <file>
.br
lrztar [lrzip options] <directory>
.br
lrztar \-d [lrzip options] <directory>
.br
LRZIP=NOCONFIG [lrzip|lrunzip] [OPTIONS] <file>
.PP
.SH "DESCRIPTION"
.PP
LRZIP is a file compression program designed to do particularly
well on very large files containing long distance redundancy\&.
lrztar is a wrapper for LRZIP to simplify compression and decompression
of directories.
.PP
.SH "OPTIONS SUMMARY"
.PP
Here is a summary of the options to lrzip\&.
.nf
\-w size compression window in hundreds of MB
default chosen by heuristic dependent on ram and chosen compression
\-d decompress
\-o filename specify the output file name and/or path
\-O directory specify the output directory when \-o is not used
\-S suffix specify compressed suffix (default '.lrz')
\-f force overwrite of any existing files
\-D delete existing files
\-P don't set permissions on output file. It may leave it world-readable
\-q don't show compression progress
\-L level set lzma/bzip2/gzip compression level (1\-9, default 7)
\-n no backend compression. Prepare for other compressor
\-l lzo compression (ultra fast)
\-b bzip2 compression
\-g gzip compression using zlib
\-z zpaq compression (best, extreme compression, extremely slow)
\-M Maximum window and level - (all available ram and level 9)
\-T value Compression threshold with LZO test. (0 (nil) - 10 (high), default 1)
\-N value Set nice value to value (default 19)
\-v[v] Increase verbosity
\-V show version
\-t test compressed file integrity
\-i show compressed file information
If no filenames or "-" is specified, stdin/out will be used.
.fi
.PP
.SH "OPTIONS"
.PP
.IP "\fB-h\fP"
Print an options summary page
.IP
.IP "\fB-V\fP"
Print the lrzip version number
.IP
.IP "\fB-v[v]\fP"
Increases verbosity. \-vv will print more messages than \-v.
.IP
.IP "\fB-w n\fP"
Set the compression window size to n in hundreds of megabytes. This is the amount
of memory lrzip will search during its first stage of pre-compression and is
the main thing that will determine how much benefit lrzip will provide over
ordinary compression with the 2nd stage algorithm. Because of buffers and
compression overheads, the value chosen must be significantly smaller than
your available ram or lrzip will induce a massive swap load. If not set
(recommended), the value chosen will be determined by internal heuristic in
lrzip which uses the most memory that is reasonable. It is limited to 2GB on
32bit machines.
.IP
.IP "\fB-L 1\&.\&.9\fP"
Set the compression level from 1 to 9. The default is
to use level 7, which is a reasonable compromise between speed and
compression. The compression level is also strongly related to how much
memory lrzip uses. See the \-w option for details.
.IP
.IP "\fB-M \fP"
Maximum compression\&. If this option is set, then lrzip ignores the heuristic
mentioned for the default window and tries to set it to all available ram,
and sets the compression level to maximum. This will cause a significant swap
load on most machines, and may even fail without enough swap space allocated.
Be prepared to walk away if you use this option. It is not recommended to use
this as it hardly ever improves compression.
.IP
.IP "\fB-T 0\&.\&.10\fP"
Sets the LZO compression threshold when testing a data chunk when slower
compression is used. The threshold level can be from 0 to 10.
This option is used to speed up compression by avoiding doing the slow
compression pass. The reasoning is that if it is completely incompressible
by LZO then it will also be incompressible by them, thereby saving time.
The default is 1.
.IP
.IP "\fB-d\fP"
Decompress. If this option is not used then lrzip looks at
the name used to launch the program. If it contains the string
"lrunzip" then the \-d option is automatically set.
.IP
.IP "\fB-l\fP"
LZO Compression. If this option is set then lrzip will use the ultra
fast lzo compression algorithm for the 2nd stage. This mode of compression
gives bzip2 like compression at the speed it would normally take to simply
copy the file, giving excellent compression/time value]&.
.IP
.IP "\fB-n\fP"
No 2nd stage compression. If this option is set then lrzip will only
perform the long distance redundancy 1st stage compression. While this does
not compress any faster than LZO compression, it produces a smaller file
that then responds better to further compression (by eg another application),
also reducing the compression time substantially.
.IP
.IP "\fB-b\fP"
Bzip2 compression. Uses bzip2 compression for the 2nd stage, much like
the original rzip does.
.IP "\fB-g\fP"
Gzip compression. Uses gzip compression for the 2nd stage, much like
the original rzip does. Uses libz compress and uncompress functions.
.IP
.IP "\fB-z\fP"
ZPAQ compression. Uses ZPAQ compression which is from the PAQ family of
compressors known for having some of the highest compression ratios possible
but at the cost of being extremely slow on both compress and decompress.
.IP
.IP "\fB-o\fP"
Set the output file name. If this option is not set then
the output file name is chosen based on the input name and the
suffix. The \-o option cannot be used if more than one file name is
specified on the command line.
.IP
.IP "\fB-O\fP"
Set the output directory for the default filename. This option
cannot be combined with \-o.
.IP
.IP "\fB-S\fP"
Set the compression suffix. The default is '.lrz'.
.IP
.IP "\fB-f\fP"
If this option is not specified (Default) then lrzip will not
overwrite any existing files. If you set this option then rzip will
silently overwrite any files as needed.
.IP
.IP "\fB-D\fP"
If this option is specified then lrzip will delete the
source file after successful compression or decompression. When this
option is not specified then the source files are not deleted.
.IP
.IP "\fB-P\fP"
If this option is specified then lrzip will not try to set the file
permissions on writing the file. This helps when writing to a brain
damaged filesystem like fat32 on windows.
.IP
.IP "\fB-q\fP"
If this option is specified then lrzip will not show the
percentage progress while compressing. Note that compression happens in
bursts with lzma compression which is the default compression. This means
that it will progress very rapidly for short periods and then stop for
long periods.
.IP "\fB-N value\fP"
The default nice value is 19. This option can be used to set the priority
scheduling for the lrzip backup or decompression. Valid nice values are
from \-20 to 19.
.IP
.IP "\fB-t\fP"
This tests the compressed file integrity. It does this by decompressing it
to a temporary file and then deleting it.
.IP
.IP "\fB-i\fP"
This shows information about a compressed file. It shows the compressed size,
the decompressed size, the compression ratio and what compression was used.
Note that the compression mode is detected from the first block only and
it will show no compression used if the first block was incompressible, even
if later blocks were compressible.
.IP
.PP
.SH "INSTALLATION"
.PP
"make install" or just install lrzip somewhere in your search path.
.PP
.SH "COMPRESSION ALGORITHM"
.PP
LRZIP operates in two stages. The first stage finds and encodes large
chunks of duplicated data over potentially very long distances (limited
only by your available ram) in the input file. The second stage is to
use a compression algorithm to compress the output of the
first stage. The compression algorithm can be chosen to be optimised
for size (lzma - default), speed (lzo), legacy (bzip2) or (gzip)
or can be omitted entirely doing only the first stage. A one stage only
compressed file can almost always improve both the compression size and
speed done by a subsequent compression program.
.PP
The key difference between lrzip and other well known compression
algorithms is its ability to take advantage of very long distance
redundancy. The well known deflate algorithm used in gzip uses a
maximum history buffer of 32k. The block sorting algorithm used in
bzip2 is limited to 900k of history. The history buffer in lrzip can be
any size long, limited only by available ram.
.
.PP
It is quite common these days to need to compress files that contain
long distance redundancies. For example, when compressing a set of
home directories several users might have copies of the same file, or
of quite similar files. It is also common to have a single file that
contains large duplicated chunks over long distances, such as pdf
files containing repeated copies of the same image. Most compression
programs won't be able to take advantage of this redundancy, and thus
might achieve a much lower compression ratio than lrzip can achieve.
.IP
.PP
.SH "FILES"
.PP
LRZIP now recognizes a configuration file that contains default settings.
This configuration is searched for in the current directory, /etc/lrzip,
and $HOME/.lrzip. The configuration filename must be \fBlrzip.conf\fP.
.PP
.SH "ENVIRONMENT"
By default, lrzip will search for and use a configuration file, lrzip.conf.
If the user wishes to bypass the file, a startup ENV variable may be set.
.br
.B LRZIP =
.I "NOCONFIG "
.B "[lrzip|lrunzip]"
[OPTIONS] <file>
.br
which will force lrzip to ignore the configuration file.
.PP
.SH "HISTORY - Notes on rzip by Andrew Tridgell"
.PP
The ideas behind rzip were first implemented in 1998 while I was
working on rsync. That version was too slow to be practical, and was
replaced by this version in 2003.
LRZIP was created by the desire to have better compression and/or speed
by Con Kolivas on blending the lzma and lzo compression algorithms with
the rzip first stage, and extending the compression windows to scale
with increasing ram sizes.
.PP
.SH "BUGS"
.PP
Probably lots.
.PP
.SH "SEE ALSO"
lrzip.conf(5)
.PP
.SH "AUTHOR and CREDITS"
.br
rzip was written by Andrew Tridgell.
.br
lzma was written by Igor Pavlov.
.br
lzo was written by Markus Oberhumer.
.br
zpaq was written by Matt Mahoney.
.br
lrzip was bastardised from rzip by Con Kolivas.
.br
Peter Hyman added informational output, updated LZMA SDK,
and aded multi-threading capabilities.
.PP
If you wish to report a problem or make a suggestion then please email Con at
kernel@kolivas.org
.PP
lrzip is released under the GNU General Public License version 2.
Please see the file COPYING for license details.

59
man/lrzip.conf.5 Normal file
View file

@ -0,0 +1,59 @@
.TH "lrzip.conf" "5" "January 2009" "" ""
.SH "NAME"
lrzip.conf \- Configuration File for lrzip
.SH "DESCRIPTION"
.PP
This file if used, will be read by the lrzip program\&, parsed\&,
and options passed to the program\&. Some options may be overridden
on the command line\&. Others are fixed\&.
.PP
The configuration file must be called \fBlrzip\&.conf\fP\&.
The lrzip program will search for the file automatically in one of
three places\&:
.nf
$PWD \- Current Directory
/etc/lrzip
$HOME/\&./lrzip
.PP
Parameters are set in \fBPARAMETER\&=VALUE\fP fashion where any line
beginning with a \fB#\fP or that is blank will be ignored\&.
.PP
.SH "CONFIG FILE EXAMPLE"
.nf
# This is a comment.
# Compression Window size in 100MB. Normally selected by program.
WINDOW = 5
# Compression Level 1-9 (7 Default).
COMPRESSIONLEVEL = 7
# Compression Method, rzip, gzip, bzip2, lzo, or lzma (default).
COMPRESSIONMETHOD = lzma
# Test Threshold value 1-10 (2 Default).
TESTTHRESHOLD = 2
# Default output directory
OUTPUTDIRECTORY = location
# Verbosity, true or 1, or max or 2
VERBOSITY = max
# Show Progress as file is parsed, true or 1, false or 0
SHOWPROGRESS = true
# Set Niceness. 19 is default. \-20 to 19 is the allowable range
NICE = 19
# Delete source file after compression
# this parameter and value are case sensitive
# value must be YES to activate
# DELETEFILES = NO
# Replace existing lrzip file when compressing
# this parameter and value are case sensitive
# value must be YES to activate
# REPLACEFILE = NO
.fi
.PP
.SH "NOTES"
.PP
Be careful when using \fBDELETEFILES\fP or \fBREPLACEFILE\fP as
no warning will be given and lrzip will simply delete the source
or replace the output file!
.PP
.SH "SEE ALSO"
lrzip(1)

85
man/lrztar.1.pod Normal file
View file

@ -0,0 +1,85 @@
# Copyright
#
# Copyright (C) 2004-2010 Jari Aalto
#
# License
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# Description
#
# To learn what TOP LEVEL section to use in manual pages,
# see POSIX/Susv standard and "tility Description Defaults" at
# http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap01.html#tag_01_11
#
# This is manual page in Perl POD format. Read more at
# http://perldoc.perl.org/perlpod.html or run command:
#
# perldoc perlpod | less
#
# To check the syntax:
#
# podchecker *.pod
#
# Create manual page with command:
#
# pod2man PAGE.N.pod > PAGE.N
=pod
=head1 NAME
lrztar - Directory wrapper for lrzip
=head1 SYNOPSIS
lrztar [options] DIRECTORY
lrztar -d [options] DIRECTORY.tar.lrz
=head1 DESCRIPTION
lrztar is a wrapper for compressing and decompressing while directories with lrzip(1)
to corresponding file C<DIRECTORY.tar.lrz>.
=head1 OPTIONS
See lrzip(1).
=head1 ENVIRONMENT
None.
=head1 FILES
None.
=head1 SEE ALSO
bzip2(1),
gzip(1),
lzop(1),
lrzip(1),
rzip(1),
zip(1)
=head1 AUTHORS
Program was written by Con Kolivas.
This manual page was written by Jari Aalto <jari.aalto@cante.net>, for
the Debian GNU system (but may be used by others). Released under license
GNU GPL version 2or (at your option) any later version. For more
information about license, visit <http://www.gnu.org/copyleft/gpl.html>.
=cut

66
pod2man.mk Normal file
View file

@ -0,0 +1,66 @@
# pod2man.mk -- Makefile portion to convert *.pod files to manual pages
#
# Copyright information
#
# Copyright (C) 2008-2010 Jari Aalto
#
# License
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# Description
#
# Convert *.pod files to manual pages. Write this to 'install'
# target:
#
# install: build $(MANPAGE)
ifneq (,)
This makefile requires GNU Make.
endif
# This variable *must* be set when called
PACKAGE ?= package
# Optional variables to set
MANSECT ?= 1
PODCENTER =
PODDATE = $$(date "+%Y-%m-%d")
# Directories
MANSRC =
MANDEST = $(MANSRC)
MANPOD = $(MANSRC)$(PACKAGE).$(MANSECT).pod
MANPAGE = $(MANDEST)$(PACKAGE).$(MANSECT)
POD2MAN = pod2man
POD2MAN_FLAGS = --utf8
makeman: $(MANPAGE)
$(MANPAGE): $(MANPOD)
# make target - create manual page from a *.pod page
podchecker $(MANPOD)
LC_ALL= LANG=C $(POD2MAN) $(POD2MAN_FLAGS) \
--center="$(PODCENTER)" \
--date="$(PODDATE)" \
--name="$(PACKAGE)" \
--section="$(MANSECT)" \
$(MANPOD) \
| sed 's,[Pp]erl v[0-9.]\+,$(PACKAGE),' \
> $(MANPAGE) && \
rm -f pod*.tmp
# End of of Makefile part

274
runzip.c Normal file
View file

@ -0,0 +1,274 @@
/*
Copyright (C) Andrew Tridgell 1998-2003
Con Kolivas 2006-2009
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
/* rzip decompression algorithm */
#include "rzip.h"
static inline uchar read_u8(void *ss, int stream)
{
uchar b;
if (read_stream(ss, stream, &b, 1) != 1)
fatal("Stream read u8 failed\n");
return b;
}
static inline u16 read_u16(void *ss, int stream)
{
u16 ret;
if (read_stream(ss, stream, (uchar *)&ret, 2) != 2)
fatal("Stream read u16 failed\n");
return ret;
}
static inline u32 read_u32(void *ss, int stream)
{
u32 ret;
if (read_stream(ss, stream, (uchar *)&ret, 4) != 4)
fatal("Stream read u32 failed\n");
return ret;
}
static inline i64 read_i64(void *ss, int stream)
{
i64 ret;
if (read_stream(ss, stream, (uchar *)&ret, 8) != 8)
fatal("Stream read i64 failed\n");
return ret;
}
static u16 read_header_v03(void *ss, uchar *head)
{
*head = read_u8(ss, 0);
return read_u16(ss, 0);
}
static i64 read_header(void *ss, uchar *head)
{
if (control.major_version == 0 && control.minor_version < 4)
return read_header_v03(ss, head);
*head = read_u8(ss, 0);
return read_i64(ss, 0);
}
static i64 unzip_literal(void *ss, i64 len, int fd_out, uint32 *cksum)
{
uchar *buf;
if (len < 0)
fatal("len %lld is negative in unzip_literal!\n",len);
if (sizeof(long) == 4 && len > (1UL << 31))
fatal("Unable to allocate a chunk this big on 32bit userspace. Can't decompress this file on this hardware\n");
buf = malloc((size_t)len);
if (!buf)
fatal("Failed to allocate literal buffer of size %lld\n", len);
read_stream(ss, 1, buf, len);
if (write_1g(fd_out, buf, (size_t)len) != (ssize_t)len)
fatal("Failed to write literal buffer of size %lld\n", len);
*cksum = CrcUpdate(*cksum, buf, len);
free(buf);
return len;
}
static i64 unzip_match_v03(void *ss, i64 len, int fd_out, int fd_hist, uint32 *cksum)
{
u32 offset;
i64 n, total = 0;
i64 cur_pos = lseek(fd_out, 0, SEEK_CUR);
if (cur_pos == -1)
fatal("Seek failed on out file in unzip_match.\n");
offset = read_u32(ss, 0);
if (lseek(fd_hist, cur_pos - offset, SEEK_SET) == -1)
fatal("Seek failed by %d from %d on history file in unzip_match - %s\n",
offset, cur_pos, strerror(errno));
while (len) {
uchar *buf;
n = MIN(len, offset);
buf = malloc((size_t)n);
if (!buf)
fatal("Failed to allocate %d bytes in unzip_match\n", n);
if (read_1g(fd_hist, buf, (size_t)n) != (ssize_t)n)
fatal("Failed to read %d bytes in unzip_match\n", n);
if (write_1g(fd_out, buf, (size_t)n) != (ssize_t)n)
fatal("Failed to write %d bytes in unzip_match\n", n);
*cksum = CrcUpdate(*cksum, buf, n);
len -= n;
free(buf);
total += n;
}
return total;
}
static i64 unzip_match(void *ss, i64 len, int fd_out, int fd_hist, uint32 *cksum)
{
i64 cur_pos;
i64 offset, n, total;
if (len < 0)
fatal("len %lld is negative in unzip_match!\n",len);
if (control.major_version == 0 && control.minor_version < 4)
return unzip_match_v03(ss, len, fd_out, fd_hist, cksum);
total = 0;
cur_pos = lseek(fd_out, 0, SEEK_CUR);
if (cur_pos == -1)
fatal("Seek failed on out file in unzip_match.\n");
/* Note the offset is in a different format v0.40+ */
offset = read_i64(ss, 0);
if (lseek(fd_hist, cur_pos - offset, SEEK_SET) == -1)
fatal("Seek failed by %d from %d on history file in unzip_match - %s\n",
offset, cur_pos, strerror(errno));
while (len) {
uchar *buf;
n = MIN(len, offset);
buf = malloc((size_t)n);
if (!buf)
fatal("Failed to allocate %d bytes in unzip_match\n", n);
if (read_1g(fd_hist, buf, (size_t)n) != (ssize_t)n)
fatal("Failed to read %d bytes in unzip_match\n", n);
if (write_1g(fd_out, buf, (size_t)n) != (ssize_t)n)
fatal("Failed to write %d bytes in unzip_match\n", n);
*cksum = CrcUpdate(*cksum, buf, n);
len -= n;
free(buf);
total += n;
}
return total;
}
/* decompress a section of an open file. Call fatal() on error
return the number of bytes that have been retrieved
*/
static i64 runzip_chunk(int fd_in, int fd_out, int fd_hist, i64 expected_size, i64 tally)
{
uchar head;
i64 len;
struct stat st;
void *ss;
i64 ofs;
i64 total = 0;
uint32 good_cksum, cksum = 0;
int l = -1, p = 0;
/* for display of progress */
char *suffix[] = {"","KB","MB","GB"};
unsigned long divisor[] = {1,1024,1048576,1073741824U};
int divisor_index;
double prog_done, prog_tsize;
if (expected_size > (i64)10737418240ULL) /* > 10GB */
divisor_index = 3;
else if (expected_size > 10485760) /* > 10MB */
divisor_index = 2;
else if (expected_size > 10240) /* > 10KB */
divisor_index = 1;
else
divisor_index = 0;
prog_tsize = (long double)expected_size / (long double)divisor[divisor_index];
ofs = lseek(fd_in, 0, SEEK_CUR);
if (ofs == -1)
fatal("Failed to seek input file in runzip_fd\n");
if (fstat(fd_in, &st) != 0 || st.st_size-ofs == 0)
return 0;
ss = open_stream_in(fd_in, NUM_STREAMS);
if (!ss)
fatal(NULL);
while ((len = read_header(ss, &head)) || head) {
switch (head) {
case 0:
total += unzip_literal(ss, len, fd_out, &cksum);
break;
default:
total += unzip_match(ss, len, fd_out, fd_hist, &cksum);
break;
}
p = 100 * ((double)(tally+total) / (double)expected_size);
if (control.flags & FLAG_SHOW_PROGRESS) {
if ( p != l ) {
prog_done = (double)(tally+total) / (double)divisor[divisor_index];
fprintf(stderr, "%3d%% %9.2f / %9.2f %s\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b",
p, prog_done, prog_tsize, suffix[divisor_index] );
fflush(stderr);
l = p;
}
}
}
good_cksum = read_u32(ss, 0);
if (good_cksum != cksum)
fatal("Bad checksum 0x%08x - expected 0x%08x\n", cksum, good_cksum);
if (close_stream_in(ss) != 0)
fatal("Failed to close stream!\n");
return total;
}
/* decompress a open file. Call fatal() on error
return the number of bytes that have been retrieved
*/
i64 runzip_fd(int fd_in, int fd_out, int fd_hist, i64 expected_size)
{
i64 total = 0;
struct timeval start,end;
gettimeofday(&start,NULL);
while (total < expected_size)
total += runzip_chunk(fd_in, fd_out, fd_hist, expected_size, total);
gettimeofday(&end,NULL);
if (control.flags & FLAG_SHOW_PROGRESS)
fprintf(stderr, "\nAverage DeCompression Speed: %6.3fMB/s\n",
(total / 1024 / 1024) / (double)((end.tv_sec-start.tv_sec)? : 1));
return total;
}

659
rzip.c Normal file
View file

@ -0,0 +1,659 @@
/*
Copyright (C) Andrew Tridgell 1998,
Con Kolivas 2006-2009
Modified to use flat hash, memory limit and variable hash culling
by Rusty Russell copyright (C) 2003.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
/* rzip compression algorithm */
#include "rzip.h"
#define CHUNK_MULTIPLE (100 * 1024 * 1024)
#define CKSUM_CHUNK 1024*1024
#define GREAT_MATCH 1024
#define MINIMUM_MATCH 31
/* Hash table works as follows. We start by throwing tags at every
* offset into the table. As it fills, we start eliminating tags
* which don't have lower bits set to one (ie. first we eliminate all
* even tags, then all tags divisible by four, etc.). This ensures
* that on average, all parts of the file are covered by the hash, if
* sparsely. */
typedef i64 tag;
/* All zero means empty. We might miss the first chunk this way. */
struct hash_entry {
i64 offset;
tag t;
};
/* Levels control hashtable size and bzip2 level. */
static struct level {
unsigned long mb_used;
unsigned initial_freq;
unsigned max_chain_len;
} levels[10] = {
{ 1, 4, 1 },
{ 2, 4, 2 },
{ 4, 4, 2 },
{ 8, 4, 2 },
{ 16, 4, 3 },
{ 32, 4, 4 },
{ 32, 2, 6 },
{ 64, 1, 16 }, /* More MB makes sense, but need bigger test files */
{ 64, 1, 32 },
{ 64, 1, 128 },
};
struct rzip_state {
void *ss;
struct level *level;
tag hash_index[256];
struct hash_entry *hash_table;
i64 hash_bits;
i64 hash_count;
i64 hash_limit;
tag minimum_tag_mask;
i64 tag_clean_ptr;
uchar *last_match;
i64 chunk_size;
uint32_t cksum;
int fd_in, fd_out;
struct {
i64 inserts;
i64 literals;
i64 literal_bytes;
i64 matches;
i64 match_bytes;
i64 tag_hits;
i64 tag_misses;
} stats;
};
static inline void put_u8(void *ss, int stream, uchar b)
{
if (write_stream(ss, stream, &b, 1) != 0)
fatal(NULL);
}
static inline void put_u32(void *ss, int stream, uint32_t s)
{
if (write_stream(ss, stream, (uchar *)&s, 4))
fatal(NULL);
}
static inline void put_i64(void *ss, int stream, i64 s)
{
if (write_stream(ss, stream, (uchar *)&s, 8))
fatal(NULL);
}
static void put_header(void *ss, uchar head, i64 len)
{
put_u8(ss, 0, head);
put_i64(ss, 0, len);
}
static void put_match(struct rzip_state *st, uchar *p, uchar *buf, i64 offset, i64 len)
{
do {
i64 ofs;
i64 n = len;
ofs = (p - (buf+offset));
put_header(st->ss, 1, n);
put_i64(st->ss, 0, ofs);
st->stats.matches++;
st->stats.match_bytes += n;
len -= n;
p += n;
offset += n;
} while (len);
}
static void put_literal(struct rzip_state *st, uchar *last, uchar *p)
{
do {
i64 len = (i64)(p - last);
st->stats.literals++;
st->stats.literal_bytes += len;
put_header(st->ss, 0, len);
if (len && write_stream(st->ss, 1, last, len) != 0)
fatal(NULL);
last += len;
} while (p > last);
}
/* Could give false positive on offset 0. Who cares. */
static int empty_hash(struct rzip_state *st, i64 h)
{
return !st->hash_table[h].offset && !st->hash_table[h].t;
}
static i64 primary_hash(struct rzip_state *st, tag t)
{
return t & ((1 << st->hash_bits) - 1);
}
static inline tag increase_mask(tag tag_mask)
{
/* Get more precise. */
return (tag_mask << 1) | 1;
}
static int minimum_bitness(struct rzip_state *st, tag t)
{
tag better_than_min = increase_mask(st->minimum_tag_mask);
if ((t & better_than_min) != better_than_min)
return 1;
return 0;
}
/* Is a going to be cleaned before b? ie. does a have fewer low bits
* set than b? */
static int lesser_bitness(tag a, tag b)
{
tag mask;
for (mask = 0; mask != (tag)-1; mask = ((mask<<1)|1)) {
if ((a & b & mask) != mask)
break;
}
return ((a & mask) < (b & mask));
}
/* If hash bucket is taken, we spill into next bucket(s). Secondary hashing
works better in theory, but modern caches make this 20% faster. */
static void insert_hash(struct rzip_state *st, tag t, i64 offset)
{
i64 h, victim_h = 0, round = 0;
/* If we need to kill one, this will be it. */
static i64 victim_round = 0;
h = primary_hash(st, t);
while (!empty_hash(st, h)) {
/* If this due for cleaning anyway, just replace it:
rehashing might move it behind tag_clean_ptr. */
if (minimum_bitness(st, st->hash_table[h].t)) {
st->hash_count--;
break;
}
/* If we are better than current occupant, we can't
jump over it: it will be cleaned before us, and
noone would then find us in the hash table. Rehash
it, then take its place. */
if (lesser_bitness(st->hash_table[h].t, t)) {
insert_hash(st, st->hash_table[h].t,
st->hash_table[h].offset);
break;
}
/* If we have lots of identical patterns, we end up
with lots of the same hash number. Discard random. */
if (st->hash_table[h].t == t) {
if (round == victim_round) {
victim_h = h;
}
if (++round == st->level->max_chain_len) {
h = victim_h;
st->hash_count--;
victim_round++;
if (victim_round == st->level->max_chain_len)
victim_round = 0;
break;
}
}
h++;
h &= ((1 << st->hash_bits) - 1);
}
st->hash_table[h].t = t;
st->hash_table[h].offset = offset;
}
/* Eliminate one hash entry with minimum number of lower bits set.
Returns tag requirement for any new entries. */
static tag clean_one_from_hash(struct rzip_state *st)
{
tag better_than_min;
again:
better_than_min = increase_mask(st->minimum_tag_mask);
if (control.flags & FLAG_VERBOSITY_MAX) {
if (!st->tag_clean_ptr)
fprintf(stderr, "\nStarting sweep for mask %u\n", (unsigned int)st->minimum_tag_mask);
}
for (; st->tag_clean_ptr < (1U<<st->hash_bits); st->tag_clean_ptr++) {
if (empty_hash(st, st->tag_clean_ptr))
continue;
if ((st->hash_table[st->tag_clean_ptr].t & better_than_min)
!= better_than_min) {
st->hash_table[st->tag_clean_ptr].offset = 0;
st->hash_table[st->tag_clean_ptr].t = 0;
st->hash_count--;
return better_than_min;
}
}
/* We hit the end: everthing in hash satisfies the better mask. */
st->minimum_tag_mask = better_than_min;
st->tag_clean_ptr = 0;
goto again;
}
static inline tag next_tag(struct rzip_state *st, uchar *p, tag t)
{
t ^= st->hash_index[p[-1]];
t ^= st->hash_index[p[MINIMUM_MATCH-1]];
return t;
}
static inline tag full_tag(struct rzip_state *st, uchar *p)
{
tag ret = 0;
int i;
for (i = 0; i < MINIMUM_MATCH; i++)
ret ^= st->hash_index[p[i]];
return ret;
}
static inline i64 match_len(struct rzip_state *st,
uchar *p0, uchar *op, uchar *buf, uchar *end, i64 *rev)
{
uchar *p = p0;
i64 len = 0;
if (op >= p0)
return 0;
while ((*p == *op) && (p < end)) {
p++;
op++;
}
len = p - p0;
p = p0;
op -= len;
end = buf;
if (end < st->last_match)
end = st->last_match;
while (p > end && op > buf && op[-1] == p[-1]) {
op--;
p--;
}
(*rev) = p0 - p;
len += p0 - p;
if (len < MINIMUM_MATCH)
return 0;
return len;
}
static i64 find_best_match(struct rzip_state *st,
tag t, uchar *p, uchar *buf, uchar *end,
i64 *offset, i64 *reverse)
{
i64 length = 0;
i64 rev;
i64 h, best_h;
rev = 0;
*reverse = 0;
/* Could optimise: if lesser goodness, can stop search. But
* chains are usually short anyway. */
h = primary_hash(st, t);
while (!empty_hash(st, h)) {
i64 mlen;
if (t == st->hash_table[h].t) {
mlen = match_len(st, p, buf+st->hash_table[h].offset,
buf, end, &rev);
if (mlen)
st->stats.tag_hits++;
else
st->stats.tag_misses++;
if (mlen >= length) {
length = mlen;
(*offset) = st->hash_table[h].offset - rev;
(*reverse) = rev;
best_h = h;
}
}
h++;
h &= ((1 << st->hash_bits) - 1);
}
return length;
}
static void show_distrib(struct rzip_state *st)
{
i64 i;
i64 total = 0;
i64 primary = 0;
for (i = 0; i < (1U << st->hash_bits); i++) {
if (empty_hash(st, i))
continue;
total++;
if (primary_hash(st, st->hash_table[i].t) == i)
primary++;
}
if (total != st->hash_count)
fprintf(stderr, "/tWARNING: hash_count says total %lld\n", st->hash_count);
fprintf(stderr, "\t%lld total hashes -- %lld in primary bucket (%-2.3f%%)\n", total, primary,
primary*100.0/total);
}
static void hash_search(struct rzip_state *st, uchar *buf,
double pct_base, double pct_multiple)
{
uchar *p, *end;
tag t = 0;
i64 cksum_limit = 0;
i64 pct, lastpct=0;
struct {
uchar *p;
i64 ofs;
i64 len;
} current;
tag tag_mask = (1 << st->level->initial_freq) - 1;
if (st->hash_table) {
memset(st->hash_table, 0, sizeof(st->hash_table[0]) * (1<<st->hash_bits));
} else {
i64 hashsize = st->level->mb_used *
(1024 * 1024 / sizeof(st->hash_table[0]));
for (st->hash_bits = 0; (1U << st->hash_bits) < hashsize; st->hash_bits++);
if (control.flags & FLAG_VERBOSITY_MAX)
fprintf(stderr, "hashsize = %lld. bits = %lld. %luMB\n",
hashsize, st->hash_bits, st->level->mb_used);
/* 66% full at max. */
st->hash_limit = (1 << st->hash_bits) / 3 * 2;
st->hash_table = calloc(sizeof(st->hash_table[0]), (1 << st->hash_bits));
}
if (!st->hash_table)
fatal("Failed to allocate hash table in hash_search\n");
st->minimum_tag_mask = tag_mask;
st->tag_clean_ptr = 0;
st->cksum = 0;
st->hash_count = 0;
p = buf;
end = buf + st->chunk_size - MINIMUM_MATCH;
st->last_match = p;
current.len = 0;
current.p = p;
current.ofs = 0;
t = full_tag(st, p);
while (p < end) {
i64 offset = 0;
i64 mlen;
i64 reverse;
p++;
t = next_tag(st, p, t);
/* Don't look for a match if there are no tags with
this number of bits in the hash table. */
if ((t & st->minimum_tag_mask) != st->minimum_tag_mask)
continue;
mlen = find_best_match(st, t, p, buf, end, &offset, &reverse);
/* Only insert occasionally into hash. */
if ((t & tag_mask) == tag_mask) {
st->stats.inserts++;
st->hash_count++;
insert_hash(st, t, (i64)(p - buf));
if (st->hash_count > st->hash_limit)
tag_mask = clean_one_from_hash(st);
}
if (mlen > current.len) {
current.p = p - reverse;
current.len = mlen;
current.ofs = offset;
}
if ((current.len >= GREAT_MATCH || p >= current.p + MINIMUM_MATCH)
&& current.len >= MINIMUM_MATCH) {
if (st->last_match < current.p)
put_literal(st, st->last_match, current.p);
put_match(st, current.p, buf, current.ofs, current.len);
st->last_match = current.p + current.len;
current.p = p = st->last_match;
current.len = 0;
t = full_tag(st, p);
}
if ((control.flags & FLAG_SHOW_PROGRESS) && (p - buf) % 100 == 0) {
pct = pct_base + (pct_multiple * (100.0 * (p - buf)) /
st->chunk_size);
if (pct != lastpct) {
struct stat s1, s2;
fstat(st->fd_in, &s1);
fstat(st->fd_out, &s2);
fprintf(stderr, "%2lld%%\r", pct);
fflush(stderr);
lastpct = pct;
}
}
if (p - buf > (i64)cksum_limit) {
i64 n = st->chunk_size - (p - buf);
st->cksum = CrcUpdate(st->cksum, buf + cksum_limit, n);
cksum_limit += n;
}
}
if (control.flags & FLAG_VERBOSITY_MAX)
show_distrib(st);
if (st->last_match < buf + st->chunk_size)
put_literal(st, st->last_match,buf + st->chunk_size);
if (st->chunk_size > cksum_limit) {
i64 n = st->chunk_size - cksum_limit;
st->cksum = CrcUpdate(st->cksum, buf+cksum_limit, n);
cksum_limit += n;
}
put_literal(st, NULL, 0);
put_u32(st->ss, 0, st->cksum);
}
static void init_hash_indexes(struct rzip_state *st)
{
int i;
for (i = 0; i < 256; i++)
st->hash_index[i] = ((random() << 16) ^ random());
}
/* compress a chunk of an open file. Assumes that the file is able to
be mmap'd and is seekable */
static void rzip_chunk(struct rzip_state *st, int fd_in, int fd_out, i64 offset,
double pct_base, double pct_multiple, i64 limit)
{
uchar *buf;
buf = (uchar *)mmap(NULL, st->chunk_size, PROT_READ, MAP_SHARED, fd_in, offset);
if (buf == (uchar *)-1)
fatal("Failed to map buffer in rzip_fd\n");
st->ss = open_stream_out(fd_out, NUM_STREAMS, limit);
if (!st->ss)
fatal("Failed to open streams in rzip_fd\n");
hash_search(st, buf, pct_base, pct_multiple);
/* unmap buffer before closing and reallocating streams */
munmap(buf, st->chunk_size);
if (close_stream_out(st->ss) != 0)
fatal("Failed to flush/close streams in rzip_fd\n");
}
/* compress a whole file chunks at a time */
void rzip_fd(int fd_in, int fd_out)
{
/* add timers for ETA estimates
* Base it off the file size and number of iterations required
* depending on compression window size
* Track elapsed time and estimated time to go
* If file size < compression window, can't do
*/
struct timeval current, start, last;
struct stat s, s2;
struct rzip_state *st;
i64 len, last_chunk = 0;
i64 chunk_window, pages;
int pass = 0, passes;
unsigned long page_size;
unsigned int eta_hours, eta_minutes, eta_seconds, elapsed_hours,
elapsed_minutes, elapsed_seconds;
double finish_time, elapsed_time, chunkmbs;
st = calloc(sizeof(*st), 1);
if (!st)
fatal("Failed to allocate control state in rzip_fd\n");
if (control.flags & FLAG_LZO_COMPRESS) {
if (lzo_init() != LZO_E_OK)
fatal("lzo_init() failed\n");
}
if (fstat(fd_in, &s))
fatal("Failed to stat fd_in in rzip_fd - %s\n", strerror(errno));
len = s.st_size;
/* Windows must be the width of _SC_PAGE_SIZE for offset to work in mmap */
chunk_window = control.window * CHUNK_MULTIPLE;
page_size = sysconf(_SC_PAGE_SIZE);
pages = chunk_window / page_size;
chunk_window = pages * page_size;
st->level = &levels[MIN(9, control.window)];
st->fd_in = fd_in;
st->fd_out = fd_out;
init_hash_indexes(st);
passes = 1 + s.st_size / chunk_window;
/* set timers and chunk counter */
last.tv_sec = last.tv_usec = 0;
gettimeofday(&start, NULL);
while (len) {
i64 chunk, limit = 0;
double pct_base, pct_multiple;
chunk = chunk_window;
if (chunk > len)
limit = chunk = len;
pct_base = (100.0 * (s.st_size - len)) / s.st_size;
pct_multiple = ((double)chunk) / s.st_size;
st->chunk_size = chunk;
pass++;
gettimeofday(&current, NULL);
/* this will count only when size > window */
if (last.tv_sec > 0) {
if (control.flags & FLAG_VERBOSE) {
elapsed_time = current.tv_sec - start.tv_sec;
finish_time = elapsed_time / (pct_base / 100.0);
elapsed_hours = (unsigned int)(elapsed_time) / 3600;
elapsed_minutes = (unsigned int)(elapsed_time - elapsed_hours * 3600) / 60;
elapsed_seconds = (unsigned int) elapsed_time - elapsed_hours * 60 - elapsed_minutes * 60;
eta_hours = (unsigned int)(finish_time - elapsed_time) / 3600;
eta_minutes = (unsigned int)((finish_time - elapsed_time) - eta_hours * 3600) / 60;
eta_seconds = (unsigned int)(finish_time - elapsed_time) - eta_hours * 60 - eta_minutes * 60;
chunkmbs=(last_chunk / 1024 / 1024) / (double)(current.tv_sec-last.tv_sec);
fprintf(stderr, "\nPass %d / %d -- Elapsed Time: %02d:%02d:%02d. ETA: %02d:%02d:%02d. Compress Speed: %3.3fMB/s.\n",
pass, passes, elapsed_hours, elapsed_minutes, elapsed_seconds,
eta_hours, eta_minutes, eta_seconds, chunkmbs);
}
}
last.tv_sec = current.tv_sec;
last.tv_usec = current.tv_usec;
last_chunk = chunk;
rzip_chunk(st, fd_in, fd_out, s.st_size - len, pct_base, pct_multiple, limit);
len -= chunk;
}
gettimeofday(&current, NULL);
chunkmbs = (s.st_size / 1024 / 1024) / ((double)(current.tv_sec-start.tv_sec)? : 1);
fstat(fd_out, &s2);
if (control.flags & FLAG_VERBOSITY_MAX) {
fprintf(stderr, "matches=%u match_bytes=%u\n",
(unsigned int)st->stats.matches, (unsigned int)st->stats.match_bytes);
fprintf(stderr, "literals=%u literal_bytes=%u\n",
(unsigned int)st->stats.literals, (unsigned int)st->stats.literal_bytes);
fprintf(stderr, "true_tag_positives=%u false_tag_positives=%u\n",
(unsigned int)st->stats.tag_hits, (unsigned int)st->stats.tag_misses);
fprintf(stderr, "inserts=%u match %.3f\n",
(unsigned int)st->stats.inserts,
(1.0 + st->stats.match_bytes) / st->stats.literal_bytes);
}
if (control.flags & FLAG_SHOW_PROGRESS) {
if (!(control.flags & FLAG_STDIN))
fprintf(stderr, "%s - ", control.infile);
fprintf(stderr, "Compression Ratio: %.3f. Average Compression Speed: %6.3fMB/s.\n",
1.0 * s.st_size / s2.st_size, chunkmbs);
}
if (st->hash_table)
free(st->hash_table);
free(st);
}

185
rzip.h Normal file
View file

@ -0,0 +1,185 @@
/*
Copyright (C) Andrew Tridgell 1998,
Con Kolivas 2006-2009
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
#define LRZIP_MAJOR_VERSION 0
#define LRZIP_MINOR_VERSION 4
#define LRZIP_MINOR_SUBVERSION 5
#define NUM_STREAMS 2
#include "config.h"
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <stddef.h>
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <sys/resource.h>
#include <netinet/in.h>
#include <sys/time.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#ifdef __APPLE__
#include <sys/sysctl.h>
#endif
#include <lzo/lzoconf.h>
#include <lzo/lzo1x.h>
#ifdef HAVE_STRING_H
#include <string.h>
#endif
#ifdef HAVE_MALLOC_H
#include <malloc.h>
#endif
#include <fcntl.h>
#include <sys/stat.h>
#ifdef HAVE_CTYPE_H
#include <ctype.h>
#endif
#include <errno.h>
#include <sys/mman.h>
/* needed for CRC routines */
#include "lzma/C/7zCrc.h"
#ifndef uchar
#define uchar unsigned char
#endif
#ifndef int32
#if (SIZEOF_INT == 4)
#define int32 int
#elif (SIZEOF_LONG == 4)
#define int32 long
#elif (SIZEOF_SHORT == 4)
#define int32 short
#endif
#endif
#ifndef int16
#if (SIZEOF_INT == 2)
#define int16 int
#elif (SIZEOF_SHORT == 2)
#define int16 short
#endif
#endif
#ifndef uint32
#define uint32 unsigned int32
#endif
#ifndef uint16
#define uint16 unsigned int16
#endif
#ifndef MIN
#define MIN(a, b) ((a) < (b)? (a): (b))
#endif
#ifndef MAX
#define MAX(a, b) ((a) > (b)? (a): (b))
#endif
#if !HAVE_STRERROR
extern char *sys_errlist[];
#define strerror(i) sys_errlist[i]
#endif
#ifndef HAVE_ERRNO_DECL
extern int errno;
#endif
typedef long long int i64;
typedef uint16_t u16;
typedef uint32_t u32;
#define FLAG_SHOW_PROGRESS 2
#define FLAG_KEEP_FILES 4
#define FLAG_TEST_ONLY 8
#define FLAG_FORCE_REPLACE 16
#define FLAG_DECOMPRESS 32
#define FLAG_NO_COMPRESS 64
#define FLAG_LZO_COMPRESS 128
#define FLAG_BZIP2_COMPRESS 256
#define FLAG_ZLIB_COMPRESS 512
#define FLAG_ZPAQ_COMPRESS 1024
#define FLAG_VERBOSITY 2048
#define FLAG_VERBOSITY_MAX 4096
#define FLAG_NO_SET_PERMS 8192
#define FLAG_STDIN 16384
#define FLAG_STDOUT 32768
#define FLAG_INFO 65536
#define FLAG_VERBOSE (FLAG_VERBOSITY | FLAG_VERBOSITY_MAX)
#define FLAG_NOT_LZMA (FLAG_NO_COMPRESS | FLAG_LZO_COMPRESS | FLAG_BZIP2_COMPRESS | FLAG_ZLIB_COMPRESS | FLAG_ZPAQ_COMPRESS)
#define LZMA_COMPRESS(C) (!((C) & FLAG_NOT_LZMA))
#define CTYPE_NONE 3
#define CTYPE_BZIP2 4
#define CTYPE_LZO 5
#define CTYPE_LZMA 6
#define CTYPE_GZIP 7
#define CTYPE_ZPAQ 8
struct rzip_control {
char *infile;
char *outname;
char *outfile;
char *outdir;
const char *suffix;
int compression_level;
unsigned char lzma_properties[5]; // lzma properties, encoded
double threshold;
unsigned long long window;
unsigned long flags;
unsigned long long ramsize;
unsigned long threads;
int nice_val; // added for consistency
int major_version;
int minor_version;
};
extern struct rzip_control control;
void fatal(const char *format, ...);
void sighandler();
void err_msg(const char *format, ...);
i64 runzip_fd(int fd_in, int fd_out, int fd_hist, i64 expected_size);
void rzip_fd(int fd_in, int fd_out);
void *open_stream_out(int f, int n, i64 limit);
void *open_stream_in(int f, int n);
int write_stream(void *ss, int stream, uchar *p, i64 len);
i64 read_stream(void *ss, int stream, uchar *p, i64 len);
int close_stream_out(void *ss);
int close_stream_in(void *ss);
void read_config(struct rzip_control *s);
ssize_t write_1g(int fd, void *buf, i64 len);
ssize_t read_1g(int fd, void *buf, i64 len);
extern void zpipe_compress(FILE *in, FILE *out, long long int buf_len, int progress);
extern void zpipe_decompress(FILE *in, FILE *out, long long int buf_len, int progress);

1050
stream.c Normal file

File diff suppressed because it is too large Load diff

204
util.c Normal file
View file

@ -0,0 +1,204 @@
/*
Copyright (C) Andrew Tridgell 1998
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
/*
Utilities used in rzip
tridge, June 1996
*/
/*
* Realloc removed
* Functions added
* read_config()
* Peter Hyman, December 2008
*/
#include "rzip.h"
void err_msg(const char *format, ...)
{
va_list ap;
va_start(ap, format);
vfprintf(stderr, format, ap);
va_end(ap);
}
void fatal(const char *format, ...)
{
va_list ap;
if (format) {
va_start(ap, format);
vfprintf(stderr, format, ap);
va_end(ap);
}
/* Delete temporary files generated for testing or faking stdio */
if (control.flags & (FLAG_TEST_ONLY | FLAG_STDOUT))
unlink(control.outfile);
if (control.flags & FLAG_STDIN)
unlink(control.infile);
fprintf(stderr, "Fatal error - exiting\n");
exit(1);
}
void sighandler()
{
/* Delete temporary files generated for testing or faking stdio */
if (control.flags & (FLAG_TEST_ONLY | FLAG_STDOUT))
unlink(control.outfile);
if (control.flags & FLAG_STDIN)
unlink(control.infile);
exit(0);
}
void read_config( struct rzip_control *control )
{
/* check for lrzip.conf in ., $HOME/.lrzip and /etc/lrzip */
FILE *fp;
char *parameter;
char *parametervalue;
char *line, *s;
char *HOME, *homeconf;
line = malloc(255);
homeconf = malloc(255);
if (line == NULL || homeconf == NULL)
fatal("Fatal Memory Error in read_config");
fp = fopen("lrzip.conf", "r");
if (fp)
fprintf(stderr, "Using configuration file ./lrzip.conf\n");
if (fp == NULL) {
fp = fopen("/etc/lrzip/lrzip.conf", "r");
if (fp)
fprintf(stderr, "Using configuration file /etc/lrzip/lrzip.conf\n");
}
if (fp == NULL) {
HOME=getenv("HOME");
if (HOME) {
strcpy(homeconf, HOME);
strcat(homeconf,"/.lrzip/lrzip.conf");
fp = fopen(homeconf, "r");
if (fp)
fprintf(stderr, "Using configuration file %s\n", homeconf);
}
}
if (fp == NULL)
return;
/* if we get here, we have a file. read until no more. */
while ((s = fgets(line, 255, fp)) != NULL) {
if (strlen(line))
line[strlen(line) - 1] = '\0';
parameter = strtok(line, " =");
if (parameter == NULL)
continue;
/* skip if whitespace or # */
if (isspace(*parameter))
continue;
if (*parameter == '#')
continue;
parametervalue = strtok(NULL, " =");
if (parametervalue == NULL)
continue;
/* have valid parameter line, now assign to control */
if (!strcasecmp(parameter, "window"))
control->window = atoi(parametervalue);
else if (!strcasecmp(parameter, "compressionlevel")) {
control->compression_level = atoi(parametervalue);
if ( control->compression_level < 1 || control->compression_level > 9 )
fatal("CONF.FILE error. Compression Level must between 1 and 9");
} else if (!strcasecmp(parameter, "compressionmethod")) {
/* valid are rzip, gzip, bzip2, lzo, lzma (default) */
if (control->flags & FLAG_NOT_LZMA)
fatal("CONF.FILE error. Can only specify one compression method");
if (!strcasecmp(parametervalue, "bzip2"))
control->flags |= FLAG_BZIP2_COMPRESS;
else if (!strcasecmp(parametervalue, "gzip"))
control->flags |= FLAG_ZLIB_COMPRESS;
else if (!strcasecmp(parametervalue, "lzo"))
control->flags |= FLAG_LZO_COMPRESS;
else if (!strcasecmp(parametervalue, "rzip"))
control->flags |= FLAG_NO_COMPRESS;
else if (!strcasecmp(parametervalue, "zpaq"))
control->flags |= FLAG_ZPAQ_COMPRESS;
else if (strcasecmp(parametervalue, "lzma"))
fatal("CONF.FILE error. Invalid compression method %s specified",parametervalue);
} else if (!strcasecmp(parameter, "testthreshold")) {
control->threshold = atoi(parametervalue);
if (control->threshold < 1 || control->threshold > 10)
fatal("CONF.FILE error. Threshold value out of range %d", parametervalue);
control->threshold = 1.05-control->threshold / 20;
} else if (!strcasecmp(parameter, "outputdirectory")) {
control->outdir = malloc(strlen(parametervalue) + 2);
if (!control->outdir)
fatal("Fatal Memory Error in read_config");
strcpy(control->outdir, parametervalue);
if (strcmp(parametervalue + strlen(parametervalue) - 1, "/"))
strcat(control->outdir, "/");
} else if (!strcasecmp(parameter,"verbosity")) {
if (control->flags & FLAG_VERBOSE)
fatal("CONF.FILE error. Verbosity already defined.");
if (!strcasecmp(parametervalue, "true") || !strcasecmp(parametervalue, "1"))
control->flags |= FLAG_VERBOSITY;
else if (!strcasecmp(parametervalue,"max") || !strcasecmp(parametervalue, "2"))
control->flags |= FLAG_VERBOSITY_MAX;
} else if (!strcasecmp(parameter,"nice")) {
control->nice_val = atoi(parametervalue);
if (control->nice_val < -20 || control->nice_val > 19)
fatal("CONF.FILE error. Nice must be between -20 and 19");
} else if (!strcasecmp(parameter, "showprogress")) {
/* true by default */
if (!strcasecmp(parametervalue, "false") || !strcasecmp(parametervalue," 0"))
control->flags &= ~FLAG_SHOW_PROGRESS;
} else if (!strcmp(parameter, "DELETEFILES")) {
/* delete files must be case sensitive */
if (!strcmp(parametervalue, "YES"))
control->flags &= ~FLAG_KEEP_FILES;
} else if (!strcmp(parameter, "REPLACEFILE")) {
/* replace lrzip file must be case sensitive */
if (!strcmp(parametervalue, "YES"))
control->flags |= FLAG_FORCE_REPLACE;
}
}
/* clean up */
free(line);
free(homeconf);
/* fprintf(stderr, "\nWindow = %d \
\nCompression Level = %d \
\nThreshold = %1.2f \
\nOutput Directory = %s \
\nFlags = %d\n", control->window,control->compression_level, control->threshold, control->outdir, control->flags);
*/
}

1771
zpipe.cpp Normal file

File diff suppressed because it is too large Load diff