Re: Difference between gzip, zip, deflate



On Jul 2, 5:57 pm, "ying...@xxxxxxxxx" <ying...@xxxxxxxxx> wrote:
Can you please help me understand what are the differences between
gzip, zip, deflate?

Deflate is a compression method and compressed data format. It can be
called a "raw" compressed data format, since there is no additional
information in a deflate stream such as file names, lengths, check
values, etc. There are not even magic bytes at the start so you can
recognize it as a deflate stream. It is simply the encoding of a
string of bytes. The deflate format is self-terminating, so when
decoding, you know when to stop without having an input length.

gzip is a wrapper for the deflate format. It provides those missing
items mentioned, such as magic bytes, file name, modification time,
and check values with header and trailer bytes around a deflate
stream. Currently gzip only allows the deflate compression method.
Only one deflate stream can be present in a gzip stream, and so
effectively can represent only one file.

zip is an archive format that contains much more information than a
gzip stream, and allows storing multiple files in an archive. The
format contains path information as well as file names, more file
attribute information, and a central directory to facilitate random
access of the archive. The zip format allows many compressed data
formats, of which deflate is only one (though the most popular for
portability).

You didn't mention zlib, but it is another wrapper around deflate
data, as well as the name of a software library for encoding and
decode deflate, zlib, and gzip streams. The zlib wrapper format is
spartan consisting of only six bytes total, and provides magic bytes,
the compression method (only deflate is allowed currently, like gzip),
and a check value at the end.

There are other formats that incorporate deflate and/or zlib formatted
data, such as the PNG image format and the PDF document format.

These formats are all mutually incompatible. However you can write
software to attempt to detect the format and extract the compressed
data. zlib can be used for the deflate decoding, though that only
covers one possible compression format in zip files.

Mark Adler

.



Relevant Pages

  • Re: java.util.zip on hpux
    ... almost the same as JAR format). ... many compression /programs/, but the two are not the same. ... installation of almost any OS) is the so-called gzip algorithm. ...
    (comp.lang.java.programmer)
  • Re: Gzip each chunk separately
    ... the gzip /program/ will act as you describe. ... format itself, the GZIP format as specified in RFC 1952, does naturally ... decompressing at the end of the first compressed "file" in the stream. ... The underlying compression format (shared by "GZIP" and ...
    (comp.lang.java.programmer)
  • Re: Difference between gzip, zip, deflate
    ... Deflate is a compression method and compressed data format. ... gzip is a wrapper for the deflate format. ...
    (comp.compression)
  • Re: How to handle a HTTP::Request with gzip, deflate headers
    ... When checking the headers, Amazon adds the deflate ... What I found out is that they actually output the wrong format here as ... the zlib FAQ to avoid this misunderstanding. ... I'll see if I can hack libwww-perl to retry decoding with the ...
    (comp.lang.perl.misc)
  • Re: compression of binary data
    ... I ran across a file format for binary data which uses ... which did achieve higher compression ... In Deflate, block type 3 is effectively reserved. ...
    (comp.compression)