Re: huffman compression
- From: "cr88192" <cr88192@xxxxxxxxxxxxxxxxxx>
- Date: Thu, 14 Jul 2005 02:41:34 +1000
"TVS" <tvsivakumar@xxxxxxxxx> wrote in message
news:1121142453.041514.319380@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>i want a full code of huffman data compression for text files in C++.
> I have tried it out but not successful enough to compress the exe
> files. Kindly give me the code so that i can understand the logic
> behind it.
>
huffman, by itself, will not offer that much (exe files might become, say,
70 or 80% or so their original size, you may not see much more). note, even
in the optimal case, at best, pure huffman can only get about 12.5% of the
original size (and this is a file of bytes with only a single value).
it does not sound that like you are approaching it from a standpoint of a
direct interest in the algo itself (though maybe something can be pointed
out, I don't know).
if you are hoping for a simple, yet still fairly effective, encoder for
typical data, then lz77 may be a better bet (lz77 is often combined with
other algos, eg, huffman, for some increase in total compression ratio). be
warned though, many simple lz77 variants (eg: lzss, ...) can use rather
small dictionaries (eg: 4kB), and thus get poor compression (I typically use
a 64kB window, as I get decent results with this).
in my case, I have written (though not posted online) a few simple huffman
coders (typically adaptive or "quasi-static"). I am often faced with the
issue that, even if I can get close, I can make a huffman coder that is
faster than I can get an arithmatic coder. in some cases stuff has to be
done, eg, by this point the huffman decoder makes use of an index table so
typical decodes can be done in a single step (even if the adaptive case, I
have found that the cost implied be periodically rebuilding the index is
less than that implied by using a recursive function or loop for decoding).
is a static huffman is desired, this is another issue (other issues pop up
with static huffman, eg, chunking and stream encoding concerns). static
huffman, however, can be made faster (maybe, or most likely, I guess it
depends on little costs like chunk size, encoding/decoding/rebuilding the
tree vs updating counts and occasionally rebuilding the tree, ...).
if no good links are provided by others, I may consider posting some source
here.
however, in general I have just been using plain bytewise lz77 for things
(if that, in reality it is rare that the desire to use compression for stuff
outweighs the cost of actually using it, in nearly all cases it is less work
to just dump raw bytes).
the cases where I have used compression (in my general code) have ended up
being cases where a lot of typically unreasonably bulky data is generated
(eg: part of what set me off on the whole compression quest to begin with:
lightmaps, and to a lesser extent video).
lightmaps are adequately handled by an older lz77+rle algo of mine, even if
it kind of sucks (and uses a terrible approach to hashing), I have not found
much reason to change it, and as of yet there has not been much other data
popping up that "seriously needs compression".
much of this data doesn't even need files much beyond raw line-based text
formats.
it is a stalemate really.
what I can come up with often somewhat exceeds my needs at a given point in
time...
or whatever...
me, also sitting around half-assedly recreating a good portion of an ode
like library. yes, I could just use ode, but I ended up doing my own
anyways. long standing problems asside (issues still remain with obb and
cylinder related collision behavior) it goes ok.
but, really then, what does it matter. if anything it was mostly just an
excersize in api design, or maybe make something possibly "decent" that I am
never able to utilize the full power of anyways (I also have a scripting
language, which is quite probably overkill).
everything else is just immitations of other things. some things I never had
reason or motivation to improve, others with only a subset of their "power"
being used.
quite possibly all this is where all these extra kloc's come from...
and what stands as a tribute to all this?
nothing of any mention.
I am arguably a terribly ineffective coder sometimes, I can write whatever I
want, but often do a poor job and can't make much good use of it all. but
then again: what am I aiming for anyways. no real goal, no real directed
progress?...
or whatever...
.
- Follow-Ups:
- Re: huffman compression
- From: TVS
- Re: huffman compression
- References:
- huffman compression
- From: TVS
- huffman compression
- Prev by Date: Re: Inventing store 1.12Gb raw video in 5Mb!
- Next by Date: Re: simple (?) gzip question
- Previous by thread: huffman compression
- Next by thread: Re: huffman compression
- Index(es):
Relevant Pages
|