Re: general compression with awk
- From: Loki Harfagr <loki@xxxxxxxxxxxxxxxxxx>
- Date: Fri, 02 Dec 2005 11:22:17 +0100
Le Wed, 30 Nov 2005 13:09:53 +0100, Rufus T. Firefly a écrit :
> Loki Harfagr wrote:
>>
>> If you still want to try awk, here's a very
>> timid starter.
>> The commented prints are for looking at the engine
>> being clumsy (It doesn't pair doubletons like "a b").
>>
> discovering (and compressing) recurrent patterns IS the whole point of
> this exercise
Oh yes! Indeed I missed your point, I thought you were lacking of
a starting architecture for a compressor type engine. While you
really did want an algo !-) Sorru for this !
....
> Your code does `uniq -c` in an obscure way.
Mmm, I wouldn't say it is that obscure :D)
> $ uniq -c example
> 4 a
> 1 b
> 1 a
> 1 b
> 1 a
> 1 b
> 1 c
> 2 d
>
> Maybe I did not quite see your point.
It seems we both had difficulties in reading each other, I think now it
is clear, and yes it *has* to look like a 'uniq -c' which is a good start
for a compression engine (collect and count element as to build a hash
dictionnary). It could be a basis for a huffman compressor, why don't you
use a huffman ? Is your research for research (intelectual curiosity) or
do you intend to actually use the thing in real process ?
> - Looking forward to your
> improvement...
Well, I don't know if I'll have time to try but if I can
get some relief off preparing Xmas presents, cards, apologies for parties,
I'll have a try :-)
> As Ed pointed out
>
> $ cat example example |awk -f RLEinawk.awk
>
> should print something like
> 2 example
>
> if you understand what I mean.
Yes, I do, of course, but you reckon that using this type of algo
on a "freely sized" dictionnary will be a very resource consuming
task for a computer, at best it'll need "factory of number of elements"
loop, if you have big files and very heterogenous you'll crash fairly
quick !-)
Of course if you can define a closed context of data you may find
out some tricks to do it in a refined optimized way :-)
See you.
.
- References:
- Re: general compression with awk
- From: Rufus T. Firefly
- Re: general compression with awk
- Prev by Date: Re: Subtraction of timestamps with awk
- Next by Date: Re: Subtraction of timestamps with awk
- Previous by thread: Re: general compression with awk
- Next by thread: Re: general compression with awk
- Index(es):
Relevant Pages
|