Re: general compression with awk
- From: "Rufus T. Firefly" <mb.atelier@xxxxxx>
- Date: Wed, 30 Nov 2005 13:09:53 +0100
Loki Harfagr wrote:
>
> If you still want to try awk, here's a very
> timid starter.
> The commented prints are for looking at the engine
> being clumsy (It doesn't pair doubletons like "a b").
>
discovering (and compressing) recurrent patterns IS the whole point of
this exercise
> With your first original sample data it gives :
> a 4
> b 1
> a 1
> b 1
> a 1
> b 1
> c 1
> d 2
>
> Maybe someone (maybe me) will find some time to improve
> the poor thing :D)
>
> $ cat RLEinawk.awk
> BEGIN{
> ### print "========="
> }
> (imp[$0]>0){
> pot=$0
> imp[pot]++;
> ### print "["FNR"] put "$0" in accu, count "imp[$0]
> next
> }
> {
> ### print "["FNR"] We read a newt "$0", first depot "pot" ->"imp[pot]
> if(FNR>1)print pot" "imp[pot]
> imp[pot]=0
> pot=$0
> ### print "Then put the newt "pot" into accu"
> imp[pot]++
> next
> }
> END{
> ### print "In the end, depot the rest"
> if(FNR>1)print pot" "imp[pot]
> ### print "========="
> }
>
Your code does `uniq -c` in an obscure way.
$ uniq -c example
4 a
1 b
1 a
1 b
1 a
1 b
1 c
2 d
Maybe I did not quite see your point. - Looking forward to your
improvement...
As Ed pointed out
$ cat example example |awk -f RLEinawk.awk
should print something like
2 example
if you understand what I mean.
RTF
.
- Follow-Ups:
- Re: general compression with awk
- From: Loki Harfagr
- Re: general compression with awk
- Next by Date: problems with number (int)
- Next by thread: Re: general compression with awk
- Index(es):
Relevant Pages
|