Re: Maintaining the mapping of keys to values in array after sorting



On 31 , 01:13, Ed Morton <mor...@xxxxxxxxxxxxxx> wrote:
Vassilis wrote:
On 31 , 00:43, Ed Morton <mor...@xxxxxxxxxxxxxx> wrote:
<snip>
$ cat file
the the quick the brown
the fox the jumped the over
the the lazy the dog's the back
the the dog the then
the bit the fox
$
$ cat count.awk
{
for (i=1;i<=NF;i++) {
a[$i]++
}}

END {
for (i in a) {
idx=sprintf("%010d",a[i])
print i,a[i]
b[idx] = b[idx] i " "
}
for (j in b) {
print j,b[j]
}}

$
$ awk -f count.awk file
fox 2
quick 1
over 1
brown 1
the 15
back 1
lazy 1
jumped 1
then 1
bit 1
dog's 1
dog 1
0000000001 quick over brown back lazy jumped then bit dog's dog
0000000002 fox
0000000015 the
$

Regards,

Ed.

Ok. Array indeces are always strings. That's one.
Then again, why do I need to add 8 or 9 (%010d) leading zeros, and not
just 3 or 4 (%05d)?

You need to use whatever number of digits you think will accomodate
however high your count can get to. If your per-word count can't exceed
9999 then you can use "%05d". You're just trying to make sure that when
the character-by-character string comparison occurs, it's comparing the
digits in the same unit of the number so that "10" is considered larger
than "2", etc.


That I can understand.
But running above code with my change on the same input doesn't give
the expected result. Check once more:

$ awk -f count.awk file
fox 2
quick 1
over 1
brown 1
the 15
back 1
lazy 1
jumped 1
then 1
bit 1
dog's 1
dog 1
00002 fox
00015 the
00001 quick over brown back lazy jumped then bit dog's dog

Do you get the right result on your machine?
Even running your code unchanged on nawk gives a different result:

$ nawk -f count.awk in
quick 1
the 15
brown 1
dog's 1
back 1
fox 2
then 1
dog 1
over 1
jumped 1
lazy 1
bit 1
0000000015 the
0000000001 quick brown dog's back then dog over jumped lazy bit
0000000002 fox

~$ uname -sr
Linux 2.6.21.3
~$ gawk --version
GNU Awk 3.1.5
Copyright (C) 1989, 1991-2005 Free Software Foundation.
[...]
~$ nawk --version
awk version 20070501


Vassilis (really lost).

.



Relevant Pages