Re: Looping/Counting



Mike wrote:
Ed Morton wrote:


post some sample input and output if you need more help.


Ed.


Thanks for the tips Ed. I'm a bit slow...

Ah, that would explain why you posted the sample input, but not the expected output. Perhaps that's coming later ;-).


Here's an example of five
lines of input data:

1.5	0.01
1.5	0.03
1.5	0.06
1.5	0.14
1.5	0.23

The actual file could contain many 10's of thousands of records.  The
data represent a length ($1) and a measurement ($2).  The ultimate goal
is to break down the data by deciles/percentiles and express some basic
statistical parameters about the data (e.g. length weighted average,
summation of length*measuement, etc.).  The way I accomplished this
before was to sort the data file by column 2, then knowing the total
count of the file I calculated what percentage each record in the file
represented along with the cumulative percentage of the data and wrote
that out to a file using my first awk script.  Taking that output, my
second script bins the information for each 10 percent of the data.
The last decile (from 91 to 100) is further broken down into
percentiles.  All of this information is then written out in a table
form.  I do this for analyzing the distribution of gold in drill
samples.  It turns out that for many deposits, 1% of the sample data
may contain 40-50% of the entire gold content of the deposit.  This
decile analysis provides me with some decent info.

I ramble.... on.  I could certainly post my clumsy scripts, they're no
beauty, just brute force awkwardness.

Anyway thanks for your post, I'll check into doing it that way.

Try this too:

awk 'NR==FNR{numRecords++;totlenght+=$1;totMeas+=$2;next}
{print $1 / numRecords}
END{print totLength,totMeas,totLength/numRecords,totMeas/numRecords}' file file


so you get a better feel for it. I don't think you'll find it too hard to modify that to do what you want. As for splitting the records up into 10% buckets, you could create an array indexed by percentile, and just print the contents of that at the end.

	Ed.
.



Relevant Pages

  • Re: Need help
    ... Hi Mjuricek, ... enough information to know what the script does. ... Include all constraints but exclude irrelevant ... based on the sample data you gave. ...
    (microsoft.public.sqlserver.programming)
  • Help: Odd Output
    ... I wrote a script to check the location. ... Here's my codes. ... if ($num <= 2) ... Here's the sample data. ...
    (comp.lang.perl.misc)
  • Re: Splitting and comparing file names
    ... with the general format would really be much preferred... ... sample data would be nice. ... You have a case of premature declaration. ... previous script that required it because of a write to the directory. ...
    (perl.beginners)
  • Re: script help - stripping trailing spaces in exisitng script
    ... below script on the below sample data. ... The only problem is the trailing spaces at the end of each line. ... A quick web search put me onto ... original to just include all this in a single script. ...
    (comp.lang.perl.misc)