Re: Rexxtacy
- From: Peter Flass <Peter_Flass@xxxxxxxxx>
- Date: Sat, 14 Jun 2008 06:39:28 -0400
I'd store the lines in an auxiliary area and never move them. Then I'd store and array of pointers contiguously (50,000*4 = 200K) and sort those. This has the advantage that you don't need to assume a maximum line length, and shorter lines require only the storage actually occupied and not some theoretical maximum. Allocate storage for the lines as you read them in chunks which are multiples of 64K.
Mie wrote:
When I'm going to translate a Rexx app to C, I'll have a basic problem
with this parts. The app sorts a text file. It also performs more than
that, e.g. data checks, but I don't have a problem with the difficult
steps... :-)
Such a text file has a header line, followed by XXXXX lines with data.
A line never is longer than 86 bytes, but assuming 96 or 128 bytes per
line is safer. IRL the number of lines is about 50,000 per text file,
200 items*265 weekdays, but the number of items depends on the user.
Nevertheless it's okay to perform all operations in memory, about 6 MB
of RAM isn't that much nowadays.
There's an ugly work-around I do understand, but that involves a file
per item. I'ld like to avoid that, just like I'ld like to avoid using
existing tools. For one because its already a collection of many tools,
and data quality (compared to other os'es) is an important issue.
The Rexx-version uses the simplest bubble sort. The data is reasonably
sorted already, but more important is that I'l understand that better
than a more complex sorting method. Feel free to improve that, but I do
like to see some understandable bubble sort too.
Not explained, but occuring in the Rexx-version: skip emtpy lines, and
skip lines starting with "<" (excluding the header line). Left out: the
obvious fclose()'s, and so on.
So, basicly: how do I store about 50,000 lines in an array of chars,
how do I check the first character of each line, how can I e.g. print
e.g. line number 100, and how do I swap 2 lines?
/* Save the header line (no knowledge problem here) */
CALL LineOut target,LineIn(source)
/* I don't know how to do this _with about 6 MB of data / >stack_ */
count=0
DO WHILE Lines(source)>0
line=Strip(LineIn(source))
IF line<>'' THEN DO
IF Left(line,1)<>'<' THEN DO
count=count+1
line.count=line
END
END
END
/* Bubble sort */
j=1
DO WHILE j<count
k=j+1
IF line.j>line.k THEN DO
temp=line.j
line.j=line.k
line.k=temp
j=0
END
j=j+1
END
/* Append the result to the target file */
DO j=1 TO count
CALL LineOut target,line.j
END j
/* Or, perhaps addressing the same data access-issue: */
SAY line.100
---
.
- Follow-Ups:
- Re: Rexxtacy
- From: Mie
- Re: Rexxtacy
- References:
- Rexxtacy
- From: Mie
- Rexxtacy
- Prev by Date: Rexxtacy
- Next by Date: Re: Rexxtacy
- Previous by thread: Rexxtacy
- Next by thread: Re: Rexxtacy
- Index(es):
Relevant Pages
|