Re: read ahead or before
- From: Janis Papanagnou <Janis_Papanagnou@xxxxxxxxxxx>
- Date: Sun, 27 Jul 2008 23:40:48 +0200
Ted Davis wrote:
On Sat, 26 Jul 2008 12:02:48 -0700, Mag Gam wrote:
I have been trying to do this instead of placing everything in a hash/
array and compare in the END block.
For example, if I have a file like this
111
2222
333
333
4445
3434
Notice there is a duplicate "333". How can I test if the next line is
the same as the current line? I suppose I can use getline() but is there
another clever way of achieving this?
Also, how can I check for previous line?
Functionally, this is the same as PK's suggestion, it's just written out
in a fuller (C-like), and hopefully, clearer, form - since you didn't say
what you want to do with the lines after suppressing adjacent duplicates,
I wrote it to print the non-duplicate lines as it encounters them. This
should not be sensitive to the file size because it stores only one line
at a time.
{
if( $0 != Prev ) print $0
Prev = $0
}
In minimalist awk format, that's
$0 != Prev {print}
{Prev = $0}
As a command line program that could be (minimalist format)
awk '$0!=Prev{print}{Prev=$0}' source > target
If we're going to go minimalist, maybe even...
awk '$0!=prev;{prev=$0}' source > target
Janis
.
(tested under Fedora and XP (as a script file - all variations tested
under Linux) with your sample data)
BTW, "gigabytes" is usually abbreviated GB (Gb would be "gigabits").
Abbreviations for SI prefixes for units larger than kilo are all upper
case - all those smaller than mega are in lower case - the full prefixes
are in lower case unless the language requires initial capitals (k and K
have an unofficial byte/bit context usage: k = 1000; K = 1024).
- References:
- read ahead or before
- From: Mag Gam
- Re: read ahead or before
- From: Ted Davis
- read ahead or before
- Prev by Date: Re: read ahead or before
- Next by Date: formatting columns in an awk output
- Previous by thread: Re: read ahead or before
- Next by thread: problem getting gawk inet working with udp, tcp okay - solved: 3.1.6 okay
- Index(es):
Relevant Pages
|