Re: Fraudulent eBay listing



According to Steve Ackman <usenet2002@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>:
In <f6gsn502g4o@xxxxxxxxxxxxxxxxx>, on 4 Jul 2007 19:33:57 GMT, DoN.
Nichols, dnichols@xxxxxxxxxxx wrote:

Of course the first works. Why would you think
it wouldn't? (These are done on FreeBSD, but grep is
fairly standard across all the unices)

O.K. I had not noticed that you were using plain grep, instead
of egrep. Plain grep does not use true REs. In plain grep, '*' simply
stands for "any number of any character", while in true REs (including
in egrep), * simply stands for "zero or more of the preceding
character", so you would need ".*" (. being "any character", and ".*"
matches any number of any character.

$ man egrep

GREP(1)

NAME
grep, egrep, fgrep, zgrep, zegrep, zfgrep, bzgrep, bzegrep,
bzfgrep -
print lines matching a pattern
[...]
egrep
is the same as grep -E.

Hmm ... that depends on the version of unix you are using. For
OpenBSD, I find the following:

======================================================================
Popocat:csu 20:27:35 # ls -li `which grep`
123705 -r-xr-xr-x 6 root bin 29776 Mar 1 2006 /usr/bin/grep
Popocat:csu 20:27:50 # ls -li `which egrep`
123705 -r-xr-xr-x 6 root bin 29776 Mar 1 2006 /usr/bin/egrep
Popocat:csu 20:27:54 # ls -li `which fgrep`
123705 -r-xr-xr-x 6 root bin 29776 Mar 1 2006 /usr/bin/fgrep
======================================================================

Note that all have the same inode, so they are links (three of six) to
the same executable.

However, on Solaris 10 (which is what I was using):

======================================================================
Fuego:dnichols 20:24:20 > l -i `which grep`
291 -r-xr-xr-x 1 root bin 10212 Jan 22 2005 /bin/grep*
Fuego:dnichols 20:24:33 > l -i `which egrep`
267 -r-xr-xr-x 1 root bin 26688 Jan 22 2005 /bin/egrep*
Fuego:dnichols 20:25:56 > l -i `which fgrep`
274 -r-xr-xr-x 1 root bin 18376 Dec 16 2005 /bin/fgrep*
======================================================================

each is a separate binary -- different sizes, and all at least somewhat
smaller than the smallest of the OpenBSD combined one.

-E, --extended-regexp
Interpret PATTERN as an extended regular expression (see below).

And the Solaris man page for egrep shows:

======================================================================
NAME
egrep - search a file for a pattern using full regular
expressions
======================================================================

while the man page for grep shows:


======================================================================
NAME
grep, egrep, fgrep - print lines matching a pattern

[ ... ]

There are three major variants of grep, controlled by the
following options.
-G Interpret pattern as a basic regular expression (see
below). This is the default.
-E Interpret pattern as an extended regular expression
(see below).
-F Interpret pattern as a list of fixed strings, separated
by newlines, any of which is to be matched.
In addition, two variant programs egrep and fgrep are avail-
able. Egrep is similiar (but not identical) to grep -E, and
is compatible with the historical Unix egrep. Fgrep is the
same as grep -F.
======================================================================

So according to this, on Solaris 10, egrep is *not* the same as
"grep -E".

[...]

According to the man page, it would seem I'm the one
using true (ok, "basic") regular expressions while you're
using *extended* regular expressions. Nowhere can I
find any reference to "true" regular expressions.

On Solaris, I am using what Solaris calls "Full regular
expressions" with egrep. With "grep -E" I would be using "extended
regular expressions".

The egrep man page on Solaris continues a bit deeper down:


======================================================================
/usr/bin/egrep
The /usr/bin/egrep utility accepts full regular expressions
as described on the regexp(5) manual page, except for \( and
\), \( and \), \{ and \}, \< and \>, and \n, and with the
addition of:

1. A full regular expression followed by + that matches one
or more occurrences of the full regular expression.

2. A full regular expression followed by ? that matches 0
or 1 occurrences of the full regular expression.

3. Full regular expressions separated by | or by a NEWLINE
that match strings that are matched by any of the
expressions.

SunOS 5.10 Last change: 24 Mar 2006 1

User Commands egrep(1)

4. A full regular expression that can be enclosed in
parentheses ()for grouping.

Be careful using the characters $, *, [, ^, |, (, ), and \
in full regular expression, because they are also meaningful
to the shell. It is safest to enclose the entire full regu-
lar expression in single quotes ('').

The order of precedence of operators is [], then *?+, then
concatenation, then | and NEWLINE.
======================================================================

Obviously, given my .+ faux pas, they're easy enough
to confuse. ;-)

And REs have changed over time. But generally, anything using
the original REs would work in all of the later versions.

[ ... ]

258 IDs with zeros present out of 714 total IDs. So -- your
expression for grep (if you are including digits) may not work in a
Killfile, unless it is based on plain grep. :-)

Which is why I included the disclaimer about which
regexp and which newsreader you use. Patterns are
patterns. Syntax is syntax.

Leafnode filters use extended regular expressions
while all my scripts for searching the spool use
regular regular expressions.

O.K.

Let's see....
$ time egrep "^From: Ignoram.*" * | wc -l
1634

real 3m12.246s
user 0m0.936s
sys 0m4.252s

Adapted from your earlier post:
$ time egrep "^From: Ignoramus[0-9]*" * | wc -l
1634

real 4m25.169s
user 0m1.115s
sys 0m5.514s

You can include the extra digits if you want, but I'll
take speed over excessively narrow patterns. ;-)

Again -- on Solaris (Solaris 8 on an older system, but the one
on which the file being scanned resides so I avoid the overhead of NFS
access):

======================================================================
stromboli:root 20:42 # time egrep "^From ignoram.*" Rec.crafts.metalworking | wc -l
2422
egrep ^From ignoram.* Rec.crafts.metalworking 29.67s user 14.90s system 97% cpu 45.906 total
wc -l 0.01s user 0.04s system 0% cpu 45.826 total
======================================================================
stromboli:root 20:43 # time egrep "^From ignoramus[0-9]*" Rec.crafts.metalworking | wc -l
2422
egrep ^From ignoramus[0-9]* Rec.crafts.metalworking 29.25s user 15.19s system 97% cpu 45.811 total
wc -l 0.03s user 0.03s system 0% cpu 45.743 total
======================================================================

So this one give separate time values for each component of the
pipeline, and the total time expended is not very different at all
between using the more refined RE.

So there are significant tradeoffs between differing flavors of
unix. For your system, there is an obvious benefit to use the shorter
and less selective RE. For mine, there does not seem to be much benefit
at all.

EVEN filtering "Ig" would do it, but then you risk some
day filtering a new participant named Ignatius or Igor...

Or -- someone who has a munged e-mail address of
"Ignore-this@not-my-domain". :-)

Right, so extending "Ig" out to "Ignoram" gets every
instance of Ingoramus, and no instances of anyone else,
or any other word in the dictionary. Anything beyond
"Ignoram" is unnecessarily narrow and time consuming.

Of course, this assumes that I *want* to filter him. I don't
particularly want to do so. But is is certainly proved that those who
do should be able to if they take the time to figure out how their
newsreader's killfile works. (At least for real newsreaders. No bets
about OE. :-)

Enjoy,
DoN.



--
Email: <dnichols@xxxxxxxxxxx> | Voice (all times): (703) 938-4564
(too) near Washington D.C. | http://www.d-and-d.com/dnichols/DoN.html
--- Black Holes are where God is dividing by zero ---
.



Relevant Pages

  • Re: Text Editor
    ... > I was wondering if there is a text editor equivalent to TextPad for ... See the man pages for grep, egrep, fgrep. ... I tend to just use egrep and the simple regular expressions. ... " Grep understands two different versions of regular expression ...
    (Fedora)
  • Re: Need some help with Search
    ... > I want to search a file for a known pattern and output only those words ... grep 'test\.txt' file ... "regular expression" and read up on that; then read the part of the grep ... [When responding by email, include the term non-spam in the subject line to ...
    (comp.unix.programmer)
  • Re: use grep or other command to get exactly pattern
    ... pattern eg: 10.0.0.7 only? ... eg: grep 10.0.0.7 file. ... Use egrep instead - it allows you to use regular expressions. ... the inverted commas are required ...
    (Fedora)
  • Re: Fraudulent eBay listing
    ... O.K. I had not noticed that you were using plain grep, ... in egrep), * simply stands for "zero or more of the preceding ... Interpret PATTERN as an extended regular expression. ... regular expressions while you're ...
    (rec.crafts.metalworking)
  • Re: use grep or other command to get exactly pattern
    ... ann kok wrote: ... pattern eg: 10.0.0.7 only? ... eg: grep 10.0.0.7 file. ... Use egrep instead - it allows you to use regular expressions. ...
    (Fedora)