Re: Filtering Google Groups: I give up :-)



In news.software.readers on 3 Aug 2007 17:01:09 GMT, Blinky the Shark
<no.spam@xxxxxxxxxxx> wrote:

I *think* most of the differnce is that some stuff has to be escaped in
Xnews but not slrn. I hope someone can add more info here.

Different things have to be escaped, because Xnews uses
Perl-compatible regular expressions and slrn uses S-lang-compatible
regular expressions, but some things have to be escaped in both
newsreaders' scorefiles.

There was some discussion here:
<http://groups.google.co.uk/group/news.software.readers/browse_thread/thread/674242aab04b7979/ed5dea552c5649b0>

The pjr.gotdns.org web page I referred to during the discussion isn't
currently available (because I want more than one page before
announcing my slrn help pages to the world, and because I've been too
lazy to add much), but this is what it said:


S-Lang and PCRE differences
===========================

Introduction
------------

Several newsreaders, such as Pan, Xnews and Tin, use Perl-Compatible
Regular Expressions (PCRE) in their scorefiles. Slrn is unusual in
using S-Lang's simple inbuilt regular expression routines instead. The
following is a guide for intermediate users of PCRE-enabled
newsreaders (not beginners or experts) who wish to convert the regular
expressions in their existing scorefiles for use in Slrn. Please note
that this is currently work in progress.

Case-sensitivity
----------------

In PCRE, case-sensitivity is turned on with (?-i) and turned off with
(?i). In S-Lang the equivalents are \c and \C.

Examples:

% Pan or Xnews:
Score: =-9999
Subject: (?-i)HELP
Score: =9999
Subject: (?i)slrn

These rules kill posts whose subject includes "HELP" (but not "help"
or "Help"), and mark other posts whose subjects include "slrn", "SLRN"
or "Slrn" as interesting. The exact equivalent in Slrn is as follows:

% Slrn:
Score: =-9999
Subject: \cHELP
Score: =9999
Subject: \Cslrn

### added note: case-sensitivity is OFF by default in slrn scorefiles;
### I don't know what the default is for the PCRE newsreaders.


Word boundaries
---------------

In PCRE, word boundaries are matched with \b. In S-Lang, a distinction
is made between the beginning of a word and its end, \< being used for
the former and \> for the latter.

Examples:

% Pan or Xnews
Score: =-9999
From: \bfred\b

% Slrn
Score: =-9999
From: \<fred\>

Both rules match "fred" but not "alfred" or "frederick".

Parentheses
-----------

In both S-Lang and PCRE, parentheses can be either literal matches for
parenthesis characters or indications that part of a regular
expression is to be treated as a group, depending on whether they're
escaped or not, but the syntax is reversed in the two languages.

PCRE: ( and ) group a sub-expression; \( and \) are literal matches.
S-Lang: \( and \) group a sub-expression; ( and ) are literal matches.

Examples:

% Pan or Xnews
Score: =-9999
From: (kook)\1\1
Subject: \(off-topic\)

% Slrn
Score: =-9999
From: \(kook\)\1\1
Subject: (off-topic)

Both examples match messages written by "kookkookkook" whose Subjects
contain the literal string "(off-topic)".

Note that such back references as \1, \2, etc are the only use for
grouping parentheses in S-Lang. Patterns such as (foo)+ have no S-Lang
equivalent. In S-Lang, ?, + and * match only a single preceding
character or character class.



(The above text is GPL'ed, if anybody wants to copy and improve it.)


--
PJR :-)
.



Relevant Pages

  • Re: Decent newsreader wanted
    ... examples of regular expressions that don't work in slrn (BTW, ... using the slang library. ... slrn makes use of the regexp engine built into the slang ...
    (uk.misc)
  • Re: slrn and wildcards in the scorefile
    ... and the slrn website and saw all kinds of information on wildcards but ... SLRN uses regular expressions instead of ... Think of regular expressions as "super"-wildcards that give you ...
    (news.software.readers)
  • Re: Filtering Google Groups: I give up :-)
    ... Xnews but not slrn. ... Perl-compatible regular expressions and slrn uses S-lang-compatible ... S-Lang and PCRE differences ...
    (news.software.readers)
  • Re: How to ID origin in email headers?
    ... Xnews was windoze only), but for slrn in an rpm based distro, try ... I use Xnews primarily because the doze XP box has a bigger monitor ... slrm and when posting time, I get rejected by the server, lines 237-156 ... quite a number of idiots out there who would comply. ...
    (comp.os.linux.networking)
  • Re: How to ID origin in email headers?
    ... Ohmster wrote: ... Xnews was windoze only), but for slrn in an rpm based distro, try ... I use Xnews primarily because the doze XP box has a bigger monitor ... slrm and when posting time, I get rejected by the server, lines 237-156 ...
    (comp.os.linux.networking)