Re: Possible bug with StringScanner class



On 7/22/05, Eric Mahurin <eric_mahurin@xxxxxxxxx> wrote:
>
> --- John Halderman <jhalderman@xxxxxxxxx> wrote:
>
> > I'm not sure if this is a bug or intentional behavior, so I
> > thought I would
> > post it here to see what the community thought of what was
> > happening. If you
> > set up a StringScanner object to perform iterative matching
> > on a string the
> > behavior of \A and ^ seem to always match. It seems to me
> > that \A should
> > only match if it is the first match performed, and ^ should
> > only match if
> > bol? returns true, which should be after a \n or if it is the
> > first match
> > performed.
>
> You should think of the current position as the beginning of
> the string for matching. In addition, the regexp that scan
> gets is implicitly anchored to that spot. So specifiing \A or
> ^ at the beginning of a regexp for scan is redundant.


There is nothing in the documentation to suggest that the current position
should be considered the beginning of a string for matching purposes, only
that any match must start at that position. That would mean a regexp
beginning with ^ would need the current position to be preceded by \n or be
the at the beginning of the string in order for it to match. Furthermore,
the existence of bol? suggests that the current position is not to be
considered the beginning of the line. As for whether is should be considered
the beginning of the string, that remains ambiguous, although I believe it
makes more sense for it not to be so.

> __________________________________
> Do you Yahoo!?
> Yahoo! Mail - Find what you need with new enhanced search.
> http://info.mail.yahoo.com/mail_250
>
>


Relevant Pages

  • Re: Filter string to remove non-utf-8 characters
    ... 'The global property tells the RegExp engine to find ALL matching ... 'Our pattern tells us what to find in the string... ... 'Use the replace function of RegExp to clean the username. ...
    (microsoft.public.scripting.vbscript)
  • Re: Filter string to remove non-utf-8 characters
    ... 'The global property tells the RegExp engine to find ALL matching ... 'Our pattern tells us what to find in the string... ... 'Use the replace function of RegExp to clean the username. ...
    (microsoft.public.scripting.vbscript)
  • Re: performance surprise -- why?
    ... On 25 Aug 2004, Anno Siegel wrote: ... >> string to a file and counted the lengths of the lines using a simple ... The substitution method must move parts of the ... > The results show indexing and global matching in the same ballpark, ...
    (comp.lang.perl.misc)
  • Re: performance surprise -- why?
    ... > CCCTAAACCCTAAACCCTAAACCCTAAACCTCTGAATCCTTAATCCCTAAATCCCTAAAT...(30MB string). ... The substitution method must move parts of the ... The results show indexing and global matching in the same ballpark, ... sub substitute { ...
    (comp.lang.perl.misc)
  • Re: Perl Substitution Begining Line
    ... apparently read the usage of \w correctly as I thought it was matching ... the s/// operator is not a regexp, ... double-quotish string, $ means the start of a variable. ... your shell to leave that alone and pass it directly to Perl. ...
    (perl.beginners)