Re: Possible bug with StringScanner class



--- John Halderman <jhalderman@xxxxxxxxx> wrote:

> On 7/22/05, Eric Mahurin <eric_mahurin@xxxxxxxxx> wrote:
> >
> > --- John Halderman <jhalderman@xxxxxxxxx> wrote:
> >
> > > I'm not sure if this is a bug or intentional behavior, so
> I
> > > thought I would
> > > post it here to see what the community thought of what
> was
> > > happening. If you
> > > set up a StringScanner object to perform iterative
> matching
> > > on a string the
> > > behavior of \A and ^ seem to always match. It seems to me
> > > that \A should
> > > only match if it is the first match performed, and ^
> should
> > > only match if
> > > bol? returns true, which should be after a \n or if it is
> the
> > > first match
> > > performed.
> >
> > You should think of the current position as the beginning
> of
> > the string for matching. In addition, the regexp that scan
> > gets is implicitly anchored to that spot. So specifiing \A
> or
> > ^ at the beginning of a regexp for scan is redundant.
>
>
> There is nothing in the documentation to suggest that the
> current position
> should be considered the beginning of a string for matching
> purposes, only
> that any match must start at that position. That would mean a
> regexp
> beginning with ^ would need the current position to be
> preceded by \n or be
> the at the beginning of the string in order for it to match.
> Furthermore,
> the existence of bol? suggests that the current position is
> not to be
> considered the beginning of the line. As for whether is
> should be considered
> the beginning of the string, that remains ambiguous, although
> I believe it
> makes more sense for it not to be so.

I think it makes perfect sense. scan/scan_until/etc only can
look at what is after the current position. They have no
visibility to what is before the current position. So, you
should consider it to be the beginning of the string for
matching purposes. Whether you like it or not, that is the way
it works and I think it is intentional.




____________________________________________________
Start your day with Yahoo! - make it your home page
http://www.yahoo.com/r/hs



.



Relevant Pages

  • Re: Filter string to remove non-utf-8 characters
    ... 'The global property tells the RegExp engine to find ALL matching ... 'Our pattern tells us what to find in the string... ... 'Use the replace function of RegExp to clean the username. ...
    (microsoft.public.scripting.vbscript)
  • Re: Filter string to remove non-utf-8 characters
    ... 'The global property tells the RegExp engine to find ALL matching ... 'Our pattern tells us what to find in the string... ... 'Use the replace function of RegExp to clean the username. ...
    (microsoft.public.scripting.vbscript)
  • Re: Possible bug with StringScanner class
    ... > I'm not sure if this is a bug or intentional behavior, ... the string for matching. ... In addition, the regexp that scan ... Do you Yahoo!? ...
    (comp.lang.ruby)
  • Re: performance surprise -- why?
    ... On 25 Aug 2004, Anno Siegel wrote: ... >> string to a file and counted the lengths of the lines using a simple ... The substitution method must move parts of the ... > The results show indexing and global matching in the same ballpark, ...
    (comp.lang.perl.misc)
  • Re: performance surprise -- why?
    ... > CCCTAAACCCTAAACCCTAAACCCTAAACCTCTGAATCCTTAATCCCTAAATCCCTAAAT...(30MB string). ... The substitution method must move parts of the ... The results show indexing and global matching in the same ballpark, ... sub substitute { ...
    (comp.lang.perl.misc)