Re: RegExp as Finite State Machine
- From: Thomas 'PointedEars' Lahn <PointedEars@xxxxxx>
- Date: Mon, 05 Jan 2009 00:14:54 +0100
Lasse Reichstein Nielsen wrote:
Thomas 'PointedEars' Lahn <PointedEars@xxxxxx> writes:
Lasse Reichstein Nielsen wrote:
Classsical counter-example of regularness:In which way is that proof that ECMAScript Regular Expressions are not a
/^(11+)\1+$/
which recognizes composite numbers (non-primes) in unary notation.
regular language?
The language is not regular (It's composite, the language of "1"'s of
prime length, is not regular. This is failry easily proved by assuming
it's regular, and using pumping to derive a contradition).
Ah, but I think you confuse the language that a regular expression can
match, L_m(r) -- he above matched language is context-free (Chomsky Type-2)
but not regular (Type-3), if I'm not mistaken -- with the Regular Expression
language, L_r. In order to prove that the ECMAScript Regular Expression
language is not regular, you have to apply the pumping lemma for regular
languages to the expression itself, not to the strings it could match.
The expression is a valid Javascript regular expression that recognizes
a non-regular language, so Javascript regexps are not restricted to the
regular languages (and therefore not possible to implement using only
a finite state machine).
Sorry, I don't think you are correct here.
Of the current crop of browsers, AFAIK none of them use a FSM
based implementation of regular expressions.
So sure are you, despite other engines that also support backreferences
(like egrep) have been demonstrated to use FSM implementations?
<http://oreilly.com/catalog/regex/chapter/ch04.html>
Referring to this?
"You might, however, notice that GNU's version of egrep does support
backreferences. It does so by having two complete engines under the
hood! It first uses a DFA engine to see whether a match is likely,
and then uses an NFA engine (which supports the full flavor,
including backreferences) to confirm the match."
Yes, exactly.
I'm only talking about the current regexp implementations because I
have actually been looking at them.
We'll see.
What I probably /should/ have also said, is that by FSM I mean a
deterministic machine (i.e., a DFA, one that is run linearly over the
input). A nondeterministic machine can use any number of tricks to
implement the nondeterminism, including backtracking.
But your premise is incorrect. Nondeterministic Finite Automata (NFAs),
also called "nondeterministic finite state machines", are definitely a
subset of Finite State Machines (FSMs).
<http://en.wikipedia.org/wiki/Nondeterministic_finite_state_machine>
PointedEars
.
- Follow-Ups:
- Re: RegExp as Finite State Machine
- From: Lasse Reichstein Nielsen
- Re: RegExp as Finite State Machine
- From: Stevo
- Re: RegExp as Finite State Machine
- References:
- Re: RegExp as Finite State Machine
- From: Thomas 'PointedEars' Lahn
- Re: RegExp as Finite State Machine
- From: Lasse Reichstein Nielsen
- Re: RegExp as Finite State Machine
- From: Thomas 'PointedEars' Lahn
- Re: RegExp as Finite State Machine
- From: Lasse Reichstein Nielsen
- Re: RegExp as Finite State Machine
- Prev by Date: Re: RegExp as Finite State Machine
- Next by Date: Re: questions about form action= and onsubmit() handling
- Previous by thread: Re: RegExp as Finite State Machine
- Next by thread: Re: RegExp as Finite State Machine
- Index(es):
Relevant Pages
|