Re: Treetop parser (or PEG in general?) questions



On Jan 29, 2:53 pm, Clifford Heath <n...@xxxxxxxxxxxxxxx> wrote:
Phrogz wrote:
For the archives, the original PEG/packrat paper defines:
 EndOfFile <- !

That would be a good thing to add, and I think the meta-grammar
would still parse.

I suppose I'm suggesting that it isn't badly needed, since it's
trivial to add:
rule EOF
!.
end
to any grammar where such a marker is useful. It probably would be
nice to have in the official language, in one form or another, though
I could see that starting to open the door to adding other
'convenience' rules built in and made available to every grammar.

I think Nathan would oppose it, but I'd also like to add regex's
as terminals for performance, so that a SyntaxNode isn't needed for
every character.

I would be very, very in favor of this. Not just for performance, for
simplicity in consuming a few nodes that I don't need granularity on
and where it would be easier to match using a regexp.

I read a snippet that made it sound like Perl6 combines regexps and
PEG in some way; haven't looked into it any further to find out,
though.

I've also suggested to him that certain rules could be designated as
"skip" rules, for which no SyntaxNode is built. Perhaps also that
a normal rule could identify another rule as a skip rule, which is
implicitly inserted between (but not around) each node of this rule.
This would allow such rules to implement the whitespace behaviour of
other parser generators, and perhaps also to build lists. Something
like:

rule statement skip whitespace
   'if' expression statement ( 'else' statement )?
    / etc...
end

where the whitespace rule is implicitly inserted, or

rule parameter_list skip comma_white
    item+
end

which would function as if I'd said

    item (comma_white item)*

Hrm, not wild about implicit insertion into rules. (But then I'm just
a bumpkin.) I would think instead you'd want some way to label each
part of a rule, or a rule as a whole:

rule some_name:no_node
...
end

rule some_name
(item whitespace:no_node)+
end

Additionally, something that I've wanted a few times (like the email
parser) would be a parser command that skips ALL node creation,
determining only if the parse is possible. Something like:
@parser.parse( str, :nodes=>:omit )
or
@parser.validate( str )

No idea how much memory or speed would be saved with this; my
assumption is "a good amount", simply given the proliferation of nodes
and modules and extending.
.



Relevant Pages

  • Re: corrupted pointer when initing a dll
    ... then call parser to parse the a2l file and load all the channel ... return result of the load and parse. ... class is housed in a DLL and that DLL can be unloaded while your ...
    (microsoft.public.vc.language)
  • Re: corrupted pointer when initing a dll
    ... then call parser to parse the a2l file and load all the channel ... return result of the load and parse. ... class is housed in a DLL and that DLL can be unloaded while your ...
    (microsoft.public.vc.language)
  • Re: corrupted pointer when initing a dll
    ... DWORD WINAPI CCCP::ParseA2lThread{ ... class is housed in a DLL and that DLL can be unloaded while your ... then call parser to parse the a2l file and load all the channel ...
    (microsoft.public.vc.language)
  • Re: Localizing dates
    ... >I need to know because I have to parse dates in different locations, ... >want to give the parser a hint about what format to expect. ... SimpleDateFormat with the dots replaced by colons. ...
    (comp.lang.java.programmer)
  • Re: Treetop parser (or PEG in general?) questions
    ... would still parse. ... a CQL file, because I need to act on each one in turn - I don't want ... for which no SyntaxNode is built. ... rule statement skip whitespace ...
    (comp.lang.ruby)