Re: Treetop parser (or PEG in general?) questions
- From: Phrogz <phrogz@xxxxxxx>
- Date: Tue, 29 Jan 2008 18:16:20 -0800 (PST)
On Jan 29, 2:53 pm, Clifford Heath <n...@xxxxxxxxxxxxxxx> wrote:
Phrogz wrote:
For the archives, the original PEG/packrat paper defines:
EndOfFile <- !
That would be a good thing to add, and I think the meta-grammar
would still parse.
I suppose I'm suggesting that it isn't badly needed, since it's
trivial to add:
rule EOF
!.
end
to any grammar where such a marker is useful. It probably would be
nice to have in the official language, in one form or another, though
I could see that starting to open the door to adding other
'convenience' rules built in and made available to every grammar.
I think Nathan would oppose it, but I'd also like to add regex's
as terminals for performance, so that a SyntaxNode isn't needed for
every character.
I would be very, very in favor of this. Not just for performance, for
simplicity in consuming a few nodes that I don't need granularity on
and where it would be easier to match using a regexp.
I read a snippet that made it sound like Perl6 combines regexps and
PEG in some way; haven't looked into it any further to find out,
though.
I've also suggested to him that certain rules could be designated as
"skip" rules, for which no SyntaxNode is built. Perhaps also that
a normal rule could identify another rule as a skip rule, which is
implicitly inserted between (but not around) each node of this rule.
This would allow such rules to implement the whitespace behaviour of
other parser generators, and perhaps also to build lists. Something
like:
rule statement skip whitespace
'if' expression statement ( 'else' statement )?
/ etc...
end
where the whitespace rule is implicitly inserted, or
rule parameter_list skip comma_white
item+
end
which would function as if I'd said
item (comma_white item)*
Hrm, not wild about implicit insertion into rules. (But then I'm just
a bumpkin.) I would think instead you'd want some way to label each
part of a rule, or a rule as a whole:
rule some_name:no_node
...
end
rule some_name
(item whitespace:no_node)+
end
Additionally, something that I've wanted a few times (like the email
parser) would be a parser command that skips ALL node creation,
determining only if the parse is possible. Something like:
@parser.parse( str, :nodes=>:omit )
or
@parser.validate( str )
No idea how much memory or speed would be saved with this; my
assumption is "a good amount", simply given the proliferation of nodes
and modules and extending.
.
- Follow-Ups:
- Re: Treetop parser (or PEG in general?) questions
- From: Clifford Heath
- Re: Treetop parser (or PEG in general?) questions
- References:
- Treetop parser (or PEG in general?) questions
- From: Phrogz
- Re: Treetop parser (or PEG in general?) questions
- From: Clifford Heath
- Re: Treetop parser (or PEG in general?) questions
- From: Phrogz
- Re: Treetop parser (or PEG in general?) questions
- From: Clifford Heath
- Treetop parser (or PEG in general?) questions
- Prev by Date: Re: smtp.set_debug_output - capture output to string?
- Next by Date: Re: smtp.set_debug_output - capture output to string?
- Previous by thread: Re: Treetop parser (or PEG in general?) questions
- Next by thread: Re: Treetop parser (or PEG in general?) questions
- Index(es):
Relevant Pages
|