Re: Alternate Regular Expressions?



Ari Brown wrote:

Just randomly curious -

Is there an alternate RegExp "language" to the current one in Ruby and Perl?

I don't know. So here's a dissertation on where to start.

The good news is a RegExp is only two things at heart...

- a Domain-Specific Language to program
- a state machine.

The bad news is, back in the day, people used to invent DSL as long strings of easily parsed characters. For example, a language called LSYSTEM might describe turtle graphics like this:

s=[::cc!!!!&&[FFcccZ]^^^^FFcccZ] # upper spikes

The really bad news is RegExp is one of these string-oriented DSLs that stuck. It will always be useful, so programmers forget how much room it has for improvement.

The good news is Ruby excels at generating light DSLs. The equivalent expression for a modern implementation of LSYSTEM might look like this:

upper_spikes = push.twist(2).thinner(2).increase_angle(4)....

etc. Because Ruby gives your programming interfaces extreme notational flexibility, you can declare the interfaces most convenient for your domain.

So start writing! and research other DSLs as you go. For example, here's a DSL written with C++ metaprogramming:

http://boost-sandbox.sourceforge.net/libs/xpressive/doc/html/index.html

Whenever you like, that language slips back to raw RegExp. Your effort should have a similar shunt.

English is like a pseudo-random number generator - there are a bajillion rules to it, but nobody cares.

Of all the world's languages, English is both the ugliest and the beautifulest.

--
Phlip
http://www.oreilly.com/catalog/9780596510657/
"Test Driven Ajax (on Rails)"
assert_xpath, assert_javascript, & assert_ajax


.



Relevant Pages

  • Re: Sob, its true about uvular R
    ... in either language. ... gámma is a vvf; it is similar to, but not identical with, fricative back r; nevertheless confusion is possible: ... g and back r confuse waren and wagen, rtl in the spoof sports news displays a board with the word SPOCHT, the fr. ... velar to uvular, ...
    (sci.lang)
  • Re: F2003 standard: Can Class(*) be used in generic proogramming?
    ... The software development community needs to move beyond its use of static, procedural languages and frameworks and start using language-oriented programming. ... I use a language oriented towards my problem and I program in it. ... Ford says ThoughtWorks colleague Ola Bini envisions a future stack of basic programming tools consisting of a "stable language" at the bottom level, with dynamic languages built on top of that, and DSLs added at the top layer. ...
    (comp.lang.fortran)
  • Re: OT: Something to think about in the wake of the Don Imus Crucifiction
    ... really concerned about loss of freedom of speech in America, suppose you oppose the monopolization of our mass-media by private interests who only allow their OWN choice of the "news" to be heard on the major broadcast outlets! ... Thank God the Internet is still free, allowing us otherwise ignorant Americans access to British and Canadian news sources. ... although I think women and ethnic minorities have reason to be more sensitive about such language than main-stream males. ... Consequently, corporate America decides what "slant" the news should take - purely factual reporting in this country has become a thing of the past, and most people don't even recognize the fact that much of what they hear as "news" has often been politicized to conform to what the program sponsors want them to think. ...
    (rec.pets.cats.anecdotes)
  • Re: Alternative COBOL "telco" source program
    ... Sorry that I got Mr. Frank posting to comp.lang.cobol (as well as ... In the PL/I group, most people have learned to just ignore him. ... "language x" so THEREFORE "language x" can't do it. ... >> an announcement containing some good news and some bad news. ...
    (comp.lang.cobol)
  • Re: Common Lisp from a Unix perspective - barriers to using CL
    ... to choose between competing regexp libraries!* ... those scripting languages but more with C++ and Java. ... There is not built-in regexp engine in C++, ... Things are slightly better here because the language ships with a huge ...
    (comp.lang.lisp)