Packrat parsing – O meta, where art thou!

Diving deeper into language construction, you’ll have to learn the basics of compiler theory. I have no real practice crafting a parser by hand. Currently reading the ANTLR book from Terence Parr (not the Language Implementation Patterns – it really sounds very promising). What really got me attracted are packrat parsers. Parsing Expression Grammars (PEG) bacuse they are more concise for human brains.

 

What’s that buzz? Head over to wikipedia and read on what’s written about PEGs or have a look at the comparison of parser generators. I’ve corrected the link for OMeta and added a link to kathadin, a mutable PEG based interpreter runtime for interconnecting languages. PEGs got my attention some time ago as I was reading the Boo Language mailing list. On Sunday @ricksaurus posted a tweet about a PEG based system called katahdin. The interpreted system isn’t maintained. Chris Seaton is in the army since 2007 and relesed his very first 0.2 version to the public domain after writing his master thesis. It’s a mess running the source on Windows with dotnet because of its dynamic nature and security issues with dynamic created assemblies and DynamicMethod. Will have to digg into this security related stuff later on. My machine runs not with full trust. As off now I’m using VS2010 to edit and build the assemblies but mono to run them.

 

The runtime has a ridiculously long bootstrapping time. But this is described in his paper. I’ve done some eval and I’m really impressed. The runtime has a boostrapping parser with a low level set of instructions and a hosting language (*.kat) defining the glue code and operators for creating new languages. Chris created an proof-of-concept for python, fortran and sqlite. All you have to do is specify the language context for which grammar you’d like to write your statements and expressions.It’ll take some more time to optimize the parser/runtime of katahdin. Boo in its next incarnation will be based on OMeta, a PEG based parser. Rodrigo has allready built Boo.Peg and Boo.OMeta some time ago in his Boo-Extensions project an will merge them into the boo master next year. This will bring us an ometa, syntax and data (unions) macro. The compiler will be self hosted in OMeta (like Perl6 does). This is really exciting as the implementation of Boo.OMeta is very marture. Boo offers some more metaprogramming capabilities like other languages also do, namely splicing, quasi quotation and macros.

Never the less. I’m very excited about katahdin atm. Its really a nobrainer to write your own extensions to katahdin. If there are some talented compileristas I would appreciate getting help to get the katahdin runtime polished.

Python/Fortran fusion sample:

import "fortran.kat";
import "python.kat";

fortran
{
      SUBROUTINE RANDOM(SEED, RANDX)

      INTEGER SEED
      REAL RANDX

      SEED = 2045*SEED + 1
      SEED = SEED - (SEED/1048576)*1048576
      RANDX = REAL(SEED + 1)/1048577.0
      RETURN

      END
}

python
{
    seed = 128
    randx = 0

    for n in range(5):
        RANDOM(seed, randx)
        print randx
}

 

sqlite:

 

import "sqlite.kat";

database = new Mono.Data.SqliteClient.SqliteConnection(
    "version=3,URI=file::memory:");

database.Open();

database? create table films(title text, year integer, director text);

database? insert into films values("Raiders of the Lost Ark",
    1981, "Steven Spielberg");

database? insert into films values("Indiana Jones and the Temple of Doom",
    1984, "Steven Spielberg");

database? insert into films values("Indiana Jones and the Last Crusade",
    1989, "Steven Spielberg");

films = database? select year, title, director
    from films
    where year >= {args[0]};

print "Films made since " + args[0];

for ([title, year, director] in films)
    print " " + title + ", " + year + " (" + director + ")";

 

How about writing your own extensions to NHibernate replicating HQL as source (and no strings). There is a integrated test syntax, too:

 

import "test.kat";

// Positive and negative

test
{
    test
    {
        assert + 0 == 0;
        assert - 0 == 0;
    }

    test
    {
        assert + 14 == 14;
        assert - + 14 == - 14;
        assert + - + 14 == - 14;
        assert - + - + 14 == 14;
    }

    test
    {
        assert - 2 + 3 == (- 2) + 3;
        assert - (2 + 3) == - 5;
    }
}
DotNetKicks-DE Image
Published Dienstag, 30. November 2010 22:25 von Rainer Schuster

Kommentare

# re: Packrat parsing – O meta, where art thou!

Mittwoch, 1. Dezember 2010 09:56 von Lars

Hallo,

wie sind denn Deine Erfahrungen mit PEG-Parsern bzgl. Fehlerbehandlung ? Während ich die PEGs an sich recht eingängig finde (auch wenn die Behandlung von Whitespace gewöhnungsbedürftig ist), finde ich die Fehlerbehandlung sehr aufwändig.

Ein paar Erfahrungen zu diesem Thema finde ich interessant.

Lars

# re: Packrat parsing – O meta, where art thou!

Mittwoch, 1. Dezember 2010 11:15 von Rainer Schuster

Ja, die Fehlerbehandlung ist nicht besonders gut. Das liegt bisher an dem Packrat, wie er funktioniert. Ich habe recht gute Erfahrungen mit Boo.OMeta gemacht. Dennoch ist die Fehlerbehandlung etwas unhandlich.

Das gute an OMeta ist, dass der Parser Testgetrieben benutzt werden kann um Regel für Regel zu entwickeln.

# re: Packrat parsing – O meta, where art thou!

Sonntag, 19. Dezember 2010 10:40 von plumps

...irgendwie ulkig. Ich dachte der blog ist ist auf Englisch. Und doch findet die Diskussion auf Deutsch statt. Hmmm...

Frohe Weihnachten ihr nerds

Kommentar abgeben

(verpflichtend) 
(verpflichtend) 
(optional)
(verpflichtend)