Diving deeper into language construction, you’ll have to learn the basics of compiler theory. I have no real practice crafting a parser by hand. Currently reading the ANTLR book from Terence Parr (not the Language Implementation Patterns – it really sounds very promising). What really got me attracted are packrat parsers. Parsing Expression Grammars (PEG) bacuse they are more concise for human brains.
What’s that buzz? Head over to wikipedia and read on what’s written about PEGs or have a look at the comparison of parser generators. I’ve corrected the link for OMeta and added a link to kathadin, a mutable PEG based interpreter runtime for interconnecting languages. PEGs got my attention some time ago as I was reading the Boo Language mailing list. On Sunday @ricksaurus posted a tweet about a PEG based system called katahdin. The interpreted system isn’t maintained. Chris Seaton is in the army since 2007 and relesed his very first 0.2 version to the public domain after writing his master thesis. It’s a mess running the source on Windows with dotnet because of its dynamic nature and security issues with dynamic created assemblies and DynamicMethod. Will have to digg into this security related stuff later on. My machine runs not with full trust. As off now I’m using VS2010 to edit and build the assemblies but mono to run them.
The runtime has a ridiculously long bootstrapping time. But this is described in his paper. I’ve done some eval and I’m really impressed. The runtime has a boostrapping parser with a low level set of instructions and a hosting language (*.kat) defining the glue code and operators for creating new languages. Chris created an proof-of-concept for python, fortran and sqlite. All you have to do is specify the language context for which grammar you’d like to write your statements and expressions.It’ll take some more time to optimize the parser/runtime of katahdin. Boo in its next incarnation will be based on OMeta, a PEG based parser. Rodrigo has allready built Boo.Peg and Boo.OMeta some time ago in his Boo-Extensions project an will merge them into the boo master next year. This will bring us an ometa, syntax and data (unions) macro. The compiler will be self hosted in OMeta (like Perl6 does). This is really exciting as the implementation of Boo.OMeta is very marture. Boo offers some more metaprogramming capabilities like other languages also do, namely splicing, quasi quotation and macros.
Never the less. I’m very excited about katahdin atm. Its really a nobrainer to write your own extensions to katahdin. If there are some talented compileristas I would appreciate getting help to get the katahdin runtime polished.
Python/Fortran fusion sample:
import "fortran.kat";
import "python.kat";
fortran
{
SUBROUTINE RANDOM(SEED, RANDX)
INTEGER SEED
REAL RANDX
SEED = 2045*SEED + 1
SEED = SEED - (SEED/1048576)*1048576
RANDX = REAL(SEED + 1)/1048577.0
RETURN
END
}
python
{
seed = 128
randx = 0
for n in range(5):
RANDOM(seed, randx)
print randx
}
sqlite:
import "sqlite.kat";
database = new Mono.Data.SqliteClient.SqliteConnection(
"version=3,URI=file::memory:");
database.Open();
database? create table films(title text, year integer, director text);
database? insert into films values("Raiders of the Lost Ark",
1981, "Steven Spielberg");
database? insert into films values("Indiana Jones and the Temple of Doom",
1984, "Steven Spielberg");
database? insert into films values("Indiana Jones and the Last Crusade",
1989, "Steven Spielberg");
films = database? select year, title, director
from films
where year >= {args[0]};
print "Films made since " + args[0];
for ([title, year, director] in films)
print " " + title + ", " + year + " (" + director + ")";
How about writing your own extensions to NHibernate replicating HQL as source (and no strings). There is a integrated test syntax, too:
import "test.kat";
// Positive and negative
test
{
test
{
assert + 0 == 0;
assert - 0 == 0;
}
test
{
assert + 14 == 14;
assert - + 14 == - 14;
assert + - + 14 == - 14;
assert - + - + 14 == 14;
}
test
{
assert - 2 + 3 == (- 2) + 3;
assert - (2 + 3) == - 5;
}
}