[ANN] KawaDD Progress Report: Major milestone achieved, both lexer and parser generation done via templates

View: New views
3 Messages — Rating Filter:   Alert me  

[ANN] KawaDD Progress Report: Major milestone achieved, both lexer and parser generation done via templates

by Jonathan Revusky-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

(This was earlier posted to kawadd-devel, but I meant to post it here as
well.)

A major milestone has been achieved in KawaDD. In the KawaDD codebase,
the output of both the parser and lexer is now  done via FreeMarker
templates.

This has allowed me to take stock of the situation. Summary:

(1) It was MUCH more work than I anticipated to get to this point. I
have to admit that if I had known in advance how much work it was, I
might not have done it. This was mostly because of how hopelessly
entangled the lexer generation code was. But now that it is done, I
think that some of the gains I anticipated are quite tangible. Once the
code that actually outputs java statements to the output are removed
from the codebase, and consolidated in external templates, one can much
more readily see what the algorithmic side of the code is actually
doing. Just as an example, the NfaState.java that is 3000 lines of code
in the JavaCC codebase is down to 1000 lines in KawaDD.

(2) One thing that I had no real idea of before actually doing this was
how much of a performance hit would be involved in using the templates.
Of course, within a reasonable margin, it doesn't matter very much. What
one cares about is the performance of the generated code, and the
generated code is basically the same as before. Anyway, your mileage
will vary, but I think it about right to say that KawaDD is currently
2-3 times as slow as JavaCC. As a practical matter, I think the tradeoff
is well worth it. A 200% slowdown may sound bad, but the fact is that,
on recent hardware, most any grammar you throw at JavaCC is processed in
a second or less. Thus, even with a 100-200% slowdown, KawaDD will only
rarely take more than 2 or 3 seconds. In most projects, the parser
generation part is mostly run as one step in a full clean+build that
takes much longer than that. For example, a full clean build of
FreeMarker takes maybe 10 seconds using JavaCC and 12 seconds using
KawaDD. The tradeoff of flexibility vs. runtime efficiency is pretty
clearly with the former here, since a system where the output is based
on templates can be customized fairly easily, while JavaCC, which uses
ostr.println() statements embedded directly in the code cannot be
customized like this.

Now, after such a huge refactoring of the code, it is more than
reasonable to wonder whether bugs were introduced. To be honest, it's
hard to be absolutely certain. For one thing, the coverage of the test
suite included with JavaCC is really quite poor, so the fact that KawaDD
passes those tests does not say much. OTOH, here are additional
functional tests that KawaDD currently passes:

(1) It can be used to build FreeMarker (both versions 2.3 and 2.4 that
differ significantly) and the resulting build passes all 60-odd unit
tests that are in the FreeMarker distro. This is a fairly significant
functional test, since FreeMarker has a quite large grammar that has
become extremely crufty after 6 years of continuous tweaks and so on.

(2) KawaDD passes the basic bootstrap test. KawaDD can be used to build
itself and the resulting build passes the aforementioned tests, all the
tests included with JavaCC, and FreeMarker versions 2.3 and 2.4.

(Actually, at points where I broke things (later fixed) in refactoring
the code, the the build would frequently pass the JavaCC test suite but
fail one or both of the above two functional tests, building/testing
freemarker or bootstrapping KawaDD itself.)

Now, I think the above inspires at least a guarded sense of confidence.
but I would greatly appreciate independent affirmation from JavaCC users
that KawaDD works as a drop-in replacement in their projects. There is
no binary distro yet, but it is easy enough to get your hands on KawaDD.
It is just:

svn co http://svn.kawadd.googlecode.com/svn/trunk/kawadd
cd kawadd
ant

And KawaDD can be launched using the scripts in the bin directory or
longhand, it's something like:

java -classpath
<kawadd-root>/kawadd.jar:<kawadd-root>/lib/freemarker.jar KawaDD
MyGrammar.jj

Oh, final note: KawaDD does not support either static parsers or
reusable parsers, so if your project uses either of those brain-damaged
ideas, KawaDD won't be a drop-in replacement. This will probably be
manifested by the compiler complaining that he various ReInit(...)
methods are gone. And gone they are. (Good riddance to bad rubbish...
;-)) Anyway, in that case, instead of, say,

MyParser.ReInit(...);
MyParser.rootProduction(...);

you need to rewrite this as:

MyParser myParser = new MyParser(...);
myParser.rootProduction(...);

I had to do that in some of the test/example code that comes with JavaCC
so that it would work with KawaDD.

Best Regards,

Jonathan Revusky
--
lead developer, FreeMarker project, http://freemarker.org/
KawaDD Parser Generator, http://code.google.com/p/kawadd


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...


Re: [ANN] KawaDD Progress Report: Major milestone achieved, both lexer and parser generation done via templates

by Paul Wagland-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Jonathan,

Just one note;

> And KawaDD can be launched using the scripts in the bin directory or  
> longhand, it's something like:
>
> java -classpath <kawadd-root>/kawadd.jar:<kawadd-root>/lib/
> freemarker.jar KawaDD MyGrammar.jj

You will get most likely get a higer buy-in rate if you set up your  
ant task to repackage freemarker and to include it in the kawadd.jar.  
This will make it much closer to being a "drop-in" replacement, even  
within the limits that you have set ;-) I know that you do not see an  
additional jar as any burden, however if you actually want people to  
migrate, then you are going to have to make that migration as painless  
as possible.

Cheers,
Paul


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...


Re: [ANN] KawaDD Progress Report: Major milestone achieved, both lexer and parser generation done via templates

by Jonathan Revusky-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Paul Wagland wrote:

> Hi Jonathan,
>
> Just one note;
>
>> And KawaDD can be launched using the scripts in the bin directory or
>> longhand, it's something like:
>>
>> java -classpath
>> <kawadd-root>/kawadd.jar:<kawadd-root>/lib/freemarker.jar KawaDD
>> MyGrammar.jj
>
> You will get most likely get a higer buy-in rate if you set up your ant
> task to repackage freemarker and to include it in the kawadd.jar.

I made a mistake earlier today. I honestly believed that you wrote this
note in a half-joking sort of way and I wrote a private response based
on that belief. Apparently, you're completely dead serious.

Well, look, for anybody who wants to merge multiple jar/zip files into a
single one, here is an easy way in ant:

<task name="merge-zips">
    <zip destfile="combined.jar">
        <zipfileset src="A.jar"/>
        <zipfileset src="B.jar"/>
        ...
    </zip>
</task>

Note that the ... in the above snippet should not be copied in
literally. What that means is that you can add more lines like the
preceding ones if you want.

That would be for the case in which you want to merge more than 2 files.
Take note also that the A.jar and B.jar would have to be replaced with
the actual filenames of the jars you want to merge.


> This
> will make it much closer to being a "drop-in" replacement, even within
> the limits that you have set ;-) I know that you do not see an
> additional jar as any burden, however if you actually want people to
> migrate, then you are going to have to make that migration as painless
> as possible.

Well, I honestly believed that most people would just use the launch
script provided in the bin directory, e.g.

http://code.google.com/p/kawadd/source/browse/trunk/kawadd/bin/kawadd

for Unix, or

http://code.google.com/p/kawadd/source/browse/trunk/kawadd/bin/kawadd.bat

on Windows.

Basically, you would need the following command:

path_to_kawadd/bin/kawadd MyGrammar.jj

Take note of the special gotcha here that, where I write above
path_to_kawadd, that should not be copied literally, but substituted
with the full path of the root directory where you checked out and built
the source.

Also, note that if, by some freak of chance, your grammar file is not
called MyGrammar.jj, it will not work unless you substitute in the
actual filename.

I hope that's helpful,

Jonathan Revusky
--
lead developer, FreeMarker project, http:/freemarker.org/
KawaDD Parser Generator, http://code.google.com/p/kawadd

>
> Cheers,
> Paul


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...