Skip to content

Instantly share code, notes, and snippets.

@peter-leonov
Created May 31, 2013 16:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save peter-leonov/5686206 to your computer and use it in GitHub Desktop.
Save peter-leonov/5686206 to your computer and use it in GitHub Desktop.
> You're welcome!
>
> In answer to your questions and remarks:
>
> 1) I'm sorry, but jison is /not/ a /port/ of bison…
Understand. Who may want to just copy such a monster.
>
> 1.a) jison combines a lexer and LALR(1) parser in one package.
It is cool and I like it. But, as you write below, in my case the builtin lexer is not the case :(
> 1.b) jison may share 4 alphanumerics with bison and may support a large
> subset of the bison vocabulary but it does not, at least not YET, include
> support for GLR(1) / LR(1) grammars. Check wikipedia and bison manuals if
> these acro's don't say anything to you right now. Jison currently only
> 'does' LALR(1) which is equivalent to the abilities of classic yacc.
OK, before given investigation, I thought that LR and LALR is all about the resulting parser, the algorithm of evaluating the grammar. Now I realize that the whole grammar, states optimization and the actual parser code is different for those type of parsers. May be not opposite different but the difference is significant in context of the Ruby parser problem.
> …
> The point being: when that ruby grammar is LR but NOT LALR, then, yes,
> expect lots of conflicts, both shift/reduce and reduce/reduce. If I recall
> my theory correctly, any LR grammar can be turned in a LALR one, but you
> will have to do that by hand and the result won't necessarily be 'nice' and
> 'human readable' if you get my drift.
The worst part is that, after all the burden, I'll have to support all this mess for years.
> …
>
> 2) porting the ruby grammar:
>
> …
>
> This should work for jison so that DOES NOT EXPLAIN the
> many conflicts; maybe the grammar was 'updated' and now uses LR (non-LALR)
> features provided by modern bison, but I'll have to see it and spend some
> serious time on that.
It is expected to have a problem parsing such a monstrous rule set while not even meant to be used this way.
> (Read: I cannot spare the time for that right now and
> I don't expect this to become a paid consulting job
I wold be glad to thank you with escudos for this answer. It saved me much time. How can I perform such action?
> NOTE THAT I AM NOT REFERRING TO THE INLINE ACTION BLOCKS PROBLEM HERE (
> http://www.gnu.org/software/bison/manual/bison.html#Mid_002dRule-Actions );
> THIS IS ABOUT PARSER THEORY/TECHNOLOGY ITSELF.
Totally understood. Generating the tree in far more simple then kickstart the parser.
>
> …
> If you're not yet fluent in LR parser theory, you may have a Long Walk
> ahead of you.
Yes, I have less knowledge about parsers then the task needs to be solved flawlessly. But, I'm brave as far as I can see projects like JRuby, Iron Ruby, Opal and others who has dealt with the parser.
I expect to spend much time on this. Porting a real world complex parser is an exciting adventure I really want to have :)
>
> 2.b) When you follow those links above, you'll see a few things. Also,
> there's https://github.com/ruby/ruby/blob/trunk/tool/id2token.rb which is
> part of that original ruby build action at
> https://github.com/ruby/ruby/blob/trunk/common.mk#L587
I'll convert it by hand, if not already.
> while full awareness
> of the parser generator process in general and bison in particular should
> have led you on an immediate hunt for lex/flex,
The first thing I were looking for, indeed.
> …
> https://github.com/ruby/ruby/blob/trunk/parse.y#L6852 ta!-da! custom lexer,
> using gperf assistance.
Sad, but true. 1300+ lines of hardcode… oh, my.
Yes, I have seen it already. Luckily, JavaScript is in the same language family with C, and porting the lexer tend to be a long, boring, but doable thing.
>
> Time spent so far: ~ 20 minutes.
My time so far is around two weeks :)
>
> …
> if you were not able to follow along smoothly, combined with the initial problem
> you posted, this is a strong indication that you have no solid previous
> experience in the referred language theory and original tools which were
> used to create ruby; this means that the goal of porting ruby to JavaScript
> would be a *many*-hurdle learning experience for you, both in C programming
> (as you'll have to be able to 'decode' what the ruby people wrote and did,
> particularly where they went off-mainstream, such as their lexer) and in
> jison.
It is the first grammar I'v ever seen since high school. I'm good in JavaScript. And I can read and write medium programs in C.
And the main two ingredients: I'm passioned and I have time for it right now.
>
> …
> if Zaach has enough time available to do this Real Soon Now
I hope he has. As far as there are only two of us in the thread, one can make a conclusion, that Jison is not in focus right now.
> but second is a 'not useful' item IFF you
> mean you mean compiling the ruby grammar in bison and having bison spit out
> JSON for /jison/
No, I meant the other trick.
> …
> I'd rather suggest getting bison to produce
> the generated parser in JavaScript
I meant this one :)
> …
> 'today' it means you'll have to learn/use GNU M4 to produce a
> suitable bison 'skeleton' for the JavaScript target
Yeap, had a look at Bison's skeletons yesterday. They broke my eyes :)))
>
> A viable 'one off' alternative would be to take the bison generated C
> parser and port that code (including tables) over to JavaScript
I'll try the compilation of the two above: skeleton + postproduction.
As far as working with strings, arrays and hashes is much simpler in JS, I'll try to write a minimum skeleton to move resulting transitions table out of Bison before it transforms it for needs of C language. If possible :) Then the time for JS converter to handle the actual parser production.
I wish myself luck in such a challenging task.
> , port the custom lexer over to JavaScript as well and then take it from there.
Sad, but no Ruby parser can avoid porting the lexer from parse.y.
As far as the lexer is not context free, I'll better port it as is.
> …
> you may find the mentioned 'M4/skeleton
> for JavaScript target' a viable option and other people would be grateful
> for that output, I'm sure! :-)
Agreed.
If I can get helped with those skeletons, the work could be done without JavaScript engine ever involved. But, pretending someone has added the JS target with a help of JS engine, I'll use it without a doubt and donate to such a handsome someone :)
>
Thank you again! :)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment