Coding 2022-09-18
I was thinking pretty hard about trying to beef up my enum function, but initial efforts didn't work, and I decided that trying to combine enums and classes wasn't worth it for now.
Anyway, then I spent most of today in a haze, so I didn't get much else done before now. I have concluded that I want to change up the stuff that I try to do relative to the tutorial, not because I refuse to do things that way because I know better, but because I've done things that way, and I want a change of pace.
So, I'm trying to figure out what it would take to parse Lox with an Earley parser instead of recursive descent.
Now, I tried to get some hands-on experience with Earley parsers, um, at the beginning of this year, apparently... Based on this series, which I'll be working from again.
I'm going to take things as they go, so maybe I'll get stuff done tonight, but what I want most is to plan this stuff properly.
Let's see what the current Lox expression grammar looks like in Earley parser terms:
- expression -> equality (identity)
- equality -> equality ["!=" "=="] comparison (expr.binary)
- equality -> comparison (identity)
- comparison -> comparison [">" ">=" "<" "<="] term (expr.binary)
- comparison -> term (identity)
- term -> term ["+" "-"] factor (expr.binary)
- term -> factor (identity)
- factor -> factor ["/" "*"] unary (expr.binary)
- factor -> unary (identity)
- unary -> ["!" "-"] unary (expr.unary)
- unary -> primary (identity)
- primary -> NUMBER (identity)
- primary -> STRING (identity)
- primary -> ["true" "false" "nil"] (identity)
- primary -> "(" expression ")" (expr.grouping or just identity, except it has to discard the parentheses)
Basically, each rule is going to be a non-terminal symbol on the left-hand-side, zero or more symbols on the right-hand-side, and a callback to construct the expression once it knows it has a parse.
(Now, if I can get all of this to work, that means I can drop the final "accumulate" step from the scanner and build everything in an iterator.)
Actually, to make this parse totally work, I do need to handle the EOF token somehow...
Anyway, I let this go too late, so I'm going to cut this here, and think about how to handle the data structures later.
Good night.