Coding 2022-01-14

By Max Woerner Chase

The good news is, I managed to construct a successful parse. The not-as-good news is, I had to fix some bugs before it went through. The worse news is, while almost all of the bugs made sense in retrospect, I think I'm going to need to do a close reading of the entire tutorial to understand where I went wrong with the final bug.

Easy stuff:

And now that I say that, I understand what went wrong with the last bug. This is a doozy, so let's see whether I can explain it.

All right, so my parse generators rely on several recursive functions. The function that I broke takes an item and an "end index" to a state set, and, looking in the range between the end index, and the item's "start index", yields all possible sub-divisions of that range into the symbols required by the rule. This function calls itself recursively for one of two reasons:

And that last bit is where I tripped up. I forgot that the state set indices go between tokens, so I tried to "skip over" an index, because the last token used by one symbol is the token before the first token used by the next symbol. This doesn't work, because "the ending of one symbol" and "the beginning of the next symbol" correspond to the same state set index. So I tried to "compensate" for something when I should have left well enough alone.

Next up, now that I sort of understand things, I see that I absolutely did not account for empty rules. (I first verified this experimentally.) So, before I can properly move on, I'll want to fix that.

Let's see what fixing that looks like.

Well, first off, I need a constructor helper for Parser objects.

...

I got distracted and did the fix. This basically consisted of implementing the algorithm that Jeffrey Kegler provided for detecting nullable symbols, undoing some premature optimizations I had done that interfered with actually using the data on nullability, fixing a code divergence because apparently that happens nearly immediately after I copy and paste some code less than ten lines down from its origin, and pondering whether I have it right, because the fixed code has more items in its state set than the code from the tutorial.

I think this is okay, because if I'm understanding things correctly, the tutorial is showing an intermediate stage of parsing. I could be totally wrong, though.

Hm, interesting question that just occurred to me: is my current parse generation code able to properly handle nullable sequences that can go arbitrarily deep? I'm skeptical, so maybe just don't do that.

I think it might actually act really pathological. But it's getting late, and I don't want to think about that too hard.

Since I've gotten a successful parse, maybe I should move on for now, and come back to this code later.

Good night.