Coding 2020-05-07

By Max Woerner Chase

Today, I'd like to do the planning/design for the environment matching in here.

I'm thinking a helper class that combines a phoneme sequence and an index; it can perform one of several checks on demand.

Anyway, the trickiest bit is the automaton design. Here are a few points to keep in mind:

This means that the automaton can be built in a single pass by starting at the end of the match, and if it's done arena-style (I can't stomach the idea of repeatedly hashing the entire state graph), then the start state has index -1, and the state at index 0 is the success state.

States need to be capable of:

Specific cases to check for: optionals or wildcards on their own should succeed immediately; this means that my "single-pass" idea might not pan out, unless the matching process is broken up very carefully:

Given an initial set of states (from the starting conditions or the previous step), generate a set of states that advance the string. If the success state is encountered in this process, the match succeeds. The following three states must be processed as part of this stage:

This can be viewed as a combination of "does this state have transitions that consume a phoneme?" and "what transitions does this state have that do not consume a phoneme?"

The stage can then be broken up into:

While there are candidates: process each candidate by adding its candidates to a new set, and possibly itself to a consumer set; if the success state is encountered, succeed; at some point filter out all candidates that have been seen from the new set; make the new set the candidate set; the output of this stage is the consumer set; if the consumer set is empty, then fail.

Although, if the consumer set is empty, then processing it will yield no candidates, so if the loop condition is the truthiness of the candidate set, then I could have an early return true inside the loops, and a fallback return false afterwards.

Wait, I'm confusing myself. This isn't a while loop, it's a for loop. Unless I make it a while loop and have the cursor handle advancing, which seems acceptable, and allows me to perform processing steps even when there aren't phonemes to process, which is necessary for the empty sequence to match the empty sequence.

I think I've got it all together. Now I just need to be focused on this stuff long enough to get it all into code.

Good night.