Instead of writing more tests, I thought about the datatypes involved in SCA², and I concluded that I wanted things to be less stringly typed before I went forward. I started redoing a bunch of stuff, and that was probably a mistake, because now it's kind of frankensteined together.
Hm. Yes. I'm going to save a diff, revert, fix some bugs I found in the rewrite, and start over with an actual plan.
Bits of a plan:
- Specific types of morphemes, lines, categories, environments, targets, replacements. (There are valid replacements that aren't valid targets)
- For environments, have explicit separation between "before" and "after", and check before backwards from the current index into the line. This should allow for some capabilities that don't exist in SCA².
- Targets are optionally a category, followed by phonemes
- Replacements are a sequence of phonemes, and if the target started with a category, optionally a category; or metathesis or gemination
- Morphemes are wrappers around string constants
- Lines are wrappers or aliases for a sequence of morphemes (including word boundaries)
- Categories are wrappers for a sequence of morphemes, and can be combined.
- Environments allow for special sequences for word boundaries (can't be a space, because it matches at the beginning), degemination, and wildcards
- Replacements operate on the input word, and construct an output word
- It may be possible to use some kind of enhanced iterator to implement the transformations as a stream, but given the expected length of a word, it's not going to be worth it.
I'm exhausted right now, but let's see if I can put any of this into action tomorrow.