Three Dollar Quill

Coding 2025-12-05

Tags:

Fri 05 December 2025

By Max Woerner Chase

All right. I just finished queueing up more word formation for my future self, so let's look at what I want to do before that.

Aside from anything else, I'd like to rig up a generator for the simpler phonology I converted things to.

But also, I've been considering how I'm processing words at different times, and I think the current data layout is just wrong. Right now, the primary key is the pronunciation, and the "short gloss" disambiguates below that. This is something that kind of makes sense in the context of a dictionary, but when I'm doing evolution, and in particular, selecting particular words to combine in various ways, I think the "short gloss" should be the primary code word, for being generally stable across time, except when I explicitly alter it. The structure for the lexicon is currently a map from pronunciation to a map from "short gloss" to "metadata". I think instead, it should be a map from "short gloss" to a pair of pronciation and "metadata". Then, the drift structure can entirely ignore pronunciation, and also make changing the short gloss optional maybe.

There are two independent strands to this. On the one hand, the lexicon structure has to be updated, the serialization updated, and the deserialization partially updated, the proto lexicon, regenerated, and then the deserialization updated the rest of the way. On the other hand, I'm going to have to update the drift file manually.

The point of all of this is to enable further processing in a more resilient manner than what's currently possible. I want to be able to filter lexicons by metadata, and write input files that concatenate specific words, as well as take products of lexicons. Let's focus on making the required changes, first.

...

It wasn't as simple as I'd hoped. One of the things that Brassica does is handle sound changes that can result in multiple pronunciations. If the short gloss is the primary key, then that means that there could be multiple associated pronunciations. Now, it appears that Brassica properly handles inputs with multiple pronunciations, so I don't need any special logic for applying the sound changes, except that I'd like to store the pronunciations separately and concatenate them only to apply sound changes.

...

All right, I've got things going, and it seems to all be working. I'm not going to try to do any more tonight, but here's what I'm missing that should now be easier:

Word formation via compounding
Paradigm generation

I need to sketch out what the flow for that should look like, but also I can no longer sweep the state of the sund changes file under the rug; it needs to be cleaned up for anything else to be possible. (Also, once it's cleaned up, I should try using it as a template for making that other naming language that I want to use to disambiguate sounds that collide.)

(Also also, there are some checks that I'm simply no longer doing, which I need to find some way to reintroduce.)

In any case, I'm tired.

Good night.