Three Dollar Quill

Algebraic Data Types 2019-09-05

Thu 05 September 2019

By Max Woerner Chase

Looking more at the dataclasses plugin, I realize that, unsurprisingly, it's based on the implementation, so, if I want to have my code make sense, I'm going to have to mirror the implementation of the __init_subclass__ methods.

Several hours later, I got sidetracked because, well, I've heard about mutation testing, and here's a big well-tested code base for me to try it out on. It really puts my coverage numbers into perspective, in that over a hundred mutations got past the tests. Unfortunately, it's hard for me to act on this stuff right now, because the library I'm trying out right now, Cosmic Ray, I haven't figured out how to narrow down the results, and it's not really worth it to go through 700+ results to figure out which ones timed out, and why all the mutations that survived, did so. I mean, I'm probably going to try to get some kind of mutation testing done before I consider this 1.0, but it turns out it's not worth it right now, and I should focus on sketching out the way those __init_subclass__ methods work. So, let's see...

Sum.

Subclassing a subclass of Sum is an error. It first checks that eq and order are compatible. It then scans all annotations in the MRO, backwards, with non-Ctor values deleting the annotation. When it gets to the final result, it creates a constructor for each Ctor annotation. It creates a custom __new__, which we can model as "cannot instantiate this class except through its constructors". If there's no __setattr__ or __delattr__ defined, then those are disallowed. There's basically the same logic around equality checking and hashing as in dataclasses, and the some around order as well, pretty much.

The big differences from dataclasses are that the arguments are passed to the class creation statement, there's no frozen or init, and the only annotations it counts use the Ctor form.

Product.

It first sets attributes on the class from the creation arguments, then relies on the values on the class for the remainder of the behavior. It checks that __eq and __order are compatible. It then scans all annotations in the MRO, backwards, with None and ClassVar values clearing the annotation. It determines the default values, and errors if there are gaps. It considers inspect.Parameter.empty to mean "leave this unset". It creates a custom __new__, which we can model as "If the class defines a __new__ method, use it as-is, otherwise create a __new__ method based on the fields and defaults." It checks what behavior to use for equality and ordering methods, at which point things get weird. I think my informal specification for what this stuff does isn't sufficiently motivated by use-cases. In plain English, this code is doing complicated, albeit consistent, stuff, possibly because I wanted to show off, and not for any better reason currently. I think the current behavior may simply be broken for advanced use cases. I think I can paper over the issues by making a new setter function that specifically checks for None to decide whether it can set over, but that's a bit hacky, especially the way I'd employ it. Maybe the thing to do is make all of the various special methods behave like its __new__ stuff currently does. That'd be some effort, but my head wouldn't hurt as much, so it's probably a good idea. I'll have to revisit this idea when I'm better-rested. (Maybe instead of class identity, it keys off fields to determine whether to use the provided function?) (This would also mean that the "setter" functions would inspect only the local namespace, and would also remove some dependencies. I'm feeling good about this, but I can't do it right now.)

This was some good progress, but it's time to wrap up for now. Good night.