Coding 2022-08-24

By Max Woerner Chase

I figured I'd talk some about something that's been on my mind, something that got me looking into Koka. It's this post. There are a bunch of interesting ideas in there, and I'm interested in seeing, more or less, what happens if I aim for a point in the design space somewhere in the convex hull of this post, Koka, Python, and the CLOS (n.b. I do not know Common Lisp, so I'm trying to get this through reading The Art of the Metaobject Protocol).

There are a lot of fiddly points to consider. Like, should there be a Python-style descriptor protocol? Koka (and Perl, I think, and presumably other things) style nullary functions with implicit parens? Both, somehow?

But I want to focus on one bit from Eevee's post, where she's talking about Python, and I'm not sure she's correct.

Let's quote the relevant bit.

Magic methods work differently from other methods, in that they only work when assigned to the class and not when assigned to an instance. It turns out there’s not actually a good reason for this.

This is going to need a bit of unpacking.

To start with, let's talk specifically about infix operators, like +. Haskell and Koka implement infix operators with a bunch of syntactic sugar, that allows the representation of the operator to be named in prefix form. So, if you crack open the definition of, say, list concatenation, you get

(++) []     ys = ys
(++) (x:xs) ys = x : xs ++ ys

or

// Append two lists.
pub fun (++)(xs : list<a>, ys : list<a> ) : list<a>
  append(xs,ys)

// Append two lists.
pub fun append(xs : list<a>, ys : list<a> ) : list<a>
  match xs
    Cons(x,xx) -> Cons(x,append(xx,ys))
    Nil -> ys

Unlike Haskell or Koka, Python implements "define an implementation of an infix operator" by having the class in question define a method named with __double_underscores__, like + with __add__. (And __radd__, and there's also __iadd__, which should do something similar to __add__ but not the same, and...)

Python also uses this kind of mechanism for unary operations, and various other bits of built-in functions and syntax. Now, what "works" here is the mapping between "fancy syntax" and "the actual method implementation". If you have

my_var + 3

then executing that code is going to result in the __add__ method being looked up on the type of my_var, and passed my_var and 3. You can get similar behavior by invoking the method directly on the instance, although this is brittle in certain ways:

assert my_var.__add__(3) == my_var + 3

(If we used a user-defined type for the right-hand side, this would not necessarily hold. This is getting into that __radd__ stuff from earlier.)

What Eevee is pondering is the equivalent of

my_var.__add__ = custom_add_function
assert my_var.__add__(3) == my_var + 3

This is not likely to work, because Python doesn't look up the "magic" method on the instance at all.

As to why I'm not currently comfortable with the statement that "there's not actually a good reason for this", we'll have to look at the reasons that I do see for it.

If I'm understanding my experiments correctly, attribute access in Python essentially "looks" in two different directions.

On an instance that is not a type, instance.attr will check the instance dictionary for attr, and the instance dictionary of its type, and its type's ancestors. The behavior is eerily configurable if you're determined, and the default behavior is slightly more involved than I feel like summarizing currently.

On an instance that is a type, the my_type.attr situation is similar, except that now the instance has type ancestors to contend with, as well as the ancestors of its types. (The type of a type should be a subtype of the type of its supertypes, so there's no problem only checking the type of the instance itself.)

Where things get hairy is not necessarily __add__, but one of the more commonly invoked operations of a type.

Suppose we want to create a type whose instances can be called like a function. This looks like

class MyClass:
    def __call__(self):
        ...

Now, one type whose instances are often called like a function is... type.

>>> class TalkyMeta(type):
...     def __call__(self, *args, **kwargs):
...             print("Hi there!")
...             return super().__call__(*args, **kwargs)
...
>>> class Talky(metaclass=TalkyMeta):
...     pass
...
>>> Talky()
Hi there!
<__main__.Talky object at 0x7fd4850de110>

As long as we have types acting as the main entry point for constructing their instances, using the __call__ method, then we can't look up all "magic" methods on the instance without making it highly inconvenient to construct an instance of a type that overrides __call__ so its instances can emulate functions.

For what it's worth, Lua's metamethods avoid taking a stand on this by not allowing the instance to access the metamethods via instancing unless you specifically hook it up like that. There's one table for metamethods, and another table for extra attributes visible on the instance, and maybe they're the same table, but they don't have to be.

It's getting late, so I'll just say that my gut feeling is that, instead of trying to make the magic method lookup take the instance dictionary into account, it should (somehow) get its own mapping on the type, distinct from the "stuff that the instance delegates to explicitly".

This raises a bunch of questions in the context of language design, so this idea is way more of a starting point than an ending point, so there's plenty more for me to think about, later.

Good night.