In this post:
- It's hard to strike a good balance in writing contracts to verify software, because you don't want to slow down the program too much.
- ... And we don't know offhand how much a given contract slows things down.
- ... But what if we could annotate things so that a static analysis tool could figure it out?
- Approximately 1.5 bad jokes that I don't have the decency to explain.
While I wait for all of the infrastructure I need for Python 3.7 CI to get set up, I'm thinking about other projects I could look into. I was thinking about something Hillel Wayne has talked about a few times, which is combining property-based testing and design-by-contract. One tension that arises in design-by-contract, is which contracts to be checking in production:
I have mixed feelings on this. On one hand, if you have control over your production environment, it’s better to fail fast than let an error propagate. On the other hand, contracts are a runtime overhead. On the other other hand, if you’re concerned about runtime overheads you really should be profiling your code instead of just turning off consistency checks and hoping that’s where the slowdown is.
On the other other other hand, many of the “full specification”-style contracts involve coming up with a “definition” of the function that’s more “obviously correct” than whatever implementation you have. In practice this usually means brute forcing the solution, which probably would be a source of slowdown in prod. So my personal rule is “some contracts stay up, some get turned off.”
It looks to me like one perspective to take on this, is to suppose that algorithmic characteristics are part of the code's interface, and therefore, in production, you don't want to be running any checks with worse asymptotic performance. So all we need to do is get everyone using big O notation, and they can verify the correctness of it all by hand.
It'd be really convenient if we could have some form of static analysis tool that took care of this stuff for us, to an extent. And, it turns out that type systems can encode some pretty sophisticated annotations of type values, via "Ghosts of Departed Proofs". I'm a little interested in trying to apply those techniques to Python, but what I have in mind here is kind of adjacent to it, and would rely on different behavior than a typical type checker has. I'll try to sketch something out, since it'll get really vague if I don't have something to point to.
T = typing.TypeVar('T') U = typing.TypeVar('U') def my_product(a: typing.List[T], b: typing.List[U]) -> typing.List[typing.Tuple[T, U]]: results =  for elem_a in a: for elem_b in b: results.append((elem_a, elem_b)) return results
We want to somehow express the idea that the runtime of my_product is dependent on the sizes of a and b, specifically, on their product. Something like...
import roger_smith T = typing.TypeVar('T') U = typing.TypeVar('U') M = typing.TypeVar('M') N = typing.TypeVar('N') def my_product(a: roger_smith.Bind[typing.List[T], roger_smith.Binding[roger_smith.Len, M]], b: roger_smith.Bind[typing.List[U], roger_smith.Binding[roger_smith.Len, N]]) -> roger_smith.Bound[typing.List[typing.Tuple[T, U]], roger_smith.Product[M, N]]: results =  for elem_a in a: for elem_b in b: results.append((elem_a, elem_b)) return results
This is obviously extremely un-ergonomic, and the name is a joke that I'm a louse for making, but hopefully the basic idea comes across: it's possible to generate one or more bindings from each argument to a type variable (TODO, check whether mypy has any objections to this that, like most of mypy's objections to writing my own type syntax, I did not foresee), and combine the bound variables into an expression in the output type that expresses the complexity bounds on the function. The time complexity should probably be worst case; it is, after all, the long tail that causes a lot of the trouble. To verify the complexities provided, the checker basically has to add up every line's algorithmic complexity within a suite, and multiply loops by the algorithmic bounds on the number of iterations. Ideally, the checker should be able to request further bindings in order to calculate such things, and to generate a complexity for the result.
To work with Python specifically, there are some further features needed. For example, methods should be annotated both with the algorithmic complexity of the specific implementation, and with the maximum algorithmic complexity that is acceptable in a subclass. Also, some except blocks should be excluded from the calculations, but I think that has to be an explicit annotation of some kind.
I feel like this is a little obnoxious because it makes you still do the work (unless it has all of the fill-in features and you just iterate over it), but at least it can handle checking and verification.
Anyway, a miracle occurs, and we have a mypy plugin for this. For the purpose of run-time type checkers, the "binding" functions should probably just return their first argument, because I can't think of any useful way to do the analysis at runtime.
Now that we have the ability to generate and verify algorithmic bounds, the next step is to have a contracts library that operates within the bounds.
It's late an I'm tired, so I'm just going to handwave my way through this idea: add to an existing or new contracts library the concept of "oracle" contracts, which are not checked at runtime, but are somehow signalled to run under test. Annotate the contracts using the Cast function to confirm that the complexity is as expected. Then, there you go: oracle contracts that (probably) don't slow down performance unacceptably at runtime.
I feel like if I tried to do this, I'd either get a useful tool, or learn about some really deep stuff. Maybe both. So, I think this is a good candidate for several weeks' worth of project sometime.