Diary 2019-09-25
A few things happened today. For one, I think I finally have proof that the recreational math idea I was messing with just doesn't work as far as I was trying to take it. For another, Coverage released another alpha, and I realized that limit-coverage was broken outside of my specific use case, so I pushed out some quick fixes that didn't break my stuff, and probably make things better; that it would be hard for them to make worse. Discovered something about Coverage that I'm going to report as an issue when I have time to put together minimal reproduction. It probably won't matter to most people. I'm putting together notes for something on software metrics, but I'm not rushing that, because that's a topic that's not going anywhere.
I figured I'd try to explain the recreational mathematics stuff, since I've mentioned it enough.
First, some background:
There are various conventions for representing numbers in base 10, using binary coding. Basically, different systems for turning a few binary digits into a single decimal digit. What inspired me was the POSTNET system, in which each digit is assigned a number, and the numbers from 0 to 9 are represented as sums of exactly two of these numbers, with just one special case (4 + 7 = 0). I found myself wondering under which circumstances a constant-weight code could have a number assigned to each digit as in POSTNET, but with no special cases.
Clearly, if just one digit is on, then the answer is to just give each digit a number, consecutively, up to the maximum. This is basically a way of thinking about normal positional systems that ignores the specific symbols: 0 is the first light, 1 is the second, 2 is the third, and so on. And if one digit is off, then it trivially corresponds to the case of one digit on, though getting the numbers to actually line up may require some fractional parts.
We can call these cases of representing N symbols with N bits the trivial solutions to this idea. What's of interest is representing more than N symbols with N bits. I have found two solutions to this (similar solutions can be obtained through symmetries, but they're the same structures): two of -1, 1, 2, 3, and three of -5, 2, 3, 4, 6, 9. From my attempts to find a solution for four of eight numbers, I doubt there are any further non-trivial solutions; I would love to be wrong, and I'd especially love it if there's literature on this that I just failed to find.
Before I get into the theory behind the computer search I did for the 4-of-8 case, I'll mention how things went doing stuff manually.
2-of-4 was simple enough to do in my head. I wasn't able to work out 3-of-6 by guessing, but I was able to express it as an extension of 2-of-4, and from there basically just solve some linear equations to find it. I attempted to extend this result to 4-of-8, and I only got negative results.
Unfortunately, my attempts weren't very rigorous, and it's hard for me to write them up in an airtight way, especially given what I found next:
Computer search for the 4-of-8 case...
Four out of eight digits would yield a base-70 code. Testing every candidate for 3-of-6 is more-or-less fine. Testing every candidate for 4-of-8... is not. Something must be done to narrow down the candidates we look at, to come up with standards from which we can generate only the candidates that meet those standards.
We can generalize the problem of finding numbers that add up to each number from 0 to 69 to adding up to 70 consecutive numbers, or further, to each number mod 70. If there is a 4-of-8 coding that sums four numbers to obtain a number between 0 and 69, then there is a 4-of-8 coding that sums four numbers to obtain every number mod 70.
Given every number mod 70, we can modulo them with much smaller numbers, to get every number mod 10, 7 times, or every number mod 14, 5 times. Looking for candidates in these spaces is much faster than the full 70, because all 8 digits have a much smaller range of values, so the search is exponentially quicker.
Supposing we had a solution, and we took its moduli mod 2, 5, and 7. From the three sequences that result, we could recover the original solution. Using 10 and 14 lets us optimize somewhat, because we know that evens have to match with evens, and odds with odds. This lets us divide up candidate sequences by the number of even numbers they contain, and solve four smaller problems instead of one big one. In addition, within each problem, we know that we can't match evens and odds, so that lets us discard more possibilities out of hand. It so happens that none of the possibilities mod 14 repeat a digit, so we just need all by-value unique permutations of the possibilities mod 10. A permutation of a 10, combined with a 14, uniquely identifies a candidate sequence mod 70, which can then be quickly checked.
I ran the numbers, had code calculate everything, and... nothing. No candidates meet the loosened standard of "each number mod 70".
Therefore, there's no coding for 4-of-8 that works like -1, 1, 2, 3, or -5, 2, 3, 4, 6, 9. I don't know about higher numbers. For all I know, the "problem" with 4-of-8 is just that 70 is congruent to 1 mod 3. I think I can gather enough low factors of the next number, 252, to have a chance of doing a search in a reasonable time, but I don't want to try right now.
So, that's how I spent my weekend.
It's late, I should stop writing.
Good night.