2011/04/26

Deterministic Random Testing

I've just written an implementation of a weighted random shuffle. Because the function is random, it won't return the same results every time. This makes it hard to unit-test; I can't just run it on some given input and assert it returns some given output. Indeed, by its nature, any ordering of the results is potentially valid. One way to solve this is to run it a large number of times, counting the frequency of various output results, and asserting that roughly the right proportion of results are returned. Of course, exactly how close "roughly right" is, is hard to determine.

This isn't a very satisfactory testing method, however. Get the tolerance too tight, and spurious random differences cause tests to fail unnecessarily. Too loose, and you might miss a subtle logic bug that skews the probabilities of some case.

The weighted random shuffle algorithm calls int rand $limit a number of times, each time passing in a small integer. This effectively uses the RNG like a dice roll, randomly selecting some integer 0 .. $limit - 1. Because each call takes a small integer, and because the algorithm is entirely deterministic for any given set of random results, this suggests a better testing method. If we could instead enumerate all possible returns of random numbers, each one exactly once, we can generate all possible results from the shuffle algorithm in their ideal proportions. Because this will be an exact deterministic count, free from randomness, we can assert exact values for the results.

So this is what I did. I've created a module, for now simply called Unrandom, which exports one function, unrandomly. It is used like the following:
use Unrandom 'unrandomly';

unrandomly {
my $da = 1 + int rand 6;
my $db = 1 + int rand 6;
say "Roll 2d6: $da + $db = " . ($da+$db);
};
The unrandomly function has the effect of replacing the rand function with one under its control while it runs the block of code. It runs the block of code a number of times, enumerating the entire tree of possible return values, in a given deterministic order:
Roll 2d6: 1 + 1 = 2
Roll 2d6: 1 + 2 = 3
Roll 2d6: 1 + 3 = 4
Roll 2d6: 1 + 4 = 5
Roll 2d6: 1 + 5 = 6
Roll 2d6: 1 + 6 = 7
Roll 2d6: 2 + 1 = 3
Roll 2d6: 2 + 2 = 4
...
Roll 2d6: 6 + 5 = 11
Roll 2d6: 6 + 6 = 12
Because each possible combination is returned exactly once, we can unit test that this dice-rolling algorithm does in fact give us the right distribution of results; there will be exactly one 2, two 3s, etc... We don't have to run it a few million times, and check that we got "roughly" the right number; we can be exact.

While I'm using this code in a unit-test, there's nothing directly test-related in the code. It could be useful anywhere that statistical modelling is used, or other problems involving random integer generation.

I'm now just looking for a good name to call it, so I can extract it from the unit tests and give it a life of its own on CPAN.

Suggestions anyone?

2011/04/09

Extended colour support, terminals, and ECMA-48

tl;dr summary: Terminal authors - please accept CSI 38:5:123 m as well as anything else, to set extended colours. NOTE THE COLON. Many terminals these days are starting to support extended colour modes; specifically things like 256 colours. They're all doing it wrong.
CSI 38;5;123 m
No no no no no. That is NOT what you think. It does not select colour 123 from palette 5. CSI arguments are separated by semicolons. This sets three unrelated SGRs; 38, 5, and 123. 38 sets a foreground colour to .. er.. nothing in particular. 5 is blinking mode, 123 has no defined meaning to my knowledge. All the other following are exactly identical:
CSI 5;38;123 m
CSI 38 m CSI 5 m CSI 123 m
ECMA-48 already defines a perfectly good way for parameters to take sub-parameters. It's the colon. The correct way to encode this concept is
CSI 38:5:123 m
Why does this matter? It matters for parsers not to have to understand what is going on. Consider
CSI 3;38:9:5:4:3:2:1;11;1
This is equivalent to
CSI 3 CSI 38:9:5:4:3:2:1 CSI 11; CSI 1
I.e. I have no idea what palette 9 is, but I can definitely parse out the contents of this SGR from the others, knowing exactly where it ends. I don't have to "just know" that palette 9 happens to take 5 parameters. The CSI encodes this. My own terminal emulator library, libvterm, already understands these colon-based arguments. It parses them correctly. This is important for other palettes that don't take just one value. For example, the RGB, RGBA, CMY, and CMYK palettes. It's vital to be able to parse out these sub-arguments from the single SGR 38 or 48. Standards exist for a reason, people. Please use them.
Edit 2012/12/20: actually, I have recently learned that xterm now supports this in its correct form as well, since xterm patch 282.