Log in

Regression tests and regtest.erl - Luke's Weblog [entries|archive|friends|userinfo]
Luke Gorrie

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Regression tests and regtest.erl [May. 29th, 2007|01:22 pm]
Luke Gorrie
[Tags|, , ]

My new favourite kind of program is one that tests whether two big and complex programs are substantially equivalent. With tools like this I can take one big and complex program, rewrite parts of it for clarity, and then have confidence that my new version preserves the original behaviour. I think this is much simpler than testing for correctness.

I wrote one particular Erlang regression test program called regtest.erl along these lines. The idea is first to execute a large and complex program on a lot of inputs and to record what happens in log files, then to test the logs from separate runs for equivalence. If the consequences in the logs are the same before and after my hacking then I know the overall behaviour is substantially the same.

Lately I write small programs like this any time I want to rewrite important code that has no explicit test suite. I'm finding it very useful!


[User Picture]From: darius
2007-05-29 06:53 pm (UTC)
I like the note for sophisticated readers.

Regression testing plus informal programming-by-contract has been my mainstay for my whole career -- I'm afraid it made me slow to get into this unit-testing fad. :-)
(Reply) (Thread)
From: (Anonymous)
2007-05-31 11:08 am (UTC)
Whee, test methodologies! You can also construct a test suite that way: "I bet some doofus will make *this* error! Let's introduce it on purpose and see on which inputs the mutated program barfs." -> Instant test case. And if you can write a program that mutates your system ("Let's change this <= to a <, and add 1 to that constant over there..."), you can even automate the whole shebang.

Can you also construct some harness that runs your two systems in lockstep, throwing you into the debugger when their output is different? (The name for that technique is bisimulation.)

Let's talk the next time we meet - I'm supposed to write a dissertation in that general area...

(Reply) (Thread)
[User Picture]From: leon03
2007-06-01 12:46 am (UTC)
But what about race conditions? And corner cases randomly-generated test data may or may not generate (depending on the chosen distribution?) Finally, what if the code to compare logs is very complicated? ;-)
(Reply) (Thread)
From: (Anonymous)
2007-06-02 12:25 am (UTC)
Nondeterministic behavior is usually "handled" by having a third test run outcome (besides "success" and "failure"): "inconclusive", meaining "the app / module I'm meant to test got an input I didn't expect at that point, I can't tell whether its subsequent behavior is correct or not". If the _output_ of the module under test can change semirandomly, the test case has to take this into account when determining success or failure.

Random test data does not lead to very interesting test cases: when you start walking with your eyes closed from your living room and have to start over each time you run into something, you'll almost certainly will not reach the nice bar downtown.
(Reply) (Parent) (Thread)