Testing with QuickCheck

With that in mind, in this book, we'll make extensive use of property-based testing, also called generative testing by some literature. Property-based testing has a slightly different, if similar, workflow to unit testing. When writing property tests, the programmer will:

  • Produce a method for generating valid inputs to a system under test
  • Write further code to demonstrate that for all valid inputs that a desirable property or property of the system holds

Property-based testing is exceptionally good at finding corner cases in software, wacky inputs that the programmer might not have dreamed up, or, as happens from time to time, even understand as potentially problematic. We'll use Andrew Gallant's QuickCheck, a tool that is patterned from Haskell QuickCheck, introduced by Koen Claessen and John Hughes in their 2000 QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs. First, a little preamble to get us into the testing module:

#[cfg(test)]
mod test {
    extern crate quickcheck;

    use super::*;
    use quickcheck::{Arbitrary, Gen, QuickCheck, TestResult};

If we were to unit test our naive HashMap, we'd probably write a test where a particular arbitrary key/value pair would be inserted and then we'd assert the ability to retrieve the same value with that same key. How does this map to property-based testing? The property is that if we perform an insertion on an empty HashMap with a key k of value v and immediately perform a retrieval of k, we'll receive v back out. Here it is:

    #[test]
    fn get_what_you_give() {
        fn property(k: u16, v: u16) -> TestResult {
            let mut system_under_test = HashMap::new();

            assert_eq!(None, system_under_test.insert(k, v));
            assert_eq!(Some(&v), system_under_test.get(&k));

            TestResult::passed()
        }
        QuickCheck::new().quickcheck(property as fn(u16, u16) -> 
TestResult); }

The test is named get_what_you_give and is annotated #[test] as per usual Rust tests. What's different is this test has a property inner function that encodes the property we elaborated on earlier and a call out to QuickCheck for actually running the property test, by default 100 times, with different inputs. Exactly how QuickCheck manages this is pretty swell. Per the Claessen and Hughes paper, QuickCheck implements the Arbitrary trait for many of the types available in the standard library. QuickCheck defines Arbitrary like so, in quickcheck/src/arbitrary.rs :

pub trait Arbitrary : Clone + Send + 'static {
    fn arbitrary<G: Gen>(g: &mut G) -> Self;

    fn shrink(&self) -> Box<Iterator<Item=Self>> {
        empty_shrinker()
    }
}

The Gen parameter is a wrapper around rand:Rng with some extra machinery for controlling distributions. The two functions are arbitrary and shrink. The purpose of arbitrary is easy to explain: it generates new, random instances of a type that is Arbitrary. The shrink function is a little more complicated to get across. Let's say we've produced a function that takes a vector of u16 but just so happens to have a bug and will crash if a member of the vector is 0. A QuickCheck test will likely find this but the first arbitrary vector it generates may have a thousand members and a bunch of 0 entries. This is not a very useful failure case to begin diagnosis with. After a failing case is found by QuickCheck the case is shrunk, per the definition of Arbitrary::shrink. Our hypothetical vector will be cut in half and if that new case is also a failure then it'll be shrunk by half again and so on until it no longer fails, at which point—hopefully—our failing case is a much more viable diagnosis tool.

The implementation of Arbitrary for u16 is a touch complicated due to macro use, but if you squint some, it'll become clear:

macro_rules! unsigned_arbitrary {
    ($($ty:tt),*) => {
        $(
            impl Arbitrary for $ty {
                fn arbitrary<G: Gen>(g: &mut G) -> $ty {
                    #![allow(trivial_numeric_casts)]
                    let s = g.size() as $ty;
                    use std::cmp::{min, max};
                    g.gen_range(0, max(1, min(s, $ty::max_value())))
                }
                fn shrink(&self) -> Box<Iterator<Item=$ty>> {
                    unsigned_shrinker!($ty);
                    shrinker::UnsignedShrinker::new(*self)
                }
            }
        )*
    }
}

unsigned_arbitrary! {
    usize, u8, u16, u32, u64
}

Arbitrary instances are generated by unsigned_arbitrary!, each of which in turn generates a shrinker via unsigned_shrinker!. This macro is too long to reprint here but the basic idea is to remove half of the unsigned integer on every shrink again, until zero is hit, at which point give up shrinking.

Fifteen years after the original QuickCheck paper, John Hughes summarized his experience in the intervening 15 years with his 2016 Experiences with QuickCheck: Testing the Hard Stuff and Staying Sane. Hughes noted that many property-based tests in the wild don't generate primitive types. The domain of applications in which primitive types are sufficient tends to be those rare, blessed pure functions in a code base. Instead, as many functions in a program are inherently stateful, property-based tests tend to generate arbitrary actions against a stateful system, modeling the expected behavior and validating that the system under test behaves according to its model.

How does that apply to our naive HashMap? The system under test is the naive HashMap, that's clear enough. Our actions are:

  • INSERT key value
  • LOOKUP key

Hopefully, that's straightforward. What about a model? In this particular instance, we're in luck. Our model is written for us: the standard library's HashMap. We need only confirm that if the same actions are applied to both our naive HashMap and the standard HashMap in the same order then we'll get the same returns from both the model and system under test. How does that look? First, we need actions:

    #[derive(Clone, Debug)]
    enum Action<T>
    where
        T: Arbitrary,
    {
        Insert(T, u16),
        Lookup(T),
    }

Our Action<T> is parameterized on a T: Arbitrary and all values via the Insert are pegged as u16s. This is done primarily for convenience's sake. Both key and value could be arbitrary or concrete types, depending on the preference of the tester. Our Arbitrary definition is as follows:

    impl<T> Arbitrary for Action<T>
    where
        T: Arbitrary,
    {
        fn arbitrary<G>(g: &mut G) -> Action<T>
        where
            G: Gen,
        {
            let i: usize = g.gen_range(0, 100);
            match i {
                0...50 => Action::Insert(Arbitrary::arbitrary(g), 
u16::arbitrary(g)), _ => Action::Lookup(Arbitrary::arbitrary(g)), } } }

Either action is equally likely to happen and any instance of T or u16 is valid for use. The validation of our naive HashMap against the standard library's HashMap:

    #[test]
    fn sut_vs_genuine_article() {
        fn property<T>(actions: Vec<Action<T>>) -> TestResult
        where
            T: Arbitrary + Eq + Hash + ::std::fmt::Debug,
        {
            let mut model = ::std::collections::HashMap::new();
            let mut system_under_test = HashMap::new();

            for action in actions.into_iter() {
                match action {
                    Action::Insert(k, v) => {
                        assert_eq!(model.insert(k.clone(), v), 
system_under_test.insert(k, v)); } Action::Lookup(k) => { assert_eq!(model.get(&k),
system_under_test.get(&k)); } } } TestResult::passed() } QuickCheck::new().quickcheck(property as fn(Vec<Action<u8>>) ->
TestResult); }
}

Hopefully, this looks familiar to our previous QuickCheck test. Again, we have an inner property function that does the actual work and we request that QuickCheck run this property test over arbitrary vectors of Action<u8>. For every action present in the vector we apply them to both the model and the system under test, validating that the results are the same. The u8 type was intentionally chosen here compared to a large domain type such as u64. One of the key challenges of writing QuickCheck tests is probing for extremely unlikely events. QuickCheck is blind in the sense that it's possible for the same path to be chosen through the program for each run if that path is the most likely path to be chosen. While millions of QuickCheck tests can give high confidence in fitness for purpose the blind nature of the runs means that QuickCheck should also be paired with tools known as fuzzers. These do not check the correct function of a program against a model. Instead, their sole purpose is to validate the absence of program crashes.