Advanced testing¶

Up till now we have been testing functions where the output is entirely predictable. In these cases, a handful of tests is usually enough to provide confidence that the software is working as expected. In the real world, however, you might be developing a complex piece of sofware to implement an entirely new algorithm, or model. In certain cases it might not even be clear what the expected outcome is meant to be. Things can be particularly challenging when the software is involves a stochastic element.

Let us consider a class to simulate the behaviour of a dice. One is provided in the dice package. Let's import it and see how it works.

from dice import Dice
help(Dice)

How could we test that the dice is fair?

Well, first of all we could check that the value of a dice roll is in range.

# dice/test/test_dice.py
def test_valid_roll():
    """ Test that a dice roll is valid. """

    # Intialise a standard, six-sided dice.
    dice = Dice()

    # Roll the dice.
    roll = dice.roll()

    # Check that the value is valid.
    assert roll > 0 and roll < 7

!pytest dice/test/test_dice.py::test_valid_roll

Great, that worked. Although, it could just be a fluke...

In practice, we need to check that the assertions hold repeatedly.

# dice/test/test_dice.py
def test_always_valid_roll():
    """ Test that a dice roll is "always" valid. """

    # Intialise a standard, six-sided dice.
    dice = Dice()

    # Roll the dice lots of times.
    for i in range(0, 10000):
        roll = dice.roll()

        # Check that the value is valid.
        assert roll > 0 and roll < 7

!pytest dice/test/test_dice.py::test_always_valid_roll

Okay, that's better. Or is it...

xkcd: random

Not again!

Perhaps we should test the average value. We know that this should equal the sum of the faces of the dice, divided by the number of sides, i.e. 3.5 for a six-sided dice.

# dice/test/test_dice.py
def test_average():
    """ Test that the average dice roll is correct. """

    # Intialise a standard, six-sided dice.
    dice = Dice()

    # Work out the expected average roll.
    exp = sum(range(1, 7)) / 6

    # Calculate the sum of the dice rolls.
    total = 0
    for i in range(0, 100000):
        total += dice.roll()

    # Check that the average matches the expected value.
    average = total / rolls
    assert average == pytest.approx(3.5, rel=1e-2)

!pytest dice/test/test_dice.py::test_average

Good... Hang on, hold your horses!

(1 + 3 + 4 + 6) / 4

Dang! We need to test that the distrubtion of outcomes is correct, i.e. that each of the six possible outcomes is equally likely.

# dice/test/test_dice.py
def test_fair():
    """ Test that a dice is fair. """

    # Intialise a standard, six-sided dice.
    dice = Dice()

    # Set the number of rolls.
    rolls = 1000000

    # Create a dictionary to hold the tally for each outcome.
    tally = {}
    for i in range(1, 7):
        tally[i] = 0

    # Roll the dice 'rolls' times.
    for i in range(0, rolls):
        tally[dice.roll()] += 1

    # Assert that the probability is correct.
    for i in range(1, 7):
        assert tally[i] / rolls == pytest.approx(1 / 6, 1e-2)

!pytest dice/test/test_dice.py::test_fair

Phew, thanks goodness! Testing is hard.

Exercise¶

Exercise 1¶

The file dice/test/test_dice.py contains an empty function, test_double_roll, for checking that the distribution for the sum of two six-sided dice rolls is correct. Fill in the body of this function and run pytest to verify that your test passes.

Hints:

For any two n-sided dice, the probability of the sum of two rolls being a value of x is given by:

$$p(x) = \frac{n - |x - (n+1)|}{n^2},\quad\mathrm{for}\ x=2\ \mathrm{to}\ 2n$$

We've provided a helper function called prob_double_roll(x, n) that will calculate this probability for you, i.e.

prob = prob_double_roll(4, 6)

will return the probability of rolling a sum of 4 with two six-sided dice.

Exercise 2¶

Parametrize your test so that it works for any pair of n-sided dice. Test it using pairs of five- and seven-sided dice.