Generate Your Tests

by Kevin Stone
October 12, 2015

When testing your application, you'll often come across the need to test that your application behaves correct across a range of inputs or conditions. Instead of repetitively defining the same test scaffolding, let's use some meta-programming to build our tests methods from a list of parameters. Nose has support for Generator Tests, but they don't work with python's unittest framework. Let's walk-through how to build our own test generator that does.

Our Test Case

So let's write a simple application for demonstration. We want to see if our input string is alphabetized.

def is_alphabetized(value):
    return ''.join(sorted(value)) == value

Now let's write a unit test case to check our implementation:

import unittest

class IsAlphabetizedTestCase(unittest.TestCase):
    def test_is_alphabetized(self):
        self.assertTrue(is_alphabetized('abcd'))

    def test_is_not_alphabetized(self):
        self.assertFalse(is_alphabetized('zyxw'))

We can go ahead and run these tests and they should pass:

> nosetests example
..
----------------------------------------------------------------------
Ran 2 tests in 0.000s

OK

But as you can see, our two test methods only tested simple examples. What we really want to test is our function over a wide range of inputs to ensure our implementation is robust.

At this point, your thought is usually to iterate over a range of cases in your method, and assert that each of them works as desired:

class IsAlphabetizedTestCase(unittest.TestCase):
    def test_is_alphabetized(self):
        inputs = [
            '',
            'a',
            'aaaaa',
            'ab',
            'abcd',
        ]
        for input in inputs:
            self.assertTrue(is_alphabetized(input))

But you lose the benefit of having a single assertion in your test. You can't isolate your test run to a single input. What if your test requires a setup or teardown after each case? You can't use all the advantages of the test framework.

Let's say you have this setup and you get bug report that our naive implementation above doesn't work correctly for A-Cert as well as one for iOS. Since we're big fans of TDD, we'll plug that in our test case of expected inputs:

from .alphabetical import is_alphabetized

import unittest


class IsAlphabetizedTestCase(unittest.TestCase):
    def test_is_alphabetized(self):
        inputs = [
            '',
            'a',
            'aaaaa',
            'ab',
            'abcd',
            'A-Cert',
            'iOS',
        ]
        for input in inputs:
            self.assertTrue(is_alphabetized(input))

And give it a run:

======================================================================
FAIL: test_is_alphabetized (example.test_alphabetical.IsAlphabetizedTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "example/test_alphabetical.py", line 34, in test_is_alphabetized
    self.assertTrue(is_alphabetized(input))
AssertionError: False is not true

----------------------------------------------------------------------

But as you can see, we don't know which test failed. We should update our test case to indicate which word failed:

    def test_is_alphabetized(self):
        inputs = [
            '',
            'a',
            'aaaaa',
            'ab',
            'abcd',
            'A-Cert',
            'iOS',
        ]
        for input in inputs:
            self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))
======================================================================
FAIL: test_is_alphabetized (example.test_alphabetical.IsAlphabetizedTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "example/test_alphabetical.py", line 46, in test_is_alphabetized
    self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))
AssertionError: A-Cert was not considered alphabetized

----------------------------------------------------------------------

Now we at least know which word failed. But what about iOS? Because the test exits on the first failure, we're unsure if that bug report was valid or not.

Before we go fixing our is_alphabetical implementation to handle the punctuation and capitalization, let's figure out how to get both inputs to fail.

Instead of looping in our test method, we could instead write separate test methods for each input:

class IsAlphabetizedTestCase(unittest.TestCase):

    def test_is_alphabetized_empty(self):
        self.assertTrue(is_alphabetized(''))

    def test_is_alphabetized_a(self):
        self.assertTrue(is_alphabetized('a'))

    def test_is_alphabetized_aaaaa(self):
        self.assertTrue(is_alphabetized('aaaaa'))

    ...

But boy is that exhausting having to repeat ourselves for each test. Instead, we'd like to provide the list of inputs as before, but have individual test methods generated for us. Nose actually supports this out of the box with Generator Tests, but you can't use them with python unittest.TestCase and we'd lose any of help we get using the framework.

But we can work around that limitation. We just need to define a new test method on our test class for each input and nose will run them just like any other test method.

What we need to do is transform that list of inputs into individual test methods. Instead of us manually writing all those test methods, what if we wrote a function that wrote the test class for us?

class IsAlphabetizedTestCase(unittest.TestCase):
    pass


def add_methods(klass, *inputs):
    """
    Take a TestCase and add a test method for each input
    """
    for input in inputs:
        def test_input(self):
            self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))

        klass.test_input = test_input


add_methods(
    IsAlphabetizedTestCase,
    '',
    'a',
    'aaaaa',
    'ab',
    'abcd',
    'A-Cert',
    'iOS',
)

Running those tests, we see:

======================================================================
FAIL: test_input (example.test_alphabetical.IsAlphabetizedTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "example/test_alphabetical.py", line 16, in test_input
    self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))
AssertionError: iOS was not considered alphabetized

----------------------------------------------------------------------
Ran 2 tests in 0.001s

FAILED (errors=1, failures=1)

What happened? Shouldn't there be more tests? Well, we kept clobbering our test in the klass. The trick is to give each test a unique name.

Let's assign the test to the class using setattr with a custom name:

def add_methods(klass, *inputs):
    """
    Take a TestCase and add a test method for each input
    """
    for input in inputs:
        test_name = 'test_alphabetical_{}'.format(input)

        def test_input(self):
            self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))

        setattr(klass, test_name, test_input)

Now running the tests:

======================================================================
FAIL: test_alphabetical_ (example.test_alphabetical.IsAlphabetizedTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "example/test_alphabetical.py", line 18, in test_input
    self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))
AssertionError: iOS was not considered alphabetized

======================================================================
FAIL: test_alphabetical_A-Cert (example.test_alphabetical.IsAlphabetizedTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "example/test_alphabetical.py", line 18, in test_input
    self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))
AssertionError: iOS was not considered alphabetized

[...]

----------------------------------------------------------------------
Ran 7 tests in 0.001s

FAILED (failures=7)

Why are all tests failing with iOS? It should be running with each variant of our input. Turns out this is due to variable scoping in python. In python2, there's only two levels of scoping: function and global. So as we loop and define test_input, the value of input is being updated. When our test methods are actually evaluated, it's always equal to the last value (iOS).

To fix that, we'll define yet another function to capture the value of input (remember there's function scoping):

def make_method(input):
    test_name = 'test_alphabetical_{}'.format(input)

    def test_input(self):
        self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))

    return test_name, test_input


def add_methods(klass, *inputs):
    """
    Take a TestCase and add a test method for each input
    """
    for input in inputs:
        test_name, test_input = make_method(input)
        setattr(klass, test_name, test_input)

With these changes, we finally get the results we expected:

======================================================================
FAIL: test_alphabetical_A-Cert (example.test_alphabetical.IsAlphabetizedTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "example/test_alphabetical.py", line 14, in test_input
    self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))
AssertionError: A-Cert was not considered alphabetized

======================================================================
FAIL: test_alphabetical_iOS (example.test_alphabetical.IsAlphabetizedTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "example/test_alphabetical.py", line 14, in test_input
    self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))
AssertionError: iOS was not considered alphabetized

----------------------------------------------------------------------
Ran 7 tests in 0.001s

FAILED (failures=2)

There you go, we are able to generate N test methods to validate our alphabetical function. Adding a new test is as simple as extending the list of inputs.

Improving the Test Generation

So we now have the ability to generate tests from an input list, but it involves a few methods and our test is now deeply nested in a helper factory. Let's improve this so we can use this more universally.

First thing to notice is that add_methods takes a class as the first argument. That looks awfully like a decorator. Let's make a few tweaks so we can use it to decorate our test class:

def add_methods(*inputs):
    """
    Take a TestCase and add a test method for each input
    """
    def decorator(klass):
        for input in inputs:
            test_name, test_input = make_method(input)
            setattr(klass, test_name, test_input)
        return klass

    return decorator


@add_methods('', 'a', 'aaaaa', 'ab', 'abcd', 'A-Cert', 'iOS')
class IsAlphabetizedTestCase(unittest.TestCase):
    pass

All we did was wrap our method generation in a decorator function that takes an input class, adds the method and returns the class.

Naming the Test Functions

if you inspect the test class, even though we have methods under unique names based on the input (like test_input_iOS), we can't reference those class names when using nose because they all are still just named test_input:

ipdb> pp IsAlphabetizedTestCase.test_alphabetical_iOS
<unbound method IsAlphabetizedTestCase.test_input>

To fix this, we need to set the __name__ attribute on the function when generating it. This has the extra benefit of not needing to return a name for the assignment, we can just use the same:

def make_method(input):
    def test_input(self):
        self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))

    test_input.__name__ = 'test_alphabetical_{}'.format(input)
    return test_input


def add_methods(*inputs):
    """
    Take a TestCase and add a test method for each input
    """
    def decorator(klass):
        for input in inputs:
            test_input = make_method(input)
            setattr(klass, test_input.__name__, test_input)
        return klass

    return decorator

Now with the function named accurately, we can specify a single test method to run:

nosetests example/test_alphabetical.py:IsAlphabetizedTestCase.test_alphabetical_iOS
F
======================================================================
FAIL: test_alphabetical_iOS (example.test_alphabetical.IsAlphabetizedTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "example/test_alphabetical.py", line 8, in test_input
    self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))
AssertionError: iOS was not considered alphabetized

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1)

Generalize the Test Method Factory

So our class decorator works well for our single test method generator. But what if we have multiple test methods we want to generate? Remember our initial example also included expected False results? Let's make it so we can pass in the test method:

from .alphabetical import is_alphabetized

import unittest


def assert_is_alphabetized(self, input):
    self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))


def assert_is_not_alphabetized(self, input):
    self.assertFalse(is_alphabetized(input), u'{} was considered alphabetized'.format(input))


def make_method(func, input):

    def test_input(self):
        func(self, input)

    test_input.__name__ = 'test_{func}_{input}'.format(func=func.__name__, input=input)
    return test_input


def generate(func, *inputs):
    """
    Take a TestCase and add a test method for each input
    """
    def decorator(klass):
        for input in inputs:
            test_input = make_method(func, input)
            setattr(klass, test_input.__name__, test_input)
        return klass

    return decorator


@generate(assert_is_alphabetized, '', 'a', 'aaaaa', 'ab', 'abcd', 'A-Cert', 'iOS')
@generate(assert_is_not_alphabetized, 'ba', 'aba', 'bob')
class IsAlphabetizedTestCase(unittest.TestCase):
    pass

Now we can run all

======================================================================
FAIL: test_assert_is_alphabetized_A-Cert (example.test_alphabetical.IsAlphabetizedTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "example/test_alphabetical.py", line 17, in test_input
    func(self, input)
  File "example/test_alphabetical.py", line 7, in assert_is_alphabetized
    self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))
AssertionError: A-Cert was not considered alphabetized

======================================================================
FAIL: test_assert_is_alphabetized_iOS (example.test_alphabetical.IsAlphabetizedTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "example/test_alphabetical.py", line 17, in test_input
    func(self, input)
  File "example/test_alphabetical.py", line 7, in assert_is_alphabetized
    self.assertTrue(is_alphabetized(input), u'{} was not considered alphabetized'.format(input))
AssertionError: iOS was not considered alphabetized

----------------------------------------------------------------------
Ran 10 tests in 0.001s

FAILED (failures=2)

Conclusion

Now with these handy generated test methods, we can make the necessary fixes to our is_alphabetical function:

import re

def is_alphabetized(value):
    value = value.lower()
    value = re.sub(r'[^a-z]', '', value)
    return ''.join(sorted(value)) == value
----------------------------------------------------------------------
Ran 10 tests in 0.001s

OK

Ah, so satisfying....


Comments