Development

Integration and Simulation Tests in Python

One of my (many) tasks lately has been to rework unit and integration tests for Review Bot, our automated code review add-on for Review Board.

The challenge was providing a test suite that could test against real-world tools, but not require them. An ever-increasing list of compatible tools has threatened to become an ever-increasing burden on contributors. We wanted to solve that.

So here’s how we’re doing it.

First off, unit test tooling

First off, this is all Python code, which you can find on the Review Bot repository on GitHub.

We make heavy use of kgb, a package we’ve written to add function spies to Python unit tests. This goes far beyond Mock, allowing nearly any function to be spied on without having to be replaced. This module is a key component to our solution, given our codebase and our needs, but it’s an implementation detail — it isn’t a requirement for the overall approach.

Still, if you’re writing complex Python test suites, check out kgb.

Deciding on the test strategy

Review Bot can talk to many command line tools, which are used to perform checks and audits on code. Some are harder than others to install, or at least annoying to install.

We decided there’s two types of tests we need:

  1. Integration tests — ran against real command line tools
  2. Simulation tests — ran against simulated output/results that would normally come from a command line tool

Being that our goal is to ease contribution, we have to keep in mind that we can’t err too far on that side at the expense of a reliable test suite.

We decided to make these the same tests.

The strategy, therefore, would be this:

  1. Each test would contain common logic for integration and simulation tests. A test would set up state, perform the tool run, and then check results.
  2. Integration tests would build upon this by checking dependencies and applying configuration before the test run.
  3. Simulation tests would be passed fake output or setup data needed to simulate that tool.

This would be done without any code duplication between integration or simulation tests. There would be only one test function per expectation (e.g., a successful result or the handling of an error). We don’t want to worry about tests getting out of sync.

Regression in our code? Both types of tests should catch it.

Regression or change in behavior in an integrated tool? Any fixes we apply would update or build upon the simulation.

Regression in the simulation? Something went wrong, and we caught it early without having to run the integration test.

Making this all happen

We introduced three core testing components:

  1. @integration_test() — a decorator that defines and provides dependencies and input for an integration test
  2. @simulation_test() — a decorator that defines and provides output and results for a simulation test
  3. ToolTestCaseMetaClass — a metaclass that ties it all together

Any test class that needs to run integration and simulation tests will use ToolTestCaseMetaClass and then apply either or both @integration_test/@simulation_test decorators to the necessary test functions.

When a decorator is applied, the test function is opted into that type of test. Data can be passed into the decorator, which is then passed into the parent test class’s setup_integration_test() or setup_simulation_test().

These can do whatever they need to set up that particular type of test. For example:

  • Integration test setup defaults to checking dependencies, skipping a test if not met.
  • Simulation test setup may write some files or spy on a subprocess.Popen() call to fake output.


For example:

class MyTests(kgb.SpyAgency, TestCase,
              metaclass=ToolTestCaseMetaClass):
    def setup_simulation_test(self, output):
        self.spy_on(execute, op=kgb.SpyOpReturn(output))

    def setup_integration_test(self, exe_deps):
        if not are_deps_found(exe_deps):
            raise SkipTest('Missing one or more dependencies')

    @integration_test(exe_deps=['mytool'])
    @simulation_test(output=(
        b'MyTool 1.2.3\n'
        b'Scanning code...\n'
        b'0 errors, 0 warnings, 1 file(s) checked\n'
    ))
    def test_execute(self):
        """Testing MyTool.execute"""
        ...

When applied, ToolTestCaseMetaClass will loop through each of the test_*() functions with these decorators applied and split them up:

  • Test functions with @integration_test will be split out into a test_integration_<name>() function, with a [integration test] suffix appended to the docstring.
  • Test functions with @simulation_test will be split out into test_simulation_<name>(), with a [simulation test] suffix appended.

The above code ends up being equivalent to:

class MyTests(kgb.SpyAgency, TestCase):
    def setup_simulation_test(self, output):
        self.spy_on(execute, op=kgb.SpyOpReturn(output))

    def setup_integration_test(self, exe_deps):
        if not are_deps_found(exe_deps):
            raise SkipTest('Missing one or more dependencies')

    def test_integration_execute(self):
        """Testing MyTool.execute [integration test]"""
        self.setup_integration_test(exe_deps=['mytool'])
        self._test_common_execute()

    def test_simulation_execute(self):
        """Testing MyTool.execute [simulation test]"""
        self.setup_simulation_test(output=(
            b'MyTool 1.2.3\n'
            b'Scanning code...\n'
            b'0 errors, 0 warnings, 1 file(s) checked\n'
        ))
        self._test_common_execute()

    def _test_common_execute(self):
        ...

Pretty similar, but less to maintain in the end, especially as tests pile up.

And when we run it, we get something like:

Testing MyTool.execute [integration test] ... ok
Testing MyTool.execute [simulation test] ... ok

...

Or, you know, with a horrible, messy error.

Iterating on tests

It’s become really easy to maintain and run these tests.

We can now start by writing the integration test, modify the code to log any data that might be produced by the command line tool, and then fake-fail the test to see that output.

class MyTests(kgb.SpyAgency, TestCase,
              metaclass=ToolTestCaseMetaClass):
    ...

    @integration_test(exe_deps=['mytool'])
    def test_process_results(self):
        """Testing MyTool.process_results"""
        self.setup_files({
            'filename': 'test.c',
            'content': b'int main() {return "test";}\n',
        })

        tool = MyTool()
        payload = tool.run(files=['test.c'])

        # XXX
        print(repr(payload))

        results = MyTool().process_results(payload)

        self.assertEqual(results, {
            ...
        })

        # XXX Fake-fail the test
        assert False

I can run that and get the results I’ve printed:

======================================================================
ERROR: Testing MyTool.process_results [integration test]
----------------------------------------------------------------------
Traceback (most recent call last):
    ...
-------------------- >> begin captured stdout << ---------------------
{"errors": [{"code": 123, "column": 13, "filename": "test.c", "line': 1, "message": "Expected return type: int"}]}

Now that I have that, and I know it’s all working right, I can feed that output into the simulation test and clean things up:

class MyTests(kgb.SpyAgency, TestCase,
              metaclass=ToolTestCaseMetaClass):
    ...

    @integration_test(exe_deps=['mytool'])
    @simulation_test(output=json.dumps(
        'errors': [
            {
                'filename': 'test.c',
                'code': 123,
                'line': 1,
                'column': 13,
                'message': 'Expected return type: int',
            },
        ]
    ).encode('utf-8'))
    def test_process_results(self):
        """Testing MyTool.process_results"""
        self.setup_files({
            'filename': 'test.c',
            'content': b'int main() {return "test";}\n',
        })

        tool = MyTool()
        payload = tool.run(files=['test.c'])
        results = MyTool().process_results(payload)

        self.assertEqual(results, {
            ...
        })

Once it’s running correctly in both tests, our job is done.

From then on, anyone working on this code can just simply run the test suite and make sure their change hasn’t broken any simulation tests. If it has, and it wasn’t intentional, they’ll have a great starting point in diagnosing their issue, without having to install anything.

Anything that passes simulation tests can be considered a valid contribution. We can then test against the real tools ourselves before landing a change.

Development is made simpler, and there’s no worry about regressions.

Going forward

We’re planning to apply this same approach to both Review Board and RBTools. Both currently require contributors to install a handful of command line tools or optional Python modules to make sure they haven’t broken anything, and that’s a bottleneck.

In the future, we’re looking at making use of python-nose‘s attrib plugin, tagging integration and simulation tests and making it trivially easy to run just the suites you want.

We’re also considering pulling out the metaclass and decorators into a small, reusable Python packaging, making it easy for others to make use of this pattern.

Integration and Simulation Tests in Python Read More »

Weird bugs: Django, timezones, and importing from eggs

Every so often you hit a bug that makes you question your sanity. The past several days have been spent chasing one of the more confusing ones I’ve seen in a long time.

Review Board 1.7 added the ability to set the server-wide timezone. During development, we found problems using SSH with a non-default timezone. This only happened when updating os.environ[‘TZ’] to something other than our default of UTC. We’d see the SSH process (rbssh, our wrapper for SSH communication) break due to an EOF on stdin and stdout, and then we’d see the development server reload itself.

Odd.

Since this originated with a Subversion repository, I first suspected libsvn. I spent some time going through their code to see if a timezone update would break something. Perhaps timeout logic. That didn’t turn up anything interesting, but I couldn’t rule it out.

Other candidates for suspicion were rbssh itself, paramiko (the SSH library), Django, and the trickster god Loki. We just had too many moving pieces to know for sure.

So I wrote a little script to get in-between a calling process and another process and log all communication between them. I tested this with rbssh and with plain ol’ ssh. rbssh was the only one that broke. Strange, since it wasn’t doing anything obviously wrong, and it worked with the default timezone. Unless it was Paramiko somehow…

For the heck of it, I tried copying some of rbssh’s imports into this new script. Ah-ha! It dropped its streams when importing Paramiko, same as rbssh. Interesting. Time to dig into that code.

The base paramiko module imports a couple dozen other modules, so I started by narrowing it down and reducing imports until I found the common one that breaks things. Well that turned out to be a module that imported Crypto.Random. Replacing the paramiko import in my wrapper with Crypto.Random verified that that was the culprit.

Getting closer…

I rinsed and repeated with Crypto.Random, digging through the code and seeing what could have broken. Hmm, that code’s pretty straight-forward, but there are some native libraries in there. Well, all this is in a .egg file (not an extracted .egg directory), making it hard to look through, so I extracted it and replaced it with a .egg directory.

Woah! The problem went away!

I glance at the clock. 3AM. I’m not sure I can trust what I’m seeing anymore. Crypto.Random breaks rbssh, but only when installed as a .egg file and not a .egg directory. That made no sense, but I figured I’d deal with it in the morning.

My dreams that night were filled with people wearing “stdin” and “stdout” labels on their foreheads, not at all getting along.

Today, I considered just ripping out timezone support. I didn’t know what else to do. Though, since I’m apparently a bit of a masochist, I decided to look into this just a little bit more. And finally struck gold.

With my Django development server running, I opened up a separate, plain Python shell. In it, I typed “import Crypto.Random”. And suddenly saw my development server reload.

How could that happen, I wondered. I tried it again. Same result. And then… lightbulb!

Django reloads the dev server when modules change. Crypto is a self-contained .egg file with native files that must be extracted and added to the module path. Causing Django to reload. Causing it to drop the spawned rbssh process. Causing the streams to disconnect. Ah-ha. This had to be it.

One last piece of the puzzle. The timezone change.

I quickly located their autoreload code and pulled it up. Yep, it’s comparing modified timestamps. We have two processes with two different ideas of what the current timezone is (one UTC, one US/Pacific, in my case), meaning when rbssh launched and imported Crypto, we’d get a bunch of files extracted with US/Pacific-based timestamps and not UTC, triggering the autoreload.

Now that the world makes sense again, I can finally fix the problem!

All told, that was about 4 or 5 days of debugging. Certainly not the longest debugging session I’ve had, but easily one of the more confusing ones in a while. Yet in the end, it’s almost obvious.

Weird bugs: Django, timezones, and importing from eggs Read More »

Django Development with Djblets: Unrooting your URLs

Typically, Django sites are designed with the assumption that they’ll have a domain or subdomain to themselves. Often times this is fine, but if you’re developing a web application designed for redistribution, sometimes you can’t make that assumption.

During development of Review Board, many of our users wanted the ability to install Review Board into a subdirectory of one of their domains, rather than a subdomain.

There’s a few rules that are important when making your site relocatable:

  • Always use MEDIA_URL when referring to your media directory, so that people can customize where they put their media files.
  • Don’t hard-code URLs in templates. Use the {% url %} tag or get_absolute_url() when possible.

These solve some of the issues, but doesn’t address relocating a Django site to a subdirectory.

Djblets fills in this gap by providing a special URL handler designed to act as a prefix for all your projects’ URLs. To make use of this, you need to modify your settings.py file as follows:

settings.py

TEMPLATE_CONTEXT_PROCESSORS = (
    ...
    'djblets.util.context_processors.siteRoot',
)

SITE_ROOT_URLCONF = 'yourproject.urls'
ROOT_URLCONF = 'djblets.util.rooturl'

SITE_ROOT = '/'

SITE_ROOT specifies where the site is located. By default, this should be “/”, but this can be changed to point to any path. For example, “/myproject/”. Note that it should always have a leading and a trailing slash.

The custom template context processor (djblets.util.context_processors.siteRoot) will make SITE_ROOT accessible to templates.

SITE_ROOT should be used in templates when you need to refer to URLs that aren’t designed to respect SITE_ROOT (such as User.get_absolute_url). Your own custom applications should always respect SITE_ROOT whenever providing a URL.

ROOT_URLCONF is typically what you would set to point to your project’s URL. However, in this case, you’ll be pointing it to djblets.util.rooturl. This in turn will forward all URLs to your project’s handler, defined in SITE_ROOT_URLCONF.

This is all you need to have a fully relocatable Django site!

To sum up:

  1. Add djblets.util.context_processors.siteRoot to your TEMPLATE_CONTEXT_PROCESSORS.
  2. Set SITE_ROOT_URLCONF to your project’s URL handler.
  3. Set ROOT_URLCONF to ‘djblets.util.rooturl’
  4. Prefix any URLs with SITE_ROOT in your templates, unless the URL would already take SITE_ROOT into account.

This is functionality that will hopefully make its way into Django at some point. For now, you have an easy way of unrooting your Django project.

Django Development with Djblets: Unrooting your URLs Read More »

Django Development with Djblets: Custom Tag Helpers

I’m planning to cover all of what Django can do, but for now, let’s start simple with something most Django developers spend way too much time creating: Custom tags.

Django’s nice enough to provide a @register.simple_tag decorator for creating very basic tags that don’t take a context but do take parameters. This is great, but what if you want more? Many Django applications use the same boilerplate time and time again to create their tags, but we make it much easier.

Introducing @basictag and @blocktag.

@basictag

Ever wanted to use Django’s @register.simple_tag but needed access to the context? I’ve found far too many cases where this would be useful, but Django doesn’t make this easy. Your tag code would end up looking like this:

class MyTagNode(template.Node):
    def __init__(self, arg1, arg2):
        self.arg1 = arg1
        self.arg2 = arg2

    def render(self, context):
        arg1 = Variable(self.arg1).resolve(context)
        arg2 = Variable(self.arg2).resolve(context)

        return context['user']

@register.tag
def mytag(parser, token):
    bits = token.split_contents()
    return MyTagNode(bits[1], bits[2])

Do this a few times and it’s bound to drive you nuts. How about this instead?

from djblets.util.decorators import basictag

@register.tag
@basictag(takes_context=True)
def mytag(context, arg1, arg2):
    return context['user']

Far less code and increased readability. Hooray!

@blocktag

@blocktag aims to do the same thing @basictag does but for block tags. A block tag is a tag that contains nested content, like @spaceless or @for. This usually requires even more boilerplate than the above code fragment, except with the added complexity of having to grab the contents of the block.

We’ve condensed it down to this:

from djblets.util.decorators import blocktag

@register.tag
@blocktag
def blinkblock(context, nodelist, arg1, arg2):
    return "<blink>%s</blink>" % nodelist.render(context)

If you’ve built block tags in the past, you’ll appreciate how simple that was.

Django Development with Djblets: Custom Tag Helpers Read More »

Django Development with Djblets

Django is an awesome development platform for web applications. With such features as database abstraction, template/view/url separation, built-in authentication with interchangeable backends, it’s made web development much more enjoyable.

We use Django in Review Board with much success. Over time, as we’ve come to develop new features, we realized that much of our codebase was useful outside of Review Board and, bit by bit, moved pieces into a library we call Djblets.

What does Djblets do?

A bit of everything, really. Any time we have useful functionality that isn’t tied to Review Board, we put it here.

Djblet’s feature list currently consists of:

  • Authentication improvements, making it easy to register and login in one step, seamlessly, handle password recovery, and more.
  • Flexible datagrids for displaying data in a paginated list with user-specific column customization, ordering and sorting.
  • Decorators to drastically simplify creation of simple and block template tags.
  • Caching functions for calling a function and caching the result if the data isn’t already in the cache, and a special URL pattern matcher that prevents caching of any contained URLs.
  • Unit testing utility classes.

And of course more little things here and there.

Downloading Djblets

Djblets is not a released app, but it’s pretty stable and well tested. You can check out a copy from our SVN repository, or automatically include it in your own repository through an svn:externals entry.

Djblets is licensed under the MIT license, making it usable in most projects.

Using Djblets

Over time I’ll be writing articles on using the many features of Djblets. See the other posts in the series, or dig around the Djblets source code.

Django Development with Djblets Read More »

Best Bugs

I was having a chat with my brother earlier about software bugs, and I started trying to remember about the best bugs I’ve encountered in software I’ve had a hand in. Below are my list of favorite bugs that I found entertaining. I’m kind of curious as to what other people would choose for their favorites.

My favorite bug would have to be Gaim’s flying buddies. When the rewrite of the Gaim buddy list was commited, in 0.60, we had a fun little bug where the drag-and-drops weren’t completed. This triggered a Gtk bug (I think it was Gtk’s bug?) where the nodes in the GtkTreeView would fly around the screen a bit from point A (where the node originally was) to point B (where the node is now). When the buddy list was off-screen, this made it particularly fun. As these were flying around, we quickly named them Flying Buddies.

Between classes one semester, I wrote a Snake game for my TI-89 calculator. It was rather easy to do, but I had an off-by-one error that generated what I called “snake droppings.” When the snake ate one of the blocks on the screen, it would of course extend. As soon as the tail started moving again, it would leave a pixel or two behind, hence the name.

My third favorite bug was during the development of my BilliardZ game for the Sharp Zaurus SL-5×00 PDA. Occasionally when hitting a ball, a big black hole would open up on the board, and the ball would disappear in it. Playing pool with black holes littering the table is a little inconvenient. I’m still not sure what caused it, but I ended up fixing it.

Speaking of bugs, something messed up on gnome-blog. I’ve spent over an hour trying to get it working again. Works now though.

(Update: I somehow lost the top paragraph. I don’t know how, but it’s back. That should provide more context.)

Best Bugs Read More »

Scroll to Top