I’ve worked with a lot of engineering and product managers over the years, and the best ones always know how to talk and think about software testing. That’s because the testing strategies you employ, and the tests that you write are crucial to the quality of your software. Understanding how your software is tested can help you make better informed decisions about project scheduling, prioritization and where to allocate engineering resources.
In this article I walk through the core concepts behind modern software testing and offer some perspective on how we at Bit Complete think about building and maintaining effective testing systems.
What is software testing? Software testing is the process of validating that software works the way you expect it to. The most visceral example is something that anybody can do: interact with the software and observe whether it behaves as expected, noting discrepancies. Modern engineering practices eschew this kind of manual testing for the most part because it is laborious and error prone.
By contrast, automated tests are themselves pieces of software that exercise the software being tested and validate correctness by inspecting the results produced. Nowadays, software engineers generally prefer to automate tests as much as possible. There are a few reasons for this:
- Eliminate sources of error and bias in judging results
- Tests need only be written once and then pay dividends over time
- Quickly identify problems in code during development
The last point above is particularly important as it changes the way that developers write code. The availability of a suite of tests that can quickly validate that changes to the code don’t break existing features makes it more efficient to produce many small changes, versus making large changes and then validating those changes in aggregate (e.g. through manual testing.) This approach, known as continuous integration (CI), makes it possible to manage the quality of software that is evolving rapidly, the way many internet applications do today.
Unit tests
The most basic kind of automated test is the unit test. Unit tests do not run the whole application, they run small pieces of code that compose it: the “units” of the application. Unit tests generally run very quickly and validate simple things about the code. For example, in a calculator application, there might be a unit test for each mathematical operation that the calculator can perform.
Generally unit tests are run as part of local development on the software developer’s machine, enabling the iterative workflow shown below:
By continuously running unit tests locally, the developer has some assurance that they are not breaking existing functionality as they make changes. This cycle terminates once the developer is ready to merge their change into the mainline of development.
Although the developer workflow described above is adequate for a single developer working on a simple project, it quickly becomes problematic for larger teams and projects. There are three reasons for this:
- Developer machines are often configured differently from one another, and from the production environment
- It’s often prohibitively slow to run the entire suite of unit tests for a large piece of software locally
- Different developers often find themselves working in the same area of the codebase, and so while their changes may work as expected independently, the combined code may be broken
For these reasons, a more comprehensive CI system supports uploading changes to a central repository in the form of a review or pull request. In that state, the full suite of tests can be run in a faster, more consistent manner as depicted below:
Generally these systems also provide various ways to serialize code merges to avoid bad combinations of code.
Integration tests
Unit testing enables you to validate many of the components that make up your software. However, when all of those components come together to make up the high level functionality that users see, there is still no guarantee that everything will work as expected. Automated tests that validate this kind of behavior are generally not considered unit tests because they depend on the interplay of a number of systems (e.g. database, backend services, etc.) and are therefore not a “unit”.
These kinds of high level tests are often called integration tests because they not only test some high level code, but they also implicitly test the integrations between different systems. One variation on integration tests are end to end (E2E) tests which validate the entire system by automating user behavior.
Unlike unit tests, integration tests tend to be somewhat flakey because there are many more failure modes that can affect the results. For this reason and because integration tests also tend to be much slower than unit tests, it doesn’t make sense to run integration tests locally, or even as part of the continuous integration cycle. Instead, integration tests are often treated more as a smoke test during the deploy process as shown here:
Here we see integration tests gating the deployment of our software. Since integration tests are somewhat flakey, it’s often necessary to double check the results, and if the failure seems legitimate, to revert the offending code change.
You could be forgiven for thinking that integration and end to end tests sound like more trouble than they’re worth. However, they can also be very powerful, as they can detect high level regressions that could not otherwise be detected by other tests. Often they can identify issues that could only otherwise be uncovered through manual testing. Nonetheless, they should be used sparingly, testing your most critical application flows.
The Testing Pyramid
We’re starting to see a pattern emerge here. Low level unit tests tend to be:
- Simple
- Reliable
- Fast
Whereas the higher level integration and end-to-end tests tend to be:
- Powerful
- Complex
- Flakey
- Slow
So what’s the right mix of tests for your software project? A good rule of thumb is to think of your tests composing a pyramid. The sturdy, reliable unit tests are plentiful, and make up the foundation.
As you ascend the pyramid, there are fewer tests, they run less frequently, but they are capable of detecting functional issues with your software not detectable further down.
Putting it all together
In this article I’ve outlined the most common kinds of tests you will need to ensure the quality of your software on an ongoing basis, unit, integration, and end to end tests. I’ve also described a model for thinking about the tradeoffs between the different kinds of tests, and how to balance them: a pyramid with reliable, plentiful unit tests on the bottom, and powerful but flakier integration tests on top.
To get the most out of your test suite you’ll also need a system for continuous integration and deployment that properly leverages this test suite. I shared one simple design above.
If you feel like you’re not getting the most out of your test suite or continuous integration/deployment system, Bit Complete can help. We have years of experience maintaining large testing and deployment systems, and we’re happy to help build or improve yours.