How to Extend CI Pipelines with Continuous Performance Testing
In previous posts we've talked about why Continuous Performance Testing (CPT) must be an integral part of managing the user experience. We've also discussed some of the reasons CPT is hard to implement. Now it's time to get into the nitty-gritty of making CPT part of your CI pipeline. As you will quickly discover, this isn’t rocket science once you understand the goals and the process.
The standard pipeline for application development can be logically split into two main sections:
1. A continuous CI loop that starts with every commit and progressively applied to most stable builds, and
2. Release-level testing and sign-off, which is executed against the release candidate before it is about to go into production.
A pre-release sign-off test happens infrequently and is heavily manual in nature. It involves deployment of a full test environment that typically includes various end-to-end (E2E) user scenarios combined with load, stress, and stability testing. This is a highly skilled job performed by the application engineers and is both explorative and experimental in nature; they must set up the environment, configure tools to measure performance and utilization elements, run tests, measure results, analyze the findings, modify scenarios, adjust the infrastructure, reconfigure their tools, and try again.
Any performance issues found at this stage result in a tough dilemma for the release management team: Should they let the release through with known performance issues, or send the release back to engineering and miss the ship date?
Both options are far from ideal. It’s far more efficient to evaluate the performance characteristics of each build during active development, at the same time other quality issues are discovered and corrected. Getting this done right — cost-effectively and automatically — is the central idea behind CPT.
This can be done by taking performance testing functions that can be automated and moving them into the CI pipeline, like this:
Effectively, we apply the principles of multi-stage regression test pipelines to performance testing. This is typically done by creating a series of rather simple tests that check different aspects of the system against performance objectives in a selected set of scenarios, and inserting these tests into appropriate phases of the CI pipeline as shown above.
While not all types of performance testing can be reduced to the types of tests suitable for CI pipelines, here are a few common ones that can:
- Performance smoke testing: Applied to an isolated service via an API in a representative configuration, this test measures the response times of certain critical performance-sensitive operations — for example search queries, data exports or product updates. The environment configuration, test data, and operations are simplified to produce a quick, cheap, and effective pass/fail test that captures gross performance trends with respect to common performance-sensitive operations for each build. Simplicity and fast response times allow us to put these tests in the category of smoke tests.
- Performance regression testing: These tests measure simulated performance characteristics of typical user scenarios, such as placing item in a shopping bag, placing a product order, browsing a catalog and so on. These tests are performed exclusively via APIs and exclude any UI operations. They aim to measure the effective performance characteristics of common business workflows and expose bottlenecks due to poor design or implementation. We implement user scenarios executing sequential API calls, and collect metrics for each scenario and each API call. We focus primarily on the performance of each step. Eliminating UI-related operations from the regression testing reduces the number of tests that need to be written, simplifies test maintenance between releases, and shortens test execution times.
- End-to-end (E2E) performance testing: Full user scenarios, including UI, backend, and all required external 3rd party services, configured in production-like environments with realistic data volumes. These are long-running, relatively “expensive” tests designed to realistically measure the user experience.
All three types of tests lend themselves well to integration with different stages of the existing CI pipeline: performance smoke tests extend existing functional smoke tests; performance regression testing supplements other forms of regression testing; and performance E2E tests are best run side by side with integration testing.
In our next post, we’ll dive deeper into the specifics of successfully extending existing CI pipelines with Continuous Performance Testing.
Victor Samoylov, Dmitry Latnikov, Mikhail Klokov