The Curiosity Blog

We Need to Talk About Test Data “Strategy”

Written by Huw Price | 14 October 2021 09:44:10 Z

For many organisations, test data “best practices” start and end with compliance. This reflects a tendency to focus on the problem immediately in front of us. “The business” or legislation have called for the removal of sensitive data from non-production environments; so, that’s the fire that organisations strive to put out first.

Though typically necessary, removing sensitive data from non-production environments overlooks two of the biggest challenges associated with test data today. First, it does not help with the immense time that testers and developers spend waiting for, finding, and making test data. Second, it overlooks the impact low-variety production data has on overall test coverage. To solve all three test data challenges – speed, quality, and compliance – a new strategy is needed.

I’ll be joining Paul Hammersley of EPI-USE Labs to discuss how organisations can target all three of these test data challenges. You can sign up to watch on demand. This blog will highlight some of the test data pressures that we will help resolve on the live webinar, while indicating the solution that we will discuss then.

Test data: A problem that isn’t going away by itself

Many organisations today must re-think their strategy for test data “management”. Relying on a central team to anonymise and copy large production data sets will always be a game of catch-up. Meanwhile, it does nothing to improve the quality of the data for testing, and nor does it reduce the time teams spend wading through large data sets or making missing combinations by hand.

Some of the challenges associated with a typical test data strategy today.

A range of factors have increased the demand for test data, adding to the urgency of a strategy re-think. These related trends have made it harder than ever for manual data provisioning to provide data of sufficient variety, at the speed demanded by parallel teams and frameworks. They include:

  1. “Agile”, DevOps and iterative delivery: With rapid, iterative development, changes and new release candidates arrive faster than ever. This demands continuous access to ever-changing data sets.
  2. Automated testing and CI/CD: Automated test execution has increased the volume and variety of tests being run, each requiring up-to-date data. Automated tests are also less forgiving than manual testers. If they are provisioned inaccurate or inconsistent data, they simply fail, wasting time as those failures must be investigated.
  3. Parallelisation of teams and frameworks: Today, there are usually more teams and frameworks than ever trying to work in parallel. These parallel testers cannot rely on a limited number of production data copies. They need parallel data, as otherwise they use up or edit one another’s data.
  4. Parallelisation of tests: While executing faster than manual tests, automated tests are also capable of running in parallel. Often, two or more tests in a test suite will require similar data combinations. This increases the demand for data, as time-consuming test failures will mount if one test consumes another test’s data.
  5. System complexity and new technologies: As developers adopt new technologies and systems grow increasingly complex, it can be harder than ever to fulfil all the requisite dependencies in test environments. This is especially true when preparing integrated data manually. Data masking, for instance, must anonymise data consistently across a range of databases and files. Otherwise, integrated and end-to-end tests will fail.

These related trends mean that today data of a greater variety is needed faster than ever before. They call for a modernisation of test data “best practices”, avoiding the significant bottlenecks that can raise in a world of rapid development, automated testing, and CI/CD.

Modernising test data

Test data practices today need to be brought into line with the “best practices” found across DevOps and CI/CD pipelines. “Provisioning” data must be automated and parallelised, as well as capable of responding to changing requests on-the-fly. Both automated and manual data requesters must further be capable of triggering the reusable processes on demand, easing the pressure on an overworked data provisioning team.

Fortunately, there are today many effective tools and techniques that address different problems associated with test data. They include data masking to support compliance, generation to boost data variety, and data cloning to make data available to parallel tests, testers, and environments. Database virtualisation has further minimised the time and costs associated with copying data, while data comparisons and analysis engines help testers and developers understand data.

You likely already have some of these solutions at your organisation, either built in-house or using commercial tools. The missing piece in many test data strategies is the process by which the different tools can be combined, reused, and made available on demand to manual and automated data requesters. Too often, responsibility is instead pushed back onto an over-worked provisioning team, who adjust and slowly run a set of linear processes for each data request.

A two-stage modernisation strategy for test data accordingly looks as follows:

A two-stage modernisation strategy for test data 

In other words, a complete test data strategy must comprise all the technologies needed to create complete and compliant data in parallel and on demand. These techniques must furthermore be standardised and automated, while also being exposed to parallel teams, automation frameworks and CI/CD pipelines. Manual and automated data requesters must be capable of parameterising and triggering reusable test data processes on demand, receiving the data they need on-the-fly.

Want to see this strategy in action?

I’ll be joining Paul Hammersley of EPI-USE Labs to explore how organisations can move from supporting test data compliance to implementing a modern test data strategy. To see how complete and compliant data can be made available on-the-fly, Testing across SAP and non-SAP systems: From test data compliance to continuous innovation.