The Curiosity Blog

5 Techniques for Overcoming Test Data Bottlenecks

Written by Mantas Dvareckas | 21 June 2022 08:19:20 Z

The demand for ever more complex, varied, and larger data has created a situation where QA can no longer keep up. Burdened by outdated Test Data Management (TDM) practices and technical debt, agile teams are looking for new ways to overcome test data bottlenecks.

Modern TDM tools and techniques are furthermore also often absent in DevOps and CI/CD pipelines. While DevOps teams have focused on automating chunks of testing and development, this automation has frequently overlooked test data, hampering both speed and quality.

This blog will explore some of the key bottlenecks associated with outdated TDM practices, before providing an overview of 5 techniques for overcoming these common blockers. These techniques have been chosen to help you consider a new and transformative approach to test data.

This blog is part 2/4 in a series focusing on test data modernization. Check out the other three parts below:

  1. 5 Test Data Challenges That Every CTO Should Know About.
  2. 5 Solutions to Test Data Coverage Issues.
  3. 5 Ways to Keep Your Test Data Compliant.

Download our free Test Data-as-a-Service Solution Brief to learn how Test Data Automation can help you transform the relationship that your teams and frameworks share with data. The solution brief discusses how your organization can shift from slow and manual data “provisioning” to streaming rich test data in real-time!

The Persistence of Test Data Bottlenecks

Test data bottlenecks arise as teams and tests cannot gather the right data at the requisite speed and volume required. Modern delivery frameworks and test execution automation have substantially increased the speed and variety of data requests, and manual test data provisioning can no longer keep up.

At some organizations, manual test data provisioning can take weeks or months, often requiring longer than an iteration. It cannot in turn provide the variety or volumes of data needed for in-sprint testing and development, at the pace demanded by DevOps and CI/CD pipelines. In fact, a massive 79% of testers still create test data manually with each run [1].

This persistence of slow and manual test data techniques comes at a time when the speed and complexity of development is growing. TDM needs to match the scale and rate of change, while catering to ever-more complex systems and growing parallelisation.

Testing and development cannot rely on a limited number of production data copies; it needs parallel, containerised data. It further requires exact, and accurately matched data sets, served on demand; yet, 49% of teams today are unable to manage the size and complexity of test data sets [1].

The test data bottlenecks that organisations face today have become a sinkhole for testers’ time, to the point where 44% of their time is spent searching, managing and generating test data [2]. These bottlenecks can have a massive ripple effect across the whole SDLC. Fortunately, solving these test data problems is possible through a range of modern test data automation techniques!

Accelerate Your Test Data

 

1. “Just in Time” Test Data Provisioning

For any organisation looking to upgrade their test data practices, moving beyond slow and manual data provisioning should be a priority. “Just in time” test data provisioning should be integrated into CI/CD and DevOps pipelines, matching their speed, automation, and flexibility.

Provisioning data must be rapid, and capable of allocating data for every possible test. This should include new tests, integrating data generation into data provisioning. As data is required by testers and frameworks in parallel, provisioning should additionally integrate data cloning and allocation.

2. On-The-Fly Test Data Allocation

Test Data Allocation assures that teams and tests in the same environment work seamlessly in parallel. When data is allocated to a given test, it can be locked to prevent any other test transacting against it. This avoids frustrating test failures caused by test data. It is particularly valuable in agile environments or when running high volumes of tests.

Learn about test data allocation and automated “Find and Makes” in the latest video from Curiosity’s Huw Price:

 

3. Automated Find and Makes

Provisioning and allocating data “just in time” requires the ability to find and make data on-the-fly.

Automating find and makes eliminates test and development time wasted finding, making, and waiting for data. Finds hunt automatically for data combinations needed in testing and development, applying AI-based techniques like query parsing to make missing combinations for different test and development scenarios. These integrated makes avoid test data bottlenecks in CI/CD, ensuring that rigorous tests run continuously with accurate data.

On-the-fly data “find and makes” from Curiosity’s Test Data Automation.

4. Database Virtualisation and on Demand Orchestration

For test data to match the speed of development enabled by containerisation, parallelisation of teams, DevOps and automation, test data should itself be containerised and virtualised.

Virtualising databases allows testers, frameworks and CI/CD pipelines to spin up the databases they need in seconds, at a fraction of a cost when compared to copying physical data. Integrating database virtualisation with on demand orchestration further provides all the environments and databases needed for parallelised testing and development, spinning up and filling containerised databases on demand. As testers, frameworks, and developers stream the data they need, database orchestration spins up parallel and virtualised environments on-the-fly:

Data orchestration and virtualisation forms part of a complete Test Data Automation toolkit, providing parallelised data sets on-the-fly.

5. Synthetic Test Data

Finally, all of these techniques should be integrated with synthetic test data generation.

Synthetic test data can create missing combinations of test data needed in testing, so testers no longer create data manually or use sensitive production data. Synthetic test data fills gaps in test coverage, facilitating rigorous testing at speed.

Though synthetic test data can save time and enhance quality, just 30% of organisations use synthetic data during database development and testing [3]. Yet, synthetic test data is a powerful tool for driving faster, more robust development. In combination with the other techniques set out in this article, it can form part of a complete Test Data Automation toolkit.

Curiosity’s Test Data Automation

The five techniques set out in this article combine to help organisations push beyond slow and manual test data provisioning. They facilitate a development environment where test data is available on demand, at all times, across the whole SDLC.

All five techniques are provided by Curiosity’s Test Data Automation. Test Data Automation helps testers and developers relieve test data bottlenecks by automatically preparing complete and compliant test data on demand. With database virtualisation and orchestration, this data further populates parallel environments on-the-fly.

With Test Data Automation, parallel teams and frameworks stream the data they need, when and where they need it. Data is furthermore anonymized or generated from scratch to simplify and support legislative compliance, while manual and automated requestors can self-provision the data they need on demand. With Test Data Automation, organisations can overcome test data bottlenecks, automating TDM to match the speed of CI/CD and DevOps pipelines.

This blog is part 2/4 in a series focusing on test data modernization. Check out the other three parts below:

  1. 5 Test Data Challenges That Every CTO Should Know About.
  2. 5 Solutions to Test Data Coverage Issues.
  3. 5 Ways to Keep Your Test Data Compliant.

Download our free Test Data-as-a-Service Solution Brief to learn how Test Data Automation can help you transform the relationship that your teams and frameworks share with data, shifting from slow and manual data “provisioning” to streaming rich test data in real-time!

Footnotes:

[1] Capgemini, Sogeti (2021), World Quality Report 2021-22. Retrieved from https://www.capgemini.com/gb-en/research/world-quality-report-wqr-2021-22/  

[2] Capgemini, Sogeti (2020), The Continuous Testing Report 2020. Retrieved from https://www.sogeti.com/explore/reports/continuous-testing-report-2020/

[3] Redgate (2021), The 2021 State of Database DevOps Report. Retrieved from https://www.red-gate.com/solutions/database-devops/report-2021