Skip to the main content.

Curiosity Modeller

Design Complex Systems, Create Visual Models, Collaborate on Requirements, Eradicate Bugs and Deliver Quality! 

Product Overview Solutions
Success Stories Integrations
Book a Demo Release Notes
Free Trial Brochure
Pricing  

Enterprise Test Data

Stream Complete and Compliant Test Data On-Demand, Removing Bottlenecks and Boosting Coverage!

Explore Curiosity's Solutions

Our innovative solutions help you deliver quality software earlier, and at less cost!

robot-excited copy-1              AI Accelerated Quality              Scalable AI accelerated test creation for improved quality and faster software delivery.

palette copy-1                      Test Case Design                Generate the smallest set of test cases needed to test complex systems.

database-arrow-right copy-3          Data Subsetting & Cloning      Extract the smallest data sets needed for referential integrity and coverage.

cloud-cog copy                  API Test Automation              Make complex API testing simple, using a visual approach to generate rigorous API tests.

plus-box-multiple copy-1         Synthetic Data Generation             Generate complete and compliant synthetic data on-demand for every scenario.

file-find copy-1                                     Data Allocation                  Automatically find and make data for every possible test, testing continuously and in parallel.

sitemap copy-1                Requirements Modelling          Model complex systems and requirements as complete flowcharts in-sprint.

lock copy-1                                 Data Masking                            Identify and mask sensitive information across databases and files.

database-sync copy-2                   Legacy TDM Replacement        Move to a modern test data solution with cutting-edge capabilities.

Explore Curiosity's Resources

See how we empower customer success, watch our latest webinars, read our newest eBooks and more.

video-vintage copy                                      Webinars                                Register for upcoming events, and watch our latest on-demand webinars.

radio copy                                   Podcasts                                  Listen to the latest episode of the Why Didn't You Test That? Podcast and more.

notebook copy                                           eBooks                                Download our latest research papers and solutions briefs.

calendar copy                                       Events                                          Join the Curiosity team in person or virtually at our upcoming events and conferences.

book-open-page-variant copy                                          Blog                                        Discover software quality trends and thought leadership brought to you by the Curiosity team.

face-agent copy                               Help & Support                            Find a solution, request expert support and contact Curiosity. 

bookmark-check copy                            Success Stories                            Learn how our customers found success with Curiosity's Modeller and Enterprise Test Data.

file-document-multiple (1) copy                                 Documentation                            Get started with the Curiosity Platform, discover our learning portal and find solutions. 

connection copy                                  Integrations                              Explore Modeller's wide range of connections and integrations.

Better Software, Faster Delivery!

Curiosity are your partners for designing and building complex systems in short sprints!

account-supervisor copy                            Meet Our Team                          Meet our team of world leading experts in software quality and test data.

calendar-month copy                                         Our History                                Explore Curiosity's long history of creating market-defining solutions and success.

check-decagram copy                                       Our Mission                                Discover how we aim to revolutionize the quality and speed of software delivery.

handshake copy                            Our Partners                            Learn about our partners and how we can help you solve your software delivery challenges.

account-tie-woman copy                                        Careers                                    Join our growing team of industry veterans, experts, innovators and specialists. 

typewriter copy                             Press Releases                          Read the latest Curiosity news and company updates.

bookmark-check copy                            Success Stories                          Learn how our customers found success with Curiosity's Modeller and Enterprise Test Data.

book-open-page-variant copy                                                  Blog                                                Discover software quality trends and thought leadership brought to you by the Curiosity team.

phone-classic copy                                      Contact Us                                           Get in touch with a Curiosity expert or leave us a message.

5 min read

28 questions to ask yourself when picking a data generation tool

28 questions to ask yourself when picking a data generation tool

Data generation enables organisations to create data of the right variety, density, and volume for different testing and development scenarios, all while avoiding the compliance risks associated with using raw production data.

Creating data from scratch will be familiar to testers and developers, who often produce data when they find that the varied data they need is missing or unavailable. Today, further applications of data generation are emerging, such as the generation of training data sets for AI/ML.

Several factors might lead you to consider adopting a new data generation solution. These include evolving compliance requirements, growing data complexity, and an increasing demand for data across your organisation. In these instances, an automated, proven solution might promise to be more robust and scalable when compared to the ad hoc data creation that occurs manually during testing and development.

This article sets out 28 questions you should ask yourself when considering a new data generation solution. The questions aim to ensure that a data generation solution will be capable of creating fit-for-purpose data at scale. They aim to assess whether a solution will provide the data your teams need, without introducing unwanted complexity, bottlenecks, or overheads.

Extensibility and connectivity

The types of data used at an enterprise are constantly shifting, as teams and systems implement new technologies. This leads to a proliferation in interrelated data types, as legacy components sit alongside new technologies like cloud-based applications and Big Data systems.

Data generation must be capable of creating data consistently across these new and legacy technologies. It should be equipped with an easily extensible range of connectors, as well as a range of techniques for generating data across systems. Otherwise, generation will not scale over time, and will not be future proofed as new technologies are adopted. You might find then be forced back to using copies of production data as your data generation tool cannot generate the data you need.

Interrelated Test Data Automation-1

Synthetic data generation must be capable of meeting a proliferation in interrelated data types needed in testing, development and CI/CD.

When considering approaches to data generation, ask yourself:

  1. Can the generation create data for every type of data used at your organisation today?
  2. Does the data generation tool provide connectors for generating data into lots of different systems?
  3. Can it generate data journeys consistently across different systems?
  4. Does it provide a quick, easy, and standardised methodology for adding new connectors as the types of data you use change?
  5. Can the data generation create data using a range of techniques, for instance going direct to a database, via a front-end, or via APIs, messages and files?

Ease-of-use and operational efficiency

A common concern with synthetic data generation is that it will be too time-consuming and unreliable when defining data generation for complex data. A data generation tool should therefore be intuitive and easy-to-use, while maximising reusability to boost efficiency.

When considering synthetic data generation, ask yourself:

  1. Are data generation jobs, rules, and functions easily reusable and combinable?
  2. How quick, easy, and automated is data analysis, profiling, and modelling?
  3. Is data analysis easily reusable, for instance from central data dictionaries?
  4. How quick, easy, and intuitive is it to define data generation functions and rules for complex data?

Integration into DevOps and CI/CD toolchains

In addition to supporting the full range of data types at an organisation, data generation should integrate seamlessly into both manual and automated ways of working. A lack of integration risks creating dependencies and bottlenecks, as data generation will require manual intervention within otherwise automated pipelines.

Synthetic data generation should furthermore be easily combinable with other technologies and techniques for creating and manipulating data. Otherwise, disparate processes for creating and provisioning data will require manual configuration and alignment. Running these complex, interrelated processes manually in turn risks creating inconsistent and misaligned data sets, undermining effective testing and development.

When deciding techniques for synthetic test data generation, ask yourself:

  1. Is the generation an open technology, capable of integrating with every tool, automated process, and manual process? This should include automated technologies like CI/CD pipelines and test automation frameworks, as well as the manual processes performed by teams at your organisation.
  2. Does generation integrate with all other requisite data management processes (such as masking, allocation, and subsetting)?

Data Automation vs Data Management

To ensure scalability and adoption, data generation should maximise automation and reusability, while minimising repetitive manual configuration.

Manual approaches and narrowly defined (non-event-driven) automation are typically limited in scope, requiring fresh configuration for each subtly different scenario. They further require rework and reconfiguration as data, scenarios, systems, and requests change. This continuous manual intervention is not sustainable, given the continuously increasing demand for data demand at enterprises today.

Automated Test Data Find and Makes

Data generation should form part of an automated service, capable of providing data for different requesters and scenarios on-the-fly. Learn more about automated "find and makes".

When creating a strategy for synthetic data generation, ask yourself:

  1. Can data generation form part of a truly automated service, or does it remain within the test data management paradigm?
  2. Can generation be triggered on demand, meaning on-the-fly?
  3. Are previously configured data generation jobs flexible, reusable, and parameterizable?
  4. Can manual and automated data requesters parameterise and run the jobs on-the-fly?
  5. Is a pre-configured generation job capable of handling differences in data requests (parameters or inputs from the requestor)?
  6. Is data generation capable of handling changes over time in production data, environments, test scenarios, and more?

Data complexity

Data generation does not just need to create data for numerous interrelated data types; the logic that must be reflected in that data can also be immensely sophisticated.

This includes the logic and structure of common types of data, such as the varying logic and syntax for creating different types of Social Security Number. At the same time, accurate data for testing and development must reflect and respect system logic, for instance reflecting temporal trends and sequences.

When deciding an approach for synthetic data generation, ask yourself:

  1. Can generation reflect complex logic, for instance reflecting a series of events? For instance, if data is generated upstream, can the generated value easily feed into a function that generates data simultaneously into a downstream system?
  2. Is a comprehensive set of functions provided, including functions for generating sophisticated data like IBAN numbers?
  3. Can generation be set to generate data for different regions and geographies, reflecting the differences in syntax and data across states and countries?
  4. How easy is it to customise functions and add new ones?
  5. How easy is it to combine functions?
  6. Are consistent journeys easy to generate, for instance using visual techniques?

Fit-for-purpose data

Today, data is required for a wide range of different purposes and scenarios. Testing and development, for instance, requires data of different volumes, density, and variety. Functional testing, for instance, might require high-variety data that’s of lower volumes than the data needed in stress testing, while unit testing and smoke testing will require different data sets yet again.

Generating data of the wrong volume or contents will undermine the accuracy and rigour of testing and development, while introducing bottlenecks. Meanwhile, different data sets must be kept up-to-date, including aligning data for different versions of system components.

Test Data Generation Analysis

Data generation should be informed by granular analysis, such as data comparisons, coverage analysis, density analysis and kurtosis.

When deciding on an approach to synthetic data generation, ask yourself:

  1. Can data generation be informed by a range of different types of analysis and data modelling?
  2. Can data generation be based on coverage analysis, creating “covered” data sets?
  3. Can generation be based on comparisons between data and environments?
  4. Can generation be based on granular data analysis, such as density analysis and kurtosis?
  5. Can the analysis be re-run automatically, updating jobs as data and systems change?

Picking the right data generation solution

The types of data and integrated technologies used at enterprises continues to grow ever-more complex, while the constant need for varied data is growing more demanding across teams and frameworks. Meanwhile, compliance and data privacy requirements continue to evolve.

Automatically generating fictitious data for a range of different testing and data scenarios offers to remove data bottlenecks, while avoiding the use of raw production data in non-production environments.

Any such solution must be scalable and robust, capable of quickly generating data that is equal in complexity to your systems and development needs. The 28 questions set out above are intended to help you identify such a solution.

Time to migrate from your legacy test data (TDM) tools? Here’s how.

Time to migrate from your legacy test data (TDM) tools? Here’s how.

If you’re reading this, you’re probably already painfully familiar with the complaints that Curiosity hear from organisations seeking alternatives to...

Read More
Test Data Automation: The next generation in test data management

Test Data Automation: The next generation in test data management

Rigorous testing at the speed of today’s release cycles requires on demand access to “good” data. That means data combinations with which to execute...

Read More
We Need to Talk About Test Data “Strategy”

We Need to Talk About Test Data “Strategy”

For many organisations, test data “best practices” start and end with compliance. This reflects a tendency to focus on the problem immediately in...

Read More
The Democratisation of (Test) Data

The Democratisation of (Test) Data

A glance at industry research from recent years shows that test data remains one of the major bottlenecks to fix in DevOps and CI/CD:

Read More
Key risk factors to mitigate during a data migration

Key risk factors to mitigate during a data migration

Part one in this article series summarized the shockingly high failure rates for migration projects, identifying data migration as a key area of...

Read More
Is test data the engineering problem to solve in 2024?

Is test data the engineering problem to solve in 2024?

It’s 2024 and the risks associated with poor test data practices show no signs of abating.

Read More
GDPR and testing: A few questions to ask yourself

GDPR and testing: A few questions to ask yourself

I’ve been harping on about GDPR and other recent developments in compliance for years now, and it’s good to see QA organisations are now seriously...

Read More
Automate Test Data Bottlenecks out of CI/CD and DevOps - Infographic

Automate Test Data Bottlenecks out of CI/CD and DevOps - Infographic

Discover how Test Data Automation can help you automate your test data management by reading the infographic below! Curiosity's Test Data Automation...

Read More
Test Data Strategy Success: Tooling to Meet The Strategy

Test Data Strategy Success: Tooling to Meet The Strategy

Today, many organisations rely on rudimental tools and techniques for creating and managing their test data. These outdated techniques not only...

Read More