The 3 Stages of an Effective Test Data Strategy
With the rise of agile and DevOps practices, software testing is more important than ever for delivering high quality applications at speed. However,...
Design Complex Systems, Create Visual Models, Collaborate on Requirements, Eradicate Bugs and Deliver Quality!
Product Overview | Solutions |
Success Stories | Integrations |
Book a Demo | Release Notes |
Free Trial | Brochure |
Pricing |
Our innovative solutions help you deliver quality software earlier, and at less cost!
AI Accelerated Quality Scalable AI accelerated test creation for improved quality and faster software delivery.
Test Case Design Generate the smallest set of test cases needed to test complex systems.
Data Subsetting & Cloning Extract the smallest data sets needed for referential integrity and coverage.
API Test Automation Make complex API testing simple, using a visual approach to generate rigorous API tests.
Synthetic Data Generation Generate complete and compliant synthetic data on-demand for every scenario.
Data Allocation Automatically find and make data for every possible test, testing continuously and in parallel.
Requirements Modelling Model complex systems and requirements as complete flowcharts in-sprint.
Data Masking Identify and mask sensitive information across databases and files.
Legacy TDM Replacement Move to a modern test data solution with cutting-edge capabilities.
See how we empower customer success, watch our latest webinars, read our newest eBooks and more.
Events Join the Curiosity team in person or virtually at our upcoming events and conferences.
Blog Discover software quality trends and thought leadership brought to you by the Curiosity team.
Help & Support Find a solution, request expert support and contact Curiosity.
Success Stories Learn how our customers found success with Curiosity's Modeller and Enterprise Test Data.
Documentation Get started with the Curiosity Platform, discover our learning portal and find solutions.
Integrations Explore Modeller's wide range of connections and integrations.
Curiosity are your partners for designing and building complex systems in short sprints!
Meet Our Team Meet our team of world leading experts in software quality and test data.
Our History Explore Curiosity's long history of creating market-defining solutions and success.
Our Mission Discover how we aim to revolutionize the quality and speed of software delivery.
Our Partners Learn about our partners and how we can help you solve your software delivery challenges.
Careers Join our growing team of industry veterans, experts, innovators and specialists.
Press Releases Read the latest Curiosity news and company updates.
Success Stories Learn how our customers found success with Curiosity's Modeller and Enterprise Test Data.
Blog Discover software quality trends and thought leadership brought to you by the Curiosity team.
Contact Us Get in touch with a Curiosity expert or leave us a message.
5 min read
Rich Jordan 08 February 2023 11:22:17 GMT
“We mustn’t use live data for testing”. This is the reason why most organizations start to look at superficial solutions to certain challenges that are ingrained in their DNA. For years, this aversion has driven the way that organizations have changed their “best” practices, struggling to wean themselves off deep-set habits.
These organizations often start with low hanging fruit and create a capability to replace live data with either masked/obfuscated data or synthetic alternatives. They then believe that’s “job done”! It isn’t. It doesn’t tackle or even reduce many of the core challenges associated with using production in test, let alone the systemic problems that led the organization to test using production data in the first place.
Most teams do not take this narrow course of action by choice; using production data is typically born out of necessity. The live data being copied from the production system is considered the only way to understand the profile of our customers, the business rules we’ve lost control of, and the erroneous data that we know we’ve got in production but never had the time to remediate.
At the same time, the organization demands that (IT) change happens now. “Our customers demand feature x yesterday”, “we can’t get left behind by our competitors” – these are all very valid concerns.
Yet, when does the organization stop to take stock and address technical debt, assessing whether the merry-go-round of the way they work and make change is compounding the risk of something going wrong?
In the case of data, that might mean data loss, incorrect data processing, or failed data/systems migrations. There have been a number of widely reported instances in the past few years of such problems occurring.
Should we really be surprised, if our system predates the EU GDPR being introduced, that it might not be compliant with that particular piece regulation? Or, at the very least, that we’d need to do some work around demonstrating that it is?
In terms of GDPR compliance, ask yourself: Do you have an up-to-date and maintained data model? And do you track all of the data flows in your organization? Did you further go back and retrospectively design and change your systems to ensure you could demonstrate “Data security by design”?
More specifically, do you have a data dictionary or similar in your organization that both maps all of the data in your organization and classifies it against PII, PSI, or even PCI if you are going broader than GDPR?
All of these requirements are a massively tall order for change teams to grasp, let alone resolve, on top of the business demands of getting the next feature into the wild.
Data gets into every aspect of an organization and its working practices. When that data is live data, the challenges of control and subsequently removing that data must be seriously considered.
The “bow tie” diagram below is a technique used in the work of risk to visualize and articulate complex and causal control weaknesses that probably exist in your organization.
This diagram indicates the types of prior causes and knock-on risks that are commonly associated with using live data. The final “cause” before the devastating non-compliance event lies with the decision to allow the use of live data in non-production environments.
Though often decided on a project-by-project basis, such decisions are typically thematic. They reflect a pattern of behavior in which management allow actions they perceive as inconsequential, creating a “slippery slope” in which risks pile up over time
It then only takes once incremental risk to fail, causing a catastrophic event. A decision, in one project in one part of the organization, might have allowed the use of live data to expedite a project delivery. This decision was made to hit a date that wasn’t really needed, while access controls were compromised as people moved transiently around projects.
Figure 1 – A bowtie diagram showing the systemic causes leading to a catastrophic “Event” of a data breach.
With loose access to various systems and a weak culture of retaining data management experts, ways of working and data movement become an organic ball of mud. This ball just gets bigger as time goes by. It will only stop if you recognize these sliding door events and take steps to address them, stopping them from overlapping.
So what can you do? The first steps would be to recognize the problem, perhaps use the bowtie above to recognize these event happening in your organization. Seeing the potential impacts of risky decisions helps to crystallize their significance to leaders, especially if they are the material risk taker within your organization. This might include the Data Protection Officer (DPO), a role required by the GPDR for many organizations.
Next, understand the context of the specific challenge in your organization and why it occurs. This will allow you to get to the route of the problem. Remember that a lot of the time, where a team uses live data, the live data itself is a symptom of the root problem.
Just as the individual problems converged over time to create your big ball of mud, fixing the problem must start by diverging the problems. You need to separate the challenges and work on solutions. Isolation is key and should be echoed through architecture best practices (like loose coupling) and controls (like RBAC).
When it comes to data and data flows, much of decoupling the problems comes down to analysis. Simply put, you have known unknowns, and need to perform analysis to grow aware of them. This involves becoming aware of the data within a system, its sensitivity classification, and whether it should be there. You must also know which systems that data will flow too, and where that data is incorrect.
This is not a problem that can be resolved in a day, and there is no AI silver bullet. It’s probably taken you years to build up this spiraling debt, and, just like anyone struggling with debt, you need to put a repayment plan in place.
Technology can help. You almost certainly have numerous untapped sources of information where a version of the truth exists. This might include production, out-of-date documentation and the numerous different understandings in different people’s heads. Data analysis tools, ML capabilities and modelling can start to build a picture of those disparate versions of the truth.
As these pictures mature and grow, you converge on a single version of that truth, picturing how the system should actually work. You’ll be surprised that production doesn’t work how you think it does.
At this point, you have a living specification, or a Master Data Management system, for the organization. You now have a central control focus for much of the subject of data within your organization, even when you have federated and probably siloed teams. You are finally taking data seriously.
Regulation by this point isn’t a hurdle; it’s an accelerator. The structured, considered approach is accelerating our ability to deliver – go slower to go faster really is true!
Figure 2 – A dedicated data capability for servicing federated development teams.
The fundamental of agile delivery (and DevOps) is small, iterative delivery. Facing into this problem with data is in turn key if your organization is serious about going further in implementing agile. Otherwise, you are probably doing agile in a silo and then inevitably bumping into the big ball of mud that the rest of the organization is struggling with.
An interesting evolutionary journey is culturally facing into these challenges and using them as opportunities to turn your team/organization into one that is efficient and effective at learning. Data with context has become knowledge that proliferates across team boundaries – much more than a means to scrubbing sensitive information from non-production.
Figure 3 – Data with context becomes knowledge and reduces the chaos across an organisation.
Learn more about the relationship between data, compliance and understanding in Rich's Test Data at the Enterprise video series.
With the rise of agile and DevOps practices, software testing is more important than ever for delivering high quality applications at speed. However,...
Test Automation is vital to any organisation wanting to adopt Agile or DevOps, or simply wanting to deliver IT change faster.
When teams are looking to transform, optimize, or cut costs in testing, where do they first look? More often than not, they follow the advice given...
Using Function Point Analysis and model-based testing to objectively measure services. A perpetual challenge in managing software testing projects is...
It's a new year, and many of us in IT and testing are reflecting on how we can improve our processes and strategies. As we set our 2024 quality...
The value of migrating from mainframe to web and cloud architecture often appears obvious to businesses: reduced infrastructure costs, increased...
Delays in testing are often due to testers waiting for data. These data provisioning bottlenecks are generally caused in part by an organisation’s...
Part one in this article series summarized the shockingly high failure rates for migration projects, identifying data migration as a key area of...
You’re working hard to transform your ways of working, with a range of different goals. Common aims of digital transformations include: