Self-led training Module 3 - Synthetic Data Generation

Introduction

With Curiosity Software, our synthetic data generation capabilities allow you to create data that can be inserted into a database, either directly or as a file to be uploaded through another process. This training will show you how to set up the database synthetic data generation activity and how to configure the ruleset ready for execution using multiple techniques, including our AI capabilities.

The techniques will be more suitable for different scenarios and each technique will briefly describe when they are most useful, for instance applying the default rules will apply the needed synthetic data generation rules for most use cases and are customisable so that they can be adapted to the needs of the user.

Training overview:

This training course will take you through the journey of creating a database synthetic data generation activity using an existing connection and a profiled definition. We will also introduce you to the main techniques for populating a rule set and show you how to incorporate these into different execution methods such as test data pipelines, API calls and self-service forms.

Here is a visual model of the high-level process of creating a synthetic data generation rule set >

By the end of this self-led training, you will be able to:

Create a Database Synthetic Data Generation Activity
Create templates for synthetic data generation
Use Data Painter to populate rule sets
Incorporate into a test data pipeline
Use pre and post actions

Pre-requisites for synthetic data generation training:

A Database Connection
A Data Definition
To have completed the ‘Definition’ section of Module 1

Before you begin:

Using the links below, please download your Synthetic Data Generation training guide, which will act as your workbook for the training. This is split in two parts, available via the links below:

Part 1 - Sections 1 and 2 - Set-up and introduction to rule sets for synthetic data generation
Part 2 - Sections 3 and 4 - Methods of synthetic data generation and running the activity
Confirm your progress in the training form and submit at the end of your training to ensure you receive your certificate of completion. The training form is here.

This training module should take approximately 1-1.5 days to complete. The training is structured into 4 key sections, and you will find exercises to complete throughout the course which will be based on the information you've just read. The training is structured as follows, and should be completed in the following order:

Section 1 - Set up the data generation activity - Includes Exercise1
Section 2 - Generation rule set and accelerators - Includes Exercises 2 and 3
1. Synthetic data generation - Additional information
Section 3 - Methods of synthetic data generation
1. Method 1: Using defaults - Includes Exercise 4
2. Method 2: Using data painter - Inlcludes Exercises 5-10
3. Method 3: Rule set accelerators
Section 4 - Execute the synthetic data generation activity - Includes Exercises 11 and 12

You can find the solutions to each exercise in the 'Exercise Solutions' page of this training course.

Proceed to Section 1 - Set up the data generation activity >

Self-led training Module 3 - Synthetic Data Generation - Introduction

Synthetic data generation into databases, with Curiosity's Enterprise Test Data® Platform

Introduction

Training overview: