Skip to content
English
  • There are no suggestions because the search field is empty.

Synthetic data generation - Additional information

We have put together some information to cover a few extra topics that relate to synthetic data generation. These include pre and post processes, foreign key rules, managing versions and managing data sources.  There are no exercises in this section.

Pre and post processes 

You can see and edit the current pre and post processes that are part of your generation activity. This allows you to do a variety of actions either before or after the generation routine kicks off, for instance to prepare the environment for generation or kicking off a stored procedure.

These processes can be either an expression, custom VIP flow or a link to another activity. When deciding whether to use a pre or post process, consider if the process will need to be done every time a generation routine is kicked off. If so, use a pre or post process, and if not, a test data pipeline will be a better option.

Foreign key rules

It is possible to view any active foreign key or soft key rules that can be used to generate the data. These are mainly detected through our profiling and discovery techniques and can be found in that training.

In this screen you can choose whether the relationship is active and used for generating data by clicking on the ‘Active’ toggle.

There is also an ‘Actions’ button on the top right-hand corner where you can choose to either activate or de-activate all the rules.

Manage versions

In this screen users can manage the different and new versions of the activity. Further information can be found in the workplace fundamentals section.

  • New Version - Creates a new version of the ruleset
  • Clone – Creates a new version of the activity using the same assets
  • Upgrade – Creates a new version of the activity using new versions of the assets
  • Compare – Shows the differences between two different versions

You can change the version that the rule set screen views by selecting the drop down as shown below:

Data sources

You can also manage different Data sources for the synthetic data generation which can act as a source for data. This can be very useful for getting product information from a master database for instance and is useful for making sure data that is generated is referentially integral across environments.

You can click the ‘+Add’ button to add another data source to the activity. You can also click the edit button to change a current data source. 

Proceed to Section 3 - Methods of data generation >