2. Scan a database
In this section, you will learn how to use the Curiosity Platform to scan your database and store database metadata within a data catalogue. This section includes Exercise 2.
The process below will allow you to scan and store a version of the database metadata within the Curiosity Platform’s catalogue. It will collect statistical properties of the data, data types and much more, all of which can be used in your data activities and test data pipelines.
- To scan the database, navigate to the 'Data Dictionary' and then to 'Databases' and click on the newly set-up database connection from Section 1.
- Click on Run Scan (VIP Server)
This will open up a form asking you to select a process, select the ‘Get Schema Metadata’ option. After this you can also choose whether to scan tables and views.
When this job completes, you will have a scanned database to review. It will show schemas, tables and columns.
- To view the scan details, click on the ‘Scan #1’
If the database is updated, you can scan multiple times. You will then have multiple versions of scans.
The available schemas and some associated information will be presented.
In this case, we’d like to see the public schema in more detail. - Click on ‘public’to learn more.
The schema details with the column details, foreign keys and references are now displayed.
Clicking on any table will show further details:
-
Column information & data types will often start to drive the decisions made in terms of masking or data generation routines. You can also click on the column to view statistical information about the data.
Exercise 2
- Kick off a scan on the connection you set up in Exercise 1, using:
- Run Scan (VIP Server)
- Run Scan (Native)
Need help or want to check your work? Check the solution video here.