Load example datasets

From Fusion Registry Wiki
Revision as of 05:55, 8 August 2023 by Vmurrell (talk | contribs) (Load the EXR Exchange Rates data)
Jump to navigation Jump to search

Overview

Fusion Registry is a 'virtual statistical data warehouse' meaning that the data it contains can either be loaded and held in storage managed by Fusion Registry, or retrived dynamically from external sources like SQL databases, SDMX REST web services or files.

In this example, we are going to load some data from the ECB's TRD - External Trade dataset into SQL database storage managed by Fusion Registry.

This example assumes that you have already loaded the Dataflow for the ECB's External dataflow TRD (1.) as discussed in this article.

We first need to prepare the Dataflow to receive data:

  • Add a connection to our MySQL database to store the data
  • Create an SDMX Data Provider - this is the 'organisation' under which the data will be submitted under the SDMX Data Collection model
  • Create an SDMX Provision Agreement - this is the SDMX structure which authorises the Data Provider to submit data to the Dataflow
  • Link the Provision Agreement to the MySQL database connection

Once the Dataflow has been prepared, the data will be loaded directly from Metadata Technology's Fusion Registry demonstration service.

Add a database connection for data storage

AddDatabaseConnection.PNG
  • Login to the admin account
  • From the left-hand menu choose Admin > Data Manager
  • Choose the Cogs.PNG 'cogs' button and select Add Database Connection from the menu
  • Use the following settings:
    • Connection Id: MYSQL_LOCAL
    • Connection Type: Registry Managed
    • Database Cache: None
    • Database Platform: MySQL
    • Connection Settings: Simple
    • Database Server: localhost
    • Database Port: 3306
    • Database Schema: fusion_registry (or the name of the schema you created)
    • Database Username: (the name of the database user with access to the schema)
    • Database Password: (the password for that database user)

The 'MYSQL_LOCAL' connection Id will be used later when configuring the Provision Agreement.

Create an SDMX Data Provider

  • Login to the admin account (if not already logged in)
  • From the left-hand menu choose Organisations > Data Providers
  • Choose the Cogs.PNG 'cogs' button and select Create Data Provider
  • Under the 1 Details step of the wizard:
    • Id: MYDP
    • Language: en (this is the language for the Name and Description - you can choose other languages, but we will work with English for now)
    • Owning Agency: SDMX
    • Name: My Data Provider (you can choose any name you like)
    • Description: A test Data Provider (again, you can choose your own description)
  • Choose the Finish button

Create an SDMX Provision Agreement

Choose Data Provider.png
  • Login to the admin account (if not already logged in)
  • From the left-hand menu choose Data > Dataflows
  • Select the 'TRD' EXB External Trade Dataflow from the list
  • Choose the Cogs.PNG 'cogs' button and select Edit Selected Dataflow
  • On the Dataflow Wizard, use the Next button to move to Step 3 - Data Providers
  • Use the AddButton.PNG button to add a new Data Provider for the Dataflow
  • When the Choose Data Providers window appears, check the MYDP Data Provider and choose the Add button
  • Back on the Dataflow Wizard, choose the Finish button to save the changes

Link the Provision Agreement to the database connection

PA Select Data Source.png
  • Login to the admin account (if not already logged in)
  • From the left-hand menu choose Data > Provision Agreements
  • Select the 'EXR_SDMX_MYDP' Provision Agreement from the list
  • Locate the Linked Data Source field at the bottom of the page - choose 'MYSQL_LOCAL' in the select box, and click the Apply button

Load the TRD External Trade data

LoadDataFromUrl.PNG

The load process may take around 60 seconds while the data is retrieved from the demonstration web service and validated. Once loaded, the Dataset Details page shows the result of the validation, for instance whether the data is semantically compliant (the values comply with the rules specified in the Data Structure). You may see some Time Period Format errors which can be safely ignored.

The loaded data now needs to be published to the EXR Dataflow.

  • Still on the Dataset Details page, locate the Provision Agreement field and choose the 'ECB:EXR_SDMX_MYDP(1.0)' Provision Agreement from the select box
  • Under Action, choose the Re-Verify Data button - this rechecks the data for compliance with any SDMX Reporting Constraints that may be defined for the Provision Agreement
  • Choose the Publish Data button
  • In the Publish Data popup, choose Append as the Action, and click the Upload button
  • You should receive a notification that the data has been successfully published