Fusion Data Mapper - Add Dataset

From Fusion Registry Wiki
Jump to navigation Jump to search

Overview

Mapping Dataflows

The job of the Fusion Data Mapper is to map Dataflows backed by Multidimesional Data Structures to a Dataflow backed by a Data Structure with only one Dimension other then Time. The Single Dimension Dataflow is referred to as the Source Dataflow with the multidimensional Dataflow being referred to as the Target.

The process of adding a Dataset results in linking a Target Dataflow to the Source Dataflow via a Structure Map. The Source Dataflow must exist in the Fusion Registry before a Dataset can be added. The Target Dataflow does not have to exist, it can be created by the Data Mapper, or reused if it already exists. On adding a Dataset the Fusion Data Mapper will create a new Structure Map linking the Source Dataflow to the Target Dataflow.

Forward Data Queries

Map Data Query

If the Source Dataflow has data loaded against it, then it is possible to link the Target Dataflow to this source Data. This results in the Fusion Registry forwarding all the data queries against the Target Dataflow to the data store or the Source Dataflow. When the data query is forwarded, the mapping rules are used to re-write the multidimensional data query to the corresponding series identifiers understood by the Source Dataflow. When the dataset is extracted from the data source, the same mapping rules are applied to convert it from a single dimensional time series dataset to a multidimensional time series.

The Data Query forwarding option is available on the form used to create the Dataflow. If this is set to YES then in addition to creating a Structure Map, a Provision Agreement will be created, and linked to the Dataflow Map as a source of Data.


Create Dataset Result

The result of creating a dataset, from the UI perspective, is an empty table, with a column for the source dimension and a column for each Dimension and Series Attributes in the Target. This table can be used to define mapping rules. Emptymappingtable.png

Summary of Structures Created

Creating a new Dataflow

This process will create a new Dataflow linked to the Data Structure Definition specified. A Structure Map will be created linking the new Dataflow to the Source Dataflow. If Forward Data Queries is selected, a Provision Agreement will be created and linked to the Structure Map as a source of Data.

Map Existing Dataflow

This process will craete a Structure Map linking the existing Dataflow to the Source Dataflow. If forward Data Queries is selected, the Mapping tool will search for a Provision Agreement against the target Dataflow, and this will be linked to the Structure Map as a source of Data. Note Linking an existing Provision Agreement to the Structure Map as a Data store will unlink it from any existing data store and may result in loss of data. If a Provision Agreement can not be found against the Target Dataflow, one will be created.

Dataflow Construction

The Dataflow Properties of Agency Id, Id, Version and Name are taken from the Form.

Provision Agreement Construction

Provision Agreements are created with the same Agency Id, Id and Version as the Target Dataflow. The system will look for a Data Provider with the same Id as the Agency Id of the Dataflow, under a Data Provider Scheme owned by the same Agency. For example if the Dataflow has the following properties:

Agency Id : CB1 Id : CBS Version : 1.0

The Provision Agreement will have the same Agency Id, Id and Version, and Name as the Dataflow. The Data Provider for the Provision Agreement will have the following properties: Data Provider Scheme Agency Id: CB1 Data Provider Id : CB1

If the Data Provider Scheme or Data Provider does not exist, it will be created.

Structure Map Construction

A Structure Set is created with the same Agency Id, Id and Version, and Name as the Dataflow. The Structure Map has the same Id as the Dataflow.