Overview

The Fusion Registry can act as a Virtual Data Store. This means that the Registry is able to link to datasets accross multiple physical data stores, and each data store can be of a diffent type (database, web service, file). The Fusion Registry is able to query multiple data stores for a single data query and then combine the data at the end, before writing it back to the user in a response format of their choice.

Registry Managed Data Store

A Registry Managed Data Store is one which the Registry maintains. This means as data is published to the Fusion Registry, the Registry will write the data to the store and manage the transaction. Registry managed data stores come in two flavours, a relational database (MySQL or SQL Server) store, or a Fusion Store which has been custom built by Metadata Technology for rapid storage and dissemination of data.

Fusion Data Store

The Fusion Data Store is an in-memory database, custom built by Metadata Technology for storing time series SDMX datasets. The data is ultimately persisted to the Registry database, so that when the Fusion Registry server is terminated, and re-launched, the Fusion Data Stores will be read back into memory. For the duration of the server up-time, the data store will live in memory.

Registry Managed Database

A Registry Managed Database includes support for SQL Server and MySQL database platforms, MariaDB^*. The Fusion Registry is responsible for creating the database tables when new datasets are loaded. The Fusion Registry will also update table definitions if the Data Structure Definition is changed.

^*Note Maria DB re-writes a query for both lastNObservation and firstNObservations to optimise away a required part of the SQL query. MariaDB can be configured to run in MySQL compatability mode, where this error does not occur.

The database table structure is a two or three tables per Dataflow. The table names are based on the Dataflow, but can be changed if required. The tables include:

Table for Series Definitions
Table for Observations
Table data dataset attributes (only created if Dataset Attributes exist on the Data Structure Definition)

Column Names

The Series Table will have the following Columns created.

Column Name	Column Type	Purpose
ID	Varchar	A concatenation of the Dimension values that make up the series. The width of the ID column is a combination of all the Dimension Widths plus the number of Dimensions. An example ID is M:FR:EMP. It is important that the column width for each Dimension of the Data Structure Definition is low, i.e if the Dimenison is data type String then place a Max Width restriction on the Dimension's definition so that the database table can be constructed with a primary key that does not exceed the limit set by the database vendor.
FR_GROUP_ID	varchar	(optional) Created on if the Data Structure Definition defines a Groups. This coulmn enable the storage of Group level Attributes
FR_PROVIDER_ACY	varchar	Data Provider's Agency, used if the data query includes Data Provider filters
FR_PROVIDER_ID	varchar	Data Provider's Id, used if the data query includes Data Provider filters
FR_PROVISION	varchar	Provision Agreement short URN e.g ECB:EXR(1.0)
FR_ERROR	varchar	Internal Use only
FR_UPDATED_DATE	varchar	Last updated date for the series
DIM ID/ATT_ID	text	A Column for each Dimension and Series Level Attribute. The Column Name is the ID of the Dimension/Attribute and the Column Type is dependant on the Text Format

The Observation Table will have the following Columns created.

Column Name	Column Type	Purpose
ID	int	Auto increment primary key
FR_KEY_ID	varchar	Foreign key to the series ID column
FR_OBS_DATE	datetime	A datetime representaion of the TIME_PERIOD column
OBS_VALUE	varchar	The Observation Value
TIME_PERIOD	varchar	Reported Time Period
FR_ERROR	varchar	Internal use only
FR_EMBARGO	int	Deprecated
FR_IS_PUBLIC	int	Deprecated
FR_UPDATED_DATE_OBS	int	Last time the observation was inserted/updated
FR_VALID_FROM	datetime	Used to support the includeHistory query parameter. A time stamp of when the observation is valid from, either taken from the Header of the dataset, or set to the time the observation was inserted
FR_VALID_TO	datetime	Used to support the includeHistory query parameter. A time stamp of when the observation is valid to (if the observation is updated, this is used to capture previous revisions)
FR_CHKSUM	varchar	a checksum on the column (internal use)
ATT_ID	text	A Column for each Observation Level Attribute. The Column Name is the ID of the Attribute and the Column Type is dependant on the Text Format

The Dataset Attribute Table, if required, will have the following Columns created.

Column Name	Column Type	Purpose
ID	Varchar	The ID of the Dataset Level Attribute.
VALUE	varchar	The reported value for the attribute
FR_ERROR	varchar	internal use only

Column Data Types

The column data types that are based on the Data Structure Definitions Dimensions and Attributes are set to Varchar with the width restrictions based on the table below. If the width restriction exceeds 250 characters the column data type is set to TEXT.

Width restrictions are calculated as follows:

Component's Representation	Column Width	Example Value
String	Max Length defined on Component or 250 if unspecified	This is a series title
Codelist	Maximum length of the longest Code Id (e.g UK would be length 2)	UK
Boolean	5	true
Short	5	32767
Integer	10	123
Long	19	123456
Float	12	12.20
Double	22	12.20
Double	22	12.20
Date	30	2002-01-01T12:15:00.000GMT
Date Time	30	2002-01-01T12:15:00.000GMT
Time Period	30	2002-Q1 (any valid Time Format)
Day	2	31
Month	2	12
Year_Month	7	2002-01
Year	4	2001
Time	10	12:15:00

Note If the dimenison or attribute is numerical and a maximum and minimum value is set, then this will be used to determine the column width. For example if the Attribute is of type Double, with a min value of 0 and a max value of 10 with a Decimals of 3, then the maximum length is 6, allowing for the values such as 10.123 to be reported.

Externally Managed Database

An Externally Managed Database can be MySQL, SQLServer, or Oracle.

There are two options for the database table design when linking to an externally managed database, these are:

Mirror the structure used by the Fusion Registry Managed database table structure (Series / Observation).
A single database table for a Dataflow, where the Series and Observations are in one table

Note The Table and column names must all be in UPPER CASE.

When the Provision Agreement is linked to an externally managed data store, the user interface will provide the ability to specify what the database tables/views are used for the Series table and the Observation table. The Series table is optional, and if it is not provided then the Single Table Design will be used to query the database table. If the Series table is provided, the Series/Obs table design will be used.

Series/Obs Table Design

See the registry table design section for the full structure. The full table structure is not required, but there are some rules required for this table/view design, these are:

The Series table must have a primary key OR a Column called ID. If there is a column called ID it will be expected to be the Primary Key column.
If the Series table has an ID column AND it is of type Varchar, the values will be expected to be the concatenated series key (A:UK:EMP for example).
The Obs table must either have a foreign key to the Series Table, OR a column called FR_KEY_ID which can be used to join to the primary key of the Series Table
The Series table is expected to have a Column per Dimension, attribute columns are all optional
The Obs table is expected to have a Column for Obs Value and Time Period, attribute columns are all optional
All columns that are prefixed with 'FR_' are optional

Single Table Design

The flat table structure is a denormalised structure where the Series and Observations are in one table. This does lead to a repetition of dimension values where a series consists of multiple observations. The flat table structure does not support all the features of a SDMX Query including firstNObservations, lastNObservations, includeHistory.

The table structure is as follows:

Column Name	Column Type	Required	Purpose
DIM _ID	varchar	Required	Column per Dimension where the column name is that of the Dimension Id
ATTR_ID	varchar	Optional	Column per Series or Observation Attribute where the column name is that of the Attribute Id
TIME_PERIOD	varchar	Required	Reported Time Period
FR_VALID_FROM	datetime	Optional	Used to support startPeriod, endPeriod, lastNObservations, firstNObservations query parameters
FR_UPDATED_DATE	datetime	Optional	to support the updatedAfter query parameter

Note: NVARCHAR and VARCHAR are not interchangeable and columns must be defined as VARCHAR and not NVARCHAR to prevent errors.

Custom Table and Column Names

It is possible to override the Registrys own table nameing and column naming convention. To do this via a web service, use the following:

Entry Point	/ws/secure/datasource/tableMapping
Access	Private (admin)
Http Method	POST
Accepts	application/json
Compression	N/A
Content-Type	application/json
Response Format	application/json
Response Statuses	200 - success 400 - Bad JSON syntax 401 - Unauthorized 500 - Server Error

The JSON request has the following structure:

{
  "URN" : "urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=WB:GCI(1.0)",
  "DataSourceId" : "MY_STORE",
  "KeyTable" : "GCI_KEY",
  "ObsTable" : "GCI_OBS",
  "DSAttributeTable" : null,
  "MappedColumns" : {
    "REF_AREA" : "COUNTRY",
    "FR_UPDATE_DATE" : "UPDATED_DATE" 
  }
}

The URN field is the URN of the Dataflow to map the table/colmn names for. The Mapped Columns field is optional, and only required if column names require mapping as well as table names. The same web service can be used to update a mapping, the Dataflow URN is the key for the mapping.

To delete a mapping call the same web service with method DELETE, and post the following JSON:

{
  "URN" : "urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=WB:GCI(1.0)",
  "DataSourceId" : "MY_STORE",
}

To retrieve a mapping call the same web service with the following query parameters:

urn	urn of the Dataflow to get the table mapping description for
dataSourceId	Id of the data store to get the table mapping description for

Example Using Oracle

The following is a short example demonstrating how to use Oracle and set up 2 externally managed tables and connect them to the Fusion Registry, allowing the Registry to disseminate data.

The steps to follow are:

Using the database tools of your choice, create the Key Table and Obs Table and populate both with data
Load the structures (attached below) into your Fusion Registry
Navigate to the Data Sources Manager page and:
- Set up an Oracle Connection where the Connection Type is "Externally Managed" and the Database Platform is "Oracle"
- Select this Oracle data source in the table and click the cogs at top-right. Select "Link Provision Agreement" and select the Provision Agreement "WB_WDI_ECON_POL_DEBT"
- If everything has gone correctly, then the table will show 3 Series (and an unknown amount of Observations)
The data can now be queried from the Registry using the appropriate Web Service and the dataflow "WDI_ECON_POL_DEBT". E.g.:

http://localhost:8080/FusionRegistry/sdmx/v2/data/dataflow/WB/WDI_ECON_POL_DEBT/1.0/?format=sdmx-compact-2.1

Key Table:

The following creates the Key Table: WB_WDI_ECON_POL_DEBT_1_0_KEY

CREATE TABLE WB_WDI_ECON_POL_DEBT_1_0_KEY (
 ID        VARCHAR2(250 BYTE),
 FREQ      VARCHAR2(5 BYTE),
 SERIES    VARCHAR2(25 BYTE),
 REF_AREA  VARCHAR2(3 BYTE)
)

Sample Data for Key Table:

INSERT INTO WB_WDI_ECON_POL_DEBT_1_0_KEY values ('A:AG_AGR_TRAC_NO:ABW', 'A', 'AG_AGR_TRAC_NO', 'ABW')
INSERT INTO WB_WDI_ECON_POL_DEBT_1_0_KEY values ('A:AG_AGR_TRAC_NO:ADO', 'A', 'AG_AGR_TRAC_NO', 'ADO')
INSERT INTO WB_WDI_ECON_POL_DEBT_1_0_KEY values ('A:AG_AGR_TRAC_NO:AFG', 'A', 'AG_AGR_TRAC_NO', 'AFG')

Observation Table:

The following creates the Observation Table: WB_WDI_ECON_POL_DEBT_1_0_OBS

CREATE TABLE WB_WDI_ECON_POL_DEBT_1_0_OBS (
 FR_KEY_ID       VARCHAR2(50 BYTE),
 FR_OBS_DATE     TIMESTAMP (6) NOT NULL,
 TIME_PERIOD     VARCHAR2(50 BYTE) NOT NULL,
 OBS_VALUE       VARCHAR2(10 BYTE) NOT NULL,
 UNIT_MULT       VARCHAR2(2  BYTE) NOT NULL,
 FR_VALID_FROM   TIMESTAMP (6) DEFAULT TO_TIMESTAMP( '2015-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') NOT NULL,
 FR_VALID_TO     TIMESTAMP (6) DEFAULT TO_TIMESTAMP( '2999-12-31 00:00:00', 'YYYY-MM-DD HH24:MI:SS') NOT NULL, 
 FR_UPDATED_DATE_OBS     TIMESTAMP (6) DEFAULT TO_TIMESTAMP( '2014-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') NOT NULL 
)

Sample Data for Obs Table:

INSERT INTO WB_WDI_ECON_POL_DEBT_1_0_OBS (FR_KEY_ID, FR_OBS_DATE, TIME_PERIOD, OBS_VALUE, UNIT_MULT) values ('A:AG_AGR_TRAC_NO:ABW', '01-JAN-1950', '1950', '1950.5001', 'C')
INSERT INTO WB_WDI_ECON_POL_DEBT_1_0_OBS (FR_KEY_ID, FR_OBS_DATE, TIME_PERIOD, OBS_VALUE, UNIT_MULT) values ('A:AG_AGR_TRAC_NO:ABW', '01-JAN-1951', '1951', '1951.5152', 'C')
INSERT INTO WB_WDI_ECON_POL_DEBT_1_0_OBS (FR_KEY_ID, FR_OBS_DATE, TIME_PERIOD, OBS_VALUE, UNIT_MULT) values ('A:AG_AGR_TRAC_NO:ABW', '01-JAN-1952', '1952', '1952.52', 'C')
INSERT INTO WB_WDI_ECON_POL_DEBT_1_0_OBS (FR_KEY_ID, FR_OBS_DATE, TIME_PERIOD, OBS_VALUE, UNIT_MULT) values ('A:AG_AGR_TRAC_NO:ADO', '01-JAN-1951', '1951', '19.51', 'C')
INSERT INTO WB_WDI_ECON_POL_DEBT_1_0_OBS (FR_KEY_ID, FR_OBS_DATE, TIME_PERIOD, OBS_VALUE, UNIT_MULT) values ('A:AG_AGR_TRAC_NO:AFG', '01-JAN-1950', '1950', '11950.50', 'C')
INSERT INTO WB_WDI_ECON_POL_DEBT_1_0_OBS (FR_KEY_ID, FR_OBS_DATE, TIME_PERIOD, OBS_VALUE, UNIT_MULT) values ('A:AG_AGR_TRAC_NO:AFG', '01-JAN-1951', '1951', '11951.51', 'C')
INSERT INTO WB_WDI_ECON_POL_DEBT_1_0_OBS (FR_KEY_ID, FR_OBS_DATE, TIME_PERIOD, OBS_VALUE, UNIT_MULT) values ('A:AG_AGR_TRAC_NO:AFG', '01-JAN-1952', '1952', '11952.999', 'C'

Dataflow, DSD and other related structures:

The following file ( File:ExampleStructures.zip ) contains the structures required by this demo. It includes:

The Data Structure: WB:WDI(1.0)
The Dataflow: WB:WDI_ECON_POL_DEBT(1.0)
The Provision Agreement: WB:WB_WDI_ECON_POL_DEBT(1.0)
Supporting structures such as Codelists and Concept Schemes

Data Stores

Contents

Overview

Registry Managed Data Store

Fusion Data Store

Registry Managed Database

Column Names

Column Data Types

Externally Managed Database

Series/Obs Table Design

Single Table Design

Custom Table and Column Names

Example Using Oracle

Key Table:

Sample Data for Key Table:

Observation Table:

Sample Data for Obs Table:

Dataflow, DSD and other related structures:

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Fusion Software

Metadata Technology

Tools