Data Stores

From Fusion Registry Wiki
Revision as of 09:11, 20 February 2020 by Mnelson (talk | contribs) (Created page with "= Overview = <p>The Fusion Registry can act as a Virtual Data Store. This means that the Registry is able to link to datasets accross multiple physical data stores, and each...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Overview

The Fusion Registry can act as a Virtual Data Store. This means that the Registry is able to link to datasets accross multiple physical data stores, and each data store can be of a diffent type (database, web service, file). The Fusion Registry is able to query multiple data stores for a single data query and then combine the data at the end, before writing it back to the user in a response format of their choice.

Registry Managed Data Store

A Registry Managed Data Store is one which the Registry maintains. This means as data is published to the Fusion Registry, the Registry wil write the data to the store and manage the transaction. Registry managed data stores come in two flavours, a relational database (MySQL or SQL Server) store, or a Fusion Store which has been custom built by Metadata Technology for rapid storage and dissemination of data.

Fusion Data Store

The Fusion Data Store is an in-memory database, custom built by Metadata Technology for storing time series SDMX datasets. The data is ultimately persisted to the Registry database, so that when the Fusion Registry server is terminated, and re-launched, the Fusion Data Stores will be read back into memory. For the duration of the server up-time, the data store will live in memory.

Registry Managed Database

A Registry Managed Database includes support for SQL Server and MySQL database platforms, MariaDB*. The Fusion Registry is responsible for creating the database tables when new datasets are loaded. The Fusion Registry will also update table definitions if the Data Structure Definition is changed.

The database table structure is a two or three tables per Dataflow. The table names are based on the Dataflow, but can be changed if required. The tables include:

  • Table for Series Definitions
  • Table for Observations
  • Table data dataset attributes

The column names are based on the DSD Dimensions and Attributes, and the column's data type is either Varchar or Text if the column width exceeds 250 characters. With columns width are calculated as follows:

Component's Representation Column Width Example Value
String Max Length defined on Component, or 250 if unspecified This is a series title
Codelist Maximum length of the longes Code Id (e.g UK would be length 2) UK
Boolean 5 true
Short 5 32767
Integer 10 123
Long 19 123456
Float 12 12.20
Double 22 12.20
Double 22 12.20
Date 30 2002-01-01T12:15:00.000GMT
Date Time 30 2002-01-01T12:15:00.000GMT
Time Period 30 2002-Q1 (any valid Time Format)
Day 2 31
Month 2 12
Year_Month 7 2002-01
Year 4 2001
Time 10 12:15:00

based on the data type of the Dimension or Attribute.

* Maria DB re-writes a query for both lastNObservation and firstNObservations to optimise away a required part of the SQL query. MariaDB can be configured to run in MySQL compatability mode, where this error does not occur.

Externally Managed Database

An Externally Managed Database can be MySQL, SQLServer, or Oracle. The database table structure

Flat Table

Multi Table