Data Formats

From Fusion Registry Wiki
Revision as of 01:59, 10 October 2023 by Plazarou (talk | contribs) (CSV Formats)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Overview

Fusion Registry accepts and outputs datasets in a number of formats. When consuming data, the Fusion Registry will analyse the dataset to try to determine what data format it has received, so that it is able to direct it to the right reader. All datasets are read by the Fusion Registry in excatly the same way, so any data processing performed on a Dataset is the same, regardless of the input Data Format.

When querying for data from the Registry web service, or performing a Data Transformation via the Web service, the output data format is described using the HTTP Accept Header which describes the required data format.


HTTP Accept Headers

SDMX Formats

SDMX Fomats are supported as described by the web services specification

Accept Headers

Accept Header Format
application/vnd.sdmx.data+json;version=1.0.0 SDMX JSON
application/vnd.sdmx.data+csv;version=1.0.0 SDMX CSV
application/vnd.sdmx.data+edi SDMX EDI
application/vnd.sdmx.structurespecificdata+xml;version=2.1 Structure Specific (2.1)
application/vnd.sdmx.structurespecificdata+xml;version=2.0 Compact (2.0/1.0)
application/vnd.sdmx.genericdata+xml;version=2.1 Generic (2.1/2.0/1.0)

CSV Formats

There are a number of CSV 'flavours' supported by Fusion Software, including the SDMX-CSV format which is an official SDMX data format. It is recommended to use the SDMX CSV format version 2 if you cannot decide which format to use.

The following Formats and correspoding VND Headers are supported:

Each VND Header can then take the additional arguments of, note SDMX-CSV only supports a subset of these arguments or supported values.

Argument Description Supported Values Default Supported Formats
version the version of the format 1.0.0, 2.0.0 SDMX-CSV only 1.0.0 SDMX-CSV, Fusion-CSV, Fusion-CSV-TS
timeFormat values are converted to the most granular ISO 8601 representation
taking into account the highest frequency of the data in the message
original or normalized original SDMX-CSV
labels output both code/concept ids and the respective labels
in the specified language
both/id/name id SDMX-CSV (with the exception of labels=name),
Fusion-CSV, Fusion-CSV-TS
delimiter the delimiter to use comma/tab/semicolon/space comma Fusion-CSV, Fusion-CSV-TS
serieskey include the series key as a column
A series key is the concatenation of the dimension values
for example A:UK:EMPLOYMENT
include/exclude exclude Fusion-CSV, Fusion-CSV-TS
bom
(since 10.3.1)
Include or Exclude the Byte Order Mark (BOM).
The BOM helps Excel interpret non Latin characters when opening a CSV file
include/exclude exclude SDMX-CSV, Fusion-CSV, Fusion-CSV-TS


Note:The Labels parameter can be used in conjuntion with the HTTP Accept-Language Header to indicate which language to resolve the labels in. If the labels are not available in the requested language, another language will be selected, defaulting to English.

Examples

application/vnd.sdmx.data+csv
application/vnd.sdmx.data+csv;version=1.0.0;
application/vnd.sdmx.data+csv;version=2.0.0;
application/vnd.sdmx.data+csv;version=1.0.0;timeFormat=normalized
application/vnd.sdmx.data+csv;version=1.0.0;timeFormat=normalized;labels=both

application/vnd.csv
application/vnd.csv;delimiter=tab
application/vnd.csv;timeFormat=normalized;serieskey=include
application/vnd.csv;version=1.0.0;labels=both

application/vnd.csv-ts
application/vnd.csv-ts;version=1.0.0;
application/vnd.csv-ts;version=1.0.0;labels=name

Data Reporting Template

There are two types of output when converting data to a Data Reporting Template format. The first is where the Data reporting template is constructed in the usual way, with the Universe of Data being derived from the Dataflow, and related Content Constraints. The second is where the Universe of Data is derived from the dataset being written into the Excel workbook. The default output is to base the Report Template Universe on the constraints, to change this behaviour, use the +partial indicator in the VND Header.

Accept Header Description
application/vnd.reporttemplate Excel Report Template pre-populated with the data from a dataset.


The dataset should contain the Provision Agreement reference, to enable the Fusion Registry to determine the Data Provider
The excel file will be the same as a Report Template generated via the Reporting Template Web Service, but it will be pre-populated with observation and attribute values from the dataset.

application/vnd.reporttemplate.ACY:BANKING(1.0) This is an extension of application/vnd.reporttemplate, it tells the Fusion Registry which Reporting Template to use. Only required if there is more then one Reporting Template for the Dataflow(s) being written.
application/vnd.reporttemplate;DATA_PROVIDER=ONS This is an extension of application/vnd.reporttemplate, it tells the Fusion Registry who the Data Provider is.

The Data Provider's Agency defaults to SDMX. If this is not true, use the syntax DATA_PROVIDER=ACY_ID.ONS

Can be used in conjunction with other VND arguments such as the Report Template identifer.

application/vnd.reporttemplate+partial This outputs the dataset conforming to the layout of the Report Template, but includes only the worksheets, and observation cells for which there is data in the dataset. There is no main worksheet.

A Data Provider reference is not necessary, however information about which Report Template to use can be provided using the .ACY:TEMPLATE_ID(1.0) syntax.