Fusion-CSV

From Fusion Registry Wiki
Jump to navigation Jump to search

see Data Formats

Overview

The Fusion-CSV Data format predated SDMX-CSV and is not an official SDMX format. It is very similar in that the format is comma separated format where each row describes a single Observation value. The differences are:

  1. Whilst there can be a header row it is not mandatory
  2. There is no column for the Dataflow, in SDMX-CSV the first column must be for the Dataflow
  3. The order of the columns is important, unlike SDMX-CSV where labels are in the same column as the Id concatenated with a colon
  4. Fusion-CSV outputs labels in their own columns as oppose to SDMX-CSV which outputs a id:label pair in a single column concatenated with a semicolon (e.g. A:Annual)
  5. Fusion-CSV does not output columns for data queries if they are not part of the response based on the detail parameter, for example detail=seriesKeysOnly for a data query will only write columns for the Dimensions of the Data Structure in the output CSV, as oppose to SDMX-CSV which outputs a column for all the DSD components, regardless of whether they are used.

Fusion-CSV can be used as both an import and export format for the Fusion Registry, and an export format for the Fusion Edge Server and Fusion Data Browser.

Ordering of columns

Fusion-CSV does not have the same input and output format (which gives another reason to use SDMX-CSV). When Fusion-CSV is output, the columns are output in the following order:

  • Dimensions
  • Dataset Attributes
  • Group Attributes
  • Series Attributes
  • TIME_FORMAT
  • OBS_VALUE
  • Observation Attributes

When reading a Fusion-CSV file the order of the columns is:

  • Dimensions
  • TIME_FORMAT
  • Dataset Attributes
  • Series Attributes
  • Observation Attributes
  • OBS_VALUE

Group Attributes are not supported when reading Fusion-CSV.

Formatting Using Query Parameters

The following URL parameters can be used in a RESTful query for Fusion-CSV data.

Query Parameter Values Description
format csv Required to output the dataset in csv-ts format
delimiter tab | semicolon | space
default is comma
Defines which delimiter to use
labels id or name or both
default id
Defines if labels or ids should be used on output, if both is selected then there are 2 columns per coded component (one for Id one for the label)
serieskey include or exclude
default exclude
If true, a series key column will be included in the output
bom include or exclude (Include or Exclude the Byte Order Mark (BOM).
The BOM helps Excel interpret non Latin characters when opening a CSV file)

The output type "all-lang" will output both id and name and will add an element to the start of each row which will be the language. For each series and each language a row will be output where the name in the row is for the specified language. If there is no name value for that language, the name is simply not output. See below for an example output/

Example https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv&labels=both&delimiter=tab

Note: The same formatting can be applied using HTTP Accept Headers as opposed to query parameters.

Example Output

An example query using the format request parameters, HTTP Accept Headers can also be used to define the same format.
https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv&labels=both


An example dataset with IDs only, spaces have been added to this example to assist readability.

REF_AREA, INDICATOR, SUB_INDICATOR, FREQ, TIME_PERIOD, OBS_VALUE
GHA,      GCI,       RANK,          A,    2008,        102
GHA,      GCI,       RANK,          A,    2009,        114
GHA,      GCI,       RANK,          A,    2010,        114
GHA,      GCI,       RANK,          A,    2011,        114
GHA,      GCI,       RANK,          A,    2012,        103
GHA,      GCI,       RANK,          A,    2013,        114
GHA,      GCI,       RANK,          A,    2014,        111

The same dataset in Fusion-CSV with labels included. Note: labels columns are only included if the Dimension, Attribute, or Measure is Coded, if it is not, then only one column is output - this can be seen in the table below where both TIME_PERIOD and OBS_VALUE are only single columns.

REF_AREA, Reference Area, INDICATOR, Indicator,                    SUB_INDICATOR, Sub Indicator, FREQ, Frequency, TIME_PERIOD, OBS_VALUE
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK,          Rank,          A,    Annual,    2008,        102
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK,          Rank,          A,    Annual,    2009,        114
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK,          Rank,          A,    Annual,    2010,        114
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK,          Rank,          A,    Annual,    2011,        114
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK:          Rank,          A,    Annual,    2012,        103
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK,          Rank,          A,    Annual,    2013,        114
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK,          Rank,          A,    Annual,    2014,        111


An example dataset with the returned detail set to series keys only - fewer columns are written to the response.
https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv&detail=serieskeysonly

REF_AREA, INDICATOR, SUB_INDICATOR, FREQ
GHA,      GCI,       RANK,          A
GHA,      GCI,       VALUE,         A    

The same dataset in Fusion-CSV with the label type specified as all-lang. Note how 6 lines below only represent 1 series (GHA:GCI:RANK:A) and 2 observations 2008 and 2009. This example demonstrates where there are 3 languages in use (en, fr and de), that there is a French name for GCI, but not a German name and that RANK has a name in en, fr and de.

LANG, REF_AREA, Reference Area, INDICATOR, Indicator,                        SUB_INDICATOR, Sub Indicator, FREQ, Frequency, TIME_PERIOD, OBS_VALUE
en,   GHA,      Ghana,          GCI,       Global Competitiveness Index,     RANK,          Rank,          A,    Annual,    2008,        102
fr,   GHA,      Ghana,          GCI,       Indice de compétitivité mondiale, RANK,          Rang,          A,    Annual,    2008,        102
de,   GHA,      Ghana,          GCI,                                       , RANK,          Rang (de),     A,    Annual,    2008,        102
en,   GHA,      Ghana,          GCI,       Global Competitiveness Index,     RANK,          Rank,          A,    Annual,    2009,        114
fr,   GHA,      Ghana,          GCI,       Indice de compétitivité mondiale, RANK,          Rang,          A,    Annual,    2009,        114
de,   GHA,      Ghana,          GCI,                                       , RANK,          Rang (de),     A,    Annual,    2009,        114