Difference between revisions of "Fusion-CSV"

From Fusion Registry Wiki
Jump to navigation Jump to search
Line 6: Line 6:
 
# There is no column for the Dataflow, in SDMX-CSV the first column must be for the Dataflow
 
# There is no column for the Dataflow, in SDMX-CSV the first column must be for the Dataflow
 
# The order of the columns is important, unlike [[SDMX-CSV]] where labels are in the same column as the Id concatenated with a colon
 
# The order of the columns is important, unlike [[SDMX-CSV]] where labels are in the same column as the Id concatenated with a colon
# Fusion-CSV outputs lables in their own columns as oppose to [[SDMX-CSV]] which outputs a id:label pair in a single column concatenated with a semicolon (e.g. A:Annual)
+
# Fusion-CSV outputs labels in their own columns as oppose to [[SDMX-CSV]] which outputs a id:label pair in a single column concatenated with a semicolon (e.g. A:Annual)
 
# Fusion-CSV does not output columns for data queries if they are not part of the response based on the [https://github.com/sdmx-twg/sdmx-rest/blob/master/v2_1/ws/rest/docs/4_4_data_queries.md detail parameter], for example detail=seriesKeysOnly for a data query will only write columns for the Dimensions of the Data Structure in the output CSV, as oppose to [[SDMX-CSV]] which outputs a column for all the DSD components, regardless of whether they are used.
 
# Fusion-CSV does not output columns for data queries if they are not part of the response based on the [https://github.com/sdmx-twg/sdmx-rest/blob/master/v2_1/ws/rest/docs/4_4_data_queries.md detail parameter], for example detail=seriesKeysOnly for a data query will only write columns for the Dimensions of the Data Structure in the output CSV, as oppose to [[SDMX-CSV]] which outputs a column for all the DSD components, regardless of whether they are used.
  

Revision as of 00:22, 20 October 2020

Overview

The Fusion-CSV Data format predated SDMX-CSV and is not an official SDMX format. It is very similar in that the format is comma separated format where each row describes a single Observation value. The differences are:

  1. Whilst there can be a header row it is not mandatory
  2. There is no column for the Dataflow, in SDMX-CSV the first column must be for the Dataflow
  3. The order of the columns is important, unlike SDMX-CSV where labels are in the same column as the Id concatenated with a colon
  4. Fusion-CSV outputs labels in their own columns as oppose to SDMX-CSV which outputs a id:label pair in a single column concatenated with a semicolon (e.g. A:Annual)
  5. Fusion-CSV does not output columns for data queries if they are not part of the response based on the detail parameter, for example detail=seriesKeysOnly for a data query will only write columns for the Dimensions of the Data Structure in the output CSV, as oppose to SDMX-CSV which outputs a column for all the DSD components, regardless of whether they are used.

Fusion-CSV can be used as both an import and export format for the Fusion Registry, and an export format for the Fusion Edge Server and Fusion Data Browser.

Formatting Using Query Parameters

The following URL parameters can be used in a RESTful query for Fusion-CSV data.

  • format =csv
  • delimiter =comma | tab | semicolon | space (comma is default)
  • labels = id | name | both | all-lang | (id is default)
  • bom = include | exclude (Include or Exclude the Byte Order Mark (BOM).
    The BOM helps Excel interpret non Latin characters when opening a CSV file)

The output type "all-lang" will output both id and name and will add an element to the start of each row which will be the language. For each series and each language a row will be output where the name in the row is for the specified language. If there is no name value for that language, the name is simply not output. See below for an example output/

Example https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv&labels=both&delimiter=tab

Note: The same formatting can be applied using HTTP Accept Headers as opposed to query parameters.

Example Output

An example query using the format request parameters, HTTP Accept Headers can also be used to define the same format.
https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv&labels=both


An example dataset with IDs only, spaces have been added to this example to assist readability.

REF_AREA, INDICATOR, SUB_INDICATOR, FREQ, TIME_PERIOD, OBS_VALUE
GHA,      GCI,       RANK,          A,    2008,        102
GHA,      GCI,       RANK,          A,    2009,        114
GHA,      GCI,       RANK,          A,    2010,        114
GHA,      GCI,       RANK,          A,    2011,        114
GHA,      GCI,       RANK,          A,    2012,        103
GHA,      GCI,       RANK,          A,    2013,        114
GHA,      GCI,       RANK,          A,    2014,        111

The same dataset in Fusion-CSV with labels included. Note: labels columns are only included if the Dimension, Attribute, or Measure is Coded, if it is not, then only one column is output - this can be seen in the table below where both TIME_PERIOD and OBS_VALUE are only single columns.

REF_AREA, Reference Area, INDICATOR, Indicator,                    SUB_INDICATOR, Sub Indicator, FREQ, Frequency, TIME_PERIOD, OBS_VALUE
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK,          Rank,          A,    Annual,    2008,        102
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK,          Rank,          A,    Annual,    2009,        114
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK,          Rank,          A,    Annual,    2010,        114
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK,          Rank,          A,    Annual,    2011,        114
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK:          Rank,          A,    Annual,    2012,        103
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK,          Rank,          A,    Annual,    2013,        114
GHA,      Ghana,          GCI,       Global Competitiveness Index, RANK,          Rank,          A,    Annual,    2014,        111


An example dataset with the returned detail set to series keys only - fewer columns are written to the response.
https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv&detail=serieskeysonly

REF_AREA, INDICATOR, SUB_INDICATOR, FREQ
GHA,      GCI,       RANK,          A
GHA,      GCI,       VALUE,         A    

The same dataset in Fusion-CSV with the label type specified as all-lang. Note how 6 lines below only represent 1 series (GHA:GCI:RANK:A) and 2 observations 2008 and 2009. This example demonstrates where there are 3 languages in use (en, fr and de), that there is a French name for GCI, but not a German name and that RANK has a name in en, fr and de.

LANG, REF_AREA, Reference Area, INDICATOR, Indicator,                        SUB_INDICATOR, Sub Indicator, FREQ, Frequency, TIME_PERIOD, OBS_VALUE
en,   GHA,      Ghana,          GCI,       Global Competitiveness Index,     RANK,          Rank,          A,    Annual,    2008,        102
fr,   GHA,      Ghana,          GCI,       Indice de compétitivité mondiale, RANK,          Rang,          A,    Annual,    2008,        102
de,   GHA,      Ghana,          GCI,                                       , RANK,          Rang (de),     A,    Annual,    2008,        102
en,   GHA,      Ghana,          GCI,       Global Competitiveness Index,     RANK,          Rank,          A,    Annual,    2009,        114
fr,   GHA,      Ghana,          GCI,       Indice de compétitivité mondiale, RANK,          Rang,          A,    Annual,    2009,        114
de,   GHA,      Ghana,          GCI,                                       , RANK,          Rang (de),     A,    Annual,    2009,        114