Difference between revisions of "Data Query Web Service"

From Fusion Registry Wiki
Jump to navigation Jump to search
(Extended HTTP Query Parameters)
(Joining Dataflows)
Line 237: Line 237:
  
 
= Joining Dataflows =
 
= Joining Dataflows =
 +
'''Requires Fusion Registry v11.0.0 or higher.'''
 +
 
A special use case, and extension of the SDMX API, is the ability to perform a join across 2 or more Dataflows.  The result is a single dataset with each series containing the combination of Dimensions for all Dataflows in the Join.  If any Dimensions did not exist in the original series, it will have the value of _Z (not applicable).
 
A special use case, and extension of the SDMX API, is the ability to perform a join across 2 or more Dataflows.  The result is a single dataset with each series containing the combination of Dimensions for all Dataflows in the Join.  If any Dimensions did not exist in the original series, it will have the value of _Z (not applicable).
  

Revision as of 07:07, 25 May 2022

Overview

The Data Query web service offers both a RESTful GET API which conforms the SDMX Specification, and a RESTful POST API which supports advanced features such as joining mulitple datasets and selecting lists of series. Both the GET and POST web service support the SDMX standard parameters such as startPeriod and endPeriod, in addition to this, both APIs support advanced parameters to support features such as paging and calculations.

Both REST and POST APIs offer the same as valid reponse formats, this can be controlled via the format data query parameter, or by using the HTTP Accept Header.

This document describes the Syntax of both the GET and POST API. Both APIs share the same Header Parameters and Query Parameters, as such these sections are applicable to both APIs.

REST GET

Fusion Registry 10 supports only the legacy entry point (supporting the API specific at version 1.5.0).

Fusion Registry 11 supports both the legacy entry point and sdmx/v2 entry point which supports version 2.0.0 of the API specification.

The query parameters on this page are supported by both entry points.

Entry Point (legacy) /ws/public/sdmxapi/rest
Entry Point (SDMX 3.0 using v2 API spec) /sdmx/v2
Access Public
Http Method GET
Compression gzip supported
Response Format Determined by Accept Header - default SDMX 2.1 Structure Specific Data
Response Statuses

200 - Ok

400 - Bad URL Syntax

500 - Server Error

HTTP Headers

The Accept Header is used to define the output format, to transform the data to. The supported accept header values are

In addition, the following optional header parameters can be used to provide further details on the incoming dataset. If these details are not provided, the Fusion Registry will interrogate the dataset header to get the information. If the dataset is a non-SDMX format, or does not contain the required information in the header, then an error response will be returned.

HTTP Header Purpose Allowed Values
Accept This can be used to define the response format of the data See Accept Headers
Accept-Language

This optional header can be used to set the locale to return any multilingual text in (names and descriptions). If the text does not exist in the specified locale, then the default rules will be applied to find the next best appropriate locale.

The corresponding locale parameter can be used to override this HTTP Header

Any valid locale, example: en


SDMX API v2 Support

From the version 2.0.0 SDMX API specification

Component Filters

These appear as query parameters after the path. The parameters are in the syntax c[COMPONENT_ID]=COMPONENT_VALUE, for example:

 data/dataflow/ECB/EXR/1.0?c[FREQ]=A

Note 1: some web containers may not support the [ and ] characters in the REST API, these can be encoded with %5B and %5D respectively.

 data/dataflow/ECB/EXR/1.0?c%5BFREQ%5D=A

Functions

The following table shows the supported function from the SDMX specification

Function Meaning Supported
eq Equals Supported (default if not specified)
ne Not equal to Supported
lt Less than Supported only on TIME_PERIOD component
le Less than or equal to Supported only on TIME_PERIOD component
gt Greater than Supported only on TIME_PERIOD component
ge Greater than or equal to Supported only on TIME_PERIOD component
co Contains Not Supported
nc Does not contain Not Supported
sw Starts with Not Supported
se Ends with Not Supported

Series Key Filters

The second way to filter data is by providing one or more series keys in the path, for example

 data/dataflow/ECB/EXR/1.0/M.RON.EUR.SP00.E

Additional series key can be provided using the comma to separate

  data/dataflow/ECB/EXR/1.0/M.RON.EUR.SP00.E,D.RON.EUR.SP00.E

Or wider cubes can be selected by using the * to mean 'all values' for that Dimension

  data/dataflow/ECB/EXR/1.0/M.*.EUR.SP00.E

Extended Series Key Filters

Fusion software extends the series Key filter by supporting the plus operator to select more then one value for a Dimension (in keeping with the previous v1 SDMX specification)

  data/dataflow/ECB/EXR/1.0/M.RON+AUD+GBP.EUR.SP00.E

Standard HTTP Query Parameters

The standard query parameters are supported by both GET and POST queries, and are part of the official SDMX 1.5.0 API.

Request Parameter Purpose Allowed Values
Parameters to restrict observations per series
startPeriod Only return observations if the observation time if after the time specified Any valid SDMX Time Format
endPeriod Only return observations if the observation time if before the time specified Any valid SDMX Time Format
lastNObservations The last ‘n’ observations to return for each matched series number
firstNObservations The first ‘n’ observations to return for each matched series number
updatedAfter Only return data if it was updated after the time specified Any valid SDMX Time Format
Parameters to refine content
detail Specifies the amount of information to be returned

full (default) – return the complete dataset
dataonly – exclude attributes and groups
serieskeysonly – only include series, no observations
nodata – returns the groups and series, including attributes and annotations, without observations

includeHistory When observation values have been revised over time, this parameter is used to include previous revisions true/false
Parameters to change layout
dimensionAtObservation Groups the data by one of the Dimensions of the DSD

AllDimensions - used to request the observations are not grouped by any dimension
'dimension id' - used to group the observations by the specified dimenison id must be a valid Dimension Id for the DSD being queried
Note: This parameter is not supported by all output formats

Extended HTTP Query Parameters

The Extended query parameters are supported by both GET and POST queries, and are not part of the official SDMX API.

Additional format specific parameters can be found in the data format documentation.


Request Parameter Purpose Allowed Values
Parameters to modify the Header for SDMX data formats (since v10.8.6)
datasetAction Dataset Action append, replace, delete
datasetId ID of the dataset string
refId ID of the file string
recieverId ID of the reciever string
Parameters to restrict observations per series
obsLimit Limit with error if exceeded. Max number of obs that can be returned. HTTP response code of 413 (request entity too large) is returned if the query exceeds the limit number
seriesLimit Limit with error if exceeded. Max number of series that can be returned. HTTP response code of 413 (request entity too large) is returned if the query exceeds the limit number
max Limit for pagination, specifies the limit for the number of series, any additional series over this limit will not be returned number
offset Use with max to support pagination, the offset defines which series match to start writing number
Parameters for saving content
saveAs If provided the response will be saved to a file with the given name. The file name does not need to include the format postfix, however if .zip

is the type, the reponse will be in zip format

string
prettyPrint XML formats only, if true the XML will be formatted for easy of readability true/false
Parameters for language
locale Alternative to using the HTTP Accept-Language header. Locale is used to define which language to export labels in.

If the metadata does not include a label in the requested language, and alternative language is selected by the server.

For CSV formats the locale is also used to format numerical observation values

string
Meta information about the dataset
lastUpdated Use this to include an additional attribute for each Observation which indicates the last time the observation was updated. The attribute ID is the same as the argument value. Example lastUpdate=LAST_UPDATED will create a new attribute on the observation called LAST_UPDATED (LAST_UPDATED="2020-01-15T16:25:13.184") string
Calculating and transforming values
calculatemeasure

Creates a new Dimension to support the output of new calculated measures. Supported calculations:

  • Moving Average, syntax: mvavg,[number of periods],linear where the linear parameter is optional, if provided missing values will be interpolated using this function
  • Percentage Change, syntax: pch

Syntax: [Measure ID];[calculation 1];[calculation 2];[...]
Example: calculatemeasure=NEW_MEASURE;mvavg;pch

aggregate Removes one or more Dimensions from the output, collapsing multiple series and aggregating the values. The default behaviour of removing the specified Dimensions can be reversed (to keep only the specified Dimensions) by using the keep argument. The default aggregation function is sum if no argument is provided Syntax: [DIM_ID ARRAY;AGGREAGATION FUNCTION;KEEP DIMENSIONS]

Example: FREQ,REF_AREA;SUM;keep
Aggregation Functions: sum, mean, median, min, max, stddev, stddevsample, var, varsample, count

completecube Fills in missing values for each series based on the start and end period of the data query for example if the data query states startPeriod=2008 but the data for one series starts at 2010, this manager will create a writer which completes the cube by either filling in with blank values, or one which is passed in on the parameter the value to complete the cube with:
'completecube=0' fills missing values with zeros
'completecube=' fills missing values with blanks
'completecube=nan' fills missing values with the string 'nan'
interpolate Derive values for periods with no data (uses linear interpolation) linear
normalisefreq Output all series in the given Frequency, even if the data is held in a different frequency, the aggregation function is used to convert frequencies

Syntax: [FREQUENCY;AGGREGATION FUNCTION]
Example: normalisefreq=A;SUM
Aggregation Functions: sum, mean, median, min, max, stddev, stddevsample, var, varsample, count, firstperiod, lastperiod, midperiod


round Round observations to x decimal places. The default behaviour is to ensure all observations meet the given number of decimal places by either rounding the excess, or filling missing with zero values. For example rounding 1.232 to 2 would return 1.23, however rounding to 4 would return 1.2320. The default rounding behaviour can be changed by using the maxdp argument, which will not pad the decimals with zero values if they are shorter then the given number of decimal places

Syntax: round=[decimal places;ROUND_FUNCTION;maxdp]
Example: round=2;halfup;maxdp
Rounding Functions: ceil, floor, halfeven, halfup, halfdown, up, down

Joining Dataflows

Requires Fusion Registry v11.0.0 or higher.

A special use case, and extension of the SDMX API, is the ability to perform a join across 2 or more Dataflows. The result is a single dataset with each series containing the combination of Dimensions for all Dataflows in the Join. If any Dimensions did not exist in the original series, it will have the value of _Z (not applicable).

To join Dataflows, query a specifically Virtual Dataflow in the Fusion Registry

/sdmx/v2/data/dataflow/FUSION/VIRTUAL/1.0 

The sub arguments to this are the list of series keys

/sdmx/v2/data/dataflow/FUSION/VIRTUAL/1.0/BKN.M.GR.NC10.B.10P1.AS.S.Q,BSI.Q.EE.N.A.A30.A.1.U2.2000.Z04.E

Note that each Series key has a prefix, this is an identifier which is referenced later to tell the system what the Dataflow is for that series key

/sdmx/v2/data/dataflow/FUSION/VIRTUAL/1.0/BKN.M.GR.NC10.B.10P1.AS.S.Q,BSI.Q.EE.N.A.A30.A.1.U2.2000.Z04.E?dataflow=ECB:BKN(1.0),ECB:BSI(1.0)

The system will match the IDs of the Dataflows with the first part of each series key, and then query each Dataflow respectively.