Difference between revisions of "Data Availability Web Service"

From Fusion Registry Wiki
Jump to navigation Jump to search
(Determine which Codes remain valid based on Data Query state)
 
(7 intermediate revisions by the same user not shown)
Line 2: Line 2:
  
 
=Overview=
 
=Overview=
The Data Availability API is the RESTful web service accessed via the 'availableconstraint' resource.
+
The Data Availability API is used to determine what Dimension values exist for specific Dataflows, and can take into account the state of current filters applied to the data query.  This is very useful in supporting the user case of a data query builder which prevents a user from selecting a combination of Dimension filters which result in no data being returned.
  
The API is provided to enable external services to determine what data is available for a Dataflow, based on a current data query selection state, without having to retrieve the full dataset. The main use case is to offer an easy and efficient way for clients of the web service to build a data query form based on the available dimension values.
+
The API specification was updated in SDMX v3.0.0 to align with other changes made to the data query.  This page documents the [https://github.com/sdmx-twg/sdmx-rest/blob/v1.5.0/v2_1/ws/rest/docs/4_6_1_other_queries.md v1 specification] and [https://github.com/sdmx-twg/sdmx-rest/blob/master/doc/availability.md v2 specification of the API].  
  
=Examples=
+
=Example API Version 2.0.0=
 +
This API required Fusion Registry 11 and has some extensions not supported by the specification
 +
 
 +
== Data Availability across all Dataflows ==
 +
This is not an official part of the specification, but it is useful as can be used to provide information about every Dataflow in the system which has data, and enables the system to link every Code back to one or more Dataflows.
 +
 
 +
'''List all Dataflows which have data'''
 +
FusionRegistry/sdmx/v2/availability/dataflow
 +
 
 +
'''Include the series count'''
 +
FusionRegistry/sdmx/v2/availability/dataflow/?includeMetrics
 +
 
 +
'''Include Codelists and Concepts Schemes'''
 +
FusionRegistry/sdmx/v2/availability/dataflow/?includeMetrics&references=descendants&
 +
 
 +
Note, in the above query the Codelists and Concept Schemes will only contain the items in the list which have Data related to them (i.e. if a Codelist has 8000 Codes but only 300 have data, the response will only contain the 300 Codes with data).  If a Code in a Codelist does not have data, but it is the parent of a Code which does have data, it will be included in the response message.
 +
 
 +
 
 +
== Data Availability for a specific Dataflow ==
 +
 
 +
'''A Single Dataflow'''
 +
FusionRegistry/sdmx/v2/availability/dataflow/'''ECB/BKN/1.0'''
 +
 
 +
The references and includeMetrics parameters can also be used
 +
 
 +
 
 +
'''Multiple Arguments''' (comma separated)
 +
FusionRegistry/sdmx/v2/availability/dataflow/'''ECB/BKN,BSI/1.0'''
 +
 
 +
== Determine which Codes remain valid based on Data Query state ==
 +
FusionRegistry/sdmx/v2/availability/dataflow/ECB/EXR/1.0/?c[EXR_SUFFIX]=A&c[FREQ]=A,D&mode=available
 +
 
 +
The data query state (filters) are applied after the dataflow path parameters, and must conform to [https://github.com/sdmx-twg/sdmx-rest/blob/master/doc/data.md version 2.0.0 of the REST API] (part of the SDMX 3.0 release).  Custom extensions to the path of the API are also supported.
 +
 
 +
The '''mode=available''' query parameter tells the web service to bring back all the values which remain valid selections based on the current query state.  A valid code selection is one which can be added to the current query, and will still result in at least 1 series being returned.  This does not take into account filters in the Time dimension, so it will not eliminate all possibilities of the user retrieving no data, but the chance is greatly reduced.
 +
 
 +
If '''mode=actual''' is used instead, the response will only include values which will be in the response Dataset.  In the above example this would result in EXR_SUFFIX of A, as the user has defined this as the only EXR_SUFFIX they want in the response.  Whereas the service may know that other EXR_SUFFIX could be selected in addition to this, but that information will not be returned when mode is actual.
 +
 
 +
=Example API Version 1.5.0=
 
====Availability of EUR exchange rate data for Danish Krone (from ECB EXR)====
 
====Availability of EUR exchange rate data for Danish Krone (from ECB EXR)====
  
Line 14: Line 52:
 
In particular it shows there are series for frequencies A, Q, D, H and M.
 
In particular it shows there are series for frequencies A, Q, D, H and M.
 
There’s also a ‘series_count’ annotation which tells us the query would return 8 series if run.
 
There’s also a ‘series_count’ annotation which tells us the query would return 8 series if run.
 +
 +
====Retrieve valid code selections based on the current query====
 +
 +
[https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/availableconstraint/EXR/.DKK...?references=none&mode=available https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/availableconstraint/EXR/.DKK...?references=none&'''mode=available''']
 +
 +
By switching from mode=exact to mode=available the response changes, it no longer shows what data will come back from the query - it shows what code options remain valid for all the Dimensions of the DSD.  A valid choice is one which will result in data being returned, in the context of the current query. This is useful to prevent a user from building a data query which results in 'no data' due to an invalid combination of dimension selections.
  
 
====Availability of EUR exchange rate data for the Egyptian Pound (ECB EXR again)====
 
====Availability of EUR exchange rate data for the Egyptian Pound (ECB EXR again)====

Latest revision as of 06:51, 16 February 2022


Overview

The Data Availability API is used to determine what Dimension values exist for specific Dataflows, and can take into account the state of current filters applied to the data query. This is very useful in supporting the user case of a data query builder which prevents a user from selecting a combination of Dimension filters which result in no data being returned.

The API specification was updated in SDMX v3.0.0 to align with other changes made to the data query. This page documents the v1 specification and v2 specification of the API.

Example API Version 2.0.0

This API required Fusion Registry 11 and has some extensions not supported by the specification

Data Availability across all Dataflows

This is not an official part of the specification, but it is useful as can be used to provide information about every Dataflow in the system which has data, and enables the system to link every Code back to one or more Dataflows.

List all Dataflows which have data

FusionRegistry/sdmx/v2/availability/dataflow

Include the series count

FusionRegistry/sdmx/v2/availability/dataflow/?includeMetrics

Include Codelists and Concepts Schemes

FusionRegistry/sdmx/v2/availability/dataflow/?includeMetrics&references=descendants&

Note, in the above query the Codelists and Concept Schemes will only contain the items in the list which have Data related to them (i.e. if a Codelist has 8000 Codes but only 300 have data, the response will only contain the 300 Codes with data). If a Code in a Codelist does not have data, but it is the parent of a Code which does have data, it will be included in the response message.


Data Availability for a specific Dataflow

A Single Dataflow

FusionRegistry/sdmx/v2/availability/dataflow/ECB/BKN/1.0

The references and includeMetrics parameters can also be used


Multiple Arguments (comma separated)

FusionRegistry/sdmx/v2/availability/dataflow/ECB/BKN,BSI/1.0

Determine which Codes remain valid based on Data Query state

FusionRegistry/sdmx/v2/availability/dataflow/ECB/EXR/1.0/?c[EXR_SUFFIX]=A&c[FREQ]=A,D&mode=available

The data query state (filters) are applied after the dataflow path parameters, and must conform to version 2.0.0 of the REST API (part of the SDMX 3.0 release). Custom extensions to the path of the API are also supported.

The mode=available query parameter tells the web service to bring back all the values which remain valid selections based on the current query state. A valid code selection is one which can be added to the current query, and will still result in at least 1 series being returned. This does not take into account filters in the Time dimension, so it will not eliminate all possibilities of the user retrieving no data, but the chance is greatly reduced.

If mode=actual is used instead, the response will only include values which will be in the response Dataset. In the above example this would result in EXR_SUFFIX of A, as the user has defined this as the only EXR_SUFFIX they want in the response. Whereas the service may know that other EXR_SUFFIX could be selected in addition to this, but that information will not be returned when mode is actual.

Example API Version 1.5.0

Availability of EUR exchange rate data for Danish Krone (from ECB EXR)

https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/availableconstraint/EXR/.DKK...?references=none&mode=exact

The response is a content constraint which describes the values for each dimension that have data, given that CURRENCY is fixed to DKK in the query. In particular it shows there are series for frequencies A, Q, D, H and M. There’s also a ‘series_count’ annotation which tells us the query would return 8 series if run.

Retrieve valid code selections based on the current query

https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/availableconstraint/EXR/.DKK...?references=none&mode=available

By switching from mode=exact to mode=available the response changes, it no longer shows what data will come back from the query - it shows what code options remain valid for all the Dimensions of the DSD. A valid choice is one which will result in data being returned, in the context of the current query. This is useful to prevent a user from building a data query which results in 'no data' due to an invalid combination of dimension selections.

Availability of EUR exchange rate data for the Egyptian Pound (ECB EXR again)

https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/availableconstraint/EXR/.EGP...?references=none&mode=exact

The same query as above, but this time querying for EUR vs EGP exchange rates. Here only series with frequencies A, Q and M are available. And the ‘series_count’ annotation tells us we would get 3 series.