Difference between revisions of "Refreshable Excel data spreadsheets"

From Fusion Registry Wiki
Jump to navigation Jump to search
(More on CSV Formats)
(More on CSV Formats)
Line 119: Line 119:
  
 
=More on CSV Formats=
 
=More on CSV Formats=
 +
====CSV Formats====
 +
Four different basic CSV formats are supported.
 +
{| class="wikitable"
 +
|-
 +
! Format !! Description
 +
|-
 +
| [[Fusion-CSV|Fusion CSV]] || Flat CSV with one observation per row
 +
|-
 +
| [[Fusion-CSV-TS|Fusion CSV Time Series]] || Time Series CSV with one series per row and time periods across the columns
 +
|-
 +
| [[Fusion-CSV-Series|Fusion CSV Vertical Time Series]] || Time Series CSV with one series per column
 +
|-
 +
| [[SDMX-CSV|SDMX CSV]] || The SDMX standard CSV format
 +
|}
 +
 
====IDs and Labels====
 
====IDs and Labels====
 
You can control whether the data returned to the worksheet includes IDs, Labels or both using URL parameters or [[Data_Formats#HTTP_Accept_Headers|HTTP Accept Headers]].
 
You can control whether the data returned to the worksheet includes IDs, Labels or both using URL parameters or [[Data_Formats#HTTP_Accept_Headers|HTTP Accept Headers]].

Revision as of 08:56, 24 June 2020


Overview

Refreshable data spreadsheets can be created using Excel's standard data management features with the data being retrieved directly from Fusion Registry's CSV REST web services.

Principles

Excel has built-in functionality for loading data from a range of sources including the Web which allows data to be retrieved directly from Fusion Registry's REST API. The REST API can provide data in a number of different formats including XML and JSON, but CSV is the best choice being easily interpreted and processed by Excel.

A simple refreshable sheet can be created using a Fusion Registry GET data query URL as the source.
Example https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv

Basic Steps

  1. Open a new Excel workbook and choose the Data menu
  2. From the ribbon bar choose New Query > From Other Sources > From Web
  3. Enter the Fusion Registry REST URL for the data you are interested in

Excel will connect to the URL and download the data into the worksheet which can be saved and used as normal. The data can be refreshed as needed using Refresh All on the Data ribbon bar, and the Show Queries option opens the query sidebar with more information about the queries and data sources used in the workbook.

Generating data URLs for Excel Web GET queries

Fusion Registry's REST API follows the SDMX REST API GET syntax, with some additional parameters and CSV extensions which are useful here.

URLs can be crafted by hand if you understand the SDMX REST syntax and know what datasets you need. However, there are a couple of other ways to discover data and generate the URLs automatically.

Fusion Data Browser

  1. Launch the Fusion Data Browser
  2. Choose a data set and use the quick filters to select the data of interest.
  3. Choose Export to CSV, but rather than exporting the file use the Query Syntax function to get the URL

The URL for the query can be copied and pasted into Excel.

Note: Check that a GET query is generated. Complex data selections in Fusion Data Browser generate a HTTP POST which requires the Advanced Query Editor in Excel. There's more about this in the section on GET and POST queries

Fusion Registry Web Services Query Builder

  1. From the Fusion Registry user interface, choose Web Service > Data from the left-hand menu bar
  2. Choose the Dataflow and the sub-cube using the query builder
  3. Choose the CSV data format

The URL for the query can be copied and pasted into Excel.

Private Fusion Registry data services and authentication

Excel will need to authenticate with the REST API if the Registry is operating in private mode or Content Security rules are in place to control who has access to what data.

In private mode, Excel will detect authentication is required and will automatically prompt for a username and password. But in other cases, you may need to explicitly set the credentials that Excel will use when calling the API.

  1. In Excel, after entering the URL, choose the Transform Data option on the data preview window - this will show the Power Query Editor
  2. On the Power Query ribbon bar, choose the Data Source Setting option
  3. Choose the new data source from the list and select the Edit Permissions option
  4. Under Credentials choose Edit...
  5. Choose Basic and enter the User name and Password

Note: Excel only supports authentication on GET queries. POST queries will only work if the Fusion Registry API is public.

GET and POST queries

HTTP GET data queries largely follow the standard SDMX syntax and are suitable where regular a regular sub-cube from a single Dataflow is required. For more complex queries Fusion Registry provides a POST option which allows data to be retrieved from multiple Dataflows, and arbitrary selections of series which cannot be described using sub-cube syntax.

GET queries

All of the queries described so far use the HTTP GET method which is the default assumed by Excel's Web data query.

The format parameter or the HTTP Accept header must be set to specify the CSV format in which the data will be returned. While either are supported by Excel, the format parameter approach is generally recommended for simplicity. Use Excel's 'advanced' web query option to set the Accept and other headers.

Format Parameter Accept Header Meaning Format Parameter Example
csv application/vnd.csv Flat format with one row per observation https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv&labels=both
csv-ts application/vnd.csv-ts Time Series format with one row per series and time periods across the columns https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv-ts
csv-series application/vnd.csv-series Time Series format with one series per column and time periods down the rows https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv-series


POST queries

POST queries work by sending JSON which describes the data required (equivalent to a SQL SELECT statement) to the Fusion Registry REST API using the HTTP POST method. In Excel, POST queries need to be coded in Microsoft's Power Query M formular language using the Power Query Editor advanced editing function.

The following example selects nine specific series identified by their series keys from the ECB's Exchange Rate dataset:

let
content = "{
     ""queries"": [{
     ""obj"": ""DataQueryDef"", 
     ""series"": [
       ""A:E5:PLN:ERD0:A"", 
       ""A:E5:PLN:ERC0:A"", 
       ""A:E5:MXN:EN00:A"", 
       ""A:E5:NOK:ERU1:A"",
       ""M:CZK:EUR:SP00:A"",
       ""M:CAD:EUR:SP00:A"",
       ""A:CZK:EUR:SP00:A"",
       ""A:CAD:EUR:SP00:E"",
       ""A:SEK:EUR:SP00:E""],
     ""dataflow"": {
       ""id"": ""EXR"",
       ""agencyId"": ""ECB"",
       ""version"": ""1.0""
     },
     ""type"": ""series""}]
   }",    
   Source = Csv.Document(Web.Contents("https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data", [Headers=[Accept="application/vnd.csv;labels=both", #"Content-Type"="application/json"],Content=Text.ToBinary(content)]
   ),[Delimiter=",", Columns=31, Encoding=65001, QuoteStyle=QuoteStyle.None]),
   #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true])
in
   #"Promoted Headers"

To build a refreshable spreadsheet using this code:

  1. Create an new Excel worksheet and from the Data ribbon bar choose New Query > From Other Sources > Blank Query
  2. The Power Query Editor should be displayed - in the Home ribbon bar choose Advanced Editor and paste in the M code

Power Query should display a preview of the data. Choose Close and Load to complete the process and load the entire query result into the worksheet.

Generating the Microsoft M code for a POST query

Fusion Data Browser generated POST query JSON

Complex selections of series in the Fusion Data Browser will generate the JSON needed for the M code, in particular where lists of specific series are chosen or when using Series Baskets.

  1. Launch the Fusion Data Browser
  2. Choose the required datasets and series using Series Baskets if needed
  3. Choose Export to CSV, but rather than exporting the file use the Query Syntax function to generate the POST JSON
  4. Build the M code by hand setting the value of the content variable to the JSON, setting the URL of the Fusion Registry REST API in the Web.Contents function, and similarly setting the Accept header to the CSV data format required

Note: Double-quotes in the JSON need to be escaped in the M code by replacing each single " mark in the JSON with two marks: ""
Raw JSON

"agencyId": "ECB"

JSON in the M Code

""agencyId"": ""ECB""

Fusion Data Browser generated M code (forthcoming feature)

A planned Fusion Data Browser feature will allow the M code to be generated directly.

More on CSV Formats

CSV Formats

Four different basic CSV formats are supported.

Format Description
Fusion CSV Flat CSV with one observation per row
Fusion CSV Time Series Time Series CSV with one series per row and time periods across the columns
Fusion CSV Vertical Time Series Time Series CSV with one series per column
SDMX CSV The SDMX standard CSV format

IDs and Labels

You can control whether the data returned to the worksheet includes IDs, Labels or both using URL parameters or HTTP Accept Headers.

By default, only IDs are included so you need to specify explicitly if labels are also required.

URL Parameter Meaning Example
labels=id IDs only https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv&labels=id
labels=name Names only https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv&labels=name
labels=both Both names and IDs https://demo.metadatatechnology.com/FusionRegistry/ws/public/sdmxapi/rest/data/WB,GCI,1.0/GHA.GCI..?format=csv&labels=both
Accept Header Meaning
application/vnd.csv;labels=both Flat CSV (one row per observation) with separate columns for IDs and Names
application/vnd.csv-ts;labels=name Time Series CSV (one row per series) with Names only
application/vnd.csv-series;labels=both Vertical Time Series CSV (one column per series) with both IDs and Names