Data Transformation Web Service
Overview
Entry Point | /ws/public/data/transform |
Access | Public (default). Configurable to Private |
Http Method | POST |
Accepts | CSV, XLSX, SDMX-ML, SDMX-EDI (any format for which there is a Data Reader) |
Compression | Zip files supported, if loading from URL gzip responses supported |
Content-Type | 1. multipart/form-data (if attaching file) – the attached file must be in field name of uploadFile 2. application/text or application/xml (if submitting data in the body of the POST) |
Response Format | Determined by Accept Header - default SDMX 2.1 Structure Specific |
Response Statuses | 200 - Trasformation performed 400 - Trasformation could not be performed (either an unreadable dataset, or unresolvable reference to a required structure) 401 - Unauthorized (if access has been restricted) 500 - Server Error |
HTTP Headers
The following header parameters can be used to provide further details on the incoming dataset. If these details are not provided, the Fusion Registry will interrogate the dataset header to get the information. If the dataset is a non-SDMX format, or does not contain the required information in the header, then an error response will be returned.
HTTP Header | Purpose | Allowed Values |
---|---|---|
Accept (Since v9.8) |
Optional. Instructs the service which data output format to output the datasets in. Note: From Fusion Registry 11.8.0 the format (if not specified) defaults to the input format. Previous versions defaulted to SDMX Structure Specific 2.1 |
See Accept Formats |
Data-Format | Used to inform the server when the data is in CSV format. | csv;delimiter=[delimiter]
Where [delimiter] is either:
|
Structure | (optional) Provides the structure to validate the data against. This is optional as this information may be present in the header of the DataSet. If provided this value will override the value in the dataset (if present). |
Valid SDMX URN for Provision Agreement, Dataflow, or Data Structure Definition |
Receiver-Id (Since v9.8) |
The ReceiverId may be included in the validation report. If not provided, the ReceiverId will be taken from the header of the dataset if it is present. If the dataset does not contain a ReceiverId (for example a non-SDMX format) then the validation report will not contain a ReceiverId in the header. |
The following characters are allowed: A-z, a-z 0-9 $, _, -, @, \ |
Dataset-Idx | If the loaded file contains multiple datasets, this argument can be used to indicate which dataset is transformed. If this argument is not present then all datasets will be in the output file (if the file formats permits multiple datasets). |
Zero indexed integer, example: 0 |
Dataset-Id (Since v9.8) |
An optional parameter which allows the user to specify the value of the DataSetID generated in the validation. |
The following characters are allowed:
A-z, a-z
0-9
$, _, -, @, \
Specific variables permit the insertion of Data Structure / Data Flow values. These values are:
|
Dataset-Action (Since v9.8.1) |
An optional parameter which allows the user to specify the value of the DataSetAction generated in the validation report. If this parameter is not specified, the default value will be used. | May be one of the following:
|
Map-Structure (Since v9.2.13) |
An optional parameter to inform the Fusion Registry to transform the structure of the dataset to conform to another Data Structure Definition. The value provided can be a URN of a Dataflow or Data Structure Definition to map the incoming data to. A Structure Map must exist in the Fusion Registry which maps between the incoming Data Structure/Dataflow and Mapped Data Structure/Dataflow. Alternatively the URN may be the URN of the Data Structure Map to use for the mapping (since v9.4.4) |
Valid SDMX URN for Dataflow or Data Structure Definition. |
Inc-Unmapped (Since v9.6.5) |
If the Map-Structure Header is used, then the inclusion of Inc-Unmapped will output a second dataset, if there are unmapped series. The additional dataset contains the data that could not be mapped due to missing mapping rules, or ambiguous outputs. The format of the additional dataset is the same format as the output dataset. As the result may contain a separate file, the response format is either set to multipart/mixed message with a boundary per file, or if the Zip header is set to true, the output will be a single zip file. The file names are 'out' and 'unmapped' with the file extension based on the output format. |
Boolean (true/false) |
Unmapped-Format (Since v11.0.0) |
If the Inc-Unmapped Header is true, the format of the unmapped dataset defaults to the format described by the Accept Header. This header option can be used to specify a different format for unmapped data |
VND Header used to describe the format, i.e. application/vnd.sdmx.data+csv;version=2.0.0 |
Inc-Metrics (Since v9.6.5) |
Includes metrics on the transformation. The result will contain a separate file, either as a multipart/mixed message with a boundary per file, or if the Zip header is set to true, the output will be a single zip file. |
Boolean (true/false)
|
Fail-On-Error (Since v9.5.0) |
An optional parameter to tell the transformation process to fail if an error is detected in the dataset. |
Boolean (true/false) |
Zip (Since v9.6.5) |
Compresses the output as a zip file. This if used in conjunction with Inc-Metrics or Inc-Unmapped the zip will contain multiple files. |
Boolean (true/false) |
Merge-Datasets (Since v10.4.6) |
Merges datasets if the DSD for the Dataset (or the DSD referenced by the Dataflow) is the same across the datasets. Default is false. |
Boolean (true/false) |
Skip-Validation (Since v11.9.0) |
Allows the validation process to be skipped when transforming a file. Useful when the input file is well understood or large. |
May be one of the following:
|
Duplicate-Behaviour (Since v11.2) |
Specify the behaviour to perform when duplicate observations are encountered. Either the duplicates can be preserved or either the first or last value can be used. |
May be one of the following:
|
Include Metrics
The following JSON is an example response when Inc-Metrics header is set to true. Request Time is Epoc Time Milliseconds, and Duration is measured in the number of milliseconds taken to complete the transformation.
{ "Meta": { "RequestTime": 1559124708568, "Duration": 220 }, "SourceData": { "Datasets": [ { "Structure": "urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=BIS:IN_FLOW(1.0)", "Series": 3118, "Observations": 3118, "Groups": 0 } ] }, "OutputData": { "Datasets": [ { "Structure": "urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=BIS:OUT_FLOW(1.0)", "Series": 1753, "Observations": 1855, "Groups": 0 } ] }, "UnMappedData": { "Datasets": [ { "Structure": "urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=BIS:IN_FLOW(1.0)", "Series": 1263, "Observations": 1263, "Groups": 0 } ] } }