Difference between revisions of "Asynchronous Data Validation and Transformation"
(→Revaidate) |
(→Revaidate) |
||
Line 128: | Line 128: | ||
|} | |} | ||
− | = | + | = Revalidate = |
− | A Re-validation service is provided if there is more information that can be provided to the dataset(s) loaded against the token, this may include if the Dataflow or Provision Agreement link changes which may result in a different set of Constraints to be applied on | + | A Re-validation service is provided if there is more information that can be provided to the dataset(s) loaded against the token, this may include if the Dataflow or Provision Agreement link changes which may result in a different set of Constraints to be applied on validation. If a structure is linked to a different Dataflow or Provision Agreement, this link information will be included in the exported dataset if it is in a format which supports it. |
Revalidation is an asynchronous action, whose progress can be tracked using the same token and tracking web services provided. | Revalidation is an asynchronous action, whose progress can be tracked using the same token and tracking web services provided. | ||
Line 149: | Line 149: | ||
|style="background-color:#eaecf0"|<b>Response Format</b>|| JSON | |style="background-color:#eaecf0"|<b>Response Format</b>|| JSON | ||
|- | |- | ||
− | |style="background-color:#eaecf0"|<b>Response Statuses</b>|| <p><b>200</b> - request | + | |style="background-color:#eaecf0"|<b>Response Statuses</b>|| <p><b>200</b> - request received and being processed</p> |
<p><b>400</b> - request could not be performed (possibly due to bad syntax of JSON POST)</p> | <p><b>400</b> - request could not be performed (possibly due to bad syntax of JSON POST)</p> | ||
<p><b>401</b> - Unauthorized (if access has been restricted)</p> | <p><b>401</b> - Unauthorized (if access has been restricted)</p> | ||
Line 156: | Line 156: | ||
== Post Body == | == Post Body == | ||
− | The POST request must contain the token of the dataset to revalidate, this is the same token that was provided by the server on data load. The SRef is an array of URNs, one for each Dataset that was present in the data file loaded to the server. Each URN refers to which structure to use to validate the | + | The POST request must contain the token of the dataset to revalidate, this is the same token that was provided by the server on data load. The SRef is an array of URNs, one for each Dataset that was present in the data file loaded to the server. Each URN refers to which structure to use to validate the dataset against. The URN can either be to a Data Structure Definition, Dataflow, or Provision Agreement. |
{ | { | ||
UID : "datasetdetailsuid", | UID : "datasetdetailsuid", | ||
Line 163: | Line 163: | ||
== Revalidate Response == | == Revalidate Response == | ||
− | The response is a JSON message with a | + | The response is a JSON message with a success message. The validation progress should be tracked using the token that was used in the re-validation request against the [[Asynchronous_Data_Validation_and_Transformation#Request_Load_Status|Load Status]] web service. |
{ | { |
Revision as of 04:29, 5 August 2020
Contents
Overview
The Asynchronous Data Load web service consumes a submitted file for Validation. On receipt of the file, the service returns a UID (token), which can be used to track the validation process, and perform further actions such as transformation to another format, or publish to a database once validation has completed.
The Fusion Registry stores the data on the instance it was sent to, so in a load balanced system, the same server must be accessed, if there is no activity on the file for 15 minutes, the registry will automatically remove the file from its cache.
Asynchronous Data Load
Entry Point | /ws/public/data/load |
Access | Public (default). Configurable to Private |
Http Method | POST |
Accepts | CSV, XLSX, SDMX-ML, SDMX-EDI (any format for which there is a Data Reader) |
Compression | Zip files supported, if loading from URL gzip responses supported |
Content-Type | 1. multipart/form-data (if attaching file) – the attached file must be in field name of uploadFile 2. application/text or application/xml (if submitting data in the body of the POST) |
Response Format | JSON |
Response Statuses | 200 - Data file recieved 400 - Trasformation could not be performed (either an unreadable dataset, or unresolvable reference to a required structure) 401 - Unauthorized (if access has been restricted) 500 - Server Error |
Data Load Response
The response to a data load is a token, which can be used in subsequent calls to track the data load and validation process and, once validation is complete, the token can be used to perform actions such as a pulish, obtain validation report, export in a different format, or export with mapping.
{ "Success" : true, "uid" : "unique token" }
Request Load Status
Entry Point | /ws/public/data/loadStatus |
Access | Public (default). Configurable to Private |
Http Method | GET |
Response Format | JSON |
Query Parameters
uid | Unique identifier for the loaded dataset, returned from the data load operation |
Load Status Response
The response is a validation report, to either indicate validation success or validation with errors.
The report Status indicates how far into the validation process the server has reached. The following table shows the various stages:
Status | Description |
---|---|
Initialising | Initial status |
Analysing | The dataset is being analysed for series and obs count, and which dsd's it references |
Validating | The dataset is being validated |
Complete | The dataset validation process has finished, there may/man not be errors |
Consolidating | The dataset is being consolidated, duplicate series and observations are being merged into one final dataset |
IncorrectDSD | The dataset references a DSD that can not be used to validate the data |
InvalidRef | The dataset references a DSD/Dataflow/Provision that does not exist in the Registry |
MissingDSD | The dataset does not not reference any structure so the system can not read the dataset |
Error | The dataset can not be read at all |
Data export/transform
Entry Point | /ws/public/data/download |
Access | Public (default). Configurable to Private |
Http Method | GET |
Response Format | Determined by Accept Header |
Query Parameters
Query Parmeter | Format | Description |
---|---|---|
uid | string | Unique identifier for the loaded dataset, returned from the data load operation |
datasetIndex | integer | If multiple datasets are in the data file, identifies which one to export, zero indexed |
map | string | URN of Dataflow or DataStucture to map to, there must be a Structure Map which describes the mapping from the source to target |
zip | boolean (default is false) | Zips the respone |
includeMetrics | boolean (default is false) | Include Metrics in the response see Data Transformation |
unmapped | boolean (default is false) | If the map parameter is supplied, and some series or observation can not be mapped, the unmapped data will be included in either the zip file (if zip is true) or in a multipart-form boundry (if zip is false) |
Revalidate
A Re-validation service is provided if there is more information that can be provided to the dataset(s) loaded against the token, this may include if the Dataflow or Provision Agreement link changes which may result in a different set of Constraints to be applied on validation. If a structure is linked to a different Dataflow or Provision Agreement, this link information will be included in the exported dataset if it is in a format which supports it.
Revalidation is an asynchronous action, whose progress can be tracked using the same token and tracking web services provided.
Entry Point | /ws/public/data/revalidate |
Access | Public (default). Configurable to Private |
Http Method | POST |
Response Format | JSON |
|- |style="background-color:#eaecf0"|Accepts|| JSON |-
|style="background-color:#eaecf0"|Content-Type|| application/json
|- |style="background-color:#eaecf0"|Response Format|| JSON |-
|style="background-color:#eaecf0"|Response Statuses||
200 - request received and being processed
400 - request could not be performed (possibly due to bad syntax of JSON POST)
401 - Unauthorized (if access has been restricted)
500 - Server Error
|}
Post Body
The POST request must contain the token of the dataset to revalidate, this is the same token that was provided by the server on data load. The SRef is an array of URNs, one for each Dataset that was present in the data file loaded to the server. Each URN refers to which structure to use to validate the dataset against. The URN can either be to a Data Structure Definition, Dataflow, or Provision Agreement.
{ UID : "datasetdetailsuid", SRef : ["urn1", "urn2", urn3"] }
Revalidate Response
The response is a JSON message with a success message. The validation progress should be tracked using the token that was used in the re-validation request against the Load Status web service.
{ "Success" : true, }
Data Publish
Entry Point | /ws/public/data/publish |
Access | Public (default). Configurable to Private |
Http Method | POST |
Accepts | JSON |
Response Format | JSON |
Response Statuses | 200 - Publish request accepted 400 - Bad request 401 - Unauthorized (if access has been restricted) 500 - Server Error |
Publishes the dataset loaded against the uid the expected json is as follows
{ UID : "datasetdetailsuid", Action : "Append|Replace|Delete", DeleteAction : "DEFAULT|OBSERVATIONS|SERIES", Dataset: int }
The If the action is Delete, then there are three different options for Delete Action. The DEFAULT option is to use the SDMX Delete rules, OBSERVATION will delete all the observations in the loaded dataset from the database, SERIES will delete all the series in the dataset from the database (including all the observations that belong to the series).