|
|
(160 intermediate revisions by 2 users not shown) |
Line 1: |
Line 1: |
− | == Overview == | + | [[Category:Fusion Data Mapper]] |
| + | == Overview – Fusion Data Mapper== |
| This document provides guidance and operating procedures for creating and managing mapped | | This document provides guidance and operating procedures for creating and managing mapped |
− | datasets using Fusion Registry 9 and the Fusion Data Mapper. | + | datasets using Fusion Registry 10 and the Fusion Data Mapper. |
| + | |
| + | '''Use Case''' |
| | | |
− | ==== Use Case ====
| |
| The primary use case is transforming single dimensional datasets to SDMX multi-dimensional | | The primary use case is transforming single dimensional datasets to SDMX multi-dimensional |
| structures. | | structures. |
Line 15: |
Line 17: |
| Mapper provides an easy-to-use user interface for defining and management the mapping rules. | | Mapper provides an easy-to-use user interface for defining and management the mapping rules. |
| | | |
− | ==== Audience ====
| + | '''Audience''' |
| * Metadata Managers – those responsible for managing the metadata mappings on the Bank’s catalogue of time series on a day to day basis. | | * Metadata Managers – those responsible for managing the metadata mappings on the Bank’s catalogue of time series on a day to day basis. |
| * Metadata Superusers – those responsible for managing the core structural metadata including Agencies, Concepts, Data Structure Definitions and Codelists. | | * Metadata Superusers – those responsible for managing the core structural metadata including Agencies, Concepts, Data Structure Definitions and Codelists. |
| * System Administrators – those responsible for administering Fusion Registry 9 as part of the integrated statistical data and metadata system, and managing the Time Series Database as the source of observation data. | | * System Administrators – those responsible for administering Fusion Registry 9 as part of the integrated statistical data and metadata system, and managing the Time Series Database as the source of observation data. |
− | ==== Prerequisites ====
| + | |
| + | '''Prerequisites''' |
| + | |
| Readers are assumed to have an understanding of basic SDMX principles and the purpose of the | | Readers are assumed to have an understanding of basic SDMX principles and the purpose of the |
| main SDMX structural metadata artefacts including Concepts, Codes and Codelists, Categories, Data | | main SDMX structural metadata artefacts including Concepts, Codes and Codelists, Categories, Data |
| Structure Definitions (DSDs), Dataflows, Provision Agreements, Structure Sets and Dataflow Maps. | | Structure Definitions (DSDs), Dataflows, Provision Agreements, Structure Sets and Dataflow Maps. |
| | | |
− | ==== Terminology ====
| + | '''Terminology''' |
− | Dataset Dataset
| |
− | | |
− | refers to a named collection of series that typically all fall under a
| |
− | specific topic, for instance ‘National Accounts’. In Fusion Registry, an SDMX
| |
− | Dataflow represents a dataset.
| |
− | Mapped Dataset A Mapped Dataset is an SDMX Dataflow where data is taken from a
| |
− | ‘source’ Dataflow and transformed to different dimensionality using
| |
− | defined mapping rules. The Fusion Data Mapper manages these mapping
| |
− | rules.
| |
− | In this document, the source Dataflow is assumed to be observation data
| |
− | from the Time Series Database which is described by a Data Structure
| |
− | Definition having only SERIES_CODE, TIME_PERIOD and OBS_VALUE
| |
− | dimensions.
| |
− | Time Series Database The source of time series observation data without metadata that Fusion
| |
− | Registry maps to Mapped Datasets using the defined mapping rules.
| |
− | | |
− | == The Fusion Data Mapper User Interface ==
| |
− | The Fusion Data Mapper is a web user interface providing the following main functions:
| |
− | | |
− | '''Authenticated users with sufficient structural metadata maintenance privileges''' | |
− | | |
− | * Add and remove mapped datasets
| |
− | * Add and remove series on mapped datasets
| |
− | * Interactively set and change the metadata values on a series by series basis
| |
− | * Export metadata values for selected series to Excel
| |
− | * Import metadata values for defined series from Excel
| |
− | * Change code names with impact analysis
| |
− | | |
− | '''Anonymous or authenticated users with sufficient privileges to view but not change the structural metadata'''
| |
− | | |
− | * Browse the catalogue of mapped datasets
| |
− | * Examine the ‘definition’ of a dataset – its dimensionality and list of possible codes for each
| |
− | * Dimension or Attribute
| |
− | * Browse the series in each dataset
| |
− | | |
− | == The Fusion Registry Administration Interface ==
| |
− | | |
− | The Administration Interface is Fusion Registry’s main web user interface.
| |
| | | |
− | For the purposes of managing the metadata on mapped datasets, it provides the following functions:
| + | {| |
| + | |-style="vertical-align: top;" |
| + | | Dataset || Dataset refers to a named collection of series that typically all fall under a specific topic, for instance ‘National Accounts’. In Fusion Registry, an SDMX Dataflow represents a dataset. |
| | | |
− | '''Authenticated users with sufficient structural metadata management privileges'''
| |
| | | |
− | * Create and modify SDMX Data Structure Definitions (DSDs)
| |
− | * Create and modify SDMX Concepts
| |
− | * Create and modify SDMX Codelists
| |
− | * Add and remove codes from SDMX Codelists
| |
− | * Register a series (series must be ‘registered’ before they can be mapped in dataset by adding the Series Code and Series Name to the relevant SERIES_CODE Codelist)
| |
| | | |
− | Refer to the ''Fusion Registry Structural Metadata Management Guide'' for general information on using the Fusion Registry Administration Interface for creating and maintaining core SDMX structure
| + | |-style="vertical-align: top;" |
− | metadata artefacts including DSDs, Dataflows, Concepts, Categories and Codelists.
| + | | Mapped Dataset|| A Mapped Dataset is an SDMX Dataflow where data is taken from a ‘source’ Dataflow and transformed to different dimensionality using defined mapping rules. The Fusion Data Mapper manages these mapping rules. |
| + | In this document, the source Dataflow is assumed to be observation data from the Time Series Database which is described by a Data Structure Definition having only SERIES_CODE, TIME_PERIOD and OBS_VALUE dimensions. |
| | | |
− | == Operating Procedures ==
| |
| | | |
− | === Add a Mapped Dataset === | + | |-style="vertical-align: top;" |
| + | | Time Series Database|| The source of time series observation data without metadata that Fusion Registry maps to Mapped Datasets using the defined mapping rules. |
| + | |} |
| | | |
− | A mapped dataset is an SDMX Dataflow and an associated SDMX Dataflow Map that describes:
| + | ==Related Pages== |
− | :::(a) The dataset’s dimensionality using an SDMX Data Structure Definition (DSD)
| + | For further guidance on Fusion Data Mapper: |
− | :::(b) The list of series in the dataset
| |
− | :::(c) The metadata values for each series
| |
| | | |
− | Use cases:
| + | [[Add a Mapped Dataset – Fusion Data Mapper]] |
− | * Creating a new dataset
| |
− | * Creating an alternative version of an existing dataset perhaps with a different compliment of series and / or dimensionality
| |
− | * Creating an alternative version of a dataset with simplified dimensionality for public dissemination
| |
| | | |
− | The Fusion Data Mapper provides a convenient way to interactively manage the process. However, it
| + | [[Add Series to a Dataset – Fusion Data Mapper]] |
− | is important to note that creating, modifying and examining the underlying SDMX artefacts can also
| |
− | be done using the Fusion Registry Administration Interface or the REST API which may be useful for
| |
− | debugging purposes. Discussion of these topics is outside of the scope of this document.
| |
| | | |
− | ==== Add a Mapped Dataset - Prerequisites ====
| + | [[Browse Privileges – Fusion Data Mapper]] |
| | | |
− | # The DSD that you plan to use for the dataset must already exist. DSDs and their associated structures can be created and managed using the Fusion Registry Administration User Interface.
| + | [[Bulk Maintenance of Metadata Values using Excel Import / Export – Fusion Data Mapper]] |
− | # The Source Dataset that contains the unmapped time series observations. The Source Dataset is an SDMX Dataflow created by a System Administrator that provides access to the Time Series Database observation data.
| |
| | | |
− | ==== Add a Mapped Dataset - Required Roles and Privileges ====
| + | [[Changing the Dimensionality of a Dataset – Fusion Data Mapper]] |
| | | |
− | To add a mapped dataset, the user must be a member of the Agency that owns the SDMX Structure Set, or a member of a parent Agency if a hierarchical agency structure is in place.
| + | [[Clone a Dataset – Fusion Data Mapper]] |
| | | |
− | Once created, the SDMX Dataflow Map which represents the dataset will be owned by the same Agency as the SDMX Structure Set to which it belongs. Any subsequent changes to the dataset can only be performed by users who are a member of that Agency. Changes include:
| + | [[Codelists - Adding and Removing Codes – Fusion Data Mapper]] |
| | | |
− | * Removing the dataset
| + | [[Codelists – Adding and Changing Multilingual Code Names with Impact Analysis - Fusion Data Mapper]] |
− | * Adding and removing series
| |
− | * Maintaining the metadata values on series
| |
| | | |
− | ==== Add a Mapped Dataset - Procedure ====
| + | [[Content Security Caveats – Fusion Data Mapper]] |
| | | |
− | Using the Fusion Data Mapper:
| + | [[Content Security Metadata Management Use Cases – Fusion Data Mapper]] |
| | | |
− | # Choose the Add Dataset function from the left-hand menu bar.
| + | [[Default Code Values – Fusion Data Mapper]] |
− | # Choose a Source Dataset from those available. All Dataflows in the Fusion Registry with a single dimension are shown in this list. However, it is important that the single dimension of the chosen source dataset must be the Series Code. If multiple Source Datasets are shown in the list, care should be taken to choose the correct one otherwise it will be impossible to create the metadata mappings.
| |
− | # Choose the Dataset Definition for the new dataset. A list of available Data Structure Definitions (DSDs) are shown to choose from.
| |
− | #:::The DSD chosen for the new dataset must follow these rules:
| |
− | #:::* the DSD must include a SERIES_CODE dimension
| |
− | #:::* the SERIES_CODE dimension must be coded (conventionally, the Codelist is named CL_SERIES_CODE)
| |
− | #:::* the codes of series in the Time Series Database to be included in the dataset must be ‘registered’ by adding them to the SERIES_CODE Codelist (refer to 4.5 for more on registering series)
| |
− | #:::If an invalid DSD is chosen, the dataset will be created but it will be impossible to add series to it.
| |
− | # Set the name for the new Dataset in the chosen language. This the descriptive name of mapped dataset’s Dataflow, for instance ‘Employment’, ‘National Accounts’ or ‘Financial Activity’. After the Dataset has been created, changes to the name, including adding alternative names in different languages, can be made using the Fusion Registry Administration Interface – Dataflow maintenance. In the example shown in Figure 1, the name has been set in Hebrew.
| |
− | # Set the SDMX ID for the new dataset. The ID is the unique reference for the dataset’s SDMX Dataflow. You must follow these rules when choosing the ID:
| |
− | #:::*The ID must be unique
| |
− | #:::*The ID must use Latin characters and can contain letters, numbers and ‘_’ characters.
| |
− | #::::It cannot contain dots (‘.’) or other special characters such as ‘@’ or ‘$’.
| |
− | #::::The following are valid:
| |
− | #::::EMPLOYMENT
| |
− | #::::FINANCIAL_ACTIVITY2
| |
− | #::::NATIONAL_ACCOUNTS
| |
− | #:::*By convention, IDs are in upper case. For example, use ‘NATIONAL_ACCOUNTS’ rather than ‘National_Accounts’
| |
− | #Set the Version for the dataset. This will be used to set the version of the dataset’s SDMX Dataflow. Version numbers are of the form <major_number>.<minor_number>. The following are valid:
| |
− | #:1.0
| |
− | #:1.1
| |
− | #:2.1
| |
− | #:By convention, the first version is 1.0.
| |
− | #:
| |
− | #:Create new versions of a dataset when you need to change the dimensionality – refer to Section 4.12 Changing the Dimensionality of a Dataset.
| |
− | #Choosing ‘Add’ will create the new dataset which should then appear in the left-hand bar
| |
− | === Clone a Dataset ===
| |
− | Clone a dataset to create a copy of an existing dataset.
| |
| | | |
− | Use cases:
| + | [[Maintaining Metadata Values on Series Interactively using the Web Interface – Fusion Data Mapper]] |
− | * Creating a copy of a dataset with the same dimensionality for experimentation or other purposes
| |
− | * Creating a copy of a dataset with completely different dimensionality
| |
− | * Adding or removing selected dimensions from a dataset
| |
| | | |
− | All of the series in the existing dataset are copied to the clone.
| + | [[Maintenance Privileges – Fusion Data Mapper]] |
| | | |
− | Where a dimension or attribute appears in both the original and clone datasets, the metadata values are copied. However, default values are used where a new dimension or mandatory attribute appears only in the clone dataset. Section 4.9 explains how to define and manage default values.
| + | [[Registering a Series – Fusion Data Mapper]] |
− | ===== Clone Dataset - Prerequisites =====
| |
− | #The DSD that you plan to use for the dataset must already exist. DSDs and their associated structures can be created and managed using the Fusion Registry Administration User Interface.
| |
− | #The Source Dataset that contains the unmapped time series observations. The Source Dataset is an SDMX Dataflow created by a System Administrator that provides access to the Time Series Database observation data.
| |
| | | |
− | ===== Clone Dataset - Required Roles and Privileges =====
| + | [[Remove a Mapped Dataset – Fusion Data Mapper]] |
− | To add a mapped dataset, the user must be a member of the Agency that owns the SDMX Structure Set, or a member of a parent Agency if a hierarchical agency structure is in place.
| |
| | | |
− | Once created, the SDMX Dataflow Map which represents the dataset will be owned by the same Agency as the SDMX Structure Set to which it belongs. Any subsequent changes to the dataset can only be performed by users who are a member of that Agency. Changes include:
| + | [[Removing Series from a Dataset – Fusion Data Mapper]] |
− | * Removing the dataset
| |
− | * Adding and removing series
| |
− | * Maintaining the metadata values on series
| |
| | | |
− | ===== Clone Dataset – Procedure =====
| + | [[The Fusion Data Mapper User Interface]] |
− | The procedure for cloning a dataset is the same as that explained in Section 4.1 on how to add a mapped dataset, with the following exceptions: | |
− | #Choose the Clone Dataset option
| |
− | #Choose a dataset to clone from – a list of existing datasets in the Structure Set is shown.
| |
− | #Choose the Data Structure (DSD) for the new cloned dataset.
| |
− | In the example shown in Figure 2, a clone is being made of the NATIONAL_ACCOUNTS dataset. A new DSD has been created called NATIONAL_ACCOUNTS Version 2.0 which adds new dimensions.
| |
− | ===== Use Case – Add a dimension to a dataset using the Clone Method =====
| |
− | #Using the Fusion Registry Administration Interface, create a new DSD based on the original but including the new dimension. Either save the DSD with a new ID, or use the same ID with a different version number. For instance:
| |
− | #:Original: NATIONAL_ACCOUNTS version 1.0
| |
− | #:New: NATIONAL_ACCOUNTS version 2.0
| |
− | #:or: NEW_NATIONAL_ACCOUNTS version 1.0
| |
− | #Using the Fusion Data Mapper, add a new dataset choosing the Clone Dataset option. Choose the existing dataset as the one to clone, and the newly created DSD. The procedure for this is explained below.
| |
− | #The new dataset will be created by copying all of the series and their metadata values to the cloned dataset. The new dimension will have the default value for every series.
| |
− | #Change the values for the new dimension as required. Section 4.7 explains how to do this interactively using the web user interface. Alternatively, export the mappings to Excel, make the necessary changes and import the results – this is explained in Section 4.8.
| |
− | #Save the mapping for the new dataset.
| |
Overview – Fusion Data Mapper
This document provides guidance and operating procedures for creating and managing mapped
datasets using Fusion Registry 10 and the Fusion Data Mapper.
Use Case
The primary use case is transforming single dimensional datasets to SDMX multi-dimensional
structures.
Single dimensional datasets are those with a single unique identifier for each series (e.g. Series Code)
such as created by FAME or similar time-series production systems.
One-to-one transformations only are supported by this version of the Fusion Data Mapper.
The transformation is performed by Fusion Registry using SDMX Structure Mapping. Fusion Data
Mapper provides an easy-to-use user interface for defining and management the mapping rules.
Audience
- Metadata Managers – those responsible for managing the metadata mappings on the Bank’s catalogue of time series on a day to day basis.
- Metadata Superusers – those responsible for managing the core structural metadata including Agencies, Concepts, Data Structure Definitions and Codelists.
- System Administrators – those responsible for administering Fusion Registry 9 as part of the integrated statistical data and metadata system, and managing the Time Series Database as the source of observation data.
Prerequisites
Readers are assumed to have an understanding of basic SDMX principles and the purpose of the
main SDMX structural metadata artefacts including Concepts, Codes and Codelists, Categories, Data
Structure Definitions (DSDs), Dataflows, Provision Agreements, Structure Sets and Dataflow Maps.
Terminology
Dataset |
Dataset refers to a named collection of series that typically all fall under a specific topic, for instance ‘National Accounts’. In Fusion Registry, an SDMX Dataflow represents a dataset.
|
Mapped Dataset |
A Mapped Dataset is an SDMX Dataflow where data is taken from a ‘source’ Dataflow and transformed to different dimensionality using defined mapping rules. The Fusion Data Mapper manages these mapping rules.
In this document, the source Dataflow is assumed to be observation data from the Time Series Database which is described by a Data Structure Definition having only SERIES_CODE, TIME_PERIOD and OBS_VALUE dimensions.
|
Time Series Database |
The source of time series observation data without metadata that Fusion Registry maps to Mapped Datasets using the defined mapping rules.
|
Related Pages
For further guidance on Fusion Data Mapper:
Add a Mapped Dataset – Fusion Data Mapper
Add Series to a Dataset – Fusion Data Mapper
Browse Privileges – Fusion Data Mapper
Bulk Maintenance of Metadata Values using Excel Import / Export – Fusion Data Mapper
Changing the Dimensionality of a Dataset – Fusion Data Mapper
Clone a Dataset – Fusion Data Mapper
Codelists - Adding and Removing Codes – Fusion Data Mapper
Codelists – Adding and Changing Multilingual Code Names with Impact Analysis - Fusion Data Mapper
Content Security Caveats – Fusion Data Mapper
Content Security Metadata Management Use Cases – Fusion Data Mapper
Default Code Values – Fusion Data Mapper
Maintaining Metadata Values on Series Interactively using the Web Interface – Fusion Data Mapper
Maintenance Privileges – Fusion Data Mapper
Registering a Series – Fusion Data Mapper
Remove a Mapped Dataset – Fusion Data Mapper
Removing Series from a Dataset – Fusion Data Mapper
The Fusion Data Mapper User Interface