Difference between revisions of "Fusion Data Mapper - Maintaining Mapping Rules"

From Fusion Registry Wiki
Jump to navigation Jump to search
(Importing mappings from Excel)
(Bulk Changes)
Line 32: Line 32:
  
 
Local changes that have not been saved to the Fusion Registry server can be undone by clicking on the Undo button.
 
Local changes that have not been saved to the Fusion Registry server can be undone by clicking on the Undo button.
 +
 +
== Filters ==
 +
The series filter input box can be used to perform a filter on all series shown in the table, if any part of the series identifier contains the text in the filter it will be a match, the search is case insensitive.  For example the series identifier '''XULADS''' will match the filter '''ad'''.
 +
 +
Column filters are applied in addition to any other filters.
 +
 +
All filters are cleared using the clear filter control.
 +
  
 
== Bulk Changes ==
 
== Bulk Changes ==
Line 41: Line 49:
  
 
A selected row remains selected until it is deselected.  The selection will remain even if the row is no longer displayed due to the application of a filter or viewing a different page.  Deselection can be achieved by clicking the same select checkbox, or via a select all followed by deselect all, or by navigating to a different Dataflow mapping.
 
A selected row remains selected until it is deselected.  The selection will remain even if the row is no longer displayed due to the application of a filter or viewing a different page.  Deselection can be achieved by clicking the same select checkbox, or via a select all followed by deselect all, or by navigating to a different Dataflow mapping.
 +
 +
 +
= Enumerated Source Dimension =
 +
If the Source Dimension uses a Codelist to enforce its allowable content, this is used by the Data Mapper to allow series identifiers to be quickly added into the mapping.
 +
 +
[[File:Addseries.png]]
 +
 +
The add series button appears as a Plus Symbol [+] in the button bar if the Source Dimension is coded.  This is true even if the Dimension was no Coded at the time the mapping was made (i.e if the Source Dimension was Free text but was subsequently modified to be Coded, the button will appear).    On clicking this button a form will open displaying all the codes in this codelist, which can be filtered using the text entry field.  The filter will search for all Code Ids and Lables that include the text as a case insensitive search.  [https://en.wikipedia.org/wiki/Regular_expression Regular Expression] searches are supported.
 +
 +
Selecting one or more codes followed by add, will add a mapping into the table, but will not include any mapped values, these require manual editing, or an export to excel to complete the form followed by an import.
 +
 +
= Partially Mapped Series and Duplicate Mappings =
 +
A series identifier does not have to map to a value for each Dimension in the target when the mapping is being created.  However this is strongly recommended in a production system as each series code should map to a unique Dimension.  If two differnt identifiers map to the same Dimension values for each Dimension then the mapping becomes ambigeous when the data is converted from the source to target dataset.  It is however possible to map the same series identifier against different Dataflow mappings.
  
 
= Modification Result =  
 
= Modification Result =  
 
The result of modifying the mapping is the Fusion Registry server being updated with the new metadata.  This may result in a re-index of the Source dataset if the Target Dataflow has been linked to the Source Dataflow for data retrieval.  This will mean any data navigation or retrieval will reflect the changes of the mapping with immediate effect.
 
The result of modifying the mapping is the Fusion Registry server being updated with the new metadata.  This may result in a re-index of the Source dataset if the Target Dataflow has been linked to the Source Dataflow for data retrieval.  This will mean any data navigation or retrieval will reflect the changes of the mapping with immediate effect.

Revision as of 08:40, 24 July 2020


Overview

The Fusion Data Mapper displays the Structure Mapping rules which are ultimatley maintained and retrieved from the Fusion Registry. The Data Mapper provides the ability to load in rules from Excel or CSV files, edit or delete the rules locally and save them to the Fusion Registry server.

Importing mappings from Excel

Mapping rules can be expressed in an excel file by creating a column which has the same Id as the ID of the Source Dimension, and subsequent columns with Ids that match those of the Target Dimensions or Series Attributes.

Mappingrules2.png

The order of the columns is not important, as long as the column headings match the Ids of the Dimensions. If the target Dataflow wants to include the Series Code, perhaps as a Series Attribute, then ensure the Series Attribute Id in the target is the same as the Id in the source, and no further mapping is required, the Series Code will be exposed via the target Dataflow. The Source Dataflow can contain other Series Attributes for example LAST_UPDATED to indicate when the series was last updated. If the Target Dataflow has the same Series Attributes the data will be automatically mapped across without changing the reported value.

The quickest way to create an Excel file is to export the empty table in Excel format from the Data Mapper - this provides a blank template for creating new rules.

Import Action: Upsert

Importing mapping rules against the upsert action results in new mappings being created, where they did not previously exist, and existing mappings being overwritten. As the Source Dimension can only map a unique identification ONCE to a set of target Dimensions, the same unique identifier mapped to other Dimension values will result in an overwrite behaviour.

Upsert will not result in any mappings being removed, only addition of new rules or modification of existing.

Import Action: Full Replace

Full Replace will replace all the rules that exist against the Target - Source Dataflow with the ones in the loaded file.

Note: Fusion Registry preserve copies of old rules should rollback be required. Rollback is not supported in the Data Mapper, so must be performed via the Fusion Registry User Interface.

Manual Editing via the UI

Mapping Rules which exist in the Fusion Registry are viewed as a table in the Fusion Data Mapper, in the same style as Excel (column, row). Columns can be sorted ascending or descending by clicking on the column name. Filters can be applied to the columns.

Single Value Changes

To manually Edit one or more values for a mapping, double click on a code id, to load a drop down list of valid values. If the target Dimension is not backed by a Codelist, then a free text input box will be displayed to manually enter a value. When a cell is edited, the row will be highlighted in Orange to notify the user that an unsaved change was made to that row. To save all changes back to the Fusion Registry server, click on the Save button, this will submit all changes to the Fusion Registry for validation and update.

Edit mapping rule.png

Local changes that have not been saved to the Fusion Registry server can be undone by clicking on the Undo button.

Filters

The series filter input box can be used to perform a filter on all series shown in the table, if any part of the series identifier contains the text in the filter it will be a match, the search is case insensitive. For example the series identifier XULADS will match the filter ad.

Column filters are applied in addition to any other filters.

All filters are cleared using the clear filter control.


Bulk Changes

Bulk Changes can be made by selecting one or more rows using the checkbox control on the row. When one or more rows are selected, click on the bulk change button next to the Column label for the column you want to change the value of. This will open a control allowing the selection of a single value which will be applied to all selected rows.

Bulk Deletes can be achieved using the same row selection checkboxes and clicking on the button to delete the selected series. This will update the mapping locally to remove the mapped series. A further Save is required to push these deletes to the server. Deleted rows and modified rows are be pushed to the Fusion Registry server in one save command.

The select all checkbox will select all the checkbox’s that match the current query filter, even if the rows are not all displayed on one page due to pagination.

A selected row remains selected until it is deselected. The selection will remain even if the row is no longer displayed due to the application of a filter or viewing a different page. Deselection can be achieved by clicking the same select checkbox, or via a select all followed by deselect all, or by navigating to a different Dataflow mapping.


Enumerated Source Dimension

If the Source Dimension uses a Codelist to enforce its allowable content, this is used by the Data Mapper to allow series identifiers to be quickly added into the mapping.

Addseries.png

The add series button appears as a Plus Symbol [+] in the button bar if the Source Dimension is coded. This is true even if the Dimension was no Coded at the time the mapping was made (i.e if the Source Dimension was Free text but was subsequently modified to be Coded, the button will appear). On clicking this button a form will open displaying all the codes in this codelist, which can be filtered using the text entry field. The filter will search for all Code Ids and Lables that include the text as a case insensitive search. Regular Expression searches are supported.

Selecting one or more codes followed by add, will add a mapping into the table, but will not include any mapped values, these require manual editing, or an export to excel to complete the form followed by an import.

Partially Mapped Series and Duplicate Mappings

A series identifier does not have to map to a value for each Dimension in the target when the mapping is being created. However this is strongly recommended in a production system as each series code should map to a unique Dimension. If two differnt identifiers map to the same Dimension values for each Dimension then the mapping becomes ambigeous when the data is converted from the source to target dataset. It is however possible to map the same series identifier against different Dataflow mappings.

Modification Result

The result of modifying the mapping is the Fusion Registry server being updated with the new metadata. This may result in a re-index of the Source dataset if the Target Dataflow has been linked to the Source Dataflow for data retrieval. This will mean any data navigation or retrieval will reflect the changes of the mapping with immediate effect.