Fusion Data Mapper - Query Forwarding
Overview
Data Query forwarding is the process of Fusion Registry using mapping rules to rewrite a Data Query against the target Dataflow to conform the the structure of the Source Dataflow. For example a query for data where REF_AREA=UK may result in a query for 15 different series by unique identifiers UK1, UK2, UK3, and so on. The Fusion Registry is able to rewrite any multidimensional query to a list of series that correspond to the data filters, it then forward this query onto the data store (or data stores if there are multiple) that contain the data. When the Fusion Registry recieves the response, it is able rewrite the dataset into a multidimensional dataset, in the format requested by the user. For example extracting the time series UK1, UK2, UK3 from an Oracle database may result in series for FREQ=A, REF_AREA=UK, INDICATOR=EMP (along with other series)being written in CSV format to the client. When the source data is updated the client immediately sees the updated data, as it is always retrieved from the source data.
Single Data Source serving data for Multiple Dataflows
If the Fusion Registry is linked to the data store for the source Dataflow (single Dimension Dataflow) then it is able to forward queries from the mapped Dataflow to the data store and map the response back out. It is possible to map a subset of the source Data to one Target Dataflow and another subset to a different Dataflow. In this way, it is possible to map many multidimensional Dataflows to the same source of data, where each Dataflow is only representing a small fraction of the total dataset.
For this use case caution must be taken against querying for all data against a Target Dataflow, as the mapped query is a query for all data for the Source Dataflow, resulting in full data extraction from the Source. The correct subset of data will be written out, because the series that have not been mapped will be discarded, however a query for all data may not be desirable if the Source Dataflow has a lot of data loaded against it. In order to protect against this behaviour the Fusion Registry provides a setting under the administration configurations for Mapping, the setting is called Explicit Data Query Conversion. Explicit Data Query Conversion is defaulted to off, this means a query for all data against the target Dataflow is sent to the source Dataflow as a query for all data. When this setting is set to On a query for all data will be explicitly mapped into a list of series identifiers that match this query, for example if 500 series are mapped to the Target then a query for all data results in a list of 500 series identifiers being queried. Protection is in place in the Fusion Registry to split large queries up into smaller batches as some databases restrict the number of parameters passed to the IN statement - so one web query may result in more then one SQL statement being executed to fulfil the query.