Publication Table

From Fusion Registry Wiki
Revision as of 00:19, 29 March 2022 by Mnelson (talk | contribs) (Observation Key)
Jump to navigation Jump to search

Overview

A Publication Table provides the means to explicitly define the structure and content of a table of data. Publication Tables can make use of variable text placeholders in the table heading, and row and column headings, it can contain variable dimensions in the cells, and include calculated data and time arithmetic. Publication Tables can also display data from multiple SDMX Dataflows in the same table and can even perform calculations across Dataflows.

The following image shows a very simple Publication Table structure, where the table headings contain static text, and each cell describes an observation to show via the series key. Other parts of this table definition describe the table title, subtitle, and what the source Dataflow is.

Publication Table Definition.png
Publication Table.png

Table Content

Overview

A Publication Table is made up of a Table Head and Body Rows. Each Body Row has first cell which is the Row Title. The Table Head is made up of 1 or more rows, and each head cell may span 1 or more columns, and 1 or more rows. The Table Head content is made up of free text and / or variable placeholders whose values are resolved when the table is built.

A Body Row is made up of 1 or more rows. The Row Title has a fixed row span and column span of 1. The content is made up of free text and / or variable placeholders whose values are resolved when the table is built. The remainder of the Body Row is made is up Observation cells. An Observation Cell can reference a unique Observation by an Observation Key, the value for this Observation is resolved at table build time. The Observation Key may contain variable placeholders and can include calculation expressions. An Observation Cell may also contain free text.

Observation Key

The puspose of the Observation Key is to reference a Observation whose value will be resolved at table build time. The Observation Key can explicitly resolve to 1 observation value, if all parts of the Key are provided. However, it is possible to make parts of the key varaibles, allowing the values to be set at table build time.

An Observation Key is made up of 4 parts:

  • Dataflow Reference
  • Series Reference
  • Time Period Reference
  • Measure Reference

An example Observation Key with all of these parameters in place is:

DF_REG.M:UK:SUR:2008:BIRTHS

The key is broken down into the following parts:

  • DF_REG - This identifies the Dataflow which has the data. The identification uses the alias, which is assiged to the Dataflow in the Publication Table definition. The Dataflow Alias is only required if the Publication Table references more then one Dataflow. If the Publication Table references more then one Dataflow, it can assign a Dataflow to a Default Alias, which means that it is required as part of the Observation Key.
  • M:UK:SUR - This identifies the Series of Data for the Dataflow, by providing a single value for each of the Dimension Members of the Dataflow, separated by a ‘:’ character. In this example the Dataflow has 3 Dimensions (excluding time), the value for the first is 'M', the second 'UK' and the last 'SUR'. As the Dataflow in this example has a Time Dimension this Series Key will uniquely identify a Series of Observations where each Observation refers to a unique Time Period.
  • 2008 - This identifies the Time Period in the Series of Data identified by the previous section. The Combination of the Series and Time Period for a Time Series Dataflow will uniquely identify the Observation.
  • BIRTHS - If the Dataflow uses a Data Structure with multiple Measures, then the Measure ID must be included as the last part of the Observation Key. In this example the Dataflow has multiple measures (BIRTHS/DEATHS/MARRIAGES) so the Observation Key needs to include this value.

If the Publication Table only uses data from one Dataflow, and if the Dataflow only has 1 Measure Dimension then the Observation Key would look more simple:

M:UK:SUR:2008

Variable Placeholders It is possible to omit Dimension Values from the Observation Key, when this is the case that Dimension will become a variable – whose value can be set at table build time. For example, the second Dimension value of ‘UK’, from the previous example, can be omitted like this:

M::SUR:2008

The Publication Table can now be built for any Country by passing in the variable at table build time - i.e. UK, FR, DE. By passing in different variable values, the Observation Key will change, and as such it will reference a different Series in the Dataset.

It is also possible to omit the value for Time Period, this allows the Time Period to be set at table build time, defaulting to the last period for which data exists. Extending the above example to omit the Time value results in an Observation Key whose Country and Time values can set defined at table build time.

M::SUR

When an Observation Key creates a variable, it can be used in any text provided to the table. For example, if the ID of the 2nd Dimension is REF_AREA, the Table Column could be given a title of $(REF_AREA) which would resolve to 'United Kingdom', 'France', or 'Germany' at build time depending on which varaible value is used to build the table. Varaibles are covered later in this document.

Table Variables

Dataflow Variables

Publication Tables define data cells across one or more Dataflows. Each ‘imported’ Dataflow is given an alias, which is used to refer to the Dataflow in the table headings, and table body. It is possible to refer to the Dataflow name, description, and Id using the following variable syntax. The following examples are based on a Dataflow which has been given the alias of 'EXR'

Dataflow Description

$(EXR).desc 

Dataflow Name

$(EXR).name 

Dataflow ID

$(EXR).id

Dimension Variables

A Dimension becomes a variable in the Publication Table when one of more Observation Keys has a wildcard placeholder for the Dimension. For example, consider a Dataflow which has 4 Dimensions: Frequency, Reference Area, Indictor, and Time. An example Observation Key would look like this:

A:UK:EMP:2008

If we want to wildcard the Indicator Dimension, the value EMP would be omitted from the Observation Key

A:UK::2008

The absence of a value for this Dimension turns the Dimension into a variable whose value can be passed to the API when building an instance of the table, for example ‘build table instance with INDICATOR=EMP’. The table is built, with the observation values resolved against a cell key A:UK:EMP:2008.

This Dimension ‘INDICATOR’ can be referenced as a variable in the table headings, row headings, column headings, and in free text in the table cells or footnotes section.

The syntax for referring to the value used for this dimension is as follows:

$(INDICATOR)

This would resolve to the name of the Indicator value used in the table, for example Employment

If the Dimension is not a Variable in the Table, then the Concept Name for INDICATOR will be output. A specific Code name can be output for the Component, using the following syntax:

$(INDICATOR.EMP)

Coded Variables

Time Variables

When the Publication Table is displaying data for a Time Series Dataflow, the Time Period can be part of the Observation Key. However, time can also be a variable which is resolved at table build time. This variable can also be manipulated by moving forward or backwards ‘n’ number of periods from the ‘base time period’ for which the table is built.

  1. .
  1. A Value for Time Period. When the Publication Table is displaying data for a Time Series Dataflow, the Time Period can be part of the Observation Key. However, time can also be a variable which is resolved at table build time. This variable can also be manipulated by moving forward or backwards ‘n’ number of periods from the ‘base time period’ for which the table is built.
  1. A fixed pointer to a Component Value. This variable references a Component in a DSD or a Component value in a DSD. The variable is resolved to the name of the Component, i.e. ‘reporting Country’ or the a specific value for the Component, i.e. ‘United Kingdom’.
  1. A reference to an imported variable. Codelists and Valuelists which exist in the Fusion Registry can be imported into a Publication Table and given an alias. The table can then reference items in the Codleist/Valuelist which is imported.

Summary

Variable Type Description Syntax Example Example Output
Dataflow Name Displays the name of a Dataflow referenced by the Publication Table in the locale in which the Publication Table is built $([dataflow alias]) $(EDU) World Bank, Education
Dataflow Description Displays the description of a Dataflow referenced by the Publication Table in the locale in which the Publication Table is built $([dataflow alias].desc) $(EDU.desc) World Bank, Education
Dataflow Id Displays the ID of a Dataflow referenced by the Publication Table $([dataflow alias].id) $(EDU.id) DF_EDU
Variable Dimension Value Displays the name of the variable used when an Observation key has a wildcard member $([dimension id]) $(REF_AREA) United Kingdom
Specific Component Value Displays the name of the Component Value $([component id].[component value]) $(OBS_STATUS.C) Confidential
Component Name Displays the name of the DSD Component (this is the name of the Concept used by the Component) in the locale in which the Publication Table is built $([component id].name) $(OBS_STATUS) Confidentiality Status
Enumeration Value Displays the name of a Code in a Codelist or Value in a Valuelist $([list alias].[list item ID] $(SEX.M) Male
Conditional Variable Displays the variable value, which is conditional on the value of a variable Dimension $([conditional variable id][variable id]) $(BANKNOTES[NOTE1]) £5
Time Displays the base time period for which the table is built, or an offset from this time period / cast to a different frequency see table 1 $(P) 2008

Data Cell Syntax

Each data cell in a publication table is defined by a cell key, which resolves to an observation when the table is materialised.

  1. A fully formed cell key consists of Dimension Values for each Dimension including Time Period if the DSD has a Time Dimension.
  2. If there are multiple measures in the DSD the Measure Dimension should also be included.
  3. A cell key can leave out Dimension values, in doing so the Dimension value is treated as a variable, which can be passed in at 'run time' to build the table output
  4. A cell key can leave out Dimension values, in doing so the Dimension value is treated as a variable, which can be passed in at 'run time' to build the table output
  5. A cell key can treat time as a variable, and also provide arithmetic on the Time Dimension to move forwards and backwards in periods, or to change Frequency
  6. A cell key can include calculations

Examples:

|- ! Cell Key !! Description |- | A:UK:EMP:M || Series key without Time |- | A:UK:EMP:M:2008 || Series key with Time |- | A:UK:EMP:M:2008:BIRTHS || Series key with Time and ID of Measure Dimension |- | A:UK:EMP:M:P || Series key with variable Time |- | A::EMP:M:P || Series key with variable dimension and variable Time |- | A:UK:EMP:M:P-1 || Series key with variable Time, which is 1 period before the base period of the table |- | :=A:(UK+FR):EMP:M || A calculated value from 2 series |}


Time Arithmetic Syntax

Syntax Description Example Input Example Output
Example Example Example Example
Example Example Example Example
Example Example Example Example
Example Example Example Example
Example Example Example Example
Example Example Example Example
Example Example Example Example

Variable Syntax