Difference between revisions of "EdgeServerCompiler"

From Fusion Registry Wiki
Jump to navigation Jump to search
(Created page with "= Fusion Edge Compiler = == Overview == <p>The Fusion Edge Compiler is a command line client, written in Java and can run on Windows or UNIX operating systems. Its responsib...")
 
(​ Command Line Arguments)
Line 28: Line 28:
 
== ​ Command Line Arguments ==
 
== ​ Command Line Arguments ==
 
'''Note''': Arguments can be provided by the command line, or alternatively via a JSON properties file, or a mix of both.
 
'''Note''': Arguments can be provided by the command line, or alternatively via a JSON properties file, or a mix of both.
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
{| class="wikitable"
 +
|-
 +
! Argument !! Example !! Description
 +
|-
 +
| prop || -prop “/home/props.json” || A reference to one or more properties files (separated by a space)
 +
|-
 +
| api || -api “https://yourorg.org/sdmx” || The URL of the web service to pull the content from
 +
|-
 +
| apict || -apict 500 || Connect Timeout to the API in seconds (default 200)
 +
|-
 +
| apirt || -apirt 500 || Read Timeout from the API in seconds (default 200)
 +
|-
 +
| apiua || -apiua “EdgeCL” || User Agent sent in HTTP Header request to API
 +
|-
 +
| tgt || -tgt “/home/compiler/target” || The target directory to write the files and folders to
 +
|-
 +
| lgr || -lgr “s3:mybucket” || The location of the current live ledger.  If this is provided then the last compile time of the ledger will be used as the updated after time to use when pulling data
 +
|-
 +
| df || -df “ACY:DF_ID(1.0)” “ACY:DF2(1.0)” || A reference to one or more Dataflows to pull data for (separated by a space).  If Dimension filters are to be applied to the dataflow, then the properties file should be used.  The keyword all can be used to pull data for all Dataflows. A Dataflow argument will include both the data and structural metadata (Dataflow plus all descendants) in the output.
 +
|-
 +
| str || -str “Codelist=ACY:CL_FREQ(1.0),ACY_CL_AGE(1.0) CategoryScheme=ACY:*(*)”<br/><br/> -str all || A list of structures to include in the output, in addition to those that are included automatically based on the Dataflows included in the output.  The structure and all descendants of that structure will be included in the output.  <p>The * are used to wildcard Agency, Id, and Version parameters.  All Structures can be obtained using the all keyword in the class type.</p>
 +
|-
 +
| upd || upd “2020-01-30T00:00.00” || Pull data that was updated after this time. This will be applied by using the updatedAfter web service query parameter against the target web service
 +
replace
 +
|-
 +
| replace || -replace || If present, all the files in the the target directory will be deleted before the pull content is run
 +
|-
 +
| metadata || -metadata || If present, the pull process will query for all Reference Metadata and include this in the output
 +
|-
 +
| zip || -zip || If present, the output files will all be in zip format
 +
|-
 +
| usr || -usr “myusername” || Username to authenticate with the REST API, if using the Fusion Registry it should correspond to a user account in the Fusion Registry
 +
|-
 +
| pwd || -pwd “mypassword” || Password to authenticate with the REST API
 +
|-
 +
| s3rgn || -s3rgn “us-east-1”|| Amazon S3 region – required if the Ledger is hosted on Amazon S3
 +
|-
 +
| s3sec ||  -s3sec “azxzcvbnm”  || Amazon S3 Secret – required if the Ledger is hosted on Amazon S3
 +
|-
 +
| s3acc || -s3acc “azxzcvbnm”  || Amazon S3 Access Key – required if the Ledger is hosted on Amazon S3
 +
|-
 +
| tmp || -tmp /tmp || Temporary directory to use for transient files. If not provided, the java.io.tmpdir JVM variable is used, usually defaulting to the user tmp directory
 +
|-
 +
| rmtmp || -rmtmp || If present will delete all files in the tmp directory before start
 +
|-
 +
| report || -report || If present will output the report to a File called report.json in the target (tgt) directory
 +
|-
 +
| h || -h || Display the help information
 +
|}

Revision as of 00:14, 20 October 2021

Fusion Edge Compiler

Overview

The Fusion Edge Compiler is a command line client, written in Java and can run on Windows or UNIX operating systems. Its responsibility is to compile SDMX data, structure, and metadata files for dissemination by the Fusion Edge Server. The Fusion Edge Compiler provides three functions:


  1. To pull content from SDMX web services (example Fusion Registry web services) in order to populate a local file system of content to publish
  2. To compile content in the local file system to create a new ‘environment’ which can be consumed by the Fusion Edge Server
  3. To publish the environment to an Amazon S3 bucket from which distributed Fusion Edge Servers can take their content, if configured to do so

The second function, compile, is the main function of the compiler. The other two functions can be performed manually if required, however the Fusion Edge Compiler provides these functions to allow the full data extract, transform, and load process to be fully automated.

General Arguments

The command line client provides three scripts for pull, compile, and publish. Each script has a UNIX (.sh) file and a windows (.bat) file. Each script can take a number of command line arguments, some arguments are common to all scripts these are:

  1. The properties file (prop argument). Each script can read one or more properties files which is a JSON file that contains configuration options. The properties file contains all the same configuration options that can be passed directly to script as a command line argument. This allows the script to read arguments from a file, and/or as a direct argument. It is possible to provide both command line arguments in addition to a properties file, the script will merge the arguments from the properties file, with the command line arguments. If a configuration option is passed as a the command line argument but also appears in the properties file, the command line argument will take precedence. As both the command line arguments and properties files can be used in conjunction with one another, the arguments are always marked as optional. However, this document will note which arguments are required, and must exist as either a command line argument, or a properties file argument.
  1. Ledger location (lgr argument). This must be the root location of the ledger, it can be provided as either a path to a folder on a file system, the http(s) URL to the root folder if hosting the ledger on a web service, or prefixed with s3:bucketname if using Amazon S3 as the file store.


​ Pull Content

buildFileSystem.sh (UNIX) or buildFileSystem.bat (Windows)

The Fusion Edge Compiler queries and SDMX web service for structural metadata, data, and reference metadata content based on what it has been requested to pull. It can work against a Fusion Registry web service as well as any other SDMX web service that complies with the SDMX specification.

The Fusion Edge Compiler pulls the content to build a target directory of files in the correct structure for the compile process to operate.

​ Command Line Arguments

Note: Arguments can be provided by the command line, or alternatively via a JSON properties file, or a mix of both.













Argument Example Description
prop -prop “/home/props.json” A reference to one or more properties files (separated by a space)
api -api “https://yourorg.org/sdmx” The URL of the web service to pull the content from
apict -apict 500 Connect Timeout to the API in seconds (default 200)
apirt -apirt 500 Read Timeout from the API in seconds (default 200)
apiua -apiua “EdgeCL” User Agent sent in HTTP Header request to API
tgt -tgt “/home/compiler/target” The target directory to write the files and folders to
lgr -lgr “s3:mybucket” The location of the current live ledger. If this is provided then the last compile time of the ledger will be used as the updated after time to use when pulling data
df -df “ACY:DF_ID(1.0)” “ACY:DF2(1.0)” A reference to one or more Dataflows to pull data for (separated by a space). If Dimension filters are to be applied to the dataflow, then the properties file should be used. The keyword all can be used to pull data for all Dataflows. A Dataflow argument will include both the data and structural metadata (Dataflow plus all descendants) in the output.
str -str “Codelist=ACY:CL_FREQ(1.0),ACY_CL_AGE(1.0) CategoryScheme=ACY:*(*)”

-str all
A list of structures to include in the output, in addition to those that are included automatically based on the Dataflows included in the output. The structure and all descendants of that structure will be included in the output.

The * are used to wildcard Agency, Id, and Version parameters. All Structures can be obtained using the all keyword in the class type.

upd upd “2020-01-30T00:00.00” Pull data that was updated after this time. This will be applied by using the updatedAfter web service query parameter against the target web service

replace

replace -replace If present, all the files in the the target directory will be deleted before the pull content is run
metadata -metadata If present, the pull process will query for all Reference Metadata and include this in the output
zip -zip If present, the output files will all be in zip format
usr -usr “myusername” Username to authenticate with the REST API, if using the Fusion Registry it should correspond to a user account in the Fusion Registry
pwd -pwd “mypassword” Password to authenticate with the REST API
s3rgn -s3rgn “us-east-1” Amazon S3 region – required if the Ledger is hosted on Amazon S3
s3sec -s3sec “azxzcvbnm” Amazon S3 Secret – required if the Ledger is hosted on Amazon S3
s3acc -s3acc “azxzcvbnm” Amazon S3 Access Key – required if the Ledger is hosted on Amazon S3
tmp -tmp /tmp Temporary directory to use for transient files. If not provided, the java.io.tmpdir JVM variable is used, usually defaulting to the user tmp directory
rmtmp -rmtmp If present will delete all files in the tmp directory before start
report -report If present will output the report to a File called report.json in the target (tgt) directory
h -h Display the help information