Fusion Edge Server Audit

From Fusion Registry Wiki
Revision as of 09:08, 18 January 2024 by Plazarou (talk | contribs) (Response Formats)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page explains details the Audit capabilities of Fusion Edge Server.

Audit File

All Audit information from Fusion Edge Server is written to Audit files which are located in the "Audit" sub-folder of your Edge Server Directory. Audit files are named according to the format:

 EdgeServerAudit_<launch time>_<log index>.json

For example:

 EdgeServerAudit_1704067199000_1.json

The "launch time" value represents the time when the Edhe Server was started. However, this value is expressed as a lengthy numerical figure, indicating the time in milliseconds since 1970.

Audit information will be written to this file until either the Edge Server is terminated or the file reaches the file limit of 10Mb in size. If the file limit is reached, a new file is created with the next incremental Log Index value. Audit files that are actively being written to may be "locked" by your Operating System until either the Edge Server is terminated or a new audit file is started.

Each Audit file contains JSON, but this is not formatted for ease of reaability. Until the file is "completed", either by termination of the Edge Server, or by a new log file being started, a final closing brace character "]" is required to make the file contents become valid JSON. So you may find that if you wish to look at the currently being written Audit file you have to:

  • Create a copy of the Audit file
  • Open this copy in the editor of your choice (for example Notepad++)
  • Add a final "]" character to the end of the file
  • Format the JSON for readability/

Disabling Audit

Auditing can be disabled by modifying the Edge Server properties file and adding the entry:

 audit.disabled=true

See the properties page for more information.

Contents of the Audit File

It is necessary to have a basic understanding of the JSON format to understand how to parse the Audit file. The file contains a JSON array of items. Each item in the array is a JSON Object. Each JSON Object is identified by "curly braces" { and }. Some JSON objects contain other JSON objects (e.g. "properties" in the example below). Each item in the highest level JSON array, represents a unique audit event in the Edge Server. The following shows an example of one such event:

 {
   "uid": "e3de1d84-2413-4b7a-ae1d-754ad38d3a9f",
   "process_id": "REST_API",
   "thread": "http-nio-8084-exec-8",
   "event_type": "GET",
   "username": "guest",
   "process_start": 1699023147303,
   "process_end": 1699023147342,
   "duration": 39,
   "status": 200,
   "vmid": "a58a00a880f5f938:4d097009:18b95a67fb4:-8000",
   "machine_id": "DESKTOP-DSTGA0Q/192.168.1.14",
   "software_version": "4.7.2.0.0.0",
   "properties": {
     "QueryParameters": {
       "c[FREQ]": "A",
       "c[REF_AREA]": "BE"
     },
     "HttpHeaders": {
       "host": "localhost:8084",
       "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/119.0",
       "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
       "accept-language": "en-GB,en;q=0.5",
       "connection": "keep-alive"
     },
     "IP": "127.0.0.1",
     "Path": "/sdmx/v2",
     "PathInfo": "/data/dataflow/BIS/BIS_CBPOL/1.0",
     "Locale": "en_GB",
     "HttpStatus": 200
   }
 }

The above JSON can be described in the following manner, where the items in brackets indicate the field that is being referred to.

A GET event ("event_type") was performed on on Friday, 3 November 2023 14:52:27.303 GMT ("process_start" of 1699023147303 ) by user "guest" ("username"). This process completed succesfully ("HttpStatus" of 200) and was a request against the Dataflow BIS:BIS_CBPOL(1.0) ("PathInfo"). The query was further constrained by the Frequency of "A" and Reference Area of "BE" ("QueryParameters")

There is more information contained in this one JSON object, but that allows us to get a quick overview of what the request was for.

Child events

Some audit events have a parent UID (element: "parent"). This shows that this audit event was created from the UID referenced in the "parent" value.

Elements

The following describes each of the elements that can be found in a JSON Object:

uid The Unique Identifier (UID) of the event
process_id The type of process requested. Such values are "APPLICATION_START" (the edge server start event), "SDMX_GET" (??), "REST_API" (a query for structures or data)
thread The internal thread namethat this ran on within the Edge Server
event_type The HTTP Event Type that was performed
username The identity of the user that performed this request
process_start The start time (in milliseconds since 1970) that this request was performed
process_end The end time (in milliseconds since 1970) that this request was performed
duration The total time (in milliseconds) to complete the request
status The returned HTTP status
vmid The VMID of the system running the request
machine_id The identity of the machine that was running the Edge Server that performed this request
software_version The version of the software of Fusion Edge Server
properties A JSON Object containing parameters submitted in the request
QueryParameters A JSON Object containing keys and values that were in the request (e.g. Key: "FREQ" and value "A")
HttpHeaders A JSON Object containing elements that were present in the header request
IP The IP of the originating request
Path Part of the request's URL and can be used to identify various Web Services. E.g. it could be: "/sdmx/v2" showing a version 2 SDMX request, or "/ws/public/sdmxapi/rest" showing a request to the version 1 API
PathInfo The extra information added to the "path". This identifies individual structures or datasets being requested.
Locale The locale used in the request
HttpStatus The HTTP Status of the request

Specific Events in the Audit File

When specific requests are made to the Edge Server, the Audit file will contain specific elements. This section details what to expect in the Audit file when particular events occur.

Startup Information

On succesfull start up of Fusion Edge Server, 9 events will be written to the Audit Log. These can be identified by the Process Ids of "LOAD_PROPERTIES", "APPLICATION_START" and "ENVIRONMENT".

The first event has a Process Id of "LOAD_PROPERTIES". This states that the Edge Server properties file has successfully loaded.

Next (although will likely appear last in the list of 9) is a Process Id of "APPLICATION_START" which contains all of the properties of the system. This spawns 5 child processes, all the with Process Ids of "APPLICATION_START". Each of these children has a property object with a single class within it. These classes are: SpringBeansContainer; EdgeServerAuditPersistenceManager; EdgeServerLedgerReaderManager; AuditEventManager; SDMXCacheManager

The process "EdgeServerLedgerReaderManager" will also launch a child process which has a Process Id of "ENVIRONMENT" and an "event_type" of "UPDATE". This in turn launches a child process with Process Id of "ENVIRONMENT" and an "event_type" of "LIVE".

Since child processes are written before their parents, the logical tree structure of the audit events is as follows. The number at the start of the line shows the likely order it will be written to the Audit file:

 1. process_id: LOAD_PROPERTIES
 9. process_id: APPLICATION_START  - with all the system props
   2. process_id: APPLICATION_START  - Class: SpringBeansContainer
   3. process_id: APPLICATION_START  - Class: EdgeServerAuditPersistenceManager
   6. process_id: APPLICATION_START  - Class: EdgeServerLedgerReaderManager
     5. process_id: ENVIRONMENT    event_type: UPDATE
       4. process_id: ENVIRONMENT  event_type: LIVE
   7. process_id: APPLICATION_START  - Class: AuditEventManager
   8. process_id: APPLICATION_START  - Class: SDMXCacheManager

Data Request

When a request for Data is performed, a single audit event is created with the following elements:

  • process_id: REST_API
  • event_type: GET
  • Path: /ws/public/sdmxapi/rest or /sdmx/v2
  • PathInfo: the dataflow requested such as /data/dataflow/BIS/BIS_CBPOL/1.0

The status element will likely have a value of "200", indicating the request was handled correctly. However to determine if the request returned data or not, this needs to be checked by the element "HttpStatus" which is located within the "properties" element. If this value is "404" then no data was returned for the request.

The following, truncated, audit event shows a successful request (status and HttpStatus of 200):

 {
   "uid": "7dd76de6-895c-4f53-bb35-18519b85580e",
   "process_id": "REST_API",
   "thread": "http-nio-8084-exec-6",
   "event_type": "GET",
   "username": "guest",
   "process_start": 1705568446233,
   "process_end": 1705568450320,
   "duration": 4087,
   "status": 200,
   "vmid": "27bece96ce048b52:9c501c3:18d194af243:-8000",
   "machine_id": "DESKTOP-DSTGA0Q/192.168.1.16",
   "software_version": "4.8.2",
   "properties": {
     "QueryParameters": {
     "c[FREQ]": "A"
     },
     "HttpHeaders": {
         "host": "localhost:8084",
         "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0",
         ...
     },
     "IP": "127.0.0.1",
     "Path": "/sdmx/v2",
     "PathInfo": "/data/dataflow/BIS/BIS_CBPOL/1.0",
     "Locale": "en_GB",
     "HttpStatus": 200
   }
 }

The following (truncated) event shows a request which returned no data. Note that the differences are the UID, properties (the requested parameters) and the HttpStatus of 404:

 {
   "uid": "0e55c832-8844-46a5-9fa5-6aa6875e66b6",
   "process_id": "REST_API",
   ...
   "status": 200,
   ...
   "properties": {
     "QueryParameters": {
     "c[FREQ]": "ZZZ"
   },
   "HttpHeaders": {
     ...
     },
     "IP": "127.0.0.1",
     "Path": "/sdmx/v2",
     "PathInfo": "/data/dataflow/BIS/BIS_CBPOL/1.0",
     "Locale": "en_GB",
     "HttpStatus": 404
   }
 }

If the request contain a ResponseFormat request (e.g. ?format=sdmx-json ) then a child audit event will also be written which contains the "ResponseFormat" in a more explicit format. For example: "sdmx-json v2.0.0". This child event may look like this and is identifiable from the "parent" element which has the UID of the event that spawned it:

 {
   "uid": "84f98074-0176-48a2-95ba-95d3b1465aad",
   "parent": "7609d4fb-1583-4440-8a3e-4ec76c324455",
   "process_id": "SDMX_GET",
   "thread": "http-nio-8084-exec-7",
   "event_type": "structure",
   "username": "guest",
   "process_start": 1705527795701,
   "process_end": 1705527795798,
   "duration": 97,
   "status": 200,
   "properties": {
     "Cache": "miss",
     "ResponseFormat": "sdmx-json v2.0.0"
   }
 }

Structure Request

When a request for Structural Metadata is performed, a single audit event is created with the following elements:

  • process_id: REST_API
  • event_type: GET
  • Path: /ws/public/sdmxapi/rest or /sdmx/v2
  • PathInfo: the structure requested such as /structure/agencyscheme/all/all

This audit event is extremely similar to the data request, and likewise if no information was returned, the "httpstatus" will be 404, and if the request contain a ResponseFormat request (e.g. ?format=sdmx-3.0 ) then a child audit event will also be written which contains the "ResponseFormat" in a more explicit format. For example: "SDMX_V3_STRUCTURE_DOCUMENT"

Availability Request from Fusion Data Browser

If the Fusion Data Browser is being used against the Fusion Edge Server, availability requests may be audited. These can be identified by the following elements:

  • process_id: REST_API
  • event_type: GET
  • Path: /sdmx/v2
  • PathInfo: /availability followed by the dataflow being queried. For example: /availability/dataflow/BIS/BIS_CBPOL/1.0

The field: "properties" -> "QueryParameters" -> "mode" will also contain the value "available"


Search Request from Fusion Data Browser

If the Fusion Data Browser is being used against the Fusion Edge Server, search requests may be audited. These can be identified by the following elements:

  • process_id: REST_API
  • event_type: GET
  • Path: /ws/public
  • PathInfo: /datasearch

The field: "properties" -> "QueryParameters" -> "query" will contain the search value. Since this requested was created by the Data Browser, the Http Header, "referer" field will state the location of the Data Browser.

Since the search field of Fusion Data Browser performs searches as the user types, you may see multiple similiar or sub-searches when a user makes a single search. For example if the user wants to search for "CREDIT", you may see audit entries for searches for "C", "CR", "CRE", "CRED", "CREDI" and "CREDIT" as the user types.

Distinguishing Between Different Request Mechanisms

Requests to the Edge server can come from a number of different mechanisms. For example a request from a Web Browser, a command-line tool such as curl, or the Fusion Data Browser.

The "user-agent" field within the "HttpHeaders" object can help here. Some tools, such as curl, supply an obvious distinct "user-agent". The following snippet shows a request that has come from curl

 "properties": {
   "HttpHeaders": {
     "host": "MyTestEdgeServer:8084",
     "user-agent": "curl/8.4.0",
     "accept": "*/*"
   },

Other values you may encounter depend upoin the user's configuration. The following is the user-agent value for Firefox v121.0 running on Windows 10:

 "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0"

And the following is the user-agent for Microsoft Edge, running on Windows 10:

 "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 Edg/120.0.0.0",

For the Fusion Data Browser, this identifies itself in a slightly different manner. There is a referer field under the objects: "properties" -> "HttpHeaders", which shows that the request originated from the Fusion Data Browser. The value in the "referer" field will contain the home location of the Fusion Data Browser:

 "properties": {
   "QueryParameters": {
     "locale": "en",
     "saveAs": "BIS_CBPOL",
     "format": "csv",
     "labels": "both"
   },
   "HttpHeaders": {
     ..
     "referer": "http://localhost:8084/FusionDataBrowser/",
   }
 }

Response Formats

When a request format is specified on a data request, a child audit event will be created. Note this will be output before the parent audit event. The child event contains a value under "properties" -> "ResponseFormat" to state what the output was.

The following table lists what will be output in the Audit file for each of the specified response formats (usually specified by format= in the request URL):

Format Value Output String
sdmx-compact-2.1 "Structure Specific (Compact) 2.1"
sdmx-generic-2.1 "Generic 2.1"
sdmx-compact-2.0 "Structure Specific (Compact) 2.0"
sdmx-generic-2.0 "Generic 2.0"
sdmx-csv-2.0.0
sdmx-csv-1.0.0
csv
csv-ts
"csv"
sdmx-json "SDMX-JSON"
sdmx-edi "SDMX-EDI"
excel "Excel (XLSX)"
excel-table "fusion-excel-table"
excel-series "fusion-excel-series"
excel-ts "fusion-excel-ts"
excel-refresh "Excel Refreshable"

More detail of the response formats can be found here.