Rules Language

Semarchy xDG uses a domain specific language to define rules for calculated badge and metrics.

The Semarchy xDG rules language used a SemQL-like syntax to define Expressions and Conditions, which are a combination of one or more Values, Operators, and Functions.

Overview

Expressions and Conditions are the phrases supported by the rules language.

  • Expressions return a value.

  • Conditions exclusively return a boolean value (true or false).

Values, operators, and functions are the tokens used in the expressions and conditions.

  • Values are simple expressions, such as:

    • literals, that is, constant values.

    • Variables, which are asset properties. For example the description, the metrics, the owners, etc of an asset.

  • Operators modify, combine, or compare expressions. They include arithmetic, comparison, and logical operators.

  • Functions combine other tokens to create new expressions.

Note that case-sensitivity differs for the language tokens:

  • Operators are not case-sensitive.

  • Functions are case-sensitive.

  • Variables are always case-sensitive.

Expression grouping using parenthesis is supported. For example (5+3)/2.

Operators

Arithmetic Operators

The following table lists the available arithmetic operators.

Table 1. Arithmeric operators
Operator Description

+

Addition

-

Subtraction

*

Multiplication

/

Division

Comparison Operators

The following table lists the available comparison operators.

Table 2. Comparison operators
Operator Description

==

Equality

!=

Inequality

>, >=

Greater than, greater than or equal

<, <=

Smaller than, smaller than or equal

Logical Operators

The following table lists the available logical operators.

Table 3. Logical operators
Operator Description

AND

Return true if both conditions are true.

OR

Return true if one condition of the other is true.

Functions

The following table lists the available functions.

Table 4. Built-in functions
Function Description

now()

Returns the current timestamp.

duration(<duration>, "<unit>")

Returns a <duration> in <units>. The duration is a number and The unit is provided as a string surrounded by double quotes. For example, duration(5, "days").

You can also use shorthand for units, for example, "d" instead of "days". The units and shorthands are:

  • d: days

  • w: weeks

  • m: months

  • y: years

  • h: hours

  • m: minutes

  • s: seconds

  • ms: milliseconds

hasVariance(<metrics_array>)

Returns true if the metric has changed over time. Prefix the metric name with an underscore to get the metric values array. For example, hasVariance(Metrics._nbAssets).

nvl(<value>, <default>)

Returns the value if it is not null, otherwise returns the default value. For example, nvl(Metrics.nbAssets, 0).

Literals

Litteral are constant values of the following types:

  • Boolean: Their value is true or false.

  • String: Their value is surrounded by double quotes. For example, "mystring"`.

  • Numbers are indicated as is. For example 25.45.

Variables

Variables are asset properties, which include the properties, metrics, and profiling information attached to an asset.

Metrics

Asset metrics are accessible using the Metrics.<metricName> syntax. <metricName> refers to a built-in metric or a metric you have harvested.

Table 5. Built-in Metrics
Metric Name Description

nbAssets

Number of assets in a container.

nbOwners

Number of owners of an asset

latestTimestamp

Latest timestamp of capture for the metrics.

nbGolden

Number of golden records. This metric only applies to Semarchy Data Management.

nbMaster

Number of master records. This metric only applies to Semarchy Data Management.

The metrics available on an asset are visible in the Metrics tab of the asset editor.
You can prefix the metric name with an underscore character _ to return the array of historical values for the metrics. Such an array is used with the hasVariance function.

Profiling Information

Dataset profiling information is available using the Profile.<info> syntax. <info> refers to a built-in profiling measure.

Table 6. Built-in Profiling Information
Metric Name Description

columnCount

Number of columns of the dataset.

rowCount

Number of rows in the dataset.

sizeInBytes

Size in bytes of the data stored in the dataset.

timestampMillis

Latest profiling execution timestamps.

In addition, you can use the Profile.<columnName> syntax to access the profiling information of a specific column.

Table 7. Built-in Profiling Column Information
Metric Name Description

fieldPath

Column name.

uniqueCount

Number of unique values in the column.

uniqueProportion

Proportion of unique values in the column.

nullCount

Number of null values in the column.

nullProportion

Proportion of null values in the column.

min

Minimum value in the column.

max

Maximum value in the column.

mean

Mean value in the column.

median

Median value in the column.

stdev

Standard deviation of the column.

Example: Profile.member_id.nullProportion > 0 returns true if the column member_id has null values.

Properties

Asset properties are available using the Properties.<propertyName> syntax. For example, Properties.schema returns the schema name for a database asset.

The properties available on an asset are visible in the Properties tab of the asset editor.

OwnersType

Assets type of owner are accessible using the OwnersType.<ownerType> syntax. For example, OwnersType.businessOwner returns the list of user owners of that asset as a businessOwner.

Example: nvl(OwnersType.businessOwner.length, 0) > 0 returns true if the asset has at least one business owner.

Exisiting owner types are: - Business Owner noted as businessOwner - Technical Owner noted as technicalOwner - Data Steward noted as dataSteward - None noted as none

Examples

Example 1. Condition: The number of golden records of a Semarchy Data Management entity is equal to the number of rows in the physical table.
Metrics.nbGolden == Profile.rowCount
Example 2. Condition: The asset was modified this week and its number of owners was changed.
Metrics.latestTimestamp + duration(7,"d") >= now()
 and hasVariance(Metrics._nbOwners)