SemQL Syntax | ||
---|---|---|
Previous | Next | |
Introduction to SemQL | Attributes and Variables |
The SemQL syntax supports the equivalent of SQL Expressions, Conditions or Order By Clause, which are a combination of one or more Values, Operators, and Functions.
Values, operators and functions are the tokens in the SemQL language.
+
,
-
,
*
,
/
,
||
), comparison operators (
=
,
!=
,
>
,
>=
,
<
,
<=
,
IN
,
BETWEEN
,
LIKE
,
REGEXP_LIKE
,
IS NULL
) and logical operators (
AND
,
OR
,
NOT
).
Important: Operators and Functions are not case-sensitive. Values are case-sensitive .
Expressions, Conditions and Order By Clauses are the ‘phrases’ supported by the SemQL Language.
ASC
or
DESC
post-fix define the sort order. Default sort order is ascending.
FirstName
is an attribute.
'USA'
is a string literal.
Initcap(FirstName)
is a function using an attribute.
Initcap(FirstName) || ' ' || Initcap(LastName)
is a combination of operators, functions, attributes and literals.
1=1
compares two literals.
Country='USA'
compares an attribute and a literal.
Upper(Country) in ('USA', 'US', 'CANADA')
uses a function.
Country
sorts by the Country attribute (ascending by default)
Country DESC
sorts by the Country attribute (descending)
CASE PublisherID WHEN 'MKT' THEN 1 WHEN 'CRM' THEN 2 ELSE 99 END ASC
sorts records where PublisherID = ‘MKT’, then ‘CRM’, then the rest.
The following sections detail the elements of a SemQL clause.
Comments in SemQL are surrounded by
/*
and
*/
.
An example of code with comments is provided below.
ANY Contacts HAVE (1=1) /* customer with contacts */
AND
NOT ANY Contacts HAVE ( IsInfluencer = 1 ) /* and no contact is an influencer */
The values in a SemQL expression may be literals, attributes or model variables:
'
. Examples:
'John'
-42
'1994-11-07'
Note: For more information about Attributes and Model Variables, refer to the Attributes and Variables chapter.
Operators are used to:
This section details the operators supported in SemQL.
Operator | Description |
---|---|
+
|
Addition |
-
|
Subtraction |
*
|
Multiplication |
/
|
Division |
the
||
(double pipe) is used for string concatenation.
Operator | Description |
---|---|
=
|
Equality |
!= ,
<>
|
Inequality |
> ,
>=
|
Greater than, greater than or equal |
< ,
<=
|
Smaller than, smaller than or equal |
IN (value_1, ..., value_n)
|
Compares a value with each value in the list, returns true if one value matches. |
BETWEEN value_1 and value_2
|
Greater than or equal to value_1 and less than or equal to value_2 |
LIKE pattern
|
TRUE if value matches the pattern. Within the pattern, the character “%” matches any string of zero or more characters except null. The character “_” matches any single character. |
REGEXP_LIKE(string, pattern, parameter)
|
returns true if the string matches the regular expression pattern. The match parameter may contain one of more of the following options:
i : case-insensitive match,
c case sensitive match,
n allows the period (.) to match the ‘newline character’ instead of ‘any character’ ,
m treats the source string as a multiple lines input. Note that
REGEXP_LIKE has a boolean result and is considered a condition and not a function.
|
IS [NOT] NULL
|
Tests for nulls |
any child_entity_role have ( <condition_on_child_entity> )
|
Condition that returns true if any of child records – in a one to many relationship – meet the given condition. For more information, see Using Related Entities' Attributes. |
all <child_entity_role> have ( <condition_on_child_entity> )
|
Condition that returns true if all of child records – in a one to many relationship – meet the given condition. For more information, see Using Related Entities' Attributes |
Operator | Description |
---|---|
AND
|
Return true if both conditions are true. |
OR
|
Return true if one condition of the other is true. |
NOT
|
Returns true if the following condition is false. |
SemQL support functions and expression constructs, that is elements that return a value.
Note: Functions differ from comparison operators as they return a non-boolean value. They cannot be used as is in conditions unless used with a comparison operator. For example,
TO_CHAR(IsValidCustomer)
is a valid expression, but not a valid condition.TO_CHAR(IsValidCustomer)='1'
is a valid condition.
The functions available in Semarchy Convergence for MDM include functions in the following categories:
The following list contains noteworthy functions and expressions:
TO_CHAR
,
TO_DATE
,
TO_NUMBER
functions to perform conversion across data types.
TRIM
,
LTRIM
,
RTRIM
,
PAD
LPAD
,
RPAD
to trip or pad with blanks.
SUBSTR
to retrieve a part of a string.
REPLACE
,
REGEXP_REPLACE
to replace part of a strings.
INSTR
to find the location of a substring in a string.
NULLIF
,
COALESCE
and
NVL
to handle null values.
GREATEST
and
LEAST
to return the greatest and least of a list of expressions.
SYSDATE
to retrieve the system date.
The complete set of built-in functions with their description is available in Appendix A
Certain functions are key in a fuzzy matching process.
Functions for normalizing of transforming values to reduce the noise during fuzzy matching:
UPPER
,
LOWER
and
INITCAP
absorb the case-sensitivity differences in strings.
SOUNDEX
returns phonetic representations of strings, absorbing typos.
SEM_NORMALIZE
returns a string with non-ASCII characters transformed to ASCII-equivalent or a blank.
Functions that implement fuzzy matching capabilities:
SEM_EDIT_DISTANCE
and
SEM_EDIT_DISTANCE_SIMILARITY
respectively returns the distance and percentage of similarity between two strings according to the
Levenshtein distance algorithm.
SEM_JARO_WINKLER
and
SEM_JARO_WINKLER_SIMILARITY
respectively return the distance and percentage of similarity between two strings according to the
Jaro-Winkler distance algorithm.
SEM_SEM_NGRAMS_SIMILARITY
returns the percentage of similarity of two strings according to the
Dice’s coefficient similarity measure applied to the n-grams of the strings.
Note: The
*_SIMILARITY
functions return a value between 0 (no match) and 100 (perfect match). If one or both strings are null, the returned value is 0.
The
CASE
expression selects a result from one or more alternatives, and returns this result.
This syntax returns the first result for which the expression matches the selector. If none match, it returns the default result.
CASE selector
WHEN expression_1 THEN result_1
...
WHEN expression_n THEN result_n
[ELSE default_result]
END
This syntax returns the first result for which the condition is true. If none is true, it returns the default result.
CASE
WHEN condition_1 THEN result_1
...
WHEN condition_n THEN result_n
[ELSE default_result]
END
The following example from an Enricher transforms the CustomerName attribute according to the Publisher of the record.
CASE PublisherID
WHEN 'CRM' THEN Upper(CustomerName)
WHEN 'MKT' THEN Upper(Replace(CustomerName, '-', ' '))
ELSE CustomerName
END
The same example with the second syntax:
CASE
WHEN PublisherID='CRM' THEN Upper(CustomerName)
WHEN PublisherID='MKT' THEN Upper(Replace(CustomerName, '-', ' '))
ELSE CustomerName
END
SemQL allows you to access database functions implemented in the database instance hosting the hub.
You can access these functions as regular functions by prefixing them with the name of the schema.
Important: The database user of the schema hosting the hub must have sufficient privileges to execute the customized functions.
Note: Database functions process data with the database engine. For certain processing involving for example algorithms, libraries or services not easily implemented with the database capabilities, it is preferable to opt for the plugin option. See the "Semarchy Convergence for MDM Plug-in Development Guide" for more information.
Previous | Top | Next |
Introduction to SemQL | Attributes and Variables |