Semarchy xDM Plug-in Reference Guide

Welcome to Semarchy xDM.
This guide provides reference information about the plug-ins delivered with the Semarchy xDM Platform.

Preface

Overview

This guide provides reference information about the plug-ins delivered with the Semarchy xDM Platform.
Using this guide, you will learn how to use these plug-ins in your MDM projects.

Audience

This document is intended for integration architects and developers setting up an MDM hub as part of their enterprise integration architecture.

To discover Semarchy xDM, you can watch our tutorials.

The Semarchy xDM Documentation Library, including the development, administration and installation guides is available online.

Document Conventions

This document uses the following formatting conventions:

Convention Meaning

Convention	Meaning
boldface	Boldface type indicates graphical user interface elements associated with an action, or a product specific term or concept.
italic	Italic type indicates special emphasis or placeholder variable that you need to provide.
`monospace`	Monospace type indicates code example, text or commands that you enter.

boldface

Boldface type indicates graphical user interface elements associated with an action, or a product specific term or concept.

italic

Italic type indicates special emphasis or placeholder variable that you need to provide.

monospace

Monospace type indicates code example, text or commands that you enter.

Other Semarchy Resources

In addition to the product manuals, Semarchy provides other resources available on its web site: https://www.semarchy.com.

Obtaining Help

There are many ways to access the Semarchy Technical Support. You can call or email our global Technical Support Center (support@semarchy.com). For more information, see https://www.semarchy.com.

Feedback

We welcome your comments and suggestions on the quality and usefulness of this documentation.
If you find any error or have any suggestion for improvement, please mail support@semarchy.com and indicate the title of the documentation along with the chapter, section, and page number, if available. Please let us know if you want a reply.

Introduction to Semarchy xDM

Semarchy xDM is the Intelligent Data Hub platform for Master Data Management (MDM), Reference Data Management (RDM), Application Data Management (ADM), Data Quality, and Data Governance.
It provides all the features for data quality, data validation, data matching, de-duplication, data authoring, workflows, and more.

Semarchy xDM brings extreme agility for defining and implementing data management applications and releasing them to production. The platform can be used as the target deployment point for all the data in the enterprise or in conjunction with existing data hubs to contribute to data transparency and quality.
Its powerful and intuitive environment covers all use cases for setting up a successful data governance strategy.

Semarchy xDM Plug-ins

Semarchy xDM implements plug-ins that use external services or information systems to contribute to the master data processing and enrichment.

Plug-ins are used in Semarchy xDM in:

Enrichers: By adding new enrichers, you can perform record-level enrichment to update, augment or standardize existing attribute values, or create content in new attributes. For example, you can connect to an external web service to retrieve stock ticker symbols from company names.
Validations: By adding new validations, you can perform record-level checks, that is check the value of attributes in a record against complex rules. For example, you can connect to an external provider to check whether a billing or shipping address is valid or not.

INFO: Using Plug-ins is explained in the Semarchy xDM Developer’s Guide, in the Certification Process Design chapter. Installing plug-ins to your Semarchy xDM instance is explained in the Semarchy xDM Administration Guide, in the Configuring the Platform chapter.

The plug-ins are designed using the Open Plug-In Architecture. Plug-in design is covered in the Semarchy xDM Plug-in Development Guide.

Text Normalization and Transliteration

This plug-in applies normalization, transliteration and phonetic transformations to text strings.

Semarchy Text Enricher

Plug-in ID

Semarchy Text Enricher - com.semarchy.engine.plugins.convergence.text

Description

This enricher applies normalization, transliteration and phonetic transformations to text strings. It takes an Input Text and applies an Input Filter to this text, for example to remove all characters but letters. Then it applies a series of transformations defined in the Transformation parameter and returns a Transformed Text.

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name Mandatory Type Description

Parameter Name	Mandatory	Type	Description
Input Filter	No	String	Filter applied to the input text before the transformation. Valid values for the Filter are: `NONE`, which applies no filter, `LETTERS`, which removes all non-letter characters from the input string and `STANDARD`, which tokenizes the input text by splitting words.
Transformation	Yes	String	A pipe-separated sequence of transformation definitions. Transformations include: `NORMALIZE` `TRANSLITERATE [<Id>]` `PHONETIC <Type> [<MaxCodeLengh>]` `BEIDERMORSE [Split] [RuleType] [MaxPhonemes] [NameType]` `DOUBLEMETAPHONE [<max_code_length>] [split]`. See the Transformations section for a detailed description of each transformation.
Synonyms Separator	No	String	Separator used between the synonyms returned by the enricher. Default value is a pipe (\|).

Input Filter

String

Filter applied to the input text before the transformation. Valid values for the Filter are: NONE, which applies no filter, LETTERS, which removes all non-letter characters from the input string and STANDARD, which tokenizes the input text by splitting words.

Transformation

Yes

String

A pipe-separated sequence of transformation definitions. Transformations include:

NORMALIZE
TRANSLITERATE [<Id>]
PHONETIC <Type> [<MaxCodeLengh>]
BEIDERMORSE [Split] [RuleType] [MaxPhonemes] [NameType]
DOUBLEMETAPHONE [<max_code_length>] [split].

See the Transformations section for a detailed description of each transformation.

Synonyms Separator

String

Separator used between the synonyms returned by the enricher. Default value is a pipe (|).

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name	Mandatory	Type	Description
Input Text	Yes	String	Text to transform.

Input Name

Mandatory

Type

Description

Input Text

Yes

String

Text to transform.

Plug-in Outputs

The following table lists the plug-in outputs.

Output Name	Type	Description
Transformed Text	String	Filtered and transformed text.
Secondary Transformed Text	String	Secondary transformed text. This text may contain transformation resulting from a Beidermorse or Double Metaphone transformation. See Other Transformations for more information.

Output Name

Type

Description

Transformed Text

String

Filtered and transformed text.

Secondary Transformed Text

String

Secondary transformed text. This text may contain transformation resulting from a Beidermorse or Double Metaphone transformation. See Other Transformations for more information.

Input Filters

The following input filters are supported by the enricher:

NONE: No filter is applied to the input text.
LETTERS: This transformation removes all non-letter characters from the input string.
STANDARD: Breaks words in the input text according to the rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29.

Transformations

The following transformations definitions are supported by the enricher:

Normalization
- NORMALIZE: Performs a Normalization
Phonetic Transformation
- PHONETIC [SOUNDEX | REFINEDSOUNDEX | METAPHONE [<max_code_length>] | DOUBLEMETAPHONE [<max_code_length>] | CAVERPHONE | CAVERPHONE1 | NYSIIS | MRA | COLOGNE | BEIDERMORSE ]: applies Phonetic Transformations
Other Transformations
- BEIDERMORSE [Split] [RuleType] [MaxPhonems] [NameType]
- DOUBLEMETAPHONE [<max_code_length>] [split]
Transliteration
- TRANSLITERATE [<ID>] apply a Transliteration transformation to the string. The transliteration is identified by an ID. If not ID is provided, the Any-Latin transliteration is used.

It is possible to sequence transformations. Successive transformations are separated by a pipe | sign.
Examples of transformations:

Normalize and apply Phonetic Soundex: NORMALIZE | SOUNDEX
Normalize and then transliterate to Latin script: NORMALIZE | TRANSLITERATE Any-Latin
Normalize, transliterate to Latin script and then apply Metaphone with a maximum resulting length of 5 characters: NORMALIZE | TRANSLITERATE Any-Latin | PHONETIC METAPHONE 5
Perform a BEIDERMORSE transformation for family names with an approximate transformation on generic name types: BEIDERMORSE APPROX 10 FALSE GENERIC

Normalization

The NORMALIZE transformation normalizes the string by applying a series of transformations, which map similar characters to a common target, to ignore certain distinctions between similar characters. This includes accent removal, case folding, etc.

Example of transformations:

Original Text	Normalized Text	Comments
‒ – — ―	- - - -	4 different dashes converted to 4 similar dashes.
AbSoLuteLy TRUE	absolutely true	CaseFolding
…	...	convert [dotdotdot] to [dot dot dot]
½ Tsp	1/2 tsp	Symbol folding
Æsop	aesop
Äsop	asop
Dürst	durst
Encyclopædia	encyclopaedia
œuvre	oeuvre
poſt	post
résumé français	resume francais	Accent removal and case folding
Straße	strasse
٣ is a magic number	3 is a magic number	Native Digital folding

Original Text

Normalized Text

Comments

‒ – — ―

- - - -

4 different dashes converted to 4 similar dashes.

AbSoLuteLy TRUE

absolutely true

CaseFolding

…

...

convert [dotdotdot] to [dot dot dot]

½ Tsp

1/2 tsp

Symbol folding

Æsop

aesop

Äsop

asop

Dürst

durst

Encyclopædia

encyclopaedia

œuvre

oeuvre

poſt

post

résumé français

resume francais

Accent removal and case folding

Straße

strasse

٣ is a magic number

3 is a magic number

Native Digital folding

The complete list of transformations is given below:

Accent removal

Hebrew Alternates folding

Overline folding

Suzhou Numeral folding

Case folding

Jamo folding

Positional forms folding

Symbol folding

Canonical duplicates folding

Letterforms folding

Small forms folding

Underline folding

Dashes folding

Math symbol folding

Space folding

Vertical forms folding

Diacritic removal (including stroke, hook, descender)

Multigraph Expansions: All

Spacing Accents folding

Width folding

Greek letterforms folding

Native digit folding

Subscript folding

Han Radical folding

For more information about these transformations see the UTR#30 Characters Foldings transformation.

Phonetic Transformations

A phonetic transformation applied to the string transforms it to a string corresponding to its pronunciation. The default phonetic transformation is PHONETIC METAPHONE.

Phonetic transformations include:

PHONETIC SOUNDEX and PHONETIC REFINEDSOUNDEX: Phonetic algorithms for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling. More information about Soundex
PHONETIC METAPHONE and PHONETIC DOUBLEMETAPHONE are algorithms for indexing words by their English pronunciation. They are suitable for use with most English words, not just names. Double Metaphone can return both a primary and a secondary code for an input string; this accounts for some ambiguous cases as well as for multiple variants of surnames with common ancestry. These algorithms support a Max Code Length parameter which defines the maximum length of the encoded result. This value default to 4. More Details about Metaphone.
PHONETIC CAVERPHONE and PHONETIC CAVERPHONE1. Algorithm for data matching for electoral rolls, optimized for accents present in parts of New Zealand. More Details about Caverphone and Caverphone 1
PHONETIC NYSIIS. New York State Identification and Intelligence System (NYSIIS), which maps similar phonemes to the same letter. The result is a string that can be pronounced by the reader without decoding. More Details about NYSIIS
PHONETIC MRA: Match Rating Approach developed by Western Airlines - this algorithm has an encoding and range comparison technique. More Details about MRA
PHONETIC COLOGNE Phonetic algorithm optimized for the German language. See Kölner Phonetik
PHONETIC BEIDERMORSE is a phonetic algorithm supporting greater accuracy in matching Slavic and Yiddish surnames with similar pronunciation but differences in spelling. It returns a list of tokens (separated by the string specified in the Synonyms Separator parameter.): first the transformed input text, then the transformed synonyms of the input text. More information about Beidermorse.

Other Transformations

These other transformations return a list of tokens which can be split into the Transformed Text and Secondary Transformed Text outputs.

These transformations should be preferably used at the end of the transformation sequence, as their secondary transformed text is not processed in subsequent transformations in the sequence.

Other transformations include:

BEIDERMORSE [<split>] [<rule_type>] [<max_phonems>] [<name_type>] The Beidermorse transformation returns a list of tokens: first the transformed input text, then the transformed synonyms of the input text. Beidermorse supports the following parameters:
- split. If this parameter is set to true all synonyms after the first one are concatenated in the Secondary Transformed Text output. If this parameter is set to false (default value) all synonyms are appended to the first token in the Transformed Text output.
- rule_type is EXACT for exact or APPROX for approximate phonetic transformation.
- max_phonems is the maximum number of synonyms returned. Default is 20.
- name_type default value is GENERIC. Use ASHKENAZI or SEPHARDIC if you specifically want phonetic encodings optimized for Ashkenazi or Sephardic Jewish family names.
DOUBLEMETAPHONE [<max_code_length>] [<split>]. This transformation encodes the input string with the Double Metaphone algorithm and returns a primary code and a secondary code. If split is set to true, then the secondary code is pushed to the Secondary Transformed Text output. Otherwise, it is concatenated to the primary code in the Transformed Text output.

Transliteration

The TRANSLITERATE transformation transforms a text from one character script to another. For example, Traditional to Simplified Chinese, Japanese Hiragana to Katakana, Cyrillic to Latin script.
Each source/target transliteration is identified by an ID. The list of supported transliteration IDs is provided in the list below. If no ID is provided, the Any-Latin transliteration is used.

Each ID represents a transliteration from one script/language to another. For example: Katakana-Latin, Latin-thai, etc. The special tag any stands for any script/language. For example, Any-Latin converts any input script to Latin script.

Accents-Any

Any-Name

Devanagari-Bengali

Han-Latin

Latin-Greek

Pinyin-NumericPinyin

Amharic-Latin/BGN

Any-NFC

Devanagari-Gujarati

Han-Latin/Names

Latin-Greek/UNGEGN

pl_FONIPA-ja

Any-Accents

Any-NFD

Devanagari-Gurmukhi

Hangul-Latin

Latin-Gujarati

pl-ja

Any-am

Any-NFKC

Devanagari-Kannada

Hans-Hant

Latin-Gurmukhi

pl-pl_FONIPA

Any-Arabic

Any-NFKD

Devanagari-Latin

Hant-Hans

Latin-Han

Publishing-Any

Any-Armenian

Any-Null

Devanagari-Malayalam

Hebrew-Latin

Latin-Hangul

ro_FONIPA-ja

Any-Bengali

Any-Oriya

Devanagari-Oriya

Hebrew-Latin/BGN

Latin-Hebrew

ro-ja

Any-Bopomofo

Any-pl_FONIPA

Devanagari-Tamil

Hex-Any

Latin-Hiragana

ro-ro_FONIPA

Any-CaseFold

Any-Publishing

Devanagari-Telugu

Hex-Any/C

Latin-Jamo

ru-ja

Any-cs_FONIPA

Any-Remove

Digit-Tone

Hex-Any/Java

Latin-Kannada

ru-zh

Any-Cyrillic

Any-ro_FONIPA

es_419-ja

Hex-Any/Perl

Latin-Katakana

Russian-Latin/BGN

Any-Devanagari

Any-ru

es_419-zh

Hex-Any/Unicode

Latin-Malayalam

Serbian-Latin/BGN

Any-es_419_FONIPA

Any-sk_FONIPA

es_FONIPA-am

Hex-Any/XML

Latin-NumericPinyin

Simplified-Traditional

Any-es_FONIPA

Any-Syriac

es_FONIPA-es_419_FONIPA

Hex-Any/XML10

Latin-Oriya

sk_FONIPA-ja

Any-FCC

Any-Tamil

es_FONIPA-ja

Hiragana-Katakana

Latin-Syriac

sk-ja

Any-FCD

Any-Telugu

es_FONIPA-zh

Hiragana-Latin

Latin-Tamil

sk-sk_FONIPA

Any-Georgian

Any-Thaana

es-am

IPA-XSampa

Latin-Telugu

Syriac-Latin

Any-Greek

Any-Thai

es-es_FONIPA

it-am

Latin-Thaana

Tamil-Bengali

Any-Greek/UNGEGN

Any-Title

es-ja

it-ja

Latin-Thai

Tamil-Devanagari

Any-Gujarati

Any-Upper

es-zh

ja_Latn-ko

Macedonian-Latin/BGN

Tamil-Gujarati

Any-Gurmukhi

Any-zh

Fullwidth-Halfwidth

ja_Latn-ru

Malayalam-Bengali

Tamil-Gurmukhi

Any-Han

Arabic-Latin

Georgian-Latin

Jamo-Latin

Malayalam-Devanagari

Tamil-Kannada

Any-Hangul

Arabic-Latin/BGN

Georgian-Latin/BGN

JapaneseKana-Latin/BGN

Malayalam-Gujarati

Tamil-Latin

Any-Hans

Armenian-Latin

Greek-Latin

Kannada-Bengali

Malayalam-Gurmukhi

Tamil-Malayalam

Any-Hant

Armenian-Latin/BGN

Greek-Latin/BGN

Kannada-Devanagari

Malayalam-Kannada

Tamil-Oriya

Any-Hebrew

ASCII-Latin

Greek-Latin/UNGEGN

Kannada-Gujarati

Malayalam-Latin

Tamil-Telugu

Any-Hex

Azerbaijani-Latin/BGN

Gujarati-Bengali

Kannada-Gurmukhi

Malayalam-Oriya

Telugu-Bengali

Any-Hex/C

Belarusian-Latin/BGN

Gujarati-Devanagari

Kannada-Latin

Malayalam-Tamil

Telugu-Devanagari

Any-Hex/Java

Bengali-Devanagari

Gujarati-Gurmukhi

Kannada-Malayalam

Malayalam-Telugu

Telugu-Gujarati

Any-Hex/Perl

Bengali-Gujarati

Gujarati-Kannada

Kannada-Oriya

Maldivian-Latin/BGN

Telugu-Gurmukhi

Any-Hex/Plain

Bengali-Gurmukhi

Gujarati-Latin

Kannada-Tamil

Mongolian-Latin/BGN

Telugu-Kannada

Any-Hex/Unicode

Bengali-Kannada

Gujarati-Malayalam

Kannada-Telugu

Name-Any

Telugu-Latin

Any-Hex/XML

Bengali-Latin

Gujarati-Oriya

Katakana-Hiragana

NumericPinyin-Latin

Telugu-Malayalam

Any-Hex/XML10

Bengali-Malayalam

Gujarati-Tamil

Katakana-Latin

NumericPinyin-Pinyin

Telugu-Oriya

Any-Hiragana

Bengali-Oriya

Gujarati-Telugu

Kazakh-Latin/BGN

Oriya-Bengali

Telugu-Tamil

Any-ja

Bengali-Tamil

Gurmukhi-Bengali

Kirghiz-Latin/BGN

Oriya-Devanagari

Thaana-Latin

Any-Kannada

Bengali-Telugu

Gurmukhi-Devanagari

Korean-Latin/BGN

Oriya-Gujarati

Thai-Latin

Any-Katakana

Bopomofo-Latin

Gurmukhi-Gujarati

Latin-Arabic

Oriya-Gurmukhi

Tone-Digit

Any-ko

Bulgarian-Latin/BGN

Gurmukhi-Kannada

Latin-Armenian

Oriya-Kannada

Traditional-Simplified

Any-Latin (default)

cs_FONIPA-ja

Gurmukhi-Latin

Latin-ASCII

Oriya-Latin

Turkmen-Latin/BGN

Any-Latin/BGN

cs_FONIPA-ko

Gurmukhi-Malayalam

Latin-Bengali

Oriya-Malayalam

Ukrainian-Latin/BGN

Any-Latin/Names

cs-cs_FONIPA

Gurmukhi-Oriya

Latin-Bopomofo

Oriya-Tamil

Uzbek-Latin/BGN

Any-Latin/UNGEGN

cs-ja

Gurmukhi-Tamil

Latin-Cyrillic

Oriya-Telugu

XSampa-IPA

Any-Lower

cs-ko

Gurmukhi-Telugu

Latin-Devanagari

Pashto-Latin/BGN

zh_Latn_PINYIN-ru

Any-Malayalam

Cyrillic-Latin

Halfwidth-Fullwidth

Latin-Georgian

Persian-Latin/BGN

Lookup

This plug-in performs a data lookup on a mapping table.

Semarchy Lookup Enricher

Plug-in ID

Semarchy Lookup Enricher - com.semarchy.engine.plugins.convergence.text

Description

This enricher performs a data lookup on a mapping table accessed via a JDBC datasource.

The mapping table is located in a datasource provided using the Datasource parameter, which defaults to the data location’s datasource. The mapping table is declared to the enricher:

By giving a Mapping Table as well as a Lookup Column and a list of (up to 20) Output Columns from this table. The input lookup value is searched in the Lookup Column and the corresponding values from the Output Columns are returned.
By giving a Custom SQL select statement executed on the datasource, which must return columns aliased LOOKUP_COLUMN and OUTPUT_COLUMN1, …, OUTPUT_COLUMN20. These columns will be used as the lookup and output columns.

You must either set Mapping Table, Lookup Column and Output Columns, or only set Custom SQL. The Mapping Table, Lookup Column, and Output Columns parameters are mandatory unless the Custom SQL parameter is set instead.

The lookup is performed on the mapping table with an optional memory cache configured with the Cache Lookup Data parameter.

When a null value is passed as the Lookup Value or when the lookup finds no matching value in lookup column, the enricher returns the Fallback Value or the Lookup Value, depending on the Fallback Behavior parameter.

The lookup value expected and output values emitted by this plug-in are string values. Any other datatype passed as the input should be converted to a string using SemQL, and outputs should be mapped to string attributes. Output values mapped to non-string output attributes rely on the database implicit conversion, which may give unexpected results.

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name Mandatory Type Description

Parameter Name	Mandatory	Type	Description
Cache Lookup Data	No	String	Use this parameter to optionally use a memory cache for the lookup process. Possible values are: `NO_CACHE`: Do not use a cache, the mapping table is queried for each lookup. `LOAD_ON_START` (Default): Cache all lookup data in memory at initialization. All lookups are made using the memory cache. `LOAD_ON_DEMAND` : Cache data after it is looked for. Lookups are first attempted on the memory cache, then on the mapping table if the lookup value is not present in the cache. Use the cache only to process batches of records. Do not use it when processing one record at a time. For example, it is recommended to set this parameter to `NO_CACHE` for enrichers running in steppers. If you configure the cache in such situation, it would load everytime the stepper triggers the enricher, causing bad performances.
Custom SQL	No	String	Leave this parameter empty to use a generated SQL query. Use this parameter instead of Mapping Table, Lookup Column and Output Columns to define the lookup dataset with a select statement in the following form: select <lookup_column> LOOKUP_COLUMN, <output_column> OUTPUT_COLUMN1, <output_column> OUTPUT_COLUMN2, <output_column> OUTPUT_COLUMN3, ... from <mapping_table> where ... The number of OUTPUT_COLUMN<N> is limited to 20 (from `OUTPUT_COLUMN1 to `OUTPUT_COLUMN20`) This query must return a dataset with n+1 columns aliased `LOOKUP_COLUMN` and `OUTPUT_COLUMN1` to `OUTPUT_COLUMNn`. These columns are used instead of the Lookup Column and Output Columns.
Datasource	No	String	JNDI name of datasource containing the lookup data. If this parameter is not defined, the enricher uses the data location datasource. This parameter should contain the full path of the datasource, for example: `java:comp/env/jdbc/SEMARCHY_STAGING`.
Fallback Behavior	No	String	Behavior when the lookup value is not found in the lookup column. Possible values are: `USE_FALLBACK` (default): returns the fallback value or null if the fallback value is not specified `USE_LOOKUP_VALUE`: returns the lookup value. When multiple output columns are specified, the same value - the fallback or lookup value - is sent to all these columns.
Fallback Value	No	String	Value to return if the lookup value is not found in the lookup column. Default value: `NULL`.
Lookup Column	No	String	Physical name of the column containing the lookup values. Default value: `NONE`.
Mapping Table	No	String	Physical name of the mapping table containing the lookup and output columns. Default value: `NONE`.
Output Columns	No	String	Comma-separated list of the physical names of the columns containing the values returned by the enricher. Default value: `NONE`. The (singular) Output Column parameter available in previous versions of this plug-in is deprecated and replaced by this parameter.

Cache Lookup Data

String

Use this parameter to optionally use a memory cache for the lookup process. Possible values are:

NO_CACHE: Do not use a cache, the mapping table is queried for each lookup.
LOAD_ON_START (Default): Cache all lookup data in memory at initialization. All lookups are made using the memory cache.
LOAD_ON_DEMAND : Cache data after it is looked for. Lookups are first attempted on the memory cache, then on the mapping table if the lookup value is not present in the cache.

Use the cache only to process batches of records. Do not use it when processing one record at a time. For example, it is recommended to set this parameter to NO_CACHE for enrichers running in steppers. If you configure the cache in such situation, it would load everytime the stepper triggers the enricher, causing bad performances.

Custom SQL

String

Leave this parameter empty to use a generated SQL query. Use this parameter instead of Mapping Table, Lookup Column and Output Columns to define the lookup dataset with a select statement in the following form:

select
    <lookup_column> LOOKUP_COLUMN,
    <output_column> OUTPUT_COLUMN1,
    <output_column> OUTPUT_COLUMN2,
    <output_column> OUTPUT_COLUMN3,
	...
from <mapping_table>
where ...

The number of OUTPUT_COLUMN<N> is limited to 20 (from `OUTPUT_COLUMN1 to OUTPUT_COLUMN20)

This query must return a dataset with n+1 columns aliased LOOKUP_COLUMN and OUTPUT_COLUMN1 to OUTPUT_COLUMNn. These columns are used instead of the Lookup Column and Output Columns.

Datasource

String

JNDI name of datasource containing the lookup data. If this parameter is not defined, the enricher uses the data location datasource.

This parameter should contain the full path of the datasource, for example: java:comp/env/jdbc/SEMARCHY_STAGING.

Fallback Behavior

String

Behavior when the lookup value is not found in the lookup column. Possible values are:

USE_FALLBACK (default): returns the fallback value or null if the fallback value is not specified
USE_LOOKUP_VALUE: returns the lookup value.

When multiple output columns are specified, the same value - the fallback or lookup value - is sent to all these columns.

Fallback Value

String

Value to return if the lookup value is not found in the lookup column. Default value: NULL.

Lookup Column

String

Physical name of the column containing the lookup values. Default value: NONE.

Mapping Table

String

Physical name of the mapping table containing the lookup and output columns. Default value: NONE.

Output Columns

String

Comma-separated list of the physical names of the columns containing the values returned by the enricher. Default value: NONE.

The (singular) Output Column parameter available in previous versions of this plug-in is deprecated and replaced by this parameter.

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name	Mandatory	Type	Description
Lookup Value	Yes	String	Value to look for in the mapping table’s lookup column.

Input Name

Mandatory

Type

Description

Lookup Value

Yes

String

Value to look for in the mapping table’s lookup column.

Plug-in Outputs

The following table lists the plug-in outputs.

Output Name	Type	Description
Output Value<N>	String	Nth Value returned by the lookup.

Output Name

Type

Description

Output Value<N>

String

Nth Value returned by the lookup.

Translation

Google Translate Enricher

Plug-in ID

Google Translate Enricher - com.semarchy.engine.plugins.convergence.translate.v2

Description

This enricher translates an Input Text from a Source Language to a Target Language using the Google Translate service. The source language is automatically detected if unspecified. This enricher requires a valid Google Key.

This plug-in must be used in compliance with the Google Translate APIs Terms of Service.

This enricher uses the Google Translate Service, which must be accessible from the Semarchy xDM Application at the following URL: https://www.googleapis.com/language/translate/v2?<parameters>;. Make sure to make this URL accessible through your firewalls.

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name Mandatory Type Description

Parameter Name	Mandatory	Type	Description
Application Name	Yes	String	Name of the client application accessing the Google Translate service. Application names should preferably have the format `<company-id>_<app-name>_<app-version>`. The name will be used by the Google servers to monitor the source of authentication.
Google Key	Yes	String	Google API Key. It is a unique key that you generate using the Google API Console.

Application Name

Yes

String

Name of the client application accessing the Google Translate service. Application names should preferably have the format <company-id>_<app-name>_<app-version>. The name will be used by the Google servers to monitor the source of authentication.

Google Key

Yes

String

Google API Key. It is a unique key that you generate using the Google API Console.

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name	Mandatory	Type	Description
Input Text	Yes	String	Text to translate.
Source Language	No	String	Language of the input text. If it is unspecified, it is detected from the input text.
Target Language	Yes	String	Target language for the translation.

Input Name

Mandatory

Type

Description

Input Text

Yes

String

Text to translate.

Source Language

String

Language of the input text. If it is unspecified, it is detected from the input text.

Target Language

Yes

String

Target language for the translation.

Plug-in Outputs

The following table lists the plug-in outputs.

Output Name	Type	Description
Translated Text	String	Translated Text.

Output Name

Type

Description

Translated Text

String

Translated Text.

Name Processing

Semarchy Person Name Enricher

Plug-in ID

Semarchy Person Name Enricher - com.semarchy.engine.plugins.convergence.personname.PersonNameEnricher

Description

This enricher extracts from a person’s full name his/her Given Name, Surname and Gender. It parses the Input Name and identifies a Given Name and Surname (with a Name Parsing Score confidence percentage). Then the given name is searched in a database of names for the source country code provided in the input. It a given name is matched, a Gender and a Most Frequent Gender (if the given name is unisex) are returned.

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name Mandatory Type Description

Parameter Name	Mandatory	Type	Description
Surname Position	Yes	String	Position of the Surname. This parameter is used for parsing the input name to detect the first and last names, and for generating the Full Name output. Possible values (`SURNAME_LAST` ,`SURNAME_FIRST` )
Case Transformation	Yes	String	Case transformation for the name. Possible values: `NONE`, `UPPER_CASE`, `LOWER_CASE` and `CAMEL_CASE`.

Surname Position

Yes

String

Position of the Surname. This parameter is used for parsing the input name to detect the first and last names, and for generating the Full Name output. Possible values (SURNAME_LAST ,SURNAME_FIRST )

Case Transformation

Yes

String

Case transformation for the name. Possible values: NONE, UPPER_CASE, LOWER_CASE and CAMEL_CASE.

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name Mandatory Type Description

Input Name	Mandatory	Type	Description
Input Name	Yes	String	Person full name to enrich.
Source Country Code	Yes	String	Code of the country of origin for the name. This code indicates the database of names to search to determine a gender for the given name. Built-in databases include `fr` for France, `us` for the USA and `ru` for Russia.

Input Name

Yes

String

Person full name to enrich.

Source Country Code

Yes

String

Code of the country of origin for the name. This code indicates the database of names to search to determine a gender for the given name. Built-in databases include fr for France, us for the USA and ru for Russia.

Plug-in Outputs

The following table lists the plug-in outputs.

Output Name	Type	Description
Full Name	String	The reconstructed full name, with the surname positioned according to the Surname Position parameter.
Gender	String	The gender of the Matched Given Name. One of MALE, FEMALE, UNISEX, UNKNOWN.
Gender Score	String	Confidence with which for Most Frequent Gender can be used [0-100].
Given Name	String	The part identified as Given Name in the input name.
Matched Given Name	String	Given name matched in the given name database.
Most Frequent Gender	String	The more frequent gender of the Matched Given Name for the given country. One of MALE, FEMALE, UNKNOWN.
Names Parsing Score	String	Names Parsing confidence [0-100]
Surname	String	The part identified as Surname in the input name.
Surname Position	String	Position at which the surname was detected.

Output Name

Type

Description

Full Name

String

The reconstructed full name, with the surname positioned according to the Surname Position parameter.

Gender

String

The gender of the Matched Given Name. One of MALE, FEMALE, UNISEX, UNKNOWN.

Gender Score

String

Confidence with which for Most Frequent Gender can be used [0-100].

Given Name

String

The part identified as Given Name in the input name.

Matched Given Name

String

Given name matched in the given name database.

Most Frequent Gender

String

The more frequent gender of the Matched Given Name for the given country. One of MALE, FEMALE, UNKNOWN.

Names Parsing Score

String

Names Parsing confidence [0-100]

Surname

String

The part identified as Surname in the input name.

Surname Position

String

Position at which the surname was detected.

International Phone Numbers Plug-In

The International Phone Numbers Plug-In for Semarchy xDM provides two features:

An enricher to standardize and improve phone numbers formatting.
A validator to check the validity of phone numbers.

Semarchy Phone Enricher

Plug-in ID

Semarchy Phone Enricher - com.semarchy.engine.plugins.convergence.phone

Description

This enricher takes as the Input Phone Number either an international phone number (with the international prefix), or a national phone number provided with a Region Code. It returns a standardized Enriched Phone Number in the Enriched Phone Format. Geocoding Data is also returned and includes (depending on the country) the country, the region/state and the city name.

If a phone number is not valid, the enricher returns the original phone value in the Enriched Phone Number, a Status Code as well as a Status Text describing the issue with the input phone number.

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

This plug-in does not use any parameter.

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name Mandatory Type Description

Input Name	Mandatory	Type	Description
Input Phone Number	Yes	String	Input Phone Number.
Region Code	No	String	Two letters region code for a national phone number, according to the ISO 3166-1 standard. If this parameter is left empty, the phone number provided in the Input Phone Number should include the international country calling code.
Enriched Phone Format	No	String	Format of the Enriched Phone Number. Possible values are `INTERNATIONAL` (default), `NATIONAL`, `E123_INTERNATIONAL`, `E123_NATIONAL` , `E164` and `RFC3966`. See Phone Formats for more information.
Region of Origin	No	String	Formats the phone output for international dialing from the country or region provided in this input. E.g.: `US`, `FR`, `GB`, `DE`. Use `ZZ` for unknown region. See this link for the list of codes.

Input Phone Number

Yes

String

Input Phone Number.

Region Code

String

Two letters region code for a national phone number, according to the ISO 3166-1 standard. If this parameter is left empty, the phone number provided in the Input Phone Number should include the international country calling code.

Enriched Phone Format

String

Format of the Enriched Phone Number. Possible values are INTERNATIONAL (default), NATIONAL, E123_INTERNATIONAL, E123_NATIONAL , E164 and RFC3966. See Phone Formats for more information.

Region of Origin

String

Formats the phone output for international dialing from the country or region provided in this input. E.g.: US, FR, GB, DE. Use ZZ for unknown region. See this link for the list of codes.

Phone Formats

The following standards are supported to format the enriched phone number:

E123_INTERNATIONAL and E123_NATIONAL refer to the ITU-T Recommendation E.123 for national and international phone numbers.
INTERNATIONAL and NATIONAL use a format similar to the ITU-T Recommendation E.123 for national and international phone numbers, but use hyphens to separate blocks of numbers.
E164 refers to the ITU-T Recommendation E.164.
RFC3966 refers to the IETF 3966 RFC.

Phone Format Examples:

E123_NATIONAL (E.123 - National Notation): (042) 123 4594
E123_INTERNATIONAL (E.123 - International Notation): +31 42 123 4567
NATIONAL (E.123 - National Notation with hyphens): (042) 123-4594
INTERNATIONAL (E.123 - International Notation with hyphens): +31 42-123-4567
E.164 (E.164 - International Notation): +31421234567 (equivalent to E.123 with no formatting)
RFC3966 (RFC3966 - International Notation): +31-42-123-4567 (equivalent to E.123 with hyphens instead of spaces)

Plug-in Outputs

The following table lists the plug-in outputs.

Output Name Type Description

Output Name	Type	Description
Enriched Phone Number	String	Phone number returned by the enricher in the format specified in the Enriched Phone Format input. This string is null if the enricher was not able to process the input phone number. The Status Code and Status Text value help troubleshooting such issues.
Geocoding Data	String	Geocoding data computed for a given number and country. Depending on the country and phone number, this value includes the country, region/state and city information. This string is null if the enricher was not able to process the input phone number. The Status Code and Status Text value help troubleshooting such issues.
Status Code	String	Return code for the phone number processing. More details about the Status Codes.
Status Text	String	Text explaining the status code.
International Phone Prefix	String	International Phone Prefix for worldwide dialing.
National Number	String	National number part of a phone number in International format. It is often the International number without the Country Prefix.
Extension	String	Extension part of the phone number.
Country Code Source	String	Explains how the Country Code was retrieved. Possible values are `FROM_NUMBER_WITH_PLUS_SIGN`, `FROM_NUMBER_WITH_IDD`, `FROM_NUMBER_WITHOUT_PLUS_SIGN` and `FROM_DEFAULT_COUNTRY`.
Leading Zero	String	Returns 0 or 1 to specify if leading zero is mandatory for foreign calls.
Possible Phone Number	String	Returns 0 or 1 to indicate whether a phone number is a possible number, and the region where the number could be dialed from.
Possible Phone Number Reason	String	Detailed explanation of why a phone number is a possible number or not. Possible values are `INVALID_COUNTRY_CODE`, `IS_POSSIBLE`, `TOO_LONG` and `TOO_SHORT`.
Valid Phone Number	String	Returns 0 or 1 to indicate whether a phone number matches a valid pattern.
Valid Phone Number For Region	String	Returns 0 or 1 to indicate that a phone number is valid for the specified Region Code.
Phone Line Type	String	Provides the line type of a phone number. Possible values are : `FIXED_LINE`, `FIXED_LINE_OR_MOBILE`, `MOBILE`, `PAGER`, `PERSONAL_NUMBER`, `PREMIUM_RATE`, `SHARED_COST`, `TOLL_FREE`, `UAN`, `UNKNOWN` and `VOIP`
Region Code	String	Returns the region code for the Phone Number. See this link for the list of codes.
International Phone Number	String	Phone number formatted for international dialing.
Time Zones	String	List of corresponding time zones for a given number. For example: `Europe/Paris`. If the timezone is unknown, returns `Etc/Unknown`
First Time Zone	String	First time zone from the list of corresponding time zones for a given number.
Carrier Name	String	Name of the carrier for the phone number.

Enriched Phone Number

String

Phone number returned by the enricher in the format specified in the Enriched Phone Format input. This string is null if the enricher was not able to process the input phone number. The Status Code and Status Text value help troubleshooting such issues.

Geocoding Data

String

Geocoding data computed for a given number and country. Depending on the country and phone number, this value includes the country, region/state and city information. This string is null if the enricher was not able to process the input phone number. The Status Code and Status Text value help troubleshooting such issues.

Status Code

String

Return code for the phone number processing. More details about the Status Codes.

Status Text

String

Text explaining the status code.

International Phone Prefix

String

International Phone Prefix for worldwide dialing.

National Number

String

National number part of a phone number in International format. It is often the International number without the Country Prefix.

Extension

String

Extension part of the phone number.

Country Code Source

String

Explains how the Country Code was retrieved. Possible values are FROM_NUMBER_WITH_PLUS_SIGN, FROM_NUMBER_WITH_IDD, FROM_NUMBER_WITHOUT_PLUS_SIGN and FROM_DEFAULT_COUNTRY.

Leading Zero

String

Returns 0 or 1 to specify if leading zero is mandatory for foreign calls.

Possible Phone Number

String

Returns 0 or 1 to indicate whether a phone number is a possible number, and the region where the number could be dialed from.

Possible Phone Number Reason

String

Detailed explanation of why a phone number is a possible number or not. Possible values are INVALID_COUNTRY_CODE, IS_POSSIBLE, TOO_LONG and TOO_SHORT.

Valid Phone Number

String

Returns 0 or 1 to indicate whether a phone number matches a valid pattern.

Valid Phone Number For Region

String

Returns 0 or 1 to indicate that a phone number is valid for the specified Region Code.

Phone Line Type

String

Provides the line type of a phone number. Possible values are : FIXED_LINE, FIXED_LINE_OR_MOBILE, MOBILE, PAGER, PERSONAL_NUMBER, PREMIUM_RATE, SHARED_COST, TOLL_FREE, UAN, UNKNOWN and VOIP

Region Code

String

Returns the region code for the Phone Number. See this link for the list of codes.

International Phone Number

String

Phone number formatted for international dialing.

Time Zones

String

List of corresponding time zones for a given number. For example: Europe/Paris. If the timezone is unknown, returns Etc/Unknown

First Time Zone

String

First time zone from the list of corresponding time zones for a given number.

Carrier Name

String

Name of the carrier for the phone number.

Status Codes

The following status codes are returned by the enricher:

0 - OK: Optimal execution. No error detected.
1 - INPUT_WAS_NULL: Input phone number was not set.
2 - PARSING FAILED: The string supplied did not seem to be a phone number. Review the Status text for more information.

Semarchy Phone Extractor

Plug-in ID

Semarchy Phone Extractor - com.semarchy.engine.plugins.convergence.phone.extractor

Description

This enricher extracts a list of phone numbers from an Input Text and returns them as a Phone List, in a given Extraction Format.

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name Mandatory Type Description

Parameter Name	Mandatory	Type	Description
Matching Leniency	No	String	Defines the phone number extraction leniency. Possible values are `POSSIBLE` (default), `VALID_FOR_REGION` (according to the Accepted Region) and `VALID`.
Extraction Format	No	String	Format of the extracted phone numbers. Possible values are `RAW` (default), `INTERNATIONAL` , `NATIONAL` , `E164` and `RFC3966` .
List Separator	No	String	Define the separator character used in the extracted phones list.
Maximum Invalid Numbers	No	String	Maximum number of invalid numbers allowed before stopping to process the text. This is to cover cases where the text contains a lot of false positives.

Matching Leniency

String

Defines the phone number extraction leniency. Possible values are POSSIBLE (default), VALID_FOR_REGION (according to the Accepted Region) and VALID.

Extraction Format

String

Format of the extracted phone numbers. Possible values are RAW (default), INTERNATIONAL , NATIONAL , E164 and RFC3966 .

List Separator

String

Define the separator character used in the extracted phones list.

Maximum Invalid Numbers

String

Maximum number of invalid numbers allowed before stopping to process the text. This is to cover cases where the text contains a lot of false positives.

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name Mandatory Type Description

Input Name	Mandatory	Type	Description
Input Text	Yes	String	Input text to search for phone numbers.
Accepted Region	No	String	Defines the region used when Matching Leniency is set to `VALID_FOR_REGION`.

Input Text

Yes

String

Input text to search for phone numbers.

Accepted Region

String

Defines the region used when Matching Leniency is set to VALID_FOR_REGION.

Plug-in Outputs

The following table lists the plug-in outputs.

Output Name	Type	Description
Extracted Phone List	String	List of phone numbers extracted.
Phone 1 to Phone 5	String	First, second… extracted phone number in the list.

Output Name

Type

Description

Extracted Phone List

String

List of phone numbers extracted.

Phone 1 to Phone 5

String

First, second… extracted phone number in the list.

Semarchy Phone Validator

Plug-in ID

Semarchy Phone Validator - com.semarchy.engine.plugins.convergence.phone

Description

This validator takes as the Input Phone Number either an international phone number (with the international prefix), or a national phone number provided with a Country Code. The validator checks whether this phone number is a valid international or national phone number.

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name Mandatory Type Description

Parameter Name	Mandatory	Type	Description
Validation Leniency	No	String	Precise validation leniency for possible phone numbers. Value may be `VALID` (default), `POSSIBLE` or `VALID_FOR_REGION`.

Validation Leniency

String

Precise validation leniency for possible phone numbers. Value may be VALID (default), POSSIBLE or VALID_FOR_REGION.

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name	Mandatory	Type	Description
Input Phone Number	Yes	String	Input Phone Number.
Country Code	No	String	Two letters country code for a national phone number, according to the ISO 3166-1 standard. If this parameter is left empty, the phone number provided in the Input Phone Number should include the international country calling code.

Input Name

Mandatory

Type

Description

Input Phone Number

Yes

String

Input Phone Number.

Country Code

String

Two letters country code for a national phone number, according to the ISO 3166-1 standard. If this parameter is left empty, the phone number provided in the Input Phone Number should include the international country calling code.

Email Plug-In

The Email Plug-In for Semarchy xDM provides an enricher to improve the quality of email addresses and a validator to check email validity.

Semarchy Email Enricher

Plug-in ID

Semarchy Email Enricher - com.semarchy.engine.plugins.convergence.email

Description

This enricher takes an Input Email Address and splits this address into the local-part (user name) and the domain name. Both these parts are checked syntactically and syntax errors are fixed automatically. The domain name validity is also checked using MX records lookup. The plug-in uses a Domain Name Cache for faster checks and automated fixes on domain names.

This plug-in is thread-safe and supports parallel execution.

Domain Name Cache

The plug-in uses several mechanisms for faster checks and automated fixes on domain names:

Domain names already checked as valid (MX record lookup) are persisted in a domain name cache stored in a JDBC Datasource. This avoids repeating MX lookup.
A list of known domains (e.g.: hotmail.com, gmail.com, etc.) is automatically seeded in the host name validation cache.
Common domain mistakes are fixed using a seeded replace list. For example gmai.com is automatically fixed to gmail.com using the cache.
Invalid domains are automatically fixed to similar valid domains already present in the cache. For example, semarcyh.com is fixed to semarchy.com as semarchy.com was previously checked as a valid domain name.

See Appendix A: Semarchy Email Enricher Domain Name Cache for more information about the domain name cache.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name Mandatory Type Description

Parameter Name	Mandatory	Type	Description
Datasource	No	String	Full name of the JDBC Datasource used to store the host name validation cache. If no datasource is specified then the data location’s datasource is used. For example: `java:comp/env/jdbc/email_cache`.
Lowercase User Name	No	String	Set to `1' to transform the local-part (username) to lowercase in the cleansed email address.
Offline Mode	No	String	Set to `1' to query only the local domain cache. The plug-in does not perform the MX Record Lookup.
Processing Mode	No	String	Processing mode: `DATABASE` (default) or `MEMORY`. Memory mode is faster but requires more memory as it caches entirely the host name validation cache in memory.

Datasource

String

Full name of the JDBC Datasource used to store the host name validation cache.
If no datasource is specified then the data location’s datasource is used. For example: java:comp/env/jdbc/email_cache.

Lowercase User Name

String

Set to `1' to transform the local-part (username) to lowercase in the cleansed email address.

Offline Mode

String

Set to `1' to query only the local domain cache. The plug-in does not perform the MX Record Lookup.

Processing Mode

String

Processing mode: DATABASE (default) or MEMORY. Memory mode is faster but requires more memory as it caches entirely the host name validation cache in memory.

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name	Mandatory	Type	Description
Input Email Address	Yes	String	Input email address to cleanse.

Input Name

Mandatory

Type

Description

Input Email Address

Yes

String

Input email address to cleanse.

Plug-in Outputs

The following table lists the plug-in outputs.

Output Name Type Description

Output Name	Type	Description
Cleansed Email Address	String	Cleansed email address returned by the enricher. This address may be valid or not. The syntactic validity or domain name validity of the email address is indicated in the other plug-in outputs.
Valid Domain	String	Flag (0 or 1) indicating whether the domain name is valid or not (based on syntax and MX records lookup) in the cleansed email address. In Offline mode, this parameter returns 1 or 0 if the domain name appears in the local domain cache as valid or invalid. It returns `null` if the domain name does not exist in the cache and the MX Lookup was not issued.
Valid Domain Syntax	String	Flag (0 or 1) indicating whether the domain name syntax is valid or not in the cleansed email address.
Valid Email Syntax	String	Flag (0 or 1) indicating whether the cleansed email address is syntactically valid or not.
Valid Username Syntax	String	Flag (0 or 1) indicating whether the local-part (user name) syntax is valid or not in the cleansed email address.
Valid Input Domain	String	Flag (0 or 1) indicating whether the domain name is valid or not (based on syntax and MX records lookup) in the input email address. In Offline mode, this parameter returns 1 or 0 if the domain name appears in the local domain cache as valid of invalid. It returns `null` if the domain name does not exist in the cache and the MX Lookup was not issued.
Valid Input Domain Syntax	String	Flag (0 or 1) indicating whether the domain name syntax is valid or not in the input email address.
Valid Input Email Syntax	String	Flag (0 or 1) indicating whether the input email address is syntactically valid or not.
Valid Input Username Syntax	String	Flag (0 or 1) indicating whether the local-part (user name) syntax is valid or not in the input email address.

Cleansed Email Address

String

Cleansed email address returned by the enricher. This address may be valid or not. The syntactic validity or domain name validity of the email address is indicated in the other plug-in outputs.

Valid Domain

String

Flag (0 or 1) indicating whether the domain name is valid or not (based on syntax and MX records lookup) in the cleansed email address. In Offline mode, this parameter returns 1 or 0 if the domain name appears in the local domain cache as valid or invalid. It returns null if the domain name does not exist in the cache and the MX Lookup was not issued.

Valid Domain Syntax

String

Flag (0 or 1) indicating whether the domain name syntax is valid or not in the cleansed email address.

Valid Email Syntax

String

Flag (0 or 1) indicating whether the cleansed email address is syntactically valid or not.

Valid Username Syntax

String

Flag (0 or 1) indicating whether the local-part (user name) syntax is valid or not in the cleansed email address.

Valid Input Domain

String

Flag (0 or 1) indicating whether the domain name is valid or not (based on syntax and MX records lookup) in the input email address. In Offline mode, this parameter returns 1 or 0 if the domain name appears in the local domain cache as valid of invalid. It returns null if the domain name does not exist in the cache and the MX Lookup was not issued.

Valid Input Domain Syntax

String

Flag (0 or 1) indicating whether the domain name syntax is valid or not in the input email address.

Valid Input Email Syntax

String

Flag (0 or 1) indicating whether the input email address is syntactically valid or not.

Valid Input Username Syntax

String

Flag (0 or 1) indicating whether the local-part (user name) syntax is valid or not in the input email address.

Semarchy Email Validator

Plug-in ID

Semarchy Email Validator - com.semarchy.engine.plugins.convergence.email

Description

This enricher takes an Input Email Address and checks its syntactic validity. The domain name validity is optionally also checked using MX records lookup.

The plug-in uses the same mechanisms as the Semarchy Email Enricher for checking the email validity, except that it does not modify the incoming email.

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name Mandatory Type Description

Parameter Name	Mandatory	Type	Description
Accepted Domains	No	String	Value tolerated for the email domain. Possible values: `ALL_DOMAINS` accepts all syntactically valid domains. `VALID_DOMAINS` accepts only domain that are known to be valid (found in the locale cache as being valid or for which the MX lookup was successful). `VALID_AND_UNKNOWN` is used in Offline Mode to accept/reject records based on their status (valid/invalid) found in the local cache. Unknown domains (not found in the local cache) are accepted. Syntax checking is always done and an email with an invalid syntax will always be rejected.
Offline Mode	No	String	Set to `1' to query only the local domain cache. The plug-in does not perform the MX Record Lookup.
Processing Mode	No	String	Processing mode: `DATABASE` (default) or `MEMORY`. Memory mode is faster but requires more memory as it caches entirely the host name validation cache in memory.

Accepted Domains

String

Value tolerated for the email domain. Possible values:

ALL_DOMAINS accepts all syntactically valid domains.
VALID_DOMAINS accepts only domain that are known to be valid (found in the locale cache as being valid or for which the MX lookup was successful).
VALID_AND_UNKNOWN is used in Offline Mode to accept/reject records based on their status (valid/invalid) found in the local cache. Unknown domains (not found in the local cache) are accepted.
Syntax checking is always done and an email with an invalid syntax will always be rejected.

Offline Mode

String

Set to `1' to query only the local domain cache. The plug-in does not perform the MX Record Lookup.

Processing Mode

String

Processing mode: DATABASE (default) or MEMORY. Memory mode is faster but requires more memory as it caches entirely the host name validation cache in memory.

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name	Mandatory	Type	Description
Input Email Address	Yes	String	Input email address to check.

Input Name

Mandatory

Type

Description

Input Email Address

Yes

String

Input email address to check.

Melissa Plug-ins

The Melissa Plug-in for Semarchy xDM provides enrichers to fix and complete contact data for US/Canada using the Personator service, and to validate international addresses in 240 countries using the Global Address Verification service.

Melissa Global Address Enricher

Plug-in ID

Melissa Global Address Enricher - com.semarchy.engine.plugins.melissa.GlobalAddressVerificationEnricher

Description

The Melissa Global Address Enricher validates international addresses in 240 countries using the Global Address Verification service.

This plug-in requires a valid license string to access the Melissa service. Contact Melissa for the license.

For more details about the service, the parameters, inputs and outputs, refer to the Melissa Global Address Documentation

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name	Mandatory	Type	Description
License String	Yes	String	Your license string. This must be valid for you to access the Melissa Service.
Delivery Lines	No	Boolean	The options allows you to specify if the Address Lines 1-8 should contain just the delivery address or the entire address
Line Separator	No	String	Possible values: SemiColon, Pipe, CR, LF, CRLF, Tab, BR. This is the line separator used for the FormattedAddress result.
Output Script	No	String	Possible values: NoChange, Latn, Native. This is the script type used for all applicable fields.
Country Of Origin	No	String	Must contain a valid ISO-3166-1Alpha-2, ISO-3166-1 Alpha-3, or ISO-3166-1 Numeric code. This is used to determine whether or not to include the country name as the last line in FormattedAddress
SSL Connection	No	Boolean	Default is true. Set to false if you don’t wish to use a secure connection.
Failure Error Codes	No	String	Comma-separated list of codes (AE01, AE02) or code families (AE). When this result code is returned by the API, the enrichment is failed.
Requests Limit	No	Number	When set, this numeric value limits the number of requests made to the Melissa API and the number of enriched records. Records after this limit are not enriched and the plugin returns blank outputs. This parameter is intended for tests purposes only.

Parameter Name

Mandatory

Type

Description

License String

Yes

String

Your license string. This must be valid for you to access the Melissa Service.

Delivery Lines

Boolean

The options allows you to specify if the Address Lines 1-8 should contain just the delivery address or the entire address

Line Separator

String

Possible values: SemiColon, Pipe, CR, LF, CRLF, Tab, BR. This is the line separator used for the FormattedAddress result.

Output Script

String

Possible values: NoChange, Latn, Native. This is the script type used for all applicable fields.

Country Of Origin

String

Must contain a valid ISO-3166-1Alpha-2, ISO-3166-1 Alpha-3, or ISO-3166-1 Numeric code. This is used to determine whether or not to include the country name as the last line in FormattedAddress

SSL Connection

Boolean

Default is true. Set to false if you don’t wish to use a secure connection.

Failure Error Codes

String

Comma-separated list of codes (AE01, AE02) or code families (AE). When this result code is returned by the API, the enrichment is failed.

Requests Limit

Number

When set, this numeric value limits the number of requests made to the Melissa API and the number of enriched records. Records after this limit are not enriched and the plugin returns blank outputs. This parameter is intended for tests purposes only.

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name	Mandatory	Type	Description
AddressLine1	No	String	The input field for the address line 1. This should contain the delivery address information (house number, street, building, suite, etc.) but should not contain locality information (city, state, postal code, etc.) which have their own inputs.
AddressLine2	No	String	The input field for the address line 2. This can be a continuation of AddressLine1 (ex: suite) or another address.
AddressLine3 … AddressLine8	No	String	The input field for the address. This should contain the delivery address information (house number, thoroughfare, building, suite, etc.) but should not contain locality information (locality, administrative area, postal code, etc.) which have their own inputs.
DependentLocality	No	String	The smaller population center data element. This depends on the Locality element.
DoubleDependentLocality	No	String	The smallest population center data element. This depends on the Locality and DependentLocality elements.
Locality	No	String	The most common population center data element.
PostalCode	No	String	The postal code.
SubAdministrativeArea	No	String	The smallest geographic data element.
SubNationalArea	No	String	The administrative region within a country on an arbitrary level below that of the sovereign state.
Country	No	String	The country.

Input Name

Mandatory

Type

Description

AddressLine1

String

The input field for the address line 1. This should contain the delivery address information (house number, street, building, suite, etc.) but should not contain locality information (city, state, postal code, etc.) which have their own inputs.

AddressLine2

String

The input field for the address line 2. This can be a continuation of AddressLine1 (ex: suite) or another address.

AddressLine3
…
AddressLine8

String

The input field for the address. This should contain the delivery address information (house number, thoroughfare, building, suite, etc.) but should not contain locality information (locality, administrative area, postal code, etc.) which have their own inputs.

DependentLocality

String

The smaller population center data element. This depends on the Locality element.

DoubleDependentLocality

String

The smallest population center data element. This depends on the Locality and DependentLocality elements.

Locality

String

The most common population center data element.

PostalCode

String

The postal code.

SubAdministrativeArea

String

The smallest geographic data element.

SubNationalArea

String

The administrative region within a country on an arbitrary level below that of the sovereign state.

Country

String

The country.

Plug-in Outputs

The following table lists the plug-in outputs.

Output Name	Type	Description
AddressKey	String	Returns a unique identifier for an address. This key can be used with other current and future Melissa services.
AddressLine1 … AddressLine8	String	These are the string values that will return the standardized or corrected contents of the input address. These lines will include the entire address including the locality, administrative area, and postal code.
AddressType	String	Returns the Address Type for US and Canada
AdministrativeArea	String	The most common geographic data element.
Building	String	Descriptive name identifying an individual location. This is a string value that is the parsed Building element from the output.
CountryISO3166_1_Alpha2	String	ISO 3166 2-character country code.
CountryISO3166_1_Alpha3	String	ISO 3166 3-character country code.
CountryISO3166_1_Numeric	String	ISO 3166 3-digit numeric country code.
CountryName	String	Returns the country name for the record.
DependentLocality	String	A dependent locality is a logical area unit that is smaller than a locality but larger than a double dependent locality or thoroughfare. It can often be associated with a neighborhood or sector. Great Britain is an example of a country that uses double dependent locality. In the United States, this would correspond to Urbanization, which is used only in Puerto Rico.
DependentThoroughfare	String	Block data element or dependent street. This is used when there are more than one thoroughfares with the same name in one locality. An adjoining thoroughfare is used to uniquely identify the target thoroughfare. This is rarely used.
DependentThoroughfareLeadingType	String	Thoroughfare type at the beginning of the dependent thoroughfare. The leading type is parsed from the dependentThoroughfare parameter. For example, if the dependent thoroughfare is "St. Hickory E," the dependent thoroughfare leading type would be "St.
DependentThoroughfareName	String	Dependent thoroughfare name parsed from the dependentThoroughfare parameter. For example, if the dependent thoroughfare is "E Hickory Ln," the dependent thoroughfare name would be "Hickory.
DependentThoroughfarePostDirection	String	Cardinal directional at the end of the dependent thoroughfare. The postfix directional is parsed from the dependentThoroughfare parameter. For example, if the dependent thoroughfare is "Hickory Ln N," the dependent thoroughfare post direction would be "N.
DependentThoroughfarePreDirection	String	Cardinal directional at the beginning of the dependent thoroughfare. The prefix directional is parsed from the dependentThoroughfare parameter. For example, if the dependent thoroughfare is "W Hickory Ln," the dependent thoroughfare pre direction would be "W.
DependentThoroughfareTrailingType	String	Thoroughfare type at the end of the dependent thoroughfare. The trailing type is parsed from the dependentThoroughfare parameter. For example, if the dependent thoroughfare is "W Hickory Ln," the dependent thoroughfare trailing type would be "Ln.
DoubleDependentLocality	String	A double dependent locality is a logical area unit that is smaller than a dependent locality but bigger than a thoroughfare. This field is very rarely used. Great Britain is an example of a country that uses double dependent locality.
FormattedAddress	String	Mailing address. The full mailing address in the preferred format for the country of the address. This includes the Organization as the first line, one or more lines in the origin country’s format, and the destination country (if required). Separate lines will be delimited by what is specified in the option.
Latitude	String	Returns the geocoded latitude for the address entered in the AddressLine field.
Locality	String	This is the most common geographic area and used by virtually all countries. This is usually the value that is written on a mailing label and referred to by terms like City, Town, Postal Town, etc.
Longitude	String	Returns the geocoded longitude for the address entered in the AddressLine field.
Organization	String	This is a string value that matches the Organization request element. It is not modified or populated by the service.
PostBox	String	Post box information for a particular delivery point.
PostalCode	String	Returns the 9-digit postal code for U.S. addresses and 6-digit postal code for Canadian addresses.
PremisesNumber	String	Alphanumeric indicator within premises field. Parsed from the premises parameter.
PremisesType	String	Leading premise type indicator within premises field. Parsed from the premises parameter.
Results	String	String value containing a comma-separated list of status, error codes, and change codes for the record. Refer the the Melissa documentation for more details.
SubAdministrativeArea	String	The smallest geographic data element.
SubNationalArea	String	A sub-national area is a logical area unit that is larger than an administrative area but smaller than the country itself. It is extremely rarely used.
SubPremises	String	Alphanumeric code identifying an individual location. More specific than premises.
SubPremisesNumber	String	Sub premises number indicator within premises field. Parsed from the subPremises parameter.
SubPremisesType	String	Sub premises type indicator within premises field. Parsed from the subPremises parameter.
Thoroughfare	String	This value is a part of the address lines and contains all the sub-elements of the thoroughfare like trailing type, thoroughfare name, pre direction, post direction, etc.
ThoroughfareLeadingType	String	Leading thoroughfare type indicator parsed from the thoroughfare parameter. A leading type is a thoroughfare type that is placed before the thoroughfare. This value is a part of the Thoroughfare field. For example, the thoroughfare type of "Rue" in Canada and France is placed before the thoroughfare, making it a leading type.
ThoroughfareName	String	Name indicator parsed from the thoroughfare parameter.
ThoroughfarePostDirection	String	Postfix directional parsed from the thoroughfare parameter.
ThoroughfarePreDirection	String	Prefix directional parsed from the thoroughfare parameter.
ThoroughfareTrailingType	String	Trailing thoroughfare type indicator parsed from the thoroughfare parameter. A trailing type is a thoroughfare type that is placed after the thoroughfare. This value is a part of the Thoroughfare field. For example, the thoroughfare type of "Avenue" in the US is placed after the thoroughfare, making it a trailing type.
TransmissionResults	String	This is a string value that lists error codes from any errors caused by the most recent request as a whole.

Output Name

Type

Description

AddressKey

String

Returns a unique identifier for an address. This key can be used with other current and future Melissa services.

AddressLine1
…
AddressLine8

String

These are the string values that will return the standardized or corrected contents of the input address. These lines will include the entire address including the locality, administrative area, and postal code.

AddressType

String

Returns the Address Type for US and Canada

AdministrativeArea

String

The most common geographic data element.

Building

String

Descriptive name identifying an individual location. This is a string value that is the parsed Building element from the output.

CountryISO3166_1_Alpha2

String

ISO 3166 2-character country code.

CountryISO3166_1_Alpha3

String

ISO 3166 3-character country code.

CountryISO3166_1_Numeric

String

ISO 3166 3-digit numeric country code.

CountryName

String

Returns the country name for the record.

DependentLocality

String

A dependent locality is a logical area unit that is smaller than a locality but larger than a double dependent locality or thoroughfare. It can often be associated with a neighborhood or sector. Great Britain is an example of a country that uses double dependent locality. In the United States, this would correspond to Urbanization, which is used only in Puerto Rico.

DependentThoroughfare

String

Block data element or dependent street. This is used when there are more than one thoroughfares with the same name in one locality. An adjoining thoroughfare is used to uniquely identify the target thoroughfare. This is rarely used.

DependentThoroughfareLeadingType

String

Thoroughfare type at the beginning of the dependent thoroughfare. The leading type is parsed from the dependentThoroughfare parameter. For example, if the dependent thoroughfare is "St. Hickory E," the dependent thoroughfare leading type would be "St.

DependentThoroughfareName

String

Dependent thoroughfare name parsed from the dependentThoroughfare parameter. For example, if the dependent thoroughfare is "E Hickory Ln," the dependent thoroughfare name would be "Hickory.

DependentThoroughfarePostDirection

String

Cardinal directional at the end of the dependent thoroughfare. The postfix directional is parsed from the dependentThoroughfare parameter. For example, if the dependent thoroughfare is "Hickory Ln N," the dependent thoroughfare post direction would be "N.

DependentThoroughfarePreDirection

String

Cardinal directional at the beginning of the dependent thoroughfare. The prefix directional is parsed from the dependentThoroughfare parameter. For example, if the dependent thoroughfare is "W Hickory Ln," the dependent thoroughfare pre direction would be "W.

DependentThoroughfareTrailingType

String

Thoroughfare type at the end of the dependent thoroughfare. The trailing type is parsed from the dependentThoroughfare parameter. For example, if the dependent thoroughfare is "W Hickory Ln," the dependent thoroughfare trailing type would be "Ln.

DoubleDependentLocality

String

A double dependent locality is a logical area unit that is smaller than a dependent locality but bigger than a thoroughfare. This field is very rarely used. Great Britain is an example of a country that uses double dependent locality.

FormattedAddress

String

Mailing address. The full mailing address in the preferred format for the country of the address. This includes the Organization as the first line, one or more lines in the origin country’s format, and the destination country (if required). Separate lines will be delimited by what is specified in the option.

Latitude

String

Returns the geocoded latitude for the address entered in the AddressLine field.

Locality

String

This is the most common geographic area and used by virtually all countries. This is usually the value that is written on a mailing label and referred to by terms like City, Town, Postal Town, etc.

Longitude

String

Returns the geocoded longitude for the address entered in the AddressLine field.

Organization

String

This is a string value that matches the Organization request element. It is not modified or populated by the service.

PostBox

String

Post box information for a particular delivery point.

PostalCode

String

Returns the 9-digit postal code for U.S. addresses and 6-digit postal code for Canadian addresses.

PremisesNumber

String

Alphanumeric indicator within premises field. Parsed from the premises parameter.

PremisesType

String

Leading premise type indicator within premises field. Parsed from the premises parameter.

Results

String

String value containing a comma-separated list of status, error codes, and change codes for the record. Refer the the Melissa documentation for more details.

SubAdministrativeArea

String

The smallest geographic data element.

SubNationalArea

String

A sub-national area is a logical area unit that is larger than an administrative area but smaller than the country itself. It is extremely rarely used.

SubPremises

String

Alphanumeric code identifying an individual location. More specific than premises.

SubPremisesNumber

String

Sub premises number indicator within premises field. Parsed from the subPremises parameter.

SubPremisesType

String

Sub premises type indicator within premises field. Parsed from the subPremises parameter.

Thoroughfare

String

This value is a part of the address lines and contains all the sub-elements of the thoroughfare like trailing type, thoroughfare name, pre direction, post direction, etc.

ThoroughfareLeadingType

String

Leading thoroughfare type indicator parsed from the thoroughfare parameter. A leading type is a thoroughfare type that is placed before the thoroughfare. This value is a part of the Thoroughfare field. For example, the thoroughfare type of "Rue" in Canada and France is placed before the thoroughfare, making it a leading type.

ThoroughfareName

String

Name indicator parsed from the thoroughfare parameter.

ThoroughfarePostDirection

String

Postfix directional parsed from the thoroughfare parameter.

ThoroughfarePreDirection

String

Prefix directional parsed from the thoroughfare parameter.

ThoroughfareTrailingType

String

Trailing thoroughfare type indicator parsed from the thoroughfare parameter. A trailing type is a thoroughfare type that is placed after the thoroughfare. This value is a part of the Thoroughfare field. For example, the thoroughfare type of "Avenue" in the US is placed after the thoroughfare, making it a trailing type.

TransmissionResults

String

This is a string value that lists error codes from any errors caused by the most recent request as a whole.

Melissa Personator Enricher

Plug-in ID

Melissa Personator Enricher - com.semarchy.engine.plugins.melissa.PersonatorConsumerEnricher

Description

The Melissa Personator Enricher fixes and completes contact data for US/Canada using the Personator Consumer service.

This plug-in requires a valid license string to access the Melissa service. Contact Melissa for the license.

For more details about the service, the parameters, inputs and outputs, refer to the Melissa Personator Consumer Documentation

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name	Mandatory	Type	Description
License String	Yes	String	Your license string. This must be valid for you to access the Melissa Service.
Action Append	No	Boolean	The Append Action will return elements based on the selected point of centricity which can either be the address, email or phone. For example, an address centric Append will return the name, company, phone and email associated with the given address. US only.
Action Check	No	Boolean	The Check Action will validate the individual input data pieces for validity and correct them if possible. If the data is correctable, additional information
Action Move	No	Boolean	The Move Action will return the latest address for an individual or business if a previous address was entered. Move requires either a Last Name and Address, or a Business/Company Name and Address as inputs. US only.
Action Verify	No	Boolean	The Verify Action will return to you the relationships between your different input data pieces. It can show you if your name,
Advanced Address Correction	No	Boolean	Uses the name input to perform more advanced address corrections. This can correct or append house numbers, street names, cities, states, and ZIP codes.
Append Options	No	String	Possible values: blank, checkError, always. Setting the Append option to Blank will cause the service to return information only when the input address, phone, email, name or company is blank.
Centric Hint	No	String	Possible values: auto, address, phone, email. Default value is Auto. When set to Auto, it first uses Address if available, followed by Phone if no Address is available, and lastly Email if neither Address nor Phone are available. Use this to tell the service which piece of information to use as the primary point of reference when appending or verifying data.
Columns	No	String	By default requested columns are restricted to mapped outputs, this parameter allow to specifies (force) which column(s) to be requested, see Melissa documentation
Diacritics	No	String	Possible values: auto, on, off. Determines whether or not French language characters are returned. If set to auto, those characters are only returned if they are in the input.
Failure Error Codes	No	String	Comma-separated list of codes (AE01, AE02) or code families (AE). When this result code is returned by the API, the enrichment is failed.
SSL Connection	No	Boolean	Default is true. Set to false if you don’t wish to use a secure connection.
Use Preferred City	No	Boolean	There is an official name that is preferred by the U.S.PS and there may be one or more unofficial "vanity" names in use. Normally, Personator allows you to verify addresses using known vanity names. Setting this to true, will return the prefered city.
Requests Limit	No	Number	When set, this numeric value limits the number of requests made to the Melissa API and the number of enriched records. Records after this limit are not enriched and the plugin returns blank outputs. This parameter is intended for tests purposes only.

Parameter Name

Mandatory

Type

Description

License String

Yes

String

Your license string. This must be valid for you to access the Melissa Service.

Action Append

Boolean

The Append Action will return elements based on the selected point of centricity which can either be the address, email or phone. For example, an address centric Append will return the name, company, phone and email associated with the given address. US only.

Action Check

Boolean

The Check Action will validate the individual input data pieces for validity and correct them if possible. If the data is correctable, additional information

Action Move

Boolean

The Move Action will return the latest address for an individual or business if a previous address was entered. Move requires either a Last Name and Address, or a Business/Company Name and Address as inputs. US only.

Action Verify

Boolean

The Verify Action will return to you the relationships between your different input data pieces. It can show you if your name,

Advanced Address Correction

Boolean

Uses the name input to perform more advanced address corrections. This can correct or append house numbers, street names, cities, states, and ZIP codes.

Append Options

String

Possible values: blank, checkError, always. Setting the Append option to Blank will cause the service to return information only when the input address, phone, email, name or company is blank.

Centric Hint

String

Possible values: auto, address, phone, email. Default value is Auto. When set to Auto, it first uses Address if available, followed by Phone if no Address is available, and lastly Email if neither Address nor Phone are available. Use this to tell the service which piece of information to use as the primary point of reference when appending or verifying data.

Columns

String

By default requested columns are restricted to mapped outputs, this parameter allow to specifies (force) which column(s) to be requested, see Melissa documentation

Diacritics

String

Possible values: auto, on, off. Determines whether or not French language characters are returned. If set to auto, those characters are only returned if they are in the input.

Failure Error Codes

String

Comma-separated list of codes (AE01, AE02) or code families (AE). When this result code is returned by the API, the enrichment is failed.

SSL Connection

Boolean

Default is true. Set to false if you don’t wish to use a secure connection.

Use Preferred City

Boolean

There is an official name that is preferred by the U.S.PS and there may be one or more unofficial "vanity" names in use. Normally, Personator allows you to verify addresses using known vanity names. Setting this to true, will return the prefered city.

Requests Limit

Number

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name	Mandatory	Type	Description
AddressLine1	No	String	The input field for the address line 1. This should contain the delivery address information (house number, street, building, suite, etc.) but should not contain locality information (city, state, postal code, etc.) which have their own inputs.
AddressLine2	No	String	The input field for the address line 2. This can be a continuation of AddressLine1 (ex: suite) or another address.
City	No	String	The city.
CompanyName	No	String	The company name.
Country	No	String	The country.
Email	No	String	The email address.
FirstName	No	String	The given (first) name.
FreeForm	No	String	Single line contact information. Address, phone, email could be all in a single field and they will be parsed out. Please don’t map any other fields if using FreeForm.
FullName	No	String	This field can contain a full name. The API will parse and check Names only if the First Name and Last Name fields are left blank.
LastLine	No	String	The city, state, and ZIP.
LastName	No	String	The family (last) name.
Phone	No	String	The phone number.
PostalCode	No	String	The postal code.
State	No	String	The US state.

Input Name

Mandatory

Type

Description

AddressLine1

String

AddressLine2

String

The input field for the address line 2. This can be a continuation of AddressLine1 (ex: suite) or another address.

City

String

The city.

CompanyName

String

The company name.

Country

String

The country.

String

The email address.

FirstName

String

The given (first) name.

FreeForm

String

Single line contact information. Address, phone, email could be all in a single field and they will be parsed out. Please don’t map any other fields if using FreeForm.

FullName

String

This field can contain a full name. The API will parse and check Names only if the First Name and Last Name fields are left blank.

LastLine

String

The city, state, and ZIP.

LastName

String

The family (last) name.

Phone

String

The phone number.

PostalCode

String

The postal code.

State

String

The US state.

Plug-in Outputs

The following table lists the plug-in outputs.

Output Name	Type	Description
AddressDeliveryInstallation	String	Returns the parsed delivery installation for the address entered in the AddressLine field.
AddressExtras	String	Any extra information that does not fit in the AddressLine fields.
AddressHouseNumber	String	Returns the parsed house number for the address entered in the AddressLine field.
AddressKey	String	Returns a unique identifier for an address. This key can be used with other current and future Melissa services.
AddressLine1	String	These are the string values that will return the standardized or corrected contents of the input address. These lines will include the entire address including the locality, administrative area, and postal code.
AddressLine2	String	These are the string values that will return the standardized or corrected contents of the input address. These lines will include the entire address including the locality, administrative area, and postal code.
AddressLockBox	String	Returns the parsed lock box number for the address entered in the AddressLine field.
AddressPostDirection	String	Returns the parsed post-direction for the address entered in the AddressLine field.
AddressPreDirection	String	Returns the parsed pre-direction for the address entered in the AddressLine field.
AddressPrivateMailboxName	String	Returns the parsed private mailbox name for the address entered in the AddressLine field.
AddressPrivateMailboxRange	String	Returns the parsed private mailbox range for the address entered in the AddressLine field.
AddressRouteService	String	Returns the parsed route service number for the address entered in the AddressLine field.
AddressStreetName	String	Returns the parsed street name for the address entered in the AddressLine field.
AddressStreetSuffix	String	Returns the parsed street suffix for the address entered in the AddressLine field.
AddressSuiteName	String	Returns the parsed suite name for the address entered in the AddressLine field.
AddressSuiteNumber	String	Returns the parsed suite number for the address entered in the AddressLine field.
AddressTypeCode	String	Returns a code for the address type in the AddressLine field.
CBSACode	String	Census Bureau’s Core Based Statistical Area (CBSA). Returns the 5-digit code for the CBSA associated with the requested record.
CBSADivisionCode	String	Returns the code for a division associated with the requested record, if any.
CBSADivisionLevel	String	Returns whether the CBSA division, if any, is metropolitan or micropolitan.
CBSADivisionTitle	String	Returns the title for the CBSA division, if any.
CBSALevel	String	Returns whether the CBSA is metropolitan or micropolitan.
CBSATitle	String	Returns the title for the CBSA.
CarrierRoute	String	Returns a 4-character code defining the carrier route for this record.
CensusBlock	String	Returns a 4-digit string containing the census block number associated with the requested record.
CensusTract	String	Returns a 4-to 6-digit string containing the census tract number associated with the requested record.
City	String	Returns the city entered in the City field.
CityAbbreviation	String	Returns an abbreviation for the city entered in the City field, if any.
CompanyName	String	Returns the company name.
CongressionalDistrict	String	Returns the 2-digit congressional district that belongs to the requested record.
CountryCode	String	Returns the country code for the country in the Country field.
CountryName	String	Returns the country name for the record.
DeliveryIndicator	String	Returns an indicator of whether an address is a business address or residential address.
DeliveryPointCheckDigit	String	Returns a string value containing the 1-digit delivery point check digit.
DeliveryPointCode	String	Returns a string value containing the 2-digit delivery point code.
EmailAddress	String	Returns the email address entered in the Email field.
EmailDomainName	String	Returns the parsed domain name for the email entered in the Email field.
EmailMailboxName	String	Returns the parsed mailbox name for the email entered in the Email field.
EmailTopLevelDomain	String	Returns the parsed top-level domain name for the email entered in the Email field.
FormattedAddress	String	Mailing address. The full mailing address in the preferred format for the country of the address. This includes the Organization as the first line, one or more lines in the origin country’s format, and the destination country (if required). Separate lines will be delimited by what is specified in the option.
Gender	String	Returns a gender for the name in the FullName field.
Gender2	String	Only used if 2 names are in the FullName field. Returns a gender for the second name in the FullName field.
Latitude	String	Returns the geocoded latitude for the address entered in the AddressLine field.
Longitude	String	Returns the geocoded longitude for the address entered in the AddressLine field.
NameFirst	String	Returns the first name in the FullName field.
NameFirst2	String	Only used if 2 names are in the FullName field. Returns the second name in the FullName field.
NameFull	String	Returns the full name for the record.
NameLast	String	Returns the last name in the FullName field.
NameLast2	String	Only used if 2 names are in the FullName field. Returns a last name for the second name in the FullName field.
NameMiddle	String	Returns a middle name for the name in the FullName field.
NameMiddle2	String	Only used if 2 names are in the FullName field. Returns a middle name for the second name in the FullName field.
NamePrefix	String	empty
NamePrefix2	String	Returns a prefix for the name in the FullName field.
NameSuffix	String	Returns a suffix for the name in the FullName field.
NameSuffix2	String	Only used if 2 names are in the FullName field. Returns a suffix for the second name in the FullName field.
PhoneAreaCode	String	Returns the parsed area code for the phone number entered in the Phone field.
PhoneExtension	String	Returns the parsed extension for the phone number entered in the Phone field.
PhoneNewAreaCode	String	Returns the parsed new area code for the phone number entered in the Phone field.
PhoneNumber	String	Returns the standardized phone number for the record.
PhonePrefix	String	Returns the parsed prefix for the phone number entered in the Phone field.
PhoneSuffix	String	Returns the parsed suffix for the phone number entered in the Phone field.
PlaceCode	String	When ZIP codes overlap, the City field will always return the city that covers most of the ZIP area. If the address is located outside of that city but within the ZIP Code, Place Code will refer to that area.
PlaceName	String	When ZIP codes overlap, the City field will always return the city that covers most of the ZIP area. If the address is located outside of that city but within the ZIP Code, Place Name will refer to that area.
PostalCode	String	Returns the 9-digit postal code for U.S. addresses and 6-digit postal code for Canadian addresses.
Results	String	String value containing a comma-separated list of status, error codes, and change codes for the record. Refer the the Melissa documentation for more details.
Salutation	String	Returns a salutation for the name in the FullName field.
State	String	Returns the state for the record.
StateName	String	Returns the full name of the state entered in the State field.
TransmissionResults	String	This is a string value that lists error codes from any errors caused by the most recent request as a whole.
UTC	String	Returns the time zone of the requested record. All Melissa products express time zones in UTC (Coordinated Universal Time).
UrbanizationName	String	Returns the urbanization name for the address entered in the AddressLine field. Usually only used if the address is in Puerto Rico.

Output Name

Type

Description

AddressDeliveryInstallation

String

Returns the parsed delivery installation for the address entered in the AddressLine field.

AddressExtras

String

Any extra information that does not fit in the AddressLine fields.

AddressHouseNumber

String

Returns the parsed house number for the address entered in the AddressLine field.

AddressKey

String

Returns a unique identifier for an address. This key can be used with other current and future Melissa services.

AddressLine1

String

AddressLine2

String

AddressLockBox

String

Returns the parsed lock box number for the address entered in the AddressLine field.

AddressPostDirection

String

Returns the parsed post-direction for the address entered in the AddressLine field.

AddressPreDirection

String

Returns the parsed pre-direction for the address entered in the AddressLine field.

AddressPrivateMailboxName

String

Returns the parsed private mailbox name for the address entered in the AddressLine field.

AddressPrivateMailboxRange

String

Returns the parsed private mailbox range for the address entered in the AddressLine field.

AddressRouteService

String

Returns the parsed route service number for the address entered in the AddressLine field.

AddressStreetName

String

Returns the parsed street name for the address entered in the AddressLine field.

AddressStreetSuffix

String

Returns the parsed street suffix for the address entered in the AddressLine field.

AddressSuiteName

String

Returns the parsed suite name for the address entered in the AddressLine field.

AddressSuiteNumber

String

Returns the parsed suite number for the address entered in the AddressLine field.

AddressTypeCode

String

Returns a code for the address type in the AddressLine field.

CBSACode

String

Census Bureau’s Core Based Statistical Area (CBSA). Returns the 5-digit code for the CBSA associated with the requested record.

CBSADivisionCode

String

Returns the code for a division associated with the requested record, if any.

CBSADivisionLevel

String

Returns whether the CBSA division, if any, is metropolitan or micropolitan.

CBSADivisionTitle

String

Returns the title for the CBSA division, if any.

CBSALevel

String

Returns whether the CBSA is metropolitan or micropolitan.

CBSATitle

String

Returns the title for the CBSA.

CarrierRoute

String

Returns a 4-character code defining the carrier route for this record.

CensusBlock

String

Returns a 4-digit string containing the census block number associated with the requested record.

CensusTract

String

Returns a 4-to 6-digit string containing the census tract number associated with the requested record.

City

String

Returns the city entered in the City field.

CityAbbreviation

String

Returns an abbreviation for the city entered in the City field, if any.

CompanyName

String

Returns the company name.

CongressionalDistrict

String

Returns the 2-digit congressional district that belongs to the requested record.

CountryCode

String

Returns the country code for the country in the Country field.

CountryName

String

Returns the country name for the record.

DeliveryIndicator

String

Returns an indicator of whether an address is a business address or residential address.

DeliveryPointCheckDigit

String

Returns a string value containing the 1-digit delivery point check digit.

DeliveryPointCode

String

Returns a string value containing the 2-digit delivery point code.

EmailAddress

String

Returns the email address entered in the Email field.

EmailDomainName

String

Returns the parsed domain name for the email entered in the Email field.

EmailMailboxName

String

Returns the parsed mailbox name for the email entered in the Email field.

EmailTopLevelDomain

String

Returns the parsed top-level domain name for the email entered in the Email field.

FormattedAddress

String

Gender

String

Returns a gender for the name in the FullName field.

Gender2

String

Only used if 2 names are in the FullName field. Returns a gender for the second name in the FullName field.

Latitude

String

Returns the geocoded latitude for the address entered in the AddressLine field.

Longitude

String

Returns the geocoded longitude for the address entered in the AddressLine field.

NameFirst

String

Returns the first name in the FullName field.

NameFirst2

String

Only used if 2 names are in the FullName field. Returns the second name in the FullName field.

NameFull

String

Returns the full name for the record.

NameLast

String

Returns the last name in the FullName field.

NameLast2

String

Only used if 2 names are in the FullName field. Returns a last name for the second name in the FullName field.

NameMiddle

String

Returns a middle name for the name in the FullName field.

NameMiddle2

String

Only used if 2 names are in the FullName field. Returns a middle name for the second name in the FullName field.

NamePrefix

String

empty

NamePrefix2

String

Returns a prefix for the name in the FullName field.

NameSuffix

String

Returns a suffix for the name in the FullName field.

NameSuffix2

String

Only used if 2 names are in the FullName field. Returns a suffix for the second name in the FullName field.

PhoneAreaCode

String

Returns the parsed area code for the phone number entered in the Phone field.

PhoneExtension

String

Returns the parsed extension for the phone number entered in the Phone field.

PhoneNewAreaCode

String

Returns the parsed new area code for the phone number entered in the Phone field.

PhoneNumber

String

Returns the standardized phone number for the record.

PhonePrefix

String

Returns the parsed prefix for the phone number entered in the Phone field.

PhoneSuffix

String

Returns the parsed suffix for the phone number entered in the Phone field.

PlaceCode

String

When ZIP codes overlap, the City field will always return the city that covers most of the ZIP area. If the address is located outside of that city but within the ZIP Code, Place Code will refer to that area.

PlaceName

String

When ZIP codes overlap, the City field will always return the city that covers most of the ZIP area. If the address is located outside of that city but within the ZIP Code, Place Name will refer to that area.

PostalCode

String

Returns the 9-digit postal code for U.S. addresses and 6-digit postal code for Canadian addresses.

Results

String

String value containing a comma-separated list of status, error codes, and change codes for the record. Refer the the Melissa documentation for more details.

Salutation

String

Returns a salutation for the name in the FullName field.

State

String

Returns the state for the record.

StateName

String

Returns the full name of the state entered in the State field.

TransmissionResults

String

This is a string value that lists error codes from any errors caused by the most recent request as a whole.

UTC

String

Returns the time zone of the requested record. All Melissa products express time zones in UTC (Coordinated Universal Time).

UrbanizationName

String

Returns the urbanization name for the address entered in the AddressLine field. Usually only used if the address is in Puerto Rico.

Google Maps Plug-in

The Google Maps Plug-in for Semarchy xDM provides an enricher for international postal addresses. This enricher cleanses, standardizes and enriches the postal addresses with geocoding information.

Google Maps Enricher

Plug-in ID

Google Maps Enricher - com.semarchy.integration.rowTransformers.googleMapsEnricher

Description

This enricher takes an input address, enriches and validates this postal address using the Google Geocoding Service.

This plug-in must be used in compliance with the Google Maps/Google Earth APIs Terms of Service.

This enricher uses the Google Geocoding Service, which must be accessible from the Semarchy xDM Application at the following URL: http://maps.googleapis.com/maps/api/geocode/json?<parameters>;. Make sure to make this URL accessible through your firewalls.

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name Mandatory Type Description

Parameter Name	Mandatory	Type	Description
Client ID or API Key	No	String	This parameter may contain either an API Key (for Standard API usage) or the Client ID (for Premium Usage), both provided by Google. The Client ID should begin with the `gme-` prefix. When providing a Client ID, the signature (Private Key) is required.
Channel	No	String	This parameter assigns a specific channel name and allows tracking usage for this plugin in the Google Maps usage reports.
Default Language	No	String	Code of the default language used for the returned results. For example, for same address, "Rue Mathieu Misery" would appear in French and "Mathieu Misery Street" in English. This code can be overridden by the Language plug-in input. See the list of supported domain languages for more information.
Private Key	No	String	Cryptographic signature key provided by Google with the Client ID.
Request per Second	No	Integer	This parameter limits the number of requests per second made by the enricher to remain within the limits of the API. It defaults to 50 requests per seconds.

Client ID or API Key

String

This parameter may contain either an API Key (for Standard API usage) or the Client ID (for Premium Usage), both provided by Google. The Client ID should begin with the gme- prefix. When providing a Client ID, the signature (Private Key) is required.

Channel

String

This parameter assigns a specific channel name and allows tracking usage for this plugin in the Google Maps usage reports.

Default Language

String

Code of the default language used for the returned results. For example, for same address, "Rue Mathieu Misery" would appear in French and "Mathieu Misery Street" in English. This code can be overridden by the Language plug-in input. See the list of supported domain languages for more information.

Private Key

String

Cryptographic signature key provided by Google with the Client ID.

Request per Second

Integer

This parameter limits the number of requests per second made by the enricher to remain within the limits of the API. It defaults to 50 requests per seconds.

You can use the Google Maps service with one of the following authentication methods:

With an API Key, with the Standard Usage Limits and a pay-as-you-go plan above the limits.
With a Client ID and a Signature (private key) with a Google Maps Premium Plan.

Keyless access to this API is not supported by Google.

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name	Mandatory	Type	Description
Address Line	Yes	String	Address line to process. If the address is composed of multiple lines, then these lines must be provided as a comma-separated list of address lines.
Postal Code	No	String	Postal code of the address.
City	No	String	City of the address.
Country	No	String	Country of the address.
Language	No	String	Code of the language for the returned result for this record. This language overrides the Default Language parameter. See the list of supported domain languages for more information.

Input Name

Mandatory

Type

Description

Address Line

Yes

String

Address line to process. If the address is composed of multiple lines, then these lines must be provided as a comma-separated list of address lines.

Postal Code

String

Postal code of the address.

City

String

City of the address.

Country

String

Country of the address.

Language

String

Code of the language for the returned result for this record. This language overrides the Default Language parameter. See the list of supported domain languages for more information.

The state, region or province information can be passed in the City input, concatenated with the city name. For example: Address.City || ' ' || Address.State

The entire address, including the Address Line, Postal Code, City and Country values can be passed to the plug-in as a single concatenated string in the Address Line input. If the source data contains the address in a single string, then you can pass this string directly in the Address Line input.

Plug-in Outputs

The following table lists the plug-in outputs. Outputs marked with an * appear in a Full and a Short form in the output list.

Output Name Type Description

Output Name	Type	Description
Address Types	String	Comma-separated list of address types (See Address Types for more information.).
Administrative Area Level 1*	String	First-order civil entity below the country level. Within the United States, these administrative levels are states. Not all countries exhibit these administrative levels.
Administrative Area Level 2*	String	Second-order civil entity below the country level. Within the United States, these administrative levels are counties. Not all countries exhibit these administrative levels.
Administrative Area Level 3*	String	Third-order civil entity below the country level. Not all countries exhibit these administrative levels.
Airport	String	Indicates an airport. NOTE: This output is deprecated.
Country*	String	The national political entity.
East Bound Longitude	String	Bounding box eastern limit.
Floor*	String	Indicates the floor of a building address.
Formatted Address	String	Human-readable version of the geocoded address.
Intersection	String	Major intersection, usually of two major roads. NOTE: This output is deprecated.
Latitude	String	Latitude of the address.
Locality*	String	Incorporated city or town political entity.
Longitude	String	Longitude of the address.
Natural Feature*	String	Prominent natural feature.
Neighborhood*	String	Named neighborhood.
North Bound Latitude	String	Bounding box northern limit.
Park*	String	Named park.
Point of Interest*	String	Named point of interest.
Post Box*	String	Specific postal box.
Postal Code*	String	Postal code as used to address postal mail within the country.
Premise*	String	Named location, usually a building or collection of buildings with a common name.
Quality	String	The value of an Address Quality element defines the granularity of the location described by an address. Should return a value that expresses this quality between 0 and 100 (100 being the best quality)
Room*	String	The room of a building address.
Route*	String	Named route (such as `US 401`).
South Bound Latitude	String	Bounding box southern limit.
Status	String	Status of the request. `OK` indicates that no error occurred and the address was geocoded. `ZERO_RESULTS` indicates that no error occurred but the address was not geocoded. See the API documentation for a list of status and error codes
Street Address	String	Precise street address. NOTE: This output is deprecated.
Street Number*	String	Precise street number.
Sub-Locality*	String	First-order civil entity below a locality.
Sub-Premise*	String	First-order entity below a named location, usually a singular building within a collection of buildings with a common name.
West Bound Longitude	String	Bounding box western limit.

Address Types

String

Comma-separated list of address types (See Address Types for more information.).

Administrative Area Level 1*

String

First-order civil entity below the country level. Within the United States, these administrative levels are states. Not all countries exhibit these administrative levels.

Administrative Area Level 2*

String

Second-order civil entity below the country level. Within the United States, these administrative levels are counties. Not all countries exhibit these administrative levels.

Administrative Area Level 3*

String

Third-order civil entity below the country level. Not all countries exhibit these administrative levels.

Airport

String

Indicates an airport. NOTE: This output is deprecated.

Country*

String

The national political entity.

East Bound Longitude

String

Bounding box eastern limit.

Floor*

String

Indicates the floor of a building address.

Formatted Address

String

Human-readable version of the geocoded address.

Intersection

String

Major intersection, usually of two major roads. NOTE: This output is deprecated.

Latitude

String

Latitude of the address.

Locality*

String

Incorporated city or town political entity.

Longitude

String

Longitude of the address.

Natural Feature*

String

Prominent natural feature.

Neighborhood*

String

Named neighborhood.

North Bound Latitude

String

Bounding box northern limit.

Park*

String

Named park.

Point of Interest*

String

Named point of interest.

Post Box*

String

Specific postal box.

Postal Code*

String

Postal code as used to address postal mail within the country.

Premise*

String

Named location, usually a building or collection of buildings with a common name.

Quality

String

The value of an Address Quality element defines the granularity of the location described by an address. Should return a value that expresses this quality between 0 and 100 (100 being the best quality)

Room*

String

The room of a building address.

Route*

String

Named route (such as US 401).

South Bound Latitude

String

Bounding box southern limit.

Status

String

Status of the request. OK indicates that no error occurred and the address was geocoded. ZERO_RESULTS indicates that no error occurred but the address was not geocoded. See the API documentation for a list of status and error codes

Street Address

String

Precise street address. NOTE: This output is deprecated.

Street Number*

String

Precise street number.

Sub-Locality*

String

First-order civil entity below a locality.

Sub-Premise*

String

First-order entity below a named location, usually a singular building within a collection of buildings with a common name.

West Bound Longitude

String

Bounding box western limit.

Embedded a Google Map in a Form

The Google Geocoding service data must be used to display maps rendered with the Google Maps service.

You can display such a map in Semarchy xDM in a form, by embedding generated HTML and JavaScript.

Create a new form field with the SemQL expression given below.
In the SemQL expression, modify the following line to concatenate your address information:

var address= "' || AddressLine || ' ' || PostalCode || ' ' || City || '";

If you are a Google Maps API for Work customer, modify in the code the URL to the Google maps service to include your Google Client ID. Note that the embedded map will stop working after adding the client ID. You must register authorized URLs with Google by following the instructions given in the Google Maps API for Work site:

<script src="https://maps.googleapis.com/maps/api/js?client=YOUR_CLIENT_ID&v3.20&sensor=false"></script>

Edit the field:
- In the Display Properties, Set the Component Type to Object, and in Data, set the Source Type to Content.
  This configuration tells Semarchy xDM to interpret this code as HTML and JavaScript on the browser.

'<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <script src="https://maps.googleapis.com/maps/api/js?sensor=false"></script>
    <script>

/* Modify the line below */
var address= "' || AddressLine || ' ' || PostalCode || ' ' || City || '";

var zoom = 18;
var mapType = google.maps.MapTypeId.ROADMAP;
var useMarker = true;
var map;

function initialize() {
	var geocoder = new google.maps.Geocoder();
	geocoder.geocode( { "address": address}, function(results, status) {
	 if (status == google.maps.GeocoderStatus.OK) { displayMap(results[0].geometry.location); }
	});
	window.onresize = resize;
}

function displayMap(latlng) {
	var mapOptions = { zoom: zoom, center: latlng, mapTypeId: mapType }
	map = new google.maps.Map(document.getElementById("map_canvas"), mapOptions);
	if (useMarker) {
		var marker = new google.maps.Marker({ map: map, position: latlng});
	}
	resize("");
}

function resize(e) {
	var center = map.getCenter();
	map.getDiv().style.height = window.innerHeight +"px";
	map.getDiv().style.width = window.innerWidth +"px";
	google.maps.event.trigger(map, ''resize'');
	map.setCenter(center);
}

google.maps.event.addDomListener(window, "load", initialize);
    </script>
  </head>
  <body style="margin:0px;">
    <div id="map_canvas" style="margin:0px;"></div>
  </body>
</html>'

Open Street Map Plug-in

The OpenStreetMap Plug-in for Semarchy xDM uses the OpenStreetMap API to provide an enricher for international postal addresses. This enricher cleanses, standardizes and enriches the postal address.

OpenStreetMap Enricher

Plug-in ID

OpenStreetMap Enricher - com.semarchy.engine.plugins.openstreetmap

Description

This enricher takes an input address, enriches and validates this postal address using the OpenStreetMap Service.

This enricher uses the OpenStreetMap Service, which must be accessible from the Semarchy xDM Application at the URL specified in the OpenStreetMap URL parameter. Make sure to make this URL accessible through your firewalls.

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name Mandatory Type Description

Parameter Name	Mandatory	Type	Description
OpenStreetMap URL	Yes	String	URL used to query OpenStreetMap API. Typically `http://nominatim.openstreetmap.org/`

OpenStreetMap URL

Yes

String

URL used to query OpenStreetMap API. Typically http://nominatim.openstreetmap.org/

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name	Mandatory	Type	Description
Address Line	Yes	String	Address line to process. If the address is composed of multiple lines, then these lines must be provided as a comma-separated list of address lines.
Postal Code	No	String	Postal code of the address.
City	No	String	City of the address.
Country	No	String	Country of the address.

Input Name

Mandatory

Type

Description

Address Line

Yes

String

Address line to process. If the address is composed of multiple lines, then these lines must be provided as a comma-separated list of address lines.

Postal Code

String

Postal code of the address.

City

String

City of the address.

Country

String

Country of the address.

Plug-in Outputs

The following table lists the plug-in outputs.

Output Name	Type	Description
Address	String	Complete address of the location.
City	String	City of the location.
Country	String	Country of the location.
Country Code	String	Country code of the location.
County	String	County of the location.
Latitude	String	Latitude of the location.
Longitude	String	Longitude of the location.
Postal Code	String	Postal code of the location.
Process Code	String	Code that indicates the result status of the address processing.
State	String	State of the Location.
Street Number	String	Street number of the location.
Street Name	String	Street name of the location.

Output Name

Type

Description

Address

String

Complete address of the location.

City

String

City of the location.

Country

String

Country of the location.

Country Code

String

Country code of the location.

County

String

County of the location.

Latitude

String

Latitude of the location.

Longitude

String

Longitude of the location.

Postal Code

String

Postal code of the location.

Process Code

String

Code that indicates the result status of the address processing.

State

String

State of the Location.

Street Number

String

Street number of the location.

Street Name

String

Street name of the location.

Microsoft Bing Maps Plug-in

The Microsoft Bing Maps Plug-in for Semarchy xDM uses the Bing Location API to provide an enricher for international postal addresses. This enricher cleanses, standardizes and enriches the postal address with geocoding information.

Bing Maps Enricher

Plug-in ID

Google Bing Enricher - com.semarchy.engine.plugins.bing.address

Description

This enricher takes an input address, enriches and validates this postal address using the Bing Maps Service.

This plug-in must be used in compliance with the Microsoft Bing Maps APIs Terms of Service.

This enricher uses the Bing Maps Service, which must be accessible from the Semarchy xDM Application at the URL specified in the Bing Location URL parameter. Make sure to make this URL accessible through your firewalls.

This plug-in is thread-safe and supports parallel execution.

Plug-in Parameters

The following table lists the plug-in parameters.

Parameter Name	Mandatory	Type	Description
Bing Maps Key	Yes	String	To use the Bing Maps Services, you must have a Bing Maps Key.
Bing Location URL	Yes	String	This URL will be used to query Bing Location API.

Parameter Name

Mandatory

Type

Description

Bing Maps Key

Yes

String

To use the Bing Maps Services, you must have a Bing Maps Key.

Bing Location URL

Yes

String

This URL will be used to query Bing Location API.

Plug-in Inputs

The following table lists the plug-in inputs.

Input Name	Mandatory	Type	Description
Address Line	Yes	String	Address line to process.
Postal Code	No	String	Postal code of the address.
City	No	String	City of the address.
Country	No	String	Country of the address.

Input Name

Mandatory

Type

Description

Address Line

Yes

String

Address line to process.

Postal Code

String

Postal code of the address.

City

String

City of the address.

Country

String

Country of the address.

Plug-in Outputs

The following table lists the plug-in outputs.

Output Name	Type	Description
Administrative District	String	The subdivision name within the country or region for an address, such as the abbreviation of a US state.
Administrative District 2	String	The subdivision name within the administrative district for an address.
Confidence	String	Defines the confidence of the location match found by the geocoding service. Possible values: High, Medium, Low.
Country or Region	String	The country or region name of the address.
Formatted Address	String	A string specifying the complete address. This address may not include the country or region.
Status Code	String	The HTTP Status code for the request.
Status Description	String	A description of the HTTP status code.
Latitude	String	Latitude of the location.
Locality	String	The locality, such as the primary city, that corresponds to an address.
Longitude	String	Longitude of the address.
Match Code	String	Defines the geocoding level of the location match found by the geocoder. One or more of the following values: Good, Ambiguous, UpHierarchy
Postal Code	String	The city or neighborhood that corresponds to the postal code.
Process Code	String	Code that indicates the result status of the process.

Output Name

Type

Description

Administrative District

String

The subdivision name within the country or region for an address, such as the abbreviation of a US state.

Administrative District 2

String

The subdivision name within the administrative district for an address.

Confidence

String

Defines the confidence of the location match found by the geocoding service. Possible values: High, Medium, Low.

Country or Region

String

The country or region name of the address.

Formatted Address

String

A string specifying the complete address. This address may not include the country or region.

Status Code

String

The HTTP Status code for the request.

Status Description

String

A description of the HTTP status code.

Latitude

String

Latitude of the location.

Locality

String

The locality, such as the primary city, that corresponds to an address.

Longitude

String

Longitude of the address.

Match Code

String

Defines the geocoding level of the location match found by the geocoder. One or more of the following values: Good, Ambiguous, UpHierarchy

Postal Code

String

The city or neighborhood that corresponds to the postal code.

Process Code

String

Code that indicates the result status of the process.

Appendices

Appendix A: Semarchy Email Enricher Domain Name Cache

The Semarchy Email Enricher uses a local cache to avoid repeating MX record lookups to check the validity of an email domain.
This domain name cache is used in priority, meaning that if a record is found in the cache, the enricher will use the information available locally and we will not issue a MX record lookup.

The plug-in stores the cache in the table name EXT_EMAIL_DOMAINS. This table is created at first run of the enricher, by default in the data location served by the enricher. You can specify a specific datasource location to store this table in the Datasource enricher parameter.

Domain Name Cache Table Structure

The structure of the EXT_EMAIL_DOMAINS table is the following:

Column Name Description

Column Name	Description
`HOST_NAME`	Domain name. e.g. "gmail.com"
`PREFIX`	2 first letters of the domain name. e.g. "gm"
`SUFFIX`	2 last letters of the domain name. e.g. "om"
`HIT_COUNT`	Number of times this host name was processed by the enricher. This value is automatically incremented by the enricher.
`SEED_DATA`	Indicates whether this record was part of the seeded data, of created by the enricher. The value is `1` for seeded data, `0` otherwise.
`VALID`	Indicates whether the domain name is valid `1` or invalid `0`. The value is `N/A` if the validity is unknown (for example, when a new domain is added in the cache in offline mode).
`SUGGESTION`	Latest correction found for an invalid domain.
`FIRST_INVALID_DATE` `LAST_INVALID_DATE` `LAST_VALID_DATE`	Additional date information used to reconsider a domain validity after a certain period of time.

HOST_NAME

Domain name. e.g. "gmail.com"

PREFIX

2 first letters of the domain name. e.g. "gm"

SUFFIX

2 last letters of the domain name. e.g. "om"

HIT_COUNT

Number of times this host name was processed by the enricher. This value is automatically incremented by the enricher.

SEED_DATA

Indicates whether this record was part of the seeded data, of created by the enricher. The value is 1 for seeded data, 0 otherwise.

VALID

Indicates whether the domain name is valid 1 or invalid 0. The value is N/A if the validity is unknown (for example, when a new domain is added in the cache in offline mode).

SUGGESTION

Latest correction found for an invalid domain.

FIRST_INVALID_DATE
LAST_INVALID_DATE
LAST_VALID_DATE

Additional date information used to reconsider a domain validity after a certain period of time.

Fixing Domain Names

The enricher automatically fixes invalid domain names by finding the closest domain name in the cache using a built-in algorithm based on:

The Edit Distance between the invalid domain and cached domain.
The hit count of the cached domain.

A cached domain that is very similar to an invalid domain name and that is frequently processed by the enricher is more likely to be used as a fix for the invalid domain.

Adding Records to the Cache

It is possible to force the creation of new records in the cache, for example to create new fix suggestions.

To manually insert a domain correction <domain_name_replacement> for a <domain_host_name> invalid domain, use the following query sample:

INSERT INTO EXT_EMAIL_DOMAINS (
	HOST_NAME,
	PREFIX,
	SUFFIX,
	HIT_COUNT,
	SEED_DATA,
	VALID,
	SUGGESTION,
	FIRST_INVALID_DATE,
	LAST_INVALID_DATE
	)
VALUES (
	<invalid_host_name>,
	SUBSTR(<invalid_host_name>, 0, 2),
	SUBSTR(<invalid_host_name>, -2, 2),
	0,
	'1',
	'0',
	<host_name_replacement>,
	CURRENT_TIMESTAMP,
	CURRENT_TIMESTAMP
	);

Cache Refresh

The Email enricher refreshes the local cache records after 3 months. This time duration is not configurable. The cache records the date information and will make a new call to the MX server to refresh the cache.

If there is good evidence that the cache is wrong about a domain’s validity, or if business users are certain they want to override the cache’s decision, the developer can set the Valid flag to 0 or 1 manually. To avoid the cache overriding this manual change, it is also important to see the date field to NULL so that the email enricher does not refresh the cache for that domain.

It is safe for developers to periodically truncate the cache table if they want the cache to refresh its results sooner than the 3 month period when the enricher automatically refreshes the cache. Developers can either drop the table entirely or delete the values they do not want and keep the seeded data as well as any other crucial domains they have manually overridden to keep this information.