In this tutorial, you will learn how to implement data quality rules in Semarchy xDM in order to standardize your data and detect duplicates.

Learning outcomes

Learning track

This tutorial is the second unit within the Data Authoring track.

Before starting this tutorial, you must complete the first unit, Build your first data authoring application.

If you have not completed this prerequisite, return to the Tutorials menu.

GO TO TUTORIALS

Semarchy xDM allows you to create enrichers to normalize, standardize, and enrich data loaded or authored in your application. Several enrichers can be defined for an entity. They are executed in sequence.

You will learn how to define two types of enrichers:

In this section, you will:

  1. Create a SemQL enricher to standardize first and last names.
  2. Add a default value to the hire date if it is empty.
  3. Create a complex type to store additional phone attributes.
  4. Use a plug-in enricher to standardize phone numbers and populate these attributes.

Learning outcomes

Add the StandardizeNames enricher

We will use SemQL enrichers to perform basic data standardization on the employees' FirstName and LastName attributes. The objective is to ensure that these attributes consistently have the first letter capitalized and the subsequent letters in lowercase.

  1. Open your HRTutorial model in the Application Builder and navigate to Entities > Employee > Enrichers.
  2. Right-click the Enrichers node and then select Add SemQL Enricher.

  1. Set the Name to StandardizeNames and then click Next.

  1. Select the FirstName and LastName attributes in the Available Attributes list, click Add to add them to the Used Attributes list, and then click Finish.

  1. Enter the SemQL expression for each attribute in the Enricher Expressions section:

  1. Save your work.

Add the DefaultHireDate enricher

You will now create another SemQL enricher to assign a default date to the HireDate attribute if no date was provided.

  1. Navigate again to Entities > Employee > Enrichers.
  2. Right-click the Enrichers node and select Add SemQL Enricher.

  1. Set the Name to DefaultHireDate and click Next.
  2. Add HireDate to the Used Attributes list, and click Next.

  1. To ensure that the enricher is triggered only if HireDate is not set, enter the following SemQL in the Filter field: HireDate is null
  2. Click Finish.

  1. In the Enricher Expressions section, set the Expression of the HireDate attribute with the following expression: CURRENT_DATE()

  1. Save your work.

Add the StandardizePhone enricher

In this section, you will use the Phone enricher plug-in to standardize and enrich phone numbers. This plug-in exposes several outputs given a phone number and a country of origin.

Add the PhoneType complex type

You first need to create a complex type that will be populated by the phone enricher.

  1. Right-click the Complex Types node in the Model Design view, and then select Add Complex Type.

  1. Enter PhoneType in the Name field, and click Finish.

  1. Select the Definition Attributes finger tab and then click the Add Definition Attribute button.

  1. Enter the following values, and then click Finish:

  1. Repeat the same steps to add all the following attributes.

Attribute

Datatype

Notes

Country

String(2)

ISO-3166-1 country code for this phone

EnrichedPhone

String(128)

Standardized phone number

LineType

String(128)

Type of line guessed by the plug-in

Carrier

String(128)

Name of the carrier when available

Location

String(128)

Name of the guessed location of the phone

Timezones

String(128)

Time zones for this phone

IsPossible

String(128)

Whether this phone is possible

IsValid

String(128)

Whether this phone is valid

  1. Check the result and save your work.

  1. Go back to the Details finger tab, click the Define button, and then click the Define Display Name button.
  2. Click Next in the first step of the wizard.
  3. Add the EnrichedPhone attribute to the Selected Attributes list, and then click Finish.

  1. Save your work.
  2. Expand the Employee entity under Entities > Employee in the Model Design view, right-click Attributes, and select Add Complex Attribute.

  1. Enter the following values, and then click Finish.

  1. Save your work.

Add the phone enricher

The Phone enricher plug-in accepts several inputs and produces several outputs such as the standardized phone or the geolocation of the phone line.

  1. Navigate again to Entities > Employee in the Model Design view.
  2. Expand the Employee entity, right-click Enrichers, and select Add API Enricher.

  1. Enter the following values, and then click Finish:

  1. Scroll down to the Inputs section of the editor, and then click the Define Inputs button.

  1. Select the Region Code, Input Phone Number and Enriched Phone Format inputs, click the Add button to add them to the Used Inputs list, and then click Finish.

  1. For each input listed in the Inputs section, set the following expressions in the Expression column:

  1. Scroll down to the Outputs section and click the Define Outputs button:

  1. Add the following attributes to the Attributes Used list, and then click Finish:

  1. Select the following values in the Output Name column for each Attribute Name from the Outputs list:

Attribute Name

Output Name

EnrichedPhone.Carrier

Carrier Name

EnrichedPhone.EnrichedPhone

Enriched Phone Number

EnrichedPhone.IsPossible

Possible Phone Number

EnrichedPhone.IsValid

Valid Phone Number

EnrichedPhone.LineType

Phone Line Type

EnrichedPhone.Location

Geocoding Data

EnrichedPhone.Timezones

Time Zones

  1. Check the enricher configuration in the Inputs and Outputs sections, and then save your work.

Congratulations!

You have successfully added data standardization rules to your model.

To summarize:

The next section will focus on adding validation and match rules.

Semarchy xDM provides a range of data validation rules during the data authoring process. These rules serve as a protective barrier, preventing the input or importation of erroneous data into the hub. While some rules are implicit, such as mandatory attributes, lists of values, or referential integrity, others can be explicitly defined, such as SemQL validations, plug-in validations, and match rules.

In this section, you will discover how to utilize SemQL for data validation, in order to ensure data consistency. You will be guided through the following tasks:

Learning outcomes

Validate end dates

An employee's contract end date should either be null if the employee is still in the company, or be later than their hire date.

You are now going to add a validation that applies this rule.

  1. Go to Entities > Employee > Validations in the Model Design view of the Application Builder.
  2. Right-click Validations and select Add SemQL Validation.

  1. Fill in the wizard with the following values and then click Finish:

  1. Save your work.

Validate emails

You will now add a rule that makes the email mandatory only if the employee is a contractor.

  1. Navigate again to Entities > Employee > Validations in the Model Design view.
  2. Right-click Validations, and then select Add SemQL Validation.

  1. In the wizard, enter the following values, and click Finish:

  1. Save your work.

Prevent duplicates

Preventing users from creating duplicate records is considered a best practice. In this section, you will create a match rule intended to prevent the creation of an employee record if another employee with identical first and last names already exists within the same subsidiary.

  1. Go to Entities > Employee > Matcher in the Model Design view.
  2. Right-click Matcher and select Define SemQL Matcher.

  1. In the Description field, type Prevent duplicate employees, and click Finish.
  2. Click the Add Match Rule button in the Match Rules section of the SemQL Matcher Editor.

  1. Set the Name of your match rule to MatchOnName.

  1. Scroll down to the Matching section, and click the Edit Expression button.

  1. In the SemQL Condition field, copy the expression below and click OK.
Upper(Record1.FirstName) = Upper(Record2.FirstName)
and Upper(Record1.LastName) = Upper(Record2.LastName)
and Record1.Subsidiary = Record2.Subsidiary

  1. Save your work.

Congratulations!

You have successfully added validation and match rules to your model.

To summarize:

The next section will focus on integrating enrichers and validation rules into the user interface, as well as deploying a new version of your application.

Semarchy xDM gives you the flexibility to control exactly which rules should apply when users are authoring data using a stepper.

In this section, you will learn how to make the enrichers and validation rules effective in the MDM application. You will:

Modify the EmployeeForm form

You will now modify the EmployeeForm form to add the enriched phone attributes and reorganize the form fields.

  1. In the Application Builder, navigate to Entity > Employee > Form, and double-click EmployeeForm.

  1. Select the Phone attribute in the form's tree view, and then use the Move Up button to move this attribute between LastName and Salutation.

  1. In the Attributes list, expand the EnrichedPhone attribute and select all the attributes it is composed of (hold the Shift key and select the first and last attribute).

  1. Drag and drop the selected attributes to the form's tree view between Phone and Salutation.

  1. Enter the following values in the Label column of these new attributes:

  1. Select the FDN_Department attribute and use the Move Up button to move it between Title and HireDate (or you can drag and drop the attribute to the target location).

  1. Make the EnrichedPhone_Country attribute read-only:

  1. Select the following attributes, and force them all to read-only authoring mode:

  1. Save your work.

Modify the AuthorEmployee stepper

Before you deploy your changes, you need to modify the AuthorEmployee stepper to enable the validations and enrichers that you defined in the previous sections.

  1. Go to Entities > Employee > Steppers in the Model Design view, and double-click AuthorEmployees.

  1. Configure the validations to perform on stepper finish:

  1. Scroll up to the Steps section and select the second step: Employee.

  1. In the Properties view, open the Step Transition Validations finger tab, select the first row of the table (DETECT_DUPS), and then select Warn in the On Step Exit column.

  1. Make all other validations blocking for step transitions:

  1. Configure all validations on data change:

  1. Configure validations on form opening:

  1. Set up the execution of enrichers:

  1. Save your work.

Deploy the model changes

You are now ready to deploy your changes, have a first glance at the employee creation form, and see the impact of your changes.

  1. In the Application Builder, right-click on the root node in the Model Design View, corresponding to the HRTutorial [0.0] model, and then select Validate. The validation report should raise no error.

  1. Go to the Management perspective of the Application Builder.

  1. Right-click on your data location EmployeeTutorial and select Deploy Model Edition.

  1. Click Finish in the wizard to deploy your model changes.

Congratulations!

You have successfully integrated validations and enrichers, and deployed the changes to your application.

To summarize:

The next section will showcase all the rules in action within your application.

In this section of the tutorial, you will try several combinations for creating employee records, in order to view your enrichers and validations in action in the user interface.

Enrichers in action

Previously, you created three enrichers to standardize names, enrich phone information, and set the default hire date. To see them in action:

  1. Open your application from the Welcome page.

  1. In the navigation drawer, select Employees, open the Options menu, then select Create.

  1. Type jack in the First Name field and BOLT-HarriSson in the Last Name field.
    Notice how the StandardizeNames enricher re-formats values automatically.

  1. Enter 0478963556 in the Phone field and FR in the Country field.
    Observe the outputs of the phone enricher:

  1. Scroll down to the Hire Date field, which has automatically been set to the current date.
  2. Discard your changes: click again on Employees in the navigation drawer, and then click Discard all on the pop-up window that appears.

Validations in action

Previously, you added two custom validations on top of the built-in validations, and enabled them in the AuthorEmployee stepper. To see them in action:

  1. In the navigation drawer of your application, select Employees, open the Options menu, and select Create.
  2. Type Jack in the First Name field and Bolt in the Last Name field, then click Finish.

  1. Two validation issues are raised. Click Cancel.

  1. Fill in the following fields:
  1. Set the End Date value to a date preceding the hire date.
    Notice the error message.

  1. Select Is Contractor.
    Notice the error message due to the missing email address.

  1. Fix the erroneous data:
  1. Click Finish to submit the creation of your record.
  2. Wait until the toaster menu at the bottom left corner of the screen indicates "Changes successfully applied," and then select Click to refresh.

  1. Jack Bolt's record is now available in the list of employees.

Matching in action

Previously, you created a match rule to prevent the creation of duplicate employee records based on their name and subsidiary.

  1. In the navigation drawer, select Employees, open the Options menu, and select Create.
  2. Fill in the following fields and then click Finish.
  1. Notice the error message stating that a duplicate has been found.
  2. Select the record to see the details.

  1. Click Resolve duplicate issue.

  1. Select the second record (i.e., the existing employee with the same name and subsidiary) by clicking on its avatar, and then click Replace.

  1. The editing form for the existing record is displayed in replacement of the creation form.

  1. Complete this record's Phone and Phone Country fields with the values you entered on the other record, and then click Finish.
  1. The existing record has been updated with phone information, and the creation of a duplicate has been avoided.

Congratulations!

You have successfully used the employee record creation form to see the rules in action.

To summarize:

Great job! You have completed the second unit of the Data Authoring track by implementing your first rules for data quality and deduplication.

To summarize:

Next steps

In the next unit, Display cards, forms, and collections, you will learn how to customize the user interface by designing display cards, forms, and collections.

GO TO TUTORIALS

Thank you for completing this tutorial.