Post-upgrade actions

After completing the upgrade procedure, specific actions may be required based on the version of Semarchy xDM from which you are upgrading.

Make sure to read all the sub-sections on this page until you reach your current Semarchy xDM product version. If no action is specified in this document for your version, then no further steps are required after restarting the Semarchy xDM instances.

Certain actions resulting in changes to the model (such as adjustments to validations, enrichers, or any aspect of the certification process) require re-deploying the model.

Upgrade from versions before 5.3.24

This section applies to all installations upgrading from a version before version 5.3.24.

Integration process

Semarchy xDM version 5.3.24 includes a fix on the certification process (MDM-14406) to prevent golden references to fuzzy-matched entities on basic entity golden records from being temporarily reset to null during integration job execution. To upgrade integration jobs, this fix requires redeploying all models.

Required actions
Platform administrators must redeploy all models to apply the fix.

Upgrade from versions before 5.3.9

This section applies to all installations upgrading from a version before version 5.3.9.

Logging

Starting with this version, the logging in Semarchy xDM relies on Log4j 2, which is configured using an XML configuration format, in replacement of the key-value property pairs previously used. It is therefore required to migrate the existing startup and platform logging configurations to that new Log4j 2 format.

Required actions
Platform administrators must perform the following actions to migrate their existing startup and platform logging configurations:

  1. Manually migrate the startup logging configuration into a new Log4j 2 XML configuration file.

  2. Upgrade the platform logging.
    When opening the Logging Configuration editor (Configuration > Logging Configuration) for the first time, administrators are prompted to choose one of the following options to migrate the platform logging configuration to the new format:

    • Let Semarchy perform an upgrade of the Log4j 1 configuration to Log4j 2 format.

      With this option, all the loggers are automatically upgraded. Only the default TOMCAT and FILE appenders are upgraded. Administrators must review, fix, and complete the configuration of other appenders when choosing this method.
    • Replace the previous Log4j 1 configuration with a default Log4j 2 configuration file and modify this configuration manually.

      When using this method, administrators must complete the configuration with their customized loggers and appenders.

Upgrade from versions before 5.3.0

This section applies to all installations upgrading from a version before version 5.3.0.

Integrated authentication

As mentioned in the pre-upgrade actions, after the upgrade, Semarchy xDM platform is configured with the Internal Identity Provider only and the administrator user that was created during the upgrade process.

Required action
Using this user’s credentials, administrators must:

  • Reconfigure the users previously stored at the application server level.

  • Configure identity management, including:

    • The third-party authentication (single sign-on) systems as Identity Providers with optionally their Role Mapping.

    • If applicable, the Role Lookup feature, to assign roles to users based on database table contents.

Authentication configured at the application server level can be deleted after this step. For Tomcat, the valve and realms configured for Semarchy xDM can be removed from the semarchy.xml file.

For a Tomcat installation, the identity providers (SSO) configuration is contained in the semarchy.xml configuration file, and the local users used for form authentication are in the tomcat-users.xml file.

Integrated datasources

As mentioned in the pre-upgrade actions, the datasources used by the data locations, xDM Dashboard, xDM Discovery, the variable value providers and the enrichment or validation plug-ins are no longer configured at the application server level and must be defined in the Semarchy xDM platform as platform datasources.

The upgrade process automatically creates empty datasources for the components that refer to application server datasources. However, it does not recover the datasource configurations.

The referenced JNDI datasources are upgraded into empty platform datasources with a normalized name. The normalized name is derived from the JNDI name by performing the following operations:

  • The java:comp/env/jdbc/ prefix is removed.

  • The remaining part is normalized by converting the characters that are not alphanumeric or underscore to _.

For example, in a data location, a reference to a JNDI datasource named java:comp/env/jdbc/DATA_LOCATION_1 is upgraded to a platform datasource named DATA_LOCATION_1. That datasource is empty and must be configured by the administrator.

Required action
Administrators must:

  • Review and reconfigure the datasources upgraded to platform datasource in Semarchy xDM Configuration. They should also review the usages defined for these platform datasources

  • Review the components referencing these datasources (data locations, dash-regular-product-name} and xDM Discovery datasources).

Datasources configured at the application server level can be deleted after this step. For Tomcat, they can be removed from the semarchy.xml file.

For a Tomcat installation, the original datasources are configured in the semarchy.xml file.

REST API authentication

The following changes have been made to the REST API authentication mechanism:

  • Authentication and the SemarchyConnect role are required for the OpenAPI documentation.

  • The REST API does not support stateful authentication.

Recommended action
Administrators should review the roles granted to users and API keys accessing the REST API OpenAPI documentation to make sure that they have the SemarchyConnect role.

Import restriction properties in steppers and workflows

The Enable Import, Enable Import-Update, and Restrict Import to Update properties available on workflow transitions, steppers, and actions are replaced with two properties:

  • The Enable Import checkbox enables or disables the import capability.

  • The Allow Import to property restricts import to Create and Update, Create Only, or Update Only records.

Recommended action
The upgrade process sets the new properties according to the legacy configuration. However, model designers should review their import experience to make sure that the behavior remains consistent with the new properties.

Plug-in batch update size defaults to 1,000

The Batch Update Size properties of enricher plug-ins now automatically take the value 1,000 when left empty. This value provides better performance in most situations.

Recommended action
Monitor the performance of the plug-in enrichers to make sure that this value change has no negative impact.

SemQL changes

The Master Records’ OldMatchGroupID attribute (B_OLDMATCHGRP column in the MI table) is deprecated.

Required action
Model designers and integration specialists should review any component that accesses or uses the deprecated attributes or columns in SQL or SemQL, and remove any use of these deprecated fields.

User and role repository tables

With the new Integrated authentication feature, the user and role tables (MTA_USER, MTA_USER_ROLES) stored in the repository have been moved and their structure has been changed. The new tables storing users and roles are named IDM_USER and IDM_USER_ROLE.

Recommended action

Model designers and integration specialists should review and update any component or script that accesses or uses the deprecated tables.

Upgrade from versions before 5.2.0

This section applies to all installations upgrading from a version before version 5.2.0.

Change in built-in variables

The following changes have occurred in the built-in variable:

  • Variable V_USER_DEPARTEMENT is renamed V_USER_DEPARTMENT

  • V_DATAEDITION and V_DATALOCATION are no longer available.

Database functions case-sensitivity

This release changes how case sensitivity is handled for database functions declared in the model.

Starting with this release, you can declare the function names and schemas in any case. However, a model validation prevents having two functions with the same qualified name and a different case.

PostgreSQL and Oracle are case-insensitive for all function calls. PostgreSQL assumes that all object names are in lowercase, and Oracle assumes that all object names are in uppercase. The product behavior does not change for models based on these two databases.

An SQL Server database may be configured to handle object names in a case-sensitive manner. Starting with this release, Semarchy xDM assumes, for SQL Server, case sensitivity for all function calls, including the SemQL built-in functions. Function calls performed in SemQL with a function name that does not match exactly the case of the built-in function definition (as it appears in the SemQL editor) now raise warnings in the model validation, since they may fail at run-time, depending on the database server configuration.

Required action

Model designers should validate their models based on SQL Server. The model validation report will raise undefined function warnings for case discrepancies. If such errors appear, the SemQL expressions should be fixed and the model redeployed.

New Melissa Personator plug-in

A new Melissa plug-in for Semarchy xDM provides enrichers to fix and complete contact data for the US and Canada using the Personator service and to validate international addresses in 240 countries using the Global Address Verification service.

Existing enrichers using the legacy Personator enricher are automatically upgraded to use the new enricher as part of the automated model upgrade process.

The upgrade automatically configures the enricher to use the new plug-in capabilities, as described below:

  • The Debug parameter (true/false) is removed. This parameter was used to automatically configure the logging, without handling logging levels. Users willing to run the plug-in execution in debug mode should configure the following loggers:

    • log4j.logger.com.semarchy.platform.engine.PluginExecution

    • log4j.logger.com.semarchy.engine.plugins.melissa

    • log4j.logger.org.apache.cxf

    • log4j.logger.org.apache.cxf.services For more information, see Configure the platform logging.

  • The Passthrough parameter (true/false) is removed and replaced with the Requests Limit parameter to limit the number of enriched records for testing purposes.

  • The enricher optimization capabilities, including Thread Pool Size, Batch Update Size, and Processing Batch Size are now fully supported. Processing Batch Size is upgraded to a value of 100 if the original value is above 100.

  • The old plug-in code had a hardcoded error/retry handling with 10 retries. This code has been removed. The enricher configuration is automatically configured with the Max Retries property set to 10.

Required action

Users using the legacy Personator enricher should review the upgraded enricher configuration after the upgrade.

Removed: GBGroup Matchcode Global plug-in

The Matchcode Global plug-in for Semarchy xDM, using GBGRoup Matchcode Global to provide an enricher for international postal addresses is removed in this release.

Required action

Users using this plug-in should switch to the REST client feature available in this release.

Deprecated: xDM Discovery: row sampling property

in xDM Discovery, the Row Sampling table property is deprecated and replaced with the Maximum Distinct Values Stored and Maximum Patterns Stored properties with default values of 50. The value set in Row Sampling is automatically copied to these new properties by the upgrade process.

Recommended action

Users should run the profiling process to benefit from these new properties.

xDM Discovery: unique and non-unique counts

in xDM Discovery, the following metrics are now persisted in the PRF_COLUMN table and no longer computed: Unique Count, Duplicate Count, Unique Values, Duplicate Values. Persisting these metrics prevents inaccurate values when sampling a subset of the table.
Queries (the Columns query) in the generated dashboards now use these new columns.

During the upgrade:

  • The new columns in PRF_COLUMN are created empty. They are loaded the next time the profiling process runs for a profiled table.
    These metrics may appear in the dashboards, but with empty values, if the profiling process is not re-executed.

  • Dashboard applications created using the Seed profiling app option are not modified.
    As a consequence, seeded profiling applications do not benefit from these columns and still use the old computed values.

Recommended action

xDM Discovery users should:

  • Re-profile their tables to load these new metrics columns.

  • Update their seeded profiling applications (the Columns query) or re-seed new applications to use these metrics.

New application documentation

Semarchy xDM introduces in this release an automated application documentation feature. This documentation is generated based on the model content and extensively uses the Documentation property set on objects in the model (attributes, form fields, etc.).

Besides, a new Documentation property is added on entities. This property is set to the value of the entity’s description during the upgrade.

Recommended action

Designers should consider configuring the application documentation and should review that documentation, possibly updating the Documentation property set on the entities and other objects of the model.

Clustered mode by default

The Semarchy xDM platform now starts by default in clustered mode to automatically propagate configuration and model deployments on all nodes. Administrators no longer need to set the com.semarchy.xdm.cluster.enabled=true system property to your application server startup for each node, as it is the default behavior.

Upgrade from versions before 5.1.0

This section applies to all installations upgrading from a version before version 5.1.0.

New license management

Semarchy xDM introduces in this release a new system to manage licenses.

The repository upgrade process erases the existing license key in xDM instances when upgrading from a version before version 5.1.0. Such instances have a seven-day grace period during which they work normally. After that period, they will stop working.

Required action

Before the grace period is over, administrators should activate the instance, as described in Manage the license.

Our global Technical Support (support@semarchy.com) team is available for assistance during the upgrade process and to help planning the license change.

New Semarchy xDM Discovery component

Semarchy xDM Discovery enables data architects and business users to gather metrics and profile any source data to prepare a data management initiative.

When connecting the welcome page, users with the appropriate privileges will see a new xDM Discovery link to access the user interface for Semarchy xDM Discovery.

Suggested actions
Teams willing to use Semarchy xDM Discovery in their projects should configure additional datasources for this component and grant the appropriate privileges to the platform roles to access this feature.

select all in actions

in previous releases, only the Delete, Confirm Duplicates, Review and Confirm Duplicates and Review Duplicates Suggestions actions had Enable All properties to execute the action on all records when no selection was made.

Starting with this release, this implicit select all is replaced with an explicit select all capability that applies to all actions. With this feature, users can explicitly select all the filtered records in a collection, possibly deselect some of them and then start an action with the selection.

The number of records that can be processed at the same time may be limited at design-time for each action using a new optional Items Limit property, available in the Action Configuration properties. A warning is raised to the user when he triggers the action with a selection larger than this limit.

When running an action on a set of records selected by select all, the Condition defined on the action as well as the model privilege grants still apply to automatically filter the records when the action starts. The user is warned if the action cannot process all the selected records due to the condition or a missing privilege.

Actions are automatically upgraded to use this new feature:

  • The Items Limit is set to null (no limit) for Export actions.

  • The Items Limit set to 1000 for Copy and Edit actions

  • The Items Limit is set to null (no limit) for existing Delete and duplicate management actions having Enable Delete All Records, Enable Confirm All, or Enable Review All selected.

  • The Items Limit is set to 1,000 for existing Delete and duplicate management actions having Enable Delete All Records, Enable Confirm All or Enable Review All deselected.

  • The Enable Delete All Records, Enable Confirm All, and Enable Review All properties are deprecated and no longer appear in the Application Builder. Designers willing to force actions to run only on single record selections should use the Support Multiple Selection property.

Recommended actions
Application designers should review the configuration of the actions after the upgrade and inform the application users of this functional change.

Split duplicate propagation

A new Split Duplicate Propagation property is added to the references configuration to define how splitting records in the referenced entity affects the referencing records. This property applies when both the referenced and referencing entities are fuzzy-matched.

Setting this property to Reset Matching forces rematching child records when their parent is split in a duplicate manager.

The child records are re-matched, ignoring their previous match groups and user decisions. The resulting golden records are redistributed under their respective parent records. When re-matched, the referencing records are all re-merged into new golden records with new Golden IDs and lose all their value overrides.

Suggested actions
Model designers willing to use this feature should enable this option and redeploy their model.

SQL Server certified for production

This release introduces full support for SQL Server, including production certification. The configuration of the SQL Server databases used for the repository and data locations need to be reviewed to make sure they comply with the requirements described in database-specific considerations.

The QUOTED_IDENTIFIER and READ_COMMITTED_SNAPSHOT should be set to ON for these databases.

Required actions
Database administrators should check the configuration of these parameters for the databases using the following command:

SELECT NAME, IS_QUOTED_IDENTIFIER_ON, IS_READ_COMMITTED_SNAPSHOT_ON FROM SYS.DATABASES

They should possibly modify the databases configuration using the following commands:

ALTER DATABASE <database_name> SET QUOTED_IDENTIFIER ON;
ALTER DATABASE <database_name> READ_COMMITTED_SNAPSHOT ON;

Upgrade from versions before 5.0.1

This section applies to all installations upgrading from a version before version 5.0.1.

Source reference validations

in previous releases, a child record referencing a parent record detected as erroneous during the pre-consolidation validation passed the certification process and was published with a null reference.

Starting with this release, when a child record references a parent record detected as erroneous during the pre-consolidation validation, then this child record is also rejected in the pre-consolidation validation by the reference validation. If a previous valid version of the parent record exists in the hub, then the child record is attached to that version instead. During post-consolidation validation, a reference is always considered valid if the referenced record exists (flagged as erroneous or not).

Required actions
Model designers should review their model validations and references for possible impact on error management. They must redeploy their models to update the jobs and apply this behavior change. Those willing to preserve the old behavior must define the -Dmdm.sourceFkValidation.allowRefToNewRecordsWithErrors=true system property.

Upgrade from versions before 5.0.0

This section applies to all installations upgrading from a version before version 5.0.0.

New Dashboard component

Semarchy xDM Dashboard is a new component for designing visualizations and dashboards that create insight by combining your business data and the data stored and managed in the Semarchy xDM intelligent data hubs.

When connecting the welcome page, users with the appropriate privileges will see a new Dashboard Builder link to access the design-time user interface for dashboards.

Suggested actions
Teams willing to use xDM Dashboard in their projects should configure additional datasources for this component and grant the appropriate privileges to the platform roles to access this feature.

User interface reorganization

The user interfaces available in Semarchy xDM have been reorganized:

The Workbench is now split into two user interfaces:

  • Configuration for all platform-level administrative tasks.

  • The Application Builder, with two perspectives:

    • Design for model design.

    • Management for model editions and deployment/data locations management tasks.

A new Dashboard Builder user interface is available for designing dashboard applications.

Suggested actions
Teams with documented processes that use the previous platform user interface organization should review their documentation accordingly.

Platform privilege changes

Platform-level privileges have been reorganized to match the user interface reorganization in the Semarchy xDM platform as well as the security best practices.

The roles defined in previous releases are automatically upgraded to these new privileges, using the following rules:

  • Roles with the old Data Location privilege are granted the new Application Management privilege.

  • Roles with the old Model Design privilege are granted the new Application Design privilege.

  • Roles with one of the old Plug-ins Administration or Logging Configuration privileges are granted the new Platform Admin privilege.

  • Roles with one of the old Job and Job Log Administration or Execution Engine privileges are granted the new Application Management privilege.

Recommended actions
Administrators should review the new privilege grants and make sure they meet their security requirements and best practices.

Model privilege changes

in previous releases of Semarchy xDM, roles with a model privilege grant with the Grant full access to the model selected had also access to the REST API, as if the Grant access to integration web services was selected too.

Starting with this release, Grant full access to the model does not automatically grant access to the REST API, and the Grant access to integration web services must be explicitly selected in the model privilege grant.
The upgrade process automatically takes care of checking this option for all model privilege grants with Grant full access to the model.

Recommended actions
Model designers should review the model privilege grants and update as needed the access to the integration API.

Lookup enricher multiple outputs support

The Semarchy Lookup enricher supports multiple outputs, which changes its configuration. The configuration of plug-in enrichers using this component is automatically upgraded.

Required actions
Designers using this enricher in their models must redeploy their model to update their job for the upgraded configuration.

REST API record update change

in previous releases, publishing with the REST API a record update for a fuzzy- and ID-matched entity incorrectly set to null the attributes not present in the payload.

This behavior has been aligned (in bug fix MDM-7819) on the basic entity behavior. Only the attributes present in the payload are taken into account to update the existing record. Attributes absent from the payload keep the value of the original record.

Required actions
Integration designers relying on this behavior in their integration flows should review and update these flows to take into account the behavior change.

Streaming data import

To enable large file imports, a streaming data import feature has been implemented for CSV and XLSX file formats. The XLS format for import is deprecated since it does not support streaming. XLS files are processed as CSV files.

Suggested actions
Designers should inform the end users of these changes.

REST API fix and stateless mode

This release introduces a memory leak fix (MDM-8086) to the REST API and makes calls stateless to reduce resource consumption and increase security.

This fix requires Tomcat instances to remove references to semarchy.commons.web.configuration_*.jar in the tomcat.util.scan.StandardJarScanFilter.jarsToSkip property found in the Catalina/catalina.properties file.

Required actions
If upgrading an on-premises Tomcat instance in the cloud, examine the catalina.properties file. If necessary, remove the semarchy.commons.web.configuration_*.jar file from the classes listed in the tomcat.util.scan.StandardJarScanFilter.jarsToSkip property.

Upgrade from versions before 4.4.0

This section applies to all installations upgrading from a version before version 4.4.0.

Search on business view child collections

in this version, business views support searching and filtering every collection in the business view. In previous releases, filtering was only supported on the root collection of the business view. This new feature implies the following changes:

  • Search configuration no longer appears in the business view editor directly.

  • Each business entity of the business view (available through the Transitions tree-table) has a field named Search Configurations in its Display Properties to define the search methods available for the business entity.

  • The business view search configurations configured in previous releases are automatically moved by the upgrade process to the root business entity of each business view.

Suggested actions
Designers willing to enable search in child collections should modify their business views accordingly.

New date datatype

in this version, a new Date datatype replaces DateTime to store dates with no time or timestamp. The Timestamp datatype remains to store dates and times with timestamp information. The upgrade process automatically converts DateTime attributes to the new Date data type in the model and data locations. As a consequence, data columns using the old DateTime type are automatically trimmed to store a Date.

Recommended actions
Designers using attributes with a DateTime datatype to store date and time information should modify their models before data location upgrade to change the datatype of these attributes from Date to Timestamp.

Integration specialists should review their data flows reading from or writing to the MDM hub and make sure they correctly take into account this data type change.

The DateTime type is deprecated for plug-in enrichers and validators, and the new Date type is added.

Recommended actions
Plug-in developers should review their plug-in code and replace DateTime parameters, input, and outputs with Date or Timestamp.

Upgrade from versions before 4.3.0

This section applies to all installations upgrading from a version before version 4.3.0.

Data location purge schedules

in this version, the data purge is modified with the following changes:

  • The purge now takes into account all the data, including source data, source authoring data, deleted data traces, as well as data related to canceled external loads, direct authoring, duplicate manages, or workflows.
    For all these, the data purge is performed according to the retention policy defined for the entities in the model.

  • A purge job is always generated when deploying a model, to handle at least the data purge for canceled external loads, direct authoring, duplicate manages, or workflows.

  • Finally, the configuration of the purge schedule is moved to each data location.

Besides:

  • The upgrade process automatically creates under each data location a purge schedule equivalent to the old scheduled purges and then removes the old purge schedules.

  • For data locations for which no purge was previously configured a purge is scheduled every Sunday at midnight to handle the purge of the canceled loads.

Recommended actions
Administrators should review the purge schedules and update them according to their production systems’ requirements and organization.

Master value picking

A new option is available for matching entities on duplicate managers and form steps. Enable Master Value Picking enables users to override the value of consolidated golden records to select values from the master records in addition to being able to enter custom values.

When upgrading to this version, this option is automatically activated for duplicate managers, but not for form steps.

Recommended actions
Designers should review that master value picking is relevant for their users, and modify the settings of the duplicate managers and steppers’ form steps accordingly.

Hierarchy tree views

This version introduces a new visualization for business views. Hierarchies views provide a user-friendly way for business users to navigate hierarchies (of products, organization, cost centers, employees) with an intuitive user interface.

Hierarchies are not configured by default during the upgrade.

Recommended actions
Designers willing to configure hierarchy tree views should review and update their business view configurations accordingly. See Configure hierarchies for more information about this feature.

Entity historization

in this version, it is possible to enable automated historization to keep track of all changes on master and golden data for each entity. Applications can be configured to allow browsing the history of the records.

The upgrade does not automatically enable historization for the entities.

Suggested actions
Model designers willing to enable this feature should alter the entities to support historization, and redeploy their model. See Data historization for more information about this feature.

ID- and fuzzy-matched entity delete

This version introduces the capability to perform soft and hard deletes for records in all entity types.

This option is not automatically enabled by the upgrade.

Suggested actions
Model designers willing to enable deletion for entities should modify their entities accordingly. See Record deletion for more information about this feature.

Authoring master records and errors

Starting with version 4.3.0, ID- and fuzzy-matched entities support authoring (creating, editing, importing) source or erroneous records on behalf of publishers. This option is enabled when configuring authoring actions for ID- and fuzzy-matched entities.

Existing action sets are not modified by the upgrade process to support this feature.

Suggested actions
Model designers willing to use this feature should modify their models accordingly. For more information about this feature, see Action sets and Data authoring patterns.

Integration

in this release, the following changes take place in the integration interfaces and should be reviewed by the integration specialists:

  • SDE and SDL views available via the REST API are respectively renamed SD4L and SD4LK.

  • The DeleteType built-in attribute, corresponding to the B_DELETETYPE column now tracks golden deleted when they lose all their master records with the LEGLESS_DELETE value.

  • New GH and MH tables and views store the golden and master history. New GH4B and MH4B views allow browsing golden and master data at the time a given batch was completed.

  • The type of the repository and data location columns storing boolean values is changed from VARCHAR(1) to CHAR(1) in Oracle.

Forms and collections with UUID in display cards

After upgrading from a version before 4.3.0, any form or collection for an entity with a Primary Key attribute of type UUID, and using this UUID in the display card, fails to load with the following exception:

java.sql.SQLSyntaxErrorException: ORA-00932: inconsistent datatypes: expected CHAR got BINARY

Display cards created after version 4.3.0 do not have this issue since the UUID is automatically wrapped in a SEM_UUID_TO_CHAR function call.

Required actions
Teams facing this issue should modify their display cards to wrap the UUID in a SEM_UUID_TO_CHAR function call, and redeploy the applications

Upgrade from versions before 4.2.0

This section applies to all installations upgrading from a version before version 4.2.0.

Data entry workflows

After a model upgrade, the data entry workflows that existed in a version before version 4.0.0 are not automatically upgraded. They are preserved in a hidden form in the model.

The data authoring experience in version 4 supports two methods:

  • Direct Authoring, into which users edit and save records, without having to create a workflow. This method uses a guided authoring UI based on a design-time object called a Stepper.

  • Workflow Authoring, into which users start a Workflow with tasks and transitions. Each task uses a stepper for modifying the data.

Workflows and steppers are triggered using actions defined in action sets.

For more information about steppers, workflows, and action sets, see Applications.

Required action
A wizard helps model designer upgrade their original workflows into steppers and workflows. Model designers must use this wizard to recover and convert their workflows.

Upgrading data entry workflows

To upgrade data entry workflows:

  1. In the Semarchy Application Builder, open the model edition containing the workflows to upgrade.

  2. Right-click the model node in the Model Design view and then select Upgrade v3.x Data Entry Workflows.

  3. Review the notice in the About Workflow Upgrade step and then click Next.

  4. The Workflow Upgrade step shows the existing data entry workflows, with the list of business views that they use for data entry.

    • Select Upgrade for each workflow that you want to upgrade.

    • Select the Upgrade Target for each of these workflows. This target defines how the workflow will be upgraded:

      • Stepper: this option is available only for single-task workflows using a single business view and having no validation or enrichment configured on the cancel transition. The wizard converts the business view into a stepper, reproducing its hierarchical structure and data entry forms. It also adds to this stepper the enrichers and validations defined on the submit transition.

      • Workflow: this option is available for all workflows. The wizard converts each business view used into a task to a stepper, as in the previous option. Besides, it converts the workflow to the new format, using for each task the newly created steppers. Enrichment and validations are added to the stepper and workflow transitions to reproduce the original behavior.

    • Select Actions to automatically create, in the entity’s default action set, the authoring actions (create, edit, etc.) corresponding to the options enabled in the original workflow. These actions use either the stepper or workflow created by the wizard.

    • Select Delete to delete the original workflow. Note that this action cannot be undone.

  5. Click Finish. The upgrade wizard converts the original workflows according to your choices.

You can only upgrade workflows in open model editions.
You can run again the workflow upgrade process on any workflow that has not been deleted by the wizard.
The Delete option deletes the selected data authoring workflows. This operation cannot be undone.
When all version 3.x data entry workflows are converted and deleted, the File > Upgrade v3.x Data Entry Workflows menu option automatically disappears.

Recommended actions
Review and test the new workflows and steppers after the upgrade to make sure the new behavior corresponds to the original functional requirements.

Duplicate management workflows

After a model upgrade, the duplicate management workflows that existed in a version before version 4.0.0 are not automatically upgraded. They are preserved in a hidden form in the model.

The duplicate management experience now uses a dedicated design-time artifact, the Duplicate Manager, duplicate management actions defined in action sets to review, confirm, merge, or split duplicates and suggestions.

For more information on duplicate managers and action sets, see Applications.

Required action
A wizard helps you upgrade original workflows into duplicate managers after completing the model upgrade. Model designers must use this wizard to recover and convert their workflows.

Upgrading duplicate management workflows

To upgrade duplicate management workflows:

  1. In the Semarchy Application Builder, open the model edition containing the workflows to upgrade.

  2. Right-click the model node in the Model Design view and then select Upgrade v3.x Duplicate Management Workflows.

  3. Review the notice in the About Workflow Upgrade step and then click Next.

  4. The Workflow Upgrade step shows the existing duplicate management workflows.

    • Select Upgrade for each workflow that you want to upgrade. Each workflow upgrades to a duplicate manager, using the default display card, form, and collection from the entity.

    • Select Delete to delete the original workflow. Note that this action cannot be undone.

    • Select Actions to automatically create, in the entity’s default action set, the duplicate management actions using the duplicate manager created by the wizard.

  5. Click Finish. The upgrade wizard converts the original workflows according to your choices.

You can only upgrade workflows in open model editions.
You can run again the workflow upgrade process on any workflow that has not been deleted by the wizard.
The Delete option deletes the selected duplicate management workflows. This operation cannot be undone.
When all v3.x duplicate management workflows are converted and deleted, the File > Upgrade v3.x Duplicate Management Workflows menu option automatically disappears.

Recommended actions
Review and test the new duplicate management actions after the upgrade to make sure the new behavior corresponds to the original functional requirements.

Survivorship and override for matching entities

ID- and fuzzy-matched entities (also known as UDPK and SDPK entities) now support overriding data values on golden records. When data authoring operations are performed in these entities, the data authored by the users (i.e., authoring data) is separated from the data loaded from external publishers (i.e., source data).

The consolidators have evolved into Survivorship rules, and support two phases:

  • Consolidation (using a consolidation rule), merges the data loaded from the publishers in the SD_ tables.

  • Override (using an override rule) applies possible data overrides authored by users on top of the consolidated data.

Survivorship rules allow implementing simply the pattern into which users fix consolidated values or directly create golden records in matching entities, without having to rely on matching and consolidation rules. They are now used only for the data loaded from external publishers (i.e., source data).

For more information about survivorship and override, see Survivorship.

Consolidator upgrade

The upgrade process automatically performs the following changes in your models:

  • Consolidators upgrade to Survivorship Rules:

    • For Record-Level Consolidators, a single default survivorship rule is created with a consolidation rule corresponding to the original consolidator.

    • For Field-Level Consolidators, one survivorship rule is created per field, with a consolidation corresponding to the field’s original consolidator.

    • For all the upgraded survivorship rules, the override strategy is set by default to No Override.

  • Consolidation strategies upgrade:

    • The Any Value consolidation strategy is converted to a Custom Ranking strategy with no ranking expression.

    • Other strategies remain unchanged.

  • Additional Changes

    • An additional Master ID survivorship rule is automatically created to consolidate the master record IDs into the golden record, with the Custom Ranking consolidation strategy and no raking expression. This corresponds to Any Value.

Recommended actions
Model designers should review the new survivorship rules and modify them according to their functional requirements, possibly adding override rules.

They should also consider:

  • Aggregating multiple single-attribute rules into multi-attribute rules, for attributes sharing the same consolidation strategy and sharing the same override strategy.
    Keep in mind that attributes within the same survivorship rule are always overridden simultaneously. As a consequence, an override on one of their attributes causes an override on all of them.

  • Creating a default survivorship per entity that applies to all attributes not taken into account by other rules.

Changes for the certification process

publishers dedicated to data entry are no longer supported. The authoring data is already separated from the publisher data. These publishers should no longer be used in any of the rules involved in the certification process.

Required action
Model designers should review and redesign matchers and consolidators handling a combination of consolidation and authoring. They should split the logic into the two elements of the survivorship rules:

  • Consolidation Rules must only handle data flows consolidated from the publishers

  • Override Rules must only handle user changes performed after the consolidation phase.

Data authored by users are not enriched or validated for ID- and fuzzy-matching entities as part of the certification process. These records should be enriched and validated as part of the steppers into which users author the data.

Required action
Model designers should review the enrichers and validations applied to data from data entry publishers, and modify the steppers used for authoring this data to enforce enrichment and validation at this level. For example, when the stepper completes.

Changes for data integration

  • SD_ tables now only contain source data from application publishers and should only be used in that sense. New SA_ tables contain records created for data authoring and overrides performed on golden data.

Required action
Integration specialists should review data integration flows intended to publish data on behalf of users (using a data entry publisher), and consider publishing data in the SA_ tables instead of the SD_ tables.

Documentation property

in this version, a Documentation property is added to entity attributes, complex types’ definition attributes, references, matching, and survivorship rules. The documentation property supports the Markdown syntax for rich text. It provides detailed information to end users using the generated application.

The Description property is not only used for describing/documenting model artifacts for model designers. Descriptions are no longer visible to end users.

The documentation of an attribute/reference is used by default in all form fields based on this attribute/reference. It is possible to override this inherited documentation in the form field.

The upgrade process automatically copies the content of the Description properties (which was visible to end users before) to the Documentation property.

Recommended actions
Model designers should review the documentation property on entity attributes, complex types’ definition attributes, references, matching, and survivorship rules, possibly enhancing it with markdown content. They should also review field documentation and possibly override the value inherited from the entity attributes.

Other changes

Changes for data integration

Designers and integration specialists should be aware of the following table-structure-breaking changes:

  • Basic entities’ SD_ and SE_ tables are renamed SA_ and AE_. Their structures remain unchanged.

  • GI_ tables no longer have a B_ERROR_STATUS column.

Required action
Model designers and integration specialists must review all components, scripts, or data integration flows using these tables and columns and update them accordingly.

Upgrade from versions before 4.0.0

This section applies to all installations upgrading from a version before version 4.0.0

Data edition decommission

The data edition feature is removed in this release. A data location now hosts one deployed model edition at a time and provides access to the current state of the data in the hub, based on the deployed model edition.

Changes for deployment

  • The Data Editions view is renamed Data Locations and reorganized for a single deployed model and single data edition.

  • The deployment process for models and the management operations for data locations is simplified:

    • All operations related to data branches and data editions disappear.

    • Only one model edition is deployed at a time in a data location, and only data edition relies on this model.

    • All model edition operations on data location—​that is, installing a new model edition, updating an open model edition, or switching to an old model edition are replaced with a single Deploy Model Edition operation.

    • The Set Status to Maintenance and Set Status to Ready on the data location replace the Set Data Edition Status to Maintenance and Set Data Edition Status Back to Open.

  • Successive model edition deployments now appear in the Data Location view in a Deployment History node, showing the integration, installation, and purge jobs deployed with the model edition.

Recommended actions
Production teams should review and update their deployment process instructions to match the simplified process and administration console changes.

Changes for data locations

Existing data in closed data editions is removed from the data locations. Only data from the latest data edition is preserved in each data location.

Required action
Model designers and integration specialists should review any component that accesses or uses data using data editions, taking into account the removal of all previous data editions.

Changes for the certification process

  • SemQL still supports the FromEdition, ToEdition and BranchID built-in attributes in existing clauses, but they are made deprecated, and will always return BranchID=0, FromEdition=0, and ToEdition=null.

  • The :V_DATABRANCH and :V_DATAEDITION built-in platform variables remain but are deprecated and both return 0.

  • The DataEdition Job Notification property is deprecated and returns the default value 0.

Recommended actions
Designers should review their SemQL clauses and remove deprecated built-in attributes.

Changes for data integration

The B_BRANCHID, B_FROMEDITION, and B_TOEDITION columns are dropped from all tables.

Required action
Model designers and integration specialists should review any component that accesses or uses the data location tables in SQL, and remove any use of these columns: integration jobs, SQL statements, PL/SQL functions called from SemQL, etc.

The PL/SQL INTEGRATION_LOAD integration package remains backward compatible. New versions of the functions not using data branch and edition are added to and replace the old ones.

Recommended action
Integration specialists should review any component that uses these functions, and remove deprecated parameters.

For more information about table structures and the integration package, see Data hub table structures.

Post-consolidation validations

Post-consolidation validations, running on consolidated data, are no longer removing golden records. They only flag these records as erroneous, yet keep them as golden records.

Changes for applications

Golden data now includes records failing post-consolidation validations. A new ErrorStatus built-in attribute indicates whether the golden record has successfully passed validations or not. Possible values include:

  • VALID if the record has no error.

  • ERROR if the record has errors

  • a <NULL> value also indicates a record with no error.

Required action
Model designers should review their business views or queries to filter (using the ErrorStatus = 'VALID' OR ErrorStatus IS NULL SemQL clause) golden data and exclude golden records failing post-consolidation validation if such validations are used.

Changes for data integration

Designers and integration specialists should be aware of the following table-structure-breaking changes:

  • B_STATUS column is removed from the GI_ and MI_ tables.

  • A new B_ERROR_STATUS column is added to the GD_ table to flag the golden records failing a post-consolidation validation. Possible values include:

    • VALID if the record has no error.

    • ERROR if the record has errors

    • a <NULL> value also indicates a record with no error.

Required action
Integration specialists should now assume that the GD_ tables must be filtered to only read the valid golden records (where B_ERROR_STATUS = 'VALID' or B_ERROR_STATUS IS NULL).

Error storage

The storage of errors has changed in this release:

  • The error tables (SE_ and GE_) no longer host the entire erroneous record, but only the list of errors with a reference to the erroneous records (which data is stored in the SD_ and GD_ tables).

  • Both SD_ and GD_ tables now have a B_ERROR_STATUS flag indicating the source or golden record status. Possible values are:

    • VALID if the record has no errors,

    • ERROR if the record has errors,

    • RECYCLED for source records only, if the record was recycled and considered valid

    • OBSOLETE_ERROR for source records only if this record had errors but a newer version of the record fixed them.

    • a <NULL> value also indicates a record with no error.

  • The B_STATUS columns are renamed B_ERROR_STATUS.

  • The SE_ and GE_ tables no longer store the entire history of errors, but only the latest error instances.

Required action
Integration specialists should review their error processing flows or scripts and respectively join data from the SE/SD tables (on B_PUBID, B_SOURCEID and B_LOADID) and GE/GD tables (on the Golden ID) to retrieve erroneous data along with the errors. They should also take into account the fact that the error tables only store the latest error instances.

Application upgrade

The structure and the content of the applications and application components have considerably changed between versions 3.x and 4.x. This section details how the repository upgrade converts original artifacts into new ones.

Display cards

Display cards replace the display name for entities.

The upgrade process automatically creates a default display card from the display name, reproducing in the primary text the concatenated string of the display name.

For more information, see Display cards.

Recommended actions
Model designers should enhance this display card, possibly configuring secondary text, an avatar, or an image.

Collections

Collections replace table views for entities.

The upgrade process automatically upgrades all table views to collections supporting only a table view. The table columns are upgraded to the new version 4.0 components with a configuration reproducing the original appearance of the table.

For more information, see Collections.

Recommended actions
Model designers should enhance the upgraded collections, possibly configuring a display card, and alternate list and grid views. They should also review the resulting tables and change the columns’ configuration to improve the appearance of their collections.

Form views

Forms replace forms views for entities. Besides, forms now support a new layout system (using the flex model), multiple tables, embedded collection, and richer components.

The upgrade process automatically upgrades all form views to forms with a single tab. The form fields are upgraded to the new version 4 components with a configuration reproducing the original appearance.

The upgrade process preserves the organization of the forms into sections, but does not preserve the layout of fields and sections in the form.
Refer to Forms for more information.

Recommended actions
Model designers should review and enhance the upgraded forms, fixing the layout and splitting long forms into multiple tabs. They should also review the field components’ configuration to improve the appearance of their forms.

Business views

Business Objects and Business Object Views are now merged into a single Business View design-time object, moved under the entity. This new artifact supports the features of both business objects and business object views.

The upgrade process automatically converts business objects and business object views to business views, re-creating the references to the upgraded forms and collections.

Refer to Business views for more information.

Recommended actions
Model designers should review and enhance the upgraded business views. They can, for example, configure transitions for the embedded collections created when improving the upgraded forms.

Applications

Applications are now organized with folders and actions (such as browse, inbox, etc.) and a navigation drawer containing shortcuts to certain actions.

The upgrade process automatically upgrades the original application organization and re-creates the folder and actions structure to provide similar features.

Refer to Application actions and folders for more information.

Recommended actions
Model designers should review and enhance the experience in the upgraded applications by adding icons, images, and colors to the actions created, and by adding advanced actions such as error browsing or record creation.

Upgrade from versions before 3.3.0

This section applies to all installations upgrading from a version before version 3.3.0.

Incoming records not merging singleton golden records

Users who have upgraded to version 3.2 before version 3.2.4 may experience an issue with singleton golden records (for fuzzy-matching entities) already in the hub not being merged with incoming records that match them. This issue is caused by a defect in the upgrade mechanism.

Required action
Users upgrading from version 3.2.0 through 3.2.4 should refer to the knowledge base article that explains how to fix an existing upgrade.

New reserved keywords in SemQL

New capabilities added to the SemQL language introduce the following reserved keywords:

  • HAS: this token is used as an equivalent to the HAVE keyword as in any <child_entity_role> has ( <condition_on_child_entity> ).

  • NULLS FIRST, and LAST: these keywords allow for NULLS FIRST and NULLS LAST clauses in custom-ranking consolidators.

Recommended action
After the upgrade, model designers should review and validate their models and make sure that no physical object is named using the new reserved keywords.

Integration job execution scheme

The integration job changes from a “by-entity” to a “by-phase” execution scheme. This means that instead of processing entirely each entity (e.g., Customers then Contact), it now processes entirely each phase (enrichment for all entities, then validation for all entities, etc.). This change does not impact the outcome of the integration job.

Required action
No action is required. Possible database operations started outside of Semarchy and assuming specific step sequencing, should be reviewed and updated as needed.

Upgrade from versions before 3.2.0

This section applies to all installations upgrading from a version before version 3.3.0.

Matching process

In versions before 3.2, a matcher includes a single matching condition.
Starting with version 3.2, a matcher is composed of several matching rules and supports scoring for automated merge and confirmation. To reflect this new matcher and provide performance enhancements, the data certification job was redesigned and the data location structures have been modified.

Matching rule upgrade

When a model is upgraded, existing matchers are converted to the new matchers to deliver the same matching behavior:

  • The existing matcher for each entity is converted to a matcher with a single matching rule called Default, with a matching score of 100. This rule contains the old matcher’s binning expressions and matching conditions.

  • The Merge Policy is configured with all the thresholds set to 0, so that all duplicate groups detected automatically merge.

  • The Auto-Confirm Policy is configured as follows:

    • Auto-confirm golden records threshold is set to 100. Merged duplicate groups are never automatically confirmed.

    • Auto-confirm singletons is checked. Singletons are automatically confirmed.

Recommended action
After the upgrade, model designers should review their matchers to benefit from the new capabilities and enable automated merge and confirmation for the business users and data stewards.

Data location upgrade

When a data location is upgraded, the data structures are modified to support new matching rules, scoring, suggested merges, and auto-confirmation. For more information about these tables, see Data hub table structures.

Recommended action
Users consuming golden and master records from GD and MD tables and using duplicate management-related columns should review the new structures and propagate changes in their data integration flows.

Data Administrators should gather statistics on the data location schemas to make sure that subsequent processing use statistics updated with the updated structure and values.

Also, the existing records are upgraded with the following rules:

  • Golden records keep their existing IDs and confirmed/validated state.

  • Confirmed or unconfirmed golden records keep their status. This status is moved from the B_IS_VALIDATED column in GD tables to the new B_ISCONFIRMED column, and a new B_CONFIRMATIONSTATUS column provides a more detailed confirmation status in MD and GD tables.

  • Partially confirmed golden records (that is golden records with only part of their master records confirmed) are upgraded to Not Confirmed (B_ISCONFIRMED=0 and B_CONFIRMATIONSTATUS=NOT_CONFIRMED).

  • The existing duplicate management workflows remain, but the duplicates listed in these workflows disappear.

  • Golden records split and confirmed by the user are upgraded in an exclusion group.

  • A confidence score is automatically computed.

Recommended action
Administrators and data stewards should review the upgrade rules listed above and should test the upgrade before applying it to their production environment. After the upgrade, data stewards should review the confirmation state of the duplicates and execute duplicate management workflows to confirm duplicates as needed.

Customized search forms

in versions before 3.2, all the default built-in search methods (full text, advanced, by form, and SemQL) were available in the business object views. Starting with version 3.2, it is possible to create customized forms and define which of the search methods or customized forms are available in applications.

Recommended action
After the upgrade, model designers should review and customize the search methods available in their applications.

Database functions

Starting with version 3.2, it is possible to declare user-defined PL/SQL functions. By declaring these functions, designers make them available in the SemQL editor and avoid SemQL warnings raised by these functions.

Recommended action
After the upgrade, model designers should declare all PL/SQL functions used in their SemQL clauses. By doing so, model validation will no longer raise a warning for these functions.

Reserved physical column names

Before version 3.2, the attributes’ physical column names could take any value. Starting with version 3.2.0, it is no longer possible to define physical column names starting with B_ as they may collide with reserved column names.

Models with physical column names violating this rule no longer pass model validation and each column will raise a validation error such as Invalid physical name: <column_name> matches the reserved pattern B_*.

Required action
Before the upgrade, administrators should refer to the knowledge base article that explains how to detect, rename and upgrade the offending attributes.

Upgrade from versions before 3.1.0

This section applies to all installations upgrading from a version before version 3.3.0.

Model privilege grants for SemAdmin

in versions before 3.1, the SemAdmin role had hardcoded full access to the data on all models.
Starting with version 3.1 the privilege grant for this role is no longer hardcoded and can be modified.

When creating a new model, or upgrading an existing one, a model-level privilege grant is automatically created for the SemAdmin role with Grant full access to the model selected to keep the same level of privileges.

Recommended action
After the upgrade, model designers may want to modify this privilege grant to reduce the privileges of the SemAdmin role on the data and prevent platform administrators from accessing data in the hub.

Model privilege grants for the integration services

in versions before 3.1, accessing the integration services required a role having platform-level read/write privileges to the data location. Starting with version 3.1, a specific option called Grant access to integration services appears in the model privilege grants to define whether a role can access this model’s data via the services.

This privilege applies to the SOAP and REST service APIs.

When upgrading, roles that could access the integration services thanks to their privileges on the data locations will automatically have this option checked for the models deployed in the data location.

Recommended action
After the upgrade, model designers may want to review and modify this privilege grant to prevent certain roles from accessing the integration services.

Upgrade from versions before 3.0.0

Upgrading from a Semarchy xDM version before version 3.0 is not supported.