Semarchy xDM Installation Guide

Welcome to Semarchy xDM.
This guide contains information about installing Semarchy xDM to design and develop an MDM project.

Preface

Overview

Using this guide, you will learn how to:

Plan the configuration of Semarchy xDM for development and production environments.
Start and connect to Semarchy xDM.

Audience

This document is intended for administrators and project managers interested in installing Semarchy xDM for their data management initiatives.

To discover Semarchy xDM, you can watch our tutorials.

The Semarchy xDM Documentation Library, including the development, administration and installation guides is available on-line.

Document Conventions

This document uses the following formatting conventions:

Convention Meaning

Convention	Meaning
boldface	Boldface type indicates graphical user interface elements associated with an action, or a product specific term or concept.
italic	Italic type indicates special emphasis or placeholder variable that you need to provide.
`monospace`	Monospace type indicates code example, text or commands that you enter.

boldface

Boldface type indicates graphical user interface elements associated with an action, or a product specific term or concept.

italic

Italic type indicates special emphasis or placeholder variable that you need to provide.

monospace

Monospace type indicates code example, text or commands that you enter.

Other Semarchy Resources

In addition to the product manuals, Semarchy provides other resources available on its web site: http://www.semarchy.com.

Obtaining Help

There are many ways to access the Semarchy Technical Support. You can call or email our global Technical Support Center (support@semarchy.com). For more information, see http://www.semarchy.com.

Feedback

We welcome your comments and suggestions on the quality and usefulness of this documentation.
If you find any error or have any suggestion for improvement, please mail support@semarchy.com and indicate the title of the documentation along with the chapter, section, and page number, if available. Please let us know if you want a reply.

Introduction to Semarchy xDM

Semarchy xDM is the Intelligent Data Hub platform for Master Data Management (MDM), Reference Data Management (RDM), Application Data Management (ADM), Data Quality, and Data Governance.
It provides all the features for data quality, data validation, data matching, de-duplication, data authoring, workflows, and more.

Semarchy xDM brings extreme agility for defining and implementing data management applications and releasing them to production. The platform can be used as the target deployment point for all the data in the enterprise or in conjunction with existing data hubs to contribute to data transparency and quality.
Its powerful and intuitive environment covers all use cases for setting up a successful data governance strategy.

Architecture Overview

The Semarchy xDM architecture includes:

The Semarchy xDM Server, a Java EE application deployed and running in an application server. This application serves:
- The Application Builder, Dashboard Builder, xDM Discovery, Setup and Configuration user interfaces: These web application are used by designers and administrators to create, manage and administer the models and applications designed in Semarchy xDM.
- Data Management Applications and Dashboard Applications: These web applications are used by business users to browse and manage data and visualize metrics dashboards.
- A Rest API to perform programmatically data integration, management and administrative operations.
The Repository that stores all the metadata used by Semarchy xDM. This includes the data models as well as the definition of the data management and dashboard applications, the Discovery datasources and profiles. The repository is hosted in a database schema. A given Semarchy xDM server is always attached to a single repository.
The Data Locations that contain data for the data models. This data include the golden, master and source data, with all the lineage and history. A data location is hosted in a database schema. Multiple data locations can be attached to a single repository.

Preparing to Install

Review the information in this section before you begin your installation.

System Requirements and Certification

Before installing Semarchy xDM, you should read the system requirements and certification documents to ensure that your environment meets the minimum installation requirements.

Hardware Requirements

The Semarchy xDM server runs as an application in a supported application server. The hardware requirements are those of the application Server.
Refer to your application server documentation for more information about the required hardware requirements.

Semarchy xDM runs on physical machines or virtual machines (VM).

Software Requirements

This section contains a list of software requirements for this release of Semarchy xDM.

It is recommended to ensure that all software involved in the installation has current patch applied.

Server Requirements

Java

Supported Java Runtime Environment (JRE) or Development Kit (JDK) versions are:

Java 8 - version 1.8.x.
Java 11

Java 9 and 10 are not supported.

The JAVA_HOME (for a JDK) or JRE_HOME (for a JRE) environment variable must be configured to point to this installation of Java.

Application Server

Semarchy xDM Server is a JEE6 web application certified with the following application servers:

Apache Tomcat 8.5.x and 9.0.x
Eclipse Jetty 9.x
IBM Websphere Liberty Profile 18.x
Wildfly 8.x, 9.x, 10.x, 11.x, 12.x, 13.x
GlassFish 4.x, 5.x
Oracle WebLogic 12c Release 2 (12.2.1.3.0)

Database

Supported database versions for the repository and the data locations are:

Oracle Express, Standard or Enterprise edition.
- Oracle Database 11g Release 1: 11.1.0.6–11.1.0.7
- Oracle Database 11g Release 2: 11.2.0.1–11.2.0.4
- Oracle Database 12c Release 1: 12.1.0.1-12.1.0.2
- Oracle Database 12c Release 2: 12.2.0.1-12.2.1.0
PostgreSQL
- PostgreSQL version 9: 9.5.10 and above, 9.6.6 and above
- PostgreSQL version 10: 10.x
- PostgreSQL version 11: 11.x
Microsoft SQL Server
- Microsoft SQL Server 2017 : v14.0

Cloud infrastructures such as Amazon Web Services or Microsoft Azure offer these databases for cloud installations.

Client Requirements

Supported browsers for the Semarchy xDM:

Google Chrome 68 and later (incl. Chrome for Android),
Firefox 61 and later, Firefox ESR 60 and later,
Microsoft Edge 41 and later,
Safari 11 and later (macOS and iOS).

Browser changes and automatic upgrades may introduce issues when working with Semarchy xDM. Please check the release notes for an updated list of known issues.

License

A newly installed Semarchy xDM instance runs for a limited time - the grace period - without a license. Within that grace period, you must request an evaluation license from Semarchy to activate that instance.

When the grace period is over, the instance will stop working: the applications, as well as the data stored in Semarchy xDM, will no longer be available, and incoming data will no longer be processed. The data stored in the hub is preserved as-is.

Review the Managing the License section in the Semarchy xDM Administration Guide for more information and detailed instructions to activate an instance and manage the license.

Planning the Installation

Architecture Details

This section details the various components of the Semarchy xDM architecture and their interactions.

Semarchy xDM Server

The Semarchy xDM Server is a Java EE application deployed and running in a supported application server.

This server provides several access methods:

The Application Builder, Dashboard Builder, Discovery, and Configuration user interfaces: These web application are used by designers and administrators to create, manage and administer the models and applications designed in Semarchy xDM, as well as profile and measure data.
Data Management Applications and Dashboard Applications: These web applications are used by business users to browse and manage data and visualize metrics dashboards.
A Rest API to perform programmatically data integration, management and administrative operations.

The Semarchy xDM server stores its information in a repository. One application is always attached to a single repository, and connects this repository using a JDBC datasource named SEMARCHY_REPOSITORY configured in the application server.

The Semarchy xDM application is used at design-time to design models and applications, and deploy them. At run-time, it also manages the processes involved to schedule and execute the certification process in the hub.

The application uses role-based security for accessing Semarchy xDM features. The users and roles used to connect to the application must be defined in the security realm of the application server. Configuring the roles and users is part of the application configuration.

Repository

The repository stores the following information:

Platform-level configuration elements: roles, server configuration, plugins, etc.
For Data Management applications and models:
- The models and applications metadata and their versions
- The configuration & security information: privileges, notification policies, preferences, etc.
- Data locations information: deployed model, jobs, etc.
- Run-time information: logs
For xDM Dashboards:
- The dashboard applications and their artifacts (queries, charts, etc.)
For xDM Discovery:
- The datasource definitions and the profiles.

A repository is stored in an database/schema accessed from the application using a JDBC datasource named SEMARCHY_REPOSITORY.

The repository data should never be accessed directly via SQL queries. Access to the Semarchy xDM information must be performed through the Semarchy user interfaces.

Repository Types

There are two types of repositories. The repository type is selected at creation time and cannot be modified afterwards.

The repository types are:

Design: All design-time and run-time operations are possible in this repository.
Deployment: With this repository type, you can only import closed model editions and cannot edit them.

The deployment repositories are suitable for production sites. Model transfer from design to deployment repositories is handled via incremental export/import of closed model editions.

Semarchy xDM Dashboards and Semarchy xDM Discovery work identically with Design and Deployment repositories. Both types of repositories support profiling, and the design of dashboard applications.

Data Locations

Data managed by Semarchy is stored in a data location. The data location contains data structured after the model entities and attributes, and runs certification jobs generated after the rules (enricher, validation, matching, etc) defined in the model. A given model edition is deployed into a data location to host data matching the model structure and rules. At run-time, the data location contains the the golden, master and source data, with all the lineage and history.

The data location is hosted in an database/schema and accessed via a JDBC datasource defined in the application server. A data location refers to the datasource via its JNDI URL.

Data locations are only used for Semarchy models designed in the Application Builder. Dashboard Applications created in Semarchy xDM Dashboards do not require a data location.

Data Location Contents

A Data Location contains the hub data, stored in the database/schema accessed using the data location’s datasource. This schema contains database tables and other objects generated from the model edition.

The data location also refers three type of jobs (stored in the repository):

Installation Jobs: The jobs for creating or modifying in a non-destructive way the data structures in the data location.
Integration Jobs: The jobs for certifying data in these data structures, according to the model job definitions.
Purge Jobs: The jobs for purging the logs and data history according to the retention policies.

Data Locations, Repositories and Models

A data location is attached to a repository: You can declare as many data locations as you want in a repository, but a data location is always attached to a single repository. It is not possible to have a data location attached to two repositories at the same time.

You may deploy several model editions successively in a data location, but only one model edition is deployed and is active in the data location at a certain point in time.

Data Location Types

There are two types of data locations. The type is selected when the data location is created and cannot be changed afterwards:

The data location types are:

Development Data Locations: A data location of this type supports deploying open or closed model editions. This type of data location is suitable for testing models in development and quality assurance environments.
Production Data Location: A data location of this type supports deploying only closed model editions. This type of data location is suitable for deploying MDM hubs in production environments.

Installation Patterns

This section provides patterns for deploying Semarchy xDM in real-life environments.

Pattern #1: Development, QA/UAT, and Production in Different Sites

This pattern assumes that the development, QA/UAT (Quality Assurance, User Acceptance Tests), and production sites are located on different networks or sites.

For this pattern, three repositories are created instead:

A REPO_DEV Design repository for the development site. A DEV Development data location is attached to this repository. This location is used by the development team to test their work.
A REPO_QA Deployment repository for the QA site. A QA Production data location is attached to this repository. This location contains QA data and allows for limit values testing.
A REPO_PROD Deployment repository for the production site. A PROD Production data location is attached to this repository. Only closed and production-ready model editions can be deployed in this location. The real production data is here.

With this configuration:

The entire development phase is performed in the REPO_DEV repository, with closed and open model editions. Model editions are deployed to the DEV data location, for the developers' tests.
When the development phase is finished, Model editions are closed and exported to files from the REPO_DEV repository and imported in the REPO_QA repository. These closed model editions are deployed to the QA data location for testing.
QA teams perform testing to ensure the model is bug-free and ready for production.
Once the user testing/QA process is finished, the closed model editions are exported from the REPO_QA to the REPO_PROD repository and then deployed to the PROD data location.

With this configuration, you need to deploy three instances of Semarchy xDM, one per repository. These three instances are located on three different networks with possibly different security, scalability and high availability requirements and settings.

Pattern #2: Development/QA/UAT and Production in Different Sites

This pattern is similar to Pattern #1 but assumes that development and QA/UAT are co-located in one site, and production is located in a remote location.

For this pattern, two repositories are created:

A REPO_DEV Design repository for the development and QA site. A DEV Development data location and a QA Development data locations are attached to this repository.
A REPO_PROD Deployment repository for the production site. The PROD Production data location is attached to this repository.

With this configuration:

Models are developed in the REPO_DEV repository and tested by developers in the DEV data location.
When a milestone is reached, models editions are closed. These closed model editions are deployed to the QA data location for testing.
Once the QA phase is finished, the closed model editions are exported to files from the REPO_DEV repository and imported in the REPO_PROD repository. From this repository, these closed model editions are deployed to the PROD data location.

With this configuration, you need to deploy two instances of Semarchy xDM, one per repository. These two instances are located on two different networks with possibly different security, scalability and high availability requirements and settings.

Pattern #3: Single Repository and Project

This pattern assumes that a single project is designed through a development/QA/Production lifecycle.

For this pattern:

A single Design repository is created.
Three data locations, DEV, QA and PROD are created:
- DEV is a Development data location into which open model editions are deployed during the development phase.
- QA is a Development data location into which the model editions closed by development are deployed for QA consumption. Possibly, open model editions can be deployed directly from development for immediate testing.
- PROD is a Production data location.

In this pattern, a single repository contains the development, QA and production editions of the models. Model versioning allows freezing and delivering to the next stage (and next deployment location) a model as it moves along its lifecycle.

Patterns #1 and #2 are more suitable for real enterprise environment. Pattern #3 is more suitable for prototyping or evaluation environments.

Pattern #4: Single Repository, Multiple Projects.

This pattern is similar to the previous one but assumes that several projects/models are managed in the same repository.

For this pattern:

A single design repository is created.
Three data locations are created per project: DEV1, QA1 and PROD1, then DEV2, QA2, PROD2, etc.

The organization is the same as in pattern #3, but a set of data locations exists for each project managed in the single repository.

You can combine this pattern with patterns #1 and #2 to manage multiple projects across multiple development, QA and production environments.

High-Availability Configuration

Semarchy xDM can be configured to support enterprise-scale deployment and high-availability.

Semarchy xDM supports the clustered deployment of the Semarchy xDM web application for high-availability and failover. A clustered deployment can be set up for example to support a large number of concurrent users performing data access and authoring operations.

Reference Architecture for High-Availability

In a clustered deployment, only one instance of the Semarchy xDM application manages and runs the certification processes. This instance is the Active Instance. A set of clustered Semarchy xDM applications serves users accessing the user interfaces (Application Builder, Dashboard Builder, xDM Discovery, Configuration or the data management applications) as well as applications accessing data locations via integration points. These are Passive Instances. The back-end databases hosting the repository and data locations are deployed in a database cluster, and an HTTP load balancer is used as a front-end for users and applications.

The reference architecture for such a configuration is described in the following figure:

This architecture is composed of the following components:

HTTP Load Balancer: This component manages the sessions coming from within the enterprise network or from the Internet (typically via a Firewall). This component may be a dedicated hardware load balancer or a software solution, which distributes the incoming sessions on the passive instances running in the JEE application server cluster.
JEE Application Server Cluster + Passive Semarchy xDM Platforms: A Semarchy xDM application instance is deployed on each node of this cluster, which is scaled to manage the volume of incoming requests. In the case of a node failure, the other nodes remain able to serve the sessions. The Semarchy xDM applications deployed in the cluster are Passive Instances. Such instance is able to process accesses to the Semarchy user interfaces or to the integration points, but is unable to manage the batches and the jobs.
JEE Server + Active Semarchy xDM Platform: This single JEE server hosts the only complete Semarchy xDM platform of the architecture. This Active Instance is not accessible to users or applications. Its sole purpose is to poll the submitted batches and process the Jobs.
Database Cluster: This component hosts the Semarchy xDM Repository and the Data Locations databases/schemas in a clustered environment. Both active and passive instances of the Semarchy xDM Platform connect to this cluster using JDBC datasources.

In this architecture:

Design-time or administrative operations are processed by the passive instances in the JEE application server cluster.
Operations performed on the MDM Hubs (data access, steppers or external loads) are also processed by the passive instances, but the resulting batches and jobs are always processed by the single active instance.

Only one Active Instance must be configured. Multiple active instances are not supported.

The xDM Dashboards and xDM Discovery components run identically on active or passive instances. For example, the Discovery profiling processes can run even on passive instances.

Load Balancing

Load balancing ensures optimal usage of the resources for a large number of users and applications accessing simultaneously Semarchy xDM.

Load balancing is performed at two levels:

The HTTP load balancer distributes the incoming requests on the nodes of the JEE application server cluster.
The JDBC datasource configuration distributes database access to the repository and the data locations on the database cluster nodes. In PostgreSQL and SQL Server environments, use the JDBC connection configuration to enable load balancing on the multiple nodes of the cluster.

Failure and Recovery

In the reference architecture, fail-over is managed for both user and application sessions.

The following table describes the behavior and the required recovery actions in case of a failure in the various points of the architecture.

Failure Type	Behavior and required actions
Database Failure	In the event of a database cluster node failure, other nodes are able to recover and process the incoming database requests.
Passive Instance Failure	If one of the nodes of the JEE application server cluster fails: Application sessions are moved to the other active nodes. User sessions to this node are automatically restarted on other active nodes. The only information not recovered is the content of the un-saved editors for the user sessions. All the other content is saved in the repository or the data locations. Transactions attached to steppers, for example, are saved in the data locations and not lost.
Active Instance Failure	The purpose of the active instance is to process batches and jobs. If this server fails: Jobs running the queues are halted, Queued jobs remain in their queue, Incoming batches remain pending for the batch poller to process them. The active instance must be restarted automatically or manually to fully recover from a failure. When it is restarted, the platform resumes its normal course of activity with no user action required. A Failure of the Active Instance does not impact the overall activity of users or applications, as these rely on the (clustered) Passive Instances. The only impact of such a failure may be a delay in the processing of data changes.

Failure Type

Behavior and required actions

Database Failure

In the event of a database cluster node failure, other nodes are able to recover and process the incoming database requests.

Passive Instance Failure

If one of the nodes of the JEE application server cluster fails:

Application sessions are moved to the other active nodes.
User sessions to this node are automatically restarted on other active nodes.

The only information not recovered is the content of the un-saved editors for the user sessions. All the other content is saved in the repository or the data locations. Transactions attached to steppers, for example, are saved in the data locations and not lost.

Active Instance Failure

The purpose of the active instance is to process batches and jobs.
If this server fails:

Jobs running the queues are halted,
Queued jobs remain in their queue,
Incoming batches remain pending for the batch poller to process them.

The active instance must be restarted automatically or manually to fully recover from a failure.

When it is restarted, the platform resumes its normal course of activity with no user action required.

A Failure of the Active Instance does not impact the overall activity of users or applications, as these rely on the (clustered) Passive Instances. The only impact of such a failure may be a delay in the processing of data changes.

Configuring Semarchy xDM for High-Availability

Active vs. Passive Instances

The Semarchy xDM server comes in two flavors corresponding to two WAR (Web Application Archive) files:

The Active Server (semarchy.war) includes the active application to deploy on the single active server. This war includes the batch poller and the engine, and is able to trigger and process the submitted batches.
The Passive Server (semarchy-passive.war) includes the passive application to deploy on all the passive nodes of the cluster. This war does not include the batch poller and engine services, and is unable to trigger or process submitted batches.

Both these files are in the semarchy-xdm-install-<version tag>.zip archive file, in the mdm-server folder.

Install and Configure Semarchy xDM

The overall installation process for a high-availability configuration is similar to the general installation process:

Create the repository and data location databases/schemas in the database cluster.
Configure the application server security for both the cluster and the active node.
Configure the JDBC Datasources for the nodes/cluster.
Refer to your Oracle Database and Application Server documentation for more information about configuring RAC JDBC Datasources. If using PostgreSQL or SQL Server, refer to the JDBC driver documentation to configure the datasource for high-availability and load balancing.
Deploy the applications:
1. Deploy the active instance in the active node.
  The architecture supports only one active instance and there is no need to load balance it. The active instance can be deployed in the semarchy context using the semarchy.war and semarchy.xml files.
  The active instance will be available on the https://active-host:active-host-port/semarchy/ URL.
2. Deploy multiple passive instances behind the load balancer.
  You can deploy the passive instances using the same context as the active instance, or a different context.
  - Deploying with the same context
    When creating a passive node using semarchy-passive.war, rename this file to semarchy.war before deployment,and use and the semarchy.xml files used for the active node deployment. This keeps the same deployment name (semarchy) for the active and the passive nodes, and usually simplifies load balancing configuration.
    In this configuration, the passive instances will be available behind the load balancer on the https://load-balancer-host:load-balancer-post/semarchy/ URL.
  - Deploying with a different context
    When creating a passive node using semarchy-passive.war, keep the semarchy-passive.war file name and use a copy of semarchy.xml, renamed to semarchy-passive.xml.
    In this configuration, the passive instances will be available behind the load balancer on the https://load-balancer-host:load-balancer-port/semarchy-passive/ URL.

Configure the HTTP Load Balancer

Semarchy xDM requires that you configure HTTP Load Balancing with Sticky Sessions (also known as session persistence or session affinity). In this mode, requests from existing sessions are consistently routed to the same server. This is mandatory for the Semarchy xDM user interfaces, but not for integration points.
For example, for Amazon Web Services (AWS) deployments, sticky sessions are configured in the Load Balancer.

Configure the Clustered Model Deployment

By default, changes performed on a running node in a cluster only apply to that node and are not propagated to the other running nodes, since each node reads its configuration only at startup.
As a consequence, in the default configuration, a configuration change or a model deployment is effective only on the node where this change was performed, while other nodes still use old configurations or old deployed models.

A change in configuration or a model deployment mandates that all nodes in the cluster are restarted to read their entire configuration.

Restarting the nodes in this configuration is important to keep a consistent behavior across all nodes. For example, if you do not restart all the nodes after a model deployment, users may access the old or new version of the same application depending on the node that serves them when they connect to the cluster.

The Clustered Mode enabled nodes to automatically retrieve configuration changes and model deployments.

To enable this mode, add the com.semarchy.xdm.cluster.enabled=true system property to your application server startup for each node. Nodes started with this flag automatically retrieve configurations changes from the repository without having to restart.

To configure this system property on a Tomcat server:

Open the <tomcat>/bin/setenv.sh (UNIX/Linux) or <tomcat>/bin/setenv.bat (Windows) file with a text editor.
Add the following properties to the CATALINA_OPTS variable: -Dcom.semarchy.xdm.cluster.enabled=true
Save the file.

With this flag enabled, the following changes are retrieved:

Model deployments in data locations. This automatically refreshes the applications and REST API.
Logging configuration changes.
New plugins deployment or plugin updates.
New or updated custom translations.

The engine, batch poller, purge schedules, continuous loads, notifications & notification server configurations are not affected since they run and can be configured only on the active instance.

Note that this new flag has an overhead since all nodes regularly poll the database for possible configuration changes. To avoid this overhead on non-clustered environments, it is not enabled by default.

Configuring the Databases and Schemas

This section explains how to configure the databases/schemas for the repository and data locations.

Configuring the Repository Storage

Before installing Semarchy xDM, you must create a storage for the repository. You can create it manually or use your database administration interface for this purpose. In this section, we provide sample scripts for creating this storage. Make sure to adapt this script to your database configuration.

Creating the repository schema (Oracle)

CREATE USER <repository_user_name> IDENTIFIED BY <repository_user_password>
 DEFAULT TABLESPACE USERS TEMPORARY TABLESPACE TEMP;
GRANT CONNECT, RESOURCE TO <repository_user_name>;

-- The following command should be used for Oracle 12c and above
GRANT UNLIMITED TABLESPACE TO <repository_user_name>;

Creating a database and the repository schema (PostgreSQL)

/* Create a database for the repository and data locations */

CREATE DATABASE <postgresql_database_name> WITH ENCODING 'UTF8';

/*
 * Disconnect and then reconnect using:
 *  the JDBC URL:  jdbc:postgresql://<host>:<port>/<postgresql_database_name>
 *  or using psql with the following command: psql -U postgres <postgresql_database_name>
 */

CREATE SCHEMA extensions;
GRANT USAGE ON SCHEMA extensions TO PUBLIC;
ALTER DEFAULT PRIVILEGES IN SCHEMA extensions GRANT EXECUTE ON FUNCTIONS TO PUBLIC;
ALTER DATABASE <postgresql_database_name> SET SEARCH_PATH TO "$user",public,extensions;

CREATE EXTENSION IF NOT EXISTS "uuid-ossp"     with schema extensions;
CREATE EXTENSION IF NOT EXISTS "fuzzystrmatch" with schema extensions;

/* Create the repository user and schema */

CREATE USER <repository_user_name> WITH PASSWORD '<repository_user_password>';

/* Use the following syntax for PostgreSQL 9 */
-- CREATE USER <repository_user_name> WITH UNENCRYPTED PASSWORD '<repository_user_password>';

/*
 * The following command is required only for PostgreSQL running on Amazon RDS.
 * It grant access to the repository to the RDS super user.
 */
-- GRANT <repository_user_name> TO <rds_superuser_name>

CREATE SCHEMA <repository_user_name> AUTHORIZATION <repository_user_name>;

Creating a repository database, login and user (SQL Server)

/* Create a database for the repository and data locations */

CREATE DATABASE <repository_database_name>
GO

/* Configuring the database */

ALTER DATABASE <data_location_database_name> SET READ_COMMITTED_SNAPSHOT ON;
GO

ALTER DATABASE <data_location_database_name> SET QUOTED_IDENTIFIER ON;
GO

/* Create a login to connect the database */

CREATE LOGIN <repository_user_name> WITH PASSWORD='<repository_user_password>', DEFAULT_DATABASE=<repository_database_name>
GO

/* Add a user for that login in the database */

USE <repository_database_name>
GO

CREATE USER <repository_user_name> FOR LOGIN <repository_user_name>
GO

/* Make this user database owner */

ALTER ROLE db_owner ADD MEMBER <repository_user_name>
GO

Store the values of the <repository_user_name> and <repository_user_password> as you will need them later for creating the datasource to access the repository.

Configuring the Data Locations Storage

You do not need to create the data locations' databases/schemas at installation time, but it is recommended to plan them as part of the installation and configuration effort. You can create them manually or use your database administration interface for this purpose. In this section, we provide a sample script for creating a data location databases/schema. Make sure to adapt this script to your database configuration and duplicate it to create the storage for all data locations.

Creating a data location schema (Oracle)

CREATE USER <data_location_user_name> IDENTIFIED BY <data_location_user_password>
 DEFAULT TABLESPACE USERS TEMPORARY TABLESPACE TEMP;
GRANT CONNECT, RESOURCE TO <data_location_user_name>;

-- The following command should be used for Oracle 12c and above
GRANT UNLIMITED TABLESPACE TO <data_location_user_name>;

Creating a data location schema (PostgreSQL)

CREATE USER <data_location_user_name> WITH PASSWORD '<data_location_user_password>';

/* Use the following syntax for PostgreSQL 9 */
-- CREATE USER <data_location_user_name> WITH UNENCRYPTED PASSWORD '<data_location_user_password>';

/*
 * The following command is required only for PostgreSQL running on Amazon RDS.
 * It grant access to the data location to the RDS super user.
 */
-- GRANT <data_location_user_name> TO <rds_superuser_name>

CREATE SCHEMA <data_location_user_name> AUTHORIZATION <data_location_user_name>;

Creating a data location database, login and user (SQL Server)

/* Create a database for the repository and data locations */

CREATE DATABASE <data_location_database_name>
GO

/* Configuring the database */

ALTER DATABASE <data_location_database_name> SET READ_COMMITTED_SNAPSHOT ON;
GO

ALTER DATABASE <data_location_database_name> SET QUOTED_IDENTIFIER ON;
GO

/* Create a login to connect the database */

CREATE LOGIN <data_location_user_name> WITH PASSWORD='<data_location_user_password>', DEFAULT_DATABASE=<data_location_database_name>
GO

/* Add a user for that login in the database */

USE <data_location_database_name>
GO

CREATE USER <data_location_user_name> FOR LOGIN <data_location_user_name>
GO

/* Make this user database owner */

ALTER ROLE db_owner ADD MEMBER <data_location_user_name>
GO

Store the values of the <data_location_user_name> and <data_location_user_password> as you will need them later for creating the datasource to access the data location.

Database-Specific Considerations

General Considerations

Configure the database with a charset that supports all languages, such as AL32UTF8 for Oracle and UTF8 for PostgreSQL. Semarchy xDM uses specific characters for storing internal information, and applications in Semarchy xDM natively support multi-lingual data without preventing users from entering accented characters (or Cyrillic or Arabic or Chinese). A database configured for a specific language or with a limited charset may not function optimally with Semarchy xDM.

Oracle

Repositories and data locations should be located in separate schemas. However, they do not necessary need to be located in the same database.

Semarchy xDM ships with an Oracle JDBC driver (ojdbc8.jar) for Oracle Database version 12c Release 2 (12.2.x). This driver is strongly recommended for all recent database versions. If you are using an older Oracle version (11g), it is recommended to review the compatibility of this driver with your Oracle database version and possibly install to an older driver version instead (ojdbc6 or ojdbc7).

Oracle JDBC Datasource Configuration

The JDBC connections to an Oracle Database hosting a repository or data location must be made with the oracle.jdbc.J2EE13Compliant property set to true.

When this option is not set, errors such as the following one will be raised in the application log.

com.semarchy.mdm.runtime.data.InvalidDataAccessResourceUsageException: java.lang.RuntimeException: Unexpected DB value .... (Class oracle.sql.TIMESTAMP for logicalType TIMESTAMP)

Make sure to configure the datasources with this property configured.
For example, for a Tomcat datasource, the resource must contain the following:

<Resource
	name="...
	...
	connectionProperties="oracle.jdbc.J2EE13Compliant=true"
	...
/>

PostgreSQL

Repositories and data locations should be located in separate schemas.

SQL Server

The configuration presented above uses logins defined in the database. Change this script to use Windows or AD logins as needed.

Semarchy xDM does not support schemas for SQL Servers. One database is required for each repository and data location, and each user used to connect these databases should have the db_owner (dbo) role.

Be cautious of the collation when configuring the database instance:

The collation defines the code page, the case (CS/CI) and accent (AS/AI) sensitivity:
- It has strong impacts on comparison functions, order by clauses, and the overall execution performances.
- During a comparison, a character that is out of the collation codepage is considered as an unknown character and is always different from another unknown character, which may cause issues in comparisons.
The collation selected for the database hosting the repository and should be selected carefully to support all the possible characters that you plan to use or store in the hub. In addition, Semarchy xDM uses internally the following special characters (identified with their unicode number): £ (U+00A3), $ (U+0024), ¤ (U+00A4), • (U+2022). These characters should be also supported by the collation.
Semarchy xDM requires that repository and data locations database have a case-sensitive collation. Using a case-insensitive collation may cause unexpected results in search operations.
Note that the SEM_NORMALIZE function used is made collation-proof and forces the Latin1_General_100_CS_AS_KS_WS_SC collation

SQL Server repository and data locations databases should also be configured as follows for Semarchy xDM:

QUOTED_IDENTIFIER should be set to ON to force SQL Server to follow the ISO rules for identifiers and literal values quoting, using the following command:
```
ALTER DATABASE <database_name> SET QUOTED_IDENTIFIER ON;
```
READ_COMMITTED_SNAPSHOT should be set to ON to allow connections to access previous (committed) version of records being modified instead of waiting for them to be unlocked, using the following command:
```
ALTER DATABASE <database_name> READ_COMMITTED_SNAPSHOT ON;
```

Sizing and Maintaining the Databases/Schemas

Repository

The following considerations should be taken into account when sizing the repository database/schema:

The repository contains a small volume of information if you exclude the execution log. It can be sized by default from 100 to 200Mb depending on the number of models and number of model versions stored in it.
The execution log is the larger part of the data stored in the repository. The volume of data generated in the log depends on the number of job executions executed daily. It is recommended to monitor the job execution and resize the repository according to the log history you want to preserve.
In order to maintain the logs to a reasonable volume, it is recommended to regularly perform a purge of the log. To configure a regular purge, make sure to configure reasonable Retention Policies in the models, and to schedule regular purges in the data locations.
Refer to the Semarchy xDM Developer’s Guide for retention policies and to the Semarchy xDM Application Management Guide for purge schedules.

Data Location

The following considerations should be taken into account when sizing the data location databases/schema:

The data location contains for each entity of the deployed model several tables that correspond to the data in the various stages of the certification process. Some of these tables contain also the history of the previous iterations of the process.
The volume required in the data location depends on the following factors:
- Number of entities
- Number of source records pushed for these entities
- Number of new or updated source records pushed for these entities.

A recommended original sizing is to add the source data volume pushed for each entity by all publishers plus one data authoring (the overall input) and multiply it by a factor of 10. It is recommended after the original sizing to monitor the size of the data location in the normal course of operations and adjust the sizing appropriately.

The same sizing considerations applies to both the data and temporary tablespaces/filegroups in the case of the data locations, as the database engine is used for most of the processing effort in the certification process.

Data Retention Policies can be created to define the volume of data to retain in the data locations, and Data Purges can be scheduled to trigger the pruning of unnecessary data. Defining Retention Policies is covered in the Securing Data chapter of the Semarchy xDM Developer’s Guide. Data Purges are described in the Managing Execution chapter of the Semarchy xDM Administration Guide.

Deployment and Configuration Overview

The Semarchy xDM Server is a Java EE application that can be deployed to a number of environments. It requires a Java EE applications server (for example: Tomcat, Glassfish, JBoss/WildFly, etc). This section details the steps required to configure the application server and deploy the application in the application server.

The instructions provided in the following chapters detail one method and approach to configure the application server. Note that the method may vary depending on the application server version. Please refer to the application server documentation for the up-to-date instructions and instruction details. Approaches also differ depending on the best practices used in your information system. Please contact the application server administrator and review these instructions to adapt them to your practices.

Conventions

In the following chapters, the following variables names are used in the tasks:

The semarchy-xdm-install-xxx.zip file refers to the Semarchy xDM - Server Installation file you have downloaded to install Semarchy xDM. The name of this file varies as it includes the Semarchy xDM version and build number.
The semadmin user refers to the first user created for connecting to Semarchy xDM. This user is named by default semadmin. This name can be changed in the installation process.
<semadmin_password> refers to the password you want to set for the semadmin user. This password must comply with the password policy defined for your application server.
The temp folder refers to a temporary folder of your choice.

Security

The application configuration includes the security in the application server.
The goal of this task is to create:

A semarchyConnect role. This built-in role grants the privilege to connect to the application.
A semarchyAdmin role. This built-in role grants full privileges in the application.
A user called semadmin with semarchyConnect and semarchyAdmin roles. This user is the administrator for Semarchy xDM.

Depending on the application server, users are directly mapped to roles, or are mapped to roles via a Group concept. When an application server uses groups:

The semarchyConnectGroup and semarchyAdminGroup groups are created
The semarchyConnectGroup group is mapped to the built-in semarchyConnect role.
The semarchyAdminGroup group is mapped to the built-in semarchyAdmin role.
The semadmin user is added to both the semarchyConnectGroup and semarchyAdminGroup groups.

This basic configuration allows you to connect with the semadmin user and have full privileges for the application.

It is recommended to tune and enhance the security by:

Creating new users, roles and groups dedicated to Semarchy xDM in the application server security realm according to your organization. Refer to your application server’s security guide for more information.
Declaring these roles in Semarchy xDM and granting privileges to the application’s features to these roles. For more information, refer to the Semarchy xDM Administration Guide.

Datasources

The configuration of the application includes creating datasources to connect to the repository and data locations that will be used by your MDM projects.

Datasources are also to access datasets from Semarchy xDM Dashboards to run queries and render charts and dashboard on top of these datasets. Similarly, datasources are used to gather profiling statistics with Semarchy xDM Discovery.

To configure the repository datasource, make sure you have the following information:

For Oracle:
- <oracle_instance_hostname>: host name or IP address of the database server
- <oracle_listener_port>: number of the port where the server listens for requests.
- <oracle_SID_name> or <oracle_service_name>: name of a database on the server. This is the SID or ServiceName in the Oracle terminology.
  Note that the JDBC URL used for oracle varies depending on the value used:
  - SID: jdbc:oracle:thin:@<oracle_instance_hostname>:<oracle_listener_port>:<oracle_SID_name>
  - Service Name: jdbc:oracle:thin:@<oracle_instance_hostname>:<oracle_listener_port>/<oracle_service_name>
- <repository_user_name>: name for the user created when configuring the repository schema.
- <repository_user_password>: this user’s password.
For PostgreSQL:
- <postgresql_hostname>: host name or IP address of the database server
- <postgresql_port>: number of the port where the server listens for requests.
- <postgresql_database_name>: name of a database on the server.
- <repository_user_name>: name for the user created when configuring the repository schema.
- <repository_user_password>: this user’s password.
For SQL Server:
- <sqlserver_hostname>: host name or IP address of the database server
- <sqlserver_port> or <instance_name>: number of the port where the instance listens for requests, or the name of this instance.
- <repository_database_name>: name of a database on the server.
- <repository_user_name>: name for the user used to connect repository database.
- <repository_user_password>: this user’s password.

The JNDI Name of the repository datasource must always be jdbc/SEMARCHY_REPOSITORY. Make sure to set this name for the repository datasource as it is referred to with that name by the application.

To configure the data location datasources, make sure you have the following information for each data location:

<data_location_name>: name for the data location. You can use this name as the JNDI name for the datasource in which this data location will be hosted.
For Oracle:
- <oracle_instance_hostname>: host name or IP address of the database server
- <oracle_listener_port>: number of the port where the server listens for requests.
- oracle_SID_name> or <oracle_service_name>: name of a database on the server. This is the SID or ServiceName in the Oracle terminology.
- <data_location_user_name>: name for the user created when configuring the data location schema.
- <data_location_user_password>: this user’s password.
For PostgreSQL:
- <postgresql_hostname>: host name or IP address of the database server
- <postgresql_port>: number of the port where the server listens for requests.
- <postgresql_database_name>: name of a database on the server.
- <data_location_user_name>: name for the user created when configuring the data location schema.
- <data_location_user_password>: this user’s password.
For SQL Server:
- <sqlserver_hostname>: host name or IP address of the database server
- <sqlserver_port> or <sql_server_instancename>: number of the port where the instance listens for requests, or the name of this instance.
- <data_location_database_name>: name of a database on the server.
- <data_location_user_name>: name for the user used to connect data location database.
- <data_location_user_password>: this user’s password.

To configure datasources for xDM Dashboards or xDM Discovery, make sure that you have the same information as above to connect each database/schema containing the data you want to run queries against.

Datasource Tuning Considerations

When configuring datasources for Semarchy xDM, more specifically in production environments, you must take into account several considerations listed below.

We provide below explanations for data source tuning and configuration, with samples for some of the supported application servers. Refer to your application server documentation for more detailed information.

Connection Pool Sizing

Datasources configured for Semarchy should use connection pools, sized according to the expected usage.

In the normal course of operation of Semarchy:

Each simultaneous user session connected to and interacting with an MDM application makes one connection to the data location datasource and possibly one to the repository when working with steppers.
Each simultaneous integration job makes one connection to the data location datasource, plus one to the repository datasource.
Each chart rendered by xDM Dashboards makes one connection to the datasource containing the data when it renders.
Each profiling process running in xDM Discovery makes one connection to the datasource containing the data it profiles and one connection to the repository.
Each chart rendered on an xDM Discovery profiling chart makes one connection to the repository.

You must configure the pool size as a trade-off between two directions:

If the pool creates and keeps too many connections, it will be ready to serve any client immediately, but it will also overuse resources by pro-actively creating too many connections.
If the pool does not have enough connections available, then the pool may be exhausted at certain times, causing client requests to wait until a connection becomes available in the pool. In a nutshell, the more users try to access a small pool and the longer their queries, the more they will have to wait between each click.

To configure the pool size, the application servers provide a series of parameters.

Application Server	Relevant Parameters
Apache Tomcat	initialSize: initial number of connections when the pool is created. maxActive: the maximum number of active connections at the same time. maxIdle: the maximum number of idle connections kept in the pool at all times. minIdle: the minimum number of idle connections kept in the pool at all times. Other parameters to consider: timeBetweenEvictionRunsMillis, minEvictableIdleTimeMillis and maxWait.
Oracle WebLogic	Initial Capacity: number of connections created when the pool is created. Maximum Capacity: the maximum number of connections at the same time in the pool Maximum Capacity: the minimum number of connections kept in the pool at all times. Other parameters to consider: Pool Resize Quantity and Shrink Frequency.
WildFly	max-pool-size: the maximum number of connections at the same time in the pool min-pool-size: the minimum number of connections kept in the pool at all times. prefill: create max-pool-size connections when the pool is created. Other parameter to consider: idle-timeout-minutes.

Application Server

Relevant Parameters

Apache Tomcat

initialSize: initial number of connections when the pool is created.
maxActive: the maximum number of active connections at the same time.
maxIdle: the maximum number of idle connections kept in the pool at all times.
minIdle: the minimum number of idle connections kept in the pool at all times.

Other parameters to consider: timeBetweenEvictionRunsMillis, minEvictableIdleTimeMillis and maxWait.

Oracle WebLogic

Initial Capacity: number of connections created when the pool is created.
Maximum Capacity: the maximum number of connections at the same time in the pool
Maximum Capacity: the minimum number of connections kept in the pool at all times.

Other parameters to consider: Pool Resize Quantity and Shrink Frequency.

WildFly

max-pool-size: the maximum number of connections at the same time in the pool
min-pool-size: the minimum number of connections kept in the pool at all times.
prefill: create max-pool-size connections when the pool is created.

Other parameter to consider: idle-timeout-minutes.

Connection Management

When Semarchy takes a connection from the connection pool, it assumes that this connection is a valid one. In certain situations, the connection may be invalid at the database side (for maintenance reasons, etc.). It is a good practice to configure the pool to test connections before serving them.

This capability is typically enabled in application servers using a query that is executed before serving the connection from the pool.

Similarly, connections in the pool should be configured not in auto-commit mode. When a connection is released to the pool, the configuration should ensure that this connection rollbacks any uncommitted statement.

Application Server Relevant Parameters

Application Server	Relevant Parameters
Apache Tomcat	testOnBorrow: boolean indicating whether connections are tested when borrowed from the pool. validationQuery: query used to test the connection (e.g.: `select 1 from dual` for Oracle or `select 1` for PostgreSQL and SQL Server). validationInterval: minimum interval between two validation tests, to avoid excess validation. defaultAutoCommit: default auto-commit state for the connections. It should be set to `false`. rollbackOnReturn: rollbacks the transaction when the connection is released to the pool. It should be set to `true`. Make sure to set defaultAutoCommit to `false` and rollbackOnReturn to `true in PostgreSQL connection pools. Another parameter to consider: validationQueryTimeout.
Oracle WebLogic	Test Reserved Connections: indicates whether connections are tested when served from the pool. Test Table Name: Name of a table onto which a `select 1 from <table_name>` is issued for testing. Seconds to Trust an Idle Pool Connection: minimum interval between two validation tests, to avoid excess validation.
WildFly	validate-on-match: indicates whether connections are tested when served from the pool. valid-connection-checker: class implementing an optimized method to validate the connection. Use `org.jboss.jca.adapters.jdbc.extensions.oracle.OracleValidConnectionChecker` for Oracle or `org.jboss.jca.adapters.jdbc.extensions.postgres.PostgreSQLValidConnectionChecker` for PostgreSQL. check-valid-connection-sql: (Alternate method to valid-connection-checker) query used to test the connection (e.g.: `select 1 from dual` for Oracle, `select 1` for PostgreSQL or `SELECT 1 FROM sysobjects` for SQL Server).

Apache Tomcat

testOnBorrow: boolean indicating whether connections are tested when borrowed from the pool.
validationQuery: query used to test the connection (e.g.: select 1 from dual for Oracle or select 1 for PostgreSQL and SQL Server).
validationInterval: minimum interval between two validation tests, to avoid excess validation.
defaultAutoCommit: default auto-commit state for the connections. It should be set to false.
rollbackOnReturn: rollbacks the transaction when the connection is released to the pool. It should be set to true.

Make sure to set defaultAutoCommit to false and rollbackOnReturn to `true in PostgreSQL connection pools.

Another parameter to consider: validationQueryTimeout.

Oracle WebLogic

Test Reserved Connections: indicates whether connections are tested when served from the pool.
Test Table Name: Name of a table onto which a select 1 from <table_name> is issued for testing.
Seconds to Trust an Idle Pool Connection: minimum interval between two validation tests, to avoid excess validation.

WildFly

validate-on-match: indicates whether connections are tested when served from the pool.
valid-connection-checker: class implementing an optimized method to validate the connection. Use org.jboss.jca.adapters.jdbc.extensions.oracle.OracleValidConnectionChecker for Oracle or org.jboss.jca.adapters.jdbc.extensions.postgres.PostgreSQLValidConnectionChecker for PostgreSQL.
check-valid-connection-sql: (Alternate method to valid-connection-checker) query used to test the connection (e.g.: select 1 from dual for Oracle, select 1 for PostgreSQL or SELECT 1 FROM sysobjects for SQL Server).

Long-Running Queries

Semarchy xDM sometimes executes long-running queries to the data location. For example, an integration job that processes large data volumes may have queries that run for several minutes or hours.

Application servers have a mechanism to consider connection borrowed from the pool for a long time as "leaked" or "stale". When such connection is detected, the connection is reclaimed by the pool and the query failed. This mechanism must be disabled for Semarchy xDM long-running queries to work properly.

Application Server Relevant Parameters

Application Server	Relevant Parameters
Apache Tomcat	removeAbandoned: boolean indicating whether connections should be considered stale and removed after removeAbandonedTimeout seconds. This parameter should preferably be set to false (it is its default value). removeAbandonedTimeout: time after which a connection is considered stale.
Oracle WebLogic	Inactive Connection Timeout: time after which a connection is considered leaked and is reclaimed by WebLogic for the pool. Set this parameter to '0' to disable this feature.
WildFly	stale-connection-checker: class implementing a method for properly checking stale connections. Use `org.jboss.jca.adapters.jdbc.extensions.oracle.OracleStaleConnectionChecker`.

Apache Tomcat

removeAbandoned: boolean indicating whether connections should be considered stale and removed after removeAbandonedTimeout seconds. This parameter should preferably be set to false (it is its default value).
removeAbandonedTimeout: time after which a connection is considered stale.

Oracle WebLogic

Inactive Connection Timeout: time after which a connection is considered leaked and is reclaimed by WebLogic for the pool. Set this parameter to '0' to disable this feature.

WildFly

stale-connection-checker: class implementing a method for properly checking stale connections. Use org.jboss.jca.adapters.jdbc.extensions.oracle.OracleStaleConnectionChecker.

JavaMail Session

Semarchy xDM uses email servers for example to send email notifications.
For the email features to work, you must configure an email notification server.

A notification server can be configured either:

By entering the mail server SMTP configuration the Semarchy xDM.
By referring Semarchy xDM to a JavaMail Session resource configured in the application server.

This latter option is described in the next chapters for each application server.

If a JavaMail Session resource is already configured in the application server, you can skip these steps. You will reuse the existing JavaMail Session resource when configuring the notification server.

The resource name used for the JavaMail session resource is set to mail/Session. This value can be changed when running the installation process, and the changed value must be used when configuring the notification server.

Application Deployment

The application is deployed by default with the semarchy context. Therefore it is accessible on the application server on the following URL: http://<application_server_host>:<application_server_port>/semarchy/.

During the application deployment, it is possible to use a different context than semarchy. If you use a different context, make sure to take it into account in the URL to test and connect to the application.

Downloading Semarchy xDM

The Server Installation files for Semarchy xDM can be downloaded from the Semarchy website, at the following URL: http://www.semarchy.com/get/semarchy-xdm-install/.
The Semarchy xDM Server Installation file you download is referred to as semarchy-xdm-install-xxx.zip.

Uncompress this file in the temp folder.

The semarchy-xdm-install-xxx.zip archive contains the following files and folders:

File/Folder Description

File/Folder	Description
`README.txt`	File describing the package
`mdm-server/`	This folder contains the installation files for Semarchy xDM.
`mdm-server/semarchy.war`	Semarchy xDM deployable WAR file.
`mdm-server/semarchy-passive.war`	Semarchy xDM deployable WAR file for Passive Instances. Use this version for deploying Semarchy xDM in an existing supported application server for High-Availability Configurations.
`mdm-server/samples/`	This folder contains sample configuration files.
`mdm-server/semarchy-oracle.xml`	Sample configuration file for deploying Semarchy xDM in Apache Tomcat with an Oracle database.
`mdm-server/semarchy-postgresql.xml`	Sample configuration file for deploying Semarchy xDM in Apache Tomcat with a PostgreSQL database.
`mdm-server/semarchy-sqlserver.xml`	Sample configuration file for deploying Semarchy xDM in Apache Tomcat with a SQL Server database.
`mdm-server/additional-libraries/`	This folder contains libraries used with Semarchy xDM, listed below.
`mdm-server/additional-libraries/ojdbc8.jar`	Oracle JDBC driver for Oracle Database version 12c. If you are using an older Oracle version, it is recommended to review the compatibility of this driver with your Oracle database version and possibly install to an older driver version instead (ojdbc6 or ojdbc7).
`mdm-server/additional-libraries/org.postgresql.jdbc<postgresql_version>.jar`	PostgreSQL JDBC Driver.
`mdm-server/additional-libraries/com.microsoft.sqlserver.mssql-jdbc_<version>.jar`	SQL Server JDBC Driver.
`mdm-server/additional-libraries/com.sun.mail.jakarta.mail_<version>.jar`	Library to install to enable JavaMail for Apache Tomcat Servers. Ignore this file if you are using a different application server.
`mdm-server/additional-libraries/com.semarchy.tool.jee.tomcat-<tomcat_version>.jar`	Tomcat tools for authentication and role mapping. See Delegating Authentication and Authorization in Tomcat for more information. Note that this component is provided for a given version of Tomcat, corresponding in the two first digits of the `<tomcat_version>` (for example 8.5). Contact our support team if you are using a different version of Tomcat.
mdm-server/additional-libraries/ com.sun.activation.jakarta.activation_.jar com.sun.istack.commons-runtime_.jar com.sun.xml.fastinfoset.FastInfoset_.jar jakarta.jws-api_.jar jakarta.xml.bind-api_.jar jakarta.xml.soap-api_.jar org.apache.servicemix.specs.jaxws-api-.jar org.glassfish.jaxb.runtime_.jar org.glassfish.jaxb.txw2_.jar org.jvnet.staxex.stax-ex_.jar	Additional libraries required for Tomcat when running with a JDK 11.

README.txt

File describing the package

mdm-server/

This folder contains the installation files for Semarchy xDM.

mdm-server/semarchy.war

Semarchy xDM deployable WAR file.

mdm-server/semarchy-passive.war

Semarchy xDM deployable WAR file for Passive Instances. Use this version for deploying Semarchy xDM in an existing supported application server for High-Availability Configurations.

mdm-server/samples/

This folder contains sample configuration files.

mdm-server/semarchy-oracle.xml

Sample configuration file for deploying Semarchy xDM in Apache Tomcat with an Oracle database.

mdm-server/semarchy-postgresql.xml

Sample configuration file for deploying Semarchy xDM in Apache Tomcat with a PostgreSQL database.

mdm-server/semarchy-sqlserver.xml

Sample configuration file for deploying Semarchy xDM in Apache Tomcat with a SQL Server database.

mdm-server/additional-libraries/

This folder contains libraries used with Semarchy xDM, listed below.

mdm-server/additional-libraries/ojdbc8.jar

Oracle JDBC driver for Oracle Database version 12c. If you are using an older Oracle version, it is recommended to review the compatibility of this driver with your Oracle database version and possibly install to an older driver version instead (ojdbc6 or ojdbc7).

mdm-server/additional-libraries/org.postgresql.jdbc<postgresql_version>.jar

PostgreSQL JDBC Driver.

mdm-server/additional-libraries/com.microsoft.sqlserver.mssql-jdbc_<version>.jar

SQL Server JDBC Driver.

mdm-server/additional-libraries/com.sun.mail.jakarta.mail_<version>.jar

Library to install to enable JavaMail for Apache Tomcat Servers. Ignore this file if you are using a different application server.

mdm-server/additional-libraries/com.semarchy.tool.jee.tomcat-<tomcat_version>.jar

Tomcat tools for authentication and role mapping. See Delegating Authentication and Authorization in Tomcat for more information. Note that this component is provided for a given version of Tomcat, corresponding in the two first digits of the <tomcat_version> (for example 8.5). Contact our support team if you are using a different version of Tomcat.

mdm-server/additional-libraries/
    com.sun.activation.jakarta.activation_*.jar
    com.sun.istack.commons-runtime_*.jar
    com.sun.xml.fastinfoset.FastInfoset_*.jar
    jakarta.jws-api_*.jar
    jakarta.xml.bind-api_*.jar
    jakarta.xml.soap-api_*.jar
    org.apache.servicemix.specs.jaxws-api-*.jar
    org.glassfish.jaxb.runtime_*.jar
    org.glassfish.jaxb.txw2_*.jar
    org.jvnet.staxex.stax-ex_*.jar

Additional libraries required for Tomcat when running with a JDK 11.

Deploying and Configuring with Apache Tomcat

This section explains how to configure and deploy the Semarchy xDM Server with Apache Tomcat.

In this section, <tomcat> refers to the Apache Tomcat installation folder.

Refer to the Tomcat Documentation for more details about the deployment and configuration processes for Apache Tomcat.

Installing Additional Libraries

The installation file includes additional libraries which are not part of the Semarchy xDM code but which are required by the application server where Semarchy is running :

All installations need a database driver to connect to the repository and data locations.
Java 11 (JDK 11) does not include certain libraries which are included in Java 8 (JDK 8), so these files must be added manually.
Customers who wish to send email notification need an additional jar file.
Additionally, more files may be needed depending on any external authentication or single sign on (SSO) configuration that you require.

Before adding libraries, you must stop the Apache Tomcat server using <tomcat>/bin/shutdown.bat (Windows) or <tomcat>/bin/shutdown.sh (UNIX/Linux).
Similarly, after installing the libraries, restart the Apache Tomcat server using <tomcat>/bin/startup.bat (Windows) or <tomcat>/bin/startup.sh (UNIX/Linux)

Installing the Database JDBC Drivers

Install the JDBC drivers to connect the repository and data location databases, as well as the additional drivers required for the databases accessed by the xDM Dashboards charts and dashboards, or profiled by xDM Discovery.

To install the JDBC drivers:

Copy the appropriate database driver file from temp/mdm-server/additional-libraries/ to the <tomcat>/lib directory.
Copy additional drivers to the same directory.

Installing the Mail Session Libraries

This configuration is required for mail notifications using JEE Mail Session.

To install the Java Mail Libraries:

Copy the temp/mdm-server/additional-libraries/com.sun.mail.jakarta.mail_<version>.jar file to the <tomcat>/lib/ folder

Installing additional libraries not included in JDK 11

If running Tomcat with a JDK 11, you must copy to the <tomcat>/lib/ folder libraries required for Semarchy that are not provided in this version of the JDK. This step is not required for a JDK 8.

You will find these libraries, listed below, in the `temp/mdm-server/additional-libraries/`folder.

com.sun.activation.jakarta.activation_*.jar
com.sun.istack.commons-runtime_*.jar
com.sun.xml.fastinfoset.FastInfoset_*.jar
jakarta.jws-api_*.jar
jakarta.xml.bind-api_*.jar
jakarta.xml.soap-api_*.jar
org.apache.servicemix.specs.jaxws-api-*.jar
org.glassfish.jaxb.runtime_*.jar
org.glassfish.jaxb.txw2_*.jar
org.jvnet.staxex.stax-ex_*.jar

Configuring the Security

To configure the Semarchy xDM administrator user:

Stop the Apache Tomcat server.
Edit the <tomcat>/conf/tomcat-users.xml file.
In the <tomcat-users> section, add the following line:

 <user username="semadmin" password="<semadmin_password>" roles="semarchyConnect,semarchyAdmin"/>

Save the file.
Restart the Apache Tomcat server.

This operation adds to Apache Tomcat a semadmin user with its password. This user has full privileges to the Semarchy xDM application. Make sure to use a strong password for this user.

Configuring the Logging

It recommended to change the default configuration of the Tomcat server to benefit from the logging configuration directly from the Semarchy xDM Configuration user interface and prevent useless logging at server startup.

To configure the logging:

Add Tomcat startup properties:
1. Open the <tomcat>/bin/setenv.sh (UNIX/Linux) or <tomcat>/bin/setenv.bat (Windows) file with a text editor.
2. Add the following properties to the CATALINA_OPTS variable: -Dorg.ops4j.pax.logging.DefaultServiceLog.level=WARN
3. Save the file.
Edit the bootstrap logging configuration:
1. Edit the <tomcat>/conf/logging.properties file with a text editor.
2. Append the properties listed below to the file then save it.

Logging.properties configuration for Tomcat to prevent verbose bootstrap logging.

com.sun.xml.level = INFO
javax.xml.bind.level = INFO
org.apache.cxf.level = WARNING
org.ops4j.pax.logging.internal.Activator.level = WARNING
org.apache.aries.blueprint.level = WARNING

After this initial configuration, you can finely configure the logging . Refer to the Configuring the Logging chapter in the Semarchy xDM Administration Guide.

Configuration File

The Semarchy xDM application deployed in a Tomcat server is configured using a semarchy.xml that you modify and provide as part of the application deployment process.

The configuration file used after deployment by a running application is located in the <tomcat>/conf/Catalina/localhost/ folder. Such a file may be directly modified on the server machine. This triggers automatically an application restart with the updated configuration. Note that if you un-deploy the application, this file is deleted and lost.

It is recommended to always keep a backup copy of the latest configuration file. You will need it in the event of a re-deployment, or when upgrading the Semarchy xDM application.

Sample configuration files for various database technologies, named semarchy-<database-name>.xml, are available in the mdm-server/ folder of the server installation files you downloaded. Use these configuration files to get started with your instance configuration.

Setting Up the Datasources

To configure the repository datasource:

Edit the semarchy.xml file.
In the <context> configuration element, search the jdbc/SEMARCHY_REPOSITORY datasource and edit the following parameters:
- driverClassName:
  - Oracle: oracle.jdbc.OracleDriver
  - PostgreSQL: org.postgresql.Driver
  - SQL Server: com.microsoft.sqlserver.jdbc.SQLServerDriver
- url:
  - Oracle: jdbc:oracle:thin:@<oracle_instance_hostname>:<oracle_listener_port>:<oracle_SID_name>
  - PostgreSQL: jdbc:postgresql://<postgresql_hostname>:<postgresql_port>/<postgresql_database_name>
  - SQL Server: jdbc:sqlserver://<sqlserver_hostname>:<sqlserver_port>;databaseName=<repository_database_name>;
    or jdbc:sqlserver://<sqlserver_hostname>;instanceName=<sqlserver_instancename>;databaseName=<repository_database_name>;
- username: <repository_user_name>
- password: <repository_user_password>
- If using PostgreSQL or SQL Server, replace the validationQuery value by SELECT 1.
Save the semarchy.xml file.

Do not change the name of the SEMARCHY_REPOSITORY datasource. The application refers to a datasource with this name for the repository.

To configure a data location datasources:

Edit the semarchy.xml file.
In the <context> configuration element, copy and un-comment the datasource sample definition called jdbc/DATA_LOCATION_1.
Rename and edit the copy of the datasource settings with the following parameters:
- name: jdbc/<data_location_datasource_name>
- driverClassName:
  - Oracle: oracle.jdbc.OracleDriver
  - PostgreSQL: org.postgresql.Driver
  - SQL Server: com.microsoft.sqlserver.jdbc.SQLServerDriver
- url:
  - Oracle: jdbc:oracle:thin:@<oracle_instance_hostname>:<oracle_listener_port>:<oracle_SID_name>
  - PostgreSQL: jdbc:postgresql://<postgresql_hostname>:<postgresql_port>/<postgresql_database_name>
  - SQL Server: jdbc:sqlserver://<sqlserver_hostname>:<sqlserver_port>;databaseName=<repository_database_name>;
    or jdbc:sqlserver://<sqlserver_hostname>;instanceName=<sqlserver_instancename>;databaseName=<rdata_location_database_name>;
- username: <data_location_user_name>
- password: <data_location_user_password>
- If using PostgreSQL or SQL Server, replace the validationQuery value by SELECT 1.
Repeat the two previous steps for each data location’s datasource.
Save the semarchy.xml file.

The repository and data location datasources are now configured, pointing to the storage previously created.

To configure a datasource for xDM Dashboards or xDM Discovery, use the same configuration steps that you used to configure a data location datasource.

Configuring JavaMail Session

This configuration is required for mail notifications using JEE Mail Session.

To configure JavaMail Session:

Edit the semarchy.xml file.
In the <context> configuration element add the entry given below and then save the semarchy.xml file. Change the entry below to match your SMTP server configuration. See the SMTP package documentation for a description of the properties.

<Resource name="mail/Session" auth="Container" type="javax.mail.Session"
 mail.smtp.host="<mail_server_host>"
 mail.port="<mail_server_port>"
 mail.smtp.user="<mail_user_name>"
 mail.transport.protocol="smtp"
 password="<mail_user_password>"
 mail.smtp.auth="true" />
 <!-- Add the following to the configuration in case of a SASL Authenticator error:
 mail.smtp.socketFactory.class="javax.net.ssl.SSLSocketFactory"
 mail.smtp.socketFactory.port="<mail_server_port>"
 mail.smtp.socketFactory.fallback="false"
 -->

Gmail Users may have to allow access to "Less Secure Apps" if facing connection issues.

Deploying the Application

Deploying Passive Instances
The rest of the process describes the deployment of an active instance of Semarchy xDM. The process for deploying a passive instance in a high-availability/load-balancing setup is similar. For passive instances, semarchy-passive.war should be deployed renamed to semarchy.war and deployed using the instructions below.
See Configuring Semarchy xDM for High-Availability for more information about active and passive instances.

To deploy the application:

Upload the semarchy.war war file and the semarchy.xml configuration file to a temporary directory on the Tomcat server machine, for example /temp/.
Connect to the Apache Tomcat Manager (http://<tomcat_host>:<tomcat_port>/manager/).
In the Deploy directory or WAR file located on the server section:
- Enter the Context Path for this deployment. This context defines the URL to the deployed application: http://<application_server_host>:<application_server_port>/<context>/.
  Use for example semarchy for the context.
- In the XML Configuration file URL, enter the path to the configuration file in the Tomcat server, for example: /temp/semarchy.xml
- In the WAR or Directory URL, enter the path to the war file in the Tomcat server, for example: /temp/semarchy.war
Click the Deploy button.

The Semarchy xDM application is deployed in the server.

Testing the Application

To test the application:

Open a web browser.
In the URL, enter: http://<tomcat_host>:<tomcat_port>/<context>/.

The Semarchy xDM Login page appears.

Proceed directly to the Installing the Repository task.

Delegating Authentication and Authorization in Tomcat

In its default configuration, Tomcat stores users, passwords and roles on the server’s filesystem, in the <tomcat>/conf/tomcat-users.xml file.

This section provides advanced configuration information for delegating authentication and authorization in Apache Tomcat:

Delegating authentication lets an external system confirm who the user is (check the username and password). This system is, for example, your LDAP or Active Directory, or an external provider such as Google, Yahoo, OKTA, etc. The authenticating system may or may not provide the roles (or authorizations) of a given user.
Delegating authorization lets an external system provide the roles of an authenticated user. These roles must match those defined in Semarchy xDM to define what the user is authorized to do.

The following sections detail configuration aspects for:

LDAP, which also applies to Active Directory, to perform both authentication and authorization.
OpenID and OpenID Connect, which apply to providers such as Google, Yahoo, Microsoft Azure AD, Okta, etc, to perform authentication and possibly authorization with ID Token. When only authentication is handled by the provider, it is typically complemented by an authorization provider such as LDAP.
Windows Authentication to enable passing your Windows user credentials to the Semarchy xDM session.
Role Mapping, to convert roles or groups returned by the authorization system into roles suitable for Semarchy xDM.
Using the Tomcat Wrapper Realm, to simplify the configuration of authentication and authorization.

A strong knowledge of Tomcat concepts and options is required to configure authentication and authorization according to each situation. Please read carefully the following documents to learn about these subjects:

LDAP

Semarchy xDM supports authenticating as well as roles retrieval from an external directory, such as LDAP or Active Directory. The information entered in the login form is passed to the external directory, which returns, if the user is valid one, this user’s roles.

To delegate the authentication to an LDAP directory, add to the semarchy.xml configuration file a Tomcat JNDI Realm definition as shown in the example below. This configuration must be customized to match your LDAP directory configuration.

Sample Configuration: LDAP authentication and authorization

<Realm className="org.apache.catalina.realm.JNDIRealm"
	connectionURL="ldap://ldaphost.mydomain.com:389"		(1)
	userPattern="uid={0},ou=users,ou=people,dc=myCompany,dc=com"	(2)
	roleBase="ou=groups,ou=people,dc=myCompany,dc=com"		(3)
	roleName="cn"							(3)
	roleSearch="(member={0})" />					(3)

The parameters of the realm must be customized to your configuration:

1	`connectionURL`: Connection URL to the LDAP server
2	Users log in to Semarchy xDM with their LDAP UID. The password passed in the login form must the one of the user found in the LDAP tree using the `userPattern`.
3	Roles are searched in the `roleBase` point in the LDAP tree. `roleSearch` defines the LDAP search filter used to search roles attached to a username (represented by `{0}`). The roles returned are the attribute identified by `roleName`.

The JNDI Realm is one of the multiple type of realms supported by Tomcat. Other frequently used realms include the Datasource Realm - using database storage - and the UserDatabase Realm - using a file storage.

OpenID and OpenID Connect

OpenID is a mechanism used to delegate authentication to another provider. With OpenID configured, you can use Google, Yahoo, Facebook or Twitter accounts to authenticate to Semarchy xDM, to replace or in addition to the login form.

There are two main "implementations" of OpenID supported by the authentication providers: OpenID 2.0 and OpenID Connect.

The next sections explain how to configure OpenID 2.0 or OpenID Connect for Tomcat, and how to mix these authentication schemes with a Login Form.

Configuring OpenID 2.0

OpenID is used as an authentication service, which does not includes the roles. A realm (LDAP, File, or other) is usually defined to manage and serve the roles of the users identified with OpenID.

Configuring OpenID 2.0 with Tomcat requires the Open ID Authenticator for Tomcat, which extends Tomcat authentication to support OpenID 2.0.

To configure authentication with OpenID:

Download the latest version of the Open ID Authenticator for Tomcat.
Copy the JAR file to the <tomcat>/lib directory.
Edit the semarchy.xml configuration file to define the authentication.

OpenID 2.0 Authentication and LDAP Authorization

In this configuration, you define:

An OpenIDAuthenticator valve for the authentication.
An OpenIDRealm, which contains a JNDI sub-realm for the authorizations.

Sample Configuration: OpenID (Yahoo) authentication, LDAP authorization

<!-- OpenID Valve configuration for Authentication -->
<Valve className="org.bsworks.catalina.authenticator.openid.OpenIDAuthenticator"
	singleProviderURI="https://me.yahoo.com"
	loginNameAttributeType="http://axschema.org/contact/email"
	allowedClaimedIDPattern="https://me.yahoo.com/a/.+"
	hostBaseURI="http://mdm_host:port" /> 	(1)

<!--Realms configuration -->
<Realm className="org.bsworks.catalina.authenticator.openid.OpenIDRealm">
	<!--
	This realm provides authorizations for the OpenID-authenticated users.
	It uses LDAP with each username equal to the OpenID username.
	Any Tomcat realm can be used for this purpose.
	-->
	<Realm  className="org.apache.catalina.realm.JNDIRealm"
		connectionURL="ldap://ldaphost.mydomain.com:389"			(2)
		userPattern="uid={0},ou=users,ou=people,dc=myCompany,dc=com"	(3)
		userPassword="uid"						(3)
		roleBase="ou=groups,ou=people,dc=myCompany,dc=com"
		roleName="cn"
		roleSearch="(member={0})" />
</Realm>

The parameters of the valve and realm must be customized to your configuration:

1	The OpenID connector will return to the original server after authentication. In most cases (when the server uses HTTPS), this mechanism works. If the server does not use HTTPS, it is recommended to set the host and port of the Semarchy xDM instance in `hostBaseURI`.
2	This LDAP host corresponds to the LDAP server containing your user and roles mappings.
3	LDAP Realms configuration. see the Tomcat JNDI Realm Documentation for more information. Note that in this realm configuration, the `userPassword` must point to the LDAP `uid` (user id). The LDAP user ID must be equal to the OpenID Connect User ID (in that case, the email).

OpenID 2.0 and Form Authentication, LDAP Authorization

Mixing OpenID 2.0 with a form authentication provides the possibility to log in using an OpenID account or a login form. Authentication via the login form is made against an authentication provider, for example an LDAP directory or a database of users stored in the application server’s disk.

Sample Configuration: OpenID (Yahoo) + Login Form (LDAP) authentication, LDAP authorization

<!-- OpenID Valve configuration for Authentication -->
<Valve className="org.bsworks.catalina.authenticator.openid.OpenIDAuthenticator"
	loginNameAttributeType="http://axschema.org/contact/email"
	singleProviderURI="https://me.yahoo.com"
	allowedClaimedIDPattern="https://me.yahoo.com/a/.+"
	hostBaseURI="http://mdm_host:myport" />

<!--
	Login form configuration, allowing to display a OpenID authentication button
	pointing to OpenID Provider (Yahoo) given by parameter value
-->
<Parameter name="SingleSignOn"
           value="openid_identifier=https://me.yahoo.com"
           override="true" />

<!--Realms configuration -->
<Realm className="org.bsworks.catalina.authenticator.openid.OpenIDRealm">
	<!--
	The first realm provides authorizations for the OpenID-authenticated users.
	This realm uses LDAP with each username equal to the OpenID username.
	Any Tomcat realm can be used for this purpose.
	-->
	<Realm  className="org.apache.catalina.realm.JNDIRealm"
		connectionURL="ldap://ldaphost.mydomain.com:389"
		userPattern="uid={0},ou=users,ou=people,dc=myCompany,dc=com"
		userPassword="uid"
		roleBase="ou=groups,ou=people,dc=myCompany,dc=com"
		roleName="cn"
		roleSearch="(member={0})" />

	<!--
	The second realm (and subsequent ones) are used as fallbacks for
	authentication and authorization.
	Local realms can be used as local authentication providers,
	or fallback providers if the OpenID authentication fails.
	-->
	<Realm 	className="org.apache.catalina.realm.JNDIRealm"
		connectionURL="ldap://ldaphost.mydomain.com:389"
		userPattern="uid={0},ou=users,ou=people,dc=myCompany,dc=com"
		roleBase="ou=groups,ou=people,dc=myCompany,dc=com"
		roleName="cn"
		roleSearch="(member={0})" />

</Realm>

Configuring OpenID Connect

OpenID Connect is a newer implementation, and authentication providers move over time from OpenID 2.0 to OpenID Connect. It is supported by example by Google, Amazon Cognito, Microsoft Azure AD, ADFS, and OKTA.

Configuring OpenID Connect with Tomcat requires the OpenID Connect Authenticator for Tomcat, which extends Tomcat authentication to support OpenID Connect.

To configure authentication with OpenID:

Download the latest version of the Open Connect Authenticator for Tomcat. The file is named tomcat8-oidcauth-<version_number>.jar
Copy the JAR file to the <tomcat>/lib directory.
Edit the semarchy.xml configuration file to define the authentication.

OpenID Connect Authentication & LDAP Authorization

In this configuration, you define:

An OpenIDConnectAuthenticator valve for the authentication. This valve defines the various OpenID Connect providers in a providers JSON payload.
A realm for the authorizations (roles) of the authenticated users.

Sample Configuration: OpenID Connect (Google) authentication

<!-- Valve configuration for OpenID Connect authentication -->

<Valve className="org.bsworks.catalina.authenticator.oidc.tomcat85.OpenIDConnectAuthenticator"
	providers="[{				(1)
		name: Google,
		issuer: https://accounts.google.com,
		clientId:     xxxxx,	(2)
		clientSecret: xxxxxx	(2)
	}]"
	additionalScopes="email"	(3)
	usernameClaim="email"		(4)
	noForm="true" 			(5)
	hostBaseURI="http://mdm_host:port" (6)
	<!-- logoutUrl="http://myhost.myport/semarchy/logout.do" --> (7)
	landingPage="/"/>

The parameters of the valve must be customized to your configuration:

1	The list of OpenID Connect providers. In this example, only Google is defined.
2	The Google Client ID and Google Client Key are retrieved from the Google Developer Console. Refer to the Google Identity Platform site for detailed setup instructions.
3	The `additionalScopes` property lists the additional scopes to request to the provider.
4	The `usernameClaim` property indicates which of the claims returned by the provider should be used as a user id when looking for the authorizations.
5	This property indicates whether the login form should be available or not, in addition to the OpenID Connect providers.
6	The OpenID Connect connector will return to the original server after authentication. In most cases (when the server uses HTTPS), this mechanism works. If the server does not use HTTPS, it is recommended to set the host and port of the Semarchy xDM instance in `hostBaseURI`. You must declare this redirect URI in the Google Developer Console. See the Google Identity Platform site for detailed instructions.
7	Optional Logout URL. Specify this URL as required by the OpenID Connect provider.

Sample Configuration: LDAP authorization

<!--
The following LDAP realm provides the authorizations of the OpenID Connect-authenticated users.
Any Tomcat realm can be used for this purpose.
-->
<Realm className="org.apache.catalina.realm.JNDIRealm"	(1)
	connectionURL="ldap://ldaphost.mydomain.com:389"
	userPattern="uid={0},ou=users,ou=people,dc=myCompany,dc=com"
	roleBase="ou=groups,ou=people,dc=myCompany,dc=com"
	roleName="cn"
	roleSearch="(member={0})"

    <CredentialHandler className="com.semarchy.tool.jee.tomcat.CaseInsensitiveCredentialHandler"/> (2)
	/>

The parameters of the realm must be customized to your configuration:

1 This JNDI Realm (LDAP Server) contains the roles of the authenticated users. Roles are searched in the roleBase point in the LDAP tree. roleSearch defines the LDAP search filter used to search roles attached to a user id (represented by {0}). The roles returned are the attribute identified by roleName.
Note that the LDAP user id, referred to with {0}, must be equal to the value requested to the OpenID provider via the usernameClaim (in that case, the email).

2 This optional credential handler is configured in the realm to automatically handle the possible case-sensitivity differences between the authentication provider and the authorization provider. It may be used in other realm configurations.

For the list of all parameters and sample configurations for all providers, refer to the OpenID Connect Authenticator for Tomcat Documentation

OpenID Connect Authentication & Roles from the ID Token

Semarchy xDM provides an extension to the OpenID Connect Authenticator for Tomcat that supports reading roles the ID Token. This token is a JSON Web Token (JWT) returned after a successful authentication, with user profile information (user’s name, email, roles, etc), represented in the form of claims.

If the OpenID Provider supports it, this configuration allows reading roles served directly by the OpenID provider.

To configure authentication with OpenID and authorization with the ID Token:

Make sure that you have the latest version of the Open Connect Authenticator for Tomcat in the <tomcat>/lib directory.
1. Copy the temp/mdm-server/additional-libraries/com.semarchy.tool.jee.tomcat-<tomcat_version>.jar file to the $TOMCAT_HOME/lib directory. This file is available from the semarchy-xdm-install-xxx.zip archive file you downloaded.
Edit the semarchy.xml configuration file to define the authentication.

In this configuration, you define:

An OpenIDConnectAuthenticator valve for the authentication. Note that this valve is based on the com.semarchy.tool.jee.tomcat.OpenIdConnectAuthenticator class, which supports the ID Token.
A realm using the com.semarchy.tool.jee.tomcat.OpenIdConnectRealm to use the roles extracted from the ID Token by the valve.

Sample Configuration: OpenID Connect (Microsoft Azure AD), Roles in the ID Token

<Valve className="com.semarchy.tool.jee.tomcat.OpenIdConnectAuthenticator"
	providers="[{		(1)
		name: 'Microsoft Azure AD',
		issuer:       xxxx,	(2)
		clientId:     xxxx,	(2)
		clientSecret: xxxxx (2)
	}]"
	usernameClaim="email"	(3)
	additionalScopes="email groups" (4)
	hostBaseURI="http://myhost:myport"
	noForm="true"
	groupClaim="groups" (5)
	groupSeparator=","

	<!-- Role mapping is supported if required -->
	roleMappingEnabled = "true" (6)
	keepMappedRoles="false"
	keepUnmappedRoles="false"
	regexEnabled="true"

	landingPage="/"

	<!-- logoutUrl="http://myhost.myport/semarchy/logout.do" --> (7)
	/>

<!-- Realm using the roles extracted from the ID Token -->
<Realm className="com.semarchy.tool.jee.tomcat.OpenIdConnectRealm" />

1	The list of OpenID Connect providers. In this example, only Microsoft Azure AD is defined.
2	These parameters are configured in Azure AD. Refer to the OpenID Connect and Azure Active Directory documentation for more information.
3	The `usernameClaim` corresponds to the claim in the ID Token containing the user name.
4	The `additionalScopes` is a space-separated list of scopes that are requested for the ID Token. Note that one of them is groups.
5	The `groupClaim` property tells the valve which of the additionalScopes (in that case, groups) contains the list of groups. This list is split using the `groupSeparator` character.
6	Each group from the `groupClaim` list is optionally processed by the Semarchy xDM role mapper if this property is set to true, to create a list of roles meaningful for Semarchy xDM. See Using the Tomcat Role Mapper for more information about these parameters.
7	Optional Logout URL. Specify this URL as required by the OpenID Connect provider

OpenID Connect and Form Authentication

Mixing OpenID Connect with a form authentication provides the possibility to log in using an OpenID Connect account or a login form. Authentication via the login form is made against a realm, for example an LDAP directory or a database of users stored in the application server’s disk.

To configure OpenID Connect and Form authentication:

Make sure that you have the latest version of the Open Connect Authenticator for Tomcat in the <tomcat>/lib directory.
1. Copy the temp/mdm-server/additional-libraries/com.semarchy.tool.jee.tomcat-<tomcat_version>.jar file to the $TOMCAT_HOME/lib directory. This file is available from the semarchy-xdm-install-xxx.zip archive file you downloaded.
Edit the semarchy.xml configuration file to define the authentication.

In this configuration, you define:

An OpenIDConnectAuthenticator valve for the authentication. This valve defines the various OpenID Connect providers in the providers JSON payload. Make sure to set noForm="false" in this configuration to enable the form.
A Tomcat Realm (in this example, using a UserDataBase Realm stored in a server file) to authenticate users using the login form, and authorize all users.

Sample Configuration: OpenID Connect (Google) + Login Form and authorizations in a UserDatabase

<!-- Valve configuration for mixed OpenID Connect authentication -->

<Valve className="org.bsworks.catalina.authenticator.oidc.tomcat85.OpenIDConnectAuthenticator"
	providers="[{				(1)
		name: Google,
		issuer: https://accounts.google.com,
		clientId:     xxxxx,
		clientSecret: xxxxxx
	}]"
	usernameClaim="email"
	additionalScopes="email"
	noForm="false" 			(2)
	hostBaseURI="http://mdm_host:port"
	landingPage="/"/>

<!--
	This realm provides authentication for form-authenticated users as well as authorization
	for OpenID Connect and form-authenticated users.
	These users are stored in an "OpenIDDatabase" file resource, declared in server.xml.
	This file contains username and roles for all users. Only users authenticated by form
	will use the passwords stored in this files.
-->
	<Realm className="org.apache.catalina.realm.UserDatabaseRealm" resourceName="OpenIDDatabase"/>

1	The list of OpenID Connect providers. In this example, only Google is defined. See OpenID Connect Authentication & LDAP Authorization for a more detailed example.
2	This property indicates that the login form should be available in addition to the OpenID Connect providers. If you set it to false, there is no login form. However, the `UserDatabaseRealm` will still be used for the OpenID Connect user’s authorizations.

Configuration in /conf/server.xml to declare the file resource containing the user database.

<!--
The resource named "OpenIDDatabase" used in the realm configuration
must be declared as the /conf/openid-users.xml file stored in the
application server's file system.
-->
<Resource name="OpenIDDatabase" auth="Container"
          type="org.apache.catalina.UserDatabase"
          description="User database"
          factory="org.apache.catalina.users.MemoryUserDatabaseFactory"
          pathname="conf/openid-users.xml" />

Sample /conf/openid-users.xml file containing user, passwords and roles.

<user username="john.doe@mydomain.com" password="xxxx" roles="semarchyConnect,businessUser"/>
<user username="local_admin" password="xxxx" roles="semarchyConnect,semarchyAdmin"/>

Automatically User Profile Seeding

The first time a user logs in using OpenID Connect, its profile information is automatically seeded with the information provided, according to the standard, by the OpenID Connect provider in the UserInfo endpoint response.

This information includes the following profile properties: email, first name, last name, picture, primary phone, address, city, postal code, country, language and time zone.

Depending on the provider, you may need to add profile phone address to the additionalScopes parameter of the valve to have this information returned in the UserInfo endpoint.

Profile information that does not exist in the standard claims may also be seeded in the user profile using the following custom claims that should be returned by the provider in the UserInfo response:

Profile property Custom claim name

Profile property	Custom claim name
Company Name	`https://semarchy.com/company_name`
Job Tile	`https://semarchy.com/job_title`
Department	`https://semarchy.com/departement`
Secondary Phone	`https://semarchy.com/secondary_phone`

Company Name

https://semarchy.com/company_name

Job Tile

https://semarchy.com/job_title

Department

https://semarchy.com/departement

Secondary Phone

https://semarchy.com/secondary_phone

Windows Authentication Using Waffle

Windows Authentication (SSO) is supported in Tomcat using the Waffle (Windows Authentication Framework) component.
Using this mechanism, the user connected to the windows machine is used to authenticate to Semarchy xDM.

The following configuration only works when the machine running the Semarchy xDM Server is a Windows server joined to the domain you want to use for authentication.

To configure Windows Authentication:

Download and uncompress Waffle.
Copying the following files from the Waffle archive to the $TOMCAT_HOME/lib directory:
- waffle-tomcat(version)-*.jar corresponding to your Tomcat version.
- waffle-jna-*.jar
- Other non-Waffle libraries: caffeine-*.jar, jna-*.jar, jna-platform-*.jar, logback-core-*.jar, logback-classic-*.jar, slf4j-api-*.jar, jcl-over-slf4j-*.jar.
Copy the temp/mdm-server/additional-libraries/com.semarchy.tool.jee.tomcat-<tomcat_version>.jar file to the $TOMCAT_HOME/lib directory.
Edit the semarchy.xml file and add the valve and realm configuration and then restart the server.

To enable Windows authentication, use the following configuration.

Sample Configuration: Windows authentication and authorization

<Valve className="com.semarchy.tool.jee.tomcat.RoleMappingNegotiateAuthenticator"
	principalFormat="fqn"
	roleFormat="fqn"
	protocols="NTLM"
	/>

<Realm className="waffle.apache.WindowsRealm" />

To mix Login Form and Windows authentication use the following configuration:

Sample Configuration: Windows + Login Form Authentication

<parameter name="SingleSignOn"
           value="action=j_negotiate_check"
           override="true" />

<Valve className="com.semarchy.tool.jee.tomcat.RoleMappingMixedAuthenticator"
	principalFormat="fqn"
	roleFormat="fqn"
	protocols="NTLM"
	/>

<Realm className="waffle.apache.WindowsRealm" />

Both these configurations use specific extensions to the Waffle authenticator valves in order to support role mapping. You can configure a role mapping file as described in the Using the Tomcat Role Mapper section. If you do not need the role mapping, you can use the Waffle NegotiateAuthenticator or MixedAuthenticator valves instead.

These configurations is valid for Waffle versions starting with version 1.7. For versions before 1.7, replace the protocols="NTML" element by disableNegociate=”true".

Browser Configuration

Browsers must be specifically configured to work with Windows Authentication:

Edge: The Tomcat server must be considered as Intranet host.
Google Chrome: The configuration is shared with Edge.
Firefox:
1. Open a new Tab.
2. Enter about:config in the address bar.
3. Enter network.negotiate-auth.trusted-uris in the Filter box.
4. Enter the Semarchy xDM server name as the value. If you have multiple servers, enter a comma separated list.
5. Close the tab.

Using the Tomcat Role Mapper

When the service providing the authorizations returns groups of users (if it does not support the concept of Roles) or role names that cannot exactly match the roles declared in Semarchy xDM (for example, if they include spaces or special characters), you have to configure a mapping between these role names and the Semarchy xDM role names.

Semarchy xDM provides a specific component for this purpose, called the Role Mapper.

To configure the Role Mapper:

Copy the temp/mdm-server/additional-libraries/com.semarchy.tool.jee.tomcat-<tomcat_version>.jar file to the $TOMCAT_HOME/lib directory.
Create a $TOMCAT_HOME/conf/roles-mapping.properties file to define the role mappings.
Edit the semarchy.xml file and add a wrapper realm around the realm for which you want to perform role mapping, as shown below.

<Realm className="com.semarchy.tool.jee.tomcat.RoleMappingRealm">
	<!-- This is the Realm to which role mapping is applied. -->
	<Realm ClassName="org.apache.catalina.realm.JNDIRealm"
	...
	/>
</Realm>

Role mapping is natively supported:

When using Windows Authentication Using Waffle, the RoleMappingNegotiateAuthenticator and RoleMappingMixedAuthenticator valves support natively role mapping and do not require additional realm configuration.
When using OpenID Connect Authentication & Roles from the ID Token, the role mapping is configured as part of the OpenIdConnectAuthenticator valve configuration.

Role Mapping File format

The role mappings are stored in a file contains one line per role mapping, in the format:

<directory_group>=<semarchy_role_1>,<semarchy_role_2>,...

The group names may contain spaces. In this situation, it should be kept as is and NOT enclosed in single or double quotes.

Examples of Role Mapping

AdministratorsGroup=semarchyConnect,semarchyAdmin
UsersGroup=semarchyConnect
Data Stewards=demoDataStewards

In certain cases with Microsoft Active Directory, group names with special characters such as spaces or backslashes require that you replace them with their Unicode equivalent them as shown below:

Examples of Role Mapping with escaped spaces (\u0020) and backslashes (\u005C)

MDM\u0020Users=semarchyConnect
Global\u005CStewards=semarchySteward

The role mapper supports role mapping using regular expressions and match groups patterns to convert input roles into different roles. This requires setting the role mapping realm’s regexEnabled attribute to true.

Example of a Role Mapping using Regular Expressions

SUPP_.* = Supplier 	(1)
SUPP_(.*) = $1		(2)
(.*) = $1		(3)

If the example above, an incoming role named SUPP_Premium would convert into the Supplier, Premium `and `SUPP_Premium roles, due to the following rules:

1	If a role starts with `SUPP_`, then the `Supplier` role is mapped as the output.
2	If a role starts with `SUPP_`, then the rest of the role string captured as a group is returned as a role. For example: `SUPP_Premium` as an input role would return `Premium` as the output.
3	The entire input role is captured as a group and returned as an output.

Advanced Role Mapping Configuration

You can use the following properties to configure the role mapper:

regexEnabled: Set to true to enable role replacement using regular expressions. This option defaults to "false".
keepMappedRoles: Set to true to preserve in the user roles list the original roles that have been mapped. Otherwise, these roles are replaced by their mapped value. This option defaults to "true". This option replaces the deprecated replaceRole option. If keepMappedRoles or keepUnmappedRoles is set, then replaceRole is ignored.
keepUnmappedRoles: Set to true to preserve in the user roles list the original roles that have not been mapped. Otherwise, these roles are removed from the list. This option defaults to "true".
rolesMappingPathName: Location of the role mapping file. By default this file is located in $TOMCAT_HOME/conf/roles-mapping.properties

Advanced Role Mapper configuration: Regular expressions are supported. All original roles are removed.

<Realm className="com.semarchy.tool.jee.tomcat.RoleMappingRealm"
       keepMappedRoles="false"
       keepUnmappedRoles="false"
       regexEnabled="true"
       rolesMappingPathName="/home/user/map.properties">

Using the Tomcat Wrapper Realm

To simplify the configuration of user authentication and authorizations in Apache Tomcat, Semarchy xDM comes with a Wrapper realm that supports two nested realms:

The first realm defines user authentication, checking that the login/password combination is valid and connecting the user.
The second realm defines how to retrieve user authorizations, that is the list of roles associated with the connected user after authentication). Note that this realm should contain only users defined with username = password, since this realm is only used for authorizations. Note that this realm, in the example below, uses a specific credential handler to automatically handle the possible case-sensitivity differences between the authentication provider and the authorization provider.

Wrapper Realm configuration: Authenticate with LDAP, retrieve roles from a database

<Realm className="com.semarchy.tool.jee.tomcat.AuthAndRolesRealm">

    <!-- First realm is for authentication -->
    <Realm className="org.apache.catalina.realm.JNDIRealm"
           connectionURL="ldap://ldaphost.mydomain.com:389"
           userPattern="uid={0},ou=users,ou=people,dc=myCompany,dc=com"
           />

    <!-- Second realm is for authorizations -->
    <Realm className="org.apache.catalina.realm.JDBCRealm"
           driverName="oracle.jdbc.driver.OracleDriver"
           connectionURL="jdbc:oracle:thin:@dbserver:1521:ora11"
           userTable="users"
           userNameCol="user_name"
           userCredCol="user_name"
           userRoleTable="user_roles"
           roleNameCol="role_name"

		<!-- Credential handler for case-sensitivity differences -->
		<CredentialHandler className="com.semarchy.tool.jee.tomcat.CaseInsensitiveCredentialHandler"/>

           />
</Realm>

Configuring a JMS Destination in Tomcat

Semarchy xDM can use a JMS (Java Message Service) provider as a notification server, in order to send job completion notifications to other applications in the form of JMS messages.

In most cases, the resource definition is generic and follows the Tomcat generic guidelines.

Certain JMS servers do not precisely follow the JMS Standards and require specific mechanisms and configuration to connect and access their JMS destinations. For these cases, Semarchy xDM includes a generic JNDI lookup factory to request a JNDI Connections and Resources (Queue or Topic).

This component is available as a jar named com.semarchy.tool.jee.tomcat-<tomcat_version>.jar in your Semarchy xDM installation package.

Configuring JMS Using the JNDI Lookup Factory

To configure the JNDI Lookup Factory:

Copy the temp/mdm-server/additional-libraries/com.semarchy.tool.jee.tomcat-<tomcat_version>.jar file to the $TOMCAT_HOME/lib directory.
Copy the client libraries (.jar files) required for your JMS Server to the $TOMCAT_HOME/lib directory.
Edit the semarchy.xml file and add the following resource declarations:

Connection Factory resource definition

<!-- The Connection Factory encapsulates a set of connection configuration parameters.
     It is used to create a connection with the JMS provider. -->

<Resource name="jms/<connection_factory_name>"			(1)
	auth="Container"
	type="javax.jms.ConnectionFactory"
	factory="com.semarchy.tool.jee.tomcat.jndi.JndiLookupFactory"
	jndiKey="<connection_factory_jndi_location_in_provider>"(2)
	initialCtxFactory="<initial_context_factory>"		(3)
	providerUrl="<provider_url>"				(4)
	username = "<jms_server_login>"				(5)
	password = "<jms_server_password>"			(5)
/>
<!-- Depending on the JMS provider, you must provide additional parameters.
     For example:
     java.naming.security.protocol = "ssl"
-->

The Connection Factory resource definition uses the following parameters:

1	The Tomcat resource name for your JMS connection factory.
2	The location of the connection factory JNDI resource in the remote JNDI provider.
3	The initial context factory class, specific to the JNDI provider.
4	The URL of the JNDI provider.
5	The login and password required to access the JNDI resource, if required.

JMS Destination resource definition

<!-- A destination is a JMS Queue or Topic to send notifications to. -->

<Resource name="jms/<destination_name>" 			(1)
	auth="Container"
	type="javax.jms.Queue"					(2)
	factory="com.semarchy.tool.jee.tomcat.jndi.JndiLookupFactory"
	jndiKey="<destination_jndi_location_in_provider>" 	(3)
	initialCtxFactory="<initial_context_factory>" 		(4)
	providerUrl="<provider_url>" 				(5)
	username = "<jms_server_login>"				(6)
	password = "<jms_server_password>"			(6)
/>
<!-- Depending on the JMS provider, you must provide additional parameters.
     For example:
     java.naming.security.protocol = "ssl"
-->

The JMS destination resource definition uses the following parameters:

1	The resource name of your JMS destination in Tomcat.
2	Type of the JMS destination. Can be a Queue or Topic.
3	The location of the JMS destination JNDI resource in the remote JNDI provider.
4	The initial context factory class, specific to the JNDI provider.
5	The URL of the JNDI provider.
6	The login and password required to access the JNDI resource, if required.

Using JMS Destinations from the JNDI Lookup Factory

When configuring the Notification Server and Job Notification Policy, or order to use the JMS destination defined:

Use the jms/<connection_factory_name> value in the notification server Connection Factory URL property.
Use the jms/<destination_name> value in the job notification JMS Destination property.

Both those values must be prefixed with java:comp/env/ when used.

You can configure the look and feel of the Semarchy xDM login, logout and error pages.

The following parameters can be passed in the semarchy.xml` file to configure this page:

SingleSignOnButtonLabel: Label used for the SSO provider indicated in SSO Login button. For example, "Google", "Facebook", etc. This label is prefixed with a localized version of "Log in with ". This parameters is optional and defaults to SSO. If you have configured multiple providers, you can provide a comma-separated list of provider labels.
SingleSignOnButtonIcon: URL of the icon representing the provider the single sign-on button. Use a HTTP URL to a 24x24px PNG file. Defaults to account.png. We also include facebook-box.png, google.png, google-plus.png and twitter.png icons in the application, so you can refer to them directly. If you have configured multiple providers, you can provide a comma-separated list of provider icons.
SingleSignOnButtonColor: CSS background color of the SSO button. If you have configured multiple providers, you can provide a comma-separated list of color codes.
SignOnImageURL: image that appears in the banner of the login/logout/error pages. Should be a 300x100px PNG file. If not specified the Semarchy xDM logo is used.
SignOnBottomMessage: HTML text that appears at the bottom the page (under the Log in button).

<!-- Enable SSO with Google along with the login form -->
<Parameter name="SingleSignOn" value="openid_identifier=https://www.google.com/accounts/o8/id" override="true"/>

<!-- Google Login button branding -->
<Parameter name="SingleSignOnButtonLabel" value="Google" />
<Parameter name="SingleSignOnButtonIcon" value="google.png" />
<Parameter name="SingleSignOnButtonColor" value="#4885ed" />

<!-- Login/Logout/Error pages image banner and footer text -->
<Parameter name="SignOnImageURL" value="http://localhost:80/staticAssets/banner.png" />
<Parameter name="SignOnBottomMessage" value="Please use this application <b>wisely</b>" />

Deploying and Configuring with JBoss/WildFly

This section explains how to configure and deploy the Semarchy xDM Server with WildFly Application Server (formerly JBoss AS).

In this section, <wildfly_home> refers to the Wildfly server installation folder.

Refer to the WildFly Documentation for your WildFly version for more details about the deployment and configuration processes in WildFly.

Installing Additional Libraries

Installing the Database JDBC Driver

Install the JDBC drivers to connect the repository and data location databases, as well as the additional drivers required for the the datasources accessed by xDM Dashboards or xDM Discovery.

To install the JDBC drivers:

Copy the appropriate database driver file from temp/mdm-server/additional-libraries/ to the <wildfly_home>/standalone/deployments directory.
Copy additional drivers to the same directory.

Configuring the Security

The configuration in this section uses the UsersRolesLoginModule (properties file based login module) and may be changed to a stronger authentication mechanism.

The following example explains how to configure the semadmin user for Wildfly in the default configuration.

To configure the security realm:

Go to the <wildfly_home>/bin folder and start the add-user.sh or add_user.bat script
Selection option b) Application User.
Select a realm or press Enter.
Enter semadmin for the Username and then press Enter.
Enter this user password and then press Enter.
Re-enter the password and then press Enter.
Enter the following list of roles: semarchyConnect,semarchyAdmin and then press Enter.
Enter y to confirm the user creation.
Enter n as this user is not used for one AS process to connect another AS process.

Setting up the Datasources

To configure the repository datasource:

Connect to the Administration Console.
In the Profile section, select Connector > Datasources.
Configure a datasource with the following parameters:
- Name: SEMARCHY_REPOSITORY
- JNDI: jdbc/SEMARCHY_REPOSITORY
- Driver: Select the driver for your database.
- Username: <repository_user_name>
- Password: <repository_user_password>
- Connection URL:
  - Oracle: jdbc:oracle:thin:@<oracle_instance_hostname>:<oracle_listener_port>:<oracle_SID_name>
  - PostgreSQL: jdbc:postgresql://<postgresql_hostname>:<postgresql_port>/<postgresql_database_name>
  - SQL Server: jdbc:sqlserver://<sqlserver_hostname>:<sqlserver_port>;databaseName=<repository_database_name>;
    or jdbc:sqlserver://<sqlserver_hostname>;instanceName=<sqlserver_instancename>;databaseName=<repository_database_name>;
Save this configuration and make sure it is enabled.

To configure a data location datasource:

Connect to the Administration Console.
In the Profile section, select Connector > Datasources.
Configure a datasource with the following parameters:
- Name: <data_location_datasource_name>
- JNDI: jdbc/<data_location_datasource_name>
- Driver: Select the driver for your database.
- Username: <data_location_user_name>
- Password: <data_location_user_password>
- Connection URL:
  - Oracle: jdbc:oracle:thin:@<oracle_instance_hostname>:<oracle_listener_port>:<oracle_SID_name>
  - PostgreSQL: jdbc:postgresql://<postgresql_hostname>:<postgresql_port>/<postgresql_database_name>
  - SQL Server: jdbc:sqlserver://<sqlserver_hostname>:<sqlserver_port>;databaseName=<repository_database_name>;
    or jdbc:sqlserver://<sqlserver_hostname>;instanceName=<sqlserver_instancename>;databaseName=<rdata_location_database_name>;
Save this configuration and make sure it is enabled.
Repeat this operation for each data location’s datasource.

The repository and data location datasources are now configured, pointing to the storage previously created.

To configure datasources for xDM Dashboards or xDM Discovery, use the same configuration steps that you used to configure the data location datasources.

Deploying the Application

To deploy the application:

Copy the temp/mdm-server/semarchy.war file to the <wildfly_home>/standalone/deployments/ folder.

The Semarchy xDM application is deployed in the server.

Configuring JavaMail Session

This configuration is required for mail notifications using JEE Mail Session.

To configure JavaMail Session:

Edit the the domain.xml or standalone-full.xml configuration file and create the mail subsystem to match your configuration. See https://docs.jboss.org/author/display/AS71/Mail+Subsystem for more information.

Testing the Application

To test the application:

Open a web browser.
In the URL, enter: http:/<wildfly_host>:<wildfly_port>/semarchy/.

Proceed directly to the Installing the Repository task.

Deploying and Configuring with GlassFish

This section explains how to configure and deploy the Semarchy xDM Server with Glassfish Application Server.

In this section, <glassfish_home> refers to the Glassfish server installation folder.

Refer to the Glassfish Documentation for more details about the deployment and configuration processes in Glassfish.

Installing Additional Libraries

Installing the Database JDBC Driver

To install the JDBC drivers:

Copy the appropriate database driver file from temp/mdm-server/additional-libraries/ to the <glassfish_home>/glassfish/lib directory.
Copy additional drivers to the same directory.

After installing the libraries, restart the Glassfish server.

Configuring the Security

The configuration in this section uses the default File Realm and may be changed to your enterprise’s type of realm.

Configuring the Security Realm and Semarchy xDM Administrator

To configure the security realm:

Open the Glassfish WebAdmin interface (http://<glassfish_host>:4848).
In the Common Tasks panel, select Configuration > Server-config > Security > Realms.
Click the New button to create a new realm with the following properties:
- Name: semarchyRealm
- ClassName: com.sun.entreprise.security.auth.realm.file.FileRealm
- JAAS Context: fileRealm
- Key file: ${com.sun.aas.instanceRoot}/config/keyfile
Click OK to save the new realm.

To configure the semadmin user:

Click the new semarchyRealm and then select the Manage Users button.
Click the New button to create a new user with the properties:
- User ID: semadmin
- Group List: semarchyAdminGroup,semarchyConnectGroup
- Password: <semadmin_password>
Click OK to save the new user.

Configuring Groups/Roles Mappings

The configured realm uses the default Java Authorization Contract for Containers (JACC) provider included in Glassfish. This JACC provider does not support dynamic roles, and mandates that the mappings between Groups and Roles are defined in the deployed application descriptor file.

To define the groups/roles mappings:

Uncompress the /temp/mdm-server/semarchy.war file into the temp/semarchy_war/ folder.
Edit the temp/semarchy_war/semarchy/WEB-INF/glassfish-web.xml file.
Add the section given below in <glassfish-web-app> element and then save the file.

<security-role-mapping>
 <role-name>semarchyConnect</role-name>
 <group-name>semarchyConnectGroup</group-name>
</security-role-mapping>
<security-role-mapping>
 <role-name>semarchyAdmin</role-name>
 <group-name>semarchyAdminGroup</group-name>
</security-role-mapping>

To add new Semarchy xDM roles after the initial setup and map them to Glassfish groups, you must modify this file and redeploy the application.

Setting up the Datasources

To configure the repository datasource:

Open the Glassfish WebAdmin interface (http://<glassfish_host>:4848).
In the Common Tasks panel, select Resources > JDBC > JDBC Connection Pools.
Click the New button to create a new connection pool with the following properties:
- Pool Name: SEMARCHY_REPOSITORY
- Resource Type: java.sql.ConnectionPoolDataSource
- Database Driver Vendor: Select your database vendor.
Click Next. Set the following additional properties:
- User: <repository_user_name>
- Password: <repository_user_password>
- URL:
  - Oracle: jdbc:oracle:thin:@<oracle_instance_hostname>:<oracle_listener_port>:<oracle_SID_name>
  - PostgreSQL: jdbc:postgresql://<postgresql_hostname>:<postgresql_port>/<postgresql_database_name>
  - SQL Server: jdbc:sqlserver://<sqlserver_hostname>:<sqlserver_port>;databaseName=<repository_database_name>;
    or jdbc:sqlserver://<sqlserver_hostname>;instanceName=<sqlserver_instancename>;databaseName=<repository_database_name>;
- MaxStatements: 50
Click Finish.
Select the new connection pool and click the Ping button to test it.
In the Common Tasks panel, select Resources > JDBC > JDBC Resources.
Click the New button to create a new JDBC resource with the following properties:
- JNDI Name: jdbc/SEMARCHY_REPOSITORY
- Pool Name: SEMARCHY_REPOSITORY
- Status: Enabled
Click OK.

In the Pool Settings, it is recommended to tune the Minimum Pool Size and Maximum Pool Size properties according to your needs. Having a pool size between 1 and 8 connections is typically sufficient for testing purposes.

Do not change the JNDI Name of the SEMARCHY_REPOSITORY datasource. The application refers to a datasource with this name for the repository.

To configure a data location datasource:

Open the Glassfish WebAdmin interface (http://<glassfish_host>:4848).
In the Common Tasks panel, select Resources > JDBC > JDBC Connection Pools.
Click the New button to create a new connection pool with the following properties:
- Pool Name: <data_location_datasource_name>
- Resource Type: java.sql.ConnectionPoolDataSource
- Database Driver Vendor: Select your database vendor.
Click Next. Set the following additional properties:
- User: <data_location_user_name>
- Password: <data_location_user_password>
- URL:
  - Oracle: jdbc:oracle:thin:@<oracle_instance_hostname>:<oracle_listener_port>:<oracle_SID_name>
  - PostgreSQL: jdbc:postgresql://<postgresql_hostname>:<postgresql_port>/<postgresql_database_name>
  - SQL Server: jdbc:sqlserver://<sqlserver_hostname>:<sqlserver_port>;databaseName=<repository_database_name>;
    or jdbc:sqlserver://<sqlserver_hostname>;instanceName=<sqlserver_instancename>;databaseName=<rdata_location_database_name>;
- MaxStatements: 50
Click Finish.
Select the new connection pool and click the Ping button to test it.
In the Common Tasks panel, select Resources > JDBC > JDBC Resources.
Click the New button to create a new JDBC resource with the following properties:
- JNDI Name: jdbc/<data_location_datasource_name>
- Pool Name: <data_location_datasource_name>
- Status: Enabled
Click OK.

Repeat this operation for each data location’s datasource.

The repository and data location datasources are now configured, pointing to the storage previously created.

To configure datasources for xDM Dashboards or xDM Discovery, use the same configuration steps that you used to configure the data location datasources.

Configuring JavaMail Session

This configuration is required for mail notifications using JEE Mail Session.

To configure JavaMail Session:

Open the Glassfish WebAdmin interface (http://<glassfish_host>:4848).
In the Common Tasks panel, select Resources > JavaMail Sessions.
Click the New button to create a new JavaMail Session with the following properties:
- JNDI Name: mail/Session
- Mail Host: <mail_server_host>
- Default User: <mail_user_name>
- Transport Protocol: smtp
Add the following additional property to enable SMTP authentication:
- mail.smtp.auth: true
- password: <mail_user_password>
Click OK.

Deploying the Application

To deploy the application:

Open the Glassfish WebAdmin interface (http://<glassfish_host>:4848).
In the Common Tasks panel, select Applications.
Click the Deploy… button.
Select Local Packaged File or Directory …, and then click Browse Folders…
Select the temp/semarchy_war/semarchy/ folder in the folder browser and click Choose Folder.
Select Web Application for the Type and make sure the Status is Enabled.
Click OK to deploy the application.

The Semarchy xDM application is deployed in the server.

Testing the Application

To test the application:

Open a web browser.
In the URL, enter: http:/<glassfish_host>:<glassfish_port>/semarchy/.

Proceed directly to the Installing the Repository task.

Deploying and Configuring with Jetty

This section explains how to configure and deploy the Semarchy xDM Server with the Eclipse Jetty Application Server.

In this section, <jetty_home> refers to the Jetty server installation folder.

Refer to the Jetty Documentation for more details about the deployment and configuration processes in Jetty.

Installing Additional Libraries

Installing the Database JDBC Driver

To install the JDBC drivers:

Copy the appropriate database driver file from temp/mdm-server/additional-libraries/ to the <jetty>/lib/ext directory.
Copy additional drivers to the same directory.

Installing the Pooling Libraries

This configuration uses connection pooling available in DBCP.

To install the Connection Pooling Libraries:

Download the Apache DBCP and Apache Common Pool.
Uncompress both these files and then and copy the commons-dbcp-1.4.jar and commons-pool-1.6.jar files to the <jetty>/lib/ext directory.

Configuring the Security

To create the Semarchy xDM Realm:

Edit the <jetty>/etc/jetty.xml file and add the realm definition as described in the example below.

<Call name="addBean">
  <Arg>
    <New class="org.eclipse.jetty.security.HashLoginService">
      <Set name="name">semarchyRealm</Set>
      <Set name="config"><SystemProperty name="jetty.home" default="."/>/etc/realm.properties</Set>
      <Set name="refreshInterval">0</Set>
    </New>
  </Arg>
</Call>

This realm uses file storage for the users and roles.

To add the semadmin user to the realm:

Create or edit the <jetty>/etc/realm.properties file and add the users in the following format:

<user_name>: <password>[,<role_1>, <role_2>]

An example is given below:

Creating a semadmin user.

semadmin: <semadmin_password>, semarchyAdmin,semarchyConnect
myuser: <my_password>,semarchyConnect, dataSteward

All passwords configured in Jetty can be obfuscated. See the Secure Password Obfuscation chapter in the Jetty Documentation for more information.

Setting up the Datasources

To configure the datasources using DBCP connection pooling:

Edit the <jetty>/etc/jetty.xml file and add the new JNDI resources shown in the template below.

<!-- Repository Datasource -->
<New id="SEMARCHY_REPOSITORY" class="org.eclipse.jetty.plus.jndi.Resource">
   <Arg></Arg>
   <Arg>jdbc/SEMARCHY_REPOSITORY</Arg>
   <Arg>
    <New class="org.apache.commons.dbcp.BasicDataSource">
          <!-- Configuraton for Oracle -->
          <Set name="driverClassName">oracle.jdbc.OracleDriver</Set>
          <Set name="url">jdbc:oracle:thin:@<oracle_instance_hostname>:<oracle_listener_port>:<oracle_SID_name></Set>

          <!-- Configuration for PostgreSQL -->
          <!-- <Set name="driverClassName">org.postgresql.Driver</Set> -->
          <!-- <Set name="url">jdbc:postgresql://<postgresql_hostname>:<postgresql_port>/<postgresql_database_name></Set> -->

          <!-- Configuration for SQL Server -->
          <!-- <Set name="driverClassName">com.microsoft.sqlserver.jdbc.SQLServerDriver</Set> -->
          <!-- <Set name="url">jdbc:sqlserver://<sqlserver_hostname>:<sqlserver_port>;databaseName=<repository_database_name></Set> -->

          <Set name="username"><repository_user_name></Set>
          <Set name="password"><repository_user_password></Set>
     </New>
   </Arg>
  </New>


<!-- Repeat the following element for each data location -->
<New id="<data_location_datasource_name>" class="org.eclipse.jetty.plus.jndi.Resource">
   <Arg></Arg>
   <Arg>jdbc/<data_location_datasource_name></Arg>
   <Arg>
    <New class="org.apache.commons.dbcp.BasicDataSource">
          <!-- Configuraton for Oracle -->
          <Set name="driverClassName">oracle.jdbc.OracleDriver</Set>
          <Set name="url">jdbc:oracle:thin:@<oracle_instance_hostname>:<oracle_listener_port>:<oracle_SID_name></Set>

          <!-- Configuration for PostgreSQL -->
          <!-- <Set name="driverClassName">org.postgresql.Driver</Set> -->
          <!-- <Set name="url">jdbc:postgresql://<postgresql_hostname>:<postgresql_port>/<postgresql_database_name>`</Set> -->

          <!-- Configuration for SQL Server -->
          <!-- <Set name="driverClassName">com.microsoft.sqlserver.jdbc.SQLServerDriver</Set> -->
          <!-- <Set name="url">jdbc:sqlserver://<sqlserver_hostname>:<sqlserver_port>;databaseName=<data_location_database_name></Set> -->

          <Set name="username"><data_location_user_name></Set>
          <Set name="password"><data_location_user_password></Set>
     </New>
   </Arg>
  </New>

Configuring JavaMail Session

Edit the <jetty>/etc/jetty.xml file and add the new JavaMail resource as described in the template below.

<New id="mail" class="org.eclipse.jetty.plus.jndi.Resource">
     <Arg><Ref refid="wac"/></Arg>
     <Arg>mail/Session</Arg>
     <Arg>
       <New class="org.eclipse.jetty.jndi.factories.MailSessionReference">
         <Set name="user"><mail_user_name></Set>
         <Set name="password"><mail_user_password></Set>
         <Set name="properties">
           <New class="java.util.Properties">
             <Put name="mail.transport.protocol">smtp</Put>
             <Put name="mail.smtp.host"><mail_server_host></Put>
             <Put name="mail.smtp.port"><mail_server_port></Put>
             <Put name="mail.from"><mail_from_user_name></Put>
             <Put name="mail.debug">true</Put>
           </New>
          </Set>
       </New>
     </Arg>
</New>

Deploying the Application

To deploy the application:

Copy the temp/mdm-server/semarchy.war file to the <jetty>/webapp/ folder.
Open a command line in the <jetty> folder and then run java -jar start.jar to start the Jetty Server.

The Semarchy xDM application is deployed and the server started.

Testing the Application

To test the application:

Open a web browser.
In the URL, enter: http:/<jetty_host>:<jetty_port>/semarchy/.

Proceed directly to the Installing the Repository task.

Deploying and Configuring with Oracle WebLogic

This section explains how to configure and deploy the Semarchy xDM Server with Oracle WebLogic Application Server.

Refer to the WebLogic Documentation for more details about the deployment and configuration processes in WebLogic.

Prior to configuring Semarchy xDM, configure and start the WebLogic domain into which you plan to deploy and configure the Semarchy xDM application.

Installing Additional Libraries

Installing the Database JDBC Driver

Oracle WebLogic comes with a JDBC Driver for the Oracle database. For PostgreSQL, SQL Server and the databases accessed from xDM Dashboards or xDM Discovery, you must add the driver file to the classpath variable of the WebLogic Server.

Configuring the Security

To create the groups:

Connect to the WebLogic Server Administration Console.
Select Security Realms from the left pane and then click the realm you are configuring.
Select the Users and Groups tab, then Groups.
Click New.
In the Create a New Group page provide the following information:
- Name: semarchyConnectGroup
- Provider: DefaultAuthenticator
Click OK. The group is added to the Groups table.
Click New.
In the Create a New Group page provide the following information:
- Name: semarchyAdminGroup
- Provider: DefaultAuthenticator
Click OK. The group is added to the Groups table.

To create the semadmin user:

Connect to the WebLogic Server Administration Console.
Select Security Realms from the left pane and then click the realm you are configuring.
Select the Users and Groups tab, then Users.
Click New.
In the Create a New User page provide the following information:
- Name: semadmin
- Provider: DefaultAuthenticator
- Password: <semadmin_password>
Click OK. The user is added to the Users table.
Click the semadmin user from the Users table, then select the Groups tab.
Select the semarchyAdminGroup and semarchyConnectGroup groups in the Available group list, then click the Add button to add them to the Chosen list.
Click Save to save the group membership.

To configure the roles mappings:

Connect to the WebLogic Server Administration Console.
Select Security Realms from the left pane and then click the realm you are configuring.
Select Roles and Policies > Realm Roles.
Expand Global Roles and then click on Roles.
Select New to create a new role with the following properties:
- Name: semarchyConnect
- Provider Name: XACMLRoleMapper
Click OK.
Select the semarchyConnect role in the Roles table.
Click Add Conditions.
Select Group for the Predicate List and then click Next.
In the Group Argument Name, enter semarchyConnectGroup and then click Add.
Click Finish.
Click Save.
Select Security Realms from the left pane and then click the realm you are configuring.
Select Roles and Policies > Realm Roles.
Expand Global Roles and then click on Roles.
Select New to create a new role with the following properties:
- Name: semarchyAdmin
- Provider Name: XACMLRoleMapper
Click OK.
Select the semarchyAdmin role in the Roles table.
Click Add Conditions.
Select Group for the Predicate List and then click Next.
In the Group Argument Name, enter semarchyAdminGroup and then click Add.
Click Finish.
Click Save.

Setting up the Datasources

To configure the repository datasource:

Connect to the WebLogic Server Administration Console.
Select Services > Data Sources from the left pane.
Select New > Generic Data Source in the Data Sources table.
Enter the following JDBC datasource properties:
- Name: SEMARCHY_REPOSITORY
- JNDI Name: jdbc/SEMARCHY_REPOSITORY
- Database Type: Select your database type in the list.
Click Next.
In the Database Driver select the driver corresponding to your database.
Click Next.
De-select the Supports Global Transaction option.
Click Next.
Enter the following connection properties:
- For Oracle:
  - Database Name: <oracle_SID_name>
  - Host Name: <oracle_instance_hostname>
  - Port: <oracle_listener_port>
  - Database User Name: <repository_user_name>
  - Password: <repository_user_password>
- For PostgreSQL:
  - Database Name: <postgresql_database_name>
  - Host Name: <postgresql_hostname>
  - Port: <postgresql_port>
  - Database User Name: <repository_user_name>
  - Password: <repository_user_password>
- For SQL Server:
  - Database Name: <repository_database_name>
  - Host Name: <sqlserver_hostname>
  - Port: <sqlserver_port>
  - Database User Name: <repository_user_name>
  - Password: <repository_user_password>
Click Next.
Click Test Configuration to validate the connection information.
In the Properties field, add the following line: defaultAutoCommit=false
Click Next.
Select a deployment target (by default Admin Server) and then click Finish.

Do not change the JNDI Name of the SEMARCHY_REPOSITORY datasource. The application refers to a datasource with this name for the repository.

To configure the data location datasource:

Connect to the WebLogic Server Administration Console.
Select Services > Data Sources from the left pane.
Select New > Generic Data Source in the Data Sources table.
Enter the following JDBC datasource properties:
- Name: <data_location_datasource_name>
- JNDI Name: jdbc/<data_location_datasource_name>
- Database Type: Select your database type in the list.
Click Next.
In the Database Driver select the driver corresponding to your database.
Click Next.
De-select the Supports Global Transaction option.
Click Next.
Enter the following connection properties:
- For Oracle:
  - Database Name: <oracle_SID_name>
  - Host Name: <oracle_instance_hostname>
  - Port: <oracle_listener_port>
  - Database User Name: <data_location_user_name>
  - Password: <data_location_user_password>
- For PostgreSQL:
  - Database Name: <postgresql_database_name>
  - Host Name: <postgresql_hostname>
  - Port: <postgresql_port>
  - Database User Name: <data_location_user_name>
  - Password: <data_location_user_password>
- For SQL Server:
  - Database Name: <data_location_database_name>
  - Host Name: <sqlserver_hostname>
  - Port: <sqlserver_port>
  - Database User Name: <repository_user_name>
  - Password: <repository_user_password>
Click Next.
Click Test Configuration to validate the connection information.
In the Properties field, add the following line: defaultAutoCommit=false
Click Next.
Select a deployment target (by default Admin Server) and then click Finish.

Repeat this operation for each data location’s datasource.

The repository and data location datasources are now configured, pointing to the storage previously created.

To configure datasources for xDM Dashboards or xDM Discovery, use the same configuration steps that you used to configure the data location datasources.

Deploying the Application

To deploy the application:

Connect to the WebLogic Server Administration Console.
Select Deployments from the left pane.
Click the Install button.
Click the Upload your file(s) link.
Select for the Deployment Archive the temp/mdm-server/semarchy.war file on your local disk.
Click Next and then Next.
In the Choose target style page select Install this deployment as an application and then click Next.
In the Optional Settings page, in the Security option group, select Custom Roles: Use roles that are defined in the Administration Console; use policies that are defined in the deployment descriptor.
Click Next.
In the Review your choices and click Finish page Select No, I will review the configuration later.
Click Finish.

The Semarchy xDM application is deployed in the server.

Configuring JavaMail Session

This configuration is required for mail notifications using JEE Mail Session.

To configure JavaMail Session:

Connect to the WebLogic Server Administration Console.
Select Services > Mail Sessions from the left pane.
Click the New button.
In the Mail Session Properties page, enter the following properties:
- Name: MailSession
- JNDI Name: mail/Session
In the JavaMail Properties, enter the following lines:
- mail.transport.protocol=smtp
- mail.smtp.host=<mail_server_host>
- mail.port=<mail_server_port>
- mail.smtp.auth=true
- mail.smtp.user=<mail_user_name>
- password=<mail_user_password>
Click Save to save the mail session configuration.

Testing the Application

To test the application:

Open a web browser.
In the URL, enter: http:/<weblogic_host>:<weblogic_port>/semarchy/.

Proceed directly to the Installing the Repository task.

Installing the Repository

The repository type cannot be modified after installation, make sure to choose a repository type adapted to your installation. Review carefully the description of the Repository Types.

Semarchy xDM holds all its information in a repository stored in a database/schema. The first task when connecting to Semarchy xDM is to create this repository structure in the storage previously created.

Repository installation is done the first time an administrator connects to Semarchy xDM.

Open your web browser and connect to the following URL: http://<application_server_host>:<application_server_port>/semarchy/.
In the login prompt, enter the following:
- User: semadmin
- Password: <semadmin_password>
Semarchy xDM opens with the license agreement. Review the End-User License Agreement.
Check the I have read and accept Semarchy’s End-User License Agreement box and then click Next.
In the Repository Creation wizard, select the type of repository.
Optionally enter a customized name for your repository.
Click Finish to create the repository.
Click OK when the Repository Successfully Created message appears.

The repository is created and Semarchy xDM is now up and running.

Next Steps

Semarchy xDM is now up and running.

This section describes the next recommended tasks for the Administrators and Project Managers.

Administrators

Administrators should proceed with the following tasks:

Activate the Instance with a license key.
Setting up the Security:
- Create users, groups and roles in the application server realm.
- Declare the roles in Semarchy xDM and grant platform-level privileges to these roles.
Configuring Notification Servers for sending emails and job notification.

Refer to the Semarchy xDM Administration Guide for more information about these tasks.

Project Managers

Project Managers should proceed to the following tasks:

Start profiling datasources. Refer to the Semarchy xDM Discovery User’s Guide for more information.
Creating a new model and Data Management Application. Refer to Semarchy xDM Developer’s Guide for more information.
Creating data locations to deploy these models using to the datasources configured when installing Semarchy xDM.
Create a new Dashboard Application. Refer to the xDM Dashboards Designer’s Guide for more information.