Semarchy xDM architecture

This page details the various components of the Semarchy xDM architecture and their interactions.

Semarchy xDM server

The Semarchy xDM Server is a Java EE application that is deployed and run in a supported application server.

This server provides the following access methods:

  • The Application Builder, Dashboard Builder, Discovery, and Configuration user interfaces: these web applications are used by designers and administrators to create, manage and administer the models and applications designed in Semarchy xDM, as well as profile and measure data.

  • Data management applications and dashboard applications: these web applications are used by business users to browse and manage data and visualize metrics dashboards.

  • A REST API: this API allows users to programmatically perform data integration, management, and administrative operations.

At design-time, the Semarchy xDM server is used to design and deploy models and applications. At run-time, it schedules and runs the certification process in the hub.

Active and passive servers

The Semarchy xDM server comes in two flavors corresponding to two web application archive (WAR) files:

  • The active server (semarchy.war) contains all the features of Semarchy xDM, including running the certification job. You should only have one active server in a typical configuration.

  • The passive server (semarchy-passive.war) is unable to run certification jobs. It is used in high-availability configurations to extend the active server with additional nodes to server users and API requests.

Only one instance (either active or passive) of Semarchy xDM can be deployed in each application server instance. Performing two deployments on the same application server instance (e.g., http://host:port/semarchy_production and http://host:port/semarchy_test) is not supported.

Repository

The Semarchy xDM server stores its information in a repository, hosted in a database or schema. A Semarchy xDM server is always attached to a single repository.

Repository contents

The repository stores the following information:

  • Platform-level configuration elements: roles, server configuration, plugins, identity management, datasources, etc.

  • For data management applications and models:

    • Model and application metadata and their versions

    • Configuration and security information: privileges, notification policies, preferences, etc.

    • Data location information: deployed model, jobs, etc.

    • Run-time information: logs

  • For xDM Dashboard:

    • Dashboard applications and their artifacts (queries, charts, etc.)

  • For xDM Discovery:

    • Datasource definitions and profiles.

Semarchy xDM uses role-based security to access the features applications. users and their assigned roles may be stored in the repository or an external identity provider system.

Repository type

The repository type is selected at creation-time and cannot be modified afterward.

The two repository types are:

  • Design: allows all design-time and run-time operations.

  • Deployment: permits only the import of closed model editions, with no editing possibility.

The deployment repositories are suitable for production sites. Model transfer from design to deployment repositories is handled via incremental export/import of closed model editions.

Semarchy xDM Dashboard and Semarchy xDM Discovery work identically with design and deployment repositories. Both types of repositories support profiling and the design of dashboard applications.

Repository datasources

The Semarchy xDM server connects the repository using two datasources:

  • The repository datasource, with credentials allowing to read from and write into the repository schema.

  • The repository read-only datasource, with credentials allowing to read a subset of the tables stored in the repository schema.
    This datasource is mainly used by xDM Discovery to browse the profiles.

Both these datasources are defined in the server startup configuration.

The repository data should never be modified directly via SQL queries. Accessing the Semarchy xDM information must be performed through the Semarchy user interfaces.

Data locations

Data managed in Semarchy applications is stored in a data location.

The data location:

  • Contains and serves the hub data, structured after the model entities and attributes.

  • Runs certification jobs generated according to the rules (enricher, validation, matching, etc.) defined in the model.

A model is deployed into a data location to host the hub data according to the model structure and rules. At run-time, the data location contains the golden, master, and source data, with their complete lineage and history.

Data locations are hosted in databases or schemas and accessed via platform datasource defined in Semarchy xDM.

Data location contents

A data location contains the hub data, stored in the database or schema accessed using the data location’s datasource. This schema contains database tables and other objects generated from the model edition.

The data location also refers to jobs stored in the repository:

  • Installation jobs: for creating or modifying the data structures in the data location in a non-destructive way.

  • Integration jobs: for certifying data in these data structures, according to the model job definitions.

  • Purge jobs: for purging the logs and data history according to the retention policies.

Data locations, repositories, and models

A data location is attached to a repository; no matter how many data locations are declared in a repository, a data location is always attached to a single repository. It is not possible to have a data location attached to two repositories at the same time.

You may deploy several model editions successively in a data location, but only one model edition is deployed and is active in the data location at a given point in time.

Data locations are only used to deploy Semarchy models designed in the Application Builder. Dashboard and charts created in Semarchy xDM Dashboard do not require data locations.

Data location types

There are two types of data locations. The type is selected when the data location is created. It cannot be changed afterward:

The data location types are:

  • Development data locations: support deploying open or closed model editions, and are suitable for testing models in development and quality assurance environments.

  • Production data locations: only support deploying closed model editions, and are suitable for deploying MDM hubs in production environments.

Datasources

Semarchy xDM connects to multiple databases and schemas: