Publish Data to the Hub

Publishing submits records creation, change or deletion to the data hub. This data goes through the certification process and turned into Golden Data.

Publishing Process

Publishing source data is performed as a transaction called an External Load. It is a Semarchy xDM transaction identified by a sequential Load ID.

An External Load represents a source data load transaction.

When an External Load is submitted, a Batch - identified by a Batch ID - is created, and runs an Integration Job that runs the certification process on the data submitted in this load.

A Batch represents a transaction certifying loaded data and writing in the hub the resulting golden data.

Using the REST or SQL APIs, the middleware can initialize, submit or cancel external loads as explained below.

Both Loads and Batches can be reviewed under the data locations in the Management perspective in the Semarchy Application Builder.

External Load Lifecycle

An external load lifecycle is described below:

  1. Initialize the External Load

    • The middleware initializes an external load with the SQL Interface or the REST API.

    • It receives from the platform a Load ID identifying the external load.

    • At that stage, an external load transaction is open with the platform.

  2. Load Data

    • The middleware inserts data into the landing tables in the data location schema. This is done using the SQL Interface or the REST API.

    • When loading data, the middleware provides both the Load ID and a Publisher Code corresponding to the publisher application.

  3. Submit the External Load

    • The middleware uses the SQL Interface or the REST API to submit the external load.

    • It provides the Load ID as well as the name of the Integration Job to trigger with this submission.

    • The platform creates a Batch to run the integration job that certifies the data published in this external load.

    • It receives from the platform a Batch ID identifying the batch that is processed by the platform for this external load.

    • At that stage, the external load transaction is closed.

The middleware can alternately Cancel the External Load with the SQL Interface or the REST API to abort the external load instead of submitting it.

Continuous Loads facilitate the publishing process. A continuous load is an open external load into which data is published. The hub polls and processes this data automatically.
Data locations may be moved to a Maintenance status by their administrator. When a data location is in that state, it is not possible to initialize external loads.

Batch Lifecycle

When an external load is submitted, the following operations take place:

  1. The platform creates a batch and returns to the submitter the Batch ID

  2. The integration batch poller picks up the batch on its schedule:

    1. It creates a Job instance using the Job Definition which name is provided in the submit action.

    2. It moves the job into the Queue specified in the job definition

  3. The Execution engine processes this job in the queue.

  4. When the job completes, the batch is considered finished.

Even when multiple loads take place simultaneously, the sequence into which the external loads are submitted defines the order into which the data is processed by the integration jobs and golden data certified from this source data.