Publish data to the hub

Publishing submits the creation, modification, or deletion of records to the data hub. This data goes through the certification process and is turned into golden data.

Publishing process

Publishing source data is performed as a transaction called an external load. It is a Semarchy xDM transaction identified by a sequential load ID.

An external load represents a transaction for loading source data.

When an external load is submitted, a batch—identified by a batch ID—is created. This batch initiates an integration job to execute the certification process on the data submitted in the load.

A batch represents a transaction certifying loaded data, and writing the resulting golden data into the hub.

Using the REST or SQL APIs, the middleware can initialize, submit, or cancel external loads, as explained below.

Both loads and batches can be examined under a data location’s node in the Management perspective of the Application Builder.

External load lifecycle

An external load lifecycle is described below:

  1. Initializing the external load

    • The middleware initializes an external load with the SQL interface or the REST API.

    • A load ID, identifying the external load, is received from the platform, opening a transaction.

  2. Loading data

    • The middleware inserts data into the landing tables in the data location schema via the SQL interface or the REST API.

    • During data loading, the middleware provides both the load ID and a publisher code corresponding to the publisher application.

  3. Submitting the external load

    • The middleware uses the SQL interface or the REST API to submit the external load.

    • It provides the load ID and the integration job name for triggering.

    • The platform creates a batch to execute the integration job that certifies the data published in this external load.

    • A batch ID identifying the batch processed by the platform for this external load is received.

    • At this stage, the external load transaction is closed.

Alternately, the middleware can cancel the external load with the SQL interface or the REST API to abort the external load instead of submitting it.

Continuous loads facilitate the publishing process. They represent an open external load into which data is published. The hub polls and processes this data automatically.
Data locations may be placed in maintenance status by their administrator. In that state, initializing external loads is not possible.

Batch lifecycle

Upon submission of an external load, the following operations occur:

  1. The platform creates a batch and returns the batch ID to the submitter

  2. The integration batch poller retrieves the batch based on its schedule:

    1. It creates a job instance using the provided job definition name from the submit action.

    2. It moves the job into the queue specified in the job definition.

  3. The execution engine processes the job in the queue.

  4. When the job is completed, the batch is considered finished.

Even with simultaneous loads, the order of submission determines the sequence for data processing by the integration jobs and certification of golden data from the source data.