Publishing Data into the Hub

Publishing data into the hub is done using either:

Methods for Publishing Data

External Loads using PL/SQL and SQL

In this approach, external loads are handled via PL/SQL and SQL interface. It works as follows:

  1. The external load is initialized using a PL/SQL call to the INTEGRATION_LOAD.GET_NEW_LOADID function on the repository schema. This function call returns a Load ID.
  2. SD tables are loaded using SQL inserts issued on the data location schema.
  3. The external load is submitted or cancelled using PL/SQL function calls on the repository schema:

Tip: This approach is recommended when the middleware is an ETL or data integration product.

External Loads using Web Services and SQL

In this approach, external loads are handled via the Integration Load Web Service, and the data is loaded using SQL. It works as follows:

  1. The external load is initialized using a web service call to the getNewLoadID operation on the Integration Load Web Service. This call returns a Load ID.
  2. SD tables are loaded using SQL inserts issued on the data location schema.
  3. The external load is submitted or cancelled using web service calls to operations on the Integration Load Web Service:

Tip: This approach may be preferred when the middleware is an ESB product, or when the load management is part of a business process. This approach is also preferred to avoid granting database access to the repository schema.

Accessing the Integration Load Web Service

The Integration Load Web Service is accessed at the following URL: http://<semarchy_server>/ws/1.1/platform/IntegrationLoadService?wsdl

Note that accessing the web service requires WS-Security authentication with a user having the Read/Write privileges on Data Location. See the "Managing the Security" chapter in the "Semarchy Convergence for MDM Administration Guide" for more information on privilege grants.

Initializing the External Load

Initializing an external load uses the following parameters:

Note: To mandate the use of an authenticated user, you must use the web service interface.

Initializing a Load (PL/SQL): INTEGRATION_LOAD.GET_NEW_LOADID

This function call to INTEGRATION_LOAD.GET_NEW_LOADID initializes an external load and returns a Load ID. It is performed on the repository database schema.

vLoad_id := <semarchy_repository_schema>.INTEGRATION_LOAD.GET_NEW_LOADID(
	'<data_location_name>'	/* Data Location Name (As in the UI) */
	<branch_id>		/* Branch ID (0, 1, ...) */
	'<program_name>'	/* Informational. Identifies the Middleware*/
	'<load_description>'	/* Informational. Describes the load. */ 
	'<user_name>'		/* User initializing the load. */
	  );

The following example performs a function call initializing an external load on the CustomerMDMHub data location, on the latest data edition of the data branch 0. The repository schema is SEMREPO . The returned Load ID is stored in the vLoad_id variable.

vLoad_id := SEMREPO.INTEGRATION_LOAD.GET_NEW_LOADID(
	'CustomerMDMHub',
	0,
	'Custom ETL',
	'Initial Load for the Hub',
	'John Doe' );

Initializing a Load (Web Service): getNewLoadID

This web service sample call on the getNewLoadID operation initializes an external load and returns a Load ID.

<soapenv:Envelope .../>
  <soapenv:Header/>
   <soapenv:Body>
      <ws:getNewLoadID>
         <dataLocationName><!-- data_location_name --></dataLocationName>
         <dataBranchID><!-- branch_id --></dataBranchID>
         <programName><!-- program_name --></programName>
         <loadDescription><!-- load_description --></loadDescription>
      </ws:getNewLoadID>
   </soapenv:Body>
</soapenv:Envelope>

Note: No user name required in the payload since the Web Service will use the credentials of the connected user (WS_Security)

The following example is a request initializing an external load on the CustomerMDMHub data location, on the data edition [0.0] (branch 0, version 0).

<soapenv:Envelope .../>
   <soapenv:Header/>
   <soapenv:Body>
      <int:getNewLoadID>
         <dataLocationName>CustomerMDMHub</dataLocationName>
         <dataBranchID>0</dataBranchID>
         <programName>Custom ETL</programName>
         <loadDescription>Initial Load for the Hub</loadDescription>
      </int:getNewLoadID>
   </soapenv:Body>
</soapenv:Envelope>

The response to this request contains the Load ID: 141.

<soap:Envelope .../>
   <soap:Body>
      <ns1:getNewLoadIDResponse ...>
         <return>141</return>
      </ns1:getNewLoadIDResponse>
   </soap:Body>
</soap:Envelope>

Loading Data

Loading data consists in inserting new source records in the SD tables for the set of entities taken into account in the integration job.

When loading data in the SD tables:

SD Tables to Load

Make sure to all the load tables that will be taken into account in the integration job. For example, if the integration job processes the Customer and Contact entities, then you should load the SD_CUSTOMER and SD_CONTACT tables.
If you insert data into the SD table of an entity that is not taken into account by the integration job, this data will be ignored. For example, if the integration job processes the Customer and Contact entities, data loaded the SD_EMPLOYEE table will not be taken into account by the integration job to certify golden records for the Employee entity.

Referential Integrity and Load Order

There is no required order to load the SD_ tables, as no foreign keys are implemented on the SD tables.
Reference validation is performed by the integration job, as a consequence references between entities must be loaded as indicated in the SD Columns to Load section.

Using Transactions

It is recommended to use a database transaction when writing to the SD tables just before submitting the load.
Canceling an external load – using the INTEGRATION_LOAD.CANCEL_LOAD() function or the cancelLoad() web service call – cancels the load but does not delete records from the SD tables. Writing for example in an auto-commit transaction in the SD tables then canceling the load will leave useless information in the SD tables.
Using a transaction gives the capability to rollback all changes on the SD_ tables when canceling a load.

SD Columns to Load

This section provides some guidance for loading the columns of the SD tables.

System Columns (Mandatory)

The following system columns must be loaded as indicated:

Note: Within a single load, you can load the records from various publishers, using the B_PUBID column to identify each publisher

Primary Keys (Mandatory)

The primary key that you load in the SD table allows identifying the source record from the source system – identified by Publisher Code in B_PUDIB.

The primary key column to load depends on the Matching Behavior of the entity.

ID Matching Entity

If the entity uses ID Matching, then this ID must be loaded in the column representing the attribute defined as the primary key attribute for the entity.

Fuzzy Matching Entity

If the entity uses Fuzzy Matching, then you must load in the B_SOURCEID column the value of the primary key from the source system. If this primary key is a composite key, then you must concatenate the values of the composite primary key and load them in the B_SOURCEID column.

Note: The B_SOURCEID column is a VARCHAR(128) column. Make sure to perform the appropriate conversions for loading this column.

References (Mandatory)

When loading data for entities that are related by a reference relationship, you must load the referencing entity with the value of the referenced primary key. The columns to load differ depending on the matching behavior the referenced entity.

Reference to an ID Matching Entity

If the referenced entity is an ID Matching entity, then you need to load the column representing the referencing attribute. This column is F_<Physical Name of the Reference To Role Name>.

For example, if Customer references Employee and this entity uses ID Matching, then you must load in SD_CUSTOMER the F_EMPLOYEE column with the primary key of the source employee record referenced by each customer record.

Reference to a Fuzzy Matching Entity

If the referenced entity is a Fuzzy Matching entity, then you need to load two columns:

Warning: Note that these columns should be considered together. You should not load the FP_ column with a publisher code and leave FS_ to a null value, and vice versa.

For example, if Contact references Customer and this entity use Fuzzy Matching, you must load in SD_CONTACT the following columns:

Attribute Columns

You should load the attribute columns relevant for the entity that you are loading.
Make sure to load:

Note: Attributes may be loaded with null or incorrect values if the values are set or modified by the enrichers. Enrichers are executed before any validation.

Other SD Columns (Optional)

The following columns do not need to be loaded or can be optionally loaded:

Submitting the External Load

Initializing an external load uses the following parameters:

Note: To mandate the use of an authenticated user, you must use the web service interface.

Submitting an External Load (PL/SQL): INTEGRATION_LOAD.SUBMIT_LOAD

The INTEGRATION_LOAD.SUBMIT_LOAD function call submits an external load identified by its Load ID and returns a Batch ID. It is performed on the repository database schema.

vBatch_id := <semarchy_repository_schema>.INTEGRATION_LOAD.SUBMIT_LOAD(
	<load_id>		/* Load ID returned by INTEGRATION_LOAD.GET_NEW_LOADID */
	'<integration_job>'	/* Name of the Integration Job to trigger. */
	'<user_name>'		/* User who has initialized the load. */
	  );

The following example performs a function call to submit an external load identified by the Load ID 22. It submits it with the job name INTEGRATE_DATA. The repository schema is SEMREPO . The returned Batch ID is stored in the vBatch_id variable.

vBatch_id := SEMREPO.INTEGRATION_LOAD.SUBMIT_LOAD(
	22,
	'INTEGRATE_DATA',
	'John Doe' );

Submitting an External Load (Web Service): submitLoad

This web service sample call on the submitLoad operation submits an external load identified by its Load ID and returns a Batch ID in the response.

<soapenv:Envelope ...>
   <soapenv:Header/>
   <soapenv:Body>
      <ws:submitLoad>
         <loadID><!-- load_id --></loadID>
         <jobName><!-- integration_job --></jobName>
      </ws:submitLoad>
   </soapenv:Body>
</soapenv:Envelope>

The following example is a request submitting an external load identified by the Load ID 125, with the INTEGRATE_DATA job.

<soapenv:Envelope .../>
   <soapenv:Body>
      <int:submitLoad>
         <loadID>125</loadID>
         <jobName>INTEGRATE_DATA</jobName>
      </int:submitLoad>
   </soapenv:Body>
</soapenv:Envelope>

The response to this request contains the Batch ID: 456.

<soap:Envelope .../>
   <soap:Body>
      <ns1:submitLoadResponse ...>
         <return>456</return>
      </ns1:submitLoadResponse>
   </soap:Body>
</soap:Envelope>

Note: No user name required here since the Web Service will use the credentials of the connected user (WS_Security).

Note: You can only submit the authenticated user’s loads.

Canceling the Load

Canceling a load is performed using a PL/SQL function call or a web service call, with the following parameters:

Note: To mandate the use of an authenticated user, you must use the web service interface.

Canceling a Load (PL/SQL): INTEGRATION_LOAD.CANCEL_LOAD

The following example performs a function call to INTEGRATION_LOAD.CANCEL_LOAD to cancel an external load identified by the Load ID 22.

SEMREPO.INTEGRATION_LOAD.CANCEL_LOAD(
	22,
	'John Doe' );

Canceling a Load (Web Service): cancelLoad

This web service sample call on the cancelLoad operation cancels a load identified by its Load ID.

<soapenv:Envelope ...>
   <soapenv:Header/>
   <soapenv:Body>
      <ws:cancelLoad>
         <loadID><!-- load_id --></loadID>
      </ws:cancelLoad>
   </soapenv:Body>
</soapenv:Envelope>

Note: No user name required here since the Web Service will use the credentials of the connected user (WS_Security).

Note: You can only cancel the authenticated user’s loads.

The following example is a request canceling an external load identified by the Load ID 125.

<soapenv:Envelope .../>
 <soapenv:Body>
      <int:cancelLoad>
         <loadID>125</loadID>
      </int:cancelLoad>
   </soapenv:Body></soapenv:Envelope>