Integration Jobs Parameters Reference

The following table lists the parameters available to customize and optimize the execution of integration jobs.

NOTE: It is required to re-deploy the model and to select the Generate Jobs Definition option in the Deploy Model Edition wizard for parameter changes to be applied.

Parameter Name Values Description

PARAM_AGGREGATE_JOB_ENRICHERS

0 or 1

If this parameter is set to 1, consecutive SemQL enrichers are merged into a single SQL statement when executed. This applies to all entities. For more information, see enricher aggregation. The default value for this parameter is 0.

PARAM_AGGREGATE_JOB_PLUGIN_ENRICHERS

0 or 1

If this parameter is set to 1, consecutive API enrichers are piped so that the output of an enricher is directly passed in memory to the next one. This applies to all entities. For more information, see enricher aggregation. The default value for this parameter is 0.

PARAM_AGGREGATE_ENTITY_ENRICHERS_<entity>

0 or 1

If this parameter is set to 1, consecutive SemQL enrichers are merged into a single SQL statement when executed. This applies only to the entity whose name is provided as <entity>. For more information, see enricher aggregation. The default value for this parameter is 0.

PARAM_AGGREGATE_ENTITY_PLUGIN_ENRICHERS_<entity>

0 or 1

If this parameter is set to 1, consecutive API enrichers are piped, so that the output of an enricher is directly passed in memory to the next one. This applies only to the entity whose name is provided as <entity>. For more information, see enricher aggregation. The default value for this parameter is 0.

PARAM_ANALYZE_STATS

0, 1, or JSON object

If this parameter is set to 1, statistics collection is triggered in the data location tables to optimize processing. This option is useful to accelerate the processing of large datasets. The default value for this parameter is 1.

For PostgreSQL and SQL Server, this parameter supports a JSON object to define advanced options, described in analyzing database statistics.

PARAM_CHILD_POSTPROCESSING_JOB

true, false, or <job_name>

Specify whether a post-processing job should be started after the current job for the records that reference a parent record which golden ID has changed. This new job starts in a new batch.

Possible values are:

  • true: start a post-processing job instance using the current job definition.

  • false (default value): do not start a post-processing job.

  • <job_name>: start <job_name> to perform the post-processing.

A post-processing job is not required if the current job is designed to process the parent entities first, then the child entities. However, in certain cases, it is not possible. A typical case is matching self-references.
In the case of a self-reference, you can use the same job (using true) for post-processing. New instances of that job will be restarted as long as there are records in the hierarchy that require it.

PARAM_CHILD_POSTPROCESSING_JOB_<entity>

true, false, or <job_name>

Specify whether a post-processing job should be started after the current job for records of <entity> that reference a parent record which golden-record ID has changed.

This parameter works like PARAM_CHILD_POSTPROCESSING_JOB but only applies to the records of <entity>.

If records from multiple entities need post-processing, and these entities have different post-processing jobs specified (using PARAM_CHILD_POSTPROCESSING_JOB or PARAM_CHILD_POSTPROCESSING_JOB_<entity>), then those jobs start separately when the current job completes in different batches.

PARAM_ENABLE_DELETE_PHASE

0 or 1

If this parameter is set to 1 (default value), the phase that processes possible deletions is executed. To entirely skip this phase, set this parameter to 0. In that case, deletions are not processed.

PARAM_NORMALIZE_DATE

0 or 1

If this parameter is set to 0 (the default value is 1), the job phase—​for data locations running on Oracle—​that truncates timestamp columns representing date attributes to the date is skipped. You may want to skip this phase if this truncation is not needed.

PARAM_NORMALIZE_STRING

0 or 1

If this parameter is set to 0 (the default value is 1), the job phase—​for data locations running on PostgreSQL and SQL Server—​that transforms empty strings to null values is skipped. You may want to skip this phase if this normalization is not needed.

PARAM_PLUGIN_ENRICHERS_UPDATE_BATCH_SIZE

Number

When the API enrichers (Java plug-ins and REST clients) execution and database are fast, the network can be the bottleneck. A batch update is a group of updates sent together to the database in one batch, rather than sending the updates one by one. This JDBC batch update size is used by API enrichers when writing records to the database. A high value for this parameter reduces the number of network accesses to write the data but increases the memory load on the server. This parameter applies to API enrichers running individually. For chained API enrichers, it applies to the last enricher in the chain, which writes enriched data to the database.

PARAM_RECYCLE_ERRORS

0 or 1

If this parameter is set to 1, error recycling is activated, and source-record rejects from prior job executions are recycled in the current job. The default value for this parameter is 0.

The error recycling mechanism applies exclusively to pre-consolidation errors (in the SE and AE tables). It does not extend to post-consolidation errors (in the GE table).

PARAM_SINGLE_TASK_ENTITY_CONSOLIDATION_<entity_name>

0 or 1

When this parameter is set to 1, the job performs consolidation for the <entity_name> entity in a single step, which strongly improves performance for high data volumes. When this parameter is set to 0, consolidation is performed in two steps (clustering and matching) for the concerned entity.

The default behavior is defined by PARAM_SINGLE_TASK_JOB_CONSOLIDATION.

PARAM_SINGLE_TASK_JOB_CONSOLIDATION

0 or 1

When this parameter is set to 1, the job performs consolidation in a single step, which strongly improves performance for high data volumes. When this parameter is set to 0, consolidation is performed in two steps (clustering and matching). This parameter applies to all entities for which no PARAM_SINGLE_TASK_ENTITY_CONSOLIDATION_<entity_name> is set.

The default value for this parameter can be configured with the xdm.integjob.useSingleStepConsolidation system property: setting this property to true will use 1 as the default value for PARAM_SINGLE_TASK_JOB_CONSOLIDATION (default value for this property is false).