Optimize Enricher Execution with Aggregation

Enrichers can be aggregated to enable faster processing.

When not aggregated, enrichers run one after the other—​reading, modifying, and then updating the data into the database. Aggregation reduces database read/write operations by executing multiple enrichers in a single operation.

This page explains how enricher execution can be optimized using aggregation.

Enable Enricher Aggregation

SemQL Enrichers Aggregation

Multiple consecutive SemQL enrichers can be aggregated using the PARAM_AGGREGATE_JOB_ENRICHERS and PARAM_AGGREGATE_ENTITY_ENRICHERS_<entity_name> job parameters. This process converts multiple SemQL enrichers into a single SQL statement that is processed by the database to prevent consecutive database read/write operations.

API Enrichers Aggregation

Multiple consecutive API enrichers (Java plug-ins and REST clients) can be aggregated using the PARAM_AGGREGATE_JOB_PLUGIN_ENRICHERS and PARAM_AGGREGATE_ENTITY_PLUGIN_ENRICHERS_<entity_name> job parameters. This process creates a memory-efficient chain that processes data in a single pass, avoiding successive database read/write operations.

Enricher Aggregation Rules and Limitations

Enricher aggregation adheres to the following rules:

  • Only successive enrichers of the same type (SemQL or API) can be aggregated. For example, in a sequence of enrichers such as SEMQL_1, SEMQL2, PLUGIN_1, PLUGIN_2, and SEMQL3, you can aggregate SEMQL_1 with SEMQL2, and PLUGIN_1 with PLUGIN_2.

  • API enricher aggregation stops when:

    • An API enricher has a filter that uses an attribute updated by a previous enricher in the chain.

    • An API enricher has an input that contains a complex SemQL expression with attributes updated by a previous enricher in the chain.

    • An API enricher has one of the Thread pool size, Max retry, Behavior on error, Batch update size or Processing batch size options set to a different value than a previous enricher in the chain.