Analyze database statistics in the integration job

This page explains how to customize the integration job to gather database statistics automatically for performance optimization.

The PARAM_ANALYZE_STATS job parameter provides a comprehensive way to perform statistics gathering as part of the job.

Analyze stats in Oracle

For Oracle, if PARAM_ANALYZE_STATS is set to 1, statistics collection is triggered in the data location tables to optimize processing.

Analyze stats in PostgreSQL

For PostgreSQL, you can set in PARAM_ANALYZE_STATS a JSON object to configure how statistics are collected. A sample JSON object is provided below.

Example 1. PostgreSQL: sample analyze stats configuration
{
	"useVacuum": true,
		"vacuumConfig": {
			"analyze": true,
			"full": true,
			"freeze": true
		}
}

In this object, set useVacuum to true to use PostgreSQL’s VACUUM command instead of ANALYZE for statistics collection. The default value is false.

If VACUUM is used, configure it with the vacuumConfig object:

  • analyze: Set to true to use the ANALYZE option. The default value is true.

  • full: Set to true to use the FULL option. The default value is false.

  • freeze: Set to true to use the FREEZE option. The default value is false.

Analyze stats in SQL Server

For SQL Server, you can set in PARAM_ANALYZE_STATS a JSON object to configure how statistics are collected using the Update Statistics statement. A sample JSON object is provided below.

Example 2. SQL Server: sample analyze stats configuration without sampling
{
	"fullScanConfig": {
		"persistSamplePercent" : "ON"
	},
	"applyOnOption": "ALL",
	"noRecompute": true,
	"incremental": "ON",
	"maxDop" : 8
}

In this object, FullScanConfig compute the statistics by scanning all the rows in the table or indexed view. These statistics are computed with the following options:

  • PersistSamplePercent (ON | OFF): When set to ON, the statistics will retain the set sampling percentage for subsequent updates that do not explicitly specify a sampling percentage. When OFF, statistics sampling percentage will reset to default sampling in subsequent updates that do not explicitly specify a sampling percentage. The default is OFF.

  • ApplyOnOption (ALL | COLUMNS | INDEX): Update all existing statistics, statistics created on one or more columns, or statistics created for indexes. If none of the options are specified, All statistics are updated.

  • NoRecompute: When true, disables the automatic statistics update option for the specified statistics. If this option is specified, the query optimizer completes this statistics update and disables future updates.

  • Incremental (ON | OFF): When set to ON, the statistics are recreated as per partition statistics. When OFF, the statistics tree is dropped and SQL Server re-computes the statistics. The default is OFF.

  • maxDop (max_degree_of_parallelism): When specified, overrides the max degree of parallelism configuration option for the duration of the statistic operation. The maximum is 64 processors. max_degree_of_parallelism can be:

    • 1: Suppresses parallel plan generation.

    • >1: Restricts the maximum number of processors used in a parallel statistic operation to the specified number or fewer based on the current system workload.

    • 0 (default): Uses the actual number of processors or fewer based on the current system workload.

You may also use an alternate payload (sample below), replacing the FullScanConfig with a SampleConfig element to analyze a sample of the records:

  • SampleConfig: Specifies the approximate percentage or number of rows in the table or indexed view for the query optimizer to use when it updates statistics. For PERCENT, the number can be from 0 through 100 and for ROWS, the number can be from 0 to the total number of rows.

Example 3. SQL Server: sample analyze stats configuration with sampling
{
	"sampleConfig": {
		"number" : 10,
		"unit" : "PERCENT",
		"persistSamplePercent" : "ON"
	},
	"applyOnOption": "ALL",
	"noRecompute": true,
	"incremental": "ON",
	"maxDop" : 8
}

You can also reuse the most recent sample rate, using the resampleConfig option, as shown below.

Example 4. SQL Server: sample analyze stats configuration with sampling
{
	"resampleConfig" : {},
	"applyOnOption": "ALL",
	"noRecompute": true,
	"incremental": "ON",
	"maxDop" : 8
}