Managing Execution |
Previous
|
|
Next
|
Managing the Platform |
|
Managing the Security |
Managing Execution
The
Execution Engine processes the jobs submitted by the integration batch poller. It orchestrates the certification process for golden data.
Understanding Jobs, Queues and Logs
Jobs are processing units that run in the execution engine. There are two main types of jobs running in the execution engine:
-
Integration Jobs that process incoming batches to perform golden data certification.
-
Deployment Jobs that deploy new model editions in data locations.
Jobs are processed in
Queues. Queues work in First-In-First-Out (FIFO) mode. When a job runs in the queue, the next jobs are
queued and wait for their turn to run. To run two jobs in parallel, it is necessary to distribute them into different queues.
Queues are grouped into
Clusters. There is one cluster per
Data Location, named after the data location.
System Queues and Clusters
Specific queues and cluster exist for administrative jobs:
- For each data location cluster, a specific
System Queue called
SEM_SYS_QUEUE is automatically created. This queue is used to run administrative operations for the data location. For example, this queue processes the deployment jobs updating the data structures in the data location.
- A specific
System Cluster cluster called
SEM_SYS_CLUSTER, which contains a single
SEM_SYS_QUEUE queue, is used to run platform-level maintenance operations.
Job Priority
As a general rule, integration jobs are processed in their queues, and have the same priority.
There are two exceptions to this rule:
- Jobs updating the data location, principally model edition
Deployment Jobs.
-
Platform Maintenance Jobs that updating the entire platform.
Model Edition Deployment Job
When a new model edition is deployed and requires data structure changes, DDL commands are issued as part of a job called
DB Install<model edition name>
. This job is launched in the in the
SEM_SYS_QUEUE queue of the data location cluster.
This job modifies the tables used by the DML statements from the integration jobs. As a consequence, it needs to run while no integration job runs. This job takes precedence over all other queued jobs in the cluster, which means that:
- Jobs currently running in the cluster are completed normally.
- All the queues in the cluster, except the
SEM_SYS_QUEUE are moved to a
BLOCKED status. Queued jobs remain in the queue and are no longer executed.
- The model edition deployment job is executed in the
SEM_SYS_QUEUE.
- When this job is finished, the other queues return to the
READY status and resume the processing of their queued jobs.
This execution model guarantees a minimum downtime of the integration activity while avoiding conflicts between integration jobs and model edition deployment.
Platform Maintenance Job
If a job is queued in the
SEM_SYS_CLUSTER/SEM_SYS_QUEUE queue, it takes precedence over all other queued jobs in the execution engine.
This means that:
- Jobs currently running in all the clusters are completed.
- All the clusters and queues except the
SEM_SYS_CLUSTER/SEM_SYS_QUEUE are moved to a
BLOCKED status. Queued jobs are no longer executed in these queues/clusters.
- The job in the in the
SEM_SYS_CLUSTER/SEM_SYS_QUEUE is executed.
- When this job is finished, the other queues are moved to the
READY status and resume the processing of their queued jobs.
This execution model guarantees a minimal disruption of the platform activity while avoiding conflicts between the platform activity and maintenance operations.
Queue Behavior on Error
When a job running in a queue encounters a run-time error, it behaves differently depending on the queue configuration:
- If the queue is configured to
Suspend on Error, the job hangs on the error point, and blocks the rest of the queued jobs. This job can be resumed when the cause of the error is fixed, or can be cancelled by user choice.
- If the queue is not configured to
Suspend on Error, the job fails automatically and the next jobs in the queue are executed. The failed job cannot be restarted.
Warning: The integration jobs are processed in a FIFO mode, a job that is failed automatically or cancelled by user choice
cannot be restarted. To resubmit the source data for certification, the external load needs to be resubmitted entirely as a new load.
Warning: The integration job performs a commit after each task. As a consequence, when a job fails or is suspended, already processed entities have their golden data certified and committed in the hub.
Suspending a job on error is the preferred configuration under the following assumptions:
- All the data in a batch needs to be integrated as one single atomic operation. For example, due to referential integrity, it is not possible to integrate
contacts without
customers and vice versa. Suspending the job guarantees that it can be continued – after fixing the cause of the error – with the data location preserved in the same state.
- Batches and integration jobs are submitted in a precise sequence that represents the changes in the source, and need to be processed in the order they were submitted. For example, missing a data value change in the suspended batch that may impact the consolidation of future batches. Suspending the job guarantees that the jobs are processed in their exact submission sequence, and no batch is skipped without an explicit user choice.
There may be some cases when this behavior can be changed:
- If the batches/jobs do not have strong integrity or sequencing requirement, then they can be skipped on error by default. These jobs can run in a queue where
Suspend on Error is disabled.
- If the integration velocity is critical for making golden data available as quickly as possible, it is possible to configure the queue running the integration job with
Suspend on Error disabled.
Queue Status
A queue is in one the following statuses:
-
READY: The queue is available for processing jobs.
-
SUSPENDED: The queue is blocked because a job has encountered an error and remains suspended. Queued jobs are not processed until the queues becomes
READY again, either when the job is cancelled or finishes successfully. For more information, see the
Troubleshooting Errors section.
-
BLOCKED: When a job is running in the
SEM_SYS_QUEUE queue of the cluster, the other queues are moved to this status. Jobs cannot be executed in a blocked queue and remain queued until the queue becomes
READY again.
A cluster can be in one the following statuses:
-
READY: The cluster is not blocked by the
SEM_SYS_CLUSTER cluster, and queues under this cluster can process jobs.
-
BLOCKED: The cluster is blocked when a job is running in the
SEM_SYS_CLUSTER cluster. When a cluster is blocked, all its attached queues are also blocked.
Managing the Execution Engine and the Queues
Accessing the Execution Engine
To access the execution engine:
- In the
Administration view, double-click the
Execution Engine node.
The
Execution Engine editor opens.
The Execution Engine Editor
This editor displays the list of queues, grouped by clusters. If a queue is currently pending on a suspended job, it appear in red.
From the
Execution Engine editor, you can perform the following operations:
Changing the Queue Behavior on Error
Note: See the
Troubleshooting Errors and the
Queue Behavior on Error sections section for more information about queue behavior on error and error management.
To change the queue behavior on error:
- In the
Administration view, double-click the
Execution Engine node. The
Execution Engine editor opens.
- Select or de-select the
Suspend on Error option for a queue to set its behavior on error or on a cluster to set the behavior of all queues in this cluster.
- Press
CTRL+S to save the configuration. This configuration is immediately active.
Opening an Execution Console for a Queue
The execution console provides the details of the activity of a given queue. This information is useful to monitor the activity of jobs running in the queue, and to troubleshoot errors.
Note: The content of the execution console is not persisted. Executions prior to opening the console are not displayed in this console. Besides, if the console is closed, its content is lost.
To open the execution console:
- In the
Administration view, double-click the
Execution Engine node. The execution engine editor appears.
- Select the queue, right-click and select
Open Execution Console.
The
Console view for this queue opens. Note that it is possible to open multiple execution consoles to monitor the activity of multiple queues.
In the
Console view toolbar you have access to the following operations:
- The
Close Console button closes the current console. The consoles for the other queues remain open.
- The
Clear Console button clears the content of the current console.
- The
Display Selected Log button allows you to select one of the execution consoles currently open.
Restarting a Suspended Job in a Queue
To restart a suspended job in a queue:
- In the
Administration view, double-click the
Execution Engine node. The execution engine editor appears. The suspended queue appears in red.
- Select the suspended queue.
- Right-click and then select
Restart Job.
The job restarts from the failed step. If the execution console for this queue is open, the details of the execution are shown in the Console.
Canceling a Suspended Job in a Queue
To cancel a suspended job in a queue:
- In the
Administration view, double-click the
Execution Engine node. The execution engine editor appears. The suspended queue appears in red.
- Select the suspended queue.
- Right-click and then select
Ignore Failure.
The job is cancelled, the queue become
READY and starts processing queued jobs.
In the job logs, this job appears in
Error status.
Managing Jobs Logs
The job logs display the jobs being executed or executed in the past by the execution engine. Reviewing the job logs allows you to monitor the activity of these jobs and troubleshoot execution errors.
Accessing the Job Logs
To access the logs:
- Open the
Administration Console perspective.
- In the
Administration View, double click the
Job Logs node.
- The
Job Logs editor opens.
The Job Logs Editor
From this editor you can review the job execution logs and
drill down into these logs.
The following actions are available from the
Job Logs editor toolbar.
- Use the
Refresh button to refresh the view.
- Use the
Auto Fit Column Width button to adjust the size of the columns.
- Use the
Apply and Manage User Defined Filters button to filter the log. See the
Filtering the Logs section for more information.
- Use the
Purge Selection button to delete the entries selected in the job logs table. See the
Purging the Logs section for more information.
- Use the
Purge using a Filter button to purge logs using an existing or a new filter. See the
Purging the Logs section for more information.
Drilling Down into the Logs
The Job Logs editor displays the log list. This view includes:
- The
Name,
Start Date,
End Date and
Duration of the job as well as the name of its creator (
Created By).
- The
Message returned by the job execution. This message is empty if the job is successful.
- The rows statistics for the Job:
-
Select Count,
Insert Count,
Update Count,
Deleted Count: number of rows selected, inserted, updated, deleted, merged as part of this job.
-
Row Count: Sum of all the Select, Insert, etc metrics.
To drill down into the logs:
- Double-click on a log entry in the
Job Logs editor.
- The
Job Log editor open. It displays all he information available in the job logs list, plus:
- The
Job Definition: This link opens the job definition for this log.
- The
Job Log Parameters: The startup parameters for this job. For example, the Batch ID and Load ID.
- The
Tasks: In this list, each entity is displayed with the statistics for this integration job instance.
- Double-Click one entity in the
Tasks list to drill down into the
Task Group Log corresponding to this entity. The
Task Group Log for the entity shows the details of the entity task, and the list of task groups performed for the entity. These tasks groups represent the successive operations performed for the given entity. For example:
Enrich and Standardize,
Validate Source Data, etc.
- Double-click one of the task groups in the
Task Log list to drill down into one of the tasks group. Each tasks may contain one of more tasks or child task groups. For example, the
Enrich and Standardize task group contains the log of the enrichers executed for the given entity.
- Double click of the task group to drill down down into the
Task Log.
- The task log shows the task statistics, and provides a link to the
Task Definition.
By drilling down into the task groups down to the task, it is possible to monitor the activity of a job, and review in the definition the executed code or plug-in.
Filtering the Logs
To create a job log filter:
- In the
Job Logs editor, click the
Apply and Manage User Defined Filters button and then select
Search. The
Define Filter dialog opens.
- Provide the filtering criteria:
-
Job Name: Name of the job. Use the
_
and
%
wildcards to represent one or any number of characters.
-
Created By: Name of the job creator. Use the
_
and
%
wildcards to represent one or any number of characters.
-
Status: Select the list of job statuses included in the filter.
-
Timeframe: Select a time range for the job start date.
- Click the
Save as Preferred Filter option and enter a filter name to save this filter.
Saved filters appear when you click the
Apply and Manage User Defined Filters button.
You can enable of disable a filter by marking it as active or inactive from this menu. You can also use the
Apply All and
Apply None to enable/disable all saved filters.
Note: Filters are saved in the user preferences and can be shared using preferences import/export.
To manage job log filters:
- Click the
Apply and Manage User Defined Filters button, then select
Manage Filters. The
Manage User Filters editor opens.
- From this editor, you can add, delete or edit a filter, and enable disable filters for the current view.
- Click
Finish to apply your changes.
Purging the Logs
You can purge selected job logs or all job logs returned by a filter.
To purge selected job logs:
- In the
Job Logs editor, select the job logs that you want to purge. Press the
CTRL key to select multiple lines or the
SHIFT key to select a range of lines.
- Click the
Purge Selection button.
- Click
OK in the confirmation window.
The selected job logs are deleted.
To purge filtered job logs:
- In the
Job Logs editor, click the
Purge using a Filter button.
- To use an existing filter:
- Select the
Use Existing Filter option.
- Select a filter from the list and then press
Finish.
- To create a new filter:
- Select the
Define New Filter option and then click
Next.
- Provide the filter parameters, as explained in the
Filtering the Logs section and then click
Finish.
- To purge all logs (no filter):
- Select the
Purge All Logs (No Filter) option and then click
Finish.
The jobs logs are purged.
Note: It is possible to trigger job logs purges through web services. The
Administration Service exposes such operations.
Troubleshooting Errors
When a job fails, depending on the configuration of the queue into which this job runs, it is either in a
Suspended or
Error status.
The status of the job defines the possible actions on this job.
- A job in
Error cannot be continued or restarted. It can be reviewed for analysis, and possible fixes will only affect subsequent jobs.
- A
Suspended job blocks the entire queue, and can be
restarted after fixing the problem, or
cancelled.
You have several capabilities in Semarchy Convergence for MDM to help you troubleshooting issues. You can drill down in the erroneous task to identify the issue or restart the job with the Execution Console activated
To troubleshoot an error:
- Open the
Job Logs.
- Double-click the log entry marked as
Suspended or in
Error.
- Drill down into the
Task Log, as explained in the
Drilling Down into the Logs section.
- In the
Task Log, review the
Message.
- Click the
Task Definition link to open the task definition and review the SQL Statements involved, or the plug-in called in this task.
Scheduling Data Purges
Data Purge helps you maintain a reasonable storage volume for the MDM hub and the repository by pruning historical data and job logs.
Introduction to Data Purge
The MDM hub stores the lineage of the certified golden data, as well as the changes that led to this golden data.
Preserving the lineage and history is a master data governance requirement. It is key in a regulatory compliance focus. However, keeping this information may also create a large volume of data in the hub storage.
To make sure lineage and history are preserved according to the data governance and compliance requirements, model designers define
Data Retention Policy in the model. To keep a reasonable volume of information, administrators have to schedule regular
Purges for this data.
Purges are managed by the
Purge Scheduler. This service manages purge schedules, and triggers the appropriate purge job on the execution engine to prune the lineage and history according to the
Data Retention Policy.
The purges delete the following elements of the lineage and history:
- Source data pushed in the hub via external loads
- Data authored or modified in data entry workflows
- Errors detected on the source data by the integration job
- Errors detected on the candidate golden records by the integration job
- Duplicate choices made in duplicate management workflows. The duplicate management decision still applies, but the time of the decision and the decision maker information are deleted.
Optionally, the job logs can also be deleted as part of the purge process.
Accessing the Purge Scheduler
To access the purge scheduler:
- Open the
Administration Console perspective.
- In the
Administration View, double click the
Purge Scheduler node. The
Purge Scheduler editor opens.
This editor displays the scheduled purges. From the
Purge Scheduler editor, you can stop, start or restart the Purge Scheduler service.
Creating a Purge Schedule
To create a purge schedule:
- In the
Purge Scheduler editor toolbar, click the
New Purge Schedule button. The
Data Branch Purge Scheduling wizard opens.
- Select the data branches that you want to purge and then click the
Add >> Button.
- Click the
Next button.
- Enter a
Cron Expression to define the purge schedule or set a purge frequency (
Monthly,
Weekly,
Daily)
- Select the
Purge Repository Logs option to prune the logs related to the purged history and lineage.
- Click
Finish to close the wizard.
- Press
CTRL+S to save the editor.
Note: Regardless of the frequency of the purges scheduled by the administrator, the data history retained is as defined by the model designer in the data retention policies.
Note: It is possible to trigger data purges through web services. The
Administration Service exposes such operations.