Welcome to Semarchy Convergence for Data Integration.
This guide contains information about using the Semarchy Convergence for Data Integration to design and develop a Data Integration project.

Preface

Audience

This document is intended for users interested in using Semarchy Convergence for Data Integration for their Enterprise Master Data Management Initiatives.

If you want to learn about MDM or discover Semarchy Convergence for MDM, you can watch our tutorials.

Document Conventions

This document uses the following formatting conventions:

Convention Meaning

boldface

Boldface type indicates graphical user interface elements associated with an action, or a product specific term or concept.

italic

Italic type indicates special emphasis or placeholder variable that you need to provide.

monospace

Monospace type indicates code example, text or commands that you enter.

Other Semarchy Resources

In addition to the product manuals, Semarchy provides other resources available on its web site: http://www.semarchy.com.

Obtaining Help

There are many ways to access the Semarchy Technical Support. You can call or email our global Technical Support Center (support@semarchy.com). For more information, see http://www.semarchy.com.

Feedback

We welcome your comments and suggestions on the quality and usefulness of this documentation.
If you find any error or have any suggestion for improvement, please mail support@semarchy.com and indicate the title of the documentation along with the chapter, section, and page number, if available. Please let us know if you want a reply.

Overview

Using this guide, you will learn how to:

  • Use the Semarchy Convergence for Data Integration Designer to design and develop a Data Integration for your MDM project.

  • Create and reverse-engineer the source data model to connect your data publisher and consumers.

  • Design the mapping and process for integrating data from and to these publishers.

  • Deploy the integration flows to production.

Introduction to Semarchy Convergence for Data Integration

Semarchy Convergence for Data Integration is the next generation high performance data integration platform that enables your IT team to deliver the right data, at the right place, at the right time. It is a key component of the Semarchy product family.

Used in conjunction with Semarchy Convergence for MDM, it manages the integration flows between the operational/analytic applications and the master data hub.

Semarchy Convergence for Data Integration is designed with the following key features for better design-time productivity and higher run-time performances.

High-Performance E-LT Architecture

Traditional data integration products use an Extract-Transform-Load (ETL) architecture to run data transformations. In this architecture, a dedicated proprietary engine is in charge of the data processing.

image

The E-LT (Extract-Load and Transform) architecture used in Semarchy Convergence for Data Integration removes the need for this middle-tier engine. It leverages the data processing engines in place to run processes generated and optimized for these engines.

image

By removing the cost of the hardware, software, maintenance and tuning skills required for the ETL server, the E-LT architecture guarantees the best TCO. This architecture scales naturally with the source and target datastores. It guarantees the best performance and scalability for data integration at any time.

Powerful User Experience

Semarchy Convergence for Data Integration uses an intuitive, familiar and robust Integrated Development Environment (IDE), the Convergence for DI Designer that contributes to the unified user experience of the Semarchy platform. This IDE is designed with features that ensure better productivity for development and higher efficiency for maintenance.

  • Metadata Driven Approach: The entire integration logic design is driven by metadata, supporting powerful cross-referencing and impact analysis capabilities that ease the development and maintenance of the data integration flows.

  • Declarative Design: Data mappings are designed graphically in a declarative way, focusing on the purpose of these mappings. The technical processes required for achieving the data movement and transformation are automatically generated using built-in process templates.

  • Integration Process Workflows: Complex process workflows are designed in diagrams, using a rich toolbox. Workflows support parallelism, conditional branching and event-based execution.

  • Components Reusability and Templating: The mappings as well as the workflows can be reused within the same project or in other projects. Processes can be converted into user-defined templates for reusability.

  • Seamless Team Collaboration: Standard source control systems such as Concurrent Versioning System (CVS) or Apache Subversion (SVN) are available from within the Convergence for DI Designer to safely share projects within the team or across teams in robust infrastructures.

Rationalized Production

Setting up and maintaining a production environment with Semarchy Convergence for Data Integration is made fast and simple:

  • Lightweight Runtime Deployment: Convergence for Data Integration uses a single lightweight Java component – the Runtime Engine – to handle runtime execution and logging.

  • Rationalized Deployment Model: The deployment model for data integration processes is designed for production users. Package files generated by the development team are easily configured and deployed by the production teams, using graphical or command-line interfaces.

  • Comprehensive Monitoring: The entire execution flow is tracked in logs which are accessed using a Web-based administration dashboard. Live executions can be monitored and past executions can be replayed in diagrams that reflect the processes created at design-time.

Enterprise Data Integration

Semarchy Convergence for Data Integration provides universal data access and enterprise-class capabilities for integration.

  • Extensible Connectivity Framework: In Convergence for Data Integration, connectivity is a fully extensible framework. Technology Descriptors can be added or customized to support any type of technology without restricting the capabilities to a subset. Built-in Process Templates can also be customized to generate processes optimized for specific use cases.

  • Built-in Technology Adapters: Out of the box adapters provide read/write access to a variety of systems and data formats including files, databases, XML, web services, applications, etc. They include both the technology descriptors and process templates for these technologies.

  • Real-Time and Batch Integration Patterns: Both the real-time (Using Changed Data Capture) and batch integration patterns are supported. These patterns cover the most common use cases for Enterprise Data Integration and Master Data Management.

  • Data Integration Services: Convergence for Data Integration provides out of the box access to web services for integration purposes. In addition, data integration flows designed in Convergence for Data Integration can be automatically exposed as web services and used as part of a Service Oriented Architecture (SOA).

  • Unified Product Suite: Convergence for Data Integration is fully integrated with the Semarchy Product Suite. Built-in process templates and patterns are provided to manage publishing and consuming master data in the Semarchy Convergence for MDM golden data hub.

Through all these features, Semarchy Convergence for Data Integration enables you to deliver master data where and when the business needs it, in a simple, fast and safe way.

Introduction to the Convergence for DI Designer

Install and Start Convergence for Data Integration

Convergence for Data Integration includes two key components:

  • The Designer is the Graphical User Interface (GUI) into which data developers create their integration jobs.

  • The Runtime Engine is the component that runs these integration jobs.

In the following section:

  • semarchy-di-designer.zip file refers to the Convergence for Data Integration- Full Setup file that you can download to install Semarchy Convergence for Data Integration. The name of this file varies as it includes the platform information, product version and build number.

  • <semarchy_di> refers to the installation folder of Convergence for Data Integration.

To install Convergence for Data Integration:

  1. Download the Convergence for Data Integration distribution (semarchy-di-designer.zip) corresponding to your platform and to your default Java Machine (32 vs. 64 Bits).

  2. Uncompress the semarchy-di-designer.zip file in your machine. This will create a semarchy_di sub-folder. This sub-folder will be referred to as <semarchy_di> (the Convergence for Data Integration installation directory).

  3. Start the Designer:

    • On Windows platforms:

      1. Open Windows Explorer, and go to the <semarchy_di> folder.

      2. Run semarchy.exe. The Designer starts.

    • On UNIX/Linux platforms:

      1. Open a shell window, and go to the <semarchy_di> folder.

      2. Run ./semarchy. The Designer starts.

      3. In a shell window, go to the <semarchy_di>/runtime folder and run chmod 755 *.sh to make the runtime scripts executable.

  4. When the Designer starts, it prompts you for the license key.

  5. In the Please validate your product dialog, enter in the Key the key string that was provided to you by Semarchy.

  6. Click the Apply button.

  7. After registering the license key, you must create the folder into which the Designer will store its data. This folder in your local machine is the Workspace. Convergence for DI Designer prompts creates by default a workspace folder in its installation directory.
    To install it in a different location:

    1. In the Workspace Launcher window, click the Browse button.

    2. In the Select Workspace Directory dialog, select the folder into which the workspace will be created.

    3. Click OK to create the workspace and open it. The Convergence for Data Integration Designer window opens on the Introduction page. This page provides access Overview, Tutorials and Web Resource pages.

  8. Click the Workbench link image to open the newly created workbench.

Directories Contents

The <semarchy_di> directory contains the following sub-folders:

  • /samples contains the files for running this getting started tutorial and other samples.

  • /workspace contains the workspace that you have created. Note that you can have several workspaces for a single installation. You can locate these workspaces anywhere in your file system.

  • /templates contains the templates provided out of the box with Convergence for Data Integration.

  • /runtime contains the Convergence for Data Integration runtime engine binary and startup scripts.

  • /plugins and configuration contain the binaries and configuration files for the Designer.

Convergence for DI Designer Overview

The Semarchy Convergence for Data Integration Designer appears as follows.

image

In the Convergence for DI Designer, the following sections are available:

  • The Project Explorer view provides a hierarchical view of the resources. From here, you can open files for editing or select resources for operations such as exporting.

  • The Editors ' section contains the various objects being edited: mappings, metadata, processes, etc.

  • Various other Views are organized around the edition view and allow navigating, viewing and editor object properties.

  • The Configuration zone allows selecting the active Configuration (development, production, etc.).

  • You can use the Perspectives to customize the layout of the various views in the Convergence for DI Designer. A default Semarchy Convergence for Data Integration perspective is created and it is possible to customize your own perspectives.

Design-Time Views

The following section describes the main views used at design-time.

The Project Explorer

The Project Explorer contains a hierarchical view of the resources in the workspace. These resources are organized into projects and folders.

The following types of resources appear in the projects:

Resource Type

File Extension

Description

Metadata

.md

A metadata resource describes source or target systems and datastores or variables that participate in a mapping or a process.

Mapping

.map

A mapping is used to load data between source and target datastores.

Process

.proc

A process is a sequence of tasks and sub-tasks that will be executed during run-time. Certain processes are Template Processes, which are used to generate a process from a mapping.

Templates Rules

.tpc

Templates Rules file describe the conditions upon which template processes can be used. For example, an Oracle integration template can be used when integrating data to a target datastore in an Oracle database, but is not suitable when targeting XML files.

Configuration Definition

.cfc

A configuration makes the metadata variable. For example, configurations can contain the connection information to the data servers and you can use configurations to switch between production and development environments.

Runtime Definition

.egc

Definition of a runtime engine. A runtime engine executes the integration processes created with the Semarchy Convergence for Data Integration Designer.

image

From the project explorer toolbar (highlighted in the previous screenshot) you can create the three main types of design-time resources: Metadata, Mappings and Processes.

Duplicate resources

Multiple resources with the same ID, called duplicates, can exist in the same Workspace. Only one of these duplicates can be active at the same time. It is indicated with an asterisk at the end of the name:

image

To enable a duplicate:

  1. Right-click the duplicate and select Enable Duplicate Model.
    The duplicated resource will become active.

Moreover, the Manage Duplicate Resource tool in the Impact View permits to manage the duplicates of the whole workspace.

The Properties View

The Properties view displays the list of properties of the object currently selected.

The Expression Editor

The Expression Editor displays code related to the object being edited. For example, the mapping expressions, the code of the actions, etc.

This editor provides two options:

  • Lock: Allows you to lock the expression editor. When this option is not-selected the expression editor changes every time you select an element in the Convergence for DI Designer to display the code of this element. To build expressions in the expression editor using drag-and-drop, you can select the element that you want to edit (for example, a target mapping), select the lock option and then drag and drop columns from the various source datastores of the mapping into the expression editor.

  • Auto-Completion: This option enables auto-completion in the code. While editing, press CTRL+SPACE to have the list of suggestions for your code.

The Outline View

The Outline view provides a high-level hierarchical or graphical view of the object currently edited.
In addition, it provides a search tool image to search into the current object.

The Impact View

The Impact view allows you to analyze the usage of an object in the project and perform impact analysis and cross-referencing.

To use the Impact Monitor:

  1. Select the object that you want to analyze. The Impact view displays the list of usages for this object.

  2. Double-click on one of the objects in the list to open it for edition.

The Impact view menu provides several options:

image

  • Save Cache and Rebuild Cache are used to explicitly refresh or save the cross-references cache for your workspace. In the normal course of operations, the cache refreshes automatically.

  • Manage Duplicate Resource opens a new window to manage the duplicated resources of the workspace.

  • Refresh Console refreshes the cache console (re-calculates mappings states, files states, cross-references, etc).

The Run-Time Views

Three views are used to monitor the run-time components and executions of the sessions.

The Runtime View

image

The Runtime view allows monitoring of a Runtime Engine.

From this view, you can perform the following operations:

  • Click the Environment button to start and stop a local (pre-configured) runtime engine and demonstration databases.

  • Click the Runtime Editor button to add or modify runtime engine definitions. The procedure to create a runtime engine definition is described below.

  • Check that a runtime engine is active by selecting it in the list and clicking Ping.

  • Connect a runtime engine: Select it in the list and then click the Connect option. When a runtime engine is connected, it is possible to view its sessions and issue commands to it via its command-line console.

  • When connected to a runtime engine, you can activate the Refresh Diagram option. When this option is active, you can monitor on the diagram of a process this process' activity as it runs in the runtime engine.

  • When connected to a runtime engine, click the Runtime Command button to open its command-line console. From the Console you can issue commands to the runtime engine. Type help in the console for a list of valid commands.

To create a new runtime definition:

  1. In the Runtime view, click the Runtime Editor button. The runtime definition (conf.egc) opens.

  2. Select the root node, right-click and select New > Engine.

  3. In the Properties view, enter the following information:

    • Description: Description of this runtime.

    • Name: User-friendly name for the run-time.

    • Server: Host name of IP address of the machine where the runtime components run.

    • Port: Port on this machine where the runtime runs.

  4. Press CTRL + S to save the new runtime definition.

The Sessions View

The Sessions view displays the list of sessions of the connected runtime engine.

This list includes the following session properties:

  • Start Time: Startup day and time of the session.

  • Name: Name of this session.

  • Status: Status of the session:

    • Prepared: Prepared but not executed sessions.

    • Running: Running sessions.

    • Executed: Executed sessions.

    • Error: Sessions in error.

    • Killed: Sessions killed by the user

    • Dead: Dead sessions, that is sessions that never finished and are considered as dead by Semarchy Convergence for DI Analytics.

  • Elapsed Time: Duration of the session.

  • ID: Unique identifier of the session.

  • Log Name: Name of the log storing this session.

  • Log Type: Type of the log storing this session.

  • Engine: Name of the runtime engine processing the session.

  • Guest Host: Name of the host from which the session was initiated.

  • Launch Mode: Method used to start the session: Semarchy Convergence for Data Integration Designer, Web Service, Scheduler, etc.

    • Designer: The session has been executed from the Designer.

    • Schedule: The session has been executed automatically by a schedule.

    • Web Interactive: The session has been executed from Analytics.

    • Engine Command: The session has been executed from the Runtime Command utility (E.g. Using the 'execute delivery' command).

    • Command Line: The session has been executed from command line (E.g. with startdelivery.bat).

    • Web Service: The session has been executed from the Runtime’s REST or SOAP web services.

    • Action: The session has been executed from the 'Execute Delivery' Process Action.

    • Restart: The session has been restarted.

  • Execution Mode: Execution mode of the session:

    • Memory: The session has been executed in memory in the Runtime.

    • Autonomous: The session has been executed outside of the Runtime.

In this view, you can filter the sessions using parameters such as the session Name or Status, filter a number of sessions, or only the session started by the current user. If the log is shared by several runtime engines, it is also possible to filter only sessions of the current runtime engine.

From this view, you can also purge the log by clicking the Delete All and Delete Until buttons.

The Session Detail View

This view displays the details of the session selected in the Sessions view. The Errors and Warning tabs display the list of issues, and the Variables tab displays the list of session and metadata variables.

Standard session variables include:

  • CORE_BEGIN_DATE: Startup day and time of the session.

  • CORE_DURATION: Duration of the session in milliseconds

  • CORE_END_DATE: Day and time of the session completion.

  • CORE_ENGINE_HOST: Host of the runtime engine processing the session.

  • CORE_ENGINE_PORT: Port of the runtime engine processing the session.

  • CORE_ROOT: Name of the process containing the session.

  • CORE_SESSION_CONFIGURATION: Configuration used for this execution.

  • CORE_SESSION_ID: Unique identifier of the session.

  • CORE_SESSION_NAME: Name of this session.

  • CORE_SESSION_TIMESTAMP: Session startup timestamp.

  • CORE_TEMPORARY_FOLDER: Temporary folder for this session.

The Statistics View

The Statistics displays the list of statistics aggregated for the sessions.

The following default statistics are available:

  • SQL_NB_ROWS: Number of lines processed.

  • SQL_STAT_INSERT: Number of lines inserted.

  • SQL_STAT_MERGE: Number of merged records.

  • SQL_STAT_UPDATE: Number of lines updated.

  • SQL_STAT_DELETE: Number of lines deleted.

  • SQL_STAT_ERROR: Number of errors detected.

  • OUT_FILE_SIZE: Output file size.

  • OUT_NB_FILES: Number of output files.

  • XML_NB_ELEMENTES: Number of XML elements processed.

  • XML_NB_ATTRIBUTES: Number of XML attributes processed.

This view can be parameterized via the preferences (Window > Preferences), in the Semarchy Convergence for Data Integration > Monitor section. You can select which of the variables need to be aggregated using which aggregate function.

The Step Detail View

image

This view displays the details of the steps executed in a session. It displays the code at various stages in three tabs:

  • Source displays the source code before generation. It comes from the process templates.

  • Generated displays the generated code. It may contain dynamic values replaced before execution.

  • Executed displays the executed code with the variables replaced.

The Detail Of selection box allows you to select one of the iterations of a step that was executed several times.

The Variable View

The Variable view displays the variables for the selected step. The list of variables depends on the steps. The following standard variables appear for all steps:

  • CORE_BEGIN_DATE: Startup day and time of the step.

  • CORE_DURATION: Duration of the step in milliseconds

  • CORE_END_DATE: Day and time of the step completion.

  • CORE_BEGIN_ACTION: True if this action is a start action.

  • CORE_NB_ENABLED_EXECUTIONS: Maximum number of executions allowed for this step. This variable is used in loops.

  • CORE_NB_EXECUTION: Number of iterations of this step.

  • CORE_RET_CODE: Return code for this step.

Working with Projects

Resources are organized into projects and folders.

In a Semarchy Convergence for Data Integration workspace, there are two default projects:

  • global contains all the objects global to this workspace, which include:

    • The Runtime Engine Definition

    • The Configuration Definitions

    • The Template Processes are also imported in this project.

  • .tech contains the definition of the various technologies supported by the platform. This project is hidden by default, and you do not need to modify this project.

Creating Projects

To create a new project:

  1. Right-click in the Project Explorer and then select New > Project in the context menu. The New Project wizard opens.

  2. In the Wizards filter, enter Project, and then select the General > Project item in the tree.

  3. Click Next.

  4. Enter a Project Name and then click Finish.

Creating Folders

To create a new folder:

  1. Right-click on a project or folder in the Project Explorer and then select New > Folder in the context menu. The New Folder wizard opens.

  2. Select the parent folder or project in the wizard.

  3. Enter a Folder Name and then click Finish.

You can organize folders within a project and resources within folders using drag and drop operations or using the Move action in a resource’s context menu.

A typical organization for a project is:

  • ProjectName (Project)

    • metadata (folder): this folder contains all the metadata resources.

    • development (folder): this folder contains all the development resources.

      • process (folder): this folder contains all the processes.

      • mapping (folder): this folder contains all the mappings.

Importing Templates

Semarchy Convergence for Data Integration uses Templates to generate the code of processes for the mappings. By default, these templates are imported in the global project.

To import templates:

  1. In the Project Explorer, right-click the global project and select Import in the context menu. The Import wizard opens.

  2. In the tree view, select General > Archive File and then click Next.

  3. Use the Browse button to select the archive file containing the templates. This file is typically named Templates.YYYY-MM-DD.zip where YYYY-MM-DD is a date corresponding to the template package release. Click OK.

  4. When the file is selected, its contents appear in the wizard. Select all the templates, or only those relevant for your workspace.

  5. Click Finish to run the import. The imported templates appear in the global project, organized into folders.

Version Control

The Semarchy Convergence for Data Integration workspace and the projects use exclusively file storage. They can be version controlled using a version control system compatible with Eclipse RCP, for example Subversion. Refer to the version control system documentation for more information.

Working with Metadata

Semarchy Convergence for Data Integration uses metadata to design, generate and run the data integration processes. For example, the structure of the tables, text or XML files taken into account in the data integration flows.

A metadata file image handled by Semarchy Convergence for Data Integration represents generally a data model. For example a database schema, a folder, etc, storing tables, files. A metadata file is created by connecting to the database server, file system, etc, to retrieve the structure of the tables, files, etc.. This mechanism is called reverse-engineering.

The following sections explain how to create the three main types of metadata files.

Defining a Database Model

Creating and Reversing a Database Model

This process uses a wizard that performs three steps:

  1. Create a new data server

  2. Create a new schema

  3. Reverse-engineer the datastores

To create a new data server:

  1. Click on the image New Metadata button in the Project Explorer toolbar. The New Model wizard opens.

  2. In the Choose the type of Metadata tree, select RDBMS > <DBMS Technology> where <DBMS Technology> is the name of the DBMS technology that you want to connect.

  3. Click Next.

  4. Select the parent folder or project for your new resource.

  5. Enter a Metadata Model Name and then click Finish. The metadata file is created and the Server wizard opens.

  6. In the Server Connection page, enter the following information:

    • Name: Name of the data server.

    • Driver: Select a JDBC Driver suitable for your data server.

    • URL: Enter the JDBC URL to connect this data server.

    • Un-select the User name is not required for this database option if authentication is required for this data server.

    • User: The database user name.

    • Password: This user’s password.

  7. (Optional) Modify the following options as needed:

    • Auto Logon: This option allows the Semarchy Convergence for Data Integration Designer to automatically create a connection to this data server when needed.

    • Logon during Startup: This option allows the Semarchy Convergence for Data Integration Designer to create a connection to this data server at startup.

    • AutoCommit: Semarchy Convergence for Data Integration Designer connections to this data server are autocommit connections.

    • Commit On Close: Semarchy Convergence for Data Integration Designer connections to this data server send a commit when they are closed.

  8. Click the Connect button to validate this connection and then click Next. The Schema Properties page opens.

To create a new schema:

  1. In the Schema Properties page, enter the following information:

    • Name: Use the checkbox to enable this field, and enter a user-friendly name for this schema.

    • Schema Name: Click the Refresh Values button to retrieve the list of schemas from the database, and then select one of these schemas.

    • Reject Mask: Set the table name mask for the table containing the load rejects (error tables). See the Table Name Masks section below for more information.

    • Reject Mask: Set the table name mask for the temporary load tables. See the Table Name Masks section below for more information.

    • Integration Mask: Set the table name mask for the temporary integration tables. See the Table Name Masks section below for more information.

    • Work Schema: Select a schema for storing the load and integration temporary tables for this data server. This schema is also referred to as the Staging Area. See the Work and Reject Schemas Selection section for more information. Click the …​ button to create a new schema definition and set it as the work schema.

    • Reject Schema: Select a schema for storing the errors (rejects) tables for this data server. See the Work and Reject Schemas Selection section for more information. Click the …​ button to create a new schema and set it as the reject schema.

  2. Click Next. The Reverse Datastore page opens.

To reverse-engineer the datastores into a schema:

  1. In the Reverse Datastore page, optionally set an object filter. Use the _ and % wildcards to represent one or any number of characters.

  2. Optionally filter the type of objects that you want to reverse-engineer: All, synonyms, tables and views.

  3. Click the Refresh button to refresh the list of datastores.

  4. Select the datastores that you want to reverse engineer in the list.

  5. Click Finish. The reverse-engineering process retrieves the structure of these datastores.

  6. Press CTRL+S to save the editor.

Adding a New Schema

To add a new schema to an existing data server:

  1. In the metadata file editor, select the root node.

  2. Right-click and select Action > Launch DataSchema Wizard.

  3. Follow the steps described in the "To create a new schema" section of Creating and Reversing a Database Model

Reverse-engineering an Existing Schema

To retrieve metadata changes from an existing schema, or to retrieve new table definitions, you must perform a new reverse-engineering.

To reverse-engineer an existing schema:

  1. In the metadata file editor, select the node corresponding to the schema.

  2. Right-click and select Action > Launch DataSchema Wizard.

  3. Click Next in the first page of the wizard.

  4. On the second page follow the steps described in the "To reverse-engineer the datastores in a schema" section of Creating and Reversing a Database Model.

Table Name Masks

Table name masks define name patterns for the temporary objects created at run-time.

Table Name masks can be any string parameterized using the following variables:

  • [number]: Automatically generated increment for the load tables, starting with 1.

  • [targetName]: Name of the target table of a mapping.

  • ${variable}$ or %{variable}%: A session variable that is set at run-time.

Note that the resulting string must be a valid table name.

Example: L_[targetName]_[number] would create Load tables named L_CUSTOMER_1, L_CUSTOMER_2, etc for a mapping loading the CUSTOMER table.

Work and Reject Schemas Selection

When defining a schema (with optionally a Name for this schema), you optionally refer to two other schemas, the Work Schema and Reject Schema. These two schemas store respectively temporary load/integration tables (Staging Area) and the error (reject) tables for the data tables stored in the schema being defined. In the mappings, the work schema is also called the Staging Area.The value for these two schemas may be:

  • Empty: In that case, the work schema and reject schemas are automatically set to the Schema Name. This means that the temporary and error tables are created in the same schema as the data tables.

  • Set to the Name or Schema Name of another schema. In that case, the temporary or error tables are stored in this other schema’s Schema Name.

It is recommended to configure by default two separate temporary (for example, DI_TEMP) and error (for example DI_REJECTS) schemas for each database server and set them as the Work Schema and the Reject Schema for all the data schemas. This avoids mixing application data (data schemas) and Convergence for DI tables in the same schemas.

Creating and using a Metadata Query

A SQL Query can be reversed and used in a database Metadata.

To create a Query:

  1. Right-click on the database node in the Metadata and select New > Query Folder. It will create a folder in which the queries will be stored.

  2. Give a name to the query folder which appeared in the Metadata.

  3. Finally, Right-click on it and select New > Query.

To reverse a Query:

  1. Give a name to the query.

  2. Enter a SQL SELECT query in the Expression.

  3. Save the Metadata.

  4. Right-click on the query and select Actions > Reverse

The reversed query can be used in Mappings as Source like any other datastores. However, it is not recommended to use it as a Target as it only represents a query and is not a table.
It is possible to parameterize the query using the xpath syntax: {./md:objectPath(ref:schema('schema_name'), 'table_name')}. Note that the schema and table must exist on the metadata.

Defining a File Model

Creating a File Model

To create a new File metadata file:

  1. Click on the image New Metadata button in the Project Explorer toolbar. The New Model wizard opens.

  2. In the Choose the type of Metadata tree, select File > File Server.

  3. Click Next.

  4. Select the parent folder or project for your new resource.

  5. Enter a Metadata Model Name and then click Finish. The editor is created and the File Wizard open automatically.

  6. In the Directory page, provide a user-friendly Name for the directory and select its Path.

  7. Click Next.

  8. In the File Properties page:

    1. Use the Browse button to select the file within the directory and set the Physical Name for the file.

    2. Set a logical Name for the file datastore.

    3. Select the file Type: Delimited or Positional (fixed width fields).

  9. Follow the process corresponding to the file type for reverse-engineering.

Reverse-Engineering a Delimited File

To reverse-engineer a delimited file:

  1. In the File Properties page, use the Refresh button to view the content of the file in the preview. Expand the wizard size to see the file contents.

  2. Set the following parameters to match the file structure:

    • Charset Name: Code page of the text file.

    • Line Separator: Character(s) used to separate the lines in the file.

    • Field Separator: Character(s) used to separate the fields in a line.

    • String Delimiter: Character(s) delimiting a string value in a field.

    • Decimal Separator: Character used as the decimal separator for numbers.

    • Lines to Skip: Number of lines to skip from the beginning of the file. This count must include the header.

    • Header Line Position: Position of the header line in the file.

  3. Click Next.

  4. Click Reverse. If the parameters set in the previous page are correct, the list of columns detected in this file is automatically populated.

    • Reverse-engineering parses through a number of lines in the file (defined by the Row Limit) to infer the data types and size of the columns. You can tune the reverse behavior by changing the Reverse Options and Size Management properties, and click Reverse again.

    • You can manually edit the detected column datatype, size and name in the table.

  5. Click Finish for finish the reverse-engineering.

  6. Press CTRL+S to save the file.

Reverse-Engineering a Fixed-Width File

To reverse-engineer a fixed-width file:

  1. In the File Properties page, use the Refresh button to view the content of the file in the preview. Expand the wizard size to see the file contents.

  2. Set the following parameters to match the file structure:

    • Charset Name: Code page of the text file.

    • Line Separator: Character(s) used to separate the lines in the file.

    • Decimal Separator: Character used as the decimal separator for numbers.

    • Lines to Skip: Number of lines to skip from the beginning of the file. This count must include the header.

    • Header Line Position: Position of the header line in the file.

  3. Click Next.

  4. Click Refresh to populate the preview.

  5. From this screen, you can use the table to add, move and edit column definitions for the file. As you add columns, the preview shows the position of the columns in the file.

  6. Click Finish to finish the reverse-engineering.

  7. Press CTRL+S to save the file.

Defining an XML Model

To create a new XML metadata file:

  1. Click the image New Metadata button in the Project Explorer toolbar. The New Model wizard opens.

  2. In the Choose the type of Metadata tree, select XML > XML Schema.

  3. Click Next.

  4. Select the parent folder or project for your new resource.

  5. Enter a Metadata Model Name and then click Finish. The editor is created and the XML Wizard opens.

  6. In the Name field, enter a name for this schema.

  7. In the XML Path field, enter the full path to the XML file. This file does not need to physically exist at this location if you have the XSD, and can be generated as part of a data integration process.

  8. In the XSD Path field, enter the full path to the XSD describing the XML file. If this XSD does not exist, click Generate to generate an XSD from the content of the XML file provided in the XML Path.

  9. Click Refresh and then select the root element for this schema. If the XSD has several root nodes, it is possible to repeat this operation to reverse-engineer all the hierarchies of elements stored in the XML file. Each of these hierarchies can point to a different XML file specified in the properties of the element node.

  10. Click Reverse. The reverse-engineering process retrieves the XML structure from the XSD.

  11. Click Finish to close the wizard and return to the editor.

  12. Press CTRL+S to save the editor.

Defining a Generic Model

A Generic model is useful when you want to have custom Metadata available in order to parameterize your developments.

To define a Generic Model:

  1. Click the image New Metadata button in the Project Explorer toolbar. The New Model wizard opens.

  2. In the Choose the type of Metadata tree, select Generic > Element.

  3. Click Next.

  4. Select the parent folder or project for your new resource.

  5. Enter a File Name for your new metadata file and then click Finish. The metadata file is created and the editor for this file opens.

  6. Select the Element node and enter the Name for this element in the Properties view.

A Generic Model is a hierarchy of Elements and Attributes. The Attribute values can be retrieved for an element thanks to the Semarchy Convergence for Data Integration usual XPath syntax.

To create a new Element:

  1. Right-Click on the parent element.

  2. Select New > Element

  3. In the Properties view, enter the name of the new Element.

To add an attribute to an Element:

  1. Right-Click on the parent element.

  2. Select New > Attribute

  3. In the Properties view enter the Name and the Value of the Attribute. This name will be useful to retrieve the value of your attribute.

Working with Configurations

Configurations allow to parameterize metadata for a given context. For example, a single data model declared in Semarchy Convergence for Data Integration may have two configurations, Production and Development. Used in the Development configuration it would point to a development server and used in the Production configuration it would point to a production server. Both servers contain the same data structures (as defined in the model), but not the same data, and have different connection information.

Creating a Configuration

To create a configuration:

  1. In the Semarchy Convergence for Data Integration Designer toolbar, click the Edit button.

  2. The Configuration Definition editor (conf.cfc) opens.

  3. Right-click the root node (Cfc), then select New > Configuration.

  4. In the Properties view, enter the new configuration’s properties:

    • Code: Code of the configuration. This code appears in the Configurations drop-down list in the Semarchy Convergence for Data Integration Designer toolbar.

    • Description: Description of the configuration.

    • Execution Protection: Set to true if you want to be prompted for a password when executing a process in this configuration.

    • Selection Protection: Set to true if you want to be prompted for a password when switching the Semarchy Convergence for Data Integration Designer to this configuration.

    • Password: Protection password for this configuration.

  5. Press CTRL+S to save the configuration.

Using Configurations

In a metadata file, it is possible to define configuration-specific values for certain properties. The following section describes the most common usage of the configurations in metadata files.

Using Configuration for Databases

In databases, you can customize the connection information to the data server as well as the data schema definitions using configuration.

To create a data server configuration:

  1. In the database metadata file editor, select the root node corresponding to your data server.

  2. Right-click and select New > DataServer Configuration.

  3. In the Properties view:

    • Select the configuration in the Configuration Name field.

    • Set the different values for the connection information (Driver, URL, User and Password) as required.

  4. Press CTRL+S to save the database metadata file.

To create a data schema configuration:

  1. In the database metadata file editor, select the node corresponding to your data schema.

  2. Right-click and select New > DataServer Configuration.

  3. In the Properties view:

    • Select the configuration in the Configuration Name field.

    • Set different values for the schema information (Schema Name, Reject Schema, etc.) as required.

  4. Press CTRL+S to save the database metadata file.

You can define configurations at all levels in the database metadata file for example to define configuration-specific structural features for the datastores, columns, etc.
Using Configuration for Files

In files, you can customize the directory location as well as the file names depending on the configuration using a directory or a file configuration.

For example:

  • in a development configuration, a file is located in the C:\temp\ directory and named testcustomers.txt

  • in a production configuration, a file is located in the /prod/files/ directory and named customers.txt

To create a directory configuration:

  1. In the File metadata file editor, select the node corresponding to your directory.

  2. Right-click and select New > Directory Configuration.

  3. In the Properties view:

    • Select the configuration in the Configuration Name field.

    • Set a value for the Path specific to the configuration.

  4. Press CTRL+S to save the File metadata file.

To create a file configuration:

  1. In the File metadata file editor, select the node corresponding to your file.

  2. Right-click and select New > File Configuration.

  3. In the Properties view:

    • Select the configuration in the Configuration Name field.

    • Set a value for the Physical Name of the file specific to the configuration.

  4. Press CTRL+S to save the File metadata file.

You can define configurations at all levels in the File metadata file for example to define configuration-specific structural features for flat files.
Using Configuration for XML

In XML files, you can customize the path of the XML and XSD files depending on the configuration using a schema configuration.

To create a schema configuration:

  1. In the XML metadata file editor, select the root node.

  2. Right-click and select New > Schema Configuration.

  3. In the Properties view:

    • Select the configuration in the Configuration Name field.

    • Set a value for the XML Path and XSD path specific to the configuration.

  4. Press CTRL+S to save the XML metadata file.

You can define configurations at all levels in the XML metadata file for example to define configuration-specific structural features in the XML file.

Working with Mappings

Mappings image relate source and target metadata, and allow moving and transforming data from several source datastores (files, tables, XML) to target datastores.

Creating a New Mapping

To create a new mapping:

  1. Click on the image New Mapping button in the Project Explorer toolbar. The New Map Diagram wizard opens.

  2. Select the parent folder or project for your new resource.

  3. Enter a File Name and then click Finish. The mapping file is created and the editor for this file opens.

Adding the Targets and Sources

To see in the Project Explorer view the objects contained in the metadata files, configure this view accordingly. Select the Customize View in this view’s menu (as shown in the screen copy below) and make sure that the Semarchy Convergence for Data Integration Diagram Objects are not selected. image

To add the source and target datastores: . In the Project Explorer, expand the metadata file containing the datastore (table, file, XML file) that you want to integrate. . Drag and drop the datastores into which data will be loaded (the targets) from the Project Explorer into the mapping editor. . Drag and drop the datastores from which data will be extracted (the sources) from the Project Explorer into the mapping editor.

For each datastore that has been added:

  1. Select this datastore.

  2. In the properties view, set the following properties:

    • Alias: Alias used in the expressions when referring to this datastore. It defaults to the datastore’s name.

    • Use CDC: Check this box to enable consumption of changed data from a source datastore, captured via the CDC feature.

    • Enable Rejects: Check this box to enable rejects management on a target datastore. When this option is selected, rows in the data flow not meeting the target table’s constraints are isolated into the rejects instead of causing a possible runtime failure.

    • Tag: Add a tag to the table. Tags are used in certain process templates.

    • Description: Free form text.

    • In the Advanced properties section, the Order defines the order of a source datastore in the FROM clause generated when loading a target.

    • In the Advanced properties section, the Integration Sequence specifies the order in which tables without any mutual dependencies must be loaded.

  3. Press CTRL+S to save the mapping.

Linking Datastores

To create a link between 2 datastores:

  1. Select a column from a source datastore in the mapping diagram.

  2. While keeping the mouse button pressed, drag this column onto another source column in the mapping diagram.

  3. Release the mouse button. You will be prompted to select the type of link to create:

    • Join: a new join will link the 2 datastores

    • Map: a source-target relationship will be created between the two columns and their datastores

You will be prompted to select the type of link only if Convergence for DI Designer detects that both kind of links can be created. Otherwise the accurate type of link will be automatically created.
When creating a source-target link between the datastores, blocks representing the load and integration process templates are automatically added to the upper part of the target datastore.
When linking a source datastore to a target, the columns from the source are automatically mapped to the target datastore columns using column name matching.
If a join already exists between two datastores, drag-and-dropping a column between these datastores adds it to the existing join expression. If you want to create a new Join between those datastores, hold the CTRL key pressed while dropping the column

Defining a Join between Sources

When a join is created:

  1. In the Expression Editor view, edit the code of the join. You can lock the expression editor and drag columns from the diagram into the expression editor. See the Filter and Join Expressions section for more information about mapping expressions.

  2. Select the join in the mapping diagram.

  3. In the Properties view, set the following Standard Properties:

    • Enable: Enables or disables the join.

    • Set the join type by selecting either Join Type to Inner, Full or Cross or by selecting the Left Part or Right Part to perform a left outer or right outer join. See Join (SQL) for a definition of the join types.

    • Set the Execution Location. The join may be executed within the source system (when joining two datastores in the same data server) or in the staging area.

    • Description: Free form text.

  4. In the Properties view, optionally set the following Advanced Properties:

    • Join Type: Explicit uses the ISO Syntax (join in the FROM clause), Implicit places the join in the WHERE clause, and Default takes the default type defined for the technology. This latter option is preferred for non-inner joins

    • Order: Defines the order of the join when using the Explicit Join type.

  5. Press CTRL+S to save the mapping.

Be cautious when using Cross and Full joins as they may lead to a multiplication of rows and performance issues.
When a Join is created between two datastores a blue area appears representing a set of datastores that are Joined together. This area is called a dataset.

Conditional Joins

A conditional Join allows to activate a dataset and its corresponding join only if the driving dataset is used.

To define a conditional Join:

  1. Select the join

  2. In the Properties view, open the Advanced Properties

  3. Set the Activate property:

    • Always: The join is not conditional and will always be executed

    • With datastore's dataset: The conditional join will be activated only if the dataset containing the datastore is used.

Conditional joins can be particularly useful to mutualize loads inside a Mapping.

Mapping the Target Columns

The target columns must be mapped with expressions using source columns. These expressions define which source columns contribute to loading data into the target columns.

When a map expression has been defined on a target column:

  1. Select the target column.

  2. In the Properties view, set the following Standard Properties:

    • Enable: Enables or disables the mapping.

    • Set the Execution Location. The mapping may be executed within the source system, in the staging area or in the target itself (while inserting data into the target).

    • Use as Key: Check this option if this column must be used as part of the unique key for this mapping. Several columns participate to the unique key in the mapping. These may be the columns from one of the target table’s unique keys, or different columns. This unique key is used in the context of this mapping to identify records uniquely for reject management and target records update purposes.

    • Enable Insert: Enable inserting data with this mapping.

    • Enable Update: Enable updating data with this mapping.

    • Aggregate: Indicates that this column contains an aggregate expression. Other (non-aggregated) columns are added in the GROUP BY clause of the queries.

    • Tag: Add a tag to the mapping. Tags are used in certain process templates.

    • Description: Free form text.

  3. Press CTRL+S to save the mapping.

Understanding Column Icons

Source and target columns have an icon that contains the first letter of their datatype. This icon appears in grey when the source column is not used in the mapping or when the target column is not mapped

In addition, target columns are tagged with icons to identify their key properties. The various icons and their meaning are listed in the following table.

Icon

Meaning

image

The yellow key indicates that the column is part of the key. The white star in the upper left corner indicates that the column is not nullable. If reject management is activated, rows with null values for this column are rejected. The letter represents the column data type (I: Integer, V: Varchar, etc.)

image

The star in the upper right corner means that the column is not nullable and Convergence for Data Integration checks for null values for this column.

image

The cross in the upper right corner means that the column is not nullable but Convergence for Data Integration does not check for null values for this column.

image

No sign in the upper right corner means that the column is nullable and Convergence for Data Integration does not check for null values for this column.

image

The plus sign in the upper right corner means that the column is nullable but Convergence for Data Integration checks for null values for this column.

image

This expression runs in the source.

image

This expression runs in the staging area.

image

This expression runs in the target.

image

These four letters have the following meaning when they appear:
- I: Enable Insert is selected for this mapping.
- U: Enable Update is selected for this mapping.
- A: Aggregate is selected for this mapping.
- T: One or more tags are set for this mapping.

Defining Computed Fields

Computed fields are virtual fields, which are calculated on the fly during execution. They are only available during execution and are not stored.

To create a computed field on a mapping:

  1. Right-click on a column and select Create Computed Field

  2. A name for the container of the computed fields will be asked for the first created on the datastore.

  3. Finally, you can change the expression of the computed field to your needs.

Computed fields can be created only on objects having transformation capacities (RDBMS supporting the SQL syntax).
It is possible to create a computed field from an other computed field too. This is used to chain transformations or operations.

Filtering the Sources

The data from the various source datastores may be filtered.

To create a filter:

  1. Select a column from a source datastore in the mapping diagram.

  2. While keeping the mouse button pressed, drag this column into the mapping diagram.

  3. Release the mouse button. A filter is created and appears in the diagram.

  4. In the Expression Editor view, edit the code of the filter. You can lock the expression editor and drag columns from the diagram into the expression editor. See the Filter and Join Expressions section for more information about filter expressions.

  5. Select the filter in the mapping diagram.

  6. In the Properties view, set the following Standard Properties:

    • Enable: Enables or disables the filter.

    • Aggregate: Check this option if this filter is an aggregate and must produce a HAVING clause.

    • Set the Execution Location. The join may be executed within the source system or in the staging area.

    • Description: Free form text.

  7. Press CTRL+S to save the mapping.

It may be preferable to position the filters on the source to reduce the volume of data transferred from the source data server.
To create a filter that must be executed only when a specific conditional join is activated, drag and drop the source column onto the conditional join itself and update the Expression.

Target Filters

A filter can be configured to be activated only for one target.

To activate a filter only for one target:

  1. Right-click the filter and select Activate > For […​].

Staging the Sources

To create a new stage:

  1. In the Project Explorer select the schema where the stage will evaluate the expressions

  2. Drag and drop the schema onto the diagram

  3. Select the stage

  4. In the Properties view, set the following properties:

    • Alias: Alias used in the expressions when referring to this stage.

    • Tag: Add a tag to the stage. Tags are used in certain process templates.

    • Description: Free form text.

    • In the Advanced properties section, the Integration Sequence specifies the order in which tables and stages without any mutual dependencies must be loaded.

To add fields to the stage:

  1. Select the stage.

  2. Click on the image button to add a new field to the Stage.

  3. In the Properties view, set the following properties:

    • Alias: Alias used in the expressions when referring to this field.

    • Enable: Enables or disables the mapping.

    • Aggregate: Indicates that this column contains an aggregate expression. Other (non-aggregated) columns are added in the GROUP BY clause of the queries.

    • Tag: Add a tag to the mapping. Tags are used in certain process templates.

    • Description: Free form text.

  4. Press CTRL+S to save the mapping.

When you drop a schema on a link between sources and a target you create a stage that can be initialized by reusing the existing mapping expressions.
You can also add fields to the stage by dragging a source field and dropping it on the stage.

Adding Sets to a Stage

To create a new set:

  1. Select the stage.

  2. Click the image button to add a new set to the Stage.

  3. In the Properties view, set the following properties:

    • Alias: Alias used in the expressions when referring to this set.

    • Description: Free form text.

  4. In the Expression Editor view, set the operators to use between the sets:

    • Each set can be referred to as [nameOfTheSet]

    • For example: ([A] union [B]) minus [C]

  5. Select the set and map the fields in this set

  6. Press CTRL+S to save the mapping.

Mapping, Filter and Join Expressions

For a mapping, filter or join, you specify an expression in the expression editor. These expressions are also referred to as the code of the mapping, filter or join.

The mapping, join or filter code can include any expression suitable for the engine that will process this mapping, filter or join. This engine is either the engine containing the source or target datastore, or the engine containing the staging area schema. You select the engine when choosing the Execution Location of the mapping, filter or join. This engine is typically a database engine. In this context, expressions are SQL expressions valid for the database engine. Literal values, column names, database functions and operators can be used in such expressions.

Examples of valid filter expressions:

  • CUSTOMER_COUNTRY LIKE 'USA'

  • COUNTRY_CODE IN ('USA', 'CAN', 'MEX') AND UCASE(REGION_TYPE) = 'SALES'

Examples of valid join expressions:

  • CUSTOMER.COUNTRY_ID = COUNTRY.COUNTRY_ID

  • UCASE(CUSTOMER.COUNTRYID) = GEOGRAPHY.COUNTRY_ID AND UCASE(CUSTOMER.REGIONID) = GEOGRAPHY.REGION_ID

Examples of valid mapping expressions:

  • For the CUSTOMER_NAME field: SRC.FIRST_NAME || ' ' || SRC.LAST_NAME

  • For the SALES_NUMBER aggregate field: SUM(ORDERLINE.AMOUNT)

  • For an OWNER field: 'ADMIN' to set a constant value to this field.

When editing a Join or a Filter, think of it as a conditional expression from the WHERE clause. When editing a mapping, think of one expression in the column list of a SELECT statement.

Restrictions

  • It is not possible to use a Stage as a source for another Stage if they are on different connections.

  • A datastore can be joined or else mapped with another datastore, but not both actions at the same time.

Working with Processes

Processes image define organized sequences of Actions executed at run-time on the IT systems. Processes are organized using Sub-processes, which are themselves composed of actions, sub-processes or References to other processes.

Creating a Process

Creating a new Process

To create a new process:

  1. Click on the image New Process button in the Project Explorer toolbar. The New Process Diagram wizard opens.

  2. Select the parent folder or project for your new resource.

  3. Enter a File Name and then click Finish. The process file is created and the editor for this file opens.

  4. Press CTRL+S to save the process.

Adding a Mapping

To add a mapping to a process:

  1. Drag and drop first the mapping from the Project Explorer into the process editor. This mapping is added as a reference in the current process.

Adding an Action

To add an action:

  1. In the Palette, select the action that you want to add to the process. You can expand the accordions in the palette to access the appropriate action.

  2. Click in the process diagram. A block representing your action appears in the diagram.

  3. Right-Click this action and then select Show Properties View.

  4. In the Properties views, set the following values in the Standard section:

    • Name: Name of the action. Use a naming convention for the action names as they are used in the variable path.

    • Dynamic Name: Dynamic Name for this action. This name may change at run-time and is available through the CORE_DYNAMIC_NAME variable.

    • Enable: Select this option to have this action enabled. Disabled actions are not executed or generated for execution.

    • Error Accepted: Select this option to have this action complete with a Success status, even if it has failed.

    • Is Begin Action: Select this option to declare this action explicitly as a startup action.

    • Description: Detailed description of the action.

  5. In the Parameters section, set the action’s parameters. Each action has its own set of parameters.

    • The mandatory parameters appear as editable fields. Enter values for these parameters.

    • The optional parameters (possibly having a default value) appear as non-editable fields. Click the field name to unlock the field value and then enter/select a value for these fields.

  6. Press CTRL+S to save the process.

Adding a Sub-Process

To add a sub-process:

  1. In the Palette, select the Process tool in the Component accordion.

  2. Click in the process diagram. A block representing your sub-process appears in the diagram.

  3. The focus is on this sub-process. Type in a name for this process and then press ENTER.

  4. A new editor opens to edit this sub-process. You can now proceed and edit both the main process and sub-process.

You can nest sub-processes within sub-processes.

Referencing Another Process

For modular development, you can use another process within your process as a reference.

To reference another process:

  1. Drag and drop first the process that you want to reference from the Project Explorer into the process editor. This process is added as a reference in the current process.

  2. Select the referenced process step in the process diagram, right-click and then select Show Properties View.

  3. In the Parameters section, set the action’s parameters. Each process has its own set of parameters.

  4. Press CTRL+S to save the process.

At the top of the process, the Breadcrumb Trail is displayed to navigate in the sub-processes.

By default, only the main process editor can be edited (main process and sub-processes created inside).
All process links (mappings, reference to other processes, etc.) are read-only, to prevent modification of other processes by error.

To enable the edition on sub-processes:

  • Right-click on a sub-process in the Breadcrumb Trail and select Enable Edition. This enables the edition directly inside this process.

  • Right-click on a sub-process on the Breadcrumb Trail and click on Open Editor. This will open a new editor for the selected sub-process.

image

Execution Flow

Steps Features

At a step level, a number of features impact the execution flow. These include Conditional Generation, Step Repetition and Startup Steps.

Conditional Generation

It is possible to make conditional the generation of given steps in a process. For example, you can generate only certain steps in the process depending on the startup parameters of the process

To make a step’s generation conditional:

  1. Select the step in the process diagram, right-click and then select Show Properties View.

  2. In the Generation section, optionally set the following options:

    • Generation Condition: Condition to generate the action code. If it is left empty, the action will be executed regardless of the condition. See Using Scripts in Conditions for more information.

  3. Press CTRL+S to save the process.

Step Repetition

Repetition allows you to generate a step several times to run in parallel or sequence. This repetition can be performed in parallel or sequence, and is done using the result of an XPath Query.

To enable step repetition:

  1. Select the step in the process diagram, right-click and then select Show Properties View.

    • In the Generation section of an action or process:

      1. Provide an XPath Query. The action will be generated for each value returned by this query.

      2. Provide a Variable Name. This variable will contain each of the values returned by the query and pass them to the action. You can use this variable name in the action code.

      3. Specify whether this repetition must be Sequential or Parallel.

  2. Press CTRL+S to save the process.

This repetition applies to the generation. It can be used for example to create one step for each table in a model.
Startup Steps

By default, all the steps with no incoming links are considered as startup steps. As a consequence, a process can start with several simultaneous steps.

In certain cases, (for example in the case of a process that is a loop) there is no step that can be identified as a startup step. In these cases, you must explicitly indicate the startup steps by checking the Is Begin Action option for these steps.

A link sequences two steps. There are three type of links to define the execution flow:

  • Successful Link: This link is followed if the step completes successfully.

  • Unsuccessful Link: This link is followed if the step fails.

  • Anyway Link: This link is followed when the steps completes successfully or fails.

  • Bind Link: This type of link is used to bind the results from a select statement to an action. See Direct Bind Links for more info.

To add a link:

  1. In the Palette, select a Successfull Link, Unsuccessfull Link or Anyway Link tool in the Link accordion.

  2. Select a first step in the package diagram. While keeping the mouse button pressed, move the cursor onto another step and then release it. A link appears between the two steps.

  3. Right-Click this link and then select Show Properties View.

  4. In the link properties, set the following options:

    • Generation > Generation Type: Define whether this link should be generated if the first step is Successful, Unsuccessful, or always (Anyway).

    • Execution > Execution Type: Define whether this link should be triggered if the first step is Successful, Unsuccessful, or always (Anyway).

    • Execution > Triggering Behavior: Define whether this link is Mandatory, Inhibitory or Not Mandatory for second step to run.

    • Optionally set Generation and Execution Conditions for this link. See Using Scripts in Conditions for more information.

At execution time:

  1. When the first steps completes, depending on its completion status, the Generation Type and the Generation Condition, the link is created or not.

  2. Then the link is triggered depending on the first step’s completion status, the Execution Type and the Execution Condition.

  3. Finally, the second step runs depending on all the incoming links' Triggering Behavior value. For example:

    • A step with two incoming Not Mandatory links (and no other) can start if one or the other link is triggered.

    • A step with two incoming Mandatory links (and no other) can start only if both links are triggered.

    • A step with one incoming Inhibitory link (and many others) will not run if this link is triggered.

A default link is always generated (Generation Type=Anyway), triggered only if the first step is successful (Execution type=Successful) and Mandatory for the second step to execute.
Links with conditions appear as dotted links.

Advanced Process Development

Process Parameters

A parameter is used to parameterize a process' behavior. The value of this parameter is passed when starting the process or when using it as a referenced process in another one.

To create a process parameter:

  1. In the Palette, select the Parameter tool in the Component accordion.

  2. Click in the process diagram. A block representing your parameter appears in the diagram.

  3. Right-Click this block and then select Show Properties View.

  4. In the Properties views, set the following values in the Core section:

    • Name: Name of the parameter. Use a naming convention for these names since they are used in the variable path.

    • Type: Type of the parameter.

    • Value: Default Value for this parameter.

  5. Press CTRL+S to save the parameter.

To use a process parameter, refer to it as a variable in your code, using the ${<parameter_name>}$ syntax.

The direct bind link is a specific type of link that allows you to run a target action once for each record returned by the source action. For each iteration, the target action has access to the bind variables sent by the source, which can be used through the bind syntax: :{column_name}:.

By default, the execution of a bind link stops as soon as one of the iteration fails. The corresponding error is then thrown and the target action is put in error.
If needed, you can change this behavior to let all the iterations executes, even when an error occurs in one of them. To accomplish this, uncheck the 'Stop Bind On Error' parameter in the advanced tab of the target action.

This parameter can only be changed on Actions targeted by a bind. Moreover, when multiple errors occur during the bind iterations, the first error encountered is thrown on the action at the end of the execution.
The 'Stop Bind on Error' functionality requires a runtime version 17.2.16 or higher.

Example with a SQL Operation action:

  • The first action is a SELECT SQL Operation.

  • The second action can be any action, including an INSERT/UPDATE/DELETE SQL Operation. This second action is repeated for each record returned by the select operation.

  • The columns of the records returned by the select operation can be used in the second action using the bind syntax: :{column_name}:.

For example, the first action performs the following SELECT query:

SELECT CUST_NAME, CUST_COUNTRY, CUST_CODE FROM CUSTOMER

The second action can generate a file with the following parameter:

  • TXT_WRITE_FILENAME: /cust/docs/customer_:{CUST_CODE}:_content.txt

One file is written per customer returned by the select statement, named after the customer code (CUST_CODE).

Scripting

Scripting a feature helps you to customize the behavior of your processes.

You can use scripting in various locations in Semarchy Convergence for Data Integration:

  • In all the textual fields (conditions, texts, parameters, etc.) with the %e(<language>){<script>}e(<language>)% syntax. In this context, the script is interpreted and the result of the script execution (return code) replaces the script in the field.

  • In the Java Native Scripting Actions steps.

Scripting Language

It is possible to specify the scripting language using the %e(<language>){…​}e(<language>)% syntax. In addition to JavaScript (Rhino), Groovy and Python (Jython) are supported.

For example:

  • %e(rhino){…​}e(rhino)%

  • %e(groovy){…​}e(groovy)%

  • %e(jython){…​}e(jython)%

Using Session Variables

Semarchy Convergence for Data Integration stores information about each action and process. This information is available as session variables. These variables can be used in the code/text of an action, in its parameters, and in its metadata. Variables can also be used in scripts, for example in conditions.

The syntax to use a variable is ${variable_path}$ where variable_path is the path to the variable.

A variable belongs to an action or a process. This action or process may be contained in another process, and so on. This organization is similar to a file system organization, with the main process at the root. You can access a variable with an absolute or relative path, depending on the location from which you request this variable.

For example if a process LOAD_DW contains a sub-process LOAD_DIMENSION which includes a sub-process LOAD_TIME which includes an action called WRITE_FILE, then the return code of this action is the following variable:

${LOAD_DW/LOAD_DIMENSION/LOAD_TIME/WRITE_FILE/CORE_RET_CODE}$

If you use this variable from the READ_FILE process within the LOAD_TIME sub-process, then the relative path is:

${../WRITE_FILE/CORE_RET_CODE}$

To use the return code of the current action, you can use:

${./CORE_RET_CODE}$

If the parent process' name is unknown, you can use the ~/ syntax:

${~/WRITE_FILE/CORE_RET_CODE}$

In the previous case, ~/ is replaced with LOAD_DW/LOAD_DIMENSION/LOAD_TIME/

Using Scripts in Conditions

Conditions may define whether:

  • A link is generated or triggered (Link’s Generation and Execution Conditions)

  • A step is generated (Step’s Generation Condition)

A condition:

  • is a script that returns a Boolean value.

  • may use session variables using the ${<variable>}$ syntax.

The default language used for the conditions is JavaScript (Rhino implementation) and the interpreter adds the %e(rhino){…​}e(rhino)% around the code in conditions.

Example of a condition: ${./AUTOMATION/CORE_NB_EXECUTIONS}$==1. The CORE_NB_EXECUTION variable (number of executions) for the AUTOMATION step should be equal to 1 for this condition to be true.

When a script is more complex than a simple expression, the context variable ctx.retvalue can be used to return the Boolean value from the script to the condition, as shown below.

This script checks whether a session variable value starts with an 'R'.
%e(rhino){
/* Retrieves a session variable value */
myVarValue = '${~/MYVARIABLE}$'; (1)
if (myVarValue.substring(0,1).equals('R')) (2)
 { __ctx__.retValue = "true"; }
else
 {         __ctx__.retValue = "false"; }
}e(rhino)%
1 Retrieval of the value for session variable MYVARIABLE.
2 Using the variable value.
Using the Scripting Context

When a script is interpreted, an object is passed to this script to provide access to the Runtime Engine features. This Context object is accessed using ctx_.

This object provides a list of methods for manipulating variables and values as well as a return value used for code substitution and condition evaluation. The following sections describe the various elements available with this object.

retValue Variable

The scripting context provides the retValue (ctx.retValue) variable.

This String variable is used to:

  • return to the condition interpreter a Boolean value.

  • return a string that replaces the script during code generation.

publishVariable Method
public void publishVariable(String name, String value) {}
public void publishVariable(String name, String value, String type) {}

This method is used to publish a session variable, and takes the following parameters:

  • name: variable path.

  • value: value of the variable.

  • type: type of the variable. The default value is String. The possible values are: Float, Integer, Long, Boolean and String.

Example: Publish a string variable called INCREMENTAL_MODE with value ACTIVE on the parent element of the current action.
__ctx__.publishVariable("../INCREMENTAL_MODE","ACTIVE");
sumVariable, averageVariable, countVariable, minVariable and maxVariable Methods
public String sumVariable(String name) {}
public String sumVariable(String name, String startingPath) {}
public String averageVariable(String name) {}
public String averageVariable(String name, String startingPath) {}
public String countVariable(String name) {}
public String countVariable(String name, String startingPath) {}
public String minVariable(String name) {}
public String minVariable(String name, String startingPath) {}
public String maxVariable(String name) {}
public String maxVariable(String name, String startingPath) {}

These methods return the aggregated values for a variable. This aggregate is either a Sum, Average, Count, Min or Max. The aggregate is performed within a given path.

This method takes the following parameters:

  • name: Name of the variable

  • startingPath: Path into which the variable values must be aggregated. If this parameter is ommitted, the values are aggregated for the entire Session.

Example: Aggregate all the numbers of rows processed for LOAD_DIMENSION process and all its sub-processes/actions.
__ctx__.sumVariable("SQL_NB_ROWS","../LOAD_DIMENSION");
getCurrentBindIteration Method
public long getCurrentBindIteration() {}

This method returns the current bind iteration number. It takes no input parameter. See Direct Bind Links for more information about bind iterations.

getVariableValue method
public String getVariableValue(String path) {}

This method returns the value of a session variable, and takes the following parameter:

  • path: Variable path.

Example: Retrieve the value of the CORE_SESSION_ID variable.
__ctx__.getVariableValue("/CORE_SESSION_ID");
getVariableCumulativeValue Method
public Object getVariableCumulativeValue(String name) {}

When an action iterates due to a bind or a loop, the variables store their value for the last iteration. In addition, numerical values contain a cumulated value which can be retrieved.

This method takes the following parameters:

  • Name: Path of the numerical variable.

getVariableTreeByName Method
public Map<String, IVariable> getVariableTreeByName(String name) {}
public Map<String, IVariable> getVariableTreeByName(String name, String startingPath) {}
public Map<String, IVariable> getVariableTreeByName(String name, boolean withErrors) {}
public Map<String, IVariable> getVariableTreeByName(String name, String startingPath, boolean withErrors)

This method returns a treeMap object containing the variables corresponding to certain criteria. It takes the following parameters:

  • name: Name of the published variable.

  • startingPath: Path from which the variable must be searched (The default value is ~/)

  • withErrors: Boolean value. If set to true, only the variables from steps in error are retrieved. It is set to false by default.

The returned Java Map object has the name of the action as the key and its value is a variable object with the following methods.

public Object getCumulativeValue();        // variable cumulated value (numbers only)
public String getShortName();                // variable name.
public String getName();                // variable name with path.
public String getActionName();                // action name with path.
public String getType();                // variable type.
public String getValue();                // variable value.
Usage example in Groovy: Retrieve the stack trace for all the steps in error.
%e(groovy){
def a = ""
def tree = __ctx__.getVariableTreeByName("CORE_STACK_TRACE","~/",true)
if (tree.size() != 0) {
        def es=tree.entrySet()
        es.each{
                  a = a+ "-- ACTION --> " + it.key + "\n"
                  a = a+ it.value.getValue() +"\n\n"
        }
        __ctx__.retValue = a
}
}e(groovy)%
Same example in JavaScript (Rhino)
%e(rhino){
importPackage(java.util);
a = "";
tree = __ctx__.getVariableTreeByName("CORE_STACK_TRACE","~/",true);if (tree.size() != 0) {
        for (i= tree.keySet().iterator() ; i.hasNext() ; ){
            action = i.next();
            maVar = tree.get(action);
                  a = a+ "-- ACTION --> " + action + "\n";
                  a = a+ maVar.getValue() +"\n\n";
        }
        __ctx__.retValue = a
}
}e(rhino)%
getLstVariablesByName Method
public List<IVariable> getLstVariablesByName(String name) {}
public List<IVariable> getLstVariablesByName(String name, boolean withErrors) {}
public List<IVariable> getLstVariablesByName(String name, String startingPath) {}
public List<IVariable> getLstVariablesByName(String name, String startingPath, boolean withErrors) {}

This method works like getVariableTreeByName, but returns a list of variables instead of a Java Map.

Usage Example in Groovy
%e(groovy){
def a = ""
def lst = __ctx__.getLstVariablesByName("V1","~/")
if (lst.size() != 0) {
        for (var in lst) {
                  a =a + var.getValue() + "\n"
        }
        __ctx__.retValue = a
}
}e(groovy)%
Same example in JavaScript (Rhino)
%e(rhino){
importPackage(java.util);
a = "";
lst = __ctx__.getLstVariablesByName("V1","~/");
if (lst.size() != 0) {
        for (i=0;i<lst.size();i++){
                  a = a+ "-- Value --> " + lst.get(i).getValue() +"\n";
        }
        __ctx__.retValue = a;
}
}e(rhino)%
createBindedPreparedStatement Method
public PreparedStatement createBindedPreparedStatement() {}

This method returns an object allowing to produce a custom set of Bind columns in Scripting, which can then be used through an outgoing Bind link.
This object allows to manipulate the column definition as well as publishing rows.

Definition of a column

The following methods can be used to define the properties of a column.

public void setColumn(int columnId, String columnName);
public void setColumn(int columnId, String columnName, String dataType)
public void setColumn(int columnId, String columnName, String dataType, int precision);
public void setColumn(int columnId, String columnName, String dataType, int precision, int scale);

Update of the properties

The following methods can be used to update the properties of a column.

public void setColumnName(int columnId, String columnName);
public void setColumnPrecision(int columnId, int precision);
public void setColumnType(int columnId, String dataType);

Definition of the value

The following methods can be used to set or update the value of a column in the current row.

public void setBigDecimal(int columnId, BigDecimal value);
public void setBoolean(int columnId, boolean value);
public void setBytes(int columnId, byte[] value);
public void setDate(int columnId, Date value);
public void setDouble(int columnId, double value);
public void setInt(int columnId, int value);
public void setLong(int columnId, long value);
public void setString(int columnId, String value);
public void setTime(int columnId, Time value);
public void setTimestamp(int columnId, Timestamp value);

Publish a new row

public int executeUpdate()
Same example in JavaScript (Rhino)
%e(rhino){
// Create the statement
   ps=__ctx__.createBindedPreparedStatement();
// Definition of the columns
    ps.setColumn(1,"TEST1"); // Set column 1
    ps.setColumn(2,"TEST2","VARCHAR",255); // Set column 2
// First Bind Iteration
    ps.setString(1,"VALUE1.1");
    ps.setString(2,"VALUE2.1");
    ps.executeUpdate();
// Second Bind Iteration
    ps.setString(1,"VALUE3.1");
    ps.setString(2,"VALUE3.2");
    ps.executeUpdate();
}e(rhino)%
Use this method in Scripting Actions to create your own Bind columns. This can be useful to iterate on a list of values for example in scripting and use the result as Bind values in the target action.
executeCommand and executeRemoteCommand Methods
public String executeCommand(String command) {}
public String executeRemoteCommand(String host, int port, String command) {}

The executeCommand method enables you to execute a Semarchy Convergence for Data Integration command on the current runtime. The executeRemoteCommand method enables you to execute a Semarchy Convergence for Data Integration command on a remote runtime.

The Semarchy Convergence for Data Integration commands are those available when you use the startCommand (.bat or .sh) shell program.

These two methods return the standard output produced by the command’s execution.

The parameters are:

  • command: command to be executed by the runtime.

  • host: hostname or IP address of the remote runtime.

  • port: RMI port of the remote runtime.

example
%e(rhino){
__ctx__.executeCommand("versions");
}e(rhino)%

It is possible to link metadata to actions and processes. A link allows you to use metadata information in actions without having to hardcode it. Instead, references to the metadata information are used in the actions. When linking metadata to action, these metadata references are converted to metadata information at generation time, and the action code remains generic. If a link is made to another metadata file, or if the metadata is modified, then the action will use the new metadata information.

Metadata Linking is used frequently to use within processes the connection information and other structural information (table names, list of columns, etc.) stored in metadata files.
Linking Metadata to a Process/Action

To link metadata to an action:

  1. Open the process that you want to modify.

  2. In the Project Explorer view, select the metadata that you want to link. For example a data store or a data server.

  3. Drag and drop the metadata onto the action in the diagram. It is added as a Metadata Link zone image on the action element.

  4. Select this metadata link, right-click and then select Show Properties View.

  5. In the properties view, set the following parameters:

    • Name: Name of the Metadata Link. This name is used to access the content in the metadata.

    • Description: Description for the metadata link.

    • Visibility: Define whether the link is visible for this current action or for the parent of the action too.

  6. Press CTRL+S to save the process.

Using the Metadata Connection

By linking actions (or processes containing actions) to a data server or schema defined in metadata file, you automatically use this data server or schema for the action. If the action required a SQL Connection, the SQL connection of the linked metadata is automatically used.

Using Metadata Specific Information

Through the metadata link, you can access specific information within the metadata. This information is accessed using an XPath query. To use a value in a metadata link, in the code or any text field of the action, specify the XPath to the value. Note that:

  • You should use the Name set when defining the metadata link as a variable in the XPath expression.

  • The XPath expression must be surrounded with %x{…​}x% in order to be interpreted accordingly and replaced by its value at run-time.

For example, a metadata link to a dataserver was named Local_XE. It is possible to use the following XPath to retrieve the JDBC URL of this data server in the action:

%x{ $Local_XE/tech:jdbcUrl() }x%

Working with Restart Points

When a process failed (with an error or a killed status), it can be restarted.

By default, it will restart from the steps that had an error status.

In a process you can also open and close "Restart Points" to define other ways to restart.

To open a restart point:

  1. Right-Click on the step in the process

  2. Choose "Restart Point" and "open"

  3. This will add the following icon image

To close a restart point:

  1. Right-Click on the step in the process

  2. Choose "Restart Point" and "close"

  3. This will add the following icon image

In case you have defined Restart Points, if the process fails without reaching the step on which the Restart Point is closed, the process will restart from the Restart Point above the failed step.

You can have several Restart Points. If the process fails, it will restart, for each failed step, on the last Restart Point above the failed step.

Working with Variables

Creating Variables in the Metadata

Creating the Metadata file

Variables can be created as metadata with the "Variable Set" metadata type.

image

To create a new variable set:

  1. Click on the image New Metadata button in the Project Explorer toolbar. The New Model wizard opens.

  2. Select the Variable Set type.

  3. Choose a name and a folder to store the metadata file. The metadata file will be created

Creating the Variables

To create a new variable:

  1. Right-Click on the Set node and choose New Child and Variable.

  2. In the properties tab, give a name to the variable

Variable Properties

Name

Mandatory

Description

Name

yes

Name of the variable

Type

Type of the variable: String, Integer, Boolean or Float. The default value will be String.

Refresh Query

Used if a Refresh Connection is defined. This query will be executed to retrieve a value for the variable. In case of a query returning multiple rows or multiple columns (or both), the first column of the first row will used as the value

Default Value

The default value of the variable

Saving Connection

Connection used to save the values of the variable. A connection should be defined first. See below for more information.

Refresh Connection

Connection used by the Refresh Query. A connection should be defined first. See below for more information.

Default Operation

Operation used when invoking the Variable Manager.

Associating Connections to Variables

Connections can be defined and shared in the variable set.

This will allow:

  1. to get values for the variable using a SQL statement, through a refresh query defined on the variable

  2. to save and get the values of the variables in a table

Defining a Connection

To define a connection:

  1. Right-Click on the Set node and choose New > Connection.

  2. Give a name to the connection

  3. Drag and drop under the connection the metadata that you want to use for this connection.

You can add several connections in the Variable Set.
You can use different types of metadata nodes in the connection: a server, a schema, a table or even a column. This can be useful to generate the SQL statement for the refresh query.
Saving and Refreshing connections

Once the connections are defined in the variable set, they can be used to refresh or save the values of the variables.

The Refresh and the save connections can be defined for each variable in the properties tab in its own combo box.

The Refresh and the save connections can also be defined on the Set node. In this case, all the variables for which connections are not defined will use these connections.

Using the Metadata to Generate the SQL Statements

The node that has been defined on the connection can be used to generate the SQL statements. The node, in fact, is a link to the Metadata.

In the Refresh Query, you can use the Xpath or metadata functions provided by Semarchy Convergence for Data Integration directly inside the { } delimiters.

If the metadata used in the connection is a schema, you can use the following syntaxes:

  • To get the schema name:

    { ./tech:physicalName() }
  • To get a qualified name for an object (ie. a table):

    { ./md:objectPath(.,'MYTABLE') }
  • If the metadata used in the connection is a table, you can use the following syntax:

    { ./tech:physicalPath() }

Using Variables in the Mappings

To use a variable in a mapping:

  1. Drag and drop the variable node from the metadata file into the mapping. This will add the variable as a new source in the mapping.

  2. In the mapping, drag and drop the variable into the target column, the join or the filter expressions in which the variable should be used.

In the Expression Editor the variable will automatically have the %{VARIABLE_NAME}% syntax where VARIABLE_NAME is the name of the variable.

Using Variables in Processes

To use a variable in a process:

  1. Drag and drop the variable node from the metadata file into the process diagram.
    This will instantiate the Variable Manager object with a predefined metadata link to the variable. Predefined properties of the variable will automatically be set on the Variable Manager when the process will be generated (default value, connections, default operation, …​)

  2. Modify the properties in the Properties Tab view

If you want to retrieve a value from a table to parameterize a mapping, you can instantiate a variable in a process as explained above. If the variable has a refresh query then it will be used to retrieve the value. You can then use the variable in the mapping as explained in the previous sections.

Syntax to Use Variables in Expressions

In order to use a variable in an expression (action text or parameter), you will have to instanciate the variable in the process, and use the %{VARIABLE_NAME}% syntax where VARIABLE_NAME is the name of the variable.

Using Variables in Other Variables

In order to use a variable in another variable, there are two cases:

  • The two variables are defined in the same metadata file:
    In this case, use the %{VARIABLE_NAME}% syntax. This syntax can be used in the query or in the default value.

  • The two variables are defined in different metadata files:
    In this case, before using the variable in another variable, you must link the two metadata files:

    1. Open the target metadata file (the metadata file containing the variable that will use the other variable’s value)

    2. Drag and drop the source variable node from the source metadata file to target metadata (on the Variable Set node)

    3. Use the %{VARIABLE_NAME}% syntax.
      This syntax can be used in the query or in the default value.

Using Variables in Other Metadata

In order to use a variable in other metadata properties (for example, a variable in a table condition), you must link the two metadata files:

  1. Open the target metadata file (the metadata file that will use the other variable’s value)

  2. Drag and drop the source variable node from the source metadata file to the target metadata file

  3. Use the %{VARIABLE_NAME}% syntax in the properties

Going to Production

Running Processes

Running a process allows you to review its behavior at design-time or execute it for run-time.

Before running a process, make sure that the runtime engine is currently running. Connect to it and activate the Refresh Diagram option to see the execution progress in the process diagram.

To run a process:

  1. Open the process editor.

  2. Right-click the process editor background and then select Execute.

  3. The process starts to execute.

You can monitor the process execution:

  • In the Sessions view, you can see the session state (running).

  • In the Process editor and the Outline view, the actions and sub-process running are colored in green, then in blue when their execution is complete. Actions raising errors and warnings appear in red and orange.

  • In the Session Details view and the Statistics view, you can see the session-level information.

  • Action-level information is visible in the Step Details, Statistics and Variables views.

Working with Deliveries

A delivery is a standalone compiled element that can be executed from a runtime engine. It is generated from a process.

Generating Deliveries

To generate a delivery:

  1. Open the process editor.

  2. Right-click the process editor background and then select Build > Delivery.

  3. The delivery file (with a .deliv extension) is generated in the /runtime/build/deliveries/ sub-directory of the Convergence for DI installation folder.

It is possible to generate from a mapping a delivery by selecting the Generate Delivery option from the context menu of the mapping background.

Deploying Deliveries

By default, the deliveries are generated in the deliveries folder of the runtime engine embedded into the Convergence for DI Designer. To deploy a delivery to another runtime engine, copy the delivery file (.deliv extension) in the /build/deliveries/ sub-directory of the remote runtime engine.

Working with Packages

A package is a pre-compiled element that can be used to generate deliveries. Unlike a delivery, a package supports re-configuration for run-time.

When a separate production team owns the production environment, the development team ships to this team packages.

The production team is able to:

  1. Extract the development configuration from the package.

  2. Create from this configuration a new configuration for the production environment

  3. Build from the package and the configuration a delivery that will be executed in the production environment.

Generating a Package

To generate a package:

  1. Open the process editor.

  2. Right-click the process editor background and then select Build > Package.

  3. The package file (with a .pck extension) is generated in the /runtime/build/packages/ sub-directory of the Convergence for DI Designer installation folder.

Generating a Delivery From a Package

From a package, it is possible to generate a delivery using an existing configuration or a configuration file.

Creating and modifying a configuration can be done from the Convergence for DI Designer. For more information, see Working with Configurations. Alternately, you can extract a configuration file template from the delivery and modify it to create your own configuration file. See Extracting a Configuration File for more information.

To generate a delivery from a package:

  1. Open an operating system command line.

  2. Go to the /runtime/ directory of the runtime engine into which the package is deployed (in the /runtime/build/packages/ sub-directory).

  3. Use the following command to extract the configuration:

    buildDelivery.bat <package_name> [-conf <configuration_name>] [-confFile <configuration_file>]

where:

  • <package_name> is the name of the package.

  • <configuration_name> is the (optional) configuration name with which the delivery is generated.

  • <configuration_file> is the (optional) name of the configuration file with which the delivery is generated.

A configuration file or a configuration name must be specified. If both are specified, the content of the configuration file overrides the configuration when it exists.

Extracting a Configuration File

This operation extracts a configuration from a package in the form of a file. This configuration file can be modified and used to generate a delivery from the package.

To extract a configuration file:

  1. Open an operating system command line.

  2. Go to the /runtime/ directory of the runtime engine into which the package is deployed (in the /runtime/build/packages/ sub-directory).

  3. Use the following command to extract the configuration:

    buildDelivery.bat <package_name> [-conf <configuration_name>] [-confFile <configuration_file>] -extract

where:

  • <package_name> is the name of the package.

  • <configuration_name> is the (optional) configuration name from which the configuration file is generated.

  • <configuration_file> is the (optional) name of the configuration file to generate. If this parameter is not specified, the configuration file is named after the package.

The configuration file (.properties extension) is created in the /build/packages/ sub-directory of the runtime installation directory.

The configuration file appears as in the example below:

#################################################################
### Name: Local XE/MDM Hub¤ (Dev)
### Type: com.stambia.rdbms.schema
#_QRNAcD34EeGJfa9nNKKg6w/TABLE_SCHEM=MDM_HUB
#_QRNAcD34EeGJfa9nNKKg6w/TABLE_CAT=
#################################################################
### Name: Local XE
### Type: com.stambia.rdbms.server
#_ldsxMD3pEeGJfa9nNKKg6w/physicalName=
#_ldsxMD3pEeGJfa9nNKKg6w/driver=oracle.jdbc.driver.OracleDriver
#_ldsxMD3pEeGJfa9nNKKg6w/url=jdbc:oracle:thin:@localhost:1521:XE
#_ldsxMD3pEeGJfa9nNKKg6w/user=MDM_HUB
#_ldsxMD3pEeGJfa9nNKKg6w/password=1355279685E38F0C392FEC2B8550200B3951C0D79B227B95C1DC348DD0BCE8F1

Uncomment the elements in this file that you want to modify.

It is possible to override the value for any metadata object property (even those not listed in the configuration template) in the configuration by specifying in the property file the value in the following way:

<object_id>/<property>=<value>

where:

  • <object_id> is the metadata object ID. It can be viewed from the Core section in the Properties view for the metadata element.

  • <property> is the property of the object that you want to modify.

  • <value> is a valid value for this property.

After editing the configuration file, you can use it for Generating Deliveries.