Working with Metadata |
Previous
|
|
Next
|
Working with Projects |
|
Working with Mappings |
Semarchy Convergence for DI uses metadata to design, generate and run the data integration processes. For example, the structure of the tables, text or XML files taken into account in the data integration flows.
A metadata file
handled by Semarchy Convergence for DI represents generally a data model. For example a database schema, a folder, etc, storing tables, files.
A metadata file is created by connecting to the database server, file system, etc, to retrieve the structure of the tables, files, etc.. This mechanism is called
reverse-engineering.
The following sections explain how to create the three main types of metadata files.
Defining a Database Model
Creating and Reversing a Database Model
This process uses a wizard that performs three steps:
- Create a new data server
- Create a new schema
- Reverse-engineer the datastores
To create a new data server:
- Click on the
New Metadata button in the
Project Explorer toolbar. The
New Model wizard opens.
- In the
Choose the type of Metadata tree, select
RDBMS > <DBMS Technology> where
<DBMS Technology> is the name of the DBMS technology that you want to connect.
- Click
Next.
- Select the parent folder or project for your new resource.
- Enter a
Metadata Model Name and then click
Finish. The metadata file is created and the
Server wizard opens.
- In the
Server Connection page, enter the following information:
-
Name: Name of the data server.
-
Driver: Select a JDBC Driver suitable for your data server.
-
URL: Enter the JDBC URL to connect this data server.
- Un-select the
User name is not required for this database option if authentication is required for this data server.
-
User: The database user name.
-
Password: This user’s password.
- (Optional) Modify the following options as needed:
-
Auto Logon: This option allows the Semarchy Convergence for DI Designer to automatically create a connection to this data server when needed.
-
Logon during Startup: This option allows the Semarchy Convergence for DI Designer to create a connection to this data server at startup.
-
AutoCommit: Semarchy Convergence for DI Designer connections to this data server are autocommit connections.
-
Commit On Close: Semarchy Convergence for DI Designer connections to this data server send a commit when they are closed.
- Click the
Connect button to validate this connection and then click
Next. The Schema Properties page opens.
To create a new schema:
- In the
Schema Properties page, enter the following information:
-
Name: Use the checkbox to enable this field, and enter a user-friendly name for this schema.
-
Schema Name: Click the
Refresh Values button to retrieve the list of schemas from the database, and then select one of these schemas.
-
Reject Mask: Set the table name mask for the table containing the load rejects (error tables). See the
Table Name Masks section below for more information.
-
Reject Mask: Set the table name mask for the temporary load tables. See the
Table Name Masks section below for more information.
-
Integration Mask: Set the table name mask for the temporary integration tables. See the
Table Name Masks section below for more information.
-
Work Schema: Select a schema for storing the load and integration temporary tables for this data server. This schema is also referred to as the
Staging Area. See the
Work and Reject Schema Selection section for more information. Click the
... button to create a new schema definition and set it as the work schema.
-
Reject Schema: Select a schema for storing the errors (rejects) tables for this data server. See the
Work and Reject Schema Selection section for more information. Click the
... button to create a new schema and set it as the reject schema.
- Click
Next. The
Reverse Datastore page opens.
To reverse-engineer the datastores into a schema:
- In the
Reverse Datastore page, optionally set an object filter. Use the
_
and
%
wildcards to represent one or any number of characters.
- Optionally filter the type of objects that you want to reverse-engineer: All, synonyms, tables and views.
- Click the
Refresh button to refresh the list of datastores.
- Select the datastores that you want to reverse engineer in the list.
- Click
Finish. The reverse-engineering process retrieves the structure of these datastores.
- Press
CTRL+S to save the editor.
Adding a New Schema
To add a new schema to an existing data server:
- In the metadata file editor, select the root node.
- Right-click and select
Action > Launch DataSchema Wizard.
- Follow the steps described in the
"To create a new schema" section of
Creating and Reversing a Database Model.
Reverse-engineering an Existing Schema
To retrieve metadata changes from an existing schema, or to retrieve new table definitions, you must perform a new reverse-engineering.
To reverse-engineer an existing schema:
- In the metadata file editor, select the node corresponding to the schema.
- Right-click and select
Action > Launch DataSchema Wizard.
- Click
Next in the first page of the wizard.
- On the second page follow the steps described in the
"To reverse-engineer the datastores in a schema" section of
Creating and Reversing a Database Model.
Table Name Masks
Table name masks define name patterns for the temporary objects created at run-time.
Table Name masks can be any string parameterized using the following variables:
-
[number]
: Automatically generated increment for the load tables, starting with 1.
-
[targetName]
: Name of the target table of a mapping.
-
${variable}$
or
%{variable}%
: A session variable that is set at run-time.
Note that the resulting string must be a valid table name.
Example:
L_[targetName]_[number]
would create Load tables named
L_CUSTOMER_1
,
L_CUSTOMER_2
, etc for a mapping loading the CUSTOMER table.
Work and Reject Schemas Selection
When defining a schema (with optionally a
Name for this schema), you optionally refer to two other schemas, the
Work Schema and
Reject Schema.
These two schemas store respectively temporary load/integration tables (Staging Area) and the error (reject) tables for the data tables stored in the schema being defined. In the mappings, the
work schema is also called the
Staging Area.The value for these two schemas may be:
- Empty: In that case, the work schema and reject schemas are automatically set to the
Schema Name. This means that the temporary and error tables are created in the same schema as the data tables.
- Set to the
Name or
Schema Name of another schema. In that case, the temporary or error tables are stored in this other schema’s
Schema Name.
Tip: It is recommended to configure by default two separate temporary (for example,
SEM_TEMP
) and error (for example
SEM_ERRORS
) schemas for each database server and set them as the
Work Schema and the
Reject Schema for all the data schemas. This avoids mixing application data (data schemas) and Convergence for DI tables in the same schemas.
Defining a File Model
Creating a File Model
To create a new File metadata file:
- Click on the
New Metadata button in the
Project Explorer toolbar. The
New Model wizard opens.
- In the
Choose the type of Metadata tree, select
File > File Server.
- Click
Next.
- Select the parent folder or project for your new resource.
- Enter a
Metadata Model Name and then click
Finish. The editor is created and the
File Wizard open automatically.
- In the
Directory page, provide a user-friendly
Name for the directory and select its
Path.
- Click
Next.
- In the
File Properties page:
- Use the
Browse button to select the file within the directory and set the
Physical Name for the file.
- Set a logical
Name for the file datastore.
- Select the file
Type:
Delimited or
Positional (fixed width fields).
- Follow the process corresponding to the file type for reverse-engineering.
Reverse-Engineering a Delimited File
To reverse-engineer a delimited file:
- In the
File Properties page, use the
Refresh button to view the content of the file in the preview. Expand the wizard size to see the file contents.
- Set the following parameters to match the file structure:
-
Charset Name: Code page of the text file.
-
Line Separator: Character(s) used to separate the lines in the file.
-
Field Separator: Character(s) used to separate the fields in a line.
-
String Delimiter: Character(s) delimiting a string value in a field.
-
Decimal Separator: Character used as the decimal separator for numbers.
-
Lines to Skip: Number of lines to skip from the beginning of the file. This count must include the header.
-
Header Line Position: Position of the header line in the file.
- Click
Next.
- Click
Reverse. If the parameters set in the previous page are correct, the list of columns detected in this file is automatically populated.
- Reverse-engineering parses through a number of lines in the file (defined by the
Row Limit) to infer the data types and size of the columns. You can tune the reverse behavior by changing the
Reverse Options and
Size Management properties, and click
Reverse again.
- You can manually edit the detected column datatype, size and name in the table.
- Click
Finish for finish the reverse-engineering.
- Press
CTRL+S to save the file.
Reverse-Engineering a Fixed-Width File
To reverse-engineer a fixed-width file:
- In the
File Properties page, use the
Refresh button to view the content of the file in the preview. Expand the wizard size to see the file contents.
- Set the following parameters to match the file structure:
-
Charset Name: Code page of the text file.
-
Line Separator: Character(s) used to separate the lines in the file.
-
Decimal Separator: Character used as the decimal separator for numbers.
-
Lines to Skip: Number of lines to skip from the beginning of the file. This count must include the header.
-
Header Line Position: Position of the header line in the file.
- Click
Next.
- Click
Refresh to populate the preview.
- From this screen, you can use the table to add, move and edit column definitions for the file. As you add columns, the preview shows the position of the columns in the file.
- Click
Finish to finish the reverse-engineering.
- Press
CTRL+S to save the file.
Defining an XML Model
To create a new XML metadata file:
- Click the
New Metadata button in the
Project Explorer toolbar. The
New Model wizard opens.
- In the
Choose the type of Metadata tree, select
XML > XML Schema.
- Click
Next.
- Select the parent folder or project for your new resource.
- Enter a
Metadata Model Name and then click
Finish. The editor is created and the
XML Wizard opens.
- In the
Name field, enter a name for this schema.
- In the
XML Path field, enter the full path to the XML file. This file does not need to physically exist at this location if you have the XSD, and can be generated as part of a data integration process.
- In the
XSD Path field, enter the full path to the XSD describing the XML file. If this XSD does not exist, click
Generate to generate an XSD from the content of the XML file provided in the
XML Path.
- Click
Refresh and then select the root element for this schema. If the XSD has several root nodes, it is possible to repeat this operation to reverse-engineer all the hierarchies of elements stored in the XML file. Each of these hierarchies can point to a different XML file specified in the properties of the element node.
- Click
Reverse. The reverse-engineering process retrieves the XML structure from the XSD.
- Click
Finish to close the wizard and return to the editor.
- Press
CTRL+S to save the editor.
Defining a Generic Model
A Generic model is useful when you want to have custom Metadata available in order to parameterize your developments.
To define a Generic Model :
- Click the
New Metadata button in the
Project Explorer toolbar. The
New Model wizard opens.
- In the
Choose the type of Metadata tree, select
Generic >
Element.
- Click
Next.
- Select the parent folder or project for your new resource.
- Enter a
File Name for your new metadata file and then click
Finish. The metadata file is created and the editor for this file opens.
- Select the
Element node and enter the
Name for this element in the
Properties view.
A Generic Model is a hierarchy of
Elements and
Attributes. The
Attribute values can be retrieved for an element thanks to the Semarchy Convergence for DI usual XPath syntax.
To create a new
Element :
- Right-Click on the parent element.
- Select
New >
Element
- In the
Properties view, enter the name of the new
Element.
To add an
attribute to an
Element
- Right-Click on the parent element.
- Select
New >
Attribute
- In the
Properties view enter the
Name and the
Value of the
Attribute. This name will be useful to retrieve the value of your attribute.
Working with Configurations
Configurations allow to parameterize metadata for a given context. For example, a single data model declared in Semarchy Convergence for DI may have two configurations,
Production and
Development. Used in the
Development configuration it would point to a development server and used in the
Production configuration it would point to a production server. Both servers contain the same data structures (as defined in the model), but not the same data, and have different connection information.
Creating a Configuration
To create a configuration:
- In the Semarchy Convergence for DI Designer toolbar, click the
Edit button.
- The
Configuration Definition editor (
conf.cfc
) opens.
- Right-click the root node (
Cfc), then select
New > Configuration.
- In the
Properties view, enter the new configuration’s properties:
-
Code: Code of the configuration. This code appears in the
Configurations drop-down list in the Semarchy Convergence for DI Designer toolbar.
-
Description: Description of the configuration.
-
Execution Protection: Set to true if you want to be prompted for a password when executing a process in this configuration.
-
Selection Protection: Set to true if you want to be prompted for a password when switching the Semarchy Convergence for DI Designer to this configuration.
-
Password: Protection password for this configuration.
- Press
CTRL+S to save the configuration.
Using Configurations
In a metadata file, it is possible to define configuration-specific values for certain properties. The following section describes the most common usage of the configurations in metadata files.
Using Configuration for Databases
In databases, you can customize the connection information to the data server as well as the data schema definitions using configuration.
To create a data server configuration:
- In the database metadata file editor, select the root node corresponding to your data server.
- Right-click and select
New > DataServer Configuration.
- In the
Properties view:
- Select the configuration in the
Configuration Name field.
- Set the different values for the connection information (
Driver,
URL,
User and
Password) as required.
- Press
CTRL+S to save the database metadata file.
To create a data schema configuration:
- In the database metadata file editor, select the node corresponding to your data schema.
- Right-click and select
New > DataServer Configuration.
- In the
Properties view:
- Select the configuration in the
Configuration Name field.
- Set different values for the schema information (
Schema Name,
Reject Schema, etc.) as required.
- Press
CTRL+S to save the database metadata file.
Note: You can define configurations at all levels in the database metadata file for example to define configuration-specific structural features for the datastores, columns, etc.
Using Configuration for Files
In files, you can customize the directory location as well as the file names depending on the configuration using a directory or a file configuration.
For example:
- in a
development configuration, a file is located in the
C:\temp\
directory and named
testcustomers.txt
- in a
production configuration, a file is located in the
/prod/files/
directory and named
customers.txt
To create a directory configuration:
- In the File metadata file editor, select the node corresponding to your directory.
- Right-click and select
New > Directory Configuration.
- In the
Properties view:
- Select the configuration in the
Configuration Name field.
- Set a value for the
Path specific to the configuration.
- Press
CTRL+S to save the File metadata file.
To create a file configuration:
- In the File metadata file editor, select the node corresponding to your file.
- Right-click and select
New > File Configuration.
- In the
Properties view:
- Select the configuration in the
Configuration Name field.
- Set a value for the
Physical Name of the file specific to the configuration.
- Press
CTRL+S to save the File metadata file.
Note: You can define configurations at all levels in the File metadata file for example to define configuration-specific structural features for flat files.
Using Configuration for XML
In XML files, you can customize the path of the XML and XSD files depending on the configuration using a schema configuration.
To create a schema configuration:
- In the XML metadata file editor, select the root node.
- Right-click and select
New > Schema Configuration.
- In the
Properties view:
- Select the configuration in the
Configuration Name field.
- Set a value for the
XML Path and
XSD path specific to the configuration.
- Press
CTRL+S to save the XML metadata file.
Note: You can define configurations at all levels in the XML metadata file for example to define configuration-specific structural features in the XML file.