Reverse-Engineering the Metadata

Semarchy Convergence for Data Integration moves and transforms data between datastores. These datastores are database tables, flat files, XML structures, web services, etc. The connection information to these datastores must be declared in Semarchy Convergence for Data Integration, and their structure (the metadata) can be automatically reverse-engineered. The reverse-engineering process creates a definition of the datastores' structures in Convergence for DI.
Data flows are created at design-time using this metadata. Run-time execution will use the connection information to perform the extract-load and transform operations between the source and target systems.

Reverse-Engineering the Databases

The MDM Hub is stored in an Oracle schema named SEMARCHY_DEMO_MDM.
In the same database, the Source HR and Marketing data is stored in a schema named SEMARCHY_DEMO_SOURCE.

Defining the Oracle Data Server and MDM Hub Schema

The Oracle Data Server is already configured in the GettingStarted > Metadata > OracleServer metadata file.
The three schemas used in this tutorial are also already defined:

In this last schema definition, one of the tables required for this tutorial, the SRC_EMPLOYEE table, which contains HR data, was not reverse-engineered. We will check the configuration of the Oracle Instance and reverse-engineer the missing table.

  1. In the Project Explorer, expand the GettingStarted > Metadata node.
  2. Double-click OracleServer. The Metadata editor opens.
  3. In this editor, select the OracleServer node, right-click and select Actions > Edit Connection.
  4. In the Duplicate Connection Profile dialog, check that the connection information to the Oracle instance is correct. If not modify this connection information:
  5. Click the Connect button to validate this connection.
  6. Click OK to close this dialog.
  7. In the editor, select the OracleServer > SEMARCHY_DEMO_SOURCE node.
  8. Expand the node SEMARCHY_DEMO_SOURCE. SRC_CONTACT and SRC_CUSTOMER have been defined for you. You will define SRC_EMPLOYEE.
  9. Right-click and select Actions > Launch DataSchema Wizard. The Schema Wizard opens and shows the properties of this schema.
  10. Click Next. The Reverse Datastore page opens.
  11. Click the Refresh button. A filtered list of datastore appears.
  12. Select the SRC_EMPLOYEE table.
  13. Click Finish. The reverse-engineering process starts and retrieves the structure of the SRC_EMPLOYEE table.
  14. When the reverse-engineering process is finished, press CTRL+S to save the editor.
  15. In the OracleServer.md editor, expand the OracleServer > SEMARCHY_DEMO_SOURCE node. The list of reverse-engineered tables appears as shown below.
  16. Close this editor.

Reverse-Engineering the Cost Centers Files

The file containing the Cost Centers data provided by the Finance team is a delimited file stored in the <semarchy_di>/samples/files/ folder. We will declare and reverse-engineer this file.

  1. Click the Create New Metadata button in the Project Explorer toolbar. The New Metadata Model wizard opens.
  2. In the Choose the type of Metadata tree, select file > File Server.
  3. Click Next.
  4. Make sure that the GettingStarted/Metadata folder is selected as the parent folder.
  5. In MetaData Model Name enter DemoFile and then click Finish. The DemoFile.md editor opens and the File Wizard open automatically.
  6. In the Directory page, enter Demo Files Folder in the Name field.
  7. Click the Browse button and select the <semarchy_di>/samples/files/.
  8. Click Next. The Set File Properties page opens.
  9. In this page:
  10. Click Next.
  11. In the Reverse page:
  12. Click Finish. The DemoFile.md editor is populated with the definition of the cost_centers file’s fields.
  13. Press CTRL+S to save this editor.

From the DI Designer, you can view the content of the reverse-engineered datastores.

  1. In the DemoFile.md editor, right-click the cost_centers node and select Actions > Consult Data.
  2. A SQL editor named cost_centers.sql appears with a predefined SQL query.
  3. Click the Execute Current SQL button.
  4. In the lower panel of the editor, the content of the file is displayed.
  5. Close the cost_centers.sql editor.
  6. Close the DemoFile.md editor.

Reverse-Engineering the XML Files

In this project, we will use two XML files.

Defining the CRM XML File

The file containing the CRM exported data is stored in the <semarchy_di>/samples/xml/ folder.

  1. Click the Create New Metadata button in the Project Explorer toolbar. The New Metadata Model wizard opens.
  2. In the Choose the type of Metadata tree, select XML > XML Schema.
  3. Click Next.
  4. Make sure that the GettingStarted/Metadata folder is selected as the parent folder.
  5. In MetaData Model Name, enter DemoXMLCRM and then click Finish. The DemoXMLCRM editor is created and the XML Wizard opens.
  6. In the Default XML Path field, click the Browse button and browse for the <semarchy_di>/samples/xml/crm_customer.xml file.
  7. In the XSD Path field, click the Browse button and browse for the <semarchy_di>/samples/xml/crm_customer.xsd file.
  8. Click the Refresh button to refresh the Root Element selection box content.
  9. In the Root Element field, select Customers.
  10. Click Reverse. The reverse-engineering process retrieves the XML structure from the XSD.
  11. Click Finish to close the wizard and return to the editor.
  12. Press CTRL+S to save the editor.
  13. Expand the nodes and review the XML structure that was reverse-engineered.
  14. Close this editor.

Defining the XML Target

The file into which we will load golden records for consumption by third party applications does not exist yet, but its XSD is stored in the <semarchy_di>/samples/xml/ folder.

  1. Click the Create New Metadata button in the Project Explorer toolbar. The New Metadata Model wizard opens.
  2. In the Choose the type of Metadata tree, select XML > XML Schema.
  3. Click Next.
  4. Make sure that the GettingStarted/Metadata folder is selected as the parent folder.
  5. In MetaData Model Name, enter DemoXMLGolden and then click Finish. The DemoXMLGolden editor is created and the XML Wizard opens.
  6. In the XSD Path field, click the Browse button and browse for the <semarchy_di>/samples/xml/golden_customer.xsd file.
  7. In the XML Path field, enter <semarchy_di>/samples/xml/golden_customer.xml. Note that the file does not exist and will be created by Semarchy Convergence for DI. It cannot be selected using the Browse button.
  8. Click the Refresh button.
  9. In the Root Element field, select Customers root element for this schema.
  10. Click Reverse. The reverse-engineering process retrieves the XML structure from the XSD.
  11. Click Finish to close the wizard and return to the editor.
  12. Press CTRL+S to save the editor.
  13. Expand the nodes and review the XML structure that was reverse-engineered.
  14. Close this editor.