The sample customer data in this tutorial is modeled after the customers of a bicycle retailer that sells high-end bikes and cycling accessories.
Before following this tutorial, you must have set up Semarchy xDM and completed the Customer B2C tutorial. If you have not yet, follow the Quick Install and Demo Application tutorials from the main Tutorials menu.
The estimated duration of this unit is about 15 minutes for an on-premises database, and 30 minutes if your database is hosted on AWS or Azure.
If you have completed the Quick Install on-premises tutorial, then you already have successfully installed a client and used it to run the create schema statements:
In that case, you can use this client to connect to the xDM repository and data location schemas - simply follow the instructions in the step associated with this client.
If you have not installed any SQL client yet, you can install one of those or use DBeaver, a free universal client. Download DBeaver Community Edition, which is sufficient to follow the Integration tutorials track.
Now that you have selected a SQL client, you are ready to connect to the database.
If you have installed xDM using the the AWS or Azure Quickstart VM:
First enable remote connections to the PostgreSQL database installed on the VM. Go to step 3. Enable remote connections to PostgreSQL on AWS or Azure
Then go to step 4. Connect to PostgreSQL with pgAdmin or 7. Connect using DBeaver.
If you have installed xDM on premises, connect to the database using the default or universal client:
For PostgreSQL: go to step 4. Connect to PostgreSQL using pgAdmin or 7. Connect using DBeaver
For Oracle: go to step 5. Connect to Oracle using SQL Developer or 7. Connect using DBeaver
For SQL Server: go to step 6. Connect to SQL Server database using SQL Server Management Studio or 7. Connect using DBeaver.
Amazon Web Services (AWS) and Microsoft Azure apply default security practices to lock down access to your virtual machine and the PostgreSQL database that is installed on it.
However, to access the PostgreSQL database and perform tasks such as querying and loading data via the SQL API, you need to:
Configure inbound rules on the virtual machine hosting the database to allow SSH connection
Configure your SQL client to connect using an SSH tunnel through the dedicated port.
This step describes how to configure inbound rules on AWS and Azure. The activation of the SSH tunnel is described in the steps dedicated to pgAdmin and DBeaver configuration.
Configure the inbound rules on AWS
In the case of AWS, connecting remotely requires to add 2 inbound rules to your AWS security group to allow SSH and PostgreSQL access:
In the navigation panel, under Network & Security, choose Security Groups.
Retrieve the security group for your instance in the list and open it (left-click on Security Group ID).
On the main page of your Security Group, click Edit inbound rules.
Click Add rule to add a first rule.
Enter the following parameters:
Type: SSH
Port: 22 (pre-selected)
Source: My IP
Skip ahead to step 3. Connect to PostgreSQL with pgAdmin or 7. Connect to any database using DBeaver, depending on the SQL client you are using.
Configure the inbound rules on Azure
In the case of Azure, connection through the SSH port (22) is enabled by default. In this section you will check that SSH connections have not been restricted, and enable them if needed only.
Click the Virtual Machines button on the button bar.
Click on the name of your virtual machine:
Click Networking in the left menu to access network settings:
The list of existing inbound port rules is displayed. Check if connection using the port 22 (SSH) is enabled for any source, or at least your IP.
If no rule exists for Port 22, click the Add inbound port rule button to add a new rule for the SSH port and enter the following parameters:
Source: IP addresses (recommended) or Any (all sources)
Source IP addresses: your IP address or range of IP you want to grant access
Destination: Any
Destination port ranges: 22
Protocol: TCP
Name: SSH
Click the Add button to finalize the creation of your security rule.
In the next step, you will connect to the database using pgAdmin. If you want to connect using DBeaver instead, go to step 7. Connect to any database using DBeaver.
In this step, you will add 2 connections in pgAdmin to access the repository and data location schemas for PostgreSQL.
Configure the connection for the Semarchy Repository
Adding a connection for semarchy_repository lets you access xDM metadata.
Right-click on Servers and select Create>Server
Name the connection after the schema name semarchy_repository
Click Connection to enter Connection parameters:
Host: Hostname or IP address of the database server. For an install on-premises, the hostname or IP address of the machine hosting database server (use 127.0.0.1 if you have installed it on your local machine) For the AWS/Azure Quickstart VM, set this value to 127.0.0.1. You will configure an SSH tunnel into the VM.
Port: Port for the database instance (5432 by default)
Username: semarchy_repository
Password: as per your database setup (by default: same as username)
If your database is hosted on an Azure or AWS server, you need to connect using an SSH tunnel: navigate to the SSH Tunnel tab and enter the following parameters:
Use SSH tunneling: Yes
Tunnel host: IP address or DNS name of remote server
Tunnel port: 22
Username: ubuntu
Authentication: Identity file
Identify file: click the [...] button to select the key pair file you have downloaded during the creation of your remote AWS or Azure instance.
Password: password for semarchy_repository (by default: same as username)
Click Save to finalize the creation of your new connection.
Configure the connection for the Data Location
Repeat the same operations to add a connection for semarchy_customer_b2c_mdm. This lets you access the data location where business data is stored.
Name the connection after the schema name semarchy_customer_b2c_mdm :
Enter the parameters for connection on the Connection tab:
host: same value as for the semarchy_repository connection.
port: same value as for the semarchy_repository connection.
username: semarchy_customer_b2c_mdm
password: as per your database setup (by default: same as username)
If your database is hosted on an Azure or AWS server: navigate to the SSH Tunnel tab and enter the following parameters:
Use SSH tunneling: Yes
Tunnel host: IP address or DNS name of remote server
Tunnel port: 22
Username: ubuntu
Authentication: Identity file
Identify file: click the [...] button to select the key pair file you have downloaded during the creation of your remote AWS or Azure instance.
Password: password for semarchy_customer_b2c_mdm (by default: same as username)
Click Save to finalize the creation of this second connection.
Now that you have created connections to PostgreSQL schemas, jump to step 8. Browse the database objects.
In this step you will add 2 connections in SQL Developer to access the repository and data location schemas for Oracle, in addition to the SYSTEM connection.
Add a connection for SYSTEM. This gives you DBA access to the Database. Enter the following parameters:
Connection name: SYSTEM
Username: system
Password: as per your database setup (by default: same as username)
Hostname: Hostname or IP address of the database server (use 127.0.0.1 if you have installed it on your local machine)
Port: port for the database instance (1521 by default)
SID: SID for your database instance (entered during the initial setup)
Add a connection for SEMARCHY_REPOSITORY. Enter the following parameters:
Connection name: SEMARCHY_REPOSITORY
Username: SEMARCHY_REPOSITORY
Password: as per your database setup (by default: same as username)
Hostname: Same as for the SYSTEM connection
Port: Same as for the SYSTEM connection
SID: Same as for the SYSTEM connection
This lets you access xDM metadata.
Add a connection for SEMARCHY_CUSTOMER_B2C_MDM.
Connection name: SEMARCHY_CUSTOMER_B2C_MDM
Username: SEMARCHY_CUSTOMER_B2C_MDM
Password: as per your database setup (by default: same as username)
Hostname: Same as for the SYSTEM connection
Port: Same as for the SYSTEM connection
SID: Same as for the SYSTEM connection
This lets you access the data location where business data is stored.
Now that you have created connections to Oracle schemas, jump to step 8. Browse the database objects.
In this step, you will add 2 connections in SQL Server Management Studio to access the repository and data location schemas.
In Object Explorer, click Connect and then Database Engine.
Enter the following parameters:
Server name: Hostname or IP address of the database server (use 127.0.0.1 if you have installed it on your local machine)
Authentication: SQL Server
Login: SEMARCHY_REPOSITORY
Password: as per your database setup (by default: same as login)
Click Connect to test the connection.
Repeat operations 1-2-3 to add the second connection using the following parameters:
Server name: same as the SEMARCHY_REPOSITORY connection
Authentication: SQL Server
Login: SEMARCHY_CUSTOMER_B2C_MDM
Password: as per your database setup (by default: same as login)
Now that you have created connections to SQL Server schemas, jump to step 8. Browse the database objects.
DBeaver allows you to connect to any database (PostgreSQL, Oracle, SQL Server, other). The following screenshots are showing PostgreSQL but can be easily adapted to other databases.
In this step you will add 2 connections:
One for semarchy_repository: this lets you access xDM metadata.
One for semarchy_customer_b2c_mdm: this lets you access data locations where business is stored.
Configure the connection for the Semarchy Repository
In the Database Navigator tree view, right-click on the blank area and select Create>Connection
Select your database type (in the screenshot below: PostgreSQL) and click Next.
Enter connection parameters:
Host: Hostname or IP address of the database server. For an install on-premises, the hostname or IP address of the machine hosting database server (use 127.0.0.1 if you have installed it on your local machine) For the AWS/Azure Quickstart VM, set this value to 127.0.0.1. You will configure an SSH tunnel into the VM.
Port: Port for the database instance (5432 by default for PostgreSQL/1521 for Oracle/1433 for SQL Server)
Username: semarchy_repository
Password: As per your database setup (by default: same as username)
If your database is hosted on an Azure or AWS server, you need to connect using an SSH tunnel: navigate to the SSH tab and enter the following parameters:
Use SSH Tunnel: Yes
Host/IP: IP address or DNS name of remote server
Port: 22
Username: ubuntu
Authentication method: Public file
Identify file: click the browse button to select the key pair file you have downloaded during the creation of your remote AWS or Azure instance.
Passphrase: password for semarchy_repository (by default: same as username)
Click Test Connection and check the connection is working. If not, check the parameters again (host, port, username and password).
Back to the connection parameters page, click "Connection details (name, type, ...)" and name the connection after the schema name semarchy_repository, then click on Finish.
Click the OK button to save the connection.
Configure the connection for the Data Location
Repeat the same operations to add a connection for semarchy_customer_b2c_mdm:
Access the Connection tab and enter the parameters for connection:
Host: same as for the semarchy_repository connection.
Port: same as for the semarchy_repository connection.
Username: semarchy_customer_b2c_mdm
Password: As per your database setup (by default: same as username)
If your database is hosted on an Azure or AWS server, you need to connect using an SSH tunnel: navigate to the SSH tab and enter the following parameters:
Use SSH Tunnel: Yes
Host/IP: IP address or DNS name of remote server
Port: 22
Username: ubuntu
Authentication method: Public file
Identify file: click the browse button to select the key pair file you have downloaded during the creation of your remote AWS or Azure instance.
Password: password for semarchy_customer_b2c (by default: same as username)
Click Test Connection and check the connection is working. If not, check the parameters again (host, port, username and password).
Name the connection after the schema name semarchy_customer_b2c_mdm.
Click OK to finalize the creation.
Now that you have created connections to the database schemas, you are ready to have a look at the database objects.
Now that you are connected to the database, take time to look through the schemas.
Navigate into the semarchy_repository schema
Open the semarchy_repository connection and go to databases>postgres>schemas in the left navigation panel.
Browse the semarchy_repository schema and unfold the tables element. Observe that this schema is composed of technical tables which contain xDM metadata:
Under the customer_b2c_mdm connection, the customer_b2c_mdm schema contains all tables of the data location used by the Customer B2C demo application. These are the tables that contain business data for each entity (Nickname, Person, Person_Product, Product...) :
Observe that table names include:
SD_xxx: Source Data
SA_xxx: Source Authoring
MD_xxx: Master Data
GD_xxx: Golden Data
See that column structure differs across the sd/sa/md/gd/... tables.
In this first unit of the Integration tutorial track, you learned how to configure a SQL client for Semarchy xDM. You are now ready to query and load data using the SQL API.
What we've covered
Select a SQL client.
Open remote access (if required)
Setup your SQL Client: you configured a SQL client to access the database, including adding schema connections.
Have a first look at database objects.
What's next?
In the next part of the Integration tutorial track, you will learn how to query data in xDM via the SQL API. You will learn basic and advanced queries as well as integration concepts, such as the difference between fuzzy and basic entities.