Getting started with Elasticsearch

This page contains information to help you get started with Elasticsearch in Semarchy xDI.

Connect to your data

Metadata creation

To create an Elasticsearch Metadata, launch the Metadata creation wizard. Select the Elasticsearch Metadata in the list, and follow the wizard. Make sure you select the module that corresponds to your Elasticsearch version.

You can then configure the server properties.

Server properties

getting started elasticsearch metadata server overview

The following server properties are available:

Property Description

Name

Your own label for the Elasticsearch server metadata.

Module

The internal module in use for the metadata.

Elasticsearch Version

Elasticsearch server version and API to use. This property is mandatory.

Protocol

Protocol to use when communicating with Elasticsearch. Starting with Elasticsearch version 8, HTTP/REST is the only supported protocol.

Transport Addresses

Comma separated list of addresses that will be used for data operations with the java API, using the following format:

<hostname>:<port>

Cluster Name

Elasticsearch cluster name

Path Home

Elasticsearch installation path on the server.

You can use "." to tell the Elasticsearch driver to use the current installation.

This is used for clusters secured with searchguard.

Http URL

HTTP URL that will be used for reverse operations on the cluster.

Http User

HTTP user which will be used when performing reverse operations on the cluster.

Http Password

HTTP password which will be used when performing reverse operations on the cluster.

Keystore

The keystore to use when the connection is secured with SSL/TLS through a self-signed or Java-unknown certificate.

Truststore

The truststore to use when the connection is secured with SSL/TLS through a self-signed or Java-unknown certificate.

Define the SSL/TLS certificate

If the Elasticsearch server is configured with an SSL/TLS certificate delivered by an official authority, Java will recognize it automatically. No specific configuration is required.

If the Elasticsearch server is configured with an SSL/TLS certificate unknown by Java, such as a self-signed certificate, the Elasticsearch metadata must be configured with this certificate for the connection to succeed. To configure the certificate used for the connection:

  1. Create and configure a Certificates and Keys metadata.

    1. Add and configure a Keystore or Truststore node inside this metadata. This must refer to the corresponding file containing the certificate.

  2. Configure the Elasticsearch metadata to use this node:

    1. In the Elasticsearch metadata, click the server node.

    2. In the Keystore property, select or drag and drop the Keystore or Truststore node.

    3. In the Truststore property, select or drag and drop the Keystore or Truststore node.

Define and reverse indexes

To define an Index and reverse it:

  1. Right click on the server node and choose New > Index

  2. Fill-in the Index properties

Example:

getting started elasticsearch metadata index overview

The following properties are available:

Property Description

Name

Logical name for the Index. You can give any name, it is a label for the Metadata.

Index Name

Elasticsearch physical name of the index.

This is the name that will be used during data and reverse operations.
Note that you can also enter an Elasticsearch Index Alias.

Reverse Size

Number of documents retrieved to analyze the structures.

As all the documents do not necessarily have the same structure, this permit to analyze multiple documents to retrieve all the possible elements.

Reverse from

The offset from which the reverse will fetch the documents.

E.g. If set to 20 the analysis will start on the 20th document.

Aliased Indexes

The list of indexes the Alias is pointing to, when the specified 'Index Name' is an Alias. It is filled automatically at reverse.

The Component supports both using the real name of an Index, or an Alias Name pointing to this index.

  • When reversing an Index using an Alias as Index Name, make sure that the Alias is existing in the Elasticsearch server.

  • After the reverse, you’ll notice that the 'Aliased Indexes' property will be filled automatically with all the Indexes the alias is pointing to.

If the Index Name property matches an existing index or alias on the server, you can right click on it and select Actions > Reverse All to reverse all the document types of the index.

As the reverse is done through parsing the document structures, one document, at least, must exist on the server for being able to reverse its type.

The Reverse All action can also be used for updating the already existing Metadata if you already reversed it. In this case, note that if you have multiple document metadata with the same Document Type Name, only the first will be updated.

From Elasticsearch 7.x servers and higher, document types are deprecated and only one is allowed per index.

In Semarchy xDI, starting from Elasticsearch 7.x, document type is therefore not taken into account for reverse operations, and replaced with "_doc" instead.

Define and reverse document types

Overview

To define manually a Document Type:

  1. Right click on an Index node and choose New > Type

  2. Fill-in the Document Type properties

Example:

getting started elasticsearch metadata type overview

The following properties are available:

Property Description

Name

Logical name for the Type. You can give any name, it is a label for the Metadata.

Document Type Name

Elasticsearch physical name of the document type. This is the name that will be used for data and reverse operations. It is mandatory.

If the Document Type Name property matches an existing Document in the Index, you can right click on it and select Actions > Reverse to reverse it.

As the reverse is done through parsing the document structures, one document, at least, must exist on the server for being able to reverse its structure.

From Elasticsearch 7.x servers and higher, document types are deprecated and only one is allowed per index.

In Semarchy xDI, starting from Elasticsearch 7.x, document type is therefore not taken into account for reverse operations, and replaced with "_doc" instead.

Document type structure

The structure of a Document is a JSON structure.

You can define it manually, if you are designing a Document Type that does not exist yet.

To define manually the structure of a Document Type, right click on the type and choose new > [Object | Value | Array].

  • Elastic Search has specific values such as version, creation date, id, at the root of the document type node.

  • Moreover, all the values of your document must be on a child JSON object named "document", which is mandatory.

Example of an Elasticsearch Document structure

getting started elasticsearch metadata type structure

Joined documents

Elasticsearch offers the ability to join documents.

This allows to define parent / child relation between documents.

You can work with such documents in Semarchy xDI:

  1. On your Document Type, add on the root node a new value named "routing".

  2. Define it as a "string" value.

getting started elasticsearch metadata document routing

You can then map this field inside your Elasticsearch Mappings.

Define and reverse search queries

Common query

A Search Query is a placeholder that will contain an Elasticsearch query.

It is used to retrieved data from a predefined query.

To define a Search Query:

  1. Right click on the server node and choose New > Search

  2. Specify the search properties

  3. Finally, right click on the search node and choose Actions > reverse. This will execute the query and parse the response to get the structure returned.

Example:

getting started elasticsearch metadata query overview

The following properties are available:

Property Description

Name

Label for the search query.

Indexes

Comma separated list of Indexes on which the query will be executed

Doc types

Comma separated list of document types on which the query will be executed

Query

Elasticsearch query to execute, with the JSON format. Refer to the Elasticsearch documentation for further information on how to design it.

Parametrized query

The values of Search queries can be parametrized with the use of parameters, which allows to dynamically change the query at execution.

To parametrize a query:

  1. Create a 'Value' on the search query with right click > new > Value

  2. Fill-in its properties

  3. Use it in the query to replace a manually set value

Example:

getting started elasticsearch metadata query parameter

The following properties are available:

Property Description

Name

Label for the parametrized value, which will next be used in the query

Reverse Value

Default value that should be used when reversing the query

Type

JSON Type (string, boolean, number)

Size

Size used for the target staging column when using this value as source in a mapping.

Scale

Number of decimals for the target staging column when using this value as source in a Mapping.

Example:

getting started elasticsearch metadata query parameter example

The parametrized values only work to replace values, not keys.

Create your first mappings

Document operations

Overview

Document types can be used as targets in Mappings to create, read, update or delete documents.

Below, a quick overview of an Elasticsearch Mapping.

getting started elasticsearch mapping overview

To perform an operation on a document type in a Mapping, drag and drop it from your Metadata.

getting started elasticsearch mapping step 1

Then, map the root node from a source database, and select the operation to perform on the Template.

getting started elasticsearch mapping step 2

Other fields are mapped from the source to the target as usual in any Mapping.

It is mandatory to map the root node, which is the repetition key. It will impact the amount of time Elasticsearch will be invoked.

Elasticsearch will be called once for each source row.

Use the result of the operation

You can retrieve and use the result of the Elasticsearch operation.

To do this, map the fields into a target datastore.

getting started elasticsearch mapping document result

Work with joined documents

To work with joined Documents, map or specify a value on the routing field.

The parent and child Documents must have the same routing value to create a relation between them.

getting started elasticsearch mapping joined documents

Additional notes on documents

There are some important notes to have in mind while making operations on Elasticsearch, which are listed below:

The root Document Type node must be mapped on the Mapping.

The 'id' node, which is on the first level, must be mapped.

  • It is used by Elasticsearch as a key for update and delete operations.

  • There is one exception, which is the "insert" operation, on which it is not mandatory. If it is not set, Elasticsearch will automatically create an UUID.

The index or the type can be overridden on the Mapping with the dedicated fields which are on the first level. It offers the possibility to set dynamically which index or type to use during the execution of the Mapping.

Query operations

In a Mapping, a query operation is used the same way as a document operation.

Drag and drop the query from the Metadata into the Mapping, and map the root node from a source database:

getting started elasticsearch mapping query overview

As for Document operation, you must map the root node.
If you created parametrized values for your query, you can map them from the sources or set manual values on them.

To retrieve the result of the query, map the fields into a target datastore:

getting started elasticsearch mapping query result

Sample project

The Elasticsearch component is distributed with sample projects that contain various examples and files. Use these projects to better understand how the component works, and to get a head start on implementing it in your projects.

Refer to Install components in Semarchy xDI Designer to learn about importing sample projects.