Spark Component

Overview

Semarchy xDI allows to work with Spark to produce fully customized Data Flows.

Install the Spark Component

If it is not available yet in your {di-designer-name-full}, install the Spark Component from {di-designer-name-short} using the component installation process.

Supported Features

Spark 2

Feature Description

LOAD

Data can be loaded to Spark: HBase, HDFS, Hive, RBDMS, Vertica, Parquet, Elasticsearch

Data can also be loaded from Spark: Hive, RDBMS, Vertica, Parquet, Elasticsearch

INTEGRATE

Data can in integrated from Spark: HDFS , Hive, RDBMS

STAGE

Spark Metadata can be used as a stage (between loading and integration) to boost Hadoop Mappings.

Spark Stage can be: SQL, Java