Hadoop Component Release Notes

This page lists the main features added to the Hadoop Component.

Feature Highlights

Version 2023.1.0

Ability to choose Hive Table Type

The Hive metadata now supports defining the Hive Table Type, which can be EXTERNAL or MANAGED.

The default type is EXTERNAL, which requires a specific configuration in the metadata. Update your hive metadata to define the external storage location, or change the type of table to MANAGED depending on your configuration.

Change Log

Version 2023.1.10

Bug Fixes

  • DI-8922: Template Load XML to Hive: fixed an error when string data contains special characters.

  • DI-9323: Template LOAD XML to Hive: fixed an issue that caused mappings to crash or output incorrect results.

Version 2023.1.9

Bug Fixes

  • DI-8555: TOOL HDFS File Connect: variable value is not initialized correctly in BeanShell scripts.

Version 2023.1.5

Bug Fix

  • DI-8175: When using HBase in an Sql to Parameters Process Action, the connection unexpectedly fails.

  • DI-7745: Third-party library upgrade.

  • DI-7797: Third-party library upgrade.

  • DI-7944: Third-party library upgrade.

Version 2023.1.4

New Features

  • DI-7915: Add tooltips to Hive External Table properties.

Version 2023.1.3

Bug Fix

  • DI-6521: Jackson Third-Party library upgrade.

Version 2023.1.0

Breaking Changes

  • DI-5390: The metadata now supports defining the Hive Table Type, which can be EXTERNAL or MANAGED.
    The default type is EXTERNAL, which requires a specific configuration in the metadata Update your hive metadata to define the external storage location, or change the type of table to MANAGED depending on your configuration.
    See External and Managed Tables.

New Features

  • DI-5312: The metadata now supports hexadecimal properties.

  • DI-5389: The metadata reverse-engineering has been improved.

  • DI-5653: Log4j version 1 has been removed from the dependencies.

  • DI-5817: Multiple third-party libraries upgarde.

  • DI-6225: The Post Processing Operation option has been added to the Load Xml To Hive template.

Bug Fixes

  • DI-4000: The specific hive serde is missing from the Hadoop module.

  • DI-6077: Hadoop tools are missing from the process palette.

  • DI-6234: Some built-in templates cannot be saved after being modified and a NullPointerException is thrown.

Version 5.3.7 (Component Pack)

Bug Fixes

  • DI-6077: Hadoop tools are missing from the process palette.

Version 3.0.0 (Component Pack)

New Features

  • DI-4053: Query Editor menu renamed to "Launch Query Editor"

  • DI-4508: Update Components and Designer to take into account dedicated license permissions

  • DI-4727: Rebranding: Templates and sample projects

  • DI-4731: Rebranding: Template messages

  • DI-4813: Rebranding: Drivers classes and URLs

  • DI-4962: Improved component dependencies and requirements management

Version 2.2.1 (Hadoop Component)

Bug Fixes

  • DI-4559: Hive - table and column names were unexpectedly truncated to 30 characters

Version 2.2.0 (Hadoop Component)

New Features

  • DI-3713: Internal change on how some libraries are built to ease maintenance

Version 2.1.1 (Hadoop Component)

New Features

  • DI-3959: Component updated to support the replacement of SQL Explorer with the Stambia Query Editor

Version 2.1.0 (Hadoop Component)

New Features

  • DI-3510: EMF compare utility - Component has been updated to support EMF Compare comparison utility

Version 2.0.5 (Hadoop Component)

New Features

  • DI-3614: New TOOL "HDFS Get File List" allowing to retrieve a list of HDFS files and store the result in a table

Version 2.0.4 (Hadoop Component)

Bug Fixes

  • DI-2736: Template - LOAD Rdbms to Hive - generated temporary file names may unexpectedly contain object delimiters

  • DI-2737: Template - LOAD Rdbms to Impala - generated temporary file names may unexpectedly contain object delimiters

Version 2.0.3 (Hadoop Component)

New Features

  • DI-1775: New template Load Salesforce to Hive

  • DI-1777: TOOL Hdfs File Connect - an "HDFS Error" session variable is now published when a connection fails, providing more detailed information about the error

  • DI-1910: Templates updated - New parameter 'Cdc Subscriber' on Templates on which it was not handled yet

  • DI-1909: Templates updated - New Parameters 'Unlock Cdc Table' and 'Lock Cdc Table' to configure the behaviour of CDC tables locking

Bug Fixes

  • DI-1729: HDFS File Get Process Tool - Module specification was missing when using WebHDFS mode, which was causing errors as the required dependencies could not be found

  • DI-1908: Templates updated - The 'Cdc Subscriber' parameter was ignored in some Templates on Lock / Unlock CDC steps

  • DI-1907: Templates updated - The 'Cdc Subscriber' parameter was ignored in some Templates when querying the source data