Hadoop Component Release Notes
Ability to choose Hive Table Type
The Hive metadata now supports defining the Hive Table Type, which can be EXTERNAL or MANAGED.
The default type is EXTERNAL, which requires a specific configuration in the metadata. Update your hive metadata to define the external storage location, or change the type of table to MANAGED depending on your configuration.
This version contains some minor improvements and fixed issues, which can be found in the complete changelog.
|This component version requires Semarchy xDI Designer version 20.4.1 or higher.|
New Template Load Salesforce to Hive
A new dedicated Template to Load data from Salesforce to Hive has been added.
It will provide a better support and optimization than the generic Templates, and handle Hive specificities when loading data from Salesforce.
HDFS File Connect Tool
When a connection issue happens on the "HDFS File Connect Tool", a session variable named "HDFS Error" is now published to provide a better error message.
Change Data Capture (CDC)
Multiple improvements have been performed to homogenize the usage of Change Data Capture (CDC) in the various Components.
Parameters have been homogenized, so that all Templates should now have the same CDC Parameters, with the same support of features.
Multiple fixes have also been performed to correct CDC issues. Refer to the changelog for the exact list of changes.
Adding a new tool named "Tool HDFS File Get Properties", which allows to retrieve information about files.
For this first version it allows to retrieve the size of a remote file.
The size will be stored in the process variable "HDFS_BYTES".
DI-5390: The metadata now supports defining the Hive Table Type, which can be EXTERNAL or MANAGED.
The default type is EXTERNAL, which requires a specific configuration in the metadata Update your hive metadata to define the external storage location, or change the type of table to MANAGED depending on your configuration.
See External and Managed Tables.
DI-5312: The metadata now supports hexadecimal properties.
DI-5389: The metadata reverse-engineering has been improved.
DI-5653: Log4j version 1 has been removed from the dependencies.
DI-5817: Multiple third-party libraries upgarde.
DI-6225: The Post Processing Operation option has been added to the Load Xml To Hive template.
Version 5.3.7 (Component Pack)
Version 3.0.0 (Component Pack)
DI-4053: Query Editor menu renamed to "Launch Query Editor"
DI-4508: Update Components and Designer to take into account dedicated license permissions
DI-4727: Rebranding: Templates and sample projects
DI-4731: Rebranding: Template messages
DI-4813: Rebranding: Drivers classes and URLs
DI-4962: Improved component dependencies and requirements management
Version 2.2.1 (Hadoop Component)
Version 2.2.0 (Hadoop Component)
Version 2.1.1 (Hadoop Component)
Version 2.1.0 (Hadoop Component)
Version 2.0.5 (Hadoop Component)
Version 2.0.4 (Hadoop Component)
Version 2.0.3 (Hadoop Component)
DI-1775: New template Load Salesforce to Hive
DI-1777: TOOL Hdfs File Connect - an "HDFS Error" session variable is now published when a connection fails, providing more detailed information about the error
DI-1910: Templates updated - New parameter 'Cdc Subscriber' on Templates on which it was not handled yet
DI-1909: Templates updated - New Parameters 'Unlock Cdc Table' and 'Lock Cdc Table' to configure the behaviour of CDC tables locking
DI-1729: HDFS File Get Process Tool - Module specification was missing when using WebHDFS mode, which was causing errors as the required dependencies could not be found
DI-1908: Templates updated - The 'Cdc Subscriber' parameter was ignored in some Templates on Lock / Unlock CDC steps
DI-1907: Templates updated - The 'Cdc Subscriber' parameter was ignored in some Templates when querying the source data