Replicator Rdbms to Snowflake
Description
This tool replicates data to Snowflake, optionally using Snowpipe.
You can also use this tool to replicate a list of changes from Change Data Capture tables to a target database, for later analysis.
Parameters
Parameter | Mandatory | Description | ||
---|---|---|---|---|
Name |
True |
Name given to the tool. |
||
Drop Tables |
False |
Enabling this parameter drops the target tables before processing. Tables to be dropped are calculated from the list of replicated source tables. A drop is performed on the target for each of those tables. |
||
Create Pk |
False |
When this parameter is enabled, the replicator replicates the primary key defined on the source tables in the target tables. |
||
Drop Pk |
False |
When this parameter is enabled, the replicator drops the primary keys of target tables before processing. Primary keys to be dropped are calculated from the list of source tables replicated. A drop is performed on the target for all the corresponding primary keys defined on the source. |
||
Create Fk |
False |
When this parameter is enabled, the replicator tries to replicate foreign keys defined on source tables in target tables.
|
||
Drop Fk |
False |
When this parameter is enabled, the replicator drops the foreign keys of target tables before processing. Foreign keys to be dropped are calculated from the list of source tables replicated. A drop is performed on the target for all corresponding foreign keys defined on the source. |
||
Enable Constraints |
False |
When this parameter is enabled, the replicator tries to enable the primary foreign key constraints on target tables. This is performed at the end of the replication, i.e., after replicating the structure and data. This can be used together with the Disable Constraints parameter if you want to disable primary and foreign key constraints before processing and then re-enable them at the end. This can be useful to avoid unicity checks while loading tables. |
||
Disable Constraints |
False |
When this parameter is enabled, the replicator tries to disable primary and foreign key constraints on target tables. This is performed at the very beginning of the replication, i.e., before replicating the structure and data. This can be used together with the Enable Constraints parameter if you want to disable primary and foreign keys constraints before processing and then re-enable them at the end. This can be useful to avoid unicity checks while loading tables. |
||
Add Comments |
False |
Enables adding comments. |
||
Delete Tables |
False |
When this parameter is enabled, the replicator removes the target table data before processing. |
||
Truncate Tables |
False |
When this parameter is enabled, the replicator truncates the target tables before processing. |
||
Normalize Column Names |
False |
When this parameter is enabled, the column names are normalized. |
||
Use CDC |
False |
When this parameter is enabled, the corresponding CDC table is used as source. This requires that the CDC be set up on the source table before. |
||
Drop Snowpipes |
False |
Allows dropping existing snowpipes. |
||
Synchronize Deletions From CDC |
False |
Only applies if the Use CDC and Load Data parameters are enabled. Enable this parameter if the records deleted on the source table need to be deleted on the target table. |
||
Create Stages |
False |
Enabling this parameter creates stages before the replication. |
||
Drop Stages |
False |
Enabling this parameter drops the stages at the end of replication. |
||
Create Tables |
False |
When this parameter is set to true, the replicator tries to create the tables on the target. The table structure is retrieved from the source metadata. Note that this does not imply replicating data of the tables. When this parameter is set to for historization, the replicator creates two additionnal columns |
||
Load Data |
True |
This parameter determines whether the source data should be replicated on the target. Possible options are:
|
||
Load Method |
True |
Specifies the method to use for loading data into Snowflake. Possible options are:
|
||
Query |
True |
Path to the metadata nodes on which to perform the replication. The value to specify depends on what is the source Metadata, and the syntax to use is XPath.
It must start with The following examples illustrate some of the most commonly used types of nodes: * If the metadata dropped is a database node:
* If the metadata dropped is a schema / folder / hierarchical file node:
* If the metadata dropped is a table or file node: NOTE: |
||
Create Snowpipes |
True |
Allows initializing Snowpipes so that they can be later used by the Snowpipe load method. Possible options are:
|
||
Snowpipe Acknowledge Mode |
True |
Defines how the replicator checks the status of the files loaded through Snowpipe when using the * enabled: The replicator checks that the files have been properly ingested before proceeding. + This implies waiting and performing status checks with the Snowpipe APIs. + If at least one file has not been correctly ingested the replicator throws an error listing the files that have failed. The files that were successfully ingested are automatically deleted from the Snowflake stage. * disabled: The replicator proceeds without checking the files' ingestion status. The files that were sent to the Snowflake stages are not deleted automatically. |
||
Force Table Case |
True |
Determines how the table name case is handled by the replicator. Possible options are:
|
||
Force Column Case |
True |
Determines how the column name case is handled by the replicator. Possible options are:
|
||
Force Constraint Case |
True |
Determines how constraint name case is handled by the replicator. Possible options are: * false: Constraint name case is preserved as on the source metadata. * lower: Constraint name is forced to lowercase on the target. * upper: Constraint name is forced to uppercase on the target. |
||
Create Table Syntax |
False |
Syntax used to create table statements on the target. These are the keywords added at the beginning of the statement. They can be useful to add hints to create table statements for instance. |
||
Select Column Pattern |
False |
Offers the possibility to change the expression used for columns in the select query performed to load source data into the target. |
||
Insert Values Pattern |
False |
Only applies if the Load Method parameter is set to generic. Applies a transformation on the insert into the target table. |
||
Column Transformation Rules |
False |
Rules used to change the name of source columns on the fly based on matching expressions. If the expression matches while computing the column name, the replacement is performed. Multiple column transformation rules can be specified by separating them with a ';' character. The rules should be specified in the XPath language using the following syntax:
Changing column name based on its name
tech:name()='CUS_LAST_NAME'/'CUSTOMER_LAST_NAME'
⇒all columns named 'CUS_LAST_NAME' on the source will be called 'CUSTOMER_LAST_NAME' on the target
Changing column name based on its datatype
@TYPE_NAME = 'VARCHAR'/concat(tech:name(),'_V')
⇒ when a column of VARCHAR datatype is encountered, the '_V' suffix is added at the end of the column name
Changing multiple column names based on its datatypes
@TYPE_NAME = 'VARCHAR'/concat(tech:name(),'_V');@TYPE_NAME = 'INTEGER'/concat(tech:name(),'_I')
⇒ same as the previous example, except that we have defined two transformation rules here, separated with ";"
Adding a prefix on all columns
1=1/concat('REPL_',tech:name())
⇒ '1=1' expression will always be true so matching for all columns
|
||
Ignored Tables |
False |
Comma-separated list of tables to ignore from the replication. The names must be fully qualified table names. For example, for Microsoft SQL Server: [CATALOG].[SCHEMA].[TABLE01],[CATALOG].[SCHEMA].[TABLE02] |
||
Field Separator |
False |
ASCII field separator used to separate fields in exported temporary files. |
||
String Delimiter |
False |
ASCII character used to delimit strings in the exported file. |
||
Target Object Prefix |
False |
Defines a prefix to add to target object names. The prefix is added to table names and constraint names (PK, AK, FK, CK, Indexes). |
||
Temporary Object Prefix |
False |
Defines a prefix to be added to temporary object names. This can be used when loading data in the synchronize mode. |
||
Split Parallel Degree |
False |
Number of subsets into which the data in the source tables should be split and loaded in parallel based on the values in a column selected by expression in the Split By Pattern parameter. |
||
Split By Pattern |
False |
Pattern used to select a column that is used to divide data in the source table into subsets. [NOTE]
====
The selected column must be of numeric type. By default, it is the column that you have selected by positioning If you do not have a valid numeric column, you can try to use an expression that returns a unique numeric value for each row. ==== |
||
Transaction Type |
False |
Type of transaction. Possible values are:
|
||
Batch Size |
False |
JDBC Batch size (number of lines) used when loading data into target tables. Larger values provide better performance but consume more memory in the runtime as the data is stored in the memory before being loaded. |
||
Fetch Size |
False |
JDBC Fetch size (number of lines) used when querying source database. Larger values provide better performance but consume more memory in the runtime as the data is stored in the memory before being sent to the target. |
||
Cdc Subscriber |
False |
Subscriber that should be used by the replication process, when the Use Cdc parameter is enabled. |
||
Cdc Max Timestamp |
False |
Limits the CDC data being consumed up to a certain time. |
||
Snowpipe Report Timeout |
150000 |
When using Snowpipe, how much time to wait for an operation to finish before stopping with an error. The value is defined in milliseconds, and the default is two and a half minutes. |