Amazon S3 - operation

Tools

Description

This tool is used to perform operations between Amazon S3 and another data source or target.

Usage

To use the tool in a process, you need to do the following things:

  • Add the TOOL AWS - S3 tool from the Process Palette.

  • Make your buckets and data available to the tool:

    • Drag and drop an S3 bucket node onto the tool from existing Amazon S3 metadata.

    • Drag and drop a file node, table node, folder node with children items, or other data source from other metadata.

  • In the tool parameters, set the S3 operation and any other parameters.

The first S3 bucket you drag and drop onto the tool will be displayed as AMAZON_S3 by default. You can reference that display name in the parameters using an XPath expression.

For examples, refer to the Amazon S3 sample projects.

Parameters

Parameters set in the tool will override parameters with the same name in existing metadata.

Parameter Description

Name

Label of the tool in the Designer process.

Disable Certificate Checking (Deprecated)

Disables validation of server certificates when using the HTTPS protocol.

Path-Style Access

Use a path-style URL rather than a virtual-hosted-style URL.

Use Large Files API

Enables multipart uploads for more resilient large file uploads.

Access Key Id

The first part of your AWS access key pair.

Secret Key

The second part of your AWS access key pair.

Bucket Name

Name of the S3 bucket to operate on.

Region

Location of the S3data center cluster.

Operation

Action to perform on the data. Actions have different parameter requirements:

  • createBucket uses the Bucket Name parameter.

  • deleteFile uses the S3 File Path parameter.

  • getFile uses the Local File Path and S3 File Path parameters.

  • getMultipleFiles uses the S3 Directory Path, Local Directory Path, S3 Object Excludes, and S3 Object Includes parameters.

  • putFile uses the Local File Path and S3 File Path parameters. Optionally uses the Use Large Files API parameter.

  • putMultipleFiles uses the Local Directory Path, Local Object Excludes, and Local Object Includes parameters. Optionally uses the Use Large Files API parameter.

Local File Path

Name of the local file path to use in getFile and putFile operations.

Local Directory Path

Name of the local source directory to use in putMultipleFiles operations. You can omit this parameter if you want the source directory to be root of the bucket.
For performance reasons, it is recommended to use this parameter to indicate static subdirectories, rather than putting those in Excludes or Includes parameters.

Better performance:
	'Local Directory Path' -> tmp
	'Local Object Includes' -> *.txt
Worse performance:
	'Local Directory Path' -> <empty>
	'Local Object Includes' -> tmp/*.txt

Local Object Excludes

Optional list of local objects to exclude in putMultipleFiles operations, formatted as a semicolon-separated list of objects masks. When empty, no file will be excluded.
When the source is a directory, or if you set the Local Directory Path parameter, the object mask is evaluated inside this directory.

You can use wildcards to affect multiple files at once, as follows:

  • The ? wildcard will match exactly one character in a segment of the path to the blob.

  • The * wildcard will match zero or more character in a segment of the path to the blob.

  • The ** wildcard will match zero or more segments of the path to the blob.

Examples:

  • Ignore XML and JSON objects in the current directory: *.xml;*.json

  • Ignore XML objects in any test subdirectory: **/test/*.xml

Local Object Includes

Optional list of local objects to use in putMultipleFiles operations, formatted as a semicolon-separated list of objects masks. When empty, all files are matched.
When the source is a directory, or if you set the Local Directory Path parameter, the object mask is evaluated inside this directory.

You can use wildcards to affect multiple files at once, as follows:

The ? wildcard will match exactly one character in a segment of the path to the blob. The * wildcard will match zero or more character in a segment of the path to the blob. The ** wildcard will match zero or more segments of the path to the blob.

Examples:

  • Include XML and JSON objects in the current directory: *.xml;*.json

  • Include XML objects in any test subdirectory: **/test/*.xml

S3 File Path

Path to an object in an S3 bucket. Used with deleteFile, getFile, and putFile operations.

  • The ? wildcard will match exactly one character in a segment of the path to the object.

  • The * wildcard will match zero or more character in a segment of the path to the object.

  • The ** wildcard will match zero or more segments of the path to the object.

S3 Directory Path

Name of the S3 directory to use in getMultipleFiles operations. You can omit this parameter if you want the source directory to be root of the bucket.
For performance reasons, it is recommended to use this parameter to indicate static subdirectories, rather than putting those in Excludes or Includes parameters.

Better performance:
	'S3 Directory Path' -> tmp
	'S3 Object Includes' -> *.txt
Worse performance:
	'S3 Directory Path' -> <empty>
	'S3 Object Includes' -> tmp/*.txt

S3 Object Excludes

Optional list of remote objects to exclude in getMultipleFiles operations, formatted as a semicolon-separated list of objects masks. When empty, no file will be excluded.
When the source is a directory, or if you set the S3 Directory Path parameter, the object mask is evaluated inside this directory.

You can use wildcards to affect multiple objects at once, as follows:

  • The ? wildcard will match exactly one character in a segment of the path to the blob.

  • The * wildcard will match zero or more character in a segment of the path to the blob.

  • The ** wildcard will match zero or more segments of the path to the blob.

Examples:

  • Ignore XML and JSON objects in the current directory: *.xml;*.json

  • Ignore XML objects in any test subdirectory: **/test/*.xml

S3 Object Includes

Optional list of remote objects to retrieve in getMultipleFiles operations, formatted as a semicolon-separated list of objects masks. When empty, all files are matched.
When the source is a directory, or if you set the S3 Directory Path parameter, the object mask is evaluated inside this directory.

You can use wildcards to affect multiple objects at once, as follows:

  • The ? wildcard will match exactly one character in a segment of the path to the blob.

  • The * wildcard will match zero or more character in a segment of the path to the blob.

  • The ** wildcard will match zero or more segments of the path to the blob.

Examples:

  • Retrieve XML and JSON objects in the current directory: *.xml;*.json

  • Retrieve XML objects in any test subdirectory: **/test/*.xml

Proxy Host

Proxy hostname used to enforce IAM authentication for database access.

Proxy Port

Proxy port used to enforce IAM authentication for database access.

Proxy User Name

Proxy user name used to enforce IAM authentication for database access.

Proxy Uncrypted Password

Proxy unencrypted password used to enforce IAM authentication for database access.

S3Base Url

Explicitly specify your S3 top-level URL to keep it consistent between environments.

Authentication with Assume Role

Defines if the authentication process should use the AssumeRole API.

disabled by default and when applicable are generally equivalent to "disabled" or "enabled".

External ID

Amazon S3 ExternalID parameter setting.

Amazon Resource Name

Amazon S3 ArnRole parameter setting.

Session Name

Amazon S3 RoleSessionName parameter setting.

Session Duration

Amazon S3 DurationSeconds parameter setting.

The Disable Certificate Checking feature should only be used to do quick tests against endpoints which don’t yet have valid certificates. It should never be used in production.

The feature has been removed from the Amazon SDK as of V2. It is deprecated in xDI Designer, and will be removed in a future release.