Enterprise Data Catalog Scanner Configuration Guide > Configuring Cloud Resources > Amazon Redshift

Amazon Redshift

Amazon Redshift is an Internet hosting service and data warehouse product. Amazon Redshift is part of Amazon Web Services, the cloud-computing platform offered by Amazon.

Objects Extracted

•Tables
•Views

Permissions to Configure the Resource

If you create a new user, ensure that you configure read permission on the Amazon Redshift data source for the user account.

Prerequisites

Obtain JDBC driver file
Update the JDBC driver file: To replace the obsolete JDBC driver jar with latest driver jar, perform the following steps:

Basic Information

The General tab includes the following basic information about the resource:

Information	Description
Name	The name of the resource.
Description	The description of the resource.
Resource type	The type of the resource.
Execute On	You can choose to execute on the default catalog server or offline.

Resource Connection Properties

The General tab includes the following properties:

Property	Description
User	The user name used to access the database.
Password	The password associated with the user name.
Host	Host name or IP address of Amazon Redshift service.
Port	Amazon Redshift server port number. Default is 5439.
Database	The name of the database instance.

The following image shows sample connection properties on the General tab:

The Metadata Load Settings tab includes the following properties:

Property	Description
Enable Source Metadata	Extracts metadata from the data source.
Import System Objects	Select this option to specify that the system objects must be imported.
Schema	Click Select... to specify the Amazon Redshift schemas that you want to import. You can use one of the following options from the Select Schema dialog box to import the schemas: - Select from List: Use this option to select the required schemas from a list of available schemas. - Select using regex: Provide an SQL regular expression to select schemas that match the expression.
S3 Bucket Name	Provide a valid Amazon S3 bucket name for the Amazon Redshift data source. You must provide this value if you want to enable profiling for Amazon Redshift. If you do not want to enable profiling, retain the default value. Bucket name should use the access key or private key specified in DIS connection.
Case Sensitive	Specifies that the resource is configured for case insensitivity. Select one of the following values: - True. Select this check box to specify that the resource is configured as case sensitive. - False. Clear this check box to specify that the resource is configured as case insensitive. The default value is False.
Memory	The memory required to run the scanner job. Select one of the following values based on the data set size imported: - Low - Medium - High Note: For more information about the memory values, see the Tuning Enterprise Data Catalog Performance article on How To-Library Articles tab in the Informatica Doc Portal
Custom Options	JVM parameters that you can set to configure the scanner container. Use the following arguments to configure the parameters: - -Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of a scanner to values, such as DEBUG, ERROR, or INFO. Default value is INFO. - -Dscanner.container.core=<No. of core>. Increases the core for the scanner container. The value should be a number. - -Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the Yarn environment. Use a comma to separate the key pair value. - -Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the scanner container memory when pmem is enabled. The default value is 1. - -DenableDirectRead=true. Enables direct reading of profiling information from an Amazon Redshift data source by avoiding the staging phase during the resource scan. Note: Enable the option only if direct read is enabled in the Data Integration Service.
Track Data Source Changes	View metadata source change notifications in Enterprise Data Catalog.
Agent Options	Specify the Enterprise Data Catalog Agent options to run the scanner job.

You can enable data discovery for an Amazon Redshift resource. For more information, see the Enable Data Discovery topic.

Note: Effective in version 10.5.2.1, you can run a profile and perform data domain discovery on external tables for an Amazon Redshift resource.

You can enable composite data domain discovery for an Amazon Redshift resource. For more information, see the Composite Data Domain Discovery topic.