Before you create a catalog source, ensure that you have the information required to connect to the source system.
Perform the following tasks:
•Assign the required permissions.
•Configure a connection to the Amazon S3 source system.
•Optionally, if you want to identify pairs of similar columns and relationships between tables within a catalog source, import a relationship inference model.
Verify permissions
To extract metadata and to configure other capabilities that a catalog source might include, you need account access and permissions on the source system. The permissions required might vary depending on the capability.
Permissions to extract metadata
Ensure that you have the required permissions to enable metadata extraction.
Grant the following permissions:
•Read permission for the user account that you use to access the catalog source.
•Access permission. Required if the user account is different from the user account that you used to create the Amazon S3 catalog source.
Permissions to run data profiles
Ensure that you have the required permissions to run profiles.
Grant the following permissions:
•ListBucket. Required to view objects from Amazon S3 buckets.
•ListBucketMultipartUploads. Required to list multipart object uploads to Amazon S3 buckets that are in progress.
•GetObject. Required to read objects from Amazon S3 buckets.
•PutObject. Required to process staging data for Avro and Parquet files.
•DeleteObject. Required to delete staging data of Avro and Parquet files.
Permissions to run data classification
You don't need any additional permissions to run data classification.
Permissions to run relationship discovery
You don't need any additional permissions to run relationship discovery.
Permissions to run glossary association
You don't need any additional permissions to run glossary association.
Create a connection
Before you configure the Amazon S3 catalog source, create a connection object in Informatica Intelligent Cloud Services Administrator.
1In Administrator, select Connections.
2Click New Connection.
3Enter the following connection details:
Property
Description
Connection Name
Name of the connection.
Each connection name must be unique within the organization. Connection names can contain alphanumeric characters, spaces, and the following special characters: _ . + -,
Maximum length is 255 characters.
Description
Optional description of the connection. Maximum length is 4000 characters.
Type
Type of connection. Ensure that the type is Amazon S3 V2.
4Enter properties specific to the Amazon S3 connection:
Property
Description
Runtime Environment
The name of the runtime environment where you want to run the tasks.
Access Key
The access key ID used to access the Amazon account resources. Required if you do not use AWS Identity and Access Management (IAM) authentication.
Note: Ensure that you have valid AWS credentials before you create a connection.
Secret Key
The secret access key used to access the Amazon account resources.
This value is associated with the access key and uniquely identifies the account. You must specify this value if you specify the access key ID. Required if you do not use AWS Identity and Access Management (IAM) authentication.
IAM Role ARN
The Amazon Resource Name (ARN) of the AWS Identity and Access Management (IAM) role assumed by the user to use the dynamically generated temporary security credentials.
Enter the ARN value if you want to use the temporary security credentials to access AWS resources.
Note: Even if you remove the IAM role that grants the agent access to the Amazon S3 bucket, the test connection is successful.
For more information about how to get the ARN of the IAM role, see the AWS documentation.
Use EC2 Role to Assume Role
Enables the EC2 role to assume another IAM role specified in the IAM Role ARN option.
By default, this property is not selected.
Note: The EC2 role must have a policy attached with permissions to assume an IAM role from the same or different account.
Folder Path
Bucket name or complete folder path to the Amazon S3 objects.
For tasks other than application ingestion and database ingestion tasks, don't use a slash at the end of the folder path. For example, <bucket name>/<my folder name>.
IAM Role ARN (Advanced Setting)
The Amazon Resource Name of the IAM role assumed by the IAM user or the EC2 role to use temporary security credentials and access the AWS resources.
External ID (Advanced Setting)
The AssumeRole request parameter to generate temporary session credentials that enable an IAM user or AWS service to assume a role.
Enable the EC2 role to assume another IAM role specified in the 'IAM Role ARN' option.
S3 Account Type (Advanced Setting)
The type of the Amazon S3 account. Select one of the following options:
- Amazon S3 Storage. Enables you to use the Amazon S3 services.
- S3 Compatible Storage. Enables you to use the endpoint for a third-party storage provider such as Scality RING or MinIO.
Default is Amazon S3 storage.
REST Endpoint (Advanced Setting)
The S3 storage endpoint required for an S3 compatible storage.
Enter the S3 storage endpoint in the HTTP or HTTPs format.
For example, http://s3.isv.scality.com.
5Click Test Connection.
Import a relationship inference model
Import a relationship inference model if you want to configure the relationship discovery capability. You can either import a predefined relationship inference model, or import a model file from your local machine.
1In Metadata Command Center, click Explore on the navigation panel.
2Expand the menu and select Relationship Inference Model. The following image shows the Explore page with the Relationship Inference Model menu:
3Select one of the following options:
- Import Predefined Content. Imports a predefined relationship inference model called Column Similarity Model v1.0.
- Import. Imports the predefined relationship inference model from your local machine. Select this if you previously imported predefined content into your local machine and the inference model is stored on the machine.
To import a file, click Choose File in the Import Relationship Inference Model window and navigate to the model file on your local machine. You can also drag and drop the file.
The imported models appear in the list of relationship inference models on the Relationship Discovery tab.