Enterprise Data Catalog Scanner Configuration Guide > Configuring Cloud Resources > Google Cloud Storage
  

Google Cloud Storage

Google Cloud Storage is an online file storage web service that stores and accesses data from the Google Cloud platform.

Objects Extracted

The Google Cloud Storage resource extracts metadata from the following assets in a Google Cloud Storage data source:

Connect to a Google Cloud Storage Data Source Enabled for SSL

To connect to a Google Cloud Storage data source enabled for SSL, perform the following steps:
  1. 1. Download the Google Cloud Storage SSL certificates using a web browser.
  2. Note: Make sure that you import the Google Cloud Storage Trust Services certificate in the Certificates directory.
  3. 2. Copy the certificates to the <INFA_HOME>/services/shared/security/ directory.
  4. 3. Go to the <INFA_HOME>/source/java/jre/bin directory and then run the following keytool command to import each copied certificate as a trusted certificate in to the Informatica domain keystore:
  5. keytool -import -file <INFA_HOME>/services/shared/security/<certificate>.cer -alias <alias name> -keystore <INFA_HOME>/services/shared/security/infa_truststore.jks -storepass <Informatica domain keystore password>
Note: If the proxy server used to connect to the data source is SSL enabled, you must download the proxy server certificates on the Informatica domain machine.

Basic Information

The General tab includes the following basic information about the resource:
Information
Description
Name
The name of the resource.
Description
The description of the resource.
Resource type
The type of the resource.
Execute On
You can choose to execute on the default catalog server or offline.

Resource Connection Properties

The General tab includes the following properties:
Property
Description
Project ID
ID of the Google Cloud platform project
Private Key
Private key of the Google Cloud platform service account
Client Email
Client email ID of the Google Cloud platform service account
Bucket Name
Bucket name of the stored objects that must be extracted.
Source Directory
Path of the source directory for metadata extraction. Use slash as a suffix to the source directory path.
Connect through a proxy server
Proxy server to connect to the data source. Default is Disabled.
Proxy Host
Host name or IP address of the proxy server.
Proxy Port
Port number of the proxy server.
Proxy User Name
Required for authenticated proxy.
Authenticated user name to connect to the proxy server.
Proxy Password
Required for authenticated proxy.
Password for the authenticated user name.
The following table describes the properties that you can configure in the Source Metadata section of the Metadata Load Settings tab:
Property
Description
Enable Source Metadata
Enables metadata extraction
File Types
Enables you to extract metadata from all the files or specific files.
Specific File Types
Select the file type for metadata extraction.
Enter File Delimiter
Specify the file delimiter. Enterprise Data Catalog supports the following delimiters: Comma (,), Colon (:), Semicolon (;), Tab (\t) and Pipe (|). Use single quotes to specify another file delimiter.
Other File Types
Extracts file metadata, such as file size, file path, and time stamp from other file types.
First Level Directory
Option to add a constraint while importing a first-level directory from the source directory.
An empty string indicates that files from the source directory are imported.
A file name indicates that all files from the source directory along with the files from the specified directory are imported.
Include Subdirectory
Option to include all the directories within the selected first-level directory while extracting metadata. If the first-level directory is empty, all the directories within the source directory are included in the metadata extraction.
Case Sensitive
Specifies that the resource is configured for case insensitivity.
Select the check box to configure the resource as case sensitive. Clear the check box to configure the resource as case insensitive.
By default the resource is configured as case sensitive.
Memory
The memory value required to run a scanner job.
Specify one of the following memory values:
  • - Low
  • - Medium
  • - High
Note: For more information about the memory values, see the Tuning Enterprise Data Catalog Performance article on How To-Library Articles tab in the Informatica Doc Portal
Custom Options
JVM parameters that you can set to configure scanner container. Use the following arguments to configure the parameters:
  • - -Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of scanner to values, such as DEBUG, INFO, or ERROR. Default value is INFO.
  • - -Dscanner.container.core=<No. of core>. Increases the core for the scanner container. The value must be a number.
  • - -Dscanner.yarn.app.environment=<key=value>. Key value pair that you need to set in the Yarn environment. Use a comma to separate the multiple key value pairs.
  • - -Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the scanner container memory when pmem is enabled. The default value is 1.
Track Data Source Changes
View metadata source change notifications in Enterprise Data Catalog.