Enterprise Data Catalog Scanner Configuration Guide > Configuring No SQL Resources > Apache Cassandra
  

Apache Cassandra

Apache Cassandra is an open source, distributed, wide-column store, and a NoSQL database management system designed to handle large amounts of data across many commodity servers. It provides high data availability without failure.

Objects Extracted

The Apache Cassandra resource extracts metadata from the following assets in an Apache Cassandra data source:

Permissions to Configure the Resource

Configure the read permission on the Apache Cassandra data source for the user account that you use to access the data source.

Basic Information

The General tab includes the following basic information about the resource:
Information
Description
Name
The name of the resource.
Description
The description of the resource.
Resource type
The type of the resource.
Execute On
You can choose to execute on the default catalog server or offline.

Resource Connection Properties

The General tab includes the following properties:
Property
Description
Host
Host name or IP address of the Apache Cassandra server.
KeyStore Password
Password to access the Apache Cassandra KeyStore
KeyStore Path
Location of the Apache Cassandra KeyStore.
Local Datacenter
Name of the datacenter that contains the required node.
Password
Password of the user account to access the Apache Cassandra server.
Port
Port number of the Apache Cassandra server. Default port number is 9042
SSL Enabled
Option to enable SSL
Truststore Password
Password to access the Apache Cassandra Truststore.
Truststore Path
Location of the Apache Cassandra Truststore.
Username
User name to access the Apache Cassandra server.
The following table describes the properties that you can configure in the Source Metadata section of the Metadata Load Settings tab:
Property
Description
Enable Source Metadata
Enables metadata extraction.
Case Sensitive
Specifies that the resource is configured for case insensitivity. Select one of the following values:
True. Select the check box to specify that the resource is configured as case sensitive.
False. Clear the check box to specify that the resource is configured as case insensitive.
The default value is True.
Keyspace
Option to import a particular database schema by specifying a list of keyspaces using the Browse API.
Memory
The memory value required to run a scanner job.
Specify one of the following memory values:
  • - Low
  • - Medium
  • - High
Note: For more information about the memory values, see the Tuning Enterprise Data Catalog Performance article on How To-Library Articles tab in the Informatica Doc Portal
Custom Options
JVM parameters that you can set to configure scanner container. Use the following arguments to configure the parameters:
  • - -Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of scanner to values, such as DEBUG, INFO, or ERROR. Default value is INFO.
  • - -Dscanner.container.core=<No. of core>. Increases the core for the scanner container. The value must be a number.
  • - -Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the Yarn environment. Use a comma to separate the multiple key value pairs.
  • - -Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the scanner container memory when pmem is enabled. Default value is 1.
Track Data Source Changes
View metadata source change notifications in Enterprise Data Catalog.
You can enable domain discovery for Cassandra resource. For more information, see Enable Data Discovery and Composite Data Domain Discovery topics.

Prerequisite to Perform Domain Discovery on Cassandra Resource

Before you run profiles to perform domain discovery on Cassandra resource, make sure to configure the connection and Data Integration Service settings.
Connection setting:
  1. 1. In Informatica Administrator, click Connection tab in the Manage section.
  2. 2. In the Domain Navigator section, select Cassandra resource connection that you use to run profiles.
  3. 3. In the Advanced Property section, set SQL identifier character as ""(quotes)
  4. 4. Click OK.
  5. 5. In the Additional Connection Properties field, add the following value: EnableCaseSensitive=0.
  6. 6. Click OK.
Data Integration Service setting:
  1. 1. In Informatica Administrator, click Services and Nodes tab in the Manage section.
  2. 2. Go to Data Integration Service.
  3. 3. In the Advance Properties field, set Maximum Heap Size as 4096M.
  4. 4. Click OK.
  5. 5. In the Custom Properties field, add the following value: ExecutionContextOptions.JVMMaxMemory=1024M
  6. 6. Click OK.