Enterprise Data Catalog Scanner Configuration Guide > Configuring No SQL Resources > Apache Cassandra

Apache Cassandra

Apache Cassandra is an open source, distributed, wide-column store, and a NoSQL database management system designed to handle large amounts of data across many commodity servers. It provides high data availability without failure.

Objects Extracted

The Apache Cassandra resource extracts metadata from the following assets in an Apache Cassandra data source:

•Keyspace
•Tables
•Views
•Function
•Fields

Permissions to Configure the Resource

Configure the read permission on the Apache Cassandra data source for the user account that you use to access the data source.

Basic Information

The General tab includes the following basic information about the resource:

Information	Description
Name	The name of the resource.
Description	The description of the resource.
Resource type	The type of the resource.
Execute On	You can choose to execute on the default catalog server or offline.

Resource Connection Properties

The General tab includes the following properties:

Property	Description
Host	Host name or IP address of the Apache Cassandra server.
KeyStore Password	Password to access the Apache Cassandra KeyStore
KeyStore Path	Location of the Apache Cassandra KeyStore.
Local Datacenter	Name of the datacenter that contains the required node.
Password	Password of the user account to access the Apache Cassandra server.
Port	Port number of the Apache Cassandra server. Default port number is 9042
SSL Enabled	Option to enable SSL
Truststore Password	Password to access the Apache Cassandra Truststore.
Truststore Path	Location of the Apache Cassandra Truststore.
Username	User name to access the Apache Cassandra server.

The following table describes the properties that you can configure in the Source Metadata section of the Metadata Load Settings tab:

Property	Description
Enable Source Metadata	Enables metadata extraction.
Case Sensitive	Specifies that the resource is configured for case insensitivity. Select one of the following values: True. Select the check box to specify that the resource is configured as case sensitive. False. Clear the check box to specify that the resource is configured as case insensitive. The default value is True.
Keyspace	Option to import a particular database schema by specifying a list of keyspaces using the Browse API.
Memory	The memory value required to run a scanner job. Specify one of the following memory values: - Low - Medium - High Note: For more information about the memory values, see the Tuning Enterprise Data Catalog Performance article on How To-Library Articles tab in the Informatica Doc Portal
Custom Options	JVM parameters that you can set to configure scanner container. Use the following arguments to configure the parameters: - -Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of scanner to values, such as DEBUG, INFO, or ERROR. Default value is INFO. - -Dscanner.container.core=<No. of core>. Increases the core for the scanner container. The value must be a number. - -Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the Yarn environment. Use a comma to separate the multiple key value pairs. - -Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the scanner container memory when pmem is enabled. Default value is 1.
Track Data Source Changes	View metadata source change notifications in Enterprise Data Catalog.

You can enable domain discovery for Cassandra resource. For more information, see Enable Data Discovery and Composite Data Domain Discovery topics.

Prerequisite to Perform Domain Discovery on Cassandra Resource

Before you run profiles to perform domain discovery on Cassandra resource, make sure to configure the connection and Data Integration Service settings.

Connection setting:

1. In Informatica Administrator, click Connection tab in the Manage section.
2. In the Domain Navigator section, select Cassandra resource connection that you use to run profiles.
3. In the Advanced Property section, set SQL identifier character as ""(quotes)
4. Click OK.
5. In the Additional Connection Properties field, add the following value: EnableCaseSensitive=0.
6. Click OK.

Data Integration Service setting:

1. In Informatica Administrator, click Services and Nodes tab in the Manage section.
2. Go to Data Integration Service.
3. In the Advance Properties field, set Maximum Heap Size as 4096M.
4. Click OK.
5. In the Custom Properties field, add the following value: ExecutionContextOptions.JVMMaxMemory=1024M
6. Click OK.