Enterprise Data Catalog Scanner Configuration Guide > Configuring No SQL Resources > MongoDB
  

MongoDB

MongoDB is a cross-platform, document-based database that provides high performance, high availability, and scalability.

Objects Extracted

The MongoDB resource extracts metadata from the following assets in a MongoDB data source:
You can view external lineage of objects, nested objects, and all MongoDB data types except the array data type when you use the Informatica Intelligent Cloud Services, PowerCenter, and Data Engineering Integration mappings.
Note: The MongoDB resource does not display internal lineage, and relationship between views and collections.

Permissions to Configure the Resource

Configure the read permission on the MongoDB data source for the user account that you use to access the data source.

Connect to a MongoDB Data Source Enabled for SSL

To connect to a MongoDB data source enabled for SSL, perform the following steps:
  1. 1. Download the MongoDB SSL certificates using a web browser.
  2. Note: Make sure that you import the MongoDB Trust Services certificate in the Certificates directory.
  3. 2. Copy the certificates to the <INFA_HOME>/services/shared/security/ directory.
  4. 3. Go to the <INFA_HOME>/source/java/jre/bin directory and then run the following keytool command to import each copied certificate as a trusted certificate in to the Informatica domain keystore:
  5. keytool -import -file <INFA_HOME>/services/shared/security/<certificate>.cer -alias <alias name> -keystore <INFA_HOME>/services/shared/security/infa_truststore.jks -storepass <Informatica domain keystore password>

Resource Connection Properties

The General tab includes the following properties:
Property
Description
Use Connection URL
URL to connect to the MongoDB server. Default value is Yes.
Connection URL
URL to connect to the MongoDB server.
URL syntax for a standalone MongoDB server is mongodb://<hostname>:<port>/<database>?retryWrites=true&w=majority
URL syntax for MongoDB as a service in the cluster is mongodb+srv://<username>:<password>@<hostname>:<port>/test?retryWrites=true&w=majority
Host
Host name or IP address of the MongoDB server.
Port
Port number of the MongoDB server.
SSL Enabled
Indicates whether the MongoDB server is enabled for SSL
Authentication
Type of authentication to connect to the MongoDB server. Select one of the following options:
  • - None.
  • - Username / Password. Enables role-based authentication.
Username
Username of the account to connect to the MongoDB server.
Password
Password of the account to connect to the MongoDB server.
Authentication Database
Authentication database where the user data is defined. By default, user data is stored in the admin database.
The following table describes the properties that you can configure in the Source Metadata section of the Metadata Load Settings tab:
Property
Description
Enable Source Metadata
Enables metadata extraction
Database
Imports the required database for the resource run. Use a semicolon to separate the database names. An empty string imports all databases. A database name imports all the collections that belong to the database.
Number of rows to sample
Specify the number of rows that the resource runs on. By default, you can choose a sample size of 20 rows. The maximum sample size in a resource run is 1000 rows.
The MongoDB resource limits the sampling size to 30 documents, to identify the schema.
Sampling Option
Specify one of the following sampling options:
  • - First N Rows
  • - Random N Rows
  • - Last N Rows
Source Metadata Filter
Specify a combination of regular expressions and wildcards to include or exclude specific assets in the resource run. Use a semicolon to separate the wildcard patterns.
Import system collections
Option to indicate whether the MongoDB collections should be imported. Select True if the user has the database owner privileges.
Memory
The memory value required to run a scanner job.
Specify one of the following memory values:
  • - Low
  • - Medium
  • - High
Note: For more information about the memory values, see the Tuning Enterprise Data Catalog Performance article on How To-Library Articles tab in the Informatica Doc Portal
JVM Options
JVM parameters that you can set to configure scanner container. Use the following arguments to configure the parameters:
  • - -Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of scanner to values, such as DEBUG, INFO, or ERROR. Default value is INFO.
  • - -Dscanner.container.core=<No. of core>. Increases the core for the scanner container. The value must be a number.
  • - -Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the Yarn environment. Use a comma to separate the multiple key value pairs.
  • - -Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the scanner container memory when pmem is enabled. Default value is 1.
Number of threads for execution
Number of threads required for the metadata extraction.
Track Data Source Changes
View metadata source change notifications in Enterprise Data Catalog.

Unsupported Features