MongoDB
MongoDB is a cross-platform, document-based database that provides high performance, high availability, and scalability.
Objects Extracted
The MongoDB resource extracts metadata from the following assets in a MongoDB data source:
- •Database
- •Collections
- •Fields
- •Views
You can view external lineage of objects, nested objects, and all MongoDB data types except the array data type when you use the Informatica Intelligent Cloud Services, PowerCenter, and Data Engineering Integration mappings.
Note: The MongoDB resource does not display internal lineage, and relationship between views and collections.
Permissions to Configure the Resource
Configure the read permission on the MongoDB data source for the user account that you use to access the data source.
Connect to a MongoDB Data Source Enabled for SSL
To connect to a MongoDB data source enabled for SSL, perform the following steps:
- 1. Download the MongoDB SSL certificates using a web browser.
Note: Make sure that you import the MongoDB Trust Services certificate in the Certificates directory.
- 2. Copy the certificates to the <INFA_HOME>/services/shared/security/ directory.
- 3. Go to the <INFA_HOME>/source/java/jre/bin directory and then run the following keytool command to import each copied certificate as a trusted certificate in to the Informatica domain keystore:
keytool -import -file <INFA_HOME>/services/shared/security/<certificate>.cer -alias <alias name> -keystore <INFA_HOME>/services/shared/security/infa_truststore.jks -storepass <Informatica domain keystore password>
Resource Connection Properties
The General tab includes the following properties:
Property | Description |
---|
Use Connection URL | URL to connect to the MongoDB server. Default value is Yes. |
Connection URL | URL to connect to the MongoDB server. URL syntax for a standalone MongoDB server is mongodb://<hostname>:<port>/<database>?retryWrites=true&w=majority URL syntax for MongoDB as a service in the cluster is mongodb+srv://<username>:<password>@<hostname>:<port>/test?retryWrites=true&w=majority |
Host | Host name or IP address of the MongoDB server. |
Port | Port number of the MongoDB server. |
SSL Enabled | Indicates whether the MongoDB server is enabled for SSL |
Authentication | Type of authentication to connect to the MongoDB server. Select one of the following options: - - None.
- - Username / Password. Enables role-based authentication.
|
Username | Username of the account to connect to the MongoDB server. |
Password | Password of the account to connect to the MongoDB server. |
Authentication Database | Authentication database where the user data is defined. By default, user data is stored in the admin database. |
The following table describes the properties that you can configure in the Source Metadata section of the Metadata Load Settings tab:
Property | Description |
---|
Enable Source Metadata | Enables metadata extraction |
Database | Imports the required database for the resource run. Use a semicolon to separate the database names. An empty string imports all databases. A database name imports all the collections that belong to the database. |
Number of rows to sample | Specify the number of rows that the resource runs on. By default, you can choose a sample size of 20 rows. The maximum sample size in a resource run is 1000 rows. The MongoDB resource limits the sampling size to 30 documents, to identify the schema. |
Sampling Option | Specify one of the following sampling options: - - First N Rows
- - Random N Rows
- - Last N Rows
|
Source Metadata Filter | Specify a combination of regular expressions and wildcards to include or exclude specific assets in the resource run. Use a semicolon to separate the wildcard patterns. |
Import system collections | Option to indicate whether the MongoDB collections should be imported. Select True if the user has the database owner privileges. |
Memory | The memory value required to run a scanner job. Specify one of the following memory values: Note: For more information about the memory values, see the Tuning Enterprise Data Catalog Performance article on How To-Library Articles tab in the Informatica Doc Portal |
JVM Options | JVM parameters that you can set to configure scanner container. Use the following arguments to configure the parameters: - - -Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of scanner to values, such as DEBUG, INFO, or ERROR. Default value is INFO.
- - -Dscanner.container.core=<No. of core>. Increases the core for the scanner container. The value must be a number.
- - -Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the Yarn environment. Use a comma to separate the multiple key value pairs.
- - -Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the scanner container memory when pmem is enabled. Default value is 1.
|
Number of threads for execution | Number of threads required for the metadata extraction. |
Track Data Source Changes | View metadata source change notifications in Enterprise Data Catalog. |
Unsupported Features
- •The MongoDB resource does not support indexed array, and virtual tables.
- •For heterogenous arrays, the MongoDB resource fetches the key to infer the schema. The value of the heterogeneous arrays is not used to infer the schema.