Enterprise Data Catalog Scanner Configuration Guide > Configuring Data Integration Resources > Custom Lineage
  

Custom Lineage

You can create a custom lineage resource to view the data lineage information for the assets in your organization. A custom lineage resource uses CSV files provided by you that include lineage data for your enterprise. You can use this option if you do not have an ETL tool supported by Enterprise Data Catalog.

Basic Information

The General tab includes the following basic information about the resource:
Information
Description
Name
The name of the resource.
Description
The description of the resource.
Resource type
The type of the resource.
Execute On
You can choose to execute on the default catalog server or offline.

Resource Connection Properties

The following tables list the properties that you must configure to add a custom lineage resource:
The General tab includes the following properties:
Property
Description
File
The CSV file or the .zip file that includes the CSV files with the lineage data. Click Choose to select the required CSV file or .zip file that you want to upload. Ensure that the CSV files in the .zip file are not stored in a directory within the .zip file.
If you want to select multiple CSV files, you must include the required CSV files in a .zip file and then select the .zip file for upload.
Note: Make sure that the CSV file includes the following parameters in the header:
  • - Association
  • - From Connection
  • - To Connection
  • - From Object
  • - To Object
The following image shows sample connection properties on the General tab:
The image displays the connection properties for a Custom Lineage resource.
The Metadata Load Settings tab includes the following properties:
Property
Description
Enable Source Metadata
Select to extract metadata from the data source.
Auto Assign Connections
Specifies to automatically assign the connection.
Memory
Specifies the memory required to run the scanner job. Select one of the following values based on the data set size imported:
  • - Low
  • - Medium
  • - High
For more information about the memory values, see the Tuning Enterprise Data Catalog Performance article on How To-Library Articles tab in the Informatica Doc Portal
Custom Options
JVM parameters that you can set to configure scanner container. Use the following arguments to configure the parameters:
  • - -Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of scanner to values, such as DEBUG, ERROR, or INFO. Default value is INFO.
  • - -Dscanner.container.core=<No. of core>. Increases the core for the scanner container. The value should be a number.
  • - -Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the Yarn environment. Use a comma to separate the key pair value.
  • - -Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the scanner container memory when pmem is enabled. Default value is 1.
  • - -DcustomScannerMaxFileSize=<file size>. Overrides the restriction on the lineage.csv file size.
  • Note: If you specify a very large file size, you might encounter a Java out-of-memory error.