Microsoft Sharepoint
SharePoint is a web-based collaborative platform that integrates with Microsoft Office. SharePoint is primarily a document management and storage system that is configurable as required.
Objects Extracted
Enterprise Data Catalog extracts only files from the Microsoft Sharepoint data source.
Permissions to Configure the Resource
Configure the read permission on the Microsoft SharePoint data source for the user account that you use to access the data source.
Supported File Types
Microsoft Sharepoint resource supports metadata extraction from the following files:
- •AVRO files
- •Delimited files
- •File System
- •Folders
- •JSON files
- •Parquet files
- •Unstructured files
- •XML files
Assign read permission to the files for metadata extraction.
Resource Connection Properties
The following table describes the connection properties:
Property | Description |
---|
SharePoint URL | URL to access SharePoint. |
User Name | User name to access SharePoint. |
Password | Password to access SharePoint. |
SharePoint Content Type | Specifies the type of SharePoint content. Select one of the following types of SharePoint content: - - All. Scans contents from both list and library Sharepoint content type.
- - SharePoint List. Scans contents from the list SharePoint content type.
- - SharePoint Library. Scans contents from the library SharePoint content type.
|
Enable Subsite Scan | Select this option to scan subsites from the SharePoint site. |
Include Nested Subsites | Specify this option to scan the nested subsites within the top-level subsite. This option applies when you select the Enable Subsite Scan option. |
The following table describes the Additional and Advanced properties for source metadata settings on the Metadata Load Settings tab:
Property | Description |
---|
Enable Source Metadata | Extracts and ingests metadata from the data source. |
File Types | Select any or all of the following file types from which you want to extract metadata: - - All. Use this option to specify if you want to extract metadata from all file types.
- - Select. Use this option to specify that you want to extract metadata from specific file types. Perform the following steps to specify the file types:
- 1. Click Select. The Select Specific File Types dialog box appears.
- 2. Select the required files from the following options:
- - Extended unstructured formats. Use this option to extract metadata from file types such as audio files, video files, image files, and ebooks.
- - Structured file types. Use this option to extract metadata from file types, such as Avro, Parquet, JSON, XML, text, and delimited files.
- - Unstructured file types. Use this option to extract metadata from file types, such as Apple files, Microsoft Excel, Microsoft PowerPoint, Microsoft Word, web pages, compressed files, emails, and PDF.
- 3. Click Select.
Note: You can select Specific File Types option in the dialog box to select files under all the categories.
|
Treat Files Without Extension As | Select one of the following options to identify files without an extension: |
Enter File Delimiter | Specify the file delimiter if the file from which you extract metadata uses a delimiter other than the following list of delimiters: - - Comma (,)
- - Horizontal tab (\t)
- - Semicolon (;)
- - Colon (:)
- - Pipe symbol (|)
Verify that you enclose the delimiter in single quotes. For example, '$'. Use a comma to separate multiple delimiters. For example, '$','%','&' |
Other File Types | Extracts basic file metadata such as size of the file, path to the file, and time stamp information from other file types. |
First Level Directory | Specify a directory or a list of directories under the source directory. If you leave this option blank, Enterprise Data Catalog imports all the files from the specified source directory. To specify a directory or a list of directories, you can perform the following steps: - 1. Click Select.... The Select First Level Directory dialog box appears.
- 2. Select the required directories using one of the following options:
- - Select from list: select the required directories from a list of directories.
- - Select using regex: provide an SQL regular expression to select schemas that match the expression.
Note: If you are selecting multiple directories, you must separate the directories using a semicolon (;). |
Include Subdirectory | Select this option to import all the files in the subdirectories under the source directory. Note: This option is mandatory to scan files from SharePoint. |
Case Sensitive | Specifies that the resource is configured for case sensitivity. Select one of the following values: - - True. Select this check box to specify that the resource is configured as case sensitive.
- - False. Clear this check box to specify that the resource is configured as case insensitive.
The default value is True. |
Memory | The memory required to run the scanner job. Select one of the following values based on the data set size imported: Note: For more information about the memory values, see the Tuning Enterprise Data Catalog Performance article on How To-Library Articles tab in the Informatica Doc Portal |
JVM Options | JVM parameters that you can set to configure scanner container. Use the following arguments to configure the parameters: - - Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of scanner to values, such as DEBUG, ERROR, or INFO. Default value is INFO.
- - Dscanner.container.core=<No. of core>. Increases the core for the scanner container. The value must be a number.
- - Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the Yarn environment. Use a comma to separate the key pair value.
- - Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the scanner container memory when pmem is enabled. Default value is 1.
|
Track Data Source Changes | View metadata source change notifications in Enterprise Data Catalog. |
You can enable data discovery for a SharePoint resource. For more information about enabling data discovery, see the
Enable Data Discovery topic.
You can enable composite data domain discovery for a SharePoint resource. The Microsoft Sharepoint resource operates at one site per resource. For more information about enabling composite data domain discovery,see the
Composite Data Domain Discovery topic.