When Intelligent Structure Discovery discovers the structure of a file, it might create a field called Unassigned Data if the file has records that don't match the intelligent structure model.
Intelligent Structure Discovery creates the Unassigned Data field for a variety of reasons, including the following scenarios:
•A delimited file, such as a CSV or log file, contains more elements than expected and Intelligent Structure Discovery can't parse the data drift.
•A record in a JSON file isn't in the intelligent structure model or exceeds the maximum record size.
Maximum record size
When Intelligent Structure Discovery discovers the structure of a JSON sample file, it uses the maximum record size to identify repeating records. If a record is larger than the maximum record size, Intelligent Structure Discovery assigns the record to the Unassigned Data field.
The default maximum record size is 640,000 bytes. You can increase the maximum record size to avoid the use of the Unassigned Data field.
To edit the maximum record size, use Administrator to configure a JVM option in the Data Integration Server properties. Use the following syntax to define the maximum record size:
-DISD_MAX_RECORD_SIZE=<size in bytes>
For example, to define a maximum record size of 2 MB, enter the following value for the JVMOption1 property:
-DISD_MAX_RECORD_SIZE=2000000
Note: Increasing the maximum record size increases the memory consumption of the discovery process. Therefore, you might need to perform one or both of the following actions:
•Increase the maximum JVM heap size in the Data Integration Server properties. To increase the JVM heap size, set one of the JVM options to -Xmx<heap size in megabytes>.
•Increase the memory of the Secure Agent machine. If the Secure Agent runs on an Informatica Cloud Hosted Agent, contact Informatica Global Customer Support.
For more information about configuring Data Integration Server properties and the Secure Agent, see the Administrator help.