Microsoft Fabric OneLake Connector > Mappings for Microsoft Fabric OneLake > File formatting options

File formatting options

Select the format of the Microsoft Fabric OneLake file and configure the formatting options.

The following table describes the formatting options for Avro, Parquet, JSON, ORC, Document, and delimited flat files:

Property	Description
Schema Source	The schema of the source or target file. Select one of the following options to specify a schema: - Read from data file. Imports the schema from a file in Microsoft Fabric OneLake. - Import from schema file. Imports the schema from a schema definition file in the agent machine.
Schema File	The schema definition file in the agent machine from where you want to upload the schema. You cannot upload a schema file when you create a target at runtime.

The following table describes the formatting options for flat files:

Property	Description
Flat File Type	The type of flat file. Select one of the following options: - Delimited. Reads a flat file that contains column delimiters. - Fixed Width. Reads a flat file with fields that have a fixed length. You must select the file format in the Fixed Width File Format option. If you do not have a fixed-width file format, click New > Components > Fixed Width File Format to create one.
Delimiter	Character used to separate columns of data in a delimited flat file. You can set values as comma, tab, colon, semicolon, or others. You cannot set a tab as a delimiter directly in the Delimiter field. To set a tab as a delimiter, you must type the tab character in any text editor. Then, copy and paste the tab character in the Delimiter field.
EscapeChar	Character immediately preceding a column delimiter character embedded in an unquoted string, or immediately preceding the quote character in a quoted string data in a delimited flat file. When you write data to Microsoft Fabric OneLake and specify a qualifier, by default, the qualifier is considered as the escape character. Else, the character specified as the escape character is considered.
Qualifier	Quote character that defines the boundaries of data in a delimited flat file. You can set qualifier as single quote or double quote.
Qualifier Mode	Specify the qualifier behavior when you write data to a delimited flat file. You can select one of the following options: - Minimal. Default mode. Applies qualifier to data enclosed within a delimiter value or a special character. - All. Applies qualifier to all data. - Non_Numeric. Not applicable. - All_Non_Null. Not applicable.
Disable escape character when a qualifier is set	Applicable to a Microsoft Fabric OneLake target. Select to disable the escape character when a qualifier is set. When you disable the escape character, the special characters not escaped and are considered as part of the data written to the target.
Code Page	Select the code page that the Secure Agent must use to read or write data to a delimited flat file. Select UTF-8 for mappings. Select one of the following options for mappings in advanced mode: - UTF-8 - MS Windows Latin 1 - Shift-JIS - ISO 8859-15 Latin 9 (Western European) - ISO 8859-3 Southeast European - ISO 8859-5 Cyrillic - ISO 8859-9 Latin 5 (Turkish) - IBM EBCDIC International Latin-1
Header Line Number	Specify the line number that you want to use as the header when you read data from a delimited flat file. Specify the value as 0 or 1. To read data from a file with no header, specify the value as 0.
First Data Row	Specify the line number from where you want the Secure Agent to read data in a delimited flat file. You must enter a value that is greater or equal to one. To read data from the header, the value of the Header Line Number and the First Data Row fields should be the same. Default is 1.
Target Header	Select whether you want to write data to a target that contains a header or without a header in the delimited flat file. You can select With Header or Without Header options. This property is not applicable when you read data from aMicrosoft Fabric OneLake source.
Distribution Column	Not applicable.
Max Rows To Preview	Not applicable.
Row Delimiter	Character used to separate rows of data. You can set values as \r, \n, and \r\n. This property is not applicable when you read data from a Microsoft Fabric OneLake source.

Configuring Delta file format in a mapping in advanced mode

You can use the Delta file format in mappings and mappings in advanced mode.

Before you use the Delta file format in mappings in advanced mode, perform the following steps:

1In the mapping task, navigate to the Advanced Session Properties section on the Runtime Options tab.

2Under Advanced Session Properties, click

and select spark.custom.property in the Session Property Name field.

3In the Session Property Value field, set the following values:

- spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension
- spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog

Separate the values with &: in the Session Property Value field.

This image displays the spark custom property along with the session property value.

Rules and guidelines for Delta file format

Consider the following rules and guidelines when you use the Delta file format to read from and write to a Delta Lake:

General guidelines
Mappings
Mappings in advanced mode

Rules and guidelines for Iceberg file format

Consider the following rules and guidelines when you use the Iceberg file format to read from or write to Apache Iceberg files:

•You cannot use the Iceberg file format when you run mappings in advanced mode.
•You can read data only when you select File as the Source Type in the advanced source properties.
•You can select the following as source objects:

- JSON snapshot file in the metadata folder for time travel capabilities.
- Any Parquet data file in the data folder to read the latest snapshot of the table.
- Snapshot Avro file in the metadata folder to read the latest snapshot of the table.

For example, snap-1486404200109107576-1-5078c39c-8640-49ca-9985-da3359ddfca1.avro

•You can infer schema from the selected object only if the file is in JSON or text format.
•When you create a target at runtime, the specified target object name is not considered. Instead, the Iceberg table name is derived from the directory that contains both the data and metadata of the table in the target Lakehouse path . For example, in the path <Lakehouse_Name>\Files\Dir1\Dir2\test.parquet, the Iceberg table name is taken from the directory Dir2. This Dir2 directory contains both data and metadata folders.
•When you write to an existing target, you can only select Parquet files in the data folder.

Rules and guidelines for Document file format

Certain rules and guidelines apply to Document file format.

Consider the following rules and guidelines when you use the Document file format to read PDF files:

•You cannot read compressed files that use Gzip compression.
•Merged cells in header rows of the PDF are not written accurately to the target.
•If a column header is empty, Data Integration automatically assigns default names to the columns, such as column1, column2, column3, and so on.
•If a table does not contain any data but includes only the header row, the default column name is prefixed to the header row in the target file.
•If columns in a table move to the next page in the PDF, the table is not written as a single table.
•If the PDF files contain only text, the text might not be well-structured in the target. There might also be a mismatch in the order of text.
•If the content in the PDF is formatted with two columns, the layout is not preserved when the content is written to the target file. Instead of keeping the distinct column structure, the text from both columns gets merged, and the content appears in the target in a single, continuous line.
•Multiple adjacent tables are read as a single table.