Target

You specify different target properties depending on the scenario that you select in the specification definition. If you configure the specification to ingest data from a relational database to Hive, you configure a Hive connection and Hive table properties to define the target. If you configure the specification to ingest data from a relational database to HDFS, you configure an HDFS connection and an ingestion directory to define the target.

The mass ingestion solution ingests all data to the target. The mass ingestion solution does not provide the option to append data that has been recently updated. Each time that you run the mass ingestion specification, the existing data in the Hive or HDFS target is deleted and replaced with the data configured in the ingestion job.

Configuring a Hive Target

Property	Description
Target Connection	Required. The Hive connection used to find the Hive storage target. If changes are made to the available Hive connections, refresh the browser or log out and log back in to the Mass Ingestion tool.
Target Schema	Required. The schema that defines the target tables.
Target Table Prefix	The prefix added to the names of the target tables. Enter a string. You can enter alphanumeric and underscore characters. The prefix is not case sensitive.
Target Table Suffix	The suffix added to the names of the target tables. Enter a string. You can enter alphanumeric and underscore characters. The prefix is not case sensitive.
Hive Options	Select this option to configure the Hive target location.
DDL Query	Select this option to configure a custom DDL query that defines how data from the source tables is loaded to the target tables.
Storage Format	Required. The storage format of the target tables. You can select Text, Avro, Parquet, or ORC. Default is Text.
External Table	Select this option if the table is external.
External Location	The external location of the Hive target. By default, tables are written to the default Hive warehouse directory. A sub-directory is created under the specified external location for each source that is ingested. For example, you can enter /temp. A source table named PRODUCT is ingested to the external location /temp/PRODUCT/

DDL Query

To define a DDL query, use SQL statements and placeholders. Use the placeholders to fetch the table name, column list, and column names. The Data Integration Service substitutes the placeholders with actual values at run time according to the tables that you ingest. You must enclose the placeholders within curly brackets. For example, {INFA_TABLE_NAME}.

Configuring an HDFS Target

Property	Description
Target Connection	Required. The HDFS connection used to find the HDFS storage target. If changes are made to the available HDFS connections, refresh the browser or log out and log back in to the Mass Ingestion tool.
Target Table Prefix	The prefix added to the names of the target files. Enter a string. You can enter alphanumeric and underscore characters. The prefix is not case sensitive.
Target Table Suffix	The suffix added to the names of the target files. Enter a string. You can enter alphanumeric and underscore characters. The prefix is not case sensitive.
Ingestion Directory	Required. The target directory on HDFS. A sub-directory is created under the ingestion directory for each source that is ingested. If the specified directory already exists, the directory is replaced. For example, you can enter /temp. A source table named PRODUCT is ingested to the directory /temp/PRODUCT/
Compression	Required. The compressed file format that stores the target files. You can select None, Gzip, Bzip2, LZO, Snappy, or Custom. If you select Custom, enter the compression codec. Default is None.
Compression Codec	If you select custom compression, enter the fully qualified class name implementing the Hadoop CompressionCodec interface.
Delimiters	The delimiters used to separate data in the target files. You can select comma, semicolon, space, tab, or other. If you select Other, you can define a custom delimiter.
Other Delimiter	Required if you choose Other Delimiter. Enter a custom delimiter.

Target

Configuring a Hive Target

DDL Query

Configuring an HDFS Target

Compression Codec