informatica.infacore.dataframe.dataobject.DataObject.read

DataObject.read(options: dict = None)

Creates the INFACore DataFrame for the specified DataObject instance to extract data from the data source using the action method collect().

Parameters:

options (dict, optional) –

A placeholder to specify properties for advanced functionalities. Default is None.

options[“data_format”], optional : dict, must have the “format” attribute. Allowed values for “format” are ‘Avro’, ‘Parquet’, ‘Orc’, ‘Json’, and ‘Flat’.

options[“filter”], optional : dict, Allowed values for the filter “type” are ‘simple’ and ‘advanced’.

For a simple filter, specify “condition” as the key and 2-D array (for example, [“column”,”operator”,”value”]) as the value.

For an advanced filter, specify “expression” as the key and expression (str) as the value.

Returns:

The INFACore DataFrame object.

Return type:

DataFrame

Examples

# Example 1 - Read data from the Oracle Customers table

>>> import informatica.infacore as ic
>>> orcl_do = ic.get_datasource("Oracle").get_connection("Oracle Prod").get_data_object("Customers")
>>> i_df = orcl_do.read()

# Example 2 - Read data from the Oracle Employee table where the employee age is greater than 30 and state is Arizona

>>> import informatica.infacore as ic
>>> orcl_do = ic.get_datasource("Oracle").get_connection("Oracle Prod").get_data_object("Employee")
>>> options = {
        "filters": {
            "type" : "simple",
            "condition":
                [
                    ["age","GREATER","30"],
                    ["state","EQUALS","ARIZONA"]
                ]
        }
    }
>>> i_df = orcl_do.read(options)

# Example 3 - Read customer data which is in Avro format from Amazon S3

>>> import informatica.infacore as ic
>>> s3_do = ic.get_datasource("Amazon S3").get_connection("S3 Sandbox").get_data_object("com.amk/customer.avro")
>>> options = {
        "data_format" : {
            "format" : "Avro"
        }
    }
>>> i_df = s3_do.read(options)