informatica.infacore.dataframe.dataobject.DataObject.read
- DataObject.read(options: dict = None)
Creates the INFACore DataFrame for the specified DataObject instance to extract data from the data source using the action method collect().
- Parameters:
options (dict, optional) –
A placeholder to specify properties for advanced functionalities. Default is None.
options[“data_format”], optional : dict, must have the “format” attribute. Allowed values for “format” are ‘Avro’, ‘Parquet’, ‘Orc’, ‘Json’, and ‘Flat’.
options[“filter”], optional : dict, Allowed values for the filter “type” are ‘simple’ and ‘advanced’.
For a simple filter, specify “condition” as the key and 2-D array (for example, [“column”,”operator”,”value”]) as the value.
For an advanced filter, specify “expression” as the key and expression (str) as the value.
- Returns:
The INFACore DataFrame object.
- Return type:
Examples
# Example 1 - Read data from the Oracle Customers table
>>> import informatica.infacore as ic >>> orcl_do = ic.get_datasource("Oracle").get_connection("Oracle Prod").get_data_object("Customers") >>> i_df = orcl_do.read()
# Example 2 - Read data from the Oracle Employee table where the employee age is greater than 30 and state is Arizona
>>> import informatica.infacore as ic >>> orcl_do = ic.get_datasource("Oracle").get_connection("Oracle Prod").get_data_object("Employee") >>> options = { "filters": { "type" : "simple", "condition": [ ["age","GREATER","30"], ["state","EQUALS","ARIZONA"] ] } } >>> i_df = orcl_do.read(options)
# Example 3 - Read customer data which is in Avro format from Amazon S3
>>> import informatica.infacore as ic >>> s3_do = ic.get_datasource("Amazon S3").get_connection("S3 Sandbox").get_data_object("com.amk/customer.avro") >>> options = { "data_format" : { "format" : "Avro" } } >>> i_df = s3_do.read(options)