Amazon Athena is an interactive query service to query and analyze big data in Amazon S3 using standard SQL.
Objects extracted
Metadata Command Center extracts the following objects from an Amazon Athena source system:
•Database
•Schema
•View
•ViewColumn
•External Table
•External Column
•Calculation
•Resource
•Nested Field
You can extract objects with the following complex data types and their nested fields from an Amazon Athena source system:
•Array
•Map
•Struct
Data Governance and Catalog displays objects with complex data types as external columns. The nested fields within arrays appear as elements, while those within maps appear as keys and values.
Data profiling for Amazon Athena
Configure data profiling to run profiles on the metadata extracted from an Amazon Athena source system. You can run data profiles on the following Amazon Athena objects:
•External tables created in the following file formats:
- Avro
- CSV
- Delta
- JSON
- Parquet
•External columns
You can view the profiling statistics in Data Governance and Catalog. The data profiling task runs profiles on the following data types for Amazon Athena objects:
•Bigint
•Boolean
•Char
•Date
•Decimal
•Double
•Float
•Int
•Smallint
•String
•Timestamp
•Tinyint
•Varchar
Sampling type
Determine the sample rows on which you want to run the data profiling task. You can choose one of the following sampling types for an Amazon Athena catalog source:
- All Rows
- Limit N Rows
- Custom Query. Enter the sampling method to specify a percentage of rows on which you want to run the data profiling task. For example, TABLESAMPLE BERNOULLI(10) or TABLESAMPLE SYSTEM(10)
Note:
You can run data quality only on views and external tables that are created in
Amazon Athena
.
Data Lineage
The following lineage data is available for Amazon Athena assets:
•From table to view
For more information about data lineage, see the Asset Discovery help.