Amazon Athena is an interactive query service to query data and analyze big data in Amazon S3 using standard SQL.
Objects extracted
The Metadata Command Center service extracts the following objects from an Amazon Athena source system:
•Database
•Schema
•View
•ViewColumn
•External Table
•External Column
•Calculation
•Resource
Data profiling for Amazon Athena
Configure data profiling to run profiles on the metadata extracted from an Amazon Athena source system. You can run data profiles on the following Amazon Athena objects:
•External tables created in the following file formats:
- Avro
- CSV
- Delta
- JSON
- Parquet
•External columns
You can view the profiling statistics in Data Governance and Catalog. The data profiling task runs profiles on the following data types for Amazon Athena objects:
•Bigint
•Boolean
•Char
•Date
•Decimal
•Double
•Float
•Int
•Smallint
•String
•Timestamp
•Tinyint
•Varchar
Sampling type
Determine the sample rows on which you want to run the data profiling task. You can choose one of the following sampling types for an Amazon Athena catalog source:
- All Rows
- Limit N Rows
- Custom Query. Enter the sampling method to specify a percentage of rows on which you want to run the data profiling task. For example, TABLESAMPLE BERNOULLI(10) or TABLESAMPLE SYSTEM(10)
Note: You can run data quality only on views and external tables that are created in Amazon Athena.
Data Lineage
The following lineage data is available for Amazon Athena assets:
•From table to view
For more information about data lineage, see Data Lineage in the Working With Assets help.