Amazon QuickSight Dashboard for S3 CSV Data Using Amazon Athena / Glue Crawler
In this article I would like to write about creating a Amazon QuickSight Dashboard for the data in a csv file located in Amazon S3.
AWS S3
Find and enter S3 from AWS Management Console. Create a S3 bucket “02transactions” with the default options and upload the Transactions.csv file to it.
AWS GLUE CRAWLER & DATA CATALOG
Here we will set up a crawler to run on demand, scan our .csv file from S3 and automatically populate the Glue Data Catalog. Go to the AWS Glue from AWS Management Console. Click the “Crawlers” under the data catalog on the left and Add crawler.
Crawler name — 02transactions_crawler
Crawler source type — Data Sources
Repeat crawls of S3 data stores — Crawl all folders
Choose a data store — S3
Crawl data in — Specified path in my account
Include path — s3://02transactions/
Add another data store — No
Frequency — Run on demand
Choose an IAM role -> Create an IAM role -> AWSGlueServiceRole-02transactions. Policies (AWSGlueServiceRole,S3 access will be attached automatically)
In the Configure the crawler’s output section, click “Add Database”
Database Name — “02transactions” and click “Create” and then “Finish”
Now in the Crawlers Overview, check the created crawler “02transactions_crawler” and click “Run Crawler”.
Once the crawler ran successfully (status will be visible in the “status” column), go to “Databases” under “Data Catalog”.
If you don't see the created “02transactions” database click the refresh on the right side. Now enter the created “02transactions” database and click the “Tables in 02transactions”. You can now see the table “02transactions” created. You can check the schema by just clicking on the table.
AMAZON ATHENA
Let’s start with Athena now. Once our data is in the Glue data catalog, it is readily available for querying in Amazon Athena. Enter Amazon Athena from the management console. Make sure that “Query editor” tab is selected on the top
select Data source — AwsDataCatalog
Database — 02transactions
Now you can see the table “02transactions” under the Tables.
Just run the following queries to make sure you see the data and schema.
SELECT * FROM “02transactions”.”02transactions” limit 10;
SHOW CREATE TABLE “02transactions”.”02transactions”;
AMAZON QUICKSIGHT
Now let’s create a QuickSight dashboard. Open QuickSight from AWS Management console. Click on the “Datasets” on the left pane and then click “New Dataset” on the right top corner.
Make you that there is also a Role has been created for QuickSight with following policies in it.
AWSQuicksightAthenaAccess, AmazonS3FullAccess, AWSQuickSightS3ConsumersPolicy.
Create a Dataset (FROM NEW DATA SOURCES) — Athena
Data source name — 02transactions
Athena workgroup — primary
Test validate once and click “Create data source”.
In the “Choose your table” window, select the following
Catalog: contain sets of databases — AWSDataCatalog
Database: contain sets of tables. — 02transactions
Tables: contain the data you can visualize. — 02transactions (check the radio button)
Now click the “Edit/Preview data”. You can already view the data in Quicksight. Now click “Save & Visualize” in top right corner.
Now you can play around and build some wonderful dashboards!