Amazon QuickSight Dashboard for S3 CSV Data using Redshift Spectrum / Glue Crawler
In this article I would like to write about creating a Amazon QuickSight Dashboard for the data in a csv file located in Amazon S3 sing Amazon Redshift Spectrum / Glue Crawler.
AWS S3 & GLUE CRAWLER/DATA CATALOG
In my last article I wrote about the .csv file I used in the S3 bucket and creating glue catalog and a crawler. The same steps can be followed here except the database name must not start with the number. Hence I used the database name — “redshift_spectrum”.
CREATING A NEW REDSHIFT CLUSTER
Create a new redshift cluster by selecting the free trial option for our task. Username and password have to be set up during this task
Now go to IAM and create a new role “00_redshift_spectrum” for our redshift spectrum to access the Glue data catalog and S3 bucket. Attach the following policies to the role (AmazonS3ReadOnlyAccess, AWSGlueConsoleFullAccess). Now attach the role to the redshift cluster. This can be done by selecting the cluster in the redshift mail dashboard then select Actions — Manage IAM Roles.
Now let’s create a spectrum schema to access the data from S3 using glue data catalog.
create external schema spectrum_schema
from data catalog
database ‘redshift_spectrum’
iam_role ‘arn:aws:iam::xxxxxxxxxxxx:role/00_redshift_spectrum’
create external database if not exists;select *
from spectrum_schema.quicksightreporting
“quicksightreporting” is the name of the table in “redshift_spectrum” database in data catalog that was created by glue crawler. Once the external schema has been created , the select statement can be ran and now we can see the data from .csv file from our s3 bucket.
AMAZON QUICKSIGHT
Now let’s create a quicksight dashboard. Open Quicksight from AWS Management console. Click on the “Datasets” on the left pane and then click “New Dataset” on the right top corner.
Create a Dataset (FROM NEW DATA SOURCES) — Redshift (Auto discovered). Now fill out the window and select your cluster from the down. Test validate once and click “Create data source”. Once this new dataset has been created, you can play around and build some wonderful dashboards!