Learn how to optimize Amazon Redshift queries using a compound sort key. Discover which query patterns will see the biggest performance gains. Essential knowledge for the AWS Certified Data Analytics – Specialty (DAS-C01) exam.
Table of Contents
Question
A company stores employee data in Amazon Resdshift. A table names Employee uses columns named Region ID, Department ID, and Role ID as a compound sort key.
Which queries will MOST increase the speed of query by using a compound sort key of the table? (Choose two.)
A. Select *from Employee where Region ID=’North America’;
B. Select *from Employee where Region ID=’North America’ and Department ID=20;
C. Select *from Employee where Department ID=20 and Region ID=’North America’;
D. Select *from Employee where Role ID=50;
E. Select *from Employee where Region ID=’North America’ and Role ID=50;
Answer
B. Select *from Employee where Region ID=’North America’ and Department ID=20;
E. Select *from Employee where Region ID=’North America’ and Role ID=50;
Explanation
A compound sort key in Amazon Redshift allows for efficient filtering and sorting of data when queries use a prefix of the sort key columns in the correct order.
In this scenario, the Employee table has a compound sort key consisting of (Region ID, Department ID, Role ID) in that order. To best leverage this sort key for fast query performance, queries should include filter predicates on a prefix of those columns, in the same left-to-right order.
Looking at the query options:
A only filters on Region ID. While this will use the sort key, queries B and E are more specific and will narrow down the data further.
B filters on both Region ID and Department ID, in the correct order matching the sort key. This will be very efficient.
C filters on Region ID and Department ID but in the wrong order. This won’t be able to fully utilize the sort key.
D only filters on Role ID. Since Role ID is the last column in the sort key, this query won’t be able to leverage the sort key at all for filtering. It will need to scan the entire table.
E, like B, filters on a prefix of the sort key columns (Region ID, Role ID) in the correct order. This will also be very efficient, since Redshift can skip scanning entire regions of the table not matching the Region ID filter.
Therefore, queries B and E will benefit the most from the compound sort key and see the biggest performance gains compared to the other options. The key is to include filter predicates on a prefix of the sort key columns in the same order they are defined in the table.
Amazon AWS Certified Data Engineer – Associate DEA-C01 certification exam assessment practice question and answer (Q&A) dump including multiple choice questions (MCQ) and objective type questions, with detail explanation and reference available free, helpful to pass the Amazon AWS Certified Data Engineer – Associate DEA-C01 exam and earn Amazon AWS Certified Data Engineer – Associate DEA-C01 certification.