Wednesday, May 14, 2025

Discover the very best Amazon Redshift configuration on your workload utilizing Redshift Check Drive


Amazon Redshift is a extensively used, absolutely managed, petabyte-scale cloud information warehouse. Tens of hundreds of shoppers use Amazon Redshift to course of exabytes of knowledge daily to energy their analytics workloads. With the launch of Amazon Redshift Serverless and the varied deployment choices Amazon Redshift offers (comparable to occasion sorts and cluster sizes), clients are on the lookout for instruments that assist them decide essentially the most optimum information warehouse configuration to help their Redshift workload.

On this submit, we reply that query by utilizing Redshift Check Drive, an open-source instrument that permits you to consider which completely different information warehouse configurations choices are finest suited on your workload. We created Redshift Check Drive from SimpleReplay and redshift-config-compare (see Evaluate completely different node sorts on your workload utilizing Amazon Redshift for extra particulars) to offer a single entry level for locating the very best Amazon Redshift configuration on your workload. Redshift Check Drive additionally offers extra options comparable to a self-hosted evaluation UI and the flexibility to copy exterior objects {that a} Redshift workload could work together with.

Amazon Redshift RA3 with managed storage is the latest occasion sort for Provisioned clusters. It lets you scale and pay for compute and storage independently, in addition to use superior options comparable to cross-cluster information sharing and cross-Availability Zone cluster relocation. Many shoppers utilizing earlier technology occasion sorts need to improve their clusters to RA3 occasion sorts. On this submit, we present you the right way to use Redshift Check Drive to guage the efficiency of an RA3 cluster configuration on your Redshift workloads.

Answer overview

At its core, Redshift Check Drive replicates a workload by extracting queries from the supply Redshift information warehouse logs (proven as Workload Extractor within the following determine) and replays the extracted workload in opposition to the goal Redshift information warehouses (Workload Replayer).

If these workloads work together with exterior objects by way of Amazon Redshift Spectrum (such because the AWS Glue Knowledge Catalog) or COPY instructions, Redshift Check Drive presents an exterior object replicator utility to clone these objects to facilitate replay.

Workload replicator architecture

Redshift Check Drive makes use of this strategy of workload replication for 2 most important functionalities: evaluating configurations and evaluating replays.

Evaluate Amazon Redshift configurations

Redshift Check Drive’s ConfigCompare utility (based mostly on redshift-config-compare instrument) helps you discover the very best Redshift information warehouse configuration by utilizing your workload to run efficiency and practical checks on completely different configurations in parallel. This utility’s automation begins by creating a brand new AWS CloudFormation stack based mostly on this CloudFormation template. The CloudFormation stack creates an AWS Step Operate state machine, which internally makes use of AWS Lambda features to set off AWS Batch jobs to run workload comparability throughout completely different Redshift occasion sorts. These jobs extract the workload from the supply Redshift information warehouse log location throughout the desired workload time (as supplied within the config parameters) after which replays the extracted workload in opposition to an inventory of various goal Redshift information warehouse configurations as supplied within the configuration file. When the replay is full, the Step Capabilities state machine uploads the efficiency stats for the goal configurations to an Amazon Easy Storage Service (Amazon S3) bucket and creates exterior schemas that may then be queried from any Redshift goal to determine a goal configuration that meets your efficiency necessities.

The next diagram illustrates the structure of this utility.

Architecture of ConfigCompare utility

Evaluate replay efficiency

Redshift Check Drive additionally offers the flexibility to evaluate the replay runs visually utilizing a self-hosted UI instrument. This instrument reads the stats generated by the workload replicator (saved in Amazon S3) and helps evaluate the replay runs throughout key efficiency indicators comparable to longest operating queries, error distribution, queries with most deviation of latency throughout runs, and extra.

The next diagram illustrates the structure for the UI.

Replay Performance analysis UI architecture

Walkthrough overview

On this submit, we offer a step-by-step walkthrough of utilizing Redshift Check Drive to mechanically replay your workload in opposition to completely different Amazon Redshift configurations with the ConfigCompare utility. Subsequently, we use the self-hosted evaluation UI utility to research the output of ConfigCompare for figuring out the optimum goal warehouse configuration emigrate or improve. The next diagram illustrates the workflow.

Walkthrough Steps

Stipulations

The next conditions needs to be addressed earlier than we run the ConfigCompare utility:

  • Allow audit logging and user-activity logging in your supply cluster.
  • Take a snapshot of the supply Redshift information warehouse.
  • Export your supply parameter group and WLM configurations to Amazon S3. The parameter group will be exported utilizing the AWS Command Line Interface (AWS CLI), for instance, utilizing CloudShell, by operating the next code:
    aws redshift describe-cluster-parameters —parameter-group-name <YOUR-SOURCE-CLUSTER-PARAMETER-GROUP-NAME> —output json >> param_group_src.json
    
    aws s3 cp param_group_src.json s3://<YOUR-BUCKET-NAME>/param_group_src.json

  • The WLM configurations will be copied as JSON within the console, from the place you possibly can enter them right into a file and add it to Amazon S3. If you wish to check any various WLM configurations (comparable to evaluating handbook vs. auto WLM or enabling concurrency scaling), you possibly can create a separate file with that focus on configuration and add it to Amazon S3 as nicely.
  • Establish the goal configurations you need to check. In the event you’re upgrading from DC2 to RA3 node sorts, check with Upgrading to RA3 node sorts for suggestions.

For this walkthrough, let’s assume you’ve got an present Redshift information warehouse configuration with a two-node dc2.8xlarge provisioned cluster. You need to validate whether or not upgrading your present configuration to a decoupled structure utilizing the RA3 provisioned node sort or Redshift Serverless would meet your workload value/efficiency necessities.

The next desk summarizes the Redshift information warehouse configurations which might be evaluated as a part of this check.

Warehouse Sort Variety of Nodes/Base RPU Possibility
dc2.8xlarge 2 default auto WLM
ra3.4xlarge 4 default auto WLM
Redshift Serverless 64 auto scaling
Redshift Serverless 128 auto scaling

Run the ConfigCompare utility

Earlier than you run the utility, customise the small print of the workload to replay, together with the time interval and the goal warehouse configurations to check, in a JSON file. Add this file to Amazon S3 and duplicate the S3 URI path to make use of as an enter parameter for the CloudFormation template that deploys the sources for the remaining orchestration.

You possibly can learn extra concerning the particular person parts and inputs of JSON file within the Readme.

For our use case, we use the next JSON file as an enter to the utility:

{
   "SNAPSHOT_ID": "redshift-cluster-manual-snapshot",
   "SNAPSHOT_ACCOUNT_ID": "123456789012",

   "PARAMETER_GROUP_CONFIG_S3_PATH": "s3://nodeconfig-artifacts/pg_config.json",

   "DDL_AND_COPY_SCRIPT_S3_PATH": "N/A",
   "SQL_SCRIPT_S3_PATH":"N/A",
   "NUMBER_OF_PARALLEL_SESSIONS_LIST": "N/A",
   "SIMPLE_REPLAY_LOG_LOCATION":"s3://redshift-logging-xxxxxxxx/RSLogs/",
   "SIMPLE_REPLAY_EXTRACT_START_TIME":"2023-01-28T15:45:00+00:00",
   "SIMPLE_REPLAY_EXTRACT_END_TIME":"2023-01-28T16:15:00+00:00",

   "SIMPLE_REPLAY_EXTRACT_OVERWRITE_S3_PATH":"N/A",
   "SIMPLE_REPLAY_OVERWRITE_S3_PATH":"N/A",

   "SIMPLE_REPLAY_UNLOAD_STATEMENTS": "true",

   "AUTO_PAUSE": true,
   "DATABASE_NAME": "database_name",

   "CONFIGURATIONS": [
    	{
      	"TYPE": "Provisioned",
      	"NODE_TYPE": "dc2.8xlarge",
      	"NUMBER_OF_NODES": "6",
      	"WLM_CONFIG_S3_PATH": "s3://nodeconfig-artifacts/wlm.json"
     },
     {
      	"TYPE": "Provisioned",
      	"NODE_TYPE": "ra3.4xlarge",
      	"NUMBER_OF_NODES": "12",
      	"WLM_CONFIG_S3_PATH": "s3://nodeconfig-artifacts/wlm.json"
     },
     {
      	"TYPE": "Serverless",
      	"BASE_RPU": "128"
     },
     {
      	"TYPE": "Serverless",
      	"BASE_RPU": "64"
     }
   ]
}

The utility deploys all the information warehouse configurations included within the CONFIGURATIONS part of the JSON file. A reproduction of the supply configuration can be included for use for a baseline of the prevailing workload efficiency.

After this file is absolutely configured and uploaded to Amazon S3, navigate to the AWS CloudFormation console and create a brand new stack based mostly on the this CloudFormation template and specify the related parameters. For extra particulars on the person parameters, check with the GitHub repo. The next screenshot exhibits the parameters used for this walkthrough.

Configuration parameters for Cloudformation Template

After that is up to date, proceed with the next steps on the AWS CloudFormation console to launch a brand new stack.

When the stack is absolutely created, choose the stack and open the Sources tab. Right here, you possibly can seek for the time period StepFunctions and select the hyperlink subsequent to the RedshiftConfigTestingStepFunction bodily ID to open the Step Capabilities state machine to run the utility.

Searching for ConfigTestingStepFunction

On the Step Capabilities web page that opens, select Begin execution. Go away the default values and select Begin execution to set off the run. Monitor the progress of the state machine’s run on the graph view of the web page. The total run will take roughly the identical time because the time window that was specified within the JSON configuration file.

StepFunction Execution example

When the standing of the run modifications from Working to Succeeded, the run is full.

Analyze the outcomes

When the Step Capabilities state machine run is full, the efficiency metrics are uploaded to the S3 bucket created by the CloudFormation template initially. To research the efficiency of the workload throughout completely different configurations, you need to use the self-hosted UI instrument that comes with Redshift Check Drive. Arrange this instrument on your workload by following the directions supplied on this Readme.

After you level the UI to the S3 location that has the stats from the ConfigCompare run, the Replays part will populate with the evaluation for replays discovered within the enter S3 location. Choose the goal configurations you need to evaluate and select Evaluation to navigate to the comparisons web page.

AnalysisUI example list of replays

You should use the Filter Outcomes part to indicate which question sorts, customers, and timeframe to check, and the Evaluation part will broaden to a piece offering evaluation of all the chosen replays. Right here you possibly can see a comparability of the SELECT queries run by the advert hoc person of the replay.

AnalysisUI filter results

The next screenshot exhibits an instance of the evaluation of a replay. These outcomes present the distribution of queries accomplished over the total run for a given person and question sort, permitting us to determine durations of excessive and low exercise. We will additionally see runtimes of those queries, aggregated as percentiles, common, and customary deviation. For instance, the P50 worth signifies that fifty% of queries ran inside 26.564 seconds. The parameters used to filter for particular customers, question sorts, and runtimes will be dynamically up to date to permit the outcomes and comparisons to be comprehensively investigated in accordance with the particular efficiency necessities every particular person use case calls for.

AnalysisUI compare throughput example

Troubleshooting

As proven within the answer structure, the primary transferring elements within the ConfigCompare automation are AWS CloudFormation, Step Capabilities (internally utilizing Lambda), and AWS Batch.

If any useful resource within the CloudFormation stack fails to deploy, we advocate troubleshooting the difficulty based mostly on the error proven on the AWS CloudFormation console.

To troubleshoot errors with the Step Capabilities state machine, find the Amazon CloudWatch logs for a step by navigating to the state machine’s newest run on the Step Capabilities console and selecting CloudWatch Logs for the failed Step Capabilities step. After resolving the error, you possibly can restart the state machine by selecting New execution.

Troubleshooting Step Function

For AWS Batch errors, find the AWS Batch logs by navigating to the AWS CloudFormation console and selecting the Sources tab within the CloudFormation stack. On this tab, seek for LogGroup to seek out the AWS Batch run logs.

Troubleshooting Cloudwatch logs

For extra details about frequent errors and their options, check with the Check Drive Readme.

Clear up

When you’ve got accomplished the analysis, we advocate manually deleting the deployed Redshift warehouses to keep away from any on-demand costs that might accrue. After this, you possibly can delete the CloudFormation stack to wash up different sources.

Limitations

Among the limitations for the WorkloadReplicator (the core utility supporting the ConfigCompare instrument) are outlined within the Readme.

Conclusion

On this submit, we demonstrated the method of discovering the proper Redshift information warehouse configuration utilizing Redshift Check Drive. The utility presents an easy-to-use instrument to copy the workload of your selection in opposition to customizable information warehouse configurations. It additionally offers a self-hosted evaluation UI that will help you dive deeper into the stats generated in the course of the replication course of.

Get began with Check Drive right now by following the directions supplied within the Readme. For an in-depth overview of the config evaluate automation, check with Evaluate completely different node sorts on your workload utilizing Amazon Redshift. In the event you’re migrating from DC2 or DS2 node sorts to RA3, check with our suggestions on node depend and kind as a benchmark.


In regards to the Authors

Sathiish Kumar is a Software program Growth Supervisor at Amazon Redshift and has labored on constructing end-to-end functions utilizing completely different database and expertise options over the past 10 years. He’s captivated with serving to his clients discover the quickest and essentially the most optimized answer to their issues by leveraging open-source applied sciences.

Julia Beck is an Analytics Specialist Options Architect at AWS. She helps clients in validating analytics options by architecting proof of idea workloads designed to fulfill their particular wants.

Ranjan Burman is an Analytics Specialist Options Architect at AWS. He makes a speciality of Amazon Redshift and helps clients construct scalable analytical options. He has greater than 16 years of expertise in numerous database and information warehousing applied sciences. He’s captivated with automating and fixing buyer issues with cloud options.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Stay Connected

0FansLike
3,912FollowersFollow
0SubscribersSubscribe
- Advertisement -spot_img

Latest Articles