Amazon Redshift Parallel Scaling Guide and Test Results


We use Amazon Redshift at Skyeng, including parallel scaling, so the article Stephen Gromoll, the founder of, for, seemed interesting to us. After the translation – a bit of our experience from the engineer according to Daniyar Belkhodzhayev.

Amazon Redshift architecture allows scaling by adding new nodes to the cluster. The need to cope with the peak number of requests can lead to over-provisioning of nodes. Parallel scaling (Concurrency Scaling), in contrast to the addition of new nodes, increases the computing power as needed.

Amazon Redshift's parallel scaling gives Redshift clusters additional power to handle peak requests. It works by transferring requests to new "parallel" clusters in the background. Requests are routed based on configuration and WLM rules.

Parallel scaling pricing is based on a free-use credit model. In addition to the free credit rate, payment is based on the time when the parallel scaling cluster processes requests.

The author tested parallel scaling on one of the internal clusters. In this post, he will talk about test results and give tips on how to get started.

Cluster requirements

To use parallel scaling, your Amazon Redshift cluster must meet the following requirements:

– platform: EC2-VPC;
– node type: dc2.8xlarge, ds2.8xlarge, dc2.large or ds2.xlarge;
– number of nodes: 2 to 32 (single node clusters are not supported).

Valid query types

Parallel scaling is not suitable for all types of requests. In the first version, it processes only read requests that satisfy three conditions:

– SELECT queries are read-only (although more types are planned);
– the query does not refer to the table with the INTERLEAVED sorting style;
– The query does not use Amazon Redshift Spectrum to link to external tables.

To route to a parallel scaling cluster, the request must queue up. In addition, queries suitable for the SQA (Short Query Acceleration) queue will not be executed in parallel scaling clusters.

Queues and SQA require proper Redshift Workload Management (WLM) configuration. We recommend that you first optimize your WLM – this will reduce the need for parallel scaling. And this is important, because parallel scaling is done for free only for a certain number of hours. AWS claims that parallel scaling will be free for 97% of customers, which brings us to the issue of pricing.

The cost of parallel scaling

AWS offers a credit model for parallel scaling. Every active Amazon Redshift cluster accumulates credits hourly, up to one hour of free credits parallel scaling per day.

You pay only when using parallel scaling clusters exceeds the amount of credits you received.

The cost is calculated on a per-second tariff on demand for a parallel cluster that is used in excess of the free rate. Payment is made only during the execution of your requests, with a minimum payment in one minute, each time the parallel scaling cluster is activated. Per-second on-demand tariff is calculated based on Amazon Redshift's general pricing guidelines, i.e. depends on the type of node and the number of nodes in your cluster.

Run parallel scaling

Parallel scaling runs for each WLM queue. Go to the AWS Redshift console and select “Workload Management” in the left navigation menu. Select the cluster WLM options group of your cluster in the next drop-down menu.

You will see a new column called “Concurrency Scaling Mode” next to each queue. The default setting is “Off”. Click "Edit", and you can change the settings for each queue.


Parallel scaling works by sending appropriate requests to new dedicated clusters. New clusters have the same size (type and number of nodes) as the main cluster.

The number of clusters used for parallel scaling is, by default, one (1) with the ability to configure a total of up to ten (10) clusters.
The total number of clusters for parallel scaling can be set with the max_concurrency_scaling_clusters parameter. Increasing the value of this parameter provides additional backup clusters.


AWS Redshift Console has a few additional graphs. The Max Configured Concurrency Scaling Clusters chart displays the max_concurrency_scaling_clusters value over time.

The number of clusters of active scaling is displayed in the user interface in the "Concurrency Scaling Activity" section:

In the "Requests" tab there is a column indicating whether the query was executed in the main cluster or in a parallel scaling cluster:

Regardless of whether a particular query was executed in the main cluster or through a parallel scaling cluster, it is stored in stl_query.concurrency_scaling_status.

A value of 1 indicates that the query was executed in a parallel scaling cluster, and other values ​​indicate that it was executed in the main cluster.


Information about parallel scaling is also stored in some other tables and views (views), for example, SVCS_CONCURRENCY_SCALING_USAGE. In addition, there are a number of catalog tables that store information about parallel scaling.


The authors launched parallel scaling for one queue in the internal cluster at approximately 18:30:00 GMT 03/29/2019. The parameter max_concurrency_scaling_clusters changed to 3 at approximately 20:30:00 03/29/2019.

To simulate the request queue, and reduced the number of slots for this queue from 15 to 5.

Below is the dashboard chart showing the number of requests that are running and queuing after the number of slots has been reduced.

We see that the waiting time for requests in the queue has increased, while the maximum time was more than 5 minutes.

Here is the relevant information from the AWS console about what happened during this time:

Redshift launched three (3) concurrent scaling clusters according to the configuration. It seems that these clusters were not fully utilized, even though many requests in our cluster were in line.

The usage graph correlates with the scaling activity graph:

After a few hours, the authors checked the queue, and it looks like 6 requests were executed with parallel scaling. Also selectively checked two requests through the user interface. We did not check how to use these values ​​when several parallel clusters are active at once.


Parallel scaling can reduce the time spent in the request queue during peak loads.

According to the results of the basic test, it turned out that the situation with the loading of requests partially improved. However, parallel scaling itself did not solve all problems with parallelism.

This is due to restrictions on the types of queries that parallel scaling can use. For example, authors have many tables with alternating (interleaved) sort keys, and most of our workload is a record.

Although parallel scaling is not a universal solution in configuring WLM, in any case, using this feature is simple and straightforward.

Therefore, the author recommends using it for your WLM queues. Start with a single parallel cluster and monitor the peak load through the console to determine if new clusters are fully utilized.

As AWS adds support for additional types of queries and tables, parallel scaling should gradually become more and more efficient.

Comment from Belkhodjaev Daniyar, an engineer according to Skyeng

We at Skyeng also immediately noticed the possibility of parallel scaling.
The functionality is very attractive, especially since AWS estimates that most users won't even have to pay extra for it.

It so happened that in mid-April we had an unusual flurry of requests to the Redshift cluster. During this period, we often resorted to the help of Concurrency Scaling, sometimes an additional cluster worked 24 hours a day without stopping.

This allowed if not to completely solve the problem with queues, then at least make the situation acceptable.

Our observations largely coincide with the impression of the guys from

We also noticed that despite the presence of requests waiting in the queue, not all requests were immediately redirected to a parallel cluster. Apparently this is due to the fact that the parallel cluster still takes time to start. As a result, during short-term peak loads, we still have small queues, and the corresponding alarms have time to work.

Having got rid of abnormal loads in April, we, as AWS assumed, went into episodic use mode – as part of the free norm.
You can track parallel scaling costs in AWS Cost Explorer. You need to select Service – Redshift, Usage Type – CS, for example USW2-CS: dc2.large.

Details about prices in Russian can be found here.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *