AWR: to what extent does the database work?
With this short post, I would like to dispel one misunderstanding related to the AWR analysis of databases running on Oracle Exadata. For almost 10 years, I have been constantly confronted with the question: What is the contribution of Exadata Software to productivity? Or with the use of newly formed words: to what extent does the work of this or that database go?
Often this correct question, in my opinion, is given the wrong answer with reference to the AWR statistics. It presents the system waits method, which interprets the response time as the sum of the processor time (DB CPU) and the waiting time of various classes.
With the advent of Exadata, AWR statistics have specific system expectations related to the work of Exadata Software. As a rule, the names of such expectations begin with the word “cell” (the cell is called the Exadata Storage server), of which the most common expectations are “cell smart table scan”, “cell multiblock physical read” and “cell single block physical read”.
In most cases, the share of such Exadata waits in the total response time is small, and therefore they do not even fall into the Top10 Foreground Events by Total Wait Time section (in this case, they should be looked for in the Foreground Wait Events section). It was with great difficulty that we found an example of daily AWR from our customers, in which Exadata expectations got into the Top10 section and in total amounted to about 5%:
Event | Waits | Total Wait Time (sec) | Avg Wait | % DB time | Wait class |
DB CPU | 115.2K | 70.4 | |||
SQL * Net more data from dblink | 670,196 | 5471.5 | 8.16ms | 3.3 | Network |
cell single block physical read | 5,661,452 | 3827.6 | 676.07us | 2.3 | User I / O |
Sync ASM rebalance | 4,350,012 | 3481.3 | 800.30us | 2.1 | Other |
cell multiblock physical read | 759,885 | 2252 | 2.96ms | 1.4 | User I / O |
direct path read | 374,368 | 1811.3 | 4.84ms | 1.1 | User I / O |
SQL * Net message from dblink | 7.983 | 1725 | 216.08ms | 1.1 | Network |
cell smart table scan | 1,007,520 | 1260.7 | 1.25ms | 0.8 | User I / O |
direct path read temp | 520,211 | 808.4 | 1.55ms | 0.5 | User I / O |
enq: TM – contention | 652 | 795.8 | 1220.55ms | 0.5 | Application |
The following conclusions are often drawn from such AWR statistics:
1. The contribution of Exadata magic to the database performance is not high – it does not exceed 5%, and the database “exadata” is bad.
2. If such a base is transferred from Exadata to the classic “server + array” architecture, then the performance will not change much. Because even if this array turns out to be three times slower than the Exadata storage system (which is hardly possible for modern All Flash arrays), then multiplying 5% by three we get an increase in the share of I / O expectations up to 15% – such a database will probably survive!
Both of these conclusions are inaccurate, moreover, they distort the understanding of the idea embodied in Exadata Software. Exadata doesn’t just provide fast I / O, it works in a fundamentally different way from the classic server + array architecture. If the work of the database is really “exadatted”, then the SQL logic is transferred to the storage system. Storage servers, thanks to a number of special mechanisms (primarily Exadata Storage Indexes, but not only), find the necessary data themselves and send the DB servers. They do this quite efficiently, so the share of typical Exadata expectations in the total response time is small.
How will this share change outside of Exadata? How will this affect the overall performance of the database? Testing will answer these questions best. For example, waiting for a “cell smart table scan” outside of Exadata can turn into such a heavy Table Full Scan that I / O will take all the response time and performance will degrade dramatically. That is why it is wrong, when analyzing AWR, to consider the total percentage of Exadata’s expectations as the contribution of its magic to performance, and even more so to use this percentage to predict performance outside of Exadata. To understand how much the work of the database is “exaggerated”, you need to study the AWR statistics of the “Instance Activity Stats” section (there are many statistics with meaningful names) and compare them with each other.
And in order to understand how the database will feel outside of Exadata, it is best to make a clone of the database from the backup on the target architecture and analyze the performance of this clone under load. As a rule, Exadata owners have such an opportunity.
Author: Alexey Struchenko, Head of the DB “Jet Infosystems”