On the issue of the stochastic nature of DBMS and problems with load testing in a cloud environment

Background to the study

Research of the hypothesis DBMS is by its nature stochastic, and not deterministic system.

In order to verify the statement and in connection with the start of work on the preparation of a methodology for statistical analysis of DBMS in a cloud environment, a series of experiments were started to determine the impact of external/random infrastructure factors on DBMS performance.

Testing tool and scenario

Standard tools are used for testing – utility pgbench

Test scenario and parameters

  • pgbench_init_param= –no-vacuum –quiet –foreign-keys –scale=100 -i test_pgbench

  • pgbench_param= –progress=60 –protocol=extended –report-per-command –jobs=1 –client=100 –time=14400 test_pgbench

The initial series of experiments consists of 4 measurements of statistical indicators of the state and performance of the DBMS over the course of 1 hour.

To reduce the impact of performance indicator emissions, median smoothing with a period of 10 minutes.

DBMS performance is calculated using the methodology described in Correlation Analysis for Resolving DBMS Performance Incidents

Observation results 1st hour

DBMS performance statistics

Fig. 1. Performance statistics: 1st hour

Fig. 1. Performance statistics: 1st hour

Probability distribution

Fig.2. Probability distribution 1st hour

Fig.2. Probability distribution 1st hour

Fig.3. Probability distribution 1st hour - graph

Fig.3. Probability distribution 1st hour – graph

Correlation between wait events and DBMS performance

For simplicity, only events with a correlation coefficient > 0.5 and a percentage of observations > 50% are shown.

Fig.4. Correlation coefficient between wait events and DBMS performance

Fig.4. Correlation coefficient between wait events and DBMS performance

Observation results 2nd hour

DBMS performance statistics

Fig. 5. Performance statistics: 2nd hour

Fig. 5. Performance statistics: 2nd hour

Probability distribution

Fig.6. Probability distribution: 2nd hour

Fig.6. Probability distribution: 2nd hour

Fig. 7. Probability distribution: 2nd hour - graph

Fig. 7. Probability distribution: 2nd hour – graph

Correlation between wait events and DBMS performance

For simplicity, only events with a correlation coefficient > 0.5 and a percentage of observations > 50% are shown.

Fig.8. Correlation coefficient between wait events and DBMS performance

Fig.8. Correlation coefficient between wait events and DBMS performance

Comparison with the result of the previous hour

  1. Productivity – decreased

  2. Statistical indicators – have changed slightly

  3. The waiting events with the largest correlation coefficient modulus have remained virtually unchanged

Observation results 3rd hour

DBMS performance statistics

Fig.9. Performance statistics: 3rd hour

Fig.9. Performance statistics: 3rd hour

Probability distribution

Fig.10. Probability distribution: 3rd hour

Fig.10. Probability distribution: 3rd hour

Fig.11. Probability distribution: 3rd hour - graph

Fig.11. Probability distribution: 3rd hour – graph

Correlation between wait events and DBMS performance

For simplicity, only events with a correlation coefficient > 0.5 and a percentage of observations > 50% are shown.

Fig.12. Correlation coefficient between wait events and DBMS performance

Fig.12. Correlation coefficient between wait events and DBMS performance

Comparison with the result of the previous hour

  1. DBMS performance – decreased

  2. The dispersion of DBMS performance indicators has increased

  3. The event with the largest correlation coefficient in modulus is IO/DataFileImmediateSync which was absent in previous observations.

DataFileImmediateSync

Waiting for immediate synchronization of the relation data file with reliable storage.

Apparently, this event, which has a significant impact on the performance of the DBMS, was caused by a change in the state of the infrastructure.

Observation results 4th hour

DBMS performance statistics

Fig.13. Performance statistics: 4th hour

Fig.13. Performance statistics: 4th hour

Probability distribution

Fig.14. Probability distribution: 4th hour

Fig.14. Probability distribution: 4th hour

Fig.15. Probability distribution: 4th hour - graph

Fig.15. Probability distribution: 4th hour – graph

Correlation between wait events and DBMS performance

For simplicity, only events with a correlation coefficient > 0.5 and a percentage of observations > 50% are shown.

Fig.16. Correlation coefficient between wait events and DBMS performance

Fig.16. Correlation coefficient between wait events and DBMS performance

Comparison with the result of the previous hour

  1. DBMS performance – increased

  2. Dispersion of DBMS performance indicators has decreased

  3. IO/DataFileImmediateSync – has no significant correlation with performance

  4. IPC/BufferIO Waiting – Correlated with DBMS Performance

BufferIO

Waiting for buffered I/O to complete.

Preliminary results

  1. During the observations, a significant spread of performance indicators was established under the same load on the DBMS.

  2. The dispersion of DBMS performance varies over a fairly wide range.

  3. Waiting events correlated with DBMS performance are generally not constant.

  4. Using the results of load testing to analyze the impact of changes in the DBMS configuration parameters and conducting load testing when conducting a single test cannot be reliable due to the unpredictable impact of the infrastructure on the performance of the DBMS.

  5. To conduct load testing and analyze the impact of changes in DBMS configuration parameters on DBMS performance in a cloud infrastructure environment, a series of tests and statistical analysis of the results are required.

  6. The results of load testing are probabilistic in nature.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *