SSD state monitoring in Qsan arrays
The use of solid-state drives in the field of data storage will not surprise anyone. SSDs are firmly established in the use of IT equipment from personal computers and laptops to servers and data storage systems. During this time, several generations of SSDs have changed, each of which has improved performance in terms of performance, reliability, and maximum capacity. But the question of monitoring the resource records SSD is still relevant.
Solid-state drives, due to their physical structure, have a limited write resource in advance. And the fact that much more data is actually written to the SSD, rather than being sent to it by the host (especially as part of the RAID group), brings us even closer to the designated limit. This circumstance is a kind of fear in some users before using SSD.
In fact, all is not so bad. The estimated DWPD resource is given for the entire warranty period of the drive (usually 3-5 years). And because the real resource of the TBW recording will be quite impressive, which allows not to be afraid to “wipe” the SSD in just a few months. Moreover, in some cases, you can temporarily use the drives in a more intensive mode than provided by the manufacturer just due to the high values of TBW. However, all this does not at all eliminate the need to monitor the current resource for recording each specific SSD in order to proactively replace it when certain thresholds are reached.
Each storage vendor in its own way implements this functionality. But most often this is simply the property of the drive is healthy / faulty. Qsan in its All Flash systems, on the contrary, made a complete visualization of the parameters of the current SSD activity as a separate module called QSLife. This module is an integral part of the new XEVO operating system, under which all Qsan storage systems will work in the future.
For each SSD in the system, the current “standard of living” is displayed in the most accessible form. It is no secret that all modern SSDs keep their own records of the blocks recorded on them. Based on these values, the system calculates the wear rate of the drive in accordance with its markup. The final result is displayed as a percentage of the brand new SSD. Also note that the degree of wear is calculated not only for the period of time during which the drive worked as part of the All Flash Qsan array, but for its entire life span, including work as part of other systems (if it was).
In addition to simplified information about the drive, you can find out some details. In particular, the amount of data recorded on it for the entire service life. And during the time that the drive worked as part of the All Flash Qsan array, its work schedules in read and write operations are available. Statistics is collected in real time and is available for any period from the depth of viewing up to one year.
Of course, the purpose of this functionality is not only to build beautiful graphs to the delight of the administrator, but also to proactively analyze the state of the drives and prevent potential future problems associated with their wear. Therefore, with regard to the “standard of living” SSD, you can set a lot of thresholds and corresponding actions related to the exhaustion of the SSD recording resource.
If you look at other storage models (not specialized All Flash, but for general purpose) from Qsan, then they do not have a similar visualized report on drives. This is understandable: nevertheless, the flagship must somehow differ from the mainstream. However, in the regular product line a similar monitoring is necessarily conducted. Yes, without collecting statistics on usage and performance. But the main function of tracking the recording resource is present.
In connection with the continuous improvement of the production technology of solid-state drives, the question of their reliability has somewhat subsided. But, nevertheless, monitoring the resource of their recording is still relevant. Such a properly configured monitoring will allow the administrator to predict the aging of the SSD in advance in accordance with the actual current loads, and the company's management to calculate the TCO (total cost of ownership) indicators.