Monitoring NetApp Volumes over SSH
Hello everyone, my name is Igor Sidorenko. Monitoring is one of the main areas of my work, as well as my hobby. I will talk about Zabbix and how to use it to monitor the information we need about NetApp volumes, having access only via SSH. Who is interested in the topic of monitoring and Zabbix, please, under cat.
Initially, we monitored volumes by mounting them to a specific server, on which a special template hung, catching NFS mounts on the node and putting them under monitoring, by analogy with the file systems of the basic Linux template. The mount had to be registered in fstab and mounted manually – because of this, a lot was lost and forgotten.
Then a great idea came to my mind: we need to automate all this. There were several options:
There are ready-made templates that work with SNMP, but no access.Getting a list of volumes and automatic mount on a node: you need to create a folder, register fstab, mount, that’s all, too much hemorrhoids.There is a magnificent API, but in our version of ONTAP it is stripped down and does not provide the user with the information they need.- Somehow use SSH access to get volumes and set them up for monitoring.
The choice fell on SSH agent…
Low Level Discovery (LLD)
First, we need to create low-level discovery (LLD), these will be the names of our volumes. All this is necessary in order to pull out specific information on the volume we need. The raw data looks something like this (114 at the time of writing):
set -unit B; volume show -state online
Well, how can we do without crutches: let’s write a one-line bash script that will display the names of volumes in JSON format (since this external verification, the scripts are on the Zabbix server in the directory /usr/lib/zabbix/externalscripts
):
#!/usr/bin/bash
SVM_NAME=""
SVM_ADDRESS=""
USERNAME=""
PASSWORD=""
for i in $(sshpass -p $PASSWORD ssh -o StrictHostKeyChecking=no $USERNAME@$SVM_ADDRESS 'set -unit B; volume show -state online' | grep $SVM_NAME | awk {'print $2'}); do echo '{"volume_name":"'$i'"}'; done | jq -s '.
Now you need to create template and, based on the received data, create data items:
Data items
To automatically create data items, you need to do item prototype:
We will be using master elements and several dependent from them elements. Thus, for each volume, one master element is created in which a set of commands is executed via SSH:
set -unit B; df -i -volume {#VOLUME_NAME}; volume show-space {#VOLUME_NAME}; statistics volume show -volume {#VOLUME_NAME}
We get such a sheet:
Last login time: 9/15/2020 12:42:45
Filesystem iused ifree %iused Mounted on
/vol/ackey_media/ 96 311191 0% /ackey_media
Volume Name: ackey_media
Volume MSID: 2159592810
Volume DSID: 1317
Vserver UUID: 46a00e5d-c22d-11e8-b6ed-00a098d48e6d
Aggregate Name: NGHF_FAS2720_04
Aggregate UUID: 7ec21b4d-b4db-4f84-85e2-130750f9f8c3
Hostname: FAS2720_04
User Data: 20480B
User Data Percent: 0%
Deduplication: -
Deduplication Percent: -
Temporary Deduplication: -
Temporary Deduplication Percent: -
Filesystem Metadata: 1150976B
Filesystem Metadata Percent: 0%
SnapMirror Metadata: -
SnapMirror Metadata Percent: -
Tape Backup Metadata: -
Tape Backup Metadata Percent: -
Quota Metadata: -
Quota Metadata Percent: -
Inodes: 12288B
Inodes Percent: 0%
Inodes Upgrade: -
Inodes Upgrade Percent: -
Snapshot Reserve: -
Snapshot Reserve Percent: -
Snapshot Reserve Unusable: -
Snapshot Reserve Unusable Percent: -
Snapshot Spill: -
Snapshot Spill Percent: -
Performance Metadata: 28672B
Performance Metadata Percent: 0%
Total Used: 1212416B
Total Used Percent: 0%
Total Physical Used Size: 1212416B
Physical Used Percentage: 0%
Logical Used Size: 1212416B
Logical Used Percent: 0%
Logical Available: 10736205824B
DOMCLIC_SVM : 9/15/2020 12:42:51
*Total Read Write Other Read Write Latency
Volume Vserver Ops Ops Ops Ops (Bps) (Bps) (us)
----------- ----------- ------ ---- ----- ----- ----- ----- -------
ackey_media DOMCLIC_SVM 0 0 0 0 0 0 0
From this sheet, we need to select the metrics we need.
The magic of regular expressions
Originally for preprocessing I wanted to use JavaScript, but somehow I didn’t master it, it didn’t work. Therefore, I stopped at regulars, and I use them almost everywhere.
Number of inodes used
We will select information only about inodes for each volume in two stages:
First, all the information:
/vol/w+/.*
Then, specifically by metrics:
(d+)s+(d+)s+(d+)
Output – Output formatting template. N (где N=1..9)
– the escape sequence is replaced by the Nth matching group. Control sequence