Monitoring Legacy Virtual Machines and Their Applications – The “Kubernetes Method”

Kubernetes has become the de facto standard in containerized application development because it provides a huge set of out-of-the-box features that help developers build scalable and fault-tolerant systems.

Everything looks great if you are developing something from scratch, but everyone knows that for most companies this is not the case! Over time, many legacy systems have become giant monolithic monsters that run not on containers, but on virtual machines (VMs). Refactoring such systems is very difficult for various reasons:

  • Technical reasons (for example, depends on outdated operating systems or kernels)

  • Business reasons (e.g. time to market, conversion cost)

  • Difficulties with vendors (for example, the vendor does not provide a solution in a container format).

  • Or, hard to believe, new applications can actually be designed to run on VMs instead of containers.

This should not prevent you from using Kubernetes (K8s) but KubeVirt is the right tool to bring these VMs into the world of K8s. KubeVirt is a VM management add-on for K8s. Its goal is to provide a common foundation for virtualization solutions on top of K8s. With KubeVirt, you can manage a VM as a K8s resource, similar to Pods. You can declare, start, stop, delete, scale and… control them! In the same way as in K8s.

Now I will not talk about KubeVirt itself, but about how to control VMs in the same way as you control containers in K8s. KubeVirt has a great user guideif you want to know more about it.

So what is the standard for monitoring applications inside Kubernetes?

If you’ve been running Kubernetes long enough, you’ll notice that Kubernetes itself can collect some metrics about the resources consumed by pods, such as CPU and memory usage. This is often not enough, and applications are usually designed to provide additional metrics to help ensure that they perform as expected.

If we move the VM to K8s, we face the same problem! There is not enough processor and memory to determine whether it is working correctly or not. We usually need to check a lot of other metrics like disk and swap usage, and since VMs aren’t ephemeral like containers, we have to keep track of whether it’s running or not.

Prometheus is a well-known monitoring and alerting solution, including many convenient integrations with K8s, which has become the most popular option when discussing the choice of monitoring containerized applications. It is mainly used with Prometheus-Operatorwhich provides some additional components that make it much easier to set up, and that’s what we’ll be using in our tutorial.

The Prometheus community has also developed node-exporterwhich is mainly used to monitor K8s nodes and will also be used to monitor our KubeVirt VMs.

So let’s get started!

Environment

This tutorial will use the following set of tools:

  • Helms v3 – to deploy Prometheus-Operator.

  • minikube – will provide us with a K8s cluster, however you can choose any other K8s provider.

  • kubectl – to deploy various K8s resources.

  • virtctl – for interacting with the KubeVirt VM, can be downloaded from KubeVirt repository.

Deploy Prometheus Operator

Once you have a K8s cluster, with minikube or any other provider, the first step is to deploy Prometheus Operator. The reason is that KubeVirt CR, once installed on the cluster, will determine if there is already ServiceMonitor CR. If so, it will create ServiceMonitors configured to monitor all KubeVirt components (virt-controller, virt-api and virt-handler) out of the box.

Although monitoring KubeVirt itself is not covered in this guide, it is good practice to deploy the Prometheus Operator before installing KubeVirt.

In order to deploy the Prometheus Operator, you need to first create its namespace, e.g., monitoring:

$ kubectl create ns monitoring

Then expand the operator in the new namespace:

$ helm fetch stable/prometheus-operator
$ tar xzf prometheus-operator*.tgz
$ cd prometheus-operator/ && helm install -n monitoring -f values.yaml kubevirt-prometheus stable/prometheus-operator

After everything is deployed, you can remove everything that was downloaded using helm:

$ cd ..
$ rm -rf prometheus-operator*

Keep in mind the release name we added here: kubevirt-prometheus. The release name will be used when declaring our ServiceMonitor later.

Deploy KubeVirt Operators and KubeVirt CustomResources

So, the next step is to deploy KubeVirt itself. Let’s start with its operator.

Take the latest version, then use kubectl create to deploy the manifest directly from Github:

$ export KUBEVIRT_VERSION=$(curl -s https://api.github.com/repos/kubevirt/kubevirt/releases | grep tag_name | grep -v -- - | sort -V | tail -1 | awk -F':' '{print $2}' | sed 's/,//' | xargs)
$ kubectl create -f https://github.com/kubevirt/kubevirt/releases/download/${KUBEVIRT_VERSION}/kubevirt-operator.yaml

Before deploying KubeVirt CR, ensure that all kubevirt-operator replicas are ready; this can be done with:

$ kubectl rollout status -n kubevirt deployment virt-operator

After that, we can deploy KubeVirt and similarly wait for all of its components to be ready:

$ kubectl create -f https://github.com/kubevirt/kubevirt/releases/download/${KUBEVIRT_VERSION}/kubevirt-cr.yaml
$ kubectl rollout status -n kubevirt deployment virt-api
$ kubectl rollout status -n kubevirt deployment virt-controller
$ kubectl rollout status -n kubevirt daemonset virt-handler

If we want to monitor VMs that can restart, we need our node-exporters to be persistent, and therefore we need to set persistent storage for them. CDI will be the component responsible for this, so we’ll expand its operator and user resource as well. As always, let’s wait until the necessary components are ready before we get started:

$ export CDI_VERSION=$(curl -s https://github.com/kubevirt/containerized-data-importer/releases/latest | grep -o "v[0-9].[0-9]*.[0-9]*")
$ kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$CDI_VERSION/cdi-operator.yaml
$ kubectl rollout status -n cdi deployment cdi-operator
$ kubectl create -f https://github.com/kubevirt/containerized-data-importer/releases/download/$CDI_VERSION/cdi-cr.yaml
$ kubectl rollout status -n cdi deployment cdi-apiserver
$ kubectl rollout status -n cdi deployment cdi-uploadproxy
$ kubectl rollout status -n cdi deployment cdi-deployment

Deploying a VM with persistent storage

Now we have everything we need. Let’s set up the VM.

Let’s start with PersistentVolumes, which are needed for resources DataVolume in CDI. Since I’m using minikube without a dynamic storage provider, I’ll create 2 PVs with reference to the PVCs that will claim them. pay attention to claimRef in each of the PVs.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-volume
spec:
  storageClassName: ""
  claimRef: 
    namespace: default
    name: cirros-dv
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 2Gi
  hostPath:
    path: /data/example-volume/
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-volume-scratch
spec:
  storageClassName: ""
  claimRef: 
    namespace: default
    name: cirros-dv-scratch
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 2Gi
  hostPath:
    path: /data/example-volume-scratch/

You can create a YAML manifest with the above content and create a PV with kubectl apply -f your-pv-manifest.yaml .

With persistent storage installed, we can create our VM with the following manifest:

apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  name: monitorable-vm
spec:
  running: true
  template:
    metadata: 
      name: monitorable-vm
      labels: 
        prometheus.kubevirt.io: "node-exporter"
    spec:
      domain:
        resources:
          requests:
            memory: 1024Mi
        devices:
          disks:
          - disk:
              bus: virtio
            name: my-data-volume
      volumes:
      - dataVolume:
          name: cirros-dv
        name: my-data-volume
  dataVolumeTemplates: 
  - metadata:
      name: "cirros-dv"
    spec:
      source:
          http: 
             url: "https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img"
      pvc:
        storageClassName: ""
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: "2Gi"

This is a pretty big manifest with a lot of information, so let’s break it down. Using the API added by KubeVirt and creating a new resource VirtualMachine with the title monitorable-vm.

apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  name: monitorable-vm

In the VM specification, we tell KubeVirt to autostart the VM after it is created:

spec:
  running: true

Parameter template used to create VirtualMachineInstance (or VMI for short), which is a running VM.

template:
    metadata: 
      name: monitorable-vm
      labels: 
        prometheus.kubevirt.io: "node-exporter"
    spec:
      domain:
        resources:
          requests:
            memory: 1024Mi
        devices:
          disks:
          - disk:
              bus: virtio
            name: my-data-volume
        machine:
          type: ""
      volumes:
      - dataVolume:
          name: cirros-dv
        name: my-data-volume

Declaration method VirtualMachineInstance very similar to how pods are declared, we add some resources, disks and volumes. But, pay attention to the parameters, as some of them are slightly different from the Pod specification.

It’s important to note here that we’ve labeled our VMI with prometheus.kubevirt.io: "node-exporter"this label will be used by our future Service to identify that we want to monitor this particular VM.

DataVolume will be received from dataVolumeTemplatewhich we have created below.

dataVolumeTemplates: 
  - metadata:
      name: "cirros-dv"
    spec:
      source:
          http: 
             url: "https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img"
      pvc:
        storageClassName: ""
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: "2Gi"

DataVolume is an abstraction PersistentVolumeClaimwhich imports the operating system image into this persistent store.

Pay attention to the name dataVolume, which should match the one we used in the VMI template. Also note that we are pulling the image directly from the internet, but there are also other methods. And one last note, this dataVolume will create 2 PVCs with identical characteristics, both PVCs will have the name dataVolumebut one of them will have a suffix -scratch. Remember that the name must match the parameter claimRefadded to previously created PVs.

Well, now create your YAML manifest and run it with:

kubectl create -f your-vm-manifest.yaml

PVCs will be created, then CDI will create a titled importer-cirros-dv to import our image into PVC. After its completion, the resource of our VirtualMachine. Since we told KubeVirt to start our VM right after it was created, you will also discover the resource VirtualMachineInstance. And finally VirtualMachineInstances KubeVirt creates a new component called virt-launcher . It’s nothing but a pod that runs the virtualization process, so… we’re still running containers… VMs run inside a container virt-launcher

If you want to check that everything is in place, you should have something like this:

$ kubectl get vm,vmi,pods
NAME                                        AGE   VOLUME
virtualmachine.kubevirt.io/monitorable-vm   1m
NAME                                                AGE   PHASE     IP            NODENAME
virtualmachineinstance.kubevirt.io/monitorable-vm   1m   Running   172.17.0.20   kubevirt
NAME                                     READY   STATUS    RESTARTS   AGE
pod/virt-launcher-monitorable-vm-vfk5f   1/1     Running   0          1m

Installing node-exporter inside a virtual machine

After launch VirtualMachineInstance we can connect to his console with the command virtctl console monitorable-vm. If you are required to enter a user and password, please provide your credentials accordingly. If you are using the disk image from this guide, then the user and password - cirros and gocubsgo respectively.

The following script will install node-exporter and configure the VM to always start the exporter on boot.

$ curl -LO -k https://github.com/prometheus/node_exporter/releases/download/v1.0.1/node_exporter-1.0.1.linux-amd64.tar.gz
$ gunzip -c node_exporter-1.0.1.linux-amd64.tar.gz | tar xopf -
$ ./node_exporter-1.0.1.linux-amd64/node_exporter &
$ sudo /bin/sh -c 'cat > /etc/rc.local <<EOF
#!/bin/sh
echo "Starting up node_exporter at :9100!"
/home/cirros/node_exporter-1.0.1.linux-amd64/node_exporter 2>&1 > /dev/null &
EOF'
$ sudo chmod +x /etc/rc.local

PS: If you are using a different source image, please set node-exporter to start at boot time accordingly.

Scraping node-exporter VM with Prometheus setup

Setting up Prometheus to scrape node-exporter (or other applications) is very easy. All we need is to create new Service and ServiceMonitor:

apiVersion: v1
kind: Service
metadata:
  name: monitorable-vm-node-exporter
  labels:
    prometheus.kubevirt.io: "node-exporter"
spec:
  ports:
  - name: metrics 
    port: 9100 
    targetPort: 9100
    protocol: TCP
  selector:
    prometheus.kubevirt.io: "node-exporter"
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kubevirt-node-exporters-servicemonitor
  namespace: monitoring
  labels:
    prometheus.kubevirt.io: "node-exporter"
    release: monitoring
spec:
  namespaceSelector:
    any: true
  selector:
    matchLabels:
      prometheus.kubevirt.io: "node-exporter"
  endpoints:
  - port: metrics
    interval: 15s

Let’s go through everything one by one to make sure we’ve set everything up correctly. Let’s start with Service:

spec:
  ports:
  - name: metrics 
    port: 9100 
    targetPort: 9100
    protocol: TCP
  selector:
    prometheus.kubevirt.io: "node-exporter"

According to the specification, we create a new port called metricswhich will be redirected to each Pod labeled prometheus.kubevirt.io: "node-exporter"on port 9100, which is the default port number for node-exporter.

apiVersion: v1
kind: Service
metadata:
  name: monitorable-vm-node-exporter
  labels:
    prometheus.kubevirt.io: "node-exporter"

We also label the Service itself with prometheus.kubevirt.io: "node-exporter"to be used by the object ServiceMonitor.

Now let’s look at our specification ServiceMonitor:

spec:
  namespaceSelector:
    any: true
  selector:
    matchLabels:
      prometheus.kubevirt.io: "node-exporter"
  endpoints:
  - port: metrics
    interval: 15s

Because ServiceMonitor will be deployed in the namespace monitoringand our service is in the namespace defaultit is necessary that namespaceSelector.any=true.

We also tell our ServiceMonitor that Prometheus should scrape endpoints from services marked prometheus.kubevirt.io:node-exporter" and whose ports are called metrics. Luckily, that’s exactly what we did with our Service.

And the last thing you should pay attention to. The Prometheus configuration can be configured to monitor multiple ServiceMonitors. You can see which ServiceMonitors our Prometheus is monitoring with the following command:

# Look for Service Monitor Selector
kubectl describe -n monitoring prometheuses.monitoring.coreos.com monitoring-prometheus-oper-prometheus

Make sure our ServiceMonitor has all the labels required for Prometheus’s Service Monitor Selector. Typically, the selector is the release name we set when we deployed our Prometheus with helm!

This part can be really tricky. If you have any problems, please take a look at Prometheus-operator troubleshooting.

Testing

You can do a quick test by forwarding (forwarding) ports through the Prometheus web interface and running some PromQLs:

kubectl port-forward -n monitoring prometheus-monitoring-prometheus-oper-prometheus-0 9090:9090

To check that everything is working, go to localhost:9090/graph and execute PromQL up{pod=~"virt-launcher.*"}. Prometheus should return data that is collected from node-exporter monitorable-vm.

To see how the metrics behave, you can play with virtctl, stop and start the VM. Note that when you stop the VM with virtctl stop monitorable-vm, VirtualMachineInstance destroyed and thus destroyed and under. As a result, our Service will not be able to find the endpoint of the pod, and then it will be removed from the Prometheus targets.

With this behavior, alerts like the one below won’t work because our target has literally disappeared, not disabled.

- alert: KubeVirtVMDown
    expr: up{pod=~"virt-launcher.*"} == 0
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: KubeVirt VM {{ $labels.pod }} is down.

BUT if the VM constantly crashes without stopping, then the pod is not killed, and the target will still be tracked. Node-exporter will never start or will constantly crash along with the VM, so this alert might work:

- alert: KubeVirtVMCrashing
    expr: up{pod=~"virt-launcher.*"} == 0 
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: KubeVirt VM {{ $labels.pod }} is constantly crashing before node-exporter starts at boot.

hacking

For your comfort nearly a fully automated script can be found at https://github.com/ArthurSens/kubevirt-VM-monitoring. Script deploy-everything.sh will install Prometheus, KubeVirt, CDI in your k8s cluster and also configure PersistentVolume, VirtualMachine, Service and ServiceMonitor. It will then connect to the VirtualMachine console where you will need to install the node-exporter using the shell script found at install-node-exporter.sh.

– –

That’s all! Now you know how to migrate your legacy VMs to a Kubernetes cluster, set up some core resources, and manage them like containerized applications!

Note that node-exporter is just an example, you can follow the same logic for a windows VM with windows exporter or your own applications that run inside your VMs. All you need is for them to send metrics over HTTP in Prometheus format, then configure Service and ServiceMonitor to the correct port number.

The material was prepared for future students of the course “Observability: monitoring, logging, tracing”. Everyone is welcome to the open class. “Grafana: forming dashboards”. In class:
– Let’s analyze the possibilities for the formation of dashboards.
– Let’s look at how to use dashboards and replication, as well as variables.
– Let’s look at the formation of dashboards for various environments.

Similar Posts

Leave a Reply