Storage Basics in Kubernetes

Permanent storage (PV) are segments of disk space that can be attached to pods and retain data even after containers are restarted or deleted. These volumes are provided through the Persistent Volume Claims mechanism, which allows users and applications to request storage of a specific size and class, abstracting from the physical implementation of the storage.

And here are temporary storages are associated with the life cycle of the container and are used to store data that is relevant only while the container is running.

The storage classification in Kubernetes is not limited to just this division. There are various StorageClasseswhich allow you to define storage classes with different characteristics.

Also implemented in Kubernetes container storage interface.

We'll look at the basics of all of this in this article.

Persistent Volumes (PV) and Persistent Volume Claims (PVC)

PV are storage units in the Kubernetes ecosystem that expose user data to pods and survive regardless of the lifecycle of individual pods. PVs provide an abstraction over physical storage, allowing administrators to offer storage as bulk resources on the network, regardless of the implementation details of internal or external storage.

PVC are storage requests created by the user. PVCs allow users to request specific levels of storage and access, thereby abstracting storage operations from the implementation and providing more granular control over storage resource management.

The life cycle of PV and PVC includes several stages:

  1. Creation of PV: An administrator or AC creates a PV in Kubernetes by defining storage parameters such as size, access method, and physical location.

  2. PVC request: Users create a PVC, specifying the required storage volume and parameters.

  3. Binding: Kubernetes automatically”ties“PVC to the corresponding PV if the requirements of the PVC match the characteristics of the PV.

  4. Usage: Pods use tethered PVs via PVC to store data.

  5. Release and reuse or disposal: When pods using a PVC are deleted, the PV can either be reassigned to another PVC or cleaned up and deleted, depending on its reuse policy.

Dynamic provisioning allows you to automatically create storage on demand PVC without the need for the administrator to first create the PV. This is achieved by using StorageClasseswhich describe the storage “classes” available for dynamic provisioning.

Main components of StorageClasses:

  • Provisioner: Used to provision volumes. Kubernetes provides built-in providers (for example, kubernetes.io/aws-ebs, kubernetes.io/gce-pd), and also supports external providers via the CSI interface.

  • Parameters: Settings specify storage settings such as speed, quality of service levels, backup or replication policies, file system type, etc. These parameters are passed to the provider to configure the volume as per the requirements.

  • Reclaim Policy: The policy for managing the lifecycle of a volume after it is released by the pod. May be Retain – saving the volume for future use or Delete – automatic volume deletion.

  • VolumeBindingMode: determines when PV is bound to PVC. Immediate results in immediate binding, while WaitForFirstConsumer defers binding until the pod that will use the PVC is created.

PVs support various access methodsincluding:

  • ReadWriteOnce(RWO): Volume can be mounted read-write by only one node.

  • ReadOnlyMany (ROX): Volume can be mounted read-only by multiple nodes.

  • ReadWriteMany(RWX): Volume can be mounted for reading and writing by multiple nodes.

The configuration of PV and PVC is defined through a YAML description file, which specifies the main characteristics, such as:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: slow
  hostPath:
    path: "/mnt/data"

This PV is 10 GiB in size, supports ReadWriteOnce access mode, and will be retained when freed thanks to the policy Retain.

To create a PVC, a similar syntax is used:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: example-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

The PVC requests 5 GiB of storage, which can be provided by any PV with the storage class slowwith at least 5 GiB of free space and supporting access mode ReadWriteOnce.

StorageClass defines how the dynamic provider should create new PVs to satisfy PVC requests. StorageClass configuration example:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2

This StorageClass fast uses AWS EBS as the underlying infrastructure and gp2 as the storage type. When you create a PVC pointing to this class, Kubernetes will automatically provision a new PV to AWS EBS with the specified parameters.

Container Storage Interface (CSI)

CSI uses a set of gRPC services to communicate with Kubernetes, and each CSI driver must implement, at a minimum, Identity and Node services. The Identity service allows Kubernetes components and CSI sidecar containers to identify a driver and its supported functionality. The Node service is required to ensure the availability of the volume at the specified path and to determine which additional. functionality supported by drivers.

To interact with Kubernetes, CSI drivers must register using the plugin registration mechanism kubelet on every supported node. This allows communication between the kubelet and the CSI driver via a Unix Domain Socket to mount and unmount volumes. Kubernetes core components do not interact directly with CSI drivers. Instead, drivers that need to perform operations that depend on the Kubernetes API (such as volume creation, volume attachment, volume snapshot, etc.) must monitor the Kubernetes API and initiate the appropriate CSI operations.

Kubernetes uses sidecar containers for CSI drivers to enable integration with various Kubernetes components and perform CSI-specific tasks. These include:

  • External Provider: monitors PersistentVolumeClaim objects in Kubernetes and runs CreateVolume and DeleteVolume operations.

  • External Attacher: monitors VolumeAttachment objects and runs ControllerPublish and ControllerUnpublish operations.

  • Node-Driver Registrar: registers the CSI driver with the kubelet and adds a custom NodeId to the API label of the Kubernetes Node object.

For example, let's create a CSI driver based on gRPC services using Python. Let's focus on a minimal implementation using two main services: CSI Identity And Node.

The first step is to define .proto files that describe gRPC interfaces that conform to the CSI specification. The files define the data structures and services that the CSI driver will implement.

Next with protoc generate gRPC code based on certain .proto files. So we will create classes and methods in Python that correspond to the definitions in the proto file:

protoc -I ./protos --python_out=. --grpc_python_out=. ./protos/csi.proto

The next step is to implement the service Identitywhich allows clients to identify the driver and its capabilities:

from concurrent import futures
import grpc

import csi_pb2
import csi_pb2_grpc

class IdentityServicer(csi_pb2_grpc.IdentityServicer):
    def GetPluginInfo(self, request, context):
        return csi_pb2.GetPluginInfoResponse(
            name="my-csi-driver",
            vendor_version='1.0.0'
        )

    def GetPluginCapabilities(self, request, context):
        return csi_pb2.GetPluginCapabilitiesResponse(
            capabilities=[
                # возможности драйвера
            ]
        )

def serve():
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
    csi_pb2_grpc.add_IdentityServicer_to_server(IdentityServicer(), server)
    server.add_insecure_port('[::]:50051')
    server.start()
    server.wait_for_termination()

if __name__ == '__main__':
    serve()

We implement the service in the same way Nodewhich is responsible for operations with volumes at the node level:

class NodeServicer(csi_pb2_grpc.NodeServicer):
    def NodePublishVolume(self, request, context):
        # монтирование тома
        pass

    def NodeUnpublishVolume(self, request, context):
        # размонтирование тома
        pass

# добавляем NodeServicer к gRPC серверу аналогично IdentityServicer

After implementing the services, we containerize the application using Docker and deploy it to Kubernetes using YAML manifests, including the necessary sidecar containers for full integration with Kubernetes.


In conclusion, I invite you to free lessonwhere visitors will learn how to create and configure different types of services in Kubernetes: ClusterIP for internal communications, ExternalService for external access, NodePort for opening a node-level port, and LoadBalancer for load balancing.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *