Working a lot with development teams who have just switched to OpenShift, we strive to provide them with guidance and best practices for successfully building and deploying applications on this platform. Based on the results of this work, we have selected 14 key, in our opinion, practices, which can be divided into two categories: application reliability and application security. These categories overlap because the higher the reliability, the better the safety, and vice versa, and the list of best practices is as follows.
This section compiles 9 best practices to help you improve app availability, uptime, and better user experience.
1. Do not store application configuration inside a container
If the container image contains configuration for a specific environment (Dev, QA, Prod), it will not work to transfer it between environments without changes. This is bad from the point of view of the reliability of the release process, since the image that was tested in the previous stages will no longer go into production. Therefore, do not keep the application configuration for a specific environment inside the container, but keep it separately, for example, using ConfigMaps and Secrets.
2. Set resource requirements and limits in pod definitions
Without proper tuning of resource requirements, applications can create overwhelming demands on memory and processor. Conversely, with the application’s explicit CPU and memory requirements, the cluster can efficiently dispatch to provide the application with the requested resources.
3. List the probes of activity (liveness) and readiness (readiness) in pod definitions
With these probes, the cluster can provide basic resiliency: restart the application if liveness fails, or no longer route traffic to it when readiness is not responding. For more details see the section Application health monitoring For more information, see the OpenShift Platform documentation.
4. Use PodDisruptionBudget to protect apps
Sometimes pods have to be removed from the cluster node, for example, when servicing the host, or when the autoscaler downsizes the cluster, shutting down unnecessary nodes. To keep the application accessible, you need to configure the PodDistruptionBudget objects.
5. Correct shutdown of pods
When shutting down, the pod must complete all current requests and correctly close all open connections so that the pod reboots, for example, when updating the application, without being noticed by end users.
6. One container – one process
Try to keep each process running in its own container. This increases process isolation and prevents signal routing problems and zombie processes that would otherwise have to be cleaned out periodically. For more details see the section Avoid multiple processes For more information, see the OpenShift Platform documentation.
7. Use application monitoring systems and alerts
Application monitoring with Prometheus and Grafana helps keep applications running smoothly in production according to business requirements.
8. Let applications write their logs to stdout / stderr
Then OpenShift will be able to collect and send them to a centralized processing system (ELK, Splunk). Logs are an invaluable resource when analyzing the operation of an application in production mode. In addition, alerts generated based on the contents of the logs help to ensure that the application is working as intended.
9. Investigate the feasibility of using Circuit breakers, Timeouts, Retries, Rate Limiting
These mechanisms increase application resilience to failures by protecting against congestion (Rate Limiting, Circuit Breakers) and helping to cope with network problems (Timeouts, Retries). Consider using an OpenShift Service Mesh solution that allows you to implement these features without touching your application code.
Below are 5 best practices and, in our opinion, absolutely essential application security hardening practices that you should definitely consider using.
10. Use only trusted container images
Apply vendor images wherever possible, as they are guaranteed to be tested, tweaked for security, and supported. As for community images, only use the developments of those communities that you trust. And remember that there are images of unknown origin in public registries like Docker Hub – don’t use them under any circumstances!
11. Use the latest versions of the base container images
Because only they have all the currently available security fixes. Configure your continuous integration pipeline so that every time you build application images, it always pulls in the latest base images, and also rebuilds application images when new base images are released.
12.build images separately, runtime images separately
The build image contains dependencies that are needed when building the application, but are not needed when running. Therefore, create separate runtime images with a minimum of dependencies to reduce the attack surface, as well as the size of the image.
13. Restricted security context constraint (SCC) – wherever possible
Modify the images so that they can work under restricted SCC (see section Support for arbitrary user IDs in the documentation). Application compromise risks include situations where an attacker gains control of the application itself, and restricted SCC allows you to secure the cluster node in this case.
14. TLS to secure communications between application components
Application components can send sensitive data to each other. If the network in which OpenShift is running cannot be considered secure, then traffic between application components can be protected using TLS encryption, and the already mentioned OpenShift Service Mesh allows you to do this without touching the application code.
So now you have a list of 14 best practices for building more reliable and secure OpenShift applications. This is a good start for developing your own development code for your team members. For more information and guidance, see Creation of images OpenShift documentation.