We cover all modern Observability needs
No system can function without failures; situations can always arise when problems arise in the operation of the software. And here observability is important – a concept that includes monitoring and observability. With monitoring, we can determine when a problem occurred, and observability allows us to understand why it occurred.
In this article, we will talk about how observability can be implemented using Grafana stack services.
Grafana Services
First of all, we need to get acquainted with the services that make up the LGTM stack. These are the services Loki, Mimir and Tempo. Each of them is a server-based information aggregation system for various types of data: logs, metrics and traces. Grafana provides a starting point for querying and visualizing data coming from multiple sources connected to a given system.
An important element of observability is the ability to alert when certain events occur, or when metrics exceed certain thresholds.
An additional point here concerns metrics, which on their own are already very useful, but when combined with alerts they become the basis for identifying problems. We can first define how to manage the metrics, extrapolate this to logs and traces, and refine this by also setting up alerts.
Let's look at each of these services in more detail.
Loki Log Aggregator
Loki is a horizontally scalable, highly available, multi-tenant log aggregation system. It is designed to be resource efficient and easy to use. It does not index the contents of the logs, but instead provides a set of labels for each log stream. The Loki project was launched at Grafana Labs in 2018 under the AGPLv3 license.
Loki allows you to monitor logs in real time to view events as they come into the system, refresh data at specific intervals, view logs for a specific date, etc. Built-in integration with Prometheus, Grafana and K8s allows you to easily switch between metrics, logs and traces within a single user interface.
An important advantage of Loki is that it only indexes the metadata and not the full text of the entire log entry:
Thanks to this approach, we can significantly save time and resources. In particular, this minimal indexing approach means that Loki requires much less space to store the same set of logs than other solutions.
To work with logs, Loki offers the use of the powerful LogQL query language. You can run LogQL queries directly in Grafana to visualize the results, or using LogCLI for those who prefer working with the command line.
Here are some examples of such requests:
Using the following query, we can get performance data for HTTP GET requests from NGINX logs:
avg(rate(({job="nginx"} |= "GET")[10s])) by (region)
And in this query we group the number of records in the last five minutes by level:
sum(count_over_time({job="mysql"}[5m])) by (level)
You can also configure alert rules in Loki so that if thresholds are exceeded, alerts can be sent to Prometheus Alertmanager for subsequent processing.
The Grafana website provides several options for installing Loki for different environments.
For study purposes, you can traditionally use containerization:
$ mkdir grafana-loki
$ cd grafana-loki
$ wget https://raw.githubusercontent.com/grafana/loki/main/examples/getting-started/loki-config.yaml -O loki-config.yaml
$ wget https://raw.githubusercontent.com/grafana/loki/main/examples/getting-started/promtail-local-config.yaml -O promtail-local-config.yaml
$ wget https://raw.githubusercontent.com/grafana/loki/main/examples/getting-started/docker-compose.yaml -O docker-compose.yaml
$ docker-compose up –d
If the installation is successful, the following URLs will be available to us:
http://localhost:3101/ready
http://localhost:3102/ready
Mimir Storage
Grafana Mimir is an open source software project that provides horizontally scalable, long-term storage for Prometheus and OpenTelemetry metrics. With it, you can run queries, create new data using record rules, and configure alert rules for multiple clients at once.
There are also several ways to install Mimir. The easiest way is to take a ready-made container:
docker pull grafana/mimir:latest
To start, we will first need to create a new network
docker network create grafanet
And then launch the container on this network.
docker run \
--rm \
--name mimir \
--network grafanet \
--publish 9009:9009 \
--volume "$(pwd)"/demo.yaml:/etc/mimir/demo.yaml grafana/mimir:latest \
--config.file=/etc/mimir/demo.yaml
In this case, the demo.yaml file may look like this:
# Do not use this configuration in production.
# It is for demonstration purposes only.
multitenancy_enabled: false
blocks_storage:
backend: filesystem
bucket_store:
sync_dir: /tmp/mimir/tsdb-sync
filesystem:
dir: /tmp/mimir/data/tsdb
tsdb:
dir: /tmp/mimir/tsdb
compactor:
data_dir: /tmp/mimir/compactor
sharding_ring:
kvstore:
store: memberlist
distributor:
ring:
instance_addr: 127.0.0.1
kvstore:
store: memberlist
ingester:
ring:
instance_addr: 127.0.0.1
kvstore:
store: memberlist
replication_factor: 1
ruler_storage:
backend: filesystem
filesystem:
dir: /tmp/mimir/rules
server:
http_listen_port: 9009
log_level: error
store_gateway:
sharding_ring:
replication_factor: 1
Tracer Grafana Tempo
Grafana Tempo is an open source server-based distributed tracing solution. Tempo allows you to search for traces, generate interval-based metrics, and associate tracking data with logs and metrics. Distributed tracing visualizes the lifecycle of a request as it moves through a set of applications.
Tempo requires only object storage and is deeply integrated with Grafana, Mimir, Prometheus and Loki, and supports various open source tracing protocols including Jaeger, Zipkin or Open Telemetry.
To install, we will also need to create our own network in docker
$ docker network create docker-tempo
Next, download a YAML file with example settings:
$ curl -o tempo.yaml https://raw.githubusercontent.com/grafana/tempo/master/example/docker-compose/etc/tempo-local.yaml
And start the container:
docker run -d --rm -p 6831:6831/udp --name tempo -v $(pwd)/tempo-local.yaml:/etc/tempo-local.yaml --network docker-tempo grafana/tempo:latest -config.file=/etc/tempo-local.yaml
Now you need to start the Tempo query container. To do this, first load the tempo query configuration file.
$ curl -o tempo-query.yaml https://raw.githubusercontent.com/grafana/tempo/master/example/docker-compose/etc/tempo-query.yaml
Using the resulting tempo-query configuration file, let's launch the docker container.
$ docker run -d --rm -p 16686:16686 -v $(pwd)/tempo-query.yaml:/etc/tempo-query.yaml --network docker-tempo grafana/tempo-query:latest --grpc-storage-plugin.configuration-file=/etc/tempo-query.yaml
If installation is successful, Tempo will be available at ttp://localhost:16686
Next, you can connect sources to stack components and configure their interaction, but this is the topic of separate articles.
Conclusion
In this article, we looked at the components that make up the Grafana stack. The combined use of Loki, Mimir and Tempo services allows for full observability for target systems.
On November 7, there will be an open lesson on the topic “Grafana Stack: covering all modern Observability needs.” We will review and configure Grafana Stack for comprehensive monitoring and data analysis.
If the topic is relevant to you, sign up for a lesson on the course page “Observability: monitoring, logging, tracing”.