Integrating a secure circuit into Yandex Cloud: sharing experience

Hi all! Nixys in touch!

Another interesting case happened to us (mainly to me).

To our client (let's agree to call him Customer) a ready-made development environment “turnkey” was needed. Easy to administer, scalable, with a minimum of responsibility and with a maximum of fault tolerance. At the same time, it should be protected from unauthorized access, so that the company's developments do not fall into the wrong hands. At that time, the client had several hardware servers, the maintenance of which took a lot of human resources. If you do not support and develop such an environment, then over time you can face significant problems. Thus, it was planned to create a production circuit in a domestic cloud.

The disadvantages we see:

  • Once the setup is complete, all access to the outside world (and the Internet, of course) will be disabled, which will immediately kill the ease of administration, since all updates will be approved by the information security service.

  • Closed loops require large investments in development and further support, since all components must be implemented within the company.

  • The company may face problems when integrating new technologies or products from other suppliers. This may lead to problems when scaling and expanding the business.

The task sounded trivial: there is a maximally closed and secure infrastructure with UserGate. The customer wants programmers to create cool services in Yandex Cloud.

From the input (as far as I know) there were:

After many brainstorms and phone calls, we came to this decision:

  1. We create an IPsec tunnel between Yandex Cloud and UserGate using the IKEv2 protocol.

  2. Where we can, we authorize via LDAP and Keycloak (by the way, here is an article from my colleague, who has already gone through this battle).

  3. Using Terraform modules from the repo nxs-marketplace-terraformwith a snap of our fingers we deploy the infrastructure, shake hands and part ways on a good note.

The infrastructure itself will consist of:

Configuring StrongSwan site-to-site IKEv2 ipsec tunnel

First, let's figure out what an IPSEC tunnel is. It is a virtual communication channel that is created between two devices for secure data transfer. This tunnel encrypts data before sending it and decrypts it after receiving it. Thus, even if the data is intercepted during transmission, the attacker will not be able to read or change it. If we compare it, for example, with OpenVPN, which works at the user level, then IPSEC, which works at the system kernel level, allows for increased speed and performance.

We chose the IKEv2 protocol (since IKEv1, created back in 1990, is simply morally obsolete), and as an encryption method — PSK (Pre Shared Key — when the same encryption key is used by all participants in the connection). In fact, we had a choice: to use PSK or certificates. We decided to stop at PSK for several reasons. Firstly, it is convenient: in case of compromise, it is faster and easier to generate and transmit a new key than to fight with certificates. Sometimes the time spent on restoring the communication channel can be very expensive. Secondly, the circuit will be closed, only certain people will know about it.

Let's get down to business.

First, it was necessary to build a tunnel between Yandex Cloud and the Customer's closed circuit. It's simple. What could go wrong?

As usual – a bug in the latest (as of May 2024) version of the firmware in UserGate. When creating an IPSEC tunnel, UserGate creates multiple connections that are broken by timeout. The bug has been confirmed by the developer, the software version has been sent for revision. And the process, alas, had already been launched. Had to redo it. So be careful.

Then we made a link between StrongSwan (virtual machine in Yandex Cloud) and StrongSwan (virtual machine behind UserGate in the Customer’s infrastructure).

Now I'll show you how we did it.

  1. Installing StrongSwan.

  2. Configure /etc/ipsec.conf:

config setup
        charondebug="all" 
        uniqueids=yes
        strictcrlpolicy=no

conn connection
        authby=secret
        left=10.12.0.21
        leftid=84.0.0.1
        leftsubnet=10.12.0.0/24,10.13.0.0/24
        right=202.0.0.3
        rightid=%any
        rightsubnet=10.1.1.0/24,10.1.4.0/24
        ike=aes256-sha256-modp2048,aes256-sha256-modp1024,aes256-sha1-modp2048,aes256-sha1-modp1024!
        esp=aes256-sha256,aes256-sha1!
        keyingtries=3
        ikelifetime=1h
        lifetime=8h
        dpdaction=hold
        auto=start
        type=tunnel
        keyexchange=ikev2

In short, left is the local server, and right is the remote server.

Everything related to encryption (ike, esp) on both hosts must be the same. Details can be found in official documentation.

  1. Configure /etc/ipsec.secrets:

#source      destination
84.0.0.1     202.0.0.3  : PSK "тут какой-то ключ в base64" 

The only important thing here is the line with the addresses and the key; source/destination are indicated to avoid confusion.

  1. We set up the network as described in this article.

  2. Add rules to /etc/ufw/before.rules:

# разрешаем ssh
-A ufw-before-input -p tcp -m tcp --dport 22 -j ACCEPT
-A ufw-before-output -p tcp -m tcp --sport 22 -j ACCEPT

Enable ESP traffic redirection:

-A ufw-before-forward --match policy --pol ipsec --dir in --proto esp -s 10.12.0.0/24 -j ACCEPT
-A ufw-before-forward --match policy --pol ipsec --dir out --proto esp -d 10.12.0.0/24 -j ACCEPT

# разрешаем forwarding 
-A ufw-before-forward -s 10.1.1.0/24 -d 10.12.0.0/24 -i eth0 -m policy --dir in --pol ipsec --reqid 1 --proto esp -j ACCEPT
-A ufw-before-forward -s 10.12.0.0/24 -d 10.1.1.0/24 -o eth0 -m policy --dir out --pol ipsec --reqid 1 --proto esp -j ACCEPT

-A ufw-before-forward -s 10.1.4.0/24 -d 10.12.0.0/24 -i eth0 -m policy --dir in --pol ipsec --reqid 1 --proto esp -j ACCEPT
-A ufw-before-forward -s 10.12.0.0/24 -d 10.1.4.0/24 -o eth0 -m policy --dir out --pol ipsec --reqid 1 --proto esp -j ACCEPT

Such rules need to be created for each subnet, but it is important to specify the correct interface and addresses.

Additional information with examples is available at official website.

  1. Create static routes to destination networks.

Afterwards, we perform similar operations on the second server, only swapping the network and host addresses.

Setting up the infrastructure

After we successfully tested the transfer of files of several gigabytes between remote hosts via the tunnel and measured the transfer speed, we proceeded to the second stage – setting up the infrastructure. This was the most difficult stage of the work.

The order I recommend is not mandatory. However, if I had set everything up in this order from the start, the project would have been completed much faster. Most of the decisions were made primarily to make life easier for the administrators who will be maintaining all of this.

  1. VPN. “Why? There's already an ipsec… tunnel,” the most attentive readers will say. But here everything is simple. It will be faster for us and some external contractors (I was, of course, told that the solution is temporary… But you and I know that there is nothing more durable than temporary ;-). The choice fell on Pritunl VPN. The advantages are obvious: simplicity of initial setup, start working immediately after installation, easy addition of entities, an extremely easy way to create new users.

  2. GitLab server in Docker Compose file. Why? Convenience of support! Namely:

  • fewer problems with updates;

  • everything is in one place, which reduces the entry barrier for the next engineers who will work on the project;

  • everything can be placed behind an external Nginx, but this is rather optional (after all, our circuit is closed).

Once Gitlab is deployed and configured, we describe each subsequent step using IaC. We set up CI/CD for everything we can, so that the client can create magic or make history with one click of a button (underline as appropriate). We make it all as readable and convenient as possible. And, of course, we place a Readme with a detailed description of the service in each project.

A little bit about GitLab setup

Nothing special was invented, docker-compose.yaml was taken from the official documentation. This is what it looks like:

version: '3'
services:
 gitlab:
   image: gitlab/gitlab-ce:<тут какая-то версия образа>
   logging:
     options:
       max-size: "1024m"
   restart: always
   hostname: 'gitlab.example.com'
   container_name: 'gitlab'
   environment:
     GITLAB_OMNIBUS_CONFIG: |
       external_url 'http://gitlab.example.com'
       nginx['redirect_http_to_https'] = false
       nginx['custom_gitlab_server_config'] = 'proxy_request_buffering off;'
       letsencrypt['enable'] = false
       prometheus['enable'] = false
       alertmanager['enable'] = false
       grafana['enable'] = false
       registry['enable'] = false
       gitlab_rails['gitlab_shell_ssh_port'] = 2222
   ports:
   - '8080:80'
   - '8443:443'
   - '2222:22'
   volumes:
   - './volumes/etc/gitlab:/etc/gitlab'
   - './volumes/log/gitlab:/var/log/gitlab'
   - './volumes/data/gitlab:/var/opt/gitlab'

The next item is configuring Nginx. Preparing the ground for certificates – based on articles from my beloved Digital Ocean. I will highlight the main points that I used:

  1. create a file /etc/nginx/snippets/self-signed.confin which we specify the location of the certificates (which we issue using Vault):

ssl_certificate /etc/ssl/certs/nginx-selfsigned.crt;
ssl_certificate_key /etc/ssl/private/nginx-selfsigned.key;
  1. Then we describe the SSL configuration in /etc/nginx/snippets/ssl-params.conf

ssl_protocols TLSv1.3;
ssl_prefer_server_ciphers on;
ssl_dhparam /etc/nginx/dhparam.pem; 
ssl_ciphers EECDH+AESGCM:EDH+AESGCM;
ssl_ecdh_curve secp384r1;
ssl_session_timeout  10m;
ssl_session_cache shared:SSL:10m;
ssl_session_tickets off;
ssl_stapling on;
ssl_stapling_verify on;
resolver 77.88.8.8 77.88.8.1 valid=300s;
resolver_timeout 5s;
preload";
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";

And we create a DHparam file using the command:
sudo openssl dhparam -out /etc/nginx/dhparam.pem 4096

What is it for? This protocol allows two or more parties to obtain a shared secret key via a non-wiretappable communication channel. The resulting key is used to encrypt further exchanges using symmetric encryption algorithms. It is not protected from the “Man-in-the-middle” vulnerability and is used primarily as an additional security measure. Moment: key generation can take a random amount of time, from a few seconds to tens of minutes, so if the generation process freezes for you, pour yourself a hot drink, sit back and watch a cat video or read something useful.

Next, we save the NGINX configuration with the specified includes:

server {
        listen 80;
        server_name gitlab.example.com;
        location / {
                return 301 https://$host$request_uri;
        }
}
server {
        listen 443 ssl http2;
        include snippets/self-signed.conf;
        include snippets/ssl-params.conf;
        server_name gitlab.example.com;
        client_max_body_size 3000m;
        access_log /var/log/nginx/git.access.log;
        error_log /var/log/nginx/git.error.log;
        location / {
                proxy_pass         http://127.0.0.1:8080;
                proxy_redirect     off;
                add_header    X-Frame-Options     "SAMEORIGIN";  
                proxy_set_header    Host                $http_host;
                proxy_set_header    X-Real-IP           $remote_addr;
                proxy_set_header    X-Forwarded-For     $proxy_add_x_forwarded_for;
                proxy_set_header    X-Forwarded-Proto   $scheme;
        }
}

Then we execute the following commands:
sudo nginx -s reload

Most likely you will see something like:

nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

This means that everything is OK. Now it's time to launch nginx and check that the previously launched GitLab offers us to log in.

  1. A cluster of everyone's favorite Kubernetes (hereinafter referred to as k8s). The solution is Managed from Yandex Cloud.
    Firstly: budgets allow;
    secondly: we relieve ourselves and our colleagues from the future of most of the responsibility for the health of the cluster;
    thirdly: our team has made a mind-blowing Terraform modulewith which the cluster is deployed quickly. All necessary options are described, and the module is supported and developed.

  2. Vault Cluster. The backend will be Consul. Why Vault? All our services need certificates! Accordingly, the next step will be to deploy and configure Cert Manager in conjunction with Vault.

  3. Cert Manager. We launch in the Kubernetes cluster. In order to quickly and easily receive certificates from Vault, we configure the Cluster issuer. For our purposes, the manifest will look like this:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
 name: vault-issuer
spec:
 vault:
   auth:
     kubernetes:
       mountPath: /v1/auth/kubernetes
       role: clusterissuer
       secretRef:
         key: token
         name: issuer-token-lmzpj
   path: pki/sign/example
   server: http://vault.example.com

This is a little different from what is written in the instructions and, accordingly, we need some minor adjustments.

  • specify the role name that you need. As you can see, I don’t have the richest imagination, so “clusterissuer”

  • path for me is “pki/sign/example” (this is the location of your certificates in Vault).

Well, in order for our Ingresses to become protected, we add an annotation with the name cluster issuer to the manifests, an example below:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
 name: example-name
 namespace: example
 labels:
   app: example
 annotations:
   cert-manager.io/cluster-issuer: vault-issuer
   meta.helm.sh/release-name: example-name
   meta.helm.sh/release-namespace: example
spec:
 ingressClassName: nginx
 rules:
   - host: example.example.com
     http:
       paths:
         - path: /
           pathType: ImplementationSpecific
           backend:
             service:
               name: example
               port:
                 number: 8080

After completing the story with certificates, we make all services safe. We add the annotation to all ingresses:

 annotations:
   cert-manager.io/cluster-issuer: vault-issuer
  1. Keycloak. As I wrote above, we planned to do everything based on the Keycloak + LDAP bundle, but our plans changed: as it turned out, ADFS had already been configured instead of LDAP. No problem, we thought, and started working with what was already there.

We needed a single point of authorization in AD via Keycloak ⬌ ADFS.
As a result, we started setting up SSO based on SAML 2.0.

  • SSO. We make people's lives easier: log in once and you're free all day. While the session is alive, you can log in to any services that are added to Keycloak. How it works: users enter Active Directory, create a username and password, and that's it, with these credentials the user will be authorized in the system for the duration of the session.

  • SAML. Roughly speaking, it is a standard markup language used to exchange security information between different systems. It allows authentication data (such as username and password) to be passed between services without the need to re-enter this data.

Don't touch – it will kill you

I invite you to the comments for a holy war between supporters of SAML and OIDC. As they say, truth is born in dispute 🙂

After setting up Keycloak, we set up authorization for all services that were deployed earlier:

GitLab
Vault
Grafana
Kibana

We will not describe everything that concerns Keycloak, because the article will turn into a gallery. On the Keycloak side, the identity provider is configured according to the official instructions. And here she is.

Afterwards, the Clients are configured, each for its own service, except for Kibana. For Kibana, we use oauth2-proxy. Information on its setup and configuration can be found in this article.

We configure the rest via SAML according to the official documentation.

Voila! The client got a completely closed development loop. I got a cool experience. You got an article. In the end, everyone is happy.

Thanks for reading! I hope you found it helpful..

By the way, very soon we will have a cool webinar, where our team will talk about the intricacies and nuances of cloud migration. To learn more — Click here.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *