Evgeniy DockerAuthPlugin'ovich Onegin

Interesting start, isn't it? My name is Roman, and I am a junior information security engineer at Ozon. In this article, I will talk about the problem of lack of access authorization to Docker daemon / Docker Engine API / Docker commands when working with containers in the Docker ecosystem and how it can be solved using 11 almost poetic lines of bash.

Speaking about poetry, the first thing that comes to my mind is literature classes, where my favorite novel was “Eugene Onegin.” At school, the literature teacher told us: “It’s you who don’t understand this now… Later, after a year, five or even 20 years, you will again touch books that, it would seem, you already know inside and out. And then you will understand everything that we are discussing here, or even discover something for yourself that you have no idea about.”

Having decided to re-read Alexander Sergeevich Pushkin’s novel “Eugene Onegin,” I was pleasantly surprised that I was able to discover something new about long-forgotten characters, perhaps even consider their relationship from a different perspective. So, for example, I never thought that Tatyana Larina could have been 14 years old at the time of her first meeting with Evgeniy, that Lensky, with his “platonic” love, drove Olga “to the brink”, or that Evgeniy may have later fallen in love with Tatyana because of her position. As a growing information security engineer, I suddenly wondered: “What would the novel Eugene Onegin look like if it had been written by an information security officer?” Ahem-ahem, “Without preamble, let me introduce you to the hero of my novel right now.”…

Problem with access authorization, or How a Bug settled in the estate

Docker does not provide any authorization for access to the Docker daemon / Docker Engine API / Docker commands when working with containers. According to official documentation:

Docker's out-of-the-box authorization model is all or nothing. Any user with permission to access the Docker daemon can run any Docker client command.

Now let's simulate the situation:

You need multiple development teams on the same virtual machine to be able to communicate with each other in real time and develop using Docker.

Since developers will have access to Docker, this means they will have access to /var/run/docker.sock. This will allow you to send any requests to the Docker Daemon, even those that will help it get out of the container and take over root.

Following the principle of least privilege, you cannot trust a large number of people, hoping that no one will break anything. There may definitely be someone who wants to do mischief, stop our container, steal secrets from it, or simply accidentally break it.

And here two security problems arise:

  1. Developers can run any commands through Docker Daemon, creating a bad container that allows them to gain privileged access on the host.

  2. Developers can interact with other teams' containers, even those they don't own.

If, to solve the first problem, it is enough to turn to the Internet and find an open-source solution that prohibits executing commands defined by the engineer (for example, OPA plugin), then the second one will be more difficult.

Since it was not possible to find ready-made implementations to solve this problem in the public domain, I wrote a plugin that I will present in this article. It will be enough to install and run it. After that:

  • interaction with containers that do not belong to us will be limited;

  • The rules prohibiting the creation of bad containers will begin to work, namely:

  1. Privileged should be false. Privileged containers have apparmor and seccomp profiles disabled. Also, these containers have all Capabilities, so getting out of it will not be difficult.

  2. NetworkMode should not be host. Otherwise the container, having SYS_ADMIN And SYS_RAW capabilities, can intercept all network traffic from docker, not being isolated from the Docker host. Flag --network

  3. IpcMode must be either empty, none, or private. Otherwise, it is possible to access the inter-process communication (IPC) mechanism. Flag --ipc

  4. Binds must not be host, otherwise the host's root filesystem will be mounted into the container. This means that the container will have full access to files and directories on your host. Flag -v

  5. CapAdd must be null to prevent adding additional capabilities to container. Flag --cap-add

  6. PidMode should not be host. If the container is launched in pid and there is SYS_PTRACE capability in the host namespace, this will allow you to get a list of all processes. Flag --pid

  7. SecurityOpt should always be left empty. This parameter is responsible for changing the apparmor and seccomp profiles, which serve as a security mechanism that maintains isolation in the container. Flag --security-opt

  8. Devices should remain empty. For example, --device=/dev/sda1 allows you to access the host file system.

  9. CgroupParent must be empty. If a designated group is used to access a cgroup that is already in use by other processes or containers, this can be dangerous.

The Bugs family evicts Eugene from his estate

The Bugs family evicts Eugene from his estate

Developer identifier as container dressing

I suggest first understanding the access restrictions in other containers. To make the distinction, we need some kind of unique user ID that should go into our plugin, and the ID of the container that the user will request access to. What should act as a unique identifier? Having studied the documentation in a little more detail, you can see piecewhich says:

The property HttpHeaders specifies a set of headers to include in all messages sent from the Docker client to the daemon. Docker doesn't try to interpret or understand these headers; it simply puts them into the messages. Docker does not allow these headers to change any headers it sets for itself.

That is, we can define a static randomly generated unique identifier for the user, which will be stored in /.docker/config.json in the user's home directory. It turns out that every time you access docker daemon, this identifier will be in the request header. We will retrieve this identifier and match it with the container.

Hmm, how will the developer end up with the identifier in the file? And what if a team needs to have access to each other's containers?

I think that a bash script that can be launched by root will come to the rescue here. The script will check if the user has /.docker/config.json, if not, the script will create a config. If the config already exists, but it does not have a special header with an identifier, then the script will add it. That is, everything should go seamlessly for developers. If developers need to have access to each other’s containers, then they will have to ensure that the same identifier is generated for the team.

How will the matching process take place? How do we know that the container has been created and it's time to compare?

To answer this question, let us turn to documentation. After studying, it will become clear that when working with containers the following construction is mainly used:

/v1.44/containers/{id}/stop

/v1.44/containers/{id}/kill

/v1.44/container/{id}/{action}

Note that the container creation API looks like this:

/v1.44/containers/create

This would seem to be something that would be difficult to map when creating a container, but this is not the case. Let's consider two options:

  1. creating a container via docker cli. docker run –p 80:80 custom_image

  2. creation via direct request to docker daemon

In the first case, running the command docker run –p 80:80 custom_image will lead to the creation of several queries:

  1. /v1.44/containers/create + Body

  2. /v1.44/containers/{id}/wait?condition=next-exit

  3. /v1.44/containers/{id}/start

That is, we will still be able to catch the event for container creation. As for the second option, having created a container, the developer will need to access the container, unless, of course, we expect that containers are not created by someone just like that, remaining hanging like a “dead soul”.

What is meant by container id?

A container can be accessed by name, by an id of any length, which accurately and uniquely identifies the container being accessed. The important point here is that we will need to take into account the names of the containers, because when creating a link between the user and the container, we only have the full id of the container.

And now the sound of the Breguet rings: the idea of ​​a plugin asks for implementation (implementation)

Let's now consider the implementation of the idea. The code is here – https://github.com/I-am-Roman/docker-auth-plugin. After AuthHeader using bash appeared in /.docker/config.jsonyou can connect the plugin, how this is done is described here: https://github.com/I-am-Roman/docker-auth-plugin?tab=readme-ov-file#enable-the-authorization-plugin-on-docker-engine. Let's move on to how it all works:

docker_dir="$HOME/.docker"
config_file="$docker_dir/config.json"
 
if [ ! -d "$docker_dir" ]; then
mkdir -p "$docker_dir"
fi

if [ ! -f "$config_file" ]; then
    echo '{
  "HttpHeaders": {
    "AuthHeader": "'$(openssl rand -hex 16)'"
  }
}' > "$config_file"
echo "Настройки успешно обновлены в $config_file"
 
else
    if grep -q '"AuthHeader":' "$config_file"; then
        echo "AuthHeader уже существует в $config_file"
    else
    authHeader=$(openssl rand -hex 16)
    jq ". + {\"HttpHeaders\": {\"AuthHeader\": \"$authHeader\"}}" "$config_file" > "$config_file.tmp" && mv "$config_file.tmp" "$config_file"
    echo "Настройки успешно обновлены в $config_file"
fi
fi

From the very beginning, at startup, we define our docker plugin and take ADMIN_TOKEN from the ENV variables – this is sha256 from the real token.

  AdminToken = os.Getenv("ADMIN_TOKEN")
  plugin.DefineAdminToken(AdminToken)

  authPlugin, err := plugin.NewPlugin()
  if err != nil {
      log.Fatal(err)
  }

Docker plugin will work in pre-hook mode. That is, before executing a command received, for example, through the docker cli, the Docker daemon will send a request to us, asking whether the request can be executed or not. We will answer: “Yes, you can” or “No, you can’t.” Requests will be sent to the API – AuthZReq.

Flow authentication scheme

Flow authentication scheme

I've received a request, what's next? Before we begin we will need:

  1. request;

  2. request body (in string format);

  3. two map. One will store id:hash_key. Other id:name

  4. remove the version from the request if it exists. Requests with docker daemon have already been demonstrated previously. It will send us requests with the docker engine version. This is a possible bypass, so you need to trim the version if there is one.

  obj := reqURL.String()
  reqBody, _ := url.QueryUnescape(string(req.RequestBody))

  // Cropping the version /v1.42/containers/...
  re := regexp.MustCompile(`/v\d+\.\d+/`)
  obj = re.ReplaceAllString(obj, "/")

After this you can start checking:

  1. Is this action allowed (we check for a match). Such actions can be performed without a token;

  for _, j := range AllowToDo {
      if obj == j {
          return authorization.Response{Allow: true}
      }
  }
  1. whether this action is prohibited (here we check that the request does not start with something that we have prohibited);

  for _, j := range ForbiddenToDo {
      keyHash := CalculateHash(req.RequestHeaders[headerWithToken])
      if yes := IsItAdmin(keyHash); yes {
          return authorization.Response{Allow: true}
      }
      if strings.HasPrefix(obj, j) {
          return authorization.Response{Allow: false, Msg: "Access denied by AuthPlugin: " + obj}
      }
  }
  1. Next, there is a check that the container is created or updated (update) without dangerous parameters, but more on that later

  2. Next we check through strings.HasPrefix(obj, actionWithContainerAPI)that the request involves some action with the container:

    a. get AuthHeader;

    b. calculate hash;

    c. We assume that the container has been created and we do not have its name. Define the container name:

    i. The docker daemon can be accessed not only through the docker cli, but also directly. Let's take advantage of this by running a query similar to docker ps -a;

    ii. Next we determine which containers we already have through isItIdExist;

    iii. We will send containers that could not be detected for removal from the map;

    iv. deleting from a map while iterating over it in parallel, in my opinion, is unsafe, so let’s make a map of those who should be deleted;

func CheckDatabaseAndMakeMapa() error {
	ctx := context.Background()
	cli, err := client.NewClientWithOpts(client.FromEnv, client.WithAPIVersionNegotiation())
	if err != nil {
		return err
	}
	defer cli.Close()

	// similar to the "docker ps -a"
	containers, err := cli.ContainerList(ctx, types.ContainerListOptions{All: true})
	if err != nil {
		return err
	}

	// Create map for a quick check of uniqueness
	// Get info from docker daemon and confidently speak
	// this container exist
	doesThisIDExist := make(map[string]bool)
	for _, container := range containers {
		ID := container.ID[:12]
		name := container.Names[0]

		// Docker Daemon usually return /<nameOfContainer> that's why we need to TrimLeft a "/"
		hasSlash := strings.Contains(name, "/")
		if hasSlash {
			name = strings.TrimLeft(name, "/")
		}

		doesThisIDExist[ID] = true
		if _, exists := IDAndNameMapping[ID]; !exists {
			IDAndNameMapping[ID] = name
		}
	}

	// Create temporary map for key storage we need to delete from IDAndNameMapping
	keysToDelete := make(map[string]bool)
	for key := range IDAndNameMapping {
		if !doesThisIDExist[key] {
			keysToDelete[key] = true
		}
	}

	// Delete old container also from IDAndHashKeyMapping
	for oldId := range keysToDelete {
		delete(IDAndNameMapping, oldId)
		_, found := IDAndHashKeyMapping[oldId]
		if found {
			delete(IDAndHashKeyMapping, oldId)
		}
	}

	return nil
}

d. we take the supposed id (in fact, we don’t yet know what’s there: name, id or garbage) from the container and check what came in the request:

i. if this is a name, then change it to id. The user can determine which id needs to be sent, so we need to be prepared for this;

ii. since we store 12-digit ids in the map, then in order to match an 8-digit id and a 12-digit one, we will have to cast the 12-digit one to an 8-digit type;

iii. if we were unable to match, then the container id is not in the request;

func DefineContainerID(obj string) string {
	partsOfApi := strings.Split(obj, "/")
	containerID := partsOfApi[2]
	isitNameOfContainer := false

	for id := range IDAndNameMapping {
		if containerID == IDAndNameMapping[id] {
			isitNameOfContainer = true
			// Redefining containerID
			containerID = id
			break
		}
	}

	// If user sent a containerID with less, than 12 symbols, or less, than 64, but not 12
	if len(containerID) != 64 && len(containerID) != 12 && !isitNameOfContainer {
		IsItShortId := false
		if len(containerID) > 12 {
			containerID = containerID[:12]
		}
		for ID := range IDAndHashKeyMapping {
			if ID[:len(containerID)] == containerID {
				containerID = ID
				IsItShortId = true
				break
			}
		}
		// We get a trash
		if !IsItShortId {
			return trash
		}
	}

	return containerID[:12]
}
  1. after we have received the id, we will check whether this container already belongs to someone.

    a. if owned, then check that the keyHash of the user who wants to perform the action is equal to the keyHash to which the container belongs.

    b. We also check if this is an admin.

func AllowMakeTheAction(keyHashFromMapa string, keyHash string) bool {
	if keyHashFromMapa == keyHash {
		return true
	} else {
		if yes := IsItAdmin(keyHash); yes {
			return true
		}
		return false
	}
}

keyHashFromMapa, found := IDAndHashKeyMapping[containerID]
if found {
    if allow := AllowMakeTheAction(keyHashFromMapa, keyHash); allow {
        return authorization.Response{Allow: true}
    } else {
        return authorization.Response{Allow: false, Msg: "Access denied by AuthPlugin. That's not your container"}
    }
} else {
    log.Println("That's container was created right now:", containerID)
    IDAndHashKeyMapping[containerID] = keyHash
    return authorization.Response{Allow: true}
}
  1. there is also a check for exec. The request from him looks different, so it will have to be taken into account separately. The only thing that changes is that we don't need to match the name and ID of the container, since that was done before.

// If it is exec, we don't need to execute CheckDatabaseAndMakeMapa
if strings.HasPrefix(obj, execAtContainerAPI) {
  key, found := req.RequestHeaders[headerWithToken]
  if !found {
      instruction := fmt.Sprintf("Access denied by AuthPlugin. Authheader is Empty. Follow instruction - %s", manual)
      return authorization.Response{Allow: false, Msg: instruction}
  }

  keyHash := CalculateHash(key)
  containerID := DefineContainerID(obj)
  if containerID == trash {
      return authorization.Response{Allow: true}
  }

  keyHashFromMapa, found := IDAndHashKeyMapping[containerID]
  if found {
      if allow := AllowMakeTheAction(keyHashFromMapa, keyHash); allow {
          return authorization.Response{Allow: true}
      } else {
          return authorization.Response{Allow: false, Msg: "Access denied by AuthPlugin. You can't exec other people's containers"}
      }
  }
}
  1. If we have reached the end of the function, we allow the action to be performed so as not to block something for which control is not needed.

A more pressing issue arises with docker container escape or escaping from a container. So companies are trying to solve this problem first by installing open-source/paid solutions. If someone wanted to solve the problem of lack of authorization in the container now, they would have to work with several plugins. Docker allows you to install several plugins, but it's such a hassle. I would like to get to the point where you can simply “Install and run”. Therefore, the described solution also provided for a limitation of actions, depending on the prescribed policies. I, in turn, introduced policies that are aimed at protecting against the creation of bad containers. They were described at the very beginning of the article. Let's look at how it all works:

  1. Once we've passed the condition that the container is being created or updated, we need to check the body of the request. This is where there will be information about what parameters the request to create a container came with.

  2. We check whether the administrator is trying to create a container. Policies will not apply to Admin.

  if obj == creationContainerAPI || updateRegex.MatchString(obj) {

      if req.RequestHeaders[headerWithToken] != "" {
          keyHash := CalculateHash(req.RequestHeaders[headerWithToken])
          if yes := IsItAdmin(keyHash); yes {
              return authorization.Response{Allow: true}
          }
      }

      // Allow to create without AuthHeader, because we don't have the container ID at this step
      yes, failedPolicy := containerpolicy.ComplyTheContainerPolicy(reqBody)
      if !yes {
          msg := fmt.Sprintf("Container Body does not comply with the container policy: %s", failedPolicy)
          return authorization.Response{Allow: false, Msg: "Access denied by AuthPlugin." + msg}
      }
  }
  1. We check that the container complies with the policies. Here we will dwell in more detail and consider the validation process:

    a. Three rules have been introduced: ExpectToSee. A check is made for equality between the value from Body and the value specified in the policy. DoesntExpectToSee checks that no value from slice is equal to what is specified in the policy (Everything is allowed except …). AllowToUse (Allowed except) checks that the values ​​used in slice are valid;

    b. all values ​​are retrieved from the CSV file;

    c. Body is necessarily cast to LowerCase;

    d. In order to write a policy, a developer needs to know which variables are used in the Body, because then the variables and their values ​​will be isolated using regexpr.

  for _, row := range records {
      nameOfKey := strings.ToLower(row[0])
      valueFromCSV := strings.ToLower(row[1])
      typeOfData := row[2]
      kindOfPolicy := row[3]

      var searcher string

      switch typeOfData {
      case "slice":
          // will ignore null and []
          searcher = fmt.Sprintf(`"%s":\s*\[([^\]]*)\]`, nameOfKey)
      case "string":
          // will ignore null
          searcher = fmt.Sprintf(`"%s":"([^"]+)"`, nameOfKey)
      case "bool":
          searcher = fmt.Sprintf(`"%s":([^",]+)`, nameOfKey)
      }

      re := regexp.MustCompile(searcher)
      // if someone will want to add the same key with a forbidden value for bybass
      matches := re.FindAllStringSubmatch(body, -1)
  1. We also look for ALL strings found, because an attacker can, for example, through Burp Suite, intercept a request by adding prohibited key:values ​​at the end.

  2. Regarding AllowToUse, DoesntExpectToSee, we extract the variables from the policies and check each variable from the body for compliance with the policy.

  for _, match := range matches {
      if match != nil {
          if kindOfPolicy == ExpectToSee {
              if match[1] != valueFromCSV {
                  return false, nameOfKey
              }
          } else if kindOfPolicy == DoesntExpectToSee {
              csv := strings.Trim(valueFromCSV, "[]")
              sliceFromCSV := strings.Split(csv, ",")
              // if will get: ["value1","value2","value3"]
              // regexpr give us at match[1] - "value1","value2","value3"
              // we should check every single value
              valueOfMatches := strings.Split(match[1], ",")

              for _, valueOfMatch := range valueOfMatches {
                  valueOfMatch := strings.Trim(valueOfMatch, "\"")
                  for _, dontExpect := range sliceFromCSV {
                      if dontExpect == valueOfMatch {
                          return false, nameOfKey
                      }
                  }
              }
          } else if kindOfPolicy == AllowToUse {
              csv := strings.Trim(valueFromCSV, "[]")
              sliceFromCSV := strings.Split(csv, ",")
              valueOfMatches := strings.Split(match[1], ",")

              for _, valueOfMatch := range valueOfMatches {
                  valueOfMatch = strings.Trim(valueOfMatch, "\"")
                  isItValueOK := false
                  for _, allowToUse := range sliceFromCSV {
                      if allowToUse == valueOfMatch {
                          isItValueOK = true
                          continue
                      }
                  }
                  if !isItValueOK {
                      return false, nameOfKey
                  }
              }
          } else {
              log.Println("I don't know this policy!")
              return true, ""
          }
      }
  }

Regarding tests:

  1. Before the integration itself, you can run filename_test.go to check its functionality.

  2. A system designed to test an already integrated docker plugin: https://github.com/I-am-Roman/test-system-dockerAuthPlugin

Duel between Bug Bug and Evgeniy

Duel between Bug Bug and Evgeniy

The result is like an unexpected threshold

In this way, Evgeniy was able to regain his home, and we received a security tool. Now we can sum up the results.

Evgeniy regained his estate

Evgeniy regained his estate

We have implemented DockerAuthPluginincluding:

  1. Bypass for administrator;

  2. permission to execute without token:

    1. /ping

    2. /images/json; docker images

    3. /containers/json?all=1; docker ps -a

    4. /containers/json; docker ps

  3. ban on execution:

    1. /plugin; docker plugin ls, docker plugin create, docker plugin enable and others

    2. /volumes; docker volumes ls, docker volumes create and others

    3. /commit

  4. prohibition on creating/updating a container with the following settings:

    1. –privileged (prohibition of creation with the “Privileged” parameter not equal to false)

    2. –cap-add (prohibit creation with the “CapAdd” parameter not equal to null)

    3. –security-opt (disable the use of the –security-opt flag)

    4. –pid (prohibit creation with the “PidMode” parameter not equal to '' (empty string))

    5. –ipc (prohibit creation with the “IpcMode” parameter not equal to '' (empty string))

    6. -v (prohibit creation with the “Binds” parameter not equal to null)

    7. –cgroup-parent (prohibit creation with the “CgroupParent” parameter not equal to ''(empty string))

    8. –device (prohibit creation with parameter “PathOnHost” and “PathInContainer” not equal to ''(empty string)”

  5. authentication when using commands:

    1. docker stop

    2. docker inspect

    3. docker rm

    4. docker start

    5. docker pause

    6. docker unpause

    7. docker logs

    8. docker exec

    9. docker port

    10. docker cp

    11. docker update

    12. and others that require interaction with containers

  • Bash script to seamlessly add the required header to /.docker/config.json for users.

  • And a test system with test cases was created, which helps to identify a possible error when adding/removing functionality.

What's next?

1. Instead of a random token, you can use a JWT token from the Identity Provider. You go to some resource, for example https://docker-jwt.comwhich performs SSO authorization, after authorization a JWT token (access and refresh) appears on the page, which must be copied to /.docker/config.json. Further on the plugin side, authorization is carried out by groups in the token. To update access and refresh tokens, you may have to implement a console utility that updates the tokens. It will be possible to check by groups in the jwt token whether the owner of group X can access the container.

2. Implement database support so that when you restart the docker plugin, all existing data is not erased. In the future, this will allow us to develop the idea of ​​a role model.

3. Teach the plugin to perform tasks in parallel/asynchronously.

Write in the comments if you want to re-read “Eugene Onegin”?

End!

End!


(1) “STO declared 2023 the year of information security. The priority of tasks from the information security team has doubled.”

(2) RISK – IS project in the project management system and a suite of automation tools designed to address security issues at Ozon.

(3) Cursing – cursing.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *