Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

containerd: add option to set parent cgroup #5033

Conversation

WanzenBug
Copy link
Contributor

@WanzenBug WanzenBug commented Jun 13, 2024

Using the runc.v2 runtime, it is possible to configure containerd to start runc with the "systemd_cgroup" flag. This will cause runc to use systemd to manage the container cgroups. For this configuration to work, runc needs the cgroup name to be of a special form: <systemd.slice>:<parent>:<name>. This is already implemented in the containerd runtime package, provided that a parent cgroup of the form <systemd.slice>:<parent>: is set.

This commit adds the option to configure such a parent cgroup for the containerd worker. By default, it will still use an empty string as cgroup parent, keeping existing behaviour.

Using a configuration like:

  [worker.containerd]
  defaultCgroupParent = "system.slice:buildkit:"
  [worker.containerd.runtime]
    name = "io.containerd.runc.v2"
    [worker.containerd.runtime.options]
      SystemdCgroup = true

a user is able to have their container cgroups managed by systemd. This makes it possible to set global resource constraints on a per-container basis using systemd drop-in configuration. When using the example above, the following file restricts every container spawned by buildkit to use only 1 CPU and 1G of RAM:

$ cat /etc/systemd/system/buildkit-.scope.d/limits.conf
[Scope]
CPUQuota=100%
MemoryMax=1G

@@ -328,7 +328,7 @@ func containerdWorkerInitializer(c *cli.Context, common workerInitializerOpt) ([
Options: opts,
}
}
opt, err := containerd.NewWorkerOpt(common.config.Root, cfg.Address, snapshotter, cfg.Namespace, cfg.Rootless, cfg.Labels, dns, nc, common.config.Workers.Containerd.ApparmorProfile, common.config.Workers.Containerd.SELinux, parallelismSem, common.traceSocket, runtime, ctd.WithTimeout(60*time.Second))
opt, err := containerd.NewWorkerOpt(common.config.Root, cfg.Address, snapshotter, cfg.Namespace, cfg.CgroupParent, cfg.Rootless, cfg.Labels, dns, nc, common.config.Workers.Containerd.ApparmorProfile, common.config.Workers.Containerd.SELinux, parallelismSem, common.traceSocket, runtime, ctd.WithTimeout(60*time.Second))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list of argument is getting really really long, and probably requires changes in Moby as well; should we implement a struct for passing options? Similar to https://github.com/moby/moby/blob/450f18d3cae6be625d671b6bacd3eefcaaa13bb9/builder/builder-next/builder.go#L79 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I wanted to comment on the other file 🙈 same applies 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't refactor this in the same PR. Future follow-up welcome.

Using the "runc.v2" runtime, it is possible to configure containerd to start
runc with the "systemd_cgroup" flag. This will cause runc to use systemd to
manage the container cgroups. For this configuration to work, runc needs the
cgroup name to be of a special form: "<systemd.slice>:<parent>:<name>".
This is already implemented in the containerd runtime package, provided that
a parent cgroup of the form "<systemd.slice>:<parent>:" is set.

This commit adds the option to configure such a parent cgroup for the
containerd worker. By default, it will still use an empty string as cgroup
parent, keeping existing behaviour.

Using a configuration like:

  [worker.containerd]
  defaultCgroupParent = "system.slice:buildkit:"
  [worker.containerd.runtime]
    name = "io.containerd.runc.v2"
    [worker.containerd.runtime.options]
      SystemdCgroup = true

a user is able to have their container cgroups managed by systemd. This makes
it possible to set global resource constraints on a per-container basis using
systemd drop-in configuration. When using the example above, the following file
restricts every container spawned by buildkit to use only 1 CPU and 1G of RAM:

  $ cat /etc/systemd/system/buildkit-.scope.d/limits.conf
  [Scope]
  CPUQuota=100%
  MemoryMax=1G

Signed-off-by: Moritz "WanzenBug" Wanzenböck <[email protected]>
@WanzenBug WanzenBug force-pushed the option-for-cgroup-parent-in-containerd-worker branch from ef8da09 to 5cb8618 Compare June 14, 2024 06:47
@tonistiigi tonistiigi merged commit 935713c into moby:master Jun 18, 2024
76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants