Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker build with img failing on infra.ci with SYNC_USERMAP_ACK: got 255: Invalid argument #2190

Closed
dduportal opened this issue Jan 11, 2022 · 14 comments · Fixed by jenkins-infra/pipeline-library#319

Comments

@dduportal
Copy link
Contributor

Service

infra.ci.jenkins.io

Summary

All the Docker builds on infra.ci, which are using img , are failing with an error SYNC_USERMAP_ACK: got 255: Invalid argument, as caught by @smerle33.

The complete error is

nsenter: failed to execv: No such file or directory
nsenter: failed to sync with parent: SYNC_USERMAP_ACK: got 255: Invalid argument
nsenter: failed to use newuidmap: Invalid argument

Reproduction steps

@dduportal dduportal added the triage Incoming issues that need review label Jan 11, 2022
@dduportal dduportal self-assigned this Jan 11, 2022
@dduportal dduportal added docker site:infra.ci.jenkins.io and removed triage Incoming issues that need review labels Jan 11, 2022
@dduportal
Copy link
Contributor Author

We already had this error in the early phases of using img:

@dduportal
Copy link
Contributor Author

dduportal commented Jan 11, 2022

@dduportal
Copy link
Contributor Author

Found it: we changed the based image from Alpine to Debian in jenkins-infra/docker-builder#34 , which breaks img.

@dduportal
Copy link
Contributor Author

Panel of upcoming solutions:

  • Short term: revert back the pipeline library to 2.0.2: only affect docker builds (not the plugin-site)
  • Medium term: either find a way to fix this issue under a debian base image or split images and go back to alpine for img
  • Long term: either full switch to Docker CE with Allow infra.ci.jenkins.io to build on VMs with Docker Engine #13 or use Docker in Docker pods with sysbox

@timja
Copy link
Member

timja commented Jan 11, 2022

this helpful at all? genuinetools/img#348 (comment)

@dduportal
Copy link
Contributor Author

this helpful at all? genuinetools/img#348 (comment)

That is extremly helpful ! I missed this one, thanks a lot

@dduportal
Copy link
Contributor Author

At first sight, it seems feasible:

kubectl -n jenkins-agents exec -ti docker-builds-docker-jenkins-weekly-pr-360-10-d83kz-773jv-rpsc1 -- cat /proc/sys/kernel/unprivileged_userns_clone 
1

(required for rootless https://github.com/moby/buildkit/blob/master/docs/rootless.md)

@timja
Copy link
Member

timja commented Jan 11, 2022

or this? genuinetools/img#340 (comment)

@halkeye
Copy link
Member

halkeye commented Jan 11, 2022

Or https://github.com/genuinetools/img/blob/master/Dockerfile.dev#L69

@dduportal
Copy link
Contributor Author

TBH, moving to buildkit sounds the most appealing part, since img is slowly dying (it was released before the buildkit stuff)

@dduportal
Copy link
Contributor Author

But mobykit is also using alpine: https://github.com/moby/buildkit/blob/master/Dockerfile#L277 :'(

Sounds like @halkeye solution with building the correct binaries is the closest one

@dduportal
Copy link
Contributor Author

Short term PR: jenkins-infra/pipeline-library#282

dduportal added a commit to jenkins-infra/pipeline-library that referenced this issue Jan 11, 2022
* fix(buildDockerAndPublishImage) downgrade docker-builder to 2.0.2

Temp. fix for jenkins-infra/helpdesk#2190

* Update buildDockerAndPublishImage.txt
@dduportal
Copy link
Contributor Author

With the version 2.0.4, the problem is that the binaries newuidmap and newgidmap come from the img alpine image, so they cannot work at all under a debian distribution:

$ ldd /usr/bin/newuidmap
        linux-vdso.so.1 (0x00007ffdc2b4c000)
        libc.musl-x86_64.so.1 => not found

Let's try to keep the original debian binaries.

@dduportal
Copy link
Contributor Author

Since jenkins-infra/pipeline-library#311, you can set the attribute useContainer to true in the argument map of buildDockerAndPublishImage() to build using a valid Docker engine.

Gotta update parallelDockerUpdatecli() to support this attribute and use Docker by default so we would be able to proceed with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants