Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linkerd check --proxy shows completed pods #11280

Closed
jack1902 opened this issue Aug 22, 2023 · 1 comment · Fixed by #11295 or #11367
Closed

linkerd check --proxy shows completed pods #11280

jack1902 opened this issue Aug 22, 2023 · 1 comment · Fixed by #11295 or #11367

Comments

@jack1902
Copy link

jack1902 commented Aug 22, 2023

What is the issue?

When running linkerd check --proxy it shows pods that are in the completed state as having outdated proxies

How can it be reproduced?

  1. setup a cluster running an older version of linkerd, say 2.13.5
  2. Run a job in the cluser which is injected with a proxy and shuts the proxy down appropriately (prehaps using linkerd-await
  3. upgrade linkerd to 2.13.6
  4. run linkerd check --proxy

Logs, error output, etc

linkerd-data-plane
------------------
√ data plane namespace exists
√ data plane proxies are ready
√ data plane is up-to-date
‼ data plane and cli versions match
    example-job-kdkukky33v-jr4hn running stable-2.13.5 but cli running stable-2.13.6
    see https://linkerd.io/2.13/checks/#l5d-data-plane-cli-version for hints
√ data plane pod labels are configured correctly
√ data plane service labels are configured correctly
√ data plane service annotations are configured correctly
√ opaque ports are properly annotated

output of linkerd check -o short

linkerd check -o short
Status check results are √

Environment

  • k8s: 1.24
  • linkerd: 2.13.6
  • OS: MacOS

Possible solution

Believe the logic within linkerd should filter out any pods in a completed state

Additional context

Slack: https://linkerd.slack.com/archives/C89RTCWJF/p1692689960803199

When upgrading linkerd, i generally like to check that all proxies are up to date and a simple way of doing this is with linkerd version --proxy but this shows:

Client version: stable-2.13.6
Server version: stable-2.13.6
Proxy versions:
	stable-2.13.5 (1 pods)
	stable-2.13.6 (N pods)

A useful way for me to check for outdated pods is to run kubectl through yq (this doesn't filter out completed pods either but feels similar to the logic implemented by linkerd):

export LINKERD_VERSION="2.13.6"
readarray outdatedPods < <(kubectl -A get pods -A -o yaml | yq -o=j -I=0 '.items[] | select(.metadata.annotations | has("linkerd.io/proxy-version")) | select(.metadata.annotations["linkerd.io/proxy-version"] != env(LINKERD_VERSION)) | {"name": .metadata.name, "namespace": .metadata.namespace}')

for pod in "${outdatedPods[@]}"; do
  name=$(echo "${pod}" | yq '.name')
  namespace=$(echo "${pod}" | yq '.namespace')

  echo "pod '${name}' in namespace '${namespace}' is running an outdated version of linkerd"
done

Would you like to work on fixing this bug?

None

@pratikkumar-mohite
Copy link

@adleong I would like to work on it, can you please assign it to me

mikutas added a commit to mikutas/linkerd2 that referenced this issue Aug 25, 2023
adleong pushed a commit that referenced this issue Sep 11, 2023
hawkw added a commit that referenced this issue Sep 13, 2023
This edge release updates the proxy's dependency on the `webpki` library
to patch security vulnerability [RUSTSEC-2023-0052]
(GHSA-8qv2-5vq6-g2g7), a potential CPU usage denial-of-service attack
when accepting a TLS handshake from an untrusted peer with a
maliciously-crafted certificate.

* Addressed security vulnerability [RUSTSEC-2023-0052] in the proxy
  ([#11361])
* Fixed `linkerd check --proxy` incorrectly checking the proxy version
  of pods in the `completed` state (thanks @mikutas!) ([#11295]; fixes
  [#11280])
* Removed unnecessary `linkerd.io/helm-release-version` annotation from
  the `linkerd-control-plane` Helm chart (thanks @mikutas!) ([#11329];
  fixes [#10778])

[RUSTSEC-2023-0052]:
    https://rustsec.org/advisories/RUSTSEC-2023-0052.html
[#11295]: #11295
[#11280]: #11280
[#11361]: #11361
[#11329]: #11329
[#10778]: #10778
@hawkw hawkw mentioned this issue Sep 13, 2023
hawkw added a commit that referenced this issue Sep 13, 2023
This edge release updates the proxy's dependency on the `webpki` library
to patch security vulnerability [RUSTSEC-2023-0052]
(GHSA-8qv2-5vq6-g2g7), a potential CPU usage denial-of-service attack
when accepting a TLS handshake from an untrusted peer with a
maliciously-crafted certificate.

* Addressed security vulnerability [RUSTSEC-2023-0052] in the proxy
  (#11361)
* Fixed `linkerd check --proxy` incorrectly checking the proxy version
  of pods in the `completed` state (thanks @mikutas!) (#11295; fixes
  #11280)
* Removed unnecessary `linkerd.io/helm-release-version` annotation from
  the `linkerd-control-plane` Helm chart (thanks @mikutas!) (#11329;
  fixes #10778)

[RUSTSEC-2023-0052]:
    https://rustsec.org/advisories/RUSTSEC-2023-0052.html
adamshawvipps pushed a commit to adamshawvipps/linkerd2 that referenced this issue Sep 18, 2023
adamshawvipps pushed a commit to adamshawvipps/linkerd2 that referenced this issue Sep 18, 2023
This edge release updates the proxy's dependency on the `webpki` library
to patch security vulnerability [RUSTSEC-2023-0052]
(GHSA-8qv2-5vq6-g2g7), a potential CPU usage denial-of-service attack
when accepting a TLS handshake from an untrusted peer with a
maliciously-crafted certificate.

* Addressed security vulnerability [RUSTSEC-2023-0052] in the proxy
  (linkerd#11361)
* Fixed `linkerd check --proxy` incorrectly checking the proxy version
  of pods in the `completed` state (thanks @mikutas!) (linkerd#11295; fixes
  linkerd#11280)
* Removed unnecessary `linkerd.io/helm-release-version` annotation from
  the `linkerd-control-plane` Helm chart (thanks @mikutas!) (linkerd#11329;
  fixes linkerd#10778)

[RUSTSEC-2023-0052]:
    https://rustsec.org/advisories/RUSTSEC-2023-0052.html
adamshawvipps pushed a commit to adamshawvipps/linkerd2 that referenced this issue Sep 18, 2023
adamshawvipps pushed a commit to adamshawvipps/linkerd2 that referenced this issue Sep 18, 2023
This edge release updates the proxy's dependency on the `webpki` library
to patch security vulnerability [RUSTSEC-2023-0052]
(GHSA-8qv2-5vq6-g2g7), a potential CPU usage denial-of-service attack
when accepting a TLS handshake from an untrusted peer with a
maliciously-crafted certificate.

* Addressed security vulnerability [RUSTSEC-2023-0052] in the proxy
  (linkerd#11361)
* Fixed `linkerd check --proxy` incorrectly checking the proxy version
  of pods in the `completed` state (thanks @mikutas!) (linkerd#11295; fixes
  linkerd#11280)
* Removed unnecessary `linkerd.io/helm-release-version` annotation from
  the `linkerd-control-plane` Helm chart (thanks @mikutas!) (linkerd#11329;
  fixes linkerd#10778)

[RUSTSEC-2023-0052]:
    https://rustsec.org/advisories/RUSTSEC-2023-0052.html

Signed-off-by: Adam Shaw <[email protected]>
mateiidavid added a commit that referenced this issue Sep 21, 2023
This stable release introduces a fix for service discovery on endpoints that
use hostPorts. Previously, the destination service would return the pod IP
associated with the endpoint which could break connectivity on pod restarts.
Discovery responses have been changed to instead return the host IP. This
release also fixes an issue in the multicluster extension where an empty
`remoteDiscoverySelector` field in the `Link` resource would cause all services
to be exported. Finally, this release addresses two security vulnerabilities,
[CVE-2023-2603] and [RUSTSEC-2023-0052] respectively, and includes numerous
other fixes and enhancements.

* CLI
  * Fixed `linkerd check --proxy` incorrectly checking the proxy version of
    pods in the `completed` state (thanks @mikutas!) ([#11295]; fixes [#11280])
  * Fixed erroneous `skipped` messages when injecting namespaces with `linkerd
    inject` (thanks @mikutas!) ([#10231])

* CNI
  * Addressed security vulnerability [CVE-2023-2603] in proxy-init and CNI
    plugin ([#11296])

* Control Plane
  * Changed how hostPort lookups are handled in the destination service.
    Previously, when doing service discovery for an endpoint bound on a
    hostPort, the destination service would return the corresponding pod IP. On
    pod restart, this could lead to loss of connectivity on the client's side.
    The destination service now always returns host IPs for service discovery
    on an endpoint that uses hostPorts ([#11328])
  * Updated HTTPRoute webhook rule to validate all apiVersions of the resource
    (thanks @mikutas!) ([#11149])

* Helm
  * Removed unnecessary `linkerd.io/helm-release-version` annotation from the
    `linkerd-control-plane` Helm chart (thanks @mikutas!) ([#11329]; fixes
    [#10778])
  * Introduced resource requests/limits for the policy controller resource in
    the control plane helm chart ([#11301])

* Multicluster
  * Fixed an issue where an empty `remoteDiscoverySelector` field in a
    multicluster link would cause all services to be mirrored ([#11309])
  * Removed time out from `linkerd multicluster gateways` command; when no
    metrics exist the command will return instantly ([#11265])
  * Improved help messaging for `linkerd multicluster link` ([#11265])

* Proxy
  * Addressed security vulnerability [RUSTSEC-2023-0052] in the proxy
    ([#11361])

[CVE-2023-2603]: GHSA-wp54-pwvg-rqq5
[RUSTSEC-2023-0052]: https://rustsec.org/advisories/RUSTSEC-2023-0052.html
[#11295]: #11295
[#11280]: #11280
[#11361]: #11361
[#11329]: #11329
[#10778]: #10778
[#11309]: #11309
[#11296]: #11296
[#11328]: #11328
[#11301]: #11301
[#11265]: #11265
[#11149]: #11149
[#10231]: #10231

Signed-off-by: Matei David <[email protected]>
mateiidavid added a commit that referenced this issue Sep 25, 2023
* stable-2.14.1

This stable release introduces a fix for service discovery on endpoints that
use hostPorts. Previously, the destination service would return the pod IP
associated with the endpoint which could break connectivity on pod restarts.
Discovery responses have been changed to instead return the host IP. This
release also fixes an issue in the multicluster extension where an empty
`remoteDiscoverySelector` field in the `Link` resource would cause all services
to be exported. Finally, this release addresses two security vulnerabilities,
[CVE-2023-2603] and [RUSTSEC-2023-0052] respectively, and includes numerous
other fixes and enhancements.

* CLI
  * Fixed `linkerd check --proxy` incorrectly checking the proxy version of
    pods in the `completed` state (thanks @mikutas!) ([#11295]; fixes [#11280])
  * Fixed erroneous `skipped` messages when injecting namespaces with `linkerd
    inject` (thanks @mikutas!) ([#10231])

* CNI
  * Addressed security vulnerability [CVE-2023-2603] in proxy-init and CNI
    plugin ([#11296])

* Control Plane
  * Changed how hostPort lookups are handled in the destination service.
    Previously, when doing service discovery for an endpoint bound on a
    hostPort, the destination service would return the corresponding pod IP. On
    pod restart, this could lead to loss of connectivity on the client's side.
    The destination service now always returns host IPs for service discovery
    on an endpoint that uses hostPorts ([#11328])
  * Updated HTTPRoute webhook rule to validate all apiVersions of the resource
    (thanks @mikutas!) ([#11149])

* Helm
  * Removed unnecessary `linkerd.io/helm-release-version` annotation from the
    `linkerd-control-plane` Helm chart (thanks @mikutas!) ([#11329]; fixes
    [#10778])
  * Introduced resource requests/limits for the policy controller resource in
    the control plane helm chart ([#11301])

* Multicluster
  * Fixed an issue where an empty `remoteDiscoverySelector` field in a
    multicluster link would cause all services to be mirrored ([#11309])
  * Removed time out from `linkerd multicluster gateways` command; when no
    metrics exist the command will return instantly ([#11265])
  * Improved help messaging for `linkerd multicluster link` ([#11265])

* Proxy
  * Addressed security vulnerability [RUSTSEC-2023-0052] in the proxy
    ([#11361])

[CVE-2023-2603]: GHSA-wp54-pwvg-rqq5
[RUSTSEC-2023-0052]: https://rustsec.org/advisories/RUSTSEC-2023-0052.html
[#11295]: #11295
[#11280]: #11280
[#11361]: #11361
[#11329]: #11329
[#10778]: #10778
[#11309]: #11309
[#11296]: #11296
[#11328]: #11328
[#11301]: #11301
[#11265]: #11265
[#11149]: #11149
[#10231]: #10231

Signed-off-by: Matei David <[email protected]>
Signed-off-by: Eliza Weisman <[email protected]>
Co-authored-by: Eliza Weisman <[email protected]>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants