Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on rolling update in Kubernetes #224

Closed
hypnoglow opened this issue Jul 22, 2019 · 0 comments · Fixed by #225
Closed

Panic on rolling update in Kubernetes #224

hypnoglow opened this issue Jul 22, 2019 · 0 comments · Fixed by #225

Comments

@hypnoglow
Copy link
Contributor

Describe the bug

In Kubernetes on rolling update oathkeeper panics:

oathkeeper-58747c8474-nqvvp oathkeeper time="2019-07-22T14:41:44Z" level=debug msg="Detected access rule repository file change." event=fsnotify file="file:///etc/config/access-rules.json" op=CHMOD
oathkeeper-58747c8474-nqvvp oathkeeper time="2019-07-22T14:41:44Z" level=debug msg="Detected that a access rule repository file has been removed, reloading config."
oathkeeper-58747c8474-nqvvp oathkeeper time="2019-07-22T14:41:44Z" level=debug msg="Access rule watcher received an update." event=config_change source=fsnotify_remove
oathkeeper-58747c8474-nqvvp oathkeeper panic: send on closed channel
oathkeeper-58747c8474-nqvvp oathkeeper 
oathkeeper-58747c8474-nqvvp oathkeeper goroutine 118 [running]:
oathkeeper-58747c8474-nqvvp oathkeeper github.com/ory/oathkeeper/rule.(*FetcherDefault).watch.func4(0xc00007c540, 0xc00011dd00)
oathkeeper-58747c8474-nqvvp oathkeeper  /go/src/github.com/ory/oathkeeper/rule/fetcher_default.go:220 +0xb6
oathkeeper-58747c8474-nqvvp oathkeeper created by github.com/ory/oathkeeper/rule.(*FetcherDefault).watch
oathkeeper-58747c8474-nqvvp oathkeeper  /go/src/github.com/ory/oathkeeper/rule/fetcher_default.go:219 +0xedc

This is because race condition:

  1. The new configmap is deployed (config gets updated in the mounted volume on the old container)
  2. Old container is getting stopped (file watcher is closed)
  3. There are still goroutines that are blocked sending to the channel, e.g. here

Expected behavior

Goroutines should not write to the closed channel.

How to fix

Make goroutines controllable using waitgroup, so no orphaned goroutines will try to write to the closed channel. Close the channel only when all goroutines exit.

hypnoglow added a commit to hypnoglow/oathkeeper that referenced this issue Jul 22, 2019
aeneasr pushed a commit that referenced this issue Jul 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant