-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initialize hwloc topology with disallowed nodes to accurately know available resources on system #1251
base: main
Are you sure you want to change the base?
Conversation
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesCodacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks! Do you think it's worth adding a comment in the code about logical PU indices not matching lstopo's ones?
Maybe, I'll have a look to see where it could fit. Note: The indices will match if no cgroups restrictions are in places. It's only when restrictions are in place that they may not match. Process masks (from hwloc-bind, taskset, slurm etc.) are anyway always using physical indices, so no risk of ambiguity there. |
This allows accurately knowing how many PUs etc. are available on the system, even if all of them can't be used. When using e.g. cgroups, PUs can be restricted so that hwloc does not report them by default in the topology. Including all PUs with HWLOC_TOPOLOGY_FLAG_INCLUDE_DISALLOWED means that masks can be correctly sized for all physical and logical indices.
…pected pika binding
…icate function name and explicitly call out pool name
890ba0d
to
9369fac
Compare
When cgroups are used (e.g. by srun) to constrain allowed PUs for a process, hwloc will by default only report the PUs and other resources that are allowed to be used with the cgroups restrictions. This can lead to pika thinking that the system has only n PUs when in reality it has m > n PUs. In itself this isn't a problem, but currently on pika main we derive the number of available PUs from the resources that hwloc reports, and thus also the CPU mask sizes may be smaller than the whole system. If using cgroups and
--pika:process-mask
/PIKA_PROCESS_MASK
with a mask that doesn't contain the first n PUs pika's initialization will currently access CPU masks out of bounds or report an error.This PR proposes to initialize the hwloc topology with the flag
HWLOC_TOPOLOGY_FLAG_INCLUDE_DISALLOWED
, which will include resources that may not be usable because of cgroups. This allows us to accurately get the maximum size of CPU masks.Even though we request all resources internally, pika will not be able to bind threads to PUs that are outside of the cgroups restrictions. A warning will be printed if a mask is set outside of the restrictions.
I've relaxed an error to a warning when pika's expected CPU mask for a thread does not match what is retrieved with hwloc. This happens if the thread affinity can't be set.
I've also cleaned up some of the logging in
scheduled_thread_pool_impl.hpp
.One downside of this approach is that, when using cgroups, hwloc logical PU indices will not correspond to indices reported by e.g.
lstopo
. Physical PU indices are exactly the same as before. I think this is a small inconsistency that's worth the price.A possible alternative approach is to create a separate topology object with
HWLOC_TOPOLOGY_FLAG_INCLUDE_DISALLOWED
only for counting the number of PUs etc. to correctly size CPU masks, but then use the a topology object without disallowed resources afterwards.