Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fewer ThreadLocals in version/seqno resolver #85896

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

DaveCTurner
Copy link
Contributor

Today we create a new ThreadLocal for every IndexReader encountered
during non-append-only indexing (effectively every refresh on every
shard). Each thread keeps a table of all its live thread-locals, where
"live" effectively means "has not been garbage-collected". Some of these
IndexReader instances have lifespans long enough for the corresponding
thread-local to escape the young generation which means that it may not
be garbage-collected for a very long time, preventing their slots in the
table from being re-used. This causes the thread-local table for write
threads to grow rather large, which egregiously slows down other
operations on unrelated thread-locals on these threads.

This commit changes the map-of-thread-locals to a thread-local-of-maps,
reducing the churn of thread-locals and releasing the resolver in each
thread's thread-local map as soon as the corresponding IndexReader is
closed.

Closes #56766

Today we create a new `ThreadLocal` for every `IndexReader` encountered
during non-append-only indexing (effectively every refresh on every
shard). Each thread keeps a table of all its live thread-locals, where
"live" effectively means "has not been garbage-collected". Some of these
`IndexReader` instances have lifespans long enough for the corresponding
thread-local to escape the young generation which means that it may not
be garbage-collected for a very long time, preventing their slots in the
table from being re-used. This causes the thread-local table for `write`
threads to grow rather large, which egregiously slows down other
operations on unrelated thread-locals on these threads.

This commit changes the map-of-thread-locals to a thread-local-of-maps,
reducing the churn of thread-locals and releasing the resolver in each
thread's thread-local map as soon as the corresponding `IndexReader` is
closed.

Closes elastic#56766
@DaveCTurner DaveCTurner added >bug :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. v8.3.0 labels Apr 14, 2022
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Apr 14, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine
Copy link
Collaborator

Hi @DaveCTurner, I've created a changelog YAML for you.

assert lookupState[leaf.ord] == null;
lookupState[leaf.ord] = new PerThreadIDVersionAndSeqNoLookup(leaf.reader(), uidField);
}
cacheHelper.addClosedListener(this::removeLookup);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB this may have performance impact since it means on close (i.e. refresh) we remove the entry from the map of every thread that touched it. Previously we'd have done most of that cleanup later on during GC.

}
ctl.set(lookupState);
}
private final Map<IndexReader.CacheKey, PerThreadIDVersionAndSeqNoLookup[]> lookupsByReader = ConcurrentCollections
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about using a CHM here, nor about its exact config. We only put and read on the owning thread, the thread-safety is just needed for the cleanup operations which should be much less common. Maybe a synchronized map would be fine/better?

@DaveCTurner
Copy link
Contributor Author

@elasticmachine please run elasticsearch-ci/bwc (don't think it actually failed, looks like something went wrong after completion somehow)

@elasticsearchmachine elasticsearchmachine changed the base branch from master to main July 22, 2022 23:07
@mark-vieira mark-vieira added v8.5.0 and removed v8.4.0 labels Jul 27, 2022
@csoulios csoulios added v8.6.0 and removed v8.5.0 labels Sep 21, 2022
@kingherc kingherc added v8.7.0 and removed v8.6.0 labels Nov 16, 2022
@rjernst rjernst added v8.8.0 and removed v8.7.0 labels Feb 8, 2023
@gmarouli gmarouli added v8.9.0 and removed v8.8.0 labels Apr 26, 2023
@quux00 quux00 added v8.11.0 and removed v8.10.0 labels Aug 16, 2023
@mattc58 mattc58 added v8.12.0 and removed v8.11.0 labels Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. Team:Distributed Meta label for distributed team v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VersionsAndSeqNoResolver may experience delayed ThreadLocal cleanup with high pressure writing to ES