Fewer ThreadLocals in version/seqno resolver #85896

DaveCTurner · 2022-04-14T11:31:15Z

Today we create a new ThreadLocal for every IndexReader encountered
during non-append-only indexing (effectively every refresh on every
shard). Each thread keeps a table of all its live thread-locals, where
"live" effectively means "has not been garbage-collected". Some of these
IndexReader instances have lifespans long enough for the corresponding
thread-local to escape the young generation which means that it may not
be garbage-collected for a very long time, preventing their slots in the
table from being re-used. This causes the thread-local table for write
threads to grow rather large, which egregiously slows down other
operations on unrelated thread-locals on these threads.

This commit changes the map-of-thread-locals to a thread-local-of-maps,
reducing the churn of thread-locals and releasing the resolver in each
thread's thread-local map as soon as the corresponding IndexReader is
closed.

Closes #56766

Today we create a new `ThreadLocal` for every `IndexReader` encountered during non-append-only indexing (effectively every refresh on every shard). Each thread keeps a table of all its live thread-locals, where "live" effectively means "has not been garbage-collected". Some of these `IndexReader` instances have lifespans long enough for the corresponding thread-local to escape the young generation which means that it may not be garbage-collected for a very long time, preventing their slots in the table from being re-used. This causes the thread-local table for `write` threads to grow rather large, which egregiously slows down other operations on unrelated thread-locals on these threads. This commit changes the map-of-thread-locals to a thread-local-of-maps, reducing the churn of thread-locals and releasing the resolver in each thread's thread-local map as soon as the corresponding `IndexReader` is closed. Closes elastic#56766

elasticmachine · 2022-04-14T11:31:19Z

Pinging @elastic/es-distributed (Team:Distributed)

elasticsearchmachine · 2022-04-14T11:31:40Z

Hi @DaveCTurner, I've created a changelog YAML for you.

DaveCTurner · 2022-04-14T11:33:15Z

server/src/main/java/org/elasticsearch/common/lucene/uid/VersionsAndSeqNoResolver.java

+                    assert lookupState[leaf.ord] == null;
+                    lookupState[leaf.ord] = new PerThreadIDVersionAndSeqNoLookup(leaf.reader(), uidField);
+                }
+                cacheHelper.addClosedListener(this::removeLookup);


NB this may have performance impact since it means on close (i.e. refresh) we remove the entry from the map of every thread that touched it. Previously we'd have done most of that cleanup later on during GC.

DaveCTurner · 2022-04-14T11:35:31Z

server/src/main/java/org/elasticsearch/common/lucene/uid/VersionsAndSeqNoResolver.java

-            }
-            ctl.set(lookupState);
-        }
+        private final Map<IndexReader.CacheKey, PerThreadIDVersionAndSeqNoLookup[]> lookupsByReader = ConcurrentCollections


Not sure about using a CHM here, nor about its exact config. We only put and read on the owning thread, the thread-safety is just needed for the cleanup operations which should be much less common. Maybe a synchronized map would be fine/better?

DaveCTurner · 2022-04-14T14:19:48Z

@elasticmachine please run elasticsearch-ci/bwc (don't think it actually failed, looks like something went wrong after completion somehow)

DaveCTurner added >bug :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. v8.3.0 labels Apr 14, 2022

elasticmachine added the Team:Distributed Meta label for distributed team label Apr 14, 2022

Update docs/changelog/85896.yaml

fd9b8f6

DaveCTurner commented Apr 14, 2022

View reviewed changes

DaveCTurner mentioned this pull request Apr 14, 2022

VersionsAndSeqNoResolver may experience delayed ThreadLocal cleanup with high pressure writing to ES #56766

Closed

craigtaverner added v8.4.0 and removed v8.3.0 labels May 25, 2022

elasticsearchmachine changed the base branch from master to main July 22, 2022 23:07

mark-vieira added v8.5.0 and removed v8.4.0 labels Jul 27, 2022

csoulios added v8.6.0 and removed v8.5.0 labels Sep 21, 2022

kingherc added v8.7.0 and removed v8.6.0 labels Nov 16, 2022

rjernst added v8.8.0 and removed v8.7.0 labels Feb 8, 2023

gmarouli added v8.9.0 and removed v8.8.0 labels Apr 26, 2023

pugnascotia added v8.10.0 and removed v8.9.0 labels Jun 22, 2023

quux00 added v8.11.0 and removed v8.10.0 labels Aug 16, 2023

mattc58 added v8.12.0 and removed v8.11.0 labels Oct 4, 2023

brianseeders added v8.13.0 and removed v8.12.0 labels Dec 6, 2023

elasticsearchmachine added v8.14.0 and removed v8.13.0 labels Feb 14, 2024

elasticsearchmachine added v8.15.0 and removed v8.14.0 labels Apr 17, 2024

elasticsearchmachine added v8.16.0 and removed v8.15.0 labels Jul 4, 2024

mark-vieira added v9.0.0 and removed v8.16.0 labels Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fewer ThreadLocals in version/seqno resolver #85896

Fewer ThreadLocals in version/seqno resolver #85896

DaveCTurner commented Apr 14, 2022

elasticmachine commented Apr 14, 2022

elasticsearchmachine commented Apr 14, 2022

DaveCTurner Apr 14, 2022

DaveCTurner Apr 14, 2022

DaveCTurner commented Apr 14, 2022

Fewer ThreadLocals in version/seqno resolver #85896

Are you sure you want to change the base?

Fewer ThreadLocals in version/seqno resolver #85896

Conversation

DaveCTurner commented Apr 14, 2022

elasticmachine commented Apr 14, 2022

elasticsearchmachine commented Apr 14, 2022

DaveCTurner Apr 14, 2022

Choose a reason for hiding this comment

DaveCTurner Apr 14, 2022

Choose a reason for hiding this comment

DaveCTurner commented Apr 14, 2022