Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add node setting to disable fsync for all operations on a node #96770

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Jun 12, 2023

This covers almost all fsync uses via a simple boolean setting in an as small as possible changeset.
Most of the changed lines are just a result of longer method signatures formatting across multiple lines and some setting adjustments.

closes #96302

This covers almost all fsync uses via a simple boolean setting.
@original-brownbear original-brownbear added WIP :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. labels Jun 12, 2023
@@ -608,6 +610,11 @@ private static Settings getRandomNodeSettings(long seed) {
builder.put(INITIAL_STATE_TIMEOUT_SETTING.getKey(), "0s");
}

if (usually()) {
// Disable fsync in most test runs to speed up tests
builder.put(IndexModule.NODE_STORE_USE_FSYNC.getKey(), false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat speedup for internal cluster tests on my Mac. I don't see why we wouldn't usually turn off fsync to give people faster builds and less SSD wear even though this is the uncommon path in prod (maybe :)).

@original-brownbear original-brownbear marked this pull request as ready for review June 13, 2023 14:34
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team label Jun 13, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @original-brownbear, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

/**
* Setting to enable or disable all fsync operations on a node.
*/
public static final Setting<Boolean> NODE_STORE_USE_FSYNC = Setting.boolSetting("node.store.use_fsync", true, Property.NodeScope);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little unfortunate how long this PR became when doing this via a setting. Could do the same thing in 30 lines using a system property probably but I think the test randomisation and general alignment with how we do these things makes it worth to go through the noise of adding a setting here.

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/part-2

@@ -209,7 +210,7 @@ private static void writeEmptyCheckpoint(Path filename, int translogLength, long
private static int writeEmptyTranslog(Path filename, String translogUUID) throws IOException {
try (FileChannel fc = FileChannel.open(filename, StandardOpenOption.WRITE, StandardOpenOption.CREATE_NEW)) {
TranslogHeader header = new TranslogHeader(translogUUID, SequenceNumbers.UNASSIGNED_PRIMARY_TERM);
header.write(fc);
header.write(fc, true); // TODO: make fsync conditional on IndexModule#NODE_STORE_USE_FSYNC?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: some todos left, but I think it is okay to address them in a followup.
I do not think this action is used often

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea it's a rare outlier, I didn't feel like spending time on it yet, maybe later :)

public final long writeAndCleanup(final T state, final Path... locations) throws WriteStateException {
return write(state, true, locations);
public final long writeAndCleanup(final T state, boolean useFsync, final Path... locations) throws WriteStateException {
return write(state, true, useFsync, locations);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call new parameter just fsync?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. >enhancement Team:Distributed Meta label for distributed team v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add node setting to disable all fsync
8 participants