Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for obtaining deposit snapshots during trustedNodeSync #4303

Merged
merged 1 commit into from
Dec 7, 2022

Conversation

zah
Copy link
Contributor

@zah zah commented Nov 8, 2022

Other changes:

fatal "Failed to initialize web3 provider",
depositContract = depositContractAddress,
web3Url, err = providerRes.error
quit 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we close the db when quit runs? also, this prevents starting eth1 and eth2 in "random" order which is not nice - ie I should be able to start the two services in any order and they should be "prone" to resume as soon as the "other" is up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree in general with the suggestions, but this requires more invasive changes to the Eth1Monitor that will be delivered in another PR.

depositContract = depositContractAddress,
deploymentBlock = $depositContractDeployedAt,
err = e.msg
quit 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto... and how does this work when you start nimbus with an unsynced eth1?

Copy link
Contributor Author

@zah zah Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This merely maintains the Status quo as Nimbus would already exit at start-up when paired with an EL that hasn't synced the deposit contract deployment block.

I've improved the behavior in the multi EL branch where the BN is able to wait for the EL to sync before the rest of the Eth1Monitor activities start.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you mean, shut down? we can run the BN without an EL entirely (in optmistic mode). Also, geth doesn't even start syncing without a BN. Since when do we require a synced EL?

Copy link
Member

@arnetheduck arnetheduck Dec 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in fact, we're quite buggy in this aspect: we run the deposit monitoring even when we know the EL is not synced - see #4057 - this causes many support requests in general because of all the error / warning logging that ensues

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deposit contract deployment block is part of the PoW history that Geth will sync on its own.
Otherwise, yes, the general behavior can be improved, but as I explained these are more invasive changes that are currently part of another PR.

@@ -229,6 +267,16 @@ proc doTrustedNodeSync*(
databaseDir, head = shortLog(dbHead.get())
(headSlot, dbHead.get())

# Fetch deposit snapshot. This API endpoint is still optional.
let depositSnapshot = await fetchDepositSnapshot(client)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is better to put behind a flag (default false) while v0 is a thing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failure is relatively benign. Only a INFO level message is printed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nonetheless, it's a draft feature - also, disabling it completely is useful in case something breaks - this was part of the rationale for moving TNS to a finalized state alone: this is the minimal set of data needed to get started, and we should keep the possibility to start like this regardless of what the server supports - blocks and deposit snapshots are optional extras.

Copy link
Contributor Author

@zah zah Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the suggested CLI flag, but I'm contemplating removing it since the API is now officially in the spec.

@arnetheduck
Copy link
Member

Removed support for storing a deposit snapshot in the network metadata.

since no snapshot sources support the new snapshots, isn't this a regression? it seems reasonable to keep the old support until the new version has reached some reasonable penetration (basically when it's supported by https://eth-clients.github.io/checkpoint-sync-endpoints/)

@zah
Copy link
Contributor Author

zah commented Nov 9, 2022

since no snapshot sources support the new snapshots, isn't this a regression? it seems reasonable to keep the old support until the new version has reached some reasonable penetration (basically when it's supported by https://eth-clients.github.io/checkpoint-sync-endpoints/)

The old support was baking a snapshot in the binary. I haven't been updating these snapshots in a while and every running client will have a more recent record in its database. New clients that didn't obtain a snapshot from trusted node sync will be initialized to the deposit contract deployment block which seems reasonable (i.e. this is the behavior of all other clients)

@arnetheduck
Copy link
Member

New clients that didn't obtain a snapshot from trusted node sync will be initialized to the deposit contract deployment block which seems reasonable (i.e. this is the behavior of all other clients)

this is the regression indeed - there's no support in servers for the eip, and there won't be fore the foreseeable future - since the code for reading the baked-in snapshot is still there, it seems reasonable to keep using it until the eip has reached penetration (which can take anything from days to months) - I'd prefer seeing this feature as a pure addon until the standard has settled (both EIP and beacon API are drafts at this point - it's quite possible there will be changes in the format meaning that we might have to go through a potentially painful DB migration, so a way out is needed)

# Fetch deposit snapshot. This API endpoint is still optional.
let depositSnapshot = await fetchDepositSnapshot(client)
if depositSnapshot.isOk:
info "Writing deposit contracts snapshot",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the snapshot be validated against the state before writing to disk?

@zah
Copy link
Contributor Author

zah commented Nov 9, 2022

I still don't think this is a regression. The non-genesis snapshots were tested only in the testnets. The metadata for mainnet was never changed after the genesis event.

@arnetheduck
Copy link
Member

I still don't think this is a regression.

the difference would be that we don't need to download deposits for the genesis validators, no?

@zah zah force-pushed the deposit-snapshots-in-trusted-sync branch from d4b3288 to d0a208d Compare November 18, 2022 17:54
@github-actions
Copy link

github-actions bot commented Nov 18, 2022

Unit Test Results

         9 files  ±  0    1 023 suites  +3   24m 8s ⏱️ -52s
  2 972 tests +  4    2 779 ✔️ +  4  193 💤 ±0  0 ±0 
13 901 runs  +12  13 682 ✔️ +12  219 💤 ±0  0 ±0 

Results for commit bd3554a. ± Comparison against base commit 7cf432b.

♻️ This comment has been updated with latest results.

@zah zah force-pushed the deposit-snapshots-in-trusted-sync branch 2 times, most recently from 37d6c7c to 051df25 Compare November 20, 2022 13:57
@zah zah force-pushed the deposit-snapshots-in-trusted-sync branch 7 times, most recently from ce12332 to 6c9b2be Compare December 5, 2022 15:03
else:
warn "The downloaded deposit snapshot does not agree with the downloaded state"
else:
warn "Deposit tree snapshot was not imported", reason = depositSnapshot.error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the download is optional and nothing will happen if it doesn't work except a small delay, so this is a notice at most

Copy link
Contributor Author

@zah zah Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, with the new CLI option the user explicitly requested the download of a deposit snapshot and it didn't work. I think a warning is reasonable. The user should check their configuration.

return await client.getBlockV2(BlockIdent.init(slot), cfg)
return awaitWithTimeout(client.getBlockV2(BlockIdent.init(slot), cfg),
smallRequestsTimeout):
raise newException(CatchableError, "Request timed out")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there's a timeout, we should reattempt - that's what the retries are for

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be retry on this error. An exception is raised here, so it can be handled in the except block below, which will produce the following message:

warn "Retrying download of block", slot, err = "Request timed out"

I didn't want to duplicate the logic in multiple places and the cost of raising an exception is not that high in this infrequent code path.

Other changes:

* More optimal search for TTD block.

* Add timeouts to all REST requests during trusted node sync.
  Fixes #4037

* Removed support for storing a deposit snapshot in the network
  metadata.
@zah zah force-pushed the deposit-snapshots-in-trusted-sync branch from 6c9b2be to bd3554a Compare December 6, 2022 23:18
@zah zah merged commit d30cb8b into unstable Dec 7, 2022
@zah zah deleted the deposit-snapshots-in-trusted-sync branch December 7, 2022 10:24
etan-status added a commit that referenced this pull request Mar 7, 2024
EIP-4881 was never correctly implemented, the `DepositTreeSnapshot`
structure has nothing to do with its actual definition. Reflect that
by renaming the type to a Nimbus-specific `DepositContractSnapshot`,
so that an actual EIP-4881 implementation can use the correct names.

- https://eips.ethereum.org/EIPS/eip-4881#specification

Notably, `DepositTreeSnapshot` contains a compressed sequence in
`finalized`, only containing the minimally required intermediate roots.

That also explains the incorrect REST response reported in #5508.

The non-canonical representation was introduced in #4303 and is also
persisted in the database. We'll have to maintain it for a while.
etan-status added a commit that referenced this pull request Mar 7, 2024
EIP-4881 was never correctly implemented, the `DepositTreeSnapshot`
structure has nothing to do with its actual definition. Reflect that
by renaming the type to a Nimbus-specific `DepositContractSnapshot`,
so that an actual EIP-4881 implementation can use the correct names.

- https://eips.ethereum.org/EIPS/eip-4881#specification

Notably, `DepositTreeSnapshot` contains a compressed sequence in
`finalized`, only containing the minimally required intermediate roots.

That also explains the incorrect REST response reported in #5508.

The non-canonical representation was introduced in #4303 and is also
persisted in the database. We'll have to maintain it for a while.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants