Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

limit by-root requests to non-finalized blocks #3293

Merged
merged 2 commits into from
Jan 21, 2022
Merged

limit by-root requests to non-finalized blocks #3293

merged 2 commits into from
Jan 21, 2022

Conversation

arnetheduck
Copy link
Member

Presently, we keep a mapping from block root to BlockRef in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.

We can distinguish between two cases where by-root access is useful:

  • unfinalized blocks - this is where the beacon chain is operating
    generally, by validating incoming data as interesting for future fork
    choice decisions - bounded by the length of the unfinalized period
  • finalized blocks - historical access in the REST API etc - no bounds,
    really

In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.

Future work includes:

  • limiting the BlockRef horizon in general - each instance is 40
    bytes+overhead which adds up - this needs further refactoring to deal
    with the tail vs state problem
  • persisting the finalized slot-to-hash index - this one also keeps
    growing unbounded (albeit slowly)

Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.

  • No longer honor BeaconBlocksByRoot requests outside of the
    non-finalized period - previously, Nimbus would generously return any
    block through this libp2p request - per the spec, finalized blocks
    should be fetched via BeaconBlocksByRange instead.
  • return Opt[BlockRef] instead of nil when blocks can't be found -
    this becomes a lot more common now and thus deserves more attention
  • dag.blocks -> dag.forkBlocks - this index only carries unfinalized
    blocks from now - finalizedBlocks covers the other BlockRef
    instances
  • in backfill, verify that the last backfilled block leads back to
    genesis, or panic
  • add backfill timings to log
  • fix missing check that BlockRef block can be fetched with
    getForkedBlock reliably
  • shortcut doppelganger check when feature is not enabled
  • in REST/JSON-RPC, fetch blocks without involving BlockRef

Screenshot from 2022-01-17 14-04-38

@github-actions
Copy link

github-actions bot commented Jan 17, 2022

Unit Test Results

     12 files  ±0     794 suites  ±0   35m 15s ⏱️ - 6m 24s
1 602 tests ±0  1 556 ✔️ ±0  46 💤 ±0  0 ±0 
9 457 runs  ±0  9 361 ✔️ ±0  96 💤 ±0  0 ±0 

Results for commit 1622233. ± Comparison against base commit 570379d.

♻️ This comment has been updated with latest results.

Presently, we keep a mapping from block root to `BlockRef` in memory -
this has simplified reasoning about the dag, but is not sustainable with
the chain growing.

We can distinguish between two cases where by-root access is useful:

* unfinalized blocks - this is where the beacon chain is operating
generally, by validating incoming data as interesting for future fork
choice decisions - bounded by the length of the unfinalized period
* finalized blocks - historical access in the REST API etc - no bounds,
really

In this PR, we limit the by-root block index to the first use case:
finalized chain data can more efficiently be addressed by slot number.

Future work includes:

* limiting the `BlockRef` horizon in general - each instance is 40
bytes+overhead which adds up - this needs further refactoring to deal
with the tail vs state problem
* persisting the finalized slot-to-hash index - this one also keeps
growing unbounded (albeit slowly)

Anyway, this PR easily shaves ~128mb of memory usage at the time of
writing.

* No longer honor `BeaconBlocksByRoot` requests outside of the
non-finalized period - previously, Nimbus would generously return any
block through this libp2p request - per the spec, finalized blocks
should be fetched via `BeaconBlocksByRange` instead.
* return `Opt[BlockRef]` instead of `nil` when blocks can't be found -
this becomes a lot more common now and thus deserves more attention
* `dag.blocks` -> `dag.forkBlocks` - this index only carries unfinalized
blocks from now - `finalizedBlocks` covers the other `BlockRef`
instances
* in backfill, verify that the last backfilled block leads back to
genesis, or panic
* add backfill timings to log
* fix missing check that `BlockRef` block can be fetched with
`getForkedBlock` reliably
* shortcut doppelganger check when feature is not enabled
* in REST/JSON-RPC, fetch blocks without involving `BlockRef`
# is on is no longer a viable fork candidate - we can't tell which is which
# at this stage, but we can check if we've seen the parent block previously
# and thus prevent requests for it to be downloaded again.
if dag.db.containsBlock(blck.parent_root):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have a metric for the number of database queries we perform, so we can see when changes in the DAG logic have an effect on them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I'm not too happy about this check in general - for gossip, it's fine because it happens after signature verification, but I need to think a bit more about what happens during sync - sync is in fact the main issue: if we return MissingParent here, it will go look for it which would be bad.

@zah zah merged commit 61342c2 into unstable Jan 21, 2022
@zah zah deleted the no-more-blocks branch January 21, 2022 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants