Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely slow blockchain sync due to HDD high load #761

Closed
vikulin opened this issue Mar 14, 2017 · 28 comments
Closed

Extremely slow blockchain sync due to HDD high load #761

vikulin opened this issue Mar 14, 2017 · 28 comments

Comments

@vikulin
Copy link

vikulin commented Mar 14, 2017

ethereumj v 1.5.0 has been built recently from develop branch.

image

peer.discovery = {

    # if peer discovery is on
    # the discovery protocol will be used to find other nodes and connect them
    # if peer discovery is off
    # this peer will only be connecting to the peers from [peer.active] list
    # inbound connections from other peers are accepted as well
    enabled = true

    # List of the seed peers to start
    # the search for online peers
    # values: [ip:port, enode://nodeid@ip:port, ip:port ...]
    ip.list = [
        # EthereumJ discovery nodes
        "frontier-2.ether.camp:30303",
        "frontier-3.ether.camp:30303",
        "frontier-4.ether.camp:30303",

        # C++ Ethereum discovery nodes
        "5.1.83.226:30303",

        # Geth discovery nodes
        "54.94.239.50:30303",
        "52.16.188.185:30303",
        "52.74.57.123:30303",

        # Parity discovery nodes
        "163.172.157.114:30303",
        "136.243.154.244:30303",
        "88.212.206.70:30303",
        "37.128.191.230:30303",
        "46.20.235.22:30303",
        "216.158.85.185:30303",
        "212.47.247.103:30303",
        "138.201.144.135:30303",
        "45.55.33.62:30303",
        "188.166.255.12:30303",
        "159.203.210.80:30303",
        "51.15.42.252:30303",
        "163.172.171.38:30303"
    ]

    # external IP/hostname which is reported as our host during discovery
    # if not set, the service http://checkip.amazonaws.com is used
    # the last resort is to get the peer.bind.ip address
    external.ip = null

    # Local network adapter IP to which
    # the discovery UDP socket is bound
    # e.g: 192.168.1.104
    #
    # if the value is empty it will be retrieved
    # by punching to some known address e.g: www.google.com
    bind.ip = ""

    # indicates whether the discovery will include own home node
    # within the list of neighbor nodes
    public.home.node = true

    # indicates if the discovered nodes and their reputations
    # are stored in DB and persisted between VM restarts
    persist = true

    # the period in seconds with which the discovery
    # tries to reconnect to successful nodes
    # 0 means the nodes are not reconnected
    touchPeriod = 600

    # the maximum nuber of nodes to reconnect to
    # -1 for unlimited
    touchMaxNodes = 100

    # number of workers that
    # test the peers for being
    # online [1..10]
    workers = 8
}

peer {
    # Port on which ethereumj will listen
    # for incoming connections
    listen.port = 30303

    # Network id
    networkId = 1

    # Private key of the peer
    # The key is generated by default on the first run and stored in the database folder
    # If you have your own peer ID, specify its private key here
    # derived nodeId = dead745c1dbcde518b48e52aca1e8d5ba666005a2c8804e39826c6080fb11c1e8abe41d1e41896e871f204f790a90fa9781744cccecf492212192a7c56e7673b
    # privateKey = f67c4032a7ff79bbfa7a780331b235c4eb681d51a0704cb1562064fb6c4bced4

    # Boot node list
    active = [
        # Sample entries:
        # { url = "enode://<hex nodeID>@hostname.com:30303" }
        # {
        #    ip = hostname.com
        #    port = 30303
        #    nodeId = e437a4836b77ad9d9ffe73ee782ef2614e6d8370fcf62191a6e488276e23717147073a7ce0b444d485fff5a0c34c4577251a7a990cf80d8542e21b95aa8c5e6c
        # }
    ]

    # list of trusted peers from which incoming connections are always accepted
    trusted = [
        # Sample entries:
        # {nodeId = "e437a4836b77ad9d9ffe73ee782ef2614e6d8370fcf62191a6e488276e23717147073a7ce0b444d485fff5a0c34c4577251a7a990cf80d8542e21b95aa8c5e6c"},
        # {ip = "11.22.33.44"},
        # {ip = "11.22.33.*"},
        # {
        #   nodeId = "e437a4836b77ad9d9ffe73ee782ef2614e6d8370fcf62191a6e488276e23717147073a7ce0b444d485fff5a0c34c4577251a7a990cf80d8542e21b95aa8c5e6c"
        #   ip = "11.22.33.44"
        # }
    ]

    # max number of active peers our node will maintain
    # extra peers trying to connect us will be dropped with TOO_MANY_PEERS message
    # the incoming connection from the peer matching 'peer.trusted' entry is always accepted
    maxActivePeers = 30

    # The protocols supported by peer
    # can be: [eth, shh, bzz]
    capabilities = [eth]

    # connection timeout for trying to
    # connect to a peer [seconds]
    connection.timeout = 2

    # how much time [seconds]
    # we will wait for a message
    # to arrive before closing the channel
    channel.read.timeout = 90

    p2p {
        # the default version outbound connections are made with
        # inbound connections are made with the version declared by the remote peer (if supported)
        # version = 4

        # max frame size in bytes when framing is enabled
        framing.maxSize = 32768

        # forces peer to send Handshake message in format defined by EIP-8,
        # see https://github.com/ethereum/EIPs/blob/master/EIPS/eip-8.md
        eip8 = true
    }
}

# the folder resources/genesis
# contains several versions of
# genesis configuration according
# to the network the peer will run on
genesis = frontier.json

# path to genesis file
# has priority over `genesis` option
# genesisFile = /some/path/to/frontier.json

# Blockchain settings (constants and algorithms) which are
# not described in the genesis file (like MINIMUM_DIFFICULTY or Mining algorithm)
# The possible named presets are:
# - main : the main network (Frontier-Homestead-...)
# - ropsten: Ropsten test network
# - testnet: Ethercamp test network
# - olympic: pre-Frontier Olympic network
# For custom network settings please refer to 'blockchain.config.class'
blockchain.config.name = "main"

# This is a more advanced replacement for 'blockchain.config.name'
# Here the exact org.ethereum.config.BlockchainForkConfig implementation
# class name can be specified.
# Only one of two options (this and above) can be defined.
#blockchain.config.class = "org.ethereum.config.fork.MainForkConfig"

database {
    # place to save physical storage files
    # can be either absolute or relative path
    dir = database

    # every time the application starts
    # the existing database will be
    # destroyed and all the data will be
    # downloaded from peers again [true/false]
    reset = false

    # If reset=true, every time the application
    # starts the database will reset itself to
    # this block number and sync again from there.
    # Set to 0 for a 'full' reset.
    resetBlock = 0

    # handling incompatible database version:
    #  * EXIT   - (default) show error in std out and exit by throwing Error
    #  * RESET  - clear database directory and continue working
    #  * IGNORE - continue working regardless possible issues
    # @since 1.4.0
    incompatibleDatabaseBehavior = EXIT

    # controls state database pruning
    # pruned state consumes much less disk space (e.g. 50G full and 1G pruned)
    # but the state can be restored only within last [maxDepth] blocks, all older
    # states are lost
    prune {
        enabled = true

        # controls how much last block states are not pruned
        # it is not recommneded to set this value below 192
        # as it can prevent rebranching from long fork chains
        maxDepth = 192
    }
}

# Cache settings
cache {

    flush {
        # size in Mbytes of the write buffer for all datasources (state, blocks, transactions)
        # data is flushed to DB when write caches size exceeds this limit
        # value < 0 disables this option
        writeCacheSize = 64

        # force data flush each N blocks
        # [10000 flush each 10000 blocks]
        # value 0 disables this option
        blocks = 0

        # flush each block after full (long) sync complete
        shortSyncFlush = true
    }

    # total size in Mbytes of the state DB read cache
    stateCacheSize = 256

    # the size of block queue cache to be imported in MBytes
    blockQueueSize = 32

    # the size of header queue cache during import in MBytes
    headerQueueSize = 8

    # maximum size (in Mb) the state bloom fiter can grow up to
    # when reaching this threshold the bloom filter
    # is turned off forever
    # 128M can manage approx up to 50M of db entries
    maxStateBloomSize = 128
}

# eth sync process
sync {

    # block chain synchronization
    # can be: [true/false]
    enabled = true

    # Fast sync options
    fast {

        # enables/disables fastsync
        # when enabling make sure the global sync option is enabled ('sync.enabled')
        enabled = true

        # when specified the fastsync retrieves the state for this block
        # this is the fast and secure option to do fastsync
        # if not specified the block is selected like [peerBestBlockNumber - 1000]
        //        pivotBlockHash = 6149ddfd7f52b2aa34a65b15ae117c269b5ff2dc58aa839dd015790553269411
    }

    # minimal peers count
    # used in sync process
    # sync may use more peers
    # than this value
    # but will try to get
    # at least this many from discovery
    peer.count = 30

    # Uncomment this param
    # to use a strict Eth version.
    # Useful for testing
    # version = 62

    # exit if we receive a block that causes state conflict
    # this option is mainly for debugging purposes
    exitOnBlockConflict = false
}

# miner options
mine {

    # start mining blocks
    # when 'sync.enabled' is true the mining starts when the sync is complete
    # else the mining will start immediately, taking the best block from database
    # (or genesis if no blocks exist yet)
    start = false

    # number of CPU threads the miner will mine on
    # 0 disables CPU mining
    cpuMineThreads = 4

    # there two options for CPU mining 'light' and 'full'
    # 'light' requires only 16M of RAM but is much slower
    # 'full' requires 1G of RAM and possibly ~7min for the DataSet generation
    #   but is much faster during mining
    fullDataSet = true

    # mining beneficiary
    coinbase = "0000000000000000000000000000000000000000"

    # extra data included in the mined block
    # one of two properties should be specified
    extraData = "EthereumJ powered"
    #extraDataHex = "0102abcd"

    # transactions with the gas price lower than this will not be
    # included in mined blocks
    # decimal number in weis
    minGasPrice = 15000000000  # 15 Gwei

    # minimal timeout between mined blocks
    minBlockTimeoutMsec = 0

    # start mining with specific nonce (might be usefult for testing)
    # null for random start nonce
    startNonce = null
}

# Solidity options
solc {
    # Full path to solc executable
    # If path is not provided, bundled Solidity Compiler is used
    path = null
}

# ========================================================
#    Debug and advanced options
# ========================================================

# the time we wait to the network
# to approve the transaction, the
# transaction got approved when
# include into a transactions msg
# retrieved from the peer [seconds]
transaction.approve.timeout = 15

# the number of blocks that should pass
# before a pending transaction is removed
transaction.outdated.threshold = 10


dump {
  # for testing purposes
  # all the state will be dumped
  # in JSON form to [dump.dir]
  # if [dump.full] = true
  # possible values [true/false]
  full = false
  dir = dmp

  # This defines the vmtrace dump
  # to the console and the style
  # -1 for no block trace
  # styles: [pretty/standard+] (default: standard+)
  block = -1
  style = pretty

  # clean the dump dir each start
  clean.on.restart = true
}

# structured trace
# is the trace being
# collected in the
# form of objects and
# exposed to the user
# in json or any other
# convenient form.
vm.structured {
  trace = false
  dir = vmtrace
  compressed = true
  initStorageLimit = 10000
}

# make changes to tracing options
# starting from certain block
# -1 don't make any tracing changes
trace.startblock = -1

# invoke vm program on
# message received,
# if the vm is not invoked
# the balance transfer
# occurs anyway  [true/false]
play.vm = true

# hello phrase will be included in
# the hello message of the peer
hello.phrase = Dev

# this property used
# mostly for debug purposes
# so if you don't know exactly how
# to apply it, leave it as [-1]
#
# ADVANCED: if we want to load a root hash
# for db not from the saved block chain (last block)
# but any manual hash this property will help.
# values [-1] - load from db
#        [hex hash 32 bytes] root hash
root.hash.start = null

# Key value data source values: [leveldb/mapdb/inmem]
keyvalue.datasource = leveldb

record.blocks=false
blockchain.only=false

# Load the blocks
# from a rlp lines
# file and not for
# the net
blocks.loader=""

#crypto settings, such as sign, hash, etc
crypto {
	#JCA cryptoprovider name. 
	providerName="SC"
	#Used for create JCA MessageDigest
	hash.alg256="ETH-KECCAK-256"
	hash.alg512="ETH-KECCAK-512"
}
@vikulin vikulin changed the title Extrimly slow blockchain sync Extremely slow blockchain sync Mar 14, 2017
@vikulin
Copy link
Author

vikulin commented Mar 14, 2017

geth has the same issue:

I0314 15:25:15.450172 core/blockchain.go:1070] imported 1 blocks, 11 txs ( 1.474 Mg) in 1m37.881s ( 0.015 Mg/s). #2333527 [b8969b19…]
I0314 15:25:40.327793 core/blockchain.go:1070] imported 2 blocks, 16 txs ( 1.568 Mg) in 24.877s ( 0.063 Mg/s). #2333529 [b8969b19… / b337334e…]
I0314 15:27:17.021910 core/blockchain.go:1070] imported 1 blocks, 13 txs ( 1.492 Mg) in 1m36.694s ( 0.015 Mg/s). #2333530 [e0f95a84…]
I0314 15:29:36.139951 core/blockchain.go:1070] imported 6 blocks, 79 txs ( 3.033 Mg) in 2m19.118s ( 0.022 Mg/s). #2333536 [e0f95a84… / 1df83234…]
I0314 15:30:50.566945 core/blockchain.go:1070] imported 1 blocks, 7 txs ( 1.424 Mg) in 1m14.426s ( 0.019 Mg/s). #2333537 [3dcabe2d…]
I0314 15:32:19.962481 core/blockchain.go:1070] imported 3 blocks, 67 txs ( 2.750 Mg) in 1m29.395s ( 0.031 Mg/s). #2333540 [3dcabe2d… / f7b79651…]
I0314 15:33:08.934767 core/blockchain.go:1070] imported 2 blocks, 75 txs ( 2.899 Mg) in 48.972s ( 0.059 Mg/s). #2333542 [f7b79651… / 2bf839a6…]
I0314 15:34:14.448058 core/blockchain.go:1070] imported 6 blocks, 77 txs ( 2.950 Mg) in 1m5.513s ( 0.045 Mg/s). #2333548 [2bf839a6… / f881cfb9…]
I0314 15:35:57.471610 core/blockchain.go:1070] imported 6 blocks, 146 txs ( 4.408 Mg) in 1m43.023s ( 0.043 Mg/s). #2333554 [f881cfb9… / 911cbbea…]
I0314 15:37:04.394776 core/blockchain.go:1070] imported 2 blocks, 78 txs ( 2.928 Mg) in 1m6.923s ( 0.044 Mg/s). #2333556 [911cbbea… / df2060b7…]
I0314 15:38:44.636901 core/blockchain.go:1070] imported 2 blocks, 67 txs ( 2.705 Mg) in 1m40.242s ( 0.027 Mg/s). #2333558 [df2060b7… / 40ca4e3e…]
I0314 15:39:28.818434 core/blockchain.go:1070] imported 13 blocks, 173 txs ( 5.421 Mg) in 44.181s ( 0.123 Mg/s). #2333571 [40ca4e3e… / a325b3ba…]
I0314 15:39:58.023458 core/blockchain.go:1070] imported 3 blocks, 48 txs ( 2.705 Mg) in 29.205s ( 0.093 Mg/s). #2333574 [a325b3ba… / fdc7028e…]
I0314 15:40:32.367048 core/blockchain.go:1070] imported 1 blocks, 9 txs ( 1.454 Mg) in 34.343s ( 0.042 Mg/s). #2333575 [81354ea9…]
I0314 15:41:48.776117 core/blockchain.go:1070] imported 2 blocks, 22 txs ( 1.786 Mg) in 1m16.409s ( 0.023 Mg/s). #2333577 [81354ea9… / e98418e8…]
I0314 15:43:09.068631 core/blockchain.go:1070] imported 10 blocks, 74 txs ( 3.395 Mg) in 1m20.292s ( 0.042 Mg/s). #2333587 [e98418e8… / 85afa036…]
I0314 15:45:24.178319 core/blockchain.go:1070] imported 4 blocks, 9 txs ( 1.421 Mg) in 2m15.109s ( 0.011 Mg/s). #2333591 [85afa036… / 44d4b7c3…]
I0314 15:46:47.874116 core/blockchain.go:1070] imported 1 blocks, 11 txs ( 1.414 Mg) in 1m23.695s ( 0.017 Mg/s). #2333592 [77d8f6db…]
I0314 15:47:54.084000 core/blockchain.go:1070] imported 8 blocks, 63 txs ( 4.095 Mg) in 1m6.209s ( 0.062 Mg/s). #2333600 [77d8f6db… / ee0ad27b…]
I0314 15:48:53.464001 core/blockchain.go:1070] imported 1 blocks, 6 txs ( 1.489 Mg) in 59.380s ( 0.025 Mg/s). #2333601 [685098d0…]
I0314 15:49:51.885306 core/blockchain.go:1070] imported 2 blocks, 11 txs ( 1.595 Mg) in 58.421s ( 0.027 Mg/s). #2333603 [685098d0… / 28066e51…]
I0314 15:50:19.003543 core/blockchain.go:1070] imported 2 blocks, 9 txs ( 1.427 Mg) in 27.118s ( 0.053 Mg/s). #2333605 [28066e51… / 3c23eb36…]
I0314 15:50:34.162007 core/blockchain.go:1070] imported 5 blocks, 17 txs ( 1.669 Mg) in 15.158s ( 0.110 Mg/s). #2333610 [3c23eb36… / 681d0b2d…]
I0314 15:51:42.092383 core/blockchain.go:1070] imported 1 blocks, 11 txs ( 1.494 Mg) in 1m7.930s ( 0.022 Mg/s). #2333611 [f8108232…]
I0314 15:52:14.785329 core/blockchain.go:1070] imported 2 blocks, 12 txs ( 1.580 Mg) in 32.692s ( 0.048 Mg/s). #2333613 [f8108232… / cd1defb4…]
I0314 15:53:19.520653 core/blockchain.go:1070] imported 3 blocks, 6 txs ( 1.474 Mg) in 1m4.735s ( 0.023 Mg/s). #2333616 [cd1defb4… / 4a499218…]
I0314 15:53:55.428710 core/blockchain.go:1070] imported 3 blocks, 9 txs ( 1.380 Mg) in 35.908s ( 0.038 Mg/s). #2333619 [4a499218… / 327afd3d…]
I0314 15:55:05.434279 core/blockchain.go:1070] imported 20 blocks, 291 txs ( 8.095 Mg) in 1m10.005s ( 0.116 Mg/s). #2333639 [327afd3d… / 3b8a2130…]
I0314 15:56:27.968242 core/blockchain.go:1070] imported 1 blocks, 12 txs ( 1.474 Mg) in 1m22.533s ( 0.018 Mg/s). #2333640 [79b0026c…]
I0314 15:57:03.839342 core/blockchain.go:1070] imported 1 blocks, 11 txs ( 1.454 Mg) in 35.871s ( 0.041 Mg/s). #2333641 [c4dadab9…]
I0314 15:57:31.252783 core/blockchain.go:1070] imported 3 blocks, 27 txs ( 1.758 Mg) in 27.413s ( 0.064 Mg/s). #2333644 [c4dadab9… / 49196712…]
I0314 15:58:28.117783 core/blockchain.go:1070] imported 5 blocks, 140 txs ( 4.296 Mg) in 56.864s ( 0.076 Mg/s). #2333649 [49196712… / d24e3372…]
I0314 16:00:13.414686 core/blockchain.go:1070] imported 3 blocks, 61 txs ( 2.899 Mg) in 1m45.296s ( 0.028 Mg/s). #2333652 [d24e3372… / 35f519b6…]
I0314 16:01:24.591430 core/blockchain.go:1070] imported 12 blocks, 192 txs ( 5.607 Mg) in 1m11.176s ( 0.079 Mg/s). #2333664 [35f519b6… / 95402048…]
I0314 16:02:26.798181 core/blockchain.go:1070] imported 1 blocks, 7 txs ( 1.474 Mg) in 1m2.206s ( 0.024 Mg/s). #2333665 [bcfe49ee…]
I0314 16:04:28.884240 core/blockchain.go:1070] imported 1 blocks, 7 txs ( 1.419 Mg) in 2m2.086s ( 0.012 Mg/s). #2333666 [ece6b73b…]
I0314 16:05:01.032477 core/blockchain.go:1070] imported 1 blocks, 10 txs ( 1.457 Mg) in 32.148s ( 0.045 Mg/s). #2333667 [3fbacf02…]
I0314 16:05:58.576369 core/blockchain.go:1070] imported 6 blocks, 141 txs ( 4.339 Mg) in 57.543s ( 0.075 Mg/s). #2333673 [3fbacf02… / 6c6437a7…]
I0314 16:07:20.273267 core/blockchain.go:1070] imported 3 blocks, 10 txs ( 1.467 Mg) in 1m21.696s ( 0.018 Mg/s). #2333676 [6c6437a7… / cdeaf5d6…]
I0314 16:08:00.833045 core/blockchain.go:1070] imported 3 blocks, 60 txs ( 2.861 Mg) in 40.559s ( 0.071 Mg/s). #2333679 [cdeaf5d6… / b277a545…]
I0314 16:08:43.221544 core/blockchain.go:1070] imported 9 blocks, 93 txs ( 3.294 Mg) in 42.388s ( 0.078 Mg/s). #2333688 [b277a545… / 72a77155…]
I0314 16:10:27.976519 core/blockchain.go:1070] imported 7 blocks, 132 txs ( 4.326 Mg) in 1m44.754s ( 0.041 Mg/s). #2333695 [72a77155… / 885158da…]

@vikulin
Copy link
Author

vikulin commented Mar 14, 2017

Related issue in geth client:
ethereum/aleth#3133

@cupuyc
Copy link
Collaborator

cupuyc commented Mar 14, 2017

@vikulin If you have time would be great if you could try with latest develop. There was 1 fix related to persistence.

If that wouldn't help - we will need to similar reports. For me regular sync takes normal time.

@vikulin
Copy link
Author

vikulin commented Mar 14, 2017

@cupuyc , can you please point to the fix number?

@cupuyc
Copy link
Collaborator

cupuyc commented Mar 14, 2017

Happened in #762

@vikulin
Copy link
Author

vikulin commented Mar 14, 2017

The same result. Looks like there is no performence improvement since last update.
00:00:06.714 INFO [general] Running frontier.json, core version: 1.5.0-SNAPSHOT
00:00:06.718 INFO [general] git.hash: [7a047c5]

selection_184

@vikulin
Copy link
Author

vikulin commented Mar 14, 2017

I noticed that my HDD is always busy with 100% rate by Java process corresponding to ethereumj. It might be local blockchain DB issue.

selection_185

@cupuyc
Copy link
Collaborator

cupuyc commented Mar 15, 2017

@vikulin

  1. Such busy disk may refer to hardware issue. Did you try geth on the same PC as ethereumJ?

  2. Would be great if you try on other PC.

@vikulin
Copy link
Author

vikulin commented Mar 15, 2017

@cupuyc , this is definitely ethereumJ issue - when I exit from ethereumJ syncing by CTRL+C my HDD usage goes to 0%.

@vikulin
Copy link
Author

vikulin commented Mar 15, 2017

@cupuyc , also my hardware is capable to do such work: 6GB RAM, SATA disk, Phenom II x4 950.

@cupuyc
Copy link
Collaborator

cupuyc commented Mar 15, 2017

Are you syncing from blank database each time or do you have some database backup?

@vikulin
Copy link
Author

vikulin commented Mar 15, 2017

I did not do any backups. All blockchain data I fetched from scratch.

@vikulin
Copy link
Author

vikulin commented Mar 17, 2017

Do you have any updated?

I added SSD specifically to work with blockchain DB and the HDD usage is about 80-90% with 40-50MB/sec every time. This is definitely high load to hardware without network traffic and should be fixed as a priority defect.

selection_199

@vikulin vikulin changed the title Extremely slow blockchain sync Extremely slow blockchain sync due to HDD high load Mar 17, 2017
@zilm13
Copy link
Collaborator

zilm13 commented Mar 17, 2017

@vikulin Hi!
You are currently passing attackers blocks. They contain transactions involving thousands of accounts with heavy operations. We have improved regular sync time in the very latest develop, so if you are using 1.4.x version, you will have speed improvement there, but it uses new DB version, so you will need to start over from the beginning. Further improvement of regular synchronisation is very difficult and not in the current schedule.
It's strongly recommended to use fast sync instead unless you need something special like DB without pruning or whatever. You will have unsecure sync in 30 minutes and full secured fast sync in 2 hours with good internet connection in that case.

@vikulin
Copy link
Author

vikulin commented Mar 17, 2017

@zilm13 , I'm running 1.5.0 latest develop.

Also thank you for clarification. As I know these operation in blocks should not execute because past blockchain data is static. Why attackers' blocks have so heavy impact for Sync?

@zilm13
Copy link
Collaborator

zilm13 commented Mar 17, 2017

@vikulin during regular sync we execute all transactions for verification and it's the only way to calculate state changes from this kind of data. Also it provides an opportunity for developers to add their own listeners and perform their own logic using results of execution.

@vikulin
Copy link
Author

vikulin commented Mar 17, 2017

@zilm13 , can the ethereumj determine what transactions generated by attackers and exclude it from execution process in future?

@cupuyc
Copy link
Collaborator

cupuyc commented Mar 17, 2017

@vikulin If you don't want to replay transactions, you could just use fast sync
-Dsync.fast.enabled=true and get state loaded and ready to work very quickly.

@zilm13
Copy link
Collaborator

zilm13 commented Mar 17, 2017

@vikulin It will include fuzzy logic which could hurt normal transactions. These transactions are correct, just there were conditions to DoS clients with transactions before HF. You could learn more about changes made in Ethereum to make such attacks expensive in EIP 150. Also after HF there are special transactions removing empty accounts data, one by one, and we should keep correctness of network state on each block over wish to make it faster. Anyway we have fastsync today and it covers 99% use cases. For others these attacks are permanent for network life.

@vikulin
Copy link
Author

vikulin commented Apr 14, 2017

@cupuyc , @zilm13 . I checked sync again - I see for fast sync=false:

04:18:21.668 INFO [sample] Blockchain sync in progress. Last imported block: #2430054 (9cbd54 <~ 694bba) Txs:17, Unc: 1 (Total: txs: 83, gas: 2119k)
04:18:31.669 INFO [sample] Blockchain sync in progress. Last imported block: #2430066 (804d6a <~ b1818f) Txs:0, Unc: 0 (Total: txs: 29, gas: 1485k)
04:18:41.670 INFO [sample] Blockchain sync in progress. Last imported block: #2430092 (a3c464 <~ 501688) Txs:0, Unc: 0 (Total: txs: 79, gas: 2819k)
04:18:51.671 INFO [sample] Blockchain sync in progress. Last imported block: #2430092 (a3c464 <~ 501688) Txs:0, Unc: 0 (Total: txs: 0, gas: 0k)
04:19:01.671 INFO [sample] Blockchain sync in progress. Last imported block: #2430096 (f9dcb0 <~ b48c1e) Txs:0, Unc: 0 (Total: txs: 40, gas: 2180k)
04:19:11.672 INFO [sample] Blockchain sync in progress. Last imported block: #2430096 (f9dcb0 <~ b48c1e) Txs:0, Unc: 0 (Total: txs: 0, gas: 0k)

With fast sync - it works with the same speed if I do not continue from scratch:
06:47:49.865 INFO [sync] FastSync: fast sync was completed, best block: (#2432680 (e3fd8e <~ 10df6a) Txs:15, Unc: 0). Continue with regular sync...
06:47:49.866 INFO [sync] Initializing SyncManager regular sync.
06:47:50.196 INFO [sync] Initializing BlockDownloader.
06:47:50.210 INFO [discover] Reading Node statistics from DB: 2638 nodes.
06:47:50.350 INFO [discover] Pinging discovery nodes...
06:47:51.253 INFO [sample] Sample component created. Listening for ethereum events...
06:47:51.257 INFO [sample] Sample worker thread started.
06:47:51.257 INFO [sample] Waiting for nodes discovery...
06:47:51.571 INFO [sample] [v] Discovery works, new nodes started being discovered.
06:47:51.574 INFO [sample] Waiting for available Eth capable nodes...
06:47:52.578 INFO [sample] [v] Available Eth nodes found.
06:47:52.578 INFO [sample] Searching for peers to sync with...
06:47:56.700 INFO [sample] [v] At least one sync peer found.
06:47:56.700 INFO [sample] Current BEST block: #2432680 (e3fd8e <~ 10df6a) Txs:15, Unc: 0
06:47:56.700 INFO [sample] Waiting for blocks start importing (may take a while)...
06:48:06.973 INFO [sample] [v] Blocks import started.
06:48:06.973 INFO [sample] Waiting for the whole blockchain sync (will take up to several hours for the whole chain)...
06:48:16.994 INFO [sample] Blockchain sync in progress. Last imported block: #2432686 (826fb5 <~ 4da83d) Txs:0, Unc: 1 (Total: txs: 7, gas: 150k)
06:48:27.044 INFO [sample] Blockchain sync in progress. Last imported block: #2432686 (826fb5 <~ 4da83d) Txs:0, Unc: 1 (Total: txs: 0, gas: 0k)
06:48:37.057 INFO [sample] Blockchain sync in progress. Last imported block: #2432694 (56808e <~ 664349) Txs:0, Unc: 0 (Total: txs: 70, gas: 2393k)
06:48:47.058 INFO [sample] Blockchain sync in progress. Last imported block: #2432695 (57ac89 <~ 56808e) Txs:24, Unc: 0 (Total: txs: 24, gas: 504k)
06:48:57.058 INFO [sample] Blockchain sync in progress. Last imported block: #2432695 (57ac89 <~ 56808e) Txs:24, Unc: 0 (Total: txs: 0, gas: 0k)
06:49:07.059 INFO [sample] Blockchain sync in progress. Last imported block: #2432695 (57ac89 <~ 56808e) Txs:24, Unc: 0 (Total: txs: 0, gas: 0k)

@ghost
Copy link

ghost commented Jun 26, 2017

leveldb seems really slow. I changed to use:
keyvalue.datasource = inmem
in my ethereum.conf and it sped up a lot.

Edit: might need to have some adjustment to the default leveldb caching params? I assume inmem means I would have to rebuild my chain every restart.

@ghost
Copy link

ghost commented Jun 26, 2017

still finding optimal values, but so far:

cache.flush.writeCacheSize = 6144
cache.flush.blocks = 100000
cache.flush.shortSyncFlush = true 

seems to work pretty well when using a jvm heap of: -Xmx7500m
increasing the other cache size params so far has been resulting in OOM issues for the heap in my testing..

I am not using fast sync, and am conducting tests on an m4.large with an EFS volume for the blockchain on Amazon Web Services. Is there a guide/documentation/somewhere on what the proportions I should be using for stateCacheSize, blockQueueSize, headerQueueSize, and maxStateBloomSize in relation to writeCacheSize? It seemed like despite keeping the sum of these values under the writeCacheSize I get heap out of memory errors, even though my writeCacheSize = 6144, under my heap size of 7500.

@CSPRNG
Copy link

CSPRNG commented Jul 2, 2017

Hi guys,

New contributor , my first post. Hope this is the correct place to ask.

The issue is that creating and synchronising the blockchain for the first time is progressively slowing down, from 16329 blocks per minute during the first hour to 10406 blocks per hour after 4 hours.
(The first half of the chain synchronized in 3.5 hours, after that it became slower and slower to the point that now will take longer than 4 days to synchronize the last 1.5 million remaining blocks)

The same pattern have been reported previously by Geth users, see https://www.reddit.com/r/ethereum/comments/5e9q0i/geth_is_very_slow_to_sync_after_block_2420000/ . And in fact the process became very slow around the same blocks.

Hoping to reset something without killing the process I turned off the router but left the program running and went to sleep. Few hours later I turned the router back on. During few hours nothing new was added to the chain; but as soon as I turned the router back online it started downloading and synchronising new block. It was a bit faster than before but not by much. 13289 blocks per hour. There are roughly 1505379 remaining , so at this speed will take more than 4 days to finish assuming it does not slow down more..

I am saving all the output of this process, one line of text every 10 seconds , if this can be used by anyone to retrace what is going on to improve it just ask me. As mentioned this is my first attempt with Ethereum and do not really know much about anything yet, any help, hints, advise is appreciated.

I plan to install and work with the C++ and Java versions of Ethereum and followed the instructions here https://github.com/ethereum/ethereumj:

Running from command line:

git clone https://github.com/ethereum/ethereumj
cd ethereumj
gradlew run -PmainClass=org.ethereum.samples.BasicSample

The machine is a Macbook Pro with 16 Gb Ram, 4 cores and a new 850 Gb SSD drive.

How can stop everything without losing what is already done ?
I would like to restart everything and force it to rediscover peers, etc.

Thank you
Mark

@jeremie-H
Copy link

jeremie-H commented Sep 16, 2017

Hello !
I've got the same problem here (with 32GB RAM / core i5 4,2GHz and blockchain on an SSD) ...
and ethereumj-core-1.5.0-RELEASE

20:29:07.983` INFO [db]  DbFlushManager: flushing db due to write cache size (268537772) reached threshold (268435456)
20:29:07.983 INFO [db]  Waiting for previous flush to complete...
20:32:16.833 WARN [net]  FrameCodec failed: 
io.netty.handler.timeout.ReadTimeoutException: null
20:39:42.077 INFO [db]  Flush completed in 673690 ms

I dont understand why a flush on a SSD take more than 15min, There is only 256MB to flush, it should be done in less than one second on any SSD. (by creating a memory representation of the file if needed, then flushing it)
It's obvious that there is a real problem, with high priority I think.
my configuration i've try :
cache.writeCacheSize = 256
cache.stateCacheSize = 256
cache.blockQueueSize = 32
headerQueueSize = 8

sync.fast.enabled = false
record.internal.transactions.data = true
keyvalue.datasource = leveldb

@adamsmd
Copy link

adamsmd commented Nov 10, 2017

I have a similar problem and did some digging. I now know what part of the code is slow but unfortunately not why or how to speed it up. I'm posting this in hopes that it will help someone else figure the rest out.

Note that while I also saw slowness starting around block 2,400,000, I didn't run these tests until around block 2,850,000. So it is possible that the problems are different than those reported by @vikulin.

TL;DR

Flushing the AsyncWriteCache for state is slow, but the bottleneck is disk read not disk write. This is because the AsyncWriteCache for state uses CountingBytesSource which does a
ReadCache.get() call inside its CountingBytesSource.put(). This get() call is consuming all the time.

Details

Looking at the output, it seems that it is the flush process that is going slow: INFO [db] Flush completed in 43851 ms.

It is so slow that flushes fail to complete before the next flush is triggered. This results in the next flush having to wait: INFO [db] Waiting for previous flush to complete...

Looking at the logs/ethereum.log file (which contains DEBUG outputs) all of the AsyncWriteCache flushes are fast with the exception of the one for state:

DEBUG [db]  AsyncWriteCache (state): flush completed in 43846 ms
DEBUG [db]  AsyncWriteCache (journal): flush submitted
DEBUG [db]  AsyncWriteCache (journal): flush started
DEBUG [db]  AsyncWriteCache (journal): flush completed in 0 ms
DEBUG [db]  AsyncWriteCache (block): flush submitted
DEBUG [db]  AsyncWriteCache (block): flush started
DEBUG [db]  AsyncWriteCache (block): flush completed in 0 ms
DEBUG [db]  AsyncWriteCache (index): flush submitted
DEBUG [db]  AsyncWriteCache (index): flush started
DEBUG [db]  AsyncWriteCache (index): flush completed in 0 ms
DEBUG [db]  AsyncWriteCache (transactions): flush submitted
DEBUG [db]  AsyncWriteCache (transactions): flush started
DEBUG [db]  AsyncWriteCache (transactions): flush completed in 0 ms

Profiling with VisualVM. It appears that most of the time is being spent in the org.ethereum.datasource.ReadCache.get() call inside org.ethereum.datasource.CountingBytesSource.put().

AsyncWriteCacheThread-0	100.0	43,097 ms (100%)	42,888 ms
java.lang.Thread.run()	100.0	43,097 ms (100%)	42,888 ms
java.util.concurrent.ThreadPoolExecutor$Worker.run()	100.0	43,097 ms (100%)	42,888 ms
java.util.concurrent.ThreadPoolExecutor.runWorker()	100.0	43,097 ms (100%)	42,888 ms
java.util.concurrent.FutureTask.run()	100.0	43,097 ms (100%)	42,888 ms
org.ethereum.datasource.AsyncWriteCache$1.call()	100.0	43,097 ms (100%)	42,888 ms
org.ethereum.datasource.AsyncWriteCache$1.call()	100.0	43,097 ms (100%)	42,888 ms
org.ethereum.datasource.WriteCache.flush()	100.0	43,097 ms (100%)	42,888 ms
org.ethereum.datasource.CountingBytesSource.put()	100.0	43,097 ms (100%)	42,888 ms
org.ethereum.datasource.CountingBytesSource.put()	100.0	43,097 ms (100%)	42,888 ms
org.ethereum.datasource.ReadCache.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.ethereum.datasource.BloomedSource.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.ethereum.datasource.BloomedSource.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.ethereum.datasource.XorDataSource.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.ethereum.datasource.XorDataSource.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.ethereum.datasource.WriteCache.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.ethereum.datasource.BatchSourceWriter.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.ethereum.datasource.leveldb.LevelDbDataSource.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.ethereum.datasource.leveldb.LevelDbDataSource.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.fusesource.leveldbjni.internal.JniDB.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.fusesource.leveldbjni.internal.JniDB.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.fusesource.leveldbjni.internal.NativeDB.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.fusesource.leveldbjni.internal.NativeDB.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.fusesource.leveldbjni.internal.NativeDB.get()	99.51493	42,888 ms (99.5%)	42,888 ms
org.fusesource.leveldbjni.internal.NativeDB$DBJNI.Get[native]()	99.51493	42,888 ms (99.5%)	42,888 ms

Running sudo iotop shows amount 100 times as many bytes read as written which is consistent with the slowness being get().

From reading the source code it looks like AsyncWriteCache (state) is the only AsyncWriteCache that uses CountingByteSource, which is consistent with the non-state flushes being faster:

public StateSource(Source<byte[], byte[]> src, boolean pruningEnabled, int maxBloomSize) {

Based on these results, I set the following configurations, but it did not seem to help:

cache.stateCacheSize = 5000
cache.maxStateBloomSize = 5000

Summary

  1. The bottleneck is the get() needed by CountingByteSource.
  2. I don't understand why the get() reads so much more data than the put() writes out.
  3. Performance might increase if those (redundant?) reads could be reduced.
  4. Alternatively, performance might increase if we could improve the performance of get() in ReadCache, BloomedSource, XorDataSource, WriteCache, BatchSourceWriter, or LevelDbDataSource.

@zilm13
Copy link
Collaborator

zilm13 commented Jan 8, 2018

@vbarzov @adamsmd We are currently switching to new DB (RocksDB) , new release will be with it instead of the LevelDB. This could help but, anyway, blockchain synchronisation is very IOPS constrained, so you'd better run it on SSD. I can only recommend to turn pruning off (database.prune.enabled = false) and trying to increase caches in settings.

@azewiusz
Copy link

And not a HBase ?

@mkalinin
Copy link
Contributor

Resolved in 1.7.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants