HomeEtherumEthereum Foundation Blog Announces Geth v1.9.0

Ethereum Foundation Blog Announces Geth v1.9.0

Published on

After many months of silence, we’re proud to announce the v1.9.0 release of Go Ethereum! Although this release has been in the making for a lot longer than we anticipated, we’re confident there will be some juicy feature for everyone to enjoy!

Warning: We’ve tried our best to squash all the bugs, but as with all major releases, we advise everyone to take extra care when upgrading. The v1.9.0 release contains database schema changes, meaning it’s not possible to downgrade once updated. We also recommend a fresh fast sync as it can drastically reduce the database size.

Some of the features mentioned here have been silently shipped over the course of the 1.8.x release family, but we’ve deemed them important enough to explicitly highlight.

Performance

It’s interesting to realize that the “Performance” section was somewhere at the end of previous announcements, but over the years it became one of the most sought-after improvements.

Over the past 6 months, we’ve tried to dissect the different components that are on the critical path of block processing, in an attempt to identify and optimize some bottlenecks. Among the many improvements, the highest impact ones were:
– The discovery and optimization of a quadratic CPU and disk IO complexity, originating from the Go implementation of LevelDB. This caused Geth to be starved and stalled, exponentially getting worse as the database grew. Huge shoutout to Gary Rong for his relentless efforts, especially as his work is beneficial to the entire Go community.
– The analysis and optimization of the account and storage trie access patterns across blocks. This resulted in stabilizing Geth’s memory usage even during the import of the Shanghai DoS blocks and speeding up overall block processing by concurrent heuristic state prefetching. This work was mostly done by Péter Szilágyi.
– The analysis and optimization of various EVM opcodes, aiming to find outliers both in Geth’s EVM implementation as well as Ethereum’s protocol design in general. This led to both fixes in Geth as well as infos funneled into the Eth 1.x scaling discussions. Shoutout goes to Martin Holst Swende for pioneering this effort.
– The analysis and optimization of our database schemas, trying to both remove any redundant data as well as redesign indexes for lower disk use (sometimes at the cost of a slight CPU hit). Props for these efforts (spanning 6-9 months) go to Alexey Akhunov, Gary Rong, Péter Szilágyi, and Matthew Halpern.
– The discovery of a LevelDB compaction overhead during the state sync phase of fast sync. By temporarily allocating pruning caches to fast sync blooms, we’ve been able to short-circuit most data accesses in-memory. This work was mostly done by Péter Szilágyi.

[TL;DR] Fast sync

We’ve run a fast sync benchmark on two i3.2xlarge AWS EC2 instances (8 core, 61 GiB RAM, 1.9 TiB NVMe SSD) with –cache=4096 –maxpeers=50 (defaults on v1.9.0) on the 25th of April.

VersionSync timeDisk sizeDisk readsDisk writes
Geth v1.8.2711h 20m176GiB1.58TiB1.94TiB
Geth v1.9.04h 8m131GiB0.91TiB1.06TiB

[TL;DR] Full sync

We’ve run a full sync benchmark on two i3.2xlarge AWS EC2 instances (8 core, 61 GiB RAM, 1.9 TiB NVMe SSD) with –cache=4096 –maxpeers=50 –syncmode=full.

VersionSync timeDisk sizeDisk readsDisk writes
Geth v1.8.276d 15h 30m341GiB28.9TiB21.8TiB
Geth v1.9.06d 8h 7m*303GiB40.2TiB*32.6TiB*

*Whilst the performance is similar, we’ve achieved that while reducing the memory use by about 1/3rd and completely removing spurious memory peaks (Shanghai DoS). The reason for the higher disk IO is due to using less memory for caching, having to push more aggressively to disk.

[TL;DR] Archive sync

We’ve run an archive sync benchmark on two m5.2xlarge AWS EC2 instances (8 core, 32 GiB RAM, 3TiB EBS SSD) with –cache=4096 –syncmode=full –gcmode=archive.

VersionSync timeDisk sizeDisk readsDisk writes
Geth v1.8.2762d 4h2.57TiB69.29TiB49.03TiB
Geth v1.9.013d 19h*2.32TiB104.73TiB91.4TiB

* EBS volumes are significantly slower than physical SSDs attached to the VM. Better performance can be achieved on VMs with real SSDs or actual physical hardware.

Freezer

Wouldn’t it be amazing if we didn’t have to waste so much precious space on our expensive and sensitive SSDs to run an Ethereum node, and could rather move at least some of the data onto a cheap and durable HDD?

With the v1.9.0 release, Geth separated its database into two parts (done by Péter Szilágyi, Martin Holst Swende, and Gary Rong):
– Recent blocks, all state and acceleration structures are kept in a fast key-value store (LevelDB) as until now. This is meant to be run on top of an SSD as both disk IO performance is crucial.
– Blocks and receipts that are older than a cutoff threshold (3 epochs) are moved out of LevelDB into a custom freezer database, that is backed by a handful of append-only flat files. Since the node rarely needs to read these data, and only ever appends to them, an HDD should be more than suitable to cover it.

A fresh fast sync at block 7.77M placed 79GB of data into the freezer and 60GB of data into LevelDB.

Freezer basics

By default, Geth will place your freezer inside your chaindata folder, into the ancient subfolder. The reason for using a sub-folder was to avoid breaking any automated tooling that might be moving the database around or across instances. You can explicitly place the freezer in a different location via the –datadir.ancient CLI flag.

When you update to v1.9.0 from an older version, Geth will automatically begin migrating blocks and receipts from the LevelDB database into the freezer. If you haven’t specified –datadir.ancient at that time, but would like to move it later, you will need to copy the existing ancient folder manually and then start Geth with –datadir.ancient set to the correct path.

Freezer tricks

Since the freezer (cold data) is stored separately from the state (hot data), an interesting question is what happens if one of the two databases goes missing?

If the freezer is deleted (or a wrong path specified), you essentially pull the rug from underneath Geth. The node would become unusable, so it explicitly forbids doing this on startup.

If, however, the state database is the one deleted, Geth will reconstruct all its indices based on the frozen data; and then do a fast sync on top to back-fill the missing state. Essentially, the freezer can be used as a guerrilla state pruner to periodically get rid of accumulated junk. By removing the state database, but not the freezer, the node will do a fast sync to fetch the latest state, but will reuse all the existing block and receipt data already downloaded previously. You can trigger this via geth removedb (plus the –datadir and –datadir.ancient flags if you used custom ones); asking it to only remove the state database, but not the ancient database.

Be advised, that reindexing all the transactions from the ancient database can take over an hour, and fast sync will only commence afterwards. This will probably be changed into a background process in the near future.

GraphQL

Who doesn’t just love JSON-RPC? Me! As its name suggests, JSON-RPC is a Remote Procedure Call protocol. Its design goal is to permit calling functions that do some arbitrary computation on the remote side, after which they return the result of said computation.

Of course – the protocol being generic – you can run data queries on top, but there’s no standardized query semantic, so people tend to roll their own. Without support for flexible queries, however, we end up wasting both computational and data transfer resources:

– RPC calls that return a lot of data (e.g. eth_getBlock) waste bandwidth if the user is only interested in a handful of fields (e.g. only the header, or even less, only the miner’s address).
– RPC calls that return only a bit of data (e.g. eth_getTransactionReceipt) waste CPU capacity if the user is forced to repeat the call multiple times (e.g. retrieving all receipts one-by-one results in loading all of them from disk for each call).

In the case of Ethereum’s JSON-RPC API, the above issues get exacerbated by the mini-reorg nature of the blockchain, as doing multiple queries (e.g. eth_getBalance) needs to actually ensure that they execute against the same state and even against the same node (e.g. load-balanced backends might have slight sync delays, so can serve different content).

Yes, we could invent a new, super optimal query mechanism that would permit us to retrieve only the data we need, whilst minimizing computational and data transfer overhead… or we could also not reinvent the wheel (again

Latest articles

Analyst Suggests Ethereum Price Could Surge to $3,100 with Bullish Momentum

Ethereum (ETH), the second-largest cryptocurrency, has seen a significant price increase over the past...

Cboe Digital Announces Plan to Introduce Margin Futures Trading for Bitcoin and Ethereum by 2024

On Nov. 13, Cboe Digital announced that it will soon launch trading and clearing...

Top Trader Predicts Significant Price Increases for Sushi and Apecoin – Check Out His Projections

A crypto strategist says more rallies are up ahead for prominent decentralized exchange (DEX)...

What is the difference between NVMe and M.2?

`` Over the last decade, solid-state drives (SSDs) have become the top choice for many...

More like this

Analyst Suggests Ethereum Price Could Surge to $3,100 with Bullish Momentum

Ethereum (ETH), the second-largest cryptocurrency, has seen a significant price increase over the past...

Cboe Digital Announces Plan to Introduce Margin Futures Trading for Bitcoin and Ethereum by 2024

On Nov. 13, Cboe Digital announced that it will soon launch trading and clearing...

Top Trader Predicts Significant Price Increases for Sushi and Apecoin – Check Out His Projections

A crypto strategist says more rallies are up ahead for prominent decentralized exchange (DEX)...
bitcoin
Bitcoin (BTC) $ 67,475.08 3.43%
ethereum
Ethereum (ETH) $ 3,763.00 7.14%
tether
Tether (USDT) $ 1.00 0.03%
bnb
BNB (BNB) $ 421.17 0.49%
solana
Solana (SOL) $ 132.24 0.50%
staked-ether
Lido Staked Ether (STETH) $ 3,752.07 7.02%
xrp
XRP (XRP) $ 0.65076 0.48%
usd-coin
USDC (USDC) $ 1.00 0.10%
cardano
Cardano (ADA) $ 0.775079 1.20%
dogecoin
Dogecoin (DOGE) $ 0.186213 14.29%