Ethereum mainstream client: Geth overall architecture

This article is the first in the Geth source code series. Through this series, we will build a framework for researching the implementation of Geth, allowing developers to dive deeper into the parts they are interested in. There are a total of six articles in this series. In the first article, we will study the design architecture of the execution layer client Geth and the startup process of the Geth Node. The pace of Geth code updates is very fast, so the code you see later may differ, but the overall design remains largely consistent, and the new code can be read with the same approach.

01\Ethereum Client

Before the Merge upgrade, Ethereum had only one client, which was responsible for executing transactions and also for the consensus of the blockchain, ensuring that new blocks were generated in a certain order. After the Merge upgrade, the Ethereum client was divided into an execution layer and a consensus layer, with the execution layer responsible for executing transactions, maintaining states and data, while the consensus layer was responsible for implementing consensus functions. The execution layer and consensus layer communicate via APIs. Both layers have their own specifications, and clients can be implemented in different languages, but must adhere to the corresponding specifications, among which Geth is one implementation of the execution layer client. The current mainstream implementations of execution layer and consensus layer clients are as follows:

Execution Layer

  • Geth: Maintained by a team directly funded by the Ethereum Foundation and developed using the Go language, it is recognized as the most stable and time-tested client.
  • Nethermind: Developed and maintained by the Nethermind team, built using C#, initially funded by the Ethereum Foundation and the Gitcoin community.
  • Besu: Originally developed by the PegaSys team at ConsenSys, now a Hyperledger community project, developed in Java.
  • Erigon: Developed and maintained by the Erigon team, funded by the Ethereum Foundation and BNB Chain. Forked from Geth in 2017, its goal is to improve synchronization speed and disk efficiency.
  • Reth: Developed primarily by Paradigm, the programming language used is Rust, emphasizing modularity and high performance. It is now nearing maturity and can be used in production environments.

Consensus Layer

  • Prysm: Maintained by Prysmatic Labs, it is one of the earliest consensus layer clients for Ethereum, developed in Go, focusing on usability and security, and was early funded by the Ethereum Foundation.
  • Lighthouse: Maintained by the Sigma Prime team, developed in Rust, focuses on high performance and enterprise-level security, suitable for high-load scenarios.
  • Teku: Originally developed by ConsenSys's PegaSys team, later became part of the Hyperledger Besu community, developed using the Java language.
  • Nimbus: Developed and maintained by the Status Network team, built using the Nim language, optimized for resource-constrained devices (such as mobile phones and IoT devices), with the goal of achieving lightweight operation in embedded systems.

02\Introduction to the Execution Layer

The Ethereum execution layer can be seen as a transaction-driven state machine, with its most basic function being to update state data by executing transactions through the EVM. In addition to transaction execution, it also has functions such as saving and validating blocks and state data, running a p2p network, and maintaining a transaction pool.

Transactions are generated by users (or programs) in a format defined by the Ethereum execution layer specification. Users need to sign the transaction, and if the transaction is valid (Nonce is consecutive, signature is correct, gas fee is sufficient, and business logic is correct), then the transaction will ultimately be executed by the EVM, thereby updating the state of the Ethereum network. Here, the state refers to a collection of data structures, data, and databases, including external account addresses, contract addresses, address balances, as well as code and data.

The execution layer is responsible for executing transactions and maintaining the state after transaction execution, while the consensus layer is responsible for selecting which transactions to execute. The EVM is the state transition function within this state machine, and the function's inputs can come from multiple sources, which may include the latest block information provided by the consensus layer or blocks downloaded from the p2p network.

The consensus layer and the execution layer communicate through the Engine API, which is the only communication method between the execution layer and the consensus layer. If the consensus layer obtains the block production rights, it will use the Engine API to allow the execution layer to produce new blocks. If it does not obtain the block production rights, it will synchronize the latest blocks for the execution layer to verify and execute, thereby maintaining consensus with the entire Ethereum network.

The execution layer can logically be divided into 6 parts:

  • EVM: Responsible for executing transactions, and transaction execution is the only way to modify the state.
  • Storage: Responsible for the storage of state and block data.
  • Trading Pool: Used for user-submitted transactions, temporarily stored, and propagated between different nodes through the p2p network.
  • p2p network: used for discovering nodes, synchronizing transactions, downloading blocks, etc.
  • RPC Service: Provides the ability to access nodes, such as users sending transactions to nodes, interactions between the consensus layer and execution layer.
  • BlockChain: Responsible for managing the blockchain data of Ethereum

The diagram below shows the key processes of the execution layer and the functions of each part:

Ethereum Mainstream Client: Geth Overall Architecture

For the execution layer (here we only discuss Full Node for now), there are three key processes:

  • If it is a new Node joining Ethereum, it needs to synchronize block and state data from other Nodes through the p2p network. If it's Full Sync, it will start downloading blocks one by one from the genesis block, verifying the blocks and rebuilding the state database through the EVM. If it's Snap Sync, it skips the entire block verification process and directly downloads the state data of the latest checkpoint and subsequent block data.
  • If it is a Node that has been synchronized to the latest state, it will continuously obtain the currently latest produced blocks from the consensus layer through the Engine API, verify the blocks, then execute all the transactions in the block via the EVM to update the state database, and write the blocks to the local chain.
  • If the node that has synchronized to the latest state has obtained the block production rights from the consensus layer, it will drive the execution layer to produce the latest block through the Engine API. The execution layer retrieves transactions from the transaction pool and executes them, then assembles them into a block and transmits it to the consensus layer through the Engine API, where the consensus layer broadcasts the block to the consensus layer p2p network.

03\Source Code Structure

The code structure of go-ethereum is quite large, but much of it consists of auxiliary code and unit tests. When studying the Geth source code, it's important to focus on the core implementation of the protocol. The functionalities of the various modules are as follows. You should pay special attention to modules such as core, eth, ethdb, node, p2p, rlp, trie & triedb.

  • accounts: Manage Ethereum accounts, including the generation of public and private key pairs, signature verification, address derivation, etc.
  • beacon: Handles the interaction logic with the Ethereum Beacon Chain, supporting the functionality after the Merge of the Proof of Stake (PoS) consensus.
  • build: Build scripts and compilation configurations (such as Dockerfile, cross-platform compilation support)
  • cmd: Command line tool entry, containing multiple subcommands
  • common: General utility classes, such as byte processing, address format conversion, mathematical functions
  • consensus: Define consensus engine, including the previous proof of work (Ethash) and single-node proof of stake (Clique) as well as the Beacon engine, etc.
  • console: Provides an interactive JavaScript console that allows users to interact directly with the Ethereum Node via the command line (such as calling Web3 APIs, managing accounts, querying blockchain data)
  • core: The core logic of the blockchain, handling the lifecycle management of blocks/transactions, state machines, Gas calculations, etc.
  • crypto: Implementation of cryptographic algorithms, including elliptic curve (secp256k1), hash (Keccak-256), signature verification
  • docs: Documentation (such as design specifications, API documentation)
  • ETH: The complete implementation of the Ethereum network protocol, including node services, block synchronization (such as fast synchronization, archive mode), transaction broadcasting, etc
  • ethclient: Implements an Ethereum client library that encapsulates the JSON-RPC interface for Go developers to interact with Ethereum nodes (such as querying blocks, sending transactions, deploying contracts).
  • ethdb: Database abstraction layer that supports LevelDB, Pebble, in-memory databases, etc., for storing blockchain data (blocks, states, transactions)
  • ethstats: Collects and reports the Node's operating status to the statistics service for monitoring network health.
  • event: Implement an event subscription and publishing mechanism to support asynchronous communication between internal modules of the Node (e.g., new block arrival, transaction pool updates)
  • graphql: Provides a GraphQL interface that supports complex queries (replaces some JSON-RPC functionality)
  • internal: Internal tools or code that restricts external access.
  • log: Logging system that supports hierarchical log output and context log recording.
  • mertrics: Performance metrics collection (Prometheus support)
  • miner: the logic related to mining, generating new blocks and packaging transactions (in PoW scenarios)
  • node: Node service management, integrating the startup and configuration of modules such as p2p, RPC, and databases.
  • p2p: Peer-to-peer network protocol implementation, supports Node discovery, data transmission, and encrypted communication.
  • params: Define Ethereum network parameters (mainnet, testnet, genesis block configuration)
  • rlp: Implement the Ethereum-specific data serialization protocol RLP (Recursive Length Prefix) for encoding/decoding data structures such as blocks and transactions.
  • rpc: Implement JSON-RPC and IPC interfaces for external programs to interact with the Node.
  • signer: Transaction signature management (hardware wallet integration)
  • tests: integration tests and state tests, verifying protocol compatibility
  • trie & triedb: Implementation of the Merkle Patricia Trie for efficient storage and management of account states and contract storage.

04\Execution Layer Module Division

There are two forms of external access to a Geth Node: one is through RPC, and the other is through the Console. RPC is suitable for external users, while the Console is suitable for the node's administrators. However, whether through RPC or the Console, both use the capabilities that are already encapsulated internally, which are built in a layered manner.

The outermost layer is the ability of API for external access nodes, Engine API for communication between the execution layer and the consensus layer, Eth API for external users or programs to send transactions and obtain block information, Net API for obtaining the state of the p2p network, and so on. For example, if a user sends a transaction through the API, then the transaction will eventually be submitted to the transaction pool and managed through the transaction pool.

At the next layer of the API, the implementation of core functions includes transaction pooling, transaction packaging, block production, block and state synchronization, etc. These functions need to rely on lower-level capabilities, such as the ability of the P2P network to synchronize transaction pools, blocks and states, and the generation of blocks and blocks synchronized from other nodes need to be validated before they can be written to the local database, which need to rely on the EVM and data storage capabilities.

Ethereum Mainstream Client: Geth Overall Architecture

Core Data Structure of Execution Layer

Ethereum

The Ethereum structure in eth/backend.go is an abstraction of the entire Ethereum protocol, essentially including the main components within Ethereum, but the EVM is an exception as it is instantiated during each transaction processing and does not need to be initialized with the entire Node. The Ethereum mentioned below refers to this structure:

type Ethereum struct { // Ethereum config, including chain config *ethconfig. Config // Transaction pool, after the user's transaction is submitted, go to the transaction pool txPool *txpool. TxPool // Used to track and manage local transactions localTxTracker *locals. TxTracker // Blockchain structure blockchain *core. BlockChain // is the core component of the Ethereum node's network layer, responsible for handling all communication with other nodes, including block synchronization, transaction broadcasting, and receiving, as well as managing peer node connection handler // responsible for node discovery and node source management discmix *enode. FairMix // Responsible for the persistent storage of blockchain data chainDb ethdb. Database // Responsible for handling the publishing and subscribing of various internal events to eventMux *event. TypeMux // Engine consensus. Engine // Manage user accounts and keys accountManager *accounts. Manager // Manage log filters and chunk filterMaps *filtermaps. FilterMaps // Channel for safely shutting down filterMaps, ensuring that resources are cleaned up correctly when nodes are shut down closeFilterMaps chan chan struct{} // Provide backend support for RPC API APIBackend *EthAPIBackend // Under PoS, work with the consensus engine to validate the block miner *miner. Miner // The lowest gas price accepted by the node is gasPrice *big. Int // Network ID networkID uint64 // Provide network-related RPC services, allowing network status to be queried through RPC netRPCService *ethapi. NetAPI // Manage P2P network connections, handle node discovery and connection establishment, and provide underlay network transport functions p2pServer *p2p. Server // Protect concurrent access to mutable fields lock sync. RWMutex // Tracks whether the node is down gracefully and helps restore shutdownTracker after an abnormal shutdown *shutdowncheck. ShutdownTracker }

Node

In node/node.go, the Node is another core data structure that acts as a container responsible for managing and coordinating the operation of various services. In the structure below, attention should be paid to the lifecycles field, where Lifecycle is used to manage the lifecycle of internal functionalities. For example, the Ethereum abstraction above relies on the Node to start and is registered in the lifecycles. This separates specific functionalities from the abstraction of the node, enhancing the scalability of the entire architecture. This Node needs to be distinguished from the Node in devp2p.

type Node struct { eventmux *event.TypeMux config *Config // Account manager, responsible for managing wallets and accounts accman *accounts.Manager log log.Logger keyDir string keyDirTemp bool dirLock *flock.Flock stop chan struct{} // p2p network instance server *p2p.Server startStopLock sync.Mutex // Tracking node lifecycle status (initializing, running, shut down) state int lock sync.Mutex // All registered backends, services, and auxiliary services lifecycles []Lifecycle // Current API list rpcAPIs []rpc.API // Different access methods provided for RPC http *httpServer ws *httpServer httpAuth *httpServer wsAuth *httpServer ipc *ipcServer inprocHandler *rpc.Server databases map[*closeTrackingDB]struct{} }

If we look at the execution layer of Ethereum from an abstract dimension, Ethereum as a world computer needs to include three parts: network, computation, and storage. The components corresponding to these three parts in the Ethereum execution layer are:

  • Network: devp2p
  • Calculation: EVM
  • Storage: ethdb

devp2p

Ethereum is essentially a distributed system where each node connects to other nodes via a p2p network. The implementation of the p2p network protocol in Ethereum is devp2p.

devp2p has two core functions: one is node discovery, which allows nodes to establish connections with other nodes when joining the network; the other is data transmission service, which enables data exchange after establishing connections with other nodes.

In the Node structure in p2p/enode/node.go, it represents a node in the p2p network, where the enr.Record structure stores key-value pairs of detailed information about the node, including identity information (the signature algorithm and public key used for the node's identity), network information (IP address, port number), supported protocol information (such as support for eth/68 and snap protocols), and other custom information, which is encoded in RLP format, with specific specifications defined in eip-778:

type Node struct { // Node record, containing various properties of the node r enr.Record // Unique identifier of the node, 32 bytes length id ID // hostname DNS name tracking the node hostname string // IP address of the node ip netip.Addr // UDP port udp uint16 // TCP port tcp uint16 }// enr.Recordtype Record struct { // Sequence number seq uint64 // Signature signature []byte // RLP encoded record raw []byte // Sorted list of all key-value pairs pairs []pair }

The Table structure in p2p/discover/table.go is the core data structure for implementing the node discovery protocol in devp2p. It implements a distributed hash table similar to Kademlia to maintain and manage node information in the network.

printf("type Table struct { mutex sync. Mutex // Index known node buckets by distance [nBuckets]*bucket // bootstrap node nursery []*enode. Node rand reseedingRandom ips netutil. DistinctNetSet revalidation tableRevalidation // Database of known nodes db *enode. DB net transport cfg Config log log. Logger // Periodically processes various events in the network refreshReq chan chan struct{} revalResponseCh chan revalidationResponse addNodeCh chan addNodeOp addNodeHandled chan bool trackRequestCh chan trackRequestOp initDone chan struct{} closeReq chan struct{} closed chan struct{} // Add and remove interfaces for nodes nodeAddedHook func(*bucket, *tableNode) nodeRemovedHook func(*bucket, *tableNode)} world!" );

ethdb

ethdb abstracts Ethereum data storage, providing a unified storage interface. The underlying database can be leveldb, pebble, or other databases. There can be many extensions as long as the interface layer remains consistent.

Some data (such as block data) can be read and written directly to the underlying database through the ethdb interface, while other data storage interfaces are built on top of ethdb. For example, a large portion of the data in the database is state data, which is organized into an MPT structure. The corresponding implementation in Geth is called a trie. During the operation of the node, the trie data will produce many intermediate states. These data cannot be directly accessed for reading and writing through ethdb; instead, triedb is needed to manage these data and intermediate states, which are ultimately persisted through ethdb.

The interface defining the read and write capabilities of the underlying database is specified in ethdb/database.go, but it does not include a concrete implementation; the actual implementation will be provided by different databases themselves, such as leveldb or pebble. In the Database, two layers of data read and write interfaces are defined, where the KeyValueStore interface is used to store active, frequently changing data, such as the latest blocks, states, etc. The AncientStore is used to handle historical block data, which rarely changes once written.

type Database interface { KeyValueStore AncientStore}// Type KeyValueStore interface { KeyValueReader, KeyValueWriter, KeyValueStater, KeyValueRangeDeleter Batcher Iteratee Compacter io. Closer}// type AncientStore interface { AncientReader AncientWriter AncientStater io. Closer}

EVM

EVM is the state transition function of the Ethereum state machine, and all updates to state data can only be performed through the EVM. The p2p network can receive transaction and block information, which will become part of the state database after being processed by the EVM. The EVM abstracts away the differences in underlying hardware, allowing programs to execute on different platforms' EVMs and obtain consistent results. This is a very mature design approach, similar to the JVM in the Java language.

The implementation of EVM has three main components. The EVM structure defined in core/vm/evm.go outlines the overall structure and dependencies of EVM, including execution context, state database dependencies, etc.; the EVMInterpreter structure in core/vm/interpreter.go defines the implementation of the interpreter, responsible for executing EVM bytecode; the Contract structure in core/vm/contract.go encapsulates the specific parameters for contract calls, including the caller, contract code, input, etc., and the current operation codes are defined in core/vm/opcodes.go:

EVMtype EVM struct { // Block context, containing block-related information Context BlockContext // Transaction context, containing transaction-related information TxContext // State database, used to access and modify account state StateDB StateDB // Current call depth int // Chain configuration parameter chainConfig *params. ChainConfig chainRules params. Rules // EVM Config Config // Bytecode Interpreter interpreter *EVMInterpreter // Abort execution flag abort atomic. Bool callGasTemp uint64 // precompiles map[common. Address]PrecompiledContract jumpDests map[common. Hash]bitvec }type EVMInterpreter struct { // Point to the EVM instance to which it belongs evm *EVM // Opcode Jump Table table *JumpTable // Keccak256 hasher instance, share hasher crypto between opcodes. KeccakState // Keccak256 hash result buffer hasherBuf common. Hash // Whether it is read-only mode, state modification is not allowed in read-only mode readOnly bool // The return data of the last CALL is used for subsequent reuse returnData []byte }type Contract struct { // caller's address caller common. Address // Contract address address common. Address jumpdests map[common. Hash]bitvec analysis bitvec // Contract Bytecode Code []byte // Code hash CodeHash common. Hash // Call input []byte // Whether to deploy IsDeployment bool for the contract // Whether to call IsSystemCall bool // Available gas gas uint64 // The amount of ETH attached to the call value *uint256. Int }

Other module implementations

The functions of the execution layer are implemented in a layered manner, and other modules and functions are built on top of these three core components. Here are a few core modules introduced.

Under eth/protocols, there are implementations of the current Ethereum p2p network subprotocols. There are the eth/68 and snap subprotocols, which are built on devp2p.

eth/68 is the core protocol of Ethereum, the name of the protocol is eth, 68 is its version number, and then on the basis of this protocol, it implements functions such as transaction pool (TxPool), block synchronization (Downloader) and transaction synchronization (Fetcher). The snap protocol is used to quickly synchronize block and state data when a new node joins the network, which can greatly reduce the time it takes for a new node to start.

ethdb provides the read and write capabilities of the underlying database. Due to the complex data structures in the Ethereum protocol, it is not possible to manage this data directly through ethdb. Therefore, rawdb and statedb have been implemented on top of ethdb to manage block and state data, respectively.

The EVM runs through all the main processes, whether it is block construction or block validation, transactions need to be executed using the EVM.

05\Geth Node startup process

The startup of Geth will be divided into two phases. The first phase will initialize the components and resources needed to start the Node, and the second phase will formally start the Node and then provide services externally.

Ethereum mainstream client: Geth overall architecture

Node Initialization

When starting a geth Node, the following code will be involved:

Ethereum Mainstream Client: Geth Overall Architecture

The initialization of each module is as follows:

  • cmd/geth/main.go: geth Node startup entry
  • cmd/geth/config.go (makeFullNode): Load configuration, initialize Node
  • node/node.go: Initialize the core container for the Ethereum Node
  1. node.rpcstack.go: Initialize RPC module
  2. accounts.manager.go: Initialize accountManager
  • eth/backend.go: Initialize Ethereum instance
  1. node/node.go OpenDatabaseWithFreezer: Initialize chaindb
  2. eth/ethconfig/config.go: Initialize the consensus engine instance (the consensus engine here does not actually participate in consensus, it only verifies the results of the consensus layer and processes validator withdrawal requests)
  3. core/blockchain.go: Initialize blockchain
  4. core/filterMaps.go: Initialize filtermaps
  5. core/txpool/blobpool/blobpool.go: Initialize blob transaction pool
  6. core/txpool/legacypool/legacypool.go: Initialize the regular transaction pool
  7. cord/txpool/locals/tx_tracker.go: Local transaction tracking (local transactions need to be configured to enable local transaction tracking, and local transactions will be processed with higher priority)
  8. eth/handler.go: Initialize the Handler instance of the protocol
  9. miner/miner.go: The module for instantiating transaction packaging (formerly the mining module)
  10. eth/api_backend.go: Instantiate RPC service
  11. eth/gasprice/gasprice.go: Instantiate gas price query service
  12. internal/ethapi/api.go: Instantiate P2P Network RPC API
  13. node/node.go(RegisterAPIs): Register RPC API
  14. node/node.go(RegisterProtocols): Register p2p's Protocols
  15. node/node.go(RegisterLifecycle): Register the lifecycle of various components
  • cmd/utils/flags.go(RegisterFilterAPI): Register Filter RPC API
  • cmd/utils/flags.go(RegisterGraphQLService): Register GraphQL RPC API (if configured)
  • cmd/utils/flags.go(RegisterEthStatsService): Register EthStats RPC API (if configured)
  • eth/catalyst/api.go: Register Engine API

The initialization of the node will be completed in makeFullNode in cmd/geth/config.go, focusing on initializing the following three modules.

Ethereum Mainstream Client: Geth Overall Architecture

In the first step, the Node structure in node/node.go will be initialized, which is the entire node container. All functionalities need to operate within this container. The second step involves initializing the Ethereum structure, which includes the implementation of various core functionalities of Ethereum. Etherereum also needs to be registered with the Node. The third step is to register the Engine API with the Node.

The initialization of the Node involves creating a Node instance, and then initializing the p2p server, account management, and other protocol ports exposed to the outside such as http.

Ethereum Mainstream Client: Geth Overall Architecture

The initialization of Ethereum will be much more complicated, as most of the core functions are initialized here. First, ethdb will be initialized, and the chain configuration will be loaded from storage. Then, a consensus engine will be created; this consensus engine will not perform consensus operations but will only verify the results returned by the consensus layer. If a withdrawal request occurs in the consensus layer, the actual withdrawal operation will also be completed here. Finally, the Block Chain structure and the transaction pool will be initialized.

After all these are completed, the handler will be initialized. The handler is the entry point for all p2p network requests, including transaction synchronization, block downloading, etc. It is a key component for the decentralized operation of Ethereum. After these parts are finished, some sub-protocols implemented on top of devp2p, such as eth/68, snap, etc., will be registered into the Node container. Finally, Ethereum will be registered as a lifecycle in the Node container, completing the initialization of Ethereum.

Ethereum mainstream client: Geth overall architecture

Finally, the initialization of the Engine API is relatively simple, just registering the Engine API to the Node. At this point, the node initialization is completely finished.

Node Startup

After completing the initialization of the Node, it is necessary to start the Node. The process of starting the Node is relatively simple; it only requires starting all the registered RPC services and Lifecycle, and then the entire Node can provide services to the outside.

Ethereum Mainstream Client: Geth Overall Architecture

06\Summary

Before deeply understanding the implementation of the Ethereum execution layer, it is necessary to have an overall understanding of Ethereum. Ethereum can be viewed as a transaction-driven state machine, where the execution layer is responsible for executing transactions and changing states, and the consensus layer is responsible for driving the execution layer's operation, including producing blocks from the execution layer, determining the order of transactions, voting on blocks, and ensuring finality for blocks. Since this state machine is decentralized, it requires communication with other nodes through a p2p network to jointly maintain the consistency of state data.

The execution layer is not responsible for determining the order of transactions, but rather for executing transactions and recording the state changes after execution. There are two forms of recording here: one is to record all state changes in the form of blocks, and the other is to record the current state in a database. At the same time, the execution layer is also the entry point for transactions, storing unbundled transactions in the transaction pool. If other nodes need to obtain block, state, and transaction data, the execution layer will send this information out via the p2p network.

For the execution layer, there are three core modules: computation, storage, and networking. Computation corresponds to the implementation of EVM, storage corresponds to the implementation of ethdb, and networking corresponds to the implementation of devp2p. With this overall understanding, one can then delve deeper into each sub-module without getting lost in the specific details.

07\Ref

[1]

[2]

[3]

[4]

[5]

[6]

·END·

Content | Ray

Editing & Typesetting | Huanhuan

Design | Daisy

View Original
The content is for reference only, not a solicitation or offer. No investment, tax, or legal advice provided. See Disclaimer for more risks disclosure.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments
  • Pin