Blockchain Indexer: Optimize Data Retrieval to Enhance dApp Development Efficiency

The Importance of Blockchain Data and the Evolution of Retrieval Methods

Data is at the core of Blockchain technology, providing the foundation for the development of decentralized applications ( dApp ). While most discussions currently focus on data availability ( DA ), data accessibility is equally important yet often overlooked.

In the era of modular Blockchain, DA solutions have become an indispensable part. They ensure that all participants can access transaction data, enabling real-time validation and maintaining network integrity. However, the DA layer acts more like a billboard rather than a database, meaning that data will not be stored indefinitely but will be deleted over time.

In contrast, data accessibility focuses on the ability to retrieve historical data, which is crucial for developing dApps and conducting Blockchain analysis. Although discussed less, data accessibility is equally important as data availability. Both play different yet complementary roles in the Blockchain ecosystem, and a comprehensive data management approach must address both issues simultaneously to support robust and efficient Blockchain applications.

Development of Web3 Data Access: Introduction to Indexers and Related Projects

Traditional Blockchain Data Retrieval Methods

Since the birth of Blockchain, it has completely changed the infrastructure, driving the creation of dApps in various fields such as gaming, finance, and social networks. However, building these dApps requires access to a large amount of Blockchain data, which is both difficult and expensive.

For dApp developers, one option is to host and run their own archive RPC nodes. These nodes store all historical Blockchain data from the beginning, allowing full access to the data. However, maintaining archive nodes is costly, and their query capabilities are limited, making it impossible to query data in the format developers need. While running cheaper nodes is an option, these nodes have limited data retrieval capabilities, which may affect the operation of the dApp.

Another approach is to use commercial RPC node providers. These providers are responsible for the costs and management of the nodes and provide data through RPC endpoints. Public RPC endpoints are free but have rate limits, which can negatively impact the user experience of dApps. Private RPC endpoints offer better performance by reducing congestion, but even simple data retrieval requires a significant amount of communication. This makes them request-heavy and inefficient for complex data queries. Additionally, private RPC endpoints are often difficult to scale and lack compatibility across different networks.

Blockchain Indexer: A Better Alternative

The blockchain indexer plays a key role in organizing chain data and sending it to the database for easier querying, hence it is often referred to as the "Google of Blockchain." They make blockchain data available at all times by indexing it and using SQL-like query languages such as GraphQL API (. Indexers provide developers with a unified query interface, allowing for quick and accurate retrieval of the required information using standardized query languages, greatly simplifying the process.

Different types of indexers optimize data retrieval in various ways:

  1. Full Node Indexer: Runs a complete Blockchain node and directly extracts data, ensuring data is complete and accurate, but requires significant storage and processing power.

  2. Lightweight Indexer: Relies on full nodes to retrieve specific data as needed, reducing storage requirements but potentially increasing query time.

  3. Dedicated Indexer: Optimizes retrieval for specific use cases, such as NFT data or DeFi transactions, for certain types of data or specific Blockchains.

  4. Aggregated Indexer: Extracts data from multiple blockchains and sources, including off-chain information, providing a unified query interface, particularly useful for multi-chain dApps.

It requires 3TB of storage space just for Ethereum, and as the Blockchain continues to grow, the data storage requirements for archive nodes will also increase. The indexer protocol deploys multiple indexers, which can efficiently index and quickly query large amounts of data, something that RPC cannot achieve.

Indexers also allow for complex queries, easy data filtering, and data extraction for post-analysis. Some indexers can also aggregate data from multiple sources, avoiding the need to deploy multiple APIs in multi-chain dApps. By being distributed across multiple nodes, indexers provide enhanced security and performance, while RPC providers may experience interruptions and downtime due to their centralized nature.

Overall, compared to RPC node providers, indexers improve the efficiency and reliability of data retrieval while reducing the cost of deploying a single node. This makes the Blockchain indexer protocol the preferred choice for dApp developers.

![Development of Web3 Data Access: Introduction to Indexers and Related Projects])https://img-cdn.gateio.im/webp-social/moments-16396b955382c2c74010c264affdca46.webp(

Indexer Use Case

Building a dApp requires retrieving and reading Blockchain data to operate its services. This includes any type of dApp, such as DeFi, NFT platforms, games, and even social networks, as these platforms need to read data first in order to execute other transactions.

DeFi

DeFi protocols require different information to quote specific prices, rates, fees, etc. The automated market maker )AMM( needs price and liquidity information about certain liquidity pools to calculate swap rates, while lending protocols rely on utilization rates to determine borrowing rates and the debt ratio for liquidation. It is essential to input the information into their dApp before calculating the rates executed by users.

Game

GameFi needs to quickly index and access data to ensure users can play games smoothly. Only through lightning-fast data retrieval and execution can Web3 games compete with Web2 games in terms of performance, thus attracting more users. These games require data such as land ownership, in-game token balances, and in-game operations. By using indexers, they can better ensure stable data flow and stable uptime, ensuring a perfect gaming experience.

NFT

NFT markets and lending platforms need to index data to access various information, such as NFT metadata, ownership and transfer data, royalty information, etc. Quickly indexing such data can avoid browsing through each NFT individually to find ownership or NFT attribute data.

Whether it is the DeFi automated market maker )AMM( that requires price and liquidity information, or the SocialFi application that needs to update new user posts, the ability to quickly retrieve data is crucial for the normal operation of dApps. With the help of indexers, they can efficiently and accurately retrieve data, thereby providing a smooth user experience.

Analysis

The indexer provides a method for extracting specific data from the original Blockchain data ), including smart contract events in each Block (. This creates opportunities for more specific data analysis, thereby providing comprehensive insights.

For example, perpetual trading protocols can identify which tokens have high trading volumes and which tokens generate fees, thereby deciding whether to list these tokens as perpetual contracts on their platform. A DEX developer can create dashboards for their own products to gain insights into which liquidity pools offer the highest returns or the strongest liquidity. They can also create public dashboards that allow developers to freely and flexibly query any type of data they want to display on the charts.

As there are multiple blockchain indexers available, identifying the differences between indexing protocols is crucial to ensure that developers choose the indexer that best fits their needs.

![Development of Web3 Data Access: Introduction to Indexers and Related Projects])https://img-cdn.gateio.im/webp-social/moments-53dbb4fd659cf6a7184990c886901658.webp(

Blockchain Indexer Overview

The Graph

The Graph is the first indexing protocol launched on Ethereum, which allows for easy querying of previously hard-to-access transaction data. It uses subgraph definitions and filtering to collect subsets of data from the Blockchain, such as all transactions related to a certain DEX USDC/ETH pool.

Using index proof, indexers stake the native token GRT for indexing and query services, and delegators can choose to stake their tokens here. Curators can access high-quality subgraphs to help indexers determine which subgraphs to compile data for to earn the best query fees. In the transition towards greater decentralization, The Graph will eventually stop its hosting services and require subgraphs to upgrade to its network, while providing upgrade indexers.

Its infrastructure brings the average cost per million queries to $40, which is much lower than the cost of self-hosted nodes. Using file data sources, it also supports parallel indexing of both on-chain and off-chain data for efficient data retrieval.

The rewards for The Graph's indexers have been steadily increasing over the past few quarters. This is partly due to the increase in query volume, but also attributed to the rise in token prices, as they plan to integrate AI-assisted queries in the future.

Subsquid

Subsquid is a peer-to-peer, horizontally scalable decentralized data lake that efficiently aggregates large amounts of on-chain and off-chain data, protected by zero-knowledge proofs. As a decentralized worker network, each node is responsible for storing data from a specific block subset, speeding up the data retrieval process by quickly identifying nodes that hold the required data.

Subsquid also supports real-time indexing, allowing indexing before the block is finalized. It also supports storing data in formats chosen by developers, facilitating easier analysis using tools like BigQuery, Parquet, or CSV. Additionally, subgraphs can be deployed on the Subsquid network without migrating to the Squid SDK, enabling no-code deployment.

Despite still being in the testnet stage, Subsquid has achieved impressive statistics, with over 80,000 testnet users, more than 60,000 Squid indexers deployed, and over 20,000 verified developers on the network. Recently, on June 3rd, Subsquid launched the mainnet of its data lake.

In addition to indexing, the Subsquid Network data lake can also replace RPC in use cases such as analytics, ZK/TEE co-processors, AI agents, and Oracles.

SubQuery

SubQuery is a decentralized middleware infrastructure network that provides RPC and indexing data services. It originally supported the Polkadot and Substrate networks, and has now expanded to include over 200 chains. Its working principle is similar to The Graph, which uses indexing proofs; indexers index data and provide query requests, and delegators stake their shares to the indexers. However, it introduces consumers to submit purchase orders to indicate that the income of indexers is secured, rather than managed.

It will introduce SubQuery data nodes that support sharding to prevent continuous synchronization of new data between each node, thereby optimizing query efficiency while moving towards greater decentralization. Users can choose to pay approximately 1 SQT token as a computation fee for every 1000 requests, or set custom fees for indexers through the protocol.

Although SubQuery launched its token earlier this year, the issuance rewards for nodes and delegators have also increased in USD value month-on-month, which indicates a continuous increase in the number of query services offered on its platform. Since the TGE, the total amount of staked SQT has increased from 6 million to 125 million, highlighting the growth of network participation.

Covalent

Covalent is a decentralized indexer network, created by block sample producers )BSP( network nodes through batch exports to create copies of blockchain data, and publish proof on the Covalent L1 Blockchain. This data is further refined by block result producers )BRP( nodes according to established rules to filter out the required data.

Through a unified API, developers can easily extract relevant Blockchain data in a consistent request and response format without the need to write custom complex queries to access the data. The CQT token, which is settled on Moonbeam, can be used as a means of payment to extract these pre-configured datasets from network operators.

The rewards from Covalent seem to show an overall upward trend from the first quarter of 2023 to the first quarter of 2024, partly due to the rise in the price of Covalent token CQT.

![Development of Web3 Data Access: Introduction to Indexers and Related Projects])https://img-cdn.gateio.im/webp-social/moments-52ee29205aa307720198994a5f3de61f.webp(

Considerations for Choosing an Indexer

Customizability of Data

Some indexers ), such as Covalent (, are general purpose indexers that provide standard pre-configured datasets via API. While they may be fast, they lack the flexibility needed for developers requiring custom datasets. By using the indexer framework, it allows for more custom data processing to meet application-specific needs.

Security

Index data must be secure; otherwise, dApps built on these indexers are also vulnerable to attacks. For example, if transactions and wallet balances can be manipulated.

DAPP4.74%
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 7
  • Share
Comment
0/400
DeFiDoctorvip
· 07-20 14:51
From clinical observations, the accessibility of data as a complication has severely affected the metabolic function at the DA level, with a consistently high recurrence rate.
View OriginalReply0
MEV_Whisperervip
· 07-20 06:59
Who else studies such hardcore stuff?
View OriginalReply0
ParallelChainMaxivip
· 07-20 00:46
I feel like DA is nothing, it's more reliable to look at the chain layer.
View OriginalReply0
NotGonnaMakeItvip
· 07-20 00:46
What to do if the historical data can't be retrieved...
View OriginalReply0
SeeYouInFourYearsvip
· 07-20 00:34
What’s the use of talking about all these high-end things every day without any skills?
View OriginalReply0
GasFeeAssassinvip
· 07-20 00:34
Crawling data card for half a day? Cracked.
View OriginalReply0
retroactive_airdropvip
· 07-20 00:31
Why hasn't the data storage been sorted out yet?
View OriginalReply0
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)