Skip to content

Instantly share code, notes, and snippets.

@jsvisa
Created April 16, 2023 13:52
Show Gist options
  • Save jsvisa/99ce397a04089cc464b6086f56f546e3 to your computer and use it in GitHub Desktop.
Save jsvisa/99ce397a04089cc464b6086f56f546e3 to your computer and use it in GitHub Desktop.
issue_eth_subscribeFilterLogs_from_block

Understanding of the current problem

The problem is caused by the current design of the SubscribeXXX interface, which only processes data from the current block height onwards, and this processing is done in a goroutine serialized by the filter_system (handled in EventSystem.eventLoop). This means that SubscribeFilterLogs cannot separately process blocks between fromBlock and currentBlock, and what we want is to filter both historical and current data consistently with the FilterLogs interface semantics.

How to solve this problem

Minimize the cost of implementing this requirement:

If the fromBlock in SubscribeFilterLogs is a block height in history, we split this interface into two parts:

  1. phase1, handle historical data: the data between fromBlock and currentBlock is continuously extracted by the server using the FilterLogs interface (in batches), and this part of the data is directly returned to the client. The goroutine exits when all historical data has been extracted;
  2. phase2, handle future data: this is the current processing logic and does not need to be changed.

Implementation method and possible issues

Synchronous

Synchronization scheme: phase2 is enabled only after phase1 is completed. This achieves the best effect with minimal impact on users. One possible solution is:

  1. Add a semaphore between phase1 and phase2. Only after phase1 program completes processing can phase2 send data to the client, otherwise it will stay in memory.

However, there may be several implementation issues: 1. If the difference between fromBlock and the current block height is large (such as starting from block 1), processing historical data may take a long time (such as hours or days). For this issue, we can limit the height difference between fromBlock and toBlock using parameters (such as less than 90,000 blocks). 2. Since data needs to be staged, the additional memory may cause the process to OOM.

Asynchronous

Asynchronous scheme: This solution has the least impact on current changes, but the problem is that the logs received by the client are unordered, and the client needs to handle the ordering issue by itself, which may be confusing for users.

These are my rough thoughts. Please let me know if there are any inaccuracies, and I would like to hear your opinion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment