Skip to content

Instantly share code, notes, and snippets.

@tonesnotes
Last active May 11, 2023 19:45
Show Gist options
  • Save tonesnotes/03d44bb99d841b37f1a2644bb314b614 to your computer and use it in GitHub Desktop.
Save tonesnotes/03d44bb99d841b37f1a2644bb314b614 to your computer and use it in GitHub Desktop.

Review of Pulse and Comparison With Chaintracks

This document was prepared 2023-04-21 by Tone Engel for Project Babbage and the Bitcoin Association

Quick Review of Pulse: https://github.com/libsv/pulse

These comments are from a very shallow look at this repository, important details may easily have been missed, possibly making conclusions and observations invalid.

Pulse Initial Observations:

  1. Go installed, project built and ran per README on the first try with no edits or customization required. Good! Note that README still says to clone from original repo?
  2. Default database appears to be seeded from a 90MB file "blockheaders.xz".
    1. Is testNet handled? The default network is mainNet.
  3. Expanding the database seed file and populating the full database took 200 secs and resulted in a 342MB database file "blockheaders.db"
    1. Running the "go run ./cmd/ ." command again resulted in the same startup time. There must be a way to restart without a full database rebuild?
    2. Only very recent headers appear to have come from the network, perhaps a thousand or so?
  4. The bitcoind P2P network is used for unknown / new header data:
    1. The default white list / black list results in very busy network traffic, almost all of it having zero value.
    2. The service crashed both times I ran it, after a few minutes. Console output is attatched below. Clearly you're seeing better reliability so perhaps its a configuration options difference.
    3. The error both times was the same: "invalid memory address or nil pointer dereference"
  5. The localhost:8080/swagger/index.html API browser is a great way to test and play.
    1. The only option for /chain/header/byHeight is "application/json", great for single headers, inefficient for larger counts.
    2. There appears to be no headers by merkle root API?? This is currently required by the TSC merkle proof standard?

Quick Comparison With Project Babbage Chaintracks

Here is how Pulse appears to stack up against Chaintracks based on this initial look.

Note that Chaintracks is designed to be integrated into applications to provide fast, in memory access to header data with minimal space and network traffic.

It appears that Pulse is intended to be accessed by HTTP API in client / server mode only. While this is supported by Chaintracks, we expect it to be a rare use case as the fundamental Bitcoin security proposition is control and trust over the header data itself which for Chaintracks means local, non-server, is better.

Note also that Chaintracks stores headers in binary form to prioritize space efficiency. The Chaintracks APIs support both binary and hex string formats. Pulse appears to put a lower priority on space efficiency, using a string / JSON based schema and API only.

Note that Chaintracks partitions headers into Bulk vs Live. Bulk headers are treated as a strict array of serialized 80 byte binary values indexed by height only. Live headers are stored in a database and support reorgs. We feel this design scales into the future while the monolithic all-headers-in-one-database approach becomes more burdened every year. There is simply no reason to store Chainwork or fields to support reorgs for ancient headers. Stepping back, the purpose of each header is to report the merkle root at a given height. All the other bytes are there just to support computing and validating hashes to confirm the validity of the merkle root values.

Client Sync Time Restart Time Local Bulk Size Local Live Size Events Notes
Pulse 200 secs 200 secs n/a 342 MB no? Restart time is probably wrong. All headers kept in SQL database.
Chaintracks 90 secs 1 sec 125 MB 1 MB yes Full local headers copy. New header & reorg events.
ChaintracksClient 8 secs 8 secs n/a 1 MB yes Remote bulk, local live headers (RAM or disk). 2000 live headers.
ChaintracksClient 2.5 secs 2.5 secs n/a 1 MB yes Remote bulk, local live headers (RAM or disk). 200 live headers.
ChaintracksServiceClient 0 secs 0 secs n/a n/a no HTTP access to trusted Chaintracks instance.

In each case, the metrics are for mainNet with 2000 live headers maintained to handle reorgs. In the case of the ChaintracksClient, both the Sync Time and Restart Time are linearly dependent on the live header count, so the metrics for 200 live headers is also listed.

Note that in all configurations, the choice of on disk or in memory storage exists and as well as multi-user or single-user SQL databases for live headers. Again, the live header database can easily be run in memory due to its small size (less than 1MB typically).

The "Sync Time" metric refers to the first startup synchronization time from header zero.

The "Restart Time" metric is the time to restart the client and synchronize after an initial sync.

Note that the 125 MB required currently for a full local copy of bulk headers includes 60 MB of index files which should not be required. These files handle finding bulk headers by merkle root and header hash values. There is no reason for merkle proofs to require these indices. The standard should require that header height is used when it was known.

Pulse Crash Console Output

2023-04-21 11:14:38.973 [INF] HEADERS: Reached the final checkpoint -- switching to normal mode
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x0 pc=0x121dcb9]

goroutine 89 [running]:
github.com/libsv/bitcoin-hc/transports/p2p/p2psync.(*SyncManager).handleHeadersMsg(0xc000313ae0, 0xc0004d59a0)
        D:/github/libsv/pulse/transports/p2p/p2psync/manager.go:600 +0xad9
github.com/libsv/bitcoin-hc/transports/p2p/p2psync.(*SyncManager).blockHandler(0xc000313ae0)
        D:/github/libsv/pulse/transports/p2p/p2psync/manager.go:756 +0x172
created by github.com/libsv/bitcoin-hc/transports/p2p/p2psync.(*SyncManager).Start
        D:/github/libsv/pulse/transports/p2p/p2psync/manager.go:851 +0xca
exit status 2
2023-04-21 11:28:40.198 [INF] HEADERS: [Headers] received headers count: 0
2023-04-21 11:28:40.650 [INF] HEADERS: [Headers] handleInvMsg, peer.ID: 3
2023-04-21 11:28:40.651 [INF] HEADERS: [Manager] handleInvMsg lastHeaderNode.height : 788596
2023-04-21 11:28:41.911 [INF] HEADERS: [Headers] handleInvMsg, peer.ID: 3
2023-04-21 11:28:41.913 [INF] HEADERS: [Manager] handleInvMsg lastHeaderNode.height : 788596
2023-04-21 11:28:42.098 [INF] HEADERS: [Server] query
2023-04-21 11:28:42.098 [INF] HEADERS: [Server] query
2023-04-21 11:28:42.390 [INF] HEADERS: [Headers] handleInvMsg, peer.ID: 3
2023-04-21 11:28:42.391 [INF] HEADERS: [Manager] handleInvMsg lastHeaderNode.height : 788596
2023-04-21 11:28:42.399 [INF] HEADERS: [Headers] received headers count: 1
2023-04-21 11:28:42.400 [INF] HEADERS: [Manager] Synced height: 788596
2023-04-21 11:28:42.401 [INF] HEADERS: Reached the final checkpoint -- switching to normal mode
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x0 pc=0x9cdcb9]

goroutine 58 [running]:
github.com/libsv/bitcoin-hc/transports/p2p/p2psync.(*SyncManager).handleHeadersMsg(0xc000594be0, 0xc0004b21c0)
        D:/github/libsv/pulse/transports/p2p/p2psync/manager.go:600 +0xad9
github.com/libsv/bitcoin-hc/transports/p2p/p2psync.(*SyncManager).blockHandler(0xc000594be0)
        D:/github/libsv/pulse/transports/p2p/p2psync/manager.go:756 +0x172
created by github.com/libsv/bitcoin-hc/transports/p2p/p2psync.(*SyncManager).Start
        D:/github/libsv/pulse/transports/p2p/p2psync/manager.go:851 +0xca
exit status 2
PS D:\github\libsv\pulse> 
@tonesnotes
Copy link
Author

I failed to point out that Chaintracks uses the CDN model to obtain seed data to accelerate local bulk header copy creation. We don't build the seed data into the repository as Pulse currently does. This allows the data to be updated continuously without requiring application updates and allows for competing services to provide the data. We intend to monetize bulk header data access in the near term to make sure sustainable patterns of use emerge.

@sirdeggen
Copy link

Tone thank you for this incredibly helpful feedback. It took me a while to find simply because I give my email inbox very little attention. I have a few questions which I'll add below.

@sirdeggen
Copy link

I also saw this crash perhaps around the same time. I have a live instance running at https://pulse.ö.network if it of any interest to you, please let me know and I'll grant you access via token.

@sirdeggen
Copy link

The motivation to develop Pulse was actually to onboard new developers who we could work with to develop the upcoming functionality with respect to Alert Key messages, while also providing a headers only client for Go developers since it has been gaining use over the last few years in the bsv space.

Pulse is very much still in development and your feedback here will certainly be taken on board.

Having said all that - clearly you guys have a kick ass solution at Babbage in Chaintracks and that's great. I've asked for BA resources to put towards the development of these tools further and in line with the new node architecture. I am in the process of awaiting internal review via new management - all of which is still in flux as can be expected.

I hope to work with you and the rest of your team in the near future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment