This document was prepared 2023-04-21 by Tone Engel for Project Babbage and the Bitcoin Association
Quick Review of Pulse: https://github.com/libsv/pulse
These comments are from a very shallow look at this repository, important details may easily have been missed, possibly making conclusions and observations invalid.
Pulse Initial Observations:
- Go installed, project built and ran per README on the first try with no edits or customization required. Good! Note that README still says to clone from original repo?
- Default database appears to be seeded from a 90MB file "blockheaders.xz".
- Is testNet handled? The default network is mainNet.
- Expanding the database seed file and populating the full database took 200 secs and resulted in a 342MB database file "blockheaders.db"
- Running the "go run ./cmd/ ." command again resulted in the same startup time. There must be a way to restart without a full database rebuild?
- Only very recent headers appear to have come from the network, perhaps a thousand or so?
- The bitcoind P2P network is used for unknown / new header data:
- The default white list / black list results in very busy network traffic, almost all of it having zero value.
- The service crashed both times I ran it, after a few minutes. Console output is attatched below. Clearly you're seeing better reliability so perhaps its a configuration options difference.
- The error both times was the same: "invalid memory address or nil pointer dereference"
- The localhost:8080/swagger/index.html API browser is a great way to test and play.
- The only option for /chain/header/byHeight is "application/json", great for single headers, inefficient for larger counts.
- There appears to be no headers by merkle root API?? This is currently required by the TSC merkle proof standard?
Here is how Pulse appears to stack up against Chaintracks based on this initial look.
Note that Chaintracks is designed to be integrated into applications to provide fast, in memory access to header data with minimal space and network traffic.
It appears that Pulse is intended to be accessed by HTTP API in client / server mode only. While this is supported by Chaintracks, we expect it to be a rare use case as the fundamental Bitcoin security proposition is control and trust over the header data itself which for Chaintracks means local, non-server, is better.
Note also that Chaintracks stores headers in binary form to prioritize space efficiency. The Chaintracks APIs support both binary and hex string formats. Pulse appears to put a lower priority on space efficiency, using a string / JSON based schema and API only.
Note that Chaintracks partitions headers into Bulk vs Live. Bulk headers are treated as a strict array of serialized 80 byte binary values indexed by height only. Live headers are stored in a database and support reorgs. We feel this design scales into the future while the monolithic all-headers-in-one-database approach becomes more burdened every year. There is simply no reason to store Chainwork or fields to support reorgs for ancient headers. Stepping back, the purpose of each header is to report the merkle root at a given height. All the other bytes are there just to support computing and validating hashes to confirm the validity of the merkle root values.
Client | Sync Time | Restart Time | Local Bulk Size | Local Live Size | Events | Notes |
---|---|---|---|---|---|---|
Pulse | 200 secs | 200 secs | n/a | 342 MB | no? | Restart time is probably wrong. All headers kept in SQL database. |
Chaintracks | 90 secs | 1 sec | 125 MB | 1 MB | yes | Full local headers copy. New header & reorg events. |
ChaintracksClient | 8 secs | 8 secs | n/a | 1 MB | yes | Remote bulk, local live headers (RAM or disk). 2000 live headers. |
ChaintracksClient | 2.5 secs | 2.5 secs | n/a | 1 MB | yes | Remote bulk, local live headers (RAM or disk). 200 live headers. |
ChaintracksServiceClient | 0 secs | 0 secs | n/a | n/a | no | HTTP access to trusted Chaintracks instance. |
In each case, the metrics are for mainNet with 2000 live headers maintained to handle reorgs. In the case of the ChaintracksClient, both the Sync Time and Restart Time are linearly dependent on the live header count, so the metrics for 200 live headers is also listed.
Note that in all configurations, the choice of on disk or in memory storage exists and as well as multi-user or single-user SQL databases for live headers. Again, the live header database can easily be run in memory due to its small size (less than 1MB typically).
The "Sync Time" metric refers to the first startup synchronization time from header zero.
The "Restart Time" metric is the time to restart the client and synchronize after an initial sync.
Note that the 125 MB required currently for a full local copy of bulk headers includes 60 MB of index files which should not be required. These files handle finding bulk headers by merkle root and header hash values. There is no reason for merkle proofs to require these indices. The standard should require that header height is used when it was known.
2023-04-21 11:14:38.973 [INF] HEADERS: Reached the final checkpoint -- switching to normal mode
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x0 pc=0x121dcb9]
goroutine 89 [running]:
github.com/libsv/bitcoin-hc/transports/p2p/p2psync.(*SyncManager).handleHeadersMsg(0xc000313ae0, 0xc0004d59a0)
D:/github/libsv/pulse/transports/p2p/p2psync/manager.go:600 +0xad9
github.com/libsv/bitcoin-hc/transports/p2p/p2psync.(*SyncManager).blockHandler(0xc000313ae0)
D:/github/libsv/pulse/transports/p2p/p2psync/manager.go:756 +0x172
created by github.com/libsv/bitcoin-hc/transports/p2p/p2psync.(*SyncManager).Start
D:/github/libsv/pulse/transports/p2p/p2psync/manager.go:851 +0xca
exit status 2
2023-04-21 11:28:40.198 [INF] HEADERS: [Headers] received headers count: 0
2023-04-21 11:28:40.650 [INF] HEADERS: [Headers] handleInvMsg, peer.ID: 3
2023-04-21 11:28:40.651 [INF] HEADERS: [Manager] handleInvMsg lastHeaderNode.height : 788596
2023-04-21 11:28:41.911 [INF] HEADERS: [Headers] handleInvMsg, peer.ID: 3
2023-04-21 11:28:41.913 [INF] HEADERS: [Manager] handleInvMsg lastHeaderNode.height : 788596
2023-04-21 11:28:42.098 [INF] HEADERS: [Server] query
2023-04-21 11:28:42.098 [INF] HEADERS: [Server] query
2023-04-21 11:28:42.390 [INF] HEADERS: [Headers] handleInvMsg, peer.ID: 3
2023-04-21 11:28:42.391 [INF] HEADERS: [Manager] handleInvMsg lastHeaderNode.height : 788596
2023-04-21 11:28:42.399 [INF] HEADERS: [Headers] received headers count: 1
2023-04-21 11:28:42.400 [INF] HEADERS: [Manager] Synced height: 788596
2023-04-21 11:28:42.401 [INF] HEADERS: Reached the final checkpoint -- switching to normal mode
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x0 pc=0x9cdcb9]
goroutine 58 [running]:
github.com/libsv/bitcoin-hc/transports/p2p/p2psync.(*SyncManager).handleHeadersMsg(0xc000594be0, 0xc0004b21c0)
D:/github/libsv/pulse/transports/p2p/p2psync/manager.go:600 +0xad9
github.com/libsv/bitcoin-hc/transports/p2p/p2psync.(*SyncManager).blockHandler(0xc000594be0)
D:/github/libsv/pulse/transports/p2p/p2psync/manager.go:756 +0x172
created by github.com/libsv/bitcoin-hc/transports/p2p/p2psync.(*SyncManager).Start
D:/github/libsv/pulse/transports/p2p/p2psync/manager.go:851 +0xca
exit status 2
PS D:\github\libsv\pulse>
I failed to point out that Chaintracks uses the CDN model to obtain seed data to accelerate local bulk header copy creation. We don't build the seed data into the repository as Pulse currently does. This allows the data to be updated continuously without requiring application updates and allows for competing services to provide the data. We intend to monetize bulk header data access in the near term to make sure sustainable patterns of use emerge.