This is not a serious proposal at this point, more of a structured brainstorming.
- Title: Supporting Incremental BEP-30 Adoption With Mapped Torrents
- BEP: TBD
- Version: TBD
- Author: Jeremy Banks <_@jeremy.ca>
- Status: Just a Thought
- Type: Standards Track
- Content-Type: text/markdown
- Last-Modified: February 4, 2017
- Created: February 2, 2017
BEP-30's Merkle tree torrents are an elegant and efficient optimization to the torrent protocol. However, they are currently not widely supported. They have a bit of a catch-22: a small fraction of clients support them, so no widely-used torrents are created using them, so there's no benefit to a typical user of a client supporting them, so there's no incentive for client developers to do the work necessary to support them, so a small fraction of clients support them...
We propose a backwards-compatible extension of BEP-30 which defines an isomorphic mapping between traditional torrents and our mapped merkle torrents. Downloaders will be able to fall back from an mapped merkle torrent to the corresponding traditional torrent if there are no BEP-30 supporting peers, or if they need to get the full piece list to verify data from non-torrent sources (such as a Web Seed).
Seeders will be able to serve the data BEP-30 and BEP-3 peers simultaneously. This could reduce the barriers to incremental BEP-30 adoption.
Given a traditional torrent info dictionary, lets its equivalent merkle torrent's info dictionary be obtained by:
- removing the
pieces
. - adding a
root hash
calculated using the leaf hashes frompieces
as described in BEP 30. - adding a
traditional infohash
with the 20-byte infohash of the original torrent, using the existingpiece size
.
(Note that no actual content data is not required to obtain the equivalent merkle torrent, it is entirely based on the existing info metadata.)
Peers downloading the merkle torrent should verify that the traditional infohash field matches, once they finished downloading and have all of the necessary piece hashes, displaying an error if it does not.
When a merkle torrent is being generated, the client MUST also calculate the infohash of the equivalent traditional torrent (with all of the same info, but with pieces
instead of a root hash
), and include this in the merkle torrent's info dictionary under the traditional infohash
key.
Any peers who are seeding any traditional torrents or mapped merkle torrents (as described above) should participate in the swarms for both the traditional and mapped merkle versions of the torrent. That means they should accept connection requests using either infohash (responding in the appropriate mode for each), and if they're participating in the DHT they should announce their participation in both swarms.
Clients may want to used a shared connection cap across both swarms, but are recommended to reserve a fraction for each swarm (if peers are available) to ensure connectivity.
Clients downloading traditional torrents SHOULD determine the equivalent merkle torrent and attempt to participate in that swarm as well.
Clients downloading merkle torrents MAY use the traditional infohash, DHT, and PEX to attempt to participate in the traditional swarm, perhaps if there are no merkle peers.
If you have the option of communicating with a peer using the traditional or merkle swarm, when you both all piece hashes, the traditional torrent should be preferred because the extra hash overhead of the merkle swarm is unnecessary.
For backwards compatibility, magnet links should continue to specify the traditional torrent infohash first as the xt=
. The infohash of the merkle torrent should be specified as xt.bepTBD=
. Obtained metadata must be verified against both of these hashes (which is possible for the merkle torrents because they now include the original torrent infohash, and possible for traditional torrents because the merkle torrent metadata can be derived from their metadata).
Peers should declare support for FR_iso
in their extension handshake if, as suggested in here, they may participate in the swarms for both versions of the torrent. This is only possible if they know all of the piece hashes, so it only applies if they've completed downloading or because they're using the conventional torrent (where they'll be downloading the metadata first).
When performing BEP-10 PEX peer exchange, clients that are known to be active on both swarms can be exchanged with peers on both swarms, even if they've only been seen in one of them.
Clients should not declare this extension if they intend to never participate in the other swarm.
NOTE: maybe we should require support for this being enabled later, and only declare it once we actually have the metadata or IDs to participate in both swarms
The different torrents need to use the same piece size. You need to weigh the potential advantages of smaller sizes for your merkle torrent against the potentially significant bloat it would add to the traditional torrent metadata.
This relies on the DHT for finding peers for the alternate swarm -- trackers are not reused for the other swarm because we don't know if they're compatible. (NOTE: We could consider specifying a way of declaring trackers that can be used with both, in the torrent file, but I'm not sure that it would be worth it.)
None! I may implement one later this year.
This document has been placed in the public domain.