Let's say I've downloaded big file using torrent. Then add very small file and recreate new torrent file. Like subtitle.
Now two torrent files are totally different file to machine. Tracker and torrent client would treat them different torrent. Of course we don't need duplicate original data file for multi seeding. But seeders and leechers split by two torrent file. They don't know about they have exact same file. Torrent client and tracker cannot connect people for exact same data. We have split share pool for exact same file. It's not efficient. More seeders, more speed.
Let's say original torrent file is 1.torrent
.
[ file1 ]
Now I add some file and make new torrent file 2.torrent
that looks like,
[
[ file1 ] => This is 1.torrent.
+ file2
] => This is 2.torrent
Another person got reached 2.torrent
. Hey maybe create new torrent file based 2.torrent
. So we got 3.torrent
.
[
[
[ file1 ] => This is 1.torrent.
+ file2
] => This is 2.torrent
+ file3
] => This is 3.torrent
So if you got 3.torrent
, you are in same share pool with 1.torrent
, 2.torrent
people.
What if there is 4.torrent
, 5.torrent
or so more in near future?
We maybe query to torrent search engine or DHT, PEX.
"Please give me torrent list based on
3.torrent
"
If there is new interesting torrent, we can upgrade 3.torrent
-> X.torrent
. We don't need any interaction to local files. Only added files will be downloaded.
If you know about source code management tools like git
, this idea is basically 'git repoisitory in one torrent file'.
git init
make 1.torrent
git commit
make 2.torrent
git commit
...
- Torrent file can contain another torrent file.
- We can keep seeder/leecher pool big as possible as. Don't split us if we have exact same contents.
- If there is other torrents based on particular torrent, we can discover them.
That's the key points.
How this idea can be real? Is that possible?
I like it! Here are my two bits.
TL;DR
NB: Torrent files are already called "metainfo files", so this becomes very 'meta'.
Technical stuff
Despite me disliking legacy, this could be easily added to the BitTorrent protocol. As a reminder, the
info
dictionary in a Torrent file has the fields:name
,piece length
,pieces
, and thefiles
list — pointing to dictionaries with each apath
andlength
.Referencing the meta
Any
path
key could easily reference a Torrent URL instead of the filename, which can be retrieved when the referencing chain ends. URLs offer various ways of specifying the reference, including magnet links. This addresses @the8472 's concern about "multi-way merge", and @hansent 's comment.A special value of
length
could work as a flag — eg zero, if that does not break the current implementation in clients. The list type of thepath
key could also be abused to add more info about the referenced file, like its hash.Preserving directory structure
The only thing that may be tricky to specify is directory chains. Let's consider two files:
Then
1.torrent
downloadsfileone
in new directoryone/subone/
, but what aboutfilezero
? To keep consistency, I believe thename
of the referenced file should be disregarded — or maybe forced to match the updated file'sname
.Pieces hashes
Using a null
length
for references can help to get around thepieces
field: do not include the hashes of the reference — they will be retrieved from the referenced file. This way, the length ofpieces
remains twenty times the sum of thelength
fields.Further ideas
I mentioned I like this idea: this is because it got me thinking. Maybe the refinement of
git
branching would be overkill, but I definitely appreciate what "versioned Torrents" could mean.I believe the referencing idea could be taken further with some resolving mechanism. This would help avoiding loops and double downloads.
The "upgrade" mechanism must be included in clients: when starting
1.torrent
, the client needs to know that0.torrent
is already finished — using download history, crawling the download directory, etcætera.Based on my naïve sketch above, the "search" function will likely become a heavy map-reduce job, because a Torrent does not know which Torrent files reference to it. Maybe there is a way to make back-referencing easier with some clever hashing.
A trailing path could be appended to the referenced URL in order to only designate a specific file from the torrent, which would address @pips- 's concern about removing files — just reference all files but the ones you want removed. However, this would create problems if files are not aligned on the piece size.