renchap/Mastodon link previews draft.md Secret

## Mastodon link previews draft.md

      
    Raw
  

              Mastodon link previews draft.md
            
          
    Link previews should not be generated by each instance
How it work currently

Currently, each Mastodon instance generates its own link preview, using the LinkCrawlerWorker.
The preview is generated right after creating the status on the original instance.
When a status is received through ActivityPub, the worker is launched after a random wait, up to 60s.
Link previews are cached locally for each instance, keyed by URL. So another status with the same URL will re-use the cached preview, if available.
The issue

As the Fediverse grows, there are more and more instances. A single status can be federated to more than 1000 instances, generating this many queries to the URL contained in the status (thundering herd problem).
For example, this status (from a 30k follower account) generated more than 3000 hits on this URL.
This is a real issue reported by multiple people. I am starting to see missing previews in my feed as some servers are no longer responsing timely in requests to generate it, and some website owner are even blocking the Mastodon User Agent now due to this issue.
Possible solutions

Solution 1: On-demand generation of previews

Rather than generating the preview when the status is first received, we could have a way to generate it when the status is read by a client for the first time.
This could maybe help spread the origin load over a longer period and avoid instances without active users generating a preview, reducing the total of needed queries.
However this may also not work for big accounts, as there is a big chance that their statuses will be read by at least a person on a instance in the minute following the post.
Also we need to define what "seen" means here:

either it is when the instance first needs to send this preview to a client (because it requests it / appears in a status list that needs to be sent), in which case generating the preview asynchronously will not work, as we need to sent it right now
or we need the clients to implement a way to fetch a status when the status is really displayed to the user, which (if I am not wrong) does not exists at the moment

Solution 2: Previews are received from the origin instance

Previews could be fetched only (or mostly) by the instance where the initial status has been created (the origin instance).
Then the preview is posted to the federated instances, included into the status payload.
The receiving instances will not need to generate the preview, only populate it's local cache with it.
A big problem here is that the preview will most often not be generated when the status is sent to the federated instances, as the federation happens right after the status creation, but the preview generation is an asynchronous job.
It also means that a malicous instance could generate a wrong preview for this URL, and this preview will be populated in other instances caches. So if this URL is used again in another (non-malicious) status, the malicious preview from the cache will be used (cache-poisoning).
A mitigation for this could be to split the cache into 2:

a local trusted cache, containing every preview for locally-generated (thus trusted) URLs
an untrusted cache, containing previews received from other instances, not tied to the URL but to a specific status. This means the preview for this status will come from the origin instance, so if this instance is malicious it will only affect this specific status

Solution 3: Previews are fetched from the origin instance

This is similar to the previous solution, but this time the LinkCrawlerWorker will first request the preview from the status origin instance. If it is already generated, then it will use it, otherwise it will generate it locally using the existing mechanism.
We could also have a special status value to indicate that the preview is being generated (queued) and the request should be retried in a few seconds.
This solution has a risk of generating a lot of trafic for the origin instance, as many federated instances will request the preview in a short timespan (when the status propagates). The instance will in fact receive the same amount of requests that the origin website is currently receiving, so it could overwhelm Mastodon if not sized properly. But this can easily be mitigated by caching this endpoint (using Rails cache or a CDN).
It has the same trust implications that the previous solution, with the same reasoning on if this is in fact really important.
Solution 4: Previews are fetched from a set of trusted instances

Give the instance operator the ability to provide a list of "trusted" instances for link preview fetching (TrustedPreviewSources).
When a status is received, FetchLinkCardService can then call a specific endpoint on a TrustedPreviewSource to fetch the preview for this URL.
There are multiple ways of doing this:

fetch the preview for n (2?) sources, use it if they both return the same preview (what does "same" mean?)
fetch from a random source, retry until it gets a preview, fallback on local fetch

Similar to the previous solution, we could imagine trying to fetch the local preview randomly and check with the one from the TrustedPreviewSource to see if it has been altered.
There should probably be a list of such default instances in the sample configuration, to encourage owners to have some here.
This will probably generate quite a lot of trafic to those trusted instances, but this endpoint can easily be cached (either using Rails cache, or even better a CDN), as the response will always be the same for an URL.
Solution 5: Previews are fetched from a set of trusted dedicaced services

This is similar to the previous solution, but we have a separate service to generate (and cache) the previews. A few community-maintained (and trusted) instances of this service are configured by the instance owner.
It could more easily allow other services needing URL previews to use this service/API/protocol, including the other Fediverse software.
Having a separate service for this will probably also help scaling and caching it (very easy to put a CDN in front of it).
Solution 6: Design and implement a protocol for websites to provide a signed preview

This is probably the best long-term solution, so I am including it, but it will require the whole web ecosystem to implement it… I included it so it is mentionned, but I dont consider it solves the issue as it will take years to get done.
We design a web protocol for websites to generate their previews (oEmbed extension?) and sign them. Each instance would then only need to fetch the origin public key (static file) and validate that the received or fetched preview (using one of the above solutions) is correctly signed by the origin website.
Other issues

A few other issues I have in mindl

The Fediverse is not only Mastodon. Other software will need to be updated to not originate those preview queries, and we probably want a solution they can implement as well and rely on our work

Related issues

mastodon/mastodon#4486
mastodon/mastodon#12738