Skip to content

Instantly share code, notes, and snippets.

@renchap
Last active June 7, 2024 22:44
Show Gist options
  • Save renchap/3ae0df45b7b4534f98a8055d91d52186 to your computer and use it in GitHub Desktop.
Save renchap/3ae0df45b7b4534f98a8055d91d52186 to your computer and use it in GitHub Desktop.
@acdha
Copy link

acdha commented May 1, 2024

Re: solution 6, I was was wondering whether you might be able to reuse a fair chunk of the Subresource Integrity spec by having the server which publishes an image declare the hashes for previews and the source image which would allow clients and potentially servers to retrieve the same payload from a shared cache or other service.

    "media_attachments": [
        {
            "id": "",
            "type": "image",
            "url": "",
            "preview_url": "",
            "remote_url": "",
            "preview_remote_url": null,
            "text_url": null,
            "meta": {
                "original": {
                    "width": 1012,
                    "height": 1200,
                    "size": "1012x1200",
                    "aspect": 0.8433333333333334,
                    "integrity": "sha256-…"
                },
                "small": {
                    "width": 441,
                    "height": 523,
                    "size": "441x523",
                    "aspect": 0.8432122370936902,
                    "integrity": "sha256-…"
                }
            }
        }

One other thing which comes to mind is the multihash format which the IPFS community created. I'm not sure about IPFS in general for this problem - it seems to have performance issues which might be a challenge for the Fediverse – but conceptually this seems to solve a similar problem except that a Mastodon implement could likely afford to be more lenient about bypassing the system since the failure mode is fetching it from the origin server rather than someone's data being irrecoverably lost.

@JasonPunyon
Copy link

JasonPunyon commented May 2, 2024

A big problem here is that the preview will most often not be generated when the status is sent to the federated instances, as the federation happens right after the status creation, but the preview generation is an asynchronous job.

Isn't that what exists today? A status federated to a downstream instance could have no link preview for up to 60 seconds + fetch time.

But in the world of Solution 2 the random wait disappears. If the originating instance can originate a link preview (crawl the link and federate the update) in under 30 seconds, we're ahead of the game. I haven't operated any instances yet, Is 30 seconds realistic?

@stefanbohacek
Copy link

stefanbohacek commented May 2, 2024

How about a combination of solutions 2 and 4: Previews are received from the origin instance, if it is a trusted instance, otherwise fall back on another solution from the list, or fetch it from the linked website?

I think having a configurable list of trusted instances, with mastodon.social and mastodon.online being the default ones, is an idea that can be extended beyond fetching link previews. Things like sharing blocklists, sharing public timelines, etc.

I put together a mockup, shared here.

@jenniferplusplus
Copy link

2 + 4 seems the most viable and effective course of action. Origin servers should include the open graph and/or oembed data they have. And also, all servers should have a more nuanced concept of trust than just suspend or allow, and that assessment of trust should inform how to treat the included link preview data (among many other things).

6 would be ideal, if it was already done. but coordinating across just the fediverse is difficult enough. Coordinating across the general purpose world wide web to define and implement such a protocol would take years, if it ever succeeded at all.

@edgarogh
Copy link

edgarogh commented Jun 7, 2024

To add to the 6th suggestion, there is a way for servers to sign their HTTP responses (SXG/webpackage). The originating mastodon instance could do signed exchange requests to the linked webpage and a subset of its hyperlinked images, and share the entire signed exchange + certificate chain as a special kind of attachment. Receivers can then trust the whole HTTP exchange as long as it is recent enough, and generate a preview based on it, without requesting anything themselves.

Pros:

  • different software can generate the preview however they want, and may not use OpenGraph at all

Cons:

  • an HTTP exchange takes a more space than a few specific OG fields. It should probably be treated like an attachment, except that it doesn't have to be stored indefinitely
  • very few websites support SXG/webpackage
  • SXG/webpackage signatures require a specific SSL certificate with a specific extension that few CAs provide (LetsEncrypt doesn't)
  • SXG/webpackage isn't an IETF web standard, but created by Google as a way to proxy cached websites to Chrome users (making the omnibar show a fake URL) while collecting browsing information from them. We may not want to promote its adoption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment