Skip to content

Instantly share code, notes, and snippets.

@loreanvictor
Last active October 29, 2024 07:50
Show Gist options
  • Save loreanvictor/bddd8824c744024d338e935bd7e96707 to your computer and use it in GitHub Desktop.
Save loreanvictor/bddd8824c744024d338e935bd7e96707 to your computer and use it in GitHub Desktop.
Interaction as Content

Can We Get More Decentralised Than The Fediverse?

I guess that the fediverse will be as decentralised as email: a bit, but not that much. Most people will be dependent on a few major hubs, some groups might have their own hubs (e.g. company email servers), personal instances will be pretty rare. This is in contrast to personal blogging, where every Bob can easily host their own (and they often do). I mean that's already implied by the name: fediverse is a federated universe, not a distributed one.

Why does this matter? Well I like not being dependent on one entity, but I would like it much more if I was dependent on no entities at all. In other words, I like to publish my own personal blog and get all the goodies of a social network, without being dependent on other micro-blogging / social content platforms.

So in this writing, I'm going to:

  • ❓ Contemplate on why the fediverse gets federated not distributed (spoilers: its push vs pull)
  • 🧠 Ideate on how could we get a distributed social system (spoilers: by extending RSS)
  • 🛠️ Reflect on how would that look in practice (spoilers: kinda weird, but I think doable?)

Push vs Pull

Ok first, what do I mean by saying "the fediverse is federated not distributed" or "its not decentralised enough"? Well I see three levels of decentralisation (relevant here):

  • 🏦 Fully central, i.e. one center (e.g. twitter servers)
  • 🇪🇺 Federated, i.e. multiple centers (e.g. the fediverse, email servers)
  • 🏴‍☠️ Distributed, i.e. no centers (e.g. personal blogging)

Why does fediverse leans towards the second? Because it is a push-based model: You need to push your content to whomever is interested, instead of just making it available for interested people to pull it on their own. It is the same as email, where you (or your email server) need to deliver each email to all recipients (by talking to each of their email servers). Those email servers also need to recognise and trust you too, which makes the whole network even more federated.

💡 Example

Assume Bob wants to post something, Alice, Carol and Malorey would like to read it. In the fediverse (or a push-based system), the following happens:

Bob posts, then:
Bob --[notifies]--> Alice.
Bob --[notifies]--> Carol.
Bob --[notifies]--> Malorey.

In a pull-based system, like personal blogging with RSS feeds, this happens instead:

Bob posts, then:
Alice   --[queries]--> Bob.
Carol   --[queries]--> Bob.
Malorey --[queries]--> Bob.

👆 In the pull-based system, more work in the end is required (when should Alice query Bob? Also Bob needs to respond to the query, though thats super easy as it is static responses), but the work is better distributed, lowering the maximum amount of work someone has to do (in this case, Bob). Which means they need fewer resources to participate, which means more decentralised participation.

Also trust plays a role here: in a push-based system, Bob needs to be allowed to notify Alice, Carol and Malorey, which further restricts free-form participation. In a pull-based system though, Bob doesn't even know about Alice, Carol and Malorey, meaning anyone can participate more freely.


Pros & Cons

Ok before getting to a solution for a pull-based (and subsequently, more decentralised) social networking solution, I'd like to take a moment to consider all the pros and cons of the two approaches. We can do that without considering particulars of solutions and protocols, since the essential differences are all about the push vs pull content distribution model.

🏴‍☠️ Pull: More Decentralised

As mentioned above, making content available for interested parties to pull needs waay less resources than pushing your content onto them (either they do the work, or you do it for them). It also requires less trust and gatekeeping, so anyone can easily participate with their own nodes, servers, CDNs, whatever.

⚙️ Pull: Granular Access Control

In a push-based protocol, the protocol needs to somewhat have a concept of who can push what to whom, meaning anything built on top of it needs to conform to that design (e.g. ActivityPub defines concepts of blocking, accepting follow requests, etc.).

A pull-based system doesn't need to think about access control at all. Anyone can do whatever weird form of access control they want on the content they've made available. You can publish some of your activity to some public feed while publishing some others to some more private feed with friends or co-workers access.

⚡ Push: Realtime

Its kind of obvious, if content isn't pushed, it is not circulated as fast (e.g. realtime). This might be ok for some stuff, and not for others (direct messaging kind of loses its meaning in a pull-based system, for example).

↕️ Push: Native Model of Two-way Interactions

A push-based system is all about two-way interactions: X pushes something onto Y. A pull-based system breaks that down to individual interactions: X posts something, Y pulls something.

Because push models two-way interactions, it acts much better on content circulation which can be modelled as two-way interaction. For example, if Alice comments on Bob's post, in a push-based system that is the same as Bob posting something and notifying Alice. In a pull-based system though, Bob needs to query everyone who he knows and might've said something, to check whether what they've said is a comment on his post or not. Which is orders of magnitude more difficult.

🔍 Content Discovery

Beyond content delivery that can be modelled as two-way interactions (e.g. comments, quotes, etc), both designs are lacking in the content discovery area in a broader sense, and in both cases you'd need to have third-party aggregators / crawlers / search services for that, similar to what search engines do for the distributed world of web pages.

While kind of independent, such discovery is an essential part of any such social network (a social network without explore, recommendation, tags, communities, etc. is just a messaging service). Any solution for this discovery issue will naturally fill-in the discovery gaps of pull vs push based systems.

In other words, if we were to practically build a pull-based system, we'd need some aggregators / search providers, which would also tell Bob who have reacted to their post, though in a push-based system Bob wouldn't be dependent on these fellas to get the answer to that question.


Interaction as Content

Assuming all those trade-offs are worth the benefits of a pull-based system, what would it look like? Well the best place to start is RSS, since it is the defacto standard of syndicating and circulating content in a pull-based design:

  • Its been iterated upon and polished for that specific puporse,
  • It has tons of tools and clients already (RSS readers, etc),
  • A ton of content already in circulation supports RSS (Youtube, Reddit, Medium, most podcasts, most personal blogs and news outlets, etc).

What is missing here? Well social media are generally successful mostly by lowering the barriers of content creation, an important part of which is making it super easy to create content through interacting with some other existing content.

We can bring that into RSS by treating any interaction as content. If you post something, thats an entry in your feed (as before). If you comment on something, thats also an entry in your feed. If you like something, thats another entry in your feed. If you follow someone (which would mean subscribing to some RSS feed), thats also another entry in your feed. To mark that interactive nature of some feed entry, we can simple extend RSS a bit:

<item>
    <title>Comment on "Exploring New Technologies"</title>
    <link>http://www.my.blog/posts/456</link>
    <description>This is bullshit man, you've missed a ton of nuance in this analysis.</description>
    <pubDate>Mon, 21 Feb 2024 14:34:56 GMT</pubDate>
    <guid isPermaLink="true">http://www.my.blog/posts/456</guid>
    <social:context type="comment" url="http://www.other.blog/posts/123">
        <item>
            <title>Exploring New Technologies</title>
            <link>http://www.other.blog/posts/123</link>
            <guid isPermaLink="true">http://www.other.blog/posts/123</guid>
            <pubDate>Mon, 21 Feb 2024 12:34:56 GMT</pubDate>
        </item>
    </social:context>
</item>

For easier discussion, I'll refer to this schematic extension as RISS (think of it as Really Intuitive Social Syndication, or any other acronym of your liking).


Back to Reality

Ok that's cool and all, but would it really make sense to build products and platforms around such a protocol, if it existed? Would such products and platforms provide tangible user benefits? I think so, though I'm not sure to what extent.

✨ Anything, Anywhere, All at Once

The most immediate benefit will be that users can get access to a lot of social content all in one place. At a basic level, this is like a nice RSS reader where you get all your news, with added engagement of being able to interact with the content.

At a deeper level though, this means you can find almost everything in one place. Most of content streams on the internet support RSS (YouTube, Medium, Reddit, podcasts, etc.). Producing RSS feeds is also relatively cheap, so content not supporting it can also be cheaply bridged. Top that with a nice search / aggregator, and you've effectively made the borders between various communities disappear for your users (I don't need to follow someone on YouTube to miss their content on Twitch. I can follow them anywhere in one place).

👁️‍🗨️ Separation of Speech and Reach

This benefit hinges on adoption so is not immediate, and might not be that great as well. But, with such a model, publishing is completely separated from distribution, meaning no one can bar anyone from publishing and their direct subscribers receiving their content (except the ISPs?). However, anyone can refuse to help distribute anything they don't like, as this is not in anyway hindering publishing of said content, and there is no exclusivity on distribution as well.

In contrast, in a centralised system, publishing and distribution are entangled, and distribution is done exlusively by the central platform operator as well, meaning them choosing "not to promote" is the borderline the same as "not allowing to be published". Even in a federated system, a server might decide they don't want to allow me to push content to my followers on that server anymore, effectively cutting off access.

Now I know people are going to complain regardless, but I do feel this separation is important for regulating such online spaces. Furthermore, I think such neat separation plays a great role in the financials of content generation as well, the same way that the distribution that lead to anyone with their own website accessible through search engines also lead to new, more open monetisation models (that are of course not without their flaws).


Long Story Short,

It actually might be possible to get more decentralised than the fediverse, via a simple extension on RSS. It might not be worth it since there will be sacrifices, but there will also be gains, so it might. And the end result might be a faster growing decentralised network as it can already incorporate much more popular content and creators, with also much lower barrier to entry and cleaner seperation of concerns and responsibilities.

I' personally pretty busy right now, but when I get time, I think I will start exploring the potential of RISS a bit more.

@ConsciousCode
Copy link

ConsciousCode commented Mar 2, 2024

@loreanvictor

@ConsciousCode I am sorry I couldn't fully understand the proposed solution. do you mean linking feeds of friends and people I follow in my own feed? in that case I agree, and thats already what I'm proposing with RISS: any social activity becomes some content, with some additional social context linking to the original feed.

I was talking about a static updatable list (eg a big JSON file listing URIs of feeds you want to promote) but you're right that it could just as easily be determined from the feed's history, with less moving parts. As for my other suggestion, I'm basically thinking of how this could be applied to a forum-esque setting, where OP creates a thread and people respond. Your proposal makes that easy enough, people add some kind of response item to their own feed and relevant parties pull from it, but if I wanted to see everyone who responded to my post I would need to scan the entire network to see who to pull, or else use an aggregator which does that linking for me (which you mention, but undermines decentralization). Instead, my suggestion is an additional push layer for events - no authentication required, just some kind of event which says "The RISS feed X just replied to you" which can then be objectively verified by pulling from it. Theoretically anyone could push such notifications to you (hence the need for verification), but in the typical case it would be part of the application layer which curates your feed. This could be used for other events as well, such as mentions, edits, etc. For instance:

  1. Bob publishes to feed B
  2. Alice pulls from feed B
  3. Alice responds to B on feed A
  4. Alice's RISS application pushes a notice to B, "<A> commented on your feed"
  5. Bob verifies the notice by pulling from A and seeing if Alice really did comment
  6. Based on some internal moderation police, Bob optionally publishes a link on B saying Alice commented via A

Though this is potentially a lot of network traffic. If Carol now wants to view the full thread, they need to pull from Bob and additionally pull from every feed it links to saying someone commented. Otherwise, if feeds cached the content of events published to them, Bob could change or fabricate comments banking on users not checking what the original commenter's feed actually said (if the link is even real in the first place). I've seen threads which can contain thousands of posts and nearly as many authors, even if we assume the client is smart enough to only query each feed once (for multiple comments from the same feed), you would need an aggregator to batch them all, otherwise that's thousands of individual requests for a single thread. Maybe it could be alleviated by lazy loading, though. Also, since a push requires a pull for verification, it could potentially risk DDoS - you would generally want to ensure an unverified push can't initiate a disproportionately large pull request. But this is all musings about scale, it probably isn't relevant for a first draft so take it with a grain of salt.

@loreanvictor
Copy link
Author

@MinchinWeb that's how I came upon this train of thought actually. I'm migrating my blog, and was thinking of adding an automated rss reader, then started contemplating why I should also post stuff I write there on social platforms too, if I want to really "share" them.

@loreanvictor
Copy link
Author

loreanvictor commented Mar 2, 2024

@ConsciousCode I see what you mean. Your proposed solution will bring all the downsides of a push-based system due to its complexity though, but I think it can be simplified enough, to act as an optional addition (for example to the imaginary base RISS protocol), to enable such discovery.

What if Bob, in his feed, also posts a hub link, which can be called by people interacting with his content, to notify him (well, the hub) of occurrence of the interaction? Bob can then pull his designated hub to know who has interacted with this content. Running such a hub will be more complex than static hosting, but still way easier than a fediverse server (since no trust is required), and can result in a much more decentralised network (basic discovery can be much further distributed).

<item>
    <title>Exploring New Technologies</title>
    <link>http://www.other.blog/posts/123</link>
    <description>My thoughts on new technologies.</description>
    <pubDate>Mon, 21 Feb 2024 12:34:56 GMT</pubDate>
    <guid isPermaLink="true">http://www.other.blog/posts/123</guid>
    <link rel="social:notify" href="http://www.other.blog/hub/notify?ref=posts/123" />
</item>

The notification payload can also simply be the corresponding rss feed item, i.e.:

<item>
    <title>Comment on "Exploring New Technologies"</title>
    <link>http://www.my.blog/posts/456</link>
    <description>This is bullshit man, you've missed a ton of nuance in this analysis.</description>
    <pubDate>Mon, 21 Feb 2024 14:34:56 GMT</pubDate>
    <guid isPermaLink="true">http://www.my.blog/posts/456</guid>
    <social:context type="comment" url="http://www.other.blog/posts/123">
        <item>
            <title>Exploring New Technologies</title>
            <link>http://www.other.blog/posts/123</link>
            <guid isPermaLink="true">http://www.other.blog/posts/123</guid>
            <pubDate>Mon, 21 Feb 2024 12:34:56 GMT</pubDate>
        </item>
    </social:context>
</item>

Additionally, this whole system can become much more realtime through WebSub. For subscribing to "reactions to a post", a hub can be specified, though it might be wise to also separate that from the standard "hub" url on a page. It might additionally be wise to incorporate the feed of social interactions to a post:

<item>
    <title>Exploring New Technologies</title>
    <link>http://www.other.blog/posts/123</link>
    <description>My thoughts on new technologies.</description>
    <pubDate>Mon, 21 Feb 2024 12:34:56 GMT</pubDate>
    <guid isPermaLink="true">http://www.other.blog/posts/123</guid>
    <link rel="social:notify" href="http://www.other.blog/hub/notify?ref=posts/123" />
    <link rel="social:hub" href="http://www.other.blug/hub/subscribe?ref=posts/123" />
    <link rel="social:feed" href="http://www.other.blug/hub/feed?ref=posts/123" />
</item>

@DeepMac
Copy link

DeepMac commented Mar 3, 2024

  • I disagree that Pull isn't two-way. The difference between Pushing and Pulling is the triggering. With a push the trigger is determined by creation of the content. With pull it's about scheduled regular checks.
    Indeed but in the pull model, the parties are separated via the content, i.e. I make content available without having to consider who is going to pull it, and a consumer can pull it without having to consider who has made it available.

But they'd still need to choose who/where to pull from.

I understand what you're trying to get at, but you seem to be hung-up on the idea of the content. The content is irrelevant. These processes would be the same if all the content was zero-byte length files. or text files full of zeroes.

In a good system, you could have both. NNTP was a distributed protocol used to distribute text posts in Usenet groups. You could connect your client to an NNTP server to retrieve existing messages as well as add new ones. Anyone else connecting to that same server could see your content directly. But the NNTP server could also be connected to by other NNTP servers that would "sync" content. Or, it could push the content to them. Both options were required because of the nature of Internet connectivity at the time, i.e. slow, unreliable and expensive.

The protocol should be agnostic to the content and use, other than having infrastructural capabilities that matter like encryption, routing, redundancy, etc. Then its up both the client and server software to have controls.

  • Both Push and Pull can have granular access controls. If I send an e-mail to someone (push) then I'm deciding who to send it to, what to send and when. But the recipient can have their own controls on whether to accept it based on content, sender, etc.
    That is correct, perhaps I need better wording for this part. What I meant to say is, when pushing something, I need to pick to whom to push it, meaning the distribution protocol needs to be aware of access control at this stage, and any further access control needs to be built on top of this, potentially (but not necessarily) limiting additional access control (or making it more complex).

Consent is the word you're looking for, :)

In the 1980s and early 1990's no one had to put access controls on their services because no one was abusing them. Then there was abuse and it became more and more significant and controls became necessary.

@alexgarel
Copy link

An interesting project about social networking in pull mode is Secure Scuttlebutt: https://scuttlebutt.nz/
It's not as simple as RSS though.

@axodys
Copy link

axodys commented Mar 4, 2024

Interesting discussion here. FWIW RSS 2.0 does have push support via the optional <cloud> element. https://github.com/rsscloud/rsscloud-server is an example rsscloud server. There's also a php based implementation inspired by the node. js one: https://github.com/colin-walker/php-rssCloud.

@loreanvictor
Copy link
Author

loreanvictor commented Mar 5, 2024

@axodys thanks for the mention. wasn't rss cloud replaced by WebSub though?

@axodys
Copy link

axodys commented Mar 5, 2024 via email

@omarsaad98
Copy link

No matter how I look at this, there is no avoiding creating a push scheme as an extension of RSS. However, I don't think that's so unreasonable. There's already a push scheme to the publishing software you already use so that you can push to your own feed. All we really need is a standardisation for push without authentication. But setting up replies such that you are required to have your own feed to reply is kind of strange. There are only drawbacks and no advantages except the authentication step and the vague preference of a pull scheme only enthusiasts understand. Without this requirement however, a feed would very quickly become it's own social media and suddenly we're back to the federated universe.

I think that RSS is not capable of solving the requirements of this problem, but maybe a program could? I think that something similar to torrents could solve this problem. A bunch of indexers that keep track of everyone involved in a certain "item". You can connect to the people directly with 2-way communication and collectively store replies. The items along with relevant indexers can be distributed through feeds or links. And the OP can communicate about any changes to the people involved. There are a lot more options in extending something like libtorrent

@loreanvictor
Copy link
Author

loreanvictor commented Mar 7, 2024

it’s not about "not having push-based mechanisms at all", it’s about not being primarily push-based: with email or activitypub, content is not published unless pushed (the main “action” of the protocol is pushing). this creates a resource and trust barrier for basic participation, similar to what happened in email (no one wants to be pushed spam).

rss already has extensions (such as websub) that don’t require authentication and handle push. even for social interactions, similar solutions (like the discussed “social hubs”) can be used. these mechanisms are still pull-triggered so they don't require gatekeeping.

p.s. I don’t have any obsession with “pull-based” mechanisms per se, but I do think a distributed network is way more valuable than a federated one, for fairly non-technical reasons. the whole point of the internet, and after that the web, is free flow of information, so any design fragmenting it, which typically emerges from a desire to control, is counter to that goal.

@klartext
Copy link

I also had local actors in mind that can connect/peer up to others, when first hearing about the Fediverse. It was somewhat disappointing, when finding out, that again I need to login into someone else's server to send an answer or provide my own postings.
OK, I could set up my own mastodon-instance, but I would expect a good working system to also allow being offline for some time.

Maybe things like CRDT's can help to make things better:
https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type
https://www.inkandswitch.com/local-first/

@loreanvictor
Copy link
Author

I also had local actors in mind that can connect/peer up to others, when first hearing about the Fediverse. It was somewhat disappointing, when finding out, that again I need to login into someone else's server to send an answer or provide my own postings.

indeed.

Maybe things like CRDT's can help to make things better: https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type https://www.inkandswitch.com/local-first/

perhaps I misunderstood this point @klartext , but as far as I can tell, consistency isn't a particular issue in distributed social networking? for example only one party "writes" any given content so content itself is always consistent, and timelines don't need to be ordinally consistent to make use of consistency models and tools (such as CRDTs) necessary?

@klartext
Copy link

klartext commented Oct 28, 2024

Well, my idea behind this was: if you see a complete thread as the document (not individual posts/replies) that is edited, and the posts/replies as the "atomic" entities (instead of characters), then the CRDT-mechamisms should work out the distribution of messages (and it's order). There must be some order, so that replies are descendents of those messages it is a reply to.

This at least is my intuitive idea behind mentioning CRDTs here. I hope it does not fall apart under scrutiny.

@klartext
Copy link

Skimming through the CRDT-article again, I found the Gossip-protocol mentioned there. So it looks, there already is something to rely upon.

@loreanvictor
Copy link
Author

so that replies are descendents of those messages it is a reply to.

imagine we have a conversation thread of three messages, A, B, and C, where the thread looks like this:

A -> B -> C

if everyone knows that B is a reply to A, and C is a reply to B, then everyone can reconstruct this thread in the same form, by virtue of how replies work. so essentially, a single thread is conflict-free already.

now of course you might have the A -> D -> E thread as well, and looking only at the "reply relationship", peers can't agree which of B or D came first. which means the collection of a message and its extended "reply tree" is NOT a CRDT on its own. however practically it doesn't need to be either, since B and D are from separate threads, peers don't really need to agree on their order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment