nagydani/ipfs_ethereum.txt

## ipfs_ethereum.txt
In this document, I outline the tasks required for storing and
presenting the Ethereum block chain and Web-based Ethereum Đapps in
ipfs. Currently, ipfs is very good at locating and delivering content
using a global, consistent address space and it has a very well designed
and implemented http gateway. However, Ethereum's use cases require
additional capabilities that ipfs currently does not provide.


Redundancy and persistency

In both important use cases, we need to make sure content is available
under the condition that nodes can come and go. Ipfs, by itself, does
not provide any mechanism to ensure this, though there is a weak
incentive for replication built into their "bitswap" protocol, which
seems not to be implemented completely at this point, with important parts
of the design still not finalized.

Long-term persistency of meaningful pieces of information can be
incentivized by content availability insurance that is largely
independent of the underlying distributed storage solution. The most important
development in this regard is the Swarm Contract at
https://github.com/ethersphere/go-ethereum/blob/bzz-config/bzz/bzzcontract/swarm.sol

However, it is also worth noting that the entire infrastructure for
redundant and secure storage developed for Swarm can be used in the framework
of ipfs thanks to its pluggable hash function. If swarm hash is added
as an application-specific hash function to ipfs and swarm nodes advertize
their content in ipfs DHT, Swarm can serve as a replication infrastructure
to ipfs.


Fair allocation of bandwidth resources

Bitswap defines an API for bandwidth accounting that can be easily extended
to include micropayment transfers to balance otherwise unbalanced bandwidth
use between peers.

The vast majority of these micropayment transactions must happen off the
block chain, otherwise the use of the block chain itself becomes a significant
transaction cost. Such a micropayment mechanism has been developed for Swarm
and can be used as a plug-in for Bitswap as well as for a multitude of other
purposes not even related to storage. The relevant contract code and go API
are availabe at
https://github.com/ethersphere/go-ethereum/tree/bzz-config/common/chequebook


Names and URIs

One design principle of Swarm was to allow for arbitrary names and URIs to
resolve to both static and dynamic content served up by Swarm infrastructure.
Unfortunately, this has not been a design goal for ipfs and in its current form
it does not fulfill it. In particular, static directories with a large number
of entries are handled very inefficiently by ipfs and there is no obvious
way around this limitation.

In practice, it makes it very difficult to migrate content like
Wikipedia to our distributed storage, even though it would have one of
the obvious candidates for a high-profile applications of such an
infrastructure. Similarly problematic would be to implement commonly
used http API's for mapping content, such as OpenStreetMap tiles, on top
of ipfs, which would be another obvious candidate.

I believe that for the success of Web3, it is instrumental to retain as
much compatibility with popular and useful Web 2.0 standards and
services as possible. The URI resolution scheme used by ipfs constitutes
a very severe limitation hampering such efforts.


Decentralization

The design of ipfs provides a common abstraction for both centralized
and decentralized storage solutions so that content can be retrieved
from both using the same software; the consumer of the content does not
even need to be aware of the underlying storage architecture and ipfs
does not specify one. The content can come from a workstation with a
temporary adress, an individual small server, a large datacenter or a
sophisticated content delivery network. As long as the content conforms
to ipfs format and is advertized in ipfs DHT, the consumer will be able
to download it all the same.

Moreover, ipfs solves one of the main problems of the (http(s)-based)
web driving its rapid centralization, which is that the costs of content
distribution borne by the publisher increase with the content's
popularity. Since ipfs content is delivered bittorrent-style, all consumers
automatically contribute their upstream bandwidth towards distribution, at
least for the time of downloading, thus contributing their fair share.

However, as history with Bitcoin shows, enabling decentralization does
not prevent centralization. Economies of scale might result in a
centralization of storage infrastructure; the real question then becomes
to what extent can large players abuse their position.


Censorship resistance

In some ways, ipfs is explicitly censorship-enabling; nodes can decide
what content to store and not to store and they can credibly comply with
take-down notices. At the same time, ipfs also helps keeping content
available for all users as long as there are nodes that are willing to
serve it, although it must be noted that it also helps finding all
such nodes. This might be a workable compromise.

For this, however, to remain the case, it is important that the DHT
remains decentralized. Unfortunately, at present there are no incentives
built into ipfs for running DHT nodes. DHT nodes cannot be excluded for
not responding to queries, because ipfs DHT attaches very little value
to connections. Consumers are not punished for freeloading (only
querying other DHT nodes, but never responding to queries), while a
cartel providing most of the storage service might decide not to keep
outsider addresses in their Kademlia table and yet provide a pleasant
user experience to freeloading consumers. Over time, this might develop
into a problem.
	In this document, I outline the tasks required for storing and
	presenting the Ethereum block chain and Web-based Ethereum Đapps in
	ipfs. Currently, ipfs is very good at locating and delivering content
	using a global, consistent address space and it has a very well designed
	and implemented http gateway. However, Ethereum's use cases require
	additional capabilities that ipfs currently does not provide.


	Redundancy and persistency

	In both important use cases, we need to make sure content is available
	under the condition that nodes can come and go. Ipfs, by itself, does
	not provide any mechanism to ensure this, though there is a weak
	incentive for replication built into their "bitswap" protocol, which
	seems not to be implemented completely at this point, with important parts
	of the design still not finalized.

	Long-term persistency of meaningful pieces of information can be
	incentivized by content availability insurance that is largely
	independent of the underlying distributed storage solution. The most important
	development in this regard is the Swarm Contract at
	https://github.com/ethersphere/go-ethereum/blob/bzz-config/bzz/bzzcontract/swarm.sol

	However, it is also worth noting that the entire infrastructure for
	redundant and secure storage developed for Swarm can be used in the framework
	of ipfs thanks to its pluggable hash function. If swarm hash is added
	as an application-specific hash function to ipfs and swarm nodes advertize
	their content in ipfs DHT, Swarm can serve as a replication infrastructure
	to ipfs.


	Fair allocation of bandwidth resources

	Bitswap defines an API for bandwidth accounting that can be easily extended
	to include micropayment transfers to balance otherwise unbalanced bandwidth
	use between peers.

	The vast majority of these micropayment transactions must happen off the
	block chain, otherwise the use of the block chain itself becomes a significant
	transaction cost. Such a micropayment mechanism has been developed for Swarm
	and can be used as a plug-in for Bitswap as well as for a multitude of other
	purposes not even related to storage. The relevant contract code and go API
	are availabe at
	https://github.com/ethersphere/go-ethereum/tree/bzz-config/common/chequebook


	Names and URIs

	One design principle of Swarm was to allow for arbitrary names and URIs to
	resolve to both static and dynamic content served up by Swarm infrastructure.
	Unfortunately, this has not been a design goal for ipfs and in its current form
	it does not fulfill it. In particular, static directories with a large number
	of entries are handled very inefficiently by ipfs and there is no obvious
	way around this limitation.

	In practice, it makes it very difficult to migrate content like
	Wikipedia to our distributed storage, even though it would have one of
	the obvious candidates for a high-profile applications of such an
	infrastructure. Similarly problematic would be to implement commonly
	used http API's for mapping content, such as OpenStreetMap tiles, on top
	of ipfs, which would be another obvious candidate.

	I believe that for the success of Web3, it is instrumental to retain as
	much compatibility with popular and useful Web 2.0 standards and
	services as possible. The URI resolution scheme used by ipfs constitutes
	a very severe limitation hampering such efforts.


	Decentralization

	The design of ipfs provides a common abstraction for both centralized
	and decentralized storage solutions so that content can be retrieved
	from both using the same software; the consumer of the content does not
	even need to be aware of the underlying storage architecture and ipfs
	does not specify one. The content can come from a workstation with a
	temporary adress, an individual small server, a large datacenter or a
	sophisticated content delivery network. As long as the content conforms
	to ipfs format and is advertized in ipfs DHT, the consumer will be able
	to download it all the same.

	Moreover, ipfs solves one of the main problems of the (http(s)-based)
	web driving its rapid centralization, which is that the costs of content
	distribution borne by the publisher increase with the content's
	popularity. Since ipfs content is delivered bittorrent-style, all consumers
	automatically contribute their upstream bandwidth towards distribution, at
	least for the time of downloading, thus contributing their fair share.

	However, as history with Bitcoin shows, enabling decentralization does
	not prevent centralization. Economies of scale might result in a
	centralization of storage infrastructure; the real question then becomes
	to what extent can large players abuse their position.


	Censorship resistance

	In some ways, ipfs is explicitly censorship-enabling; nodes can decide
	what content to store and not to store and they can credibly comply with
	take-down notices. At the same time, ipfs also helps keeping content
	available for all users as long as there are nodes that are willing to
	serve it, although it must be noted that it also helps finding all
	such nodes. This might be a workable compromise.

	For this, however, to remain the case, it is important that the DHT
	remains decentralized. Unfortunately, at present there are no incentives
	built into ipfs for running DHT nodes. DHT nodes cannot be excluded for
	not responding to queries, because ipfs DHT attaches very little value
	to connections. Consumers are not punished for freeloading (only
	querying other DHT nodes, but never responding to queries), while a
	cartel providing most of the storage service might decide not to keep
	outsider addresses in their Kademlia table and yet provide a pleasant
	user experience to freeloading consumers. Over time, this might develop
	into a problem.