Skip to content

Instantly share code, notes, and snippets.

@gavinandresen
Last active February 26, 2020 03:14
Show Gist options
  • Save gavinandresen/4073937 to your computer and use it in GitHub Desktop.
Save gavinandresen/4073937 to your computer and use it in GitHub Desktop.
Bitcoin Transactions "T2" : Metadata

Bitcoin Transaction Metadata Design

This is a proposal for associating arbitrary metadata with Bitcoin transactions.

Popular Bitcoin services have been using various hacks to associate metadata with transactions, like using the amount of transaction outputs to communicate some information (e.g. SatoshiDice looks for a 0.00543210 BTC output to specify a payment address). Encoding metadata this way is inefficient, bloating the blockchain with extra data, and inelegant.

A general, extensible, backwards-compatible mechanism for associating information with transactions is needed.

This document describes a solution, then discusses it along with other possible solutions.

Transaction Preliminaries

Alice wants to pay Bob, and wants to include some extra information beyond the basic information that are required to validate transactions and put them into the blockchain. Note: I will use "Alice" and "Bob" interchangeably with "software running on behalf of Alice/Bob."

Examples of the types of information she might want to include:

  • An encrypted message that only Bob can read
  • A signature from a third-party service saying that Alice's payment is guaranteed by them against double-spending

Payment starts by Alice securely obtaining at least two pieces of information from Bob: a public key "P" and a pay-to-script "s". It it critical that Alice knows that "P" and "s" come directly from Bob and not a man-in-the-middle attacker; the Bitcoin Payment Messages proposal (https://gist.github.com/4120476) describes a way to do that.

Bob may use the same public key "P" for both messaging and transaction signing ("s" could be simply 'P OP_CHECKSIG').

Transaction Construction

This mechanism for associating immutable transaction metadata without increasing the size of the blockchain was `proposed by Stefan Thomas<https://bitcointalk.org/index.php?topic=108423.msg1178438#msg1178438>`__. This is a variation on that idea.

To construct a "T2" (version 2) transaction that includes metadata, Alice first creates the serialized metadata as a list of key --> value pairs.

Information for Bob is encrypted using his public key "P" and ECIES+AES encryption and is added to the metadata as "P" --> ECIES+AES(message_for_Bob). The encrypted message is, itself, a list of key --> value pairs, and a separate document (TODO) proposes conventions for the keys and values understood by Bitcoin client software.

If the transaction includes outputs for multiple recipients, then encrypted data for each of them may be added to the metadata.

Alice may also add arbitrary key --> value pairs that is public information that can be read by anybody. Perhaps "BOND" --> "bond_issuer_identifier" if this is a smart-property bond transaction, for example.

Alice then creates a new transaction that is 'version 2' (transactions without metadata are 'version 1'). She selects enough unspent inputs to pay transaction fees and Bob.

One of the transaction's outputs is used to bind the metadata to the transaction, as follows:

  1. Alice adds two entries to the metadata: "K" --> base public key, "N" --> change output index. For example, if the payment to Bob is the first transaction output and a change transaction is the second output, she would add "K" --> ...her change key..., "N" --> "1".
  2. Alice computes the SHA256 hash of the serialiazed metadata "H".
  3. She computes a derived public key K' = K*H (where * is ECC multiplication) and sets output N to pay to that key.

Transaction Relay/Broadcast

This mechanism for broadcasting a transaction and its associated metadata is inspired by a `comment from ByteCoin<https://bitcointalk.org/index.php?topic=108423.msg1179538#msg1179538>`__, who observed that data relayed across the Bitcoin peer-to-peer network does not necessarily have to be the same data that is stored permanently in the blockchain.

Once Alice has constructed the transaction and metadata, she announces and broadcasts it in the usual way ('inv' and 'tx' messages) to peers that understand version=2 transactions (TODO: define which PROTOCOL_VERSION supports these new 'tx' messages). version=2 transactions are serialized on the network as the transaction data followed immediately by the metadata.

Peers decide whether or not to relay the transaction (and add it and its metadata to the memory pool) using the size of the transaction plus its metadata and the existing rules to prioritize relaying of transactions. Additional rules may be added to prevent attackers from flooding the network or filling up the memory pool with transactions that include lots of metadata (TODO: work out exact policy).

Before relaying or adding to the memory pool, peers compute the metadata's hash and verify that the public key for output i is A*H (i --> A is in the metadata). If the hash does not match, or the metadata does not contain an index --> key pair, then the transaction shall be considered invalid and the peer that transmitted it shall be subject to anti-denial-of-service measures (such as immediate disconnection).

Miners add just the transaction data to the blocks they create; they have no obligation to store or validate the metadata.

Receiving Payment/Messages

Bob detects payments as usual-- he sees a transaction that assigns Bitcoins to "s."

If Bob is monitoring the p2p network when the payment is broadcast, he will also see that Alice has sent him a message about the transaction because his public key is part of the transaction metadata.

Transaction Metadata Storage

If Bob's bitcoin client is not monitoring the p2p network when the payment is broadcast, then ... what ?

The bitcoin client will get the transaction when it gets the block containing the transaction, and will know that it is a version=2 transaction (the version is part of the transaction data stored in the blockchain) and will also know that it is a payment to Bob.

It will not, however, automatically have the transaction metadata. To get it, some mechanism for looking up metadata given a transaction id will need to be developed. TODO: define that mechanism. Storage is cheap but it isn't free.

Note that Bob can independently validate that the metadata has not been modified, by computing K'=K*SHA256(metadata) and checking to see that K' is stored in the blockchain.

Discussion

This design comes from many discussions over the past couple of years debating whether or not it should be easy for users to add arbitrary data to the blockchain.

Design alternative: OP_DROP

The most straightforward way of associating metadata with a transaction is to define a new "standard" transaction output form; perhaps: <data> OP_DROP <pubkey> OP_CHECKSIG

data would be limited to 520 bytes (that is the maximum number of bytes allowed to be pushed onto the evaluation stack), so any solution using OP_DROP would use a hash of the data and still need some mechanism for looking up the full metadata, given the hash.

Design alternative: merged mined metadata

TODO

Deisgn alternative: direct, side-channel communication

TODO

Issues

Spam

If Bitcoin clients watch every transaction's metadata for messages and displays them to the user, then an attacker could easily spam users with unwanted messages.

However, because messages are attached to transactions, it should be easy for Bitcoin clients to impose reasonable rules to prevent unwanted messages (perhaps "only display messages ecncrypted with my keys if they are associated with transactions that pay me more than X BTC").

Privacy

Associating the metadata hash with the "change" output makes it easy to identify payment and change, and may make it easier to link all payments from Alice.

The best fix for this is probably a transaction mixing service, so Alice and Annette and Alex can cooperate to create one transaction paying Bob and Barbara and Bill (and with three separate change outputs), so that an observer cannot tell who paid who. This metadata design extends to such mixed-transaction use cases.

References

Forum thread on transaction metadata / OP_DROP : https://bitcointalk.org/index.php?topic=108423

ECIES Encryption : http://en.wikipedia.org/wiki/Integrated_Encryption_Scheme

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment