Skip to content

Instantly share code, notes, and snippets.

@kazcw
Last active Aug 29, 2015
Embed
What would you like to do?
Squashing redundant tx data in blocks on the wire

Sparse blocks

  • exchange priority policies in peer introductions
  • assign unique sequential IDs in the order the transactions were inved (per peer) receiving a getdata for a tx updates last-known-peer-received inv to all invs up to the one referenced
  • include ID-last-received, last-known-peer-received in sparse block
  • reference txes in sparse block by index in receiver's prioritiziation with peer's sent invs up to ID-last-received and sender's prior invs up to last-known-peer-received

Messages

  • sparseblock
  • invack: specifies latest inv received, sent when many invs have not resulted in any getdata
  • gettx: getdata, but using new sequential ID to save 28 bytes per tx

Ordering Policy

It seems important for ordering policies to be able to be specified in as much detail as possible.

Parameters

  • total inputs
  • total outputs
  • bytes
  • coin days destroyed
  • net UTXO size change
  • sigops
  • is data carrier
  • is output raw multisig
  • age in mempool
  • number of double spends of this mutually known
  • generation depth (see "ultra fast block validation" below)
  • what else?

This parameter set should be extensible to allow for unforeseen future factors.

Combining parameters

Ordering policies should allow arbitrary algebraic combinations of their parameters, as well as thresholds. Boolean combinations of sub-policies would also be desirable. This could be implemented with a tx-script-like stack-based language, in which each supported tx property is pushed onto the stack by a particular opcode, and +-*//min/max/boolean operators combine them to yield the sort key.

Difficult parameters

  • Coin-days-destroyed: changes, peers need agreement on when (if?) it's recalculated. Probably can just not recalculate, but peers still need agreement on "time seen" to get CDD.
  • Age in mempool: seems intractable in terms of time, but could be done easily in terms of "how many txes old is this sequential ID"

Dealing with heterogeneous mempools

One potential pitfall: this allows for an environment of completely heterogeneous mempool policies. I think that's a good thing, but we need to avoid a situation where only least-common-denominator transactions make it farther than a hop or two, and we don't want nodes to have a strong preference for connecting to like-minded peers since clustering reduces overall connectivity. It may be worthwhile to add a parallel mechanism for relay policies, to differentiate between what a node would keep in its mempool vs. what it wouldn't even relay and doesn't want to see at all. Relay policies could be specified just like prioritization policies, but with the final stack value evaluated in a boolean context.

Policies in coinbase

An interesting additional use of policy-scripts would be a standardized way for miners to include a policy script in a coinbase, allowing miners a mechanism to advertise things like their relative price of sigops vs bytes. Nodes may then choose to take this information into account in order to optimize their mempool policies for likelihood of consistency with future blocks. Since policy scripts provide only relative information on prices of different transaction properties rather than an absolute fee, this should not allow miners to "vote fees up", although care would need to be taken they wouldn't be able to drive up prices by claiming common transaction types are at the high end of the fee scale.

Ultra-fast block validation

Increasing the network efficiency of previously received transactions in a block increases the relative benefit of changes reducing processing time of such transactions, which currently takes my node 40-300ms (timing the ConnectBlock) even for blocks in which all the signatures have already been verified.

One way we would could do this is to organize mempool transactions into "generations", where a generation 0 tx has no dependencies on unconfirmed transactions and generation N has no dependencies on transactions of generation later than N-1. Validation of a newly-received block could then be done by simply iterating through the generations, for each adding its inputs consumed to a set to ensure they're unique and checking that any dependencies are included in outputs provided by previous generations. Since transactions are checked for double-spending against mempool transactions on acceptance into the mempool, the vast majority of transactions (under normal circumstances), for which no competing double-spends have been received, could be omitted from the inputs-consumed set-building process entirely.

Since this degree of change to such a critical process as block validation has risks, we could still perform the current validation before actually accepting the block as valid (at least until the code has been in use long enough we're confident it can't differ from the current results), but use the very fast validation to determine whether to relay. The worst-case-scenario, where a block has been constructed that exploits a bug to pass the quick validation without passing the current validation, would enable an attacker only to increase nodes DoS scores by the severity of the failure during full validation. A node detecting a difference between the results of its validators could automatically set a persistent flag disabling quick-validation until it is patched.

Roadmap

progress type description
implement primitive sequential IDs
push feature gettx
implement primitive priority-scripts
push feature relay policies
implement primitive invack
prototype squashblock based on sequential-IDs
push feature prioritization-based squashblock
implement primitive coinbase policies
push feature dynamic mempool policies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment