kazcw/sparseblocks.md

## sparseblocks.md

      
    Raw
  

              sparseblocks.md
            
          
    Sparse blocks


exchange priority policies in peer introductions
assign unique sequential IDs in the order the transactions were inved (per peer)
receiving a getdata for a tx updates last-known-peer-received inv to
all invs up to the one referenced
include ID-last-received, last-known-peer-received in sparse block
reference txes in sparse block by index in receiver's
prioritiziation with peer's sent invs up to ID-last-received and sender's prior
invs up to last-known-peer-received

Messages


sparseblock
invack: specifies latest inv received, sent when many invs have not resulted
in any getdata
gettx: getdata, but using new sequential ID to save 28 bytes per tx

Ordering Policy

It seems important for ordering policies to be able to be specified in
as much detail as possible.
Parameters


total inputs
total outputs
bytes
coin days destroyed
net UTXO size change
sigops
is data carrier
is output raw multisig
age in mempool
number of double spends of this mutually known
generation depth (see "ultra fast block validation" below)
what else?

This parameter set should be extensible to allow for unforeseen future factors.
Combining parameters

Ordering policies should allow arbitrary algebraic combinations of
their parameters, as well as thresholds. Boolean combinations of
sub-policies would also be desirable. This could be implemented with a
tx-script-like stack-based language, in which each supported tx
property is pushed onto the stack by a particular opcode, and
+-*//min/max/boolean operators combine them to yield the sort key.
Difficult parameters


Coin-days-destroyed: changes, peers need agreement on when (if?)
it's recalculated. Probably can just not recalculate, but peers still
need agreement on "time seen" to get CDD.
Age in mempool: seems intractable in terms of time, but could be
done easily in terms of "how many txes old is this sequential ID"

Dealing with heterogeneous mempools

One potential pitfall: this allows for an environment of completely
heterogeneous mempool policies. I think that's a good thing, but we
need to avoid a situation where only least-common-denominator
transactions make it farther than a hop or two, and we don't want
nodes to have a strong preference for connecting to like-minded peers
since clustering reduces overall connectivity. It may be worthwhile to
add a parallel mechanism for relay policies, to differentiate between
what a node would keep in its mempool vs. what it wouldn't even relay
and doesn't want to see at all. Relay policies could be specified just
like prioritization policies, but with the final stack value evaluated
in a boolean context.
Policies in coinbase

An interesting additional use of policy-scripts would be a
standardized way for miners to include a policy script in a coinbase,
allowing miners a mechanism to advertise things like their relative
price of sigops vs bytes. Nodes may then choose to take this
information into account in order to optimize their mempool policies
for likelihood of consistency with future blocks. Since policy scripts
provide only relative information on prices of different transaction
properties rather than an absolute fee, this should not allow miners
to "vote fees up", although care would need to be taken they wouldn't
be able to drive up prices by claiming common transaction types are at
the high end of the fee scale.
Ultra-fast block validation

Increasing the network efficiency of previously received transactions in a block
increases the relative benefit of changes reducing processing time of such
transactions, which currently takes my node 40-300ms (timing the ConnectBlock)
even for blocks in which all the signatures have already been verified.
One way we would could do this is to organize mempool transactions into
"generations", where a generation 0 tx has no dependencies on unconfirmed
transactions and generation N has no dependencies on transactions of generation
later than N-1. Validation of a newly-received block could then be done by
simply iterating through the generations, for each adding its inputs consumed to
a set to ensure they're unique and checking that any dependencies are included
in outputs provided by previous generations. Since transactions are checked for
double-spending against mempool transactions on acceptance into the mempool, the
vast majority of transactions (under normal circumstances), for which no
competing double-spends have been received, could be omitted from the
inputs-consumed set-building process entirely.
Since this degree of change to such a critical process as block validation has
risks, we could still perform the current validation before actually accepting
the block as valid (at least until the code has been in use long enough we're
confident it can't differ from the current results), but use the very fast
validation to determine whether to relay. The worst-case-scenario, where a block
has been constructed that exploits a bug to pass the quick validation without
passing the current validation, would enable an attacker only to increase nodes
DoS scores by the severity of the failure during full validation. A node
detecting a difference between the results of its validators could automatically
set a persistent flag disabling quick-validation until it is patched.
Roadmap


progress type
description


implement primitive
sequential IDs


push feature
gettx


implement primitive
priority-scripts


push feature
relay policies


implement primitive
invack


prototype
squashblock based on sequential-IDs


push feature
prioritization-based squashblock


implement primitive
coinbase policies


push feature
dynamic mempool policies
progress type	description
implement primitive	sequential IDs
push feature	gettx
implement primitive	priority-scripts
push feature	relay policies
implement primitive	invack
prototype	squashblock based on sequential-IDs
push feature	prioritization-based squashblock
implement primitive	coinbase policies
push feature	dynamic mempool policies