Skip to content

Instantly share code, notes, and snippets.

@xeroc
Created April 8, 2016 14:14
Show Gist options
  • Save xeroc/e2c674054da389838b9a5b57a9c854f4 to your computer and use it in GitHub Desktop.
Save xeroc/e2c674054da389838b9a5b57a9c854f4 to your computer and use it in GitHub Desktop.
import hashlib
import json
from binascii import hexlify, unhexlify
import time
from pprint import pprint
def hash(x):
return hashlib.sha256(unhexlify(x)).hexdigest()
def hashJson(x):
return hash(hexlify(bytes(json.dumps(x),'utf-8')))
def appendblock(data):
last_block = blockchain[:-1]
block = {
"previous_id": hashJson(last_block),
"timestamp": int(time.time()),
"data_hash": hashJson(data),
"commits": data,
}
blockchain.append(block)
def verifyBlockchain():
for i in range(1, len(blockchain)-1):
prev_block = blockchain[i-1]
block = blockchain[i]
if block["previous_id"] != hashJson(prev_block):
return False
if block["data_hash"] != hashJson(commits):
return False
return True
def indexData():
storage = {}
for i in blockchain:
data = i["commits"]
for d in data:
storage[d["key"]] = d["value"]
return storage
def storeData(key, value):
global numCommits
global currentCommits
currentCommits.append({"key": key, "value": value})
numCommits += 1
if not numCommits % 10:
appendblock(currentCommits)
currentCommits = []
genesisblock = {
"previous_id": hashJson(["genesisblock"]),
"timestamp": 0,
"data_hash": hashJson([]),
"commits": [],
}
blockchain = [genesisblock]
numCommits = 0
currentCommits = []
for i in range(0,20):
storeData("foo-%d" % i, "bar")
pprint(blockchain)
pprint(verifyBlockchain())
pprint(indexData())
import hashlib
import json
from binascii import hexlify, unhexlify
import time
from pprint import pprint
import random
def hash(x):
return hashlib.sha256(unhexlify(x)).hexdigest()
def hashJson(x):
return hash(hexlify(bytes(json.dumps(x), 'utf-8')))
def pow(block, numZeros):
blockHash = hashJson(block)
while True:
nonce = str(random.getrandbits(64))
block["nonce"] = nonce
h = hashJson(block)
if h[0:numZeros] == "0" * numZeros:
return block, h
block = {"previous_id": hashJson(["genesisblock"]),
"timestamp": 0,
"data_hash": hashJson([]),
"commits": [],
}
pprint(pow(block, 5))

Introduction: Blockchain Basics

Traditional databases

  • Oracle, MySQL
  • carry data: rational (excel), document (mongo), object (??)
  • Managable data: mutable, editable, deletable, appendable
  • distributed via master -> slave (proxy) concept
  • "centralized" database "management"

Immutable database

(- blockchain, read-only filesystems(?))

  • carry data: any
  • Managable data: append ONLY, no edits, not deletes => immutable
  • distributed
    • master -> slave
    • (global, or local) P2P network or
    • distributed vs. non-distributed
  • database management:
    • centralized vs. decentralized
    • private vs. public

Achieving "Append-Only" via blockchain

Similarity to book keeping(!) with new pages every X seconds/minutes (writing on the backside of the previous page (?))

  1. New booking entries arrive at the book keeper
  2. book keeper adds them to his current page of entries
  3. every now and then, the book keeper changes to the "next page" and by this "commits" to the previous entries
  4. every new page carriers a date time

Blockchain tech:

  1. the data to be inserted is formatted properly
  2. data to be submitted/committed is organized as 'operations'
  3. several operations can be grouped together into a single 'transaction'
  4. several transactions are grouped into a set of "database commits" (i.e. a "block")
  5. new blocks get "applied" (read: committed) "occasionally"
  6. the new block carries the hash of the content of the previous block as reference
  7. Participants verify operations/transactions/commits and linking of each blocks

Excursion Hash functions

Definition: A hash function is any function that can be used to map data of arbitrary size to data of fixed size. The values returned by a hash function are called hash values, hash codes, hash sums, or simply hashes.

HASH(arbitrary-data) -> randomly looking shorter hash

Properties:

  • Determinism (always same hash for same data)
  • Uniformity (Uniform mapping)
  • Data normalization (data format normalization)
  • Continuity (Not useful here)
  • Non-invertible (cannot reverse the process (processing theorem)

Examples:

  • digit sum
  • two-letter country codes
  • MD5
  • SHA1
  • SHA256
  • ...

Training: Blockchain Basics (1.py)

  • Develop our own blockchain in python
  • blocks are json objects:
{
   "previous_id": <>,
   "timestamp": <>,
   "data_hash": <>,
   "commits": {...},
}
  • data is a json object, the hash is from the stringified/serialized json
[
 {
    "key" : <>,
    "value" : <>,
 }
]
  • indexing the blockchains key/value pairs in a local json object
  • rescan/re-apply/reindex blocks

Note:

Why Bitcoin uses a merkle tree?

If they were stored sequentially, you'd need all transactions to verify that any single one belongs to a given block. With a merkle tree, you only need to provide the transaction and the merkle branch to prove that this transaction in fact appears in this block.

Introduction: Consensus Mechanisms

Core questions to answer:

  • When should a block be appended to the bockchain
  • In a P2P network, who should be allowed to do so?

Starting stuation:

  • need to organize people/players/network participants
  • over an untrusted, unrelaible network
  • with the expectation of malicous interferences during communication
  • Byzantine Generals Problem (unsolved until satoshi)

Goals

  • find consensus about anything (organization of a time, agreeing on a set of data, voting, ..)

Proof of Work

  • Goal: integer representation of hash of a block is smaller than target "difficulty".
  • Difficulty adjusted to regualte towards target block confirmation time of x minutes (BTC: 10 minutes)
  • Show BTC blocks and leading zeros
  • Hashes are one way function (irreversible operation)
  • Mining equipment / resource hungry
  • Different Hash algorithms (scrypt, sha256, sha512, groekel, combinations of multiple algos)
  • Different retargeting strategies
  • Optional: reward for finding a block
  • Drawbacks:
    • Energy hungry
    • decentralization due to pooled mining
    • Random (poisson distributed) block confirmation times
    • user consensus != miner consensus
    • participating in consensus require expensive hardware
  • Advantages
    • Easy to verify (due to high energy consumed)

Proof of Stake

  • Goal: integer representation of hash of a block is smaller than target "difficulty".
  • Every satoshi allows to derive one hash for a given time instant (time and token address included in hash calculation)
  • The more stake one has, the more hashes can be computed => higher probability to find a valid block
  • Optional: reward for finding a block
  • Drawbacks:
    • Nothing at stake
    • Stake grinding
    • Initial stake distribution(?)
    • Participation in "minting/staking" requires unlocked keys/wallet
    • Random (poisson distributed) block confirmation times
    • the rich get richer
  • Advantages:
    • Low use of resources (energy efficient) due to significantly less hashes

Combination of POW and POS

  • Randomized block selection: Nxt and BlackCoin use randomization to predict the following generator, by using a formula that looks for the lowest hash value in combination with the size of the stake.[4][5][6] Since the stakes are public, each node can predict - with reasonable accuracy - which account will next win the right to forge a block.
  • Coin age based selection Peercoin's proof-of-stake system combines randomization with the concept of "coin age," a number derived from the product of the number of coins times the number of days the coins have been held. Coins that have been unspent for at least 30 days begin competing for the next block. Older and larger sets of coins have a greater probability of signing the next block. However, once a stake of coins has been used to sign a block, they must start over with zero "coin age" and thus wait at least 30 more days before signing another block. Also, the probability of finding the next block reaches a maximum after 90 days in order to prevent very old or very large collections of stakes from dominating the blockchain.
  • Velocity based selection Reddcoin's proof-of-stake velocity (PoSV)[11] claims to encourage velocity i.e. movement of money between people, rather than hoarding.
  • Voting based selection Instead of only using the stake size, the block generators can be selected by votes. BitShares uses a system where stake is used to elect a total of 101 delegates, who are then ordered at random.[12] This has many of the advantages of shareholder voting (for example, the flexible accountability enhances the incentives of the generators to act responsibly), and yet it reintroduces the dangerous Sybil attack - as in one case where one user posed as the top five delegates.[13]

Delegated Proof of Stake

  • Blockchain voting of block producers
  • votes are weightes by stake (similar to shareholders)
  • Block producers are sorted by shareholder approval
  • Set of N producers are allowed to sign new blocks with their keys
  • Subsequent block producers are known a-priori
  • Disadvanages:
    • Requires shareholders to participate in votes (apathy)
    • initial distribution of shares(?)
  • Advantages:
    • Allows interpretation of a business with shareholders and accountants (block producers)
    • High efficient
    • allows significantly lower confirmations times
    • constant block confirmation times(!)
    • very high degree of decentralization
    • bad actors can be unapproved by shareholders
    • shareholder consensus = network consensus (sooner or later)
    • participating in consensus does not require expensive hardware, but shares in "company"

Training (2.py)

  • Implementation of basic Proof Of Work

  • Discussion about 'target' adjustment

  • Discussion about Poison statistics 99.999% after 6 confirms etc..

  • POS requires more work since Blocks get authenticated by the miner (pub/priv key)

Introduction: Blockchain "protocol" specifications:

  • Who is allowed to read the database?
  • Who is allowed to transact (private vs. public)?
  • Who is allowed to commit new data to the blockchain?
  • What is a valid operation/transaction (What can be added to the database?)?
  • What is a valid block?

Who is allowed to read the database?

  • verify blockchain requires full knowledge of all data on the blockchain
  • public verifiers/block producers -> public blockchain
  • may still contain encrypted (but validated) data

Who is allowed to transact (private vs. public)?

  • Even if the blockchain is publicly auditable, does not imply everyone can add data
  • Adding data may require a valid and authorized signature
  • By this, people can be allowed to do A, but prevented from doing B
  • Permission-less ledger: "No one can stop you from playing by the rules!"
  • Those "rules" are defined below and thus already restrict a permission-less ledger.

!Authorization!

Who is allowed to commit new data to the blockchain?

  • Miners, Stakers, Minters, Witnesses => block producers
  • Requirements from Consensus Mechanism need to be fulfilled
  • Maybe predefined producers, maybe voted, maybe undefined

What is a valid operation/transaction (What can be added to the database?)?

  • What kind of data can be added?
    • transfer operations?
    • deployment of smart contracts?
    • trade orders?
    • lending orders?
    • any kind of relational data
    • even object-relational data
    • or raw data
  • What requirements does this data need to fulfill to be valid?
    • double-spending of funds?!
    • overwriting of existing keys?
    • user is allowed to perform operation X?
    • user is allowed to perform operation X on account Y?

!Validation!

The "what is allowed" needs to be defined in the rules (i.e. the blockchain protocol) of the specific blockchain.

What is a valid block?

  • valid previous hash(!)
  • valid hash of the content (e.g. transactions, operations, merkle tree, ...)
  • data and time within a tolerance (NTP)
  • valid "consensus" requirement (e.g. pow, pos, signature, ...)

Examples

Bitcoin

  • What is a valid transaction (What can be added to the database?)
    • simple "transfers" of an amount of bitcoin/satoshi from an address to a "Script"
    • most used "target scripts" are also just addresses
    • require proper signature from the spending address (coin owner)
  • What is a valid block
    • set of valid transactions (merkle tree root)
    • date
    • organizational data (nonce, protocol version, etc)
    • valid proof of work
  • Who is allowed to transact (private vs. public)
    • anybody that has bitcoin and owns the private key(s)
  • Who is allowed to add data to the database?
    • POW miners

Ethereum

BitShares

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment