Skip to content

Instantly share code, notes, and snippets.

@ArturGajowy
Last active October 15, 2019 18:41
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ArturGajowy/73e481767548b084a8e5d7c19bb1f70c to your computer and use it in GitHub Desktop.
Save ArturGajowy/73e481767548b084a8e5d7c19bb1f70c to your computer and use it in GitHub Desktop.
Charging for deploy

Charging for deploy

Charging overall scheme

  • charge upfront phloLimit * phloRate from deployer vault to pos/validator vault
    • check that there's enough rev
    • check that transfer succeeded
  • if system deploy succeeded & transfer was successful, execute user deploy
    • if not enough rev, we need to store the deploy on-chain for reference, as proof of deploy attempt and basis for charging
    • this means we put the user deploy in the block in all cases, but only execute it if pre-charge succeeds
    • replay is not validated for failed deploys (see Caveats)
    • deploy for which pre-charge failed has an empty event log and a successful = false
      • FUTURE / JIRA: add error status to user deploys to discern OOPE errors from other error causes
      • FUTURE / JIRA: add error info to capture a limited data on the error (e.g. stacktrace header and/or hash)
  • only if user deploy successful refund remaining phlo
    • don't even do the refund deploy otherwise
    • failures include: out of phlo error, whatever rholang error
    • we can't afford refunds for erroneous executions, b/c error behavior is undefined in rholang,
      and thus: so is its cost (very likely non-deterministic and non-replayable)
    • this needs to be documented

TBD Propose & replay

The following two sections will discuss where this all needs to be plugged in and what changes are needed in the area bounded by:

  • BlockCreator.createBlock (propose) & MultiparentCasperImpl.attemptAdd (replay) from the top
  • RuntimeManager.processDeploy & RuntimeManager.replayDeploy from the bottom Most likely, some of the indirection in between could be inlined. Details of error handling in the current state need to be investigated.

TBD Current state

  • where deploys are evaluated first
  • where deploys are replayed
  • based on what deploys / things to do are determined

Target state

  • for each UserDeploy, the relevant pre-charge and refund SystemDeploys are added around
  • slashing SystemDeploy-s are added

To propose:

  • [we issue a series of needed system deploys]
  • we go through pending user deploys, for each:
    • execute the pre-charge
    • execute the user deploy
    • execute the refund
  • (as needed) we issue the slashing system deploys
  • (as needed) we issue the close block system deploy

This results in a:

BlockMessage
  userDeploys: List[Singed[ProcessedUserDeploy]]
  systemDeploys: List[Signed[ProcessedSystemDeploy]]

ProcessedSystemDeploy
  timestamp: Long
  templateId: String  //technically we could skip it, but it's harmless enough to retain as a debug guide
  eventLog: EventLog
  //THE FOLLOWING MUST NOT END UP IN THIS MESSAGE:
  // deployerPk //the validator's PK! <- it follows from who's proposing the block. Duplication = functional dependency = bugs = death
  // inputs: Map[UriString, GroundTerm]
  

Building block data types:

//NOT A NETWORK MESSAGE - hardcoded values stored in dictionary known to all validators
SystemDeployTemplate 
  id: String
  code: String
  outputs: Map[UriString, RholangType]

Signed[A: Serialize]
  sig
  sigAlgo

To replay

We operate just like in propose, the only info we use from the BlockMessage is the one we couldn't possibly know otherwise. We need to know what needs to happen in order to validate that it happened.

Every time we execute a system deploy we pop the first system deploy from BlockMessage.systemDeploys to know which eventLog to use.

The order of system deploys is implicit and defined by casper (propose/replay) code

  • we issue a series of needed system deploys
  • we go through pending user deploys, for each:
    • execute the pre-charge
    • execute the user deploy
    • execute the refund
  • (as needed) we issue the slashing system deploys
  • (as needed) we issue the close block system deploy

Pseudocode

# `System` is a service used for execution of system deploys,
# collecting their traces or
# - in its replay variant - verifying all traces have been used
#
# `User` is a service used for execution of user deploys
# the replay implementation would raise a fatal error (`Sync.raiseError`)
# if the user deploy status was not as expected

def propose(blockNumber, userDeploys, invalidBlocks):

  userDeploys.foreach \userDeploy ->
    System.execute(PreCharge, /* inputs: */ userDeploy.deployerId)
    val (success, remiainingPhlo) = User.execute(userDeploy)
    if success:
      System.execute(Refund, /* inputs: */ rev(remiainingPhlo), ...)
  invalidBlocks.foreach \ib ->
    System.execute(Slash, ib)
  if epochChange(blockNumber):
    System.exceute(CloseBlock)
  bonds = System.execute(GetBonds)

  postHash = getTsHash()
  BlockMetadata.put(postHash, bonds)
  systemDeploys = System.getDeploys #includes get bonds event log \o/
  
  return (bonds, BlockMessage(..., postHash, systemDeploys, ...))


def replay(blockNumber, userDeploys, systemDeploys, invalidBlocks):
  
  System.rig(systemDeploys)
  
  userDeploys.foreach \userDeploy ->
    System.execute(PreCharge, /* inputs: */ userDeploy.deployerId)
    val (success, remiainingPhlo) = userDeploy.execute
    if success:
      System.execute(Refund, /* inputs: */ rev(remiainingPhlo), ...)
  invalidBlocks.foreach \ib ->
    systemDeploys.execute(Slash, ib)
  if epochChange(blockNumber):
    systemDeploys.exceute(CloseBlock)
  bonds = systemDeploys.execute(GetBonds)
  
  postHash = getTsHash()
  BlockMetadata.check(postHash, bonds)
  System.checkLogClear

Building blocks:

System deploys

  • no explicit code - just a templateId ("slashValidator", "prechargeDeploy", "refundDeploy")
  • each validator must use the right code themselves, independently
  • must always succeed
    • error in propose aborts the propose
      • it means we have a bug in platform code
    • error in replay is slashable
      • it means the proposing validator cheats on what the system deploy did
  • signed by the validator proposing the block
  • has a set of inputs and outputs
  • the only parts transferred via network are:
    • signature
    • templateId - the only thing that determines the rholang code to be executed (no string replacements)
    • inputs - a Map[UriString, GroundTerm] that parametrizes the deploy by providing values via sys: namespace
    • event log
  • unforgeable name uniqueness needs to be guaranteed
    • the easiest way would be to seed the RNG with the deploy signature
      • we should do that for both User and System deploys

SystemDeploy inputs

  • sys: namespace in NormalizerEnv for SysDeploy's input params
  • string interpolation won't do, b/c we need to inject a userId (unforgeable name)
  • only SystemDeploy code can refer to uris starting with sys:
  • this is a necessary validation for all incoming UserDeploy-s
    • otherwise users can intercept the values and do rogue things

SystemDeploy outputs

  • way of obtaining return values from SystemDeploys
    • each system deploy has an outputs: Map[UriString, RholangType] defined (can be empty)
      • it probably should never be empty or we're not sure if whole deploy executed
  • after executing the system deploy, a consume per output is done, in lexicographical order of uris
  • all consumes must yield exactly one value of the specified RholangType
  • only after all the consumes the checkpoint is created and the event log is captured
    (the event log for system deploys must contain the consumes of deploy outputs)
  • QUESTION what random seed do we use for the consumes?
    • ANSWER it should not matter , but we'll use rand.splitShort(Short.MIN_VALUE + outputIndex) //when output sorted lexicographically by uri for consistency and debuggability

Error handling: deploy vs replay

UserDeploy SystemDeploy
Propose Deploy is published to network with error cause and the event log Propose is cancelled, a bug is reported in logs
Replay Replay succeeds IFF during propose there was an error too. We though disregard: type of error, remaining phlo, event log discrepancies. See Caveats section below. Error during system deploy replay is subject to slashing the proposing validator

User vs System deploys

  • both have an event log
  • both need to be replayable
  • both are signed, so they can't be counterfeit
  • both have a timestamp
  • both need to have unique unforgeable-name namespaces, based on the 'random seed' (currently signature PK + timestamp, just the signature itself in the future as quickly as possible)
UserDeploy SystemDeploy
- initiated by user - initiated by scala code in casper
- signed by user - signed by the validator doing the propose
- code is arbitrary, exchanged as string - code is predefined, selected based on SystemDeploy.templateId, must not be transferred via network / taken from the message
- can fail and still be published to the network - must not fail, publishing failed system deploy is slashable
- execution is charged for (cost-accounted) - execution is not charged for (it's validator's operating costs)
- deployerId resolves to Deployer's public key - deployerId resovles to validator's public key, or is undefined
- has no input parameters - can be statically parameterized using names from the sys: uri-namespace
- has no outputs - has predefined, outputs sent-to and consumed-from predefined sys: names; the output type is also predefined; not publishing an output (and of correct type) is execution failure

Steps

  • validate user deploy cost in replay

  • NormalizerEnv generalized from (Option[DeployId], Option[DeployerId]) to Map[UriString, GroundTerm]

    • referring to a URI not defined in NormalizerEnv is a normalization ("compilation") error
      • just as it is now with DeployId and DeployerId
  • Singed[A: Serialize] with a signing smart-constructor, and (hopefully not a) unsafe constructor

  • Bulletproof unforgeable-name namespace uniqueness by prohibiting terms wider than Short.MAX_VALUE splits/terms. See DebruijnInterpreter.split()

  • remove DeployData.payment log

  • remove all the Span.mark calls for now

  • Introduce RuntimeEnv(reducer, rspace, errorVector, cost, all the Ref-s for casper-specific state)

  • introduce ProcessedSystemDeploy-s

  • make ProcessedSystemDeploy considerd in merge / conflict finding

~~ - DeployData -> data Deploy = UserDeploy | SystemDeploy ~~ there's no need for SystemDeploy to share common super type with UserDeploy, see the data type definitions above

  • pass the casper-specific state via SystemDeploy input-s, remove them from RuntimeManager

    • Ref[F, DeployParameters]
    • Ref[F, BlockData]
    • Runtime.InvalidBlocks[F]
  • obtain the bonds using a SystemDeploy output

  • remove the getBonds and getData methods from RuntimeManager

  • Express slashing as a system deploy (in createBlock and in replay) (?)

    • explore how slashing works in replay currently
    • pass inputs via the NormalizerEnv
    • pass a 'return result' to make sure slashing executed successfully, and to completion
      • as we know, when errors happen, we simply end up in a 'the girls continuations never came' situation
      • validating a continuation that's sequenced after all other executed code had its match means the code executed successfully
  • Add precharge system deploy

    • inject (via sys:) original deployer's deployerId to obtain their RevVault's authKey
    • transfer phloLimit * phloPrice rev from deployer's vault to pos/validator vault
    • capture the transfer result on a sys: name (see code below):
    • check that the result was (true, Nil), report the actual value in the exception message if not.
    • above steps will affect both propose (BlockCreator.createBlock) and replay (Casper.attemptAdd) both of them need to issue the SystemDeploy independently, or there's no validation
      • we can't simply iterate through list of (System)Deploy-s in replay - we need to re-do the needed system deploys based on whatever actions casper needs to take in propose (a mirror of the actions must be coded in replay)
 new 
   originalDeployerId(`sys:capser:ogDeployerId`), 
   prechargeAmount(`sys:capser:prechargeAmount`),
   transferResult(`sys:casper:transferResult`)
 in {  
     PoS!("chargeDeploy", *originalDeployerId, *prechargeAmount, *transferResult)  
 }
  • Add refund system deploy
    • calculate the remaining phlo and convert them back to Rev according to the UserDeploy's phloRate
      • disregard rounding errors. Always round down. We're operating in nano-Rev-s anyway.
      • phloRate should be documented to be in phlo / nanoRev units
    • put the refundAmount and other input values into NormalizerEnv and execute the system deploy with it
    • other steps analogous to ones in Add precharge system deploy above
 new 
     originalDeployerId(`sys:capser:ogDeployerId`),
     refundAmount(`sys:casper:refundAmount`),
     refundResult(`sys:casper:refundResult`)
 in {  
     PoS!("refundDeploy", *originalDeployerId, *refundAmount, *refundResult)  
 }
 // passing current validator's deployerId might be needed to integrate per-validator PoS vaults?
  • Add outstanding tests that didn't make it in previous commits

  • reintroduce Span.mark calls as needed

Tests

  • executing a basic deploy depletes rev balance
  • calling an undefined method charges the whole phloLimit and does not refund
  • two system deploys accessing the same data (in a conflicting manner) must conflict. e.g. doing a refund to the same wallet
  • just to be sure: executing@0!(1) | for (_ <- @0) { Nil } | 42.callUndefined() results in a failed (user/system) deploy
  • TODO surely there's more

Caveats / exclusions / future work

  • CONSIDER: if deployer has less than phloLimit * phloRate in their vault, the pre-charge fails. We should though claim all the rev they have available, because they agreed to that and incurred a cost on the network for doing the precharge (?)
    • this could though also mean if I say limit = LOTS and I have rev = LOTS - 1, I lose LOTS, which is disproportionate to the cost incurred by the network....
  • because of error semantics being undefined for rholang, and the impl being non-deterministic and non-replayable, the following apply:
    • a system deploy failure in propose terminates the propose
    • a system deploy failure during replay is a slashable offence
    • a user deploy failure will be replayed, but any discrepancies in the replay result will only be logged as WARN - so there's no validation of user deploy failures
      • this means validator can steal up to deploeyr's phloLimit and each such case must be assessed and reconciled manually by the Coop after being reported
    • FUTURE / JIRA replay user deploy failures and confirm that it fails in some way:
      • same error type but different cost should be OK
      • rholang error in propose and oope in replay should be OK
      • otherwise, saldy, as well - because we can't guarantee error replayability so far
      • we could if we had stricter replay, e.g. a sequential one

Known issues

  • huge rholang term won't execute, but will clog up ram anyway and not be charged for (enough)
    • to prevent that, we'd need to limit deploy message size based on the phloLimit, and do so in network layer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment