ArturGajowy/charging.md Secret

## charging.md

      
    Raw
  

              charging.md
            
          
    Charging for deploy

Charging overall scheme


charge upfront phloLimit * phloRate from deployer vault to pos/validator vault

check that there's enough rev
check that transfer succeeded


if system deploy succeeded & transfer was successful, execute user deploy

if not enough rev, we need to store the deploy on-chain for reference, as proof of deploy attempt and basis for charging
this means we put the user deploy in the block in all cases, but only execute it if pre-charge succeeds
replay is not validated for failed deploys (see Caveats)
deploy for which pre-charge failed has an empty event log and a successful = false

FUTURE / JIRA: add error status to user deploys to discern OOPE errors from other error causes
FUTURE / JIRA: add error info to capture a limited data on the error (e.g. stacktrace header and/or hash)


only if user deploy successful refund remaining phlo

don't even do the refund deploy otherwise
failures include: out of phlo error, whatever rholang error
we can't afford refunds for erroneous executions, b/c error behavior is undefined in rholang,

and thus: so is its cost (very likely non-deterministic and non-replayable)
this needs to be documented


TBD Propose & replay

The following two sections will discuss where this all needs to be plugged in and what changes are needed in the area bounded by:

BlockCreator.createBlock (propose) & MultiparentCasperImpl.attemptAdd (replay) from the top
RuntimeManager.processDeploy & RuntimeManager.replayDeploy from the bottom
Most likely, some of the indirection in between could be inlined. Details of error handling in the current state need to be investigated.

TBD Current state


where deploys are evaluated first
where deploys are replayed
based on what deploys / things to do are determined

Target state


for each UserDeploy, the relevant pre-charge and refund SystemDeploys are added around
slashing SystemDeploy-s are added

To propose:


[we issue a series of needed system deploys]
we go through pending user deploys, for each:

execute the pre-charge
execute the user deploy
execute the refund


(as needed) we issue the slashing system deploys
(as needed) we issue the close block system deploy

This results in a:
BlockMessage
  userDeploys: List[Singed[ProcessedUserDeploy]]
  systemDeploys: List[Signed[ProcessedSystemDeploy]]

ProcessedSystemDeploy
  timestamp: Long
  templateId: String  //technically we could skip it, but it's harmless enough to retain as a debug guide
  eventLog: EventLog
  //THE FOLLOWING MUST NOT END UP IN THIS MESSAGE:
  // deployerPk //the validator's PK! <- it follows from who's proposing the block. Duplication = functional dependency = bugs = death
  // inputs: Map[UriString, GroundTerm]
  

Building block data types:
//NOT A NETWORK MESSAGE - hardcoded values stored in dictionary known to all validators
SystemDeployTemplate 
  id: String
  code: String
  outputs: Map[UriString, RholangType]

Signed[A: Serialize]
  sig
  sigAlgo

To replay

We operate just like in propose, the only info we use from the BlockMessage is the one we couldn't possibly know otherwise. We need to know what needs to happen in order to validate that it happened.
Every time we execute a system deploy we pop the first system deploy from BlockMessage.systemDeploys to know which eventLog to use.
The order of system deploys is implicit and defined by casper (propose/replay) code

we issue a series of needed system deploys
we go through pending user deploys, for each:

execute the pre-charge
execute the user deploy
execute the refund


(as needed) we issue the slashing system deploys
(as needed) we issue the close block system deploy

Pseudocode

# `System` is a service used for execution of system deploys,
# collecting their traces or
# - in its replay variant - verifying all traces have been used
#
# `User` is a service used for execution of user deploys
# the replay implementation would raise a fatal error (`Sync.raiseError`)
# if the user deploy status was not as expected

def propose(blockNumber, userDeploys, invalidBlocks):

  userDeploys.foreach \userDeploy ->
    System.execute(PreCharge, /* inputs: */ userDeploy.deployerId)
    val (success, remiainingPhlo) = User.execute(userDeploy)
    if success:
      System.execute(Refund, /* inputs: */ rev(remiainingPhlo), ...)
  invalidBlocks.foreach \ib ->
    System.execute(Slash, ib)
  if epochChange(blockNumber):
    System.exceute(CloseBlock)
  bonds = System.execute(GetBonds)

  postHash = getTsHash()
  BlockMetadata.put(postHash, bonds)
  systemDeploys = System.getDeploys #includes get bonds event log \o/
  
  return (bonds, BlockMessage(..., postHash, systemDeploys, ...))


def replay(blockNumber, userDeploys, systemDeploys, invalidBlocks):
  
  System.rig(systemDeploys)
  
  userDeploys.foreach \userDeploy ->
    System.execute(PreCharge, /* inputs: */ userDeploy.deployerId)
    val (success, remiainingPhlo) = userDeploy.execute
    if success:
      System.execute(Refund, /* inputs: */ rev(remiainingPhlo), ...)
  invalidBlocks.foreach \ib ->
    systemDeploys.execute(Slash, ib)
  if epochChange(blockNumber):
    systemDeploys.exceute(CloseBlock)
  bonds = systemDeploys.execute(GetBonds)
  
  postHash = getTsHash()
  BlockMetadata.check(postHash, bonds)
  System.checkLogClear
Building blocks:

System deploys


no explicit code - just a templateId ("slashValidator", "prechargeDeploy", "refundDeploy")
each validator must use the right code themselves, independently
must always succeed

error in propose aborts the propose

it means we have a bug in platform code


error in replay is slashable

it means the proposing validator cheats on what the system deploy did


signed by the validator proposing the block
has a set of inputs and outputs
the only parts transferred via network are:

signature
templateId - the only thing that determines the rholang code to be executed (no string replacements)
inputs - a Map[UriString, GroundTerm] that parametrizes the deploy by providing values via sys: namespace
event log


unforgeable name uniqueness needs to be guaranteed

the easiest way would be to seed the RNG with the deploy signature

we should do that for both User and System deploys


SystemDeploy inputs


sys: namespace in NormalizerEnv for SysDeploy's input params
string interpolation won't do, b/c we need to inject a userId (unforgeable name)
only SystemDeploy code can refer to uris starting with sys:
this is a necessary validation for all incoming UserDeploy-s

otherwise users can intercept the values and do rogue things


SystemDeploy outputs


way of obtaining return values from SystemDeploys

each system deploy has an outputs: Map[UriString, RholangType] defined (can be empty)

it probably should never be empty or we're not sure if whole deploy executed


after executing the system deploy, a consume per output is done, in lexicographical order of uris
all consumes must yield exactly one value of the specified RholangType
only after all the consumes the checkpoint is created and the event log is captured

(the event log for system deploys must contain the consumes of deploy outputs)
QUESTION what random seed do we use for the consumes?

ANSWER it should not matter , but we'll use rand.splitShort(Short.MIN_VALUE + outputIndex) //when output sorted lexicographically by uri for consistency and debuggability


Error handling: deploy vs replay


UserDeploy
SystemDeploy


Propose
Deploy is published to network with error cause and the event log
Propose is cancelled, a bug is reported in logs


Replay
Replay succeeds IFF during propose there was an error too. We though disregard: type of error, remaining phlo, event log discrepancies. See Caveats section below.
Error during system deploy replay is subject to slashing the proposing validator


User vs System deploys


both have an event log
both need to be replayable
both are signed, so they can't be counterfeit
both have a timestamp
both need to have unique unforgeable-name namespaces, based on the 'random seed'
(currently signature PK + timestamp, just the signature itself in the future as quickly as possible)


UserDeploy
SystemDeploy


- initiated by user
- initiated by scala code in casper


- signed by user
- signed by the validator doing the propose


- code is arbitrary, exchanged as string
- code is predefined, selected based on SystemDeploy.templateId, must not be transferred via network / taken from the message


- can fail and still be published to the network
- must not fail, publishing failed system deploy is slashable


- execution is charged for (cost-accounted)
- execution is not charged for (it's validator's operating costs)


- deployerId resolves to Deployer's public key
- deployerId resovles to validator's public key, or is undefined


- has no input parameters
- can be statically parameterized using names from the sys: uri-namespace


- has no outputs
- has predefined, outputs sent-to and consumed-from predefined sys: names; the output type is also predefined; not publishing an output (and of correct type) is execution failure


Steps


validate user deploy cost in replay


NormalizerEnv generalized from (Option[DeployId], Option[DeployerId]) to Map[UriString, GroundTerm]

referring to a URI not defined in NormalizerEnv is a normalization ("compilation") error

just as it is now with DeployId and DeployerId


Singed[A: Serialize] with a signing smart-constructor, and (hopefully not a) unsafe constructor


Bulletproof unforgeable-name namespace uniqueness by prohibiting terms wider than Short.MAX_VALUE splits/terms.
See DebruijnInterpreter.split()


remove DeployData.payment log


remove all the Span.mark calls for now


Introduce RuntimeEnv(reducer, rspace, errorVector, cost, all the Ref-s for casper-specific state)


introduce ProcessedSystemDeploy-s


make ProcessedSystemDeploy considerd in merge / conflict finding


~~ - DeployData -> data Deploy = UserDeploy | SystemDeploy  ~~
there's no need for SystemDeploy to share common super type with UserDeploy, see the data type definitions above


pass the casper-specific state via SystemDeploy input-s, remove them from RuntimeManager

Ref[F, DeployParameters]
Ref[F, BlockData]
Runtime.InvalidBlocks[F]


obtain the bonds using a SystemDeploy output


remove the getBonds and getData methods from RuntimeManager


Express slashing as a system deploy (in createBlock and in replay) (?)

explore how slashing works in replay currently
pass inputs via the NormalizerEnv
pass a 'return result' to make sure slashing executed successfully, and to completion

as we know, when errors happen, we simply end up in a 'the girls continuations never came' situation
validating a continuation that's sequenced after all other executed code had its match means the code executed successfully


Add precharge system deploy

inject (via sys:) original deployer's deployerId to obtain their RevVault's authKey
transfer phloLimit * phloPrice rev from deployer's vault to pos/validator vault
capture the transfer result on a sys: name (see code below):
check that the result was (true, Nil), report the actual value in the exception message if not.
above steps will affect both propose (BlockCreator.createBlock) and replay (Casper.attemptAdd)
both of them need to issue the SystemDeploy independently, or there's no validation

we can't simply iterate through list of (System)Deploy-s in replay - we need to re-do the needed system deploys based on whatever actions casper needs to take in propose (a mirror of the actions must be coded in replay)


 new 
   originalDeployerId(`sys:capser:ogDeployerId`), 
   prechargeAmount(`sys:capser:prechargeAmount`),
   transferResult(`sys:casper:transferResult`)
 in {  
     PoS!("chargeDeploy", *originalDeployerId, *prechargeAmount, *transferResult)  
 }

Add refund system deploy

calculate the remaining phlo and convert them back to Rev according to the UserDeploy's phloRate

disregard rounding errors. Always round down. We're operating in nano-Rev-s anyway.
phloRate should be documented to be in phlo / nanoRev units


put the refundAmount and other input values into NormalizerEnv and execute the system deploy with it
other steps analogous to  ones in  Add precharge system deploy above


 new 
     originalDeployerId(`sys:capser:ogDeployerId`),
     refundAmount(`sys:casper:refundAmount`),
     refundResult(`sys:casper:refundResult`)
 in {  
     PoS!("refundDeploy", *originalDeployerId, *refundAmount, *refundResult)  
 }
 // passing current validator's deployerId might be needed to integrate per-validator PoS vaults?


Add outstanding tests that didn't make it in previous commits


reintroduce Span.mark calls as needed


Tests


executing a basic deploy depletes rev balance
calling an undefined method charges the whole phloLimit and does not refund
two system deploys accessing the same data (in a conflicting manner) must conflict. e.g. doing a refund to the same wallet
just to be sure: executing@0!(1) | for (_ <- @0) { Nil } | 42.callUndefined() results in a failed (user/system) deploy
TODO surely there's more

Caveats / exclusions / future work


CONSIDER: if deployer has less than phloLimit * phloRate in their vault, the pre-charge fails. We should though claim all the rev they have available, because they agreed to that and incurred a cost on the network for doing the precharge (?)

this could though also mean if I say limit = LOTS and  I have rev = LOTS - 1, I lose LOTS, which is disproportionate to the cost incurred by the network....


because of error semantics being undefined for rholang, and the impl being non-deterministic and non-replayable, the following apply:

a system deploy failure in propose terminates the propose
a system deploy failure during replay is a slashable offence
a user deploy failure will be replayed, but any discrepancies in the replay result will only be logged as WARN - so there's no validation of user deploy failures

this means validator can steal up to deploeyr's phloLimit and each such case must be assessed and reconciled manually by the Coop after being reported


FUTURE / JIRA replay user deploy failures and confirm that it fails in some way:

same error type but different cost should be OK
rholang error in propose and oope in replay should be OK
otherwise, saldy, as well - because we can't guarantee error replayability so far
we could if we had stricter replay, e.g. a sequential one


Known issues


huge rholang term won't execute, but will clog up ram anyway and not be charged for (enough)

to prevent that, we'd need to limit deploy message size based on the phloLimit, and do so in network layer
	UserDeploy	SystemDeploy
Propose	Deploy is published to network with error cause and the event log	Propose is cancelled, a bug is reported in logs
Replay	Replay succeeds IFF during propose there was an error too. We though disregard: type of error, remaining phlo, event log discrepancies. See `Caveats` section below.	Error during system deploy replay is subject to slashing the proposing validator
UserDeploy	SystemDeploy
- initiated by user	- initiated by scala code in casper
- signed by user	- signed by the validator doing the propose
- code is arbitrary, exchanged as string	- code is predefined, selected based on `SystemDeploy.templateId`, must not be transferred via network / taken from the message
- can fail and still be published to the network	- must not fail, publishing failed system deploy is slashable
- execution is charged for (cost-accounted)	- execution is not charged for (it's validator's operating costs)
- deployerId resolves to Deployer's public key	- deployerId resovles to validator's public key, or is undefined
- has no input parameters	- can be statically parameterized using names from the `sys:` uri-namespace
- has no outputs	- has predefined, outputs sent-to and consumed-from predefined `sys:` names; the output type is also predefined; not publishing an output (and of correct type) is execution failure