- Bernhard Scholz (Sydney)
- "Souffle (Static analysis toolkit for Datalog)
- long-standing problem of analytical potential vs performance / use in practice
- background on datalog: simplified prolog variant without backtracking, using memoisation for efficiency, excels at fix-point calculations and (other) bottom-up methods
- static analysis with datalog:
- needs extraction of input program into relations (specific to language),
- and (generic) datalog engine working on the relations using rules that describe the semantics.
- (example of unprotected vulnerable code blocks)
- drawbacks of Datalog:
- guaranteed termination => limited set of constants, no functors
- no typing
- incremental state (no deletion) => no pruning possible
- example of hand-crafted code vs. datalog
- enter souffle : using datalog as DSL, performs on par with C++
code,
- key assumption: rules do not change (unlike in databases)
- performance achieved by large RAM and multicore parallelism
-
inspired by predecessor bddbddb, z3
-
strongly typed (base types, records, algebr. datatypes)
-
Extensions
- structuring large datalog programs (components)
- functors (pre-declared 1-column relation) - losing termination guarantee
- records: pre-declared structs with typed named fields
- backed by a lookup (memo) table, stored instead of using tuples directly
- can also model lists by using recursion
- subsumption (implements deletion)
- killer app : intervals (subsumed by inclusion)
-
compilation:
- Futamura projection (specialisation of programs)
- 2 modes: interpreter (new) and synthesiser (older) that specialises
- sophisticated indexing to search relations typical for the
executed semantics (rule-dependent)
- highly-optimised data structures
- Provenance: where tuples come from (proof trees)
- Profiling: to identify bottlenecks..
- Java security (for which Souffle was conceived and first built)
- AWS SDN: enforcing access control policies
- some more (incl. "blockchain")
- Java "points-to" analysis
- appeal of writing datalog-style reasoning
APLAS is a 2nd tier forum and typically attracts PhD work, but with quite advanced topics at times.
Highly technical talk, even the motivation.
- modeling program state changes as effects, using state operators
- algebraic laws for, e.g., omitting dead code
- (compiler might use this to optimise away statements)
- algebraic laws for, e.g., omitting dead code
- shared state (concurrency): need to add a "choice" and axioms
- (this is known background)
- to fix the semantics for concurrent execution, introduce a "yield"
operator that models concurrency effects
- keeping all equations of the theory, but introducing more fine-grained state observations that can distinguish states reachable by concurrency effects
(virtual talk)
- key idea: in procedure summaries, describe memory effects more compactly using memory regions (to better compose them).
- program analysis procues "access paths" for pointers
- aliasing is eliminated abstractly
- loops laed to unbounded # of access paths, but reaching a finite memory set
- SMT encoding of a memory access object, abstracting ptr reads to array accesses. Writes lead to versioning the arrays
- closer examination and narrowing of properties on finite maps (regions), e.g., when describing function calls
"Where We Are and Where We Go"
-
"Verification witness" - extracted proof/ counterexample
-
problem space: validating witnesses to addressing bugs in verifiers or over-approximations inherent in the verifier tool design
-
analysis of 76 provers extracting witnesses and 10 validators, assessing the ability to detect wrong witnesses
-
proposed (and accepted) new scoring scheme for SV-COMP competition of validators (starting from 2023). Verifiers scoring high negative penalties for wrong results (invalid witnesses only, because otherwise uncertain).
- Questions:
- What is the ground truth when verifiers have inconsistent results?
- SV-Benchmarks repository (20k problems, known and peer-verified)
- about path-invariant witnesses and how to recover errors on a
single path
- witnesses may describe several paths where only one is a bug. The ideal witness only contains a single path, though.
- Did all the validators exhibit similar problems, or different ones?
- "Avoid technology bias", use different techniques for different witnesses (produced with different abstraction techniques)
- E.g., do not validate a CPA-checker witness using CPA validation
- What is the ground truth when verifiers have inconsistent results?
-
goal: fix errors in grammars, automating a find-and-fix approach
- exact localisation of errors from failing acceptance tests
- simplistic editing approach (insert/delete/mutate) for the grammar
-
considering lexical information around the syntax error to validate edits
-
(other) complicated rules relating to passing and failing tests
- used to eliminate invalid patches for the grammar
-
possible applications:
- when there is a "reasonable initial grammar and driving examples"
- (use case: correcting student work, giving hints)
- (use case: genetic grammar learning)
-
results: (student grammars) "mostly human quality patches" (see screenshot. ..)
-
future work: generate tests (termed "active repair")
-
Killer app: tutoring compilers courses!
- experience report about the student grammar use case above
- focus on auto-generating test suites for compilers
- another key application: fuzzing
- mostly straightforward algorithm implementation in prolog
- algorithms often well-known
- e.g., "use each rule at least once", "use each rule on the RHS of another at least once", ..
- key insights:
- for systematic construction, more complexity (longer derivation chains) does not improve coverage any more
- better results when using random-generated tests
- more ambitious: semantic errors, generated by using "annotations" to
tokens (in this case identifiers, as _decl, and _ref)
- Reinforcement learning: middle ground between unsupervised and supervised
- generating a policy, in iterations to gradually improve
- environment assumed unknown initially, later providing feedback
- technique of reward function problematic for complex tasks
- often cannot compose reward functions (e.g., visit p then q)
- application of temporal logic to formulate reward functions
- potentially much larger gains to have from applying formal
reasoning such as temporal logic:
- testability, verifiability, access to interpretations
- drawbacks:
- not robust (small changes may break the outcome)
-
Markov decision process (MDP)
- potentially infinite state and/or action set
- probabilistic transition
-
RL: given an MDP and a discount factor, can we generate a policy that optimises the discounted sum of rewards?
- classic optimisation problem
- 1953: there is a memory-less deterministic optimal policy
- 1957, Bellman equation = solves
-
Reward function approach: given Simulator of MDP that can reset to initial state and sample states
- by sampling executions, can estimate transition probabilities
- ultimately converging when iterated (infinitesimal process) (1992)
- (2002) converges to near-optimal policy with finitely-many samples
-
from temporal specifications: given MDP and temporal spec.
- can also sample, to find a spec.fulfilling policy.
- from example (screenshot): no guarantee to learn optimal policy using finitely-many samples
- actual (proven) result much stronger: cannot learn near-optimal policies from finitely-many samples.
-
Algorithmic performance:
- reward-based sampling performs extremely poorly, by not exploring the space in a targeted manner
- (problem of horizon for the goal - "Reward based learning is myopic")
- (This relates to how the rewards are set up, not shown)
-
uses structure of temporal spec to abstract the structure of the problem space
- DAG of partial goals, decomposing the task, learning policy for every edge separately, then combining the policies
- optimal solution is immediate from policies (probabilities)
-
induced state distribution from learnt policies: order of learning policies matters so the DAG can be swept exactly once.
- key insight: systematic composition of policies/probabilities
-
Other aspects, besides performance: verifiability and explanation of (optimal) solutions, correctness assurance
- "inherently interpretable"
- able to generalise algorithms, capitalising on compositionality
-
"what about theoretical guarantees?"
- guarantees apply only to finite-state MDPs
- practical MDPs are usually infinite-state
-
Challenge: explore/define what guarantees can be developed for AI systems out of formal methods
- distributions depend on path. how does it affect the results?
- current algo uses heuristics
- greedily composing best-so-far path sometimes sufficient
- in other applications, need to average over all paths
- current algo uses heuristics
- composition is sequential, can it be parallel composition?
- follow-up question: can you compute parts in parallel?
- in order to learn, need the full picture of all partial policies
- follow-up question: can you compute parts in parallel?
- are your state spaces finite, or are you using continuous state?
- all presented problems are continuous-state
- follow-up: is it single-state (single agent?)
- full dimensionality in shown examples is single agent, but other app.s have multi-agent setup
- follow-up: if you don't have access to a simulator, can you use
offline learning?
- reinforcement uses online method to target exploration, difficult problem to not have the simulator
- also, if all data is known a priory, RL is not the right choice
- what is the semantics of the nodes in the DAG?
- whatever predicates are in the spec will appear as nodes
-
record calculi, from the late 80s/90s
- subtyping for records, breaks (to some extent) record concatenation
- example: intersection of two record types
- 90s work: only defined when record fields are disjoint
- problematic limitation: lack of recursion
-
this talk: calculus for recursion in
λᵢ
, termedλᵢ^μ
-
notions of: record merge (disjoint, based on top-like types), function-types (super of
⊥ → ⊤
), which is tricky for contravariant types, completeness arguments, and construction of top types
- student (Yahui Song, Singapore) nearing completion
- deepSEA = Certik technology
- this poster: incremental work on reentrancy
- general approach:
- deepSEA language compiles to Coq,
- Coq proofs use additional library of primitives
- Coq code generation of Ethereum bytecode
The student (Daniel Britten, NZ) is nearing completion, was interested in K (tried it recently) and its technical background.
Daniel later reached out to Yan and myself about potential jobs in ~6 months time.