Skip to content

Instantly share code, notes, and snippets.

@jtpaasch
Last active January 3, 2024 14:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jtpaasch/1080012d2f79a0056daea64b7660f000 to your computer and use it in GitHub Desktop.
Save jtpaasch/1080012d2f79a0056daea64b7660f000 to your computer and use it in GitHub Desktop.
The reaching definitions analysis (done in Souffle datalog), from Principles of Program Analysis, ch. 2.
/*
=======================================================
01 - Arithmetic expressions
=======================================================
Encoding arithmetic expressions and simple assignments with datalog.
The language is a subset of the one in PPA, pp. 3-4.
*/
/*
-------------------------------------------------------
Types
-------------------------------------------------------
*/
.type id <: symbol // Each AST node gets an ID, e.g. "N3".
.type num <: number // A numeral, e.g. "13" or "2".
.type var <: symbol // A variable, e.g. "x" or "y".
.type aOp <: symbol // An operator, e.g. "+" or "*".
.type label <: symbol // Program labels, e.g. "L1".
/*
-------------------------------------------------------
Program encoding
-------------------------------------------------------
*/
.decl numAexpr(id:id,num:num)
.decl varAexpr(id:id,var:var)
.decl binOpAexpr(id:id,aOp:aOp,operand1:id,operand2:id)
.decl assignmentStmt(id:id,label:label,var:var,aexpr:id)
.decl seqStmt(id:id,lhs:id,rhs:id)
/*
-------------------------------------------------------
Computed relations
-------------------------------------------------------
*/
// Variables that occur in expressions.
.decl freeVar(var:var)
freeVar(x) :- varAexpr(_,x).
/*
-------------------------------------------------------
Outputs
-------------------------------------------------------
*/
.output freeVar
/*
-------------------------------------------------------
The program
-------------------------------------------------------
We'll encode this program:
```
L1 y := x
L2 z := 1
L3 z := z + y
```
Here is the syntax tree, with node IDs and labels added in:
```
+-- N2 Assgn L1
|
N1 Seq --+ +-- N4 Assgn L2
| |
+-- N3 Seq --+
|
+-- N5 Assgn L3
```
*/
// L1 ... ; L2 ...
seqStmt("N1","N2","N3").
// L1 y := x
assignmentStmt("N2","L1","y","N100").
varAexpr("N100","x").
// L2 ... ; L3 ...
seqStmt("N3","N4","N5").
// L2 z := 1
assignmentStmt("N4","L2","z","N110").
numAexpr("N110",1).
// L3 z := z + y
assignmentStmt("N5","L3","z","N120").
binOpAexpr("N120","+","N121","N122").
varAexpr("N121","z").
varAexpr("N122","y").
/*
-------------------------------------------------------
Commentary
-------------------------------------------------------
I run this program with souffle:
```
souffle 01-aexpr.dl -D-
```
I get the expected output:
```
--------------
freeVar
var
===============
x
z
y
===============
```
*/
/*
=======================================================
02 - More program statements
=======================================================
Encoding more program statements with datalog, in order
to handle more complex control flow.
The language is taken from PPA, pp. 3-4.
*/
/*
-------------------------------------------------------
Types
-------------------------------------------------------
*/
.type id <: symbol // Each AST node gets an ID, e.g. "N3".
.type num <: number // A numeral, e.g. "13" or "2".
.type var <: symbol // A variable, e.g. "x" or "y".
.type aOp <: symbol // An arithmetic operator, e.g. "+" or "*".
.type bOp <: symbol // A boolean operator, e.g. "<" or "=".
.type label <: symbol // Program labels, e.g. "L1".
/*
-------------------------------------------------------
Program encoding
-------------------------------------------------------
*/
.decl numAexpr(id:id,num:num)
.decl varAexpr(id:id,var:var)
.decl binOpAexpr(id:id,aOp:aOp,operand1:id,operand2:id)
.decl binOpBexpr(id:id,bOp:bOp,operand1:id,operand2:id)
.decl assignmentStmt(id:id,label:label,var:var,aexpr:id)
.decl seqStmt(id:id,lhs:id,rhs:id)
.decl ifStmt(id:id,label:label,cond:id,branch1:id,branch2:id)
/*
-------------------------------------------------------
Computed relations
-------------------------------------------------------
*/
// Free variables that occur in expressions.
.decl freeVar(var:var)
freeVar(x) :- varAexpr(_,x).
/*
-------------------------------------------------------
Outputs
-------------------------------------------------------
*/
.output freeVar
/*
-------------------------------------------------------
The program
-------------------------------------------------------
We'll encode this program:
```
L1 y := x
L2 if (x > 0) then
L3 z := 3
else
L4 z := 0
L5 y := z * x
fi
```
Here is the abstract syntax tree, with nodes and labels added in:
```
+-- N2 Assgn L1
|
N1 Seq --+ +-- Then --+-- N4 Assgn L3
| |
+-- N3 If L3 --+ +-- N6 Assgn L4
| |
+-- Else --+-- N5 Seq --+
|
+-- N7 Assgn L5
```
*/
// L1 ... ; L2 if ...
seqStmt("N1","N2","N3").
// L1 y := x
assignmentStmt("N2","L1","y","N100").
varAexpr("N100","x").
// L2 if (x > 0) ...
ifStmt("N3","L3","N110","N4","N5").
binOpBexpr("N110",">","N111","N112").
varAexpr("N111","x").
numAexpr("N112",0).
// L3 z := 3
assignmentStmt("N4","L3","z","N120").
numAexpr("N120",3).
// L4 ... ; L5 ...
seqStmt("N5","N6","N7").
// L4 z := 0
assignmentStmt("N6","L4","z","N130").
numAexpr("N130",0).
// L5 y := z * x
assignmentStmt("N7","L5","y","N140").
binOpAexpr("N140","*","N141","N142").
varAexpr("N141","z").
varAexpr("N142","x").
/*
-------------------------------------------------------
Commentary
-------------------------------------------------------
I run this program with souffle:
```
souffle 02-stmts.dl -D-
```
I get the expected output (note that `y` does _not_ appear
in any expressions in the above program, and so it is not
listed as a free variable):
```
---------------
freeVar
var
===============
x
z
===============
```
*/
/*
=======================================================
03 - Control flow - Example 1
=======================================================
Encoding control flow for a datalog version of PPA's
reaching definition analysis, and computing the `init`,
`final`, and `flow` functions defined in PPA, pp. 36-37.
*/
/*
-------------------------------------------------------
Types
-------------------------------------------------------
*/
.type id <: symbol // AST node IDs (e.g., "N1")
.type num <: number // Numerals (e.g., "13")
.type var <: symbol // Variables (e.g., "x")
.type aOp <: symbol // Arithmetic operators (e.g., "*" or "+")
.type bool <: symbol // "true" or "false"
.type bOp <: symbol // Boolean operators (e.g., "=" or "<")
.type label <: symbol // E.g., "L1", "L2", etc.
/*
-------------------------------------------------------
Program encoding
-------------------------------------------------------
*/
.decl numAexpr(id:id,num:num)
.decl varAexpr(id:id,var:var)
.decl boolBexpr(id:id,bool:bool)
.decl binOpAexpr(id:id,op:aOp,operand1:id,operand2:id)
.decl binOpBexpr(id:id,op:bOp,operand1:id,operand2:id)
.decl skipStmt(id:id,label:label)
.decl assignmentStmt(id:id,label:label,var:var,aexpr:id)
.decl seqStmt(id:id,lhs:id,rhs:id)
.decl ifStmt(id:id,label:label,cond:id,branch1:id,branch2:id)
.decl whileStmt(id:id,label:label,cond:id,body:id)
.decl stmtLabel(id:id,label:label)
/*
-------------------------------------------------------
Computed relations
-------------------------------------------------------
*/
.decl init(id:id,label:label)
init(id,label) :- assignmentStmt(id,label,_,_).
init(id,label) :- skipStmt(id,label).
init(id,label) :- seqStmt(id,lhs,_),init(lhs,label).
init(id,label) :- ifStmt(id,label,_,_,_).
init(id,label) :- whileStmt(id,label,_,_).
.decl final(id:id,label:label)
final(id,label) :- assignmentStmt(id,label,_,_).
final(id,label) :- skipStmt(id,label).
final(id,label) :- seqStmt(id,_,rhs),final(rhs,label).
final(id,label) :- ifStmt(id,_,_,branch1,_),final(branch1,label).
final(id,label) :- ifStmt(id,_,_,_,branch2),final(branch2,label).
final(id,label) :- whileStmt(id,label,_,_).
.decl flow(src:label,dst:label)
flow(src,dst) :-
seqStmt(_,srcId,_),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
seqStmt(_,_,srcId),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
seqStmt(_,lhs,rhs),
final(lhs,src),
init(rhs,dst).
flow(src,dst) :-
ifStmt(_,_,_,srcId,_),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
ifStmt(_,_,_,_,srcId),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
ifStmt(_,src,_,branch1,_),
init(branch1,dst).
flow(src,dst) :-
ifStmt(_,src,_,_,branch2),
init(branch2,dst).
flow(src,dst) :-
whileStmt(_,_,_,srcId),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
whileStmt(_,src,_,body),
init(body,dst).
flow(src,dst) :-
whileStmt(_,dst,_,body),
final(body,src).
/*
-------------------------------------------------------
Outputs
-------------------------------------------------------
*/
.output init
.output final
.output flow
/*
-------------------------------------------------------
The program
-------------------------------------------------------
We'll encode this program (Example 2.1, from p. 37 of PPA).
```
L1 x := 1
L2 while (x > 0) do
L3 z := z * y
L4 x := x - 1
done
```
Here's the syntax tree, with node IDs and labels added to it:
```
+-- N2 Assgn L1
|
N1 Seq --+ +-- N5 Assgn L3
| |
+-- N3 While L2 --+-- N4 Seq --+
|
+-- N6 Assgn L4
```
*/
// Which statement IDs have which labels
stmtLabel("N2","L1").
stmtLabel("N3","L2").
stmtLabel("N5","L3").
stmtLabel("N6","L4").
// L1 ... ; L2 while ...
seqStmt("N1","N2","N3").
// L1 x := 1
assignmentStmt("N2","L1","x","N100").
numAexpr("N100",1).
// L2 while...
whileStmt("N3","L2","N110","N4").
binOpBexpr("N110",">","N111","N112").
varAexpr("N111","x").
numAexpr("N112",0).
// L3 ... ; L4 ...
seqStmt("N4","N5","N6").
// L3 z := z * y
assignmentStmt("N5","L3","z","N130").
binOpAexpr("N130","*","N131","N132").
varAexpr("N131","z").
varAexpr("N132","y").
// L4 x := x - 1
assignmentStmt("N6","L4","x","N140").
binOpAexpr("N140","-","N141","N142").
varAexpr("N141","x").
numAexpr("N142",1).
/*
-------------------------------------------------------
Commentary
-------------------------------------------------------
I run this program with souffle:
```
souffle 03-control-flow-ex-1.dl -D-
```
The output matches the results I see in PPA on p. 37.
Specifically, I get this output:
```
---------------
final
id label
===============
N2 L1
N5 L3
N6 L4
N1 L2
N3 L2
N4 L4
===============
---------------
init
id label
===============
N2 L1
N5 L3
N6 L4
N1 L1
N3 L2
N4 L3
===============
---------------
flow
src dst
===============
L1 L2
L3 L4
L4 L2
L2 L3
===============
```
*/
/*
=======================================================
03 - Control flow - Example 2
=======================================================
Encoding control flow for a datalog version of PPA's
reaching definition analysis, computing the functions
`init`, `final`, and `flow` described in PPA, pp. 36-37.
This is just like 03 - Control flow - Example 1, but
the example program is different.
*/
/*
-------------------------------------------------------
Types
-------------------------------------------------------
*/
.type id <: symbol // Node IDs (e.g., "N1")
.type num <: number // Numerals (e.g., "13")
.type var <: symbol // Variables (e.g., "x")
.type aOp <: symbol // Arithmetic operators (e.g., "*" or "+")
.type bool <: symbol // "True" or "False"
.type bOp <: symbol // Boolean operators (e.g., "=" or "<")
.type unknownLabel <: symbol // I.e., "?"
.type label <: symbol // E.g., "L1", "L2", etc.
.type label? = label | unknownLabel
/*
-------------------------------------------------------
Program encoding
-------------------------------------------------------
*/
.decl numAexpr(id:id,num:num)
.decl varAexpr(id:id,var:var)
.decl boolBexpr(id:id,bool:bool)
.decl binOpAexpr(id:id,op:aOp,operand1:id,operand2:id)
.decl binOpBexpr(id:id,op:bOp,operand1:id,operand2:id)
.decl skipStmt(id:id,label:label)
.decl assignmentStmt(id:id,label:label,var:var,aexpr:id)
.decl seqStmt(id:id,lhs:id,rhs:id)
.decl ifStmt(id:id,label:label,cond:id,branch1:id,branch2:id)
.decl whileStmt(id:id,label:label,cond:id,body:id)
.decl stmtLabel(id:id,label:label)
/*
-------------------------------------------------------
Computed relations
-------------------------------------------------------
*/
.decl init(id:id,label:label)
init(id,label) :- assignmentStmt(id,label,_,_).
init(id,label) :- skipStmt(id,label).
init(id,label) :- seqStmt(id,lhs,_),init(lhs,label).
init(id,label) :- ifStmt(id,label,_,_,_).
init(id,label) :- whileStmt(id,label,_,_).
.decl final(id:id,label:label)
final(id,label) :- assignmentStmt(id,label,_,_).
final(id,label) :- skipStmt(id,label).
final(id,label) :- seqStmt(id,_,rhs),final(rhs,label).
final(id,label) :- ifStmt(id,_,_,branch1,_),final(branch1,label).
final(id,label) :- ifStmt(id,_,_,_,branch2),final(branch2,label).
final(id,label) :- whileStmt(id,label,_,_).
.decl flow(src:label,dst:label)
flow(src,dst) :-
seqStmt(_,srcId,_),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
seqStmt(_,_,srcId),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
seqStmt(_,lhs,rhs),
final(lhs,src),
init(rhs,dst).
flow(src,dst) :-
ifStmt(_,_,_,srcId,_),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
ifStmt(_,_,_,_,srcId),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
ifStmt(_,src,_,branch1,_),
init(branch1,dst).
flow(src,dst) :-
ifStmt(_,src,_,_,branch2),
init(branch2,dst).
flow(src,dst) :-
whileStmt(_,_,_,srcId),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
whileStmt(_,src,_,body),
init(body,dst).
flow(src,dst) :-
whileStmt(_,dst,_,body),
final(body,src).
/*
-------------------------------------------------------
Outputs
-------------------------------------------------------
*/
.output init
.output final
.output flow
/*
-------------------------------------------------------
The program
-------------------------------------------------------
We'll encode this program (taken from p. 45 of PPA).
```
L1 x := 5
L2 y := 1
L3 while (x > 1) do
L4 y := x * y
L5 x := x - 1
done
```
Here's the syntax tree, with node IDs and labels added to it:
```
+-- N2 Assgn L1
|
N1 Seq --+ +-- N4 Assgn L2
| |
+-- N3 Seq --+ +-- N7 Assgn L4
| |
+-- N5 While L3 -- N6 Seq --+
|
+-- N8 Assgn L5
```
*/
stmtLabel("N2","L1").
stmtLabel("N4","L2").
stmtLabel("N5","L3").
stmtLabel("N7","L4").
stmtLabel("N8","L5").
seqStmt("N1","N2","N3").
seqStmt("N3","N4","N5").
seqStmt("N6","N7","N8").
// L1 x := 5
assignmentStmt("N2","L1","x","N100").
numAexpr("N100",5).
// L2 y := 1
assignmentStmt("N4","L2","y","N110").
numAexpr("N110",1).
// L3 while...
whileStmt("N5","L3","N120","N6").
binOpBexpr("N120",">","N121","N122").
varAexpr("N121","x").
numAexpr("N122",1).
// L4 y := x * y
assignmentStmt("N7","L4","y","N130").
binOpAexpr("N130","*","N131","N132").
varAexpr("N131","x").
varAexpr("N132","y").
// L5 x := x - 1
assignmentStmt("N8","L5","x","N140").
binOpAexpr("N140","-","N141","N142").
varAexpr("N141","x").
numAexpr("N142",1).
/*
-------------------------------------------------------
Commentary
-------------------------------------------------------
I run this program with souffle:
```
souffle 03-control-flow-ex-2.dl -D-
```
I get the expected output:
```
---------------
final
id label
===============
N2 L1
N4 L2
N7 L4
N8 L5
N1 L3
N3 L3
N5 L3
N6 L5
===============
---------------
init
id label
===============
N2 L1
N4 L2
N7 L4
N8 L5
N1 L1
N3 L2
N5 L3
N6 L4
===============
---------------
flow
src dst
===============
L1 L2
L2 L3
L4 L5
L5 L3
L3 L4
===============
```
*/
/*
=======================================================
04 - Reaching definition analysis - Full
=======================================================
This is a full encoding of PPA's reaching definition analysis
in datalog. This tracks each node ID in the abstract syntax
tree, and it computes `flow` information as well as the
`gen`, `kill`, and `RD` functions.
*/
/*
-------------------------------------------------------
Types
-------------------------------------------------------
*/
.type id <: symbol // Node IDs (e.g., "N1")
.type num <: number // Numerals (e.g., "13")
.type var <: symbol // Variables (e.g., "x")
.type aOp <: symbol // Arithmetic operators (e.g., "*" or "+")
.type bool <: symbol // "True" or "False"
.type bOp <: symbol // Boolean operators (e.g., "=" or "<")
.type unknownLabel <: symbol // I.e., "?"
.type label <: symbol // E.g., "L1", "L2", etc.
.type label? = label | unknownLabel
/*
-------------------------------------------------------
Program encoding
-------------------------------------------------------
*/
.decl numAexpr(id:id,num:num)
.decl varAexpr(id:id,var:var)
.decl boolBexpr(id:id,bool:bool)
.decl binOpAexpr(id:id,op:aOp,operand1:id,operand2:id)
.decl binOpBexpr(id:id,op:bOp,operand1:id,operand2:id)
.decl skipStmt(id:id,label:label)
.decl assignmentStmt(id:id,label:label,var:var,aexpr:id)
.decl seqStmt(id:id,lhs:id,rhs:id)
.decl ifStmt(id:id,label:label,cond:id,branch1:id,branch2:id)
.decl whileStmt(id:id,label:label,cond:id,body:id)
.decl progStart(label:label)
.decl stmtLabel(id:id,label:label)
/*
-------------------------------------------------------
Computed relations
-------------------------------------------------------
*/
.decl freeVar(var:var)
freeVar(x) :- varAexpr(_,x).
.decl init(id:id,label:label)
init(id,label) :- assignmentStmt(id,label,_,_).
init(id,label) :- skipStmt(id,label).
init(id,label) :- seqStmt(id,lhs,_),init(lhs,label).
init(id,label) :- ifStmt(id,label,_,_,_).
init(id,label) :- whileStmt(id,label,_,_).
.decl final(id:id,label:label)
final(id,label) :- assignmentStmt(id,label,_,_).
final(id,label) :- skipStmt(id,label).
final(id,label) :- seqStmt(id,_,rhs),final(rhs,label).
final(id,label) :- ifStmt(id,_,_,branch1,_),final(branch1,label).
final(id,label) :- ifStmt(id,_,_,_,branch2),final(branch2,label).
final(id,label) :- whileStmt(id,label,_,_).
.decl flow(src:label,dst:label)
flow(src,dst) :-
seqStmt(_,srcId,_),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
seqStmt(_,_,srcId),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
seqStmt(_,lhs,rhs),
final(lhs,src),
init(rhs,dst).
flow(src,dst) :-
ifStmt(_,_,_,srcId,_),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
ifStmt(_,_,_,_,srcId),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
ifStmt(_,src,_,branch1,_),
init(branch1,dst).
flow(src,dst) :-
ifStmt(_,src,_,_,branch2),
init(branch2,dst).
flow(src,dst) :-
whileStmt(_,_,_,srcId),
stmtLabel(srcId,src),
flow(src,dst).
flow(src,dst) :-
whileStmt(_,src,_,body),
init(body,dst).
flow(src,dst) :-
whileStmt(_,dst,_,body),
final(body,src).
.decl kill(stmt:label,var:var,label:label?)
kill(stmt,var,"?") :- assignmentStmt(_,stmt,var,_).
kill(stmt,var,label) :-
assignmentStmt(_,stmt,var,_),
assignmentStmt(_,label,var,_).
.decl gen(stmt:label,var:var,label:label?)
gen(stmt,var,stmt) :- assignmentStmt(_,stmt,var,_).
.decl RDin(stmt:label,var:var,label:label?)
RDin(stmt,var,"?") :-
progStart(stmt),
freeVar(var).
RDin(stmt,var,label) :-
!progStart(stmt),
flow(label2,stmt),
RDout(label2,var,label).
.decl RDout(stmt:label,var:var,label:label?)
RDout(stmt,var,label) :- gen(stmt,var,label).
RDout(stmt,var,label) :- RDin(stmt,var,label),!kill(stmt,var,label).
/*
-------------------------------------------------------
Outputs
-------------------------------------------------------
*/
.output kill
.output gen
.output RDin
.output RDout
/*
-------------------------------------------------------
The program
-------------------------------------------------------
We'll encode this program (taken from p. 45 of PPA).
```
L1 x := 5
L2 y := 1
L3 while (x > 1) do
L4 y := x * y
L5 x := x - 1
done
```
Here's the syntax tree, with node IDs and labels added to it:
```
+-- N2 Assgn L1
|
N1 Seq --+ +-- N4 Assgn L2
| |
+-- N3 Seq --+ +-- N7 Assgn L4
| |
+-- N5 While L3 -- N6 Seq --+
|
+-- N8 Assgn L5
```
*/
progStart("L1").
stmtLabel("N2","L1").
stmtLabel("N4","L2").
stmtLabel("N5","L3").
stmtLabel("N7","L4").
stmtLabel("N8","L5").
seqStmt("N1","N2","N3").
seqStmt("N3","N4","N5").
seqStmt("N6","N7","N8").
// L1 x := 5
assignmentStmt("N2","L1","x","N100").
numAexpr("N100",5).
// L2 y := 1
assignmentStmt("N4","L2","y","N110").
numAexpr("N110",1).
// L3 while...
whileStmt("N5","L3","N120","N6").
binOpBexpr("N120",">","N121","N122").
varAexpr("N121","x").
numAexpr("N122",1).
// L4 y := x * y
assignmentStmt("N7","L4","y","N130").
binOpAexpr("N130","*","N131","N132").
varAexpr("N131","x").
varAexpr("N132","y").
// L5 x := x - 1
assignmentStmt("N8","L5","x","N140").
binOpAexpr("N140","-","N141","N142").
varAexpr("N141","x").
numAexpr("N142",1).
/*
-------------------------------------------------------
Commentary
-------------------------------------------------------
I run this program with souffle:
```
souffle 04-reaching-def-full.dl -D-
```
I get the expected output:
```
---------------
gen
stmt var label
===============
L1 x L1
L2 y L2
L4 y L4
L5 x L5
===============
---------------
kill
stmt var label
===============
L1 x L1
L1 x L5
L1 x ?
L2 y L2
L2 y L4
L2 y ?
L4 y L2
L4 y L4
L4 y ?
L5 x L1
L5 x L5
L5 x ?
===============
---------------
RDin
stmt var label
===============
L1 x ?
L1 y ?
L2 x L1
L2 y ?
L4 x L1
L4 x L5
L4 y L2
L4 y L4
L5 x L1
L5 x L5
L5 y L4
L3 x L1
L3 x L5
L3 y L2
L3 y L4
===============
---------------
RDout
stmt var label
===============
L1 x L1
L1 y ?
L2 x L1
L2 y L2
L4 x L1
L4 x L5
L4 y L4
L5 x L5
L5 y L4
L3 x L1
L3 x L5
L3 y L2
L3 y L4
===============
```
*/
/*
=======================================================
04 - Reaching definition analysis - Simple
=======================================================
Encoding PPA's reaching definition analysis in datalog.
This is a very simple encoding of what we have in PPA.
It does not compute `flow`, for example, but instead
we type that in by hand. It also does not track each
node ID in the abstract syntax tree. The emphasis here
is just on the basic `gen`,`kill`, and `RD` functions.
*/
/*
-------------------------------------------------------
Types
-------------------------------------------------------
*/
.type id <: symbol // Expression IDs (e.g., "E1")
.type num <: number // Numerals (e.g., "13")
.type var <: symbol // Variables (e.g., "x")
.type aOp <: symbol // Arithmetic operators (e.g., "*" or "+")
.type bool <: symbol // "True" or "False"
.type bOp <: symbol // Boolean operators (e.g., "=" or "<")
.type unknownLabel <: symbol // I.e., "?"
.type label <: symbol // E.g., "l1", "l2", etc.
.type label? = label | unknownLabel
/*
-------------------------------------------------------
Program encoding
-------------------------------------------------------
*/
.decl numAexpr(id:id,num:num)
.decl varAexpr(id:id,var:var)
.decl binOpAexpr(id:id,op:aOp,operand1:id,operand2:id)
.decl boolBexpr(id:id,bool:bool)
.decl binOpBexpr(id:id,op:bOp,operand1:id,operand2:id)
.decl skipStmt(label:label)
.decl assignmentStmt(label:label,var:var,aexpr:id)
.decl seqStmt(label1:label,label2:label)
.decl ifStmt(label:label,cond:id,branch1:label,branch2:label)
.decl whileStmt(label:label,cond:id,body:label)
/*
-------------------------------------------------------
Computed relations
-------------------------------------------------------
*/
.decl freeVar(var:var)
freeVar(x) :- varAexpr(_,x).
.decl init(stmt:label,entry:label)
.decl final(stmt:label,exit:label)
.decl flow(from:label,to:label)
.decl kill(stmt:label,var:var,label:label?)
kill(stmt,var,"?") :- assignmentStmt(stmt,var,_).
kill(stmt,var,label) :-
assignmentStmt(stmt,var,_),
assignmentStmt(label,var,_).
.decl gen(stmt:label,var:var,label:label?)
gen(stmt,var,stmt) :- assignmentStmt(stmt,var,_).
.decl RDin(stmt:label,var:var,label:label?)
RDin(stmt,var,"?") :- init(stmt,_), freeVar(var).
RDin(stmt,var,label) :-
!init(stmt,_),
flow(label2,stmt),
RDout(label2,var,label).
.decl RDout(stmt:label,var:var,label:label?)
RDout(stmt,var,label) :- gen(stmt,var,label).
RDout(stmt,var,label) :- RDin(stmt,var,label), !kill(stmt,var,label).
/*
-------------------------------------------------------
Outputs
-------------------------------------------------------
*/
.output freeVar
.output kill
.output gen
.output RDin
.output RDout
/*
-------------------------------------------------------
The program
-------------------------------------------------------
We'll encode this program (taken from p. 45 of PPA).
```
L1 x := 5
L2 y := 1
L3 while (x > 1) do
L4 y := x * y
L5 x := x - 1
done
```
*/
// Expressions that occur in the program
boolBexpr("E1","true").
boolBexpr("E2","false").
numAexpr("E3",5).
numAexpr("E4",1).
varAexpr("E5","x").
varAexpr("E6","y").
binOpBexpr("E7",">","E5","E4").
binOpAexpr("E8","*","E5","E6").
binOpAexpr("E9","-","E5","E4").
// Statements that occur in the program
assignmentStmt("L1","x","E3").
assignmentStmt("L2","y","E4").
assignmentStmt("L4","y","E8").
assignmentStmt("L5","x","E9").
whileStmt("L3","E7","L4").
seqStmt("L1","L2").
seqStmt("L2","L3").
seqStmt("L4","L5").
// Flow information
init("L1","L1").
final("L1","L3").
flow("L1","L2").
flow("L2","L3").
flow("L3","L4").
flow("L4","L5").
flow("L5","L3").
/*
-------------------------------------------------------
Commentary
-------------------------------------------------------
The `kill`, `gen`, `RDin`, and `RDout` predicates are copied pretty much
verbatim from PPA.
I run this datalog program with souffle as follows:
```
souffle 04-reaching-def-simple.dl -D-
```
The output I get matches the results listed in PPA. In particular,
I see the following.
The `kill` output is this (which matches the PPA results on p. 45):
```
---------------
kill
stmt var label
===============
L1 x L1
L1 x L5
L1 x ?
L2 y L2
L2 y L4
L2 y ?
L4 y L2
L4 y L4
L4 y ?
L5 x L1
L5 x L5
L5 x ?
===============
```
The `gen` output is this (which matches the PPA results on p. 45):
```
---------------
gen
stmt var label
===============
L1 x L1
L2 y L2
L4 y L4
L5 x L5
===============
```
`RDin` is this (which matches the PPA results on p. 46):
```
---------------
RDin
stmt var label
===============
L1 x ?
L1 y ?
L2 x L1
L2 y ?
L4 x L1
L4 x L5
L4 y L2
L4 y L4
L5 x L1
L5 x L5
L5 y L4
L3 x L1
L3 x L5
L3 y L2
L3 y L4
===============
```
And `RDout` is this (which matches the PPA results on p. 46):
```
---------------
RDout
stmt var label
===============
L1 x L1
L1 y ?
L2 x L1
L2 y L2
L4 x L1
L4 x L5
L4 y L4
L5 x L5
L5 y L4
L3 x L1
L3 x L5
L3 y L2
L3 y L4
===============
```
*/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment