SlideShare a Scribd company logo
1 of 105
Download to read offline
Yoyak:
static analysis framework
Heejong Lee
ScalaDays 2015
Speaker Introduction
• Has been working in a static analysis industry since 2008
• Studied programming language theory at a graduate school
• Has been developing several static analyzers which are
mostly commercial ones
• Began to use Scala six years ago and still actively using it in
everyday development
Agenda
• Static analysis
• Theory of abstract interpretation
• Yoyak framework: implementation highlights
• Yoyak framework: Scala experience
• Yoyak framework: Roadmap
Static Analysis
What is Static Analysis?
• Analyze source codes without actually running it
• Someone prefers to call it white box test
• Used for finding bugs, optimizing a compiled binary,
calculating a software metric, proving safety properties, etc.
Examples of Static Analysis
• Finding bugs : symbolic execution
• Optimizing a compiled binary: data flow analysis
• Calculating a software metric: syntactic analysis
• Proving safety properties: model checking, abstract
interpretation, type system
Two important terms in Static Analysis
• Soundness
• The analysis result should contain all possibilities which can
happen in the runtime
• If the analysis uses an over-approximation, it is sound
• Completeness
• The analysis result should not contain any possibility which
cannot happen in the runtime
• If the analysis uses an under-approximation, it is complete
Two important terms in Static Analysis
Over-approximation of Semantics
Program Semantics
Under-approximation of
Semantics
Abstract
Interpretation
The beauty of abstraction
http://cargocollective.com/carlyfox/Design
What is the result of this expression?
19224 ⇥ 7483919 ⇥ (11952 20392)
What is the result of this expression?
19224 ⇥ 7483919 ⇥ (11952 20392)
= 1214270048744640
How long does it take without a calculator?
What is the result of this expression?
19224 ⇥ 7483919 ⇥ (11952 20392)
= 1214270048744640
What if we do not have an interest in the exact number, rather
we just want to know whether it is positive or negative?
What is the result of this expression?
19224 ⇥ 7483919 ⇥ (11952 20392)
ˆ+ ⇥ ˆ+ ⇥ ˆ
= ˆ
↵
= n (n 2 Z ^ n < 0)
What is the result of this expression?
19224 ⇥ 7483919 ⇥ (11952 20392)
= 1214270048744640
= n (n 2 Z ^ n < 0)
takes 30 seconds
takes 3 seconds
• inaccurate but not incorrect
• accurate enough for a specific purpose
• much faster than a real calculation
This is abstract interpretation
Is this program safe from buffer overruns?
void foo(int x) {
String[] strs = new String[10];
int index = 0;
while(x > 0) {
index = index + 1;
x = x - 1;
}
strs[index] = "hello!";
}
No, ArrayIndexOutOfBoundsException may occur at the last line
void foo(int x) {
String[] strs = new String[10];
int index = 0;
while(x > 0) {
index = index + 1;
x = x - 1;
}
strs[index] = "hello!";
}
index = [0,0]
index = [1,∞]
index = [0,∞]
• Roughly but soundly execute the program
Abstract interpretation for dummies
?
Abstract interpretation for brains
First, we need to precisely define what “domain” and
“semantics” means in a mathematical way
Let me introduce you Javar language
1
1
What this program means?
Javar-1
C ! n (n 2 Z)
Javar-1 semantic domain
n 2 V alue = Z
JCK 2 V alue
Javar-1 semantics
JnK = n
1+1
Javar-2
C ! n op n (n 2 Z, op 2 {+, , ⇤, /})
Javar-{1,2} semantic domain
n 2 V alue = Z
JCK 2 V alue
Javar-2 semantics
JnK = n
Jn1 + n2K = Jn1K + Jn2K
Jn1 n2K = Jn1K Jn2K
Jn1 ⇤ n2K = Jn1K ⇥ Jn2K
Jn1 / n2K = Jn1K ÷ Jn2K
x := x + 1
Javar-3
C ! x := E
E ! n (n 2 Z)
| x
| E op E (op 2 {+, , ⇤, /})
Javar-3 semantic domain
M 2 Memory = V ar ! V alue
n 2 V alue = Z
x 2 V ar = V ariables
JCK 2 Memory ! Memory
JEK 2 Memory ! Z
Javar-3 semantics
Jx := EKM = M{x ! JEKM}
JnKM = n
JxKM = M(x)
JE1{+, , ⇤, /}E2KM = JE1KM{+, , ⇥, ÷}JE2KM
x := 100 + 2;
if(x)
x := x * 10
else
x := x / 2;
while(x)
x := x - 1
Javar-4
C ! x := E
| if (E) C else C
| while (E) C
| C; C
E ! n (n 2 Z)
| x
| E op E (op 2 {+, , ⇤, /})
Javar-{3,4} semantic domain
M 2 Memory = V ar ! V alue
n 2 V alue = Z
x 2 V ar = V ariables
JCK 2 Memory ! Memory
JEK 2 Memory ! Z
Javar-4 semantics
Jx := EKM = M{x ! JEKM}
Jif(E) C1 else C2KM = if JEKM 6= 0 then JC1KM else JC2KM
Jwhile(E) CKM = if JEKM 6= 0 then Jwhile(E) CK(JCKM) else M
JnKM = n
JxKM = M(x)
JE1{+, , ⇤, /}E2KM = JE1KM{+, , ⇥, ÷}JE2KM
This is not a definition
Jwhile(E) CKM = if JEKM 6= 0 then Jwhile(E) CK(JCKM) else M
GNU = GNU’s Not Unix
The existence and uniqueness of the fixed-point
is guaranteed by domain theory
Jwhile(E) CKM = if JEKM 6= 0 then Jwhile(E) CK(JCKM) else M
Jwhile(E) CK = M.if JEKM 6= 0 then Jwhile(E) CK(JCKM) else M
F = M.if JEKM 6= 0 then F(JCKM) else M
F = H(F)
Jwhile(E) CK = fix( F. M.if JEKM 6= 0 then F(JCKM) else M)
Abstract interpretation revisited
• Safely estimate program semantics in a finite time
• Abstraction is not omission, guarantees soundness
• Most of static analysis techniques can be defined in a form of
abstract interpretation
Key Elements of Abstract Interpretation
• Domain : concrete domain, abstract domain
• Semantics : concrete semantics, abstract semantics
• Galois connection : pair of abstraction and concretization
functions
• CPO : complete partial order
• Continuous function : preserving upper bound
Galois Connection
8x 2 D, ˆx 2 ˆD : ↵(x) v ˆx () x v (ˆx)
x
ˆx
↵
D ˆD
CPO
exists partial order ⊑
exists element x where x ⊑ y (for all y ∈ D)
for all ordered subset of D, there
exists upper bound x where x ∈ D
Lattices
Partially ordered set in which every
two elements have a unique LUB(⊔)
and a unique GLB(⊓)
Continuous Function
x
D
8ordered subset S ✓ D, F(
G
x2S
x) =
G
x2S
F(x)
D
y
z
F(x)
F(y)
F(z)
Abstract Interpretation in a Nutshell
Concrete Abstract
Program Semantics
Domain D should be CPO should be CPO
Galois Connection
Semantic Function F should be continuous should be monotonic
Program Execution
F : D ! D ˆF : ˆD ! ˆD
lfp F =
G
i2N
Fi
(?)
G
i2N
ˆFi
(ˆ?) v ˆX
↵ : D ! ˆD : ˆD ! D
Performing analysis using abstract interpretation = calculating in a finite timeˆX
And the following formula is always satisfied (soundness guarantee)
lfp F v ˆX
Abstract Interpretation in a Nutshell
lfp F v ˆX
false positives
lfp F
ˆX
lfp ˆF
↵ F v ˆF ↵
D ˆD
Is this program safe from buffer overruns?
void foo(int x) {
String[] strs = new String[10];
int index = 0;
if(x > 0) {
index = 1;
} else {
index = 10;
}
strs[index] = "hello!";
}
void foo(int x) {
String[] strs = new String[10];
int index = 0;
if(x > 0) {
index = 1;
} else {
index = 10;
}
strs[index] = "hello!";
}
index = [0,0]
index = [1,1]
index = [10,10]
index = [1,10]
Interval analysis based on abstract interpretation
• Concrete domain: the domain in the real world
Memory = V ar ! V alue
V alue = 2Z
C 2 C ! Memory ! Memory
V 2 E ! Memory ! V alue
Interval analysis based on abstract interpretation
• Concrete semantics: the semantics in the real world
C x := E m = m{x 7! V E m}
C if(E) C1 C2 m = V E m ? C C1 m : C C2 m
C while(E) C m = V E m ? C while(E) C (C C m) : m
C C1; C2 m = C C2 (C C1 m)
V x m = m x
V n m = {n}
V E1 + E2 m = (V E1 m) + (V E2 m)
Interval analysis based on abstract interpretation
• Concrete execution of a program
? @ F(?) @ F(F(?)) @ F(F(F(?)))... @ Fi
(?) = Fi+1
(?)
is the execution result of a programFi
(?) 2 Memory
F = m.C C m
lfp F =
G
i2N
Fi
({})
Interval analysis based on abstract interpretation
• Abstract domain: the domain we will use in an analysis
ˆMemory = V ar ! ˆV alue
ˆV alue = ˆZ [ {?}
ˆZ = {[a, b] | a 2 Z [ { 1}, b 2 Z [ {1}, a  b}
ˆC 2 C ! ˆMemory ! ˆMemory
ˆV 2 E ! ˆMemory ! ˆV alue
ㅗ
[0,0] [1,1] [2,2] ……..[-1,-1][-2,-2][-3,-3]
[-1,0] [0,1] [0,2][-2,-1][-3,-2]
[-3,-1] [-2,0] [-1,1] [0,2]
[-2,1][-3,0] [-1,2]
……..
[-∞,∞]
[0,∞]
[-1,∞]
[-2,∞]
……..
[-∞,0]
[-∞,1]
[-∞,2]
…….……
………………
………………..…
……..
…….……
………………
………………..…
Lattice of Interval Domain
Interval analysis based on abstract interpretation
• Abstract semantics: the semantics we will use in an analysis
ˆC x := E ˆm = ˆm{x 7! ˆV E ˆm}
ˆC if(E) C1 C2 ˆm = ˆC C1 ˆm t ˆC C2 ˆm
ˆC while(E) C ˆm = ˆm t ˆC while(E) C ( ˆC C ˆm)
ˆC C1; C2 ˆm = ˆC C2 ( ˆC C1 ˆm)
ˆV x ˆm = ˆm x
ˆV n ˆm = ↵{n}
ˆV E1 + E2 ˆm = (ˆV E1 ˆm)ˆ+(ˆV E2 ˆm)
Interval analysis based on abstract interpretation
• Abstract execution of a program
is the analysis result of a program
ˆF = ˆm. ˆC C ˆm
G
i2N
ˆFi
({}) v ˆX
ˆ? @ ˆF(ˆ?) @ ˆF( ˆF(ˆ?)) @ ˆF( ˆF( ˆF(ˆ?)))... @ ˆFi
(ˆ?) v ˆX
ˆX
Interval analysis based on abstract interpretation
• Widening
What if this chain has infinite length?
ˆ? @ ˆF(ˆ?) @ ˆF( ˆF(ˆ?)) @ ˆF( ˆF( ˆF(ˆ?)))... @ ˆFi
(ˆ?) v ˆX
ˆ? @ ˆF(ˆ?) @ ˆF( ˆF(ˆ?)) @ ˆF( ˆF( ˆF(ˆ?)))... @ ˆFi 1
(ˆ?)r ˆFi
(ˆ?) v ˆX
rWe need a widening operator
Interval analysis based on abstract interpretation
• Widening
ˆ? @ [0, 0] @ [0, 1] @ [0, 2]... @ [0, i 1] r [0, i] v [0, 1]
void foo(int x) {
String[] strs = new String[10];
int index = 0;
while(x > 0) {
index = index + 1;
x = x - 1;
}
strs[index] = "hello!";
}
index = [0,0]
index = [1,∞]
index = [0,∞]
Is this program safe from buffer overruns?
void foo(int x) {
String[] strs = new String[10];
int index = 0;
if(x > 0) {
index = 1;
} else {
index = 10;
}
strs[index] = "hello!";
}
Interval analysis based on abstract interpretation
0
21
3 4
5 6
index = 0; if(x > 0) index = 1 else index = 10; result = index
ˆC C0 ˆm = ˆC C2 ( ˆC C1 ˆm)
ˆC C1 ˆm = ˆm{index 7! ↵{0}}
ˆC C2 ˆm = ˆC C4 ( ˆC C3 ˆm)
ˆC C3 ˆm = ˆC C5 ˆm t ˆC C6 ˆm
ˆC C4 ˆm = ˆm{result 7! ˆm index}
ˆC C5 ˆm = ˆm{index 7! ↵{1}}
ˆC C6 ˆm = ˆm{index 7! ↵{10}}
Interval analysis based on abstract interpretation
ˆC C0 {} = ˆC C2 ( ˆC C1 {})
ˆC C1 {} = {index 7! [0, 0]}
ˆC C2 {index 7! [0, 0]} = ˆC C4 ( ˆC C3 {index 7! [0, 0]})
ˆC C3 {index 7! [0, 0]} = ˆC C5 {index 7! [0, 0]} t ˆC C6 {index 7! [0, 0]}
ˆC C4 {index 7! [1, 10]} = {index 7! [1, 10], result 7! [1, 10]}
ˆC C5 {index 7! [0, 0]} = {index 7! [1, 1]}
ˆC C6 {index 7! [0, 0]} = {index 7! [10, 10]}
ˆC C0 {} = {index 7! [1, 10], result 7! [1, 10]}
void foo(int x) {
String[] strs = new String[10];
int index = 0;
if(x > 0) {
index = 1;
} else {
index = 10;
}
strs[index] = "hello!";
}
index may have an integer between 1 and 10
Since the size of the buffer strs is 10,
ArrayIndexOutOfBoundsException may occur here
Is this program safe from buffer overruns?
Yoyak
Do not reinvent the wheel
https://trimaps.com/assets/website/dontreinventthemap-6ba62b8ba05d4957d2ed772584d7e4cd.png
Motivation
• Do no reinvent the wheel : many components that static analyzers often use
are reusable
• CFG data types : construction, optimization, visualization
• Graph algorithms : unrolling loops, finding loop heads, finding topological
order
• Intermediate language data types : construction, optimization, pretty
printing
• Common abstract domains : integer interval, abstract object, abstract
memory
• Common abstract semantics : assignment, invoking methods, evaluating
binary expressions
Motivation
• Perfect to be a framework : the theory of abstract
interpretation guarantees soundness and termination of the
analysis if a user supplies valid abstract domain and
semantics
Generic fixed point
computation engine
Abstract domain D
Abstract semantics F
Fixed point
x = F(x) (x∈D)
Overview
Yoyak
Abstract Domain
Fixed Point
Computation
Abstract Semantics
MapDom
MemDom
Interval
ArithmeticOps
LatticeOps
StdSemanticsForwardAnalysis
AbstractTransferable
Widening
Galois
ILFlowSensitive
FixedPoint
Computation
Worklist
WideningAt
LoopHeads
Interprocedural
Iteration
DoWidening
CommonIL
Attachable
Typable
Fixed-point Computation in Yoyak
Built-in work-list algorithm
x := 10
Assume (y == 0)
println(“0”)
println(“2”)
Assume (y != 0)
Assume (y == 1)
println(“0”)
Assume (y != 1)
Assume (z)
throw new Ex();
ENTRY
EXIT
Assume (!z)
println(“done”)
return;
def computeFixedPoint(startNodes:
List[BasicBlock])(implicit widening:
Option[Widening[D]] = None) :
MapDom[BasicBlock,D] = {
worklist.add(startNodes:_*)
var map = MapDom.empty[BasicBlock,D]
while(worklist.size() > 0) {
val bb = worklist.pop().get
val prevInputs = memoryFetcher(map,bb)
val prev = getInput(map,prevInputs)
val (mapOut,next) = work(map,prev,bb)
val orig = map.get(bb)
val isStableOpt = ops.<=(next,orig)
if(isStableOpt.isEmpty) {
println("error: abs. transfer func. is
not distributive")
}
if(!isStableOpt.get) {
val widened = if(widening.nonEmpty) {
doWidening(widening.get)(orig,next,bb)
} else next
map = mapOut.update(bb->widened)
val nextWork = getNextBlocks(bb)
worklist.add(nextWork:_*)
}
}
map
Fixed-point Computation in Yoyak
Built-in work-list algorithm
trait FlowSensitiveFixedPointComputation[D<:Galois] extends
FlowSensitiveIteration[D] with CfgNavigator[D] with DoWidening[D] {
def computeFixedPoint(startNodes: List[BasicBlock])(implicit widening:
Option[Widening[D]] = None) : MapDom[BasicBlock,D] = {
class FlowSensitiveForwardAnalysis[D<:Galois](val cfg: CFG)(
implicit val ops: LatticeOps[D],
val absTransfer: AbstractTransferable[D],
val widening: Option[Widening[D]] = None) extends
FlowSensitiveFixedPointComputation[D] with WideningAtLoopHeads[D] {
Abstract Semantics in Yoyak
Built-in work-list algorithm
trait AbstractTransferable[D<:Galois] {
protected def transferIdentity(stmt: Identity, input: D#Abst)(
implicit context: Context) : D#Abst = input
protected def transferAssign(stmt: Assign, input: D#Abst)(
implicit context: Context) : D#Abst = input
protected def transferInvoke(stmt: Invoke, input: D#Abst)(
implicit context: Context) : D#Abst = input
protected def transferIf(stmt: If, input: D#Abst)(
implicit context: Context) : D#Abst = input
protected def transferAssume(stmt: Assume, input: D#Abst)(
implicit context: Context) : D#Abst = input
// so on
Abstract Semantics in Yoyak
Built-in standard semantic
trait StdSemantics[A<:Galois,D,Mem<:MemDomLike[A,D,Mem]] extends
AbstractTransferable[GaloisIdentity[Mem]] {
val arithOps : ArithmeticOps[A]
override protected def transferAssign(stmt: Assign, input: Mem)(
implicit context: Context) : Mem = {
val (rv,output) = eval(stmt.rv,input)
output.update(stmt.lv,rv)
}
Abstract Domain in Yoyak
Composable abstract domains
class MapDom[K,V <: Galois : LatticeOps] {
trait LatticeOps[D <: Galois] extends ParOrdOps[D] {
def /(lhs: D#Abst, rhs: D#Abst) : D#Abst
def bottom : D#Abst
trait ParOrdOps[D <: Galois] {
def <=(lhs: D#Abst, rhs: D#Abst) : Option[Boolean]
trait Galois {
type Conc
type Abst
Abstract Domain in Yoyak
Built-in Interval Domain
scala> import com.simplytyped.yoyak.framework.domain.arith._
import com.simplytyped.yoyak.framework.domain.arith._
scala> import com.simplytyped.yoyak.framework.domain.arith.Interval._
import com.simplytyped.yoyak.framework.domain.arith.Interval._
scala> val intv1 = Interv.of(10)
intv1: com.simplytyped.yoyak.framework.domain.arith.Interval = Interv(IInt(10),IInt(10))
scala> val intv2 = Interv.in(IInt(-10),IInt(10))
intv2: com.simplytyped.yoyak.framework.domain.arith.Interval = Interv(IInt(-10),IInt(10))
scala> val intv3 = Interv.in(IInfMinus,IInf)
intv3: com.simplytyped.yoyak.framework.domain.arith.Interval = IntervTop
scala> val intv4 = Interv.in(IInt(-10),IInf)
intv4: com.simplytyped.yoyak.framework.domain.arith.Interval = Interv(IInt(-10),IInf)
Abstract Domain in Yoyak
Built-in Interval Domain
scala> import IntervalInt.arithOps
import IntervalInt.arithOps
scala> arithOps.+(intv1,intv2) // [10,10] + [-10,10]
res1: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = Interv(IInt(0),IInt(20))
scala> arithOps.-(intv1,intv2) // [10,10] - [-10,10]
res2: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = Interv(IInt(0),IInt(20))
scala> arithOps.+(intv2,intv3) // [-10,10] + [-∞,∞]
res3: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = IntervTop
scala> arithOps.*(intv2,intv4) // [-10,10] * [-10,∞]
res4: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = IntervTop
scala> arithOps.*(intv1,intv4) // [10,10] * [-10,∞]
res5: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = Interv(IInt(-100),IInf)
Abstract Domain in Yoyak
Built-in Standard Object Model
trait StdObjectModel[A<:Galois,D<:Galois,This<:StdObjectModel[A,D,This]] extends
MemDomLike[A,D,This] with ArrayJoinModel[A,D,This] {
implicit val arithOps : ArithmeticOps[A]
implicit val boxedOps : LatticeWithTopOps[D]
def update(kv: (Loc,AbsValue[A,D])) : This
def remove(loc: Local) : This
def alloc(from: Stmt) : (AbsRef,This)
def get(k: Loc) : AbsValue[A,D]
def isStaticAddr(addr: AbsAddr) : Boolean
def isDynamicAddr(addr: AbsAddr) : Boolean
class MemDom[A <: Galois : ArithmeticOps, D <: Galois : LatticeWithTopOps] extends
StdObjectModel[A,D,MemDom[A,D]] {
Abstract Domain in Yoyak
Built-in Memory Domain
scala> import com.simplytyped.yoyak.framework.domain.mem.MemDom
scala> import com.simplytyped.yoyak.framework.domain.mem.MemElems._
scala> import com.simplytyped.yoyak.framework.domain.Galois._
scala> import com.simplytyped.yoyak.framework.domain.arith.Interv
scala> import com.simplytyped.yoyak.framework.domain.arith.IntervalInt
scala> import com.simplytyped.yoyak.il.CommonIL.Value._
scala> val memory = new MemDom[IntervalInt,SetAbstraction[String]]
memory:
com.simplytyped.yoyak.framework.domain.mem.MemDom[com.simplytyped.yoyak.framework.doma
in.arith.IntervalInt,com.simplytyped.yoyak.framework.domain.Galois.SetAbstraction[Stri
ng]] = com.simplytyped.yoyak.framework.domain.mem.MemDom@8443a1
Abstract Domain in Yoyak
scala> val memory2 = memory.update(Local("x") -> AbsArith[IntervalInt](Interv.of(1)))
scala> val memory3 = memory.update(Local("x") -> AbsArith[IntervalInt](Interv.of(10)))
scala> val memory4 = MemDom.ops[IntervalInt,SetAbstraction[String]]./(memory2,memory3)
scala> memory4.get(Local("x"))
res1:
com.simplytyped.yoyak.framework.domain.mem.MemElems.AbsValue[com.simplytyped.yoyak.framework
.domain.arith.IntervalInt,com.simplytyped.yoyak.framework.domain.Galois.SetAbstraction[Strin
g]] = AbsArith(Interv(IInt(1),IInt(10)))
Built-in Memory Domain
IL in Yoyak
CommonIL
abstract class Stmt extends Attachable {
override def equals(that: Any): Boolean = this eq that.asInstanceOf[AnyRef]
override def hashCode() : Int = System.identityHashCode(this)
private[Stmt] def copyAttr(stmt: Stmt) : this.type = {sourcePos = stmt.pos; this}
}
IL in Yoyak
CommonIL
case class Block(stmts: StatementContainer) extends Stmt
case class Switch(v: Value.Loc, keys: List[Value.t], targets: List[Target]) extends Stmt
case class Placeholder(x: AnyRef) extends Stmt
sealed trait CoreStmt extends Stmt
case class If(cond: Value.CondBinExp, target: Target) extends CoreStmt
case class Goto(target: Target) extends CoreStmt
sealed trait CfgStmt extends CoreStmt
case class Identity(lv: Value.Local, rv: Value.Param) extends CfgStmt
case class Assign(lv: Value.Loc, rv: Value.t) extends CfgStmt
case class Invoke(ret: Option[Value.Local], callee: Type.InvokeType) extends CfgStmt
case class Assume(cond: Value.CondBinExp) extends CfgStmt
case class Return(v: Option[Value.Loc]) extends CfgStmt
case class Nop() extends CfgStmt
case class EnterMonitor(v: Value.Loc) extends CfgStmt
case class ExitMonitor(v: Value.Loc) extends CfgStmt
case class Throw(v: Value.Loc) extends CfgStmt
IL in Yoyak
Stmt
x := 10;
switch (y) {
case 0:
println(“0”);
break;
case 1:
println(“1”);
default:
println(“2”);
}
if(z) {
throw new Exception();
} else {
println(“done”);
}
return 0;
x := 10;
if(y == 0) {
println(“0”);
goto D;
}
if(y == 1) {
println(“1”);
}
D:
println(“2”);
if(z) {
throw new Exception();
} else {
println(“done”);
}
return 0;
CoreStmt
x := 10
Assume (y == 0)
println(“0”)
println(“2”)
Assume (y != 0)
Assume (y == 1)
println(“0”)
Assume (y != 1)
Assume (z)
throw new Ex();
ENTRY
EXIT
Assume (!z)
println(“done”)
return;
CfgStmt
Simple Interval Analysis in Yoyak
class IntervalAnalysis(cfg: CFG) {
def run() = {
import IntervalAnalysis.{memDomOps,absTransfer,widening}
val analysis = new FlowSensitiveForwardAnalysis[GMemory](cfg)
val output = analysis.compute
output
}
}
object IntervalAnalysis {
type Memory = MemDom[IntervalInt,SetAbstraction[Any]]
type GMemory = GaloisIdentity[Memory]
implicit val absTransfer : AbstractTransferable[GMemory] =
new StdSemantics[IntervalInt,SetAbstraction[Any],Memory] {
val arithOps: ArithmeticOps[IntervalInt] = IntervalInt.arithOps
}
implicit val memDomOps : LatticeOps[GMemory] = MemDom.ops[IntervalInt,SetAbstraction[Any]]
implicit val widening : Option[Widening[GMemory]] = {
implicit val NoWideningForSetAbstraction = Widening.NoWidening[SetAbstraction[Any]]
Some(MemDom.widening[IntervalInt,SetAbstraction[Any]])
}
}
Simple Interval Analysis in Yoyak
MemDom
StdObjectModel
MapDom
AbsValue
AbsRef
AbsArith
IntervalInt
AbsBox
SetAb[Any]
AbsBottom
AbsTop
AbsObject
AbsAddr
IntervalAnalysis
FlowSensitive
ForwardAnalysis
FlowSensitive
FixedPointComputation
Worklist
LatticeOps
FlowSensitiveIteration
Abstract
Transferable
CfgNavigator
WideningAtLoopHeads
Widening
MapDom
BasicBlock
MemDom
MemDom.op
IntervalInt.widening
IntervalAnalysisTransferFunction
CFG
Fixed-point result
StdSemantics
ArithmeticOps
IntervalInt.arithOps
Yoyak : Scala Experience
• Scala is a very good language to implement a static analyzer
• Function is a first class citizen
• Type class support
• Algebraic data type support
• Native support for mutable and immutable values
• Excellent support for parallelization
Yoyak : Scala Experience
• Function is a first class citizen
Natural way to express mathematical logic
// optimize Cfg
(insertAssume _ andThen removeIfandGoto) apply rawCfg
Yoyak : Scala Experience
• Type class support
Can avoid F-bounded polymorphism which is the fast lane to overworking
• F-bounded polymorphism
• Commonly happen when inheritance meets immutability
• Seriously deteriorate code readability
Yoyak : Scala Experience
• F-bounded polymorphism
trait Queue[T, This <: Queue[T, This]] {
def push(elem: T) : This
}
trait GoodQueue[T, This <: GoodQueue[T, This]] extends Queue[T, This] {
def pop : (T, This)
}
trait BetterQueue[T, R, This <: BetterQueue[T, R, This]] extends GoodQueue[T,
This] {
def giveMeSomethingNew : R
}
trait QueueUnited[T, R, Q <: Queue[T, Q], G <: GoodQueue[T, G], B <:
BetterQueue[T, R, B], This <: QueueUnited[T, R, Q, G, B, This]] extends
BetterQueue[T, R, This] {
def giveUp : Unit
}
• Always need the type of concrete subclass
• Reiterate all type variables again in subclass reference
• Type class liberates methods from inheritance
Yoyak : Scala Experience
• Type class
trait QueueLike[T,This] {
def push(elem: T) : This
}
trait GoodQueueLike[T,This] {
implicit val queueLike : QueueLike[T,This]
def push(elem: T) : This = queueLike.push(elem)
def pop(q: This) : (T,This)
}
trait BetterQueueLike[T,R,This] {
implicit val goodQueueLike : GoodQueueLike[T,This]
def push(elem: T) : This = goodQueueLike.push(elem)
def pop(q: This) : (T,This) = goodQueueLike.pop(q)
def giveMeSomethingNew : R
}
class QueueUnited[T,R,This](implicit val q : QueueLike[T,This], g :
GoodQueueLike[T,This], b : BetterQueueLike[T,R,This]) {
def push(elem: T) : This = b.push(elem)
def pop(q: This) : (T,This) = b.pop(q)
def giveMeSomethingNew : R = b.giveMeSomethingNew
def giveUp : Unit = {}
}
Yoyak : Scala Experience
• Type class in Yoyak
trait StdObjectModel[A<:Galois,D<:Galois,This<:StdObjectModel[A,D,This]] extends
MemDomLike[A,D,This] with ArrayJoinModel[A,D,This] {
implicit val arithOps : ArithmeticOps[A]
implicit val boxedOps : LatticeWithTopOps[D]
Use both methods in an appropriate place
Yoyak : Scala Experience
• Algebraic data type support
Natural way to express an abstract syntax tree of a program
;
if(x)
a = 1 a = 2
println(a)
Seq(
If(“x”,Assign(“a”,1),
Assign(“a”,2)),
Invoke(“println”,List(“a”))
)
Yoyak : Scala Experience
• Algebraic data type support
Easy to navigate the abstract syntax tree
def eval(v: Value.t, input: Mem)(implicit context: Context) : (AbsValue[A,D],Mem) = {
v match {
case x : Value.Constant => evalConstant(x,input)
case x : Value.Loc => evalLoc(x,input)
case x : Value.BinExp => evalBinExp(x,input)
case Value.This => (AbsRef(Set("$this")),input)
case Value.CaughtExceptionRef => (AbsRef(Set("$caughtex")),input)
case Value.CastExp(v, ofTy) => evalLoc(v,input)
case Value.InstanceOfExp(v, ofTy) => (AbsTop,input)
case Value.LengthExp(v) => (AbsTop,input)
case Value.NewExp(ofTy) => input.alloc(context.stmt)
case Value.NewArrayExp(ofTy, size) => input.alloc(context.stmt)
Yoyak : Scala Experience
• Native support for mutable and immutable values
Memory
x
y
z
Object
f
g
1
“A”
In some cases, mutability is more important than immutability
Yoyak : Scala Experience
• Native support for mutable and immutable values
Memory
x
y
z
Object
f
g
1
“A”
NewObject
f
g
2
“A”
memory.filter{_._2 == object}.foldLeft(memory) {
case (m,(k,_)) => m + (k -> newObject)
}
O(n)
Yoyak : Scala Experience
• Native support for mutable and immutable values
Memory
x
y
z
NewObject
f
g
2
“A”
object.update(newObject) O(1)
Yoyak : Scala Experience
• Native support for mutable and immutable values
Memory
x
y
z
Object
f
g
1
“A”
NewObject
f
g
2
“A”
If we frequently update immutable objects in a big memory,
it may result in severe inefficiency
Yoyak : Scala Experience
• Excellent support for parallelization
• Static analysis does not sufficiently utilize today’s
advancement of computing scalability (multicore
machines, big data technologies, cloud computing)
• Scala has a perfect platform to experiment parallelization
which called Akka
• Many fun things to try with Yoyak powered by Akka
Yoyak : Scala Experience
• Excellent support for parallelization
Worklist Parallelization
can be naturally
implemented by Akka’s
Actor model
Yoyak : Roadmap
• Add more built-in abstract domains
• Optimize analysis performance
• Visualize analysis details
• Build Scala compiler plug-in
Yoyak : Roadmap
• Add more built-in abstract domains
Interval domain cannot represent
the relation between two variables
x = [2,8], y = [1,7] produce
49 combinations of (x,y) pairs
100 1 2 3 4 5 6 7 8 9
10
0
1
2
3
4
5
6
7
8
9
X Axis
YAxis
Yoyak : Roadmap
• Add more built-in abstract domains
Octagon domain can represent the
relation between two variables
100 1 2 3 4 5 6 7 8 9
10
0
1
2
3
4
5
6
7
8
9
X Axis
YAxis
http://www.di.ens.fr/~mine/publi/article-mine-HOSC06.pdf
Yoyak : Roadmap
• Add more built-in abstract domains
2-interval domain is more precise
than interval domain
100 1 2 3 4 5 6 7 8 9
10
0
1
2
3
4
5
6
7
8
9
X Axis
YAxis
Yoyak : Roadmap
• Optimize analysis performance
• {Worklist, Method, Class}-level parallelization
• Reduce abstract memory size by removing unused
variables (faster join operation for abstract memory)
• Optional faster but unsound analysis
Yoyak : Roadmap
• Visualize analysis details
It is hard to know what a static analyzer is doing at a
specific moment because…
• Static analyzer’s behavior is very different for each
input program
• Often need to inspect and compare a map with
thousands of entries
• Unable to look over the big picture by ordinary Java
debuggers
Yoyak : Roadmap
• Visualize analysis details
Example from SAT solvers
Visualization of the search tree
generated by a basic DPLL
algorithm
DPVis
Yoyak : Roadmap
• Build Scala compiler plug-in
• Programming language researchers foresee that the semantic
program analyzer will be merged with compiler systems in the
near future as the type system did
Syntactic Analysis
Grammar Checking
Type System Semantic Analysis
Yoyak : Roadmap
• Build Scala compiler plug-in
• Scala compiler is well modularized, cleanly coded (as
compared to other compiler systems), so it is an excellent
platform for experimenting new ideas
• Pure Scala code is safe from null, however linked Java
libraries are not
• It would be great if Scala compiler can detect possible null
dereferences at a compile time and issue a warning
Thank you!
Further Questions,
ScalaDays 2015
twitter @heejongl
gmail heejong@gmail.com

More Related Content

What's hot

DeepXplore: Automated Whitebox Testing of Deep Learning
DeepXplore: Automated Whitebox Testing of Deep LearningDeepXplore: Automated Whitebox Testing of Deep Learning
DeepXplore: Automated Whitebox Testing of Deep LearningMasahiro Sakai
 
4366 chapter7
4366 chapter74366 chapter7
4366 chapter7Sai Kumar
 
Java Puzzle
Java PuzzleJava Puzzle
Java PuzzleSFilipp
 
C Code and the Art of Obfuscation
C Code and the Art of ObfuscationC Code and the Art of Obfuscation
C Code and the Art of Obfuscationguest9006ab
 
Connection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsConnection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsAlexander Litvinenko
 
Chapter1(premiumcontent).docx
Chapter1(premiumcontent).docxChapter1(premiumcontent).docx
Chapter1(premiumcontent).docxArhamQadeer
 
QA Auotmation Java programs,theory
QA Auotmation Java programs,theory QA Auotmation Java programs,theory
QA Auotmation Java programs,theory archana singh
 
COSCUP: Introduction to Julia
COSCUP: Introduction to JuliaCOSCUP: Introduction to Julia
COSCUP: Introduction to Julia岳華 杜
 
30 分鐘學會實作 Python Feature Selection
30 分鐘學會實作 Python Feature Selection30 分鐘學會實作 Python Feature Selection
30 分鐘學會實作 Python Feature SelectionJames Huang
 
The International Journal of Engineering and Science (IJES)
The International Journal of Engineering and Science (IJES)The International Journal of Engineering and Science (IJES)
The International Journal of Engineering and Science (IJES)theijes
 
Julia: The language for future
Julia: The language for futureJulia: The language for future
Julia: The language for future岳華 杜
 
20190907 Julia the language for future
20190907 Julia the language for future20190907 Julia the language for future
20190907 Julia the language for future岳華 杜
 
Math quota-cmu-g-455
Math quota-cmu-g-455Math quota-cmu-g-455
Math quota-cmu-g-455Rungroj Ssan
 

What's hot (20)

DeepXplore: Automated Whitebox Testing of Deep Learning
DeepXplore: Automated Whitebox Testing of Deep LearningDeepXplore: Automated Whitebox Testing of Deep Learning
DeepXplore: Automated Whitebox Testing of Deep Learning
 
Java puzzles
Java puzzlesJava puzzles
Java puzzles
 
4366 chapter7
4366 chapter74366 chapter7
4366 chapter7
 
Java Puzzle
Java PuzzleJava Puzzle
Java Puzzle
 
C Code and the Art of Obfuscation
C Code and the Art of ObfuscationC Code and the Art of Obfuscation
C Code and the Art of Obfuscation
 
Connection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsConnection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problems
 
C++ L04-Array+String
C++ L04-Array+StringC++ L04-Array+String
C++ L04-Array+String
 
Gilat_ch03.pdf
Gilat_ch03.pdfGilat_ch03.pdf
Gilat_ch03.pdf
 
Gilat_ch02.pdf
Gilat_ch02.pdfGilat_ch02.pdf
Gilat_ch02.pdf
 
Java Puzzlers
Java PuzzlersJava Puzzlers
Java Puzzlers
 
Chapter1(premiumcontent).docx
Chapter1(premiumcontent).docxChapter1(premiumcontent).docx
Chapter1(premiumcontent).docx
 
QA Auotmation Java programs,theory
QA Auotmation Java programs,theory QA Auotmation Java programs,theory
QA Auotmation Java programs,theory
 
Soluciones quiz
Soluciones quizSoluciones quiz
Soluciones quiz
 
COSCUP: Introduction to Julia
COSCUP: Introduction to JuliaCOSCUP: Introduction to Julia
COSCUP: Introduction to Julia
 
Integral table
Integral tableIntegral table
Integral table
 
30 分鐘學會實作 Python Feature Selection
30 分鐘學會實作 Python Feature Selection30 分鐘學會實作 Python Feature Selection
30 分鐘學會實作 Python Feature Selection
 
The International Journal of Engineering and Science (IJES)
The International Journal of Engineering and Science (IJES)The International Journal of Engineering and Science (IJES)
The International Journal of Engineering and Science (IJES)
 
Julia: The language for future
Julia: The language for futureJulia: The language for future
Julia: The language for future
 
20190907 Julia the language for future
20190907 Julia the language for future20190907 Julia the language for future
20190907 Julia the language for future
 
Math quota-cmu-g-455
Math quota-cmu-g-455Math quota-cmu-g-455
Math quota-cmu-g-455
 

Similar to Yoyak ScalaDays 2015

Compilation of COSMO for GPU using LLVM
Compilation of COSMO for GPU using LLVMCompilation of COSMO for GPU using LLVM
Compilation of COSMO for GPU using LLVMLinaro
 
Data Structure: Algorithm and analysis
Data Structure: Algorithm and analysisData Structure: Algorithm and analysis
Data Structure: Algorithm and analysisDr. Rajdeep Chatterjee
 
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」Ken'ichi Matsui
 
[DL輪読会]Conditional Neural Processes
[DL輪読会]Conditional Neural Processes[DL輪読会]Conditional Neural Processes
[DL輪読会]Conditional Neural ProcessesDeep Learning JP
 
Conditional neural processes
Conditional neural processesConditional neural processes
Conditional neural processesKazuki Fujikawa
 
Introduction to Polyhedral Compilation
Introduction to Polyhedral CompilationIntroduction to Polyhedral Compilation
Introduction to Polyhedral CompilationAkihiro Hayashi
 
Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchAhmed BESBES
 
Advance data structure & algorithm
Advance data structure & algorithmAdvance data structure & algorithm
Advance data structure & algorithmK Hari Shankar
 
Current Score – 0 Due Wednesday, November 19 2014 0400 .docx
Current Score  –  0 Due  Wednesday, November 19 2014 0400 .docxCurrent Score  –  0 Due  Wednesday, November 19 2014 0400 .docx
Current Score – 0 Due Wednesday, November 19 2014 0400 .docxfaithxdunce63732
 
Computer graphics lab manual
Computer graphics lab manualComputer graphics lab manual
Computer graphics lab manualUma mohan
 
MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CSusumu Tokumoto
 
COMPUTER GRAPHICS LAB MANUAL
COMPUTER GRAPHICS LAB MANUALCOMPUTER GRAPHICS LAB MANUAL
COMPUTER GRAPHICS LAB MANUALVivek Kumar Sinha
 
An introduction to Google test framework
An introduction to Google test frameworkAn introduction to Google test framework
An introduction to Google test frameworkAbner Chih Yi Huang
 
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...Cyber Security Alliance
 
Embedded SW Interview Questions
Embedded SW Interview Questions Embedded SW Interview Questions
Embedded SW Interview Questions PiTechnologies
 
Georgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software securityGeorgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software securityDefconRussia
 
Thinking Functionally In Ruby
Thinking Functionally In RubyThinking Functionally In Ruby
Thinking Functionally In RubyRoss Lawley
 

Similar to Yoyak ScalaDays 2015 (20)

Compilation of COSMO for GPU using LLVM
Compilation of COSMO for GPU using LLVMCompilation of COSMO for GPU using LLVM
Compilation of COSMO for GPU using LLVM
 
C programs
C programsC programs
C programs
 
Data Structure: Algorithm and analysis
Data Structure: Algorithm and analysisData Structure: Algorithm and analysis
Data Structure: Algorithm and analysis
 
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
 
[DL輪読会]Conditional Neural Processes
[DL輪読会]Conditional Neural Processes[DL輪読会]Conditional Neural Processes
[DL輪読会]Conditional Neural Processes
 
Conditional neural processes
Conditional neural processesConditional neural processes
Conditional neural processes
 
Vcs16
Vcs16Vcs16
Vcs16
 
Introduction to Polyhedral Compilation
Introduction to Polyhedral CompilationIntroduction to Polyhedral Compilation
Introduction to Polyhedral Compilation
 
Understanding Reed-Solomon code
Understanding Reed-Solomon codeUnderstanding Reed-Solomon code
Understanding Reed-Solomon code
 
Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from Scratch
 
Advance data structure & algorithm
Advance data structure & algorithmAdvance data structure & algorithm
Advance data structure & algorithm
 
Current Score – 0 Due Wednesday, November 19 2014 0400 .docx
Current Score  –  0 Due  Wednesday, November 19 2014 0400 .docxCurrent Score  –  0 Due  Wednesday, November 19 2014 0400 .docx
Current Score – 0 Due Wednesday, November 19 2014 0400 .docx
 
Computer graphics lab manual
Computer graphics lab manualComputer graphics lab manual
Computer graphics lab manual
 
MuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for C
 
COMPUTER GRAPHICS LAB MANUAL
COMPUTER GRAPHICS LAB MANUALCOMPUTER GRAPHICS LAB MANUAL
COMPUTER GRAPHICS LAB MANUAL
 
An introduction to Google test framework
An introduction to Google test frameworkAn introduction to Google test framework
An introduction to Google test framework
 
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...
 
Embedded SW Interview Questions
Embedded SW Interview Questions Embedded SW Interview Questions
Embedded SW Interview Questions
 
Georgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software securityGeorgy Nosenko - An introduction to the use SMT solvers for software security
Georgy Nosenko - An introduction to the use SMT solvers for software security
 
Thinking Functionally In Ruby
Thinking Functionally In RubyThinking Functionally In Ruby
Thinking Functionally In Ruby
 

Recently uploaded

Unlocking AI: Navigating Open Source vs. Commercial Frontiers
Unlocking AI:Navigating Open Source vs. Commercial FrontiersUnlocking AI:Navigating Open Source vs. Commercial Frontiers
Unlocking AI: Navigating Open Source vs. Commercial FrontiersRaphaël Semeteys
 
openEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scaleopenEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scaleShane Coughlan
 
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...MyFAA
 
Steps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic DevelopersSteps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic Developersmichealwillson701
 
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...Splashtop Inc
 
User Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller ResumeUser Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller ResumeKaylee Miller
 
03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...
03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...
03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...jackiepotts6
 
Technical improvements. Reasons. Methods. Estimations. CJ
Technical improvements.  Reasons. Methods. Estimations. CJTechnical improvements.  Reasons. Methods. Estimations. CJ
Technical improvements. Reasons. Methods. Estimations. CJpolinaucc
 
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...Maxim Salnikov
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityRandy Shoup
 
8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdfOffsiteNOC
 
Boost Efficiency: Sabre API Integration Made Easy
Boost Efficiency: Sabre API Integration Made EasyBoost Efficiency: Sabre API Integration Made Easy
Boost Efficiency: Sabre API Integration Made Easymichealwillson701
 
Revolutionize Your Field Service Management with FSM Grid
Revolutionize Your Field Service Management with FSM GridRevolutionize Your Field Service Management with FSM Grid
Revolutionize Your Field Service Management with FSM GridMathew Thomas
 
BATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data MeshBATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data MeshBATbern
 
MUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow ModelsMUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow ModelsUniversity of Antwerp
 
Einstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdfEinstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdfCloudMetic
 
Enterprise Content Managements Solutions
Enterprise Content Managements SolutionsEnterprise Content Managements Solutions
Enterprise Content Managements SolutionsIQBG inc
 
Mobile App Development company Houston
Mobile  App  Development  company HoustonMobile  App  Development  company Houston
Mobile App Development company Houstonjennysmithusa549
 
renewable energy renewable energy renewable energy renewable energy
renewable energy renewable energy renewable energy  renewable energyrenewable energy renewable energy renewable energy  renewable energy
renewable energy renewable energy renewable energy renewable energyjeyasrig
 

Recently uploaded (20)

Unlocking AI: Navigating Open Source vs. Commercial Frontiers
Unlocking AI:Navigating Open Source vs. Commercial FrontiersUnlocking AI:Navigating Open Source vs. Commercial Frontiers
Unlocking AI: Navigating Open Source vs. Commercial Frontiers
 
openEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scaleopenEuler Community Overview - a presentation showing the current scale
openEuler Community Overview - a presentation showing the current scale
 
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
Take Advantage of Mx Tracking Flight Scheduling Solutions to Streamline Your ...
 
Steps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic DevelopersSteps to Successfully Hire Ionic Developers
Steps to Successfully Hire Ionic Developers
 
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
Splashtop Enterprise Brochure - Remote Computer Access and Remote Support Sof...
 
User Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller ResumeUser Experience Designer | Kaylee Miller Resume
User Experience Designer | Kaylee Miller Resume
 
03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...
03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...
03.2024_North America VMUG Optimizing RevOps using the power of ChatGPT in Ma...
 
Technical improvements. Reasons. Methods. Estimations. CJ
Technical improvements.  Reasons. Methods. Estimations. CJTechnical improvements.  Reasons. Methods. Estimations. CJ
Technical improvements. Reasons. Methods. Estimations. CJ
 
20140812 - OBD2 Solution
20140812 - OBD2 Solution20140812 - OBD2 Solution
20140812 - OBD2 Solution
 
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
If your code could speak, what would it tell you? Let GitHub Copilot Chat hel...
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of SimplicityLarge Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
 
8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf8 key point on optimizing web hosting services in your business.pdf
8 key point on optimizing web hosting services in your business.pdf
 
Boost Efficiency: Sabre API Integration Made Easy
Boost Efficiency: Sabre API Integration Made EasyBoost Efficiency: Sabre API Integration Made Easy
Boost Efficiency: Sabre API Integration Made Easy
 
Revolutionize Your Field Service Management with FSM Grid
Revolutionize Your Field Service Management with FSM GridRevolutionize Your Field Service Management with FSM Grid
Revolutionize Your Field Service Management with FSM Grid
 
BATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data MeshBATbern52 Swisscom's Journey into Data Mesh
BATbern52 Swisscom's Journey into Data Mesh
 
MUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow ModelsMUT4SLX: Extensions for Mutation Testing of Stateflow Models
MUT4SLX: Extensions for Mutation Testing of Stateflow Models
 
Einstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdfEinstein Copilot Conversational AI for your CRM.pdf
Einstein Copilot Conversational AI for your CRM.pdf
 
Enterprise Content Managements Solutions
Enterprise Content Managements SolutionsEnterprise Content Managements Solutions
Enterprise Content Managements Solutions
 
Mobile App Development company Houston
Mobile  App  Development  company HoustonMobile  App  Development  company Houston
Mobile App Development company Houston
 
renewable energy renewable energy renewable energy renewable energy
renewable energy renewable energy renewable energy  renewable energyrenewable energy renewable energy renewable energy  renewable energy
renewable energy renewable energy renewable energy renewable energy
 

Yoyak ScalaDays 2015

  • 2. Speaker Introduction • Has been working in a static analysis industry since 2008 • Studied programming language theory at a graduate school • Has been developing several static analyzers which are mostly commercial ones • Began to use Scala six years ago and still actively using it in everyday development
  • 3. Agenda • Static analysis • Theory of abstract interpretation • Yoyak framework: implementation highlights • Yoyak framework: Scala experience • Yoyak framework: Roadmap
  • 5. What is Static Analysis? • Analyze source codes without actually running it • Someone prefers to call it white box test • Used for finding bugs, optimizing a compiled binary, calculating a software metric, proving safety properties, etc.
  • 6. Examples of Static Analysis • Finding bugs : symbolic execution • Optimizing a compiled binary: data flow analysis • Calculating a software metric: syntactic analysis • Proving safety properties: model checking, abstract interpretation, type system
  • 7. Two important terms in Static Analysis • Soundness • The analysis result should contain all possibilities which can happen in the runtime • If the analysis uses an over-approximation, it is sound • Completeness • The analysis result should not contain any possibility which cannot happen in the runtime • If the analysis uses an under-approximation, it is complete
  • 8. Two important terms in Static Analysis Over-approximation of Semantics Program Semantics Under-approximation of Semantics
  • 9. Abstract Interpretation The beauty of abstraction http://cargocollective.com/carlyfox/Design
  • 10. What is the result of this expression? 19224 ⇥ 7483919 ⇥ (11952 20392)
  • 11. What is the result of this expression? 19224 ⇥ 7483919 ⇥ (11952 20392) = 1214270048744640 How long does it take without a calculator?
  • 12. What is the result of this expression? 19224 ⇥ 7483919 ⇥ (11952 20392) = 1214270048744640 What if we do not have an interest in the exact number, rather we just want to know whether it is positive or negative?
  • 13. What is the result of this expression? 19224 ⇥ 7483919 ⇥ (11952 20392) ˆ+ ⇥ ˆ+ ⇥ ˆ = ˆ ↵ = n (n 2 Z ^ n < 0)
  • 14. What is the result of this expression? 19224 ⇥ 7483919 ⇥ (11952 20392) = 1214270048744640 = n (n 2 Z ^ n < 0) takes 30 seconds takes 3 seconds • inaccurate but not incorrect • accurate enough for a specific purpose • much faster than a real calculation This is abstract interpretation
  • 15. Is this program safe from buffer overruns? void foo(int x) { String[] strs = new String[10]; int index = 0; while(x > 0) { index = index + 1; x = x - 1; } strs[index] = "hello!"; }
  • 16. No, ArrayIndexOutOfBoundsException may occur at the last line void foo(int x) { String[] strs = new String[10]; int index = 0; while(x > 0) { index = index + 1; x = x - 1; } strs[index] = "hello!"; } index = [0,0] index = [1,∞] index = [0,∞]
  • 17. • Roughly but soundly execute the program Abstract interpretation for dummies
  • 19. First, we need to precisely define what “domain” and “semantics” means in a mathematical way
  • 20. Let me introduce you Javar language
  • 21. 1
  • 23. Javar-1 C ! n (n 2 Z)
  • 24. Javar-1 semantic domain n 2 V alue = Z JCK 2 V alue
  • 26. 1+1
  • 27. Javar-2 C ! n op n (n 2 Z, op 2 {+, , ⇤, /})
  • 28. Javar-{1,2} semantic domain n 2 V alue = Z JCK 2 V alue
  • 29. Javar-2 semantics JnK = n Jn1 + n2K = Jn1K + Jn2K Jn1 n2K = Jn1K Jn2K Jn1 ⇤ n2K = Jn1K ⇥ Jn2K Jn1 / n2K = Jn1K ÷ Jn2K
  • 30. x := x + 1
  • 31. Javar-3 C ! x := E E ! n (n 2 Z) | x | E op E (op 2 {+, , ⇤, /})
  • 32. Javar-3 semantic domain M 2 Memory = V ar ! V alue n 2 V alue = Z x 2 V ar = V ariables JCK 2 Memory ! Memory JEK 2 Memory ! Z
  • 33. Javar-3 semantics Jx := EKM = M{x ! JEKM} JnKM = n JxKM = M(x) JE1{+, , ⇤, /}E2KM = JE1KM{+, , ⇥, ÷}JE2KM
  • 34. x := 100 + 2; if(x) x := x * 10 else x := x / 2; while(x) x := x - 1
  • 35. Javar-4 C ! x := E | if (E) C else C | while (E) C | C; C E ! n (n 2 Z) | x | E op E (op 2 {+, , ⇤, /})
  • 36. Javar-{3,4} semantic domain M 2 Memory = V ar ! V alue n 2 V alue = Z x 2 V ar = V ariables JCK 2 Memory ! Memory JEK 2 Memory ! Z
  • 37. Javar-4 semantics Jx := EKM = M{x ! JEKM} Jif(E) C1 else C2KM = if JEKM 6= 0 then JC1KM else JC2KM Jwhile(E) CKM = if JEKM 6= 0 then Jwhile(E) CK(JCKM) else M JnKM = n JxKM = M(x) JE1{+, , ⇤, /}E2KM = JE1KM{+, , ⇥, ÷}JE2KM
  • 38. This is not a definition Jwhile(E) CKM = if JEKM 6= 0 then Jwhile(E) CK(JCKM) else M GNU = GNU’s Not Unix
  • 39. The existence and uniqueness of the fixed-point is guaranteed by domain theory Jwhile(E) CKM = if JEKM 6= 0 then Jwhile(E) CK(JCKM) else M Jwhile(E) CK = M.if JEKM 6= 0 then Jwhile(E) CK(JCKM) else M F = M.if JEKM 6= 0 then F(JCKM) else M F = H(F) Jwhile(E) CK = fix( F. M.if JEKM 6= 0 then F(JCKM) else M)
  • 40. Abstract interpretation revisited • Safely estimate program semantics in a finite time • Abstraction is not omission, guarantees soundness • Most of static analysis techniques can be defined in a form of abstract interpretation
  • 41. Key Elements of Abstract Interpretation • Domain : concrete domain, abstract domain • Semantics : concrete semantics, abstract semantics • Galois connection : pair of abstraction and concretization functions • CPO : complete partial order • Continuous function : preserving upper bound
  • 42. Galois Connection 8x 2 D, ˆx 2 ˆD : ↵(x) v ˆx () x v (ˆx) x ˆx ↵ D ˆD
  • 43. CPO exists partial order ⊑ exists element x where x ⊑ y (for all y ∈ D) for all ordered subset of D, there exists upper bound x where x ∈ D
  • 44. Lattices Partially ordered set in which every two elements have a unique LUB(⊔) and a unique GLB(⊓)
  • 45. Continuous Function x D 8ordered subset S ✓ D, F( G x2S x) = G x2S F(x) D y z F(x) F(y) F(z)
  • 46. Abstract Interpretation in a Nutshell Concrete Abstract Program Semantics Domain D should be CPO should be CPO Galois Connection Semantic Function F should be continuous should be monotonic Program Execution F : D ! D ˆF : ˆD ! ˆD lfp F = G i2N Fi (?) G i2N ˆFi (ˆ?) v ˆX ↵ : D ! ˆD : ˆD ! D Performing analysis using abstract interpretation = calculating in a finite timeˆX And the following formula is always satisfied (soundness guarantee) lfp F v ˆX
  • 47. Abstract Interpretation in a Nutshell lfp F v ˆX false positives lfp F ˆX lfp ˆF ↵ F v ˆF ↵ D ˆD
  • 48. Is this program safe from buffer overruns? void foo(int x) { String[] strs = new String[10]; int index = 0; if(x > 0) { index = 1; } else { index = 10; } strs[index] = "hello!"; }
  • 49. void foo(int x) { String[] strs = new String[10]; int index = 0; if(x > 0) { index = 1; } else { index = 10; } strs[index] = "hello!"; } index = [0,0] index = [1,1] index = [10,10] index = [1,10]
  • 50. Interval analysis based on abstract interpretation • Concrete domain: the domain in the real world Memory = V ar ! V alue V alue = 2Z C 2 C ! Memory ! Memory V 2 E ! Memory ! V alue
  • 51. Interval analysis based on abstract interpretation • Concrete semantics: the semantics in the real world C x := E m = m{x 7! V E m} C if(E) C1 C2 m = V E m ? C C1 m : C C2 m C while(E) C m = V E m ? C while(E) C (C C m) : m C C1; C2 m = C C2 (C C1 m) V x m = m x V n m = {n} V E1 + E2 m = (V E1 m) + (V E2 m)
  • 52. Interval analysis based on abstract interpretation • Concrete execution of a program ? @ F(?) @ F(F(?)) @ F(F(F(?)))... @ Fi (?) = Fi+1 (?) is the execution result of a programFi (?) 2 Memory F = m.C C m lfp F = G i2N Fi ({})
  • 53. Interval analysis based on abstract interpretation • Abstract domain: the domain we will use in an analysis ˆMemory = V ar ! ˆV alue ˆV alue = ˆZ [ {?} ˆZ = {[a, b] | a 2 Z [ { 1}, b 2 Z [ {1}, a  b} ˆC 2 C ! ˆMemory ! ˆMemory ˆV 2 E ! ˆMemory ! ˆV alue
  • 54. ㅗ [0,0] [1,1] [2,2] ……..[-1,-1][-2,-2][-3,-3] [-1,0] [0,1] [0,2][-2,-1][-3,-2] [-3,-1] [-2,0] [-1,1] [0,2] [-2,1][-3,0] [-1,2] …….. [-∞,∞] [0,∞] [-1,∞] [-2,∞] …….. [-∞,0] [-∞,1] [-∞,2] …….…… ……………… ………………..… …….. …….…… ……………… ………………..… Lattice of Interval Domain
  • 55. Interval analysis based on abstract interpretation • Abstract semantics: the semantics we will use in an analysis ˆC x := E ˆm = ˆm{x 7! ˆV E ˆm} ˆC if(E) C1 C2 ˆm = ˆC C1 ˆm t ˆC C2 ˆm ˆC while(E) C ˆm = ˆm t ˆC while(E) C ( ˆC C ˆm) ˆC C1; C2 ˆm = ˆC C2 ( ˆC C1 ˆm) ˆV x ˆm = ˆm x ˆV n ˆm = ↵{n} ˆV E1 + E2 ˆm = (ˆV E1 ˆm)ˆ+(ˆV E2 ˆm)
  • 56. Interval analysis based on abstract interpretation • Abstract execution of a program is the analysis result of a program ˆF = ˆm. ˆC C ˆm G i2N ˆFi ({}) v ˆX ˆ? @ ˆF(ˆ?) @ ˆF( ˆF(ˆ?)) @ ˆF( ˆF( ˆF(ˆ?)))... @ ˆFi (ˆ?) v ˆX ˆX
  • 57. Interval analysis based on abstract interpretation • Widening What if this chain has infinite length? ˆ? @ ˆF(ˆ?) @ ˆF( ˆF(ˆ?)) @ ˆF( ˆF( ˆF(ˆ?)))... @ ˆFi (ˆ?) v ˆX ˆ? @ ˆF(ˆ?) @ ˆF( ˆF(ˆ?)) @ ˆF( ˆF( ˆF(ˆ?)))... @ ˆFi 1 (ˆ?)r ˆFi (ˆ?) v ˆX rWe need a widening operator
  • 58. Interval analysis based on abstract interpretation • Widening ˆ? @ [0, 0] @ [0, 1] @ [0, 2]... @ [0, i 1] r [0, i] v [0, 1] void foo(int x) { String[] strs = new String[10]; int index = 0; while(x > 0) { index = index + 1; x = x - 1; } strs[index] = "hello!"; } index = [0,0] index = [1,∞] index = [0,∞]
  • 59. Is this program safe from buffer overruns? void foo(int x) { String[] strs = new String[10]; int index = 0; if(x > 0) { index = 1; } else { index = 10; } strs[index] = "hello!"; }
  • 60. Interval analysis based on abstract interpretation 0 21 3 4 5 6 index = 0; if(x > 0) index = 1 else index = 10; result = index ˆC C0 ˆm = ˆC C2 ( ˆC C1 ˆm) ˆC C1 ˆm = ˆm{index 7! ↵{0}} ˆC C2 ˆm = ˆC C4 ( ˆC C3 ˆm) ˆC C3 ˆm = ˆC C5 ˆm t ˆC C6 ˆm ˆC C4 ˆm = ˆm{result 7! ˆm index} ˆC C5 ˆm = ˆm{index 7! ↵{1}} ˆC C6 ˆm = ˆm{index 7! ↵{10}}
  • 61. Interval analysis based on abstract interpretation ˆC C0 {} = ˆC C2 ( ˆC C1 {}) ˆC C1 {} = {index 7! [0, 0]} ˆC C2 {index 7! [0, 0]} = ˆC C4 ( ˆC C3 {index 7! [0, 0]}) ˆC C3 {index 7! [0, 0]} = ˆC C5 {index 7! [0, 0]} t ˆC C6 {index 7! [0, 0]} ˆC C4 {index 7! [1, 10]} = {index 7! [1, 10], result 7! [1, 10]} ˆC C5 {index 7! [0, 0]} = {index 7! [1, 1]} ˆC C6 {index 7! [0, 0]} = {index 7! [10, 10]} ˆC C0 {} = {index 7! [1, 10], result 7! [1, 10]}
  • 62. void foo(int x) { String[] strs = new String[10]; int index = 0; if(x > 0) { index = 1; } else { index = 10; } strs[index] = "hello!"; } index may have an integer between 1 and 10 Since the size of the buffer strs is 10, ArrayIndexOutOfBoundsException may occur here Is this program safe from buffer overruns?
  • 63. Yoyak Do not reinvent the wheel https://trimaps.com/assets/website/dontreinventthemap-6ba62b8ba05d4957d2ed772584d7e4cd.png
  • 64. Motivation • Do no reinvent the wheel : many components that static analyzers often use are reusable • CFG data types : construction, optimization, visualization • Graph algorithms : unrolling loops, finding loop heads, finding topological order • Intermediate language data types : construction, optimization, pretty printing • Common abstract domains : integer interval, abstract object, abstract memory • Common abstract semantics : assignment, invoking methods, evaluating binary expressions
  • 65. Motivation • Perfect to be a framework : the theory of abstract interpretation guarantees soundness and termination of the analysis if a user supplies valid abstract domain and semantics Generic fixed point computation engine Abstract domain D Abstract semantics F Fixed point x = F(x) (x∈D)
  • 66. Overview Yoyak Abstract Domain Fixed Point Computation Abstract Semantics MapDom MemDom Interval ArithmeticOps LatticeOps StdSemanticsForwardAnalysis AbstractTransferable Widening Galois ILFlowSensitive FixedPoint Computation Worklist WideningAt LoopHeads Interprocedural Iteration DoWidening CommonIL Attachable Typable
  • 67. Fixed-point Computation in Yoyak Built-in work-list algorithm x := 10 Assume (y == 0) println(“0”) println(“2”) Assume (y != 0) Assume (y == 1) println(“0”) Assume (y != 1) Assume (z) throw new Ex(); ENTRY EXIT Assume (!z) println(“done”) return; def computeFixedPoint(startNodes: List[BasicBlock])(implicit widening: Option[Widening[D]] = None) : MapDom[BasicBlock,D] = { worklist.add(startNodes:_*) var map = MapDom.empty[BasicBlock,D] while(worklist.size() > 0) { val bb = worklist.pop().get val prevInputs = memoryFetcher(map,bb) val prev = getInput(map,prevInputs) val (mapOut,next) = work(map,prev,bb) val orig = map.get(bb) val isStableOpt = ops.<=(next,orig) if(isStableOpt.isEmpty) { println("error: abs. transfer func. is not distributive") } if(!isStableOpt.get) { val widened = if(widening.nonEmpty) { doWidening(widening.get)(orig,next,bb) } else next map = mapOut.update(bb->widened) val nextWork = getNextBlocks(bb) worklist.add(nextWork:_*) } } map
  • 68. Fixed-point Computation in Yoyak Built-in work-list algorithm trait FlowSensitiveFixedPointComputation[D<:Galois] extends FlowSensitiveIteration[D] with CfgNavigator[D] with DoWidening[D] { def computeFixedPoint(startNodes: List[BasicBlock])(implicit widening: Option[Widening[D]] = None) : MapDom[BasicBlock,D] = { class FlowSensitiveForwardAnalysis[D<:Galois](val cfg: CFG)( implicit val ops: LatticeOps[D], val absTransfer: AbstractTransferable[D], val widening: Option[Widening[D]] = None) extends FlowSensitiveFixedPointComputation[D] with WideningAtLoopHeads[D] {
  • 69. Abstract Semantics in Yoyak Built-in work-list algorithm trait AbstractTransferable[D<:Galois] { protected def transferIdentity(stmt: Identity, input: D#Abst)( implicit context: Context) : D#Abst = input protected def transferAssign(stmt: Assign, input: D#Abst)( implicit context: Context) : D#Abst = input protected def transferInvoke(stmt: Invoke, input: D#Abst)( implicit context: Context) : D#Abst = input protected def transferIf(stmt: If, input: D#Abst)( implicit context: Context) : D#Abst = input protected def transferAssume(stmt: Assume, input: D#Abst)( implicit context: Context) : D#Abst = input // so on
  • 70. Abstract Semantics in Yoyak Built-in standard semantic trait StdSemantics[A<:Galois,D,Mem<:MemDomLike[A,D,Mem]] extends AbstractTransferable[GaloisIdentity[Mem]] { val arithOps : ArithmeticOps[A] override protected def transferAssign(stmt: Assign, input: Mem)( implicit context: Context) : Mem = { val (rv,output) = eval(stmt.rv,input) output.update(stmt.lv,rv) }
  • 71. Abstract Domain in Yoyak Composable abstract domains class MapDom[K,V <: Galois : LatticeOps] { trait LatticeOps[D <: Galois] extends ParOrdOps[D] { def /(lhs: D#Abst, rhs: D#Abst) : D#Abst def bottom : D#Abst trait ParOrdOps[D <: Galois] { def <=(lhs: D#Abst, rhs: D#Abst) : Option[Boolean] trait Galois { type Conc type Abst
  • 72. Abstract Domain in Yoyak Built-in Interval Domain scala> import com.simplytyped.yoyak.framework.domain.arith._ import com.simplytyped.yoyak.framework.domain.arith._ scala> import com.simplytyped.yoyak.framework.domain.arith.Interval._ import com.simplytyped.yoyak.framework.domain.arith.Interval._ scala> val intv1 = Interv.of(10) intv1: com.simplytyped.yoyak.framework.domain.arith.Interval = Interv(IInt(10),IInt(10)) scala> val intv2 = Interv.in(IInt(-10),IInt(10)) intv2: com.simplytyped.yoyak.framework.domain.arith.Interval = Interv(IInt(-10),IInt(10)) scala> val intv3 = Interv.in(IInfMinus,IInf) intv3: com.simplytyped.yoyak.framework.domain.arith.Interval = IntervTop scala> val intv4 = Interv.in(IInt(-10),IInf) intv4: com.simplytyped.yoyak.framework.domain.arith.Interval = Interv(IInt(-10),IInf)
  • 73. Abstract Domain in Yoyak Built-in Interval Domain scala> import IntervalInt.arithOps import IntervalInt.arithOps scala> arithOps.+(intv1,intv2) // [10,10] + [-10,10] res1: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = Interv(IInt(0),IInt(20)) scala> arithOps.-(intv1,intv2) // [10,10] - [-10,10] res2: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = Interv(IInt(0),IInt(20)) scala> arithOps.+(intv2,intv3) // [-10,10] + [-∞,∞] res3: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = IntervTop scala> arithOps.*(intv2,intv4) // [-10,10] * [-10,∞] res4: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = IntervTop scala> arithOps.*(intv1,intv4) // [10,10] * [-10,∞] res5: com.simplytyped.yoyak.framework.domain.arith.IntervalInt#Abst = Interv(IInt(-100),IInf)
  • 74. Abstract Domain in Yoyak Built-in Standard Object Model trait StdObjectModel[A<:Galois,D<:Galois,This<:StdObjectModel[A,D,This]] extends MemDomLike[A,D,This] with ArrayJoinModel[A,D,This] { implicit val arithOps : ArithmeticOps[A] implicit val boxedOps : LatticeWithTopOps[D] def update(kv: (Loc,AbsValue[A,D])) : This def remove(loc: Local) : This def alloc(from: Stmt) : (AbsRef,This) def get(k: Loc) : AbsValue[A,D] def isStaticAddr(addr: AbsAddr) : Boolean def isDynamicAddr(addr: AbsAddr) : Boolean class MemDom[A <: Galois : ArithmeticOps, D <: Galois : LatticeWithTopOps] extends StdObjectModel[A,D,MemDom[A,D]] {
  • 75. Abstract Domain in Yoyak Built-in Memory Domain scala> import com.simplytyped.yoyak.framework.domain.mem.MemDom scala> import com.simplytyped.yoyak.framework.domain.mem.MemElems._ scala> import com.simplytyped.yoyak.framework.domain.Galois._ scala> import com.simplytyped.yoyak.framework.domain.arith.Interv scala> import com.simplytyped.yoyak.framework.domain.arith.IntervalInt scala> import com.simplytyped.yoyak.il.CommonIL.Value._ scala> val memory = new MemDom[IntervalInt,SetAbstraction[String]] memory: com.simplytyped.yoyak.framework.domain.mem.MemDom[com.simplytyped.yoyak.framework.doma in.arith.IntervalInt,com.simplytyped.yoyak.framework.domain.Galois.SetAbstraction[Stri ng]] = com.simplytyped.yoyak.framework.domain.mem.MemDom@8443a1
  • 76. Abstract Domain in Yoyak scala> val memory2 = memory.update(Local("x") -> AbsArith[IntervalInt](Interv.of(1))) scala> val memory3 = memory.update(Local("x") -> AbsArith[IntervalInt](Interv.of(10))) scala> val memory4 = MemDom.ops[IntervalInt,SetAbstraction[String]]./(memory2,memory3) scala> memory4.get(Local("x")) res1: com.simplytyped.yoyak.framework.domain.mem.MemElems.AbsValue[com.simplytyped.yoyak.framework .domain.arith.IntervalInt,com.simplytyped.yoyak.framework.domain.Galois.SetAbstraction[Strin g]] = AbsArith(Interv(IInt(1),IInt(10))) Built-in Memory Domain
  • 77. IL in Yoyak CommonIL abstract class Stmt extends Attachable { override def equals(that: Any): Boolean = this eq that.asInstanceOf[AnyRef] override def hashCode() : Int = System.identityHashCode(this) private[Stmt] def copyAttr(stmt: Stmt) : this.type = {sourcePos = stmt.pos; this} }
  • 78. IL in Yoyak CommonIL case class Block(stmts: StatementContainer) extends Stmt case class Switch(v: Value.Loc, keys: List[Value.t], targets: List[Target]) extends Stmt case class Placeholder(x: AnyRef) extends Stmt sealed trait CoreStmt extends Stmt case class If(cond: Value.CondBinExp, target: Target) extends CoreStmt case class Goto(target: Target) extends CoreStmt sealed trait CfgStmt extends CoreStmt case class Identity(lv: Value.Local, rv: Value.Param) extends CfgStmt case class Assign(lv: Value.Loc, rv: Value.t) extends CfgStmt case class Invoke(ret: Option[Value.Local], callee: Type.InvokeType) extends CfgStmt case class Assume(cond: Value.CondBinExp) extends CfgStmt case class Return(v: Option[Value.Loc]) extends CfgStmt case class Nop() extends CfgStmt case class EnterMonitor(v: Value.Loc) extends CfgStmt case class ExitMonitor(v: Value.Loc) extends CfgStmt case class Throw(v: Value.Loc) extends CfgStmt
  • 79. IL in Yoyak Stmt x := 10; switch (y) { case 0: println(“0”); break; case 1: println(“1”); default: println(“2”); } if(z) { throw new Exception(); } else { println(“done”); } return 0; x := 10; if(y == 0) { println(“0”); goto D; } if(y == 1) { println(“1”); } D: println(“2”); if(z) { throw new Exception(); } else { println(“done”); } return 0; CoreStmt x := 10 Assume (y == 0) println(“0”) println(“2”) Assume (y != 0) Assume (y == 1) println(“0”) Assume (y != 1) Assume (z) throw new Ex(); ENTRY EXIT Assume (!z) println(“done”) return; CfgStmt
  • 80. Simple Interval Analysis in Yoyak class IntervalAnalysis(cfg: CFG) { def run() = { import IntervalAnalysis.{memDomOps,absTransfer,widening} val analysis = new FlowSensitiveForwardAnalysis[GMemory](cfg) val output = analysis.compute output } } object IntervalAnalysis { type Memory = MemDom[IntervalInt,SetAbstraction[Any]] type GMemory = GaloisIdentity[Memory] implicit val absTransfer : AbstractTransferable[GMemory] = new StdSemantics[IntervalInt,SetAbstraction[Any],Memory] { val arithOps: ArithmeticOps[IntervalInt] = IntervalInt.arithOps } implicit val memDomOps : LatticeOps[GMemory] = MemDom.ops[IntervalInt,SetAbstraction[Any]] implicit val widening : Option[Widening[GMemory]] = { implicit val NoWideningForSetAbstraction = Widening.NoWidening[SetAbstraction[Any]] Some(MemDom.widening[IntervalInt,SetAbstraction[Any]]) } }
  • 81. Simple Interval Analysis in Yoyak MemDom StdObjectModel MapDom AbsValue AbsRef AbsArith IntervalInt AbsBox SetAb[Any] AbsBottom AbsTop AbsObject AbsAddr IntervalAnalysis FlowSensitive ForwardAnalysis FlowSensitive FixedPointComputation Worklist LatticeOps FlowSensitiveIteration Abstract Transferable CfgNavigator WideningAtLoopHeads Widening MapDom BasicBlock MemDom MemDom.op IntervalInt.widening IntervalAnalysisTransferFunction CFG Fixed-point result StdSemantics ArithmeticOps IntervalInt.arithOps
  • 82. Yoyak : Scala Experience • Scala is a very good language to implement a static analyzer • Function is a first class citizen • Type class support • Algebraic data type support • Native support for mutable and immutable values • Excellent support for parallelization
  • 83. Yoyak : Scala Experience • Function is a first class citizen Natural way to express mathematical logic // optimize Cfg (insertAssume _ andThen removeIfandGoto) apply rawCfg
  • 84. Yoyak : Scala Experience • Type class support Can avoid F-bounded polymorphism which is the fast lane to overworking • F-bounded polymorphism • Commonly happen when inheritance meets immutability • Seriously deteriorate code readability
  • 85. Yoyak : Scala Experience • F-bounded polymorphism trait Queue[T, This <: Queue[T, This]] { def push(elem: T) : This } trait GoodQueue[T, This <: GoodQueue[T, This]] extends Queue[T, This] { def pop : (T, This) } trait BetterQueue[T, R, This <: BetterQueue[T, R, This]] extends GoodQueue[T, This] { def giveMeSomethingNew : R } trait QueueUnited[T, R, Q <: Queue[T, Q], G <: GoodQueue[T, G], B <: BetterQueue[T, R, B], This <: QueueUnited[T, R, Q, G, B, This]] extends BetterQueue[T, R, This] { def giveUp : Unit } • Always need the type of concrete subclass • Reiterate all type variables again in subclass reference • Type class liberates methods from inheritance
  • 86. Yoyak : Scala Experience • Type class trait QueueLike[T,This] { def push(elem: T) : This } trait GoodQueueLike[T,This] { implicit val queueLike : QueueLike[T,This] def push(elem: T) : This = queueLike.push(elem) def pop(q: This) : (T,This) } trait BetterQueueLike[T,R,This] { implicit val goodQueueLike : GoodQueueLike[T,This] def push(elem: T) : This = goodQueueLike.push(elem) def pop(q: This) : (T,This) = goodQueueLike.pop(q) def giveMeSomethingNew : R } class QueueUnited[T,R,This](implicit val q : QueueLike[T,This], g : GoodQueueLike[T,This], b : BetterQueueLike[T,R,This]) { def push(elem: T) : This = b.push(elem) def pop(q: This) : (T,This) = b.pop(q) def giveMeSomethingNew : R = b.giveMeSomethingNew def giveUp : Unit = {} }
  • 87. Yoyak : Scala Experience • Type class in Yoyak trait StdObjectModel[A<:Galois,D<:Galois,This<:StdObjectModel[A,D,This]] extends MemDomLike[A,D,This] with ArrayJoinModel[A,D,This] { implicit val arithOps : ArithmeticOps[A] implicit val boxedOps : LatticeWithTopOps[D] Use both methods in an appropriate place
  • 88. Yoyak : Scala Experience • Algebraic data type support Natural way to express an abstract syntax tree of a program ; if(x) a = 1 a = 2 println(a) Seq( If(“x”,Assign(“a”,1), Assign(“a”,2)), Invoke(“println”,List(“a”)) )
  • 89. Yoyak : Scala Experience • Algebraic data type support Easy to navigate the abstract syntax tree def eval(v: Value.t, input: Mem)(implicit context: Context) : (AbsValue[A,D],Mem) = { v match { case x : Value.Constant => evalConstant(x,input) case x : Value.Loc => evalLoc(x,input) case x : Value.BinExp => evalBinExp(x,input) case Value.This => (AbsRef(Set("$this")),input) case Value.CaughtExceptionRef => (AbsRef(Set("$caughtex")),input) case Value.CastExp(v, ofTy) => evalLoc(v,input) case Value.InstanceOfExp(v, ofTy) => (AbsTop,input) case Value.LengthExp(v) => (AbsTop,input) case Value.NewExp(ofTy) => input.alloc(context.stmt) case Value.NewArrayExp(ofTy, size) => input.alloc(context.stmt)
  • 90. Yoyak : Scala Experience • Native support for mutable and immutable values Memory x y z Object f g 1 “A” In some cases, mutability is more important than immutability
  • 91. Yoyak : Scala Experience • Native support for mutable and immutable values Memory x y z Object f g 1 “A” NewObject f g 2 “A” memory.filter{_._2 == object}.foldLeft(memory) { case (m,(k,_)) => m + (k -> newObject) } O(n)
  • 92. Yoyak : Scala Experience • Native support for mutable and immutable values Memory x y z NewObject f g 2 “A” object.update(newObject) O(1)
  • 93. Yoyak : Scala Experience • Native support for mutable and immutable values Memory x y z Object f g 1 “A” NewObject f g 2 “A” If we frequently update immutable objects in a big memory, it may result in severe inefficiency
  • 94. Yoyak : Scala Experience • Excellent support for parallelization • Static analysis does not sufficiently utilize today’s advancement of computing scalability (multicore machines, big data technologies, cloud computing) • Scala has a perfect platform to experiment parallelization which called Akka • Many fun things to try with Yoyak powered by Akka
  • 95. Yoyak : Scala Experience • Excellent support for parallelization Worklist Parallelization can be naturally implemented by Akka’s Actor model
  • 96. Yoyak : Roadmap • Add more built-in abstract domains • Optimize analysis performance • Visualize analysis details • Build Scala compiler plug-in
  • 97. Yoyak : Roadmap • Add more built-in abstract domains Interval domain cannot represent the relation between two variables x = [2,8], y = [1,7] produce 49 combinations of (x,y) pairs 100 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 X Axis YAxis
  • 98. Yoyak : Roadmap • Add more built-in abstract domains Octagon domain can represent the relation between two variables 100 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 X Axis YAxis http://www.di.ens.fr/~mine/publi/article-mine-HOSC06.pdf
  • 99. Yoyak : Roadmap • Add more built-in abstract domains 2-interval domain is more precise than interval domain 100 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 X Axis YAxis
  • 100. Yoyak : Roadmap • Optimize analysis performance • {Worklist, Method, Class}-level parallelization • Reduce abstract memory size by removing unused variables (faster join operation for abstract memory) • Optional faster but unsound analysis
  • 101. Yoyak : Roadmap • Visualize analysis details It is hard to know what a static analyzer is doing at a specific moment because… • Static analyzer’s behavior is very different for each input program • Often need to inspect and compare a map with thousands of entries • Unable to look over the big picture by ordinary Java debuggers
  • 102. Yoyak : Roadmap • Visualize analysis details Example from SAT solvers Visualization of the search tree generated by a basic DPLL algorithm DPVis
  • 103. Yoyak : Roadmap • Build Scala compiler plug-in • Programming language researchers foresee that the semantic program analyzer will be merged with compiler systems in the near future as the type system did Syntactic Analysis Grammar Checking Type System Semantic Analysis
  • 104. Yoyak : Roadmap • Build Scala compiler plug-in • Scala compiler is well modularized, cleanly coded (as compared to other compiler systems), so it is an excellent platform for experimenting new ideas • Pure Scala code is safe from null, however linked Java libraries are not • It would be great if Scala compiler can detect possible null dereferences at a compile time and issue a warning
  • 105. Thank you! Further Questions, ScalaDays 2015 twitter @heejongl gmail heejong@gmail.com