SlideShare a Scribd company logo
1 of 6
Download to read offline
Differential Testing, Java Performance Evaluation
and Execution Time Comparison of two Binary
Decision Diagram Libraries
Xia Xiao
Institute for Software Research, Carnegie Mellon University, Pittsburgh, PA
New York University, New York City, NY
xx681@nyu.edu
Abstract
As one of the data structures to represent Boolean functions in
Computer Science, Binary Decision Diagram(BDD) or
branching program is a compressed and abstract representation
of sets and relations. Comparing with other data structures like
Negation normal form(NNF) and Propositional directed acyclic
graph(PDAG), Binary Decision Diagram has its unique features
that are advantageous: In most cases, the term BDD commonly
refers to Reduced Binary Decision Diagram(ROBDD). The
nodes with variables in the BDD are connected with an order
(either alphabetical or numerical), and therefore it saves the
trouble traversing through the graph to look for a particular
node within the structure; meanwhile, the structure of BDD is
usually reduced, which means it eliminates the redundancies of
duplicated nodes existing within the same structure. Most
important of all, ROBDD has the feature that its representation
of a particular function and variable order is canonical – this
advantage it’s useful and make it simple for checking the
functional equivalency of multiple different Boolean functions
and operations like technology mapping.
Based on the uniqueness of Binary Decision Diagram,
Boolean function f can be transformed into a reduced and
ordered version of BDD. It simplified the work of building truth
table and comparing the truth values with different combination
of operations with multiple variables. It is worthwhile for us to
conduct a research on the Java Performance Evaluation on BDD
libraries.
There already exists some implementation of BDD libraries.
We are particularly interested in comparing the performance of
the JavaBDD library (available at
http://javabdd.sourceforge.net/index.html) and our own
implementation of Binary Decision Diagram library. By
implementing Differential Testing on both BDD libraries, we
check the functionally equivalency of operations to make sure
our BDD library follows the correct structure and no bugs
existing in the execution. In the differential testing file, we
measure the execution time of constructing different sizes of
BDD objects on both libraries and collected data from multiple
runs of the JUnit test file. Then we use Minitab Express to
generate 6 histograms and 6 boxplots of the execution time of
two BDD libraries in order to compare their running speed.
With theoretical support of statistic, we use t-test and
confidence interval to make sure the difference in their Java
Performance is statistically significant.
This paper shows that the JavaBDD library performs a faster
execution time when constructing BDD objects (and its
advantage gets more obvious when the number of BDD
increase). And this conclusion is illustrated by using differential
testing and statistically rigorous methodologies. In addition, we
advocate taking advantage of the structure of the existing
JavaBDD library and apply those advantages into our own
development of the BDD library implementations. We also
would like to encourage other researchers to put effort into
improving and consummating Java BDD implementation in
future work, since it would benefit future research on Data
Structure and Boolean Function studies.
Keywords
Binary Decision Diagram, Java, Differential Testing, Java
Performance Evaluation, Statistics, Methodology
1.   Introduction
Boolean function has been an important concept exiting in both
mathematics and logic for many years. It describes how to
determine a Boolean value output based on some logical
calculation from some Boolean inputs. There have been many
extended applications derive from the theory. Generally, a
Boolean function is of the following form:
ƒ: Bk
→ B
In the formula above, B = {0, 1} is called a Boolean domain and
k is a non-negative integer called the arity of the function. In the
case where k = 0, the "function" is essentially a constant element
of B. Every k-ary Boolean function can be expressed as a
propositional formula in k variables x1, …, xk, and two
propositional formulas are logically equivalent if and only if
they express the same Boolean function. There are 22k k-ary
functions for every k.
Particularly, Boolean function plays a pivotal role in the area
of computer engineering. There have been some propositional
logical representations of Boolean function, like multivariate
polynomials over (GF), negation normal forms, and
propositional directed acyclic graphs (PDAG). Among those
representations, Binary Decision Diagram is a more efficient
and simplified.
Binary Decision Diagram(BDD), a data structure has been
introduced by Lee in 1959, is gradually getting popular in recent
years. Its uniqueness in the variable’s order and reduction in the
duplicated nodes make it a special position among other data
structure representing Boolean functions.
The data structure of a Binary Decision Tree and truth table is
illustrated in Figure 1. As is shown, the value of the function can
be determined for a given variable assignment by following a
path down the graph to a terminal node. In Figure 1 the dotted
lines represent edges to a high child. Therefore, in order to find
(x1=0, x2=1, x3=1), we can begin at x1, and traverse down the
dotted line to x2 (since x1 has an assignment to 0), then down
two solid lines (since x2 and x3 each have an assignment to one).
This leads to the terminal 1, which is the value of f (x1=0, x2=1,
x3=1).
Figure 1. An example illustrating the Binary decision tree and
the corresponding truth table of the Boolean function f (x1, x2,
x3) = ¬(x1x2x3)+ x1x2 + x2x3.
The binary decision tree and the truth table in the above
figure can be transformed into a binary decision diagram by
maximally reducing it according to the two reduction rules:
•   Merge any isomorphic sub graphs.
•   Eliminate any node whose two children are
isomorphic.
The resulting BDD is shown in Figure 2 as following:
Figure 2. The BDD for the Boolean function f(x1, x2, x3) =
¬(x1x2x3)+ x1x2 + x2x3.
From the graph illustration of Boolean function f represented
in the data structure of Binary Decision Diagram we can see
clearly its structure: It’s rooted directed and consists of several
decision nodes (x1, x2, x3) and it has two two terminal nodes 0
and 1.
In Figure 2, each decision node is pointed to a low child (on
the left) and a high child (on the right). A dotted line represents
an assignment of 0 and a solid line represents an assignment of
1 instead.
This paper is organized as follows. In section 2, we firstly
raise some questions and come up several ideas of approaches
towards the research. Section 3 specifies the four major steps
of the research process. In section 4, we study the concept of
Differential Testing and apply it to the two Binary Decision
Diagram libraries. In the meantime, we write Java program that
automatically generate JUnit test file to conduct the differential
testing between the two BDD libraries. Section 6 we make the
comparison of the execution time of two BDD libraries.
Section 6 is the summary of this paper, and we make the
conclusion from the 6 histograms and 6 boxplots. Overall, the
JavaBDD library performs a faster running speed than our
BDD library.
2.   Questions and Ideas of
Approaches
In initial period of our research, we start with raising some
questions with relevance to the topics of Binary Decision
Diagram, Differential Testing and Java Performance Evaluation
as follows:
•   How to implement Differential Testing on two different
BDD libraries?
•   What’s the difference between Differential Testing and
other prevalent testing methods?
•   What are the methods/theories would be helpful during
Java Evaluation on BDD?
•   Which BDD library would have a shorter running time
during the evaluation? Is there a significant difference
between their running time?
•   What non-determinism factor may effect the BDD Java
Performance Evaluation?
After we raise the above questions, we then collect some ideas
of approaches towards the research as follows:
•   Since we are trying to compare two BDD libraries, it’s
important to consider the difference in the
implementation and functionality.
•   Statistically rigorous methodology should be included
in our Java Performance Evaluation in order to avoid
misleading or even incorrect result.
•   Confidence intervals can provide theoretical support
for the Java Performance Evaluation.
•   Various system effects may have influence on the BDD
performance, while they do not have large impact on
the Java Performance Evaluation.
In the next section, we specify and follow the four major steps
during the research, making progress gradually towards our
conclusion.
3.   Steps of the research
In the first step, we begin with getting familiar with the concepts,
theory, and background knowledge, and try to understand the
implementation and functionality in two different BDD libraries.
Next, we implement Differential Testing and write program for
automatically generating random test cases for both BDD
libraries. In the third step, we implement Differential Testing
and write program for automatically generating random test
cases for both BDD libraries. In the final step, we compare the
running time from step 3; we use Minitab Express to produce t-
test and histogram for two BDD library performance; and we
make conclusion based on the evaluation.
4.   Differential Testing on Binary
Decision Diagram
Differential Testing – A form of random testing, is an important
testing technology for large software system. Particularly, based
on its unique feature of indirectness and comparison, differential
testing can bring about swift, efficient testing results and save
miscellaneous expense in time and space.
In order to implement Differential Testing on the two Binary
Decision Diagram, we follow the following procedure and the
corresponding code is shown in Figure 3:
a)   Construct a series of BDD objects from both libraries,
and assign them with the same features(with the same
variable, low and high nodes).
b)   Run the written program to automatically generate
random combination of the existing BDD with random
operations on them. Assign the constructed new BDDs
to group a and group b.
c)   Make hypothesis and assume the BDDs with the same
data structure are built.
d)   Use Junit Test and helper Method isSameBDD to
check the features of two groups of BDDs.
e)   The helper method isSameBDD would call the toString
method in each library to check their equality.
f)   Prove the hypothesis that the the constructed BDDs
work in the same way.
Figure 3. A part of the code from the JUnit test file illustrating
how the differential testing is applied in the BDD Java
performance Evaluation.
5.   Execution Time Comparison of two
BDD libraries
In order to compare the Java Performance Evaluation of two
Binary Decision Diagram, we use Minitab Express to generate
6 box plots and 6 histograms respectively as follows, Figure 4
shows the 6 boxplots and Figure 5 shows the 6 histograms:
(a)
(b)
(c)
Figure 4. The boxplots illustrating the execution time of two
BDD libraries when constructing the different numbers of BDD.
In Figure 4, the boxplots illustrate the distribution of the
running speed of two BDD libraries. As we can notice, when the
number of BDD to be constructed is one, the two boxplots are
almost on the same vertical level (overlap), which shows that the
running time of them are very close to each other. However, as
the size of BDD constructed increase, the boxplots don't’
overlap anymore and tend to show a gradually increasing
difference between the running time. And the gap is getting
larger as the size gets larger.
(d)
(e)
(f)
(a)
(b)
(c)
Figure 5. The histogram illustrating the execution time of two
BDD libraries when constructing the different numbers of BDD.
(d)
(e)
(f)
Figure 5 shows the 6 histograms illustrating the distribution of
both BDD libraries. Starting from size 1, the running time of
both libraries are very close, and the mode of the execution time
mostly distributed near 2 milliseconds. As the size of the
construction getting large, the different gets large as well, and
the difference can be seen more clearly in graph (c), (d), (e) and
(f).
5. Summary
As we can see from section 5, There exists a difference in the
running speed between two BDD libraries during the Java
Performance Evaluation. As the numbers of BDD constructed
increase (size of 1,10, 20, 30, 40, 50), the histogram shows that
the difference in running time is also increasing.
The 6 boxplots illustrate that the distribution of the running
speed of two libraries. We notice that when size = 1, the running
speed is overlapped. However, as the size of BDD constructed
increase, the boxplots do not overlap anymore. The gap between
the two boxplots get larger as the size gets larger.
Overall, the JavaBDD library perform a faster running speed
than our BDD library. From data in the t-test statistics, 6
histograms and 6 boxplots generated by Minitab Express, the
results support our hypothesis.
There have been some non-determinisms could have a effect
on the Java Performance Evaluation on the Binary Decision
Diagram, including the garbage collector and the Java noise in
the background. However, they would not make a big difference
during the Java Performance Evaluation.
We believe this paper makes a step towards further study on
implementing the current Java BDD library and figuring out
how to improve the existing implementation and make it more
efficient with a faster execution time. Moreover, we would like
to implement a deeper inspection on our BDD library to make
sure there is not bugs or other kinds of exceptions occur. The
stability and reliability worth a further study and work in the
future.
Acknowledgment
We would like to thank Professor Christian Kästner – who offers
enthusiastic instructions and help with this research. His detailed
suggestions have greatly helped us throughout this summer
research program, including conducing the research, making the
academic poster and writing this research paper. The weekly
reading group also helped in introducing me into the world of
software research. By reading and discussing a specifically
chosen paper on a weekly basis, I learned some important skills
associated with research papers: How to critically reflect on a
scientific work; how to practice reading and argumentation
strategies; how to be exposed to broad range of research topics;
and how to lead a reading group discussion and guide your
teammates throughout the discussion. In the meantime, the
weekly meetings for all REU students on each Monday are also
thought-provoking. PHD students would Moreover, I would like
to thank all the staff and faculty in the REU program of Institute
for Software Research at Carnegie Mellon University. With your
generously offering me with this precious opportunity to
participate in this summer research program, I would be able to
have such a great experience during my undergraduate
education. The REU summer program indeed provided me with
a platform to work closely with professors, PHD students at
Carnegie Mellon University, as well as undergraduate
colleagues from other universities and colleges. It would always
be a valuable experience during my undergraduate years.
References
[1]   Andersen, Henrik Reif. "An introduction to binary decision
diagrams." Lecture notes, available online, IT University
of Copenhagen (1997).
[2]   “Binary Decision Diagram.” Wikipedia. Wikimedia
Foundation, n.d. Web. 24 July 2016.
[3]   "Boolean Function." Wikipedia. Accessed July 27, 2016.
https://en.wikipedia.org/wiki/Boolean_function.
[4]   Clarke, Edmund M., Masahiro Fujita, and Xudong Zhao.
"Multi-Terminal Binary Decision Diagrams and Hybrid
Decision Diagrams." Representations of Discrete
Functions, 1996, 93-108. doi:10.1007/978-1-4613-1385-
4_4.
[5]   Drechsler, Rolf, and Bernd Becker. Binary decision
diagrams: theory and implementation. Springer Science &
Business Media, 2013.
[6]   Georges, Andy, Dries Buytaert, and Lieven Eeckhout.
“Statistically Rigorous Java Performance Evaluation.”
ACM SIGPLAN Notices 42, no. 10 (October 21,
2007):57.doi; 10,1145/1297105.1297033.
[7]   Vliet, Hans Van. Software Engineering: Principles and
Practice. Chichester: John Wiley, 2000.
[8]   Wegener, Ingo. "Branching Programs and Binary Decision
Diagrams." 2000. doi:10.1137/1.9780898719789.
[9]   William M. Mckeeman, “Differential Testing for Software,”
Digital Technical Journal 10. No. 1 (1998).

More Related Content

What's hot

An Interval Type-2 Fuzzy Approach for Process Plan Selection
	An Interval Type-2 Fuzzy Approach for Process Plan Selection	An Interval Type-2 Fuzzy Approach for Process Plan Selection
An Interval Type-2 Fuzzy Approach for Process Plan Selectioninventionjournals
 
Customer Clustering For Retail Marketing
Customer Clustering For Retail MarketingCustomer Clustering For Retail Marketing
Customer Clustering For Retail MarketingJonathan Sedar
 
Paper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityPaper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityGon-soo Moon
 
Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...Tomonari Masada
 
Multimodal Biometrics Recognition by Dimensionality Diminution Method
Multimodal Biometrics Recognition by Dimensionality Diminution MethodMultimodal Biometrics Recognition by Dimensionality Diminution Method
Multimodal Biometrics Recognition by Dimensionality Diminution MethodIJERA Editor
 
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITYSOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITYIJDKP
 
Finding Relationships between the Our-NIR Cluster Results
Finding Relationships between the Our-NIR Cluster ResultsFinding Relationships between the Our-NIR Cluster Results
Finding Relationships between the Our-NIR Cluster ResultsCSCJournals
 

What's hot (8)

Master thesis
Master thesisMaster thesis
Master thesis
 
An Interval Type-2 Fuzzy Approach for Process Plan Selection
	An Interval Type-2 Fuzzy Approach for Process Plan Selection	An Interval Type-2 Fuzzy Approach for Process Plan Selection
An Interval Type-2 Fuzzy Approach for Process Plan Selection
 
Customer Clustering For Retail Marketing
Customer Clustering For Retail MarketingCustomer Clustering For Retail Marketing
Customer Clustering For Retail Marketing
 
Paper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityPaper-Allstate-Claim-Severity
Paper-Allstate-Claim-Severity
 
Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...
 
Multimodal Biometrics Recognition by Dimensionality Diminution Method
Multimodal Biometrics Recognition by Dimensionality Diminution MethodMultimodal Biometrics Recognition by Dimensionality Diminution Method
Multimodal Biometrics Recognition by Dimensionality Diminution Method
 
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITYSOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
 
Finding Relationships between the Our-NIR Cluster Results
Finding Relationships between the Our-NIR Cluster ResultsFinding Relationships between the Our-NIR Cluster Results
Finding Relationships between the Our-NIR Cluster Results
 

Similar to Xia Xiao-research paper

AN EFFICIENT CODEBOOK INITIALIZATION APPROACH FOR LBG ALGORITHM
AN EFFICIENT CODEBOOK INITIALIZATION APPROACH FOR LBG ALGORITHMAN EFFICIENT CODEBOOK INITIALIZATION APPROACH FOR LBG ALGORITHM
AN EFFICIENT CODEBOOK INITIALIZATION APPROACH FOR LBG ALGORITHMIJCSEA Journal
 
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...IJCI JOURNAL
 
Deploying the producer consumer problem using homogeneous modalities
Deploying the producer consumer problem using homogeneous modalitiesDeploying the producer consumer problem using homogeneous modalities
Deploying the producer consumer problem using homogeneous modalitiesFredrick Ishengoma
 
Technical_Report_on_ML_Library
Technical_Report_on_ML_LibraryTechnical_Report_on_ML_Library
Technical_Report_on_ML_LibrarySaurabh Chauhan
 
Advanced Data Structures 2006
Advanced Data Structures 2006Advanced Data Structures 2006
Advanced Data Structures 2006Sanjay Goel
 
1 Project 2 Introduction - the SeaPort Project seri.docx
1  Project 2 Introduction - the SeaPort Project seri.docx1  Project 2 Introduction - the SeaPort Project seri.docx
1 Project 2 Introduction - the SeaPort Project seri.docxhoney725342
 
Free ebooks download ! Edhole
Free ebooks download ! EdholeFree ebooks download ! Edhole
Free ebooks download ! EdholeEdhole.com
 
Free ebooks download ! Edhole
Free ebooks download ! EdholeFree ebooks download ! Edhole
Free ebooks download ! EdholeEdhole.com
 
ME/R model: A New approach of Data Warehouse Schema Design
ME/R model: A New approach of Data Warehouse Schema DesignME/R model: A New approach of Data Warehouse Schema Design
ME/R model: A New approach of Data Warehouse Schema Designidescitation
 
HadoopDB a major step towards a dead end
HadoopDB a major step towards a dead endHadoopDB a major step towards a dead end
HadoopDB a major step towards a dead endthkoch
 
Software Design
Software DesignSoftware Design
Software DesignHa Ninh
 
Codemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labCodemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labUgo Landini
 
Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)Zakaria Zubi
 
MIXED 0−1 GOAL PROGRAMMING APPROACH TO INTERVAL-VALUED BILEVEL PROGRAMMING PR...
MIXED 0−1 GOAL PROGRAMMING APPROACH TO INTERVAL-VALUED BILEVEL PROGRAMMING PR...MIXED 0−1 GOAL PROGRAMMING APPROACH TO INTERVAL-VALUED BILEVEL PROGRAMMING PR...
MIXED 0−1 GOAL PROGRAMMING APPROACH TO INTERVAL-VALUED BILEVEL PROGRAMMING PR...cscpconf
 
DIRECTIONS READ THE FOLLOWING STUDENT POST AND RESPOND EVALUATE I.docx
DIRECTIONS READ THE FOLLOWING STUDENT POST AND RESPOND EVALUATE I.docxDIRECTIONS READ THE FOLLOWING STUDENT POST AND RESPOND EVALUATE I.docx
DIRECTIONS READ THE FOLLOWING STUDENT POST AND RESPOND EVALUATE I.docxlynettearnold46882
 
ON AN OPTIMIZATION TECHNIQUE USING BINARY DECISION DIAGRAM
ON AN OPTIMIZATION TECHNIQUE USING BINARY DECISION DIAGRAMON AN OPTIMIZATION TECHNIQUE USING BINARY DECISION DIAGRAM
ON AN OPTIMIZATION TECHNIQUE USING BINARY DECISION DIAGRAMIJCSEA Journal
 
GEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsGEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsTanu Malik
 
object oriented programming part inheritance.pptx
object oriented programming part inheritance.pptxobject oriented programming part inheritance.pptx
object oriented programming part inheritance.pptxurvashipundir04
 

Similar to Xia Xiao-research paper (20)

AN EFFICIENT CODEBOOK INITIALIZATION APPROACH FOR LBG ALGORITHM
AN EFFICIENT CODEBOOK INITIALIZATION APPROACH FOR LBG ALGORITHMAN EFFICIENT CODEBOOK INITIALIZATION APPROACH FOR LBG ALGORITHM
AN EFFICIENT CODEBOOK INITIALIZATION APPROACH FOR LBG ALGORITHM
 
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...
 
Deploying the producer consumer problem using homogeneous modalities
Deploying the producer consumer problem using homogeneous modalitiesDeploying the producer consumer problem using homogeneous modalities
Deploying the producer consumer problem using homogeneous modalities
 
Technical_Report_on_ML_Library
Technical_Report_on_ML_LibraryTechnical_Report_on_ML_Library
Technical_Report_on_ML_Library
 
Advanced Data Structures 2006
Advanced Data Structures 2006Advanced Data Structures 2006
Advanced Data Structures 2006
 
1 Project 2 Introduction - the SeaPort Project seri.docx
1  Project 2 Introduction - the SeaPort Project seri.docx1  Project 2 Introduction - the SeaPort Project seri.docx
1 Project 2 Introduction - the SeaPort Project seri.docx
 
Free ebooks download ! Edhole
Free ebooks download ! EdholeFree ebooks download ! Edhole
Free ebooks download ! Edhole
 
Free ebooks download ! Edhole
Free ebooks download ! EdholeFree ebooks download ! Edhole
Free ebooks download ! Edhole
 
ME/R model: A New approach of Data Warehouse Schema Design
ME/R model: A New approach of Data Warehouse Schema DesignME/R model: A New approach of Data Warehouse Schema Design
ME/R model: A New approach of Data Warehouse Schema Design
 
HadoopDB a major step towards a dead end
HadoopDB a major step towards a dead endHadoopDB a major step towards a dead end
HadoopDB a major step towards a dead end
 
Software Design
Software DesignSoftware Design
Software Design
 
Codemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labCodemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech lab
 
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge BasesLOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
 
Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)
 
Mdb dn 2016_06_query_primer
Mdb dn 2016_06_query_primerMdb dn 2016_06_query_primer
Mdb dn 2016_06_query_primer
 
MIXED 0−1 GOAL PROGRAMMING APPROACH TO INTERVAL-VALUED BILEVEL PROGRAMMING PR...
MIXED 0−1 GOAL PROGRAMMING APPROACH TO INTERVAL-VALUED BILEVEL PROGRAMMING PR...MIXED 0−1 GOAL PROGRAMMING APPROACH TO INTERVAL-VALUED BILEVEL PROGRAMMING PR...
MIXED 0−1 GOAL PROGRAMMING APPROACH TO INTERVAL-VALUED BILEVEL PROGRAMMING PR...
 
DIRECTIONS READ THE FOLLOWING STUDENT POST AND RESPOND EVALUATE I.docx
DIRECTIONS READ THE FOLLOWING STUDENT POST AND RESPOND EVALUATE I.docxDIRECTIONS READ THE FOLLOWING STUDENT POST AND RESPOND EVALUATE I.docx
DIRECTIONS READ THE FOLLOWING STUDENT POST AND RESPOND EVALUATE I.docx
 
ON AN OPTIMIZATION TECHNIQUE USING BINARY DECISION DIAGRAM
ON AN OPTIMIZATION TECHNIQUE USING BINARY DECISION DIAGRAMON AN OPTIMIZATION TECHNIQUE USING BINARY DECISION DIAGRAM
ON AN OPTIMIZATION TECHNIQUE USING BINARY DECISION DIAGRAM
 
GEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsGEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC Programs
 
object oriented programming part inheritance.pptx
object oriented programming part inheritance.pptxobject oriented programming part inheritance.pptx
object oriented programming part inheritance.pptx
 

Xia Xiao-research paper

  • 1. Differential Testing, Java Performance Evaluation and Execution Time Comparison of two Binary Decision Diagram Libraries Xia Xiao Institute for Software Research, Carnegie Mellon University, Pittsburgh, PA New York University, New York City, NY xx681@nyu.edu Abstract As one of the data structures to represent Boolean functions in Computer Science, Binary Decision Diagram(BDD) or branching program is a compressed and abstract representation of sets and relations. Comparing with other data structures like Negation normal form(NNF) and Propositional directed acyclic graph(PDAG), Binary Decision Diagram has its unique features that are advantageous: In most cases, the term BDD commonly refers to Reduced Binary Decision Diagram(ROBDD). The nodes with variables in the BDD are connected with an order (either alphabetical or numerical), and therefore it saves the trouble traversing through the graph to look for a particular node within the structure; meanwhile, the structure of BDD is usually reduced, which means it eliminates the redundancies of duplicated nodes existing within the same structure. Most important of all, ROBDD has the feature that its representation of a particular function and variable order is canonical – this advantage it’s useful and make it simple for checking the functional equivalency of multiple different Boolean functions and operations like technology mapping. Based on the uniqueness of Binary Decision Diagram, Boolean function f can be transformed into a reduced and ordered version of BDD. It simplified the work of building truth table and comparing the truth values with different combination of operations with multiple variables. It is worthwhile for us to conduct a research on the Java Performance Evaluation on BDD libraries. There already exists some implementation of BDD libraries. We are particularly interested in comparing the performance of the JavaBDD library (available at http://javabdd.sourceforge.net/index.html) and our own implementation of Binary Decision Diagram library. By implementing Differential Testing on both BDD libraries, we check the functionally equivalency of operations to make sure our BDD library follows the correct structure and no bugs existing in the execution. In the differential testing file, we measure the execution time of constructing different sizes of BDD objects on both libraries and collected data from multiple runs of the JUnit test file. Then we use Minitab Express to generate 6 histograms and 6 boxplots of the execution time of two BDD libraries in order to compare their running speed. With theoretical support of statistic, we use t-test and confidence interval to make sure the difference in their Java Performance is statistically significant. This paper shows that the JavaBDD library performs a faster execution time when constructing BDD objects (and its advantage gets more obvious when the number of BDD increase). And this conclusion is illustrated by using differential testing and statistically rigorous methodologies. In addition, we advocate taking advantage of the structure of the existing JavaBDD library and apply those advantages into our own development of the BDD library implementations. We also would like to encourage other researchers to put effort into improving and consummating Java BDD implementation in future work, since it would benefit future research on Data Structure and Boolean Function studies. Keywords Binary Decision Diagram, Java, Differential Testing, Java Performance Evaluation, Statistics, Methodology 1.   Introduction Boolean function has been an important concept exiting in both mathematics and logic for many years. It describes how to determine a Boolean value output based on some logical calculation from some Boolean inputs. There have been many extended applications derive from the theory. Generally, a Boolean function is of the following form: ƒ: Bk → B In the formula above, B = {0, 1} is called a Boolean domain and k is a non-negative integer called the arity of the function. In the case where k = 0, the "function" is essentially a constant element of B. Every k-ary Boolean function can be expressed as a
  • 2. propositional formula in k variables x1, …, xk, and two propositional formulas are logically equivalent if and only if they express the same Boolean function. There are 22k k-ary functions for every k. Particularly, Boolean function plays a pivotal role in the area of computer engineering. There have been some propositional logical representations of Boolean function, like multivariate polynomials over (GF), negation normal forms, and propositional directed acyclic graphs (PDAG). Among those representations, Binary Decision Diagram is a more efficient and simplified. Binary Decision Diagram(BDD), a data structure has been introduced by Lee in 1959, is gradually getting popular in recent years. Its uniqueness in the variable’s order and reduction in the duplicated nodes make it a special position among other data structure representing Boolean functions. The data structure of a Binary Decision Tree and truth table is illustrated in Figure 1. As is shown, the value of the function can be determined for a given variable assignment by following a path down the graph to a terminal node. In Figure 1 the dotted lines represent edges to a high child. Therefore, in order to find (x1=0, x2=1, x3=1), we can begin at x1, and traverse down the dotted line to x2 (since x1 has an assignment to 0), then down two solid lines (since x2 and x3 each have an assignment to one). This leads to the terminal 1, which is the value of f (x1=0, x2=1, x3=1). Figure 1. An example illustrating the Binary decision tree and the corresponding truth table of the Boolean function f (x1, x2, x3) = ¬(x1x2x3)+ x1x2 + x2x3. The binary decision tree and the truth table in the above figure can be transformed into a binary decision diagram by maximally reducing it according to the two reduction rules: •   Merge any isomorphic sub graphs. •   Eliminate any node whose two children are isomorphic. The resulting BDD is shown in Figure 2 as following: Figure 2. The BDD for the Boolean function f(x1, x2, x3) = ¬(x1x2x3)+ x1x2 + x2x3. From the graph illustration of Boolean function f represented in the data structure of Binary Decision Diagram we can see clearly its structure: It’s rooted directed and consists of several decision nodes (x1, x2, x3) and it has two two terminal nodes 0 and 1. In Figure 2, each decision node is pointed to a low child (on the left) and a high child (on the right). A dotted line represents an assignment of 0 and a solid line represents an assignment of 1 instead. This paper is organized as follows. In section 2, we firstly raise some questions and come up several ideas of approaches towards the research. Section 3 specifies the four major steps of the research process. In section 4, we study the concept of Differential Testing and apply it to the two Binary Decision Diagram libraries. In the meantime, we write Java program that automatically generate JUnit test file to conduct the differential testing between the two BDD libraries. Section 6 we make the comparison of the execution time of two BDD libraries. Section 6 is the summary of this paper, and we make the conclusion from the 6 histograms and 6 boxplots. Overall, the JavaBDD library performs a faster running speed than our BDD library.
  • 3. 2.   Questions and Ideas of Approaches In initial period of our research, we start with raising some questions with relevance to the topics of Binary Decision Diagram, Differential Testing and Java Performance Evaluation as follows: •   How to implement Differential Testing on two different BDD libraries? •   What’s the difference between Differential Testing and other prevalent testing methods? •   What are the methods/theories would be helpful during Java Evaluation on BDD? •   Which BDD library would have a shorter running time during the evaluation? Is there a significant difference between their running time? •   What non-determinism factor may effect the BDD Java Performance Evaluation? After we raise the above questions, we then collect some ideas of approaches towards the research as follows: •   Since we are trying to compare two BDD libraries, it’s important to consider the difference in the implementation and functionality. •   Statistically rigorous methodology should be included in our Java Performance Evaluation in order to avoid misleading or even incorrect result. •   Confidence intervals can provide theoretical support for the Java Performance Evaluation. •   Various system effects may have influence on the BDD performance, while they do not have large impact on the Java Performance Evaluation. In the next section, we specify and follow the four major steps during the research, making progress gradually towards our conclusion. 3.   Steps of the research In the first step, we begin with getting familiar with the concepts, theory, and background knowledge, and try to understand the implementation and functionality in two different BDD libraries. Next, we implement Differential Testing and write program for automatically generating random test cases for both BDD libraries. In the third step, we implement Differential Testing and write program for automatically generating random test cases for both BDD libraries. In the final step, we compare the running time from step 3; we use Minitab Express to produce t- test and histogram for two BDD library performance; and we make conclusion based on the evaluation. 4.   Differential Testing on Binary Decision Diagram Differential Testing – A form of random testing, is an important testing technology for large software system. Particularly, based on its unique feature of indirectness and comparison, differential testing can bring about swift, efficient testing results and save miscellaneous expense in time and space. In order to implement Differential Testing on the two Binary Decision Diagram, we follow the following procedure and the corresponding code is shown in Figure 3: a)   Construct a series of BDD objects from both libraries, and assign them with the same features(with the same variable, low and high nodes). b)   Run the written program to automatically generate random combination of the existing BDD with random operations on them. Assign the constructed new BDDs to group a and group b. c)   Make hypothesis and assume the BDDs with the same data structure are built. d)   Use Junit Test and helper Method isSameBDD to check the features of two groups of BDDs. e)   The helper method isSameBDD would call the toString method in each library to check their equality. f)   Prove the hypothesis that the the constructed BDDs work in the same way. Figure 3. A part of the code from the JUnit test file illustrating how the differential testing is applied in the BDD Java performance Evaluation.
  • 4. 5.   Execution Time Comparison of two BDD libraries In order to compare the Java Performance Evaluation of two Binary Decision Diagram, we use Minitab Express to generate 6 box plots and 6 histograms respectively as follows, Figure 4 shows the 6 boxplots and Figure 5 shows the 6 histograms: (a) (b) (c) Figure 4. The boxplots illustrating the execution time of two BDD libraries when constructing the different numbers of BDD. In Figure 4, the boxplots illustrate the distribution of the running speed of two BDD libraries. As we can notice, when the number of BDD to be constructed is one, the two boxplots are almost on the same vertical level (overlap), which shows that the running time of them are very close to each other. However, as the size of BDD constructed increase, the boxplots don't’ overlap anymore and tend to show a gradually increasing difference between the running time. And the gap is getting larger as the size gets larger. (d) (e) (f)
  • 5. (a) (b) (c) Figure 5. The histogram illustrating the execution time of two BDD libraries when constructing the different numbers of BDD. (d) (e) (f) Figure 5 shows the 6 histograms illustrating the distribution of both BDD libraries. Starting from size 1, the running time of both libraries are very close, and the mode of the execution time mostly distributed near 2 milliseconds. As the size of the construction getting large, the different gets large as well, and the difference can be seen more clearly in graph (c), (d), (e) and (f).
  • 6. 5. Summary As we can see from section 5, There exists a difference in the running speed between two BDD libraries during the Java Performance Evaluation. As the numbers of BDD constructed increase (size of 1,10, 20, 30, 40, 50), the histogram shows that the difference in running time is also increasing. The 6 boxplots illustrate that the distribution of the running speed of two libraries. We notice that when size = 1, the running speed is overlapped. However, as the size of BDD constructed increase, the boxplots do not overlap anymore. The gap between the two boxplots get larger as the size gets larger. Overall, the JavaBDD library perform a faster running speed than our BDD library. From data in the t-test statistics, 6 histograms and 6 boxplots generated by Minitab Express, the results support our hypothesis. There have been some non-determinisms could have a effect on the Java Performance Evaluation on the Binary Decision Diagram, including the garbage collector and the Java noise in the background. However, they would not make a big difference during the Java Performance Evaluation. We believe this paper makes a step towards further study on implementing the current Java BDD library and figuring out how to improve the existing implementation and make it more efficient with a faster execution time. Moreover, we would like to implement a deeper inspection on our BDD library to make sure there is not bugs or other kinds of exceptions occur. The stability and reliability worth a further study and work in the future. Acknowledgment We would like to thank Professor Christian Kästner – who offers enthusiastic instructions and help with this research. His detailed suggestions have greatly helped us throughout this summer research program, including conducing the research, making the academic poster and writing this research paper. The weekly reading group also helped in introducing me into the world of software research. By reading and discussing a specifically chosen paper on a weekly basis, I learned some important skills associated with research papers: How to critically reflect on a scientific work; how to practice reading and argumentation strategies; how to be exposed to broad range of research topics; and how to lead a reading group discussion and guide your teammates throughout the discussion. In the meantime, the weekly meetings for all REU students on each Monday are also thought-provoking. PHD students would Moreover, I would like to thank all the staff and faculty in the REU program of Institute for Software Research at Carnegie Mellon University. With your generously offering me with this precious opportunity to participate in this summer research program, I would be able to have such a great experience during my undergraduate education. The REU summer program indeed provided me with a platform to work closely with professors, PHD students at Carnegie Mellon University, as well as undergraduate colleagues from other universities and colleges. It would always be a valuable experience during my undergraduate years. References [1]   Andersen, Henrik Reif. "An introduction to binary decision diagrams." Lecture notes, available online, IT University of Copenhagen (1997). [2]   “Binary Decision Diagram.” Wikipedia. Wikimedia Foundation, n.d. Web. 24 July 2016. [3]   "Boolean Function." Wikipedia. Accessed July 27, 2016. https://en.wikipedia.org/wiki/Boolean_function. [4]   Clarke, Edmund M., Masahiro Fujita, and Xudong Zhao. "Multi-Terminal Binary Decision Diagrams and Hybrid Decision Diagrams." Representations of Discrete Functions, 1996, 93-108. doi:10.1007/978-1-4613-1385- 4_4. [5]   Drechsler, Rolf, and Bernd Becker. Binary decision diagrams: theory and implementation. Springer Science & Business Media, 2013. [6]   Georges, Andy, Dries Buytaert, and Lieven Eeckhout. “Statistically Rigorous Java Performance Evaluation.” ACM SIGPLAN Notices 42, no. 10 (October 21, 2007):57.doi; 10,1145/1297105.1297033. [7]   Vliet, Hans Van. Software Engineering: Principles and Practice. Chichester: John Wiley, 2000. [8]   Wegener, Ingo. "Branching Programs and Binary Decision Diagrams." 2000. doi:10.1137/1.9780898719789. [9]   William M. Mckeeman, “Differential Testing for Software,” Digital Technical Journal 10. No. 1 (1998).