SlideShare a Scribd company logo
1 of 32
Download to read offline
LDV: Light-weight
DatabaseVirtualization
Quan Pham2,Tanu Malik1, Boris Glavic3 and Ian Foster1,2
Computation Institute1and Department of Computer Science2,3
University of Chicago1,2,Argonne National Laboratory1
Illinois Institute of Technology3
Thursday, April 23, 15
Share and Reproduce
Alice wants to share her models and
simulation output with Bob, and Bob wants to
re-execute Alice’s application to validate her
inputs and outputs.
Alice Bob
Thursday, April 23, 15
Significance
|reportingchecklistforlifesciencesarticles
1. Howwasthesamplesizechosentoensureadequatepower
todetectapre-specifiedeffectsize?
Foranimalstudies,includeastatementaboutsamplesize
estimateevenifnostatisticalmethodswereused.
2. Describeinclusion/exclusioncriteriaifsamplesoranimalswere
excludedfromtheanalysis.Werethecriteriapre-established?
3. Ifamethodofrandomizationwasusedtodeterminehow
samples/animalswereallocatedtoexperimentalgroupsand
processed,describeit.
Foranimalstudies,includeastatementaboutrandomization
evenifnorandomizationwasused.
4. Iftheinvestigatorwasblindedtothegroupallocationduring
theexperimentand/orwhenassessingtheoutcome,state
theextentofblinding.
Foranimalstudies,includeastatementaboutblindingeven
ReportingChecklistForLifeSciencesArticles
Thischecklistisusedtoensuregoodreportingstandardsandtoimprovethereproducibilityofpublishedresults.Formoreinformation,
pleasereadReportingLifeSciencesResearch.
Figurelegends
Eachfigurelegendshouldcontain,foreachpanelwheretheyarerelevant:
theexactsamplesize(n)foreachexperimentalgroup/condition,givenasanumber,notarange;
a description of the sample collection allowing the reader to understand whether the samples represent technical or biological
replicates(includinghowmanyanimals,litters,cultures,etc.);
astatementofhowmanytimestheexperimentshownwasreplicatedinthelaboratory;
definitionsofstatisticalmethodsandmeasures:
○ verycommontests,suchast-test,simpleχ2 tests,WilcoxonandMann-Whitneytests,canbeunambiguouslyidentifiedbynameonly,
butmorecomplextechniquesshouldbedescribedinthemethodssection;
○ aretestsone-sidedortwo-sided?
○ arethereadjustmentsformultiplecomparisons?
○ statisticaltestresults,e.g.,Pvalues;
○ definitionof‘centervalues’asmedianoraverage;
○ definitionoferrorbarsass.d.ors.e.m.
Anydescriptionstoolongforthefigurelegendshouldbeincludedinthemethodssection.
Pleaseensurethattheanswerstothefollowingquestionsarereportedinthemanuscriptitself.Weencourageyoutoincludeaspecific
subsectioninthemethodssectionforstatistics,reagentsandanimalmodels.Below,providethepagenumber(s)orfigurelegend(s)
wheretheinformationcanbelocated.
Statisticsandgeneralmethods
Reportedonpage(s)orfigurelegend(s):
CorrespondingAuthorName: ________________________________________
ManuscriptNumber: ______________________________
Metrics aims to improve the reproducibility of scientific research.
NY Times, Dec, 2014
Thursday, April 23, 15
Alice’s Options
1.A tar and gzip
2. Submit to a repository
3. Build website with code, parameters, and data
4. Create a virtual machine
Thursday, April 23, 15
Bob’s Frustration
1-3. I do not find the lib.so required for
building the model.
4. How do I?
Lack of easy and efficient methods for sharing
and reproducibility
Amount of pain
Bob suffers
Amount of
pain Alice suffers
Thursday, April 23, 15
ApplicationVirtualization
Alice’s Machine Bob’s Machine
Thursday, April 23, 15
ApplicationVirtualization
Alice’s Machine Bob’s Machine
Thursday, April 23, 15
ApplicationVirtualization for
DB Applications
Application
Operating System
File System File System
Slice
Pkg
Copy
AV
Alice's
Computer
chdir(“/usr”)
open
(“lib/libc.so.6”)
Thursday, April 23, 15
ApplicationVirtualization for
DB Applications
Application
Operating System
File System File System
Slice
Pkg
Copy
AV
Alice's
Computer
chdir(“/usr”)
open
(“lib/libc.so.6”)DB Server
Thursday, April 23, 15
ApplicationVirtualization for
DB Applications
• Applications that interact with a relational database
• Examples:
• Text-mining applications that download data,
preprocess and insert into a personal DB
• Analysis scripts using parts of a hosted database
Application
Operating System
File System File System
Slice
Pkg
Copy
AV
Alice's
Computer
chdir(“/usr”)
open
(“lib/libc.so.6”)DB Server
Thursday, April 23, 15
ApplicationVirtualization for
DB Applications
• Applications that interact with a relational database
• Examples:
• Text-mining applications that download data,
preprocess and insert into a personal DB
• Analysis scripts using parts of a hosted database
Application
Operating System
File System File System
Slice
Pkg
Copy
AV
Alice's
Computer
chdir(“/usr”)
open
(“lib/libc.so.6”)DB Server
Thursday, April 23, 15
Why doesn’t it work?
• Application virtualization methods are
oblivious to semantics of data in a database
system
• The database state at the time of sharing
the application may not be the same as the
start of the application
ared among multiple users and
Thus, to re-execute an applica-
as of the start of the application,
to understand a shared applica-
application provenance are well
these two types of provenance
ned methods - companion web-
cation virtualization - addresses
o automatic mechanism for cap-
on and DB provenance, these
s for determining which data is
they do not solve the issue of
vious state, and do not address
ring the binaries of commercial
irtualization is currently limited
not communicate to server pro-
r or a database server. In fact,
nicates with a database server,
ord the communication between
ver. This is not sufficient for
sed by the application (and, thus,
ackage) and to be able to reset
re application execution started.
a solution for the later problem,
share with Bob (Figure 1). Alice would preferably like to share
this application in the form of package P with Bob, who may
want to re-execute the application in its entirety or may want to
validate, just the analysis task, or provide his own data inputs
to examine the analysis result.
If Alice wants Bob to re-execute and build upon her
database application, then Bob must have access to an en-
vironment that consists of application binaries and data, any
extension modules that the code depends upon (e.g., dynam-
ically linked libraries), a database server and a database on
which the application can be re-executed. Ideally, it would
be useful if Alice’s environment can be virtualized and thus
automatically set up for Bob.
P3 P4Other experiments
f1
P1 Insert
t1
t2
t3
Query P2
t4
f2
Alice’s
experiment
Database
Fig. 1: Alice’s experiment with processes P1 and P2 uses tupleThursday, April 23, 15
LDV: Light-weight
DatabaseVirtualization
• Goal: Easily and efficiently share and repeat
DB applications.
Thursday, April 23, 15
Key Ideas
• DB application = Application (OS) part + DB part
• Use data provenance to capture interactions from/to the
application side to the database side
• Limited formal mechanisms so far to combine the two kinds
of provenance models
• Create a virtualized package that can be re-
executed
• Either include the server and data, or replay interactions
(for licensed databases)
• No virtualization mechanism for database replay
Thursday, April 23, 15
Related Work
• Application virtualization
• Linux Containers, CDE[Usenix’11]
• Packaging with annotations
• Docker
• Packaging with provenance
• PTU1[TaPP’13], ReproZip[TaPP’13], Research Objects
• Unified provenance models
• based on program instrumentation [TaPP’12]
1 Q. Pham,T. Malik, and I. Foster. Using provenance for repeatability. In Theory and Practice of Provenance (TaPP), 2013.
Thursday, April 23, 15
How does LDV work?
Application
Operating System
File System
DB Server
Execution
Trace
DB Server
DB Slice
File System
Slice
Pkg
CopyLDV
Alice's
Computer
Alice’s Machine
ldv-audit db-app
• Monitoring system calls
• Monitoring SQL
• Server-included packages
• Server-excluded packages
• Execution traces
• Relevant DB and filesystem
slices
Thursday, April 23, 15
• Redirecting file access
• Redirecting DB access
• Server-included packages
• Server-excluded packages
File System
Bob's
ComputerUser Application
Operating System
DB Server
Execution
Trace
DB Server
DB Slice
File System
Slice
Pkg
LDV Redirect
Bob’s Machine
ldv-exec db-app
How does LDV work?
Thursday, April 23, 15
Example
Alice:~$ ldv-audit app.sh
Application package created as app-pkg
Alice:~$ ls
app-pkg app.sh src data
Alice:~$echo "Hi Bob, Please find the pkg --Alice"  |
mutt -s "Sharing DB Application -a "./app-pkg" 
-- bob-vldb2015@gmail.com
Bob:~$ ls .
app-pkg
Bob:~$ cd app-pkg
Bob:~$ ls
app.sh src data
Bob:~$ldv-exec app.sh
Running app-pkg....
Ubuntu 14.04
(Kernel 3.13)
+
Postgres 9.1
CentOS 6.2
(Kernel 2.6.32)
+
MySQL
Thursday, April 23, 15
LDV Issues
• Monitoring system calls
• Monitoring SQL
• Execution traces
• Relevant DB slices
• Redirecting file access
• Server-included packages
• Server-excluded packages
• Redirecting DB access
Thursday, April 23, 15
LDV Issues
• Monitoring system calls
• Monitoring SQL
• Execution traces
• Relevant DB slices
• Redirecting file access
• Server-included packages
• Server-excluded packages
• Redirecting DB access
Thursday, April 23, 15
An Execution Trace
A B
P1
Insert1
Insert2
t1
t2
t3
Query
t4
t5
P2
C
[1, 6] [7, 8]
[5, 5]
[8, 8]
[5, 5]
[5, 5]
[8, 8]
[9, 9]
[9, 9]
[9, 9]
[9, 9]
[9, 9]
[9, 9]
[9, 9]
[7, 12]
Fig. 2: An execution trace with processes and database operations
t1
t2
t3
Q1
[4, 4]
[4, 4]
[4, 4]
[4, 4]
Fig. 3: PLin trace and data de
A
B
P1
[1, 5]
[5, 7]
[2, 3]
[8, 8]
Fig. 4: PBB trace and data de
(a) the first process reads file f0
and the last process writes
file f0
, and (b) each process Pi was executed by process Pi 1.
Example 6. Consider the trace shown in Figure 4. Process
P1 reads files A and B and writes files C and D. Thus, both
graph. In contrast, we assume the temporal c
given (recorded when creating an execution tr
these annotations to restrict what edges have to
Similarly, Dey et al. [8] determine all possible ord
a file
a process
a tuple
a query
temporal
annotations
Uses provenance entities and activities to model the
execution of a DB application
Thursday, April 23, 15
Data Dependencies from Provenance
Systems
t1
t2
t3
Query
t4
t5
P2
C
[9, 9]
[9, 9]
[9, 9]
[9, 9]
[9, 9]
[9, 9]
[9, 9]
[7, 12]
t1
t2
t3
Q1 t4
[4, 4]
[4, 4]
[4, 4]
[4, 4]
Fig. 3: PLin trace and data dependenci
A
B
P1
C
D
[1, 5]
[5, 7]
[2, 3]
[8, 8]
Fine-Grained DB Provenance
t1
t2
t3
Q1 t4
[4, 4]
[4, 4]
[4, 4]
[4, 4]
Fig. 3: PLin trace and data dependencies.
A
B
P1
C
D
[1, 5]
[5, 7]
[2, 3]
[8, 8]
Fig. 4: PBB trace and data dependencies.
st, we assume the temporal constraints as
when creating an execution trace) and use
s to restrict what edges have to be inferred.
al. [8] determine all possible orders of events
le for an OPM provenance graph.
File Operations
A DB execution trace has more edges than those
determined by individual provenance systems
A combined execution trace models the execution of a DB
application including its processes, file operations, and DB
accesses based on a OS and a DB provenance model.
Definition 6 (Combined Execution Trace). Let PDB and POS
be DB and OS provenance models. Every execution trace for
PDB+OS is a combined execution trace for PDB and POS.
Example 3. A combined execution trace for the PLin and
PBB models is shown in Figure 2. This trace models the
execution of two processes P1 and P2. Process P1 reads two
files A and B, and executes two insert statements (at time 5
and 8 respectively). These insert statements create three tuple
versions t1, t2, and t3. Process P2 executes a query which
returns tuples t4 and t5. These tuples depend on tuples t1 and
t3. Finally, process P2 writes file C.
VI. DATA DEPENDENCIES
The above definitions describe interactions of activities and
entities in an execution trace of a provenance model, but do not
model data dependencies, i.e., dependencies between entities.
In our model, a dependency is an edge between two entities e
and e0
where a change to the input node (e0
) may result in a
change to the output node (e). Given a provenance model, de-
pendency information may or may not be explicitly available;
it depends upon the granularity at which information about
entities and activities is tracked and stored. For instance, the
blackbox provenance model PBB operates at the granularity
of processes and files and may not compute exact dependency
information. Consider a process P that reads from files A and
sales
id price
{t1} 1 5
{t2} 2 11
{t3} 3 14
result
ttl
{t2, t3} 25
Fig. 5: Annotated relation sales and query result
compute provenance polynomials (and thus also Lineage) on
demand for an input query. In the following we will us
Lin(Q, t) to denote the Lineage of a tuple t in the result of
query Q.
Example 4. Consider the sales table shown in Fig
ure 5. The Lineage of each tuple in the sales ta
ble is a singleton set containing the tuple’s identi
fier. The result of a query SELECT sum(value) AS ttl
FROM sales WHERE price > 10 is a single row with ttl =
11+14 = 25. The Lineage contains all tuples (t2 and t3) tha
were used to compute this results.
We define data dependencies in the PLin model based on
Lineage. We connect each tuple t in the result of a query Q to
all input tuples of the query that are in t’s Lineage. Similarly
we connect a modified tuple t in the result of an update to th
corresponding tuple t0
in the input of the update.
Definition 7 (PLin Data Dependencies). Let G be a PLin
trace. Let Lin(s, t) denote the Lineage of tuple t in the resul
of DB operation s, and let t and t0
denote entities (tuples)
The dependencies D(G) ⇢ D ⇥ D of G are defined as:
Using PTU1
Using Perm2
1 Q. Pham,T. Malik, and I. Foster. Using provenance for repeatability. In Theory and Practice of Provenance (TaPP), 2013.
2 B. Glavic et al. Perm: Processing Provenance and Data on the same Data Model through Query Rewriting. In ICDE, 2009.
Thursday, April 23, 15
Can we use temporal annotations and
known direct data dependencies to infer
a sound and complete set of
dependencies that helps us determine
the smallest size repeatability package?
Key Question
Thursday, April 23, 15
Axioms for
Dependency Inference
• no direct data dependencies implies there
is no data flow
• state of node at point in time depends on
past interactions only
• flow of data should not violate temporal
causality
Thursday, April 23, 15
Inferring Dependencies
(a) No Dependency between C and A
A P1 B P2 C[2, 3] [6, 7] [1, 5] [6, 6]
(b) C depends on A at time 4
A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6]
(c) No Dependency between C and A
(a) No Dependency between C and A
A P1 B P2 C[2, 3] [6, 7] [1, 5] [6, 6]
(b) C depends on A at time 4
A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6]
(c) No Dependency between C and A1 4 5 6
2 6 5
no such
sequence exists
• to determine whether information has
flown from A to C
• find increasing sequence of times for edges
so each time lies in edge’s interval
sequence shown
on the left
Thursday, April 23, 15
Experiments
• 3 Metrics
• Performance
• Usability
• Generality
1e-05
0.0001
Test
Prepare
Inserts First
Select
Other
Selects
Updates
1e-05
0.0001
Test
Prepare
Inserts First
Select
Other
Selects
U
Fig. 7: Execution time of each step in an execution of TPC-H
(a) Q1
1e-05
0.0001
0.001
0.01
0.1
1
10
100
1000
Test
Prepare
Inserts First
Select
Other
Selects
Updates
Executiontime(seconds)
PostgreSQL
0.03
0.00357
0.562
0.3753
0.00084
Open-Source DB Server scenario
178.2
0.003042
0.492
0.3921
0.001
Proprietary DB Server scenario
0.016
4E-05
0.016
0.0003
0.00018
(b) Q2
1e-05
0.0001
0.001
0.01
0.1
1
10
100
1000
Test
Prepare
Inserts First
Select
Other
Selects
U
Executiontime(seconds)
Postgre
0.03
0.00348
0.088
0.02872
Open-Source DB Server sce
34.09
0.003072
0.18
0.08532
Proprietary DB Server sce
0.036
0.000376
0.218
0.1555
Fig. 8: Re-Execution time of each step in an execution of TPC-H
Package Software
binaries
Server
binaries
Data
directory
Database
provenance
PTU 3 3 3(full) 7
Open-Source DBS 3 3 3(empty) 3
Proprietary DBS 3 7 7 3
TABLE III: Content of PTU and LDV packages: PTU pack-
ages contain data directory of the full database, whereas Open-
Source DBS LDV packages contain a data directory of an
empty database (created by the initdb command) 100
200
300
400
500
Totalpackagesize(MB)
ApplicationVirtualization + Database
LDV + Open-source DB
LDV + Proprietary DB
Thursday, April 23, 15
TPC-H Queries
• Most TPC-H queries touch large fractions
of tables
• Modified by varying parameters and
selectivity
TABLE II: The 18 TPC-H benchmark queries used in our experiments
Queries SQL PARAM Sel.
Q1-1 to
Q1-5
SELECT l quantity, l partkey , l extendedprice , l shipdate , l receiptdate FROM lineitem
WHERE l suppkey BETWEEN 1 AND PARAM
10, 20, 50, 100,
250
1%, 2%, 5%,
10%, 25%
Q2-1 to
Q2-4
SELECT o comment, l comment FROM lineitem l, orders o, customer c WHERE l.l orderkey
= o.o orderkey AND o.o custkey = c.c custkey AND c.c name LIKE ’%PARAM%’;
0000000, 000000,
00000, 0000
66%, 6.6%,
0.66%, 0.06%
Q3-1 to
Q3-4
SELECT count(⇤) FROM lineitem l, orders o, customer c WHERE l.l orderkey = o.o orderkey
AND o.o custkey = c.c custkey AND c.c name LIKE ’%PARAM%’;
0000000, 000000,
00000, 0000
66%, 6.6%,
0.66%, 0.06%
Q4-1 to
Q4-5
SELECT o orderkey, AVG(l quantity) AS avgQ FROM lineitem l, orders o WHERE l.l orderkey
= o.o orderkey AND l suppkey BETWEEN 1 AND PARAM GROUP BY o orderkey;
10, 20, 50, 100,
250
1%, 2%, 5%,
10%, 25%
(a) Audit
0.001
0.01
0.1
1
10
100
1000
10000
Executiontime(seconds)
PostgreSQL + PTU
Server-included package
Server-excluded package
(b) Replay
0.001
0.01
0.1
1
10
100
1000
10000
Executiontime(seconds)
PostgreSQL + PTU
0.01001
0.08
0.053
01
Server-included package
4.19
0.00063
0.05
0.025
.0003
Server-excluded package
0.01
0.01
0.009
01
Thursday, April 23, 15
Size Comparison
10
100
1000
10000
1-1 1-2 1-3 1-4 1-5 2-1 2-2 2-3 2-4 3-1 3-2 3-3 3-4 4-1 4-2 4-3 4-4 4-5
Packagesize(MB)
Query
PTU package
Server-included package
Server-excluded package
Fig. 9: LDV packages are significantly smaller than PTU
packages when queries have low selectivity.
and the LDV packages. The VMI is 8.2 GB: 80 times larger
than the average LDV package (100MB). To evaluate runtime
performance, we instantiate this VMI using the same number
of cores and memory as in our machine to execute our queries.
[4] C. T. Bro
ivory.idy
[5] J. Chene
Foundati
[6] F. Chirig
provenan
[7] F. S. Ch
to suppo
[8] S. C. De
provenan
[9] J. Freire
reproduc
14(4), 20
[10] B. Glavi
Data Mo
[11] B. Glavi
provenan
practice
[12] C. A. G
for work
on Workfl
[13] P. J. Guo
create p
Conferen
[14] B. How
research.
• LDV packages are significantly smaller than PTU
packages when queries have low selectivity
• TheVMI is 8.2 GB: 80 times larger than the
average LDV package (100MB).
Thursday, April 23, 15
Audit and Replay1e-05
0.0001
0.001
0.01
0.1
Inserts First
Select
Other
Selects
Updates
Executio
1e-05
0.0001
0.001
0.01
0.1
Initialization Inserts First
Select
Other
Selects
Updates
Executio
1E-05
0.0
0.0001
0.00063
0
0.0003
0.0
2E-05
0.0
0.0
0.0001
Fig. 7: Execution time of each step in an application with query Q1-1
(a) Audit
0.001
0.01
0.1
1
10
100
1000
10000
1-1 1-2 1-3 1-4 1-5 2-1 2-2 2-3 2-4 3-1 3-2 3-3 3-4 4-1 4-2 4-3 4-4 4-5
Executiontime(seconds)
Query
PostgreSQL + PTU
Server-included package
Server-excluded package
(b) Replay
0.001
0.01
0.1
1
10
100
1-1 1-2 1-3 1-4 1-5 2-1 2-2 2-3 2-4 3-1 3-2 3-3 3-4 4-1 4-2 4-3 4-4 4-5
Executiontime(seconds)
Query
PostgreSQL + PTU
Server-included package
Server-excluded package
VM
Fig. 8: Execution time for each query, during audit (left) and replay (right)
LE III: Package Contents: PTU packages contain all data
of the full DB, whereas server-included LDV packages
in the data files of an empty DB.
kage type Software
binaries
DB
server
Data
files
DB
provenance
U 3 3 3(full) 7
V server-included 3 3 3(empty) 3
V server-excluded 3 7 7 3
tuples needed to re-execute the application—which, fo
queries, is at most ⇠25% of all tuples. Server-excluded
packages are often yet smaller, because they contain on
query results—which, for many of our experiment queri
smaller than the tuples required for re-execution. Ho
recall that server-excluded packages have less flexibilit
do server-included packages.
LDV amortizes audit cost significantly at replay time
Thursday, April 23, 15
Summary
• LDV permits sharing and repeating DB
applications
• LDV combines OS and DB provenance to
determine file and DB slices
• LDV creates light-weight virtualized
packages based on combined provenance
• Results show LDV is efficient, usable, and
general
• LDV at http://github.com/lordpretzel/ldv.git
Thursday, April 23, 15
Q&A
• ?
Thursday, April 23, 15
Inferring Dependencies
s:
ncies
nter-
ns of
entity
n the
nance
ncies,
h do
there
C).
e file
(a) No Dependency between C and A
A P1 B P2 C[2, 3] [6, 7] [1, 5] [6, 6]
(b) C depends on A at time 4
A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6]
(c) No Dependency between C and A
A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6]
Fig. 6: Example traces with different temporal annotations
s:
ncies
nter-
ns of
ntity
n the
ance
cies,
h do
there
C).
e file
(a) No Dependency between C and A
A P1 B P2 C[2, 3] [6, 7] [1, 5] [6, 6]
(b) C depends on A at time 4
A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6]
(c) No Dependency between C and A
A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6]
Fig. 6: Example traces with different temporal annotations
between e0
and e, because if there is no path between e0
and
s:
ncies
inter-
ns of
entity
n the
nance
ncies,
th do
there
! C).
e file
(a) No Dependency between C and A
A P1 B P2 C[2, 3] [6, 7] [1, 5] [6, 6]
(b) C depends on A at time 4
A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6]
(c) No Dependency between C and A
A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6]
Fig. 6: Example traces with different temporal annotations
between e0
and e, because if there is no path between e0
and
No Depedency between C and A
C Depends on A at time 4
No Dependency between C and A
Thursday, April 23, 15

More Related Content

What's hot

Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
The Galaxy bioinformatics workflow environment
The Galaxy bioinformatics workflow environmentThe Galaxy bioinformatics workflow environment
The Galaxy bioinformatics workflow environmentRutger Vos
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...University of California, San Diego
 
The Materials Project - Combining Science and Informatics to Accelerate Mater...
The Materials Project - Combining Science and Informatics to Accelerate Mater...The Materials Project - Combining Science and Informatics to Accelerate Mater...
The Materials Project - Combining Science and Informatics to Accelerate Mater...University of California, San Diego
 
Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...Anubhav Jain
 
Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...
Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...
Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...Gilles Fedak
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
FireWorks workflow software
FireWorks workflow softwareFireWorks workflow software
FireWorks workflow softwareAnubhav Jain
 
FireWorks overview
FireWorks overviewFireWorks overview
FireWorks overviewAnubhav Jain
 
Reproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookReproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookKeiichiro Ono
 
Automating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAutomating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAnubhav Jain
 
Building Reproducible Network Data Analysis / Visualization Workflows
Building Reproducible Network Data Analysis / Visualization WorkflowsBuilding Reproducible Network Data Analysis / Visualization Workflows
Building Reproducible Network Data Analysis / Visualization WorkflowsKeiichiro Ono
 
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBenchWBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBencht_ivanov
 
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...Keiichiro Ono
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationIan Foster
 
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)Keiichiro Ono
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...Ian Foster
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010BOSC 2010
 

What's hot (20)

MAVRL Workshop 2014 - Python Materials Genomics (pymatgen)
MAVRL Workshop 2014 - Python Materials Genomics (pymatgen)MAVRL Workshop 2014 - Python Materials Genomics (pymatgen)
MAVRL Workshop 2014 - Python Materials Genomics (pymatgen)
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
The Galaxy bioinformatics workflow environment
The Galaxy bioinformatics workflow environmentThe Galaxy bioinformatics workflow environment
The Galaxy bioinformatics workflow environment
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
 
The Materials Project - Combining Science and Informatics to Accelerate Mater...
The Materials Project - Combining Science and Informatics to Accelerate Mater...The Materials Project - Combining Science and Informatics to Accelerate Mater...
The Materials Project - Combining Science and Informatics to Accelerate Mater...
 
Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...Atomate: a high-level interface to generate, execute, and analyze computation...
Atomate: a high-level interface to generate, execute, and analyze computation...
 
Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...
Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...
Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
FireWorks workflow software
FireWorks workflow softwareFireWorks workflow software
FireWorks workflow software
 
The Materials API
The Materials APIThe Materials API
The Materials API
 
FireWorks overview
FireWorks overviewFireWorks overview
FireWorks overview
 
Reproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookReproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter Notebook
 
Automating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAutomating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomate
 
Building Reproducible Network Data Analysis / Visualization Workflows
Building Reproducible Network Data Analysis / Visualization WorkflowsBuilding Reproducible Network Data Analysis / Visualization Workflows
Building Reproducible Network Data Analysis / Visualization Workflows
 
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBenchWBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
 
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
SDCSB Advanced Tutorial: Reproducible Data Visualization Workflow with Cytosc...
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
 
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010
 

Similar to LDV: Light-weight Database Virtualization

ICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationBoris Glavic
 
"Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications""Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications"Pinar Alper
 
13-Essential-Data-Validation-Checks.pdf
13-Essential-Data-Validation-Checks.pdf13-Essential-Data-Validation-Checks.pdf
13-Essential-Data-Validation-Checks.pdfarifulislam946965
 
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland LeusdenTestistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland LeusdenTurkish Testing Board
 
Belak_ICME_June02015
Belak_ICME_June02015Belak_ICME_June02015
Belak_ICME_June02015Jim Belak
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8dallemang
 
Migrating Data Warehouse Solutions from Oracle to non-Oracle Databases
Migrating Data Warehouse Solutions from Oracle to non-Oracle DatabasesMigrating Data Warehouse Solutions from Oracle to non-Oracle Databases
Migrating Data Warehouse Solutions from Oracle to non-Oracle DatabasesJade Global
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Rusif Eyvazli
 
Architecture by Accident
Architecture by AccidentArchitecture by Accident
Architecture by AccidentGleicon Moraes
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 
Data Base Testing Interview Questions
Data Base Testing Interview QuestionsData Base Testing Interview Questions
Data Base Testing Interview QuestionsRita Singh
 
Hadoop/Spark Non-Technical Basics
Hadoop/Spark Non-Technical BasicsHadoop/Spark Non-Technical Basics
Hadoop/Spark Non-Technical BasicsZitao Liu
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Anubhav Jain
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataRobert Grossman
 
Arabidopsis Information Portal: A Community-Extensible Platform for Open Data
Arabidopsis Information Portal: A Community-Extensible Platform for Open DataArabidopsis Information Portal: A Community-Extensible Platform for Open Data
Arabidopsis Information Portal: A Community-Extensible Platform for Open DataMatthew Vaughn
 

Similar to LDV: Light-weight Database Virtualization (20)

ICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database Virtualization
 
"Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications""Data Provenance: Principles and Why it matters for BioMedical Applications"
"Data Provenance: Principles and Why it matters for BioMedical Applications"
 
13-Essential-Data-Validation-Checks.pdf
13-Essential-Data-Validation-Checks.pdf13-Essential-Data-Validation-Checks.pdf
13-Essential-Data-Validation-Checks.pdf
 
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland LeusdenTestistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
 
Belak_ICME_June02015
Belak_ICME_June02015Belak_ICME_June02015
Belak_ICME_June02015
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
 
Migrating Data Warehouse Solutions from Oracle to non-Oracle Databases
Migrating Data Warehouse Solutions from Oracle to non-Oracle DatabasesMigrating Data Warehouse Solutions from Oracle to non-Oracle Databases
Migrating Data Warehouse Solutions from Oracle to non-Oracle Databases
 
Poster (1)
Poster (1)Poster (1)
Poster (1)
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...
 
Architecture by Accident
Architecture by AccidentArchitecture by Accident
Architecture by Accident
 
DataHub
DataHubDataHub
DataHub
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
Linked Data and Semantic Web Application Development by Peter Haase
Linked Data and Semantic Web Application Development by Peter HaaseLinked Data and Semantic Web Application Development by Peter Haase
Linked Data and Semantic Web Application Development by Peter Haase
 
Data Base Testing Interview Questions
Data Base Testing Interview QuestionsData Base Testing Interview Questions
Data Base Testing Interview Questions
 
Hadoop/Spark Non-Technical Basics
Hadoop/Spark Non-Technical BasicsHadoop/Spark Non-Technical Basics
Hadoop/Spark Non-Technical Basics
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate Data
 
Arabidopsis Information Portal: A Community-Extensible Platform for Open Data
Arabidopsis Information Portal: A Community-Extensible Platform for Open DataArabidopsis Information Portal: A Community-Extensible Platform for Open Data
Arabidopsis Information Portal: A Community-Extensible Platform for Open Data
 
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
 
NoSQL
NoSQLNoSQL
NoSQL
 

Recently uploaded

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

LDV: Light-weight Database Virtualization

  • 1. LDV: Light-weight DatabaseVirtualization Quan Pham2,Tanu Malik1, Boris Glavic3 and Ian Foster1,2 Computation Institute1and Department of Computer Science2,3 University of Chicago1,2,Argonne National Laboratory1 Illinois Institute of Technology3 Thursday, April 23, 15
  • 2. Share and Reproduce Alice wants to share her models and simulation output with Bob, and Bob wants to re-execute Alice’s application to validate her inputs and outputs. Alice Bob Thursday, April 23, 15
  • 3. Significance |reportingchecklistforlifesciencesarticles 1. Howwasthesamplesizechosentoensureadequatepower todetectapre-specifiedeffectsize? Foranimalstudies,includeastatementaboutsamplesize estimateevenifnostatisticalmethodswereused. 2. Describeinclusion/exclusioncriteriaifsamplesoranimalswere excludedfromtheanalysis.Werethecriteriapre-established? 3. Ifamethodofrandomizationwasusedtodeterminehow samples/animalswereallocatedtoexperimentalgroupsand processed,describeit. Foranimalstudies,includeastatementaboutrandomization evenifnorandomizationwasused. 4. Iftheinvestigatorwasblindedtothegroupallocationduring theexperimentand/orwhenassessingtheoutcome,state theextentofblinding. Foranimalstudies,includeastatementaboutblindingeven ReportingChecklistForLifeSciencesArticles Thischecklistisusedtoensuregoodreportingstandardsandtoimprovethereproducibilityofpublishedresults.Formoreinformation, pleasereadReportingLifeSciencesResearch. Figurelegends Eachfigurelegendshouldcontain,foreachpanelwheretheyarerelevant: theexactsamplesize(n)foreachexperimentalgroup/condition,givenasanumber,notarange; a description of the sample collection allowing the reader to understand whether the samples represent technical or biological replicates(includinghowmanyanimals,litters,cultures,etc.); astatementofhowmanytimestheexperimentshownwasreplicatedinthelaboratory; definitionsofstatisticalmethodsandmeasures: ○ verycommontests,suchast-test,simpleχ2 tests,WilcoxonandMann-Whitneytests,canbeunambiguouslyidentifiedbynameonly, butmorecomplextechniquesshouldbedescribedinthemethodssection; ○ aretestsone-sidedortwo-sided? ○ arethereadjustmentsformultiplecomparisons? ○ statisticaltestresults,e.g.,Pvalues; ○ definitionof‘centervalues’asmedianoraverage; ○ definitionoferrorbarsass.d.ors.e.m. Anydescriptionstoolongforthefigurelegendshouldbeincludedinthemethodssection. Pleaseensurethattheanswerstothefollowingquestionsarereportedinthemanuscriptitself.Weencourageyoutoincludeaspecific subsectioninthemethodssectionforstatistics,reagentsandanimalmodels.Below,providethepagenumber(s)orfigurelegend(s) wheretheinformationcanbelocated. Statisticsandgeneralmethods Reportedonpage(s)orfigurelegend(s): CorrespondingAuthorName: ________________________________________ ManuscriptNumber: ______________________________ Metrics aims to improve the reproducibility of scientific research. NY Times, Dec, 2014 Thursday, April 23, 15
  • 4. Alice’s Options 1.A tar and gzip 2. Submit to a repository 3. Build website with code, parameters, and data 4. Create a virtual machine Thursday, April 23, 15
  • 5. Bob’s Frustration 1-3. I do not find the lib.so required for building the model. 4. How do I? Lack of easy and efficient methods for sharing and reproducibility Amount of pain Bob suffers Amount of pain Alice suffers Thursday, April 23, 15
  • 8. ApplicationVirtualization for DB Applications Application Operating System File System File System Slice Pkg Copy AV Alice's Computer chdir(“/usr”) open (“lib/libc.so.6”) Thursday, April 23, 15
  • 9. ApplicationVirtualization for DB Applications Application Operating System File System File System Slice Pkg Copy AV Alice's Computer chdir(“/usr”) open (“lib/libc.so.6”)DB Server Thursday, April 23, 15
  • 10. ApplicationVirtualization for DB Applications • Applications that interact with a relational database • Examples: • Text-mining applications that download data, preprocess and insert into a personal DB • Analysis scripts using parts of a hosted database Application Operating System File System File System Slice Pkg Copy AV Alice's Computer chdir(“/usr”) open (“lib/libc.so.6”)DB Server Thursday, April 23, 15
  • 11. ApplicationVirtualization for DB Applications • Applications that interact with a relational database • Examples: • Text-mining applications that download data, preprocess and insert into a personal DB • Analysis scripts using parts of a hosted database Application Operating System File System File System Slice Pkg Copy AV Alice's Computer chdir(“/usr”) open (“lib/libc.so.6”)DB Server Thursday, April 23, 15
  • 12. Why doesn’t it work? • Application virtualization methods are oblivious to semantics of data in a database system • The database state at the time of sharing the application may not be the same as the start of the application ared among multiple users and Thus, to re-execute an applica- as of the start of the application, to understand a shared applica- application provenance are well these two types of provenance ned methods - companion web- cation virtualization - addresses o automatic mechanism for cap- on and DB provenance, these s for determining which data is they do not solve the issue of vious state, and do not address ring the binaries of commercial irtualization is currently limited not communicate to server pro- r or a database server. In fact, nicates with a database server, ord the communication between ver. This is not sufficient for sed by the application (and, thus, ackage) and to be able to reset re application execution started. a solution for the later problem, share with Bob (Figure 1). Alice would preferably like to share this application in the form of package P with Bob, who may want to re-execute the application in its entirety or may want to validate, just the analysis task, or provide his own data inputs to examine the analysis result. If Alice wants Bob to re-execute and build upon her database application, then Bob must have access to an en- vironment that consists of application binaries and data, any extension modules that the code depends upon (e.g., dynam- ically linked libraries), a database server and a database on which the application can be re-executed. Ideally, it would be useful if Alice’s environment can be virtualized and thus automatically set up for Bob. P3 P4Other experiments f1 P1 Insert t1 t2 t3 Query P2 t4 f2 Alice’s experiment Database Fig. 1: Alice’s experiment with processes P1 and P2 uses tupleThursday, April 23, 15
  • 13. LDV: Light-weight DatabaseVirtualization • Goal: Easily and efficiently share and repeat DB applications. Thursday, April 23, 15
  • 14. Key Ideas • DB application = Application (OS) part + DB part • Use data provenance to capture interactions from/to the application side to the database side • Limited formal mechanisms so far to combine the two kinds of provenance models • Create a virtualized package that can be re- executed • Either include the server and data, or replay interactions (for licensed databases) • No virtualization mechanism for database replay Thursday, April 23, 15
  • 15. Related Work • Application virtualization • Linux Containers, CDE[Usenix’11] • Packaging with annotations • Docker • Packaging with provenance • PTU1[TaPP’13], ReproZip[TaPP’13], Research Objects • Unified provenance models • based on program instrumentation [TaPP’12] 1 Q. Pham,T. Malik, and I. Foster. Using provenance for repeatability. In Theory and Practice of Provenance (TaPP), 2013. Thursday, April 23, 15
  • 16. How does LDV work? Application Operating System File System DB Server Execution Trace DB Server DB Slice File System Slice Pkg CopyLDV Alice's Computer Alice’s Machine ldv-audit db-app • Monitoring system calls • Monitoring SQL • Server-included packages • Server-excluded packages • Execution traces • Relevant DB and filesystem slices Thursday, April 23, 15
  • 17. • Redirecting file access • Redirecting DB access • Server-included packages • Server-excluded packages File System Bob's ComputerUser Application Operating System DB Server Execution Trace DB Server DB Slice File System Slice Pkg LDV Redirect Bob’s Machine ldv-exec db-app How does LDV work? Thursday, April 23, 15
  • 18. Example Alice:~$ ldv-audit app.sh Application package created as app-pkg Alice:~$ ls app-pkg app.sh src data Alice:~$echo "Hi Bob, Please find the pkg --Alice" | mutt -s "Sharing DB Application -a "./app-pkg" -- bob-vldb2015@gmail.com Bob:~$ ls . app-pkg Bob:~$ cd app-pkg Bob:~$ ls app.sh src data Bob:~$ldv-exec app.sh Running app-pkg.... Ubuntu 14.04 (Kernel 3.13) + Postgres 9.1 CentOS 6.2 (Kernel 2.6.32) + MySQL Thursday, April 23, 15
  • 19. LDV Issues • Monitoring system calls • Monitoring SQL • Execution traces • Relevant DB slices • Redirecting file access • Server-included packages • Server-excluded packages • Redirecting DB access Thursday, April 23, 15
  • 20. LDV Issues • Monitoring system calls • Monitoring SQL • Execution traces • Relevant DB slices • Redirecting file access • Server-included packages • Server-excluded packages • Redirecting DB access Thursday, April 23, 15
  • 21. An Execution Trace A B P1 Insert1 Insert2 t1 t2 t3 Query t4 t5 P2 C [1, 6] [7, 8] [5, 5] [8, 8] [5, 5] [5, 5] [8, 8] [9, 9] [9, 9] [9, 9] [9, 9] [9, 9] [9, 9] [9, 9] [7, 12] Fig. 2: An execution trace with processes and database operations t1 t2 t3 Q1 [4, 4] [4, 4] [4, 4] [4, 4] Fig. 3: PLin trace and data de A B P1 [1, 5] [5, 7] [2, 3] [8, 8] Fig. 4: PBB trace and data de (a) the first process reads file f0 and the last process writes file f0 , and (b) each process Pi was executed by process Pi 1. Example 6. Consider the trace shown in Figure 4. Process P1 reads files A and B and writes files C and D. Thus, both graph. In contrast, we assume the temporal c given (recorded when creating an execution tr these annotations to restrict what edges have to Similarly, Dey et al. [8] determine all possible ord a file a process a tuple a query temporal annotations Uses provenance entities and activities to model the execution of a DB application Thursday, April 23, 15
  • 22. Data Dependencies from Provenance Systems t1 t2 t3 Query t4 t5 P2 C [9, 9] [9, 9] [9, 9] [9, 9] [9, 9] [9, 9] [9, 9] [7, 12] t1 t2 t3 Q1 t4 [4, 4] [4, 4] [4, 4] [4, 4] Fig. 3: PLin trace and data dependenci A B P1 C D [1, 5] [5, 7] [2, 3] [8, 8] Fine-Grained DB Provenance t1 t2 t3 Q1 t4 [4, 4] [4, 4] [4, 4] [4, 4] Fig. 3: PLin trace and data dependencies. A B P1 C D [1, 5] [5, 7] [2, 3] [8, 8] Fig. 4: PBB trace and data dependencies. st, we assume the temporal constraints as when creating an execution trace) and use s to restrict what edges have to be inferred. al. [8] determine all possible orders of events le for an OPM provenance graph. File Operations A DB execution trace has more edges than those determined by individual provenance systems A combined execution trace models the execution of a DB application including its processes, file operations, and DB accesses based on a OS and a DB provenance model. Definition 6 (Combined Execution Trace). Let PDB and POS be DB and OS provenance models. Every execution trace for PDB+OS is a combined execution trace for PDB and POS. Example 3. A combined execution trace for the PLin and PBB models is shown in Figure 2. This trace models the execution of two processes P1 and P2. Process P1 reads two files A and B, and executes two insert statements (at time 5 and 8 respectively). These insert statements create three tuple versions t1, t2, and t3. Process P2 executes a query which returns tuples t4 and t5. These tuples depend on tuples t1 and t3. Finally, process P2 writes file C. VI. DATA DEPENDENCIES The above definitions describe interactions of activities and entities in an execution trace of a provenance model, but do not model data dependencies, i.e., dependencies between entities. In our model, a dependency is an edge between two entities e and e0 where a change to the input node (e0 ) may result in a change to the output node (e). Given a provenance model, de- pendency information may or may not be explicitly available; it depends upon the granularity at which information about entities and activities is tracked and stored. For instance, the blackbox provenance model PBB operates at the granularity of processes and files and may not compute exact dependency information. Consider a process P that reads from files A and sales id price {t1} 1 5 {t2} 2 11 {t3} 3 14 result ttl {t2, t3} 25 Fig. 5: Annotated relation sales and query result compute provenance polynomials (and thus also Lineage) on demand for an input query. In the following we will us Lin(Q, t) to denote the Lineage of a tuple t in the result of query Q. Example 4. Consider the sales table shown in Fig ure 5. The Lineage of each tuple in the sales ta ble is a singleton set containing the tuple’s identi fier. The result of a query SELECT sum(value) AS ttl FROM sales WHERE price > 10 is a single row with ttl = 11+14 = 25. The Lineage contains all tuples (t2 and t3) tha were used to compute this results. We define data dependencies in the PLin model based on Lineage. We connect each tuple t in the result of a query Q to all input tuples of the query that are in t’s Lineage. Similarly we connect a modified tuple t in the result of an update to th corresponding tuple t0 in the input of the update. Definition 7 (PLin Data Dependencies). Let G be a PLin trace. Let Lin(s, t) denote the Lineage of tuple t in the resul of DB operation s, and let t and t0 denote entities (tuples) The dependencies D(G) ⇢ D ⇥ D of G are defined as: Using PTU1 Using Perm2 1 Q. Pham,T. Malik, and I. Foster. Using provenance for repeatability. In Theory and Practice of Provenance (TaPP), 2013. 2 B. Glavic et al. Perm: Processing Provenance and Data on the same Data Model through Query Rewriting. In ICDE, 2009. Thursday, April 23, 15
  • 23. Can we use temporal annotations and known direct data dependencies to infer a sound and complete set of dependencies that helps us determine the smallest size repeatability package? Key Question Thursday, April 23, 15
  • 24. Axioms for Dependency Inference • no direct data dependencies implies there is no data flow • state of node at point in time depends on past interactions only • flow of data should not violate temporal causality Thursday, April 23, 15
  • 25. Inferring Dependencies (a) No Dependency between C and A A P1 B P2 C[2, 3] [6, 7] [1, 5] [6, 6] (b) C depends on A at time 4 A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6] (c) No Dependency between C and A (a) No Dependency between C and A A P1 B P2 C[2, 3] [6, 7] [1, 5] [6, 6] (b) C depends on A at time 4 A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6] (c) No Dependency between C and A1 4 5 6 2 6 5 no such sequence exists • to determine whether information has flown from A to C • find increasing sequence of times for edges so each time lies in edge’s interval sequence shown on the left Thursday, April 23, 15
  • 26. Experiments • 3 Metrics • Performance • Usability • Generality 1e-05 0.0001 Test Prepare Inserts First Select Other Selects Updates 1e-05 0.0001 Test Prepare Inserts First Select Other Selects U Fig. 7: Execution time of each step in an execution of TPC-H (a) Q1 1e-05 0.0001 0.001 0.01 0.1 1 10 100 1000 Test Prepare Inserts First Select Other Selects Updates Executiontime(seconds) PostgreSQL 0.03 0.00357 0.562 0.3753 0.00084 Open-Source DB Server scenario 178.2 0.003042 0.492 0.3921 0.001 Proprietary DB Server scenario 0.016 4E-05 0.016 0.0003 0.00018 (b) Q2 1e-05 0.0001 0.001 0.01 0.1 1 10 100 1000 Test Prepare Inserts First Select Other Selects U Executiontime(seconds) Postgre 0.03 0.00348 0.088 0.02872 Open-Source DB Server sce 34.09 0.003072 0.18 0.08532 Proprietary DB Server sce 0.036 0.000376 0.218 0.1555 Fig. 8: Re-Execution time of each step in an execution of TPC-H Package Software binaries Server binaries Data directory Database provenance PTU 3 3 3(full) 7 Open-Source DBS 3 3 3(empty) 3 Proprietary DBS 3 7 7 3 TABLE III: Content of PTU and LDV packages: PTU pack- ages contain data directory of the full database, whereas Open- Source DBS LDV packages contain a data directory of an empty database (created by the initdb command) 100 200 300 400 500 Totalpackagesize(MB) ApplicationVirtualization + Database LDV + Open-source DB LDV + Proprietary DB Thursday, April 23, 15
  • 27. TPC-H Queries • Most TPC-H queries touch large fractions of tables • Modified by varying parameters and selectivity TABLE II: The 18 TPC-H benchmark queries used in our experiments Queries SQL PARAM Sel. Q1-1 to Q1-5 SELECT l quantity, l partkey , l extendedprice , l shipdate , l receiptdate FROM lineitem WHERE l suppkey BETWEEN 1 AND PARAM 10, 20, 50, 100, 250 1%, 2%, 5%, 10%, 25% Q2-1 to Q2-4 SELECT o comment, l comment FROM lineitem l, orders o, customer c WHERE l.l orderkey = o.o orderkey AND o.o custkey = c.c custkey AND c.c name LIKE ’%PARAM%’; 0000000, 000000, 00000, 0000 66%, 6.6%, 0.66%, 0.06% Q3-1 to Q3-4 SELECT count(⇤) FROM lineitem l, orders o, customer c WHERE l.l orderkey = o.o orderkey AND o.o custkey = c.c custkey AND c.c name LIKE ’%PARAM%’; 0000000, 000000, 00000, 0000 66%, 6.6%, 0.66%, 0.06% Q4-1 to Q4-5 SELECT o orderkey, AVG(l quantity) AS avgQ FROM lineitem l, orders o WHERE l.l orderkey = o.o orderkey AND l suppkey BETWEEN 1 AND PARAM GROUP BY o orderkey; 10, 20, 50, 100, 250 1%, 2%, 5%, 10%, 25% (a) Audit 0.001 0.01 0.1 1 10 100 1000 10000 Executiontime(seconds) PostgreSQL + PTU Server-included package Server-excluded package (b) Replay 0.001 0.01 0.1 1 10 100 1000 10000 Executiontime(seconds) PostgreSQL + PTU 0.01001 0.08 0.053 01 Server-included package 4.19 0.00063 0.05 0.025 .0003 Server-excluded package 0.01 0.01 0.009 01 Thursday, April 23, 15
  • 28. Size Comparison 10 100 1000 10000 1-1 1-2 1-3 1-4 1-5 2-1 2-2 2-3 2-4 3-1 3-2 3-3 3-4 4-1 4-2 4-3 4-4 4-5 Packagesize(MB) Query PTU package Server-included package Server-excluded package Fig. 9: LDV packages are significantly smaller than PTU packages when queries have low selectivity. and the LDV packages. The VMI is 8.2 GB: 80 times larger than the average LDV package (100MB). To evaluate runtime performance, we instantiate this VMI using the same number of cores and memory as in our machine to execute our queries. [4] C. T. Bro ivory.idy [5] J. Chene Foundati [6] F. Chirig provenan [7] F. S. Ch to suppo [8] S. C. De provenan [9] J. Freire reproduc 14(4), 20 [10] B. Glavi Data Mo [11] B. Glavi provenan practice [12] C. A. G for work on Workfl [13] P. J. Guo create p Conferen [14] B. How research. • LDV packages are significantly smaller than PTU packages when queries have low selectivity • TheVMI is 8.2 GB: 80 times larger than the average LDV package (100MB). Thursday, April 23, 15
  • 29. Audit and Replay1e-05 0.0001 0.001 0.01 0.1 Inserts First Select Other Selects Updates Executio 1e-05 0.0001 0.001 0.01 0.1 Initialization Inserts First Select Other Selects Updates Executio 1E-05 0.0 0.0001 0.00063 0 0.0003 0.0 2E-05 0.0 0.0 0.0001 Fig. 7: Execution time of each step in an application with query Q1-1 (a) Audit 0.001 0.01 0.1 1 10 100 1000 10000 1-1 1-2 1-3 1-4 1-5 2-1 2-2 2-3 2-4 3-1 3-2 3-3 3-4 4-1 4-2 4-3 4-4 4-5 Executiontime(seconds) Query PostgreSQL + PTU Server-included package Server-excluded package (b) Replay 0.001 0.01 0.1 1 10 100 1-1 1-2 1-3 1-4 1-5 2-1 2-2 2-3 2-4 3-1 3-2 3-3 3-4 4-1 4-2 4-3 4-4 4-5 Executiontime(seconds) Query PostgreSQL + PTU Server-included package Server-excluded package VM Fig. 8: Execution time for each query, during audit (left) and replay (right) LE III: Package Contents: PTU packages contain all data of the full DB, whereas server-included LDV packages in the data files of an empty DB. kage type Software binaries DB server Data files DB provenance U 3 3 3(full) 7 V server-included 3 3 3(empty) 3 V server-excluded 3 7 7 3 tuples needed to re-execute the application—which, fo queries, is at most ⇠25% of all tuples. Server-excluded packages are often yet smaller, because they contain on query results—which, for many of our experiment queri smaller than the tuples required for re-execution. Ho recall that server-excluded packages have less flexibilit do server-included packages. LDV amortizes audit cost significantly at replay time Thursday, April 23, 15
  • 30. Summary • LDV permits sharing and repeating DB applications • LDV combines OS and DB provenance to determine file and DB slices • LDV creates light-weight virtualized packages based on combined provenance • Results show LDV is efficient, usable, and general • LDV at http://github.com/lordpretzel/ldv.git Thursday, April 23, 15
  • 32. Inferring Dependencies s: ncies nter- ns of entity n the nance ncies, h do there C). e file (a) No Dependency between C and A A P1 B P2 C[2, 3] [6, 7] [1, 5] [6, 6] (b) C depends on A at time 4 A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6] (c) No Dependency between C and A A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6] Fig. 6: Example traces with different temporal annotations s: ncies nter- ns of ntity n the ance cies, h do there C). e file (a) No Dependency between C and A A P1 B P2 C[2, 3] [6, 7] [1, 5] [6, 6] (b) C depends on A at time 4 A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6] (c) No Dependency between C and A A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6] Fig. 6: Example traces with different temporal annotations between e0 and e, because if there is no path between e0 and s: ncies inter- ns of entity n the nance ncies, th do there ! C). e file (a) No Dependency between C and A A P1 B P2 C[2, 3] [6, 7] [1, 5] [6, 6] (b) C depends on A at time 4 A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6] (c) No Dependency between C and A A P1 B P2 C[1, 1] [4, 7] [2, 5] [1, 6] Fig. 6: Example traces with different temporal annotations between e0 and e, because if there is no path between e0 and No Depedency between C and A C Depends on A at time 4 No Dependency between C and A Thursday, April 23, 15