Adaptable SLA-Aware Consistency_Ayantu Debebe Ejeta.pdf

2332-7790 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TBDATA.2017.2656121, IEEE
Transactions on Big Data
IEEE TRANSACTIONS ON BIG DATA, VOL. 3, NO. 1, DECEMBER 2016 1
Adaptable SLA-Aware Consistency Tuning for
Quorum-replicated Datastores
Subhajit Sidhanta, Wojciech Golab, Supratik Mukhopadhyay, and Saikat Basu
Abstract—Users of distributed datastores that employ quorum-based replication are burdened with the choice of a suitable client-centric
consistency setting for each storage operation. The above matching choice is difficult to reason about as it requires deliberating about
the tradeoff between the latency and staleness, i.e., how stale (old) the result is. The latency and staleness for a given operation depend
on the client-centric consistency setting applied, as well as dynamic parameters such as the current workload and network condition.
We present OptCon, a machine learning-based predictive framework, that can automate the choice of client-centric consistency setting
under user-specified latency and staleness thresholds given in the service level agreement (SLA). Under a given SLA, OptCon predicts
a client-centric consistency setting that is matching, i.e., it is weak enough to satisfy the latency threshold, while being strong enough
to satisfy the staleness threshold. While manually tuned consistency settings remain fixed unless explicitly reconfigured, OptCon tunes
consistency settings on a per-operation basis with respect to changing workload and network state. Using decision tree learning, OptCon
yields 0.14 cross validation error in predicting matching consistency settings under latency and staleness thresholds given in the SLA.
We demonstrate experimentally that OptCon is at least as effective as any manually chosen consistency settings in adapting to the SLA
thresholds for different use cases. We also demonstrate that OptCon adapts to variations in workload, whereas a given manually chosen
fixed consistency setting satisfies the SLA only for a characteristic workload.
Index Terms—distributed databases, storage management, distributed systems,, storage management high availability, distributed file
systems, middleware, performance evaluation, machine learning, reliability, availability, and serviceability
✦
1 INTRODUCTION
Many quorum-based distributed data stores [1, 2, 3],
allow the developers to explicitly declare the desired client-
centric consistency setting (i.e., consistency observed from
the viewpoint of the client application) for an operation.
Such systems accept the client-centric consistency settings
for an operation in the form of a runtime argument, typically
referred to as the consistency level. The performance of the
system, with respect to a given operation, is affected by the
choice of the consistency level applied [4].
From the viewpoint of the user, the most important per-
formance parameters affected by the consistency level ap-
plied are the latency and client-observed consistency anoma-
lies, i.e., anomalies in the result of an operation observed
from the client application, such as stale reads. Consistency
anomalies are measured in terms of the client-centric stal-
eness [5], i.e., how stale (old) is the version of the data
item (observed from the client application) with respect
to the most recent version. According to the consistency
level applied, the system waits for coordination among a
specific number of replicas containing copies (i.e., versions)
of the data item accessed by the given operation [1]. If the
system waits for coordination among a smaller number of
• S. Sidhanta is currently with INESC-ID / IST, University of Lisbon,
Portugal. This work was done while S. Sidhanta was with Louisiana State
University.
E-mail: ssidhanta@gsd.inesc-id.pt
• S. Mukhopadhyay is with Louisiana State University, USA. Email:
supratik@csc.lsu.edu
• S. Basu was with Louisiana State University, USA. Email:
sbasu8@lsu.edu
• W. Golab is with University of Waterloo, Canada. Email: wgo-
lab@uwaterloo.ca
Manuscript received May 5, 2005; revised October 26, 2016.
replicas, the chance of getting a stale result (i.e., an older
version) increases. Also, the latency for the given operation
depends on the waiting time for the above coordination;
hence, in turn, depends on the consistency level applied.
For example, a weak consistency level for a read operation
in Cassandra [1], like READ ONE, requires only one of the
replicas to coordinate successfully, resulting in low latency
and high chances of a stale read. Hence, while choosing the
consistency level, developers must consider how this choice
affects the latency and staleness for a given operation.
The chosen consistency level must be matching with
respect to the latency and staleness thresholds specified in
the given service level agreement (SLA), i.e., it must be weak
enough to satisfy the latency threshold, while being strong
enough to satisfy the staleness threshold. Consider a typical
use case at Netflix where a user browses recommended
movie titles. Such use cases require real-time response [6].
Hence the SLA typically comprises low latency and higher
staleness thresholds. For the given SLA, workload, and
network state, the matching choice is a weak read-write
consistency level (like ONE/ANY in Cassandra). If the
developer applies stronger consistency level, the resulting
high latency might violate the SLA.
With the current state-of-the-art, the developers have to
manually determine a matching consistency level for a given
operation at development time. Reasoning about the above
choice is difficult because of: 1) the large number of possible
SLAs, 2) unpredictable factors like changing network state
and varying workload that impact the latency and staleness,
and 3) the absence of a well-formed mathematical relation
connecting the above parameters [7]. This makes automated
consistency tuning under latency and staleness thresholds
in the SLA a highly desirable feature for quorum-based

datastores.
We present OptCon [8], a framework that automatically
determines a matching consistency level for a given opera-
tion under a given SLA. Due to the absence of a closed-form
mathematical model capturing the impact of the consistency
level, current workload, and network state, on the observed
latency and staleness, OptCon applies machine learning
[9] to train a model for predicting a matching consistency
level under the given SLA, workload, and network state.
For the Netflix use case, taking into account the read-
heavy workload, the current network state, and the SLA
thresholds, OptCon predicts a weak consistency level. The
contributions of this paper are:
• We introduce OptCon, a novel machine learning-
based framework, that can automatically predict a
matching consistency level that satisfies the latency
and staleness thresholds specified in a given SLA,
i.e., the predicted consistency level is weak enough
to satisfy the latency threshold, and strong enough
to satisfy the staleness threshold. Using decision tree
learning, OptCon yields a cross validation error of
0.14 in predicting a matching consistency level under
the given SLA.
• Experimental results demonstrate that OptCon is
at least as effective as any manually chosen con-
sistency level in adapting to the different latency
and staleness thresholds in SLAs. Furthermore, we
also demonstrate experimentally that OptCon sur-
passes any manually chosen fixed consistency level
in adapting to a varying workload, i.e., OptCon
satisfies the SLA thresholds for variations in the
workload, whereas a manually chosen consistency
level satisfies the SLA for only a subset of workload
types.
2 MOTIVATION
2.1 Choice of the SLA Parameters
Following prior research by Terry et al. [4], we include
latency and staleness in the SLA for operations on quorum-
based stores. The choice of consistency level directly affects
the latency for a given operation on a quorum-based data-
store [1]. While Terry et al. [4] use categorical attributes for
specifying the desired consistency (such as read-my-write,
eventual, etc.) in the SLA, we use a more fine-grained SLA,
that accepts threshold values for the client-centric staleness
[10]. Golab et al. [10] demonstrate that both the propor-
tion and severity of stale results increases from stronger
to weaker consistency levels. The use of a client-centric
staleness metric in the SLA enables the developer to specify,
on a per-operation basis, the exact degree of staleness of
results, that a given client application can tolerate.
Following Terry et al. [4], we do not include throughput
in the SLA. But it is still desirable to have throughput [11] as
a secondary optimization criterion, once the SLA thresholds
are satisfied. Depending on the SLA, a set of consistency
levels can be matching with respect to the latency and
staleness thresholds given in the SLA. Consider a real-
time application (such as an online shopping cart) that
demands moderately low latency and tolerates relatively
higher staleness in the SLA. In such cases, any weaker con-
sistency level (like ONE or TWO in Cassandra, depending
on the replication factor, can yield latency and staleness
values within the given SLA thresholds. In such cases of
multiple matching consistency levels, OptCon chooses the
one that maximizes throughput. Also, we do not consider
parameters like replication factor, number of server nodes,
and certain other server-centric parameters in our model,
since our focus is optimizing client-centric performance [10].
TABLE 1: Glossary of OptCon Model Parameters
L
Average latency, i.e., average
time delay for a
storage operation over a
one minute time period
T
Average throughput i.e.,
operations/second, over a
one minute time period
S
Client-centric staleness given
by the 95 percentile Γ score [10]
Tc
Thread count, i.e., number
of user threads spawned
by the storage operations
invoked from the
client application
P
Packet count, i.e., number of
network packets transmitted
to execute a given operation
RW
Read-write proportion in
the client workload
C
Consistency level applied for
an operation
2.2 Motivation for Automated Consistency Tuning
The large number of possible use cases [11], each comprising
different SLA thresholds for latency and staleness, makes
manual determination of a matching consistency level a
complex process. The latency and staleness are also affected
[7] by the following independent variables (parameters):
1) the packet count parameter, i.e., the number of packets
transmitted during a given operation, represents the net-
work state [1], and 2) the read proportion (i.e., proportion
of reads in the workload), and 3) thread count. Parameters
2 and 3 represent the workload characteristics. We list the
parameters in Table 1. Manually reasoning about all of
these parameters combined together is difficult, even for a
skilled and experienced developer. Further, unpredictable
variations in workload, network congestion, and node fail-
ures may render a particular consistency level, manually
pre-configured at development time, infeasible at runtime
[12, 4].
3 CHALLENGES
Currently, there is no closed-form mathematical model [7]
that represents the effect of the consistency level on the
observed staleness and latency, with respect to the inde-
pendent variables. Bailis et al. [7] base their work upon
the simplifying assumption that writes do not execute con-
currently with other operations. Hence, it is not clear from
[7] how to compute the latency and staleness from a real
workload. Pileus and Tuba [4, 13] are the only systems that
provide fine-grained consistency tuning using SLAs. But,
instead of predicting, these systems perform actual trials on
the system, and select the consistency level corresponding
to the SLA row that produces maximum resultant utility,
based on the trial outcomes. The trial-based technique can
produce unreliable results due to the unpredictable param-
eters like network conditions and workload that affect the
observed latency and staleness. Thus, predictions based on

the outcomes of the trial phase may be unsuitable in the
actual running time of the operation. Sivaramakrishnan et
al. [14] use a static analysis approach to determine the
weakest consistency level that satisfies the correctness con-
ditions declared in the form of a contract, i.e., a specification
of client-centric consistency properties that the application
must satisfy. They do not consider the tradeoff between
staleness and latency, and cannot dynamically adapt to
varying workload and network state.
4 DESIGN OF OPTCON
4.1 Design Overview
In the absence of a closed-form mathematical model relating
the consistency level to the observed latency and staleness,
OptCon leverages machine learning-based prediction tech-
niques [9], following the footsteps of prior research [15].
OptCon is the first work that trains on historic data, and
learns a model M relating the consistency level and the
input parameters (i.e., the workload and network state) to
the observed latency and staleness. The model M can pre-
dict a matching consistency level with respect to the given
SLA thresholds for latency and staleness, under the current
workload and network state. A matching consistency level
is weak enough to satisfy the latency threshold, and simul-
taneously strong enough to satisfy the staleness threshold.
Our definition of matching consistency level is a modified
version of the correct consistency level in QUELA [14]. For
multiple matching consistency levels, OptCon predicts the
consistency level that maximizes the throughput.
In contrast to program verification-based static (compile
time) approaches [14], OptCon provides on-the-fly predic-
tions based on dynamic (runtime) values of the input pa-
rameters, i.e., the current workload and the network state
(Section 2.2). Thus, OptCon tunes the datastore accord-
ing to any dynamic variations in the workload and any
given SLA. Furthermore, unlike the state-of-the-art trial-
based techniques that base decisions on the values of the
parameters obtained during the trial phase [4, 13], OptCon
provides reliable predictions (see Section 6.9 and 6.10), tak-
ing into consideration the actual runtime values of the input
parameters.
Average Latency Staleness
≤100ms ≤5ms
≤50ms ≤10ms
≤25ms ≤15ms
TABLE 2: Example subSLAs
Like Terry et al. [4], users provide latency and staleness
thresholds to OptCon in form of an SLA, illustrated in
Table 2. Each SLA consists of rows of subSLAs, where each
subSLA row, in turn, comprises two columns containing the
thresholds for latency and staleness, respectively. Instead of
categorical attributes [4], OptCon uses threshold values for
staleness [10] in the subSLAs. Unlike Terry et al., subSLAs in
OptCon do not have a utility column. Also, subSLAs are not
necessary ordered. If any one (or all) of the thresholds for a
particular subSLA are violated, OptCon marks the operation
as failed with respect to the given subSLA. The handling of
such failure cases will depend on the application logic, and
is not part of the scope of this paper.
Instead of modifying the source code of distributed data-
stores, OptCon is designed as a wrapper over existing dis-
tributed storage systems. We have implemented and tested
OptCon on Cassandra, which follows the Dynamo [16]
model. Among the Dynamo-style systems, while Cassandra
organises the data in the form of column families, Voldemort
[17] and Riak [2] are strictly key-value stores. Also, Cassan-
dra and Riak are designed to enable faster writes, whereas
Voldemort enables faster reads. But the principles governing
the internal synchronization mechanisms are similar for all
these systems. Hence the consistency level choices affect the
observed latency and staleness in a similar fashion for all
these systems. Thus, OptCon can potentially be integrated
with any Dynamo-style system.
4.2 Architecture
Fig. 1: The architecture of OptCon: rectangles denote mod-
ules and the folded rectangle denotes a dataset.
OptCon (refer to Figure 1) consists of the following
modules: 1) The Logger module records the independent
variables (Section 2.2), the observed latency, and staleness,
obtained using experiments performed on the datastore
with benchmark workloads. It collates these parameters into
a training data corpus, which acts as the training dataset.
2) The Learner module learns the model M (refer Section
6.3) from the training dataset, and predicts a matching
consistency level that maximizes the throughput. 3) The
Client module calls the other modules, and executes the
given operation on the datastore.
During the training phase, the Client module runs a
simulation workload on the given quorum-based datastore.
The Client calls the Logger module (Figure 1) to collect
the independent variables (parameters) and the observed
parameters for the operation from the JMX interface [1],
and appends these parameters into the training data. The
Client then calls the Learner module, which trains the
model M from the training data applying machine learning
techniques. During the prediction phase, the Client calls
the Learner, with runtime arguments comprising the sub-
SLA thresholds and the current values of the independent
variables. The subSLA (and also the SLA) can be varied
on a per-operation basis. The Learner predicts a matching
consistency level from the learnt model M. The Client
performs the given operation on the datastore with the
predicted consistency level.
4.3 Design of the Logger Module
4.3.1 Model Parameters Collected by the Logger
For the reasons explained in Section 2.1, the latency L, i.e.,
time delay for the operation, and client-centric staleness

S, are considered as the model parameters that need to
satisfy a given SLA. The throughput T is considered as a
secondary optimization criterion. As discussed in Section
2.1, server centric parameters, like replication factor and
keyspace size, are not considered. Following the reasoning
in Section 2.2, the independent variables (parameters) for
learning the model M are: 1) the read proportion (RW )
in the operation, 2) the thread count (Tc), i.e., the number
of user threads spawned by the client, 3) the packet count
(P), i.e., the number of network packets transmitted during
the operation, and 4) the consistency level C, provided as
a categorical attribute containing attribute values specific to
the given datastore.
Most operations on quorum based stores are not instan-
taneous, but are executed over certain time intervals [10].
Following [7], the Logger uses the average latency over a
time interval of one minute for measuring L. The Logger
computes S in terms of the Γ metric of Golab et al. [10]. As
demonstrated in the above work, Γ is preferred over other
client-centric staleness measures for its proven sensitivity
to workload parameters and consistency level. Section 6.4
establishes the significance of the above parameters with
respect to the OptCon model. Section 6.5 demonstrates the
accuracy of the model comprising the above parameters.
4.3.2 Computation of Γ as the Metric for Client-centric Stal-
eness
The metric Γ is based upon Lamport’s atomicity property
[18], which states that operations appear to take effect in
some total order that reflects their “happens before” relation
in the following sense: if operation A finishes before opera-
tion B starts then the effect of A must be visible before the
effect of B. The gamma metric defines observed consistency
in terms of the responses of storage operations observed
from the clients, and does not refer directly to server-
side implementation details of the datastore, like replication
factor, etc. As a result, the metric is implementation inde-
pendent, i.e., it can be applied to any key-value datastore. Γ
is a bound on time-based staleness, i.e., every read returns
a value at most Γ time units stale. A Γ score quantifies
the bound on the timestamp difference between operations
on the same key that return different values. We say that
a trace of operations recorded by the logger is Γ-atomic
if it is atomic in Lamport’s sense, or becomes atomic after
each operation in the trace is transformed by decreasing its
starting time by Γ/2 time units, and increasing its finish
time by Γ/2 time units. In this context the start and finish
times are shifted in a mathematical sense for the purpose of
analyzing a trace, and do not imply injection of artificial
delays; the behavior of the storage system is unaffected.
The per-key Γ score quantifies the degree of client-centric
staleness incurred by operations applied to a particular key.
We considered average per-key Γ, but settled for percentile
per-key Γ since it takes into account the skewed nature of
the workloads.
4.4 Design of the Learner Module
In OptCon, building the model M from training examples
(Figure 1) is a one-time process, i.e., the same model can be
reused for predicting consistency levels for all operations.
Even with node failures, the model may not need to be
regenerated since the training data remain unchanged, only
the failed predictions, resulting due to node failure, need to
be rerun. However, the model will need to be retrained in
certain cases where the behavior of the system with respect
to the latency and staleness changes, such as when a storage
system receives a software or hardware upgrade.
4.4.1 Overview of Possible Learning Techniques
We provide a brief overview of the possible Learning tech-
niques and their applicability to OptCon. We describe the
implementation details, including the various configuration
parameters, of each learning algorithm in Section 5. Perfor-
mance analysis and insights gathered from each technique
are given in Section 6.5.2. We consider Logistic regression
and Bayesian learning as they can help visualize the signifi-
cance of the model parameters and the dependency relations
among these parameters, and thus can provide intuition to
the developer for developing complex and more accurate
models [9]. We also consider Decision Tree, Random For-
est, and Artificial Neural Networks (ANN), since they can
produce more accurate predictions, being directly computed
from the data. We consider these approaches in the order
of their performance given in Table 3, in terms of various
model selection metrics. We leave the choice of a suitable
learning technique to the developer, rendering flexibility to
the framework. Based on our evaluation of the learning
algorithms, developers can choose the learning technique
that best suits the respective application domain and use
case.
Assuming linear dependency, we start with the linear
regression technique for training the model in OptCon, in
an attempt to come up with a simple mathematical model
that expresses the effect of RW , T c, and P , on latency
and staleness for an operation, with different values of
client side consistency level C. Also, linear regression can
provide preliminary insights on the relationship among the
model parameters, and act as a basis for designing further
complicated modelling algorithms, i.e., it can help us un-
derstand the dependency relationships between parameters
that can be used as a baseline for learning algorithms, for
example, Bayesian learning which require assuming some
prior distribution of independent parameters.
Next, with logistic regression approach, we fit two logit
(i.e., logistic) functions for the two dependent variables L
and S as follows: logit (π (L)) = β0 + β1C + β2RW +
β3P + β4T c and logit (π (S)) = β4 + β5C + β6RW +
β7P + β8T c, where C is the applied consistency level,
L is the observed latency, S is the observed staleness,
RW is the proportion of reads, P is number of packets
transmitted, and T c is the number of threads during an
operation. βi are the coefficients estimated by iterative least
square estimation, and π is the likelihood function. Using
ordinary least square estimation, we iteratively minimize a
loss function given by the difference of the observed and
estimated values, with respect to the training dataset. For
eliminating overfitting, we perform L1
regularization with
the Lasso method.
Next we consider the decision tree learning algorithm
because of its simplicity, speed, and accuracy. The given
problem can be viewed as a classification problem, which

classifies the training dataset into classes based on whether
the observed latency and staleness (corresponding to each
row) fall within the given SLA thresholds. In fact, the
problem statement, i.e., choosing a matching consistency
level, can be viewed as a classification problem. The mod-
elling technique using Decision Tree comprises the fol-
lowing phases: 1) Labelling: On the fly computations for
maximizing throughput T (i.e., the secondary optimization
criteria) in the prediction phase would introduce consider-
able overhead. Since the latency L is considered before T is
maximized, the additional overhead may result in violation
of the latency threshold in the SLA. Hence, OptCon per-
forms the computations for T in the labelling phase itself.
Using exhaustive search, we label each row in the training
dataset with the highest T corresponding to a given values
of RW , Tc and P , such that the SLA parameters L and
S are within the given SLA thresholds. 2) Training: Next
we apply decision tree learning for training a model M
from the labelled dataset. Use of error pruning techniques
mitigates issues of overfitting [9].
Next, we consider the random forest algorithm that ap-
plies ensemble learning to bootstrap multiple weak decision
tree learners to generate a strong classifier. It iteratively
chooses random samples with replacement from the train-
ing dataset, and trains a decision tree on each of the samples.
We take the average of the predictions from each decision
tree to obtain the final prediction with respect to a test
dataset.
For the Bayesian approach, we make the following
simplifying assumptions to make our problem suitable for
Bayesian learning: A1) a prior logistic distribution exists for
each of RW , T c, P , and C (see Section 4), A2) the posterior
distributions for L, and S are conditionally dependent on
the joint distribution of hRW, T c, P, Ci, and A3) the
target features L and S are conditionally independent. With
these approximations, we plug-in the resultant dependency
graph from the Logistic Regression approach as input to the
Bayesian learner. Despite these assumptions and approxi-
mations, the Bayesian technique can provide intuition to the
developers, that can be used to build a more accurate model.
Next, we consider Artificial Neural Networks (ANN) for
learning a model M. We consider a two-layer perceptron
network for ANN; we do not consider Deep Learning be-
cause 1) the number of features are manageable for a simple
ANN, 2) the size of training dataset is also manageable for
ANN, and 3) simple ANN gives satisfactory accuracy, and
4) training delay would be considerably high with deep
learning. Though the training phase for ANN is not a factor
(since training occurs once), the overhead for the prediction
phase is still considerably high (Section 6.3) due to model
complexity.
5 IMPLEMENTATION
OptCon is developed as a Java-based wrapper over Cas-
sandra v2.1.0. The source code can be found in the github
repositories located at https://github.com/ssidhanta/. The
Logger module is implemented as an extension over the
YCSB (Yahoo Cloud Serving Benchmark) 0.1.4 [19] frame-
work. It runs as a daemon on top of the datastore, listening
for the parameters using the built-in JMX Interface. Logistic
Regression, SVM, and Neural Network implementations of
the Learning module use Matlab 2013b’s statistical toolbox.
Decision Tree and Random Forest are implemented using
Java APIs from Weka, an open source machine learning
suite. The Bayesian approach uses Infer.net, an open source
machine learning toolbox under the .NET framework. We
summarize the important configuration parameters in the
Learner implementations. Logistic regression uses a min-
imization function with additional regularization weights,
multiplied by a tunable L1
norm term Λ. We set α = 0.05
and Λ = 25. Decision tree uses a confidence threshold of 0.25
for pruning, and uses random data shuffling with seed = 1.
Random forest bootstraps 100 decision trees, and introduces
low correlation among the individual trees by selecting a
random subset of the features at each split boundary of the
component trees. The Bayesian method uses a precision of
0.1. The ANN is implemented as a two-layer perceptron,
with default weights = 0 and the default bias = 0.
5.1 Implementation Details: Decision Tree
Labelling Phase: We generate the training dataset as fol-
lows: 1) we set YCSB parameters, namely read proportion
RW and thread count T c, to simulate a characteristic
workload w, and 2) the above workload w is executed with
different applied consistency levels C. Each row k in the
training dataset corresponds to a particular characteristic
workload w with a particular consistency level Ck. Each
row k records observed values of the output parameters,
namely average latency L, staleness S, and throughput
T , over a sliding time window of 1 minute. For a given
combination of values of the input parameters, comprising
the tuple hRW, T c, P i, and the given subSLA thresholds,
the output parameters, L, S and T , vary according to the
applied consistency level C for the operation corresponding
to the row k in the training dataset. The training dataset is
labelled as follows: 1) the training dataset is exhaustively
searched, 2) in each iteration of the above search step, we
group the rows in the training dataset by the values of the
input parameters comprising the tuple hRW, T c, P i,
such that each group of rows contains an unique combina-
tion of values for the above tuple, 3) for each group of rows
g, we find the row l in g that satisfies both of the following
conditions. Condition 1: The value of throughput Tl in the
row l is maximum among the values of throughput for all
rows in g. Condition 2: The values of staleness Sl and latency
Ll satisfy the thresholds set in the subSLA, 4) rows in each
group of rows g in the training dataset are labelled with the
consistency level Cl corresponding to the row l that satisfies
the above conditions.
Training Phase The J48 decision tree [9] learner is loaded
with the labelled training data. The J48 learner is an open
source implementation of the C4.5 algorithm, which can
learn a model from a training dataset comprising both
continuous and discrete attributes (i.e., model parameters).
For handling continuous attributes, C4.5 assigns a threshold
value t to each continuous attribute a. C4.5 splits the rows
in the training dataset into categories, based on whether the
values of the attribute a for a row r exceed the threshold
t or not. C4.5 uses the normalized information gain Ig,
computed from the difference in entropy, as the splitting

(a) Bayesian belief network (b) Structure of decision tree with latency given in millisec-
onds
Fig. 2: Bayesian belief network and decision tree structure
criterion. At each node in the tree, the parameter that
produces the highest Ig is chosen to make the split. The
algorithm continues recurring along the rest of the training
dataset till the complete decision tree is generated. The
leaf nodes represent the predicted consistency level for a
given workload, characterized by a given set of values
of the input parameters hRW, T c, P i. Each group of
rows in the training dataset forms a path in the decision
tree (from root to a leaf node), for which the predicted
consistency level is given by the label of the leaf node. Under
a subSLA comprising a 2 ms threshold for latency and a
10 ms threshold for staleness, the generated J48 decision
tree is presented in Figure 2(b). The C4.5 algorithm chooses
staleness as the first splitting function, assigning a threshold
of 0 ms for splitting the training dataset into two categories
at the root. The right subtree r corresponds to the examples
with observed latency exceeding 0 ms, and the left subtree
l corresponds to the rest of the examples. Next, staleness
is chosen as the criterion that further splits the examples in
the right subtree r into subtrees l1 and r1, based on whether
the observed staleness exceeds 0.5 ms or not. The examples
in the right subtree r1 are further split into subtrees l2
and r2 based on a latency threshold of 3.967 ms. Then the
right subtree r2 derived from the above split is again split
into subtrees depending on whether the observed latency
exceeds a threshold of 5.431 ms.
Prediction Phase During the prediction phase under a
given subSLA with thresholds TL and TS for latency and
staleness, respectively, the Learner module traverses along a
path in the tree (Figure 2(b)), starting from the root to a leaf
node l, depending on the values of the input parameters
hRW, T c, P i. The consistency level Cl corresponding to
the label of the leaf node l is the predicted matching con-
sistency level under the given subSLA, for the given values
of the input parameters. The predictor constructs the above
path as follows. Starting from the root node, it traverses the
depth of the tree, marking each node in the path as v. At
each step in the path, it splits the tree into branches, based
on whether the value of the attribute a corresponding to
the label of the current node v falls within a range [0, t]
or a range [t, max]; t is the assigned split threshold, and
max is the maximum value of a in the training dataset. At
each node v in the path, the predictor chooses the branch
for which the value of the attribute a, corresponding to the
label of the given node, falls within which of the above
ranges. The predictor keeps on traversing the tree in the
above manner till it reaches a leaf node l, and it chooses the
label Cl of the leaf node l as the matching consistency for
the given operation.
5.2 Implementation Details: Bayesian Learning
With the Bayesian [9] approach, we assume the following.
A1: A prior distribution exists for each of the input param-
eters, namely RW , T c, P , and C. The input parameter C
is a discrete variable, which takes values from a finite set of
categorical attributes, comprising the supported list of con-
sistency levels. Hence, we assume a Dirichlet distribution
[20] for the prior probabilities for C. On the other hand, the
parameters RW , T c, and P are continuous variables. We
assume that these parameters follow a Gaussian distribu-
tion. A2: The posterior distributions for output parameters,
namely L and S, are conditionally dependent on the joint
distribution of the input parameters hRW, T c, P, Ci.
A3: The output parameters L and S are conditionally in-
dependent, allowing L and S to be expressed as the effect
of the input parameters RW , T c, P , and C. We represent
the dependency relation of the model parameters from the
equations obtained from the Logistic Regression approach
using a dependency graph. Figure 2(a) illustrates such a
graph. Our assumption that the output parameters, L and
S, are conditionally independent (though actually L and S
are dependent on each other) enables us to express them
as the effect of the input parameters. This allows us to
model the causality among the dependent and independent
variables as a simple Bayesian Network.
6 EVALUATION
6.1 Experimental Setup
Following [10], we have run our experiments on a testbed of
20 Amazon Ec2 m1-small instances, located in the Ec2 region
us-west-2, running Ubuntu 13.10, loaded with Cassandra
2.1.0 with a replication factor of 5, running on Oracle JDK
1.8 Our models are learnt from a training dataset comprising
23040 rows, and a testing dataset of 2560 rows, generated by
running varying YCSB 0.1.4 [19] workloads. We have used
the NTP (Network Time Protocol) protocol implementation

from ntp.org to bring down the maximum clock skew to 5
ms.
6.2 Latency and Staleness Ranges Used in the sub-
SLAs
[21] suggest that the latency for common web applications
falls in the range of 75-140 ms. The Pileus system [4], which
includes latency in subSLAs as well, uses values between
200 to 1000 ms as the latency threshold. Further reducing
the lower bound to favor availability, we choose any value
between 101 and 150ms as a candidate for the threshold
of latency in OptCon. Following the observations from the
Riak deployment at Yammer [22], Bailis et al. [7] uses a
range of 1.85-13.6 ms for the staleness bounds, given as
the t-visibility values. Hence, following [7], we choose a
value within the above range as the staleness threshold.
Rounding the bounding values for the ranges given above,
our subSLAs use values in the ranges 101-150 ms and 1-13
ms as thresholds for L and S, respectively.
6.3 OptCon Framework Overhead
Training the model being a one-time process, the latency
overhead due to the training phase is negligible during an
operation. The overhead due to training phase, which in-
cludes the time spent by the Logger in collecting the training
dataset from YCSB, varies from 6.5 ms for Linear Regression
to 27 ms for ANN. However, there is still considerable
overhead due to the prediction phase, especially with a large
training dataset. The prediction phase spans from the call
to the Learner module till the predicted consistency level
is returned by the Learner module. The average, variance
and standard deviation of the overhead are 1 ms, 0.25, and
0.37, respectively, with decision tree learning. The average
overhead with different learning techniques is given in Table
3.
Approach
Cross
Validation
Error
AICC BIC Overhead (ms)
Decision Tree 0.14 10.73 51.44 1
Bayesian Learning 0.57 12.85 53.57 1.2
Logistic Regression 1.98 16.32 57.07 0.7
Random Forest 0.14 9.51 50.24 1.3
Neural Network 0.059 12.85 63.54 1.5
TABLE 3: Model Selection Results: As per descending order
of AICC and BIC, and ascending order of CV and Overhead
6.4 Dimensionality Reduction: Significance of the
Model Parameters
Parameter
Rank Features
Technique
Sequential Attribute
Selection Technique
Misclassification
Error For
Random Forest
C 4.81 1 0.22
RW 0.9 1 1.11
P 2.78 1 0.24
TABLE 4: Significance of the Model Parameters: Significance
of a parameter is greater for higher values of Rank Features
and Sequential Attribute Selection, and lesser for higher
values of Misclassification error
We apply dimensionality reduction [9] to determine the
features (parameters) that are relevant to the problem, and
have low correlation among themselves. We use the follow-
ing techniques to detect any redundant feature that can be
omitted without losing information. In the rank features
approach, we determines the most significant features on
the basis of different criterion functions. The above criterion
functions are evaluated as a filtering (or a preprocessing)
step before the training phase. On the other hand, sequential
feature selection is a wrapper technique, that sequentially
selects the features until there is no substantial improvement
(given by the deviance of the fitted models) in the predictive
power for an included feature. For the random forest model,
we evaluate the significance of the features in the ensemble
of component trees using the misclassification error. Table 4
presents the results of the above techniques on the model
parameters C, RW , and P . Sequential feature selection
outputs 1 for all the model parameters, indicating that
all the parameters are significant. Rank features ranks the
relative significance of the features in the order C, P , and
RW . Misclassification error ranks the features in the order
RW , P , and C. The order of the relative significance is
different with each dimensionality reduction technique; the
developer must choose the suitable ranking approach based
on the learning technique. For example, the misclassification
error criterion is to be used for the random forest technique.
6.5 Comparison of the Predictive Power of the Learn-
ing Techniques Using Model Selection Metrics
6.5.1 The Model Selection Metrics Used
Cross validation error (CV error) [9] is the most common
and widely used model selection metric. It represents the
accuracy of the model with respect to an unseen test dataset,
which is different from the original dataset used to train
the model. Using 10-fold cross validation, we partition the
dataset into 10 subsamples, and run 10 rounds of validation.
In each round a different subsample is used as the test
dataset, while we train the model with the rest of the dataset.
We compute the average of the mean squared error values
for all validation rounds to obtain the mean CV error (Table
3). We also use the Akaike Information Criterion (AIC)
[23] which quantifies the quality of the generated model in
terms of the information loss during the training phase. For
obtaining AIC, we first compute the likelihood of each ob-
served outcome in the test dataset with respect to the model.
We compute the maximum log likelihood L, i.e., the maxi-
mum of natural logarithms over the likelihood of each of the
observed outcomes. AICC [23] (Table 3) further improves
upon AIC, penalizing overfitting with a correction term for
finite sample size n. Thus AICC = 2k−2 ln L+ 2k(k+1)
n−k−1
,
where k is the number of model parameters. Apart from
AICC, we also compute the Bayesian Information Criterion
(BIC) [23] that uses the Bayesian prior probability estimates
to penalize overfitting. Thus, BIC = 2k ln N − 2 ln L,
where the additional parameter N for the sample size en-
ables BIC to assign a greater penalty on the sample size than
AICC. At smaller sample size, BIC assigns lower penalty
on the free parameters than AICC, whereas the penalty
is higher for larger sample size (because it uses k ln N
instead of k). Normally, the BIC and AICC scores (Table

3) are used together to compare the models, AICC detects
the problem of overfitting and BIC indicates underfitting.
Another criterion for the choice of the algorithm is the
average prediction overhead (refer Section 6.3), which is
given in the last column of Table 3.
6.5.2 Insights: Evaluation of Learning Techniques With Re-
spect to the Metrics
The above table can guide the developers in making an
informed choice regarding the correct learning technique
to use. The following analysis with respect to the Table
3 can act as lessons learnt for future practitioners and
researchers looking to apply learning techniques to solve
similar problems. Our problem can be directly cast as clas-
sification of the given training data into classes labelled
(Section 4.4) by the consistency levels. Hence, Decision Tree
yields high accuracy and speed (Table 3). We observed
near random predictions with Linear regression (hence we
omitted the results), because: 1) our problem is more of a
classification problem, as already explained, and 2) linear
regression requires the assumption of a linear model. Apart
from linear regression, logistic regression fares worst among
the approaches as indicated by the values of the metrics.
This is because it also treats the problem as a regression
problem, whereas ours is a nonlinear classification problem.
But it is still useful, since it yields a simple relation which
expresses the effect of the parameters RW , T c, and P ,
on L and S for an operation, under different consistency
levels. Also, it can be used to determine the strength of
the relationship among the parameters. Thus it can act as
a basis for complex modelling algorithms. Random forest
further eliminates errors due to overfitting and noisy data
by: 1) bootstrapping multiple learners, and 2) using random-
ized feature selection. Hence, it produces the best accuracy.
However, the prediction overhead due to model complexity,
might increase the latency overhead beyond the thresholds
of real-time subSLA. Similarly, though ANN yields high
accuracy (Table 3), the additional overhead due to the model
complexity (it produces the most complex model) may
overshoot the threshold for real-time use cases. Bayesian
method requires several approximating assumptions which
result in high error scores.
6.5.3 Selection of Algorithm Based on Tradeoff Between
Accuracy And Speed
We define a measure Perf for selecting the learning tech-
nique that provides an optimal tradeoff between the accu-
racy and speed (Table 3). We assign the overhead of the
slowest learning technique (as per Table 3) as the baseline
overhead Obase . Since ANN has the maximum overhead
(1.5 ms) as per Table 3, Obase = 1.5. The speedup Speedup
obtained with a chosen learning algorithm relative to the
slowest algorithm is estimated from the observed relative
decrease in overhead for the chosen algorithm with re-
spect to the baseline Obase (i.e., for the slowest algorithm,
Speedup= 0). Thus, Speedup = Obase −O
Obase
, where O is the
overhead of the chosen technique. We denote the prediction
error for a given learning technique as E. We assign the
error for the least accurate algorithm, as per Table 3, as the
baseline Ebase . Logistic regression has the maximum CV er-
ror (1.98 as per Table 3). Hence, Ebase = 1.98. The value of E
and Ebase depends on the choice of the error measure. The
developer can either use the CV error, or both AICC and BIC
taken together. In the latter case, E is given as the maximum
between the AICC and BIC for the chosen algorithm, and
Ebase is given as the maximum between the AICC and BIC
for the least accurate algorithm. We compute the relative
increase in accuracy (ARel ) obtained with a chosen learning
algorithm with respect to the least accurate algorithm. ARel
is estimated from the proportion of observed decrease in
error for the chosen algorithm with respect to the baseline
Ebase . Thus, ARel = Ebase −E
Ebase
. Finally, the measure Perf is
computed as Perf = wSpeedup × Speedup + wA × ARel ,
where wSpeedup and wA are the weights (real numbers in
the closed interval [0, 1], such that wSpeedup + wA = 1)
assigned by the developers to quantify the relative im-
portance of speed and accuracy, respectively, for selecting
a learning technique. The developer can assign a larger
weight to accuracy (i.e., wA > wSpeedup ), if accuracy is
more important than speedup (and vice versa) according to
the use case. The developer chooses the learning algorithm
that produces the maximum value of Perf , with a given
choice of error measure. We provide the values of Perf for
various learning algorithms in Table 5, with CV Error as
the error measure, and equal weights assigned to accuracy
and speedup (i.e., wA = wSpeedup = 0.5). The value of
Perf is maximum for the decision tree algorithm, implying
that it produces an optimal tradeoff of accuracy and speed
according to our choice of error measure, and the assigned
values for the weights wA and wSpeedup . In most cases,
the difference in overhead between algorithms is negligibly
small, hence the Speedup is not a criterion of significance
for most cases. Thus, generally developers would assign
weights wA = 1 and wSpeedup = 0, choosing the algorithm
with the highest accuracy.
Decision Tree
Bayesian
Learning
Logistic
Regression
Random
Forest
ANN
0.62 0.46 0.27 0.52 0.49
TABLE 5: Evaluation of the Accuracy-Speed Tradeoff
6.6 Analysis of Linear Regression Approach
6.6.1 Background of ANOVA Analysis Technique
We use ANOVA [20] or Analysis of Variance on the fitted
linear regression to perform a hypothesis test for determin-
ing whether there is a significant linear relationship between
input parameters. The degree of freedom DF (refer to Table
6) is given as the number of independent observations in a
sample (training dataset) minus the number of the param-
eters that are to be estimated from sample. The regression
mean square (MSR) for each parameter is computed as the
squared sum of errors (i.e., deviation of observed values
from the estimated values) divided by the degree of freedom
DF. The mean square error (MSE) is computed as the
squared sum of errors divided by n−1, where n is number
of parameters. The F statistic for each parameter is com-
puted as MSR/MSE. The p values in the ANOVA table
denote the probabilistic significance of the corresponding
parameter - a small p value will imply a small probability of
obtaining a large value of the F statistic. The null hypothesis

for linear regression states that there is no significant linear
relationship among the model parameters. If we obtain a
large value of F with a small p-value, we reject the null
hypothesis in favor of the alternative hypothesis, i.e., there
exists a significant relationship among model parameters.
TABLE 6: ANOVA results for Linear Regression
ANOVA for Latency function
Variable SumSq DF MeanSq F pValue
C 4.11E+18 1 4.11E+18 7.0851 7.82E-03
RW 1.57E+20 1 1.57E+20 269.75 1.23E-57
P 7.26E+18 1 7.26E+18 12.503 0.00041377
Tc 1.05E+23 1 3.2E+25 325.45 1.5E-65
Error 1.45E+21 2492 5.80E+17
ANOVA for Staleness function
Variable SumSq DF MeanSq F pValue
C 7.04E+08 1 7.04E+08 1.1906 0.27523
RW 7.71E+09 1 7.71E+09 480.32 1.70E-105
P 9.71E+08 1 9.71E+08 3.6821 0.05501
Tc 3.23E+12 1 2.85E+13 538.12 1.57E-123
Error 1.92E+12 2492 7.69E+08
6.6.2 The Resulting ANOVA Table
In case of the regression equation for latency L, the p
values (refer to Table 6) are all approximately 0 (rounded
to 3 digits of precision) for the parameters C, RW , P ,
and T c, respectively, and the values of F statistic are 7.08,
269.75, 12.5, and 325.45, respectively. Such small p values
indicate that probability of observing large values of F-
ratios is very small. Since F is as high as 269.75 for RW ,
the significance of the independent parameter RW in the
regression equation for L is very high, i.e., the probability
of L linearly varying with RW is very high. On the other
hand, F statistics for the parameters C and P is much
lower compared to RW , hence these parameters may have
relatively smaller chance of having a linear relationship
with the dependent parameter L. In the case of the fitted
regression equation for staleness S, the p values (refer to
Table 6) are approximately 0.28, 0, 0.06, and 0, for the
parameters C, RW , P , and T c, respectively, and the values
of F statistic are 1.19, 480.32, 3.68, and 538.12, respectively.
Thus, while the p values for the regression equations of
S are greater than those for L, they are still very small
(approximately 0, when rounded). Hence, the probability of
observing high values of F statistics is very small, especially
for C and P (F is 1.19 and 3.68, respectively, for C and
P ). Thus, the parameters C and P have very low chance
of being independent parameters in the regression equation
for staleness S.
TABLE 7: Confusion matrix generated using Decision Tree
a b c,d,e,f g h <– classified as
6194 238 0 73 93 a = ANY
115 1045 0 2793 495 b = ONE
0 0 0 0 0 c = TWO
0 0 0 0 0 d = THREE
0 0 0 0 0 e = LOCAL QUORUM
0 0 0 0 0 f = EACH QUORUM
117 1073 0 2782 553 g = QUORUM
114 1029 0 2771 h = ALL
Fig. 3: ROC curve generated using Decision Tree: AUC=0.98
Fig. 4: ROC Curve for the Logistic Regression approach
6.7 Analysis of Confusion Matrix and ROC Curves
6.7.1 Background of Confusion Matrix and ROC Curves
True positives (TP) [20] represent examples in the training
dataset that corresponds to a correct true classification. False
positives (FP) or type I errors involve incorrect labelling of
negative examples as positives. True negatives (TN), on the
other hand, are negative examples that have been classified
correctly. False negatives (FN) or type II errors involve
incorrect classification of positive examples as negatives.
A confusion matrix is a tabular visualization of the values
of TP, FP, TN, and FN [20]. True Positive Rate (TPR) and
False Positive Rate (FPR) are obtained as T P R = T P
N
, and
F P R = F P
N
, respectively. Receiver Operating characteris-
tics (ROC) curves display performance of machine learning
algorithms in terms of TPR and FPR. We plot the ROC
from the confusion matrix. The ROC curve plots the TPR
and FPR along the x and the y axis, respectively. The area
under the curve (AUC) quantifies the probability that a
given classifier will assign higher rank to a positive example
over a negative example. The ideal region for ROC curve
is the upper left half which corresponds to an area of 1,
representing a perfect test; an area of 0.5 typically means a
uniformly random test.

Fig. 5: ROC curve generated using Bayes Net
6.7.2 Resulting Confusion Matrix and ROC Curves
Figure 7 shows the confusion matrix obtained with the
Decision Tree technique. From the confusion matrix, we
obtain the ROC curve for the decision tree (refer to Figure
3). The area under the curve (AUC), computed using Mann
Whitney technique [20], is 0.99. The Gini coefficient G1 is
computed as G1 = 2AUC − 1 = 2 × 0.99 − 1 =
0.97. But the curve follows below the diagonal indicating
negative predictions; hence we follow the mirrored method,
which mirrors, i.e., reverses, its predictions with reference
to the diagonal line. All predictions obtained are reversed
using above technique before passing the predicted values
to the client for performing the read/write operations.
Figure 5 presents the ROC curve for the Bayes network
approach. The diagonal corresponds to random guesses;
the prediction instances all reside above it. The area under
the ROC curve for the Bayes network is 0.9, and the Gini
coefficient G1 = 2 × 0.9 − 1 = 0.8. Figure 4 presents
the ROC curve for the Logistic Regression approach.
Fig. 6: Cross Validated Deviance of Lasso Fit
6.8 Analysis of Logistic Regression
A saturated model is a model with the maximum number
of regression coefficients that can be estimated. Deviance
for a model M1 is computed as twice the difference be-
tween the log-likelihood of the model M1 and the saturated
model MS. For a nonnegative value of Λ, lasso (we use
lasso regularization with logistic regression, as discussed in
Fig. 7: Residuals for the Logistic Regression approach
Section 4.4.1) minimizes the Λ-penalized deviance, which
is equivalent to maximizing the Λ-penalized log likelihood.
The above deviance represents the value of the loss function,
computed as the negative log-likelihood for the prediction
model, averaged over the validation folds in the cross-
validation procedure. We use Figure 6 to analyze the de-
viance, obtained with Logistic Regression. The green circle
and dashed line locate the Λ term that produces minimal
cross-validation error. Figure 6 plots the value of Λ with
minimum cross-validated MSE, and the highest value Λ
that is within one standard error of minimum MSE. Figure 6
identifies the minimum-deviance point with a green circle,
and the dashed line presents the variation of the regular-
ization parameter Λ. The blue circled point has minimum
Λ with standard deviation at most 1. Figure 7 presents
the residuals obtained with the Lasso plot, computed as
the difference between the observed and predicted values.
Residuals are used to detect outliers to a fitted regression
curve, and in the process help test the assumptions of the
logistic regression model. In Figure 7, we plot the histogram
of raw residuals, i.e., the difference between the observed
and the fitted values of L estimated from the logistic re-
gression equation. The histogram shows that the residuals
are slightly left skewed, indicating that the model can be
further improved.
6.9 Adaptability of OptCon to Varying Workload
Varying Throughput: We run our experiments on a geo-
replicated testbed of 7 Amazon Ec2 m1.large instances
(replica nodes), located in 3 regions (3 nodes in us-west-2a, 2
nodes in us-east, and 2 nodes in eu-west), running Ubuntu
14.04 LTS, loaded with Cassandra 2.1.2 with a replication
factor of 3. We demonstrate the adaptability of OptCon to
variations in throughput in the client workload. We tune
the target throughput parameter (operations per second) in
the YCSB client, and observe the variations in the severity
of staleness (i.e., 95 percentile Γ score) and latency against
changing throughput, under a subSLA comprising a 300
ms threshold for latency (following the SLA for Amazon’s
shopping cart application [16]) and a 10 ms threshold for
staleness. As per Table 5, we choose the decision Tree im-
plementation of the Learner module. Following [10], we run
YCSB workloads for spans of 60 seconds with the following
configuration: 50% ratio of reads and writes, 8 client threads,

0
100
200
300
400
500
600
700
800
100 200 300 400 500 600 700
0
5
10
15
20
Latency
(milliseconds)
Gamma
(milliseconds)
Throughput in ops/s
Gamma (milliseconds)
Latency (milliseconds)
(a) Operations With READ ALL/WRITE
ALL
0
100
200
300
400
500
600
700
800
100 200 300 400 500 600 700
0
5
10
15
20
Latency
(milliseconds)
Gamma
(milliseconds)
Throughput in ops/s
(b) Operations With READ/WRITE QUO-
RUM
0
100
200
300
400
500
600
700
800
100 200 300 400 500 600 700
0
5
10
15
20
Latency
(milliseconds)
Gamma
(milliseconds)
Throughput in ops/s
(c) Operations With READ ONE/WRITE
ONE
0
100
200
300
400
500
600
700
800
100 200 300 400 500 600 700
0
5
10
15
20
Latency
(milliseconds)
Gamma
(milliseconds)
Throughput in ops/s
(d) Operations With OptCon
Fig. 8: Adaptability of OptCon to Varying Target Throughput: under a subSLA (Latency: 300ms Staleness: 10ms)
0
50
100
150
200
0
2
4
6
8
10
12
14
Latency
in
milliseconds
Staleness
in
milliseconds
ReadProportion 0.1 0.3 0.5 0.7 0.9
Staleness with OptCon
Latency with OptCon
(a) Operations With OptCon satisfy the sub-
SLA in 100% cases
0
50
100
150
200
0
2
4
6
8
10
12
14
Latency
in
milliseconds
Staleness
in
milliseconds
Staleness with ALL READ/ALL WRITE
Latency with ALL READ/ALL WRITE
(b) Operations With READ ALL/WRITE
ALL satisfy the subSLA in 70% cases
0
50
100
150
200
0
2
4
6
8
10
12
14
Latency
in
milliseconds
Staleness
in
milliseconds
Staleness with ALL READ/QUORUM WRITE
Latency with ALL READ/QUORUM WRITE
(c) Operations With READ ALL/WRITE
QUORUM satisfy the subSLA in 75% cases
0
50
100
150
200
0
2
4
6
8
10
12
14
Latency
in
milliseconds
Latency with QUORUM READ/QUORUM WRITE
(d) Operations With READ QUO-
RUM/WRITE QUORUM satisfy the
subSLA in 75% cases
0
50
100
150
200
0
2
4
6
8
10
12
14
Latency
in
milliseconds
Staleness
in
milliseconds
Staleness with QUORUM READ/ALL WRITE
Latency with QUORUM READ/ALL WRITE
(e) Operations With READ QUO-
RUM/WRITE ALL satisfy the subSLA
in 35% cases
0
50
100
150
200
0
2
4
6
8
10
12
14
Latency
in
milliseconds
Staleness
in
milliseconds
Staleness with ONE READ/ANY WRITE
Latency with ONE READ/ANY WRITE
(f) Operations With READ ONE/WRITE
ANY satisfy the subSLA in 45% cases
Fig. 9: Adaptability of OptCon to Varying Read Proportion under the subSLA SLA-1 (Latency:250ms Staleness: 5ms)

0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
predicted consistency
(a) Operations With OptCon under subSLA-1:
Latency:250ms Staleness: 5ms
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
(b) Operations With OptCon under subSLA-
2: Latency:100ms Staleness: 10ms
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
(c) Operations With OptCon under subSLA-3:
Latency:20ms Staleness: 20ms
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
read all/write all
(d) Operations With READ ALL/WRITE
ALL
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
read all/write quorum
(e) Operations With READ ALL/WRITE
QUORUM
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
read/write quorum
(f) Operations With READ QUO-
RUM/WRITE QUORUM
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
read quorum/write all
(g) Operations With READ QUO-
RUM/WRITE ALL
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
read all/write any
(h) Operations With READ ALL/WRITE
ANY
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms latency in ms
read all/write one
(i) Operations With READ ALL/WRITE ONE
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
read one/write all
(j) Operations With READ ONE/WRITE ALL
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
read quorum/write one
(k) Operations With READ QUO-
RUM/WRITE ONE
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
read quorum/write any
(l) Operations With READ QUO-
RUM/WRITE ANY
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
read one/write quorum
(m) Operations With READ ONE/WRITE
QUORUM
0
5
10
15
20
0 100 200 300 400 500
staleness
in
ms
latency in ms
write any/read one
(n) Operations With READ ONE/WRITE
ANY
Consistency Level M Value
subSLA-1
WRITE ANY/READ ONE 68
READ/WRITE QUORUM 41
Predicted Consistency 85
subSLA-2
READ QUORUM/WRITE ALL 70
subSLA-3
READ ALL/WRITE QUORUM 0
(o) Chart Showing M-statistic
values
Fig. 10: Adaptability of OptCon to Different subSLAs

“latest” distribution, and keyspace size of 100,000. Figures
8(d), 8(a), 8(b), and 8(c) plot the 95 percentile values of
observed Γ scores and observed latency in milliseconds,
against varying target throughput, with manually chosen
consistency levels and with OptCon. Since frequency of
operations on each key increases with increase in target
throughput applied from the YCSB client, the chances of ob-
serving stale results increases as well. Hence, the plots show
that, with increasing target throughput, the 95 percentile Γ
score increases. Stronger consistency levels (i.e., ALL and
QUORUM, in Figures 8(a) and 8(b)) satisfy the staleness
threshold against all values of applied target throughput.
Due to the high internode communication delay between
replica nodes distributed across the 3 regions under geo-
replicated settings, the consistency level ALL (Figure 8(a))
violates the 300 ms latency threshold. On the other hand,
weak consistency level (i.e., ONE, in Figure 8(c)) satisfies
both the latency and staleness threshold for lower values of
target throughput, but violates the 10 ms staleness thresh-
old where target throughput exceeds 400 op/s. Hence, for
throughput ≤ 400 op /s, ONE or QUORUM is the match-
ing choice, while QUORUM is matching for throughput >
400 op/s. OptCon (Figure 8(d)) applies weak consistency
level ONE for target throughput ≤ 400 op/s, to achieve the
latency threshold and to maximize the observed through-
put. It applies strong consistency level QUORUM for target
throughput > 400 op/s. QUORUM (Figure 8(b)) satisfies
the subSLA for all values of target throughput. However,
choosing ONE instead of QUORUM for throughput ≤ 400
op/s, OptCon (Figure 8(d)) is able is able to produce 61 ms
less latency, on an average, than with QUORUM (Figure
8(b)).
Varying Read Proportion: Next, we vary the read pro-
portion (i.e., proportion of reads) parameter (RW ) while
running YCSB workloads, with OptCon and with manually
chosen consistency levels (Figure 9(a)). In the Figures 9(a)
through 9(f), read proportion is plotted along the x axis, the
observed latency along the primary y axis, and the staleness
along the secondary y axis. We demonstrate that, for a
specific read proportion, only a subset of all possible read-
write consistency levels is optimal, i.e., satisfies latency and
staleness thresholds in a given subSLA. Varying the read
proportion (RW ) in the workload from 0.1 to 1, we observe
that a manually chosen read-write consistency level is opti-
mal for only a fraction of the total number of cases, under
the subSLA SLA-1 (Table 2). OptCon demonstrates its adapt-
ability to varying read proportion, applying the optimal
consistency level for each read proportion, thus satisfying
the subSLA in 100% cases. OptCon can adapt to changing
read proportion due to the presence of the parameter RW
as a feature in the prediction model M. As shown by the
results in [10], the frequency of stale results (i.e., higher
Γ scores) increases with increasing read proportion in the
workload. Hence with higher read proportion in the right-
half of the x axis, stronger read consistency levels are re-
quired to return less stale results. Also, since the proportion
of writes is less, the penalty for the write latency overhead
is less. Hence stronger write consistency levels, that result
in high write latency, are acceptable. Thus for higher read
proportions, combinations of strong read-write consistency
levels, i.e., ALL/QUORUM, succeed in lowering staleness
values to acceptable bounds (right-half of the Figures 9(b),
9(c), 9(d), and 9(e)). Strong consistency levels achieve a
maximum success rate of 75% in satisfying the subSLA, with
READ ALL/QUORUM WRITE (Figure 9(c)) and READ
QUORUM/WRITE ALL (Figure 9(d). Taking into account
the clock skew, the staleness is effectively 0 in these cases.
For the same reasons, for higher read proportions, weak
read consistency levels result in failure to achieve stringent
staleness thresholds, as demonstrated by the points in the
right-half of Figure 9(f). With lower read proportions in the
left-half of the x axes, the frequency of stale read results
are smaller [10]. In this case, weaker manually chosen read-
write consistency levels, i.e., ONE/ANY (left-half of Figure
reffig:label-12) are sufficient for returning consistent results.
For lower read proportions, stronger consistency levels, i.e.,
ALL or QUORUM, are unnecessary, only resulting in high
latency overhead. With weak consistency levels, we observe
a maximum success rate of only ≤ 55% in satisfying the
subSLA (left-half of Figures 9(b) through 9(e)), with READ
ALL/WRITE ALL (Figure 9(a)). Thus, manually chosen
consistency levels show a maximum success rate of 75% in
satisfying the subSLA, in contrast with a 100% success rate
demonstrated by OptCon (Figure 9(a)).
6.10 Adaptability of OptCon to Different subSLAs
We demonstrate that consistency levels predicted by Opt-
Con are always matching, i.e., it always chooses from
the above optimal set of consistency levels for the given
subSLA. We have integrated OptCon (using Decision tree
learning as per Table 5) with the RUBBoS [24] benchmark,
that can simulate concurrent access, and interleaving user
access. Figures 10(a) through 10(n) plot the observed stal-
eness along the y axis, and latency along the x axis, un-
der subSLAs SLA-1, SLA-2 and SLA-3, respectively (Table
2). Figures 10(a) through 10(c) demonstrate experiments
performed with OptCon. Figures 10(d) through 10(j) cor-
respond to experiments done with manually chosen con-
sistency levels. We demonstrate a few extreme cases of
occasional heavy network traffic in real world applica-
tions with artificially introduced network delays (refer to
the few boundary cases with arbitrarily high latency in
Figures 10(a), 10(b), and 10(c), that violate the respective
subSLAs). These boundary cases were simulated using the
traffic shaping feature of Traffic Control [25], a linux-based
tool for configuring network traffic, that controls the trans-
mission rate by lowering the network bandwidth. SLA-1
(Table 2) represents systems demanding stronger consis-
tency. Manually chosen weak read-write consistency level,
i.e, ANY/ONE, fails to satisfy the stringent staleness bound
of SLA-1 (Figures 10(k) through 10(m)) On the other hand,
strong consistency levels (i.e., ALL) (Figures 10(d) through
10(g)) satisfy SLA-1. OptCon satisfies SLA-1 by applying
the respective optimal consistency levels (i.e., ALL in this
case) for SLA-1 (Figure 10(a)). For lower latency bound in
SLA-3 (Table 2), the manually chosen strong consistency
levels (ALL/QUORUM) unnecessarily produce latencies be-
yond the acceptable threshold (Figures 10(d) through 10(g)).
However, weaker consistency levels (i.e., ONE/ANY) prove
to be optimal (Figures 10(k) through 10(j)). Again, OptCon
succeeds by choosing the respective optimal consistency

levels (i.e., ONE/ANY in this case) for SLA-3 (Figure 10(c)).
Similarly with SLA-2, a subset of manually chosen fixed
consistency levels (Figures 10(h), 10(i), and 10(j)) produce
optimal results, whereas OptCon successfully achieves SLA-
2 in Figure 10(b) for all operations. We evaluate OptCon
by a measure M, which measures the adaptability of the
system with the subSLA. Following [4], M (Table 10(o))
computes the percentage of cases which did not violate
the subSLA. For all the given subSLAs, the M values for
operations performed with OptCon, given in the Figures
10(a), 10(b), and 10(c), exceed the M for operations using
fixed consistency levels.
7 RELATED WORK
Wada et al. [26] and Bermbach et al. [27] analyze and
quantify consistency from a system-centric perspective that
focuses on the convergence time of the replication proto-
col. Bailis et al. [7] and Rahman et al. [28] instead con-
sider a client-centric perspective of consistency, in which
a consistency anomaly occurs only if differences in state
among replicas lead to a client application actually ob-
serving a stale read. In practice eventual consistency is
preferred over strong consistency in scenarios where the
system must maintain availability during network parti-
tions [29, 5], or when the application is latency-sensitive
and able to tolerate occasional inconsistencies [30]. The state
machine replication paradigm achieves the strongest possi-
ble form of consistency by physically serializing operations
[31]. Lamport’s Paxos protocol is a quorum-based fault-
tolerant protocol for state machine replication. Mencius
improves upon Paxos by ensuring better scalability and
higher throughput under high client load using a rotating
coordinator scheme [32]. Using variations of Paxos, a num-
ber of systems [33, 34, 35, 36] claim to provide scalability and
fault-tolerance. Relatively few systems [37, 4, 13, 14] provide
mechanisms for fine-grained control over consistency. Li et
al. [12] applies static analysis for automated consistency
tuning for relational databases.
8 CONCLUSIONS
OptCon automates client-centric consistency-latency tuning
in quorum-replicated stores, based on staleness and la-
tency thresholds in the SLAs. We implemented the OptCon
Learning module using various Learning algorithms and
do a comparative analysis of the accuracy and overhead
for each case. ANN produces highest accuracy, and Logistic
Regression the lowest overhead. Assigning equal weight to
accuracy and overhead, the decision tree performed best.
OptCon can adapt to variations in the workload, and is at
least as effective as any manually chosen consistency setting
in satisfying a given SLA. OptCon provides insights for
applying machine learning to similar problems.
9 ACKNOWLEDGEMENT
The project is partially supported by Army Research Office
(ARO) under Grant W911NF1010495. Any opinions, find-
ings, and conclusions or recommendations expressed in this
material are those of the authors and do not necessarily re-
flect the views of the ARO or the United States Government.
Wojciech Golab is supported in part by the Natural Sciences
and Engineering Research Council (NSERC) of Canada.
Subhajit Sidhanta is supported in part by the Amazon AWS
(Amazon Web Services) Research Grant.
REFERENCES
[1] A. Lakshman and P. Malik, “Cassandra: A decentralized
structured storage system,” SIGOPS Oper. Syst. Rev., vol. 44, no. 2,
pp. 35–40, Apr. 2010. [Online]. Available: http://doi.acm.org/10.
1145/1773912.1773922
[2] C. Meiklejohn, “Riak PG: Distributed process groups on dynamo-
style distributed storage,” in Erlang ’13.
[3] B. Calder, J. Wang, A. Ogus, N. Nilakantan, A. Skjolsvold,
S. McKelvie, Y. Xu, S. Srivastav, J. Wu, H. Simitci, J. Haridas,
C. Uddaraju, H. Khatri, A. Edwards, V. Bedekar, S. Mainali,
R. Abbasi, A. Agarwal, M. F. u. Haq, M. I. u. Haq,
D. Bhardwaj, S. Dayanand, A. Adusumilli, M. McNett,
S. Sankaran, K. Manivannan, and L. Rigas, “Windows azure
storage: A highly available cloud storage service with strong
consistency,” in Proceedings of the Twenty-Third ACM Symposium on
Operating Systems Principles, ser. SOSP ’11. New York, NY, USA:
ACM, 2011, pp. 143–157. [Online]. Available: http://doi.acm.org/
10.1145/2043556.2043571
[4] D. B. Terry, V. Prabhakaran, R. Kotla, M. Balakrishnan, M. K.
Aguilera, and H. Abu-Libdeh, “Consistency-based service level
agreements for cloud storage,” in SOSP ’13.
[5] D. B. Terry, M. M. Theimer, K. Petersen, A. J. Demers, M. J.
Spreitzer, and C. H. Hauser, “Managing update conflicts in Bayou,
a weakly connected replicated storage system,” in SOSP ’95.
[6] A. Jain, “Using cassandra for real-time analytics: Part 1,” http://
blog.markedup.com/2013/03/cassandra-real-time-analytics/,
2011.
[7] P. Bailis, S. Venkataraman, M. J. Franklin, J. M. Hellerstein,
and I. Stoica, “Probabilistically bounded staleness for practical
partial quorums,” Proc. VLDB Endow., vol. 5, no. 8, pp. 776–787,
Apr. 2012. [Online]. Available: http://dl.acm.org/citation.cfm?
id=2212351.2212359
[8] S. Sidhanta, W. Golab, S. Mukhopadhyay, and S. Basu, “OptCon:
An Adaptable SLA-Aware Consistency Tuning Framework for
Quorum-based Stores,” in CCGrid ’16.
[9] P. Flach, Machine Learning: The Art and Science of Algorithms That
Make Sense of Data. New York, NY, USA: Cambridge University
Press, 2012.
[10] W. M. Golab, M. R. Rahman, A. AuYoung, K. Keeton, J. J. Wylie,
and I. Gupta, “Client-centric benchmarking of eventual consis-
tency for cloud storage systems,” in ICDCS, 2014, p. 28.
[11] P. Cassandra, “Apache cassandra use cases,” http : //
planetcassandra.org/apache-cassandra-use-cases/, 2015.
[12] C. Li, J. Leitão, A. Clement, N. Preguiça, R. Rodrigues, and
V. Vafeiadis, “Automating the choice of consistency levels in
replicated systems,” in USENIX ATC 14.
[13] M. S. Ardekani and D. B. Terry, “A self-configurable geo-replicated
cloud storage system,” in OSDI 14.
[14] K. C. Sivaramakrishnan, G. Kaki, and S. Jagannathan, “Declarative
programming over eventually consistent data stores,” in PLDI ’15.
[15] L. Ravindranath, J. Padhye, R. Mahajan, and H. Balakrishnan,
“Timecard: Controlling user-perceived delays in server-based
mobile applications,” in Proceedings of the Twenty-Fourth ACM
Symposium on Operating Systems Principles, ser. SOSP ’13. New
York, NY, USA: ACM, 2013, pp. 85–100. [Online]. Available:
http://doi.acm.org/10.1145/2517349.2522717
[16] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and
W. Vogels, “Dynamo: Amazon’s highly available key-value store,”
SIGOPS Oper. Syst. Rev., vol. 41, no. 6, pp. 205–220, Oct. 2007.
[Online]. Available: http://doi.acm.org/10.1145/1323293.1294281
[17] R. Sumbaly, J. Kreps, L. Gao, A. Feinberg, C. Soman, and S. Shah,
“Serving large-scale batch computed data with project Volde-
mort,” in Proc. of the 10th USENIX Conference on File and Storage
Technologies (FAST), 2012.
[18] L. Lamport, “On interprocess communication,” Distributed
Computing, vol. 1, no. 2, pp. 77–85, 1986. [Online]. Available:
http://dx.doi.org/10.1007/BF01786227
[19] B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears,
“Benchmarking cloud serving systems with YCSB,” in SoCC ’10.

[20] S. Weerahandi, Exact Statistical Methods for Data Analysis, ser.
Springer Series in Statistics. New York, NY, USA: Springer Science
and Business Media, 2001.
[21] T. Everts. Web performance today.
[22] C. Hale and R. Kennedy, “Using riak at yammer,” http://dl.
dropbox.com/u/2744222/2011-03-22 Riak-At-Yammer.pdf, 2011.
[23] K. P. Burnham and D. R. Anderson, Model selection and multimodel
inference: a practical information-theoretic approach. Springer Science
& Business Media, 2002.
[24] o. Consortium, “RUBBoS:˜Bulletin Board Benchmark,” 2002.
[25] B. Hubert, T. Graf, G. Maxwell, R. Van Mook, M. Van Oosterhout,
P. B. Schroeder, J. Spaans, and P. Larroy, Linux Advanced Routing
& Traffic Control HOWTO, Online publication, Linux Advanced
Routing & Traffic Control, http://lartc.org/, Apr. 2004. [Online].
Available: http://lartc.org/lartc.pdf
[26] H. Wada, A. Fekete, L. Zhao, K. Lee, and A. Liu, “Data consistency
properties and the trade-offs in commercial cloud storage: the
consumers’ perspective,” in Proc. Conference on Innovative Data
Systems Research (CIDR), 2011, pp. 134–143.
[27] D. Bermbach and S. Tai, “Eventual consistency: How soon is
eventual? An evaluation of Amazon S3’s consistency behavior,”
in Proc. Workshop on Middleware for Service Oriented Computing
(MW4SOC), 2011.
[28] M. R. Rahman, W. , A. AuYoung, K. Keeton, and J. J. Wylie,
“Toward a principled framework for benchmarking consistency,”
in Proc. of the Eighth USENIX Conference on Hot Topics in System
Dependability, ser. HotDep’12. Berkeley, CA, USA: USENIX
Association, 2012, pp. 8–8. [Online]. Available: http://dl.acm.
org/citation.cfm?id=2387858.2387866
[29] K. Birman and R. Friedman, Trading Consistency for Availability in
Distributed Systems. Cornell University. Department of Computer
Science, 1996. [Online]. Available: https://books.google.com/
books?id=XAwDrgEACAAJ
[30] E. A. Brewer, “Towards robust distributed systems (Invited Talk),”
in Proc. of the 19th ACM SIGACT-SIGOPS Symposium on Principles
of Distributed Computing, 2000.
[31] L. Lamport, “Time, clocks and the ordering of events in a dis-
tributed system,” Communications of the ACM, vol. 21, no. 7, p. 558,
1978.
[32] Y. Mao, F. P. Junqueira, and K. Marzullo, “Mencius: Building
efficient replicated state machines for WANs,” in OSDI’08.
[33] J. Rao, E. J. Shekita, and S. Tata, “Using Paxos to build a scalable,
consistent, and highly available datastore,” PVLDB, vol. 4, no. 4,
p. 243, 2011.
[34] W. J. Bolosky, D. Bradshaw, R. B. Haagens, N. P. Kusters,
and P. Li, “Paxos replicated state machines as the basis of
a high-performance data store,” in Proc. of the 8th USENIX
Conference on Networked Systems Design and Implementation,
ser. NSDI’11. Berkeley, CA, USA: USENIX Association, 2011,
pp. 11–11. [Online]. Available: http://dl.acm.org/citation.cfm?
id=1972457.1972472
[35] T. Kraska, G. Pang, M. J. Franklin, S. Madden, and A. Fekete,
“MDCC: multi-data center consistency,” in Proc. of the 8th ACM
European Conference on Computer Systems, ser. EuroSys ’13. New
York, NY, USA: ACM, 2013, pp. 113–126. [Online]. Available:
http://doi.acm.org/10.1145/2465351.2465363
[36] J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman,
S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh,
S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura,
D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak,
C. Taylor, R. Wang, and D. Woodford, “Spanner: Google’s
globally-distributed database,” in Proc. of the 10th USENIX
Conference on Operating Systems Design and Implementation,
ser. OSDI’12. Berkeley, CA, USA: USENIX Association, 2012,
pp. 251–264. [Online]. Available: http://dl.acm.org/citation.cfm?
id=2387880.2387905
[37] H. Yu and A. Vahdat, “Building replicated internet
services using TACT: A toolkit for tunable availability
and consistency tradeoffs.” in WECWIS, 2000, pp. 75–84.
[Online]. Available: http://dblp.uni-trier.de/db/conf/wecwis/
wecwis2000.html#YuV00
Subhajit Sidhanta Subhajit Sidhanta received
his PhD from the Department of Computer Sci-
ence and Engineering LSU, affiliated with the
School of Electrical Engineering and Computer
Science at Louisiana State University, USA, in
2016. He is currently a Postdoctoral Researcher
working with the Distributed Systems Research
Group at INESC-ID research lab, affiliated with
Instituto Superior Tcnico at University of Lisbon,
Portugal. His major research interest encom-
passes the areas of consistency in distributed
systems, performance modelling of distributed storage and parallel com-
puting systems.
Supratik Mukhopadhyay Supratik Mukhopad-
hyay is a Professor in the Computer Science
Department at Louisiana State University, USA.
He received his PhD from the Max Planck Institut
fuer Informatik and the University of Saarland
in 2001, MS from the Indian Statistical Institute,
India and his BS from Jadavpur University, India.
His areas of specialization include formal veri-
fication of software, inference engines, bigdata
analytics distributed systems, parallel computing
framework and program synthesis.
Saikat Basu Saikat Basu received his PhD from
the Computer Science Department at Louisiana
State University, Baton Rouge, USA. He re-
ceived his BS in Computer Science from Na-
tional Institute of Technology, Durgapur, India
in 2011. His primary research interests include
applications of Computer Vision and Machine
Learning to satellite imagery data and using
High Performance Computing architectures to
address bigdata analytics problems for very
high-resolution satellite datasets. He is particu-
larly interested in the applications of various Deep Learning frameworks
to satellite image understanding and representations.
Wojciech Golab Wojciech Golab received his
PhD in Computer Science from the University of
Toronto in 2010. After a post-doctoral fellowship
at the University of Calgary, he spent two years
as a Research Scientist at Hewlett-Packard Labs
in Palo Alto, and then joined the University of
Waterloo in 2012 as an Assistant Professor in
Electrical and Computer Engineering. His re-
search agenda focuses on algorithmic problems
in distributed computing with applications to stor-
age and processing of big data. Dr. Golab's the-
oretical work on shared memory algorithms won the best paper award
at DISC 2008, and was distinguished by the ACM Computing Reviews
as one of 91 “notable computing items published in 2012”.

Adaptable SLA-Aware Consistency_Ayantu Debebe Ejeta.pdf

Recommended

Recommended

More Related Content

Similar to Adaptable SLA-Aware Consistency_Ayantu Debebe Ejeta.pdf

Similar to Adaptable SLA-Aware Consistency_Ayantu Debebe Ejeta.pdf (20)

More from tadeseguchi

More from tadeseguchi (8)

Recently uploaded

Recently uploaded (20)

Adaptable SLA-Aware Consistency_Ayantu Debebe Ejeta.pdf