36x48_new_modelling_cloud_infrastructure

Modeling Reliability of Cloud Infrastructure Software
Washington Garcia, Florida Atlantic University
Supervisor: Dr. Theophilus Benson
Edmund T. Pratt School of Engineering, Duke University
Motivation
Challenges
In the past, researchers have manually annotated bugs
and conducted statistical analysis to learn what types of bugs
infest cloud infrastructure. However, this process of manual
annotation is time consuming. Unlike most classification
problems, bug classification requires domain knowledge and
familiarity with the code-base to be useful, so crowdsourced
annotation such as Amazon Mechanical Turk is out of the
question.
An added obstacle to bug classification is the nature of
bug descriptions themselves, which are often unstructured and
consist of natural language provided by human developers.
Common natural language processing techniques can fall
short because, unlike news articles, bug descriptions contain
many typos, domain specific synonyms, abbreviations, and
inconsistencies.
Method
1. Gunawi et al. 2014. What Bugs Live in the Cloud? A Study of 3000+
Issues in Cloud Systems. In Proceedings of the ACM Symposium on
Cloud Computing (SOCC '14). ACM, New York, NY, USA, Article
7, 14 pages. DOI=http://dx.doi.org/10.1145/2670979.2670986
2. A. Medem, M. I. Akodjenou and R. Teixeira, "TroubleMiner:
Mining network trouble tickets," Integrated Network
Management-Workshops, 2009. IM '09. IFIP/IEEE International
Symposium on, New York, NY, 2009, pp. 113-119.
DOI: 10.1109/INMW.2009.5195946
3. Agnes Sandor, Nikolaos Lagos, Ngoc-Phuoc-An Vo, and Caroline
Brun. 2016. Identifying User Issues and Request Types in Forum
Question Posts Based on Discourse Analysis. In Proceedings of
the 25th International Conference Companion on World Wide
Web (WWW '16 Companion). Republic and Canton of Geneva,
Switzerland, 685-691. DOI:
http://dx.doi.org/10.1145/2872518.2890568
GitHub for this project:
https://github.com/w-garcia/BugClustering
Future Work
Since the system is mostly modular, different components can
be swapped out with better implementations to improve
performance. Some proposed additions include:
• The system currently takes around 15-30 minutes to
classify a dataset. This can be increased by injecting more
than one ticket at a time into step 2. However, the chance
of unlabeled tickets being clustered into the same cluster
becomes higher as more are injected into the dataset,
which makes the cluster less useful.
• Using discourse analysis [3] to replace the usage of the
banned word list, synonym list, and phrase filter. Instead,
useful words are found by identifying linguistic features
of each description.
• If proven accurate, our system can be used for the analysis
of bugs in systems such as OpenStack, Spark, Quagga,
ONOS, and OpenDaylight. Such analysis will provide
insight into the current state of cloud infrastructure and
how cloud development has shifted in the past few years.
References
Process Initial Results
Extended analysis of the accuracy is planned in the future,
with preliminary results available for four systems:
Category accuracy is how often the system correctly predicted
a ticket’s Cloud Bug Study category (such as aspect, software,
hardware). Class accuracy reflects its success at predicting
specific classes for tickets (such as a-consistency, sw-logic,
hw-disk, etc.). It performed best when predicting categories
using Cassandra’s large 1200+ ticket model.
An increasing amount of popular services are utilizing
cloud infrastructure due to its convenience, low cost, and
scalability. However, as more services turn to cloud as a
means of storing and delivering data to consumers, the faults
of cloud infrastructure become more apparent. When cloud
infrastructure fails, the consequences are disastrous, with
failures making national headlines. Popular services such as
Amazon, Dropbox, Netflix, and many social media sites all
rely on cloud computing at their core.
Although new cloud infrastructures have sprouted in
recent years, there is little knowledge about what type of bugs
they contain, and how these bugs affect quality of service to
other components. We propose a system that can
automatically classify bug tickets using the natural language
descriptions provided by developers. This system allows
taxonomies of bugs to be built for new cloud infrastructures,
which can be used to shift development focus and help stop
failures before they happen.
The primary objective was to create a system that could
classify unlabeled cloud infrastructure bugs with minimal
human intervention. The motive of our problem boils down to
building a classifier that will output bug classifications
(hardware, software, etc.) given an unlabeled bug description
as input. Thanks to past research in this field there is a large
repository of classified bugs from previous cloud
infrastructures [1].
1 2
3
5 6
1. Input to our system consists of classified bug tickets taken from the Cloud Bug Study [1]. The
bundled bug descriptions are very short, so each ticket is pre-processed using the Python JIRAAPI
to find the full bug description that corresponds to the issue ID. Next, each ticket is passed through
a stemmer, which uses the NLTK library to strip descriptions down to only nouns and verbs, then
reduce each word to its stem. A banned word list, system synonyms list, and phrase filter remove
specific words that only add noise to the clustering. Finally, low frequency words are filtered out.
2. Each bug description is encoded as a
vector of keyword weights. At this point,
an unlabeled ticket is injected into the
dataset, and weights for each keyword are
created. We use Document Frequency
(DF): the weight for any word k is the
number of tickets in the dataset that have k
present.
4
3. An n x m matrix is created using every vector that is generated,
where n is the amount of tickets, and m is the amount of unique
words in the dataset. This matrix is passed as input to a hierarchical
agglomerative clustering algorithm provided by the Python library
SciPy. The output of the clustering algorithm is a binary tree. Each
parent is labelled by the intersection of its children’s keywords.
4. The binary tree is collapsed to an n-ary
tree using a modified version of the
algorithm presented by Medem et al [2].
The previously unlabeled ticket is
marked, and its parent label is used as
the classification.
5. Steps 2-5 are repeated for each
unlabeled ticket in the desired
dataset, building a taxonomy of
classified bugs.
6. Steps 3 and 4 are repeated one more time with the
new taxonomy as input. An n-ary tree is generated,
giving a visual overview of the previously
unlabeled bugs and their predicted classifications.

36x48_new_modelling_cloud_infrastructure

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to 36x48_new_modelling_cloud_infrastructure

Similar to 36x48_new_modelling_cloud_infrastructure (20)

36x48_new_modelling_cloud_infrastructure