Aurora

•

0 likes•212 views

Aurora is an Open Source CLI application to collect OpenStack boot logs for 100 boot cycles and performed textual structure extraction semantic log classification on them.

Software Technology Business

Project Aurora
Openstack log analysis...
- Raghavendra R M
Vinod Kumar L

What is it ?
● Analysis of Openstack boot logs using data
analysis techniques to identify, understand
and debug the problems in booting
● Extracting textual structure of Openstack logs
and performing semantic log classification

How we Approached it...
● Log Collection
● Log Preprocessing and substitutions
● Extracting structure through textual clustering
● Online Semantic Log Classification

Design of log collection automation...
Log Collection

Automation Architecture
VMRUN VIX APIs
to start the server
and OpenStack
services and
spawn VMs
Collecting logs
from
OpenStack
GUEST OS
HOST OS
VMVM

Log Preprocessing
Getting the logs fit for analysis...

Data set...
● Logs of 100 boot up cycles
● Total Messages : 337958 lines
● Alert Categories : 7
● Size : 63.1 MB

Log structure
Nov 13 07:57:23 pesos 2013-11-13 07:57:23.330 17739 DEBUG nova.
openstack.common.rpc.amqp [-] UNIQUE_ID is
7458faafe75b4be3b51fe21c39820cd7. _add_unique_id
/opt/stack/nova/nova/openstack/common/rpc/amqp.py:341
Nov 13 07:57:23 pesos 2013-11-13 07:57:23.227 17739 AUDIT nova.compute.
resource_tracker [-] Free disk (GB): 195
DEBUG -> INFO -> AUDIT -> WARNING -> ERROR -> CRITICAL ->
TRACE

Key Characteristics...
● Logs contain redundant and duplicate information
● Logs have unknown message structure
● Log messages contain a small set of unique words but a large
number of numbers and other symbols
● Distribution of words is different from that of found in natural
languages
● Messages in log tend to be short but are of variable length

Substitutions...
● Non word tokens are substituted
○ numbers - ‘<num>’
○ path/url - ‘<path>
○ url - ‘<url>’
○ ip - ‘<ip>’
○ Annotations - ‘’
○ keys - <y>
○ Unicode keys - <x>
● Total number of unique tokens
○ before - 1.7 lakh
○ after - 1216
● Total number of unique logs
○ before - 3.5 lakh
○ after - 1047

Extracting structure through textual clustering
● Modified DBScan algorithm
○ for making it to work on per-message basis
● Key components for algorithm
○ Measuring similarity
○ Determining the similarity threshold for clustering

Measuring similarity
● Modified Levenshtein distance algorithm
○ to use entire tokens as operation set
○ to avoid bias of giving more importance to longer
strings - Normalised LD

Determining similarity threshold
● NLD calculated for 1011
pairs !

Online semantic log classification
● Extension to online DBSCAN algorithm

Performance Evaluation
● Classification accuracy
○ Total number of clusters - 255
○ Average accuracy - 100 %

Thank You !
Raghavendra R M
Vinod Kumar L

Viewers also liked

Overview of NASA JSC White Sands Test Facility (WSTF)

dlnv2iz

Explosive nuclear experiments on plutonium the norm 20 years after U.S.'s las...

nuclearcrimes2

Project orion

Clifford Stone

DARPA FALCON PROJECT

Abhijith C

Emergencies

Union of Concerned Scientists

Military robots and artificial intelligence, AP Hill, Jade Helm 15 and DARPA

MinistryOfHeaing777

Immobilisation and Storage of Nuclear Waste

Nuclear Liaison Television

Nuclear waste disposal-geological importance

Suranimk

Fusion

Shivam Singh Sengar

Michele Italy Talk

Michele Fattoruso

Three Mile Island, Chernobyl

nicomrules

Nuclear Waste Reprocessing

NV4CFE - Nevadans 4 Carbon Free Energy

Rod Rimando - Opportunities for advanced robotics in nuclear cleanup

Daniel Huber

Nuclear waste and its management

sagarpandey1996

Three Mile Island’s new steam generator tubes could fail

Scott Portzline

Probability Studies of Nuclear Accidents are Flawed - here's why.

Scott Portzline

Radiological Consequences of the Fukushima Nuclear Accident - 28 March 2011

International Atomic Energy Agency

Nuclear power plant

Ashish Khudaiwala

Nuclear Reactors, Materials, and Waste CIKR Sector: Case Study of the Nuclea...

Lindsey Landolfi

Nuclear Accidents and Lessons Learned

Momina Mateen

Viewers also liked (20)

Overview of NASA JSC White Sands Test Facility (WSTF)

Explosive nuclear experiments on plutonium the norm 20 years after U.S.'s las...

Project orion

DARPA FALCON PROJECT

Emergencies

Military robots and artificial intelligence, AP Hill, Jade Helm 15 and DARPA

Immobilisation and Storage of Nuclear Waste

Nuclear waste disposal-geological importance

Fusion

Michele Italy Talk

Three Mile Island, Chernobyl

Nuclear Waste Reprocessing

Rod Rimando - Opportunities for advanced robotics in nuclear cleanup

Nuclear waste and its management

Three Mile Island’s new steam generator tubes could fail

Probability Studies of Nuclear Accidents are Flawed - here's why.

Radiological Consequences of the Fukushima Nuclear Accident - 28 March 2011

Nuclear power plant

Nuclear Reactors, Materials, and Waste CIKR Sector: Case Study of the Nuclea...

Nuclear Accidents and Lessons Learned

Similar to Aurora

Ledingkart Meetup #2: Scaling Search @Lendingkart

Mukesh Singh

SOSCON 2016 JerryScript

Samsung Open Source Group

Starting with version 2.10, the Apache ZooKeeper dependency has been eliminated and replaced with a pluggable framework that enables you to reduce the infrastructure footprint of Apache Pulsar by leveraging alternative metadata and coordination systems based on your deployment environment. In this talk, walk through the steps required to utilize the existing etcd service running inside Kubernetes to act as Pulsar's metadata store, thereby eliminating the need to run ZooKeeper entirely, leaving you with a Zookeeper-less Pulsar.

Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022

StreamNative

Building data "Py-pelines"

Rob Winters

NetflixOSS Meetup season 3 episode 1

Ruslan Meshenberg

Nowadays in an increasingly more complex and dynamic network its not enough to be a regex ninja and storing only the logs you think you might need. From network traffic to custom logs you won't know which logs will be crucial to stop the next attacker, and if you are not planning to spend a half of your security budget in a commercial solution we will show you a way to building you own SIEM with open source. The talk will go from how to build a powerful logging environment for your organization to scaling on the cloud and storing everything forever. We will walk through how to build such a system with open source solutions as Elasticsearch and Hadoop, and creating your own custom monitoring rules to monitor everything you need. The talk will also include how to secure the environment and allow restricted access to other teams as well as avoiding common pitfalls and ensuring compliance standards.

Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...

Hernan Costante

Centralized Logging System Using ELK Stack

Rohit Sharma

Speakers: Dominic Dwyer & Wei Shan Ang This talk was presented in Percona Live Europe 2017. However, we did not have enough time to test against more scenario. We will be giving an updated talk with a more comprehensive tests and numbers. We hope to run it against citusDB and MongoRocks as well to provide a comprehensive comparison. https://www.percona.com/live/e17/sessions/high-performance-json-postgresql-vs-mongodb

PGConf APAC 2018 - High performance json postgre-sql vs. mongodb

PGConf APAC

Scaling Up Logging and Metrics

Ricardo Lourenço

Pulsar Functions is a succinct framework provided by Apache Pulsar to conduct real-time data processing. Its use cases include ETL pipeline, event-driven applications, and simple data analytics. While Pulsar Functions already provides an extremely simple programming interface, we want to further lower the barrier for users to access real-time data. Since SQL is one of the universal languages in the technology world and well accepted by the vast majority of data engineers, we decided to add a SQL expressing layer on top of Pulsar Functions runtime. In this talk, we will discuss the architecture and implementation of this new service. We will see how SQL syntax, Pulsar Functions, and Function Mesh can work together to deliver a unique user development experience for real-time data jobs in the cloud environment. We will also walk through use cases like filtering, routing, and projecting messages as well as integrating with the Pulsar IO Connectors framework.

Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022

StreamNative

Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana

Mushfekur Rahman

My Sql Proxy

Liu Lizhi

At Uber we use high cardinality monitoring to observe and detect issues with our 4,000 microservices running on Mesos and across our infrastructure systems and servers. We’ll cover how we put the resulting 6 billion plus time series to work in a variety of different ways, auto-discovering services and their usage of other systems at Uber, setting up and tearing down alerts automatically for services, sending smart alert notifications that rollup different failures into individual high level contextual alerts, and more. We’ll also talk about how we accomplish all this with a global view of our systems with M3, our open source metrics platform. We’ll take a deep dive look at how we use M3DB, now available as an open source Prometheus long term storage backend, to horizontally scale our metrics platform in a cost efficient manner with a system that’s still sane to operate with petabytes of metrics data.

OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...

NETWAYS

The last few years have seen the emergence of Serverless as a paradigm for event streaming. Its very simple programming model has attracted developers in droves. At the same time, its ability to elastically scale has simplified operations significantly. Combined together with the ubiquity of their presence across all cloud providers, serverless today has become the leading choice to do event processing at scale for a lot of companies. In this talk, Sijie Guo from StreamNative will explore how the serverless paradigm is applied to event streaming in Apache Pulsar, a next-generation event streaming system. Pulsar provides native support for serverless functions where the events are processed as soon as they arrive in a streaming manner and that provides flexible deployment options (thread, process, container). He will describe how these serverless functions make data engineering easier and share the real world usage of Pulsar Functions.

Serverless Event Streaming with Pulsar Functions

StreamNative

Node.js Web Apps @ ebay scale

Dmytro Semenov

High performance json- postgre sql vs. mongodb

Wei Shan Ang

Kubernetes @ Squarespace (SRE Portland Meetup October 2017)

Kevin Lynch

Kafka is playing an increasingly important role in messaging and streaming systems and is becoming the defacto messaging platform in many enterprises. Managing and maintaining Kafka deployments and tuning the data pipelines for high-performance and scalability can become a challenging task. In this session, we will discuss the lessons learned and the best practices for achieving zero data loss pipelines.

Building zero data loss pipelines with apache kafka

Avinash Ramineni

kranonit S06E01 Игорь Цинько: High load

Krivoy Rog IT Community

Event driven architectures with Kinesis

Mark Harrison

Similar to Aurora (20)

Ledingkart Meetup #2: Scaling Search @Lendingkart

SOSCON 2016 JerryScript

Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022

Building data "Py-pelines"

NetflixOSS Meetup season 3 episode 1

Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...

Centralized Logging System Using ELK Stack

PGConf APAC 2018 - High performance json postgre-sql vs. mongodb

Scaling Up Logging and Metrics

Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022

Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana

My Sql Proxy

OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...

Serverless Event Streaming with Pulsar Functions

Node.js Web Apps @ ebay scale

High performance json- postgre sql vs. mongodb

Kubernetes @ Squarespace (SRE Portland Meetup October 2017)

Building zero data loss pipelines with apache kafka

kranonit S06E01 Игорь Цинько: High load

Event driven architectures with Kinesis

Recently uploaded

%in Benoni+277-882-255-28 abortion pills for sale in Benoni

masabamasaba

Announcing Codolex 2.0 from GDK Software

Jim McKeeth

Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pretoria ● Abortion Pills For Sale In Pretoria ● Pretoria 🏥🚑!! Abortion Clinic Near Me Cost, Price, Women's Clinic Near Me, Abortion Clinic Near, Abortion Doctors Near me, Abortion Services Near Me, Abortion Pills Over The Counter, Abortion Pill Doctors' Offices, Abortion Clinics, Abortion Places Near Me, Cheap Abortion Places Near Me, Medical Abortion & Surgical Abortion, approved cyctotec pills and womb cleaning pills too plus all the instructions needed This Discrete women’s Termination Clinic offers same day services that are safe and pain free, we use approved pills and we clean the womb so that no side effects are present. Our main goal is that of preventing unintended pregnancies and unwanted births every day to enable more women to have children by choice, not chance. We offer Terminations by Pill and The Morning After Pill.” Our Private VIP Abortion Service offers the ultimate in privacy, efficiency and discretion. we do safe and same day termination and we do also womb cleaning as well its done from 1 week up to 28 weeks. We do delivery of our services world wide SAFE ABORTION CLINICS/PILLS ON SALE WE DO DELIVERY OF PILLS ALSO Abortion clinic at very low costs, 100% Guaranteed and it’s safe, pain free and a same day service. It Is A 45 Minutes Procedure, we use tested abortion pills and we do womb cleaning as well. Alternatively the medical abortion pill and womb cleansing !!!

Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...

Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg

%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg

masabamasaba

%in ivory park+277-882-255-28 abortion pills for sale in ivory park

masabamasaba

We specialize in Psychic Readings, Psychic Love Spells, Binding Love Spells, Obsession Spells, Voodoo Spells, Lottery Spells, Marriage Spells, Black Magic Spells, Palm Readings & much more. Are you depressed? We perform this come-to-me love spell that works instantly with the aim of bringing back the victim to the person performing the magic. Have you lost your lover? We perform this come-to-me love spell that works instantly with the aim of bringing back the victim to the person performing the magic. Have you lost your lover? Do u need to solve any relationship problem? Contact the powerful spells caster chief kule with love spells that work overnight and love spells that really work. Have you found yourself infatuated with a special someone you think could be the one? Are you looking for a spell to provide them with a nudge in the right direction? Or maybe the spell you cast didn’t achieve the results you were hoping for? Whether you’re new or versed in the ways of spell casting, we’re here to help. Today we’re going to provide you with a detailed guide on the types of love spells to cast. Not only that but there’s something for those who wish to find outside advice from more advanced spell casters. We’re also going to provide you with the top sites available to help you with your dilemma. Let’s begin our journey by educating ourselves on love magic and what a real love caster looks like. Love Magic and Love Casters Love magic made its first appearance back in Ancient Egypt and has been an active practice since. This type of magic is a branch of traditional magic and can be practiced in various ways. Typically the more common use of love magic is through the work of spells, but other methods look like Charms Rituals-LOVE Potions-Dolls and even Amulets If you are interested in becoming a love caster, be prepared for what’s to come. A genuine love caster knows that the art of love casting is no easy feat and shouldn’t be done casually. You should know that not only does it require you to be gifted spiritually, but you must be ready to serve others. Someone who is considered a real love caster has experience in all manner of spells, no matter the difficulty. Training yourself in attraction, commitment, and marriage spells is an excellent place to start. But this by no means will make you a professional. Practice your craft and expand your knowledge; understand that you will possess the ability to help others in time truly. Types of Love Spells What better way to start broadening your experiences with love spells than by learning more about them? These spells work like just about any other spell. Simply apply your intention, use a medium (sigils, mantras, candles, or charm bags), and top it off with establishing the belief that you will receive what you want. So what kind of spells are available and which ones suit your needs the best? Let’s take a look at the many options you have at your disposal.

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...

masabamasaba

WSO2CON 2024 - Does Open Source Still Matter?

WSO2

%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...

masabamasaba

WSO2CON 2024 Slides - Unlocking Value with AI

WSO2

WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...

WSO2

VTU technical seminar 8Th Sem on Scikit-learn

AmarnathKambale

WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...

WSO2

%in kempton park+277-882-255-28 abortion pills for sale in kempton park

masabamasaba

MakeMyPass" Online Bus Pass Management System illustrates the flow of activities and actions that occur within the system to accomplish specific tasks or use cases. This type of diagram focuses on representing the sequence of activities and decision points involved in a particular process. Below is an example outline and description of key elements that could be included in an Activity Diagram for the system:

BUS PASS MANGEMENT SYSTEM USING PHP.pptx

alwaysnagaraju26

WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source

WSO2

Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...

Bert Jan Schrijver

WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...

WSO2

WSO2CON 2024 Slides - Open Source to SaaS

WSO2

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...

masabamasaba

ADR, or Architecture Decision Record, is a valuable tool in software development for several reasons. It provides a centralized location for documenting and tracking architectural decisions, aiding both current and future team members. ADRs enhance communication among team members by documenting the rationale behind architectural decisions, especially beneficial during onboarding of new team members or when revisiting decisions. They serve as a knowledge base, enabling teams to learn from past decisions and refine their decision-making process. Additionally, ADRs contribute to transparency by helping stakeholders understand the reasons behind specific architectural choices. As with any other tool or process, introducing them into an organization can face several obstacles, and overcoming these challenges is crucial for successful implementation. In this talk I go through some common problems and our way of solving them.

Architecture decision records - How not to get lost in the past

Papp Krisztián

Recently uploaded (20)

%in Benoni+277-882-255-28 abortion pills for sale in Benoni

Announcing Codolex 2.0 from GDK Software

Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...

%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg

%in ivory park+277-882-255-28 abortion pills for sale in ivory park

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...

WSO2CON 2024 - Does Open Source Still Matter?

%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...

WSO2CON 2024 Slides - Unlocking Value with AI

WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...

VTU technical seminar 8Th Sem on Scikit-learn

WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...

%in kempton park+277-882-255-28 abortion pills for sale in kempton park

BUS PASS MANGEMENT SYSTEM USING PHP.pptx

WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source

Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...

WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...

WSO2CON 2024 Slides - Open Source to SaaS

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...

Architecture decision records - How not to get lost in the past

Aurora

1. Project Aurora Openstack log analysis... - Raghavendra R M Vinod Kumar L

2. What is it ? ● Analysis of Openstack boot logs using data analysis techniques to identify, understand and debug the problems in booting ● Extracting textual structure of Openstack logs and performing semantic log classification

3. How we Approached it... ● Log Collection ● Log Preprocessing and substitutions ● Extracting structure through textual clustering ● Online Semantic Log Classification

4. Design of log collection automation... Log Collection

5. Automation Architecture VMRUN VIX APIs to start the server and OpenStack services and spawn VMs Collecting logs from OpenStack GUEST OS HOST OS VMVM

6. Log Preprocessing Getting the logs fit for analysis...

7. Data set... ● Logs of 100 boot up cycles ● Total Messages : 337958 lines ● Alert Categories : 7 ● Size : 63.1 MB

8. Log structure Nov 13 07:57:23 pesos 2013-11-13 07:57:23.330 17739 DEBUG nova. openstack.common.rpc.amqp [-] UNIQUE_ID is 7458faafe75b4be3b51fe21c39820cd7. _add_unique_id /opt/stack/nova/nova/openstack/common/rpc/amqp.py:341 Nov 13 07:57:23 pesos 2013-11-13 07:57:23.227 17739 AUDIT nova.compute. resource_tracker [-] Free disk (GB): 195 DEBUG -> INFO -> AUDIT -> WARNING -> ERROR -> CRITICAL -> TRACE

9. Key Characteristics... ● Logs contain redundant and duplicate information ● Logs have unknown message structure ● Log messages contain a small set of unique words but a large number of numbers and other symbols ● Distribution of words is different from that of found in natural languages ● Messages in log tend to be short but are of variable length

10. Substitutions... ● Non word tokens are substituted ○ numbers - ‘<num>’ ○ path/url - ‘<path> ○ url - ‘<url>’ ○ ip - ‘<ip>’ ○ Annotations - ‘’ ○ keys - <y> ○ Unicode keys - <x> ● Total number of unique tokens ○ before - 1.7 lakh ○ after - 1216 ● Total number of unique logs ○ before - 3.5 lakh ○ after - 1047

11. Extracting structure through textual clustering ● Modified DBScan algorithm ○ for making it to work on per-message basis ● Key components for algorithm ○ Measuring similarity ○ Determining the similarity threshold for clustering

12. Measuring similarity ● Modified Levenshtein distance algorithm ○ to use entire tokens as operation set ○ to avoid bias of giving more importance to longer strings - Normalised LD

13. Determining similarity threshold ● NLD calculated for 1011 pairs !

14. Online semantic log classification ● Extension to online DBSCAN algorithm

15. Performance Evaluation ● Classification accuracy ○ Total number of clusters - 255 ○ Average accuracy - 100 %

16. Thank You ! Raghavendra R M Vinod Kumar L

Aurora

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to Aurora

Similar to Aurora (20)

Recently uploaded

Recently uploaded (20)

Aurora