SlideShare a Scribd company logo
1 of 16
Download to read offline
Project Aurora
Openstack log analysis...
- Raghavendra R M
Vinod Kumar L
What is it ?
● Analysis of Openstack boot logs using data
analysis techniques to identify, understand
and debug the problems in booting
● Extracting textual structure of Openstack logs
and performing semantic log classification
How we Approached it...
● Log Collection
● Log Preprocessing and substitutions
● Extracting structure through textual clustering
● Online Semantic Log Classification
Design of log collection automation...
Log Collection
Automation Architecture
VMRUN VIX APIs
to start the server
and OpenStack
services and
spawn VMs
Collecting logs
from
OpenStack
GUEST OS
HOST OS
VMVM
Log Preprocessing
Getting the logs fit for analysis...
Data set...
● Logs of 100 boot up cycles
● Total Messages : 337958 lines
● Alert Categories : 7
● Size : 63.1 MB
Log structure
Nov 13 07:57:23 pesos 2013-11-13 07:57:23.330 17739 DEBUG nova.
openstack.common.rpc.amqp [-] UNIQUE_ID is
7458faafe75b4be3b51fe21c39820cd7. _add_unique_id
/opt/stack/nova/nova/openstack/common/rpc/amqp.py:341
Nov 13 07:57:23 pesos 2013-11-13 07:57:23.227 17739 AUDIT nova.compute.
resource_tracker [-] Free disk (GB): 195
DEBUG -> INFO -> AUDIT -> WARNING -> ERROR -> CRITICAL ->
TRACE
Key Characteristics...
● Logs contain redundant and duplicate information
● Logs have unknown message structure
● Log messages contain a small set of unique words but a large
number of numbers and other symbols
● Distribution of words is different from that of found in natural
languages
● Messages in log tend to be short but are of variable length
Substitutions...
● Non word tokens are substituted
○ numbers - ‘<num>’
○ path/url - ‘<path>
○ url - ‘<url>’
○ ip - ‘<ip>’
○ Annotations - ‘’
○ keys - <y>
○ Unicode keys - <x>
● Total number of unique tokens
○ before - 1.7 lakh
○ after - 1216
● Total number of unique logs
○ before - 3.5 lakh
○ after - 1047
Extracting structure through textual clustering
● Modified DBScan algorithm
○ for making it to work on per-message basis
● Key components for algorithm
○ Measuring similarity
○ Determining the similarity threshold for clustering
Measuring similarity
● Modified Levenshtein distance algorithm
○ to use entire tokens as operation set
○ to avoid bias of giving more importance to longer
strings - Normalised LD
Determining similarity threshold
● NLD calculated for 1011
pairs !
Online semantic log classification
● Extension to online DBSCAN algorithm
Performance Evaluation
● Classification accuracy
○ Total number of clusters - 255
○ Average accuracy - 100 %
Thank You !
Raghavendra R M
Vinod Kumar L

More Related Content

Viewers also liked

Three Mile Island, Chernobyl
Three Mile Island, ChernobylThree Mile Island, Chernobyl
Three Mile Island, Chernobyl
nicomrules
 
Radiological Consequences of the Fukushima Nuclear Accident - 28 March 2011
Radiological Consequences of the Fukushima Nuclear Accident - 28 March 2011Radiological Consequences of the Fukushima Nuclear Accident - 28 March 2011
Radiological Consequences of the Fukushima Nuclear Accident - 28 March 2011
International Atomic Energy Agency
 
Nuclear Reactors, Materials, and Waste CIKR Sector: Case Study of the Nuclea...
Nuclear Reactors, Materials, and Waste CIKR Sector:  Case Study of the Nuclea...Nuclear Reactors, Materials, and Waste CIKR Sector:  Case Study of the Nuclea...
Nuclear Reactors, Materials, and Waste CIKR Sector: Case Study of the Nuclea...
Lindsey Landolfi
 

Viewers also liked (20)

Overview of NASA JSC White Sands Test Facility (WSTF)
Overview of NASA JSC White Sands Test Facility (WSTF)Overview of NASA JSC White Sands Test Facility (WSTF)
Overview of NASA JSC White Sands Test Facility (WSTF)
 
Explosive nuclear experiments on plutonium the norm 20 years after U.S.'s las...
Explosive nuclear experiments on plutonium the norm 20 years after U.S.'s las...Explosive nuclear experiments on plutonium the norm 20 years after U.S.'s las...
Explosive nuclear experiments on plutonium the norm 20 years after U.S.'s las...
 
Project orion
Project orionProject orion
Project orion
 
DARPA FALCON PROJECT
DARPA FALCON PROJECTDARPA FALCON PROJECT
DARPA FALCON PROJECT
 
Emergencies
EmergenciesEmergencies
Emergencies
 
Military robots and artificial intelligence, AP Hill, Jade Helm 15 and DARPA
Military robots and artificial intelligence, AP Hill, Jade Helm 15 and DARPAMilitary robots and artificial intelligence, AP Hill, Jade Helm 15 and DARPA
Military robots and artificial intelligence, AP Hill, Jade Helm 15 and DARPA
 
Immobilisation and Storage of Nuclear Waste
Immobilisation and Storage of Nuclear WasteImmobilisation and Storage of Nuclear Waste
Immobilisation and Storage of Nuclear Waste
 
Nuclear waste disposal-geological importance
Nuclear waste disposal-geological importanceNuclear waste disposal-geological importance
Nuclear waste disposal-geological importance
 
Fusion
FusionFusion
Fusion
 
Michele Italy Talk
Michele Italy TalkMichele Italy Talk
Michele Italy Talk
 
Three Mile Island, Chernobyl
Three Mile Island, ChernobylThree Mile Island, Chernobyl
Three Mile Island, Chernobyl
 
Nuclear Waste Reprocessing
Nuclear Waste ReprocessingNuclear Waste Reprocessing
Nuclear Waste Reprocessing
 
Rod Rimando - Opportunities for advanced robotics in nuclear cleanup
Rod Rimando  - Opportunities for advanced robotics in nuclear cleanupRod Rimando  - Opportunities for advanced robotics in nuclear cleanup
Rod Rimando - Opportunities for advanced robotics in nuclear cleanup
 
Nuclear waste and its management
Nuclear waste and its managementNuclear waste and its management
Nuclear waste and its management
 
Three Mile Island’s new steam generator tubes could fail
Three Mile Island’s new steam generator tubes could failThree Mile Island’s new steam generator tubes could fail
Three Mile Island’s new steam generator tubes could fail
 
Probability Studies of Nuclear Accidents are Flawed - here's why.
Probability Studies of Nuclear Accidents are Flawed - here's why.Probability Studies of Nuclear Accidents are Flawed - here's why.
Probability Studies of Nuclear Accidents are Flawed - here's why.
 
Radiological Consequences of the Fukushima Nuclear Accident - 28 March 2011
Radiological Consequences of the Fukushima Nuclear Accident - 28 March 2011Radiological Consequences of the Fukushima Nuclear Accident - 28 March 2011
Radiological Consequences of the Fukushima Nuclear Accident - 28 March 2011
 
Nuclear power plant
Nuclear power plantNuclear power plant
Nuclear power plant
 
Nuclear Reactors, Materials, and Waste CIKR Sector: Case Study of the Nuclea...
Nuclear Reactors, Materials, and Waste CIKR Sector:  Case Study of the Nuclea...Nuclear Reactors, Materials, and Waste CIKR Sector:  Case Study of the Nuclea...
Nuclear Reactors, Materials, and Waste CIKR Sector: Case Study of the Nuclea...
 
Nuclear Accidents and Lessons Learned
Nuclear Accidents and Lessons Learned Nuclear Accidents and Lessons Learned
Nuclear Accidents and Lessons Learned
 

Similar to Aurora

kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
Krivoy Rog IT Community
 

Similar to Aurora (20)

Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
SOSCON 2016 JerryScript
SOSCON 2016 JerryScriptSOSCON 2016 JerryScript
SOSCON 2016 JerryScript
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
Building data "Py-pelines"
Building data "Py-pelines"Building data "Py-pelines"
Building data "Py-pelines"
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
 
Centralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackCentralized Logging System Using ELK Stack
Centralized Logging System Using ELK Stack
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and Metrics
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
 
My Sql Proxy
My Sql ProxyMy Sql Proxy
My Sql Proxy
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
 
Serverless Event Streaming with Pulsar Functions
Serverless Event Streaming with Pulsar FunctionsServerless Event Streaming with Pulsar Functions
Serverless Event Streaming with Pulsar Functions
 
Node.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scaleNode.js Web Apps @ ebay scale
Node.js Web Apps @ ebay scale
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
Building zero data loss pipelines with apache kafka
Building zero data loss pipelines with apache kafkaBuilding zero data loss pipelines with apache kafka
Building zero data loss pipelines with apache kafka
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
Event driven architectures with Kinesis
Event driven architectures with KinesisEvent driven architectures with Kinesis
Event driven architectures with Kinesis
 

Recently uploaded

Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 

Recently uploaded (20)

%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 

Aurora

  • 1. Project Aurora Openstack log analysis... - Raghavendra R M Vinod Kumar L
  • 2. What is it ? ● Analysis of Openstack boot logs using data analysis techniques to identify, understand and debug the problems in booting ● Extracting textual structure of Openstack logs and performing semantic log classification
  • 3. How we Approached it... ● Log Collection ● Log Preprocessing and substitutions ● Extracting structure through textual clustering ● Online Semantic Log Classification
  • 4. Design of log collection automation... Log Collection
  • 5. Automation Architecture VMRUN VIX APIs to start the server and OpenStack services and spawn VMs Collecting logs from OpenStack GUEST OS HOST OS VMVM
  • 6. Log Preprocessing Getting the logs fit for analysis...
  • 7. Data set... ● Logs of 100 boot up cycles ● Total Messages : 337958 lines ● Alert Categories : 7 ● Size : 63.1 MB
  • 8. Log structure Nov 13 07:57:23 pesos 2013-11-13 07:57:23.330 17739 DEBUG nova. openstack.common.rpc.amqp [-] UNIQUE_ID is 7458faafe75b4be3b51fe21c39820cd7. _add_unique_id /opt/stack/nova/nova/openstack/common/rpc/amqp.py:341 Nov 13 07:57:23 pesos 2013-11-13 07:57:23.227 17739 AUDIT nova.compute. resource_tracker [-] Free disk (GB): 195 DEBUG -> INFO -> AUDIT -> WARNING -> ERROR -> CRITICAL -> TRACE
  • 9. Key Characteristics... ● Logs contain redundant and duplicate information ● Logs have unknown message structure ● Log messages contain a small set of unique words but a large number of numbers and other symbols ● Distribution of words is different from that of found in natural languages ● Messages in log tend to be short but are of variable length
  • 10. Substitutions... ● Non word tokens are substituted ○ numbers - ‘<num>’ ○ path/url - ‘<path> ○ url - ‘<url>’ ○ ip - ‘<ip>’ ○ Annotations - ‘’ ○ keys - <y> ○ Unicode keys - <x> ● Total number of unique tokens ○ before - 1.7 lakh ○ after - 1216 ● Total number of unique logs ○ before - 3.5 lakh ○ after - 1047
  • 11. Extracting structure through textual clustering ● Modified DBScan algorithm ○ for making it to work on per-message basis ● Key components for algorithm ○ Measuring similarity ○ Determining the similarity threshold for clustering
  • 12. Measuring similarity ● Modified Levenshtein distance algorithm ○ to use entire tokens as operation set ○ to avoid bias of giving more importance to longer strings - Normalised LD
  • 13. Determining similarity threshold ● NLD calculated for 1011 pairs !
  • 14. Online semantic log classification ● Extension to online DBSCAN algorithm
  • 15. Performance Evaluation ● Classification accuracy ○ Total number of clusters - 255 ○ Average accuracy - 100 %
  • 16. Thank You ! Raghavendra R M Vinod Kumar L