Your SlideShare is downloading. ×
0
A Novel Mind Map Based Approach for Log Data Extraction<br />Dileepa  Jayathilake<br />Department of Electrical Engineerin...
Conclusion<br />Implementation<br />AGENDA<br />Solution Design<br />Solution Overview<br />Problem Identification<br />Ba...
Functional Conformance<br />BACKGROUND<br />Quality Verification<br />Troubleshooting<br />System Administrators<br />Doma...
BACKGROUND<br />Labor Intensive<br />Require Expertise<br />Error-prone<br />Advantage of Recurrence not used<br />PITFALL...
Different log formats & structure<br />Lack of a common platform<br />Making rules human & machine readable<br />PROBLEM I...
EXISTING SUPPORT<br />PROBLEM IDENTIFICATION<br />XML<br /><ul><li> Universal format
 Ubiquitous use
 Many tools available
 Costly meta data
 Less human readable
Associated languages are complex
 Not every log is xml</li></ul>Log File Grammars<br /><ul><li> Formal definitions
 Regular expression based
 Assume line logs
 Fail with complex log file structures
 Unable to handle difficult syntax
 Distant from XML </li></li></ul><li>Handle arbitrary formats and structures of log files<br />SOLUTION OVERVIEW<br />Resi...
Log Files<br />SOLUTION OVERVIEW<br />SOLUTION OVERVIEW<br />Interpretation<br />Processing<br />Presentation<br />Unified...
Easy to add content<br />SOLUTION DESIGN<br />Easy to visualize<br />Resembles human knowledge organization better<br />Ea...
Upcoming SlideShare
Loading in...5
×

A Novel Mind Map Based Approach for Log Data Extraction

2,272

Published on

Software log file analysis helps immensely in software testing and troubleshooting. The first step in automated log file analysis is extracting log data. This requires decoding the log file syntax and interpreting data semantics. The expected output of this phase is an organization of the extracted data for further processing. Log data extractors can be developed using popular programming languages targeting one or few log file formats. Rather than repeating this process for each log file format, it is desirable to have a generic scheme for interpreting elements of a log file and filling a data structure suitable for further processing. The new log data extraction scheme introduced in this paper is an attempt to provide the advanced features demanded by modern log file analysis procedures. It is a generic scheme which is capable of handling both text and binary log files with complex structures and difficult syntax. Its output is a tree filled with the information of interest for the particular case.

My speech in ICSCA 2011 - http://dileepaj.blogspot.com/2011/07/speech-in-icsca-2011.html

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,272
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "A Novel Mind Map Based Approach for Log Data Extraction"

  1. 1. A Novel Mind Map Based Approach for Log Data Extraction<br />Dileepa Jayathilake<br />Department of Electrical Engineering<br /> University of Moratuwa <br />Sri Lanka<br />ICIIS 2011<br />
  2. 2. Conclusion<br />Implementation<br />AGENDA<br />Solution Design<br />Solution Overview<br />Problem Identification<br />Background<br />
  3. 3. Functional Conformance<br />BACKGROUND<br />Quality Verification<br />Troubleshooting<br />System Administrators<br />Domain Experts<br />Application Logs<br />Developers<br />Monitoring Tool Logs<br />Testers<br />LOG FILE ANALYSIS<br />
  4. 4. BACKGROUND<br />Labor Intensive<br />Require Expertise<br />Error-prone<br />Advantage of Recurrence not used<br />PITFALLS IN MANUAL APPROACH <br />
  5. 5. Different log formats & structure<br />Lack of a common platform<br />Making rules human & machine readable<br />PROBLEM IDENTIFICATION<br />Challenges<br />Result<br />Proprietary Implementation<br />Automation abandoned<br />Reports not customizable<br />Costly<br />Rules not human readable<br />Less resilient to format changes<br />Difficult to add new rules<br />CHALLENGES<br />
  6. 6. EXISTING SUPPORT<br />PROBLEM IDENTIFICATION<br />XML<br /><ul><li> Universal format
  7. 7. Ubiquitous use
  8. 8. Many tools available
  9. 9. Costly meta data
  10. 10. Less human readable
  11. 11. Associated languages are complex
  12. 12. Not every log is xml</li></ul>Log File Grammars<br /><ul><li> Formal definitions
  13. 13. Regular expression based
  14. 14. Assume line logs
  15. 15. Fail with complex log file structures
  16. 16. Unable to handle difficult syntax
  17. 17. Distant from XML </li></li></ul><li>Handle arbitrary formats and structures of log files<br />SOLUTION OVERVIEW<br />Resilient to log file format and structure changes<br />A knowledge representation which is both human and machine readable<br />EXPECTATIONS<br />In lined with XML<br />Friendly for non-developers<br />+<br />Ability to generate custom reports<br />A GENERIC LOG ANALYSIS FRAMEWORK<br />
  18. 18. Log Files<br />SOLUTION OVERVIEW<br />SOLUTION OVERVIEW<br />Interpretation<br />Processing<br />Presentation<br />Unified mechanism for extracting information of interest from both text and binary log files with arbitrary structure and format<br />Easy mechanism to build and maintain a rule base for inferences<br />Flexible means for generating custom reports from inferences<br />Knowledge Representation Schema<br />
  19. 19. Easy to add content<br />SOLUTION DESIGN<br />Easy to visualize<br />Resembles human knowledge organization better<br />Easy to combine<br />MIND MAPS<br />Easily convertible to XML<br />Easy access to computers<br />Tree<br />Can utilize existing tree algorithms<br />Can utilize existing tools<br />MIND MAP AS KNOWLEDGE UNIT<br />
  20. 20. GENERIC INTERPRETATION<br />SOLUTION DESIGN<br />Interpretation<br />Unified mechanism for extracting information of interest from both text and binary log files with arbitrary structure and format<br />Log Files<br />
  21. 21. LOG FILE GRAMMAR<br />SOLUTION IMPLEMENTATION<br />Assume knowledge on file structure and syntax<br />Able to handle a spectrum of log file types<br />Based on hierarchical log entries<br />Log entries identified by attribute combination<br />Translates a log file into a mind map<br />Resilient for malformed log files<br />
  22. 22. SOLUTION IMPLEMENTATION<br />PARSER<br />
  23. 23. val = 2.3<br />SOLUTION IMPLEMENTATION<br />LE ≡ ([A,S,E,S,B], NO); <br />A ≡ ([A1,A2,A3], NO); A1 ≡ (‘v’); A2 ≡ (‘a’); A3 ≡ (‘l’);<br />S ≡ ({SPACE, TAB}, -1, 0, NO); SPACE ≡ (‘ ‘); TAB ≡ (‘t’); E ≡ (‘=’); B ≡ ({ZERO, ONE, …, NINE, DECIMAL_POINT}, -1, 1); <br />ZERO ≡ (‘0’); ONE ≡ (‘1’); … ; NINE ≡ (‘9’); DECIMAL_POINT ≡ (‘.’)<br />EXAMPLE<br />
  24. 24. Difficult syntax<br />SOLUTION IMPLEMENTATION<br />MICROSOFT SHAREPOINT LOG FILE<br />
  25. 25. MICROSOFT APPLICATION VERIFIER LOG<br />SOLUTION IMPLEMENTATION<br />XML<br />
  26. 26. SOLUTION IMPLEMENTATION<br />TRADING SYSTEM LOG<br />Corrupted Log<br />
  27. 27. CONCLUSION<br />The new scheme<br />Is capable of expressing both text and binary log files with different structures and formats ranging from flat messages to complex hierarchies. <br />
  28. 28. REFERENCES<br />[1] J. H. Andrews, “Testing using log file analysis: tools, methods and issues,” Proc. 13th IEEE International Conference on Automated Software Engineering, Oct. 1998, pp. 157-166.<br />[2] D. Jayathilake, “A mind map based framework for automated software log file analysis,” International Conference on Software and Computer Applications., in press.<br />[3] T. Takada and H. Koike, “Mielog: a highly interactive visual web browser using information visualization and statistical analysis,” Proc. USENIX Conf. on System Administration, Nov. 2002, pp. 133-144.<br />[4] L. Destailleur, “AWStats,” [Online]. Available: http://awstats.sourceforge.net<br />[5] J. Valdman, “Log file analysis,” Department of Computer Science and Engineering (FAV UWB)., Tech. Rep. DCSE/TR-2001-04, 2001.<br />[6] J. H. Andrews, “Theory and practice of log file analysis,” Department of Computer Science, University of Western Ontario., Tech. Rep. 524, May 1998.<br />[7] T. Buzan and B. Buzan, The Mind Map Book. New York: Penguin Books, 1994, pp.79-91.<br />[8] J. Cowie and W. Lehnert, “Information extraction,” Comm. ACM 39, 1996, pp. 80–91.<br />[9] J. Abela and T. Debeaupuis, “Universal Format for Logger Messages,” The Internet Engineering Task Force. [Online]. Available: http://tools.ietf.org/html/draft-abela-ulm-05<br />
  29. 29. QUESTIONS<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×