Large Scale Log Analysis with HBase and Solr at Amadeus (Martin Alig, ETH Zurich)

7,709 views
7,343 views

Published on

This talk was held at the second meeting of the Swiss Big Data User Group on July 16 at ETH Zürich.
http://www.bigdata-usergroup.ch/item/296477

Published in: Technology, Education
0 Comments
15 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
7,709
On SlideShare
0
From Embeds
0
Number of Embeds
313
Actions
Shares
0
Downloads
0
Comments
0
Likes
15
Embeds 0
No embeds

No notes for slide

Large Scale Log Analysis with HBase and Solr at Amadeus (Martin Alig, ETH Zurich)

  1. 1. Large Scale Log Analysis with HBase andSolr at AmadeusMartin Aligaligma@student.ethz.ch
  2. 2. Overview Problem Solution - Overview HBase Solr Solution - Details ResultsMontag, 16. Juli 2012 2
  3. 3. Problem Amadeus is the worlds leading technology provider to the travel industry, providing marketing, distribution and IT services worldwide. The Amadeus computer reservation system (CRS) processed 850 million billable travel transactions in 2010. Current logging framework produces 100000 - 1000000 messages per secondMontag, 16. Juli 2012 3
  4. 4. Problem - Log Messages Messages with 1 KB average size Message can be anything: XML, Edifact, HEX dump, ... A few fixed attributes per message given: Timestamp, source, various ids.Montag, 16. Juli 2012 4
  5. 5. Problem - Current Solution Write log messages in plain text files. Split, compress and copy to SAN. Queries? Search? Statistics?Montag, 16. Juli 2012 5
  6. 6. Solution Overview Use Apache HBase for storage and instant random access Apache MapReduce for complex queries. Apache Solr as full text search engine for queries on the log messages.Montag, 16. Juli 2012 6
  7. 7. Apache HBase Open source, non-relational, distributed database. Modeled after Googles BigTable Runs on top of Hadoop Distributed Filesystem (HDFS)Montag, 16. Juli 2012 7
  8. 8. HBase - Terms Region  Contigous ranges of rows stored together  Dynamically split / merged and distributed RegionServer (slave)  Serves regions, e.g. data for reads and writes HMaster (master)  Responsible for coordination  Assigns regions to Region Servers, detects failures  Admin functionsMontag, 16. Juli 2012 8
  9. 9. HBase - Architecture ZooKeeper HMaster Client ZooKeeper HMaster ZooKeeper RegionServer RegionServer RegionServer HDFSMontag, 16. Juli 2012 9
  10. 10. HBase - Data Access Java API REST Apache Avro, Apache Thrift Hadoop MapReduceMontag, 16. Juli 2012 10
  11. 11. HBase - Secondary Indexes No native support for secondary indexes Different choices:  Client managed: Write value in data table and index in index table  Coprocessors that automatically create the secondary index  Periodic update: Use MapReduce job to add indexMontag, 16. Juli 2012 11
  12. 12. HBase - Coprocessors Run arbitrary code on any node:  Observer: RegionObserver, MasterObserver, WALObserver provide hooks for code execution (prePut, postPut, preGet, postGet, ...)  Endpoint: Installed on nodes, executed on client requestMontag, 16. Juli 2012 12
  13. 13. Apache Solr Apache Lucene + many features like  Distributed index  Distributed search  ... Apache Lucene is a high-performance, full-featured text search engine libraryMontag, 16. Juli 2012 13
  14. 14. Solution - Details Client Insert log messages, create secondary indexes for predefinded attributes. HBase Use coprocessor functionality to index log messages in Solr after insert. SolrMontag, 16. Juli 2012 14
  15. 15. Solution - Cluster Configuration Client Zookeeper Namenode SecondaryNamenode HMaster DataNode DataNode DataNode RegionServer RegionServer RegionServer Solr Solr Solr ...Montag, 16. Juli 2012 15
  16. 16. Solution - HBase & MapReduce Very good integration of MapReduce into HBase Easy to use HBase as data source, data sink or both Provides helper classesMontag, 16. Juli 2012 16
  17. 17. Solution - Problems Can Solr keep up with HBase? Is Solr full text search practical for log messages? (XML, other formats, ...)Montag, 16. Juli 2012 17
  18. 18. Results Not many, yet. Generic experiments with random data Experiments with real log data just startedMontag, 16. Juli 2012 18
  19. 19. Results - Write Random Data - HBaseOnly Insert random data, 1KB records. Cluster configuration:  5 Nodes:  RAM: 24 GiB  CPU: Intel Xeon L5520 2.26  HD: 2x 15k RPM Sas 73 GB (RAID1)  1. Node: Master (Namenode, HMaster, Zookeeper)  2. - 5. Node: Slaves (Datanode, RegionServer) Client on seperate node Experiment executed with and without secondary indexes. (5 additional indexes)Montag, 16. Juli 2012 19
  20. 20. Results - Write Random Data - HBaseOnly No secondary indexes Secondary indexs avg. inserts/sec avg. inserts/sec (not counting index inserts ~30000 ~6000Montag, 16. Juli 2012 20
  21. 21. Results - Write Read Data - HBase & Solr No real numbers First tests: Single Solr instance indexes ~1000 log messages per second.Montag, 16. Juli 2012 21
  22. 22. QuestionsMontag, 16. Juli 2012 22
  23. 23. Montag, 16. Juli 2012 23
  24. 24. HBase - Architecure Source: HBase - The Definitive GuideMontag, 16. Juli 2012 24
  25. 25. HBase - Key Design Source: HBase - The Definitive GuideMontag, 16. Juli 2012 25
  26. 26. HBase - Hardware Master  Ram: 24 GB  CPU: Dual quad-core  Disks: 4 x 1 TB SATA, RAID 0+1 Slave  Ram: 24 GB or more  CPU: Dual quad-core  Disks: 6 + 1 TB SATA, JBODMontag, 16. Juli 2012 26
  27. 27. HBase - Monitoring Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. HBase provides metrics for Ganglia.Montag, 16. Juli 2012 27
  28. 28. Log Message Example (1) 2012/05/15 04:33:04.783757 sitst201 srvT2M-838059 Trace name: all0302 Message sent [con=19104962 (FE_EXT_TCIL-ISO9735_ETK- 310_OPK2_ETK-REQ), cxn=1498840662 (172.17.39.174:13101), addr=0x1db58830, len=354, CorrID=000100E1A1EU42, MsgID=SQ8ZK36LG3TJ12JE6XMU2O8] UNB^]IATB^_1^]1AETH^_^_LY^]CDBETICKET^_^_LY^]1205 15^_0433^]00JNQPH79K0001^]^]^]O^UNH^]1^]TKCREQ^ _08^_5^_1A^]000100E1A1EU42^DCX^]134^]<DCC VERS="1.0"><MW><UKEY VAL="EXRU$3013#GJ12V4K#1IZ" TRXNB="1"/><$Montag, 16. Juli 2012 28
  29. 29. Log Message Example (2) 2012/05/15 04:33:04.783671 sitst201 srvT2M-838059 Trace name: all0302 Query [SAP=1ASICDBETK, DCXID=EXRU$3013#GJ12V4K#1IZ, TRXNB=1, CorrID=000100E1A1EU42, MsgID=SQ8ZK36LG3TJ12JE6XMU2O8]Montag, 16. Juli 2012 29
  30. 30. Log Message Example (3) 2012/05/15 04:32:42.289282 sitmt301 muxT2-332108 Trace name: all0302 Message received [con=17697 (inSrvT2_TCIL_1), cxn=1626671045 (194.156.170.210:8000), addr=0x13e9b830, len=1710, CorrID=09B5840E, MsgID=OX7E09RYABBLS61HR2DXTL] +----- ADDR -----+--------------- HEX ---------------+----- ASCII ---- +---- EBCDIC ----+ 0000000013e9b830 554e421d 49415442 1f311d31 4153494c UNB.IATB.1.1ASIL .+.............< 0000000013e9b840 53533243 53544e1d 3141304c 53534352 SS2CSTN.1A0LSSCR ......+....<.... 0000000013e9b850 591d3132 30353135 1f303433 321d3030 Y.120515.0432.00 ................ 0000000013e9b860 39 ...Montag, 16. Juli 2012 30

×