• Save
16.07.12 Analyzing Logs/Configs of 200'000 Systems with Hadoop (Christoph Schnidrig, NetApp)
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

16.07.12 Analyzing Logs/Configs of 200'000 Systems with Hadoop (Christoph Schnidrig, NetApp)

on

  • 917 views

This talk was held at the second meeting of the Swiss Big Data User Group on July 16 at ETH Zürich. The topic of this meeting was: "NoSQL Storage: War Stories and Best Practices".

This talk was held at the second meeting of the Swiss Big Data User Group on July 16 at ETH Zürich. The topic of this meeting was: "NoSQL Storage: War Stories and Best Practices".
http://www.bigdata-usergroup.ch/item/296477

Statistics

Views

Total Views
917
Views on SlideShare
915
Embed Views
2

Actions

Likes
1
Downloads
0
Comments
0

2 Embeds 2

https://twimg0-a.akamaihd.net 1
https://si0.twimg.com 1

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

16.07.12 Analyzing Logs/Configs of 200'000 Systems with Hadoop (Christoph Schnidrig, NetApp) Presentation Transcript

  • 1. Analyzing Logs/Configs of 200000Systems withHadoopchristoph.schnidrig@netapp.com 1
  • 2. What is AutoSupport?¡  AutoSupport is NetApps phone home mechanism¡  Collection of –  Logfiles –  XML files –  Command output capture –  Counter Manager output 2
  • 3. Business Challenges Gateways ETL Data Warehouse Reporting•  600K ASUPs •  Data needs to •  Only 5% of data goes into the •  Numerous mining every week be parsed and data warehouse requests are not satisfied loaded in 15 •  Oracle DBMS struggling to currently•  40% coming over the weekend mins scale, maintenance and •  Huge untapped potential backups challenging of valuable information for•  2TB growth over •  No easy way to access this lead generation, week unstructured content supportability, and BI Finally, the incoming load doubles every 16 months! 4
  • 4. Hadoop Architecture 7
  • 5. Solution Architecture 8
  • 6. Client Apps – how the customer sees it 11
  • 7. Physical Architecture FAS  2040 FAS2040 A 1  GB  Ethernet B12 data nodes : 12 cores , 48 GB RAM each3 E-series storage arrays (~600TB) Secondary   Job  Tracker Name  Node Name  Node 10  GB/s  Ethernet 2 4 2 4 2 4 2 4 2 4 2 4 2 4 2 4 2 4 2 4 2 4 2 4 Port 1 Port 2 8 8 8 8 Lnk Lnk Port 1 Port 2 8 8 8 8 Lnk Lnk Port 1 Port 2 8 8 8 8 Lnk Lnk Ch 1 Ch 2 FCHost Ch 3 Ch 4 Drive Ch 1 Ch 2 FCHost Ch 3 Ch 4 Drive Ch 1 Ch 2 FCHost Ch 3 Ch 4 Drive Expansion Expansion Expansion ID/Diag ID/Diag ID/Diag E  2600  Storage  Array
  • 8. Some performance numbersMetrics HadoopRaw ASUP ingest 1000 ASUPs/minThroughput or 1.5 GB/minASUP Configuration data parse & 1000 ASUP/minLoadEvent messages (EMS) Process & < 1 Hour for 2 Billion recordsLoad ~= > 200 GB/HourEMS Ad-hoc analysis 4-6M records/sec ~= 200 MB/sec on compressed (LZO) data 14 14
  • 9. New possibilities with Hadoop ¡  Correlate disk latency (hot) with disk type –  24 billion records –  4 weeks to run query –  Hadoop implementation 10.5 hours ¡  Bug detection through pattern matching –  240 billion records – Too large to run –  Hadoop implementation 18 hours 15
  • 10. Incoming AutoSupport Volumesand TB Consumption Flat-File Storage Requirement35003000 Total Usage (tb)25002000 Projected Total Usage (tb)1500 Doubles1000500 0 Jan-05 Jan-06 Jan-07 Jan-08 Jan-09 Jan-10 Jan-11 Jan-12 Jan-13 Jan-14 Jan-15 Jan-16¡  At projected current rate of growth, total storage requirements continue doubling every 16 months¡  Cost Model: > $15M per year Ecosystem costs 16
  • 11. References¡  NetApp Accelerates AutoSupport Analytics with NetApp Open Solution for Hadoop http://media.netapp.com/documents/asup-hadoop.pdf¡  NetApp Open Solution for Hadoop Solutions Guide http://media.netapp.com/documents/tr-3969.pdf¡  ESG: Lab Validation Report http://media.netapp.com/documents/ar-esg-netapp- open-solution.pdf
  • 12. 18