HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data

Cloudera, Inc.
May. 31, 2012
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
1 of 17

More Related Content

What's hot

HBaseCon 2013: Integration of Apache Hive and HBaseHBaseCon 2013: Integration of Apache Hive and HBase
HBaseCon 2013: Integration of Apache Hive and HBaseCloudera, Inc.
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBaseCon
A Survey of HBase Application ArchetypesA Survey of HBase Application Archetypes
A Survey of HBase Application ArchetypesHBaseCon
HBaseCon 2015: HBase and SparkHBaseCon 2015: HBase and Spark
HBaseCon 2015: HBase and SparkHBaseCon
HBaseCon 2015 General Session: The Evolution of HBase @ BloombergHBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg
HBaseCon 2015 General Session: The Evolution of HBase @ BloombergHBaseCon
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseHBaseCon

Viewers also liked

HBaseCon 2015: Meet HBase 1.0HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon
HBaseCon 2013: Apache HBase Operations at PinterestHBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2013: Apache HBase Operations at PinterestCloudera, Inc.
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreCloudera, Inc.
HBaseCon 2012 | Scaling GIS In Three ActsHBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Scaling GIS In Three ActsCloudera, Inc.
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...Cloudera, Inc.
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARNHBaseCon

Viewers also liked(20)

Similar to HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data

Hadoop: today and tomorrowHadoop: today and tomorrow
Hadoop: today and tomorrowSteve Loughran
Big data and tools Big data and tools
Big data and tools Shivam Shukla
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
Hbase mhug 2015Hbase mhug 2015
Hbase mhug 2015Joseph Niemiec
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony NguyenThanh Nguyen
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Thanh Nguyen

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.

More from Cloudera, Inc.(20)

Recently uploaded

Take Control of Podcasting thanks to Open Source and Podcasting 2.0Take Control of Podcasting thanks to Open Source and Podcasting 2.0
Take Control of Podcasting thanks to Open Source and Podcasting 2.0🎙 Benjamin Bellamy
EuroBSDCon 2023 - (auto)Installing BSD Systems - Cases using pfSense, TrueNAS...EuroBSDCon 2023 - (auto)Installing BSD Systems - Cases using pfSense, TrueNAS...
EuroBSDCon 2023 - (auto)Installing BSD Systems - Cases using pfSense, TrueNAS...Vinícius Zavam
Product Research Presentation-Maidy Veloso.pptxProduct Research Presentation-Maidy Veloso.pptx
Product Research Presentation-Maidy Veloso.pptxMaidyVeloso
Solving today’s Traffic Problems with Sustainable Ride Hailing SolutionSolving today’s Traffic Problems with Sustainable Ride Hailing Solution
Solving today’s Traffic Problems with Sustainable Ride Hailing SolutionOn Demand Clone
Google Cloud Study Jams Info SessionGoogle Cloud Study Jams Info Session
Google Cloud Study Jams Info SessionGDSCPCCE
How is AI changing journalism? Strategic considerations for publishers and ne...How is AI changing journalism? Strategic considerations for publishers and ne...
How is AI changing journalism? Strategic considerations for publishers and ne...Damian Radcliffe

Recently uploaded(20)

HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data

Editor's Notes

  1. Introduce myself: I am Chris Niemira, a Systems Administrator with AOL. I run a number of Hadoop and HBase clusters, along with numerous other components of the AOL Mail system. I spend my days doing work that ranges from system patches, code installs and troubleshooting, to capacity planning, performance and bottleneck analysis, and kernel tuning. I do a little engineering, a little design work, an on-call rotation, and every once in a while I get to play with Hadoop/HBase.
  2. The AOL Mail System has been around for a long time, and went through a major re-architecture between 2010 – 2011. It’s not a 15 year old code base, and we evolve it constantly. We service over 70 million mailboxes in the AOL Mail environment today. That includes supporting our paying members, in addition to free accounts. Of course, member experience is our #1 priority. We have all kinds of tools in our proverbial utility belt, as we believe in trying to use the right thing for the right job.
  3. It means we’re reasonably large. But we’ve also been operating “at scale” for a long time now. While we have been doing “Big Data” for a lot of years now, we got to our current size by operating a certain way: Rigid quality and change controls, lots of documentation, emphasis on uptime. As we have shifted toward being more agile, we have had to be careful with unproven technologies. HBase, for all the buzz, is still pretty young and error-prone. Some of the realities for dealing with a production Hadoop/HBase system would seemingly require a departure from our traditional mentality. Like everyone, we require stability and robustness of our production applications, but our way of getting there has had to change. Above all, however, we must still take care of our customers, so it’s a balancing act for us.
  4. So HBase is one of the tools we’ve added to kit in the last few years that’s still proving itself. We’ve got two applications running and we’ve identified a few other places where it’s a good candidate to utilize. This isn’t to say that we are not using it for important things, but it’s not at the core of our system. We’ve managed to build a relatively stable platform over time. There’s a lot of scripted recovery, and a lot of proactive monitoring in our environment, and for the most part when there are problems, they are mitigated or resolved without even the involvement of an admin.
  5. AOL Mail first stared looking into Hadoop and HBase back in mid 2010. Other business units in our company had been working with Hadoop for a while before then, and a little of intra-company consulting convinced us to give HBase a try. This system is one component our our anti-abuse strategy. I can’t reveal exactly what it does, but I can tell you a bit about how the HBase stuff happens. In addition to the 60 node cluster and the application servers there’s the ancillary junk which includes NameNodes (2x), HMasters (2x), Zookeepers (3x). The app hosts and Zookeepers, which are currently physicals, are being switched to virtual devices in our internal cloud.
  6. This is what the application looks like. The “Service Layer” comprises various components within the AOL Mail system. They speak their own protocols and send messages to an “Event Catcher” which decodes the stream, and writes a log to local disk. That log is imported in Hadoop (and can optionally be sampled to a development sandbox at the same time) and then further cooked via MapRed which ultimately outputs rows into HBase, and can send triggers to external applications. One thing we can do at this point (not illustrated) is populate a memcache which may be used by client apps to reduce load on some HBase queries.
  7. The real answer is that when we first started, we couldn’t make streaming a million and a half rows a minute work out with the Hbase we had two years ago. At the time, it was easier for us to build the batch loader, which has proven to have a few interesting advantages. Our next-generation model will rely on HBase itself being more stable, and will heavily leverage coprocessors to do a lot of what we’re doing now with MapReduce.
  8. A big obstacle for us is getting MapReduce and HBase to play nicely together. From what I’ve seen, bigger hardware is starting to become more popular for running HBase, and we believe it’s essential. We’ve floated between an 8 – 16 GB heap for the RegionServer. For this application, I believe we’re currently using 16. Getting GC tuning and the IPC timeouts in HBase/Zookeeper correct are critically important. System tuning is also very important. Depending on which flavor of Linux you’re running, the stock configuration may be completely inappropriate for the needs of an HBase/Hadoop complex. In particular, look at the kenel’s IO scheduler, and VM settings.
  9. This application was built a short while after we started our trial-by-fire with HBase on the previous application. It was a different development team with input from the engineers working on the previously discussed application. This application has the same “event catcher” layer for the same reasons, but it has always written directly to HBase. We import data into a “raw” table and then process that table with MapReduce writing the output into a “cooked” table. There’s a much lower number of events here, but it spikes up significantly during the MapReduce phase. It’s exactly the same class of hardware with the same ancillary junk as the previous app. Most of the query load is actually farmed out of memcache.
  10. Yes, this is a relatively straight-forward design.
  11. Exploding tables might be a better name for this, since it’s an across-the-board sort of thing. Backups, of course, are obvious. We’ve run into three catastrophic data loss events, actually once each with three different clusters. The first was during a burn-in phase for the Contact History application I described earlier. At that time the data it had accumulated over the week or so that it had been running wasn’t considered important essential so we were able to truncate and move along. Another time, for a separate plain Hadoop cluster, an unintentionally malicious user actually managed to delete my backups and corrupt the namenode’s event log. Luckily that data was restorable from another source. The last time was with the Activity Profiler application. Basically, having data backups saved the day.
  12. This is our working model for a next-generation HBase system It is currently being prototyped with the cooperation of our Engineering and Operations teams The key design concept is to allow for a great deal of flexibility and re-use, and it centers around this idea of installing a fairly dynamic rules-engine at both the event collection and event storage layers. Hopefully will be presenting it soon