Your SlideShare is downloading. ×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in the Cloud - Rick Tucker, Sproxil

971
views

Published on

As small companies are adapting to handle Big Data, the cloud and HBase enable developers to leverage that data to provide revenue-generating real time applications. When developing a real time …

As small companies are adapting to handle Big Data, the cloud and HBase enable developers to leverage that data to provide revenue-generating real time applications. When developing a real time application for an existing system, one must balance incrementing counters in real time with Map Reduce jobs over the same data-set. When maintaining an analytics platform, ensuring data accuracy is essential. At Sproxil, SMS logs are ingested into HBase at a growing rate and we report metrics such as SMS throughput, unique user growth over time, and return SMS user activity in real time. Sproxil provides a versatile analytics application enabling customers to handpick statistics on demand to gain market insights enabling them react quickly to trends. This talk will identify the most profitable metrics and demonstrate how to calculate them using Map Reduce while continually updating data as it arrives.

Published in: Technology, Business

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
971
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
52
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Processed large volume of text messages, has even led to arrest of counterfeiters
  • High speed transactional operations criticalHandle large volumes of text messages quicklyLarge volume of dataMillions of recordsSchema supports sparse data
  • Explain why regex is costly
  • Transcript

    • 1. Developing Real Time AnalyticsApplications Using HBase in the Cloud May 22, 2012 Rick Tucker tech@sproxil.com tech@sproxil.com May 22,2012 © 2012 Sproxil, Inc.
    • 2. About Sproxil• Brand protection, specializing in anti- 1 SCRATCH counterfeiting solutions• Solution requires a scalable and high- throughput text 2 message processing TEXT engine• Supports a real-time analytics web interface 3 VERIFY tech@sproxil.com May 22,2012 © 2012 Sproxil, Inc.
    • 3. Why HBase? USER SENDS TEXT MESSAGE CALCULATETEXT MESSAGE IS PROCESSED ANALYTICS USER Amazon EC2 RECEIVES Cloud REPLY tech@sproxil.com May 22,2012 © 2012 Sproxil, Inc.
    • 4. Real-Time Analytics Engine • MapReduce too slow to maintain data in true real time • As data arrives, analytical data is updated through countersText Message Message Increment Arrives Analyzed Counters Genuine Product +1 Increment Counter for Authentication Genuine Authentications Repeat Customer +1 Increment Counter for Repeat Customers tech@sproxil.com May 22,2012 © 2012 Sproxil, Inc.
    • 5. Schema Design: Example 1• Example: View log of text messages in chronological order • Rowkey: row prefix + timestamp Row transaction 2012-05-22 12:00:00 transaction 2012-05-22 12:01:14 transaction 2012-05-22 12:02:03Note: HBase sorts rowkeys lexicographically so scans return data in reversechronological order tech@sproxil.com May 22,2012 © 2012 Sproxil, Inc. 5
    • 6. • • Row transaction userID 1 2012-05-22 12:00:00 transaction userID 1 2012-05-22 12:01:14 transaction userID 2 2012-05-22 12:00:54 transaction userID 2 2012-05-22 12:01:22 transaction userID 2 2012-05-22 12:02:01Note: Hbase sorts rows lexicographically so scans return data in reversechronological order tech@sproxil.com May 22,2012 © 2012 Sproxil, Inc.
    • 7. Critical Findings• Schema design is crucial for successful HBase implementation – Pack as much info as possible into row keys• Use caution with Filters – E.g. Regex filters can be costly – Alternatives: • Directly query for data you need • Use efficient filters when filtering large data sets tech@sproxil.com May 22,2012 © 2012 Sproxil, Inc.
    • 8. Thank You! Your global brand protection specialists – spanning 3 continents and Making Counterfeiting Unprofitable™ speaking 9 languages tech@sproxil.com +1 617 682 9577America | Asia | Africa Sproxil.com tech@sproxil.com May 22,2012 © 2012 Sproxil, Inc. 8

    ×