Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big Data Use Cases

16,854 views

Published on

Everyone is awash in the new buzzword, Big Data, and it seems as if you can’t escape it wherever you go. But there are real companies with real use cases creating real value for their businesses by using big data. This talk will discuss some of the more compelling current or recent projects, their architecture & systems used, and successful outcomes.

Published in: Technology
  • Be the first to comment

Big Data Use Cases

  1. 1. Big Data Use DevNexus Conference 2/18/2013 *Fully buzzword-compliant title 1 * Cases
  2. 2. whoami • Brad Anderson • Solutions Architect at MapR (Atlanta) • ATLHUG co-chair • NoSQL East Conference 2009 • “boorad” most places (twitter, github) • banderson@maprtech.com 2
  3. 3. Mobile Virtualization Social Media B2B Application Service Provider Cloud Client/Server Web 2.0 Service Bureau Software-as-a-Service 3
  4. 4. BIG DATA 4
  5. 5. 5
  6. 6. Business Value 6
  7. 7. Business Value 7
  8. 8. Big Data is not new! but the tools are. 8
  9. 9. Ship the Function to the Data Distributed Computing Traditional Architecture function function data data function data data function function data data function data RDBMS function data data data data data data data data function function function data data data data data data data data data function function function data data data SAN/NAS 9
  10. 10. Variation: Multiple MapReduces Example: Fraud Detection in User Transactions MapReduce Transaction data LDA training LDA scoring G2 score 95 %-ile LDA anomaly HBase / MapR M7 Edition Candidate events for analyst review http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation 10
  11. 11. MapR Distribution for Apache Hadoop  Complete Hadoop distribution  Comprehensive management suite  Industry-standard interfaces  Enterprise-grade dependability  Higher performance 11
  12. 12. Big Data Ecosystem 12
  13. 13. Use Case Company  Data Source(s)  Technique(s)  Business Value  13
  14. 14. Proactive Monitoring 14
  15. 15. Data Sources Server Telemetry  Monitoring Logs  Network Flow  15
  16. 16. Techniques Pattern Recognition  Proactive Monitoring  Early Alert Delivery  16
  17. 17. Business Value 17
  18. 18. Telecommunications Giant ETL Offload 18
  19. 19. Telecommunications Data Sources Customer Records  Contract Data  Purchase Orders  Call Center  19
  20. 20. Telecommunications Techniques Analytics ETL 20
  21. 21. Telecommunications Techniques + ETL (Hadoop) Analytics (Teradata) 21
  22. 22. Telecommunications Business Value 22
  23. 23. Credit Card Issuer Data Sources Customer Purchase History  Merchant Designations  Merchant Special Offers  23
  24. 24. Credit Card Issuer Techniques Hadoop Purchase History Export (4 hrs) App App Merchant Information Recommendation Engine Results (Mahout) Presentation Data Store (DB2) App App Merchant Offers App Import (4 hrs) 24
  25. 25. Credit Card Issuer Techniques Hadoop Purchase History Merchant Information Recommendation Engine Results (Mahout) Index Update (2 min) App App Recommendation Search Index (Solr) App App Merchant Offers App 25
  26. 26. Credit Card Issuer Business Value 26
  27. 27. Waste & Recycling Leader Idle Alerts 27
  28. 28. Data Sources  Truck Geolocation Data 20,000 trucks – 5 sec interval –  Landfill Geographic Boundaries 28
  29. 29. Techniques Realtime Stream Computation (Storm) Truck Geolocation Data Hadoop Storage Immediate Alerts Batch Computation (MapReduce) Tax Reduction Reporting Shortest Path Graph Algorithm Route Optimization 29
  30. 30. Business Value 30
  31. 31. Fraud Detection Data Lake 31
  32. 32. Data Sources   Anti-Money Laundering Consumer Transactions 32
  33. 33. Techniques Anti-Money Laundering System Consumer Transactions System 33
  34. 34. Techniques AML Data Lake (Hadoop) Suspicious Events Consumer Transactions Analyst Latent Dirichlet Allocation, Bayesian Learning Neural Network, Peer Group Analysis 34
  35. 35. Business Value 35
  36. 36. Machine Learning Search Relevance DNA Matching 36
  37. 37. Data Sources Birth, Death, Census, Military, I mmigration records  Search Behavior Activity  DNA SNP (snips)  37
  38. 38. Techniques Record Linking  Search Relevance  Clickstream Behavior  Security Forensics  DNA Matching  38
  39. 39. Business Value 39
  40. 40. Traffic Analytics 40
  41. 41. Data Sources  Inrix Road Segment Data Avg Speed / minute / segment – Reference Speeds –  Road Segment Geolocation Data 41
  42. 42. Techniques  Bottleneck Detection Algorithm  Time Offset Correlations –  Alternate Routes Predictive Congestion Analysis – Growth & Term Assumptions 42
  43. 43. 43
  44. 44. 44
  45. 45. Business Value 45
  46. 46. Similar Characteristics Lots of Data  Structured, Semi-Structured, Unstructured  Varied Systems Interoperating – Hadoop, Storm, Solr, MPP, Visualizations  Increase Revenue  Decrease Costs  46
  47. 47. Thank You 47

×