Your SlideShare is downloading. ×
  • Like
  • Save
AWS Enterprise Day | Big Data Analytics
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

AWS Enterprise Day | Big Data Analytics


The presentation will discuss the business value of big data analytics and Intel's strategy in big data.

The presentation will discuss the business value of big data analytics and Intel's strategy in big data.

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Big Data Analytics Bernard Cheah Regional Solution Architect - Intel
  • 2. Big Data – Volume, Velocity, Variety (& Value) 7.9 ZB by 2015 3x more bits in digital universe than stars in the physical universe 450 Billion Business transactions per day by 2020 (IDC) Therapies tailored to a persons genome Decoding the human genome: •  From 10 years to hours •  On track to hit <$1000 per person Explosive growth, 30 Tb/month billing data Radical overhaul of customer service: •  Self service, real time access •  30x performance increase $600 B Potential value to US healthcare 90% of Data In the world was created in the last 2 years. 100 years Worth of video uploaded to YouTube every 10 days >5 Billion People calling, texting, tweeting & browsing on cell phones “In God we trust, all others bring data” — NASA, Johnson Space Center How  Will  Businesses  Manage  a  50x  Data  Growth     by  2020  in  an  Affordable  Way?  
  • 4. Approaching Big Data…
  • 5. Hadoop? The  best  thing  since…  
  • 6. Hadoop Framework Open  Source   Proprietary   HDFS | Lustre | GlusterFS Hadoop Compatible File Systems YARN (+MapReduce) Distributed Processing Framework HBase Zookeeper Coordination Flume LogCollector Sqoop DataTransfer Hive Query Oozie Workflow Mahout Machine Learning Pig Scripting R Stats Hcatalog Metadata Deployment   Upgrade   ConfiguraCon   Unified  Logging   Tuning   Alerts   Resource   Monitor   Job  Profiler   Security  Controls   Heat  Map   Rhino (Security) High Availability and Disaster Recovery HBase  Explorer   RecommendaCon  Engine   Behavior  Model   VerCcal  Accelerators   AnalyCcs  Workbench   Connectors Netezza, Oracle, SAP, SQLServer, Teradata, DB2 Kafka Event  Bus   Lucene, Solr Search   Tribeca Graph  Mining   Gryphon Low-­‐latency  SQL-­‐92   Spark/Shark In-­‐memory   SLURM Scheduler
  • 7. Big Data Use Cases Across Industries EducaCon   Financial  Services  
  • 8. Telco- China Mobile Group Guangdong Hadoop & Xeon optimized Big Data storage & analytics •  Challenge: Deliver real time access to Call Data Records (CDR) for billing self service •  Solution: Chose Hadoop + Xeon over RDMS to remove data access bottlenecks, increase storage, and scale system •  Benefits: Lower TCO, 30x performance increase, stable operation, analytics on subscriber usage for targeted promotions •  Data Characteristics: •  30TB billing data/month •  Real-time retrieval of 30 days CDRs •  300k records/second, 800k insert speed/sec •  15 analytics queries Analy&cs  
  • 9. Government - Smart Traffic Intelligent Transport System Hadoop for Predictive Analytics Crime prevention, Info sharing & Predictive Traffic Analytics Machine Generated Data: •  Embedded HBase client in camera for real-time inserts of structured/unstructured data •  30000 + camera data collection points •  2 billion HBase records •  Petabytes of traffic data •  Terabytes of images •  1 week of Data mining Results: •  Automated queries for traffic violation •  Crime Prevention: ID fake •  Licenses <1 minute •  Traffic Routing App     Servers   Regional  Data  Collec&on   Distributed  Processing  Across  District  Nodes   Derived                                                                              Analy&cs  Services     Crime  PrevenCon   CiCzen  Traffic  Services  
  • 10. Options For Hadoop Deployment On-Premise (or private cloud) •  Limited scalability •  Internal IT resources to manage cluster •  CapEx – HW, DC space, power & cooling On AWS (public cloud) •  Scalability •  Flexibility •  Easy to deploy to multiple locations •  Additional resources on demand •  OpEx Hybrid Cloud model •  Provides bursting capacity •  Flexibility •  Scalability •  IT still needs to manage on- premise cluster Security Is Addressed In All Models
  • 11. “Where do I start…?” 1.  What is your business problem? 2.  Do you have a (lots of) data problem? 3.  Will big data analytics work for my business problem? Speak To AWS Today!