• Save
AWS Enterprise Day | Big Data Analytics
Upcoming SlideShare
Loading in...5
×
 

AWS Enterprise Day | Big Data Analytics

on

  • 436 views

The presentation will discuss the business value of big data analytics and Intel's strategy in big data.

The presentation will discuss the business value of big data analytics and Intel's strategy in big data.

Statistics

Views

Total Views
436
Views on SlideShare
436
Embed Views
0

Actions

Likes
3
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    AWS Enterprise Day | Big Data Analytics AWS Enterprise Day | Big Data Analytics Presentation Transcript

    • Big Data Analytics Bernard Cheah Regional Solution Architect - Intel
    • Big Data – Volume, Velocity, Variety (& Value) 7.9 ZB by 2015 3x more bits in digital universe than stars in the physical universe 450 Billion Business transactions per day by 2020 (IDC) Therapies tailored to a persons genome Decoding the human genome: •  From 10 years to hours •  On track to hit <$1000 per person Explosive growth, 30 Tb/month billing data Radical overhaul of customer service: •  Self service, real time access •  30x performance increase $600 B Potential value to US healthcare 90% of Data In the world was created in the last 2 years. 100 years Worth of video uploaded to YouTube every 10 days >5 Billion People calling, texting, tweeting & browsing on cell phones “In God we trust, all others bring data” — NASA, Johnson Space Center How  Will  Businesses  Manage  a  50x  Data  Growth     by  2020  in  an  Affordable  Way?  
    • MACHINE  GENERATED     HUMAN   GENERATED     BUSINESS   GENERATED   Sources of Big Data EDGE   SCALE  UP   DISTRIBUTED   REQUIRES  DIFFERENT  APPROACHES  
    • Approaching Big Data…
    • Hadoop? The  best  thing  since…  
    • Hadoop Framework Open  Source   Proprietary   HDFS | Lustre | GlusterFS Hadoop Compatible File Systems YARN (+MapReduce) Distributed Processing Framework HBase Zookeeper Coordination Flume LogCollector Sqoop DataTransfer Hive Query Oozie Workflow Mahout Machine Learning Pig Scripting R Stats Hcatalog Metadata Deployment   Upgrade   ConfiguraCon   Unified  Logging   Tuning   Alerts   Resource   Monitor   Job  Profiler   Security  Controls   Heat  Map   Rhino (Security) High Availability and Disaster Recovery HBase  Explorer   RecommendaCon  Engine   Behavior  Model   VerCcal  Accelerators   AnalyCcs  Workbench   Connectors Netezza, Oracle, SAP, SQLServer, Teradata, DB2 Kafka Event  Bus   Lucene, Solr Search   Tribeca Graph  Mining   Gryphon Low-­‐latency  SQL-­‐92   Spark/Shark In-­‐memory   SLURM Scheduler
    • Big Data Use Cases Across Industries EducaCon   Financial  Services  
    • Telco- China Mobile Group Guangdong Hadoop & Xeon optimized Big Data storage & analytics •  Challenge: Deliver real time access to Call Data Records (CDR) for billing self service •  Solution: Chose Hadoop + Xeon over RDMS to remove data access bottlenecks, increase storage, and scale system •  Benefits: Lower TCO, 30x performance increase, stable operation, analytics on subscriber usage for targeted promotions •  Data Characteristics: •  30TB billing data/month •  Real-time retrieval of 30 days CDRs •  300k records/second, 800k insert speed/sec •  15 analytics queries Analy&cs  
    • Government - Smart Traffic Intelligent Transport System Hadoop for Predictive Analytics Crime prevention, Info sharing & Predictive Traffic Analytics Machine Generated Data: •  Embedded HBase client in camera for real-time inserts of structured/unstructured data •  30000 + camera data collection points •  2 billion HBase records •  Petabytes of traffic data •  Terabytes of images •  1 week of Data mining Results: •  Automated queries for traffic violation •  Crime Prevention: ID fake •  Licenses <1 minute •  Traffic Routing App     Servers   Regional  Data  Collec&on   Distributed  Processing  Across  District  Nodes   Derived                                                                              Analy&cs  Services     Crime  PrevenCon   CiCzen  Traffic  Services  
    • Options For Hadoop Deployment On-Premise (or private cloud) •  Limited scalability •  Internal IT resources to manage cluster •  CapEx – HW, DC space, power & cooling On AWS (public cloud) •  Scalability •  Flexibility •  Easy to deploy to multiple locations •  Additional resources on demand •  OpEx Hybrid Cloud model •  Provides bursting capacity •  Flexibility •  Scalability •  IT still needs to manage on- premise cluster Security Is Addressed In All Models
    • “Where do I start…?” 1.  What is your business problem? 2.  Do you have a (lots of) data problem? 3.  Will big data analytics work for my business problem? Speak To AWS Today!