Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Valley Machine Learning Meetup

1,521 views

Published on

On Jan 10, 2014, we presented this to a full-house audience, members of the Silicon Valley Machine Learning group, at the Hacker Dojo in Mountain View, CA. The session was enjoyable with a very interactive, well-informed audience.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,521
On SlideShare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
41
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Adatao: Interactive, Visual, Predictive Analytics for Big Data @ Silicon Valley Machine Learning Meetup

  1. 1. Data Intelligence for All Visual, Interactive, Predictive Analytics for Big Data ! Christopher Nguyen, PhD Co-Founder & CEO Presented on January 10, 2014
  2. 2. Big-Data Compute Engines, Google Apps Engineering Director, Google Founders’ Award, HKUST Prof, 2 successful enterprise exits, Stanford PhD Deep engineering & business experience from Google, Yahoo et al. PhD’s in DM & ML from UIUC, Georgia Tech, Stanford, Berkeley, ... Hadoop distributed/streaming analytics,Yahoo Hadoop Eng, UIUC PhD Machine learning & machine vision, US Army Research Lab, Johns Hopkins PhD http://adatao.com
  3. 3. 2 0 0 5 BIG DATA PROBLEMS Huge Volume High Velocity Great Variety http://adatao.com
  4. 4. 2 0 1 0 D OOP HA lp e he problems ata d ig b ve ol s d IV E H DUCE APRE M http://adatao.com
  5. 5. 2 0 1 3 BIG DATA + BIG COMPUTE = OPPORTUNITIES BIG DA TA= PROBL EMS automatic customer segmentation Machine Learning Predictive Analytics Natural Language http://adatao.com
  6. 6. Hadoop has a Big Problem. It’s too slooow… http://adatao.com
  7. 7. Users need interactive visualization & advanced analytics http://adatao.com
  8. 8. But no tools available on the Hadoop stack offer this http://adatao.com
  9. 9. Until now. DATA INTELLIGENCE FOR ALL http://adatao.com
  10. 10. ADATAO SOLUTION IN HADOOP LANDSCAPE Hybrid Batch Interactive, Fast ! Hybrid Export-thenAnalyze RDBMS SQL Queries Hive BATCH ETL MapReduce Basic Analytics (SQL Queries) Advanced Analytics In-Memory, Fast Interactive Visualization, Data Mining and Machine Learning ! Impala, Stinger, Presto, Platfora, Hadapt DATA INTELLIGENCE FOR ALL HDFS http://adatao.com
  11. 11. ONE Integrated Solution for Business & Data Science & Engineering Business Users Data Scientists BIG INSIGHTS 001 001 0 1 1 00 1 1 0 1 1 1 01 1 1 0 1 0 0 01 0 0 0 0 0 0 10 0 0 1 0 0 0 10 0 0 1 0 1 1 00 1 1 0 1 1 1 01 1 1 0 Visually Beautiful Interactive Data
 Exploration Narrative Web App BIG COMPUTE Data Engineers Powerful In-Memory Data Mining Machine Learning Big Analytics Platform (Hadoop HDFS, Cassandra, SQL DMBS, Streaming Data) BIG DATA http://adatao.com
  12. 12. DATA INTELLIGENCE FOR ALL is the first & still only solution for this problem http://adatao.com
  13. 13. Live Demo Deployment Diagram CLIENT MASTER WORKER WORKER WORKER WORKER http://adatao.com
  14. 14. Magic 1 It’s Big & Interactive http://adatao.com
  15. 15. Magic It’s Web-Based 2 http://adatao.com
  16. 16. 3 Magic It’s Multi-Lingual http://adatao.com
  17. 17. 4 Magic It’s Dashboards! http://adatao.com
  18. 18. Magic 5 It’s Machine Learning http://adatao.com
  19. 19. for Business Users Predictive Decision Making A Beautiful New Way to Create & Share Visual Narratives of Your Analysis Perform Ad Hoc Queries in Plain English Publish Streaming, Interactive Dashboards Collaborate With Others In Real Time Query Terabytes in Seconds. http://adatao.com
  20. 20. 001 001 0 1 1 00 1 1 0 1 1 1 01 1 1 0 1 0 0 01 0 0 0 0 0 0 10 0 0 1 0 0 0 10 0 0 1 0 1 1 00 1 1 0 1 1 1 01 1 1 0 for Data Scientists & Engineers Big Data Mining & Machine Learning Powerful In-Memory Data Mining & Machine Learning—Model Terabytes in Seconds Interactive, Cluster-Scale Data Munging & Modeling with Native R, R-Studio, Python, SQL, and Java Front-ends Real-Time Scoring Directly From Trained Models Share reproducible, live data analysis documents Hadoop, Cassandra, RDBMS, Streaming Data http://adatao.com
  21. 21. Use Cases http://adatao.com
  22. 22. Internet Service Provider Interactive, Ad Hoc Business Query Insight Discovery on Aggregated 
 Operational Data Finance Marketing Engineering Sales http://adatao.com
  23. 23. Customer Service Provider Product Recommendation Cross-channel 
 User Experience Optimization
 http://adatao.com
  24. 24. Heavy Equipment Manufacturer Sensor Network Analytics
 for Predictive Maintenance
 http://adatao.com
  25. 25. Mobile Ad Platform Ad Targeting CTR Prediction 
 http://adatao.com
  26. 26. Scaling Performance http://adatao.com
  27. 27. Algorithm Run time (sec) for 
 50GB (800M rows) Per-Core Throughput (MB/sec-core) Per-Machine Throughput (MB/s) adatao.lm (ridge) 3.04 130 1,040 adatao.lm 4.05 102 816 adatao.lm.gd 12.2 32 256 adatao.glm.gd 24.5 16 128 adatao.glm 36.1 11 88 adatao.kmeans 335 1.2 9.6 pAnalytics performance on building machine learning models with cluster Adatao16 (m3.2xlarge) on a 50GB data set of 5 features and 800 million rows. (Gradient descent algorithms are over 5 iterations) http://adatao.com
  28. 28. Algorithm Run time (sec) for 
 1.1 TB dataset (1.6B rows) Per-Core Throughput (MB/sec-core) Per-Machine Throughput (MB/s) adatao.lm (ridge) 70.9 130 1,040 adatao.lm 74.9 123 984 adatao.lm.gd 127 72.8 582 adatao.glm.gd 145 63.6 509 pAnalytics performance on building machine learning models with cluster Adatao40 (m3.2xlarge) on a 1.1 TB data set of 40 features and 1.6 billion rows. (Gradient descent algorithms are over 5 iterations) http://adatao.com
  29. 29. http://adatao.com
  30. 30. Data Intelligence for All Business Users Fast & Easy Business Analytics ! Natural Language ! Beautiful Web UI Data Scientists & Engineers 001 001 0 1 1 00 1 1 0 1 1 1 01 1 1 0 1 0 0 01 0 0 0 0 0 0 10 0 0 1 0 0 0 10 0 0 1 0 1 1 00 1 1 0 1 1 1 01 1 1 0 Big & Fast Data Science ! R, Python, REST API ! Data Mining & ML http://adatao.com
  31. 31. Thanks! ! Data Scientist? http:/ /adatao.com/forms/ds.html http://adatao.com
  32. 32. http://adatao.com

×