SlideShare a Scribd company logo
1 of 25
Download to read offline
A	
  Big	
  Data	
  Business	
  
James	
  Yu	
  (虞沐)	
  	
  
@Huami	
  US	
  (华⽶米美国)	
  
Who	
  am	
  I?	
  
•  14	
  years	
  in	
  so?ware	
  industry	
  
– 10	
  years	
  backend	
  engineer	
  
– 4	
  years	
  Big	
  Data	
  and	
  Cloud	
  CompuIng	
  
•  6	
  years	
  in	
  China	
  (HP/SAP),	
  and	
  8	
  years	
  in	
  
Silicon	
  Valley	
  (eBay/Samsung/Baidu/LinkedIn)	
  
•  Now	
  @Huami	
  US,	
  as	
  architect	
  and	
  manager	
  of	
  
cloud	
  and	
  big	
  data	
  team	
  
Agenda	
  
•  What	
  is	
  a	
  Big	
  Data	
  business	
  
•  How	
  to	
  	
  
•  China	
  VS.	
  US	
  
•  A	
  story	
  of	
  healthcare/wearable	
  
•  How	
  we	
  do	
  it	
  
What	
  is	
  a	
  Big	
  Data	
  business	
  
•  Free	
  services	
  (or	
  even	
  give	
  away	
  money)	
  
•  Lots	
  of	
  users	
  
•  Knowing	
  what	
  each	
  individual	
  wants	
  
•  Making	
  money	
  by	
  connecIng	
  people	
  and	
  
business	
  
(Google,	
  Facebook,	
  TwiZer,	
  Yelp,	
  LinkedIn,	
  Uber,	
  
Airbnb,	
  BAT,	
  嘀嘀出⾏行,	
  美团,	
  饿了么)	
  
How-­‐to	
  
•  Business	
  model	
  	
  
(great	
  services	
  that	
  people	
  like)	
  
•  Talents	
  and	
  technologies	
  
(right	
  people,	
  right	
  technology)	
  
•  Good	
  luck	
  
(more	
  than	
  50%	
  big	
  data	
  projects	
  fail)	
  
China	
  VS.	
  US	
  
China	
   Items	
   US	
  
market	
  
compeIIon	
  
talents	
  
tech	
  -­‐	
  infra	
  
tech	
  -­‐	
  ecosystem	
  
$	
  investment	
  
Wearable	
  and	
  Healthcare	
  
Name:	
  Tom	
  
	
  
Walking	
  steps/day:	
  2000	
  
Running	
  duraIon:	
  30	
  minutes	
  
Sleeping	
  score:	
  75	
  
Heartbeat	
  min/median/max:	
  
70/80/90	
  
	
  
RecommendaIons:	
  
Follow-­‐up	
  for	
  heartbeat	
  
More	
  walking	
  needed	
  
Wearable	
  and	
  AuthenIcaIon	
  
Wearable	
  connects	
  everything	
  
Huami	
  
•  MiBand-­‐1	
  @2014/8	
  @79	
  RMB	
  
•  10M	
  users,	
  2rd	
  worldwide,	
  next	
  to	
  Fitbit	
  
•  Amazfit	
  @2015/9	
  @299	
  RMB	
  
	
  
•  More	
  coming	
  in	
  next	
  few	
  months	
  
Huami	
  –	
  how	
  we	
  do	
  it	
  
•  Business	
  model:	
  low	
  cost	
  band	
  
•  Technology:	
  wearable	
  hardware,	
  user	
  app,	
  
and	
  robust	
  cloud	
  with	
  big	
  data	
  system	
  
•  Talent:	
  3	
  locaIons	
  
Cloud	
  CompuIng	
  
Our	
  Technology	
  	
  -­‐	
  	
  Cloud	
  
OpIons:	
  
•  AWS	
  
•  Google	
  Cloud	
  
•  Microso?	
  Azure	
  
•  Aliyun	
  
Compare:	
  Pros	
  &	
  Cons	
  
AWS	
   Google	
   MS	
  Azure	
   Aliyun	
  
Global	
  coverage	
  	
   Yes	
   Yes	
  
China	
  coverage	
   Yes	
  
Features	
   Yes	
   Yes	
  
Open	
  source	
  
friendly	
  
Yes	
   Yes	
  
Quality	
  and	
  
stability	
  
Yes	
  
(since	
  2006)	
  
Yes	
   Yes	
  
New	
  features	
   Yes	
  
Price	
   Yes	
   Yes	
   Yes	
  
DocumentaIon	
   Great	
   OK	
   OK	
   Poor	
  
Our	
  choice	
  is	
  AWS	
  
•  AWS	
  has	
  the	
  best	
  technology	
  and	
  quality.	
  
•  AWS	
  has	
  the	
  best	
  global	
  coverage.	
  
•  Aliyun	
  has	
  the	
  best	
  coverage	
  in	
  China.	
  
Global	
  datacenters	
  
US	
   Chi
na	
  
EU	
  
Singap
ore	
  
AWS	
  offerings	
  -­‐	
  regions	
  
China: China
Asia: ap-southeast-1 (ap-
southeast-2, ap-
northeast-1)
US: us-east-1 (us-
west-2)
EU: eu-central-1 (eu-
west-1)
AWS  offerings  -­‐  services
Big	
  Data	
  
Big	
  Data	
  steps	
  
Big	
  Data	
  Lambda	
  Architecture	
  
Big	
  data	
  -­‐	
  opIons	
  
Component	
   Op;ons	
  
Real	
  Ime	
  data	
  processing	
   •  Spark	
  streaming	
  (recommended)	
  
•  Storm	
  
Real	
  Ime	
  NoSQL	
  database	
   •  DynamoDB	
  (recommended)	
  
•  Cassandra	
  
•  HBase	
  
•  MongoDB	
  
Offline	
  data	
  storage	
   •  S3	
  (recommended)	
  
•  HDFS	
  
ETL	
  (SQL)	
   •  Hadoop	
  Hive/Pig	
  	
  
•  Spark	
  SQL	
  (recommended)	
  
ETL	
  (programming)	
  	
   •  Hadoop	
  MapReduce	
  
•  Spark	
  RDD/Dataframe	
  programing	
  
(recommended)	
  	
  
Batch	
  AnalyIcs	
   •  Spark	
  (SQL)	
  
•  Redshi?	
  (recommended)	
  	
  	
  
•  Hadoop	
  Hive/Pig	
  
Machine	
  learning	
   •  Hadoop	
  Mahout	
  
•  Spark	
  MLlib	
  /	
  SparkR	
  (recommended)	
  	
  
•  AWS	
  machine	
  learning	
  
•  R	
  /	
  SciPy	
  /	
  Matlab	
  /	
  DeepLearning	
  
Data	
  products	
  
•  Helping	
  user	
  to	
  beZer	
  track	
  his/her	
  acIvity,	
  
includes	
  fitness	
  and	
  health	
  
•  Develop	
  an	
  ecosystem	
  with	
  partners	
  from	
  
different	
  areas	
  (smart	
  appliance,	
  security,	
  
payment,	
  and	
  many	
  others)	
  
Q	
  &	
  A	
  

More Related Content

Similar to bd

Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...Amazon Web Services
 
Opening Keynote by Dr. Werner Vogels
Opening Keynote by Dr. Werner VogelsOpening Keynote by Dr. Werner Vogels
Opening Keynote by Dr. Werner VogelsAmazon Web Services
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game ChangerCaserta
 
Conflict in the Cloud – Issues & Solutions for Big Data
Conflict in the Cloud – Issues & Solutions for Big DataConflict in the Cloud – Issues & Solutions for Big Data
Conflict in the Cloud – Issues & Solutions for Big DataHalo BI
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusersBob Hardaway
 
Data & Analytics - Session 1 - Big Data Analytics
Data & Analytics - Session 1 -  Big Data AnalyticsData & Analytics - Session 1 -  Big Data Analytics
Data & Analytics - Session 1 - Big Data AnalyticsAmazon Web Services
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxAIMLSEMINARS
 
Big Data and High Performance Computing
Big Data and High Performance ComputingBig Data and High Performance Computing
Big Data and High Performance ComputingAbzetdin Adamov
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantageAmazon Web Services
 
Mapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudMapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudChris Dagdigian
 
(ISM213) Building and Deploying a Modern Big Data Architecture on AWS
(ISM213) Building and Deploying a Modern Big Data Architecture on AWS(ISM213) Building and Deploying a Modern Big Data Architecture on AWS
(ISM213) Building and Deploying a Modern Big Data Architecture on AWSAmazon Web Services
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & HadoopBlackvard
 
Building Enterprise OLAP on Hadoop for FSI
Building Enterprise OLAP on Hadoop for FSIBuilding Enterprise OLAP on Hadoop for FSI
Building Enterprise OLAP on Hadoop for FSILuke Han
 
The Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management StackThe Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management StackSnapLogic
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAmazon Web Services
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewAbhishek Roy
 
Cloudera - Mike Olson - Hadoop World 2010
Cloudera - Mike Olson - Hadoop World 2010Cloudera - Mike Olson - Hadoop World 2010
Cloudera - Mike Olson - Hadoop World 2010Cloudera, Inc.
 

Similar to bd (20)

Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
 
Big data
Big dataBig data
Big data
 
Opening Keynote by Dr. Werner Vogels
Opening Keynote by Dr. Werner VogelsOpening Keynote by Dr. Werner Vogels
Opening Keynote by Dr. Werner Vogels
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
Conflict in the Cloud – Issues & Solutions for Big Data
Conflict in the Cloud – Issues & Solutions for Big DataConflict in the Cloud – Issues & Solutions for Big Data
Conflict in the Cloud – Issues & Solutions for Big Data
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 
Data & Analytics - Session 1 - Big Data Analytics
Data & Analytics - Session 1 -  Big Data AnalyticsData & Analytics - Session 1 -  Big Data Analytics
Data & Analytics - Session 1 - Big Data Analytics
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptx
 
Big Data and High Performance Computing
Big Data and High Performance ComputingBig Data and High Performance Computing
Big Data and High Performance Computing
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Mapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudMapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the Cloud
 
(ISM213) Building and Deploying a Modern Big Data Architecture on AWS
(ISM213) Building and Deploying a Modern Big Data Architecture on AWS(ISM213) Building and Deploying a Modern Big Data Architecture on AWS
(ISM213) Building and Deploying a Modern Big Data Architecture on AWS
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & Hadoop
 
Building Enterprise OLAP on Hadoop for FSI
Building Enterprise OLAP on Hadoop for FSIBuilding Enterprise OLAP on Hadoop for FSI
Building Enterprise OLAP on Hadoop for FSI
 
The Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management StackThe Impact of SMACT on the Data Management Stack
The Impact of SMACT on the Data Management Stack
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
 
Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
Cloudera - Mike Olson - Hadoop World 2010
Cloudera - Mike Olson - Hadoop World 2010Cloudera - Mike Olson - Hadoop World 2010
Cloudera - Mike Olson - Hadoop World 2010
 

bd

  • 1. A  Big  Data  Business   James  Yu  (虞沐)     @Huami  US  (华⽶米美国)  
  • 2. Who  am  I?   •  14  years  in  so?ware  industry   – 10  years  backend  engineer   – 4  years  Big  Data  and  Cloud  CompuIng   •  6  years  in  China  (HP/SAP),  and  8  years  in   Silicon  Valley  (eBay/Samsung/Baidu/LinkedIn)   •  Now  @Huami  US,  as  architect  and  manager  of   cloud  and  big  data  team  
  • 3. Agenda   •  What  is  a  Big  Data  business   •  How  to     •  China  VS.  US   •  A  story  of  healthcare/wearable   •  How  we  do  it  
  • 4. What  is  a  Big  Data  business   •  Free  services  (or  even  give  away  money)   •  Lots  of  users   •  Knowing  what  each  individual  wants   •  Making  money  by  connecIng  people  and   business   (Google,  Facebook,  TwiZer,  Yelp,  LinkedIn,  Uber,   Airbnb,  BAT,  嘀嘀出⾏行,  美团,  饿了么)  
  • 5. How-­‐to   •  Business  model     (great  services  that  people  like)   •  Talents  and  technologies   (right  people,  right  technology)   •  Good  luck   (more  than  50%  big  data  projects  fail)  
  • 6. China  VS.  US   China   Items   US   market   compeIIon   talents   tech  -­‐  infra   tech  -­‐  ecosystem   $  investment  
  • 7. Wearable  and  Healthcare   Name:  Tom     Walking  steps/day:  2000   Running  duraIon:  30  minutes   Sleeping  score:  75   Heartbeat  min/median/max:   70/80/90     RecommendaIons:   Follow-­‐up  for  heartbeat   More  walking  needed  
  • 10. Huami   •  MiBand-­‐1  @2014/8  @79  RMB   •  10M  users,  2rd  worldwide,  next  to  Fitbit   •  Amazfit  @2015/9  @299  RMB     •  More  coming  in  next  few  months  
  • 11. Huami  –  how  we  do  it   •  Business  model:  low  cost  band   •  Technology:  wearable  hardware,  user  app,   and  robust  cloud  with  big  data  system   •  Talent:  3  locaIons  
  • 13. Our  Technology    -­‐    Cloud   OpIons:   •  AWS   •  Google  Cloud   •  Microso?  Azure   •  Aliyun  
  • 14. Compare:  Pros  &  Cons   AWS   Google   MS  Azure   Aliyun   Global  coverage     Yes   Yes   China  coverage   Yes   Features   Yes   Yes   Open  source   friendly   Yes   Yes   Quality  and   stability   Yes   (since  2006)   Yes   Yes   New  features   Yes   Price   Yes   Yes   Yes   DocumentaIon   Great   OK   OK   Poor  
  • 15. Our  choice  is  AWS   •  AWS  has  the  best  technology  and  quality.   •  AWS  has  the  best  global  coverage.   •  Aliyun  has  the  best  coverage  in  China.  
  • 16. Global  datacenters   US   Chi na   EU   Singap ore  
  • 17. AWS  offerings  -­‐  regions   China: China Asia: ap-southeast-1 (ap- southeast-2, ap- northeast-1) US: us-east-1 (us- west-2) EU: eu-central-1 (eu- west-1)
  • 20.
  • 22. Big  Data  Lambda  Architecture  
  • 23. Big  data  -­‐  opIons   Component   Op;ons   Real  Ime  data  processing   •  Spark  streaming  (recommended)   •  Storm   Real  Ime  NoSQL  database   •  DynamoDB  (recommended)   •  Cassandra   •  HBase   •  MongoDB   Offline  data  storage   •  S3  (recommended)   •  HDFS   ETL  (SQL)   •  Hadoop  Hive/Pig     •  Spark  SQL  (recommended)   ETL  (programming)     •  Hadoop  MapReduce   •  Spark  RDD/Dataframe  programing   (recommended)     Batch  AnalyIcs   •  Spark  (SQL)   •  Redshi?  (recommended)       •  Hadoop  Hive/Pig   Machine  learning   •  Hadoop  Mahout   •  Spark  MLlib  /  SparkR  (recommended)     •  AWS  machine  learning   •  R  /  SciPy  /  Matlab  /  DeepLearning  
  • 24. Data  products   •  Helping  user  to  beZer  track  his/her  acIvity,   includes  fitness  and  health   •  Develop  an  ecosystem  with  partners  from   different  areas  (smart  appliance,  security,   payment,  and  many  others)  
  • 25. Q  &  A