8min intro
to
Data Science
Key takeaways of Data Science
• An overview of the shift to Data Science Platforms
• The 2 critical components of a Data Science platform
• Industries that are most likely to get disrupted and shift to Data Science
• Characteristics of firms that get left behind the Data Science wave
• Factors that push an industry towards Data Science
• A brief overview of aspects of platform architecture beyond Hadoop
• What's in it for you ? How can an individual intercept this massive new trend ?
What's common to the following ?
1
2
3
4 5
Japanese dating app
Sensored cows in Netherland Googles autonomous car
MOOC
Heart implants
The world around is changing …
• How our health gets cared for ?
• How we learn ?
• How we fall in love ?
• How we do farming ?
• How we drive ?
They all have Data Science embedded in them
(an intimate fabric of our lives)
How did the following players disrupt the Marketplace ?
• Amazon Defeated Borders ( Books )
• Netflix Defeated Blockbuster ( Video )
• iTunes Defeated Tower records ( Music )
• Google defeated Yahoo ( Search ) – Page rank
algorithm
What's the secret sauce ?
Ability to “see” patterns FASTER than competition is key to SURVIVAL !!!
Industries disrupted by Data Science
• Telecom ( Infrastructure optimisation, Network security )
• Banking ( Customer sentiment, Multi channel analysis )
• Digital channel ( Consumer engagement, Recommendation engines )
• Automotive ( Autonomous cards, Fords OnStar )
• Health care (Wearables )
• Oil n Gas ( Operations optimisation )
• Retail ( Digitisation )
What factors are driving companies towards data science ?
• Competitive advantage in the market place ( get ahead fast using unique insights )
• Existential threat ( others are moving ahead fast and I need to catch up )
• Revenue enhancement ( Cross sell models, recommenders )
• Cost optimisation ( Operational efficiency )
Overview of Data Science Platform ?
• Store massive torrents of data
• Billions of events
• Petabytes of data
• Emphasis on how to handle data
• Hadoop
• Cloudera
• Infobright
• Splunk
• Sense patterns from data for competitive advantage
• Emphasis on seeing patterns
• Algorithms
• Clustering
• Advanced visualisation
• Text mining
Why is Data Science Hot ?
Data Science jobs are H O T !
Data Science Jobs hot In India too !
BIG DATA HAS ENTERED BOARD ROOM GLOBALLY
“By 2018, the United States alone could
face a shortage of 140,000 to 190,000
people with deep analytical skills as well
as 1.5 million managers and analytics
with the know-how to use the analysis
of big data to make effective decisions”
McKinsey & Company: Big Data: The next frontier for competition
DATA SCIENCE = PASSPORT TO GLOBAL MARKET !
1
2
3
To summarize
3 key takeaways …
6
key points regarding our UNIQUE LEARNING MODEL
Principle-1 : Humanize Machine Learning
Principle-2 : 60 % Doing + 40 % Listening
Principle-3 : Biz Backward , instead of Technology
forward !
Principle-4 : Playbooks + Checklists +
Worksheets
Principle-5 : Outcome triumphs Output , ROI is key !
SegmentationROI from customers
Moving to high value
segments
6. Repeat top 10 R commands 5 times
What you would learn at the end of 4 weeks ?
15 Core Foundational Building Blocks for next generation job market
PREDICTIVE
SCORING
MODELING
DEMYSTIFYING
MACHINE
LEARNING
CORRELATIO
N DETECTION
ADVANCED
VISUALISATION
VOLATILITY
ANALYTICS
CLUSTERING
FEATURE
EXTRACTION
OUTLIER
EXPLORATION
BOX PLOTS
SCATTER
PLOTS
UNIVARIATE
ANALYSIS
EXPLORATORY
DATA ANALYSIS
REGRESSION
MODELING
BUSINESS USE
CASES OF ML
REFRERENCE
ARCHITECTURE
4 Week Data Science Boot camp
Week by week plan
Week-1
Week-2
Week-3
Week-4
 Demystifying Data Science
 Introduction to Machine learning techniques
 Step by Step methodology for converting noise to signal
 12 tools of a Data Scientist
 Descriptive vs Prescriptive statistics
 How to do EDA ( Exploratory Data Analysis ) –Univariate / Bivariate / Corrrelations
 Advanced Visualisation techniques
 Data Science Lab Session-2 : Hands on Univariate + Bivariate + Correlation Analytics
 Data Science Lab Session-1 : Getting feet wet in Data Science tools
 Introduction to segmentation and clustering techniques
 Segmentation in Retail Industry
 Segmentation in Telecom industry
 Segmentation in Healthcare industry
 How to present for maximising Segmentation Business Impact
 Data Science Lab Session-3 : Hands on SEGMENTATION on live data
 Demystifying Predictive Analytical Models ( PAM )
 Predictive Analytical Models in Retail Industry
 Predictive Analytical Models in Telecom industry
 Predictive Analytical Models in Healthcare industry
 Mapping Impact of Predictive models on Business Outcomes
 Summary of Key Data Science concepts
 Data Science Lab Session-4 : Hands on PREDICTIVE ANALYTICS on live data
 END 2 END MACHINE LEARNING PROJECT on
Live data ( Telecom or Retail or Banking )
Good luck in hunting for patterns using Data Science 
8 minute intro to data science

8 minute intro to data science

  • 1.
  • 2.
    Key takeaways ofData Science • An overview of the shift to Data Science Platforms • The 2 critical components of a Data Science platform • Industries that are most likely to get disrupted and shift to Data Science • Characteristics of firms that get left behind the Data Science wave • Factors that push an industry towards Data Science • A brief overview of aspects of platform architecture beyond Hadoop • What's in it for you ? How can an individual intercept this massive new trend ?
  • 3.
    What's common tothe following ? 1 2 3 4 5 Japanese dating app Sensored cows in Netherland Googles autonomous car MOOC Heart implants
  • 4.
    The world aroundis changing … • How our health gets cared for ? • How we learn ? • How we fall in love ? • How we do farming ? • How we drive ? They all have Data Science embedded in them (an intimate fabric of our lives)
  • 5.
    How did thefollowing players disrupt the Marketplace ? • Amazon Defeated Borders ( Books ) • Netflix Defeated Blockbuster ( Video ) • iTunes Defeated Tower records ( Music ) • Google defeated Yahoo ( Search ) – Page rank algorithm
  • 6.
  • 7.
    Ability to “see”patterns FASTER than competition is key to SURVIVAL !!!
  • 8.
    Industries disrupted byData Science • Telecom ( Infrastructure optimisation, Network security ) • Banking ( Customer sentiment, Multi channel analysis ) • Digital channel ( Consumer engagement, Recommendation engines ) • Automotive ( Autonomous cards, Fords OnStar ) • Health care (Wearables ) • Oil n Gas ( Operations optimisation ) • Retail ( Digitisation )
  • 9.
    What factors aredriving companies towards data science ? • Competitive advantage in the market place ( get ahead fast using unique insights ) • Existential threat ( others are moving ahead fast and I need to catch up ) • Revenue enhancement ( Cross sell models, recommenders ) • Cost optimisation ( Operational efficiency )
  • 10.
    Overview of DataScience Platform ? • Store massive torrents of data • Billions of events • Petabytes of data • Emphasis on how to handle data • Hadoop • Cloudera • Infobright • Splunk • Sense patterns from data for competitive advantage • Emphasis on seeing patterns • Algorithms • Clustering • Advanced visualisation • Text mining
  • 11.
    Why is DataScience Hot ?
  • 12.
    Data Science jobsare H O T !
  • 13.
    Data Science Jobshot In India too !
  • 18.
    BIG DATA HASENTERED BOARD ROOM GLOBALLY
  • 19.
    “By 2018, theUnited States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analytics with the know-how to use the analysis of big data to make effective decisions” McKinsey & Company: Big Data: The next frontier for competition
  • 20.
    DATA SCIENCE =PASSPORT TO GLOBAL MARKET !
  • 24.
  • 25.
    6 key points regardingour UNIQUE LEARNING MODEL
  • 26.
    Principle-1 : HumanizeMachine Learning
  • 27.
    Principle-2 : 60% Doing + 40 % Listening
  • 28.
    Principle-3 : BizBackward , instead of Technology forward !
  • 29.
    Principle-4 : Playbooks+ Checklists + Worksheets
  • 30.
    Principle-5 : Outcometriumphs Output , ROI is key ! SegmentationROI from customers Moving to high value segments
  • 31.
    6. Repeat top10 R commands 5 times
  • 32.
    What you wouldlearn at the end of 4 weeks ? 15 Core Foundational Building Blocks for next generation job market PREDICTIVE SCORING MODELING DEMYSTIFYING MACHINE LEARNING CORRELATIO N DETECTION ADVANCED VISUALISATION VOLATILITY ANALYTICS CLUSTERING FEATURE EXTRACTION OUTLIER EXPLORATION BOX PLOTS SCATTER PLOTS UNIVARIATE ANALYSIS EXPLORATORY DATA ANALYSIS REGRESSION MODELING BUSINESS USE CASES OF ML REFRERENCE ARCHITECTURE
  • 33.
    4 Week DataScience Boot camp Week by week plan Week-1 Week-2 Week-3 Week-4  Demystifying Data Science  Introduction to Machine learning techniques  Step by Step methodology for converting noise to signal  12 tools of a Data Scientist  Descriptive vs Prescriptive statistics  How to do EDA ( Exploratory Data Analysis ) –Univariate / Bivariate / Corrrelations  Advanced Visualisation techniques  Data Science Lab Session-2 : Hands on Univariate + Bivariate + Correlation Analytics  Data Science Lab Session-1 : Getting feet wet in Data Science tools  Introduction to segmentation and clustering techniques  Segmentation in Retail Industry  Segmentation in Telecom industry  Segmentation in Healthcare industry  How to present for maximising Segmentation Business Impact  Data Science Lab Session-3 : Hands on SEGMENTATION on live data  Demystifying Predictive Analytical Models ( PAM )  Predictive Analytical Models in Retail Industry  Predictive Analytical Models in Telecom industry  Predictive Analytical Models in Healthcare industry  Mapping Impact of Predictive models on Business Outcomes  Summary of Key Data Science concepts  Data Science Lab Session-4 : Hands on PREDICTIVE ANALYTICS on live data  END 2 END MACHINE LEARNING PROJECT on Live data ( Telecom or Retail or Banking )
  • 38.
    Good luck inhunting for patterns using Data Science 