Presented By:
Shivumanjesh P
Facts on Data Generation..
 Every day 2.5 quintillion bytes of data has been
created
 With so much information at our fingertips, we’re
adding to the data stockpile every time we turn to
our search engines for answers.
 Internet - More than 3.7
billion humans use the internet (that’s
a growth rate of 7.5 percent over
2016).
 On average, Google now processes
more than 40,000 searches EVERY
second (3.5 billion searches per day)!
Social Media
 Snapchat users share 527,760 photos
 More than 120 professionals join
LinkedIn
 Users watch 4,146,600 YouTube
videos
 456,000 tweets are sent on Twitter
 Instagram users post 46,740 photos
 1.5 billion people are active on
Facebook daily
Communication
 We send 16 million text messages
 There are 990,000 Tinder swipes
 15,000 GIFs are sent via Facebook
messenger
 Every minute there are 103,447,520
spam emails sent
 There are 154,200 calls on Skype
Services
 The Weather Channel
receives 18,055,556 forecast requests
 Venmo processes $51,892 peer-to-
peer transactions
 Spotify adds 13 new songs on a
average everyday
 Uber riders take 45,788 trips!
 There are 600 new page edits to
Wikipedia
Voice Search
 There are 33 million voice-first devices
in circulation
 8 million people use voice control
each month
 Voice search queries in Google for
2016 were up 35 times over 2008
Data Science?
 An area that manages, manipulates, extracts,
and interprets knowledge from tremendous
amount of data
Evolution of Data Science
What makes Data Science different
What Data Science is Comprised of
Data Science v/s Big Data V/s Data
Analytics
Data Science v/s Big Data V/s Data
Analytics Cont..
Data Science for the Modern Data
Architecture
Some Key Terms in Data Science
 Advanced analytics
 Big data
 Data analysis
 Data analytics
 Data scientist
 Descriptive analytics
 Predictive analytics
 Prescriptive analytics
Data Science Process..
Common Data Science
techniques One must be aware
of
 Anomaly Detection
 Clustering Analysis
 Association Analysis
 Regression Analysis
 Classification Analysis
Steps Involved in Problem
Solving Using Data Science
approach
 Define the problem
 Decide on an approach
 Collect data
 Analyze data
 Interpret results
Data Science Solutions for some
common categories of questions.
Questions? Data Science Approach
Which server in my server
farm needs maintenance the
most?
Identifying themes in large
data sets
Is this combination of
purchases different from
what this customer has
ordered in the past?
Identifying anomalies in
large data sets
How likely is this user to
click on my video?
Predicting the likelihood of
something happening
What is the topic of this
online article?
Showing how things are
connected to one another
Is this an image of a cat or a
mouse?
Categorizing individual data
points
Applications of Data Science
In Retail Industry..
In Healthcare domain
Medical image analysis
Genetics and Genomics
Creation of drugs
Virtual assistance for patients
and customer support
Future of Data Science
And the Refinement Continues……
Data Science: A Comprehensive Overview

Data Science: A Comprehensive Overview

  • 1.
  • 3.
    Facts on DataGeneration..  Every day 2.5 quintillion bytes of data has been created  With so much information at our fingertips, we’re adding to the data stockpile every time we turn to our search engines for answers.
  • 4.
     Internet -More than 3.7 billion humans use the internet (that’s a growth rate of 7.5 percent over 2016).  On average, Google now processes more than 40,000 searches EVERY second (3.5 billion searches per day)!
  • 5.
    Social Media  Snapchatusers share 527,760 photos  More than 120 professionals join LinkedIn  Users watch 4,146,600 YouTube videos  456,000 tweets are sent on Twitter  Instagram users post 46,740 photos  1.5 billion people are active on Facebook daily
  • 6.
    Communication  We send16 million text messages  There are 990,000 Tinder swipes  15,000 GIFs are sent via Facebook messenger  Every minute there are 103,447,520 spam emails sent  There are 154,200 calls on Skype
  • 7.
    Services  The WeatherChannel receives 18,055,556 forecast requests  Venmo processes $51,892 peer-to- peer transactions  Spotify adds 13 new songs on a average everyday  Uber riders take 45,788 trips!  There are 600 new page edits to Wikipedia
  • 8.
    Voice Search  Thereare 33 million voice-first devices in circulation  8 million people use voice control each month  Voice search queries in Google for 2016 were up 35 times over 2008
  • 9.
    Data Science?  Anarea that manages, manipulates, extracts, and interprets knowledge from tremendous amount of data
  • 10.
  • 11.
    What makes DataScience different
  • 12.
    What Data Scienceis Comprised of
  • 13.
    Data Science v/sBig Data V/s Data Analytics
  • 14.
    Data Science v/sBig Data V/s Data Analytics Cont..
  • 15.
    Data Science forthe Modern Data Architecture
  • 16.
    Some Key Termsin Data Science  Advanced analytics  Big data  Data analysis  Data analytics  Data scientist  Descriptive analytics  Predictive analytics  Prescriptive analytics
  • 17.
  • 18.
    Common Data Science techniquesOne must be aware of  Anomaly Detection  Clustering Analysis  Association Analysis  Regression Analysis  Classification Analysis
  • 19.
    Steps Involved inProblem Solving Using Data Science approach  Define the problem  Decide on an approach  Collect data  Analyze data  Interpret results
  • 20.
    Data Science Solutionsfor some common categories of questions. Questions? Data Science Approach Which server in my server farm needs maintenance the most? Identifying themes in large data sets Is this combination of purchases different from what this customer has ordered in the past? Identifying anomalies in large data sets How likely is this user to click on my video? Predicting the likelihood of something happening What is the topic of this online article? Showing how things are connected to one another Is this an image of a cat or a mouse? Categorizing individual data points
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
    Virtual assistance forpatients and customer support
  • 28.
    Future of DataScience And the Refinement Continues……