Uploaded on

 

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
671
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
0
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Data Science Shankar Radhakrishnan Cognizant
  • 2. History… • Questions first, data later • Data model first, data processing later • Size first, project second, react overtime • Focus on accuracy, assume little • Importance to completeness and comprehensiveness • Expose raw data to decision makers • Provide insights but those that are not actionable • Bound by constraints (Procurement, Process, Build Insights, Interaction)
  • 3. What’s Changed ? • Medium to participate is vast • Mode to reach expanded • Data types are vast and voluminous • Noise is huge, yet accepted • Urgency precedes accuracy • Guidance is better than completeness • Cost to store and process has fallen (and still falling) • More ways and means to process data at scale
  • 4. Speaking of Data • Volume - Data at rest • Variety - Data in many forms • Velocity - Data in motion • Veracity - Data in doubt
  • 5. Data Science “ Data Science is the art of turning data into actions ” This is accomplished through creation of data products, that provide actionable information
 without exposing underlying data or analytics “ Scientific study of the creation, validation and transformation of data to create meaning ” http://www.datascienceassn.org/code-of-conduct.html
  • 6. While we are on definitions… Data Mining “ Non-trivial process of identifying valid, novel, potentially useful and understandable structures or patterns or models or relationships in data to enable data driven decision making ” Statistics “ Science of learning from data or of 
 making sense out of data ”
  • 7. Science of Data Science • Analyze and understand data that’s available • Find and acquire what more is needed • Discover what’s not known from data • Predict and build “actionable insights” from data • Build data products that has “immediate” business impact • Make it easy for business to “use” • Help decision making to drive “business value”
  • 8. Data Science Toolkit Python R Java Textwrangler SQL C, C++ Mahout NLTK OpenNLP GPText SciPy Pandas scikit-leam Hadoop Hive HAWQ PL/Python PL/R PL/Java Proprietary D3.js Gephi Graphviz R Tableau Proprietary Languages Libraries Database Visualization
  • 9. Approach, Techniques • Classification • Filtering • Structure • Clustering • Disambiguation • De-duplication • Normalization • Correlation • Prediction • Discover • Reason • Model • Deploy • Visualize • Recommend • Predict • Explore • Machine Learning • Decision Trees • Bayesian Networks • Logistic Regression • Monte Carlo Methods • Component Analysis • Fuzzy Modeling • Neural Networks • Genetic Algorithms Step Process Technology
  • 10. Data Science In Action • Improving User Experience • Multi-device event stream analysis • Intrusion detection, avoidance • Collocation analysis from 
 cell-phone towers • Text Mining, Bandwidth Throttling • Network Performance & Optimization • Mobile User Location Analytics • Customer Churn Prevention • Social Media and Sentiment Analysis • Location Based Initiatives
  • 11. Thanks !