• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
A Data science primer

A Data science primer






Total Views
Views on SlideShare
Embed Views



3 Embeds 32

http://beta.3dmxconsulting.co.uk 25
http://www.linkedin.com 6
http://www.3dmxconsulting.co.uk 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    A Data science primer A Data science primer Presentation Transcript

    • A Data Science Primer Pavan Keerthi Independent Consultant specializing in Business Intelligence Website : www.3dmxconsulting.co.uk : @pavan_keerthi : pavan.keerthi@3dmxconsulting.co.uk : www.linkedin.com/in/pavankeerthi
    • Who is a Data Scientist? Maths / Machine Statistics Learning Some one who can work Programmi with … Data ng / Analysis Hacking Other aspects like Data Business • Strong Communication Skills Visualizati Intelligen • Strong Organizational Skills on ce • Asking questions are critical for job.03/01/2012 A Data Science Prime 2
    • What data do you deal with? • Structured ,Semi Structured and Unstructured data03/01/2012 A Data Science Prime 3
    • Life Cycle of Data Science Project Data Data Data Collection Munging Storage Data Data Presentation Processing03/01/2012 A Data Science Prime 4
    • Data Value Chain • Insights • Volume • Storage • Metrics Raw Cleaned Reporting Analytics • Value Add Forecast • Leverage • No Value • Compliance • Monitoring • Revenue03/01/2012 A Data Science Prime 5
    • Data life time is important 1 hour - 1 sec – 1 Less than days hr 1 sec Historical Analysis Near Real time Analysis Blocking • Offline OLAP • Real time OLAP • OLTP System • Massive Analytical • Messaging Systems ProcessingMany times, same copy of data flows through all cycles03/01/2012 A Data Science Prime 6
    • Buzz: What is Big Data – The definition is still evolving but suffice to say, if it has following characteristics then it can be called Big Data It has Volume • Generally the size doesn’t make it practical to store on Single machine or move around data easily It has Velocity • The speed at which the data is received makes it impractical to handle the processing using traditional Data warehouses It has Variety • Data is received from various systems and various formats03/01/2012 A Data Science Prime 7
    • Buzz : How Big is Big Data? Zettabyte Exabyte Petabyte Terabytes GigabyteConsensus is ,usually they are in size of terabytes and Petabytes Megabyte03/01/2012 A Data Science Prime 8
    • Where Do you Apply Data Science?? Financial Bio Retail / Sector Technology Ecommerce Telecom Law Social Media /Network Ops Enforcement Entertainment / Gambling03/01/2012 A Data Science Prime 9
    • What Technologies can you use today? Machine Learning Libraries – Mahout , Weka , Oracle Data mining, SQL Server Data Mining ,SPSS,SAS or other Open Source Implementations Databases – Relational ,NoSQL db’s , OLAP Cubes, InMemory db’s , Columnar db’s , Graph db’s etc.. Analytical Software – R , Matlab , Octave or functional Language libraries (e.g.: Numpy) Storage/Computing – Hadoop and related technology, , Cloud Computing (e.g.: AWS ,Azure )03/01/2012 A Data Science Prime 10