Your SlideShare is downloading. ×
0
A Data science primer
A Data science primer
A Data science primer
A Data science primer
A Data science primer
A Data science primer
A Data science primer
A Data science primer
A Data science primer
A Data science primer
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

A Data science primer

1,119

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,119
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
28
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. A Data Science Primer Pavan Keerthi Independent Consultant specializing in Business Intelligence Website : www.3dmxconsulting.co.uk : @pavan_keerthi : pavan.keerthi@3dmxconsulting.co.uk : www.linkedin.com/in/pavankeerthi
  • 2. Who is a Data Scientist? Maths / Machine Statistics Learning Some one who can work Programmi with … Data ng / Analysis Hacking Other aspects like Data Business • Strong Communication Skills Visualizati Intelligen • Strong Organizational Skills on ce • Asking questions are critical for job.03/01/2012 A Data Science Prime 2
  • 3. What data do you deal with? • Structured ,Semi Structured and Unstructured data03/01/2012 A Data Science Prime 3
  • 4. Life Cycle of Data Science Project Data Data Data Collection Munging Storage Data Data Presentation Processing03/01/2012 A Data Science Prime 4
  • 5. Data Value Chain • Insights • Volume • Storage • Metrics Raw Cleaned Reporting Analytics • Value Add Forecast • Leverage • No Value • Compliance • Monitoring • Revenue03/01/2012 A Data Science Prime 5
  • 6. Data life time is important 1 hour - 1 sec – 1 Less than days hr 1 sec Historical Analysis Near Real time Analysis Blocking • Offline OLAP • Real time OLAP • OLTP System • Massive Analytical • Messaging Systems ProcessingMany times, same copy of data flows through all cycles03/01/2012 A Data Science Prime 6
  • 7. Buzz: What is Big Data – The definition is still evolving but suffice to say, if it has following characteristics then it can be called Big Data It has Volume • Generally the size doesn’t make it practical to store on Single machine or move around data easily It has Velocity • The speed at which the data is received makes it impractical to handle the processing using traditional Data warehouses It has Variety • Data is received from various systems and various formats03/01/2012 A Data Science Prime 7
  • 8. Buzz : How Big is Big Data? Zettabyte Exabyte Petabyte Terabytes GigabyteConsensus is ,usually they are in size of terabytes and Petabytes Megabyte03/01/2012 A Data Science Prime 8
  • 9. Where Do you Apply Data Science?? Financial Bio Retail / Sector Technology Ecommerce Telecom Law Social Media /Network Ops Enforcement Entertainment / Gambling03/01/2012 A Data Science Prime 9
  • 10. What Technologies can you use today? Machine Learning Libraries – Mahout , Weka , Oracle Data mining, SQL Server Data Mining ,SPSS,SAS or other Open Source Implementations Databases – Relational ,NoSQL db’s , OLAP Cubes, InMemory db’s , Columnar db’s , Graph db’s etc.. Analytical Software – R , Matlab , Octave or functional Language libraries (e.g.: Numpy) Storage/Computing – Hadoop and related technology, , Cloud Computing (e.g.: AWS ,Azure )03/01/2012 A Data Science Prime 10

×