Successfully reported this slideshow.
Your SlideShare is downloading. ×

Data science

Ad

The Colorful World of
Data Science
Sreejith C
Data Scientist
Calpine Labs
UVJ Technologies
Kochi

Ad

Overview
- Presentaion:
Introduction to Data Science
- Demonstration :
Loan Prediction Problem
- Exploratory data analysis...

Ad

What is Data Science ?

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Upcoming SlideShare
Introduction To Data Science
Introduction To Data Science
Loading in …3
×

Check these out next

1 of 24 Ad
1 of 24 Ad
Advertisement

More Related Content

Slideshows for you (18)

Similar to Data science (20)

Advertisement

Data science

  1. 1. The Colorful World of Data Science Sreejith C Data Scientist Calpine Labs UVJ Technologies Kochi
  2. 2. Overview - Presentaion: Introduction to Data Science - Demonstration : Loan Prediction Problem - Exploratory data analysis in Python - Data Munging in Python - Building a Predictive Model in Python Logistic Regression Decision Tree Random Forest
  3. 3. What is Data Science ?
  4. 4. The Science of - Discovering what we don’t know from data - Obtaining predictive, actionable insight from data - Creating Data Products that have business impact now - Communicating relevant business stories from data - Building confidence in decisions that drive business value
  5. 5. “ Data science is clearly a blend of the hackers’ arts, statistics and machine learning... and the expertise in mathematics and the domain of the data for the analysis to be interpretable... It requires creative decisions and open-mindedness in a scientific context “ Hilary Mason and Chris Wiggins Hilary Mason is an American data scientist and the founder of technology startup Fast Forward Labs as well as Data Scientist in Residence at Accel Partners. She was the Chief Scientist at bitly. Christopher H. Wiggins is an associate professor of applied mathematics at Columbia University, the first Chief Data Scientist at The New York Times, and co- founder and co-organizer of hackNY hackNY.org
  6. 6. THE DATA SCIENCE VENN DIAGRAM
  7. 7. Who is a Data Scientist ?
  8. 8. “ We realized that as our organizations grew, we both had to figure out what to call the people on our teams. Business analyst and Data analyst seemed too limiting. The focus of our teams was to work on data applications that would have an immediate and massive impact on the business. The term that seemed to fit best was data scientist: those who use both data and science to create something new “ DJ Patil Chief Data Scientist of the United States Office of Science and Technology Policy, Patil is credited for coining the term "data science"
  9. 9. What Does a Data Scientist Do?
  10. 10. “... on any given day, a team member could author a multistage processing pipeline in Python, design a hypothesis test, perform a regression analysis over data samples with R, design and implement an algorithm for some data-intensive product or service in Hadoop, communicate the results of our analyses to other members of the organization “ Jeff Hammerbacher Data scientist as well as chief scientist and cofounder at Cloudera.Along with Along with Jeff Hammerbacher, Patil is credited with coining the term "data science", Jeff Hammerbacher is credited with coining the term "data science"
  11. 11. Machine Learning - Regression - Classification - Clustering
  12. 12. Big Data Analytics
  13. 13. How to become a data scientist ?
  14. 14. Data scientists need to know how to code Python R Julia Java Scala Sql / NoSql Spark / Hadoop
  15. 15. Data scientists need to be comfortable with mathematics & statistics.
  16. 16. Data scientists need know machine learning & software engineering.
  17. 17. Putting the pieces together ..... SIMPLE (Students' Innovations in Morphology Phonology and Language Engineering) groups CLEAR (Computational Linguistics in Engineering And Research) magazine - Blog / Write about your experience - Build sample projects - Share ideas
  18. 18. Puzzle A huntsman can hit a target with a probability of 0.8 He sees a flock of birds (150 birds) atop a banyan tree. He takes aim and fires 5 continuos shots. Question : How many birds remain on the tree ?
  19. 19. Don't lose the big picture !! 0 !
  20. 20. Loan Prediction Problem challenge is to predict approval status of loan (Approved/ Reject) Link : https://github.com/sreejithc321/ML_Regression/tree/master/loan _prediction Demonstration
  21. 21. References http://www.slideshare.net/ryanorban/how-to-become-a-data- scientist http://www.slideshare.net/datasciencelondon/big-data-sorry-data- science-what-does-a-data-scientist-do https://speakerdeck.com/bargava/introduction-to-machine-learning https://www.analyticsvidhya.com/blog/2016/01/complete-tutorial- learn-data-science-python-scratch-2/
  22. 22. Connect me at : http://in.linkedin.com/in/sreejithc321 Follow me at : https://twitter.com/sreejithc321

×