The Colorful World of
Introduction to Data Science
- Demonstration :
Loan Prediction Problem
- Exploratory data analysis in Python
- Data Munging in Python
- Building a Predictive Model in Python
The Science of
- Discovering what we don’t know from data
- Obtaining predictive, actionable insight from data
- Creating Data Products that have business impact
- Communicating relevant business stories from data
- Building conﬁdence in decisions that drive business
“ Data science is clearly a blend of the hackers’ arts,
statistics and machine learning...
and the expertise in mathematics and the domain of
the data for the analysis to be interpretable...
It requires creative decisions and open-mindedness in
a scientiﬁc context “
Hilary Mason and Chris Wiggins
Hilary Mason is an American data scientist and the founder of technology startup Fast Forward Labs as well as Data Scientist in Residence at Accel Partners. She
was the Chief Scientist at bitly.
Christopher H. Wiggins is an associate professor of applied mathematics at Columbia University, the first Chief Data Scientist at The New York Times, and co-
founder and co-organizer of hackNY hackNY.org
“ We realized that as our organizations grew, we both had to ﬁgure
out what to call the people on our teams.
Business analyst and Data analyst seemed too limiting.
The focus of our teams was to work on data applications that would
have an immediate and massive impact on the business.
The term that seemed to ﬁt best was data scientist:
those who use both data and science to create something new “
Chief Data Scientist of the United States Office of Science and Technology Policy, Patil is credited for coining the term "data science"
“... on any given day, a team member could author a multistage
processing pipeline in Python,
design a hypothesis test, perform a regression analysis over data
samples with R,
design and implement an algorithm for some data-intensive product
or service in Hadoop,
communicate the results of our analyses to other members of the
Data scientist as well as chief scientist and cofounder at Cloudera.Along with Along with Jeff Hammerbacher, Patil is credited with coining the term "data science", Jeff
Hammerbacher is credited with coining the term "data science"
Data scientists need to know how to code
Sql / NoSql
Spark / Hadoop
Data scientists need to be comfortable with
mathematics & statistics.
Data scientists need know machine learning &
Putting the pieces together .....
SIMPLE (Students' Innovations in Morphology Phonology and
Language Engineering) groups
CLEAR (Computational Linguistics in Engineering And
- Blog / Write about your experience
- Build sample projects
- Share ideas
A huntsman can hit a target with a probability of 0.8
He sees a flock of birds (150 birds) atop a banyan tree.
He takes aim and fires 5 continuos shots.
Question : How many birds remain on the tree ?