"What is Data Science?" High School Version

{
What is Data Science?
Renée Teate, March 2016
Harrisonburg High School

Let’s start with: “What is Data?”
http://upload.wikimedia.org/wikipedia/commons/f/f0/DARPA
_Big_Data.jpg
https://encrypted-
tbn2.gstatic.com/images?q=tbn:ANd9GcS9dKu3_Tzi-sWW-
yAqee5y0EhuvoIZNSya_rAKnuBBd0JYxPX7pw
http://fc01.deviantart.net/fs71/i/2012/326/3/4/cute_dog_by_tho
masmeadows345-d5lsah9.jpg
http://www.freefoto.com/images/1351/06/1351_06_2---Books--
Shakespeare-and-Company-Bookstore--The-Latin-Quarter--
Paris_web.jpg

Created & Collected
https://c2.staticflickr.com/4/3273/3017878633_65beb1c7d6.jpg
http://upload.wikimedia.org/wikipedia/commons/e/e4/Gr
een_Bank_100m_diameter_Radio_Telescope.jpg
https://c1.staticflickr.com/1/2/1349370_07
03fce74c.jpg
http://upload.wikimedia.org/wikipedia/commons/9/96/Bill_Nye
,_Barack_Obama_and_Neil_deGrasse_Tyson_selfie_2014.jpg

Analyzed and Visualized
http://upload.wikimedia.org/wikipedia/commons/1/1c/CMS_Higgs-event.jpg
http://upload.wiki
media.org/wikipedi
a/commons/9/90/Ke
ncf0618FacebookNe
twork.jpg
http://upload.wikimedia.org/wikipedia/commo
ns/b/bf/USDA_Hardiness_zone_map.jpg
https://c1.staticflickr.com/3/2300/2596366618_2d6cb01735.jpg

“Big Data”
https://web-assets.domo.com/blog/wp-content/uploads/2014/04/DataNeverSleeps_2.0_v2.jpg

Stored in Databases on Servers in Data Centers
http://pixabay.com/static/uploads/photo/2014/03/13/01/12/datacen
ter-286386_640.jpg
https://c2.staticflickr.com/2/1296/533233247_b6baa30fdb_z.jpg?zz=1
“The Cloud”

Database
[dey-tuh-beys]
noun
A comprehensive collection of related data
organized for convenient access, generally in
a computer.
-dictionary.com
I used a database to look up this definition!

Types of Databases
http://www.oaddo.org
Relational DMBS
Graph Database

https://www.google.com/maps/@38.8905569,-77.1721577,13z/data=!5m1!1e1
http://upload.wikimedia.org/wikipedia/commons/6/69/Netflix_logo.svg
https://c2.staticflickr.com/4/3324/3507973704_563846fe14_z.jpg?zz=1
How is data
collected about you
used to help you?

Data Scientist
Computer
Scientist
• Gathering data
• Writing Code
• Designing Interfaces
• Design / Manage /
Query Databases
• Data Mining
Mathematician
• Statistics
• Predictive Analytics
• Data Visualizations
• Evaluating Results
Business Person
• Domain Expertise
• Knowing what
questions to ask
• Interpreting results
for business
decisions
• Presenting outcomes
No one person needs to have all of these skills.
More organizations are now building data science teams.

Becoming a Data Scientist Podcast

How have I learned data
science?

 Statistician
 Data Mining Specialist
 Biostatistician
 Social Science Researcher
 Big Data Analyst
 Spatial/GIS Analyst
 Natural Language
Processing Researcher
 Computational Physicist
Some other names for “Data Scientist”
 Pythonista
 Financial Analyst
 Recommendation System
Engineer
 Information Architect
 Artificial Intelligence
Researcher
 Neuroscientist
 Data Visualization Designer

Data Science jobs pay an
average of $118,000 per year
It is estimated that by 2018, US could have a
shortage of 140,000+ people with advanced
analytical skills & need 1.5M managers/analysts
that can make decisions based on data analysis

Examples
 Galaxy Classification from Images
http://benanne.github.io/2014/04/05/galaxy-zoo.html
 Choosing Audience for Content Promotion
on Facebook
http://citizennet.com/blog/2012/11/10/random-forests-
ensembles-and-performance-metrics/
 Predicting Seizures
https://www.kaggle.com/c/seizure-detection
 March Madness Picks
https://www.kaggle.com/c/march-machine-
learning-mania-2015

Facial Recognition (auto-tagging)
http://www.mirrordaily.com/facebook-and-google-develop-amazing-facial-recognition-algorithms/22402/

What other things can facial
recognition be used for?
What are the ethical
questions about this?

It’s actually really hard for computers to do things humans
consider simple! We train them using “machine learning”
https://www.linkedin.com/pulse/machine-learning-image-detectioncats-vs-dogs-amrith-kumar

"Once you start working in robotics, you
realize that things that kids learn to do
up to age 10 ... are actually the hardest
things to get a robot to do.“
-Pieter Abbeel, AI Researcher and Professor at UC Berkeley

https://www.youtube.com/watch?v=gy5g33S0Gzo
AI and Robotics

 Programming
 Any language is good
to start with! Just
start coding!
 Most common:
Python or R
 Database design, SQL
 Math
 Statistics
 Linear Algebra
Different places you can start if you’re interested in data science
 Research and Analysis
 Science involving data
collection and
interpretation
 Working with “messy” real
life data
 Business Analytics
 Data Mining
 Others
 Business / Communication
 Graphic Design

 Doing Data Science by Cathy O’Neil* & Rachel Schutt
 Data Smart by John Foreman* (uses Excel)
 Blogs & News Feeds (FlowingData.com is a good one to start with)
 Podcasts
 Twitter – look for curated lists of people to follow
https://twitter.com/BecomingDataSci/lists/women-in-data-science/members
 Online courses like those on DataCamp, Codecademy, Coursera
 TED talks on Data http://www.ted.com/search?q=data
 Practice w/public data sets on sites like data.gov
 Volunteer opportunities via DataKind
 Ask me…. I have plenty more!
Learning Resources

Find me online:
@becomingdatasci
“Data Science Renee”on Twitter
BecomingADataScientist.com
DataSciGuide.com
Becoming a Data Scientist Podcast &
Learning Club

Questions?
Or want a copy of these slides and links?
Renée Teate
renee@becomingadatascientist.com
@becomingdatasci on Twitter

"What is Data Science?" High School Version

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to "What is Data Science?" High School Version

Similar to "What is Data Science?" High School Version (20)

Recently uploaded

Recently uploaded (20)

"What is Data Science?" High School Version

Editor's Notes