Your SlideShare is downloading. ×
0
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Hack Kid Con - Learn to be a Data Scientist for $1
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hack Kid Con - Learn to be a Data Scientist for $1

1,884

Published on

Attempt to inspire some kids to pay attention in Math and Science classes so they can get a good job and help fill the skills gap in the years to come.

Attempt to inspire some kids to pay attention in Math and Science classes so they can get a good job and help fill the skills gap in the years to come.

Published in: Technology, Education
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,884
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
26
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. LEARNTO BE A DATA SCIENTIST FOR $1 Hack Kid Conference - April 2014 by Adrian Cockcroft BatteryVentures
  • 2. A BIG new problem for a new generation
  • 3. Now A BIG new problem for a new generation
  • 4. Now A BIG new problem for a new generation Your future job as a Data Scientist
  • 5. WHAT DOES A DATA SCIENTIST DO?
  • 6. The hive mind map shows popular twitter hashtags for the last 7 days and how they are connected http://hivemindmap.com/?#
  • 7. HIVE MIND MAP A mind-map of what’s happening onTwitter Thanks to Mark Harwood for these slides and the Hive Mind Map http://www.infoq.com/presentations/elasticsearch-revealing-uncommonly-common
  • 8. Connections The thickness of a line between hashtags is based on the strength of connection Tip:! Strength of connection is the number of tweets with both tags vs the number with only one - see “Jaccard similarity coefficient”
  • 9. Top tweets The most popular tweets for a tag are sorted based on the number of “retweets”
  • 10. When? The rise and fall of each hashtag’s popularity can be shown over time
  • 11. Calendar summary Tags that “peak” together are grouped into events on a calendar Tip:! Peaks are detected using standard deviations. Only tags with a single peak are chosen as events Tip:! Tags that rise and fall in popularity at the same time are detected using Pearson’s Correlation
  • 12. What makes this possible? • Free software (Lucene, Java, Eclipse, Gephi, Tomcat, d3, Google analytics…) • Free data (millions of users’ tweets from Twitter’s 1% sample feed) • “Cloud” computing (rented server) • Smarter web browsers (visualizations using HTML5’s SVG/Canvas) • All the friendly folks on the internet (e.g. http://stackoverflow.com/ questions/14799842) • Some imagination…
  • 13. Opportunities in Data Science • We are all generating volumes of data never seen before • You can recycle the behaviors of billions of people into more intelligent systems • customer purchases can be used for product recommendations • user searches can be used for spelling corrections, • Reader clicks can influence the trending news • Spotify activity is used to make music recommendations) • The tools have never been cheaper • It has never been easier to find help in developing systems
  • 14. …one more thing.. I’m writing these slides for you while on my annual snowboarding trip to Canada. Data science pays well ;-) Wish you were here…
  • 15. HOW CAN A KID LEARN BIG DATA FOR $1?
  • 16. BIG DATA INTHE CLOUD WITH AMAZON EMR https://www.youtube.com/watch?v=S6Ja55n-o0M
  • 17. LESSTHAN $1 After running two of the EMR examples, creating 6 computers in the cloud to do the analysis for up to an hour each
  • 18. GOOGLE BIGQUERY https://demobigquery.appspot.com/
  • 19. BAY AREA WEATHER https://demobigquery.appspot.com/
  • 20. WHYTHE FLINTSTONES? https://demobigquery.appspot.com/
  • 21. MEASURING KIDS How good are you at Math and Science, is it getting better or worse?
  • 22. SCHOOL DATA https://www.data.gov/ http://eddataexpress.ed.gov/state-report.cfm/state/CA/
  • 23. ACHIEVEMENT SCORES Download results into Excel to analyze and draw graphs
  • 24. DOWNLOADED DATA Needed some clean-up. Made sure grade was consistent (4, 8, HS) for all results, and created a short Subject column
  • 25. SCORES 2004-2012 Elementary - 4th Grade, Middle School - 8th Grade, High School
  • 26. SCORES 2004-2012 Elementary - 4th Grade, Middle School - 8th Grade, High School About half of high school students in California are proficient at Math and Science
  • 27. CALIFORNIA SCHOOLS Science and Math Scores at Elementary, Middle and High School Level
  • 28. CALIFORNIA SCHOOLS Science and Math Scores at Elementary, Middle and High School Level Scores have been getting better. Good!
  • 29. CALIFORNIA SCHOOLS Science and Math Scores at Elementary, Middle and High School Level Scores have been getting better. Good! Maybe the Math tests were harder for everyone that year?
  • 30. CALIFORNIA SCHOOLS Science and Math Scores at Elementary, Middle and High School Level Scores have been getting better. Good!4th Grade “cohort” in 2004 was 8th Grade in 2008 Maybe the Math tests were harder for everyone that year?
  • 31. DATA SCIENCE WITH EXCEL Pivot tables let you rearrange data and trend lines measure the slope
  • 32. LEARNTO BE A DATA SCIENTIST FOR $1 • Everything is being measured • The latest data science tools are available to anyone for pennies • There is lots of freely available data • Pay attention in math and science class, play around with EMR and Bigquery and get an interesting and well paid job as a data scientist!

×