SlideShare a Scribd company logo
LEARNTO BE A	

DATA SCIENTIST FOR $1
Hack Kid Conference - April 2014	

by Adrian Cockcroft	

BatteryVentures
A BIG new problem	

for a new generation
Now
A BIG new problem	

for a new generation
Now
A BIG new problem	

for a new generation
Your future job as a Data Scientist
WHAT DOES A DATA SCIENTIST DO?
The hive mind map shows popular twitter hashtags
for the last 7 days and how they are connected
http://hivemindmap.com/?#
HIVE MIND MAP
A mind-map of what’s happening onTwitter
Thanks to Mark Harwood for these slides and the Hive Mind Map
http://www.infoq.com/presentations/elasticsearch-revealing-uncommonly-common
Connections
The thickness of a line between hashtags is
based on the strength of connection
Tip:!
Strength of connection
is the number of tweets
with both tags vs the
number with only one -
see “Jaccard similarity
coefficient”
Top tweets
The most popular tweets for a tag are sorted
based on the number of “retweets”
When?
The rise and fall of each hashtag’s popularity
can be shown over time
Calendar summary
Tags that “peak” together are grouped into
events on a calendar
Tip:!
Peaks are detected
using standard
deviations. Only tags
with a single peak are
chosen as events
Tip:!
Tags that rise and
fall in popularity at
the same time are
detected using
Pearson’s
Correlation
What makes this possible?
• Free software (Lucene, Java, Eclipse, Gephi, Tomcat, d3, Google analytics…)
• Free data (millions of users’ tweets from Twitter’s 1% sample feed)
• “Cloud” computing (rented server)
• Smarter web browsers (visualizations using HTML5’s SVG/Canvas)
• All the friendly folks on the internet (e.g. http://stackoverflow.com/
questions/14799842)
• Some imagination…
Opportunities in Data Science
• We are all generating volumes of data never seen before
• You can recycle the behaviors of billions of people into
more intelligent systems
• customer purchases can be used for product recommendations
• user searches can be used for spelling corrections,
• Reader clicks can influence the trending news
• Spotify activity is used to make music recommendations)
• The tools have never been cheaper
• It has never been easier to find help in developing systems
…one more thing..
I’m writing these slides for you
while on my annual snowboarding
trip to Canada.
Data science pays well ;-)
Wish you were here…
HOW CAN A KID
LEARN BIG DATA
FOR $1?
BIG DATA INTHE CLOUD WITH AMAZON EMR
https://www.youtube.com/watch?v=S6Ja55n-o0M
LESSTHAN $1
After running two of the EMR examples, creating 6 computers in the cloud
to do the analysis for up to an hour each
GOOGLE BIGQUERY
https://demobigquery.appspot.com/
BAY AREA WEATHER
https://demobigquery.appspot.com/
WHYTHE FLINTSTONES?
https://demobigquery.appspot.com/
MEASURING KIDS
How good are you at Math and Science, is it getting better or worse?
SCHOOL DATA
https://www.data.gov/	

http://eddataexpress.ed.gov/state-report.cfm/state/CA/
ACHIEVEMENT SCORES
Download results into Excel to analyze and draw graphs
DOWNLOADED DATA
Needed some clean-up. Made sure grade was consistent (4, 8, HS) for all
results, and created a short Subject column
SCORES 2004-2012
Elementary - 4th Grade, Middle School - 8th Grade, High School
SCORES 2004-2012
Elementary - 4th Grade, Middle School - 8th Grade, High School
About half of	

high school	

students in	

California are	

proficient at	

Math and	

Science
CALIFORNIA SCHOOLS
Science and Math Scores at Elementary, Middle and High School Level
CALIFORNIA SCHOOLS
Science and Math Scores at Elementary, Middle and High School Level
Scores have	

been getting	

better. Good!
CALIFORNIA SCHOOLS
Science and Math Scores at Elementary, Middle and High School Level
Scores have	

been getting	

better. Good!
Maybe the	

Math tests	

were harder	

for everyone	

that year?
CALIFORNIA SCHOOLS
Science and Math Scores at Elementary, Middle and High School Level
Scores have	

been getting	

better. Good!4th Grade	

“cohort” in	

2004 was 8th	

Grade in 2008
Maybe the	

Math tests	

were harder	

for everyone	

that year?
DATA SCIENCE WITH EXCEL
Pivot tables let you rearrange data and trend lines measure the slope
LEARNTO BE A DATA SCIENTIST FOR $1
• Everything is being measured	

• The latest data science tools are
available to anyone for pennies	

• There is lots of freely available data	

• Pay attention in math and science class,
play around with EMR and Bigquery
and get an interesting and well paid job
as a data scientist!

More Related Content

Viewers also liked

Data Scientist: The Sexiest Job in the 21st Century
Data Scientist: The Sexiest Job in the 21st CenturyData Scientist: The Sexiest Job in the 21st Century
Data Scientist: The Sexiest Job in the 21st Century
Lyn Fenex
 
What is a Data Scientist
What is a Data Scientist What is a Data Scientist
What is a Data Scientist
Experian_US
 
Data science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyData science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi Periasamy
Peter Kua
 
Be a Data Scientist in 8 steps!
Be a Data Scientist in 8 steps! Be a Data Scientist in 8 steps!
Be a Data Scientist in 8 steps!
PromptCloud
 
The path to be a data scientist
The path to be a data scientistThe path to be a data scientist
The path to be a data scientist
Poo Kuan Hoong
 
A Data Scientist Experiment
A Data Scientist ExperimentA Data Scientist Experiment
A Data Scientist Experiment
Jan Chipchase
 
Data Scientist 101 BI Dutch
Data Scientist 101 BI DutchData Scientist 101 BI Dutch
Data Scientist 101 BI Dutch
Jos van Dongen
 
Вебинар: Инструменты для работы Data Scientist
Вебинар: Инструменты для работы Data ScientistВебинар: Инструменты для работы Data Scientist
Вебинар: Инструменты для работы Data Scientist
FlyElephant
 
Data Science Day New York: Data Scientist - The New Data Analyst
Data Science Day New York: Data Scientist - The New Data AnalystData Science Day New York: Data Scientist - The New Data Analyst
Data Science Day New York: Data Scientist - The New Data Analyst
Cloudera, Inc.
 
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
Dataconomy Media
 
Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!
Edureka!
 
How to become a Data Scientist?
How to become a Data Scientist? How to become a Data Scientist?
How to become a Data Scientist?
HackerEarth
 
How Will AI Change the Role of the Data Scientist?
How Will AI Change the Role of the Data Scientist?How Will AI Change the Role of the Data Scientist?
How Will AI Change the Role of the Data Scientist?
Hugo Gävert
 
How to Become a Data Scientist – By Ryan Orban, VP of Operations and Expansio...
How to Become a Data Scientist – By Ryan Orban, VP of Operations and Expansio...How to Become a Data Scientist – By Ryan Orban, VP of Operations and Expansio...
How to Become a Data Scientist – By Ryan Orban, VP of Operations and Expansio...
Galvanize
 
Life of a data scientist (pub)
Life of a data scientist (pub)Life of a data scientist (pub)
Life of a data scientist (pub)
Buhwan Jeong
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)
Zenodia Charpy
 

Viewers also liked (17)

Data Scientist: The Sexiest Job in the 21st Century
Data Scientist: The Sexiest Job in the 21st CenturyData Scientist: The Sexiest Job in the 21st Century
Data Scientist: The Sexiest Job in the 21st Century
 
What is a Data Scientist
What is a Data Scientist What is a Data Scientist
What is a Data Scientist
 
Data science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyData science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi Periasamy
 
Be a Data Scientist in 8 steps!
Be a Data Scientist in 8 steps! Be a Data Scientist in 8 steps!
Be a Data Scientist in 8 steps!
 
Data Scientist Why now?
Data Scientist Why now?Data Scientist Why now?
Data Scientist Why now?
 
The path to be a data scientist
The path to be a data scientistThe path to be a data scientist
The path to be a data scientist
 
A Data Scientist Experiment
A Data Scientist ExperimentA Data Scientist Experiment
A Data Scientist Experiment
 
Data Scientist 101 BI Dutch
Data Scientist 101 BI DutchData Scientist 101 BI Dutch
Data Scientist 101 BI Dutch
 
Вебинар: Инструменты для работы Data Scientist
Вебинар: Инструменты для работы Data ScientistВебинар: Инструменты для работы Data Scientist
Вебинар: Инструменты для работы Data Scientist
 
Data Science Day New York: Data Scientist - The New Data Analyst
Data Science Day New York: Data Scientist - The New Data AnalystData Science Day New York: Data Scientist - The New Data Analyst
Data Science Day New York: Data Scientist - The New Data Analyst
 
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
 
Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!
 
How to become a Data Scientist?
How to become a Data Scientist? How to become a Data Scientist?
How to become a Data Scientist?
 
How Will AI Change the Role of the Data Scientist?
How Will AI Change the Role of the Data Scientist?How Will AI Change the Role of the Data Scientist?
How Will AI Change the Role of the Data Scientist?
 
How to Become a Data Scientist – By Ryan Orban, VP of Operations and Expansio...
How to Become a Data Scientist – By Ryan Orban, VP of Operations and Expansio...How to Become a Data Scientist – By Ryan Orban, VP of Operations and Expansio...
How to Become a Data Scientist – By Ryan Orban, VP of Operations and Expansio...
 
Life of a data scientist (pub)
Life of a data scientist (pub)Life of a data scientist (pub)
Life of a data scientist (pub)
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)
 

Similar to Hack Kid Con - Learn to be a Data Scientist for $1

WLMA 14 Conference Keynote PPT - Paige Jaeger: Connecting Creatively with the CC
WLMA 14 Conference Keynote PPT - Paige Jaeger: Connecting Creatively with the CCWLMA 14 Conference Keynote PPT - Paige Jaeger: Connecting Creatively with the CC
WLMA 14 Conference Keynote PPT - Paige Jaeger: Connecting Creatively with the CC
Paige Jaeger
 
Epub compass 2012 ace_conference
Epub compass 2012 ace_conferenceEpub compass 2012 ace_conference
Epub compass 2012 ace_conference
lindarg
 
Open Source Data Visualization for Resource Sharing: An Ivy Plus Libraries Pr...
Open Source Data Visualization for Resource Sharing: An Ivy Plus Libraries Pr...Open Source Data Visualization for Resource Sharing: An Ivy Plus Libraries Pr...
Open Source Data Visualization for Resource Sharing: An Ivy Plus Libraries Pr...
Heidi Nance
 
Melbourne officeevent
Melbourne officeeventMelbourne officeevent
Melbourne officeeventStephen Abram
 
Wow resource roundup
Wow resource roundupWow resource roundup
Wow resource roundupjdanielian
 
Get Ready For Abundance Culture At High School
Get Ready For Abundance Culture At High SchoolGet Ready For Abundance Culture At High School
Get Ready For Abundance Culture At High School
Cape Peninsula University of Technology
 
Data matters-bournemouth-2015
Data matters-bournemouth-2015Data matters-bournemouth-2015
Data matters-bournemouth-2015
Alan Dix
 
From Digital Literacy to Digital Fluency
From Digital Literacy to Digital FluencyFrom Digital Literacy to Digital Fluency
From Digital Literacy to Digital Fluency
David Cain
 
Maine Libraries
Maine LibrariesMaine Libraries
Maine Libraries
Stephen Abram
 
Evaluating Electronic Resources
Evaluating Electronic ResourcesEvaluating Electronic Resources
Evaluating Electronic Resources
Richard Bernier
 
Ma sla
Ma slaMa sla
Bondurant-Farrar
Bondurant-FarrarBondurant-Farrar
Bondurant-Farrar
Evan Abbey
 
Bondurant-Farrar
Bondurant-FarrarBondurant-Farrar
Bondurant-Farrar
Evan Abbey
 
How Does Reading & Learning Change on the Internet: Responding to New Literacies
How Does Reading & Learning Change on the Internet: Responding to New LiteraciesHow Does Reading & Learning Change on the Internet: Responding to New Literacies
How Does Reading & Learning Change on the Internet: Responding to New Literacies
Julie Coiro
 
Getting to Know Your Data with R
Getting to Know Your Data with RGetting to Know Your Data with R
Getting to Know Your Data with R
Stephen Withington
 
Fys presentation 12_aug_2010
Fys presentation 12_aug_2010Fys presentation 12_aug_2010
Fys presentation 12_aug_2010
Bruce Gilbert
 
Why Be Open?
Why Be Open?Why Be Open?
Why Be Open?
David Wiley
 
Professional Information Research
Professional Information ResearchProfessional Information Research
Professional Information Research
Eric Kokke
 

Similar to Hack Kid Con - Learn to be a Data Scientist for $1 (20)

Delaware2011
Delaware2011Delaware2011
Delaware2011
 
WLMA 14 Conference Keynote PPT - Paige Jaeger: Connecting Creatively with the CC
WLMA 14 Conference Keynote PPT - Paige Jaeger: Connecting Creatively with the CCWLMA 14 Conference Keynote PPT - Paige Jaeger: Connecting Creatively with the CC
WLMA 14 Conference Keynote PPT - Paige Jaeger: Connecting Creatively with the CC
 
Epub compass 2012 ace_conference
Epub compass 2012 ace_conferenceEpub compass 2012 ace_conference
Epub compass 2012 ace_conference
 
Open Source Data Visualization for Resource Sharing: An Ivy Plus Libraries Pr...
Open Source Data Visualization for Resource Sharing: An Ivy Plus Libraries Pr...Open Source Data Visualization for Resource Sharing: An Ivy Plus Libraries Pr...
Open Source Data Visualization for Resource Sharing: An Ivy Plus Libraries Pr...
 
Melbourne officeevent
Melbourne officeeventMelbourne officeevent
Melbourne officeevent
 
Mich la april 2011
Mich la april 2011Mich la april 2011
Mich la april 2011
 
Wow resource roundup
Wow resource roundupWow resource roundup
Wow resource roundup
 
Get Ready For Abundance Culture At High School
Get Ready For Abundance Culture At High SchoolGet Ready For Abundance Culture At High School
Get Ready For Abundance Culture At High School
 
Data matters-bournemouth-2015
Data matters-bournemouth-2015Data matters-bournemouth-2015
Data matters-bournemouth-2015
 
From Digital Literacy to Digital Fluency
From Digital Literacy to Digital FluencyFrom Digital Literacy to Digital Fluency
From Digital Literacy to Digital Fluency
 
Maine Libraries
Maine LibrariesMaine Libraries
Maine Libraries
 
Evaluating Electronic Resources
Evaluating Electronic ResourcesEvaluating Electronic Resources
Evaluating Electronic Resources
 
Ma sla
Ma slaMa sla
Ma sla
 
Bondurant-Farrar
Bondurant-FarrarBondurant-Farrar
Bondurant-Farrar
 
Bondurant-Farrar
Bondurant-FarrarBondurant-Farrar
Bondurant-Farrar
 
How Does Reading & Learning Change on the Internet: Responding to New Literacies
How Does Reading & Learning Change on the Internet: Responding to New LiteraciesHow Does Reading & Learning Change on the Internet: Responding to New Literacies
How Does Reading & Learning Change on the Internet: Responding to New Literacies
 
Getting to Know Your Data with R
Getting to Know Your Data with RGetting to Know Your Data with R
Getting to Know Your Data with R
 
Fys presentation 12_aug_2010
Fys presentation 12_aug_2010Fys presentation 12_aug_2010
Fys presentation 12_aug_2010
 
Why Be Open?
Why Be Open?Why Be Open?
Why Be Open?
 
Professional Information Research
Professional Information ResearchProfessional Information Research
Professional Information Research
 

More from Adrian Cockcroft

Microservices Workshop All Topics Deck 2016
Microservices Workshop All Topics Deck 2016Microservices Workshop All Topics Deck 2016
Microservices Workshop All Topics Deck 2016
Adrian Cockcroft
 
Gophercon 2016 Communicating Sequential Goroutines
Gophercon 2016 Communicating Sequential GoroutinesGophercon 2016 Communicating Sequential Goroutines
Gophercon 2016 Communicating Sequential Goroutines
Adrian Cockcroft
 
Monitoring Challenges - Monitorama 2016 - Monitoringless
Monitoring Challenges - Monitorama 2016 - MonitoringlessMonitoring Challenges - Monitorama 2016 - Monitoringless
Monitoring Challenges - Monitorama 2016 - Monitoringless
Adrian Cockcroft
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Adrian Cockcroft
 
Microservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceMicroservices Workshop - Craft Conference
Microservices Workshop - Craft Conference
Adrian Cockcroft
 
Evolution of Microservices - Craft Conference
Evolution of Microservices - Craft ConferenceEvolution of Microservices - Craft Conference
Evolution of Microservices - Craft Conference
Adrian Cockcroft
 
Microservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkMicroservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New York
Adrian Cockcroft
 
What's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at CiscoWhat's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at Cisco
Adrian Cockcroft
 
In Search of Segmentation
In Search of SegmentationIn Search of Segmentation
In Search of Segmentation
Adrian Cockcroft
 
Microxchg Analyzing Response Time Distributions for Microservices
Microxchg Analyzing Response Time Distributions for MicroservicesMicroxchg Analyzing Response Time Distributions for Microservices
Microxchg Analyzing Response Time Distributions for Microservices
Adrian Cockcroft
 
Innovation and Architecture
Innovation and ArchitectureInnovation and Architecture
Innovation and Architecture
Adrian Cockcroft
 
Cloud Trends Nov2015 Structure
Cloud Trends Nov2015 StructureCloud Trends Nov2015 Structure
Cloud Trends Nov2015 Structure
Adrian Cockcroft
 
Openstack Silicon Valley - Vendor Lock In
Openstack Silicon Valley - Vendor Lock InOpenstack Silicon Valley - Vendor Lock In
Openstack Silicon Valley - Vendor Lock In
Adrian Cockcroft
 
When Developers Operate and Operators Develop
When Developers Operate and Operators DevelopWhen Developers Operate and Operators Develop
When Developers Operate and Operators Develop
Adrian Cockcroft
 
Dockercon 2015 - Faster Cheaper Safer
Dockercon 2015 - Faster Cheaper SaferDockercon 2015 - Faster Cheaper Safer
Dockercon 2015 - Faster Cheaper Safer
Adrian Cockcroft
 
Microservices the Good Bad and the Ugly
Microservices the Good Bad and the UglyMicroservices the Good Bad and the Ugly
Microservices the Good Bad and the Ugly
Adrian Cockcroft
 
Gluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A ChallengeGluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A Challenge
Adrian Cockcroft
 
Software Architecture Conference - Monitoring Microservices - A Challenge
Software Architecture Conference -  Monitoring Microservices - A ChallengeSoftware Architecture Conference -  Monitoring Microservices - A Challenge
Software Architecture Conference - Monitoring Microservices - A Challenge
Adrian Cockcroft
 
Microxchg Microservices
Microxchg MicroservicesMicroxchg Microservices
Microxchg Microservices
Adrian Cockcroft
 
Cloud Native Cost Optimization UCC
Cloud Native Cost Optimization UCCCloud Native Cost Optimization UCC
Cloud Native Cost Optimization UCC
Adrian Cockcroft
 

More from Adrian Cockcroft (20)

Microservices Workshop All Topics Deck 2016
Microservices Workshop All Topics Deck 2016Microservices Workshop All Topics Deck 2016
Microservices Workshop All Topics Deck 2016
 
Gophercon 2016 Communicating Sequential Goroutines
Gophercon 2016 Communicating Sequential GoroutinesGophercon 2016 Communicating Sequential Goroutines
Gophercon 2016 Communicating Sequential Goroutines
 
Monitoring Challenges - Monitorama 2016 - Monitoringless
Monitoring Challenges - Monitorama 2016 - MonitoringlessMonitoring Challenges - Monitorama 2016 - Monitoringless
Monitoring Challenges - Monitorama 2016 - Monitoringless
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
 
Microservices Workshop - Craft Conference
Microservices Workshop - Craft ConferenceMicroservices Workshop - Craft Conference
Microservices Workshop - Craft Conference
 
Evolution of Microservices - Craft Conference
Evolution of Microservices - Craft ConferenceEvolution of Microservices - Craft Conference
Evolution of Microservices - Craft Conference
 
Microservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New YorkMicroservices: What's Missing - O'Reilly Software Architecture New York
Microservices: What's Missing - O'Reilly Software Architecture New York
 
What's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at CiscoWhat's Missing? Microservices Meetup at Cisco
What's Missing? Microservices Meetup at Cisco
 
In Search of Segmentation
In Search of SegmentationIn Search of Segmentation
In Search of Segmentation
 
Microxchg Analyzing Response Time Distributions for Microservices
Microxchg Analyzing Response Time Distributions for MicroservicesMicroxchg Analyzing Response Time Distributions for Microservices
Microxchg Analyzing Response Time Distributions for Microservices
 
Innovation and Architecture
Innovation and ArchitectureInnovation and Architecture
Innovation and Architecture
 
Cloud Trends Nov2015 Structure
Cloud Trends Nov2015 StructureCloud Trends Nov2015 Structure
Cloud Trends Nov2015 Structure
 
Openstack Silicon Valley - Vendor Lock In
Openstack Silicon Valley - Vendor Lock InOpenstack Silicon Valley - Vendor Lock In
Openstack Silicon Valley - Vendor Lock In
 
When Developers Operate and Operators Develop
When Developers Operate and Operators DevelopWhen Developers Operate and Operators Develop
When Developers Operate and Operators Develop
 
Dockercon 2015 - Faster Cheaper Safer
Dockercon 2015 - Faster Cheaper SaferDockercon 2015 - Faster Cheaper Safer
Dockercon 2015 - Faster Cheaper Safer
 
Microservices the Good Bad and the Ugly
Microservices the Good Bad and the UglyMicroservices the Good Bad and the Ugly
Microservices the Good Bad and the Ugly
 
Gluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A ChallengeGluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A Challenge
 
Software Architecture Conference - Monitoring Microservices - A Challenge
Software Architecture Conference -  Monitoring Microservices - A ChallengeSoftware Architecture Conference -  Monitoring Microservices - A Challenge
Software Architecture Conference - Monitoring Microservices - A Challenge
 
Microxchg Microservices
Microxchg MicroservicesMicroxchg Microservices
Microxchg Microservices
 
Cloud Native Cost Optimization UCC
Cloud Native Cost Optimization UCCCloud Native Cost Optimization UCC
Cloud Native Cost Optimization UCC
 

Recently uploaded

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 

Recently uploaded (20)

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 

Hack Kid Con - Learn to be a Data Scientist for $1

  • 1. LEARNTO BE A DATA SCIENTIST FOR $1 Hack Kid Conference - April 2014 by Adrian Cockcroft BatteryVentures
  • 2.
  • 3.
  • 4.
  • 5. A BIG new problem for a new generation
  • 6. Now A BIG new problem for a new generation
  • 7. Now A BIG new problem for a new generation Your future job as a Data Scientist
  • 8.
  • 9. WHAT DOES A DATA SCIENTIST DO?
  • 10.
  • 11. The hive mind map shows popular twitter hashtags for the last 7 days and how they are connected http://hivemindmap.com/?#
  • 12. HIVE MIND MAP A mind-map of what’s happening onTwitter Thanks to Mark Harwood for these slides and the Hive Mind Map http://www.infoq.com/presentations/elasticsearch-revealing-uncommonly-common
  • 13. Connections The thickness of a line between hashtags is based on the strength of connection Tip:! Strength of connection is the number of tweets with both tags vs the number with only one - see “Jaccard similarity coefficient”
  • 14. Top tweets The most popular tweets for a tag are sorted based on the number of “retweets”
  • 15. When? The rise and fall of each hashtag’s popularity can be shown over time
  • 16. Calendar summary Tags that “peak” together are grouped into events on a calendar Tip:! Peaks are detected using standard deviations. Only tags with a single peak are chosen as events Tip:! Tags that rise and fall in popularity at the same time are detected using Pearson’s Correlation
  • 17. What makes this possible? • Free software (Lucene, Java, Eclipse, Gephi, Tomcat, d3, Google analytics…) • Free data (millions of users’ tweets from Twitter’s 1% sample feed) • “Cloud” computing (rented server) • Smarter web browsers (visualizations using HTML5’s SVG/Canvas) • All the friendly folks on the internet (e.g. http://stackoverflow.com/ questions/14799842) • Some imagination…
  • 18. Opportunities in Data Science • We are all generating volumes of data never seen before • You can recycle the behaviors of billions of people into more intelligent systems • customer purchases can be used for product recommendations • user searches can be used for spelling corrections, • Reader clicks can influence the trending news • Spotify activity is used to make music recommendations) • The tools have never been cheaper • It has never been easier to find help in developing systems
  • 19. …one more thing.. I’m writing these slides for you while on my annual snowboarding trip to Canada. Data science pays well ;-) Wish you were here…
  • 20. HOW CAN A KID LEARN BIG DATA FOR $1?
  • 21. BIG DATA INTHE CLOUD WITH AMAZON EMR https://www.youtube.com/watch?v=S6Ja55n-o0M
  • 22. LESSTHAN $1 After running two of the EMR examples, creating 6 computers in the cloud to do the analysis for up to an hour each
  • 26. MEASURING KIDS How good are you at Math and Science, is it getting better or worse?
  • 28. ACHIEVEMENT SCORES Download results into Excel to analyze and draw graphs
  • 29. DOWNLOADED DATA Needed some clean-up. Made sure grade was consistent (4, 8, HS) for all results, and created a short Subject column
  • 30. SCORES 2004-2012 Elementary - 4th Grade, Middle School - 8th Grade, High School
  • 31. SCORES 2004-2012 Elementary - 4th Grade, Middle School - 8th Grade, High School About half of high school students in California are proficient at Math and Science
  • 32. CALIFORNIA SCHOOLS Science and Math Scores at Elementary, Middle and High School Level
  • 33. CALIFORNIA SCHOOLS Science and Math Scores at Elementary, Middle and High School Level Scores have been getting better. Good!
  • 34. CALIFORNIA SCHOOLS Science and Math Scores at Elementary, Middle and High School Level Scores have been getting better. Good! Maybe the Math tests were harder for everyone that year?
  • 35. CALIFORNIA SCHOOLS Science and Math Scores at Elementary, Middle and High School Level Scores have been getting better. Good!4th Grade “cohort” in 2004 was 8th Grade in 2008 Maybe the Math tests were harder for everyone that year?
  • 36. DATA SCIENCE WITH EXCEL Pivot tables let you rearrange data and trend lines measure the slope
  • 37. LEARNTO BE A DATA SCIENTIST FOR $1 • Everything is being measured • The latest data science tools are available to anyone for pennies • There is lots of freely available data • Pay attention in math and science class, play around with EMR and Bigquery and get an interesting and well paid job as a data scientist!