Submit Search
Upload
Data Stories
•
Download as KEY, PDF
•
0 likes
•
571 views
J
Jeremy Hinegardner
Follow
The slides that accompany my 2011 Scottish Ruby Conference talk 'Data Stories'.
Read less
Read more
Technology
Business
Report
Share
Report
Share
1 of 35
Download now
Recommended
Four letter word social media risks
Four letter word social media risks
Ryan Garcia
Russian translations 1258
Russian translations 1258
leejaan66
Laws in russia 1342
Laws in russia 1342
hplei83
Laws in russia 1337
Laws in russia 1337
farefare80
Russia laws 1329
Russia laws 1329
hplei83
Laws in russia 1334
Laws in russia 1334
Kinlay89
DataTags: Sharing Privacy Sensitive Data by Latanya Sweeney
DataTags: Sharing Privacy Sensitive Data by Latanya Sweeney
datascienceiqss
Argentinean version of chemistry
Argentinean version of chemistry
Digicraft
Recommended
Four letter word social media risks
Four letter word social media risks
Ryan Garcia
Russian translations 1258
Russian translations 1258
leejaan66
Laws in russia 1342
Laws in russia 1342
hplei83
Laws in russia 1337
Laws in russia 1337
farefare80
Russia laws 1329
Russia laws 1329
hplei83
Laws in russia 1334
Laws in russia 1334
Kinlay89
DataTags: Sharing Privacy Sensitive Data by Latanya Sweeney
DataTags: Sharing Privacy Sensitive Data by Latanya Sweeney
datascienceiqss
Argentinean version of chemistry
Argentinean version of chemistry
Digicraft
Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)
Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)
SERVICE DESIGN DAYS
Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...
Fredrik Olsson
Tactical Information Gathering
Tactical Information Gathering
Christian Martorella
Getting comfortable with Data
Getting comfortable with Data
Ritvvij Parrikh
Digital forensics track schroader-rob when forensics collide
Digital forensics track schroader-rob when forensics collide
ISSA LA
A living hell - lessons learned in eight years of parsing real estate data
A living hell - lessons learned in eight years of parsing real estate data
lokku
Big data
Big data
raghav125
(Ab)using Identifiers: Indiscernibility of Identity
(Ab)using Identifiers: Indiscernibility of Identity
BayCHI
Knowledge Integration in Practice
Knowledge Integration in Practice
Peter Mika
Our Adventure with MongoDB
Our Adventure with MongoDB
Ethan Gunderson
What's the Big Deal About Big Data?.pdf
What's the Big Deal About Big Data?.pdf
Steven Jong
Big Data to Analytics
Big Data to Analytics
Milind Zodge
Mongo chicago
Mongo chicago
Ethan Gunderson
A fresh new look into Information Gathering - OWASP Spain
A fresh new look into Information Gathering - OWASP Spain
Christian Martorella
Web Evolution Nova Spivack Twine
Web Evolution Nova Spivack Twine
Nova Spivack
Semantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistants
Peter Mika
From Telephones to Tablets: The Good, The Bad and The Ugly
From Telephones to Tablets: The Good, The Bad and The Ugly
Angela Hey
movie_notebook.pdf
movie_notebook.pdf
pinstechwork
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
Linuxmalaysia Malaysia
Big data tutorial_part4
Big data tutorial_part4
Aravindharamanan S
Creative Photography for Geeks
Creative Photography for Geeks
Jeremy Hinegardner
Gemology
Gemology
Jeremy Hinegardner
More Related Content
Similar to Data Stories
Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)
Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)
SERVICE DESIGN DAYS
Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...
Fredrik Olsson
Tactical Information Gathering
Tactical Information Gathering
Christian Martorella
Getting comfortable with Data
Getting comfortable with Data
Ritvvij Parrikh
Digital forensics track schroader-rob when forensics collide
Digital forensics track schroader-rob when forensics collide
ISSA LA
A living hell - lessons learned in eight years of parsing real estate data
A living hell - lessons learned in eight years of parsing real estate data
lokku
Big data
Big data
raghav125
(Ab)using Identifiers: Indiscernibility of Identity
(Ab)using Identifiers: Indiscernibility of Identity
BayCHI
Knowledge Integration in Practice
Knowledge Integration in Practice
Peter Mika
Our Adventure with MongoDB
Our Adventure with MongoDB
Ethan Gunderson
What's the Big Deal About Big Data?.pdf
What's the Big Deal About Big Data?.pdf
Steven Jong
Big Data to Analytics
Big Data to Analytics
Milind Zodge
Mongo chicago
Mongo chicago
Ethan Gunderson
A fresh new look into Information Gathering - OWASP Spain
A fresh new look into Information Gathering - OWASP Spain
Christian Martorella
Web Evolution Nova Spivack Twine
Web Evolution Nova Spivack Twine
Nova Spivack
Semantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistants
Peter Mika
From Telephones to Tablets: The Good, The Bad and The Ugly
From Telephones to Tablets: The Good, The Bad and The Ugly
Angela Hey
movie_notebook.pdf
movie_notebook.pdf
pinstechwork
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
Linuxmalaysia Malaysia
Big data tutorial_part4
Big data tutorial_part4
Aravindharamanan S
Similar to Data Stories
(20)
Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)
Service Design Days 2017 - Keynote Jon Rogers (University of Dundee)
Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...
Tactical Information Gathering
Tactical Information Gathering
Getting comfortable with Data
Getting comfortable with Data
Digital forensics track schroader-rob when forensics collide
Digital forensics track schroader-rob when forensics collide
A living hell - lessons learned in eight years of parsing real estate data
A living hell - lessons learned in eight years of parsing real estate data
Big data
Big data
(Ab)using Identifiers: Indiscernibility of Identity
(Ab)using Identifiers: Indiscernibility of Identity
Knowledge Integration in Practice
Knowledge Integration in Practice
Our Adventure with MongoDB
Our Adventure with MongoDB
What's the Big Deal About Big Data?.pdf
What's the Big Deal About Big Data?.pdf
Big Data to Analytics
Big Data to Analytics
Mongo chicago
Mongo chicago
A fresh new look into Information Gathering - OWASP Spain
A fresh new look into Information Gathering - OWASP Spain
Web Evolution Nova Spivack Twine
Web Evolution Nova Spivack Twine
Semantic search: from document retrieval to virtual assistants
Semantic search: from document retrieval to virtual assistants
From Telephones to Tablets: The Good, The Bad and The Ugly
From Telephones to Tablets: The Good, The Bad and The Ugly
movie_notebook.pdf
movie_notebook.pdf
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
Big data tutorial_part4
Big data tutorial_part4
More from Jeremy Hinegardner
Creative Photography for Geeks
Creative Photography for Geeks
Jeremy Hinegardner
Gemology
Gemology
Jeremy Hinegardner
Extending JRuby
Extending JRuby
Jeremy Hinegardner
FFI -- creating cross engine rubygems
FFI -- creating cross engine rubygems
Jeremy Hinegardner
Playing Nice with Others
Playing Nice with Others
Jeremy Hinegardner
Crate - ruby based standalone executables
Crate - ruby based standalone executables
Jeremy Hinegardner
FFI - building cross engine ruby extensions
FFI - building cross engine ruby extensions
Jeremy Hinegardner
More from Jeremy Hinegardner
(7)
Creative Photography for Geeks
Creative Photography for Geeks
Gemology
Gemology
Extending JRuby
Extending JRuby
FFI -- creating cross engine rubygems
FFI -- creating cross engine rubygems
Playing Nice with Others
Playing Nice with Others
Crate - ruby based standalone executables
Crate - ruby based standalone executables
FFI - building cross engine ruby extensions
FFI - building cross engine ruby extensions
Recently uploaded
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Zilliz
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
UiPathCommunity
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
apidays
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
MIND CTI
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Orbitshub
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
danishmna97
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
Rustici Software
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Juan lago vázquez
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
apidays
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Edi Saputra
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
apidays
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
DianaGray10
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
Zilliz
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
rafiqahmad00786416
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
Sandro Moreira
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
Overkill Security
Recently uploaded
(20)
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
Data Stories
1.
Data Stories Jeremy Hinegardner Scottish
Ruby Conference 2011
2.
Data Analysis Anyone?
3.
Other People’s Data?
4.
Your Own Data?
5.
Weblog data Database transactions postgres
performance metrics
6.
Public Data?
7.
UN Statistics website data.gov.uk data.gov Scottish
Home Survey The Guardian IMDB
8.
All 3?
9.
Collective Intellect private customer
data internal chat, email lists our internal data processing metrics, queue sizes, database queries, performance metrics. public data blogs, boards, tweets
10.
I have the
__DATA__! What now?
11.
Scrub it. Get out
your Cleaning Supplies... Ruby
12.
Majority of Your
Time Will be Spent Cleaning The Data.
13.
Interesting Data Cleaning Problems?
14.
I want to
hear about them.
15.
Cleaning IMDB
16.
A whole bunch
of .gz files.
17.
Each with its
own slightly different format.
18.
Extra junk around
the data.
19.
ISO-8859 -> UTF8
20.
Dates...
21.
Black and White Black
and white Black & White
22.
Country Inconsistencies
23.
Why are we
doing this?
24.
To Learn Something New!
25.
Supercrunchers
26.
Outliers
27.
Freakonomics
28.
Superfreakonomics
29.
Science of Fear
30.
OK Cupid
31.
The Guardian
32.
Investigation Time.
33.
Internet Movie Database Title
Running Times by Country Year made Country of Origin Actresses/Actors Language Release Dates by Country Production Companies Colour Genre
34.
Ruby + SQLite
+ R + iTerm
35.
Thanks! Jeremy Hinegardner jeremy@hinegardner.org @copiousfreetime
Editor's Notes
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
Download now