SlideShare a Scribd company logo
1 of 126
Data wrangling with open
source tools
Tony Hirst
Dept of Communication & Systems
The Open University, UK
Premises
“I take data
from wherever I
can get it”
1
“Appropriate
everything”
2
Conversations
with data
3
Visual
Conversations
with data3
(Accession Plot)
@mediaczar
If a picture’s worth a
thousand words,
maybe it should take
as long to read?
Most learning
analytics won’t be
performed by
learning analytics
researchers
How can we help
people fashion
their own tools to
support data
conversations?
Recipes
site:open.ac.uk
Have a
conversation
with the data…
Ask the right
questions…
xkcd.com/1138
Sometimes a question
makes most sense in
the context of
questions previously
asked and answers
previously received
DATA
USERS
Educators
Learners
Planners
Marketers
Policymakers
Researchers
Press
NGOs
“
D
E
V
E
L
O
P
E
R
S
”
Have
dashboard,
so what?
A tools and
issues
based view
DATA
TOOLS
USERS
PROBLEMS
Example – Google Fusion Tables
Fusion Table
https://www.google.com/fusiontables/DataSource?docid=1VKG7iCbFlsEYJzTuQppf4xoIqq1ABxWTdW6O_7o#rows:id=1
http://is.gd/qhuaoA
Walkthrough
http://blog.ouseful.info/2012/11/16/a-quick-look-at-gcsealevel-certificate-awards-market-share-by-examination-board/
http://is.gd/f9YAbG
DATA
TOOLS
USERS
PROBLEMS
Access/obtain data
Make sense of data
Ask specific questions of data
Communicate in a data-centric way
Load data
Clean data
Merge/enrich data
DATA
Issues
TOOLS
DATA
Other
TOOLS
Issues
TOOLS
“Tool based
programming”
A barrier to access
(for the tool user) is
data format
JSON XMLCSVXLS
TSV
.db
HTML
PDF DOCTXT
GLUE LOGIC(Glue code)
=importHTML(URL, “table”, N)
HTML
QUERYABLE
DATA
Try it…
Example Page
http://en.wikipedia.org/wiki/List_of_colleges_and_universities_in_the_United_States_by_endowment
http://is.gd/7Vbg6n
Google Spreadsheets as a database
Explorer
https://views.scraperwiki.com/run/google_spreadsheet_query/
http://is.gd/jiMJoh
Walkthrough
http://schoolofdata.org/2013/05/24/asking-questions-of-data-garment-factories-data-expedition/
http://is.gd/qJHihu
=importCSV(URL, N)
HTML
INTERACTIVE
DASHBOARD
Google Charts
Google Chart
Visualization API
https://code.google.com/apis/ajax/playground/
http://is.gd/TTHIUh
Google
Visualisation
API
googleVis
(R)
https://developers.facebook.com/
docs/reference/api/examples/
http://is.gd/7cRnvS
A barrier to access
(for the tool user) is
data shape
A barrier to access
(for the tool user) is
data cleanliness
Questions of
identity
The Open University
Open University
OU
Open Uni
Open University, UK
NORMALISATION/RECONCILIATION
Reconciliation to
a canonical name
and/orto a
unique identifier
A stumbling block
(for the data user)
is data enrichment
A stumbling block
(for the data user)
is joining datasets
A stumbling block
(for the data user)
is joining partially
matched data
Rolling your own
interactive data
exploration tools
R Shiny
Apps
ui.R server.R
RCharts
Many chart tools
do the work for
you if the data is
in the right shape
DATA
TOOLS
USERS PROBLEMS
Justask…
ask.SchoolOfData.org
blog.ouseful.info
@psychemedia

More Related Content

What's hot

Library Connect Webinar - Calculating sharing metrics: Possible approaches
Library Connect Webinar - Calculating sharing metrics: Possible approaches Library Connect Webinar - Calculating sharing metrics: Possible approaches
Library Connect Webinar - Calculating sharing metrics: Possible approaches Library_Connect
 
Investigating Performance
Investigating PerformanceInvestigating Performance
Investigating PerformanceMegan Bowe
 
Recommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human CategorizationRecommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human CategorizationChristoph Trattner
 
Library Connect Webinar - The secret life of articles: From download metrics ...
Library Connect Webinar - The secret life of articles: From download metrics ...Library Connect Webinar - The secret life of articles: From download metrics ...
Library Connect Webinar - The secret life of articles: From download metrics ...Library_Connect
 
Data science as a science
Data science as a scienceData science as a science
Data science as a sciencejtleek
 
From Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsFrom Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsPaul Groth
 
Cognitive Models in Recommender Systems
Cognitive Models in Recommender SystemsCognitive Models in Recommender Systems
Cognitive Models in Recommender SystemsChristoph Trattner
 
Investigating Performance
Investigating PerformanceInvestigating Performance
Investigating PerformanceMegan Bowe
 
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionUniversity of Washington
 

What's hot (9)

Library Connect Webinar - Calculating sharing metrics: Possible approaches
Library Connect Webinar - Calculating sharing metrics: Possible approaches Library Connect Webinar - Calculating sharing metrics: Possible approaches
Library Connect Webinar - Calculating sharing metrics: Possible approaches
 
Investigating Performance
Investigating PerformanceInvestigating Performance
Investigating Performance
 
Recommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human CategorizationRecommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human Categorization
 
Library Connect Webinar - The secret life of articles: From download metrics ...
Library Connect Webinar - The secret life of articles: From download metrics ...Library Connect Webinar - The secret life of articles: From download metrics ...
Library Connect Webinar - The secret life of articles: From download metrics ...
 
Data science as a science
Data science as a scienceData science as a science
Data science as a science
 
From Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsFrom Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge Graphs
 
Cognitive Models in Recommender Systems
Cognitive Models in Recommender SystemsCognitive Models in Recommender Systems
Cognitive Models in Recommender Systems
 
Investigating Performance
Investigating PerformanceInvestigating Performance
Investigating Performance
 
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data Interaction
 

Viewers also liked

Funding entrepreneurship cj cornell-december3rd 2012
Funding entrepreneurship cj cornell-december3rd 2012Funding entrepreneurship cj cornell-december3rd 2012
Funding entrepreneurship cj cornell-december3rd 2012Propel Arizona
 
How to scraping content from web for location-based mobile app.
How to scraping content from web for location-based mobile app.How to scraping content from web for location-based mobile app.
How to scraping content from web for location-based mobile app.Diep Nguyen
 
Practical Crowdfunding for Arizona Entrepreneurs - Fall 2013
Practical Crowdfunding for Arizona Entrepreneurs - Fall 2013Practical Crowdfunding for Arizona Entrepreneurs - Fall 2013
Practical Crowdfunding for Arizona Entrepreneurs - Fall 2013Propel Arizona
 
Propel Arizona Crowdfunding Essentials - for NACET
Propel Arizona Crowdfunding Essentials - for NACETPropel Arizona Crowdfunding Essentials - for NACET
Propel Arizona Crowdfunding Essentials - for NACETPropel Arizona
 
How My Comic Book Obsession Birthed a New Functional Testing Tool
How My Comic Book Obsession Birthed a New Functional Testing ToolHow My Comic Book Obsession Birthed a New Functional Testing Tool
How My Comic Book Obsession Birthed a New Functional Testing ToolFeihong Hsu
 
Propel Arizona: Crowdfunding for Communities
Propel Arizona:  Crowdfunding for CommunitiesPropel Arizona:  Crowdfunding for Communities
Propel Arizona: Crowdfunding for CommunitiesPropel Arizona
 
Functional Programming With Python (EuroPython 2008)
Functional Programming With Python (EuroPython 2008)Functional Programming With Python (EuroPython 2008)
Functional Programming With Python (EuroPython 2008)Adam Byrtek
 

Viewers also liked (7)

Funding entrepreneurship cj cornell-december3rd 2012
Funding entrepreneurship cj cornell-december3rd 2012Funding entrepreneurship cj cornell-december3rd 2012
Funding entrepreneurship cj cornell-december3rd 2012
 
How to scraping content from web for location-based mobile app.
How to scraping content from web for location-based mobile app.How to scraping content from web for location-based mobile app.
How to scraping content from web for location-based mobile app.
 
Practical Crowdfunding for Arizona Entrepreneurs - Fall 2013
Practical Crowdfunding for Arizona Entrepreneurs - Fall 2013Practical Crowdfunding for Arizona Entrepreneurs - Fall 2013
Practical Crowdfunding for Arizona Entrepreneurs - Fall 2013
 
Propel Arizona Crowdfunding Essentials - for NACET
Propel Arizona Crowdfunding Essentials - for NACETPropel Arizona Crowdfunding Essentials - for NACET
Propel Arizona Crowdfunding Essentials - for NACET
 
How My Comic Book Obsession Birthed a New Functional Testing Tool
How My Comic Book Obsession Birthed a New Functional Testing ToolHow My Comic Book Obsession Birthed a New Functional Testing Tool
How My Comic Book Obsession Birthed a New Functional Testing Tool
 
Propel Arizona: Crowdfunding for Communities
Propel Arizona:  Crowdfunding for CommunitiesPropel Arizona:  Crowdfunding for Communities
Propel Arizona: Crowdfunding for Communities
 
Functional Programming With Python (EuroPython 2008)
Functional Programming With Python (EuroPython 2008)Functional Programming With Python (EuroPython 2008)
Functional Programming With Python (EuroPython 2008)
 

Similar to Lasi datawrangling

Big Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday LearningBig Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday LearningStefan Dietze
 
Data Models And Details About Open Data
Data Models And Details About Open DataData Models And Details About Open Data
Data Models And Details About Open DataMichael Bostwick
 
Educational Transformation with Media
Educational Transformation with MediaEducational Transformation with Media
Educational Transformation with MediaTerryKH2006
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social mediaFarida Vis
 
Data visualization and digital humanities research
Data visualization and digital humanities researchData visualization and digital humanities research
Data visualization and digital humanities researchSusan Smith
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSara-Jayne Terp
 
The culture of researchData
The culture of researchDataThe culture of researchData
The culture of researchDatapetermurrayrust
 
Strata 2012: Big Data and Bibliometrics
Strata 2012: Big Data and BibliometricsStrata 2012: Big Data and Bibliometrics
Strata 2012: Big Data and BibliometricsWilliam Gunn
 
Open Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics InstituteOpen Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics InstituteTheContentMine
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData TheContentMine
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustLEARN Project
 
The crowd and the library
The crowd and the libraryThe crowd and the library
The crowd and the libraryTrevor Owens
 
Using technologies to promote projects
Using technologies to promote projectsUsing technologies to promote projects
Using technologies to promote projectsDART Project
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...The Higher Education Academy
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of PublishingAnita de Waard
 

Similar to Lasi datawrangling (20)

Big Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday LearningBig Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday Learning
 
Data Models And Details About Open Data
Data Models And Details About Open DataData Models And Details About Open Data
Data Models And Details About Open Data
 
Educational Transformation with Media
Educational Transformation with MediaEducational Transformation with Media
Educational Transformation with Media
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social media
 
Lecture_1_Intro_toDS&AI.pptx
Lecture_1_Intro_toDS&AI.pptxLecture_1_Intro_toDS&AI.pptx
Lecture_1_Intro_toDS&AI.pptx
 
Data visualization and digital humanities research
Data visualization and digital humanities researchData visualization and digital humanities research
Data visualization and digital humanities research
 
Session 01 designing and scoping a data science project
Session 01 designing and scoping a data science projectSession 01 designing and scoping a data science project
Session 01 designing and scoping a data science project
 
The culture of researchData
The culture of researchDataThe culture of researchData
The culture of researchData
 
Strata 2012: Big Data and Bibliometrics
Strata 2012: Big Data and BibliometricsStrata 2012: Big Data and Bibliometrics
Strata 2012: Big Data and Bibliometrics
 
Ebi
EbiEbi
Ebi
 
Open Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics InstituteOpen Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics Institute
 
020610
020610020610
020610
 
LuisValeroInterests
LuisValeroInterestsLuisValeroInterests
LuisValeroInterests
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-Rust
 
The crowd and the library
The crowd and the libraryThe crowd and the library
The crowd and the library
 
Using technologies to promote projects
Using technologies to promote projectsUsing technologies to promote projects
Using technologies to promote projects
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...
 
Bibliotheek & Onderzoek 2.0?
Bibliotheek & Onderzoek 2.0?Bibliotheek & Onderzoek 2.0?
Bibliotheek & Onderzoek 2.0?
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 

More from Tony Hirst

15 in 20 research fiesta
15 in 20 research fiesta15 in 20 research fiesta
15 in 20 research fiestaTony Hirst
 
Jupyternotebooks ou.pptx
Jupyternotebooks ou.pptxJupyternotebooks ou.pptx
Jupyternotebooks ou.pptxTony Hirst
 
Virtual computing.pptx
Virtual computing.pptxVirtual computing.pptx
Virtual computing.pptxTony Hirst
 
ouseful-parlihacks
ouseful-parlihacksouseful-parlihacks
ouseful-parlihacksTony Hirst
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriateTony Hirst
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriateTony Hirst
 
Robotlab jupyter
Robotlab   jupyterRobotlab   jupyter
Robotlab jupyterTony Hirst
 
Fco open data in half day th-v2
Fco open data in half day  th-v2Fco open data in half day  th-v2
Fco open data in half day th-v2Tony Hirst
 
Notes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 WorkshopNotes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 WorkshopTony Hirst
 
Community Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wireCommunity Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wireTony Hirst
 
Residential school 2015_robotics_interest
Residential school 2015_robotics_interestResidential school 2015_robotics_interest
Residential school 2015_robotics_interestTony Hirst
 
Data Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKXData Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKXTony Hirst
 
A Quick Tour of OpenRefine
A Quick Tour of OpenRefineA Quick Tour of OpenRefine
A Quick Tour of OpenRefineTony Hirst
 
Conversations with data
Conversations with dataConversations with data
Conversations with dataTony Hirst
 
Data reuse OU workshop bingo
Data reuse OU workshop bingoData reuse OU workshop bingo
Data reuse OU workshop bingoTony Hirst
 
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Tony Hirst
 
Lincoln jun14datajournalism
Lincoln jun14datajournalismLincoln jun14datajournalism
Lincoln jun14datajournalismTony Hirst
 

More from Tony Hirst (20)

15 in 20 research fiesta
15 in 20 research fiesta15 in 20 research fiesta
15 in 20 research fiesta
 
Dev8d jupyter
Dev8d jupyterDev8d jupyter
Dev8d jupyter
 
Ili 16 robot
Ili 16 robotIli 16 robot
Ili 16 robot
 
Jupyternotebooks ou.pptx
Jupyternotebooks ou.pptxJupyternotebooks ou.pptx
Jupyternotebooks ou.pptx
 
Virtual computing.pptx
Virtual computing.pptxVirtual computing.pptx
Virtual computing.pptx
 
ouseful-parlihacks
ouseful-parlihacksouseful-parlihacks
ouseful-parlihacks
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriate
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriate
 
Robotlab jupyter
Robotlab   jupyterRobotlab   jupyter
Robotlab jupyter
 
Fco open data in half day th-v2
Fco open data in half day  th-v2Fco open data in half day  th-v2
Fco open data in half day th-v2
 
Notes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 WorkshopNotes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 Workshop
 
Community Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wireCommunity Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wire
 
Residential school 2015_robotics_interest
Residential school 2015_robotics_interestResidential school 2015_robotics_interest
Residential school 2015_robotics_interest
 
Data Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKXData Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKX
 
Week4
Week4Week4
Week4
 
A Quick Tour of OpenRefine
A Quick Tour of OpenRefineA Quick Tour of OpenRefine
A Quick Tour of OpenRefine
 
Conversations with data
Conversations with dataConversations with data
Conversations with data
 
Data reuse OU workshop bingo
Data reuse OU workshop bingoData reuse OU workshop bingo
Data reuse OU workshop bingo
 
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
 
Lincoln jun14datajournalism
Lincoln jun14datajournalismLincoln jun14datajournalism
Lincoln jun14datajournalism
 

Recently uploaded

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 

Recently uploaded (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Lasi datawrangling

Editor's Notes

  1. I am not a journalist, but it seems to me that a large part of your work, and indeed a large part of the work of a scientist or an analyst, is in asking the right questions of a source, and knowing how to frame those questions.The data journalist knows how to ask questions of data.
  2. Also – high incidence of crime around police stations (no location, so police station used as default location); Russell Square as a murder hotspot.
  3. Another nice example of this, and one used by many advocates of data visualisation, is the famous example of Anscombe’s quartet, for sets of two dimensional data with some interesting properties.
  4. For example, many of the “classic” summary statistics for the corresponding columns in these data sets are to all intents and purposes the same.
  5. But when we look at the datasets as a set of scatterplots, we see how the data tells very different stories.
  6. People learn the skills they need, as they need them.