SlideShare a Scribd company logo
David Herzog
Missouri School of Journalism and NICAR
   Locating the data

   Obtaining the data

   Evaluating the data

   Working with the data

   Visualizing the data
 “Database state of mind”


 Data has to exist. Where?
  Online
  Offline
 Government websites
  Data.gov
  U.S. Census Bureau
  FDIC
  Missouri Data Portal
  Missouri Accountability Portal
 U.S. agency FOIA pages
  Drug Enforcement Administration


 NGO sites
  Right-to-Know Network
  OpenMissouri.org
  NICAR database library
  ALA state agency databases wiki
 Commercial services
  Socrata
  Infochimps
  Geocommons
  Foreclosure Radar
  Oil Price Information Service
  Search Systems
  Junar
 Academic data catalogs
  ICPSR


 Forms
  Forms.gov
  Web forms
   ▪ Columbia parade permits
 Records retention schedules


 Reports
  State auditor
  U.S. Government Accountability Office
  U.S. Inspectors General
 Google advanced search
  Look for data files
  Look for key words
  Look only on government sites
 Data entry
   In the field
   At the office


 Printouts/reports


 Inspection forms
 Download it


 Write or request a scraper with ScraperWiki


 Convert a PDF with
   CometDocs
   Zamzar


 Just ask for it
 U.S. Freedom of Information Act
  Passed in 1966
  Amended in 1996 to include electronic records


 State open-records statutes
  Missouri Sunshine Law
 Get the roadmap!
  Record layout
  File layout
  Data dictionary
  Code sheet


 Metadata
  Data about the data
 Look at it immediately when you get it
  It is what you asked for/expected?
  How many rows/records of data?
  Is the file format OK?
 Does it look too good to be true?
 Beware of missing information
 Who collected the information?
 How? What are their methods?
 Why?
 What is their agenda?
 Who supports them financially or otherwise?
 Notepad++ for PCs
 TextMate for Mac
 Always keep original file


 Never overwrite data columns


 Tools
   Spreadsheets
   Database managers
   Google Refine
   Programming languages
 Raw numbers, without context, rarely are
 interesting.

 Ask: Compared to what?
 Raw (amount) change
   New-Original


 Percent change
   Change/Original


 Per capita rates
   Per person
   Per x people
 Percent of total
   Individual/Total


 Ratio
   Apples/oranges


 Averages
   Mean
   Median
 Be curious!
 Cut out small slices
 Spreadsheets for simple math and
  comparisons
 Spreadsheets for pivot tables
 Database managers for more robust analysis
 Always ask: Is this correct?
 Online software platforms


 Desktop software
 Contact David Herzog at


  herzogd@missouri.edu
  Twitter: @davidherzog

More Related Content

What's hot

Data Journalism for Business Reporting
Data Journalism for Business ReportingData Journalism for Business Reporting
Data Journalism for Business Reporting
Reynolds Center for Business Journalism
 
Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...
Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...
Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...
News Leaders Association's NewsTrain
 
Gov Docs Overview
Gov Docs Overview Gov Docs Overview
Gov Docs Overview
Dr. Starr Hoffman
 
Finding and using government and legal resources - Spring 2014
Finding and using government and legal resources - Spring 2014Finding and using government and legal resources - Spring 2014
Finding and using government and legal resources - Spring 2014
St. Thomas University Library
 
Everything Except Taxes
Everything Except TaxesEverything Except Taxes
Everything Except Taxes
lmantle
 
Data Journalism for Business Reporting
Data Journalism for Business ReportingData Journalism for Business Reporting
Data Journalism for Business Reporting
Reynolds Center for Business Journalism
 
State of Florida Neo4J Graph Briefing -Payments to Prescriptions Analysis
State of Florida Neo4J Graph Briefing -Payments to Prescriptions AnalysisState of Florida Neo4J Graph Briefing -Payments to Prescriptions Analysis
State of Florida Neo4J Graph Briefing -Payments to Prescriptions Analysis
Neo4j
 
Where to Find Data Sets
Where to Find Data SetsWhere to Find Data Sets
Where to Find Data Sets
AnnaCave2
 
Locating scientific government information on the web
Locating scientific government information on the webLocating scientific government information on the web
Locating scientific government information on the web
Shannon Lynch
 
DOI Library Training Session Presentation - Locating Scientific Government In...
DOI Library Training Session Presentation - Locating Scientific Government In...DOI Library Training Session Presentation - Locating Scientific Government In...
DOI Library Training Session Presentation - Locating Scientific Government In...
DOILibrary1151
 
Trellis Pitch Deck
Trellis Pitch DeckTrellis Pitch Deck
Trellis Pitch Deck
DrewThaler
 
Data can only dance with its music NICAR17
Data can only dance with its music NICAR17Data can only dance with its music NICAR17
Data can only dance with its music NICAR17
J T "Tom" Johnson
 
National latina researchers network supercharge your search 2015 webinar
National latina researchers network supercharge your search 2015 webinarNational latina researchers network supercharge your search 2015 webinar
National latina researchers network supercharge your search 2015 webinar
Matthew Von Hendy
 
Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...
Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...
Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...
News Leaders Association's NewsTrain
 
Spj110509
Spj110509Spj110509
Spj110509
Alex Johnson
 
Email list of multi millionaires
Email list of multi millionairesEmail list of multi millionaires
Email list of multi millionaires
mbrown012
 
Open Data Sources for Grants
Open Data Sources for GrantsOpen Data Sources for Grants
Open Data Sources for Grants
jasonparker83
 
Using the Web as an Investigative Reporting Tool
Using the Web as an Investigative Reporting ToolUsing the Web as an Investigative Reporting Tool
Using the Web as an Investigative Reporting Tool
Texas Center for Community Journalism at TCU
 
Gale Infotrac Update
Gale Infotrac UpdateGale Infotrac Update
Gale Infotrac Update
mlincoln
 

What's hot (19)

Data Journalism for Business Reporting
Data Journalism for Business ReportingData Journalism for Business Reporting
Data Journalism for Business Reporting
 
Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...
Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...
Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...
 
Gov Docs Overview
Gov Docs Overview Gov Docs Overview
Gov Docs Overview
 
Finding and using government and legal resources - Spring 2014
Finding and using government and legal resources - Spring 2014Finding and using government and legal resources - Spring 2014
Finding and using government and legal resources - Spring 2014
 
Everything Except Taxes
Everything Except TaxesEverything Except Taxes
Everything Except Taxes
 
Data Journalism for Business Reporting
Data Journalism for Business ReportingData Journalism for Business Reporting
Data Journalism for Business Reporting
 
State of Florida Neo4J Graph Briefing -Payments to Prescriptions Analysis
State of Florida Neo4J Graph Briefing -Payments to Prescriptions AnalysisState of Florida Neo4J Graph Briefing -Payments to Prescriptions Analysis
State of Florida Neo4J Graph Briefing -Payments to Prescriptions Analysis
 
Where to Find Data Sets
Where to Find Data SetsWhere to Find Data Sets
Where to Find Data Sets
 
Locating scientific government information on the web
Locating scientific government information on the webLocating scientific government information on the web
Locating scientific government information on the web
 
DOI Library Training Session Presentation - Locating Scientific Government In...
DOI Library Training Session Presentation - Locating Scientific Government In...DOI Library Training Session Presentation - Locating Scientific Government In...
DOI Library Training Session Presentation - Locating Scientific Government In...
 
Trellis Pitch Deck
Trellis Pitch DeckTrellis Pitch Deck
Trellis Pitch Deck
 
Data can only dance with its music NICAR17
Data can only dance with its music NICAR17Data can only dance with its music NICAR17
Data can only dance with its music NICAR17
 
National latina researchers network supercharge your search 2015 webinar
National latina researchers network supercharge your search 2015 webinarNational latina researchers network supercharge your search 2015 webinar
National latina researchers network supercharge your search 2015 webinar
 
Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...
Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...
Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...
 
Spj110509
Spj110509Spj110509
Spj110509
 
Email list of multi millionaires
Email list of multi millionairesEmail list of multi millionaires
Email list of multi millionaires
 
Open Data Sources for Grants
Open Data Sources for GrantsOpen Data Sources for Grants
Open Data Sources for Grants
 
Using the Web as an Investigative Reporting Tool
Using the Web as an Investigative Reporting ToolUsing the Web as an Investigative Reporting Tool
Using the Web as an Investigative Reporting Tool
 
Gale Infotrac Update
Gale Infotrac UpdateGale Infotrac Update
Gale Infotrac Update
 

Similar to A crash course in data for information graphics

Umhoefer: Data-driven enterprise - handout
Umhoefer: Data-driven enterprise - handoutUmhoefer: Data-driven enterprise - handout
Umhoefer: Data-driven enterprise - handout
News Leaders Association's NewsTrain
 
Cil2013 searcher academylinks
Cil2013 searcher academylinksCil2013 searcher academylinks
Cil2013 searcher academylinks
Marcy Phelps
 
Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17
Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17
Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17
News Leaders Association's NewsTrain
 
Data driven enterprise off your beat - denver news train - april 11-12, 2019
Data driven enterprise off your beat - denver news train - april 11-12, 2019Data driven enterprise off your beat - denver news train - april 11-12, 2019
Data driven enterprise off your beat - denver news train - april 11-12, 2019
News Leaders Association's NewsTrain
 
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
News Leaders Association's NewsTrain
 
Ona 2012
Ona 2012Ona 2012
Overview of the Census - Doig
Overview of the Census - DoigOverview of the Census - Doig
Overview of the Census - Doig
Reynolds Center for Business Journalism
 
Discovering and mapping your community needs
Discovering and mapping your community needsDiscovering and mapping your community needs
Discovering and mapping your community needs
The HealthPath Foundation of Ohio
 
lecture10.ppt
lecture10.pptlecture10.ppt
lecture10.ppt
AwaisMazhar11
 
Legal Apps and Websites
Legal Apps and WebsitesLegal Apps and Websites
Legal Apps and Websites
Susanna Marlowe
 
FSU SLIS Week 14 Intro to Info Services: Health, Law and Business
FSU SLIS Week 14 Intro to Info Services: Health, Law and BusinessFSU SLIS Week 14 Intro to Info Services: Health, Law and Business
FSU SLIS Week 14 Intro to Info Services: Health, Law and Business
Lorri Mon
 
Data Librarianship
Data LibrarianshipData Librarianship
Data Librarianship
Lynda Kellam
 
BUSI 3460U - Fall 2010 [complete]
BUSI 3460U - Fall 2010 [complete]BUSI 3460U - Fall 2010 [complete]
BUSI 3460U - Fall 2010 [complete]
kstymest
 
Searchthewebtutorial2014
Searchthewebtutorial2014Searchthewebtutorial2014
Searchthewebtutorial2014
Joyce Miller
 
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
News Leaders Association's NewsTrain
 
Federal Social Statistics
Federal Social StatisticsFederal Social Statistics
Federal Social Statistics
kingv
 
Best Business Sources
Best Business SourcesBest Business Sources
Best Business Sources
Marcy Phelps
 
Database fundamentals
Database fundamentalsDatabase fundamentals
Database fundamentals
crystalpullen
 
Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...
Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...
Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...
News Leaders Association's NewsTrain
 
ACP Digging Deeper
ACP Digging DeeperACP Digging Deeper
ACP Digging Deeper
Jennifer LaFleur
 

Similar to A crash course in data for information graphics (20)

Umhoefer: Data-driven enterprise - handout
Umhoefer: Data-driven enterprise - handoutUmhoefer: Data-driven enterprise - handout
Umhoefer: Data-driven enterprise - handout
 
Cil2013 searcher academylinks
Cil2013 searcher academylinksCil2013 searcher academylinks
Cil2013 searcher academylinks
 
Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17
Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17
Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17
 
Data driven enterprise off your beat - denver news train - april 11-12, 2019
Data driven enterprise off your beat - denver news train - april 11-12, 2019Data driven enterprise off your beat - denver news train - april 11-12, 2019
Data driven enterprise off your beat - denver news train - april 11-12, 2019
 
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
 
Ona 2012
Ona 2012Ona 2012
Ona 2012
 
Overview of the Census - Doig
Overview of the Census - DoigOverview of the Census - Doig
Overview of the Census - Doig
 
Discovering and mapping your community needs
Discovering and mapping your community needsDiscovering and mapping your community needs
Discovering and mapping your community needs
 
lecture10.ppt
lecture10.pptlecture10.ppt
lecture10.ppt
 
Legal Apps and Websites
Legal Apps and WebsitesLegal Apps and Websites
Legal Apps and Websites
 
FSU SLIS Week 14 Intro to Info Services: Health, Law and Business
FSU SLIS Week 14 Intro to Info Services: Health, Law and BusinessFSU SLIS Week 14 Intro to Info Services: Health, Law and Business
FSU SLIS Week 14 Intro to Info Services: Health, Law and Business
 
Data Librarianship
Data LibrarianshipData Librarianship
Data Librarianship
 
BUSI 3460U - Fall 2010 [complete]
BUSI 3460U - Fall 2010 [complete]BUSI 3460U - Fall 2010 [complete]
BUSI 3460U - Fall 2010 [complete]
 
Searchthewebtutorial2014
Searchthewebtutorial2014Searchthewebtutorial2014
Searchthewebtutorial2014
 
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
 
Federal Social Statistics
Federal Social StatisticsFederal Social Statistics
Federal Social Statistics
 
Best Business Sources
Best Business SourcesBest Business Sources
Best Business Sources
 
Database fundamentals
Database fundamentalsDatabase fundamentals
Database fundamentals
 
Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...
Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...
Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...
 
ACP Digging Deeper
ACP Digging DeeperACP Digging Deeper
ACP Digging Deeper
 

More from David Herzog

Interactive mapping for journalists
Interactive mapping for journalistsInteractive mapping for journalists
Interactive mapping for journalists
David Herzog
 
Analytic mapping 2013
Analytic mapping 2013Analytic mapping 2013
Analytic mapping 2013
David Herzog
 
First look: Political AdVault
First look: Political AdVaultFirst look: Political AdVault
First look: Political AdVault
David Herzog
 
Resources for Missouri open records
Resources for Missouri open recordsResources for Missouri open records
Resources for Missouri open records
David Herzog
 
Connecting to state data using OpenMissouri.org
Connecting to state data using OpenMissouri.orgConnecting to state data using OpenMissouri.org
Connecting to state data using OpenMissouri.org
David Herzog
 
Mapping the news 2012
Mapping the news 2012Mapping the news 2012
Mapping the news 2012
David Herzog
 
Web 2.0 tools for data journalists
Web 2.0 tools for data journalistsWeb 2.0 tools for data journalists
Web 2.0 tools for data journalists
David Herzog
 

More from David Herzog (7)

Interactive mapping for journalists
Interactive mapping for journalistsInteractive mapping for journalists
Interactive mapping for journalists
 
Analytic mapping 2013
Analytic mapping 2013Analytic mapping 2013
Analytic mapping 2013
 
First look: Political AdVault
First look: Political AdVaultFirst look: Political AdVault
First look: Political AdVault
 
Resources for Missouri open records
Resources for Missouri open recordsResources for Missouri open records
Resources for Missouri open records
 
Connecting to state data using OpenMissouri.org
Connecting to state data using OpenMissouri.orgConnecting to state data using OpenMissouri.org
Connecting to state data using OpenMissouri.org
 
Mapping the news 2012
Mapping the news 2012Mapping the news 2012
Mapping the news 2012
 
Web 2.0 tools for data journalists
Web 2.0 tools for data journalistsWeb 2.0 tools for data journalists
Web 2.0 tools for data journalists
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Zilliz
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
 

A crash course in data for information graphics

  • 1. David Herzog Missouri School of Journalism and NICAR
  • 2. Locating the data  Obtaining the data  Evaluating the data  Working with the data  Visualizing the data
  • 3.  “Database state of mind”  Data has to exist. Where?  Online  Offline
  • 4.  Government websites  Data.gov  U.S. Census Bureau  FDIC  Missouri Data Portal  Missouri Accountability Portal
  • 5.  U.S. agency FOIA pages  Drug Enforcement Administration  NGO sites  Right-to-Know Network  OpenMissouri.org  NICAR database library  ALA state agency databases wiki
  • 6.  Commercial services  Socrata  Infochimps  Geocommons  Foreclosure Radar  Oil Price Information Service  Search Systems  Junar
  • 7.  Academic data catalogs  ICPSR  Forms  Forms.gov  Web forms ▪ Columbia parade permits
  • 8.  Records retention schedules  Reports  State auditor  U.S. Government Accountability Office  U.S. Inspectors General
  • 9.  Google advanced search  Look for data files  Look for key words  Look only on government sites
  • 10.
  • 11.  Data entry  In the field  At the office  Printouts/reports  Inspection forms
  • 12.  Download it  Write or request a scraper with ScraperWiki  Convert a PDF with  CometDocs  Zamzar  Just ask for it
  • 13.  U.S. Freedom of Information Act  Passed in 1966  Amended in 1996 to include electronic records  State open-records statutes  Missouri Sunshine Law
  • 14.  Get the roadmap!  Record layout  File layout  Data dictionary  Code sheet  Metadata  Data about the data
  • 15.  Look at it immediately when you get it  It is what you asked for/expected?  How many rows/records of data?  Is the file format OK?
  • 16.  Does it look too good to be true?  Beware of missing information  Who collected the information?  How? What are their methods?  Why?  What is their agenda?  Who supports them financially or otherwise?
  • 17.  Notepad++ for PCs  TextMate for Mac
  • 18.
  • 19.
  • 20.
  • 21.  Always keep original file  Never overwrite data columns  Tools  Spreadsheets  Database managers  Google Refine  Programming languages
  • 22.  Raw numbers, without context, rarely are interesting.  Ask: Compared to what?
  • 23.  Raw (amount) change  New-Original  Percent change  Change/Original  Per capita rates  Per person  Per x people
  • 24.  Percent of total  Individual/Total  Ratio  Apples/oranges  Averages  Mean  Median
  • 25.  Be curious!  Cut out small slices  Spreadsheets for simple math and comparisons  Spreadsheets for pivot tables  Database managers for more robust analysis  Always ask: Is this correct?
  • 26.  Online software platforms  Desktop software
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.  Contact David Herzog at  herzogd@missouri.edu  Twitter: @davidherzog