This document provides an overview of computer-assisted reporting (CAR), which uses data from databases, spreadsheets, and other sources to uncover stories. CAR can confirm obvious facts, reveal unexpected findings, and report information without needing to attribute to sources. Some potential data sources discussed include inspection reports, complaints, licenses and registrations. The document outlines challenges in obtaining data through freedom of information requests and provides tips for negotiating access. It also discusses software options, potential stories that could be uncovered, and resources for learning more about CAR techniques.
From a presentation I gave to the inaugural meeting of the Hacks & Hackers Ottawa chapter. It's a general survey on data journalism (nee computer-assisted reporting).
There is a lot of confusion out there about the various kinds of NoSQL, and NewSQL, technologies. Document stores, graph databases, columnar databases, graph databases, and the list goes on. This confusion has lead to a good deal of less than optimal deployments, pain, and, ultimately, antipathy.
In this talk, Dan will walk us through a high-level explanation of the various NoSQL technologies available to us, how they work, and provide some dos and don'ts for their implementation.
From a presentation I gave to the inaugural meeting of the Hacks & Hackers Ottawa chapter. It's a general survey on data journalism (nee computer-assisted reporting).
There is a lot of confusion out there about the various kinds of NoSQL, and NewSQL, technologies. Document stores, graph databases, columnar databases, graph databases, and the list goes on. This confusion has lead to a good deal of less than optimal deployments, pain, and, ultimately, antipathy.
In this talk, Dan will walk us through a high-level explanation of the various NoSQL technologies available to us, how they work, and provide some dos and don'ts for their implementation.
Final Year Projects (Computer Science 2013) - Syed Ubaid Ali JafriSyed Ubaid Ali Jafri
Final year project ideas and Aims related to computer science students, Students can get an idea and make their final year project belongs to the industry requirement.
Creating an Open Source Genealogical Search Engine with Apache SolrBrooke Ganz
Set Your Records Free!
LeafSeek is a new tool that helps you turn your genealogical or historical record collections into searchable online databases. Combine multiple datasets of different types — such as birth, marriage, and military records — into one unified searchable website. Find inter-connections in your data that you never noticed before.
With great features like built-in geo-spatial searches, pop-up Google Maps, Beider-Morse Phonetic Matching, name synonyms, and language localization, LeafSeek can help you turn your spreadsheets of names and dates into a full-featured genealogy search engine. It’s designed for researchers and genealogy societies alike.
Oh, and one more thing: LeafSeek is free and open source. No strings attached.
NTEN Webinar - Data Cleaning and Visualization Tools for NonprofitsAzavea
Slides from a webinar we conducted for NTEN that covers tools that nonprofits can use to clean and prepare their datasets and then visualize them via charts, maps, and graphs.
Computer-assisted reporting seminar for StatsCanGlen McGregor
A seminar about how journalists are using data to aide their reporting, presented to Canadian federal public servants with Statistics Canada. This is part evangelism as I try to convince government to open up their data.
Exploring Data Preparation and Visualization Tools for Urban ForestryAzavea
This webinar was held on December 12, 2012 and provided an overview of free and low-cost tools for cleaning and preparing data and building useful and beautiful data visualizations.
Nerd Out with Hadoop: A Not-So-Basic Introduction to the PlatformSteve Hoffman
Talk Mike Roark and I gave at Tech Week Chicago 2013. Thanks to Mike for digging up the slides!
This to-the-point, no-frills workshop gives programmers the steps and takeaways needed to get a Hadoop database set up today. The Orbitz big data team will help development folks get over any intimidation felt towards Hadoop—and show what is possible when you become acquainted with the system. Start creating your own datasets right after the talk!
Postgres Vision 2018: Five Sharding Data ModelsEDB
Whether you work with a distributed system or an MPP database, a key factor in the flexibility you get with the system is how you shard or partition your data. Do you do it by customer, time, or some random uuid? At Postgres Vision 2018, Craig Kerstiens, head of Cloud at Citus Data, presented five different approaches to sharding and the considerations for selecting each of them.
Final Year Projects (Computer Science 2013) - Syed Ubaid Ali JafriSyed Ubaid Ali Jafri
Final year project ideas and Aims related to computer science students, Students can get an idea and make their final year project belongs to the industry requirement.
Creating an Open Source Genealogical Search Engine with Apache SolrBrooke Ganz
Set Your Records Free!
LeafSeek is a new tool that helps you turn your genealogical or historical record collections into searchable online databases. Combine multiple datasets of different types — such as birth, marriage, and military records — into one unified searchable website. Find inter-connections in your data that you never noticed before.
With great features like built-in geo-spatial searches, pop-up Google Maps, Beider-Morse Phonetic Matching, name synonyms, and language localization, LeafSeek can help you turn your spreadsheets of names and dates into a full-featured genealogy search engine. It’s designed for researchers and genealogy societies alike.
Oh, and one more thing: LeafSeek is free and open source. No strings attached.
NTEN Webinar - Data Cleaning and Visualization Tools for NonprofitsAzavea
Slides from a webinar we conducted for NTEN that covers tools that nonprofits can use to clean and prepare their datasets and then visualize them via charts, maps, and graphs.
Computer-assisted reporting seminar for StatsCanGlen McGregor
A seminar about how journalists are using data to aide their reporting, presented to Canadian federal public servants with Statistics Canada. This is part evangelism as I try to convince government to open up their data.
Exploring Data Preparation and Visualization Tools for Urban ForestryAzavea
This webinar was held on December 12, 2012 and provided an overview of free and low-cost tools for cleaning and preparing data and building useful and beautiful data visualizations.
Nerd Out with Hadoop: A Not-So-Basic Introduction to the PlatformSteve Hoffman
Talk Mike Roark and I gave at Tech Week Chicago 2013. Thanks to Mike for digging up the slides!
This to-the-point, no-frills workshop gives programmers the steps and takeaways needed to get a Hadoop database set up today. The Orbitz big data team will help development folks get over any intimidation felt towards Hadoop—and show what is possible when you become acquainted with the system. Start creating your own datasets right after the talk!
Postgres Vision 2018: Five Sharding Data ModelsEDB
Whether you work with a distributed system or an MPP database, a key factor in the flexibility you get with the system is how you shard or partition your data. Do you do it by customer, time, or some random uuid? At Postgres Vision 2018, Craig Kerstiens, head of Cloud at Citus Data, presented five different approaches to sharding and the considerations for selecting each of them.
Combining Data Mining and Machine Learning for Effective User ProfilingCodePolitan
Slide presentasi ini dibawakan oleh Anne Regina pada Seminar & Workshop Pengenalan & Potensi Big Data & Machine Learning yang diselenggarakan oleh KUDIO pada tanggal 14 Mei 2016.
04062024_First India Newspaper Jaipur.pdfFIRST INDIA
Find Latest India News and Breaking News these days from India on Politics, Business, Entertainment, Technology, Sports, Lifestyle and Coronavirus News in India and the world over that you can't miss. For real time update Visit our social media handle. Read First India NewsPaper in your morning replace. Visit First India.
CLICK:- https://firstindia.co.in/
#First_India_NewsPaper
An astonishing, first-of-its-kind, report by the NYT assessing damage in Ukraine. Even if the war ends tomorrow, in many places there will be nothing to go back to.
El Puerto de Algeciras continúa un año más como el más eficiente del continente europeo y vuelve a situarse en el “top ten” mundial, según el informe The Container Port Performance Index 2023 (CPPI), elaborado por el Banco Mundial y la consultora S&P Global.
El informe CPPI utiliza dos enfoques metodológicos diferentes para calcular la clasificación del índice: uno administrativo o técnico y otro estadístico, basado en análisis factorial (FA). Según los autores, esta dualidad pretende asegurar una clasificación que refleje con precisión el rendimiento real del puerto, a la vez que sea estadísticamente sólida. En esta edición del informe CPPI 2023, se han empleado los mismos enfoques metodológicos y se ha aplicado un método de agregación de clasificaciones para combinar los resultados de ambos enfoques y obtener una clasificación agregada.
‘वोटर्स विल मस्ट प्रीवेल’ (मतदाताओं को जीतना होगा) अभियान द्वारा जारी हेल्पलाइन नंबर, 4 जून को सुबह 7 बजे से दोपहर 12 बजे तक मतगणना प्रक्रिया में कहीं भी किसी भी तरह के उल्लंघन की रिपोर्ट करने के लिए खुला रहेगा।
01062024_First India Newspaper Jaipur.pdfFIRST INDIA
Find Latest India News and Breaking News these days from India on Politics, Business, Entertainment, Technology, Sports, Lifestyle and Coronavirus News in India and the world over that you can't miss. For real time update Visit our social media handle. Read First India NewsPaper in your morning replace. Visit First India.
CLICK:- https://firstindia.co.in/
#First_India_NewsPaper
03062024_First India Newspaper Jaipur.pdfFIRST INDIA
Find Latest India News and Breaking News these days from India on Politics, Business, Entertainment, Technology, Sports, Lifestyle and Coronavirus News in India and the world over that you can't miss. For real time update Visit our social media handle. Read First India NewsPaper in your morning replace. Visit First India.
CLICK:- https://firstindia.co.in/
#First_India_NewsPaper
Here is Gabe Whitley's response to my defamation lawsuit for him calling me a rapist and perjurer in court documents.
You have to read it to believe it, but after you read it, you won't believe it. And I included eight examples of defamatory statements/
7. Understand data tables Fields (or columns) contain types of data. Records (or rows) contain the information you want .
8. Why use data? • No more “according to…” State as fact, don’t attribute. • Uncovers stories even the subjects don’t know. • Confirms the obvious; reveals the unexpected.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34. Other CAR ideas • Overpass inspection records. • Mayoralty campaign donors. • Ambulance response times. • Health/safety reports in city-run apartment buildings. • Complaints against taxi drivers.
35. More CAR ideas • Pet licenses by postal code. • Single men/women by census tract. • Day most marriage licenses. • Most common street name.
36. Data sources • Inspection reports • Complaints • Incidents reports • Discipline records • Registrations and licenses Check reporting requirements.
37.
38. The CAR Inverse Pyramid of Aggravation Obtaining data (hard) Formatting data Analyzing data Reporting Writing (easy)
39. Where to get data • Ask for it. • Download from the Web. • Scrape from the Web. • Build it yourself from documents. • FOI or ATIP.
42. FOI/ATIP strategies • Ask for everything except names (and sometimes addresses) • Negotiate. • Appeal.
43. Negotiating for data • Request a sample of the data. • Arrange a meeting with the ATIP/FOI person or the data expert. • Eliminate fields that need to be severed. • Modify your request.
44. Finding the story • Think vertically. Look at columns. • Cross-tab columns. • Chart over time. • Look for patterns. • Dig down from data..
45.
46. Spreadsheets • “ Smart paper,” great for adding up totals, calculating percentages and other data summary. • Limited to 65,000 records.
47.
48. Database managers • Quickly organizes large numbers of records. • No 65,000-record limit. • Can summarize data, too.
49.
50. Mapping or GIS • Puts data on a map. • Analyzes data based on location and distance.
51.
52.
53. Software for Mac • Spreadsheet: Excel ($), OpenOffice (open source) • Database manager: Filemaker($), MySQL (open source) • Mapping : QGIS (open source), Google Maps mash-ups.
54. Software for PC • Spreadsheet: Excel ($), OpenOffice (open source) • Database manager: Access($), MySQL (open source) • Mapping : ArcView ($, Citizen owns a copy), MapInfo ($), Google Maps mash-ups.
55. Problems with CAR • Long lead times. • Time consuming. • Requires software and hardware. • Extra work not seen by readers (or editors)
56. Scale • A little goes a long way • Think story, not series • Keep and reuse your data
57. Why learn CAR? • Few doing it in Canada • Easy online components • Useful in almost any beat • Works in print, broadcast, web
59. Ask for help • (613) 235-6685 • [email_address] • http://www.sushiboy.org/car.pdf • http://www.sushiboy.org/car.ppt -30-
Editor's Notes
CAR is a misnomer. Predates use of the internet in newsrooms. Everyone who does a Google search is computer assisted. We are using electronic data to do reporting.
- Just about anything large organizations (and governments) do gets put into a database somewhere. Email is a form of database. Maps are graphic representations of data. Census stuff.
Each line is a separate record that represents a single gun somewhere in Canada. Record level data. Not summary data. Summary data would be “21 per cent of handguns in Canada are registered in Quebec.”
Shows the file number of a complaint. Which are most interesting? Most serious?
Govt. contracts by department and date. Party column was added after the fact.
Need to know what things are called when you ask for them.
“ Ottawa Police solve fewer murders of women than police forces in eight other major Canadian cities, a Citizen analysis shows.” Parking meter on Lisgar most ticketed in the city. Lots of tickets in Byward Market; but Lynda Lane close.
Low income people twice as likely to live near lotto dealers. Neighbourhood with most outlets per capita. How to do this story?
Black boys twice as likely to be suspended than white boys. More interesting lede?
Data works great on web. Lets readers drill down themselves so you don’t have to.
Nevada rural ambulances worst in US. Grocery clerks and miners.
Might have the same data in Canada. Find out what it’s called.
- Ledes with compelling story; data hit comes later.
Reporter’s process on this? Complaint from a person, idea, data, analysis, back to a person with the same problem. Sources?
Looks like summary data, but produced by the paper. Again, ledes with one person’s story. A bit hackneyed but it works.
Everyone fills up with gas. Water-cooler value. The numbers behind the mundane things we do every day. Lots of papers have done this story. Nice simple lede. Sources?
Texas Assessment of Knowledge and Skills. We publish results of standardized testing every year. Imagine if we could show which schools cheat the most? Multiple choice data easy to computer capture.
Call this guy, asking him about Ontario standardized tests. Privacy concerns easy to get around.
Easy to knock-off here. Are we funding groups with Christian affiliations? More since Harper elected?
Conclusion a bit dramatic.
Huge amount of donations from two addresses. Takes abstract idea like campaign finance and reduces it down to bricks-and-mortar.
Quantifying something most people have had happen to them. Obvious: theives hit parking lots. Less obvious: South Iowa Street is the hot zone.
Google Mash-up. Can do with Platial or code it yourself. Country club vs public course.
Cellphone towers as a visual blight? What about health hazard? European standards?
Methodology shaky? Important thing: gettinng the data from FCC.
Guy who issued more tickets than anyone else. Crunch the numbers, then go find him and talk to him.
List of parking officers names. Nearly got it wrong. Panic two days before publication because Raine ranked second. Two guys named Charbonneau. Figured out with badge numbers.
Whenver possible, put your data on the map. Confirms obvious: parking in the Market, Lynda Lane.
Same idea. Mapping break-ins. “ 11 News put a special computer program to work.”
“ The Fat Belt” obesity charted by geography. Nobody in grey, less than <10 per cent. What is your story based on this data? (Poverty) What is the SECOND story? (Michigan).
What’s are paczkis and coneys? Experts blame: long winters, no bike trails, suburban population boom in Detroit increases commuting time, elimination of phys ed in school.
Look for data stories behind the A1 or C1 news hit.
Doesn’t have to be death and despair.
Government logs everything. Professional bodies… lawyers, nurses, dentists, morticians. Story about veterinarian who gave too much laughing gas to a Norwegian Blue parrot.
Hard stuff at the beginning. Get many irons in the fire.
Stupidity: Wouldn’t give me the Lobbyist Registry because it was available in very limited form online. ATIP coordinators are just learning this, too.
Analysis based on location Eg. How many homes within a km of a cellphone tower?