Similar, But Not The Same: Designing Projects Around Three Open Datasets

  • 563 views
Uploaded on

The traits of an 'open' dataset -- factors like accuracy, geographic scope and copyright entanglements -- shape the development process in profound ways. I'll share what I've learned building projects …

The traits of an 'open' dataset -- factors like accuracy, geographic scope and copyright entanglements -- shape the development process in profound ways. I'll share what I've learned building projects around heritage trees, public art and poetry posts in Portland, and extrapolate a blueprint for evaluating and planning open data projects.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
563
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
1
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Transcript

  • 1. Similar, But Not The Same: Designing Projects Around Three Open Datasets Matt Blair Open Source Bridge, Portland, Oregon June 23, 2011
  • 2. • Summary of the three projects• Planning Open Data Projects• Slides and PDF online
  • 3. The Three Projects
  • 4. Poetry Boxes
  • 5. What are these things?
  • 6. “A poetry post (or poetry pole or poetrybox) is a wooden pole, usually, mounted onprivate property, so that it facespedestrians. On top of the pole is a box,with a glass or clear face and a lid. Inside thebox is a sheet of paper containing a poem(or, sometimes, prose or a photograph).Sometimes the pole is absent, the boxmounted to a tree. That’s it.” – Laura O. Foster
  • 7. A Poetry Box
  • 8. Poetry Posts
  • 9. A Poetry Pole
  • 10. That Poetry “Tree”
  • 11. Project History• Started: June 2010• An idea without much data• Only 11 locations until October• Various iterations...
  • 12. Project Goals• Map the poetry posts• Promote the idea of sharing poetry in a neighborhood context• Encourage people to get out and walk, interact with each other
  • 13. Current Status
  • 14. 59 locations Including 33 photos
  • 15. http://poetrybox.info
  • 16. Data Available for Re-use GeoJSON in Couch DB
  • 17. I’m not sure what’s next And that’s part of the story.
  • 18. PDX Trees
  • 19. Project History• Started: August 2010• Built around Heritage Tree data released through Civic Apps• App released: October 2010
  • 20. Project Goals• Where are these Heritage Trees?• Learn more about each• Get people out to see the trees
  • 21. Public Art PDX
  • 22. Project History• Started: November 2010• Built with data from the Regional Arts & Culture Council (aka RACC)• Geo-coded by City’s Bureau of Technology Services for portlandmaps.com
  • 23. http://racc.org/public-art/search
  • 24. portlandmaps.com
  • 25. Project Goals• Make public art more accessible in situ• Remind everyone of how lucky we are• Encourage exploration (& more walking!)
  • 26. App released:February 2011
  • 27. 451 works of art At 215 locations
  • 28. Data Available for Re-use GeoJSON in Couch DB
  • 29. PlanningOpen Data Projects
  • 30. They’re all just maps, right?
  • 31. They’re all just maps, right? What’s the diff?
  • 32. What makes open data easy -- or difficult?
  • 33. Anticipate problems and skills needed
  • 34. Make great projectsthat excite and engage
  • 35. Seem basic?Make no assumptions about Open Data.
  • 36. 1. Is there data?
  • 37. Data + Idea = Project
  • 38. True?
  • 39. Lots of great Ideas with no Open Data to use.
  • 40. Lots of Open Datathat’s just dull as dirt.
  • 41. Data IdeasThe Open Data Universe
  • 42. DataIdeas Work here. Option #1
  • 43. Option #2:Assemble your own data
  • 44. If it exists in digital form: • Screen-scrape • Repurpose feeds or reporting systems • Google Refine • Convince governments/stakeholders to release it
  • 45. If it’s not in digital form?
  • 46. And arrives like this?
  • 47. Tough to Automate
  • 48. Poetry Posts Dozens and dozens of emailsin and outside of a Google Group
  • 49. Don’t underestimate data collection.
  • 50. 2. Data Sources
  • 51. • Who gathered it?• To what end?• Are they maintaining it?• Do they want to share?
  • 52. PDX TreesOne list, managed by one department.
  • 53. Public Art Data in Portland
  • 54. Data Sources (Nov 2010) • PDX API: 279 works of art • Civic Apps CSV file: 366 works of art • BTS: 300ish works of art (no direct access) • RACC.org: 1800+ works or art
  • 55. PDXAPI RACC.org Civic Apps BTS?Where to start?
  • 56. RACC.org Civic Apps PDXAPIAnd BTS = Civic Apps?
  • 57. But they didn’t match...
  • 58. RACC.org Permanent Collection
  • 59. Text = Available Data
  • 60. Text = Available Data As of November 2010: ~ 370 works of art
  • 61. RACC.org
  • 62. RACC.org (City of Portland)(Multnomah County)
  • 63. Done? Not quite...
  • 64. (Caveat: Not To Scale) Accuracy Not Guaranteed
  • 65. RACC.org Public Art
  • 66. RACC.org Murals Public Art
  • 67. RACC.org Murals TriMet Public Art
  • 68. RACC.org Murals TriMet Convention Center Public Art
  • 69. RACC.org Murals TriMet Metro Convention Center Public Art
  • 70. RACC.org Murals Port of TriMetPortland Metro Convention Center Public Art
  • 71. Fountains RACC.org Murals Port of TriMetPortland Metro Convention Center Public Art
  • 72. Fountains RACC.org MuralsParks& Rec Port of TriMetPortland Metro Convention Center Public Art
  • 73. Fountains RACC.org MuralsParks& Rec Port of TriMetPortland Beaverton? Metro Convention Center Public Art
  • 74. Fountains RACC.org MuralsParks& Rec Port of TriMetPortland Beaverton? Hillsboro? Metro Convention Center Public Art
  • 75. Fountains RACC.org MuralsParks& Rec Port of TriMetPortland Beaverton? Hillsboro? Metro Clark Convention County? Center Public Art
  • 76. Fountains RACC.org MuralsParks& Rec Port of TriMetPortland Performance Beaverton? Hillsboro? Metro Clark Convention County? Center Public Art
  • 77. Fountains RACC.org MuralsParks& Rec Port of TriMetPortland Performance Temporary Installations Beaverton? Hillsboro? Metro Clark Convention County? Center Public Art
  • 78. Fountains RACC.org Murals Graffiti?Parks& Rec Port of TriMetPortland Performance Temporary Installations Beaverton? Hillsboro? Metro Clark Convention County? Center Public Art
  • 79. “Of course not!”
  • 80. “But what if it’s Banksy?”
  • 81. Fountains RACC.org Murals Graffiti?Parks& Rec Port of TriMetPortland Performance Temporary Installations Beaverton? Hillsboro? Metro Clark Convention County? Center Community? Public Art
  • 82. Community Collection
  • 83. Where’s Paul?
  • 84. photo by Cacophony (via Wikipedia)
  • 85. Intersection Repairphoto by City Repair (via Flickr)
  • 86. Julian Voss- Andrae’s Alpha Helix(at the Linus Pauling House) photo via julianvossandrae.com
  • 87. Fountains RACC.org Murals Graffiti?Parks& Rec Port of TriMetPortland Performance Temporary Installations Beaverton? Hillsboro? Metro Clark Convention County? Center Community? Public Art
  • 88. Fountains RACC.org Murals Graffiti?Parks& Rec Available Port of Data TriMetPortland Performance Temporary Installations Beaverton? Hillsboro? Metro Clark Convention County? Center Community? Public Art
  • 89. A Community-wide DatabaseBuild an inter-agency superset of public art
  • 90. Yes, it’s complicated.
  • 91. Yes, it’s complicated. Yet Portland has some of the best public art data in the country.
  • 92. Data Source Tips• Start small• Find allies and set an example• Don’t wait for the perfect dataset• Plan for chaos• But be ambitious in the long-term!
  • 93. 3. Data Structure
  • 94. Each source has its own:• metadata• schema• volatility• level of accuracy and currency• messes to clean up
  • 95. A dynamic mix of...• noisy data• shifting standards• unexpected restrictions• adapting to community requests• addressing data provider concerns• ambiguities of project ownership
  • 96. Less about architecture than improvisation...
  • 97. Metadata Flexibility Let early adopters set the standards
  • 98. NoSQL
  • 99. Document Databases
  • 100. CouchDB
  • 101. Eventual Schema
  • 102. “There is alwaysschema somewhere.”
  • 103. “There is alwaysschema somewhere.” If it’s not in your software, you’re forcing your audience to build it in their heads.
  • 104. Delay Schema Decisions Push them up to presentation/client layer
  • 105. But what about MVC?
  • 106. But what about MVC? My model is in my view?!
  • 107. More nimble thanrelational database Especially during development
  • 108. YKmMV
  • 109. 4. Scope and Density
  • 110. Not Just Geographic Topic or Time
  • 111. Social Clusters
  • 112. Geo-Density and UI
  • 113. PDXShrub?
  • 114. PDXLichen?
  • 115. Cluster it?
  • 116. Art Mob?
  • 117. Places and Art Distinct Databases
  • 118. The Map Shows Places Not Art
  • 119. If a place hasone work of art: Place = Art
  • 120. If a place hasmultiple works of art...
  • 121. Presentation Determines Data Model
  • 122. Presentation Determines Data Model But you just said!?!
  • 123. 5. Stories
  • 124. “While the map makes class and racedifferences all the more evident, its great tolearn about the few murals where I live andI look forward to using this app on a walkingtour downtown soon.” – Marshal Kirkpatrick, Read Write Web
  • 125. Incomplete Data
  • 126. Recent PublicBuilding Projects
  • 127. Economic Activity
  • 128. Voice andRepresentation
  • 129. Share Open Data:Interpretations Emerge
  • 130. A Tool for Advocacy?
  • 131. 6. Accuracy
  • 132. What’s Missingor Incorrect?
  • 133. What’s Missingor Incorrect? Errors and omissions arean opportunity for participation.
  • 134. > 1 mile
  • 135. Artifact of Geo-Coding Address != Location
  • 136. CapacityWho has the time to manually verify all these locations?
  • 137. Crowd-Correction
  • 138. 7. Audience Appeal
  • 139. Public Art{ "docs": [{ "addrCity": "", "addrState": " ", "addrStreet": "", "addrZip": "", "artists": "Dan Corson", "date": "2009", "dateModified": "2011-04-18 00:00:00", "description": "Mercurial Sky is an ever-changing array of light played on LEDtubes integrated into the Director Park Canopy. The digital video only emits from thelighted bars, and provides a sense of movement through an abstract tapestry of lightand color. If you stand farther away, or look in nearby reflections, the images arecompressed and give a clearer view of the video. nn"I filmed images and patterns ofnatural phenomena like waves, clouds, fire, earthworms, and jellyfish to bring themovement and randomness of nature into this mostly hardscaped park."", "detailPageURL": "http://racc.org/public-art/search/?recid=2909.101", "dimensions": "duration: 1:23:10", "discipline": "video", "fundingSource": "Percent for Art - City of Portland", "thumbnailURL": "http://data.racc.org/pa_inventory/1844/1844thumb.jpg", "location": "Director Park Canopy", "mappableDiscipline": "other", "medium": "Digital video on DVD", "recordID": "2909", "title": "Mercurial Sky", "dataSource": "RACC", "collection": "None", "photoCredit": "RACC", "artCopyright": "TBD", "locationVerified": "YES", "geometry": { "coordinates": [ -122.681124, 45.518759 ], "type": "Point" }}
  • 140. Public Art{ "docs": [{ "addrCity": "", "addrState": " ", "addrStreet": "", "addrZip": "", "artists": "Dan Corson", "date": "2009", "dateModified": "2011-04-18 00:00:00", "description": "Mercurial Sky is an ever-changing array of light played on LEDtubes integrated into the Director Park Canopy. The digital video only emits from thelighted bars, and provides a sense of movement through an abstract tapestry of lightand color. If you stand farther away, or look in nearby reflections, the images arecompressed and give a clearer view of the video. nn"I filmed images and patterns ofnatural phenomena like waves, clouds, fire, earthworms, and jellyfish to bring themovement and randomness of nature into this mostly hardscaped park."", "detailPageURL": "http://racc.org/public-art/search/?recid=2909.101", "dimensions": "duration: 1:23:10", "discipline": "video", "fundingSource": "Percent for Art - City of Portland", "thumbnailURL": "http://data.racc.org/pa_inventory/1844/1844thumb.jpg", "location": "Director Park Canopy", "mappableDiscipline": "other", "medium": "Digital video on DVD", "recordID": "2909", "title": "Mercurial Sky", "dataSource": "RACC", "collection": "None", "photoCredit": "RACC", "artCopyright": "TBD", "locationVerified": "YES", "geometry": { "coordinates": [ -122.681124, 45.518759 ], "type": "Point" }}
  • 141. Public Art{ "docs": [{ "addrCity": "", "addrState": " ", "addrStreet": "", "addrZip": "", "artists": "Dan Corson", "date": "2009", "dateModified": "2011-04-18 00:00:00", "description": "Mercurial Sky is an ever-changing array of light played on LEDtubes integrated into the Director Park Canopy. The digital video only emits from thelighted bars, and provides a sense of movement through an abstract tapestry of lightand color. If you stand farther away, or look in nearby reflections, the images arecompressed and give a clearer view of the video. nn"I filmed images and patterns ofnatural phenomena like waves, clouds, fire, earthworms, and jellyfish to bring themovement and randomness of nature into this mostly hardscaped park."", "detailPageURL": "http://racc.org/public-art/search/?recid=2909.101", "dimensions": "duration: 1:23:10", "discipline": "video", "fundingSource": "Percent for Art - City of Portland", "thumbnailURL": "http://data.racc.org/pa_inventory/1844/1844thumb.jpg", "location": "Director Park Canopy", "mappableDiscipline": "other", "medium": "Digital video on DVD", "recordID": "2909", "title": "Mercurial Sky", "dataSource": "RACC", "collection": "None", "photoCredit": "RACC", "artCopyright": "TBD", "locationVerified": "YES", "geometry": { "coordinates": [ -122.681124, 45.518759 ], "type": "Point" }}
  • 142. Heritage Trees{ "address": "2403 WI/ SW JEFFERSON ST", "circumfere": "12.300000000000001", "common_nam": "Deodar cedar", "diameter": "47", "geometry": { "coordinates": [ -122.70463884770101, 45.521710633334202 ], "type": "Point" }, "gid": "103", "height": "73", "notes": "between SW Marconi Ave and SW Tichner Dr", "objectid": "103", "owner": "Right Of Way", "scientific": "Cedrus deodara", "spread": "73", "stateid": "1N1E32 100", "status": "Heritage", "treeid": "113", "year": "1996"}
  • 143. PDX Trees{ "address": "2403 WI/ SW JEFFERSON ST", "circumfere": "12.300000000000001", "common_nam": "Deodar cedar", "diameter": "47", "geometry": { + ? "coordinates": [ -122.70463884770101, 45.521710633334202 ], "type": "Point" }, "gid": "103", "height": "73", "notes": "between Marconi Ave and SW Tichner Dr", "objectid": "103", "owner": "Right Of Way", "scientific": "Cedrus deodara", "spread": "73", "stateid": "1N1E32 100", "status": "Heritage", "treeid": "113", "year": "1996"}
  • 144. PDX Trees{ "address": "2403 WI/ SW JEFFERSON ST", "circumfere": "12.300000000000001", "common_nam": "Deodar cedar", "diameter": "47", "geometry": { + "coordinates": [ -122.70463884770101, 45.521710633334202 ], "type": "Point" }, "gid": "103", "height": "73", "notes": "between Marconi...", "objectid": "103", "owner": "Right Of Way", "scientific": "Cedrus deodara", "spread": "73", "stateid": "1N1E32 100", "status": "Heritage", "treeid": "113", "year": "1996"}
  • 145. PDX Trees{ "address": "2403 WI/ SW JEFFERSON ST", "circumfere": "12.300000000000001", "common_nam": "Deodar cedar", "diameter": "47", "geometry": { + ?? "coordinates": [ -122.70463884770101, 45.521710633334202 ], "type": "Point" }, "gid": "103", "height": "73", "notes": "between Marconi Ave and SW Tichner Dr", "objectid": "103", "owner": "Right Of Way", "scientific": "Cedrus deodara", "spread": "73", "stateid": "1N1E32 100", "status": "Heritage", "treeid": "113", "year": "1996"}
  • 146. Poetry Posts1991 SW Mill St Terrace
  • 147. Poetry Posts1991 SW Mill St Terrace
  • 148. PDX Trees{ "address": "2403 WI/ SW JEFFERSON ST", "circumfere": "12.300000000000001", "common_nam": "Deodar cedar", "diameter": "47", "geometry": { + "coordinates": [ -122.70463884770101, 45.521710633334202 ], "type": "Point" }, "gid": "103", "height": "73", "notes": "between Marconi...", "objectid": "103", "owner": "Right Of Way", "scientific": "Cedrus deodara", "spread": "73", "stateid": "1N1E32 100", "status": "Heritage", "treeid": "113", "year": "1996"}
  • 149. Sunny October Day?
  • 150. Sunny October Day? No one will believe it.
  • 151. More Realistic
  • 152. In Rarer Weather, too...
  • 153. But there are 283 Trees!
  • 154. But there are 283 Trees! I need help.
  • 155. Create acrowd-sourced,longitudinal,season-sortablecollection oftree photos...
  • 156. Will anyone send photos?
  • 157. by Brad B
  • 158. by kateinoregon
  • 159. “KeepPortlandGreen!”by Dan Flynn
  • 160. (and I still add some...)
  • 161. 320+ Photos Sent In
  • 162. Submitted underCreative Commons
  • 163. Photos (and data)available for re-use
  • 164. 8. Intellectual Property
  • 165. I thought this was “open” data?
  • 166. Restrictions inTerms of Use
  • 167. Linked Media
  • 168. Linked Media Who took that photo? Who owns it? Who can license it?
  • 169. 9. Data Volatility
  • 170. How fast does the data change?
  • 171. Poetry Posts Whenever I update it.
  • 172. Poetry PostsWebsite reads from CouchDB
  • 173. Poetry Posts (future) Multiple clients read/write from CouchDB
  • 174. PDX Trees
  • 175. PDX TreesTrees are slow movers.
  • 176. PDX Trees• Data: once or twice a year• Photos: a couple times a week
  • 177. PDX Trees• Basic tree data on phone• Available images for each tree pulled from API in real-time• Photos not available offline
  • 178. PDX Trees (future)• Multiple clients adopting the same pattern?• Device caching of images?• Remote delete of flagged images?
  • 179. Public Art PDX• Basic data changes faster than app update• Micro-updates (a comma moved)• New works and collections• Photos online only (IP)
  • 180. Public Art PDX• CouchDB is the canonical data store• App fetches packaged data releases• App searches data locally• Photos load from CouchDB
  • 181. DIY Phone to Couch Sync
  • 182. Public Art PDX (future) Mobile Couchbase!
  • 183. 10. Open Data Roles
  • 184. Who do you need?
  • 185. Standard Roles• Project Management • Software Engineering• Graphic Design • Testing• Information Architecture • Maintenance• Metadata Design • Coordination• Data Entry/Import • Marketing• UI Design • Communication
  • 186. Roles That Deserve Special Attention
  • 187. Data Collection
  • 188. Fountains RACC.org Murals Graffiti?Parks& Rec Available Port of Data TriMetPortland Performance Temporary Installations Beaverton? Hillsboro? Metro Clark Convention County? Center Community? Public Art
  • 189. Research & Verification Not necessarily a technical person
  • 190. Data Scrubbing Probably Need Tech Skills
  • 191. CurationWhat’s in, what’s out and why
  • 192. Authority(not required, but it helps)
  • 193. Advocacy Why are we doing this?Why should you release this data?
  • 194. Public Art PDX 1.0:Essential Collaborators• Regional Arts & Culture Council (RACC)• The Office of Mayor Sam Adams• City of Portland’s Bureau of Technology Services• City Attorney (and pro bono attorneys)• Bud Clark! (see the video)
  • 195. Public Art PDX 1.0Collaborators Helped With: • Graphic Design • Communication • Metadata Design • Data Collection • Data Entry/Import • Data Scrubbing • Testing • Authority • Marketing • Advocacy
  • 196. Evolving Roles for theCommunity Collection• Metadata Design • Curation• Data Entry/Import • Photography• Research and Verification • Communication
  • 197. Public Art PDX 1.x: Community CollectionDifferent Phases, Different Roles, New Participants
  • 198. PDF OnlineText Text
  • 199. This is a work-in-progress. Ideas welcome.
  • 200. 11.
  • 201. WhereCamp September, 2010
  • 202. “The App is not the Thing”
  • 203. “The App is not the Thing” (redux)
  • 204. Websites =Presentation
  • 205. Apps = Presentation
  • 206. Presentation Layers are Ephemeral
  • 207. Presentation Layers are Ephemeral They come and go, like fashion
  • 208. This year’s attire
  • 209. What costumes willyour data wear in five years’ time?
  • 210. Who knows.
  • 211. Data *is* the Thing.
  • 212. Data Lasts.
  • 213. Project Websites• http://poetrybox.info• http://pdxtrees.org• http://publicartpdx.com
  • 214. Thank You• http://mattblair.net• Email: elsewisemedia@gmail.com• Blog: http://elsewisestrategic.com• github.com/mattblair