Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Strategy

3,616 views

Published on

Information to Collect, and What You Can Do With It. A seminar for APME/ONA Newstrain.

Published in: Business, Technology
  • Be the first to comment

Data Strategy

  1. 1. Your data strategy What information to collect and what you can do with it Anthony Moor Deputy Managing Editor/Interactive The Dallas Morning News [email_address]
  2. 2. Web 3.0: The data-driven Web Source: Chicago Tribune SEO (Google) Domain name speculation (Netscape) Search Boost The entire Web (Google AdSense) A few large sites (DoubleClick) Ad Distribution Write/Contribute (Wikipedia, Flickr) Read (Britannica online) Engagement Web services (syndication) Client/Server Architecture Dynamic (XML, Ajax, RSS) Static (HTML) State Data (mashup, widget) Page (article) Unit of Content <ul><li>Data powers the Web </li></ul><ul><li>Automated mashing up of personalized content </li></ul><ul><li>Intelligent-agent driven assembly and interactivity </li></ul>Everyone Geeks Audience Web 3.0 The Semantic Web Web 2.0 (2003-2010) Web 1.0 (1993-2003)
  3. 3. Key features of a data-driven Web <ul><li>Data powers Web applications </li></ul><ul><ul><li>Certain classes of data are becoming critical building blocks for Web 3.0 applications </li></ul></ul><ul><ul><li>Structured data records published to the Web in reusable and remotely queryable formats (widgets) </li></ul></ul><ul><li>Leverages the Long Tail </li></ul><ul><ul><li>Low-cost economics and broad reach enabled by the Internet </li></ul></ul><ul><li>Becomes the geospatial Web (Geoweb) </li></ul><ul><ul><li>Merging of geographical (location-based) information with the abstract information that currently dominates the Internet </li></ul></ul><ul><li>Enables content remixing and repurposing (Think: mashups) </li></ul><ul><ul><li>Increase benefits from collective adoption not private restriction </li></ul></ul><ul><li>Users add value </li></ul><ul><ul><li>Users add their own data to that which we provide </li></ul></ul>Source: Chicago Tribune
  4. 4. Data centers provide utility
  5. 5. Databases are hard to build <ul><li>Databases have three parts, built by different tech experts </li></ul><ul><ul><li>The data warehouse, where your cleaned/converted data sits </li></ul></ul><ul><ul><li>The Web interface, carefully designed to be intuitive to your users </li></ul></ul><ul><ul><li>The production tool, so producers can amend the data </li></ul></ul><ul><li>So consider them for projects that have a long shelf life </li></ul><ul><li>Because databases persist – the data gets old </li></ul><ul><ul><li>So someone needs to manage and update the database </li></ul></ul>
  6. 6. Acquire and build databases
  7. 7. Interactive school guides
  8. 8. Create image maps about your data
  9. 9. Property appraisal, home sale data
  10. 10. Police reports and crime statistics
  11. 11. Voter and election guides Powered by TheVoterGuide.org
  12. 12. Public employee salary databases
  13. 13. Mashups <ul><li>You can do your own map mashup at Atlas </li></ul><ul><li>Follow mashup development at Programmable Web </li></ul>
  14. 14. Carefully consider what to database or map <ul><li>Consider the time investment </li></ul><ul><li>Ask: What job do you want to get done for your user? </li></ul><ul><li>Crudely sketch out exactly your idea on a piece of paper then </li></ul><ul><li>… walk through exactly the clicks a user will take to get your stuff </li></ul><ul><li>Maps </li></ul><ul><ul><li>Poor: DISD bond issue map </li></ul></ul><ul><ul><li>Good: Home sales map </li></ul></ul><ul><li>GuideLive.com </li></ul><ul><ul><li>Listings </li></ul></ul>
  15. 15. Data structure powers mashups
  16. 16. Of course our stories don’t mashup very well. They aren’t data! Yes they are. And we ignore that fact at our peril.
  17. 17. Our key competitive differentiator is the data we gather every single day <ul><li>Our articles </li></ul><ul><li>Images </li></ul><ul><li>Ads </li></ul><ul><li>Classifieds </li></ul><ul><li>Listings </li></ul><ul><li>Video </li></ul><ul><li>User Content </li></ul><ul><li>Archives </li></ul><ul><li>Databases </li></ul><ul><li>Blogs </li></ul>
  18. 18. But it’s locked, hidden and unorganized <ul><li>Names/Places </li></ul><ul><ul><li>Coppell </li></ul></ul><ul><ul><li>Grapevine Mills Mall </li></ul></ul><ul><ul><li>Coppell High </li></ul></ul><ul><ul><li>OSU </li></ul></ul><ul><ul><li>Travis Masters </li></ul></ul><ul><ul><li>Emily Coker </li></ul></ul><ul><ul><li>Sarah Sanders </li></ul></ul><ul><li>Dates/Facts </li></ul><ul><ul><li>Audi slid under 18-wheeler </li></ul></ul><ul><ul><li>Friday morning </li></ul></ul><ul><li>Concepts </li></ul><ul><ul><li>Suicide </li></ul></ul><ul><ul><li>National Merit Scholarship </li></ul></ul><ul><ul><li>Motocross </li></ul></ul><ul><ul><li>Untimely deaths </li></ul></ul>
  19. 19. So how do we get it unlocked and organized? We need some data about this data
  20. 20. So how do we get it unlocked and organized? We need some data about this data metadata
  21. 21. Metadata tells us what an article is about
  22. 22. <ul><li>Names/Places </li></ul><ul><ul><li>Coppell </li></ul></ul><ul><ul><li>Grapevine Mills Mall </li></ul></ul><ul><ul><li>Coppell High </li></ul></ul><ul><ul><li>OSU </li></ul></ul><ul><ul><li>Travis Masters </li></ul></ul><ul><ul><li>Emily Coker </li></ul></ul><ul><ul><li>Sarah Sanders </li></ul></ul><ul><li>Dates/Facts </li></ul><ul><ul><li>Audi slid under 18-wheeler </li></ul></ul><ul><ul><li>Friday morning </li></ul></ul><ul><li>Concepts </li></ul><ul><ul><li>Suicide </li></ul></ul><ul><ul><li>National Merit Scholarship </li></ul></ul><ul><ul><li>Motocross </li></ul></ul><ul><ul><li>Untimely deaths </li></ul></ul>1) So we first extract key entities and concepts
  23. 23. <ul><li>Names/Places </li></ul><ul><ul><li>Coppell </li></ul></ul><ul><ul><li>Grapevine Mills Mall </li></ul></ul><ul><ul><li>Coppell High </li></ul></ul><ul><ul><li>OSU </li></ul></ul><ul><ul><li>Travis Masters </li></ul></ul><ul><ul><li>Emily Coker </li></ul></ul><ul><ul><li>Sarah Sanders </li></ul></ul><ul><li>Dates/Facts </li></ul><ul><ul><li>Audi slid under 18-wheeler </li></ul></ul><ul><ul><li>Friday morning </li></ul></ul><ul><li>Concepts </li></ul><ul><ul><li>Suicide </li></ul></ul><ul><ul><li>National Merit Scholarship </li></ul></ul><ul><ul><li>Motocross </li></ul></ul><ul><ul><li>Untimely deaths </li></ul></ul>2) Then filter them for relevance
  24. 24. 3) And finally relate them to standard categories <ul><li>Names/Places </li></ul><ul><ul><li>Coppell </li></ul></ul><ul><ul><li>Grapevine Mills Mall </li></ul></ul><ul><ul><li>Coppell High </li></ul></ul><ul><ul><li>OSU </li></ul></ul><ul><ul><li>Travis Masters </li></ul></ul><ul><ul><li>Emily Coker </li></ul></ul><ul><ul><li>Sarah Sanders </li></ul></ul><ul><li>Dates/Facts </li></ul><ul><ul><li>Audi slid under 18-wheeler </li></ul></ul><ul><ul><li>Friday morning </li></ul></ul><ul><li>Concepts </li></ul><ul><ul><li>Suicide </li></ul></ul><ul><ul><li>National Merit Scholarship </li></ul></ul><ul><ul><li>Motocross </li></ul></ul><ul><ul><li>Untimely deaths </li></ul></ul><ul><li>Names/Places </li></ul><ul><ul><li>Towns > Coppell </li></ul></ul><ul><ul><li>Location > Grapevine Mills </li></ul></ul><ul><ul><li>High Schools > Coppell </li></ul></ul><ul><ul><li>OSU </li></ul></ul><ul><ul><li>People > Travis Masters </li></ul></ul><ul><ul><li>People > Emily Coker </li></ul></ul><ul><ul><li>Sarah Sanders </li></ul></ul><ul><li>Dates/Facts </li></ul><ul><ul><li>Accidents > Auto/Truck </li></ul></ul><ul><ul><li>April 25, 2008 </li></ul></ul><ul><li>Concepts </li></ul><ul><ul><li>Suicide </li></ul></ul><ul><ul><li>National Merit Scholarship </li></ul></ul><ul><ul><li>Motocross </li></ul></ul><ul><ul><li>Teens > Deaths </li></ul></ul>
  25. 25. The standard categories are… <ul><li>Names/Places </li></ul><ul><ul><li>Towns > Coppell </li></ul></ul><ul><ul><li>Location > Grapevine Mills </li></ul></ul><ul><ul><li>High Schools > Coppell </li></ul></ul><ul><ul><li>OSU </li></ul></ul><ul><ul><li>People > Travis Masters </li></ul></ul><ul><ul><li>People > Emily Coker </li></ul></ul><ul><ul><li>Sarah Sanders </li></ul></ul><ul><li>Dates/Facts </li></ul><ul><ul><li>Accidents > Auto/Truck </li></ul></ul><ul><ul><li>April 25, 2008 </li></ul></ul><ul><li>Concepts </li></ul><ul><ul><li>Suicide </li></ul></ul><ul><ul><li>National Merit Scholarship </li></ul></ul><ul><ul><li>Motocross </li></ul></ul><ul><ul><li>Teens > Deaths </li></ul></ul>
  26. 26. <ul><li>tax·on·o·my </li></ul><ul><ul><li>Pronunciation: </li></ul></ul><ul><ul><ul><li> ak-sä-nə-mē </li></ul></ul></ul><ul><ul><li>Function: </li></ul></ul><ul><ul><ul><li>noun </li></ul></ul></ul><ul><ul><li>Etymology: </li></ul></ul><ul><ul><ul><li>French taxonomie, from tax- + -nomie -nomy </li></ul></ul></ul><ul><ul><li>Date: </li></ul></ul><ul><ul><ul><li>circa 1828 </li></ul></ul></ul><ul><ul><li>1:   the organizational structure of categories and attributes that define how you classify, describe and manage your data </li></ul></ul>
  27. 27. Taxonomy is the card catalog of our content <ul><li>A set of index terms that we manage and apply to each piece of content </li></ul><ul><li>Terms are hierarchical : Large categories split into specific sub-categories </li></ul><ul><li>Terms are cross-referenced, so if you look for “bucket,” you also get “pail.” </li></ul>A taxonomy organizes, classifies and relates our content Structuring our content just like data
  28. 28. So why do I need it?  Faceted navigation I can click in level by level to find something (Cuisine>Asian>Chinese>Szechwan )
  29. 29. So why do I need it?  Faceted navigation I can click in level by level to find something (Cuisine>Asian>Chinese>Szech-wan )  Much better site search by enabling search boxes that can restrict a search to terms of a particular type or context
  30. 30. So why do I need it?  Faceted navigation I can click in level by level to find something (Cuisine>Asian>Chinese>Szech-wan )  Much better search by enabling search boxes that can restrict a search to terms of a particular type or context  Related information that I may not have known about (articles, photo galleries, other listings)
  31. 31. So why do I need it?  Faceted navigation I can click in level by level to find something (Cuisine>Asian>Chinese>Szech-wan )  Much better search by enabling search boxes that can restrict a search to terms of a particular type or context  Related information that I may not have known about (articles, photo galleries, other listings)  Multiple attributes for listings (Parking, Ambience)
  32. 32. So why do I need it?  Faceted navigation I can click in level by level to find something (Cuisine>Asian>Chinese>Szech-wan )  Much better search by enabling search boxes that can restrict a search to terms of a particular type or context  Related information that I may not have known about (articles, photo galleries, other listings)  Multiple attributes for listings (BYOB, Outdoor Dining)  Higher search ranking (SEO) on search engines for topic subjects, listings, classified categories Source: Chicago Tribune
  33. 33. ‘ Hot Topic’-driven content pages provide new opportunities for keyword-targeted advertising, boost SEO rankings, increase site traffic and drive more engagement
  34. 35. Embedded links increase page views and boost SEO, which generates more site traffic
  35. 37. Geographic terms can power data mapping
  36. 38. … or create customized alerts where you define the geography where you want notifications to come from
  37. 39. Whatever you call it… it’s about describing and classifying our content <ul><li>Taxonomy – our standard, heirarchical categories </li></ul><ul><li>Metadata – the keywords describing a piece of content </li></ul><ul><li>Structured data – Information that’s been organized as above </li></ul>
  38. 40. How do I get structured data? <ul><li>You can do it by hand </li></ul><ul><ul><li>Librarians and Web producers are doing it every day </li></ul></ul><ul><ul><li>But they can only do so much </li></ul></ul><ul><ul><li>Should reporters be adding metadata to every story? </li></ul></ul><ul><ul><li>Should line editors and/or copy editors add metadata? </li></ul></ul><ul><li>You can use technology </li></ul><ul><ul><li>IPTC </li></ul></ul><ul><ul><li>AP Digital Exchange </li></ul></ul><ul><ul><li>Inform </li></ul></ul><ul><ul><li>Teragram </li></ul></ul><ul><ul><li>NStein </li></ul></ul><ul><ul><li>MetaCarta (geotagging) </li></ul></ul><ul><ul><li>Serra Media (geotagging) </li></ul></ul><ul><ul><li>Generate Inc. (business data) </li></ul></ul>
  39. 41. Elements of a data strategy
  40. 42. Organizing for structured data <ul><li>Do you need a taxonomy and data strategy? </li></ul><ul><li>Audit your newsroom datastream </li></ul><ul><li>Do you need a data coordinator? </li></ul><ul><li>Gannett’s Data Desks </li></ul><ul><ul><li>Brought everyone who ‘does data’ together: Agate clerks, librarians, CAR staff </li></ul></ul><ul><ul><li>Responsibilities: Acquiring data, creating databases, programming them for interactivity, training for building spreadsheets </li></ul></ul><ul><li>Do you need self-service ways the public can give you info? </li></ul><ul><ul><li>Yes: Web forms </li></ul></ul><ul><ul><li>No: Faxes, phones, notes on paper, lists on someone’s computer </li></ul></ul>
  41. 43. Reporting for structured data <ul><li>Reporting with data in mind means you gather the same fact in the same way every time </li></ul><ul><ul><li>Shoot every photo exactly the same way </li></ul></ul><ul><ul><li>Ask the same question of every interviewee </li></ul></ul><ul><ul><li>Find out all the same info from every venue </li></ul></ul><ul><li>Save the data </li></ul><ul><li>Input the data alongside the story </li></ul><ul><li>Look for databases you can bring back with your story </li></ul><ul><li>Example: Bluegrass Instruments </li></ul>
  42. 44. Writing/editing for structured data <ul><li>Editors add and apply keywords and standard categories </li></ul><ul><li>Bloggers (already?) tag and categorize blog posts </li></ul><ul><ul><li>Should you have standard categories or go with the ‘folksonomy?’ </li></ul></ul>
  43. 45. Discussion http://www.slideshare.net/ajmoor

×