Slideshare.net (beta)

 

All comments

Add a comment on Slide 1

If you have a SlideShare account, login to comment; else you can comment as a guest


Showing 1-50 of 0 (more)

Data Strategy

From ajmoor, 3 months ago

Information to Collect, and What You Can Do With It. A seminar fo more

222 views  |  0 comments  |  0 favorites  |  8 downloads
Embed
options

More Info

This slideshow is Public
Total Views: 222
on Slideshare: 222
from embeds: 0

Slideshow transcript

Slide 1: Your data strategy What information to collect and what you can do with it Anthony Moor Deputy Managing Editor/Interactive The Dallas Morning News amoor@dallasnews.com

Slide 2: Web 3.0: The data-driven Web Web 1.0 Web 2.0 Web 3.0 (1993-2003) (2003-2010) The Semantic Web Audience Geeks Everyone •Data powers the Unit of Content Page (article) Data (mashup, widget) Web State Static (HTML) Dynamic •Automated (XML, Ajax, RSS) mashing up of Architecture Client/Server Web services personalized (syndication) content Engagement Read Write/Contribute •Intelligent-agent (Britannica online) (Wikipedia, Flickr) driven assembly Ad Distribution A few large sites The entire Web and interactivity (DoubleClick) (Google AdSense) Search Boost Domain name SEO (Google) speculation (Netscape) Source: Chicago Tribune

Slide 3: Key features of a data-driven Web • Data powers Web applications – Certain classes of data are becoming critical building blocks for Web 3.0 applications – Structured data records published to the Web in reusable and remotely queryable formats (widgets) • Leverages the Long Tail – Low-cost economics and broad reach enabled by the Internet • Becomes the geospatial Web (Geoweb) – Merging of geographical (location-based) information with the abstract information that currently dominates the Internet • Enables content remixing and repurposing (Think: mashups) – Increase benefits from collective adoption not private restriction • Users add value – Users add their own data to that which we provide Source: Chicago Tribune

Slide 4: Data centers provide utility

Slide 5: Databases are hard to build • Databases have three parts, built by different tech experts – The data warehouse, where your cleaned/converted data sits – The Web interface, carefully designed to be intuitive to your users – The production tool, so producers can amend the data • So consider them for projects that have a long shelf life • Because databases persist – the data gets old – So someone needs to manage and update the database

Slide 6: Acquire and build databases

Slide 7: Interactive school guides

Slide 8: Create image maps about your data

Slide 9: Property appraisal, home sale data

Slide 10: Police reports and crime statistics

Slide 11: Voter and election guides Powered by TheVoterGuide.org

Slide 12: Public employee salary databases

Slide 13: Mashups • You can do your own map mashup at Atlas • Follow mashup development at Programmable Web

Slide 14: Carefully consider what to database or map • Consider the time investment • Maps • Ask: What job do you want to get – Poor: DISD bond issue map done for your user? – Good: Home sales map • Crudely sketch out exactly your • GuideLive.com idea on a piece of paper then – Listings • …walk through exactly the clicks a user will take to get your stuff

Slide 15: Data structure powers mashups

Slide 16: Of course our stories don’t mashup very well. They aren’t data! Yes they are. And we ignore that fact at our peril.

Slide 17: Our key competitive differentiator is the data we gather every single day • Our articles • Images • Ads • Classifieds • Listings • Video • User Content • Archives • Databases • Blogs

Slide 18: But it’s locked, hidden and unorganized • Names/Places – Coppell – Grapevine Mills Mall – Coppell High – OSU – Travis Masters – Emily Coker – Sarah Sanders • Dates/Facts – Audi slid under 18-wheeler – Friday morning • Concepts – Suicide – National Merit Scholarship – Motocross – Untimely deaths

Slide 19: So how do we get it unlocked and organized? We need some data about this data

Slide 20: So how do we get it unlocked and organized? We need some metadata this data data about

Slide 21: Metadata tells us what an article is about

Slide 22: 1) So we first extract key entities and concepts • Names/Places – Coppell – Grapevine Mills Mall – Coppell High – OSU – Travis Masters – Emily Coker – Sarah Sanders • Dates/Facts – Audi slid under 18-wheeler – Friday morning • Concepts – Suicide – National Merit Scholarship – Motocross – Untimely deaths

Slide 23: 2) Then filter them for relevance • Names/Places – Coppell – Grapevine Mills Mall – Coppell High – OSU – Travis Masters – Emily Coker – Sarah Sanders • Dates/Facts – Audi slid under 18-wheeler – Friday morning • Concepts – Suicide – National Merit Scholarship – Motocross – Untimely deaths

Slide 24: 3) And finally relate them to standard categories • Names/Places • Names/Places – Coppell – Towns > Coppell – Grapevine Mills Mall – Location > Grapevine Mills – Coppell High – High Schools > Coppell – OSU – OSU – Travis Masters – People > Travis Masters – Emily Coker – People > Emily Coker – Sarah Sanders – Sarah Sanders • Dates/Facts • Dates/Facts – Audi slid under 18-wheeler – Accidents > Auto/Truck – Friday morning – April 25, 2008 • Concepts • Concepts – Suicide – Suicide – National Merit Scholarship – National Merit Scholarship – Motocross – Motocross – Untimely deaths – Teens > Deaths

Slide 25: The standard categories are… • Names/Places – Towns > Coppell – Location > Grapevine Mills – High Schools > Coppell – OSU – People > Travis Masters – People > Emily Coker – Sarah Sanders • Dates/Facts – Accidents > Auto/Truck – April 25, 2008 • Concepts – Suicide – National Merit Scholarship – Motocross – Teens > Deaths

Slide 26: tax·on·o·my Pronunciation: tak-sä-nə-mē Function: noun Etymology: French taxonomie, from tax- + -nomie -nomy Date: circa 1828 1: the organizational structure of categories and attributes that define how you classify, describe and manage your data

Slide 27: Taxonomy is the card catalog of our content A taxonomy organizes, • A set of index terms that we classifies and relates manage and apply to each piece of content our content • Terms are hierarchical: Large categories split into specific sub- categories • Terms are cross-referenced, so if you look for “bucket,” you also get “pail.” Structuring our content just like data

Slide 28: So why do I need it?  Faceted navigation I can click in level by level to find something (Cuisine>Asian>Chinese>Szechw an )

Slide 29: So why do I need it?  Faceted navigation I can click in level by level to find something (Cuisine>Asian>Chinese>Szech- wan )  Much better site search by enabling search boxes that can restrict a search to terms of a particular type or context

Slide 30: So why do I need it?  Faceted navigation I can click in level by level to find something (Cuisine>Asian>Chinese>Szech- wan )  Much better search by enabling search boxes that can restrict a search to terms of a particular type or context  Related information that I may not have known about (articles, photo galleries, other listings)

Slide 31: So why do I need it?  Faceted navigation I can click in level by level to find something (Cuisine>Asian>Chinese>Szech- wan )  Much better search by enabling search boxes that can restrict a search to terms of a particular type or context  Related information that I may not have known about (articles, photo galleries, other listings)  Multiple attributes for listings (Parking, Ambience)

Slide 32: So why do I need it?  Faceted navigation I can click in level by level to find something (Cuisine>Asian>Chinese>Szech- wan )  Much better search by enabling search boxes that can restrict a search to terms of a particular type or context  Related information that I may not have known about (articles, photo galleries, other listings)  Multiple attributes for listings (BYOB, Outdoor Dining)  Higher search ranking (SEO) on search engines for topic subjects, listings, classified categories Source: Chicago Tribune

Slide 33: ‘Hot Topic’-driven content pages provide new opportunities for keyword-targeted advertising, boost SEO rankings, increase site traffic and drive more engagement

Slide 35: Embedded links increase page views and boost SEO, which generates more site traffic

Slide 37: Geographic terms can power data mapping

Slide 38: …or create customized alerts where you define the geography where you want notifications to come from

Slide 39: Whatever you call it… it’s about describing and classifying our content • Taxonomy – our standard, heirarchical categories • Metadata – the keywords describing a piece of content • Structured data – Information that’s been organized as above

Slide 40: How do I get structured data? • You can do it by hand • You can use technology – Librarians and Web producers are – IPTC doing it every day – AP Digital Exchange – But they can only do so much – Inform – Should reporters be adding – Teragram metadata to every story? – NStein – Should line editors and/or copy – MetaCarta (geotagging) editors add metadata? – Serra Media (geotagging) – Generate Inc. (business data)

Slide 41: Elements of a data strategy

Slide 42: Organizing for structured data • Do you need a taxonomy and data strategy? • Audit your newsroom datastream • Do you need a data coordinator? • Gannett’s Data Desks – Brought everyone who ‘does data’ together: Agate clerks, librarians, CAR staff – Responsibilities: Acquiring data, creating databases, programming them for interactivity, training for building spreadsheets • Do you need self-service ways the public can give you info? – Yes: Web forms – No: Faxes, phones, notes on paper, lists on someone’s computer

Slide 43: Reporting for structured data • Reporting with data in mind means you gather the same fact in the same way every time – Shoot every photo exactly the same way – Ask the same question of every interviewee – Find out all the same info from every venue • Save the data • Input the data alongside the story • Look for databases you can bring back with your story Example: Bluegrass Instruments

Slide 44: Writing/editing for structured data • Editors add and apply keywords and standard categories • Bloggers (already?) tag and categorize blog posts – Should you have standard categories or go with the ‘folksonomy?’

Slide 45: Discussion http://www.slideshare.net/ajmoor