Overview of Bowker's Metdata Processes


Published on

Published in: Education, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Overview of Bowker's Metdata Processes

  1. 1. Overview of Bowker’s Metadata ProcessesPatricia PaytonSenior Director, Publisher Relations & Content Development forBowker908.219.0241 Patricia.Payton@bowker.com
  2. 2. Agenda• Bowker’s role in the marketplace – Customer workflows – Selected client lists – Bowker products• Bowker metadata management – Aggregated and enhanced content – Value added processes – Processing of publisher data and audits – Testing new data feeds – Publisher outreach priorities• Next cooperative steps
  3. 3. Bowker’s Role in the Marketplace
  4. 4. Representative Clients by Market Segment Publishers: Retailers: Libraries: Random House Barnes & Noble New York Public HarperCollins Follett College Brooklyn Public Hachette Stores Chicago Public Elsevier Indigo Johns Hopkins University Macmillan Abebooks.com Harvard Cengage Hastings Yale Wiley Sony Princeton Apple Queens Borough Public Schools: eBay State of Oklahoma NYC DOE web services: Anthology Blackboard.com Columbia University New York Times MIT EdMap
  5. 5. Bowker eBook Customers Today• 45 customers currently purchase eBook data feeds – Borders, B&N, SWETS, NY Times• Libraries need central repository to better identify all eContent• All products incorporate eBook metadata – Publisher data represents 82% of collection – Aggregator and conversion house data also stored
  6. 6. Search & Discovery ProductsBowker Books In Print® • > 1,200 retail & library clients (>10K locations) make buying decisions using this online bibliographic reference tool • Content is aggregated and standardized • > 20M records; > 13M “active” book records
  7. 7. Search & Discovery ProductsBowker ® Syndetic Solutions • Library catalog (OPAC) enrichment service • 2.3B queries/month; >11M content elements; updated weekly • Cover images, Tables of Contents, Summaries, Reviews, First chapters, Author notes, Awards, & Knowledge Profiles • Includes books, videos & music in English, Spanish, German, Swedish & Italian • Analytics show users search “long tail”—29K hits, most requested title 18
  8. 8. Traditional vs. Content Searching Searches select metadata fields onlySearches allavailable content
  9. 9. Search & Discovery ProductsBowker Data Licensing • Embed data in customer acquisition & workflow processes • 60+ clients including major retailers, small startups, eBook platforms, and search engines • User controls processing rules • Works via pull or push methods
  10. 10. Metadata Management Customer and Product NeedsAudits and AggregatedGap Filling Metadata Value Added Enhanced Processes Content
  11. 11. Aggregated & Enhanced ContentContent from the Supply Chain • Data Feeds – National Libraries, Publishers, & Distributors • Price & Availability Notifications – WholesalersLicensed Content • Full Text Reviews – PW, LJ and SLJ, NYT (adding UK sources in 2011) • Review citations – 10 trusted sources Included on > 145K ISBNs • NY Times Book Review, Los Angeles Times, San Fran. Chronicle, and USA TodayBowker Created Content • Author Biographies – > 80,000 authors • Bestseller lists – 23 sources • Including New York Times, Los Angeles Times, USA Today, and The Wall Street Journal • Included on >225K ISBNs even Audio, Video, Print and E-Book • > 100 Years At a Glance synopses • Detail listings for PW and NY Times on position and length on list • Media mentions – 25 sources • Business Week, Entertainment Weekly, Time, Good Morning America, Oprah, NPR • Awards – > 400 sources • Knowledge Profiles – 225K unique across all subjects • Genre and sub-genre, Author, Title, Characters of book and traits, themes, keyword related • U.S. titles only
  12. 12. Knowledge Profile Creation
  13. 13. Value Added ProcessesSubject Classification • Bowker stores and forwards publisher-assigned BISAC subject codes • Many of Bowker customers use our more specific subject terms • Bowker’s scheme has > 80K Bowker subject terms compared to BISAC’s 3700 codes • All Bowker codes are mapped to BISAC and BIC codes for easy updating Title Linking • ISBNs of the same intellectual work are linked • Title, subtitle, and first contributor matches are given a unique title record number • Unique title record number links all editions to valued-added data such as: • Bowker subject classifications • Reviews & review citations • Awards • Media mentions • Bestseller notations • Chapter excerpts • Dewey, Library of Congress and British Library classification schemes • Lexile measures from Metametrics (for children’s books)
  14. 14. Linking Enriched Content Across Formats
  15. 15. Linking eBook Metadata• Feature vendor specific information• Display of agency and institutional pricing
  16. 16. Processing of Publisher DataFile Process • Process goal is 48 hours of receipt • Automated process pulls from FTP and submits each file • Data locks down 90 days past publication date • Only updates to status, returns, and price related fields are allowedIndividual file audit reports run • Exclude Report-- • ISBN is invalid (e.g., 9 digits, or check-digit will not validate) • Publisher is not properly linked to current Distributor • New Imprint for publisher is in file but not in Bowker’s Publisher Authority database • ISBN status is “No Longer Stocked by Us” or “Refer to another Supplier” (meaning the supplier of the file no longer carries that ISBN) • Title Change Report • Contributor Change ReportProcesses vary for print, eBooks, and cover images
  17. 17. Database Audit ProcessesDaily • Query/review prices over $400Weekly • High profile titlesMonthly • Un-fielded data • Upper case titles • Undefined articles • Bestselling and classic authors are cleaned • Bad contributor cleaning • Research ISBNs with “untitled” titles • Remove pipe characters, carriage returns and line feeds from titles and contributorsOn demand • Review for timeliness of data • Bad publisher/imprint symbols
  18. 18. Testing Process for New Feeds Publisher Data Integration Quality Assurance Production Relations • Map file • In-depth quality • FTP account set up imprint/publishers review of all titles • Statement of Use• Validation of to our database • Compare file to supplied to ONIX files • Load data to test data already in BIP publisher• Check required system • Review • Cover images data fields • Work excludes completeness of requested present • Supply audit of data• Brief quality scan records to QA • For Excel files, verify scripting• Determine was correct quantity of records supplied• Write script for conversion of Excel files to ONIXFile can move on File can move onin process or be in process or be returned to returned to publisher 6 weeks publisher average wait 1 week on time due 2 weeks on average to complete the testing process average to files in queue
  19. 19. Publisher Outreach PrioritiesGap filling • Forthcoming titles (i.e. price, annotation, and cover image at 60 days prior to publication) • Validating that older titles (pre 2000) that are still active in our system are still available • Identifying issues around items lacking prices in our system • Including items that were cancelled, are not for sale separately, or are no longer distributedEstablishing eBook metadata feeds • With publishers, eBook aggregators and distributorsFree full content indexing service • Whereby Bowker extracts keywords and phrases with relevancy and frequency scores to embed behind the scenes in productsUnderstanding the use of ISBNs for digital products
  20. 20. Next Cooperative Steps• Data Submission Guides• Additional documents available – Data integrity document (more detail on audit reports and processes) – Publisher profile data (details on current state of your data)• Exchange contact details for particular types of issues• Discuss file format and data fields best for your title set• Set date for test file submission
  21. 21. About Bowker Bowker is the worlds leading provider of bibliographic information management solutions designed to help publishers, booksellers, and libraries better serve their customers. The company is focused on developing various tools and products that make books easier for people to discover, evaluate, order, and experience, as well as providing services to publishers that help them better understand and meet the interests of readers worldwide. Bowker is an affiliated business of ProQuest and is headquartered in New Providence, New Jersey, with additional operations in England and Australia. For more information, please visit www.bowker.com. Follow Us On Twitter @DiscoverBowker