Innovation and the STM publisher of the future (SSP IN Conference 2011)

  • 1,417 views
Uploaded on

 

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,417
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
34
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Innovation and the STM publisher ofthe futureBradley P. Allen, Elsevier LabsInnovation Session, SSP IN Conference 2011Arlington, VA, USA2011-09-19
  • 2. Peak physical media • “Music Sales”, New York Times, 1 August 2009. http://www.nytimes.com/imagepages/2009/08/01/opinion/01blow.ready.html • “Initial Circs per student”, William Denton, 31 January 2011. http://www.miskatonic.org/2011/01/31/initial-circs-student • “Rise of e-book Readers to Result in Decline of Book Publishing Business”, Steven Mather, iSuppli, 28 April 2011. http://www.isuppli.com/Home-and-Consumer- Electronics/News/Pages/Rise-of-e-book-Readers-to-Result-in-Decline-of-Book- Publishing-Business.aspx 2
  • 3. A simple model of the evolution of publishing Print era: 1600s - Digital Library era: Platform-as-a- 1980 1980 – 2010s Service era: 2010s • Packaged as • Packaged as • Packaged as books and books and apps and APIs articles articles • Digitally • Physically • Digitally distributed distributed distributed • Access and • Access and • Access and discovery discovery discovery through social through through search networks libraries engines 3
  • 4. Facets of STM publishing in the PaaS era Process Type Extract, Load Discovery and Acquisition and Enhancement Indexing Composition Delivery Access Transform Entity Activity Content Type Submitting Entity extractionAuthor Product catalog Article Crawling Fact extractionSupplier Editor Book Syndicating ClusteringWeb site Reviewer Media object Formatting AggregatingTypesetter User Entity record Mapping OrderingAutomated process Designer Asset metadata Cleansing SummarizingSubject matter expert Developer Relational metadata Indexing FilteringSearch engine E-book Provenance metadata Querying AnalysisContent repository Mobile app Usage metadata Updating Data scienceEntity registry Mobile-enhanced Web site Taxonomy Storing Rendering API Ontology Annotating Design User-generated content Subject tagging Publishing Classification Accessing Entity recognition Retrieving Deleting 4
  • 5. STM publishing as business intelligence Surajit Chaudhuri, Umeshwar Dayal, and Vivek Narasayya. 2011. An overview of business intelligence technology. Commun. ACM 54, 8 (August 2011), 88-98. http://doi.acm.org/10.1145/1978542.1978562 5
  • 6. Some scenarios to compare the two digital erasScenario Digital Library era Platform-as-a-service eraA new medical term relevant to an emerging Organizational governance issues about how A single, automated and standardizedhealthcare issue (e.g. a new type of avian flu taxonomies are be updated, coupled with taxonomy management and contentvirus) needs to be incorporated into a search manually-intensive workflows and ad-hoc enhancement workflow allows rapid andindex immediately approaches to content tagging, inhibit rapid timely update of search applications responseApplication developers want to mash up Data silos without easy means of Content API and single-point-of-accessepidemiological data with medical journal programmatic access by developers, coupled repository allow data and content to bearticles to create topic-specific Web resource with governance and business model accessed, discovered and reused across questions , inhibit data reuse multiple applicationsDigital library developers want to stage Duplication of core content leads to Consolidation of duplicate repositories into acontent into single repository for unified synchronization, quality control issues single point of truth across all contentsearch index generation accessible and discoverable through a Content API eliminates the need for duplication and synchronizationThird party solutions providers want to No standards, no APIs for point-of-care Standards and APIs that scale across multipleintegrate content (e.g. tagged medical journal content integration across all content and partners, for all content types, for all deliveryarticles, medical taxonomies) into point-of- data formatscare solutionsPublishers want to deliver their content to No clear standard or approach for targeting Web- and industry-standards for eReader,tablets and e-readers in delivery formats that emerging eReader, tablet devices, multiple tablet devices supported as part of standardtake advantage of the displays and interaction and divergent approaches leading to siloed automated processing into delivery channel-modalities on those devices solutions, duplication of effort specific formats, regularly updated and exposed through a Content APIJournal publisher wants to integrate content No single point of access to content Easy access to multiple opportunities forenhancements across multiple subject matter enhancements, no standards for content content enhancements embedded inareas to add value to products leveraging enhancement suppliers and partners to standard next-generation article formats andArticle of the Future technology deliver enhancements for integration provided using standard content enhancement formats 6
  • 7. Goals for the publisher of the future • Craft content acquisition, production and management systems that support with equal capability and flexibility a broad range of content types and delivery channels • Make it easy for authors, editors and reviewers to work with bundles of content and data in the aggregate • Make it easy to discover and access, across all content assets, information in fragments smaller than the unit of publication • Then make it easy to aggregate and compose these fragments into new products and services • Leverage the tremendous power of Web architectural standards and formats to increase the ease of content integration and interoperability 7
  • 8. New requirements for content management • Broad range of content types • Accessible – Must treat as first-class objects video, audio, – Must be easily accessed through content images, datasets, metadata and knowledge creation, retrieval, update and deletion (CRUD) organization systems in addition to articles and services books • Flexible • Standards-based – New content types and associated schemas – Web-standard formats to support ease of must be easily added through configuration integration and interoperability • Reusable • Fine-grained – It must be efficient for product developers to – Must be decomposable into and addressable in aggregate and compose content fragments into fragments smaller than the unit of publication; new products e.g., down to the level of specific words, phrases, images, table cells in articles or book • Modifiable chapters, key frames and segments in videos – Support the enhancement and correction of content at any time following creation • Discoverable – Must be easily located across all levels of • Broad range of delivery formats granularity, – Content standards and services must support fulfillment, delivery and presentation across desktop, notebook, tablet and mobile computing devices 8
  • 9. Leveraging Web standards for sharing 1. Use URIs to name things 2. Use HTTP URIs so they can be looked up 3. Return useful data when things are looked up 4. Include links to other things in the returned data “Linked data is just a term for how to publish data on the web while working with the web. And the web is the best architecture we know for publishing information in a hugely diverse and distributed environment, in a gradual and sustainable way.” Tennison J, 2010. Why Linked Data for data.gov.uk? http://www.jenitennison.com/blog/node/140 Shotton D, Portwin K, Klyne G, Miles A, 2009. Adventures in Semantic Publishing: Exemplar Semantic Enhancements of a Research Article. PLoS Comput Biol 5(4): e1000361. doi:10.1371/journal.pcbi.1000361 9
  • 10. From books and articles to evolving research objects Linked data Relational metadata Entity record Relational Metadata Article Relational metadata Relational Acquire Metadata Relational Deliver metadata Media object Relational Relational metadata Metadata Transform, Enhance, Compose 10
  • 11. Leveraging consumer Web innovations • Emergent technologies driven by consumer Web applications emphasize design choices that focus on delivering cheap, robust and scalable Web applications – Schemaless document stores provide read/write at Web scale with support for analytics • For more dynamic, fine-grained content and linked data • For easier usage and citation analysis, bibliometrics and scientometrics – Web application development frameworks that leverage HTML5/CSS/JS to deliver across desktops, notebooks, tablets and smartphones – Deploying in the cloud and moving scale-out from development to operations to reduce time-to-market, cost of failure for emerging, niche publishing opportunities • As we shift to the Platform-as-a-Service era, these features become an important part of the STM publishing technology stack 11
  • 12. Examples from Elsevier: Linked Data Repository 12
  • 13. Examples from Elsevier: SciVal 13
  • 14. Examples from Elsevier: SciVerse 14
  • 15. The publisher of the future as lean startup • This stuff is not just for big publishers • These are the tools that new consumer Internet businesses are using to create new products and services today… quickly and on the cheap • Smaller publishers and societies can use lean startup techniques to drive app and API design and development starting from existing web presences and third-party APIs 15
  • 16. Example: Impact metrics in Klout 16
  • 17. Example: Content acquisition using Github 17
  • 18. Example: SciVerse/Mendeley integration 18
  • 19. Challenges for the publisher of the future • When content can be mashed up at a fine-level of granularity using multiple third-party APIs, what are the rights associated with the resulting product? What are the appropriate business models? • What standards should there be for research objects? • Who gets credit for research objects? How is impact determined and reputation managed? • What is an acceptable trade off between content flexibility and high-touch presentation design? 19
  • 20. In summary • STM publishing is only beginning the transition from print to online • Articles and books are no longer sufficient containers for scholarly communication • Tools to effect this change come from the consumer Internet and the business intelligence worlds • Publishers of the future will leverage the best practices emerging around these tools to create innovative new products to serve their communities 20