This presentation was provided by Bill Kasdorf of Apex Content Solutions during the NISO Virtual Conference, Convergence: The Web and Publishing Onto the Web, held on May 17, 2017.
Kasdorf The Web Imperative: How Web Technologies Are Transforming Publishing Ecosystem
1. Bill Kasdorf
VP and Principal Consultant, Apex Content Solutions
Member of W3C PBG, DPUB IG, EPUB 3 CG, PWG
The Web Imperative
How Web Technologies are Transforming
the Publishing Ecosystem
3. The publishing ecosystem
is using the Web
and Web technologies
in more ways than ever,
to make ever more
rich, complex, dynamic,
and effective publications.
4. The publishing ecosystem
is using the Web
and Web technologies
in more ways than ever.
The Web is being made
ever more useful for publishing—
even for complex publications.
6. Not so much.
The Web is more about
communication
and commerce
and connecting.
It really hasn’t focused on what
professional publications require.
Until now.
8. Books in Browsers 2014:
“Bridging the Web and Digital Publishing”
Unofficial Draft 30 June 2015:
“EPUB+WEB” (AKA “EPUB-WEB”)
GitHub, 24 September 2015:
“Portable Web Documents for the OWP”
W3C Working Draft, 15 October 2015:
“Portable Web Publications for the OWP”
W3C Editors Draft, 28 November 2016:
“Web Publications for the OWP”
(a work in process)
9. (P)WP
Recent Realization:
First we need to define a Web Publication!
Meaning an arbitrarily extensive and complex
collection of resources on the web
(web pages, CSS, fonts, images, media, scripts, etc.)
that has an identity, that can be referenced, etc.
Whether/how it’s packaged is a separate issue.
“PUBLICATION” ≠“DOCUMENT”
but
“BUNCH OF STUFF ON THE WEB”
might = “PUBLICATION”
10. (P)WP
Recent Realization:
First we need to define a Web Publication!
Meaning an arbitrarily extensive and complex
collection of resources on the web
(web pages, CSS, fonts, images, media, scripts, etc.)
that has an identity, that can be referenced, etc.
Whether / how it’s packaged is a separate issue.
11. (P)WP
Recent Realization:
First we need to define a Web Publication!
Meaning an arbitrarily extensive and complex
collection of resources on the web
(web pages, CSS, fonts, images, media, scripts, etc.)
that has an identity, that can be referenced, etc.
Whether/how it’s packaged is a separate issue.
The “P”
is coming
to mean
“packaged”
more than
“portable”
12. Relationship
with
researchers Decisions
based on
analytics
Peer review
automation
Reference tracking
Statistic checking
Text and Data
mining
How
efficient
is text?
Rethink
use of
document
Integrating
digital
artefacts
The
narrative
Research
Data
Who
wants the
Narrative
Integrating
multimedia
Taxonomy
resources
Linked
Open Data
Relationship
maps
Wikipedia
All digital
artefacts
Code protocols
ORCIDS needed
XML linked data
Reducing
friction
Publications at
different places
Open science
satellite
NEW
RELATIONS
VIA SOCIAL
NETWORKS SCN’s
creating
new journals
Become
publishers?
Start
ups
A challenge
An opportunity
in integration
STM
= B2C
publishing?
Chaos and
diversity
Not the
same as
asking an
expert
Inertia
Signal
vs noise
Business/
leisure/
research?
Needs
easier
paywalls
New
business
models
INFORMATION
STRUCTURE
AND
CONTEXTUALISATION
Behavioural
analytics
Researcher Researcher
Consumer Doctor
Patient Patient
CAVEATS
Big data
analytics
User focus
DYNAMIC
PUBLISHING
Creating
solutions
Not search
results
User
tracking =
new products
and services
PUBLISHING
1
Unstructured
Data
Precision
Information
Automated
literature
navigation
Look up
everything Outsource
your brain
and memory
Look up on the fly
Any
device
will do
Different
skill sets
Ask right questions
Get solutions,
not search
results
Facts
and data
vs meta-
analysis
The
customised
solution
Atomisation
of
information
Who wants
the full
document?
Open
Science
Platforms
Citizen
Science
Innovation
in Society
Who wants
the full
document?
EVERYONE
IS A
CUSTOMER
Virtual
reality
Augmented
reality+
Social
learning
Reputation
management
Metrics
Social
networks
Life
Logging
Social
reading
Collaboration
tools
LIFELONG
LEARNING
ASK
WATSON!
THE
PERIPHERAL
BRAIN
22 USERS
Convergence
online
and
offline
INDUSTRIAL-
ISATION OF
RESEARCH
Scaling
up
Using
big data
Text
analytics
Data
moving
from lab
to lab
Fast
translation
of results
Data
analytics
by the
crowd
OPEN
SCIENCE
Open
Data
Reproducibility
Sharing
Research
Data
SMALL
SHOP
LABS
Citizen
science
Garage
shops
Outside
academia
Findable
Retrievable
Accessible
RESEARCH
DATA
Linking
Data and
Pubs
Interoperable
Collaboration
Performance
Evaluation
Pooling
of Data
Robot
labs
Machine
generated
Research
Hypothesis
Experiments
Citizen
Science
Knowledge
graphs
AUTOMATED
KNOWLEDGE
CREATION 3RESEARCH
PRIVACYANDSECURITY
44 Warranting
reproducibility
Identity,
reputation
Link
people
Certification
VoR
Users
securing
their
own
metadataUser
privacy
Balance
Privacy
and Value
NEED
Unauthorised
PDFs
PROBLEMS
Theft and
privacy
Push
walls
Content
locks
Internet
locked
and
blocked
Big
user data
Right
to be
forgotten
Safe
harbour
Individualised
services
allowed?
BOOST IN
ARTIFICIAL
INTELLIGENCE
TDM
Statistics
on steroids
Internet
of Data
Artificial
Intelligence
Machine
learning,
machine
reading
COMPUTER
POWER ON
STEROIDS
Cloud
computing
Webscale
computing
No more
capacity
limits
Easier
innovation?
Computing
costs up
or down?
Big Data meets Artificial IntelligenceText
Non-text
Protocols
Research
Data
Knowledge
graphs
Code
Orcids
MORE
OUTPUTS
- ALL
DIGITAL
Outputs
born
digital
Increased
output
variety
5TECHNOL
O
GY
5
User-cen
tered Publishing delivers Precision Inform
at
ion
The Machine is the New Reader
Science as a Social Machine
D
ata Privacy requires a Web of Trust
STM Tech Trends: Outlook 2020
THE TECHNOLOGY FLOODGATES ARE OPEN
Kindly sponsored by
It’s not
just about text.
And almost all of this
depends on Web
technologies.
13. The Web Publication Vision:
ONE PUBLICATION FOR BOTH
ONLINE AND OFFLINE USE.
The same content in two different “states”:
Offline, packaged or cached;
Online, with all essential resources linked.
A canonical URL that leads to both.
14. Wouldn’t it be great if
there was no difference between
an online publication
and an EPUB?
15. EPUB 3 has become essential
to the publishing ecosystem
E-Readers
It’s the “master format” for virtually all systems.
16. EPUB 3 has become essential
to the publishing ecosystem
E-Readers
It’s the “master format” for virtually all systems.
Accessibility
It’s the format for interchange of accessible content.
17. EPUB 3 has become essential
to the publishing ecosystem
E-Readers
It’s the “master format” for virtually all systems.
Accessibility
It’s the format for interchange of accessible content.
Education
Platforms are built on the EPUB for Education profile.
18. EPUB 3 has become essential
to the publishing ecosystem
E-Readers
It’s the “master format” for virtually all systems.
Accessibility
It’s the format for interchange of accessible content.
Education
Platforms are built on the EPUB for Education profile.
Not Just Books
It’s used for all kinds of publications.
19. EPUB 3 has become essential
to the publishing ecosystem
E-Readers
It’s the “master format” for virtually all systems.
Accessibility
It’s the format for interchange of accessible content.
Education
Platforms are built on the EPUB for Education profile.
Not Just Books
It’s used for all kinds of publications.
Global
Widely adopted in US, EU, Far East, Israel.
20. We want to avoid two competing specs.
These need to be the same thing.
Could be one master spec,
or a layered spec with “profiles”:
e.g., “PWP” as a profile of a WP (a type of WP),
and “EPUB 4” in turn as a profile of PWP
(like EPUB for Education is for EPUB),
a type of PWP requiring more predictability,
accessibility, archivability.
EPUB 4 vs. (P)WP
21. We want to avoid two competing specs.
These need to be the same thing.
Could be one master spec,
or a layered spec with “profiles”:
e.g., “PWP” as a profile of a WP (a type of WP),
and “EPUB 4” in turn as a profile of PWP
(like EPUB for Education is for EPUB),
a type of PWP requiring more predictability,
accessibility, archivability.
EPUB 4 vs. (P)WP
This is why the
IDPF was recently
combined into
the W3C.
22. “W3C is thrilled to gain
the expertise of the publishing industry
with its rich tradition of excellence in developing
many forms of content for books, magazines,
journals, educational materials,
and scholarly publications.”
—Jeff Jaffe, W3C CEO
23. Publishing Business Group
Provides a formal voice for publishing in the W3C.
Dues comparable to IDPF dues.
For all types of publishers; no end date.*
Publishing Working Group
Chartered for 3 years to develop Web Publications.
Requires W3C membership; includes TPI members.
EPUB 3 Community Group
Free to all; maintains EPUB 3 family of specs.
*IDPF members are “Transitional Publishing Industry” status for 2 years.
Publishing @ W3C
25. The Baseline for Accessible Publications
Clear guidelines to enable certification of accessibility
and discovery of accessible features in an EPUB.
Based on Web Accessibility Recommendations.
Adds publication-specific requirements.
Requires accessibility-specific metadata.
Techniques document provides “how to do it” advice.
Applicable and referenceable by
any version of EPUB and other specs too.
EPUB Accessibility 1.0
27. 3 Web Annotations Recs publ. Feb. 23, 2017:
Web Annotation Data Model
Web Annotation Vocabulary
Web Annotation Protocol
Provide interoperable “data structures” for annotations:
Can exchange annotations between systems.
Can store annotations on an annotation server.
Put annotations on text, images, videos, etc.
Includes Notes on “Selectors and States” and
“Embedding Annotations in HTML.”
Interoperable Annotations
28. Annotating All Knowledge
Coalition of over 70 scholarly publishers, platforms,
libraries, and technology organizations.
Open source, standards-based; supports key formats
(HTML, PDF, EPUB, images, video, and data).
Ambitious 3-year timeline:
Pilots at JSTOR, arXiv, eLife, etc.
Force11: Community Platform, Working Group.
Interoperable Annotations
29. IIIF: The International Image
Interoperability Framework
A Community . . .
Over 600 national libraries, research institutions,
museums, tech firms, aggregators, and projects.
. . . that creates APIs . . .
Image (the pixels); Presentation (human readable info);
Authentication (almost finished); Search (to come).
. . . that it uses to create interoperable services.
Focusing on providing a good UX.
Interoperable Images
30. Image API
URL and identifier for image; can express regions,
size, mirror, rotation, rights info, multiple versions.
Presentation API
Structure, properties (labels, rights, technical info, links),
can associate transcription, translation, commentary,
etc. with regions of an image.
Working on Audio, Video; 3D in future
All based on Web technologies, including Annotations.
Interoperable Images
31. RedLink’s Remarq™
: “Decentralizied Social”
An “editorial engagement platform”
that combines experts and the community.
Keeps conversation on the publisher’s site.
Profiles, polls, discussions, recommendations, sharing.
Creates trusted public conversation by
vetting participants for appropriate subject expertise.
No need for infrastructure investment, training.
Enhances value of publisher’s Version of Record.
Managed Interoperability
34. Coming soon:
Scholarly journals.
Atypon
hosts 40% of the world’s peer-reviewed journals.
In an upcoming release they’re implementing
a Readium*-based EPUB reader in the browser
and will generate EPUBs automatically
from content submitted in their XML spec
and FXL EPUB from PDFs.
*Modular, open source software based on web technologies for EPUB.
36. Authoring rich educational content
is complicated.
Quiz and test content.
Multimedia.
Interactive features.
Group or individual activities.
Educational metrics.
Personalization.
Interoperating with LMSs.
All of this uses web technologies,
often based on EPUB and EPUB for Education.
39. Editoria
Being developed by Univ. of California Press, California
Digital Library, & Collaborative Knowledge Foundation.
Fully integrated, web-based authoring,
editing, & typesetting workflow.
Automated conversion from Word to XML/HTML.
Full-function EPUB- and web-based visual editor.
“Bookbuilder” interface manages book components.
PDF rendering via CSS-based typesetting.
All free, open-source, and web-based.
Free, open book production infrastructure
41. Manifold
Being developed by Univ. of Minnesota Press,
CUNY Digital Scholarship Lab, & Cast Iron Coding.
Online reading, annotation, discussion, &
augmention platform for monographs.
Integrates with print workflows; EPUB-based.
Enables dynamic discussion & augmentation of books:
audio, video, spreadsheets, presentations, links, etc.
All free, open-source, and web-based.
Making Monographs “Living Works”
42. WEB PUBLICATIONS
Web Publications for the Open Web Platform: https://w3c.github.io/dpub-pwp/
PUBLISHING@W3C
https://www.w3.org/publishing/
EPUB 3.1
http://www.idpf.org/epub/31/spec/epub-spec.html
ACCESSIBILITY
EPUB Accessibility 1.0: http://www.idpf.org/epub/a11y/
EPUB Accessibility Techniques: http://www.idpf.org/epub/a11y/techniques/techniques.html
WEB ANNOTATIONS
Data Model: https://www.w3.org/TR/2017/REC-annotation-model-20170223/
Vocabulary: https://www.w3.org/TR/2017/REC-annotation-vocab-20170223/
Protocol: https://www.w3.org/TR/2017/REC-annotation-protocol-20170223/
Resources