2. What’s Happening?
• Publishing is changing
• Outsourcing
• Digital Publishing
• XML
• A single retailer has gained huge power
2
3. Digital markets are maturing
• More books will become digital only
• The price of printing may only be a small part but it’s a
part
• Traditional publishing will be competing against a different
cost base
3
4. Change happens
• The business is changing
• Publishing is paper centric
• It needs to be content centric
• The paper mindset needs to be replaced by a content
mindset.
4
5. The world has changed
• When did you
• last use an encyclopaedia?
• pick up a dictionary?
• These are digital activities now
5
6. Publishers are using XML
• Text + tags
• It’s a mark-up language
• Made up of elements, text and attributes
• Each element contains other elements
• Ideally we represent the meaning of the content not the
format of it
6
7. A little bit more…
• XML files are basically text files with rules
• The rules are
• Content is marked up with elements
• Elements contain other elements or text
• An element is made up of an opening tag and a closing
tag
• Elements can contain attributes
<para id="para001">This is a paragraph.</para>
7
8. XML…
<?xml version="1.0" encoding="UTF-8"?>
<article xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0">
<info>
<title>Publishing Revolutions</title>
<abstract><para>Let's talk about all the changes
in publishing going on right now.</para></abstract>
</info>
<section>
<title>What's Happening?</title>
<itemizedlist>
<listitem><para>Publishing is changing</para></listitem>
<listitem>
<itemizedlist>
<listitem><para>Outsourcing</para></listitem>
<listitem><para>Digital Publishing</para></listitem>
<listitem><para>XML</para></listitem>
</listitem>
</itemizedlist>
</listitem>
</itemizedlist>
</section>
</article>
8
9. Indexing the XML
<para xml:id="the0000116">As a business strategy, the Internet giants’ formula is simple: The
more personally relevant their information offerings are, the more ads they can sell, and
the more likely you are to buy the products they’re offering. And the formula works.
<indexterm><primary>Amazon</primary></indexterm> Amazon sells billions of dollars in
merchandise by predicting what each customer is interested in and putting it in the front of
the virtual store. Up to 60 percent of
<indexterm><primary>Netflix</primary></indexterm>Netflix’s rentals come from the
personalized guesses it can make about each customer’s movie preferences—and at this point,
Netflix can predict how much you’ll like a given movie within about half a star.
Personalization is a core strategy for the top five sites on the Internet— Yahoo, Google,
Facebook, YouTube, and Microsoft Live—as well as countless others.</para>
9
10. Standards
• That last slide is a language called DocBook
• A standard for (originally) technical publishing
• We have
• <indexterm>
• <primary>, <secondary>, <tertiary>
• <see>, <seealso>
• The indexing mark-up was not created by indexers
• why not?
• Please get involved – the XML community needs you
10
11. Why XML?
• Agile content
• XML is an enabler for multi-format publishing
• EPUB
• Kindle
• Web
• multiple print formats
• custom book publishing
11
12. Traditional workflows
• Traditional publishing workflows do not include digital
content well
• Well-known
• Developed over a long period
• Efficient way to produce books
12
13. Traditional workflows
Manuscript Prepare for
copyediting
Copyediting
Author queries
Other formats?
and corrections
Prepare for
typesetting
Print ready
Typesetting proofs Printing
Proofs Indexing
Proofreading
13
14. Content driven publishing
• XML is a valuable tool
• It’s not an answer in itself
• An XML first workflow doesn’t necessarily get you
anywhere
• It’s the neutral format for your content
14
15. XML first workflows
Manuscript Copyediting and
Tagging
HTML
Conversion to Place into
XML Repository
XML
Author queries
and corrections
EPUB Kindle
Indexing Transformation
Print Ready
PDF
Proofs
eBook PDF
Proofreading/
Copyediting
…
15
16. How does this affect indexers?
• Publishers are outsourcing more and more
• Indexers have almost always been freelance anyway
• Publishers are now outsourcing the entire process
• Mostly to India and China
• Non-native speakers
• That might affect indexers
• Increasing automation
• I don’t think automated indexing is a good idea but that
doesn’t mean everyone does
16
17. EBooks
• Indexes in eBooks are a bit of a problem
• Random House –
“We don’t need indexes any more
because we can search the text”
• Many EBooks have an unlinked index with page numbers
at the back
• Some have links with page numbers
• Some publishers take out the index completely
17
18. EBooks
• Right now, Amazon dominate the market with the Kindle
• Everyone else supports the EPUB format
• Apple have about 10% of the European market and
Amazon have 85%.
• The other retailers are basically irrelevant
• The organisations controlling this market are not the
publishers
18
22. Indexes and eBooks
• There is no point in talking to publishers about the way
indexes work in eBooks
• Talk to Apple, Amazon, Sony, Barnes & Noble
• Think of the amazing things you could do with an index
on an eBook
• intelligent search
• new ways to present the index
• view the index and display snippets of content
22
24. This may affect you
• As publishers move towards that XML first workflow, the
old ways of indexing may not be appropriate
• As publishers move to entirely digital workflows the
index becomes a digital artefact
• The concept of the page is going to become much less
important
• A true XML index has no presentation
• Is it italic, bold, etc is no longer important to the
publisher who needs to present it in five different ways
24
25. The index is more important
• The digital publishing world includes the Web
• On the web, the primary way we can find things is
through searches
• A good index is a fantastic search corpus
• search the index not the book
• Semantic searches have been promised next year for
twenty years
• you are good at this and computers aren’t
• Convergence is happening in search – taxonomies and
indexes are both used
25
26. What can the Society do?
• Talk to the industry
• Not necessarily publishers
• Join the International Digital Publishers Forum
• Talk to your software suppliers about indexing XML
directly
• Join the Organisation for the Advancement of Standards
in Information Systems
• Have a corporate policy on XML – create some XML
indexing standards yourselves!
• Create relationships with bibliographic communities
because they are already thinking hard about these things
26
27. What can individuals do?
• Learn some XML
• It’s not hard
• Go on a course?
• Improve your digital skills
• The industry is changing and being aware of the new
environment can only help you
• Push back on clients – make sure they understand what
an index is
27
Editor's Notes
\n
The entry cost is lower too - the number of publishers is increasing as is the number of self-publishers.\n
Prices are probably going to be pushed down. This suggests that the competition might be based on cost - those who have worked out how to cut their costs vs those who haven&#x2019;t. Digital only books are happening. Pretty well impossible to get any figures for the difference in cost of publishing those but I think they&#x2019;ll be lower.\n
After at least ten years of false starts we are here. Digital publishing has happened. The take off is faster than we had imagined.\n
it&#x2019;s the same content but the presentation of that content has changed dramatically. My laptop has the new oxford american dictionary built in. Why would I pick up a heavy book.\nPrint encyclopaedias are dead (not really true but I had to check to see if Britannica was still available in print)\nOK, present company excepted\n
What do I mean by that? Well, it&#x2019;s a way of identifying things about the content *in* the content.\nI&#x2019;m not going to try and teach you XML now!\n
\n
\n
\n
James Lamb has already told me there&#x2019;s a misconception in the model\n
It&#x2019;s a neutral format that will allow you to produce many other formats from a single base. It may not do *all* the work for you but it will help a great deal. Simple content (monographs, adult fiction) can be rendered directly from XML to print. With some more user (InDesign operator for example) input XML can be flowed into your page layout tool for further manipulation with the grunt work being done for you.\n
\n
This has been refined over about 600 years! It worked really well for a long long time. It was only the arrival of digital *outputs* that created a drive to change this. Computers basically gave us electronic ways of doing the same thing we were already doing.\n
XML isn&#x2019;t answer to anything much in itself. It&#x2019;s important to be thinking in terms of general content. If you are working with XML, it can become easier to manipulate, store, license, chop up your content. But it&#x2019;s the content that&#x2019;s important. Content is not necessarily king - in the mobile market, easy content can win over brilliant content. However, have flexible content is going to help you become flexible. XML (done right) *is* flexible content.\n
\n
Saw a t-shirt a few years ago &#x201C;My job went to India and all I got was this lousy t-shirt&#x201D;\nContinuum - use NewGen who use TExtract\n\nOne publisher I work with has outsourced everything but insists that the outsourcer hire UK indexer and copyeditors\n
It&#x2019;s a pity that search is not useful\n
Apple and Amazon have the market\nI think the Kindle format is not going to be long lived - remember the Palm Pilot in the late 90s. The Kindle file format comes from that and is precisely that good for presentation on a screen. The odds are that Amazon will switch to something like the EPUB format.\n
\n
\n
\n
The people who make the reading devices can do great things with indexes if motivated to. They&#x2019;ll do it if they see a clear improvement (or competitive advantage) over the others. \n\nMuch of the digital world is being defined by the specialists - perhaps indexers could define the way that indexes are presented. All it takes is a good idea and some organisation.\n
\n
This has started now. In some sectors it&#x2019;s complete\n
\n
Join OASIS - and any other standards body you can lay your hands on :)\nOASIS is responsible for the DocBook standard\nOne of the nice things about XML standards is that they&#x2019;re promiscuous - we can mix them up and create new ones\n
Right now publishers (many of them) really don&#x2019;t have any better answers than anyone else. It&#x2019;s all changing. If you can tell your clients about a new approach, they may well want to listen\nThe PTC course has a good reputation\nYou&#x2019;d be surprised by how many don&#x2019;t understand\n