Geoffrey Bilder at CrossRef suggested the title “CrossRef and the Pursuit of Truthiness” to capture a critical issue facing scholarly publishers. On the Internet traditional systems of trust and authority are being questioned and dismissed in some quarters. There is a battle being waged between opposing camps about whether it is best to rely on experts as authoritative sources of information or whether the “wisdom of crowds” and web 2.0 magic will replace experts.
You may have seen the online debate between Cult of the Amateur Andrew Keen and “Here comes everybody” Clay Shirky where this issue is hotly contested. Andrew Keen spoke at this meeting last year about his ideas – his view is that “Web 2.0 is pushing us back to the dark ages” and Clay Shirky believes we are entering a golden age when people will organize without organizations – without the traditional structures of authority. We also hear a lot about the supposedly non-hierarchical, authority-less Wikipedia. I think this all is of direct relevance to scholarly publishers. At the moment there seems to be a dichotomy in people’s mind between the traditional “publishing” way of doing things and the “internet” way of doing things. However, I think that both extreme views are wrong – both Keen and Shirky go too far (interestingly they both have books and blogs and are on the lecture circuit – i.e. both using the things that they say are the root of all evil – traditional books in Shirkly’s case and blogs in Keen’s case. I think that somewhere in the middle of both of these views is the most effective way to operate - the question for our industry is who will get there first? Will it be scholarly publishers or will it be new upstarts or even, heaven forbid, Google. Of course all the questioning of traditional sources of trust includes the role of scholarly publishers and the value of scholarly content. Scholarly publishers have not done a good job a promoting the value that they add to content and have not done a good job highlighting why scholarly content is different than a lot of content out there on the web. Publishers are being too defensive. The fact is that scholarly content now lives on the web and so publishers must adapt but they must do so while protecting and promoting their role in creating trustworthy content. So I’m going to talk a little bit about trust on the Internet and then give an overview of two CrossRef initiatives - CrossMark and Contributor ID – that are intended to provide mechanisms for publishers to highlight the value of their content and their role in creating and maintaining that value.
The word “truthiness” was coined by American comedian Stephen Colbert on the satirical show “The Colbert Report” and it is the true enemy of scholarly publishers. Truthiness was coined in October 2005 and defined as “truth that comes from the gut, not books” and then picked up with the American Dialect Society and defined as “The quality of preferring concepts or facts one wishes to be true, rather than concepts or facts known to be true”. During this segment on the show there was one funny comment that I thought captured some of the web view that dismisses traditional, authoritiative scholarly content: “ Well, anybody who knows me knows I'm no fan of dictionaries or reference books - they're elitist. Constantly telling us what is or isn't true or what did or didn't happen. Who's Britannica to tell me the Panama Canal was finished in 1914? If I want to say it was happened in 1941, that's my right.” Clearly, if this view wins out on the web we are all doomed. But all is not lost because, interestingly, I think there are signs that a range of different people are very concerned about the issue of trust and authority on the web and some of the web-based services that are trying to develop trust and authority are attempting to mimic what scholarly publishers already do and not doing a very good job of it. Scholarly publishers can take advantage of this trend.
To explain this trend I want to quickly review the Internet Trust Anti-pattern as outlined by Geoffrey Bilder – many of you may have heard him speak about this but he defined it very well in a blog posting and journal article from a couple of years ago. I think that in the Intern Trust Anti-pattern lies hope for scholarly publishers.
We are starting to see iconic web figures start to talk about trust, authority and quality content on the Internet and the need to make changes to how the web works in this regard. Eric Schmidt, CEO of Google said in a talk to magazine publishers that the Internet is a “cesspool” of false information and that the traditional publishing brands were essential signals of trust in the online world.
Sir Tim Berners Lee in launching his World Wide Web foundation said: How can we determine whether we can trust the material emanating from a site? The Web was originally conceived as a tool for researchers who trusted one another implicitly; strong models of security were not built in. We have been living with the consequences ever since. As a result, substantial research should be devoted to engineering layers of trust and provenance into Web interactions. ...&quot;
In an interview with the BBC Berners Lee also said - that there needed to be new systems that would give websites a label for trustworthiness once they had been proved reliable sources…So I'd be interested in different organisations labeling websites in different ways.”
It’s interesting to note that Google is not just talking about trustworthy content but doing something about it. Google Knol got a lot of attention when it launched. It aims are to allow people to share knowledge and claims that a “knol” is an “authoritative article” and that the service will “highligh authors”. But if you look at what Knol actually does to verify authors and content it is very weak and overall in terms of quality – a failure. Names are verified by telephone number or credit card – that’s it. Knol provides “multiple cues that help you evaluate the quality and veracity of information.” Knol entries allow comments and ratings to be applied by other users but the entry itself can only be edited by the author and authors do not have to provide any contact information. Slate magazine summed up knol as “a wasteland of…text copied from elsewhere, outdated entries abandoned by their creators, self-promotion, spam, and a great many old college papers” – since users can have ads on their entries and share in the revenue from the ads generating revenue seems to be a main goal of knol authors.
This morning Larry Sanger spoke very effectively about Citizendium – quality is assured through rigid editorial control – Reliability and quality, we write under our real names, participants write for academic credit – this sounds a lot like what scholarly publishers do! Citizendium’s policies are enforced by constables/editors who must be at least 25 years old and have a bachelor’s degree. I think Citizendium is a good example that there is nothing in new technologies that is inherantly opposed to traditional values of scholarly publishing – it all depends on how they are implemented and what controls are in place. Users on the internet currently lack the tools to determine whether something should be trusted or the provenance of content –we can see that Citizendium is trying to build a brand that will provide they trust and is building the brand around highlighting the process of how the content is created. Scholarly publishers have been very bad in talking about and promoting the value that they add to content as part of the publishing process. Now – will Citizendium be able to scale up to compete with Wikipedia? Are there enough incentives for people putting in unpaid time on Citizendium? It lacks the hook that scholarly publishers have of the need to contribute and author content for career advancement and tenure.
Now I want to talk about this issue of certification and verification – this is something that is a general trend in society. People didn’t use to care where their sneakers came from or who made them under what conditions – the Nike or New Balance brand is all that they needed.
But now Western consumers want to know where their shoes were made and companies have had to do a lot of work verifying and certifying their whole manufacturing supply chain.
This happens in many different areas and many different certification organizations have grown up over the years in everything from Organic products to sustainable and dolphin friendly fishing to more traditional metrics like the British Standards Kitemark. What all these logos have in common is that they certify the processes by which something was created or sources – consumers may not want all the gory details but they want to know that somebody has checked things out so they can trust the product. Scholarly publishers need to take note of this trend.
This is starting to happen in the scholarly world too. There are a wealth of new services and sites and recently a group of bloggers got together and created a logo to highlight serious blog posts about peer-reviewed literature.
Now to turn to scholarly publishing. I think there are two big problems at the moment for scholarly publishers. The first is that the publishing process is invisible and many researchers, let alone the general public, really understand what goes into producing scholarly content.
A recent article in Learned Publishing made a plea to scholarly publishers to promote peer review more effectively and challenge its detractors or doubters.
An overview of the editorial process from an article describing the Open Journal System I thought really captured what goes into producing a scholarly journal and demonstrates the added value that publishers provide but to a large extent this whole process is invisible to readers.
Problem number 2 is that many people, even some publishers, think that the publisher’s job is done when a “Final Version” is published. However, publishers have a critical role post-publication for trustworthy scholarly content. Who else but the publisher can oversee content post-publication and maintain the version of record, issue correction, retractions and errata.
The recently issued NISO Journal Article Version guidelines start to highlight this in that there is a Version of Record but also Enhanced and Corrected Versions of Records – events that happen post-publication.
This then brings me to CrossMark. Web-based tools to enable users to identify trustworthy content Native web signals to inform the users of the added value provided by scholarly publishers. Publisher have to be more open and demonstrate to users in a simple way why scholarly content is authoritative and trustworthy – that it’s not “truthy”. Publishers have to demonstrate that their content is worthy of trust – not just demand or expect that they will be seen as trustworthy – publishers have to prove it.
Geoffrey Whitson Bilder Bilder, Geoffrey Whitson Geoffrey W. Bilder Bilder, Geoffrey W. Bilder, G. W. G. W. Bilder Geoffrey Bilder Bilder, Geoffrey G. Bilder Bilder, G.
Much worse with lots of asian authors – fewer surnames.
Classic problems of identity – CR is in a position to do something cross-eyed.
Internal holding name – talked to you about some of the problems with name authority but publishers have some specific problems with use of manuscript tracking systems – so authentication is a real issue. So we must address both authorization and disambiguation.
Study of cracked pots. Author submits manuscript –first time for the journal. So the journal says – who are you? Go get an ID from CrossRef – registers and gets identity to provide to journal. Claim things like what’s they’ve published in the past –institution, homepage. Interesting thing
Interesting thing – article is published and metadata goes to CrossRef – author IDs can be sent too. Once CR has them then we can figure out what that author has published (not just what they claim they’ve published). Publishers are verifying author claims. Multiple levels of claims – some more authoritative than others. Series of different trust measures.
CrossRef And The Pursuit Of Truthiness, STM Meeting, Frankfurt, Germany, October 2008, Ed Pentz
CrossRef and the Pursuit of Truthiness STM 2008 Frankfurt Conference 14 October 2008
A Word to Worry About <ul><li>"truth that comes from the gut, not books" (Stephen Colbert, Comedy Central's "The Colbert Report," October 2005) </li></ul><ul><li>"the quality of preferring concepts or facts one wishes to be true, rather than concepts or facts known to be true" (American Dialect Society, January 2006) </li></ul>
Internet Trust Anti-Pattern <ul><li>System is started by self-selecting core group of high-trust technologists (or specialists of some sort). </li></ul><ul><li>System is touted as authority-less, non-hierarchical, etc. But this is not true (see 1). </li></ul><ul><li>The general population starts using the system. </li></ul><ul><li>The system nearly breaks under the strain of untrustworthy users. </li></ul><ul><li>Regulatory controls are instituted to restore order. Sometimes they are automated, sometimes not. </li></ul><ul><li>If the regulatory controls work, the system is again touted as authority-less, non-hierarchical, etc. But this is not true (see 5). </li></ul>In Google We Trust? Geoffrey Bilder, Journal of Electronic Publishing, vol. 9, no. 1, Winter 2006
How can we determine whether we can trust the material emanating from a site? The Web was originally conceived as a tool for researchers who trusted one another implicitly; strong models of security were not built in. We have been living with the consequences ever since. As a result, substantial research should be devoted to engineering layers of trust and provenance into Web interactions. ..."
Sir Tim told BBC News that there needed to be new systems that would give websites a label for trustworthiness once they had been proved reliable sources…So I'd be interested in different organisations labeling websites in different ways.
Authoritative article The key idea behind the knol project is to highlight authors. Name verification is by telephone number (US only) or credit card Knol is a wasteland of…text copied from elsewhere, outdated entries abandoned by their creators, self-promotion, spam, and a great many old college papers
Reliability and quality We write under our real names Participants write for academic credit
Problem #1 The publishing process is invisible Solution #1 Make it more visible
Open Journal Systems: An example of open source software for journal management and publishing, J Willinsky. Library Hi Tech. 2005, Vol 23, Issue 4, p 504 doi:10.1108/07378830510636300
Problem #2 Solution #2 Thinking that publisher’s job is done on publication of the “final” version Recognize there is no final version and publisher has role post-publication
http://www.niso.org/publications/rp/ Version of Record Enhanced VoR Corrected VoR
CrossMark <ul><li>A visible kitemark for humans – licensed from CrossRef </li></ul><ul><li>A mechanism for publishers to make a statement of ongoing stewardship </li></ul><ul><li>A reliable mechanism for users to identify the version of the document the publisher is taking responsibility for </li></ul>
CrossMark <ul><li>Metadata for machines (and human geeks) </li></ul>
Erratum Version of Record DOI:10/1037/1114 Crackpot Press Peer Reviewed: Yes CrossChecked: Yes Review Type: Double Blind Protocols: Carberry protocol on hum Funding: 30% Templeton
CrossMark <ul><li>Highlight pre-publication added value – publishing as a managed process </li></ul><ul><li>Highlight ongoing post-publication management of content (errata, corrections, retractions) – counter false notion that once something is published the publisher’s job is done </li></ul><ul><li>What about pre-prints? No claim of ongoing responsibility </li></ul><ul><li>What about alternate Versions of Record? </li></ul>
CrossMark <ul><li>Basic metadata to be simple but extensible by the publisher </li></ul><ul><ul><li>Beyond the minimum publishers decide what metadata gets included </li></ul></ul><ul><ul><li>Record and advertise processes employed to ensure trustworthiness – peer review, link to journal information pages </li></ul></ul><ul><li>Linked in with DOI to ensure user can locate and access the latest version of the metadata or the content if updated </li></ul>
CrossMark Pilot <ul><li>Detailed use cases </li></ul><ul><li>Examples of use in different formats (HTML, PDF) </li></ul><ul><li>Develop techniques and processes to ensure integrity of the system </li></ul><ul><li>Draft business case and policies (who can use it and on what content) </li></ul>
"together we can create a reality that we all agree on — the reality we just agreed on…any user can change any entry, and if enough users agree with them, it becomes true."
Mission Statement <ul><li>To enable easy identification and use of trustworthy electronic content by promoting the cooperative development and application of a sustainable infrastructure </li></ul>