Not so many years ago, when online publication appeared in the scientific press, it was intended as nothing more than a side service. A scientific review was still a periodical collection of pages compiled, printed and distributed by an editor under the form of a booklet, which was bought, indexed and made available by academic libraries. The physical support was inseparable from its content to the point that we currently refer to a scientific article as a ‘paper’.
However, since the day reviews started to be compiled with personal computers, editors realized that they could easily export an electronic copy of the articles they were collecting and post it on their websites. Often editors online provided just the title and abstract of the articles as a tease for readers. Sometimes, they uploaded the full-text as an extra service offered to libraries already subscribed to the hardcopy version.
Soon however, librarians started to realize how much simpler and less expensive it was to manage digital files instead of paper booklets. Consequently, they informed scientific editors they would rather buy just the electronic version of their reviews. To meet this demand, some editors set up online archives for their contents. Others, the majority, decided to rely on portals that gathered and standardized contents coming from different scientific editors. In a few years, the majority of scientific press became available through a handful of specialized portals.
By this time, scholars had discovered how nice it was to search their bibliographic references online. Instead of wandering through catalogues and shelves, they could sit at their desks, type a few keywords and consult hundreds of papers straight on their computer screen. Nowhere was such an abundance of scientific literature available other than on online archives. In a couple of decades the entire chain of scientific publication (from paper submission, to bibliography compiling) has gone online.
So far, the digitalization of scientific press has had three main consequences. First, many journals have introduced an “online first” publication for forthcoming articles, thereby reducing the lag between the acceptance and actual publication.
Second, a new generation of bibliographic tools has been crafted drawing on the new affordances of electronic articles. Equipped with database such as Google Scholar, the ISI Web of Knowledge, Scopus and tools such as Mendeley and Zotero, scholars are now able to access and handle literature at a scale that was unthinkable just few years ago. Third, the business model of scientific editors has been undermined as scholars realized that, thanks to digital techniques, they could easily replace editors in collecting and disseminating papers. The result of such discovery is the Open Access Movement working to make articles (or their pre-print version) freely available to the scientific community (Willinsky, 2006). The way articles are written, distributed and exploited is radically changing and yet, strangely enough, articles themselves are not: they are still papers . Unaffected by digital revolutions, articles continue to be made of (plain) text and (few) images.
Third, the business model of scientific editors has been undermined as scholars realized that, thanks to digital techniques, they could easily replace editors in collecting and disseminating papers. The result of such discovery is the Open Access Movement working to make articles (or their pre-print version) freely available to the scientific community (Willinsky, 2006). The way articles are written, distributed and exploited is radically changing and yet, strangely enough, articles themselves are not: they are still papers . Unaffected by digital revolutions, articles continue to be made of (plain) text and (few) images.
The fact that papers are the only unchanged link in the chain of scientific press is even more remarkable as research practices themselves have been deeply affected by the impact of digital technologies. Disciplines such as physics, chemistry and biology have been thoroughly renovated by electronic computing and the same renewal is taking place, a few years later, in the human (digitalhumanities.org) and social sciences (Lazer et al., 2009). Not since the 16th century have scientific disciplines converged so unanimously on one single research tool. Today, historians, economists, psychologists, ethnographers, as well as biologists, mathematicians, physicists, chemists, they all use similar personal computers to collect, process and store their data. They all feed their data in similar databases, spreadsheets and word processors. They all collaborate on the same digital networks. Their papers can be found in the same online libraries, downloaded to the same hard disk and read on the same screens. Before computers, the only medium to know the same irresistible success in scientific community was the movable-type press. We know how deeply Gutenberg’s invention affected the birth of modern sciences (Eisenstein, 1979) and there is evidence that digital technologies may play the same role in the next few years.
In social (as well as in natural) sciences, research practices are mutating under the influence of electronic tools. Not only existing methodologies are facilitated and enhanced by digital technologies, but new digital methods are emerging that were unthinkable just a few years ago (Roger, 2009). Social scientists are now confronted by datascapes similar to the scale of the natural sciences, they profit from semi-automatic data treatment, they flirt with simulation and information visualization, they share data over digital networks and experiment in online collaboration. Digital technologies have changed the way we investigate social phenomena and might change the way we conceive them (Latour et al. 2012).
This amazing effervescence spreading through laboratories all over the word is hardly visible when one looks at the actual products of social sciences. No matter how innovative the source or the analysis of the data may be, the results of social researches are still delivered under the two hundred year old form of a scientific paper. Even if both the chains of scientific production and of scientific dissemination have been completely renovated, the very link between the two of the remains unchanged. Scientific papers still have the exact same form they use to have when research was made and published on paper. This inertia prevents scientific press from divulging the full richness of computer-enhanced research and from taking advantage of the potential of online publication. Of course, scholars can publish rich multimedia accounts of the results of their research on their websites; of course portals on social sciences can be organized. Yet, when it comes to career evaluation (in particular as scientometrics indicator are concerned), scientific papers are the only publications that really matters.
Papers are the bottleneck of scientific research. The last sentence of the previous paragraph needs to be qualified: there are excellent reasons why papers remain the bottleneck of scientific research. For one thing, the ‘organized skepticism’ of modern science (Merton, 1973) would be impossible without a standardized way to identify and cite the ideas submitted to the evaluation of the scientific community. To support or criticize the work of their colleagues, scientists need to be able to refer to them unequivocally. This, in fact, is the main function of the system of papers and journals. Over decades and disciplines, papers provide a consistent way to express and discuss scientific results. Wherever they are, whatever subjects they study, whatever methods and theories they use, scholars knows that their contribution will eventually translate in a few pages of text and images that can be easily identified, transported, archived and compared to other. Achieving such standardization took more than three centuries (since the publication of the Journal des Sçavans and the Philosophical Transactions of the Royal Society ) and it would be silly to overlook its advantages.
Still, an increasing number of scientific practices are emerging that cannot comfortably be squeezed through the bottleneck of paper publication. For example, the more data and computational power become available for investigation and the more the scientists realize the value of databases and analytic algorithms as scientific results. Two important sets of initiatives have been taken in this direction. On the one hand, several online archives have been created to encourage the publication of scientific datasets on the web (e.g. the Harvard Dataverse Project, thedata.org and the UK data archive, data-archive.ac.uk ).
On the other hand, many of the most prestigious journals are starting to require their contributors to release their data and the computer codes for purposes of replication and scientific quality ( Ince et al. , 2012), leading to a dramatic improvement in methodological rigor and a greater awareness of the advantages of data and code sharing. These praiseworthy initiatives, however, are meant to improve the transparency and reproducibility of science, not to enhance the publication of scientific results. Data and source-code are not the only non-textual items that scholars may want to publish.
In the last few years, many scientists, particularly in human and social sciences, have started experimenting with new techniques of data visualization and information communication. More and more, the results of social investigations are therefore delivered as videos, graphic visualizations, interactive animations and navigation interfaces. None of these formats fit the traditional paper-journal system, thereby forcing scholars to write and publish textual descriptions of their multimedia results. These texts offer a pale substitute for the original results and, in many cases, their only function is to provide the address of the web page where the original results are actually published. This solution is unsatisfactory because it entrust the publication of important scientific results to the self-publishing efforts of their authors instead of including it within the scope of scientific publishing. There is, however, no good reason why this situation should not evolve. The development of digital media and scientific publishing have removed most of the economic or technical obstacles that prevented extending and improving on the formats of scientific publishing while at the same time conserving the advantages characteristic of the traditional paper-journal systems. In the next paragraph, we will discuss how such extension and improvement could be obtained and what would be its advantages.
While multimedia and interactivity may enhance the reproducibility of scientific experience (in particular, the experiences that are ‘natively digital’), it is crucial that the other features of traditional publishing are not lost in the process. As we said, there are several good reasons why scientific papers have so far remained the only accepted format for scientific publications. Papers have four features that made them irreplaceable in modern science: they are citable (it is possible to identify them unequivocally); accessible (they can be accessed by anyone at a reasonable cost); durable (their maintenance is relatively easy) and stable (once published, they cannot be intentionally or unintentionally altered). Without these features, scientific publications would not be able to contribute to the dialogue of the scientific community. If digital publications are to enrich the reproducibility of scientific publications, they have to be at least as citable, assessable, durable and stable as the traditional papers, but they also have to combine these features with the multimedia and interactive potential of online media. Luckily, in the last few years, digital technologies have reached the maturity necessary to make this possible.
Multimedia is the capacity of digital technologies to draw together formats originally developed for separated medias, such as texts, images, video, sounds… Multimedia is interesting for scientific publishing because it lets authors profit from new formats of publication (see, for example, how the Journal of Visualized Experiments – jove.com – is experimenting with the use of videos to present scientific protocols), without renouncing the advantages of the old ones. In a multimedia environment, the most diverse elements can find their place within the traditional textual structure of the scientific papers (in the same ways notes, images and tables are currently embedded in text). Interactivity is the capacity of digital technologies to offer a non-linear exploration of a message. Though traditional formats have always allowed some interaction to their reader (through textual devices such as notes, references, citations and tables), the possibilities opened up by online media are unprecedented. Today, instead of just describing their research protocols, scholars can embed them (or part of them) in the publication. In particular, most digital analyses can be opened to the reader, by publishing datasets together with the tools that allowed their investigation (see the example of ManyEyes – www-958.ibm.com).
From a technical point of view, there are no major obstacles preventing scientific publishing to exploit the potential of digital technologies. Web technologies, in particular, have proved to be a perfectly suitable support for scientific communication. Created within the world’s largest international research organization (the CERN of Geneva), the Web was originally conceived as a tool to enhance scientific dialogue (Berners-Lee, 1999). Developed in an academic environment and inspired by the practices of academic publishing, web protocols have a natural affinity with the publication principles discussed above, thereby reducing the cost for migrating online.
Citability . According to many observers, the most important brick in the development of the Web was the introduction of the Uniform Resource Locators. URLs provide a system of unique addresses allowing any file available online to be reached by any computer connected to the Internet. Embedded in web pages as hyperlinks, URLs offer a convenient and unambiguous system of citation among any type of document. What is also interesting about URLs is their scalability: they can be used to identify a domain (as in domain.com); a website within a domain (as in website.domain.com); a folder within a website (as in website.domain.com/folder); a document within a folder (as in website.domain.com/folder/document.html); an element within a document (as in website.domain.com/folder/document.html#element). The platform for digital scientific publishing we are proposing should draw significantly on the URLs system to assure the citability of its publication. Each article published on the platform will therefore be assigned a permanent URL (also called ‘permalink’) that will identify it unequivocally and enduringly. Elements within articles will be identified by anchors or sub-URLs (also permanent), allowing scholars to cite them directly (the same way it is possible to cite a specific page or paragraph within a traditional scientific paper). For citability to be assured, it is also indispensable that all the elements composing the submitted articles are stored at the same address. External hyperlinks, therefore, will only be accepted if they are used as citations and if they can be removed without compromising the integrity of the article.
Accessibility . Making web documents accessible to a growing number of people in the world required and still requires considerable efforts. Most of these efforts have been invested in building the physical infrastructures necessary to extend the access to the Internet. As this infrastructure grows more and more pervasive, however, accessibility problems (the so called digital-divide) seem to be moving toward the ‘last mile’: the actual software which allows documents to be read online. This software, usually called a browser or reader, exists in several proprietary and open solutions, each one with multiple versions released in different times for different devices (computers, mobile phones, tablets…). Even for the relatively simple task of publishing traditional books (with nothing more than text and images) in an electronic form, there exists dozens of competing (and often non-interoperable) hardware and software solutions. To be universally accessible, scientific publication should therefore be delivered in formats that can be properly read by as many browsers and readers as possible. This is why the digital scientific publishing platform we are proposing should rely on the Web standards as defined by the World Wide Web Consortium (W3C) and conform to the guidelines issued by the Web Accessibility Initiative (WAI). Complying with web standards is the best guarantee that the content of scientific publications will be accessible to the largest possible audience (Zeldman, 2010). Also, the conformity to W3C standards can be easily tested by testing the articles submitted to the platform with one of the many code validation services (e.g. validator.w3.org).
Durability . Besides allowing the largest accessibility, recent web standards have another important advantage: they assure the forward and backward compatibility of the documents they encode. This compatibility is a crucial achievement of the latest version of the Hyper Text Markup Language. HTML 5 is forward-compatible, meaning that future versions of standard-complying browser will be able to read the current version of the language, despite all possible future changes of the standard, and backward compatible, meaning that HTML file documents degrade gracefully in older browsers conceived for previous version of HTML. Relying on HTML 5 technologies, the scientific publishing platform we are proposing will be durable because the legibility of its content will be assured through time by the forward and backward compatibility of web standards. Though it is well known that digital technologies are affected by a very fast obsolescence, employing compatible standards guarantees that scientific publications will remain readable for a relatively long period of time without requiring conversion or other software maintenance. At the very least, HTML being a markup language (a language meant to add annotation over a natural language text), it always remains possible to print html pages and read the text they contain. In a sense, HTML assures some backward compatibility even with traditional print.
Stability . The efforts deployed to assure the durability of web standards may be defeated by the facility with which it is possible to add, remove or modify any online file. The possibility to change online documents at an infinitesimal cost is surely one of the greatest advantages of web technologies, but it does create a problem for scientific publishing. The citability of scientific publication only makes sense if the content of such publications remains stable. If authors were allowed to correct their statements after publishing them, it would become very difficult to know which version of an article is being referenced. In digital technologies, such problem has been solved by introducing a series of versioning systems. Allowing to distinguish different versions of the same document and to switch back and forward among them, versioning systems have played a crucial role in supporting the distributed and collaborative projects in open-source and open-editing. Wikipedia is probably the most well-known example of this versioned collaboration . The scientific publishing platform we are proposing will not, however, propose a versioning system. The reason is simple: keeping trace of different versions makes sense only when the aim is to encourage the continuous modification of documents. Versioning promotes agile and bootstrapping approaches (Raymond, 2001) where the quality of publications is assessed and ameliorated after their release. Such model is very far from the model of scientific publication, which rest on editor and peer-review to limit the number of publications and make sure that only the best results are publicly released. Though versioning and distribute collaboration certainly have their place in science, their advantages are probably more clear in the phases that precede publication. As concerning stability, the type of publication we are imagining will resemble more paper-print than e-publishing. Although no technical reasons prevent articles to evolve (and be traced in such evolution), the platform we are proposing will not allow such changes. Once a digital article is reviewed and published, it will not be allowed to change, unless as an entirely new submission.
Multimedia and interactivity . Despite what is often believed, Web technologies were not originally created to support multimedia and interactive communication. As their name indicates, both the main protocol (hypertext transfer protocol – HTTP) and the main language of the Web (the hypertext markup language – HTML) have been developed with the distribution of texts as their main preoccupation. Initially, the only addition to the traditional textual formats was the possibility to connect several texts through hyperlinks. Soon, however, the notion of link was extended to allow embedding external resources within web pages, first images and then videos, sounds, applets and various multimedia objects. Analogously, the possibility to click on an element of the page and trigger an action (initially the movement toward another page) has been soon extended to support different type of interaction with the web documents. More recently, the advent of HTML 5 has marked the official acknowledgment of the role of multimedia and interactivity in online communication. Offering better support to handle data, videos, sounds, dynamic images (svg and canvas), style sheets and making all these elements (and other) easily programmable through JavaScript, HTML 5 is greatly enhancing the multimedia and interactive potential of web standards (thereby bypassing non standard languages as Flash actionscript). Furthermore, HTML 5 and related technologies are executed within the browser. This means that (compared to languages such as PHP or Python) the burden of computation is moved from the server publishing the document to the computer accessing them. This helps limit the broadcast and maintenance costs of scientific publishing. At the same time, since web technologies are interpreted and rendered by the browser, users do not need to install any additional software or applet, thereby reducing security risks and compatibility problems.
For all these reasons, HTML 5 and related technologies (in particular JavaScript) seems particularly suitable for scientific publishing. Other technologies exist, of course, that are vastly used in research and could be transmitted over the Internet (see for instance the work done by the project HUBzero.org to develop an environment capable of transf ering online any experiment of computational science – McLennan & Kennell, 2010). We feel, however, that web standards offers the best cost-benefit compromise. With a relative little effort for the publishers and the readers, web technologies can open scientific literature to a vast range of new formats and combinations of formats.
In the end, it is clear that the obstacles to the renewal of scientific publishing are not technological, but organizational. The platform we described in the previous pages is not difficult to implement from a technical point of view, but will be useless if it does not succeed in being recognized as a For this reason, the editorial policies of the platform we are imagining will be as rigorous as the most influential reviews in social sciences. It will be led by an international steering committee gathering scholars who have distinguished themselves by their contributions to the digital renovation of social sciences. It will organize a meticulous process of peer review by mobilizing reviewers from diverse disciplines and countries. In selecting its article the platform will be as severe in contents as it will be flexible in forms. Agreements will be taken with all major (and minor) citations databases in order for articles to be well referenced and highly visible on the scientific web. All these precautions and other will be take in order to make sure that, in term of academic and scientometrics status, JODSS articles will be as valuable as papers published in other peer-reviewed journals. To succeed in the difficult challenge to establish as legitimate scientific publications, articles that do not comply with the format of traditional publishing, the platform we are imagining will be introduced in early 2014 as a part of Big Data & Society a new journal dedicate to digital social sciences and published by Sage.