Digital Texts scholarly communication in a digital networked age

3,393 views

Published on

2 Comments
6 Likes
Statistics
Notes
No Downloads
Views
Total views
3,393
On SlideShare
0
From Embeds
0
Number of Embeds
575
Actions
Shares
0
Downloads
20
Comments
2
Likes
6
Embeds 0
No embeds

No notes for slide
  • Tony HirstTwitter:@psychemediaBlog: http://blog.ouseful.info“On the limited usability and functionality of digital publication concepts. The conceptualisation of formal digital publications remains too much stuck in the old hierarchical paradigm of the author in charge, and does not match modern democratic user expectations, formed by collaborative knowledge production practices. What is needed is a far more open attitude to knowledge communication, and greater user-centeredness. “
  • Dominant theme is /scholarly/ communication, but we should also take care to bear in mind these acts as communicative acts in general. As I hope to show, news communication is also struggling to identify the most effective way of engaging audiences in a networked world, though for slightly different reasons.
  • Let’s just air some assumptions about the way the systems works at the moment. To begin with, we have the unit of publication, our publication atoms, typically: articles in peer reviewed journals, though this may be slightly different in different subject areas: computer science, for example, credits peer reviewed papers presented at certain well- regarded international conferences, the arts may prefer slightly longer form monographs, and so on. But even so, the peer-reviewed journal article still carries gravitas in most, if not all, academic subject areas.
  • This diagram charts a typical academic publication process.
  • As I have the honour of presenting the first presentation of the conference, I’d like to open up a few questions that may be worth bearing in mind over the next couple of days.Who are the audiences for scholarly communications?
  • What, actually, are the things we publish?
  • What distinguishes academic/scholarly communication from any old communication?
  • Why do we bother publishing *scholoarly* works?
  • At this point, I’d just like to mention a presentation – and consequently a series of blog posts – delivered by Martin Belam at News:Rewired in London at the start of 2009. The themes covered related to the way in which news media, particularly in their publication of topic related content, deliver it in a reverse chronologiical way. However, when presented with an anniversary of an event, or a set piece news event, they can draw on a wealth of material to produce well round analytical reviews and reflective comment. (An example Martin gives is the anniversary of the moon landings.)
  • One of the activities of scholars in centuries past was to summarise all that had gone before (either with, or without, additional comment).
  • One reason for this was to support discovery – a review of the sum total of human thought to date acted as some sort of narrative metadata that set in context that previous work, and allowed it to be (re)discvoered, as well as raising awareness of links between different works.
  • Of course, now we have Google, and full text search – in certain respects, the contents of a document doubles up as its metadata. (Martin Belam has another nice example to demonstrate how this is not always effective – a camper van story illustrating the topic of “Gordon Brown”.)Google also works in the context of documents drawn from a myriad different sources, a networked environment where otherwise unrelated documents may be thrown next to each other in the search results page for a particular search query.
  • So that’s the context… What I’d now like to do is explore a couple of ideas relating to the state of scholarly publishing in a networked, social environment. Those ideas are: the (social) structure of communication networks;- the atoms of publication
  • First, the structure of communication networks.
  • The first point I’d like to make is that networks are typically structured. Some areas of a network may be more highly internetworked than other parts. Some parts of the network may only be very loosely connected, and to all intents and purposes /dis/connected. In certain respects, we might think of the effectively disconnected parts of the network as silos…
  • In the social sense, the siloed parts of the network may alternatively be thought of as /echo chambers/.
  • Here’s an example showing how different journals interconnect based on significant numbers of citations appearing in one journal referencing another journal. If we were to map the sum total of academic journals, we would notice different aggregations of journals.
  • What may be really dangerous – or an indicator of some sort of bias - is where there are disconnected clusters of journals within the same subject area…
  • The internal link structure of my blog.
  • As well as journal citation networks we have co-author networks. Another person-centric network type would be based on citations to named individuals, rather than either co-authors, or citations from one journal to another.Mixing types, we might also depict bipartite graphs in which there are two sorts of node – author and journal – and then plot which authors cite which journals. (I started exploring this in the context of the OU’s repository to see whether or not the fact of citing articles in journals could be used as an ROI indicator for the subscriptions held by the OU Library).As these network constructing techniques get easier to use, and citation data /as data/ becomes available, I would full expect to see visualisation techniques around such things as paper citation networks being used to illustrate literature reviews.
  • One of the networks I’ve been exploring recently is Twitter, as much to get a feel for how to do network analysis on a fast changing network as anything. In particular, I’ve been exploring hashtag communities – which actually resemble hashtag echo-chambers in many situations.This graph was generated using a tool called Gephi; the nodes are individuals, and the different colours represent different clusters that individuals have been assigned to using Gephi’s modularity statistic.
  • Here’s another graph – this time showing the connections between the people I follow on Twitter. Again, different clusters are visible.A question now arises – do journals provide the best means for an author to communicate a message /if that’s what the communication is for/?
  • T is, if we see scholarly communication as part of an active and ongoing discussion, is the journal framework the best place for that? Was it /ever/ for that? Or are journals more tied up with capturing snapshots of the scholarly process and incorporating them into the historical record in much the same way that the news media capture reviews of news events? (And through reference to the ideas of Martin Belam, I’ve already suggested that that model is showing signs of strain.)
  • Of course, the traditional publishing model has become the cornerstone of academic reward structures – more often than not, promotion opportunities are predicated on having a sound publication record. But what are the corresponding metrics for engagement in an active and dynamic real-time conversational active community process?
  • As an aside, it’s worth noting that when the active, participatory process takes place in public, in the open, there are opportunities for breaking outside the silo through weak ties and serendipitous discovery.
  • This open participation may include encouraging volunteers to engage with the research process – it’s probably no surprise that one of the scientific areas that continues to benefit from the activities of expert amateurs is astronomy, since the “lab” is the sky and is available to all.
  • So, this notion of openness and participation leads me, via an anecdote, to the second theme of my talk: the atoms of publication.
  • Let’s just recap for a moment before we go on, though: this is the current central dogma – that the journal article is the typical unit of publication.
  • And what do we notice about it? For a start, it’s anchored in TIME and GRANULARITY. It is like the news article in the news media.
  • But does it have to be so? The first idea I’d like to introduce here is that maybe scholars can learn from the “daily news” facet of the news media and reduce the temporal granularity with communications are published. A great example of this is the open notebook, We all know that scientific publishing is typically revisionist, it’s the Sunday paper investigative feature article putting the chaos of the daily stories from the previous week into some sort of coherent narrative. The actual process of the science is often a world away form that recorded in the published scientific article. The actual process of science is what’s recorded in the notebook. So why not open that up, why not treat it as a partial, ongoing publication that can influence, and be influenced by, others working in the area?
  • Here’s an example of how an ad hoc collaboration can arise in quite an informal way through open boundaries and networked communication, as reported by Cameron Neylon.“Jean-Claude Bradley is the master when it comes to organising collaborations around diverse sets of online tools. The UsefulChem and Open Notebook Science Challenge projects both revolved around the use of wikis, blogs, GoogleDocs, video, ChemSpider and whatever tools are appropriate for the job at hand. This is something that has grown up over time but is at least partially formally organised. At some level the tools that get used are the ones Jean-Claude decides will be used and it is in part his uncompromising attitude to how the project works (if you want to be involved you interact on the project’s terms) that makes this work effectively.At the other end of the spectrum is the small scale, perhaps random collaboration that springs up online, generates some data and continues (or not) towards something a little more organised. By definition such “projectlets” will be distributed across multiple services, perhaps uncoordinated, and certainly opportunistic. Just such a project has popped up over the past week or so and I wanted to document it here.I have for some time been very interested in the potential of visualising my online lab notebook as a graph. The way I organise the notebook is such that it, at least in a sense, automatically generates linked data and for me this is an important part of its potential power as an approach. I often use a very old graph visualisation in talks I give out the notebook as a way of trying to indicate the potential which I wrote about previously, but we’ve not really taken it any further than that.A week or so ago, Tony Hirst (@psychemedia) left a comment on a blog post which sparked a conversation about feeds and their use for generating useful information. I pointed Tony at the feeds from my lab notebook but didn’t take it any further than that. Following this he posted a series of graph visualisations of the connections between people tweeting at a set of conferences and then the penny dropped for me…sparking this conversation on twitter.@psychemedia You asked about data to visualise. I should have thought about our lab notebook internal links! What formats are useful? [link]@CameronNeylon if the links are easily scrapeable, it’s easy enough to plot the graph eg http://blog.ouseful.info/2010/08/30/the-structure-of-ouseful-info/ [link]@psychemedia Wouldn’t be too hard to scrape (http://biolab.isis.rl.ac.uk/camerons_labblog) but could possibly get as rdf or xml if it helps? [link]@CameronNeylon structured format would be helpful… [link]At this point the only part of the whole process that isn’t publicly available takes place as I send an email to find out how to get an XML download of my blog and then report back via Twitter.@psychemedia Ok. XML dump at http://biolab.isis.rl.ac.uk/camerons_labblog/index.xml but I will try to hack some Python together to pull the right links out [link]Tony suggests I pull out the date and I respond that I will try to get the relevant information into some sort of JSON format, and I’ll try to do that over the weekend. Friday afternoons being what they are and Python being what is I actually manage to do this much quicker than I expect and so I tweet that I’ve made the formatted data, raw data, and script publicly available via DropBox. Of course this is only possible because Tony tweeted the link above to his own blog describing how to pull out and format data for Gephi and it was easy for me to adapt his code to my own needs, an open source win if there ever was one.Despite the fact that Tony took the time out to put the kettle on and have dinner and I went to a rehearsal by the time I went to bed on Friday night Tony had improved the script and made it available (with revisions) via a Gist, identified some problems with the data, and posted an initial visualisation. On Saturday morning I transfer Tony’s alterations into my own code, set up a local Git repository, push to a new Github repository, run the script over the XML dump as is (results pushed to Github). I then “fix” the raw data by manually removing the result of a SQL insertion attack – note that because I commit and push to the remote repository I get data versioning for free – this “fixing” is transparent and recorded. Then I re-run the script, pushing again to Github. I’ve just now updated the script and committed once more following further suggestions from Tony.So over a couple of days we used Twitter for communication, DropBox, GitHub, Gists, and Flickr for sharing data and code, and the whole process was carried out publicly. I wouldn’t have even thought to ask Tony about this if he hadn’t been publicly posting his visualisations (indeed I remember but can’t find an ironic tweet from Tony a few weeks back about it would be clearly much better to publish in a journal in 18 months time when no-one could even remember what the conference he was analysing was about…).So another win for open approaches. Again, something small, something relatively simple, but something that came together because people were easily connected in a public space and were routinely sharing research outputs, something that by default spread into the way we conducted the project. It never occurred to me at the time, I was just reaching for the easiest tool at each stage, but at every stage every aspect of this was carried out in the open. It was just the easiest and most effective way to do it.”
  • So I asked a Cameron a question about how his open notebook could be viewed as a data source, or something similar, and he mentioned the feeds, and it went no firther. Except that we were now on the periphery of each others’ awareness in a particular regard.I posted some visualisations and mentioned a tool was using, it gave Cameron an idea, we had a chat…
  • ..and this fell out of it…
  • Here’s how it worked – open networked tools and the path of least resistance.
  • By changing practice slightly (and this is something one of my OU colleagues, Martin Weller, has blogged about on his edtechie blog, the collateral fallout of the scholarly process may actually be useful outputs.
  • To illustrate what I mean, I’m going to use an extreme example – this document, this image, is actually created from data. It’s generated from a list of pairs of items that specify a from node and a to node. The source document is list of pairs of words. The graph is generated from the source data in this case using a tool called Graphviz. That image is a view over a some data. That document is actually a view over structured data.
  • The Graphviz generated image leads us nicely to the next lens with which we can inspect the notion of publication atoms: living documents. Now I want to distinguish three sorts of living document if I may (partly for narrative reasons!)- Document views that are dynamically constructed
  • The first view is of documents as living in the sense of being created from data. In the ultimate case, this can mean that the document itself may become interactive, and that the data underlying the document is combined with, manipulated or influenced by the actions of the user-reader in generating the document-as-read.Here’s an example – it’s from Mathematica, and it hints at what might be possible with the computable document format that Conrad Wolfram in particular has been trying to promote.(Just by the by, there’s a fascinating post by Stephen Wolfram on his blog from 3 or 4 weeks ago when Mathematica version 8 was released about their work on “natural language programming”; well worth a read.)
  • Here’s another example. But see that link saying “Show steps”?
  • Voila…the working…
  • Just by the by, there’s an experimental service at the moment called dexy.it, which is looking at a framework within which “live” components can be embedded in something like a self-hosted Wordpress blog post, and scripts can be used to generate renderings of scripted, data specified components.I also have a list of demonstrations of live programming demos that can be used as part of presentations within a browser.
  • In the second case, we have interactive components that the the user/reader can engage with and start to construct their own narrative. The New York Times has some great examples of this sort of interactive document produced by their visualisation lab.
  • Just recalling Martin Belam and the news media for again for a moment…
  • …isn’t this the sort of thing that might be useful in a scholarly setting?
  • Of course, there are likely to be issues, not least with knowing what’s currently true. But here we may be able to call on the daily practice of software developers, who live by version control systems. Version control systems allow software developers to create snapshots of their code at any particular point in time. If a programme stops working as a result of changes, the developer can roll back to a version that works.Wikis have a history mechanism too of course (Jon Udell’s Heavy Metal Umlaut screencast is a great illustration of the power of the Wikipedia/MediaWiki history mechanism), Google Docs includes versioning so you can compare previous versions of docs, and Herbert van derSompel’s Memento project (remember Herbert? I opened the presentation with a quote from a paper he co-authored about the atoms of publication) is seeking to provide a mechanism for looking back at the state of web documents at particular points in time.
  • Just by the by, issue tracking is another component of the software developers’ toolbox that might play a role in an open social networked scholarly publishing framework…
  • …not least becauseit can be an open social activity too… Issue tracking may also be used to identify point of contention, or points that require clarification.
  • It also seems to me that process steps such as issue tracking – or socially open informal publication –can slot in the peer review slot in the current publishing process, although it may also afford far more disruptive reconceptualisations of the whole process.
  • So, we’re back to the beginning again – albeit in a very contrived way – having considered two aspects of communications environment that I believe the world of scholarly communication needs to start innovating in. Exploiting the structure of networks, not only in terms of knowledge networks, but also social networks; and exploring notions of what appropriate atoms of publishing are when live online networked and computationally powered documents are a very real possibility.
  • Contact: Tony Hirst, Department of Communication and Systems, The Open University, UKSocial media: generally known as /psychemedia/Blog: blog.ouseful.info
  • Digital Texts scholarly communication in a digital networked age

    1. 1. Texts and Literacy in the Digital Age:Assessing the future of scholarly communication<br />Tony Hirst<br />Department of Communication and Systems<br />The Open University<br />
    2. 2. Scholarly communication<br />
    3. 3. “In the established scholarly communication system, the concept of a journal publication dominates our definition of a unit of communication”<br />D-Lib Magazine, September 2004<br />Volume 10 Number 9, ISSN 1082-9873<br />Rethinking Scholarly Communication: Building the System that Scholars Deserve<br />H Van de Sompel, S Payette, J Erickson, C Lagoze, S Warner<br />http://webdoc.sub.gwdg.de/edoc/aw/d-lib/dlib/september04/vandesompel/09vandesompel.html<br />
    4. 4. Peer Review<br />Author<br />“Editing”<br />Page layout<br />Publication<br />Deposit<br />
    5. 5. Publish to whom?<br />Peers<br />Other disciplines<br />Research students (training)<br />Undergraduate students (teaching)<br />Industry (knowledge/technology transfer)<br />News media (expert commentary)<br />Public (public engagement)<br />
    6. 6. Publish what?<br />Text<br />Diagrams<br />Data<br />Analyses on, or views over, datasets (queries)<br />-> protocols<br />
    7. 7. Peer Review<br />Quality control<br />Replicability<br />Validity<br />Clarity<br />(Discovery/extension – recommend additional references)<br />Conformity with formal publishing norms<br />
    8. 8. Publish for what purpose?<br />Reviewing the field<br />Making the next step in an argument<br />Documenting a process<br />Collecting and/or analysing evidence/data<br />Correcting an error elsewhere<br />[ANCHORING KNOWLEDGE]<br />(Making a claim, building reputation)<br />Influence policy<br />
    9. 9.
    10. 10.
    11. 11. Summaries as an aid to discovery<br />
    12. 12. Discoverability<br />“Einstein Google Logo At Google…” by dannysullivan<br />
    13. 13. 2<br />The structure of <br />communication networks<br />The atoms of <br />publication<br />
    14. 14. 1<br />The structure of <br />communication networks<br />
    15. 15. Under a network view, the silos become self-evident<br />
    16. 16. Echo chambers<br />
    17. 17. Visualization of the Citation Impact Environments of Scientific Journals, LoetLeydesdorff<br />
    18. 18. “Scientific journals tend to cite one another in dense clusters which represent specialties.” [ Leydesdorff ]<br />
    19. 19.
    20. 20. Co-author networks<br />
    21. 21. “Hashtag communities”<br />
    22. 22. Personal Twitter Networks<br />
    23. 23. Scholarly communication asHISTORICAL RECORDorACTIVE COMMUNITY PROCESS<br />
    24. 24. “Reach”<br />“Reputation”<br />“Influence”<br />
    25. 25. Participatory Activity(Communication through participation)<br />
    26. 26.
    27. 27. 2<br />The atoms of <br />publication<br />
    28. 28. “In the established scholarly communication system, the concept of a journal publication dominates our definition of a unit of communication”<br />D-Lib Magazine, September 2004<br />Volume 10 Number 9, ISSN 1082-9873<br />Rethinking Scholarly Communication: Building the System that Scholars Deserve<br />H Van de Sompel, S Payette, J Erickson, C Lagoze, S Warner<br />http://webdoc.sub.gwdg.de/edoc/aw/d-lib/dlib/september04/vandesompel/09vandesompel.html<br />
    29. 29. Anchored inTIMEandGRANULARITY<br />
    30. 30. Open Notebooks<br />
    31. 31. http://cameronneylon.net/blog/a-little-bit-of-federated-open-notebook-science/[ http://bit.ly/ifp3kq ]<br />
    32. 32. “A week or so ago, Tony Hirst (@psychemedia) left a comment on a blog post which sparked a conversation about feeds and their use for generating useful information. I pointed Tony at the feeds from my lab notebook but didn’t take it any further than that. Following this he posted a series of graph visualisations of the connections between people tweeting at a set of conferences and then the penny dropped for me…sparking this conversation on twitter…”<br />
    33. 33.
    34. 34. “Over a couple of days we used Twitter for communication, DropBox, GitHub, Gists, and Flickr for sharing data and code, and the whole process was carried out publicly”<br />
    35. 35. “So another win for open approaches. Again, something small, something relatively simple, but something that came together because people were easily connected in a public space and were routinely sharing research outputs, something that by default spread into the way we conducted the project. It never occurred to me at the time, I was just reaching for the easiest tool at each stage, but at every stage every aspect of this was carried out in the open. It was just the easiest and most effective way to do it.”<br />
    36. 36. Documentas a DATAbase<br />
    37. 37. Co-author networks<br />
    38. 38. Living Documents<br />
    39. 39.
    40. 40.
    41. 41.
    42. 42. ( Dexy.it )<br />
    43. 43.
    44. 44.
    45. 45. Wouldn't it be superb instead if, when faced with a long-tail query like "obama foreign policy cubavenezuela", we could answer not with a list, but with a relevant composite article that pulled together the key points from all those articles into one overview document - dynamically.the key points from all those articles into one overview document - dynamically. <br />
    46. 46. Version Control<br />
    47. 47. Issue Tracking<br />
    48. 48. (Issue tracking can bea social activity)<br />
    49. 49. Peer Review<br />Author<br />“Editing”<br />Page layout<br />Publication<br />Deposit<br />
    50. 50. 2<br />The structure of <br />communication networks<br />The atoms of <br />publication<br />
    51. 51. Tony Hirst@psychemediablog.ouseful.info<br />

    ×