TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
62s
1. Open Access and the Structure of Academic Communication / By Hans Pfeiffenberger, Alfred
Wegener Institute for Polar and Marine Research, Bremerhaven
开放获取与学术交流的结构 / 汉斯·斐芬博格 阿尔弗雷德·魏格纳极地与海洋研究所,不来梅港
The criterion for any assessment of the system and means of academic communication must be their
maximum contribution to the benefit and progress of science and scholarship. In particular, individual
means are no more than a necessary service for science and scholarship, a service whose effectiveness
and efficiency needs occasional examination.
学术交流系统和方法的评鉴标准,必须是对科学与学术的进步作出最大的贡献。各种方法只是
为科学与学术提供必要服务,服务效果和效率需要不时加以检验。
State of the art•••••••••••
现状
Beginning in 1665 with the Philosophical Transactions of the Royal Society, the peer-reviewed article
in an academic journal has become the gold standard of academic communication. It did partly replace
communication by letter between colleagues and rivals. For the pure communication of results, the
journal was and is more efficient, and professional editing probably also makes it more effective. The
‘date stamp’ of a trustworthy publisher established priority, which previously was difficult or
impossible to determine in disputed cases. Finally, refereeing by peers provided a quality assurance that
allowed the reader, to a degree, to accept the facts and conclusions contained in the paper as correct.
自 1665 年《英国皇家学会自然科学会报》发刊以来,学术期刊上刊登的同行评审论文,成为学
术交流的标准程序,部分取代了同事和对手间的信函交流。单纯从成果交流来看,期刊更有效
率,而专业的编辑也使之更有效果。值得信赖的出版社的“日期印戳”确定着优先权,而之前对
争议案例很难或不可能解决。最后,同行的评审提供质量保证,在某种程度上使读者相信,论
文中的事实与结论是正确的。
To a certain extent, the journal article still has these advantages, even if these are no longer undisputed.
Above all, the efficiency of communication is fundamentally questioned when the Blue Ribbon/Atkins
Report(41) notes that: ‘The primary access to the latest findings in a growing number of fields is
through the Web, then through classic preprints and conferences, and lastly through refereed archival
papers.’
在一定程度上,期刊论文仍然具有这些优势,即使不再是无可争议的。尤其是,其交流的效率
从根本上受到质疑,《蓝带/阿特金斯报告》(注 41)指出:“在越来越多的领域中,最新发现
的主要获取途径是万维网,接着是传统的预印本及会议,最后才是评审过的存档论文。”
注 41: Atkins et al., Report of the National Science Foundation Blue-Ribbon Advisory Panel on
Cyberinfrastructure, 2003, http://www.nsf.gov/od/oci/reports/toc.jsp.
Furthermore, the value of quality assurance is in decline as growing quantities of underlying primary
data, as well as other materials, no longer form part of the publication as printed, and are assessed
neither during the review process nor otherwise. Recent major scientific scandals are largely connected
with invented or falsified data, or with erroneous evaluation and summarisation.
而且,由于基础原始数据及其它资料的数量日益增长,不再以印本出版物出现,也未经评审或
其它评估程序,导致质量保证的价值下滑。最近发生的重大科学丑闻,基本上是创造或伪造数
据,或者错误的评价和总结。
Modern demands and possibilities• ••••••••••••
现代需求与可行性
2. From a relaxed point of view, one could say that the age of Internet-based communication is only just
beginning — and this certainly goes for academe, too. However, access to academic journals is
currently almost exclusively via the Internet. The reason for this rapid development must be the greater
efficiency of net-based access, which in turn has many reasons.
从轻松的角度来看,可以说,因特网为基础的交流时代还在崛起中──这自然渗入学术领域。然
而,目前获取学术期刊几乎只经由因特网。如此快速发展的理由,一定是因为基于网络的获取
更有效率,这反过来,也有很多原因。
The increase in efficiency is also absolutely necessary, since the proportion of the population engaged
in research or using research results is increasing. If the efficiency of reception were not rising, the
proportion of what any one individual could take in would constantly be on the decline, as would the
use of any individual research result.
提高效率也是绝对必要的,因为从事研究或运用研究成果的人口比例在上升。如果接受成果的
效率未提升,每个人能够接受的成果的比例就会不断下滑,每项研究成果的利用也同样如此。
A further reason why there is a need for a clear increase in efficiency lies in the shape of certain areas
of research. There are those which are particularly costly, and thus require as complete an exploitation
of the results as possible. Others are of highest relevance to society and at the same time of great
interdisciplinary complexity. They require the correlation and utilisation of many results from many
different disciplines. We might cite as examples such ‘modern’ research topics as Risk Habitat
Megacity(42) and of course Global Change. Relevant disciplines range from the further development of
climate models, via examinations of traffic flow from an engineering point of view, to sociological
insights into the change in the lives of Arctic peoples.
在特定研究领域里,对提高效率的需求,特别殷切。它们的研究非常昂贵,因之需要尽可能完
整地开发研究成果。还有一些领域,与社会高度相关,而且具有复杂的跨学科性,需要关联并
利用不同学科的众多研究成果。我们可引“现代”的研究课题为例,如“居住在大城市的风
险”(注 42),以及“全球变迁”,其相关学科范围,从气候模式的进一步发展,从工程学观点检
验交通流量,到由社会学观察极地居民的生活改变。
注 42: Strategies for Sustainable Development in Megacities and Urban Agglomerations,
http://www.risk-habitat-megacity.ufz.de/.
The 50 000 participants in (and doubtless also the recipients of the insights obtained from) the
International Polar Year 2007–2008 (www.ipy.org) — a programme that represents only part of
research into global change — come from 60 nations. Its persistent results — including a ‘data
snapshot’ of the Polar Regions, which is perhaps more important than journal articles — will form a
basis on which global change will be tracked in the coming decades. For this reason, the programme
has adopted a policy which includes an obligation to make resulting data available rapidly and freely.
Both the implementation of this one coordinated research programme of 170 formally independent
project clusters and the utilisation of results demand an extremely rapid, effective and efficient
communication system. This does not yet exist in institutionalised form, but it is planned in order to
exchange data sets in real time that are needed for the implementation of the project.
来自 60 个国家的 50,000 人参加(无疑也受益)的 2007-2008 国际极地年(www.ipy.org),只
是全球变迁研究的项目之一,它的持续性结果──包括极地的“数据快照”,或许比期刊论文更重
要──将成为未来数十年追踪全球变迁的基础。出于这个原因,该项目采用一项政策,包括将结
果数据尽快且自由公告的义务。实施这样一个由 170 个正式的独立项目群组成的研究项目,并
利用其成果,需要一个快速、有效果且有效率的交流机制。目前还没有制度化的形式,但正计
划这样的机制,以满足实时交换数据集的需求。
3. Approaches to a solution• ••
解决方案
In the global knowledge society, the extensive knowledge present in people’s heads, the information
that has been written down, and the data obtained at huge expense can only be really comprehensively
and effectively accessed and utilized if they can be linked in every possible way, including ways as yet
unknown. To the degree necessary, this is clearly beyond human capability. Therefore, machine
processes — from the (full text) search engine to techniques of text and data mining — need to be
employed. Today, however, only a financial giant like Google would be in any position to purchase
access to the entire material of major publishers if it wanted to. Accordingly, in today’s STM (science,
technology, medicine) fields, for example, only one (commercial) entity is in a position to make
available a relevant part of all electronically available scientific texts of a certain quality standard and
to network these via their citation structures: http://scientific.thomson.com/products/wos.
在全球知识社会中,存在于人们头脑中的大量知识、被记载下来的信息、以及耗费巨大取得的
数据,必须以任何可能的、包括仍未知的方式连结起来,才能被真正全面有效地获取与利用。
这显然超出人的能力,因此需要用机器处理,从(全文)搜索引擎到文本和数据挖掘技术。然
而,今天只有像谷歌这样的金融巨子,才有能力购置主要出版社全部出版物的获取权──如果它
需要的话。同样地,在今天的科技医学领域里,只有一个(商业)实体能够把所有相当质量水
准的、电子形式的科学文本,以引文结构方式、通过网络连结相关部分:
http://scientific.thomson.com/products/wos
[http://www.thomsonreuters.com/products_services/scientific/Web_of_Science]
By contrast, what can already be done for openly accessible material at evidently little effort is made
clear by limited, simple services such as Citeseer (http://citeseer.ist.psu.edu) or Scientific Commons
(http://www.scientificcommons.org). Alongside obvious search functionality, both services also
contain textmining approaches in order to identify networks of persons (authors and co-authors),
schools or communities. Such navigation aid for the sea of information would be highly useful only in
complex contexts such as global change.
相形之下,对于可开放获取的资料,显然努力不够,Citeseer(http://citeseer.ist.psu.edu)或“科
学共享”(http://www.scientificcommons.org)之类服务就显得相当单薄。除了明显的搜寻功能,
这两项服务也包含文字挖掘方法,用以辨识人际网络(作者和合作者)、学校或社区。这种信
息之海的领航援助,只有在全球变迁这种复杂议题下才会非常有用。
An example of the evaluation of texts on biochemical compounds(43) shows that only their machine
analysis allows researchers to recognise far-reaching correlations and to draw conclusions when the
underlying foundations of these conclusions are spread over hundreds of publications.
评估生物化学化合物的文本(注 43)时发现,只有借助机器分析,研究人员才能认识到影响深
远的关联,并得出结论,其结论的基础遍布数以百计的出版物中。
注 43: Hofmann-Apitius, M., 'Paradigm Changes Affecting the Practice of Scientific Communication in
the Life Sciences' = [影响生命科学中科学交流实践的范式变迁], in: Scientific Publishing in the
European Research Area, Brussels, 15 February 2007. = [欧洲研究领域的科学出版]
http://ec.europa.eu/research/science-society/document_library/pdf_06/hofmann-022007_en.pdf.
That even the refereeing of individual articles will not be able to manage without such techniques was
pointed out by the journal Nature in a Special Report: ‘As information technology becomes more
sophisticated, I think you are going to see more journals adding new tools to their screening
processes.’(44) The report also explains why access to underlying data, presumed to be stored on CDs
4. in cardboard boxes, is not a possible procedure in the refereeing context.
《科学》的一份特别报告指出,即使对论文逐一评审,也不能在没有此等技术的情况下完
成:“随着信息技术变得更加成熟,更多的期刊会使用新的工具筛选论文。”(注 44)该报告还
解释为何评审程序中,不可能包括获取存储在未公开光盘中的基础数据。
注 44: Marris, E., ‘Should journals police scientific fraud? = [期刊应当监管科学欺诈吗?], in:
Nature. 439 (2006), 520–521, doi:10.1038/439520a.
The need for access to full text and to underlying data becomes particularly clear if one considers that
even the most valuable datasets are not adequately retrievable and usable if the texts which describe
them or are otherwise associated with them are not available for automatic analysis services. Pure data-
set catalogues, not connected with publications and the other contexts in which the authors of the data
work, cannot in the long term do justice to the data(45) any more than just the abstracts of articles in
journals can.
如果描述数据集的文本或其他相关信息没有纳入自动分析服务,即使最有价值的数据集也不能
被充分被检索和使用,因此获取全文及基础数据的需求变得尤为明显。长远来看,单纯的数据
集目录,如果没有与作者的作品及其他内容连结,便不能得到充分的利用(注 45),甚至比不
上期刊论文的摘要。
注 45: Pfeiffenberger, H., & Macario, A., 'Text, Data and People – How to Represent Earth System
Science' = [文本、数据和人:如何表达地球系统科学], CERN workshop on Innovations in
Scholarly Communication(OAI4), Geneva, 20 October 2005. = [欧洲粒子物理研究所学术交流创新
会议,日内瓦,2005 年 10 月 20 日] http://epic.awi.de/Publications/Pfe2005c.pdf.
Conclusions•••••••••••••
结论
These considerations lead to the expectation that the future of academic communication will be marked
by a wide variety of Internet-based services, which will efficiently make available and effectively
present freely accessible articles, data and other materials in a variety of ways.
基于上述考虑,未来的学术交流,将以基于因特网的服务为特征,以多种方式有效率、有效果
地达到自由获取论文、数据及其它资料。
p. 62-65
Open Access: Opportunities and challenges. A handbook [开放获取 : 机会及挑战] / European
Commission/German Commission for UNESCO). -- Luxembourg: Office for Official Publications of
the European Communities, 2008. -- 144 pp., 14.8 x 21.0 cm. -- ISBN 978-92-79-06665-8. -- EUR
23459, http://tinyurl.com/3q8wo5