From Open Data to Open Science, by Geoffrey BoultonLEARN Project
1st LEARN Workshop. Embedding Research Data as part of the research cycle. 29 Jan 2016. Presentation by Geoffrey Boulton, University of Edinburgh & CODATA
The Challenges of Making Data Travel, by Sabina LeonelliLEARN Project
1st LEARN Workshop. Embedding Research Data as part of the research cycle. 29 Jan 2016. Presentation by Sabina Leonelli, Exeter Centre for the Study of Life Sciences (Egenis) & Department of Sociology, Philosophy and Anthropology, University of Exeter
Keynote talk to LEARN (LERU/H2020 project) for research data management. Emphasizes that problems are cultural not technical. Promotes modern approaches such as Git / continuousIntegration, announces DAT. Asserts that the Right to Read in the Right to Mine. Calls for widespread development of contentmining (TDM)
Data management: The new frontier for librariesLEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”, by Kathleen Shearer, COAR, CARL/ABCR, RDC/DCR, ARL, SSHRC/CSRH.
Open Data in a Big Data World: easy to say, but hard to do?LEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”
Helsinki, 28 June 2016, by Sarah Callaghan, STFC Rutherford Appleton Laboratory
From Open Data to Open Science, by Geoffrey BoultonLEARN Project
1st LEARN Workshop. Embedding Research Data as part of the research cycle. 29 Jan 2016. Presentation by Geoffrey Boulton, University of Edinburgh & CODATA
The Challenges of Making Data Travel, by Sabina LeonelliLEARN Project
1st LEARN Workshop. Embedding Research Data as part of the research cycle. 29 Jan 2016. Presentation by Sabina Leonelli, Exeter Centre for the Study of Life Sciences (Egenis) & Department of Sociology, Philosophy and Anthropology, University of Exeter
Keynote talk to LEARN (LERU/H2020 project) for research data management. Emphasizes that problems are cultural not technical. Promotes modern approaches such as Git / continuousIntegration, announces DAT. Asserts that the Right to Read in the Right to Mine. Calls for widespread development of contentmining (TDM)
Data management: The new frontier for librariesLEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”, by Kathleen Shearer, COAR, CARL/ABCR, RDC/DCR, ARL, SSHRC/CSRH.
Open Data in a Big Data World: easy to say, but hard to do?LEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”
Helsinki, 28 June 2016, by Sarah Callaghan, STFC Rutherford Appleton Laboratory
Research data management: a tale of two paradigms: Martin Donnelly
Presentation I was supposed to give at "Scotland’s Collections and the Digital Humanities" workshop in Edinburgh on May 2nd 2014. Illness prevented it, but my heroic DCC colleague Jonathan Rans stepped up and delivered the presentation on my behalf.
How can we ensure research data is re-usable? The role of Publishers in Resea...LEARN Project
How can we ensure research data is re-usable? The role of Publishers in Research Data Management, by Catriona MacCallum. 2nd LEARN Workshop, Vienna, 6th April 2016
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...Jisc
Universities and researchers need to be able to manage research data effectively to fulfil research funders requirements and ultimately to contribute to research excellence. UK universities are comparatively well advanced in what is a global challenge, but none the less there needs to be further advances in university policy, technical and support services. This session will share best practice in research data management and information about key tools that can help to develop university solutions; and it will also inform participants about the latest Jisc initiatives to help build university research data services and shared services.
Liberating facts from the scientific literature - Jisc Digifest 2016Jisc
Text and data mining (TDM) techniques can be applied to a wide range of materials, from published research papers, books and theses, to cultural heritage materials, digitised collections, administrative and management reports and documentation, etc. Use cases include academic research, resource discovery and business intelligence.
This workshop will show the value and benefits of TDM techniques and demonstrate how ContentMine aims to liberate 100,000,000 facts from the scientific literature, and ContentMine will provide a hands on demo on a topical and accessible scientific/medical subject.
B2: Open Up: Open Data in the Public SectorMarieke Guy
Parallel session [B2: Open Up: Open Data in the Public Sector] run at the Institutional Web Management Workshop 2013 (IWMW 2013) event, University of Bath on 26 - 28th June 2013.
The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016Jisc
There is broad recognition within the scientific community that the emerging data deluge will fundamentally alter disciplines in areas throughout academic research. A wide variety of researchers - from scientists and engineers to social scientists and humanities researchers - will require tools, technologies, and platforms that seamlessly integrate into standard scientific methodologies and processes.
'The fourth paradigm' refers to the data management techniques and the computational systems needed to manipulate, visualize, and manage large amounts of research data. This talk will illustrate the challenges researchers will face, the opportunities these changes will afford, and the resulting implications for data-intensive researchers.
In addition, the talk will review the global movement towards open access, research repositories and open science and the importance of curation of digital data. The talk concludes with some comments on the research requirements for campus e-infrastructure and the end-to-end performance of the network.
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014Jisc
The analysis of government data, data held by business, the web, social science survey data will support new research directions and findings. Big Data is one of David Willetts’ 8 great technologies, and in order to secure the UK’s competitive advantage new investments have been made by the Economic Social Science Research Council ( ESRC) in Big Data, for example the Business Datasafe and Understanding Populations investments. In this session the benefits of the use of Big Data in social science , and the ESRCs Big Data strategy will be explained by Professor David De Roure.of the Oxford e-Research Centre and advisor to the ESRC.
Think Big about Data: Archaeology and the Big Data Challengeariadnenetwork
Presentation by Gabriele Gattiglia, University of Pisa – MAPPA Lab
EAA 2014 session: Open Access and Open Data in Archaeology
Istanbul, Turkey
13 September 2013
The Needs of stakeholders in the RDM process - the role of LEARNLEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”
Helsinki, 28 June 2016, by Martin Moyle/Paul Ayris, UCL Library Services
In scientific communication, we observe a complex interaction of several stakeholder groups, each of which have distinct interests, strategies and approaches for Open Access and Open Data. The German government initiated a “Commission for the Future of the Information Infrastructure” (KII) in Germany. In this commission, most of the stakeholders are working together in order to design a future scenario for the supply of scientific information. The KII’s evaluation and recommendations for Open Access as well as research data will be particularly highly recognized and will significantly influence Open Access and Open Data developments in Germany.
I will outline the current situation in Germany – players and their interactions in terms of Open Access and Open Data – and present two initiatives and their work in detail. One of them, the KII process, will show the official site, the other one will show the grassroots site of the story.
for getting the library resources fro the libraries entire world, the important tool is Library catalogues. every can browse all most all the world literature through WorldCat fro the INTERNET.
Research data management: a tale of two paradigms: Martin Donnelly
Presentation I was supposed to give at "Scotland’s Collections and the Digital Humanities" workshop in Edinburgh on May 2nd 2014. Illness prevented it, but my heroic DCC colleague Jonathan Rans stepped up and delivered the presentation on my behalf.
How can we ensure research data is re-usable? The role of Publishers in Resea...LEARN Project
How can we ensure research data is re-usable? The role of Publishers in Research Data Management, by Catriona MacCallum. 2nd LEARN Workshop, Vienna, 6th April 2016
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...Jisc
Universities and researchers need to be able to manage research data effectively to fulfil research funders requirements and ultimately to contribute to research excellence. UK universities are comparatively well advanced in what is a global challenge, but none the less there needs to be further advances in university policy, technical and support services. This session will share best practice in research data management and information about key tools that can help to develop university solutions; and it will also inform participants about the latest Jisc initiatives to help build university research data services and shared services.
Liberating facts from the scientific literature - Jisc Digifest 2016Jisc
Text and data mining (TDM) techniques can be applied to a wide range of materials, from published research papers, books and theses, to cultural heritage materials, digitised collections, administrative and management reports and documentation, etc. Use cases include academic research, resource discovery and business intelligence.
This workshop will show the value and benefits of TDM techniques and demonstrate how ContentMine aims to liberate 100,000,000 facts from the scientific literature, and ContentMine will provide a hands on demo on a topical and accessible scientific/medical subject.
B2: Open Up: Open Data in the Public SectorMarieke Guy
Parallel session [B2: Open Up: Open Data in the Public Sector] run at the Institutional Web Management Workshop 2013 (IWMW 2013) event, University of Bath on 26 - 28th June 2013.
The fourth paradigm: data intensive scientific discovery - Jisc Digifest 2016Jisc
There is broad recognition within the scientific community that the emerging data deluge will fundamentally alter disciplines in areas throughout academic research. A wide variety of researchers - from scientists and engineers to social scientists and humanities researchers - will require tools, technologies, and platforms that seamlessly integrate into standard scientific methodologies and processes.
'The fourth paradigm' refers to the data management techniques and the computational systems needed to manipulate, visualize, and manage large amounts of research data. This talk will illustrate the challenges researchers will face, the opportunities these changes will afford, and the resulting implications for data-intensive researchers.
In addition, the talk will review the global movement towards open access, research repositories and open science and the importance of curation of digital data. The talk concludes with some comments on the research requirements for campus e-infrastructure and the end-to-end performance of the network.
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014Jisc
The analysis of government data, data held by business, the web, social science survey data will support new research directions and findings. Big Data is one of David Willetts’ 8 great technologies, and in order to secure the UK’s competitive advantage new investments have been made by the Economic Social Science Research Council ( ESRC) in Big Data, for example the Business Datasafe and Understanding Populations investments. In this session the benefits of the use of Big Data in social science , and the ESRCs Big Data strategy will be explained by Professor David De Roure.of the Oxford e-Research Centre and advisor to the ESRC.
Think Big about Data: Archaeology and the Big Data Challengeariadnenetwork
Presentation by Gabriele Gattiglia, University of Pisa – MAPPA Lab
EAA 2014 session: Open Access and Open Data in Archaeology
Istanbul, Turkey
13 September 2013
The Needs of stakeholders in the RDM process - the role of LEARNLEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”
Helsinki, 28 June 2016, by Martin Moyle/Paul Ayris, UCL Library Services
In scientific communication, we observe a complex interaction of several stakeholder groups, each of which have distinct interests, strategies and approaches for Open Access and Open Data. The German government initiated a “Commission for the Future of the Information Infrastructure” (KII) in Germany. In this commission, most of the stakeholders are working together in order to design a future scenario for the supply of scientific information. The KII’s evaluation and recommendations for Open Access as well as research data will be particularly highly recognized and will significantly influence Open Access and Open Data developments in Germany.
I will outline the current situation in Germany – players and their interactions in terms of Open Access and Open Data – and present two initiatives and their work in detail. One of them, the KII process, will show the official site, the other one will show the grassroots site of the story.
for getting the library resources fro the libraries entire world, the important tool is Library catalogues. every can browse all most all the world literature through WorldCat fro the INTERNET.
This is a citizen science overview particularly aimed at graduate students enrolled in a new course at Arizona State University, aptly titled "Citizen Science." The author of this presentation, and course instructor, Darlene Cavalier, will talk students through its nuances and intersections with science, technology, and society.
The ContentMine system (Open Source) can search EuropePMC and download hundreds of articles in seconds. These can be indexed by AMI dictionaries allowing a rapid evaluations and refinement of the search
Keynote talk at the Web Science Summer School, Singapore, 8 December 2014. Today we see the rise of Social Machines, like Twitter, Wikipedia and Galaxy Zoo—where communities identify and solve their own problems, harnessing commitment, local knowledge and embedded skills, without having to rely on experts or governments.
The Social Machines paradigm provides a lens onto the interacting sociotechnical systems of our hybrid digital-physical world, citizen-centric and at scale—emphasising empowerment and sociality in a world of pervasive technology adoption and automation.
This talk will present the Social Machines paradigm as an approach to social media analytics and a rethinking of our scholarly practices and knowledge infrastructure.
Science, Technology and Society: The information age. All about the improvements of technology, how technology evolved, how science helped the technology and the society. And also how the life of the society makes easier because of science and technology. Science and Technology’s impact in todays world.
Published on Jan 29, 2016 by PMR
Keynote talk to LEARN (LERU/H2020 project) for research data management. Emphasizes that problems are cultural not technical. Promotes modern approaches such as Git / continuous Integration, announces DAT. Asserts that the Right to Read in the Right to Mine. Calls for widespread development of content mining (TDM)
The Culture of Research Data, by Peter Murray-RustLEARN Project
1st LEARN Workshop. Embedding Research Data as part of the research cycle. 29 Jan 2016. Presentation by Peter Murray-Rust, ContentMine.org and University of Cambridge
Beyond Preservation: Situating Archaeological Data in Professional PracticeEric Kansa
I presented this lecture at the German Archaeological Institute (DAI) in Berlin on Nov. 6, 2014 (see: http://www.dainst.org/termin/-/event-display/ogNX4Gtxkd87/342513)
The lecture focuses on how archaeological data fits in professional practice. It looks at scholarly communications, government policies toward the sciences and humanities, and professional reward structures.
The lecture then shows examples of how Open Context publishes archeological data, including editorial processes to promote data quality and relate contributed data to the 'Web of Data' using Linked Open Data methods. Research applications of Open Context and linked archaeological data include the Digital Index of North American Archaeology (DINAA) project (see: http://ux.opencontext.org/blog/archaeology-site-data/) and a data integration study exploring the development and dispersal of animal husbandry economies in Epipaleolithic - Chalcolithic Anatolia (see: http://dx.doi.org/10.1371/journal.pone.0099845)
The lecture concludes with how archaeologists need to invest more intellectually in the method and theory of modeling and creating data. It also looks at how concepts and expectations of publishing static artifacts need to be revised (using techniques like version control) to enable continued and more transparent revision of data to fix problems, implement new standards, and meet new research goals.
Strategic scenarios in digital content and digital businessMarco Brambilla
This lesson was given in May 2009 at MIP, Politecnico di Milano. The audience included members of the Acer academy program.
Rights on reused content are maintained by respective owners.
See further information on my activity at:
http://home.dei.polimi.it/mbrambil/
and:
http://twitter.com/marcobrambi
Module 1 - CaseInformation Networking as Technology Tools, Uses, .docxbunnyfinney
Module 1 - Case
Information Networking as Technology: Tools, Uses, and Socio-Technical Interactions
Assignment Overview
Information overload! The phrase alone is enough to strike terror into the hardiest of managers; it presages the breakdown of society as we know it and the failure of management to cope with change. The media constantly dissect the forthcoming collapse brought on by TMI ("too much information"), even as they themselves pile up larger and larger dossiers on the subject, and we are frequently informed that it is our own damn fault that we are drowning in data, since we simply can't discriminate between the important stuff and everything else. Hence, the info-tsunami warning signs posted all along what we once so naively called the "information superhighway.”
Of course, this is arrant nonsense—human beings have been suffering from information overload in varying forms since about the time we hit the ground and found ourselves simultaneously running after the antelope and away from the lion. There's no question that the human mind has a limited capacity to process information, but after several million years we've gotten pretty good at figuring out how to handle a lot. The two basic tricks turn out to be distinguishing between short-term and long-term information storage, and "chunking"—putting things in a limited number of baskets. This isn't primarily a course in the psychology of memory—it's about information tools and systems—but in fact the same things that make our information tools and systems work are the same things that have kept us near the antelopes and away from the lions (mostly) for the last million years or so. So we're beginning this course by thinking about information tools, what makes them like and unlike other kinds of tools, how the concept of a socio-technical system (in which social and behavioral functions shape results as much as does the technology itself) helps make sense of what we're facing, and why the technology just might win after all.
Let's start with a little historical review. Amy Blair has recently done a very intriguing summary of just why information overload isn't something that we, or still less our kids, dreamed up—people have been drowning in data for ages regardless of the tools at their disposal:
Blair, A. (2010) Information Overload, Then and Now. The Chronicle of Higher Education Review. November 28. Retrieved November 15, 2010 from
http://chronicle.com/article/Information-Overload-Then-and/125479/?sid=cr&utm_source=cr&utm_medium=en
We thought we had it all nailed down when the information theorists came up with their typology distinguishing between "data" (raw stuff), "information" (cooked stuff), and "knowledge" (cooked stuff that we've eaten). This rather elegant approach did have the virtue of emphasizing that information processing is a human task, even though we might delegate part of it to machinery, and that the tests of that task are the results for humans. It helps retur.
Lev Manovich.
How and why study big cultural data.
Presentation at Data Mining and Visualization for the Humanities symposium, NYU, March 19, 2012.
softwarestudies.com
Presented by Ms Diane Quarless, Director, ECLAC subregional headquarters for the Caribbean, at the LEARN Caribbean Research Data Workshop. http://learn-rdm.eu/en/workshops/eclac-mini-workshops/3rd-mini-workshop
Presented by Ms Bernadette Lewis, Secretary General, Caribbean Telecommunications Union at the LEARN Caribbean Research Data Workshop. http://learn-rdm.eu/en/workshops/eclac-mini-workshops/3rd-mini-workshop
Gestion de datos para la investigacion: el caso peruano by Edward Mezones, Su...LEARN Project
Gestion de datos para la investigacion: el caso peruano by Edward Mezones, Superintendencia Nacional de Salud (Perú) - presented at the 4th LEARN RDM Workshop in Santiago, Chile: http://learn-rdm.eu/
TALLER LEARN SOBRE DATOS DE INVESTIGACIÓN IMPLEMENTACIÓN DE POLÍTICAS Y ESTRA...LEARN Project
TALLER LEARN SOBRE DATOS DE INVESTIGACIÓN IMPLEMENTACIÓN DE POLÍTICAS Y ESTRATEGIAS EN AMÉRICA LATINA Y EL CARIBE by Miguel Ángel Márdero Arellano, IBICT (Brazil) - presented at the 4th LEARN RDM Workshop in Santiago, Chile: http://learn-rdm.eu/
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
tapal brand analysis PPT slide for comptetive data
Data, Science, Society - Claudio Gutierrez, University of Chile
1. Data, Science, Society
LEARN Final Conference, CEPAL, London, May 5th, 2017
Claudio Guti´errez • DCC, Universidad de Chile / CIWS •
cgutierr@dcc.uchile.cl
2. The foundations of experience (since we absolutely must get
down to this) have been non-existent or very weak; nor has a
collection or store of particulars yet been sought or made, able
or in any way adequate, either in number, kind or certainty, to
inform the intellect. [...] Natural history contains nothing that
has been researched in the proper ways, nothing verified,
nothing counted, nothing weighed, nothing measured.
FRANCIS BACON, APHORISMS, XCVIII
3. A tentative agenda
I. Torrents of Data
II. The notion of Data
III. Research and Scientific Data
IV. Data and Society
V. Concluding Remarks
5. There are already too many books. Even when we drastically
reduce the number of subjects to which man must direct his
attention, the quantity of books that he must absorb is so
enormous that it exceeds the limits of his time and his capacity
of assimilation. [...] Here then is the drama: the book is
indispensable at this stage in history, but the book is in danger
because it has become a danger for man.
JOS ´E ORTEGA Y GASSET. THE MISSION OF THE LIBRARIAN.
1935.
6. TWO DIMENSIONS OF THE PROBLEM:
QUANTITY (Ortega’s problem): too many objects. Beyond
our time limits, human capacity of assimilation.
QUALITY (New problem): the object itself is beyond our
intelligibility. Huge sizes and no explicit semantics.
The essence: beyond human scale
8. human scale
Byte B ∼ 100 a character
Kilo KB ∼ 103 written text
Mega MB ∼ 106 image, music
Giga GB ∼ 109 movies
beyond human
Tera TB ∼ 1012 US Congress Library
Peta PB ∼ 1015 Large data center
Exa EB ∼ 1018 All words ever spoken
Zetta ZB ∼ 1021 Amount of global data
9. + Data science portals
+ Data portals of organizations
+ Online libraries
+ APIs and services for data
+ Online datasets and journals
+ Visualization and processing tools
+ Legal and regulatory frameworks
+ Open Data initiatives
+ · · ·
————————————–
. . . how to organize them?
10. PARAPHRASING A CLASSICAL THESIS ABOUT SOCIAL CHANGE:
At a certain stage of development, the material forces of society
began producing more symbolic material than the one existing
social relations can digest. From forms of development of the
culture these relations turn into their fetters. Then begins an era
of information upheaval.
11. SUMMARY AND WORKING HYPOTHESIS:
The symbolic world is growing so fast and vast that escapes
our “natural” human capacities to handle it. We feel that an
obscure and daunting, fundamentally unintelligible, (parallel)
world is growing in front of our eyes.
The formerly vast and volatile symbolic world is being
materialized in digital data (the virtual world), thus making
obsolete the conceptual models used to deal with it.
Moral: Need to understand what is “data”!
13. NECESSARY CLARIFICATION
Data = information Data = knowledge
traditional view:
knowledge = information + metainformation
information = data + metadata
data = ?
14. ——– I ——–
At the most basic and abstract level, data is a distinction, a
“fracture in the fabric of Being”. Data is the most basic layer in
the symbolic world. Has not meaning by itself, but is the source
of meaning.
15. ——– II ——–
By data we will mean materialized (digitally recorded) data.
Despite its ontological status between the material and the
intangible, data is material. But it makes sense only in the
virtual world.
16. ——– III ——–
The distinctions that define data assume an implicit context.
This network of meanings is not stated explicitly, that is, not
specified in the data itself. This allows manifold interpretations
of the same data from different points of view, to further explore
new dimensions, etc.
17. ——– IIII ——–
Data is the starting point for our discussion. Data is something
given, the basic elements of our field. From this point of view
our concern at this stage is not the possible meanings of data,
but them as “material” elements.
18. DATA SCIENCE AS THE CHEMISTRY OF THE VIRTUAL WORLD
Virtual World
Data
=
Material world
Atoms
27. DIAGNOSIS FROM OECD (1996)
Knowledge, as embodied in human beings (as “human capital”)
and in technology, has always been central to economic
development. But only over the last few years has its relative
importance been recognised, just as that importance is
growing. The OECD economies are more strongly dependent
on the production, distribution and use of knowledge than ever
before.
28. A BASIC CHAIN OF DEDUCTIONS
Economy is strongly dependent on (scientific) knowledge.
Science today is heavily based on data.
—————————————————-
“Data has become the new oil.”
31. BLURRING BOUNDARIES II
EXTENSIONAL, static, data
(datasets, collection/networks of datasets)
versus
INTENSIONAL, dynamic, data
(Streaming, URI, API, etc.)
33. nature of these resources. Some knowledge commons reside at the local
level, others at the global level or somewhere in between. There are
SUBTRACTABILITY
Low High
DifficultEasy
EXCLUSION
Toll or club goods
Journal subscriptions
Day-care centers
Public goods
Useful knowledge
Sunsets
Private goods
Personal computers
Doughnuts
Common-pool resources
Libraries
Irrigation systems
Figure 1.1
Types of goods. Source: Adapted from V. Ostrom and E. Ostrom 1977
34. DATA AS PUBLIC GOOD
A public good has two critical properties, non-rivalrous
consumption–the consumption of one individual does not
detract from that of another–and non-excludability–it is difficult
if not impossible to exclude an individual from enjoying the
good. [...] Knowledge is a global public good requiring public
support at the global level.
Joseph Stiglitz, 1998.
35. OECD VIEW OF OPEN ACCESS
Openness means access on equal terms for the international
research community at the lowest possible cost, preferably at
no more than the marginal cost of dissemination. Open access
to research data from public funding should be easy, timely,
user-friendly and preferably Internet-based
OECD, 2007.
36. NSF’S PRINCIPLES
Agencies must adopt a presumption in favor of openness to the
extent permitted by law and subject to privacy, confidentiality,
security, or other valid restrictions.
Open data are publicly available data structured in a way to be
fully accessible and usable. This is important because data that
is open, available, and accessible will help spur innovation and
inform how agencies should evolve their programs to better
meet the public’s needs.
Open Data at NSF
37. OPEN DATA MOVEMENT
Open data is data that can be freely used, re-used and
redistributed by anyone –subject only, at most, to the
requirement to attribute and sharealike.
Open Data Handbook
40. LIMITATIONS OF OPEN ACCESS
• DUAL NATURE OF DATA: material and intangible and
non-material and non-intangible
• SCALE: Open access works well at human scale (this is
origin of open movements and anti-closure movements).
Needs secon thoughts at big scale.
• CYCLE AND ECOSYSTEM: Data needs support in all parts
of the cycle. Need access for all parts of the ecosystem of
science.
42. ACCESS IS NOT ENOUGH: NEED TO “REFINE”
Nature Scientific Data Journal:
“Scientific Data is a peer-reviewed, open-access journal for
descriptions of scientifically valuable datasets, and research
that advances the sharing and reuse of scientific data.”
43. DATA ITSELF AS ECOSYSTEM
Main challenge is how we would like to manage and govern
this new good, including its whole cycle, that is, how it is
generated, accessed, stored, curated, processed and
delivered.
44. DATA AS COMMONS
The essential questions for any commons analysis are
inevitably about equity, efficiency and sustainability. Equity
refers to issues of just or equal appropriation from, and
contribution to, the maintenance of a resource. Efficiency deals
with optimal production, management and use of the resource.
Sustainability looks at the oucomes over the long term.
Ch. Hess, E. Ostrom, 2006.