Successfully reported this slideshow.
Upcoming SlideShare
×

Enabling Data-Intensive Science Through Data Infrastructures

3,208 views

Published on

These slides are from a talk given at LIBER's 42nd annual conference by Carlos Morais Pires of the European Commission.

In light of the current data deluge, and plans by the European Commission to harness this deluge through the implementation of e-infrastructures for data driven science under Horizon 2020, Pires issued a call to action to libraries to engage in the data infrastructure and bring their own unique, and now much needed competencies, to bear in bringing meaning to, and spreading the word about, data-driven science.

Published in: Education, Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Enabling Data-Intensive Science Through Data Infrastructures

1. 1. enabling data-intensive science through data infrastructures LIBER 42nd Annual Conference München, 27 June 2013 Carlos Morais Pires European Commission e-Infrastructures, DG CNECT.C1 Author’s views do not commit the European Commission
2. 2. summary • engineers and librarians… all about communicating information • data as infrastructure: Europe is "Riding the Wave" • interoperable data infrastructure • balancing community driven and service driven initiatives • H2020 WP under construction (pending “trilogue” decisions) • times of change & “influence the future”
3. 3. engineers… The number pi (symbol: π) /pa / is a mathematical constant thatɪ is the ratio of a circle's circumference to its diameter, and is approximately equal to 3.14159 26535 89793 23846 26433 83279 50288 41971 69399 37510 58… in http://en.wikipedia.org/ 22 divided by 7 = 3,1428571428571428571428571428571
4. 4. it’s all about bits… http://en.wikipedia.org/wiki/Entropy_%28information_theory%29#Definition In information theory, entropy is a measure of the uncertainty in a random variable. In this context, the term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message. Entropy is typically measured in bits. Shannon entropy is the average unpredictability in a random variable, which is equivalent to its information content. The concept was introduced by Claude E. Shannon in his 1948 paper "A Mathematical Theory of Communication"…
5. 5. it’s all about communicating information…
6. 6. Policy context A Reinforced European Research Area Partnership for Excellence and Growth, COM(2012) 392 – July 2012 Towards better access to scientific information: boosting the benefits of public investments in research, COM(2012) 401 final - July2012 Commission, Recommendation on access and preservation of scientific information, C(2012) 4890 final – July 2012
7. 7. data as infrastructure: Europe is Riding the Wave The High Level Expert Group on Scientific Data presented Riding the Wave in October 2010 Vision: "data e-infrastructure that supports seamless access, use, re-use, and trust of data. In a sense, the physical and technical infrastructure becomes invisible and the data themselves become the infrastructure a valuable asset on which science, technology, the economy and society can advance".
8. 8. useful definitions Data: digital recorded factual material commonly accepted in the scientific community as necessary to validate research findings (not include lab notebooks, preliminary analysis, drafts of scientific papers, plans for future research, peer review reports, communication with peers, physical objects, lab specimens) [c.f. White House Memo on "Increasing Access to the Results of Federally Funded Scientific Research"] Data infrastructures: services, applications, tools, knowledge and policies for research data to be discoverable, understandable, accessible, preserved and curated… and available 24/7
9. 9. implementing interoperable data infrastructure (a)data generators; research projects, big research infrastructure, installations or medium size laboratories, simulation centres, surveys or individual researchers (b)discipline-specific data service providers, providing data and workflows as a service (c)providers of generic common data services (computing centres, libraries) (d)researchers as users, using the data for science and engineering community driven data infrastructure, including ESFRI, ESFRI clusters and others
10. 10. network infrastructure, GÉANT distributed computing/software infrastructure scientific data infrastructure data infrastructure: bridging islands bridges
11. 11. consultation towards horizon2020
12. 12. consultation towards horizon2020
13. 13. What Relevance, Strengths, Weaknesses Propose additional areas of actions How many 80+ replies from 100+ organisations Who Research organisations and associations, universities,… LERU, LIBER, CNRS, COAR, EIROforum,, OpenAIRE, CERN, APA, Volker Mehrmann TU Berlin, European Bioinformatics Institute, Max Planck Society, Observatoire Astronomique de Strasbourg, Museum f. Naturkunde Berlin, Pensoft Publishers, University of Edinburgh, University of Göttingen, University of Florence, etc. about the public consultation
14. 14. overall opinion
15. 15. responses… involvement of all stakeholders across the fiches relevance of long tail, universities have important role preservation and access are related look at areas that are less developed in IT skills development workflows for interaction researcher/data centres match research and education
16. 16. H2020 workprogramme being prepared please note that things may change as result of the “trilogue” H2020 Research Infrastructure: ensure that Europe has world-class research infrastructures, including e- infrastructures, accessible to all researchers in Europe and beyond. It is a key area of H2020 Excellence in Science priority. e-infrastructures will make every European researcher digital. 5 challenges: (1) High Performance Computing, (2) Connectivity, (3) Data, (4) e-Infrastructure Integration and (5) Policy and International.
17. 17. H2020 workprogramme… (current version) • Community data services • Managing, preserving and computing with big research data • E-Infrastructure for Open Access • Towards global data e-Infrastructures (support RDA) • e-Infrastructures for virtual research environments (VRE) • Integration of Core and Basic Operations Services for e-Infrastructures • Skills and professions for e-infrastructures • Centres of Excellence for computing applications • PRACE • Network of Competence Centres for SMEs • GEANT These lines are related with the content of the Framework for Action
18. 18. Research Data Alliance: Common Infrastructure, Policy and Practice Drives Data Sharing and Exchange throughout the Data Life Cycle http://rd-alliance.org From Prof. Fran Berman and Prof. John Wood, Members of the RDA Council
19. 19. the conference to shape the future… Re-inventing the Library for the Future Libraries and why we need them The revolution in Open Science Preparedness for digital preservation New horizons for OA policies in Europe Ten recommendations on Data Management What challenges for libraries to adapt to the new Era The future of Science Publishing Ego-System Converging parallel universes Developing Data Informatics Capability in Libraries Etc.
20. 20. “The Times They Are A-Changin…” Come gather 'round people Wherever you roam And admit that the waters Around you have grown And accept it that soon You'll be drenched to the bone If your time to you Is worth savin' Then you better start swimmin' Or you'll sink like a stone For the times they are a-changin'. Bob Dylan
21. 21. back to engineering: looking at “change” and “amplitude modulation”
22. 22. Thank You! Carlos Morais Pires carlos.morais-pires[at]ec.europa.eu @CarlosMPires