The document discusses the development of data services to support eScience/eResearch. It provides an overview of eScience, including that it involves large-scale collaborative science enabled by the internet using digital data. Characteristics of eScience include being data-driven, distributed, collaborative, and trans-disciplinary. Libraries are important to eScience because it involves large data sets, collections, and repositories. The document also discusses how science paradigms have shifted to become more computational and data-focused.
This document provides an overview of a professional development day for librarians on scientific data management. The day includes presentations on e-science, cyberinfrastructure, and data; a case study on data management for gravitational wave research; and a group activity to develop data management initiatives. The presentations will cover characteristics of e-science such as large collaborative digital datasets, and implications for libraries, including initiatives to provide data support services and address challenges in data preservation, access, and the research data lifecycle.
This document discusses open educational resources (OERs) and digital systems and services to promote open access to education and learning. It defines OERs according to UNESCO as educational resources that are freely available online for use and adaptation for non-commercial purposes. The document also discusses that OERs have three core features - open access, permission for reuse and modification, and open licensing. Learning objects are also discussed as reusable digital resources that support learning. Examples of learning objects covering topics like AIDS, DNA, and anatomy of the ear are provided.
The document describes the Inspiring Science Education tools, which were developed to support teachers in authoring and delivering technology-enhanced science lessons that follow an inquiry cycle and assess students' problem solving competences. The tools include an authoring tool to design lessons incorporating assessment tasks aligned with the PISA problem solving framework, and a delivery tool to implement the lessons and collect student assessment data. The overall goal is to help teachers improve their lesson plans and enhance students' problem solving skills.
A content analysis of the emerging research on academic cyberloafingZizo Aku
Despite the diverse opportunities digital technologies offer that enhance learning and improve instructional practice, the main challenge faced by many institutions is the distracting effects of hyper-connectivity caused by mobile devices during learning activities. Some students find it difficult to balance online leisure activity with school work because of the guilty pleasures associated with using certain types of media. The failure of college students to reduce distractions from academic cyberloafing could negatively impact their achievement of academic success. This scholarly paper is designed to explore how contemporary research has investigated this emerging phenomenon to better understand important strategies for control.
Kunal Punera is seeking a full-time position in research labs working on web/data mining, information retrieval, and machine learning. He has a Ph.D. in computer engineering from UT Austin with a focus on these areas. His research interests include web data analysis, data mining, machine learning, and information retrieval. He has published numerous papers in top conferences and journals and has worked with Yahoo! and IBM on related research projects.
The document provides the schedule for a machine learning conference. It includes the times for registration, invited talks on topics like machine learning in space and applying machine learning to real-world problems, contributed talks on research topics, coffee breaks, lunch, and a poster session. The day concludes with a panel discussion and concluding remarks.
This document is a curriculum vitae for Dr. B. Kalpana, a professor of computer science. It provides details about her education, teaching experience, areas of research interest including data mining and mobile computing, publications, projects supervised, and professional affiliations. She has over 20 years of teaching experience and has guided several PhD and MPhil students. She has published papers in international conferences and journals and has received best paper awards.
This document provides an overview of a professional development day for librarians on scientific data management. The day includes presentations on e-science, cyberinfrastructure, and data; a case study on data management for gravitational wave research; and a group activity to develop data management initiatives. The presentations will cover characteristics of e-science such as large collaborative digital datasets, and implications for libraries, including initiatives to provide data support services and address challenges in data preservation, access, and the research data lifecycle.
This document discusses open educational resources (OERs) and digital systems and services to promote open access to education and learning. It defines OERs according to UNESCO as educational resources that are freely available online for use and adaptation for non-commercial purposes. The document also discusses that OERs have three core features - open access, permission for reuse and modification, and open licensing. Learning objects are also discussed as reusable digital resources that support learning. Examples of learning objects covering topics like AIDS, DNA, and anatomy of the ear are provided.
The document describes the Inspiring Science Education tools, which were developed to support teachers in authoring and delivering technology-enhanced science lessons that follow an inquiry cycle and assess students' problem solving competences. The tools include an authoring tool to design lessons incorporating assessment tasks aligned with the PISA problem solving framework, and a delivery tool to implement the lessons and collect student assessment data. The overall goal is to help teachers improve their lesson plans and enhance students' problem solving skills.
A content analysis of the emerging research on academic cyberloafingZizo Aku
Despite the diverse opportunities digital technologies offer that enhance learning and improve instructional practice, the main challenge faced by many institutions is the distracting effects of hyper-connectivity caused by mobile devices during learning activities. Some students find it difficult to balance online leisure activity with school work because of the guilty pleasures associated with using certain types of media. The failure of college students to reduce distractions from academic cyberloafing could negatively impact their achievement of academic success. This scholarly paper is designed to explore how contemporary research has investigated this emerging phenomenon to better understand important strategies for control.
Kunal Punera is seeking a full-time position in research labs working on web/data mining, information retrieval, and machine learning. He has a Ph.D. in computer engineering from UT Austin with a focus on these areas. His research interests include web data analysis, data mining, machine learning, and information retrieval. He has published numerous papers in top conferences and journals and has worked with Yahoo! and IBM on related research projects.
The document provides the schedule for a machine learning conference. It includes the times for registration, invited talks on topics like machine learning in space and applying machine learning to real-world problems, contributed talks on research topics, coffee breaks, lunch, and a poster session. The day concludes with a panel discussion and concluding remarks.
This document is a curriculum vitae for Dr. B. Kalpana, a professor of computer science. It provides details about her education, teaching experience, areas of research interest including data mining and mobile computing, publications, projects supervised, and professional affiliations. She has over 20 years of teaching experience and has guided several PhD and MPhil students. She has published papers in international conferences and journals and has received best paper awards.
June 2020: Most Downloaded Article in Soft Computing ijsc
Soft computing is likely to play an important role in science and engineering in the future. The successful applications of soft computing and the rapid growth suggest that the impact of soft computing will be felt increasingly in coming years. Soft Computing encourages the integration of soft computing techniques and tools into both everyday and advanced applications. This Open access peer-reviewed journal serves as a platform that fosters new applications for all scientists and engineers engaged in research and development in this fast growing field.
NKU offers a variety of STEM programs with hands-on learning opportunities through small class sizes, research experiences, and state-of-the-art facilities. Students can interact closely with faculty and have opportunities for internships with local companies. Graduates are well prepared for high-paying careers or graduate programs. NKU provides scholarships and unique housing and study options to support STEM students.
A Tableau-based Federated Reasoning Algorithm for Modular OntologiesJie Bao
This document describes a tableau-based algorithm for distributed reasoning over modular ontologies. It introduces description logics and modular ontologies modeled as packages in package-based description logics (P-DL). P-DL allows ontologies to be organized into modules or packages that can import terms from other packages. The algorithm uses a federation of local reasoners, each handling a package, to collaboratively construct a distributed tableau by sharing facts between local tableau constructions. This avoids materializing a single global tableau and allows reasoning to be performed even when global knowledge is not available.
Soft computing is likely to play an important role in science and engineering in the future. The successful applications of soft computing and the rapid growth suggest that the impact of soft computing will be felt increasingly in coming years. Soft Computing encourages the integration of soft computing techniques and tools into both everyday and advanced applications. This Open access peer-reviewed journal serves as a platform that fosters new applications for all scientists and engineers engaged in research and development in this fast growing field.
International Perspectives: Visualization in Science and EducationLiz Dorland
Overview of the international and interdisciplinary Gordon Research Conference on Visualization in Science and Education and info on key cognitive science and learning sciences researchers. History of the conference, NSF workshop, and research on learning with visualizations.
The document is a curriculum vitae for Dr. Abimbola Helen Afolayan, a Nigerian lecturer currently working in the Department of Information Systems at the Federal University of Technology in Akure, Nigeria. It details her educational background, research interests in areas like decision support systems and artificial intelligence, publications, conference presentations, memberships in professional organizations, and teaching experience.
The document is a curriculum vitae for Colin Fyfe. It summarizes that he is currently a Personal Professor at the University of the West of Scotland, with educational qualifications including a BSc in Mathematics, MSc in Information Technology, and a PhD in neural networks. It also outlines his extensive employment history in education and research, as well as his significant research contributions and roles in academic administration and conference organization.
Data Science: An Emerging Field for Future JobsJian Qin
Data deluge has become a reality in today's scientific research. What does it mean to future science workforce? How can you prepare yourself to embrace the data challenges and opportunities? This presentation will provide you with an overview of data science and what it means to you as future researchers and career scientists.
This document discusses the need to redefine information literacy frameworks to incorporate data literacy for the 21st century. It provides context on the growth of data-driven research and debates around roles in data management. It examines conceptions of data literacy from social science and science perspectives and examples of libraries developing data services. Finally, it analyzes pedagogical approaches to teaching data literacy and calls for discussion on integrating data literacy into information literacy frameworks and education.
Supporting research life cycle librariansSherry Lake
The document discusses the role of academic libraries in supporting the research data lifecycle. It notes trends like increasing data regulation and a lack of data management training for researchers. Libraries are well-positioned to help address these challenges due to their expertise in areas like intellectual property, relationship building, and providing access to information. The document outlines how roles like the data research scientist and research data management librarian can help libraries engage with researchers throughout the entire data lifecycle from collection to long-term preservation.
The document discusses the University of Virginia School of Data Science (SDS) and opportunities for collaboration with NASA. It provides an overview of SDS, including its mission to be a leader in responsible data science through interdisciplinary collaboration. It describes SDS's data science framework, research areas, capabilities, and recent growth. Examples of current research projects involving NASA data on environmental monitoring and forest ecosystems are presented. The document promotes further partnership between SDS and NASA on challenges in science, medicine, and other domains.
Bioinformatics databases: Current Trends and Future PerspectivesUniversity of Malaya
Data is the most powerful resource in any field or subject of study. In Biology, data comes from scientists and their actions, while any institution that makes sense of the data collected, will be in the forefront in their respective research field. In the beginning of any data collection endeavour, it is critical to find proper management techniques to store data and to maximise its utilisation. This presentation reflects upon the current trends and techniques of data modeling, architecture with a highlight on the uses of database, focusing on Bioinformatics examples and case studies. Finally, the future of bioinformatics databases is highlighted to give an overview of the modeling techniques to accommodate the biological data escalation in coming years.
The document discusses the history and future of open science. It describes how open science has evolved from early empirical studies to today's data-driven computational research. Currently, many projects and repositories are making scientific data and findings openly accessible online. However, challenges remain regarding policies, infrastructure, and cultural changes. Moving forward, librarians can help by supporting data management, metadata standards, and identifying appropriate repositories for preserving and sharing research. The future of open science relies on continued collaboration across disciplines to facilitate data-intensive discovery.
Confronting Reality with Big Data & Learning Analytics
We are experiencing an explosion in the quantity of data available online from archives and live streams. Learning Analytics is concerned with how educational research, and learning platform design, can make more effective use of such data (Long & Siemens, 2011). Improving outcomes through the analysis of data is of interest to researchers, administrators, systems architects, social media developers, educators and learners. Analytics are being held up by some as a way to confront, and tackle, the tough new realities of less money, less attention, and higher accountability for quality of learning.
Researchers and vendors are building reporting capabilities into tools that provide unprecedented levels of data on learners. This symposium will show what is possible, and what's coming soon. What objections could possibly be raised to such progress?
However, information infrastructure embodies and shapes worldviews: classification schemes are not only systematic ways to capture and preserve, but also to forget, by virtue of what remains invisible (Bowker & Star, 1999). Learning analytics and recommendation engines are designed with a particular conception of ‘success’, driving the patterns deemed to be evidence of progress, the interventions that are deemed appropriate, the data captured and the rules that fire in software.
This symposium will air some of the critical arguments around the limits of decontextualised data and automated analytics, which often appear reductionist in nature, failing to illuminate higher order learning. There are complex ethical issues around data fusion, and it is not clear to what extent learners are empowered, in contrast to being merely the objects of tracking technology. Educators may also find themselves at the receiving end of a new battery of institutional ‘performance indicators’ that do not reflect what they consider to be authentic learning and teaching.
This Symposium will provide the opportunity to hear a series of brief presentations introducing contrasting perspectives, before the debate is opened to all. Speakers from a cross-section of The Open University will describe how we are connecting datasets, analysing student data and prototyping next generation analytics. Complementing this, JISC will present a national capability perspective, with an update on the JISC CETIS ‘landscape analysis’ of the field, which will clarify potential benefits, issues to consider, and help institutions to assess their current capability and possible next steps.
Participants will catch up with developments in this fast moving field, through exposure to the possibilities of analytics, as well as issues to be alert to.
This document discusses context-aware adaptive and personalized mobile learning systems. It begins with an introduction that outlines the motivation for such systems in providing tailored learning experiences on mobile devices. It then provides definitions for key terms like mobile learning, adaptivity, and personalization. The main issues in designing these systems are the learner's contextual information that can be used for adaptations, and the types of adaptations that are possible. The document outlines ASK's research progress in this area, including their context model and prototype tools. It concludes by noting further research issues.
This document provides biographical and professional information about Yan Zhou. It includes sections on research interests, skills, education, experience, professional service, honors, publications, submitted papers, references, research statement, and teaching statement. The key points are:
- Yan Zhou received a D.Sc. in Computer Science from Washington University in St. Louis in 2001, and has since been a visiting assistant professor at Pacific Lutheran University teaching courses in computer science.
- Her research focuses on machine learning theory and applications, especially semi-supervised learning techniques to leverage both labeled and unlabeled data when labeled data is limited.
- She is interested in continuing research in bioinformatics and information retrieval to aid problems like gene identification and protein
Organizational Implications of Data Science Environments in Education, Resear...Victoria Steeves
Data science (DS) poses key organizational challenges for academic institutions. DS is a multidisciplinary field that includes a range of research methodologies and fields of inquiry. DS as a domain is interested in many of the same issues as libraries: data access and curation, reproducibility, the value of ontologies, and open scholarship. At the same time, identifying opportunities to collaborate and deploy unified services can be challenging. The Data Science Environment (DSE) program, co-funded by the Gordon and Betty Moore and Alfred P. Sloan foundations, provides resources to help universities develop collaborations between researchers, develop tools in DS, and create new career paths for data scientists. Working groups within the DSE focus on reproducibility, career paths, education/training, research methods, space issues, and software/tools. This program has introduced new opportunities for libraries to explore how to engage with this community and consider how to bring the expertise in the DS community to bear on library missions and goals. In this panel, program members from each of the three partner universities, the University of Washington, New York University and the University of California, Berkeley, consider the research questions of the DSE and the organizational impact of these groups in the University as a whole and for the libraries specifically. The panel will employ a case-study presentation model framed through three lenses: the role of data sciences in information science, the
potential career paths for data scientists in libraries, and the potential
amplification of information services (e.g. data curation, institutional repositories, scholarly publishing).
CNI Program: Talk Description: https://www.cni.org/topics/digital-curation/organizational-implications-of-data-science-environments-in-education-research-and-research-management-in-libraries
Video of Talk--Vimeo: https://vimeo.com/149713097
Video of Talk--YouTube: https://www.youtube.com/watch?v=L0G9JsPMEXY
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
Thinking about the need for deeper provenance for knowledge graphs but also using knowledge graphs to enrich provenance. Presented at https://seminariomirianandres.unirioja.es/sw19/
The Perils and Promise of Environmental Data ScienceDawn Wright
Keynote address delivered in April 2019 to the Yale School of Forestry & Environmental Studies, during their annual research conference. "The mission of the Annual F&ES Research Conference is to provide a forum for research degree students and postdocs to share their original work with the F&ES community, as well as with the broader Yale and New Haven communities. After the success of last year's partnership with Yale Pathways to Science, we will again open conference attendance to local high school students and host events emphasizing research communication. Our aim is for the conference to facilitate interdisciplinary communication and collaboration both within the School and beyond the walls of Kroon."
June 2020: Most Downloaded Article in Soft Computing ijsc
Soft computing is likely to play an important role in science and engineering in the future. The successful applications of soft computing and the rapid growth suggest that the impact of soft computing will be felt increasingly in coming years. Soft Computing encourages the integration of soft computing techniques and tools into both everyday and advanced applications. This Open access peer-reviewed journal serves as a platform that fosters new applications for all scientists and engineers engaged in research and development in this fast growing field.
NKU offers a variety of STEM programs with hands-on learning opportunities through small class sizes, research experiences, and state-of-the-art facilities. Students can interact closely with faculty and have opportunities for internships with local companies. Graduates are well prepared for high-paying careers or graduate programs. NKU provides scholarships and unique housing and study options to support STEM students.
A Tableau-based Federated Reasoning Algorithm for Modular OntologiesJie Bao
This document describes a tableau-based algorithm for distributed reasoning over modular ontologies. It introduces description logics and modular ontologies modeled as packages in package-based description logics (P-DL). P-DL allows ontologies to be organized into modules or packages that can import terms from other packages. The algorithm uses a federation of local reasoners, each handling a package, to collaboratively construct a distributed tableau by sharing facts between local tableau constructions. This avoids materializing a single global tableau and allows reasoning to be performed even when global knowledge is not available.
Soft computing is likely to play an important role in science and engineering in the future. The successful applications of soft computing and the rapid growth suggest that the impact of soft computing will be felt increasingly in coming years. Soft Computing encourages the integration of soft computing techniques and tools into both everyday and advanced applications. This Open access peer-reviewed journal serves as a platform that fosters new applications for all scientists and engineers engaged in research and development in this fast growing field.
International Perspectives: Visualization in Science and EducationLiz Dorland
Overview of the international and interdisciplinary Gordon Research Conference on Visualization in Science and Education and info on key cognitive science and learning sciences researchers. History of the conference, NSF workshop, and research on learning with visualizations.
The document is a curriculum vitae for Dr. Abimbola Helen Afolayan, a Nigerian lecturer currently working in the Department of Information Systems at the Federal University of Technology in Akure, Nigeria. It details her educational background, research interests in areas like decision support systems and artificial intelligence, publications, conference presentations, memberships in professional organizations, and teaching experience.
The document is a curriculum vitae for Colin Fyfe. It summarizes that he is currently a Personal Professor at the University of the West of Scotland, with educational qualifications including a BSc in Mathematics, MSc in Information Technology, and a PhD in neural networks. It also outlines his extensive employment history in education and research, as well as his significant research contributions and roles in academic administration and conference organization.
Data Science: An Emerging Field for Future JobsJian Qin
Data deluge has become a reality in today's scientific research. What does it mean to future science workforce? How can you prepare yourself to embrace the data challenges and opportunities? This presentation will provide you with an overview of data science and what it means to you as future researchers and career scientists.
This document discusses the need to redefine information literacy frameworks to incorporate data literacy for the 21st century. It provides context on the growth of data-driven research and debates around roles in data management. It examines conceptions of data literacy from social science and science perspectives and examples of libraries developing data services. Finally, it analyzes pedagogical approaches to teaching data literacy and calls for discussion on integrating data literacy into information literacy frameworks and education.
Supporting research life cycle librariansSherry Lake
The document discusses the role of academic libraries in supporting the research data lifecycle. It notes trends like increasing data regulation and a lack of data management training for researchers. Libraries are well-positioned to help address these challenges due to their expertise in areas like intellectual property, relationship building, and providing access to information. The document outlines how roles like the data research scientist and research data management librarian can help libraries engage with researchers throughout the entire data lifecycle from collection to long-term preservation.
The document discusses the University of Virginia School of Data Science (SDS) and opportunities for collaboration with NASA. It provides an overview of SDS, including its mission to be a leader in responsible data science through interdisciplinary collaboration. It describes SDS's data science framework, research areas, capabilities, and recent growth. Examples of current research projects involving NASA data on environmental monitoring and forest ecosystems are presented. The document promotes further partnership between SDS and NASA on challenges in science, medicine, and other domains.
Bioinformatics databases: Current Trends and Future PerspectivesUniversity of Malaya
Data is the most powerful resource in any field or subject of study. In Biology, data comes from scientists and their actions, while any institution that makes sense of the data collected, will be in the forefront in their respective research field. In the beginning of any data collection endeavour, it is critical to find proper management techniques to store data and to maximise its utilisation. This presentation reflects upon the current trends and techniques of data modeling, architecture with a highlight on the uses of database, focusing on Bioinformatics examples and case studies. Finally, the future of bioinformatics databases is highlighted to give an overview of the modeling techniques to accommodate the biological data escalation in coming years.
The document discusses the history and future of open science. It describes how open science has evolved from early empirical studies to today's data-driven computational research. Currently, many projects and repositories are making scientific data and findings openly accessible online. However, challenges remain regarding policies, infrastructure, and cultural changes. Moving forward, librarians can help by supporting data management, metadata standards, and identifying appropriate repositories for preserving and sharing research. The future of open science relies on continued collaboration across disciplines to facilitate data-intensive discovery.
Confronting Reality with Big Data & Learning Analytics
We are experiencing an explosion in the quantity of data available online from archives and live streams. Learning Analytics is concerned with how educational research, and learning platform design, can make more effective use of such data (Long & Siemens, 2011). Improving outcomes through the analysis of data is of interest to researchers, administrators, systems architects, social media developers, educators and learners. Analytics are being held up by some as a way to confront, and tackle, the tough new realities of less money, less attention, and higher accountability for quality of learning.
Researchers and vendors are building reporting capabilities into tools that provide unprecedented levels of data on learners. This symposium will show what is possible, and what's coming soon. What objections could possibly be raised to such progress?
However, information infrastructure embodies and shapes worldviews: classification schemes are not only systematic ways to capture and preserve, but also to forget, by virtue of what remains invisible (Bowker & Star, 1999). Learning analytics and recommendation engines are designed with a particular conception of ‘success’, driving the patterns deemed to be evidence of progress, the interventions that are deemed appropriate, the data captured and the rules that fire in software.
This symposium will air some of the critical arguments around the limits of decontextualised data and automated analytics, which often appear reductionist in nature, failing to illuminate higher order learning. There are complex ethical issues around data fusion, and it is not clear to what extent learners are empowered, in contrast to being merely the objects of tracking technology. Educators may also find themselves at the receiving end of a new battery of institutional ‘performance indicators’ that do not reflect what they consider to be authentic learning and teaching.
This Symposium will provide the opportunity to hear a series of brief presentations introducing contrasting perspectives, before the debate is opened to all. Speakers from a cross-section of The Open University will describe how we are connecting datasets, analysing student data and prototyping next generation analytics. Complementing this, JISC will present a national capability perspective, with an update on the JISC CETIS ‘landscape analysis’ of the field, which will clarify potential benefits, issues to consider, and help institutions to assess their current capability and possible next steps.
Participants will catch up with developments in this fast moving field, through exposure to the possibilities of analytics, as well as issues to be alert to.
This document discusses context-aware adaptive and personalized mobile learning systems. It begins with an introduction that outlines the motivation for such systems in providing tailored learning experiences on mobile devices. It then provides definitions for key terms like mobile learning, adaptivity, and personalization. The main issues in designing these systems are the learner's contextual information that can be used for adaptations, and the types of adaptations that are possible. The document outlines ASK's research progress in this area, including their context model and prototype tools. It concludes by noting further research issues.
This document provides biographical and professional information about Yan Zhou. It includes sections on research interests, skills, education, experience, professional service, honors, publications, submitted papers, references, research statement, and teaching statement. The key points are:
- Yan Zhou received a D.Sc. in Computer Science from Washington University in St. Louis in 2001, and has since been a visiting assistant professor at Pacific Lutheran University teaching courses in computer science.
- Her research focuses on machine learning theory and applications, especially semi-supervised learning techniques to leverage both labeled and unlabeled data when labeled data is limited.
- She is interested in continuing research in bioinformatics and information retrieval to aid problems like gene identification and protein
Organizational Implications of Data Science Environments in Education, Resear...Victoria Steeves
Data science (DS) poses key organizational challenges for academic institutions. DS is a multidisciplinary field that includes a range of research methodologies and fields of inquiry. DS as a domain is interested in many of the same issues as libraries: data access and curation, reproducibility, the value of ontologies, and open scholarship. At the same time, identifying opportunities to collaborate and deploy unified services can be challenging. The Data Science Environment (DSE) program, co-funded by the Gordon and Betty Moore and Alfred P. Sloan foundations, provides resources to help universities develop collaborations between researchers, develop tools in DS, and create new career paths for data scientists. Working groups within the DSE focus on reproducibility, career paths, education/training, research methods, space issues, and software/tools. This program has introduced new opportunities for libraries to explore how to engage with this community and consider how to bring the expertise in the DS community to bear on library missions and goals. In this panel, program members from each of the three partner universities, the University of Washington, New York University and the University of California, Berkeley, consider the research questions of the DSE and the organizational impact of these groups in the University as a whole and for the libraries specifically. The panel will employ a case-study presentation model framed through three lenses: the role of data sciences in information science, the
potential career paths for data scientists in libraries, and the potential
amplification of information services (e.g. data curation, institutional repositories, scholarly publishing).
CNI Program: Talk Description: https://www.cni.org/topics/digital-curation/organizational-implications-of-data-science-environments-in-education-research-and-research-management-in-libraries
Video of Talk--Vimeo: https://vimeo.com/149713097
Video of Talk--YouTube: https://www.youtube.com/watch?v=L0G9JsPMEXY
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
Thinking about the need for deeper provenance for knowledge graphs but also using knowledge graphs to enrich provenance. Presented at https://seminariomirianandres.unirioja.es/sw19/
The Perils and Promise of Environmental Data ScienceDawn Wright
Keynote address delivered in April 2019 to the Yale School of Forestry & Environmental Studies, during their annual research conference. "The mission of the Annual F&ES Research Conference is to provide a forum for research degree students and postdocs to share their original work with the F&ES community, as well as with the broader Yale and New Haven communities. After the success of last year's partnership with Yale Pathways to Science, we will again open conference attendance to local high school students and host events emphasizing research communication. Our aim is for the conference to facilitate interdisciplinary communication and collaboration both within the School and beyond the walls of Kroon."
Interdisciplinarity and Epistemic Fluency: What makes complex knowledge work ...Lina Markauskaite
Webinar 2 “Interdisciplinarity in Technology-Enhanced Learning”
The topic chosen for the second edition of the Webinar series is “Interdisciplinarity in TEL”. The TEL field is interdisciplinary by definition. This makes TEL an especially interesting research field. Yet, it also brings complexity at different levels. A challenge for TEL researchers is to properly understand what is interdisciplinarity in our field, its challenges and implications. In the first part of the dialog, Lina Markauskaite will elaborate on the concept of epistemic fluency as “the capacity to understand, switch between and combine different kinds of knowledge and different ways of knowing about the world” (Markauskaite & Goodyear, 2016)
About the Webinar
Big data is being collected at a rate that is surpassing traditional analytical methods due to the constantly expanding ways in which data can be created and mined. Faculty in all disciplines are increasingly creating and/or incorporating big data into their research and institutions are creating repositories and other tools to manage it all. There are many challenge to effectively manage and curate this data—challenges that are both similar and different to managing document archives. Libraries can and are assuming a key role in making this information more useful, visible, and accessible, such as creating taxonomies, designing metadata schemes, and systematizing retrieval methods.
Our panelists will talk about their experience with big data curation, best practices for research data management, and the tools used by libraries as they take on this evolving role.
The document discusses opportunities for collaboration between the University of Virginia School of Data Science (SDS) and NASA. It provides an overview of SDS, including its mission to be a leader in responsible data science through interdisciplinary collaboration and societal benefit. Examples are given of current SDS research projects involving NASA data on climate change and forest ecosystems. The document proposes areas for potential SDS-NASA collaboration such as courses involving NASA content, funded research projects, student fellowships and faculty positions. It aims to leverage the strengths of both organizations in responsible data science.
Neville Prendergast "E-Science - What is it?"The TMC Library
Neville Prendergast gave this presentation during the "Understanding E-Science: A Symposium for Medical Librarians" on February 13, 2012 in Houston, TX.
Curriculum Development at the Tetherless World Constellation - Peter Fox - RD...ASIS&T
The document summarizes curriculum development at the Tetherless World Constellation focusing on data science and related fields. It discusses themes like data science, semantic science, knowledge provenance and ontology engineering. It notes the Constellation involves over 35 faculty, post-docs, grad and undergrad students across multiple departments. It also lists some application themes like government data, environmental informatics and health/life sciences. Finally, it advocates teaching data science methodology and principles over technology in an interdisciplinary way and emphasizing collaboration.
Similar to Developing Data Services to Support eScience/eResearch (20)
Data Science and What It Means to Library and Information ScienceJian Qin
Data science involves collecting, analyzing, and preserving large datasets to extract knowledge and make predictions. It differs from traditional disciplines by dealing with heterogeneous, unstructured data from complex networks. A data scientist requires math, computing, communication skills, and the ability to ask the right questions. Libraries are well-positioned to offer various data services including data discovery, consulting, mining, integration, and curation to support research and decision-making. Practicing data science in libraries requires vision, risk-taking, data science knowledge, careful planning, and collaboration.
How Portable Are the Metadata Standards for Scientific Data?Jian Qin
The one-covers-all approach in current metadata standards for scientific data has serious limitations in keeping up with the ever-growing data. This paper reports the findings from a survey to metadata standards in the scientific data domain and argues for the need for a metadata infrastructure. The survey collected 4400+ unique elements from 16 standards and categorized these elements into 9 categories. Findings from the data included that the highest counts of element occurred in the descriptive category and many of them overlapped with DC elements. This pattern also repeated in the elements co-occurred in different standards. A small number of semantically general elements appeared across the largest numbers of standards while the rest of the element co-occurrences formed a long tail with a wide range of specific semantics. The paper discussed implications of the findings in the context of metadata portability and infrastructure and pointed out that large, complex standards and widely varied naming practices are the major hurdles for building a metadata infrastructure.
Functional and Architectural Requirements for Metadata: Supporting Discovery...Jian Qin
The tremendous growth in digital data has led to an increase in metadata initiatives for different types of scientific data, as evident in Ball’s survey (2009). Although individual communities have specific needs, there are shared goals that need to be recognized if systems are to effectively support data sharing within and across all domains. This paper considers this need, and explores systems requirements that are essential for metadata supporting the discovery and management of scientific data. The paper begins with an introduction and a review of selected research specific to metadata modeling in the sciences. Next, the paper’s goals are stated, followed by the presentation of valuable systems requirements. The results include a base-model with three chief principles: principle of least effort, infrastructure service, and portability. The principles are intended to support “data user” tasks. Results also include a set of defined user tasks and functions, and applications scenarios.
The document outlines a presentation on survey methodology and ethical issues given at Wuhan University in summer 2012. It covers topics such as what a survey is, survey design, quality, and ethical considerations. The presentation includes sections on defining a survey, key elements of survey research design like research questions, sampling, and constructs and measurements, and addressing ethical issues in using surveys.
Data repositories -- Xiamen University 2012 06-08Jian Qin
The document discusses data repositories and services. It begins by defining what a data repository is, noting that it is a logical and sometimes physical partitioning of data where multiple databases reside. It then outlines some key aspects of data repositories, including technical features like standards, software, and staffing requirements. The document also discusses functions of repositories like content management, archiving, dissemination and system maintenance. It provides examples of institutional repositories and data repositories, highlighting characteristics of each. Finally, it provides a case study on Dryad, an international repository for data and publications in biosciences.
The document outlines an eScience librarianship curriculum that aims to educate librarians for managing research data in the digital era. The curriculum covers key areas like scientific data literacy, data management competencies, and skills for collaborating in eScience initiatives. It consists of core courses in scientific data management and cyberinfrastructure technologies, as well as capstone courses focused on developing the ability to plan and lead eScience librarianship projects. The goal is to produce librarians with expertise in all aspects of the data lifecycle and the ability to support researchers throughout the eScience process.
This document provides guidance on preparing research papers for international journal publication. It discusses the typical structure of a research paper, including the introduction, literature review, methodology, findings, discussion, and conclusion. The literature review is described as a critical synthesis of previous research that helps contextualize the study and identify gaps. An effective methodology with clearly described hypotheses, data collection, sampling, and analysis is also emphasized. The peer review process is covered, noting common criteria like a paper's contribution, appropriate methods, supported conclusions, and clear communication. Overall, preparing quality papers is outlined as a long process requiring patience, honesty, attention to detail, and understanding differences in writing styles across languages.
Linking Scientific Metadata (presented at DC2010)Jian Qin
Linked entity data in metadata records builds a foundation for semantic web. Even though metadata records contain rich entity data, there is no linking between associated entities such as persons, datasets, projects, publications, or organizations. We conducted a small experiment using the dataset collection from the Hubbard Brook Ecosystem Study (HBES), in which we converted the entities and their relationships into RDF triples and linked the URIs contained in RDF triples to the corresponding entities in the Ecological Metadata Language (EML) records. Through the transformation program written in XML Stylesheet Language (XSL), we turned a plain EML record display into an interlinked semantic web of ecological datasets. The experiment suggests a methodological feasibility in incorporating linked entity data into metadata records. The paper also argues for the need of changing the scientific as well as general metadata paradigm.
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...TechSoup
Whether you're new to SEO or looking to refine your existing strategies, this webinar will provide you with actionable insights and practical tips to elevate your nonprofit's online presence.
A Free 200-Page eBook ~ Brain and Mind Exercise.pptxOH TEIK BIN
(A Free eBook comprising 3 Sets of Presentation of a selection of Puzzles, Brain Teasers and Thinking Problems to exercise both the mind and the Right and Left Brain. To help keep the mind and brain fit and healthy. Good for both the young and old alike.
Answers are given for all the puzzles and problems.)
With Metta,
Bro. Oh Teik Bin 🙏🤓🤔🥰
🔥🔥🔥🔥🔥🔥🔥🔥🔥
إضغ بين إيديكم من أقوى الملازم التي صممتها
ملزمة تشريح الجهاز الهيكلي (نظري 3)
💀💀💀💀💀💀💀💀💀💀
تتميز هذهِ الملزمة بعِدة مُميزات :
1- مُترجمة ترجمة تُناسب جميع المستويات
2- تحتوي على 78 رسم توضيحي لكل كلمة موجودة بالملزمة (لكل كلمة !!!!)
#فهم_ماكو_درخ
3- دقة الكتابة والصور عالية جداً جداً جداً
4- هُنالك بعض المعلومات تم توضيحها بشكل تفصيلي جداً (تُعتبر لدى الطالب أو الطالبة بإنها معلومات مُبهمة ومع ذلك تم توضيح هذهِ المعلومات المُبهمة بشكل تفصيلي جداً
5- الملزمة تشرح نفسها ب نفسها بس تكلك تعال اقراني
6- تحتوي الملزمة في اول سلايد على خارطة تتضمن جميع تفرُعات معلومات الجهاز الهيكلي المذكورة في هذهِ الملزمة
واخيراً هذهِ الملزمة حلالٌ عليكم وإتمنى منكم إن تدعولي بالخير والصحة والعافية فقط
كل التوفيق زملائي وزميلاتي ، زميلكم محمد الذهبي 💊💊
🔥🔥🔥🔥🔥🔥🔥🔥🔥
How to Manage Reception Report in Odoo 17Celine George
A business may deal with both sales and purchases occasionally. They buy things from vendors and then sell them to their customers. Such dealings can be confusing at times. Because multiple clients may inquire about the same product at the same time, after purchasing those products, customers must be assigned to them. Odoo has a tool called Reception Report that can be used to complete this assignment. By enabling this, a reception report comes automatically after confirming a receipt, from which we can assign products to orders.
How to Download & Install Module From the Odoo App Store in Odoo 17Celine George
Custom modules offer the flexibility to extend Odoo's capabilities, address unique requirements, and optimize workflows to align seamlessly with your organization's processes. By leveraging custom modules, businesses can unlock greater efficiency, productivity, and innovation, empowering them to stay competitive in today's dynamic market landscape. In this tutorial, we'll guide you step by step on how to easily download and install modules from the Odoo App Store.
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
This presentation was provided by Rebecca Benner, Ph.D., of the American Society of Anesthesiologists, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
Developing Data Services to Support eScience/eResearch
1. School
of
Information
Studies
Syracuse
University
Developing
Data
Services
to
Support
eScience/eResearch
2012
Priscilla
M.
Mayden
Lecture
eScience
and
the
Evolution
of
Library
Services
Jian
Qin
School
of
Information
Studies
Syracuse
University
http://eslib.ischool.syr.edu/
February
22,
2012
2. School
of
Information
Studies
Syracuse
University
The
morning
ahead
An
environmental
scan
• E-‐Science,
cyberinfrastructure,
and
data
• What
do
all
these
have
to
do
with
me?
Case
study:
The
gravitational
wave
research
data
management
Group
work:
Role
play
in
developing
data
management
initiatives
Priscilla M. Mayden Lecture 2012, Utah 2
3. School
of
Information
Studies
Syracuse
University
An
environmental
scan
• E-‐Science,
cyberinfrastructure,
and
data
• What
do
all
these
have
to
do
with
me?
Overview
of
E-‐Science
and
Data
Characteristics
of
e-‐science
Data
sets,
data
collections,
and
data
repositories
Why
does
it
matter
to
libraries?
4. School
of
Information
Studies
Syracuse
University
E-‐Science
“In
the
future,
e-‐Science
will
refer
to
the
large
scale
science
that
will
increasingly
be
carried
out
through
distributed
global
collaborations
enabled
by
the
Internet.
”
National e-Science Center. (2008). Defining e-Science.
http://www.nesc.ac.uk/nesc/define.html
Priscilla M. Mayden Lecture 2012, Utah 4
5. School
of
Information
Studies
Syracuse
University
Characteris>cs
of
e-‐science
• Digital
data
driven
• Distributed
• Collaborative
• Trans-‐disciplinary
• Fuses
pillars
of
science
– Experiment
– Theory
Greer,
Chris.
(2008).
E-‐Science:
Trends,
Transformations
&
Responses.
In:
– Model/simulation
Reinventing
Science
Librarianship:
Models
for
the
Future,
October
2008.
– Observation/correlation
http://www.arl.org/bm~doc/
ff08greer.pps
Priscilla M. Mayden Lecture 2012, Utah 5
6. School
of
Information
Studies
Syracuse
University
Shi?
in
Science
Paradigms
Thousand
A
few
hundred
A
few
decades
Today
years
ago
years
ago
ago
Data exploration (eScience)
unify theory, experiment, and
simulation
A computational -- Data captured by
approach instruments or generated by
simulating simulator
Theoretical complex -- Processed by software
branch phenomena -- Information/Knowledge
using models, stored in computer
generalizations -- Scientist analyzes
Science was database/files using data
empirical management and statistics
describing natural Gray,
J.
&
Szalay,
A.
(2007).
eScience
–
A
transformed
scienti_ic
method.
phenomena http://research.microsoft.com/en-‐us/um/people/gray/talks/NRC-‐CSTB_eScience.ppt
Priscilla M. Mayden Lecture 2012, Utah 6
7. School
of
Information
Studies
Syracuse
University
Gray,
J.
&
Szalay,
A.
(2007).
eScience
–
A
transformed
scienti_ic
method.
http://research.microsoft.com/en-‐us/
X-‐Informa>cs
um/people/gray/talks/NRC-‐CSTB_eScience.ppt
• The
evolution
of
X-‐Informatics
and
Computational-‐X
for
each
discipline
X
• How
to
codify
and
represent
our
knowledge
Experiments &
Instruments
Other Archives facts questions
Literature facts ? answers
Simulations
The Generic Problems
• Data
ingest
• Query
and
Visualization
tools
• Managing
a
petabyte
• Building
and
executing
models
• Common
schema
• Integrating
data
and
Literature
• How
to
organize
it
• Documenting
experiments
• How
to
reorganize
it
• Curation
and
long-‐term
preservation
• How
to
share
with
others
Priscilla M. Mayden Lecture 2012, Utah 7
8. School
of
Information
Studies
Syracuse
University
Useful
resources
Part 2: Health and Wellbeing
• The healthcare singularity and the age of
semantic medicine
• Healthcare delivery in developing countries:
challenges and potential solutions
• Discovering the wiring diagram of the brain
• Toward a computational microscope for
neurobiology
• A unified modeling approach to data-intensive
http://research.microsoft.com/en- healthcare
us/collaboration/fourthparadigm/
• Visualization in process algebra models of
biological systems
Priscilla M. Mayden Lecture 2012, Utah 8
9. School
of
Information
Studies
Syracuse
University
What
are
data?
What
are
some
of
the
major
data
formats?
Why
data
formats?
FUNDAMENTALS
OF
DATA
Priscilla M. Mayden Lecture 2012, Utah 9
10. School
of
Information
Studies
Syracuse
University
What
are
data?
(1)
An
artist’s
conception
(above)
depicts
fundamental
NEON
observatory
instrumentation
and
systems
as
well
as
potential
spatial
organization
of
the
environmental
measurements
made
by
these
instruments
and
systems.
http://www.nsf.gov/pubs/2007/nsf0728/nsf0728_4.pdf
Priscilla M. Mayden Lecture 2012, Utah 10
11. School
of
Information
Studies
Syracuse
University
What
are
data?
(2)
Priscilla M. Mayden Lecture 2012, Utah 11
12. School
of
Information
Studies
Syracuse
University
Medical
and
health
data
Standardization
Compliance
Security
http://www.weforum.org/issues/charter-health-data
Priscilla M. Mayden Lecture 2012, Utah 12
13. School
of
Information
Studies
Syracuse
University
The
mul>-‐dimensions
of
data
Research orientation
Data types
Data formats
Levels of
processing
Priscilla M. Mayden Lecture 2012, Utah 13
14. School
of
Information
Studies
Syracuse
University
Scien>fic
data
formats
Common
data
format
Image
formats
Matrix
formats
Microarray
_ile
formats
Communication
protocols
Priscilla M. Mayden Lecture 2012, Utah 14
15. School
of
Information
Studies
Syracuse
University
Scien>fic
&
medical
data
formats
• Medical
and
Physiological
Data
• Chemical
Formats
Formats
– XYZ
—
XYZ
molecule
geometry
_ile
– BDF
—
BioSemi
data
format
(.xyz)
(.bdf)
– MOL
—
MDL
MOL
format
(.mol)
– EDF
—
European
data
– MOL2
—
Tripos
MOL2
format
(.mol2)
format
(.edf)
– SDF
—
MDL
SDF
format
(.sdf)
• Molecular
Biology
data
Formats
– SMILES
—
SMILES
chemical
format
– PDB
—
Protein
Data
Bank
(.smi)
format
(.pdb)
• Bioinformatics
Formats
– MMCIF
—
MMCIF
3D
molecular
model
format
(.cif)
– GenBank
—
NCBI
GenBank
sequence
format
(.gb,
.gbk)
• Medical
Imaging
– FASTA
—
bioinformatics
sequence
– DICOM
—
DICOM
annotated
format
(.fasta,
.fa,
.fsa,
.mpfa)
medical
images
(.dcm,
.dic)
– NEXUS
—
NEXUS
phylogenetic
data
format
(.nex,
.ndk)
Priscilla M. Mayden Lecture 2012, Utah 15
16. School
of
Information
Studies
Syracuse
University
Why
data
formats?
• Archiving
• Transmission
– Preservation
for
– delivery
across
posterity
• hardware
• Storage
• software
– Availability
for
• administrative
“arbitrary”
access
– system
boundaries
• Analysis
– availability
for
processing
Priscilla M. Mayden Lecture 2012, Utah 16
17. School
of
Information
Studies
Syracuse
University
Summary
• Scienti_ic
data
formats
are
closely
tied
to
scienti_ic
computing
– Data
structure,
model,
and
attributes
– Self-‐descriptive
with
header/metadata
– API
for
manipulating
the
data
– Interoperability:
conversion
between
different
formats
• No
one-‐format-‐_its-‐all
standard
• Each
standard
has
one
or
more
tools
for
creating,
editing,
and
annotating
dataset
Priscilla M. Mayden Lecture 2012, Utah 17
18. School
of
Information
Studies
Syracuse
University
What
is
a
dataset?
What
are
some
of
the
metadata
standards
for
describing
datasets?
What
is
data
management?
DATASETS,
METADATA,
AND
DATA
MANAGEMENT
Priscilla M. Mayden Lecture 2012, Utah 18
19. School
of
Information
Studies
Syracuse
University
Dataset
classifica>on
Volume
Large-‐volume
Small-‐volume
Priscilla M. Mayden Lecture 2012, Utah 19
20. School
of
Information
Studies
Syracuse
University
Ecological data example: Instantaneous streamflow by watershed
http://www.hubbardbrook.org/data/dataset.php?id=1
Priscilla M. Mayden Lecture 2012, Utah 20
21. School
of
Information
Studies
Syracuse
University
Diabetes data
and trends—
Country level
estimates:
http://apps.nccd.cdc.gov/
DDT_STRS2/
NationalDiabetesPrevale
nceEstimates.aspx?
mode=PHY ;
Diabetes Data &
Trends home page:
http://apps.nccd.cdc.gov/
ddtstrs/default.aspx
Priscilla M. Mayden Lecture 2012, Utah 21
22. Clinical trials data management:
School
of
Information
Studies
Syracuse
University
http://www.clinicaltrials.gov/ct2/show/NCT00006286?term=TADS
+NIMH&rank=1
Priscilla M. Mayden Lecture 2012, Utah 22
23. School
of
Information
Studies
Syracuse
University
Common
in
the
examples
• Attributes
of
a
dataset
tell
users/managers:
– What
the
dataset
is
about
– How
data
was
collected
– To
which
project
the
data
is
related
– Who
were
responsible
for
data
collection
– Who
you
may
contact
to
obtain
the
data
– What
publications
the
data
have
generated
– ??
Priscilla M. Mayden Lecture 2012, Utah 23
24. School
of
Information
Studies
Syracuse
University
Metadata
standards
in
medical
&
health
sciences
Structure
Semantics
Medical
Bioinfomatics
NCBI Taxonomy
Healthcare
images
NCBO Bioportal
UMLS
MeSH (Medical Subject
GenBank
Headings)
GenBank
HL7
DICOM
GenBank
SNOMED CT (Systematized
Nomenclature of Medicine--
Clinical Terms)
Priscilla M. Mayden Lecture 2012, Utah 24
25. School
of
Information
Studies
Syracuse
University
Priscilla M. Mayden Lecture 2012, Utah 25
26. School
of
Information
Studies
Syracuse
University
Research
data
collec>ons
Size Metadata Management
Standards
Larger,
Multiple, Organized
discipline-‐ comprehensive Institutionalized,
based
Heroic
individual
Smaller,
None or inside the
team-‐based
random team
Priscilla M. Mayden Lecture 2012, Utah 26
27. School
of
Information
Studies
Syracuse
University
Research
collec>ons
• Limited
processing
or
long-‐term
management
• Not
conformed
to
any
data
standards
• Varying
sizes
and
formats
of
data
_iles
• Low
level
of
processing,
lack
of
plan
for
data
products
• Low
awareness
of
metadata
standards
and
data
management
issues
Priscilla M. Mayden Lecture 2012, Utah 27
28. School
of
Information
Studies
Syracuse
University
Resource
collec>ons
• Authored
by
a
community
of
investigators,
within
a
domain
or
science
or
engineering
• Developed
with
community
level
standards
• Life
time
is
between
mid-‐
and
long-‐term
Priscilla M. Mayden Lecture 2012, Utah 28
29. School
of
Information
Studies
Syracuse
University
Reference
collec>on
• Example:
Global
Biodiversity
Information
Facility
– Created
by
large
segments
of
science
community
– Conform
to
robust,
well-‐established
and
comprehensive
standards,
e.g.
• ABCD
(Access
to
Biological
Collection
Data)
• Darwin
Core
• DiGIR
(Distributed
Generic
Information
Retrieval)
• Dublin
Core
Metadata
standard
• GGF
(Global
Grid
Forum)
• Invasive
Alien
Species
Pro_ile
• LSID
(Life
Sciences
Identi_ier)
• OGC
(Open
Geospatial
Consortium)
Priscilla M. Mayden Lecture 2012, Utah 29
30. School
of
Information
Studies
Syracuse
University
Datasets,
data
collec>ons,
and
data
repositories
System for storing,
managing,
preserving, and
• Data
collections
are
built
for
providing access to
larger
segments
of
science
datasets
and
engineering
Data
• Datasets
repository
– typically
centered
around
an
A repository may
event
or
a
study
contain one or more
– contain
a
single
_ile
or
multiple
data collections
_iles
in
various
formats
A data collection may
– coupled
with
documentation
contain one or more
about
the
background
of
data
datasets
collection
and
processing
A dataset may
contain one or more
Priscilla M. Mayden Lecture 2012, Utah data files 30
31. School
of
Information
Studies
Syracuse
University
Data
management
for
science
research
• De_inition
from
Wikipedia:
http://en.wikipedia.org/wiki/Data_management
• Key
concepts
in
data
management:
– Data
ownership
– Data
collection
– Data
storage
How do they relate to
– Data
protection
responsible conduct of
– Data
retention
research?
– Data
analysis
http://ori.hhs.gov/images/
– Data
sharing
ddblock/data.pdf
– Data
reporting
Priscilla M. Mayden Lecture 2012, Utah 31
32. School
of
Information
Studies
Syracuse
University
An
aPempt
to
define
DM
• In
the
context
of
libraries:
– Data
management
is
a
process
in
which
librarians
plan,
design,
and
implement
data
services
to
support
eScience/
eResearch.
– Data
services
that
libraries
may
provide:
• Institutional
or
community
data
repositories
• Data
management
plan
for
pre-‐
and
post-‐award
of
grants
• Metadata
creation,
linking,
and
discovery
• Data
archiving,
preservation,
and
curation
• Consultation
for
research
group’s
data
management
projects
• Data
management
and
data
literacy
training
for
graduate
students
and
faculty
Priscilla M. Mayden Lecture 2012, Utah 32
33. School
of
Information
Studies
Syracuse
University
Ini>a>ves
in
research
libraries
Data support and Libraries involved in
services in supporting eScience:
institutions: 73%
45%
• Pressure
points:
– Lack
of
resources
– Dif_iculty
acquiring
the
appropriate
staff
and
expertise
to
provide
eScience
and
data
management
or
curation
services
– Lack
of
a
unifying
direction
on
campus
Source: Soehner, C., Steeves, C. & Ward, J. (2010). E-Science and data support services: A
study of ARL member institution. http://www.arl.org/bm~doc/escience_report2010.pdf
Priscilla M. Mayden Lecture 2012, Utah 33
34. School
of
Information
Studies
Syracuse
University
Data
preserva>on
challenges
• Data
formats
– Vary
in
data
types,
e.g.
vector
and
raster
data
types
– Format
conversions,
e.g.
from
an
old
version
to
a
newer
one
• Data
relations
– e.g.
there
are
data
models,
annotations,
classi_ication
schemes,
and
symbolization
_iles
for
a
digital
map
• Semantic
issues
– Naming
datasets
and
attributes
Priscilla M. Mayden Lecture 2012, Utah 34
35. School
of
Information
Studies
Syracuse
University
Data
access
challenges
• Reliability
• Authenticity
• Leverage
technology
to
make
data
access
easier
and
more
effective
– Cross-‐database
search
– Integration
applications
– “Science-‐ready”
datasets
Priscilla M. Mayden Lecture 2012, Utah 35
36. School
of
Information
Studies
Syracuse
University
Suppor>ng
digital
research
data
• Lifecycle
of
research
data
– Create:
data
creation/capture/gathering
from
laboratory
experiments,
_ield
work,
surveys,
devices,
media,
simulation
output…
– Edit:
organize,
annotate,
clean,
_ilter…
– Use/reuse:
analyze,
mine,
model,
derive
additional
data,
visualize,
input
to
instruments
/computers
– Publish:
disseminate,
create,
portals
/data.
Databases,
associate
with
literature
– Preserve/destroy:
store
/
preserve,
store
/replicate
/
preserve,
store
/
ignore,
destroy…
Priscilla M. Mayden Lecture 2012, Utah 36
37. School
of
Information
Studies
Syracuse
University
Suppor>ng
data
management
The data deluge Researchers need:
Numerical, image, video Specialized search
engines to discover
Models, simulations, bit the data they need
streams
Powerful data mining
XML, CVS, DB, HTML tools to use and
analyze the data
Priscilla M. Mayden Lecture 2012, Utah 37
38. School
of
Information
Studies
Syracuse
University
Research
data
management
Community
Institution
eScience
librarian
Financial and
policy support Science Data content User
domain idiosyncrasies requirements
Evolving and interconnecting –
Institutional
Community
National
International
repository
repository
repository
repository
Priscilla M. Mayden Lecture 2012, Utah 38
39. School
of
Information
Studies
Syracuse
University
Implica>ons
to
scholarly
communica>on
process
Publishing
Curation
Archiving
Data
publishing;
Maintaining,
preserving
The
long-‐term
New
scholarly
publishing
and
adding
value
to
storage,
retrieval,
and
models—open
access,
digital
research
data
use
of
scienti_ic
data
institutional
and
throughout
its
lifecycle.
and
methods.
community
repositories,
self-‐publishing,
library
publishing,
....
Priscilla M. Mayden Lecture 2012, Utah 39
40. School
of
Information
Studies
Syracuse
University
Summary
• E-‐Science
development
has
raised
expectations
to
research
libraries
– Working
knowledge
and
skills
in
e-‐Science
– Focus
on
process
(data
and
team
science)
rather
than
product
(reference
services)
– Proactive,
collaborative,
integrative,
and
interdisciplinary
Priscilla M. Mayden Lecture 2012, Utah 40
41. School
of
Information
Studies
Syracuse
University
Case
Study:
Learning
Data
Management
Needs
from
Scien>sts
42. School
of
Information
Studies
Syracuse
University
Gravita>onal
Wave
(GW)
Research
Priscilla M. Mayden Lecture 2012, Utah 42
43. School
of
Information
Studies
Syracuse
University
What
is
the
problem?
• Tracking
data
output
and
work_lows
is
dif_icult
due
to
lack
of
provenance
data
• Search
of
datasets
is
limited
due
to
lack
of
speci_ic
options
• Within
the
LIGO
community,
data
sharing
and
reuse
is
dif_icult
without
provenance
metadata
Data provenance case study 43
44. School
of
Information
Studies
Syracuse
University
Understand
the
research
workflow
• Interview
the
scientist
– Listening
(good
listening
skills)
– Asking
questions
(don’t
be
afraid
of
asking
questions)
– Use
your
librarian
brain
to
ingest
the
conversation:
• How
does
the
research
_low
from
one
point
to
next?
• What
consists
of
the
research
input
and
output
at
each
stage
of
research
in
terms
of
data?
Priscilla M. Mayden Lecture 2012, Utah 44
45. Mapping
out
the
knowledge
v0.1
School
of
Information
Studies
Syracuse
University
Priscilla M. Mayden Lecture 2012, Utah 45
46. Mapping
out
the
knowledge
v0.2
School
of
Information
Studies
Syracuse
University
Priscilla M. Mayden Lecture 2012, Utah 46
47. Mapping
out
the
knowledge
v1.0
School
of
Information
Studies
Syracuse
University
Priscilla M. Mayden Lecture 2012, Utah 47
48. School
of
Information
Studies
Syracuse
University
Lessons
learned
• Science
is
learnable
even
if
you
don’t
have
a
subject
background
– Learn
enough
to
understand
the
research
process
and
work_low
• Scientists
are
eager
to
get
help
• Librarians
need
to
be
technical-‐minded
– Data,
metadata,
database
– Structures,
models,
work_lows
• Librarians
need
to
be
good
listeners
while
staying
good
conversation
leaders
– Know
when
and
how
to
lead
the
conversation
to
get
what
you
need
for
data
management
planning
and
implementation
– Do
your
homework
on
the
subject
so
that
you
can
be
an
intelligent
listener
Priscilla M. Mayden Lecture 2012, Utah 48
50. School
of
Information
Studies
Syracuse
University
Case
Study
#1:
To
build
or
not
to
build
a
data
repository?
by the researchers in this institution.
A university library has developed an institutional repository for preserving and
providing access to the scholarly output
Now the new challenge arises from e-science research demanding data
management plan by the funding agency and the linking between publications
and data by the authors and users. You already know that some faculty use
their disciplinary data repository for submitting their datasets (e.g., GenBank for
microbiology research data). The problem you face now is whether an
institutional data repository should be built for those who do “small science” and
don’t have funding nor expertise to manage their data.
Questions to be addressed:
• What are the strategies you will use to approach the problem?
• What are the possible solutions for the problem?
• What are some of the tradeoffs for the solutions you will adopt?
Priscilla M. Mayden Lecture 2012, Utah 50
51. School
of
Information
Studies
Syracuse
University
Case
study
#2:
Developing
a
data
taxonomy
The concept of research data management is a stranger to many faculty as
well as your library staff. What is data? What is a data set? These seemingly
simple terms can be very confusing and have different interpretations in
different context and disciplines. As part of the data management strategies,
you decide to develop an authoritative data taxonomy for the campus research
community. This data taxonomy will benefit the creation and use of institutional
data policies, data repository or repositories, and data management plans
required of funding agencies.
Questions to be addressed:
• What should the data taxonomy include?
• What form should it take, a database-driven website or a static HTML page?
• Who should be the constituencies in this process?
• Who will be the maintainer once the taxonomy is released?
Priscilla M. Mayden Lecture 2012, Utah 51
52. School
of
Information
Studies
Syracuse
University
Case
study
#3:
Developing
a
data
policy
Data policies play an important role in governing how the data will be managed,
shared, and accessed. It is also an instrument that will fend off potential legal
problems. Data policies have several types: data access and use, data
publishing, and data management. Your university’s Office of Sponsored
Research has some existing policy on data, but it is neither systematic nor
complete. Many of the terms were defined years ago and did not cover the new
areas such as the embargo period of data. As the university has decided to
build a data repository for managing and preserving datasets, a data policy has
become one of the top priorities for both the institution and the data repository.
Questions to be addressed:
• What should the data policy include?
• Who should be the constituencies in this process?
• Who will be the interpretation authority for the data policy?
Priscilla M. Mayden Lecture 2012, Utah 52
53. School
of
Information
Studies
Syracuse
University
Case
study
#4:
Cataloging
datasets
Describing datasets is the process of creating metadata for datasets. In
scientific disciplines, several metadata standards have been developed, e.g.,
the Content Standard for Digital Geospatial Metadata (CSDGM), Darwin Core,
and Ecological Metadata Language (EML). Each of these metadata standards
contains hundreds of elements and requires both metadata and subject
knowledge training in order to use them. Besides, creating one record using
any of these standards will require a tremendous time investment. But you
library does not have such specialized personnel nor have the fund to hire new
persons for the job. The existing staff has some general metadata skills such as
Dublin Core. In deciding the metadata schema for your data repository, you
need to address these questions:
• Should I adopt a scientific metadata standard or develop one tailored to our
need?
• How can I learn what metadata elements are critical to dataset submitters and
searchers?
• What are some of the benefits and disadvantages for adopting a standard or
developing a local schema?
Priscilla M. Mayden Lecture 2012, Utah 53
54. School
of
Information
Studies
Syracuse
University
Case
study
#5:
Evalua>ng
data
repository
tools
Research data as a driving force for e-science is inherently a tool-intensive
field. Tools related to data management can be divided into two broad
categories: those for creating metadata records and those for data repository
management. An academic institution decided to build their own data repository
as part of the supporting service for researchers to meet the data management
plan requirement of funding agencies. This data repository development task
was handed down to the library. You the library director have to decide whether
to develop an in-house system or use an off-the-shelf software system. As
usual, you put together a taskforce to find a solution to this challenge. The
questions to be addressed by the taskforce include:
• What are the options available to us?
• What evaluation criteria are the most important to our goal?
• What are the limitations for us to adopt one option or the other?
• How will this option be interoperate with existing institutional repository
system? Or, can the existing repository system used for data repository
purposes?
Priscilla M. Mayden Lecture 2012, Utah 54