Haley A. Kujawa is a senior at Virginia Tech studying Biological Systems Engineering with an expected graduation date of May 2017. She has a 3.64 GPA and is involved in several extracurricular activities including Under the Green Umbrella, Gamma Phi Beta, Engineers Without Borders, and Students Helping Honduras. Her relevant experience includes conducting independent research on waste disposal in Switzerland as part of the Presidential Global Scholars program, designing a sedimentation tank for a theoretical community as part of an introductory engineering design project, and performing lab experiments and writing formal reports for an introductory engineering lab course.
Kelsey Kachnik graduated from the University of Utah with a Bachelor of Science in Biomedical Engineering in 2015. She has experience in biomedical engineering research, medical device design projects, and leadership roles in campus organizations. Her skills include lab techniques, technical software programs, and event planning. She is proficient in MATLAB, Solidworks, cell culture, statistics, and Microsoft Office.
Recomendations for infrastructure and incentives for open science, presented to the Research Data Alliance 6th Plenary. Presenter: William Gunn, Director of Scholarly Communications for Mendeley.
Matthew Lichtenberger is a student at the University of Arizona pursuing a B.S. in Astronomy and Physics with minors in Spanish and Mathematics. He has conducted research with NASA and private companies, analyzing exoplanet observations and developing mission proposals. He also leads the technology efforts for the University of Arizona Astronomy Club. Lichtenberger maintains high academic achievement and has received multiple scholarships recognizing his accomplishments in physics.
On Friday September 16th I was honored with the award for the North Carolina American Chemical Society Distinguished Speaker Award and got to review the past 20 years of my career. This was my short intro bio
"Antony Williams is a Ph.D. NMR spectroscopist and cheminformatician who has worked in academia, government, a Fortune 500 company, and two start-ups. He is co-founder of the free online chemical database ChemSpider, originally started as a hobby project and ultimately acquired by the Royal Society of Chemistry (in the UK) and now used by over 50,000 users per day. He is now a computational chemist at the Environmental Protection Agency in the National Center for Computational Toxicology and is focused on developing web applications to support data dissemination and progress efforts in allowing for faster and cheaper approaches to identify potential toxicological effects of chemicals. He has published >180 papers, >25 book chapters and a number of books. He is known as the ChemConnector on social networks. "
This document discusses reproducible research and provides guidance on how to conduct research in a reproducible manner. It covers:
1. The importance of reproducible research due to large datasets, computational analyses, and the potential for human error. Ensuring reproducibility requires new expertise and infrastructure.
2. Key aspects of reproducible research include data management plans, version control, use of file formats and software/tools that allow reproducibility, and publishing data and code to allow others to replicate results.
3. Reproducible research benefits the scientific community by increasing transparency and allows researchers to re-analyze their own data in the future. Journals and funders are increasingly requiring reproducibility.
Haley A. Kujawa is a senior at Virginia Tech studying Biological Systems Engineering with an expected graduation date of May 2017. She has a 3.64 GPA and is involved in several extracurricular activities including Under the Green Umbrella, Gamma Phi Beta, Engineers Without Borders, and Students Helping Honduras. Her relevant experience includes conducting independent research on waste disposal in Switzerland as part of the Presidential Global Scholars program, designing a sedimentation tank for a theoretical community as part of an introductory engineering design project, and performing lab experiments and writing formal reports for an introductory engineering lab course.
Kelsey Kachnik graduated from the University of Utah with a Bachelor of Science in Biomedical Engineering in 2015. She has experience in biomedical engineering research, medical device design projects, and leadership roles in campus organizations. Her skills include lab techniques, technical software programs, and event planning. She is proficient in MATLAB, Solidworks, cell culture, statistics, and Microsoft Office.
Recomendations for infrastructure and incentives for open science, presented to the Research Data Alliance 6th Plenary. Presenter: William Gunn, Director of Scholarly Communications for Mendeley.
Matthew Lichtenberger is a student at the University of Arizona pursuing a B.S. in Astronomy and Physics with minors in Spanish and Mathematics. He has conducted research with NASA and private companies, analyzing exoplanet observations and developing mission proposals. He also leads the technology efforts for the University of Arizona Astronomy Club. Lichtenberger maintains high academic achievement and has received multiple scholarships recognizing his accomplishments in physics.
On Friday September 16th I was honored with the award for the North Carolina American Chemical Society Distinguished Speaker Award and got to review the past 20 years of my career. This was my short intro bio
"Antony Williams is a Ph.D. NMR spectroscopist and cheminformatician who has worked in academia, government, a Fortune 500 company, and two start-ups. He is co-founder of the free online chemical database ChemSpider, originally started as a hobby project and ultimately acquired by the Royal Society of Chemistry (in the UK) and now used by over 50,000 users per day. He is now a computational chemist at the Environmental Protection Agency in the National Center for Computational Toxicology and is focused on developing web applications to support data dissemination and progress efforts in allowing for faster and cheaper approaches to identify potential toxicological effects of chemicals. He has published >180 papers, >25 book chapters and a number of books. He is known as the ChemConnector on social networks. "
This document discusses reproducible research and provides guidance on how to conduct research in a reproducible manner. It covers:
1. The importance of reproducible research due to large datasets, computational analyses, and the potential for human error. Ensuring reproducibility requires new expertise and infrastructure.
2. Key aspects of reproducible research include data management plans, version control, use of file formats and software/tools that allow reproducibility, and publishing data and code to allow others to replicate results.
3. Reproducible research benefits the scientific community by increasing transparency and allows researchers to re-analyze their own data in the future. Journals and funders are increasingly requiring reproducibility.
Protégé4US: Harvesting Ontology Authoring Data with ProtégéMarkel Vigo
The inherent complexity of ontologies poses a number of cognitive and perceptual challenges for ontology authors. We investigate how users deal with the complexity of the authoring process by analysing how one of the most widespread ontology development tools (i.e. Protégé) is used. To do so, we build Protégé4US (Protégé for User Studies) by extending Protégé in order to generate log files that contain ontology authoring events. These log files not only contain data about the interaction with the environment, but also about OWL entities and axioms. We illustrate the usefulness of Protégé4US with a case study with 15 participants. The data generated from the study allows us to know more about how Protégé is used (e.g. most frequently used tabs), how well users perform (e.g. task completion times) and identify emergent authoring strategies, including moving down the class hierarchy or saving the cur- rent workspace before running the reasoner. We argue that Protégé4US is an valuable instrument to identify ontology authoring patterns.
Jean-Claude Bradley presents at the Special Libraries Association meeting on June 14, 2011 on the "International Year of Chemistry: Perils and Promises of Modern Communication in the Sciences- The Role of Trust". The talk mainly covers the problems with a trusted source based model for melting point data and demonstrates that an Open Data model including Open Notebook Science when necessary can be very helpful in curating datasets. Web services for experimental and predicted melting points are then reviewed.
Capturing Context in Scientific Experiments: Towards Computer-Driven Sciencedgarijo
Scientists publish computational experiments in ways that do not facilitate reproducibility or reuse. Significant domain expertise, time and effort are required to understand scientific experiments and their research outputs. In order to improve this situation, mechanisms are needed to capture the exact details and the context of computational experiments. Only then, Intelligent Systems would be able help researchers understand, discover, link and reuse products of existing research.
In this presentation I will introduce my work and vision towards enabling scientists share, link, curate and reuse their computational experiments and results. In the first part of the talk, I will present my work for capturing and sharing the context of scientific experiments by using scientific workflows and machine readable representations. Thanks to this approach, experiment results are described in an unambiguous manner, have a clear trace of their creation process and include a pointer to the sources used for their generation. In the second part of the talk, I will describe examples on how the context of scientific experiments may be exploited to browse, explore and inspect research results. I will end the talk by presenting new ideas for improving and benefiting from the capture of context of scientific experiments and how to involve scientists in the process of curating and creating abstractions on available research metadata.
Leicester Research Archive (LRA): the work of a repository administratorGaz Johnson
Second part (of three) of a lecture delivered to post graduate library students at the University of Loughborough. Focusses on the role of the repository administrator, and the practical steps taken to populate the site. This section written and presented by Valérie Spezi.
Cheminformatics Workflows Using Mobile Apps for Drug DiscoverySean Ekins
This document discusses using mobile apps for drug discovery workflows. It describes how mobile devices are revolutionizing computing through new user interfaces and availability anywhere. Several examples are given of simple app workflows for tasks like looking up structures, running searches, and sharing data. The document advocates for mobile and cloud-based approaches to replace desktop-based cheminformatics workflows. This could make specialized tasks more accessible and collaboration easier. The potential for mobile apps to transform existing software vendors is also noted. The majority of the document focuses on examples of developing mobile apps to enable drug discovery for tuberculosis.
Protégé4US: Harvesting Ontology Authoring Data with Protégérobertstevens65
This document describes Protégé4US, a tool that logs interaction data from users authoring ontologies in Protégé. A user study was conducted with Protégé4US to collect data from ontology authors performing tasks of varying complexity. The logged data was analyzed to identify patterns in how authors interacted with the tool, such as correlations between specific events like expanding the class hierarchy and longer task completion times, indicating problematic situations. Visualizations of the logged data showed common transitions between states and the authoring rhythm. The analysis of these visualizations across users helped sketch decision trees to understand the relationships between events. Future work includes more statistical analysis of patterns to identify authoring strategies.
Lowell Tarek Abbott Vidal successfully completed The Data Scientist's Toolbox course from Johns Hopkins University on Coursera with distinction in June 2015. The course provided an overview of the data, questions, and tools used by data analysts and scientists, and introduced concepts for turning data into knowledge as well as practical tools like version control, markdown, git, GitHub, R, and RStudio. The course was instructed by Jeffrey Leek, Roger D. Peng, and Brian Caffo of Johns Hopkins Bloomberg School of Public Health.
Nicola Beddall-Hill is a PhD candidate studying informatics at City University London. She is attached to the TLRP TEL project Ensemble and her research involves using a retroductive approach to analyze 65GB of ethnographic data capturing a student project using technology to enhance learning. This data includes audio, video, photographs, focus groups, field notes, and GPS track logs. Her process involves organizing the raw data, sorting it into a database, creating a timeline of critical events, anonymizing the events into Fedora, coding the anonymous events using theoretical concepts in ATLAS.ti, creating a model, and testing the model.
An introduction to Nowomics and how it helps biologists track new data and papers relevant to their research. With some background on how the site go started.
Jean-Claude Bradley presents the introductory lecture for Chemical Information Retrieval at Drexel University for Fall 2011 on September 23, 2011. Examples are given to demonstrate how difficult it can be to find and assess chemical information such as melting points. An overview of the class wiki is then given
This resume is for Andrés Isaac Monterroso Cohen, who is seeking an entry-level biomedical engineering job with a focus on design and management. He has a Bachelor of Science in Biomedical Engineering from Worcester Polytechnic Institute, with a 3.70 GPA. His relevant experience includes research assisting at Massachusetts General Hospital and UMass Memorial Medical Center. Current projects include designing a self-retractor for surgery and having previously designed a tremor assist device. Skills include proficiency with MATLAB, LabVIEW, Python and various laboratory and surgical techniques. He has held leadership roles as Fraternity President and in honor societies.
Aileen Cheng is seeking a position in biotech using computer science skills in software development, modeling, data analysis, and databases. She has a Bachelor of Science in Bioengineering and Computer Science from Caltech with relevant coursework and projects developing databases, algorithms, simulations, and more. Her experience includes software engineering at Cisco and internships developing assays and LIMS systems in biotech startups and research laboratories at Caltech and abroad.
This document provides a 7-step process for conducting research and writing a paper. It includes choosing a topic, finding basic information, refining the topic, forming a research question, developing keywords, locating and retrieving materials, and evaluating resources. It also lists several reference books and databases for conducting research and provides guidance on evaluating resources and citing sources using APA or MLA style.
Model Management in Systems Biology: Challenges – Approaches – SolutionsMartin Scharm
I gave this talk as a webinar in the FAIRDOM webinar series 2016. The recordings of the webinar are available from http://fair-dom.org/knowledgehub/webinars-2/martin-scharm/
Sulaimon Isiaka has a B.S. in Engineering Physics and seeks a position applying his expertise in areas such as mathematical modeling, data analysis, and research assistance. He has over 3 years of experience as a student research assistant conducting quantitative and qualitative analysis. This includes working on the ALICE experiment at CERN to study quark-gluon plasma through heavy ion collisions. He is proficient in tools like MATLAB, Linux, and programming languages.
ALAMW14 Altmetrics Panel: Redefining Research ImpactWilliam Gunn
This document discusses new ways of measuring research impact beyond traditional citations. It describes how Mendeley collects data on researcher behavior directly from their platform to provide faster and more comprehensive metrics on researcher engagement. This includes data on document views, saves, annotations and more. It also discusses how this broader dataset could enable new services for stakeholders to better understand research impact and discovery.
Jean-Claude Bradley presents on March 30, 2011 at the American Chemical Society on Rapid Dissemination of Chemical Information for people and machines using Open Notebook Science.
Dawid Walas successfully completed The Data Scientist's Toolbox course from Johns Hopkins University on Coursera with distinction in March 2015. The course provided an overview of the conceptual ideas and practical tools used by data analysts and scientists, including version control, markdown, git, GitHub, R, and RStudio. The course was instructed by Jeffrey Leek, Roger Peng, and Brian Caffo of Johns Hopkins Bloomberg School of Public Health.
InCoB2016 FactPub: the open-access web platform for academic paper contentsShun Shiku
The scientific papers and their contents are in high demand. For example, National Institute of Health’s PubMed Central provides free access to biomedical literature and has one million unique visitors daily. 40% of its visitors are citizens rather than universities or companies. However, not all the academic paper contents are accessible due to a paywall that prevents free distribution, locking up scientific knowledge.
FactPub (http://factpub.org) aims to provide the public a method to access information beyond paper paywalls while respecting copyright law. FactPub is based on the concept of 'idea-expression dichotomy' which is that facts cannot be copyrighted, and that their distribution is not bound by copyright licenses. Factify processes paragraphs to meaningful sentence-level facts using Stanford Core NLP library. Factify is run on the client side as a browser extension (Firefox/Chrome) to parse academic PDF papers. After Factify extracts facts from a paper, it uploads ‘facts’ to factpub.org, generating the content page in Wiki text format. Large-scale adoption of this fact-publishing framework will empower accessibility to health and other scientific research.
DISCLAIMER: The presenter is not a lawyer. Please consult qualified legal counsel if an opinion is required on legal matters.
SERONTO is an ontology framework consisting of a core ontology and multiple domain ontologies that integrate with the core ontology. The document discusses the structure of SERONTO, including the core ontology, domain ontologies, and integration layers between them. It also addresses developing domain ontologies consistently within the SERONTO framework through an iterative process involving domain experts and the wider community.
The document discusses the Centre for Ecology & Hydrology's (CEH) trial of using electronic lab notebooks. It describes CEH's large-scale, long-term monitoring networks that generate significant data. CEH scientists traditionally used paper lab notebooks that have issues like not being accessible, searchable, or easily archived. The trial involved volunteers maintaining parallel paper and wiki-based electronic notebooks to evaluate features. The wiki format provided many benefits over paper, though additional functionality is still needed to fully replace paper notebooks. CEH aims to select a permanent electronic notebook solution to improve data management.
Protégé4US: Harvesting Ontology Authoring Data with ProtégéMarkel Vigo
The inherent complexity of ontologies poses a number of cognitive and perceptual challenges for ontology authors. We investigate how users deal with the complexity of the authoring process by analysing how one of the most widespread ontology development tools (i.e. Protégé) is used. To do so, we build Protégé4US (Protégé for User Studies) by extending Protégé in order to generate log files that contain ontology authoring events. These log files not only contain data about the interaction with the environment, but also about OWL entities and axioms. We illustrate the usefulness of Protégé4US with a case study with 15 participants. The data generated from the study allows us to know more about how Protégé is used (e.g. most frequently used tabs), how well users perform (e.g. task completion times) and identify emergent authoring strategies, including moving down the class hierarchy or saving the cur- rent workspace before running the reasoner. We argue that Protégé4US is an valuable instrument to identify ontology authoring patterns.
Jean-Claude Bradley presents at the Special Libraries Association meeting on June 14, 2011 on the "International Year of Chemistry: Perils and Promises of Modern Communication in the Sciences- The Role of Trust". The talk mainly covers the problems with a trusted source based model for melting point data and demonstrates that an Open Data model including Open Notebook Science when necessary can be very helpful in curating datasets. Web services for experimental and predicted melting points are then reviewed.
Capturing Context in Scientific Experiments: Towards Computer-Driven Sciencedgarijo
Scientists publish computational experiments in ways that do not facilitate reproducibility or reuse. Significant domain expertise, time and effort are required to understand scientific experiments and their research outputs. In order to improve this situation, mechanisms are needed to capture the exact details and the context of computational experiments. Only then, Intelligent Systems would be able help researchers understand, discover, link and reuse products of existing research.
In this presentation I will introduce my work and vision towards enabling scientists share, link, curate and reuse their computational experiments and results. In the first part of the talk, I will present my work for capturing and sharing the context of scientific experiments by using scientific workflows and machine readable representations. Thanks to this approach, experiment results are described in an unambiguous manner, have a clear trace of their creation process and include a pointer to the sources used for their generation. In the second part of the talk, I will describe examples on how the context of scientific experiments may be exploited to browse, explore and inspect research results. I will end the talk by presenting new ideas for improving and benefiting from the capture of context of scientific experiments and how to involve scientists in the process of curating and creating abstractions on available research metadata.
Leicester Research Archive (LRA): the work of a repository administratorGaz Johnson
Second part (of three) of a lecture delivered to post graduate library students at the University of Loughborough. Focusses on the role of the repository administrator, and the practical steps taken to populate the site. This section written and presented by Valérie Spezi.
Cheminformatics Workflows Using Mobile Apps for Drug DiscoverySean Ekins
This document discusses using mobile apps for drug discovery workflows. It describes how mobile devices are revolutionizing computing through new user interfaces and availability anywhere. Several examples are given of simple app workflows for tasks like looking up structures, running searches, and sharing data. The document advocates for mobile and cloud-based approaches to replace desktop-based cheminformatics workflows. This could make specialized tasks more accessible and collaboration easier. The potential for mobile apps to transform existing software vendors is also noted. The majority of the document focuses on examples of developing mobile apps to enable drug discovery for tuberculosis.
Protégé4US: Harvesting Ontology Authoring Data with Protégérobertstevens65
This document describes Protégé4US, a tool that logs interaction data from users authoring ontologies in Protégé. A user study was conducted with Protégé4US to collect data from ontology authors performing tasks of varying complexity. The logged data was analyzed to identify patterns in how authors interacted with the tool, such as correlations between specific events like expanding the class hierarchy and longer task completion times, indicating problematic situations. Visualizations of the logged data showed common transitions between states and the authoring rhythm. The analysis of these visualizations across users helped sketch decision trees to understand the relationships between events. Future work includes more statistical analysis of patterns to identify authoring strategies.
Lowell Tarek Abbott Vidal successfully completed The Data Scientist's Toolbox course from Johns Hopkins University on Coursera with distinction in June 2015. The course provided an overview of the data, questions, and tools used by data analysts and scientists, and introduced concepts for turning data into knowledge as well as practical tools like version control, markdown, git, GitHub, R, and RStudio. The course was instructed by Jeffrey Leek, Roger D. Peng, and Brian Caffo of Johns Hopkins Bloomberg School of Public Health.
Nicola Beddall-Hill is a PhD candidate studying informatics at City University London. She is attached to the TLRP TEL project Ensemble and her research involves using a retroductive approach to analyze 65GB of ethnographic data capturing a student project using technology to enhance learning. This data includes audio, video, photographs, focus groups, field notes, and GPS track logs. Her process involves organizing the raw data, sorting it into a database, creating a timeline of critical events, anonymizing the events into Fedora, coding the anonymous events using theoretical concepts in ATLAS.ti, creating a model, and testing the model.
An introduction to Nowomics and how it helps biologists track new data and papers relevant to their research. With some background on how the site go started.
Jean-Claude Bradley presents the introductory lecture for Chemical Information Retrieval at Drexel University for Fall 2011 on September 23, 2011. Examples are given to demonstrate how difficult it can be to find and assess chemical information such as melting points. An overview of the class wiki is then given
This resume is for Andrés Isaac Monterroso Cohen, who is seeking an entry-level biomedical engineering job with a focus on design and management. He has a Bachelor of Science in Biomedical Engineering from Worcester Polytechnic Institute, with a 3.70 GPA. His relevant experience includes research assisting at Massachusetts General Hospital and UMass Memorial Medical Center. Current projects include designing a self-retractor for surgery and having previously designed a tremor assist device. Skills include proficiency with MATLAB, LabVIEW, Python and various laboratory and surgical techniques. He has held leadership roles as Fraternity President and in honor societies.
Aileen Cheng is seeking a position in biotech using computer science skills in software development, modeling, data analysis, and databases. She has a Bachelor of Science in Bioengineering and Computer Science from Caltech with relevant coursework and projects developing databases, algorithms, simulations, and more. Her experience includes software engineering at Cisco and internships developing assays and LIMS systems in biotech startups and research laboratories at Caltech and abroad.
This document provides a 7-step process for conducting research and writing a paper. It includes choosing a topic, finding basic information, refining the topic, forming a research question, developing keywords, locating and retrieving materials, and evaluating resources. It also lists several reference books and databases for conducting research and provides guidance on evaluating resources and citing sources using APA or MLA style.
Model Management in Systems Biology: Challenges – Approaches – SolutionsMartin Scharm
I gave this talk as a webinar in the FAIRDOM webinar series 2016. The recordings of the webinar are available from http://fair-dom.org/knowledgehub/webinars-2/martin-scharm/
Sulaimon Isiaka has a B.S. in Engineering Physics and seeks a position applying his expertise in areas such as mathematical modeling, data analysis, and research assistance. He has over 3 years of experience as a student research assistant conducting quantitative and qualitative analysis. This includes working on the ALICE experiment at CERN to study quark-gluon plasma through heavy ion collisions. He is proficient in tools like MATLAB, Linux, and programming languages.
ALAMW14 Altmetrics Panel: Redefining Research ImpactWilliam Gunn
This document discusses new ways of measuring research impact beyond traditional citations. It describes how Mendeley collects data on researcher behavior directly from their platform to provide faster and more comprehensive metrics on researcher engagement. This includes data on document views, saves, annotations and more. It also discusses how this broader dataset could enable new services for stakeholders to better understand research impact and discovery.
Jean-Claude Bradley presents on March 30, 2011 at the American Chemical Society on Rapid Dissemination of Chemical Information for people and machines using Open Notebook Science.
Dawid Walas successfully completed The Data Scientist's Toolbox course from Johns Hopkins University on Coursera with distinction in March 2015. The course provided an overview of the conceptual ideas and practical tools used by data analysts and scientists, including version control, markdown, git, GitHub, R, and RStudio. The course was instructed by Jeffrey Leek, Roger Peng, and Brian Caffo of Johns Hopkins Bloomberg School of Public Health.
InCoB2016 FactPub: the open-access web platform for academic paper contentsShun Shiku
The scientific papers and their contents are in high demand. For example, National Institute of Health’s PubMed Central provides free access to biomedical literature and has one million unique visitors daily. 40% of its visitors are citizens rather than universities or companies. However, not all the academic paper contents are accessible due to a paywall that prevents free distribution, locking up scientific knowledge.
FactPub (http://factpub.org) aims to provide the public a method to access information beyond paper paywalls while respecting copyright law. FactPub is based on the concept of 'idea-expression dichotomy' which is that facts cannot be copyrighted, and that their distribution is not bound by copyright licenses. Factify processes paragraphs to meaningful sentence-level facts using Stanford Core NLP library. Factify is run on the client side as a browser extension (Firefox/Chrome) to parse academic PDF papers. After Factify extracts facts from a paper, it uploads ‘facts’ to factpub.org, generating the content page in Wiki text format. Large-scale adoption of this fact-publishing framework will empower accessibility to health and other scientific research.
DISCLAIMER: The presenter is not a lawyer. Please consult qualified legal counsel if an opinion is required on legal matters.
SERONTO is an ontology framework consisting of a core ontology and multiple domain ontologies that integrate with the core ontology. The document discusses the structure of SERONTO, including the core ontology, domain ontologies, and integration layers between them. It also addresses developing domain ontologies consistently within the SERONTO framework through an iterative process involving domain experts and the wider community.
The document discusses the Centre for Ecology & Hydrology's (CEH) trial of using electronic lab notebooks. It describes CEH's large-scale, long-term monitoring networks that generate significant data. CEH scientists traditionally used paper lab notebooks that have issues like not being accessible, searchable, or easily archived. The trial involved volunteers maintaining parallel paper and wiki-based electronic notebooks to evaluate features. The wiki format provided many benefits over paper, though additional functionality is still needed to fully replace paper notebooks. CEH aims to select a permanent electronic notebook solution to improve data management.
1) ALTER-NET is a network for long-term ecological research that needed to integrate data from many sources using SERONTO, a semantic framework and socio-ecological research ontology.
2) SERONTO includes a core ontology and domain ontologies for ecosystems, biodiversity, and socio-economics. It provides a common model for structuring ecological observations for data management.
3) Developing the ontologies required roles like working groups, experts, and a coordinator to create examples, document issues, and refine the process over time through workshops and a wiki decision forum. Clear coordination and documentation were essential.
Poster Semantic data integration proof of conceptNicolas Bertrand
This document summarizes a proof of concept study that tested the ability to semantically integrate ecological data from different databases using the Socio-Ecological Research and Observation oNTOlogy (SERONTO). The study showed that SERONTO could successfully import database schemas and reference lists, map relations between database tables and SERONTO concepts, and allow complex queries across multiple connected databases from within SERONTO. However, maintaining mappings between reference lists and coupling value sets, units and calculations requires further work. Overall, the study demonstrated the feasibility of using SERONTO and semantic approaches to provide integrated access to distributed ecological data.
Semantic Data Integration of Biodiversity Data with the SERONTO OntologyNicolas Bertrand
The document discusses using the SERONTO ontology to semantically integrate three biodiversity datasets. SERONTO was developed by the ALTER-Net Network of Excellence to allow interoperability across heterogeneous data resources from multiple institutions and scientific domains. The project aims to integrate the National Biodiversity Network Gateway, Countryside Survey, and Environmental Change Network datasets using SERONTO. This will create a distributed system that can be extended to integrate other biodiversity data, demonstrating SERONTO's ability to integrate biodiversity data at different scales.
The document discusses testing the use of the SERONTO ontology for semantic data integration of distributed ecological databases from ALTER-Net and LTER Europe. Five databases were independently mapped to SERONTO concepts and queries could be run across the integrated data without knowledge of the underlying database structures. However, the effort required for mapping was significant and maintaining reference lists will be crucial. More use cases are needed to fully evaluate SERONTO's potential for LTER data integration.
The Wegman Company is a full-service commercial furniture provider established in 1967 based in Cincinnati, Ohio with locations in Atlanta, Orlando, and Miami. They offer installation, warehousing, asset management, refurbishment, and design support services for all major manufacturers of commercial furniture. Their services also include project planning, uniformed installation teams, service contracts, art installation, competitive storage rates, shipping, inventory management, usage reports, design software, furniture specifications, and on-site assessments. They have a commercial upholstery maintenance program and work with many large corporate and nonprofit clients.
Jean-Claude Bradley presents on "Peer Review and Science2.0: blogs, wikis and social networking sites" as a guest lecturer for the “Peer Review Culture in Scholarly Publication and Grantmaking” course at Drexel University. The main thrust of the presentation is that peer review alone is not capable of coping with the increasing flood of scientific information being generated and shared. Arguments are made to show that providing sufficient proof for scientific findings does scale and weakens the tragedy of the trusted source cascade.
Jean-Claude Bradley presents at the Science Commons Symposium on Feb 20, 2010 at the Microsoft Campus in Redmond. The talk covers doing Open Notebook Science using free and hosted tools, including new archiving protocols developed with Andrew Lang.
Open PHACTS April 2017 Science webinar Workflow toolsopen_phacts
This webinar discusses workflow tools to support life science research. It includes presentations on the Common Workflow Language (CWL) by Michael Crusoe and uses of Knime and Pipeline Pilot workflows with Open PHACTS examples. There will also be a panel discussion on the future of workflows for life science research with speakers from Eli Lilly, Janssen, and others. Example CWL workflows are shown to demonstrate portable life science workflows.
Results may vary: Collaborations Workshop, Oxford 2014Carole Goble
Thoughts on computational science reproducibility with a focus on software. Given at the Software Sustainability Institute's 2014 Collaborations Workshop
Being Reproducible: SSBSS Summer School 2017Carole Goble
Lecture 2:
Being Reproducible: Models, Research Objects and R* Brouhaha
Reproducibility is a R* minefield, depending on whether you are testing for robustness (rerun), defence (repeat), certification (replicate), comparison (reproduce) or transferring between researchers (reuse). Different forms of "R" make different demands on the completeness, depth and portability of research. Sharing is another minefield raising concerns of credit and protection from sharp practices.
In practice the exchange, reuse and reproduction of scientific experiments is dependent on bundling and exchanging the experimental methods, computational codes, data, algorithms, workflows and so on along with the narrative. These "Research Objects" are not fixed, just as research is not “finished”: the codes fork, data is updated, algorithms are revised, workflows break, service updates are released. ResearchObject.org is an effort to systematically support more portable and reproducible research exchange.
In this talk I will explore these issues in more depth using the FAIRDOM Platform and its support for reproducible modelling. The talk will cover initiatives and technical issues, and raise social and cultural challenges.
The document discusses the evolution of science and research from the 1940s to present day. It notes Vannevar Bush's 1945 concerns about the growing mountain of research that scientists did not have time to fully understand or remember. It then discusses the current "data explosion" and challenges of accessing, sharing, and building on increasingly large amounts of data and research. The document advocates for reusable, reproducible, and transparent science through connected resources and environments that facilitate collaboration and knowledge sharing.
Open Research: Manchester leading and learningCarole Goble
Open and FAIR science has an international momentum. Large scale communities are striving to make and manage the digital infrastructure needed for scientists to be open as possible, closed as necessary, as expected by the NIH, OECD, UNESCO and the EC. ELIXIR is such a research infrastructure in Europe for Life Sciences. This talk will highlight two of ELIXIR's Open Science resources built by Open Science communities to enable life science researchers to be open, and led by Manchester. And how can we learn from these and bring these practices to Manchester?
Launch: Manchester Office for Open Research, 4th April 2022
https://www.openresearch.manchester.ac.uk/
This document discusses provenance and collaboration in science. It presents use cases in astronomy, biology, and other disciplines to illustrate challenges around data packaging, preservation, retrieval and reuse of scientific workflows. These include dealing with large datasets, versioning data from external sources, and understanding and reusing other researchers' workflows. The role of research objects and linked data for supporting provenance, identity, context and the lifecycle of scientific work is also examined.
The document proposes an Earth Science Collaboratory (ESC) that would provide access to Earth science models, data, tools, and services to facilitate collaboration and reproducibility in data-intensive Earth science research. It describes the current fragmented state of accessing and sharing models, data, tools, and knowledge. The ESC would integrate these components and provide services like cloud computing, discovery, and provenance tracking. It presents a use case of how the ESC could help collaboration in the development of precipitation retrieval algorithms for the Global Precipitation Measurement mission.
This document introduces FAIRDOM, a consortium that provides a platform and services to help researchers organize, manage, share, and preserve research outputs according to FAIR principles. FAIRDOM has been in operation for 10 years and has over 50 installations supporting over 118 projects. It provides tools and services to help researchers collaborate better and integrate their data, models, publications and other research objects. FAIRDOM also works with other organizations and infrastructure providers to support broader research initiatives.
The document discusses the GeoChronos project, which uses the Elgg platform to facilitate collaboration between earth observation scientists. Key features of the GeoChronos portal include interactive application services that allow scientists to access tools remotely, and spectral libraries that help scientists share and annotate spectral data. Elgg provides profiles, groups, forums and other features, and custom plugins were developed for applications, libraries, and other scientific needs. The portal has been successfully used for collaboration and teaching.
Aspects of Reproducibility in Earth ScienceRaul Palma
The document discusses aspects of reproducibility in earth science research within the European Virtual Environment for Research - Earth Science Themes (EVEREST) project. The key objectives of EVEREST are to establish an e-infrastructure to facilitate collaborative earth science research through shared data, models, and workflows. Research Objects (ROs) will be used to capture and share workflows, processes, and results to help ensure reproducibility and preservation of earth science research. An example RO is described for mapping volcano deformation using satellite imagery and other data sources. Issues around reproducibility related to data access, software dependencies, and manual intervention in workflows are also discussed.
Open access for researchers, policy makers and research managers - Short ver...Iryna Kuchma
Presented at Open Access: Maximising Research Impact, April 23 2009, New Bulgarian University Library, Sofia. Open access for researchers: enlarged audience, citation impact, tenure and promotion. Open access for policy makers and research managers:
new tools to manage a university’s image and impact. How to maximize the visibility of research publications, improve the impact and influence of the work, disseminate the results of the research, showcase the quality of the research in the Universities and research institutions, better measure and manage the research in the institution, collect and curate the digital outputs, generate new knowledge from existing findings, enable and encourage collaboration, bring savings to the higher education sector and better return on investment. What are the key functions for research libraries?
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
Lecture 1:
Being FAIR: FAIR data and model management
In recent years we have seen a change in expectations for the management of all the outcomes of research – that is the “assets” of data, models, codes, SOPs, workflows. The “FAIR” (Findable, Accessible, Interoperable, Reusable) Guiding Principles for scientific data management and stewardship [1] have proved to be an effective rallying-cry. Funding agencies expect data (and increasingly software) management retention and access plans. Journals are raising their expectations of the availability of data and codes for pre- and post- publication. The multi-component, multi-disciplinary nature of Systems and Synthetic Biology demands the interlinking and exchange of assets and the systematic recording of metadata for their interpretation.
Our FAIRDOM project (http://www.fair-dom.org) supports Systems Biology research projects with their research data, methods and model management, with an emphasis on standards smuggled in by stealth and sensitivity to asset sharing and credit anxiety. The FAIRDOM Platform has been installed by over 30 labs or projects. Our public, centrally hosted Asset Commons, the FAIRDOMHub.org, supports the outcomes of 50+ projects.
Now established as a grassroots association, FAIRDOM has over 8 years of experience of practical asset sharing and data infrastructure at the researcher coal-face ranging across European programmes (SysMO and ERASysAPP ERANets), national initiatives (Germany's de.NBI and Systems Medicine of the Liver; Norway's Digital Life) and European Research Infrastructures (ISBE) as well as in PI's labs and Centres such as the SynBioChem Centre at Manchester.
In this talk I will show explore how FAIRDOM has been designed to support Systems Biology projects and show examples of its configuration and use. I will also explore the technical and social challenges we face.
I will also refer to European efforts to support public archives for the life sciences. ELIXIR (http:// http://www.elixir-europe.org/) the European Research Infrastructure of 21 national nodes and a hub funded by national agreements to coordinate and sustain key data repositories and archives for the Life Science community, improve access to them and related tools, support training and create a platform for dataset interoperability. As the Head of the ELIXIR-UK Node and co-lead of the ELIXIR Interoperability Platform I will show how this work relates to your projects.
[1] Wilkinson et al, The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, doi:10.1038/sdata.2016.18
The document discusses how universities can maximize research output through open access repositories and metrics. It argues that by mandating that researchers deposit their work in institutional repositories, universities can provide open access to 100% of research articles. This maximizes the visibility, usage, and impact of the research and provides competitive advantages for universities that adopt open access mandates early on. Open access is achieved through "green open access self-archiving," where authors deposit their final, peer-reviewed manuscripts in institutional repositories.
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
Wf4Ever: Scientific Workflows and Research Objects as tools for scientific in...Joint ALMA Observatory
The document outlines current challenges in radio astronomy and potential solutions using scientific workflows and research objects. It introduces the speaker and their background and interests in bringing computational tools and the virtual observatory to radio astronomy. Specific challenges discussed include an overabundance of data that is difficult to find, document, share and reproduce. The talk proposes that workflows and research objects could help address these issues by defining computations and dependencies, enabling distributed and interactive computing, and providing tools for workflow storage, discovery and provenance.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
13. Volunteers 1. Darren Sleep (QA Manager, ECP Section, Lancaster) - dsleep 2. Bob Possee (Head of Site. Virologist, Oxford) - rdpo 3. Rob Griffiths (Microbiologist, Ecologist, Oxford) - rig 4. Andrew Worgan (GIS / Organic Chemistry, Dorset) - adpw - Start date: 06 July 2006 5. Jonathan Evans (Hydrologist, Wallingforfd) - jge Start date: 24 july 2006 -- continuing to use labbook despite end of pilot 6. David Wilson (ECP, Lancaster) - drwi Start Date: 26 September 2006 7. David Williams (Field Support / Laboratory Technician, Bangor) 8. Mark Bescoby (Radiochemist, Lancaster) start date: 26 September 2006 9. Neville Llewellyn (Organic analytical chemist, Wallingford) 10. Thien Ho (Virologist, Oxford) 11. Gillian Ainsworth (Environmental Chemist, Lancaster) - Start date: 12 July 2006 12. Susan Brown (Microbiologist (amoeba), Dorset) Start date: 17 july 2006 13. Jan Dick (Landscape restoration ecologist, Edinburgh) - pulled out on 15 September 2006 14. Richard Ellis (Climate Modeller, Wallingford) Start date: 21 September 2006 continuing to use lab book despite end of pilot 15. Fai Fung (Hydrologist, Wallingford) start date: 23 September 2006
14.
15. In the lab Digital Pens Digital lab book upload