Alex Drijver (ChemAxon) provides an overview of this potential Pistoia Alliance working group during the "Dragons' Den" session at the Pistoia Alliance Conference in Boston, MA, on April 24, 2012.
The document summarizes findings from a survey on research data management practices. Some key findings include:
- 17% of researchers had lost data due to issues like hardware failure and human error.
- 68% of researchers currently share or plan to share their data. Main motivations for sharing include funder requirements and increasing citation/impact.
- Only 16% of researchers currently use university research data management support services, indicating a need to improve outreach and support.
- 41% of researchers hold some type of sensitive data like patient or personal information, underscoring the need for secure data storage and sharing policies.
Grampian safe haven, research data networkJisc RDM
Safe havens" should be developed as an environment for population-based research where the risk of identifying individuals is minimized. Researchers in safe havens are bound by strict confidentiality codes preventing disclosure of personally identifying information and providing sanctions for breaches of confidentiality.
The document discusses Frictionless Data, an initiative by the Open Knowledge Foundation to make research data easier to share, consume, and analyze. It aims to introduce standards and tools to "containerize" datasets using simple specifications like Tabular Data Package. This would make data easier to integrate into tools and platforms, find, maintain quality for, and analyze. It discusses problems like lack of standards, tools to validate datasets are presented. Examples of early implementations that integrate validation checks and continuous validation are also provided.
Standardising research data policies, research data networkJisc RDM
The document discusses standardizing research data policies across journals. It describes an expert group working to develop templates and guidance for data policies. It also discusses a collaboration to implement the Joint Declaration of Data Citation Principles. The group is working with Springer Nature to help standardize their data policies across journals into four main types. The goal is to improve data sharing, citation and reuse.
Rots RDAP11 Data Archives in Federal AgenciesASIS&T
Arnold Rots, VAO; Data Archives in Federal Agencies; RDAP11 Summit
The 2nd Research Data Access and Preservation (RDAP) Summit
An ASIS&T Summit
March 31-April 1, 2011 Denver, CO
In cooperation with the Coalition for Networked Information
http://asist.org/Conferences/RDAP11/index.html
This presentation reviewed the challenges in identifying, acquiring and utilizing research data in relation to an evolving data market. Strategic solutions were examined in which the FAIR principles play a key role in the future of data management.
The document summarizes findings from a survey on research data management practices. Some key findings include:
- 17% of researchers had lost data due to issues like hardware failure and human error.
- 68% of researchers currently share or plan to share their data. Main motivations for sharing include funder requirements and increasing citation/impact.
- Only 16% of researchers currently use university research data management support services, indicating a need to improve outreach and support.
- 41% of researchers hold some type of sensitive data like patient or personal information, underscoring the need for secure data storage and sharing policies.
Grampian safe haven, research data networkJisc RDM
Safe havens" should be developed as an environment for population-based research where the risk of identifying individuals is minimized. Researchers in safe havens are bound by strict confidentiality codes preventing disclosure of personally identifying information and providing sanctions for breaches of confidentiality.
The document discusses Frictionless Data, an initiative by the Open Knowledge Foundation to make research data easier to share, consume, and analyze. It aims to introduce standards and tools to "containerize" datasets using simple specifications like Tabular Data Package. This would make data easier to integrate into tools and platforms, find, maintain quality for, and analyze. It discusses problems like lack of standards, tools to validate datasets are presented. Examples of early implementations that integrate validation checks and continuous validation are also provided.
Standardising research data policies, research data networkJisc RDM
The document discusses standardizing research data policies across journals. It describes an expert group working to develop templates and guidance for data policies. It also discusses a collaboration to implement the Joint Declaration of Data Citation Principles. The group is working with Springer Nature to help standardize their data policies across journals into four main types. The goal is to improve data sharing, citation and reuse.
Rots RDAP11 Data Archives in Federal AgenciesASIS&T
Arnold Rots, VAO; Data Archives in Federal Agencies; RDAP11 Summit
The 2nd Research Data Access and Preservation (RDAP) Summit
An ASIS&T Summit
March 31-April 1, 2011 Denver, CO
In cooperation with the Coalition for Networked Information
http://asist.org/Conferences/RDAP11/index.html
This presentation reviewed the challenges in identifying, acquiring and utilizing research data in relation to an evolving data market. Strategic solutions were examined in which the FAIR principles play a key role in the future of data management.
The webinar discussed Jisc's proposal for a Research Data Shared Service (RDSS) to address issues with research data management across UK higher education institutions. The RDSS would provide cost-effective solutions for depositing, describing, storing, publishing, and preserving research data through standardized technology and shared expertise. An alpha version was being piloted with 16 institutions and would include repository, preservation, and advisory services. The goal was to increase access to and reuse of research data while reducing costs and risks for institutions.
Big Data in Pediatric Critical Care by Mohit MehraData Con LA
Abstract:- There is an urgent need in the pediatric ICUs to collect, store and transform healthcare data to make accurate and timely predictions in the areas of patient outcomes and treatment recommendations. We are currently heavily invested in using open source big data stacks in order to achieve this goal and help our young ones. In this talk I can highlight how we go about managing structured and unstructured high frequency data generated from a disparate set of devices and systems and ultimately how we have created data pipelines to process the data and make it available for data scientists and app developers.
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...US-Ignite
Robert Grossman –University of Chicago
Joe Mambretti–Northwestern University
Piers Nash –University of Chicago
Jim Chen –Northwestern University
Allison Heath –University of Chicago
Comeaux RDAP11 Data Archives in Federal AgenciesASIS&T
Joey Comeaux, CICL RDA; Data Archives in Federal Agencies
The 2nd Research Data Access and Preservation (RDAP) Summit
An ASIS&T Summit
March 31-April 1, 2011 Denver, CO
In cooperation with the Coalition for Networked Information
http://asist.org/Conferences/RDAP11/index.html
This document summarizes Helen Henderson's presentation on institutional identifiers. It discusses existing standards like ONIX, COUNTER, and ISSN, as well as new standards being developed like KBART, Project TRANSFER, and CORE. It outlines several scenarios where institutional identifiers could be used, such as in the electronic resources supply chain, eLearning, research funding, and author registries. It describes the stakeholders involved in each scenario and key issues to address. Finally, it provides the timeline and work plan for the NISO working group developing a new institutional identifier standard.
Implementing figshare, research data networkJisc RDM
Implementing figshare and engaging researchers,
Research data network, September 2016, Georgina Parsons, Cranfield University and Megan Hardeman, figshare.
Stop press: should embargo conditions apply to metadata?Jisc RDM
Sarah Middle of Cambridge University discusses whether embargo conditions should apply to metadata. Session held at the Research Data Network event in May 2016, Cardiff University.
Why does research data matter to librariesJisc RDM
- Research data matters to libraries because it is increasingly being produced and collected by researchers, and there are growing requirements to manage and preserve it.
- A survey found that while most researchers currently manage their own data, there is a trend toward using institutional repositories and libraries more for long-term preservation.
- Libraries are well-suited to help with research data management because of their experience organizing and describing information over long periods of time, but there are also challenges due to differences across disciplines in how data is defined and treated.
- As funders and journals require better data sharing practices, libraries have an opportunity to take a more active role in helping researchers and institutions capture, describe, and manage research data over
The document discusses a global initiative to facilitate open access to scholarly resources and research data across boundaries by building a federation of registries. It provides use cases of how such a system could help postgraduate students, research project leaders, administrators, and ICT specialists discover and monitor globally accessible data relevant to their work. The proposed strategy is to create a "Register of Registries" that would enable consistent discovery services for finding data in collections through a standardized, interoperable model. An initial scoping meeting was held in 2007 and annual meetings since to develop the strategy.
Real-World Data Challenges: Moving Towards Richer Data EcosystemsAnita de Waard
The document discusses trends in scientific data repositories and ecosystems. It notes that repositories are becoming more like virtual laboratories where scientists can conduct research. It also discusses how artificial intelligence and machine learning are being used to complement human discovery and analysis of large and complex datasets. The document raises several challenges around issues such as data ownership, rewards for data sharing and software development, and the roles of various stakeholders in research data management.
In early 2014, we asked science and social science researchers...
• What expectations do the terms publication and peer review raise in reference to data?
• What features would be useful to evaluate the trustworthiness, evaluate the impact, and enhance the prestige of a data publication?
The dog who caught the car: There’s more PEPFAR data than ever before, now what?MEASURE Evaluation
This document summarizes a presentation about using PEPFAR (President's Emergency Plan for AIDS Relief) data more effectively. It describes over a dozen years of USAID investments in PEPFAR data infrastructure, totaling nearly $1.9 billion. This includes indicators to support reporting systems, the DATIM data management system, and improved site capacity. With more data available than ever, the presentation discusses how it could be used to more effectively target HIV/AIDS services, identify priority sites, and better understand the epidemic context.
Notes taken to support breakout discussion of possible business models necessary to support the information ecosystem in life science R&D during the Pistoia Alliance Information Ecosystem Workshop in October 2011.
Richard Bolton (GSK and Pistoia's ELN query services workstream coordinator) discusses the Alliance's chemistry strategy, which includes ELN query standards, hosted ELN, and chemistry externalization faciliation
David Klatte (Pfizer) presented on this potential new working group during the "Dragons' Den" portion of the Pistoia Alliance Conference in Boston, MA, on April 24, 2012.
Towards a brokering framework for knowledge-based services: Learning from the...Pistoia Alliance
Ian Harrow, co-leader of the Pistoia Alliance SESL pilot, describes the vision for the SESL pilot, the outcomes, and the project's future. The presentation at the 2011 BioITWorld Conference and Expo included a link to the SESL public demonstrator.
The Pistoia Alliance Information Ecosystem WorkshopPistoia Alliance
Michael Braxenthaler, president of the Pistoia Alliance, introduces the concept of the information ecosystem in life science research and discusses the role the Pistoia Alliance can play within this ecosystem. The workshop occurred in October 2011.
The Pistoia Alliance Conference in April 2011 included a series of 10-minute "lightning talks" from vendors about what they think pharma will look like in 2020. This presentation was delivered by Rajiv Sabharwal of Infosys.
The webinar discussed Jisc's proposal for a Research Data Shared Service (RDSS) to address issues with research data management across UK higher education institutions. The RDSS would provide cost-effective solutions for depositing, describing, storing, publishing, and preserving research data through standardized technology and shared expertise. An alpha version was being piloted with 16 institutions and would include repository, preservation, and advisory services. The goal was to increase access to and reuse of research data while reducing costs and risks for institutions.
Big Data in Pediatric Critical Care by Mohit MehraData Con LA
Abstract:- There is an urgent need in the pediatric ICUs to collect, store and transform healthcare data to make accurate and timely predictions in the areas of patient outcomes and treatment recommendations. We are currently heavily invested in using open source big data stacks in order to achieve this goal and help our young ones. In this talk I can highlight how we go about managing structured and unstructured high frequency data generated from a disparate set of devices and systems and ultimately how we have created data pipelines to process the data and make it available for data scientists and app developers.
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...US-Ignite
Robert Grossman –University of Chicago
Joe Mambretti–Northwestern University
Piers Nash –University of Chicago
Jim Chen –Northwestern University
Allison Heath –University of Chicago
Comeaux RDAP11 Data Archives in Federal AgenciesASIS&T
Joey Comeaux, CICL RDA; Data Archives in Federal Agencies
The 2nd Research Data Access and Preservation (RDAP) Summit
An ASIS&T Summit
March 31-April 1, 2011 Denver, CO
In cooperation with the Coalition for Networked Information
http://asist.org/Conferences/RDAP11/index.html
This document summarizes Helen Henderson's presentation on institutional identifiers. It discusses existing standards like ONIX, COUNTER, and ISSN, as well as new standards being developed like KBART, Project TRANSFER, and CORE. It outlines several scenarios where institutional identifiers could be used, such as in the electronic resources supply chain, eLearning, research funding, and author registries. It describes the stakeholders involved in each scenario and key issues to address. Finally, it provides the timeline and work plan for the NISO working group developing a new institutional identifier standard.
Implementing figshare, research data networkJisc RDM
Implementing figshare and engaging researchers,
Research data network, September 2016, Georgina Parsons, Cranfield University and Megan Hardeman, figshare.
Stop press: should embargo conditions apply to metadata?Jisc RDM
Sarah Middle of Cambridge University discusses whether embargo conditions should apply to metadata. Session held at the Research Data Network event in May 2016, Cardiff University.
Why does research data matter to librariesJisc RDM
- Research data matters to libraries because it is increasingly being produced and collected by researchers, and there are growing requirements to manage and preserve it.
- A survey found that while most researchers currently manage their own data, there is a trend toward using institutional repositories and libraries more for long-term preservation.
- Libraries are well-suited to help with research data management because of their experience organizing and describing information over long periods of time, but there are also challenges due to differences across disciplines in how data is defined and treated.
- As funders and journals require better data sharing practices, libraries have an opportunity to take a more active role in helping researchers and institutions capture, describe, and manage research data over
The document discusses a global initiative to facilitate open access to scholarly resources and research data across boundaries by building a federation of registries. It provides use cases of how such a system could help postgraduate students, research project leaders, administrators, and ICT specialists discover and monitor globally accessible data relevant to their work. The proposed strategy is to create a "Register of Registries" that would enable consistent discovery services for finding data in collections through a standardized, interoperable model. An initial scoping meeting was held in 2007 and annual meetings since to develop the strategy.
Real-World Data Challenges: Moving Towards Richer Data EcosystemsAnita de Waard
The document discusses trends in scientific data repositories and ecosystems. It notes that repositories are becoming more like virtual laboratories where scientists can conduct research. It also discusses how artificial intelligence and machine learning are being used to complement human discovery and analysis of large and complex datasets. The document raises several challenges around issues such as data ownership, rewards for data sharing and software development, and the roles of various stakeholders in research data management.
In early 2014, we asked science and social science researchers...
• What expectations do the terms publication and peer review raise in reference to data?
• What features would be useful to evaluate the trustworthiness, evaluate the impact, and enhance the prestige of a data publication?
The dog who caught the car: There’s more PEPFAR data than ever before, now what?MEASURE Evaluation
This document summarizes a presentation about using PEPFAR (President's Emergency Plan for AIDS Relief) data more effectively. It describes over a dozen years of USAID investments in PEPFAR data infrastructure, totaling nearly $1.9 billion. This includes indicators to support reporting systems, the DATIM data management system, and improved site capacity. With more data available than ever, the presentation discusses how it could be used to more effectively target HIV/AIDS services, identify priority sites, and better understand the epidemic context.
Notes taken to support breakout discussion of possible business models necessary to support the information ecosystem in life science R&D during the Pistoia Alliance Information Ecosystem Workshop in October 2011.
Richard Bolton (GSK and Pistoia's ELN query services workstream coordinator) discusses the Alliance's chemistry strategy, which includes ELN query standards, hosted ELN, and chemistry externalization faciliation
David Klatte (Pfizer) presented on this potential new working group during the "Dragons' Den" portion of the Pistoia Alliance Conference in Boston, MA, on April 24, 2012.
Towards a brokering framework for knowledge-based services: Learning from the...Pistoia Alliance
Ian Harrow, co-leader of the Pistoia Alliance SESL pilot, describes the vision for the SESL pilot, the outcomes, and the project's future. The presentation at the 2011 BioITWorld Conference and Expo included a link to the SESL public demonstrator.
The Pistoia Alliance Information Ecosystem WorkshopPistoia Alliance
Michael Braxenthaler, president of the Pistoia Alliance, introduces the concept of the information ecosystem in life science research and discusses the role the Pistoia Alliance can play within this ecosystem. The workshop occurred in October 2011.
The Pistoia Alliance Conference in April 2011 included a series of 10-minute "lightning talks" from vendors about what they think pharma will look like in 2020. This presentation was delivered by Rajiv Sabharwal of Infosys.
Presentation delivered at the annual general meeting of Pistoia members. Describes the results of board member elections, the state of the Alliance's project portfolio, progress over the past year, and insights from new member Constellation Technologies about why they joined the Alliance.
Resource Description Framework Approach to Data Publication and FederationPistoia Alliance
Bob Stanley, CEO, IO Informatics, explains the utility to RDF as a standard way of defining and redefining data that could have utility in managing life science information.
The Pistoia Alliance Biology Domain Strategy April 2011Pistoia Alliance
Michael Braxenthaler (Roche and external liaison officer for Pistoia) describes the Pistoia Alliance biology domain strategy at the first Pistoia Alliance Conference in April 2011.
The Pistoia Alliance: Strategy, Progress, MomentumPistoia Alliance
Pistoia Alliance Board Member Ramesh Durvasula of BMS provides an overview of the Pistoia Alliance and project status at the BioITWorld Expo in Boston on April 13, 2011.
The Pistoia Alliance Conference in April 2011 included a series of 10-minute "lightning talks" from vendors about what they think pharma will look like in 2020. This presentation was delivered by Richard Resnick of GenomeQuest (and yes, this 41 slide talk was over in just 8 minutes!)
The Pistoia Alliance Conference in April 2011 included a series of 10-minute "lightning talks" from vendors about what they think pharma will look like in 2020. This presentation was delivered by Kevin Lustig of Assay Depot.
Presentation by Simon Thornber, lead of the Pistoia Alliance sequence services working group, about the RFP issued for the second phase of the project.
Collaborative Drug Discovery -- Life Science Collaboration & Virtualization: ...Pistoia Alliance
The Pistoia Alliance Conference in April 2011 included a series of 10-minute "lightning talks" from vendors about what they think pharma will look like in 2020. This presentation was delivered by Sean Ekins of Collaborative Drug Discovery.
Nick Lynch, president of the Pistoia Alliance, delivered this presentation summarizing the mission of the Alliance, its current deliverables and progress, and its strategy for the next several years.
This document provides an overview of C-13 NMR spectroscopy. It discusses the background of C-13 NMR, how it compares to H-1 NMR, chemical shifts, coupling and decoupling techniques, the NOE effect, Fourier transform methods, and examples of restricted rotation. The key points covered are:
- C-13 NMR provides information about chemically nonequivalent nuclei and their chemical environments. It differs from H-1 NMR in abundance, chemical shift range, and coupling.
- Tetramethylsilane (TMS) is used as the reference standard for both H-1 and C-13 NMR.
- Coupling between nuclei allows their environments to be determined, while decoupling provides separate spectra for each
This document provides an overview of proton NMR spectroscopy. It begins with definitions of light and the electromagnetic spectrum. It then discusses spectroscopy in general and introduces NMR, focusing on proton NMR. The key concepts of proton NMR covered include its principle, instrumentation, chemical shifts, spin-spin splitting, deuterium exchange, and the n+1 rule. Applications discussed include distinguishing isomers, determining molecular weight, and studying tautomeric mixtures. Clinical, agricultural, and biological applications are also mentioned.
This document discusses challenges and potential solutions for improving data sharing in neuroscience. It notes that while there is a large amount of neuroscience data, it is unevenly distributed across repositories and databases. The document proposes creating a distributed "data sharing ecosystem" where data and related metadata are systematically tracked, linked and made available. Key elements would include unique IDs for all data objects, logging all activities, and developing accountability scores and influence measures to promote better data citizenship. However, concerns are raised about monitoring researchers and potential biases, which would need to be addressed for such a system to work.
Immersive Recommendation Workshop, NYC Media Lab'17Longqi Yang
The rapid evolution of deep learning technologies and the explosion of diverse user interaction traces have brought significant challenges and opportunities to recommendation and personalized systems. In this workshop, we discussed recent trends and techniques in user modeling and presented our work on immersive recommendation systems. These systems learn users’ preferences from diverse digital trace modalities (text, image and unstructured data streams) in a wide range of recommendation domains (creative art, food, news, and events). The workshop included a light tutorial on OpenRec, an open source framework that enables quick prototyping of complex recommender systems via modularization.
This workshop is based on research and development done at Cornell Tech as part of the Connected Experiences Lab, supported by Oath and NSF.
1) Jordan Engbers is a chief scientist and CTO who has experience in bioinformatics, neuroscience, clinical data science, and founding two data science companies.
2) Data science is a multidisciplinary field that uses techniques from many areas like statistics, computer science, and domain knowledge to understand data and help improve decision making.
3) The impact of data science comes from developing data products - tools that deliver insights from data to drive better decisions. This requires both scientific rigor and software engineering practices.
The document discusses ChemSpider, a free online chemical database, and its efforts to engage the chemistry community to help build and curate its database. It describes ChemSpider's roles in hosting and exposing chemical data as well as curating submitted data. It acknowledges that while crowdsourcing engagement has been low, more collaboration across databases could help improve overall data quality. Continued growth will depend on better engaging the community to contribute to and help shape the resource.
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
Lecture 1:
Being FAIR: FAIR data and model management
In recent years we have seen a change in expectations for the management of all the outcomes of research – that is the “assets” of data, models, codes, SOPs, workflows. The “FAIR” (Findable, Accessible, Interoperable, Reusable) Guiding Principles for scientific data management and stewardship [1] have proved to be an effective rallying-cry. Funding agencies expect data (and increasingly software) management retention and access plans. Journals are raising their expectations of the availability of data and codes for pre- and post- publication. The multi-component, multi-disciplinary nature of Systems and Synthetic Biology demands the interlinking and exchange of assets and the systematic recording of metadata for their interpretation.
Our FAIRDOM project (http://www.fair-dom.org) supports Systems Biology research projects with their research data, methods and model management, with an emphasis on standards smuggled in by stealth and sensitivity to asset sharing and credit anxiety. The FAIRDOM Platform has been installed by over 30 labs or projects. Our public, centrally hosted Asset Commons, the FAIRDOMHub.org, supports the outcomes of 50+ projects.
Now established as a grassroots association, FAIRDOM has over 8 years of experience of practical asset sharing and data infrastructure at the researcher coal-face ranging across European programmes (SysMO and ERASysAPP ERANets), national initiatives (Germany's de.NBI and Systems Medicine of the Liver; Norway's Digital Life) and European Research Infrastructures (ISBE) as well as in PI's labs and Centres such as the SynBioChem Centre at Manchester.
In this talk I will show explore how FAIRDOM has been designed to support Systems Biology projects and show examples of its configuration and use. I will also explore the technical and social challenges we face.
I will also refer to European efforts to support public archives for the life sciences. ELIXIR (http:// http://www.elixir-europe.org/) the European Research Infrastructure of 21 national nodes and a hub funded by national agreements to coordinate and sustain key data repositories and archives for the Life Science community, improve access to them and related tools, support training and create a platform for dataset interoperability. As the Head of the ELIXIR-UK Node and co-lead of the ELIXIR Interoperability Platform I will show how this work relates to your projects.
[1] Wilkinson et al, The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, doi:10.1038/sdata.2016.18
2016 07 12_purdue_bigdatainomics_seandavisSean Davis
Newer, faster, cheaper molecular assays are driving biomedical research. I discuss the history of biomedical data including concepts of data sharing, hypothesis-driven vs generating research, and the potential to expand our thinking on biomedical research to be much more integrated through smart, creative, and open use of technologies and more flexible, longitudinal studies.
The Forensic Technolgy Center of Excellence- Continuous Improvement of Lab Efficiency, Technology Implementation, and Leadership Excellence @ American Society of Crime Lab Directors 2017 Symposium (May 3, 2017) Dallas, TX
Presentation by Prof. Dr. Henning Müller.
Overview:
- Medical image retrieval projects
- Image analysis and 3D texture modeling
- Data science evaluation infrastructures (ImageCLEF, VISCERAL, EaaS – Evaluation as a Service)
- What comes next?
- Data challenges are growing in terms of volume, variety, velocity and quality. There is no single solution and real-world solutions will be hybrid.
- Metadata management is a huge challenge, even basic metadata is beyond most small organizations. Federated systems are needed to transform medicine.
- The document discusses challenges with data management across various domains including life sciences, healthcare, genomics, machine learning, artificial intelligence, and personal data. It emphasizes the importance of data visibility, quality, and integration across siloed systems.
Jean-Claude Bradley had an incredible passion for providing open science tools and data to the community. He had boundless energy, no shortage of ideas and ran so many projects in parallel that it was often difficult to keep up. But at RSC we tried. We provided access to our data, our application programming interfaces and lots of our out-of-hours time to help turn his vision into reality. As a result we helped in the delivery of the SpectralGame to help people learn about NMR and we supported the integration of our services into GoogleDocs underpinning the management and curation of physicochemical property data. We tweaked a number of our services based on JC’s input and as a result we have ended up with a suite of capabilities that serve many of our existing efforts to integrate to electronic lab notebooks and support the ongoing shift towards Open Chemistry. JC was very much ahead of his time….and we were glad to have supported his work. This presentation will give a snapshot of some of the work we did to support his vision.
Introduction to Jackson Labs, JMCRS, Clinical Services and Scientific Services at the Jackson Labs. Differences between long and short read sequencing. FAIR Data Action Plan. Metadata needs. Data Commons and the need to capture sample specific gene models discovered.
The Genome-in-a-Bottle Consortium was established to develop reference materials for clinical applications of human genome sequencing. The National Institute of Standards and Technology (NIST) has been working with various organizations to obtain and characterize reference genomes. The current plan is to use the NA12878 genome as a pilot sample and 8 trios from the Personal Genome Project as a more complete set. Working groups were formed to address reference material selection, characterization measurements, bioinformatics/data integration, and performance metrics. The consortium discussed obtaining consent for reference genomes, the scope of work, and how decisions will be made regarding new reference materials and policies.
But how do I GET the data? Transparency Camp 2014Jeffrey Quigley
The document discusses data collection and management. It describes how Shooju is a web-based data platform that consolidates data sources, makes data searchable from one place, and seamlessly integrates with tools. It notes that most organizations spend more time cleaning and managing data than analyzing it. Common methods to collect data include APIs, scraping, and manual collection, each with advantages and disadvantages. Shooju provides cost savings, added data quality, and enables enhanced decision making by streamlining data workflows and automating processes.
The Jeopardy match between the two best human players of all time and the IBM Deep Q/A software, “Watson,” captured the spotlight and stimulated the imagination of the entire world. The subsequent announcement of IBM’s involvement in the creation of “Dr. Watson” has created a high level of interest in the healthcare community about the potential of this breakthrough technology as well as the potential pitfalls of the use of “artificial intelligence” in medicine. Dr. Siegel is currently working together with IBM engineers to explore how Dr. Watson can work together with physicians and medical specialists. His presentation, which was delivered on March 28th, provided a high level overview of the uniqueness of Deep Q/A Software and how it differs from other previous artificial intelligence applications.
Publication of raw and curated NMR spectroscopic data for organic moleculesChristoph Steinbeck
The document discusses nuclear magnetic resonance (NMR) spectroscopy and the need for sharing raw NMR data. It describes NMRReData as a machine-readable representation for linking NMR spectral data to chemical structures. Benefits of NMRReData include improved data quality, easier data sharing and storage, and validation of results. The document also calls for building a stable, open archive with community standards for submitting raw NMR data and metadata. Existing frameworks could support such an archive by handling submissions and allowing search/visualization of NMR data.
The document discusses big data, defining it using the 4 Vs of volume, velocity, variety, and veracity. It describes how the volume of data has grown exponentially in recent years. Tools like Hadoop and Splunk are used to analyze large and diverse datasets in real-time. Examples are given of how big data impacts various industries like healthcare, retail, and more. Industries are now able to gain insights from large amounts of structured and unstructured data to improve areas such as customer service, risk analysis, and personalized medicine.
This presentation was given at the ToxForum 2023 Winter Meeting in the session regarding Life Cycle Impact Assessment: LCIAs are increasingly being utilized within Life Cycle Assessments (LCA) to attempt to quantify impacts to human health, among other impacts, and ultimately compare products or processes. This session will provide the audience with an overview of LCIA, specifically outlining how toxicology data are utilized in LCIA human health impact calculations. Further, our speakers will delve into the nuances related to the interpretation of LCIA human health outputs and how they compare to information obtained via a risk assessment (RA) approach. Toxicology data are key to these newly emerging efforts to characterize total human health impact. Therefore, panelists will be asked to consider new trends in toxicology, data sources, and or approaches that may help to better inform a calculation of human health impact along with appropriate domains of applicability fostering a discussion relevant to regulators, academics and those in application industries.
This talk presents areas of investigation underway at the Rensselaer Institute for Data Exploration and Applications. First presented at Flipkart, Bangalore India, 3/2015.
I presented this keynote talk at the WorldComp conference in Las Vegas, on July 13, 2009. In it, I summarize what grid is about (focusing in particular on the "integration" function, rather than the "outsourcing" function--what people call "cloud" today), using biomedical examples in particular.
Similar to NMR Automatic Structure Verficiation (20)
Fairification experience clarifying the semantics of data matricesPistoia Alliance
This webinar presents the Statistics Ontology, STATO which is a semantic framework to support the creation of standardized analysis reports to help with review of results in the form of data matrices. STATO includes a hierarchy of classes and a vocabulary for annotating statistical methods used in life, natural and biomedical sciences investigations, text mining and statistical analyses.
This webinar discusses driving adoption of microphysiological systems (MPS) in drug R&D. The webinar agenda includes presentations on multi-organ chips for safety and efficacy assessment from TissUse, current applications and future perspectives of organ-on-chips in pharmaceutical industry from AstraZeneca, and driving adoption of MPS from ToxRox Consulting. A panel discussion will be moderated by Mary Ellen Cosenza. The presentations will cover benefits of MPS for reducing drug failures and animal testing, applications across drug discovery and development, challenges for adoption, and perspectives from industry.
Federated Learning (FL) is a learning paradigm that enables collaborative learning without centralizing datasets. In this webinar, NVIDIA present the concept of FL and discuss how it can help overcome some of the barriers seen in the development of AI-based solutions for pharma, genomics and healthcare. Following the presentation, the panel debate on other elements that could drive the adoption of digital approaches more widely and help answer currently intractable science and business questions.
It seems that AI is also becoming a buzzword, like design thinking. Everyone is talking about AI or wants to have AI, and sees all the ideas and benefits – that’s fine, but how do you get started? But what’s different now? Three innovations have finally put AI on the fast track: Big Data, with the internet and sensors everywhere; massive computing power, especially through the Cloud; and the development of breakthrough algorithms, so computers can be trained to accomplish more sophisticated tasks on their own with deep learning. If you use new technology, you need to explore and know what’s possible. With design thinking, it aids to outline the steps and define the ways in which you’re going to create the solution. Starting with mapping the customer journey, defining who will be using that service enhanced with intelligent technology, or who will benefit and gain value from it. We discuss how these two worlds are coming together, and how you get started to transform your venture with Artificial Intelligence using Design Thinking.
Speaker: Claudio Mirti, Principal Solution Specialist – Data & AI, Microsoft
Themes and objectives:
To position FAIR as a key enabler to automate and accelerate R&D process workflows
FAIR Implementation within the context of a use case
Grounded in precise outcomes (e.g. faster and bigger science / more reuse of data to enhance value / increased ability to share data for collaboration and partnership)
To make data actionable through FAIR interoperability
Speakers:
Mathew Woodwark,Head of Data Infrastructure and Tools, Data Science & AI, AstraZeneca
Erik Schultes, International Science Coordinator, GO-FAIR
Georges Heiter, Founder & CEO, Databiology
Knowledge graphs ilaria maresi the hyve 23apr2020Pistoia Alliance
Data for drug discovery and healthcare is often trapped in silos which hampers effective interpretation and reuse. To remedy this, such data needs to be linked both internally and to external sources to make a FAIR data landscape which can power semantic models and knowledge graphs.
2020.04.07 automated molecular design and the bradshaw platform webinarPistoia Alliance
This presentation described how data-driven chemoinformatics methods may automate much of what has historically been done by a medicinal chemist. It explored what is reasonable to expect “AI” approaches might achieve, and what is best left with a human expert. The implications of automation for the human-machine interface were explored and illustrated with examples from Bradshaw, GSK’s experimental automated design environment.
Dr. Dennis Wang discusses possible ways to enable ML methods to be more powerful for discovery and to reduce ambiguity within translational medicine, allowing data-informed decision-making to deliver the next generation of diagnostics and therapeutics to patients quicker, at lowered costs, and at scale.
The talk by Dr. Dennis Wang was followed by a panel discussion with Mr. Albert Wang, M. Eng., Head, IT Business Partner, Translational Research & Technologies, Bristol-Myers Squibb.
With the explosion of interest in both enhanced knowledge management and open science, the past few years have seen considerable discussion about making scientific data “FAIR” — findable, accessible, interoperable, and reusable. The problem is that most scientific datasets are not FAIR. When left to their own devices, scientists do an absolutely terrible job creating the metadata that describe the experimental datasets that make their way in online repositories. The lack of standardization makes it extremely difficult for other investigators to locate relevant datasets, to re-analyse them, and to integrate those datasets with other data. The Center for Expanded Data Annotation and Retrieval (CEDAR) has the goal of enhancing the authoring of experimental metadata to make online datasets more useful to the scientific community. The CEDAR work bench for metadata management will be presented in this webinar. CEDAR illustrates the importance of semantic technology to driving open science. It also demonstrates a means for simplifying access to scientific data sets and enhancing the reuse of the data to drive new discoveries.
Open interoperability standards, tools and services at EMBL-EBIPistoia Alliance
In this webinar Dr Henriette Harmse from EMBL-EBI presents how they are using their ontology services at EMBL-EBI to scale up the annotation of data and deliver added value through ontologies and semantics to their users.
Fair webinar, Ted slater: progress towards commercial fair data products and ...Pistoia Alliance
Elsevier is a global information analytics business that helps institutions and professional’s
advance healthcare and open science to improve performance for the benefit of humanity.
In this webinar, we discuss how Elsevier is increasingly leveraging the FAIR Guiding Principles to improve its products and services to better serve the scientific community.
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesPistoia Alliance
The FAIR (Findable, Accessible, Interoperable and Reusable) principles aim to maximize the discovery and reuse of digital resources. Using recently developed software and metrics to assess FAIRness and supported through an ELIXIR Implementation Study, Michel worked with a subset of ELIXIR Core Data Resources to apply these technologies. In this webinar, he will discuss their approach, findings, and lessons learned towards the understanding and promotion of the FAIR principles.
Implementing Blockchain applications in healthcarePistoia Alliance
Blockchain technology can revolutionise the way information is exchanged between parties by bringing an unprecedented level of security and trust to these transactions. The technology is finding its way into multiple use cases but we are yet to see full adoption and real-world business implementation in the Healthcare industry.
In this webinar we will explore the main challenges and considerations for the implementation of Blockchain technology in Healthcare use cases. This is the third webinar in our Blockchain Education series.
Building trust and accountability - the role User Experience design can play ...Pistoia Alliance
In this webinar our panel of UX specialists give a brief introduction to User Experience before presenting the design opportunities UX can bring to AI. We all know that AI has great potential but has some significant hurdles to overcome not least so the human aspect of trust and ethical considerations when designing in the life sciences.
This document summarizes a webinar on using machine learning and data mining techniques to predict drug repurposing opportunities for chronic pancreatitis. Specifically:
1. Ensemble learning techniques like kernel-based models were used to analyze drug and disease target interaction data from multiple sources to identify potential drug candidates for repurposing.
2. The top 5 repurposing candidates identified through this process were being evaluated further by the partner organization Mission-Cure with the goal of beginning patient trials by January 2020.
3. Additional techniques discussed included using compressed sensing to analyze drug-disease networks and predict side effects to help evaluate candidate drugs identified for repurposing opportunities.
PA webinar on benefits & costs of FAIR implementation in life sciences Pistoia Alliance
The slides from the Pistoia Alliance Debates Webinar where a panel of experts from technology support providers and the biopharma industry, who have been invited to share their views on the "Benefits and costs of FAIR Implementation for life science industry".
Creating novel drugs is an extraordinarily hard and complex problem.
One of the many challenges in drug design is the sheer size of the search space for novel chemical compounds. Scientists need to find molecules that are active toward a biological target or pathway and at the same time have acceptable ADMET properties.
There is now considerable research going on using various AI and ML approaches to tackle these challenges.
Our distinguished speakers, Drs. Alex Tropsha and Ola Engkvist, will discuss their recent work in Drug Design involving Deep Reinforcement Learning and Neural Networks, and will answer questions from the audience on the current state of the research in the field.
Speakers:
Prof Alex Tropsha, Professor at University of North Carolina at Chapel Hill, USA
Dr. Ola Engkvist, Associate Director at AstraZeneca R&D, Gothenburg, Sweden
Alexander Tropsha presented on using AI and machine learning for drug design and discovery. He discussed using QSAR models to predict properties and activity of molecules based on their structural descriptors. He also introduced ReLeaSE, a new method using deep reinforcement learning to generate novel drug-like molecules and guide chemical library design through a thought cycle of molecule generation, model building, and iterative improvement. If successful, this approach could disrupt traditional computational drug discovery pipelines.
Blockchain, IoT and the GxP lab technology helping compliance?
This webinar discusses how distributed ledger technology like blockchain and IOTA could help enhance compliance in GxP laboratories. It explores how DLT could be used to track devices, materials, and data in a more transparent, trusted and auditable way. Specifically, it presents a vision of an internet-connected "laboratory of the future" where all devices share data using DLT. This could improve integrity, security and access to data while reducing costs. While DLT cannot directly increase compliance, it may help build trust in GxP systems and processes by making components more transparent to regulators.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Essentials of Automations: Exploring Attributes & Automation ParametersSafe Software
Building automations in FME Flow can save time, money, and help businesses scale by eliminating data silos and providing data to stakeholders in real-time. One essential component to orchestrating complex automations is the use of attributes & automation parameters (both formerly known as “keys”). In fact, it’s unlikely you’ll ever build an Automation without using these components, but what exactly are they?
Attributes & automation parameters enable the automation author to pass data values from one automation component to the next. During this webinar, our FME Flow Specialists will cover leveraging the three types of these output attributes & parameters in FME Flow: Event, Custom, and Automation. As a bonus, they’ll also be making use of the Split-Merge Block functionality.
You’ll leave this webinar with a better understanding of how to maximize the potential of automations by making use of attributes & automation parameters, with the ultimate goal of setting your enterprise integration workflows up on autopilot.
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframePrecisely
Inconsistent user experience and siloed data, high costs, and changing customer expectations – Citizens Bank was experiencing these challenges while it was attempting to deliver a superior digital banking experience for its clients. Its core banking applications run on the mainframe and Citizens was using legacy utilities to get the critical mainframe data to feed customer-facing channels, like call centers, web, and mobile. Ultimately, this led to higher operating costs (MIPS), delayed response times, and longer time to market.
Ever-changing customer expectations demand more modern digital experiences, and the bank needed to find a solution that could provide real-time data to its customer channels with low latency and operating costs. Join this session to learn how Citizens is leveraging Precisely to replicate mainframe data to its customer channels and deliver on their “modern digital bank” experiences.
1. The Pistoia ASV
Project ...
An Emerging Vehicle for Collaboration:
NMR Spectra data exchange
The Pistoia Alliance
Alex Drijver on behalf of John Hollerton, Giles Ratcliffe and John Wise
Pistoia Conference http://pistoiaalliance.org
24th April 2012
2. What’s up doc?
„Too much data ... ?”
„No, too little time ...”
„For what?”
„To analyze the NMR data on compounds I
buy for screening..”
3. If only I had ...
ASV
The small print .. Automatic Structure Verification ...
4. What’s up doc?
„Too much data ... ?”
„No, too little ...”
„For what?”
„To make a reliable algorithm that will
validate structures based on NMR spectra”
5.
6. What if...
A data set of qualified structures and NMR
spectra existed
Contributed by pharma (or other) =
chemical diversity
Available to software developers
With a control set for validation
7. The Money bit
Data contributed free of charge
Data available free of charge
Data curator and program administrator
Cost of labor and of system
say 1 FTE/year @ $120,000
Open contribution (money and structures)
Systematic/pragmatic
8. Rules
• Contribution of data based on size (issue
of how to measure)
• Only participating providers of data can
get the validation data set
• No limitation on who can use the test set
• ASV’s sold on commercial terms
• Financial contribution needed for
maintaining the infrastructure
9. WIFM
• Pharma
– Time - which is money
• Software providers
– Industry relevant software – which means
money
• CROs
– Could provide value-add from validating
structures prior to sale – which means money