Slides from my talk at the ACS CINF Symposium on Chemical Nomenclature & Representation on 26 August 2019 in San Diego.
Abstract:
The first edition of the Beilstein Handbook of Organic Chemistry was published nearly 140 years ago. Electronic laboratory notebooks have been in use in chemistry for almost 20 years. And the life science industry still doesn't have a well-defined way of capturing and exchanging information about chemical reactions and relies on imprecise or vendor-specific data formats. Without a common language and structure to describe experiments, data integration is unnecessarily expensive and a significant part of published data has not been readily available for processing or analysis.
The Unified Data Model (UDM) project team aims to improve the situation. UDM is a collective effort of vendors and life science organizations to create an open, extendable and freely available reference model and data format for exchange of experimental information about compound synthesis and testing. Run under the umbrella of the Pistoia Alliance, the project team has published two releases of the UDM data format and it is expected that the model will continue to be improved as demand stipulates working with the Pistoia FAIR data implementation by industry community.
In a series of announcements that left more than 1,200 gamers gathered in Cologne alternately breathless, giddy with laughter, and shouting their enthusiasm, Jensen Huang introduced the GeForce RTX series of gaming processors, representing the biggest leap in performance in NVIDIA’s history.
With the components already introduced to the market, we are making the platform truly end-to-end by launching;
- The market’s first complete 5G radio system
- The first version of an E2E Core network capable of 5G use cases based on network slices
- A 5G core network which can now be connected to 5G NR radio
This enables already today some 5G use cases, for telecom operators to capture growth opportunities for 5G & Internet of Things services for Consumers & Enterprises.
This webinar focuses on the particular use case of graph databases in Network & IT-Management. This webinar is designed for people who work with Network Management at telecom companies or professionals within industries that handle and rely on complex networks.
We’ll start with an overview of Neo4j and Graph-thinking within Networks, explaining how Neworks are naturally modelled as graphs. We’ll explain how graph databases vastly help mitigate some of the major challenges the Network and Security Managers face on daily basis — including intrusions and other cyber crimes, performance optimization, outage simulations, fraud prevention and more.
With uCPE/SD-WAN taking center stage in enabling software-defined Cloud services to enterprise branch offices globally, this session will provide a uCPE review from a solution, deployment and reference design standpoint.
Speaker: Sab Gosal, Segment Manager
Network Platforms Group (NPG), September 2018
In a series of announcements that left more than 1,200 gamers gathered in Cologne alternately breathless, giddy with laughter, and shouting their enthusiasm, Jensen Huang introduced the GeForce RTX series of gaming processors, representing the biggest leap in performance in NVIDIA’s history.
With the components already introduced to the market, we are making the platform truly end-to-end by launching;
- The market’s first complete 5G radio system
- The first version of an E2E Core network capable of 5G use cases based on network slices
- A 5G core network which can now be connected to 5G NR radio
This enables already today some 5G use cases, for telecom operators to capture growth opportunities for 5G & Internet of Things services for Consumers & Enterprises.
This webinar focuses on the particular use case of graph databases in Network & IT-Management. This webinar is designed for people who work with Network Management at telecom companies or professionals within industries that handle and rely on complex networks.
We’ll start with an overview of Neo4j and Graph-thinking within Networks, explaining how Neworks are naturally modelled as graphs. We’ll explain how graph databases vastly help mitigate some of the major challenges the Network and Security Managers face on daily basis — including intrusions and other cyber crimes, performance optimization, outage simulations, fraud prevention and more.
With uCPE/SD-WAN taking center stage in enabling software-defined Cloud services to enterprise branch offices globally, this session will provide a uCPE review from a solution, deployment and reference design standpoint.
Speaker: Sab Gosal, Segment Manager
Network Platforms Group (NPG), September 2018
Este é um documento disponibilzado pela Ashrae na internet para consultas sobre TC 9.9 para operação em Data Centers no mundo todo, esse guia fala sobre as classes e os seus limites operacionais mínimos e máximos
An introduction to Neo4j and Graph Databases. Learn about the primary use cases for Graph Databases and explore the properties of Neo4j that make those use cases possible.
ScottMadden has developed an approach for analyzing data center requirements and driving improvements in existing data center retrofits. Our approach takes into account the technological requirements, the physical attributes of a data center, and the requirements for a rigorous measurement and verification program needed to ensure improvements actually capture the energy efficiently gains and the resultant greenhouse gas reductions.
Our approach addresses the latest trends in data center management such as virtualization and cloud computing and provide a framework for developing metrics needed to drive changes in data center performance.
AI model efficiency is crucial for making AI ubiquitous, leading to smarter devices and enhanced lives. Besides the performance benefit, quantized neural networks also increase power efficiency for two reasons: reduced memory access costs and increased compute efficiency.
The quantization work done by the Qualcomm AI Research team is crucial in implementing machine learning algorithms on low-power edge devices. In network quantization, we focus on both pushing the state-of-the-art (SOTA) in compression and making quantized inference as easy to access as possible. For example, our SOTA work on oscillations in quantization-aware training that push the boundaries of what is possible with INT4 quantization. Furthermore, for ease of deployment, the integer formats such as INT16 and INT8 give comparable performance to floating point, i.e., FP16 and FP8, but have significantly better performance-per-watt performance. Researchers and developers can make use of this quantization research to successfully optimize and deploy their models across devices with open-sourced tools like AI Model Efficiency Toolkit (AIMET).
Presenters: Tijmen Blankevoort and Chirag Patel
In this deck, Greg Wahl from Advantech presents: Transforming Private 5G Networks.
Advantech Networks & Communications Group is driving innovation in next-generation network solutions with their High Performance Servers. We provide business critical hardware to the world's leading telecom and networking equipment manufacturers with both standard and customized products. Our High Performance Servers are highly configurable platforms designed to balance the best in x86 server-class processing performance with maximum I/O and offload density. The systems are cost effective, highly available and optimized to meet next generation networking and media processing needs.
“Advantech’s Networks and Communication Group has been both an innovator and trusted enabling partner in the telecommunications and network security markets for over a decade, designing and manufacturing products for OEMs that accelerate their network platform evolution and time to market.” Said Advantech Vice President of Networks & Communications Group, Ween Niu. “In the new IP Infrastructure era, we will be expanding our expertise in Software Defined Networking (SDN) and Network Function Virtualization (NFV), two of the essential conduits to 5G infrastructure agility making networks easier to install, secure, automate and manage in a cloud-based infrastructure.”
In addition to innovation in air interface technologies and architecture extensions, 5G will also need a new generation of network computing platforms to run the emerging software defined infrastructure, one that provides greater topology flexibility, essential to deliver on the promises of high availability, high coverage, low latency and high bandwidth connections. This will open up new parallel industry opportunities through dedicated 5G network slices reserved for specific industries dedicated to video traffic, augmented reality, IoT, connected cars etc. 5G unlocks many new doors and one of the keys to its enablement lies in the elasticity and flexibility of the underlying infrastructure.
Advantech’s corporate vision is to enable an intelligent planet. The company is a global leader in the fields of IoT intelligent systems and embedded platforms. To embrace the trends of IoT, big data, and artificial intelligence, Advantech promotes IoT hardware and software solutions with the Edge Intelligence WISE-PaaS core to assist business partners and clients in connecting their industrial chains. Advantech is also working with business partners to co-create business ecosystems that accelerate the goal of industrial intelligence."
Watch the video: https://wp.me/p3RLHQ-lPQ
* Company website: https://www.advantech.com/
* Solution page: https://www2.advantech.com/nc/newsletter/NCG/SKY/benefits.html
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Nice presentation by Nokia talking about 5G network and radio enhancements such as 5G Quality of Service, Netowrk Slicing, Latency Reduction and architecture issue. Thanks Benoist for this and your work in 3GPP RAN2.
Beginners: 5G Terminology (Updated - Feb 2019)3G4G
An updated short presentation and video looking at 5G terminology that is being used in 3GPP standards and specifications.
Terms such as NG-RAN, NR, ng-eNB, en-gNB, RIT, SRIT, Option 3, etc. will be discussed
Simplifying AI Infrastructure: Lessons in Scaling on DGX SystemsRenee Yao
Simplifying AI Infrastructure: Lessons in Scaling on DGX Systems, the world's most powerful AI Systems. This is a presentation I did at GTC Israel in 2018
LTE is a common standard covering both FDD and TDD flavors, enableing the industry to build common FDD/TDD infrastructure, common devices, and a large common ecosystem. LTE and its evolution LTE Advanced play a critical role in addressing the 1000x increase in mobile data.
Qualcomm has been leading LTE proliferation from the very beginning— from the industry-first Gobi LTE/3G multimode, common FDD/TDD modems to the current third-generation solutions that powered the world’s first LTE Advanced carrier-aggregation launch in June 2013.
For more information please visit www.qualcomm.com/lte
Download the presentation here: http://www.qualcomm.com/media/documents/lte-qualcomm-leading-global-success
Artificial Intelligence (AI), specifically deep learning, is revolutionizing industries, products, and core capabilities by delivering dramatically enhanced experiences. However, the deep neural networks of today use too much memory, compute, and energy. To make AI truly ubiquitous, it needs to run on the end device within tight power and thermal budgets. Advancements in multiple areas are necessary to improve AI model efficiency, including quantization, compression, compilation, and neural architecture search (NAS). In this presentation, we’ll discuss:
- Qualcomm AI Research’s latest model efficiency research
- Our new NAS research to optimize neural networks more easily for on-device efficiency
- How the AI community can take advantage of this research though our open-source projects, such as the AI Model Efficiency Toolkit (AIMET) and AIMET Model Zoo
In recent years, the growth of scientific data and the increasing need for data sharing and collaboration in the field of environmental chemistry has led to the creation of various software and databases that facilitate research and development into the safety and toxicity of chemicals. The US-EPA Center for Computational Toxicology and Exposure has been developing software and databases that serve the chemistry community for many years. This presentation will focus on several web-based software applications which have been developed at the USEPA and made available to the community. While the primary software application from the Center is the CompTox Chemicals Dashboard almost a dozen proof-of-concept applications have been built serving various capabilities. The publicly accessible Cheminformatics Modules (https://www.epa.gov/chemicalresearch/cheminformatics) provides access to six individual modules to allow for hazard comparison for sets of chemicals, structure-substructure-similarity searching, structure alerts and batch QSAR prediction of both physicochemical and toxicity endpoints. A number of other applications in development include a chemical transformations database (ChET) and a database of analytical methods and open mass spectral data (AMOS). Each of these depends on the underlying DSSTox chemicals database, a rich source of chemistry data for over 1.2 million chemical substances. I will provide an overview of all tools in development and the integrated nature of the applications based on the underlying chemistry data. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.
ICIC 2014 Increasing the efficiency of pharmaceutical research through data i...Dr. Haxel Consult
The pressures of pharmaceutical research and development demand increasing efficiency from scientists. High-quality decisions must be made faster and encompass all available information. At the same time there is a growing desire to better utilize the multi-billion dollar research investment recorded in laboratory notebooks and bioassay databases. Key values for data integration in a data exploration environment include gathering data from disparate E-notebooks and bioassay databases into a single searchable “virtual” system and increased discoverability by accessing data through a system designed for exploration. Key benefits are better chemistry decisions through easier access to broader data and reduced time for preparing patent filings. The ability to interlink in-house and reported assay data with in-house and published chemistry provides a data-rich environment for developing insights and predictive models. We will discuss our experience with integrating information from journals, patents, bio-assay databases, and E-lab notebooks to address these needs.
Este é um documento disponibilzado pela Ashrae na internet para consultas sobre TC 9.9 para operação em Data Centers no mundo todo, esse guia fala sobre as classes e os seus limites operacionais mínimos e máximos
An introduction to Neo4j and Graph Databases. Learn about the primary use cases for Graph Databases and explore the properties of Neo4j that make those use cases possible.
ScottMadden has developed an approach for analyzing data center requirements and driving improvements in existing data center retrofits. Our approach takes into account the technological requirements, the physical attributes of a data center, and the requirements for a rigorous measurement and verification program needed to ensure improvements actually capture the energy efficiently gains and the resultant greenhouse gas reductions.
Our approach addresses the latest trends in data center management such as virtualization and cloud computing and provide a framework for developing metrics needed to drive changes in data center performance.
AI model efficiency is crucial for making AI ubiquitous, leading to smarter devices and enhanced lives. Besides the performance benefit, quantized neural networks also increase power efficiency for two reasons: reduced memory access costs and increased compute efficiency.
The quantization work done by the Qualcomm AI Research team is crucial in implementing machine learning algorithms on low-power edge devices. In network quantization, we focus on both pushing the state-of-the-art (SOTA) in compression and making quantized inference as easy to access as possible. For example, our SOTA work on oscillations in quantization-aware training that push the boundaries of what is possible with INT4 quantization. Furthermore, for ease of deployment, the integer formats such as INT16 and INT8 give comparable performance to floating point, i.e., FP16 and FP8, but have significantly better performance-per-watt performance. Researchers and developers can make use of this quantization research to successfully optimize and deploy their models across devices with open-sourced tools like AI Model Efficiency Toolkit (AIMET).
Presenters: Tijmen Blankevoort and Chirag Patel
In this deck, Greg Wahl from Advantech presents: Transforming Private 5G Networks.
Advantech Networks & Communications Group is driving innovation in next-generation network solutions with their High Performance Servers. We provide business critical hardware to the world's leading telecom and networking equipment manufacturers with both standard and customized products. Our High Performance Servers are highly configurable platforms designed to balance the best in x86 server-class processing performance with maximum I/O and offload density. The systems are cost effective, highly available and optimized to meet next generation networking and media processing needs.
“Advantech’s Networks and Communication Group has been both an innovator and trusted enabling partner in the telecommunications and network security markets for over a decade, designing and manufacturing products for OEMs that accelerate their network platform evolution and time to market.” Said Advantech Vice President of Networks & Communications Group, Ween Niu. “In the new IP Infrastructure era, we will be expanding our expertise in Software Defined Networking (SDN) and Network Function Virtualization (NFV), two of the essential conduits to 5G infrastructure agility making networks easier to install, secure, automate and manage in a cloud-based infrastructure.”
In addition to innovation in air interface technologies and architecture extensions, 5G will also need a new generation of network computing platforms to run the emerging software defined infrastructure, one that provides greater topology flexibility, essential to deliver on the promises of high availability, high coverage, low latency and high bandwidth connections. This will open up new parallel industry opportunities through dedicated 5G network slices reserved for specific industries dedicated to video traffic, augmented reality, IoT, connected cars etc. 5G unlocks many new doors and one of the keys to its enablement lies in the elasticity and flexibility of the underlying infrastructure.
Advantech’s corporate vision is to enable an intelligent planet. The company is a global leader in the fields of IoT intelligent systems and embedded platforms. To embrace the trends of IoT, big data, and artificial intelligence, Advantech promotes IoT hardware and software solutions with the Edge Intelligence WISE-PaaS core to assist business partners and clients in connecting their industrial chains. Advantech is also working with business partners to co-create business ecosystems that accelerate the goal of industrial intelligence."
Watch the video: https://wp.me/p3RLHQ-lPQ
* Company website: https://www.advantech.com/
* Solution page: https://www2.advantech.com/nc/newsletter/NCG/SKY/benefits.html
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Nice presentation by Nokia talking about 5G network and radio enhancements such as 5G Quality of Service, Netowrk Slicing, Latency Reduction and architecture issue. Thanks Benoist for this and your work in 3GPP RAN2.
Beginners: 5G Terminology (Updated - Feb 2019)3G4G
An updated short presentation and video looking at 5G terminology that is being used in 3GPP standards and specifications.
Terms such as NG-RAN, NR, ng-eNB, en-gNB, RIT, SRIT, Option 3, etc. will be discussed
Simplifying AI Infrastructure: Lessons in Scaling on DGX SystemsRenee Yao
Simplifying AI Infrastructure: Lessons in Scaling on DGX Systems, the world's most powerful AI Systems. This is a presentation I did at GTC Israel in 2018
LTE is a common standard covering both FDD and TDD flavors, enableing the industry to build common FDD/TDD infrastructure, common devices, and a large common ecosystem. LTE and its evolution LTE Advanced play a critical role in addressing the 1000x increase in mobile data.
Qualcomm has been leading LTE proliferation from the very beginning— from the industry-first Gobi LTE/3G multimode, common FDD/TDD modems to the current third-generation solutions that powered the world’s first LTE Advanced carrier-aggregation launch in June 2013.
For more information please visit www.qualcomm.com/lte
Download the presentation here: http://www.qualcomm.com/media/documents/lte-qualcomm-leading-global-success
Artificial Intelligence (AI), specifically deep learning, is revolutionizing industries, products, and core capabilities by delivering dramatically enhanced experiences. However, the deep neural networks of today use too much memory, compute, and energy. To make AI truly ubiquitous, it needs to run on the end device within tight power and thermal budgets. Advancements in multiple areas are necessary to improve AI model efficiency, including quantization, compression, compilation, and neural architecture search (NAS). In this presentation, we’ll discuss:
- Qualcomm AI Research’s latest model efficiency research
- Our new NAS research to optimize neural networks more easily for on-device efficiency
- How the AI community can take advantage of this research though our open-source projects, such as the AI Model Efficiency Toolkit (AIMET) and AIMET Model Zoo
In recent years, the growth of scientific data and the increasing need for data sharing and collaboration in the field of environmental chemistry has led to the creation of various software and databases that facilitate research and development into the safety and toxicity of chemicals. The US-EPA Center for Computational Toxicology and Exposure has been developing software and databases that serve the chemistry community for many years. This presentation will focus on several web-based software applications which have been developed at the USEPA and made available to the community. While the primary software application from the Center is the CompTox Chemicals Dashboard almost a dozen proof-of-concept applications have been built serving various capabilities. The publicly accessible Cheminformatics Modules (https://www.epa.gov/chemicalresearch/cheminformatics) provides access to six individual modules to allow for hazard comparison for sets of chemicals, structure-substructure-similarity searching, structure alerts and batch QSAR prediction of both physicochemical and toxicity endpoints. A number of other applications in development include a chemical transformations database (ChET) and a database of analytical methods and open mass spectral data (AMOS). Each of these depends on the underlying DSSTox chemicals database, a rich source of chemistry data for over 1.2 million chemical substances. I will provide an overview of all tools in development and the integrated nature of the applications based on the underlying chemistry data. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.
ICIC 2014 Increasing the efficiency of pharmaceutical research through data i...Dr. Haxel Consult
The pressures of pharmaceutical research and development demand increasing efficiency from scientists. High-quality decisions must be made faster and encompass all available information. At the same time there is a growing desire to better utilize the multi-billion dollar research investment recorded in laboratory notebooks and bioassay databases. Key values for data integration in a data exploration environment include gathering data from disparate E-notebooks and bioassay databases into a single searchable “virtual” system and increased discoverability by accessing data through a system designed for exploration. Key benefits are better chemistry decisions through easier access to broader data and reduced time for preparing patent filings. The ability to interlink in-house and reported assay data with in-house and published chemistry provides a data-rich environment for developing insights and predictive models. We will discuss our experience with integrating information from journals, patents, bio-assay databases, and E-lab notebooks to address these needs.
The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrate advances in biology, chemistry, exposure and computer science to help prioritize chemicals for further research based on potential human health risks. This work involves computational and data driven approaches that integrate chemistry, exposure and biological data. As an outcome of these efforts the National Center for Computational Toxicology (NCCT) has measured, assembled and delivered an enormous quantity and diversity of data for the environmental sciences including high-throughput in vitro screening data, legacy in vivo animal data, consumer use and production information, exposure models and chemical structure databases with associated properties. A series of software applications and databases have been produced over the past decade to deliver these data, but recent developments have focused on the development of a new software architecture that assembles the resources into a single platform. Our web application, the CompTox Chemistry Dashboard provides access to data associated with ~750,000 chemical substances. These data include experimental and predicted physicochemical property data, bioassay screening data associated with the ToxCast program, product and functional use information and a myriad of related data of value to environmental scientists.
The dashboard provides chemical-based searching based on chemical names, synonyms and CAS Registry Numbers. Flexible search capabilities allow for chemical identification based on non-targeted analysis studies using mass spectrometry. Chemical identification using both mass and formula-based searching utilizes rank-ordering of results via functional use statistics, thereby providing a solution to help prioritize chemicals for further review when detected in environmental media.
This presentation will provide an overview of the dashboard, its capabilities for delivering data to the environmental chemistry community and how the architecture provides a foundation for the development of additional applications to support chemical risk assessment. This abstract does not reflect U.S. EPA policy.
The CompTox Chemistry Dashboard was developed by the Environmental Protection Agency’s National Center for Computational Toxicology. This dashboard has been architected in a manner that allows for the deployment of multiple “applications”, both as publicly available databases, and for deployment under the constraints of confidential business information (CBI). The public dashboard provide access to multiple types of data for ~750,000 chemicals. This includes, when available for a chemical substance, physicochemical parameters, toxicity and bioassay data, consumer use and analytical data. Fate, exposure, and hazard calculations can benefit from access to the data aggregation and curation efforts that underpin the public dashboard. Also, regulators can benefit from the integration of their own data within their closed infrastructure environments. This presentation will provide a review of the chemistry dashboard architecture and its present application providing access to data to the research and regulatory communities. We will also review present developments in the area of delivering an application programming interface, web services, and software components for integration into third party applications providing access to the data exposed via the dashboard. This abstract does not reflect U.S. EPA policy.
OpenAIRE provide dashboard #OpenAIREweek2020Pedro Príncipe
OpenAIRE provide session at the OpenAIRE week 2020 - A user journey in OpenAIRE provide - services and the interoperability guidelines, by Pedro Principe
During the last two decades Clinical Decision Support (CDS) standards and technologies have progressed significantly to develop them as more robust and scalable systems. However, the current context of medicine sets high demands in aspects such as interoperability to enable the use of EHR data in CDS systems, the need to establish communication challenges to include the patient as an active component in decision making, collaborative learning and sharing CDS systems across institutional borders, to name a few.
In this thesis I tackle some of these challenges. In particular, I evolve previous conceptual computerized decision support frameworks and I postulate a CDS systems environment where different models interact to enable:
• Secondary use of data for CDS systems: The dissertation presents a model to leverage different developments in data access and standardization of medical information. The result is an openEHR-based Data Warehouse architecture that enables access, standardization and abstraction of clinical data for CDS systems. The architecture allows: a) to access heterogeneous data sources; b) to standardize data into openEHR to grant interoperability of data; and c) to exploit an openEHR repository as a Data Warehouse that allows querying data in a technology-independent format (the Archetype Query Language).
• CDS systems semantic specification: The semantic model proposed exploits the paradigm of Linked Services to unambiguously describe CDS systems in a machine- understandable fashion. This grants ontological descriptions of functional, non- functional and data semantics. These descriptions facilitate to overcome some of the barriers in CDS functionality sharing. In particular, the semantic model proposed allows using expressive queries to discover CDS services in health
III
networks, and analyzing CDS systems interfaces to understand how to interoperate with
them.
• Effective patient-CDS systems interaction: the dissertation proposes a method to
evaluate the communication process between patients and consumer-oriented CDS systems. The method aims for detecting if important human-computer interaction barriers that could lead to negative outcomes are present in CDS systems user interfaces.
Slides to be presented at a webinar arranged by Metasolution as part of a Vinnova project http://metasolutions.se/2014/03/webbinarium-med-kerstin-forsberg-om-lankade-data-i-lakemedelsforskningen/
Information and data on chemicals is used by scientists to evaluate potential health and ecological risks due to environmental exposures. EPA’s CompTox Chemicals Dashboard (https://comptox.epa.gov) helps evaluate the safety of chemicals by providing public access to a variety of information on over 760,000 chemicals. Within the Dashboard, users can access chemical structures, chemistry information, toxicity data, hazard data, exposure information, and additional links to relevant websites and applications. These data are compiled from sources including EPA’s computational toxicology research databases, from public domain databases and with collaborators across the world. Chemical lists have been added that provide access to various classes of chemicals and project-based datasets are under constant development. Specific functionality has been delivered within the Dashboard to support mass spectrometry including “MS-ready forms” of chemical substances that would be detectable by mass spectrometry. Workflows have been developed to assist in candidate identification and have now been proven with multiple published studies. An integration path between the dashboard and MetFrag has also been established to provide users the significant benefits resulting from the marriage between the two applications. The datasets underpinning the dashboard are freely available (https://comptox.epa.gov/dashboard/downloads) for integration into third party databases. This presentation will provide an overview of the available data types and functionality of the dashboard prior to examining how it is developing to support mass spectrometry based analyses within the agency and for the community in general. This will include a review of our research efforts to enhance the dashboard using in silico MS/MS fragmentation prediction for spectral matching. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.
Richard Bolton (GSK and Pistoia's ELN query services workstream coordinator) discusses the Alliance's chemistry strategy, which includes ELN query standards, hosted ELN, and chemistry externalization faciliation
October 24, 2014: Joseph DeCarolis, Assistant Professor at North Carolina State University, will present The Importance of Open Data and Models for Energy Systems Analysis.
Energy system models represent a critical planning tool that can be used to deliver policy-relevant insights at scales ranging from local to global. When such models are used to inform public policy, the associated data and source code should be open in order to enable third party replication of results, expose hidden assumptions, and identify key model sensitivities. In this talk, I describe my own effort to push open data and models within the international energy modeling community.
Unambiguous representation of Lab Medicine requests & results - UK’s approach...Jay Kola
Unambiguous representation of Lab Medicine requests & results - UK’s approach & considerations for SNOMED CT community. A presentation to the SNOMED International General Assembly to showcase how UK's approach could help countries share lab test/result information even if they did not use SNOMED CT.
ICIC 2013 Conference Proceedings Sebastian RadestockDr. Haxel Consult
Making hidden data discoverable: How to build effective drug discovery engines?
Sebastian Radestock (Elsevier, Germany)
In a complex IT environment comprising dozens if not hundreds of databases and likely as many user interfaces it becomes difficult if not impossible to find all the relevant information needed to make informed decisions. Historical data get lost, not normalized data cannot be compared and maintenance becomes a nightmare. We will discuss a new approach to address this issue by showing various examples and use cases on how in-house data and public data can be integrated in various ways to address the unique and individual needs of companies to keep the competitive edge.
FAIR Data and Model Management for Systems Biology(and SOPs too!)Carole Goble
MultiScale Biology Network Springboard meeting, Nottingham, UK, 1 June 2015
FAIR Data and model management for Systems Biology
Over the past 5 years we have seen a change in expectations for the management of all the outcomes of research – that is the “assets” of data, models, codes, SOPs and so forth. Don’t stop reading. Yes, data management isn’t likely to win anyone a Nobel prize. But publications should be supported and accompanied by data, methods, procedures, etc. to assure reproducibility of results. Funding agencies expect data (and increasingly software) management retention and access plans as part of the proposal process for projects to be funded. Journals are raising their expectations of the availability of data and codes for pre- and post- publication. And the multi-component, multi-disciplinary nature of Systems Biology demands the interlinking and exchange of assets and the systematic recording of metadata for their interpretation.
Data and model management for the Systems Biology community is a multi-faceted one including: the development and adoption appropriate community standards (and the navigation of the standards maze); the sustaining of international public archives capable of servicing quantitative biology; and the development of the necessary tools and know-how for researchers within their own institutes so that they can steward their assets in a sustainable, coherent and credited manner while minimizing burden and maximising personal benefit.
The FAIRDOM (Findable, Accessible, Interoperable, Reusable Data, Operations and Models) Initiative has grown out of several efforts in European programmes (SysMO and EraSysAPP ERANets and the ISBE ESRFI) and national initiatives (de.NBI, German Virtual Liver Network, SystemsX, UK SynBio centres). It aims to support Systems Biology researchers with data and model management, with an emphasis on standards smuggled in by stealth.
This talk will use the FAIRDOM Initiative to discuss the FAIR management of data, SOPs, and models for Sys Bio, highlighting the challenges multi-scale biology presents.
http://www.fair-dom.org
http://www.fairdomhub.org
http://www.seek4science.org
Similar to UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Information (CINF 58, ACS National Meeting 2019-08-26) (20)
0x01 - Newton's Third Law: Static vs. Dynamic AbusersOWASP Beja
f you offer a service on the web, odds are that someone will abuse it. Be it an API, a SaaS, a PaaS, or even a static website, someone somewhere will try to figure out a way to use it to their own needs. In this talk we'll compare measures that are effective against static attackers and how to battle a dynamic attacker who adapts to your counter-measures.
About the Speaker
===============
Diogo Sousa, Engineering Manager @ Canonical
An opinionated individual with an interest in cryptography and its intersection with secure software development.
This presentation by Morris Kleiner (University of Minnesota), was made during the discussion “Competition and Regulation in Professions and Occupations” held at the Working Party No. 2 on Competition and Regulation on 10 June 2024. More papers and presentations on the topic can be found out at oe.cd/crps.
This presentation was uploaded with the author’s consent.
Have you ever wondered how search works while visiting an e-commerce site, internal website, or searching through other types of online resources? Look no further than this informative session on the ways that taxonomies help end-users navigate the internet! Hear from taxonomists and other information professionals who have first-hand experience creating and working with taxonomies that aid in navigation, search, and discovery across a range of disciplines.
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Orkestra
UIIN Conference, Madrid, 27-29 May 2024
James Wilson, Orkestra and Deusto Business School
Emily Wise, Lund University
Madeline Smith, The Glasgow School of Art
Acorn Recovery: Restore IT infra within minutesIP ServerOne
Introducing Acorn Recovery as a Service, a simple, fast, and secure managed disaster recovery (DRaaS) by IP ServerOne. A DR solution that helps restore your IT infra within minutes.
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Information (CINF 58, ACS National Meeting 2019-08-26)
1. CINF58 – 26 August 2019
Frederik van den Broek, Gerd Blanke, Jarek Tomczak, Markus Fischer
UDM
(Unified Data Model)
Enabling Exchange of Comprehensive
Reaction Information
3. Drivers and Opportunities of Change
There is a renewed interest in reaction-centric cheminformatics (most of it
was done before the 1990s), but this requires standardised data sets:
• Improved reaction searching and navigation
• Reaction similarity and classification
• Improved automatic determination of reaction mapping
• Mechanism elucidation
• Synthetic feasibility
• Retrosynthesis (design and planning)
• Reaction outcome prediction (products, yield, specificity, safety)
3
4. Drivers and Opportunities of Change
• Making data FAIR
4Source: https://www.dtls.nl/fair-data/fair-principles-explained/
5. Vendor Publisher
Ideal World
5
CRO Academia
Pharma
ELN 1 ELN 2
Robots
ELN 4ELN 3 Conversion
Integrated
Data
Public Data
Sources
Analysis Reporting
Publication
Supplements
UDM
others
Commercial
Databases
6. Current Situation
Various systems storing reaction information have been available for more
than three decades, however:
• No common data model that can comprehensively describe chemical
reactions
• Most of the commercially available databases use models similar to the former
Cheminform database that was distributed by MDL
• No common file format which allows representation of chemical reactions,
their conditions and outcomes
• Common formats:
• RXN/RD files originally created by MDL in the middle of the 80s
• PerkinElmer ELN XML
• No reaction drawing standards
• A IUPAC project “Graphical Representation Standards for Chemical Reaction Diagrams”
has been re-started by Keith Taylor in 2017
6
7. Challenges
• Collaboration and data exchange with partners and CROs using different
ELNs is difficult
• Integration, comparison and analysis of reaction data from various
sources is very laborious
• The lack of a common data model makes it difficult to develop and share
business rules, for consistent representation of reactions and IP capture
• There are very limited open source/open data activities around reaction
databases and searching
7
8. UDM Objective
The goal of the UDM project is to
create and publish an open,
extendable and freely available
data format for exchange of
experimental information about
compound synthesis and their
testing.
8
10. ELN Query Service Definition (Pistoia Alliance)
Scope:
• Define and publish high-level foundation design principles for an ELN
data mart and its query services
• Design and implement a prototype version of a synthetic chemistry ELN
data mart using information from the Discovery Chemistry workflow
• System independent query interface & tools accessing via a published
API
• Expand the data model to incorporate data from ELN sources across the
life science space – biology, pharmaceutical sciences and analytical
sciences
10
11. UDM Origins
• Roche UDM project (2012 to 2013) to integrate in-house chemistry data into
Elsevier’s Reaxys database
• Further developed by Roche and Elsevier with contributions from other pharma
companies as a data transfer format for chemical reactions from a variety of ELNs
into Elsevier Reaxys database
• The originators provided the created UDM XML file format to the Pistoia Alliance
and are committed to working together to make it more generic and to extend it
to other experiment types (October 2017)
• Founders:
11
12. Roche Key Drivers
From ACS Presentation by
Michael Kapler, Roche Pharma Research and Early Development
http://abstracts.acs.org/chem/245nm/program/view.php?obj_id=188977
13. Roche integration: Data source overview
From ACS Presentation by
Michael Kapler, Roche Pharma Research and Early Development
Roche in
Reaxys
Licensed
in Reaxys
Export &
Unify
Export &
Unify
14. Others have followed
From Bio-IT World 2019 presentation by
Ludovic Otterbein, Director Research Informatics & Operations, Lundbeck
15. UDM – Simplified Data Model
15
UDM
UDM_VERSION
LEGAL
CITATIONS
MOLECULES
REACTIONS VARIATION
CONDITIONS
PREPARATION
CONDITION_GROU
P
REACTANT
PRODUCT
CATALYST
SOLVENT
REAGENT
18. UDM Example – MOLECULE
<MOLECULES>
...
<MOLECULE ID="3247633">
<MOLSTRUCTURE><![CDATA[
Mrv0541 05221820572D
HDR
0 0 0 0 0 999 V3000
M V30 BEGIN CTAB
M V30 COUNTS 12 12 0 0 0 REGNO=3247633
M V30 BEGIN ATOM
M V30 1 C 5.39 1.3336 0 0
...
M V30 END CTAB
M END
]]></MOLSTRUCTURE>
<NAME>4-(prop-2-ynyloxy)benzaldehyde</NAME>
</MOLECULE>
...
</MOLECULES>
18
20. UDM Project Roadmap
20
2017 2018 2019 2020
Elsevier donates
UDM to Pistoia
Alliance
October 2017
June 2018
UDM Release 4.0
• Fully compatible with the Elsevier version
• Cleaned-up and documented XML schema
• Added support for units of measure
• Included sample data sets (Reaxys, SPRESI)
• Included conversion tool from SPRESI RD file to
UDM
November 2018
UDM Release 5.0 (Brooklyn)
• Support for various representation of molecular
structures and reaction diagrams
• Improved semantic of the model
• New representation of molecular properties
• New properties of reaction components (reactants,
products, catalysts, solvents, reagents)
• Improved representation of reaction conditions
• Support for vendor extensions of the model
• Special tags for capturing legal information
• Support for various formats of PREPARATION
section
• Glossary of UDM terms
• Change log
November 2019
UDM Release 6.0
• Further improvements to the reaction model
• Extended support for analytical data
• Possible BFO-compatible ontology representation +
SHACL model
2020
• Health and safety data
• Compound testing: screening /
DMPK
• Support for galenic formulation
development
• Biochemical reactions
• Support for large molecules
• …
21. UDM workshop 9 May 2019
21
• Workshop at Elsevier in Amsterdam, attended by representatives from Pharma,
ELN vendors, chemistry content providers and industry experts.
• Outcomes:
• Confirmed UDM priorities and roadmap for 2019
• Identified various UDM use cases
• Identified need for more sample data sets to improve UDM and its coverage for various
synthesis types, especially for those not found frequently in literature
• Identified various data types that need to be supported by UDM
• Discussed factors and risks influencing the adoption of UDM – to be largely mitigated by
developing an open source UDM Toolkit (have applied for funding, but additional donations are
welcome)
22. Initial UDM Version
Developed by Elsevier and
Roche to integrate
customer reactions
UDM Going Open
Source
Pistoia takes on
governance of UDM as
open source project
UDM 4.0 Release
First Pistoia Alliance
version released
Reaxys Reaction Flat
File
Reaxys exports single step
reactions as UDM (RDF is
part of the offering)
UDM 5.0 Release
Improved version
based on project
members requirements
Entellect Press Release
Elsevier announces data
Platform to harmonize
proprietary and external
data
Scilligence – Reaxys
Interoperability Press Release
The Scilligence ELN integrates
Reaxys reaction query capabilities
Reaxys SCI collaboration
Reaxys collaborates with the
Italian Chemistry Society to
support data driven
chemistry research
201820172013 2019 2020
ReaxysPlus
Established UDM as
ReaxysPlus import
format for customer
UDM extensions to
accommodate
customer
requirements
Entellect Reaction
Workbench
Advanced reaction analytics
platform
Upcoming support UDM as import
format
ELN – Reaxys data exchange using UDM
Evaluating UDM as bi-directional exchange format to
improve Scilligence ELN – Reaxys interoperability
Reaxys UDM export/import
Implement UDM export for Reaxys
and ReaxysPlus reactions. Support
import for batch searching
Lower Data Exchange Barriers in Academic Research
Evaluate the use of the UDM and ELNs in academic chemistry
research in collaboration with the Società Chimica Italiana. KIT
open source ELN adopting UDM.
Beilstein investigating impact adopting UDM. Entellect RMC Workbench
Utilize UDM to integrate
Bioactivity data
ReaxysPlus support for Pistoia UDM versions
Enable reaction ingestion using the Pistoia UDM versions
Nov-2018
Nov-2018
Apr-2019
Apr-2019
Support Pistoia Alliance UDM project
Mar-2018
Jun-2018Oct-2017
Jan-2014
Elsevier UDM Roadmap
23. Search &
workflow
Visualization
Predictive
analytics
Accelerated
data-science
driven R&D
• Chemistry
intelligence
• Disease
intelligence
• Safety
intelligence
• Efficacy
• intelligence
• Trial
intelligence
• Drug
intelligence
• Commercial
intelligence
Exploratory
Analytics
Compound
& Reaction
Trial
Post Market Assay
-omics Translational
Scientific data from internal
external sources
Ingest &
enrich
Connect Serve
Entellect is a smart and flexible life sciences platform that powers R&D discovery
by using Elsevier’s trusted approach towards data integration and harmonization.
Entellect delivers connected and AI-ready data by linking and enriching disparate content
against established life science taxonomies. Combined with the option of Elsevier data, the
result is a scalable knowledge environment, enabling exploratory and predictive analytics
applications
24. Benefits of UDM
• Provides improved quality of experimental data
• Supports integration, comparison and analysis of research data from
various sources
• Enables collaboration and data exchange with partners and CROs using
different ELNs
• Helps in defining and sharing business rules / protocols, for consistent
representation of experiments and IP capture.
• (Validation ELN data)
24
29. UDM Example – MOLECULE
<MOLECULES>
...
<MOLECULE ID="3247633">
<MOLSTRUCTURE><![CDATA[
Mrv0541 05221820572D
HDR
0 0 0 0 0 999 V3000
M V30 BEGIN CTAB
M V30 COUNTS 12 12 0 0 0 REGNO=3247633
M V30 BEGIN ATOM
M V30 1 C 5.39 1.3336 0 0
...
M V30 END CTAB
M END
]]></MOLSTRUCTURE>
<NAME>4-(prop-2-ynyloxy)benzaldehyde</NAME>
</MOLECULE>
...
</MOLECULES>
29