Pipeline Pilot Chemistry 9.0 is inheriting many new chemical representations from the Accelrys Direct data model. These include the support of the Self Contained Sequence Representation (SCSR) biologics, enhanced Markush structure representations, Markush homology groups, and Non Specific Structures (NONS). Also significantly enhanced is the support for Sgroups, in particular for polymers, mixtures, and formulations. Further, Pipeline Pilot depiction has been upgraded to support these enhancements and the stereochemical perception and ring perception capabilities were improved based on Direct.
The major benefit of these changes is that Direct and Pipeline Pilot now use the same data model. Searches carried out in Direct or in Pipeline Pilot will return identical results and both products will deliver identical structural perceptions. This session will give guidance on how these changes will impact your calculators and models and how you can plan for a smooth upgrade.
(ATS6-DEV06) Using Packages for Protocol, Component, and Application DeliveryBIOVIA
Delivering protocols, components, and applications to users and other developers on an AEP server can be very challenging. Accelrys delivers the majority of its AEP services in the form of packages. This talk will discuss the methods that anyone can use to deliver bundled applications in the form of packages and the benefits of doing so. The discussion will include how to create packages, modifying existing packages, deploying packages to servers, and tools that can be used for ensuring the quality of the packages.
1. The document discusses Discngine's Tibco Spotfire Pipeline Pilot connector, which allows graphs stored in Pipeline Pilot to be accessed and visualized in Spotfire.
2. It describes the architecture of the connector and how it executes Pipeline Pilot protocols to generate HTML pages for visualization in Spotfire.
3. Challenges in integrating the large Spotfire API and synchronizing client and server datasets are also discussed.
(ATS6-PLAT07) Managing AEP in an enterprise environmentBIOVIA
Deployments can range from personal laptop usage to large enterprise environments. The installer allows both interactive and unattended installations. Key folders include Users for individual data, Jobs for temporary execution data, Shared Public for shared resources, and XMLDB for the database. Logs record job executions, authentication events, and errors. Tools like DbUtil allow backup/restore of data, pkgutil creates packages for application delivery, and regress enables test automation. Planning folder locations and maintenance is important for managing resources in an enterprise environment.
(ATS6-PLAT09) Deploying Applications on load balanced AEP servers for high av...BIOVIA
This document discusses deploying Accelrys Enterprise Platform (AEP) servers in a load balanced configuration for high availability. It recommends using a staging server to test configurations before deploying to production nodes. All nodes should be configured identically and share storage. A load balancer should be configured to distribute traffic evenly across nodes. Applications need to be packaged and deployed identically to each node to ensure consistency across the load balanced farm. Load balancing improves availability, scalability and performance but requires additional infrastructure and configuration.
(ATS3-PLAT07) Pipeline Pilot Protocol Tips, Tricks, and ChallengesBIOVIA
This document provides tips and tricks for using Pipeline Pilot, including how to use protocol search, favorites bar, tool tips, component profiling, design mode, protocol recovery, recursion vs looping, merge/join operations, debugging tips, and RTC subprotocols. It emphasizes best practices like avoiding loops and using recursion instead. Design mode and checkpoints are highlighted as useful debugging aids. Resources like training, support, and the user community are recommended for additional help.
ScienceCloud: Collaborative Workflows in Biologics R&DBIOVIA
The life sciences industry has undergone dramatic changes and effective global collaboration has become a key success factor in this new age. BIOVIA is providing a hosted and comprehensive solution stack for externalized, collaborative research for pharma/biotech and CROs to address these new challenges. Recently we added the support for biologics data management and IP capture. In this talk we will present collaborative and comprehensive capabilities in antibody characterization and development: capabilities to analyze, annotate and predict developability as part of a framework that facilitates secure data sharing and collaboration.
(ATS4-PLAT04) Chemistry Data Model Enhancements in Pipeline Pilot 9.0: what a...BIOVIA
Pipeline Pilot Chemistry is inheriting many new chemical representations from the Accelrys Direct data model. These include the support of the Self Contained Sequence Representation (SCSR) biologics, enhanced Markush structure representations, Markush homology groups and Non Specific Structures (NONS). PPChem’s current support for Sgroups, and in particular polymers, mixtures, and formulations is significantly enhanced.
Depiction is also being upgraded to support these new features and enhancements, and the general aesthetics of depiction is improved.
The stereochemical perception, and ring perception capabilities of the Direct data model are superior to those currently offered in PPChem and are being incorporated. Changes to stereochemistry and ring perception may impact your calculators and models. We will report our findings on the magnitude of the changes.
A major benefit of these changes is that Direct and PPChem will use the same data model. This means that searches carried out in Direct or in PPChem will return identical results and both products will deliver identical structural perceptions.
The influence of data curation on QSAR Modeling – Presented at American Chemi...Kamel Mansouri
This presentation examined the impact of data quality on the construction of QSAR models being developed within the EPA‘s National Center for Computational Toxicology. We have developed a public-facing platform to provide access to predictive models. As part of the work we have attempted to disentangle the influence of the quality versus quantity of data available to develop and validate QSAR models. This abstract does not reflect U.S. EPA policy.
(ATS6-DEV06) Using Packages for Protocol, Component, and Application DeliveryBIOVIA
Delivering protocols, components, and applications to users and other developers on an AEP server can be very challenging. Accelrys delivers the majority of its AEP services in the form of packages. This talk will discuss the methods that anyone can use to deliver bundled applications in the form of packages and the benefits of doing so. The discussion will include how to create packages, modifying existing packages, deploying packages to servers, and tools that can be used for ensuring the quality of the packages.
1. The document discusses Discngine's Tibco Spotfire Pipeline Pilot connector, which allows graphs stored in Pipeline Pilot to be accessed and visualized in Spotfire.
2. It describes the architecture of the connector and how it executes Pipeline Pilot protocols to generate HTML pages for visualization in Spotfire.
3. Challenges in integrating the large Spotfire API and synchronizing client and server datasets are also discussed.
(ATS6-PLAT07) Managing AEP in an enterprise environmentBIOVIA
Deployments can range from personal laptop usage to large enterprise environments. The installer allows both interactive and unattended installations. Key folders include Users for individual data, Jobs for temporary execution data, Shared Public for shared resources, and XMLDB for the database. Logs record job executions, authentication events, and errors. Tools like DbUtil allow backup/restore of data, pkgutil creates packages for application delivery, and regress enables test automation. Planning folder locations and maintenance is important for managing resources in an enterprise environment.
(ATS6-PLAT09) Deploying Applications on load balanced AEP servers for high av...BIOVIA
This document discusses deploying Accelrys Enterprise Platform (AEP) servers in a load balanced configuration for high availability. It recommends using a staging server to test configurations before deploying to production nodes. All nodes should be configured identically and share storage. A load balancer should be configured to distribute traffic evenly across nodes. Applications need to be packaged and deployed identically to each node to ensure consistency across the load balanced farm. Load balancing improves availability, scalability and performance but requires additional infrastructure and configuration.
(ATS3-PLAT07) Pipeline Pilot Protocol Tips, Tricks, and ChallengesBIOVIA
This document provides tips and tricks for using Pipeline Pilot, including how to use protocol search, favorites bar, tool tips, component profiling, design mode, protocol recovery, recursion vs looping, merge/join operations, debugging tips, and RTC subprotocols. It emphasizes best practices like avoiding loops and using recursion instead. Design mode and checkpoints are highlighted as useful debugging aids. Resources like training, support, and the user community are recommended for additional help.
ScienceCloud: Collaborative Workflows in Biologics R&DBIOVIA
The life sciences industry has undergone dramatic changes and effective global collaboration has become a key success factor in this new age. BIOVIA is providing a hosted and comprehensive solution stack for externalized, collaborative research for pharma/biotech and CROs to address these new challenges. Recently we added the support for biologics data management and IP capture. In this talk we will present collaborative and comprehensive capabilities in antibody characterization and development: capabilities to analyze, annotate and predict developability as part of a framework that facilitates secure data sharing and collaboration.
(ATS4-PLAT04) Chemistry Data Model Enhancements in Pipeline Pilot 9.0: what a...BIOVIA
Pipeline Pilot Chemistry is inheriting many new chemical representations from the Accelrys Direct data model. These include the support of the Self Contained Sequence Representation (SCSR) biologics, enhanced Markush structure representations, Markush homology groups and Non Specific Structures (NONS). PPChem’s current support for Sgroups, and in particular polymers, mixtures, and formulations is significantly enhanced.
Depiction is also being upgraded to support these new features and enhancements, and the general aesthetics of depiction is improved.
The stereochemical perception, and ring perception capabilities of the Direct data model are superior to those currently offered in PPChem and are being incorporated. Changes to stereochemistry and ring perception may impact your calculators and models. We will report our findings on the magnitude of the changes.
A major benefit of these changes is that Direct and PPChem will use the same data model. This means that searches carried out in Direct or in PPChem will return identical results and both products will deliver identical structural perceptions.
The influence of data curation on QSAR Modeling – Presented at American Chemi...Kamel Mansouri
This presentation examined the impact of data quality on the construction of QSAR models being developed within the EPA‘s National Center for Computational Toxicology. We have developed a public-facing platform to provide access to predictive models. As part of the work we have attempted to disentangle the influence of the quality versus quantity of data available to develop and validate QSAR models. This abstract does not reflect U.S. EPA policy.
The purpose of this webinar is to highlight GSK's approach to:
- create a simple, mechanistically descriptive model
- verify its utility with clarity of objectives, and
- communicate understanding via creative but aligned metrics
... for a challenging chemical reaction.
This is a presentation by Prof. Anne Elster at the International Workshop on Open Source Supercomputing held in conjunction with the 2017 ISC High Performance Computing Conference.
This is a little presentation I gave to Roald Hoffmann's group at Cornell. What are the industrial applications of computational chemistry? How to people work differently in academia vs. industry? What are the sorts of things students should think about if they plan to work in the corporate world?
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...Frederik van den Broek
Slides from my talk at the ACS CINF Symposium on Chemical Nomenclature & Representation on 26 August 2019 in San Diego.
Abstract:
The first edition of the Beilstein Handbook of Organic Chemistry was published nearly 140 years ago. Electronic laboratory notebooks have been in use in chemistry for almost 20 years. And the life science industry still doesn't have a well-defined way of capturing and exchanging information about chemical reactions and relies on imprecise or vendor-specific data formats. Without a common language and structure to describe experiments, data integration is unnecessarily expensive and a significant part of published data has not been readily available for processing or analysis.
The Unified Data Model (UDM) project team aims to improve the situation. UDM is a collective effort of vendors and life science organizations to create an open, extendable and freely available reference model and data format for exchange of experimental information about compound synthesis and testing. Run under the umbrella of the Pistoia Alliance, the project team has published two releases of the UDM data format and it is expected that the model will continue to be improved as demand stipulates working with the Pistoia FAIR data implementation by industry community.
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...Kamel Mansouri
This presentation highlighted how data curation impacts the reliability of QSAR models. We examined key datasets related to environmental endpoints to validate across chemical structure representations (e.g., mol file and SMILES) and identifiers (chemical names and registry numbers), and approaches to standardize data into QSAR-ready formats prior to modeling procedures. This allowed us to quantify and segregate data into quality categories. This improved our ability to evaluate the resulting models that can be developed from these data slices, and to quantify to what extent efforts developing high-quality datasets have the expected pay-off in terms of predicting performance. The most accurate models that we build will be accessible via our public-facing platform and will be used for screening and prioritizing chemicals for further testing.
Researchers at EPA’s National Center for Computational Toxicology integrate advances in biology, chemistry, and computer science to examine the toxicity of chemicals and help prioritize chemicals for further research based on potential human health risks. The goal of this research program is to quickly evaluate thousands of chemicals, but at a much reduced cost and shorter time frame relative to traditional approaches. The data generated by the Center includes characterization of thousands of chemicals across hundreds of high-throughput screening assays, consumer use and production information, pharmacokinetic properties, literature data, physical-chemical properties as well as the predictive computational modeling of toxicity and exposure. We have developed a number of databases and applications to deliver the data to the public, academic community, industry stakeholders, and regulators. This presentation will provide an overview of our work to develop an architecture that integrates diverse large-scale data from the chemical and biological domains, our approaches to disseminate these data, and the delivery of models supporting predictive computational toxicology. In particular, this presentation will review our new publicly-accessible CompTox Dashboard as the first application built on our newly developed architecture. This abstract does not reflect U.S. EPA policy.
The document describes the Collaborative Estrogen Receptor Activity Prediction Project (CERAPP) which aims to use structure-based models to predict estrogen receptor (ER) activity for chemicals that may be present in the environment or drinking water. Over 10,000 chemicals were identified for screening through the Endocrine Disruptor Screening Program. CERAPP will run structure-based models on these chemicals to prioritize them for further testing, since the models can be run on large numbers of chemicals more efficiently than traditional testing methods. Multiple participants developed models and the results will be combined through a consensus modeling approach to generate the most accurate predictions.
The Royal Society of Chemistry provides access to a number of databases hosting chemicals data, reactions, spectroscopy data and prediction services. These databases and services can be accessed via web services utilizing queries using standard data formats such as InChI and molfiles. Data can then be downloaded in standard structure and spectral formats allowing for reuse and repurposing. The ChemSpider database integrates to a number of projects external to RSC including Open PHACTS that integrates chemical and biological data. This project utilizes semantic web data standards including RDF. This presentation will provide an overview of how structure and spectral data standards have been critical in allowing us to integrate many open source tools, ease of integration to a myriad of services and underpin many of our future developments.
The Royal Society of Chemistry hosts large scale data collections and provides access to the data to the chemistry community. The largest RSC data set of wide scale interest to the community offers access to tens of millions of compounds. The host platform, ChemSpider, is limited as it is a structure centric hub only. A new architecture, the RSC data repository, has been developed that extends support to reactions, spectral data, crystallography data and related property data. It is also the architecture underlying a series of exemplar projects for managing data for a number of diverse laboratories. The adoption of data standards for the integration and distribution of data has been essential. Specific standards include molecular structure formats such as molfiles and InChIs, and spectral data formats such as JCAMP. This presentation will report on our development of the data repository, the importance of utilizing standards for data integration, the flexible nature of the architecture to deliver solutions for various laboratories and our efforts to develop new large data collections. This includes text-mining efforts to extract large spectrum-structure collections from large corpuses.
The document provides information on charged aerosol detection (CAD) technology. It discusses the evolution of CAD products, how CAD works, comparisons to other detection methods like ELSD, and example applications. Key points covered include how CAD provides a uniform response for analytes independent of chemical structure, its wide dynamic range of up to four orders of magnitude, and how it can detect both non-volatile and semi-volatile compounds on HPLC and UHPLC systems without the need for reference standards.
Webinar: What's New in Pipeline Pilot 8.5 Collection Update 1?BIOVIA
The document discusses new features in Pipeline Pilot 8.5 Collection Update 1. It introduces protocol comparison capabilities, updates to the documents and text collection including new visualization and search components. The Accelrys Query Service allows unified searching across data sources. Imaging components now include curvature analysis and color deconvolution. The NGS collection includes performance updates and new viewers. Additional resources and services are available to assist with the upgrade.
An examination of data quality on QSAR Modeling in regards to the environment...Kamel Mansouri
The development of QSAR models is critically dependent on the quality of available data. As part of our efforts to develop public platforms to provide access to predictive models, we have attempted to discriminate the influence of the quality versus quantity of data available to develop and validate QSAR models. We have focused our efforts on the widely used EPISuite software that was initially developed over two decades ago and, specifically, on the PHYSPROP dataset used to train the EPISuite prediction models. This presentation will review our approaches to examining key datasets, the delivery of curated data and the development of machine-learning models for thirteen separate property endpoints of interest to environmental science. We will also review how these data will be made freely accessible to the community via a new “chemistry dashboard”. This abstract does not reflect U.S. EPA policy.
The development of QSAR models is critically dependent on the quality of available data. As part of our efforts to develop public platforms to provide access to predictive models, we have attempted to discriminate the influence of the quality versus quantity of data available to develop and validate QSAR models. We have focused our efforts on the widely used EPISuite software that was initially developed over two decades ago and, specifically, on the PHYSPROP dataset used to train the EPISuite prediction models. This presentation will review our approaches to examining key datasets, the delivery of curated data and the development of machine-learning models for thirteen separate property endpoints of interest to environmental science. We will also review how these data will be made freely accessible to the community via a new “chemistry dashboard”. This abstract does not reflect U.S. EPA policy
USUGM 2014 - Brett Hiemenz (GlaxoSmithKline): From Desktop to Browser - ChemA...ChemAxon
Since its initial adoption of ChemAxon tools in 2009, GlaxoSmithKline has followed a roadmap moving from desktop tools to web-based enterprise solutions. This presentation will touch on previous milestones, provide an update on the current activities, and describe the alignment of GSK with the ChemAxon vision for the Plexus Discovery suite.
A de facto standard or a free-for-all? A benchmark for reading SMILESNextMove Software
The document discusses a benchmark for evaluating how accurately different cheminformatics toolkits can read SMILES strings representing stereochemistry, implicit hydrogens, and aromatic systems. The author tested 15 toolkits on test cases examining cis-trans stereochemistry, implicit valence as defined by the SMILES specification, and their ability to consistently interpret the aromatic nature and hydrogen counts of molecules represented by SMILES strings. While stereochemistry is generally handled well, adherence to the SMILES valence model and consistent aromatic interpretation vary more between tools. The benchmark aims to identify such differences and facilitate improvements to interoperability.
This document describes a biotech startup's transition from using various disconnected systems like Excel, PowerPoint and paper notebooks, to using the Dotmatics informatics platform. The startup implemented Dotmatics for compound registration and searching, data analysis to reduce time spent from afternoons to 30 minutes, and electronic notebooks for chemistry to enable searching and collaboration. Dotmatics provided the startup with a single vendor solution that integrated all of their informatics needs, was easy to deploy across platforms, and enabled collaboration with researchers in China.
Reaxys provides a unified information portal that integrates data from multiple chemistry sources through a single interface. It links chemistry data, structures, citations, and full-text articles. Reaxys also integrates in-house data from sources like electronic lab notebooks through its API and can be used for activities like compound screening, literature searching, and patent analysis to support drug discovery.
Webinar - Pharmacopeial Modernization: How Will Your Chromatography Workflow ...Waters Corporation
In this webinar, Dr. Leonel Santos and Dr. Horacio Pappa from the United States Pharmacopeia (USP) will provide an overview of its pharmacopeial harmonization and modernization efforts. The pair will also review changes described in the pending USP General Chapter <621> on liquid chromatography (LC), which will provide increased flexibility for gradient methods.
Amanda Dlugasch, from Waters Corporation, will follow with an illuminating case study, which leverages USP <621> allowable adjustments to illustrate the benefits of modernizing methods, including migrating HPLC methods to UHPLC or UPLC, without the need to revalidate.
Topics covered in this webinar will include:
- Pharmacopeial monograph modernization prioritization scheme
- Review of USP General Chapter <621> current allowable adjustments to validated chromatographic methods and forthcoming updates
- Case study on the migration of isocratic and gradient pharmacopeial methods to modern chromatography column technology, highlighting improved method performance and throughput
Replay the webinar, hosted by SelectScience:
https://www.selectscience.net/webinars/pharmacopeial-modernization-how-will-your-chromatography-workflow-benefit/?webinarID=1228
The ACCESS-Optimization Project is a 3-year effort between NCI, BoM and Fujitsu to optimize and scale up climate and earth system models run on NCI infrastructure. The project aims to address performance and scalability issues, assist with future HPC procurements, and contribute to model development with a focus on performance. Current work involves profiling applications, constructing and testing higher resolution configurations, and reporting on workflow and scalability issues for future weather and climate applications. Methodologies used include tools for performance analysis and scaling tests. Areas of work include high resolution models of the ocean, atmosphere and coupled climate system, as well as data assimilation procedures. Deliverables to date include porting the ocean model to the new
This document discusses an empirical study of RDF stream processing systems. The study aimed to understand why different systems can produce different outputs for the same inputs. Through experiments, the study found that differences could be explained by parameters like the starting time (t0) of windows in continuous queries. A more detailed model called SECRET was then developed to describe stream processing and help predict system outputs. This led to the CSR-bench benchmark for evaluating and comparing RDF stream reasoning systems.
How to get the maximum performance from your AEP server. This will discuss ways to improve execution time of short running jobs and how to properly configure the server depending on the expected number of users as well as the average size and duration of individual jobs. Included will be examples of making use of job pooling, Database connection sharing, and parallel subprotocol tuning. Determining when to make use of cluster, grid, or load balanced configurations along with memory and CPU sizing guidelines will also be discussed.
(ATS6-PLAT05) Security enhancements in AEP 9BIOVIA
In the latest version of the Accelrys Enterprise Platform we have streamlined how permissions are managed and added the capability for packages to define groups and permission sets. In addition, enhancements have been made to File Based Authentication, we have added support for enterprise authentication solutions like Kerberos and SAML and improved the usability of the Administration Portal. This session describes the new features and how to manage them through the Administration Portal.
More Related Content
Similar to (ATS6-PLAT01) Chemistry Harmonization: Bringing together the Direct 9 and Pipeline Pilot Chemistry Data Models
The purpose of this webinar is to highlight GSK's approach to:
- create a simple, mechanistically descriptive model
- verify its utility with clarity of objectives, and
- communicate understanding via creative but aligned metrics
... for a challenging chemical reaction.
This is a presentation by Prof. Anne Elster at the International Workshop on Open Source Supercomputing held in conjunction with the 2017 ISC High Performance Computing Conference.
This is a little presentation I gave to Roald Hoffmann's group at Cornell. What are the industrial applications of computational chemistry? How to people work differently in academia vs. industry? What are the sorts of things students should think about if they plan to work in the corporate world?
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...Frederik van den Broek
Slides from my talk at the ACS CINF Symposium on Chemical Nomenclature & Representation on 26 August 2019 in San Diego.
Abstract:
The first edition of the Beilstein Handbook of Organic Chemistry was published nearly 140 years ago. Electronic laboratory notebooks have been in use in chemistry for almost 20 years. And the life science industry still doesn't have a well-defined way of capturing and exchanging information about chemical reactions and relies on imprecise or vendor-specific data formats. Without a common language and structure to describe experiments, data integration is unnecessarily expensive and a significant part of published data has not been readily available for processing or analysis.
The Unified Data Model (UDM) project team aims to improve the situation. UDM is a collective effort of vendors and life science organizations to create an open, extendable and freely available reference model and data format for exchange of experimental information about compound synthesis and testing. Run under the umbrella of the Pistoia Alliance, the project team has published two releases of the UDM data format and it is expected that the model will continue to be improved as demand stipulates working with the Pistoia FAIR data implementation by industry community.
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...Kamel Mansouri
This presentation highlighted how data curation impacts the reliability of QSAR models. We examined key datasets related to environmental endpoints to validate across chemical structure representations (e.g., mol file and SMILES) and identifiers (chemical names and registry numbers), and approaches to standardize data into QSAR-ready formats prior to modeling procedures. This allowed us to quantify and segregate data into quality categories. This improved our ability to evaluate the resulting models that can be developed from these data slices, and to quantify to what extent efforts developing high-quality datasets have the expected pay-off in terms of predicting performance. The most accurate models that we build will be accessible via our public-facing platform and will be used for screening and prioritizing chemicals for further testing.
Researchers at EPA’s National Center for Computational Toxicology integrate advances in biology, chemistry, and computer science to examine the toxicity of chemicals and help prioritize chemicals for further research based on potential human health risks. The goal of this research program is to quickly evaluate thousands of chemicals, but at a much reduced cost and shorter time frame relative to traditional approaches. The data generated by the Center includes characterization of thousands of chemicals across hundreds of high-throughput screening assays, consumer use and production information, pharmacokinetic properties, literature data, physical-chemical properties as well as the predictive computational modeling of toxicity and exposure. We have developed a number of databases and applications to deliver the data to the public, academic community, industry stakeholders, and regulators. This presentation will provide an overview of our work to develop an architecture that integrates diverse large-scale data from the chemical and biological domains, our approaches to disseminate these data, and the delivery of models supporting predictive computational toxicology. In particular, this presentation will review our new publicly-accessible CompTox Dashboard as the first application built on our newly developed architecture. This abstract does not reflect U.S. EPA policy.
The document describes the Collaborative Estrogen Receptor Activity Prediction Project (CERAPP) which aims to use structure-based models to predict estrogen receptor (ER) activity for chemicals that may be present in the environment or drinking water. Over 10,000 chemicals were identified for screening through the Endocrine Disruptor Screening Program. CERAPP will run structure-based models on these chemicals to prioritize them for further testing, since the models can be run on large numbers of chemicals more efficiently than traditional testing methods. Multiple participants developed models and the results will be combined through a consensus modeling approach to generate the most accurate predictions.
The Royal Society of Chemistry provides access to a number of databases hosting chemicals data, reactions, spectroscopy data and prediction services. These databases and services can be accessed via web services utilizing queries using standard data formats such as InChI and molfiles. Data can then be downloaded in standard structure and spectral formats allowing for reuse and repurposing. The ChemSpider database integrates to a number of projects external to RSC including Open PHACTS that integrates chemical and biological data. This project utilizes semantic web data standards including RDF. This presentation will provide an overview of how structure and spectral data standards have been critical in allowing us to integrate many open source tools, ease of integration to a myriad of services and underpin many of our future developments.
The Royal Society of Chemistry hosts large scale data collections and provides access to the data to the chemistry community. The largest RSC data set of wide scale interest to the community offers access to tens of millions of compounds. The host platform, ChemSpider, is limited as it is a structure centric hub only. A new architecture, the RSC data repository, has been developed that extends support to reactions, spectral data, crystallography data and related property data. It is also the architecture underlying a series of exemplar projects for managing data for a number of diverse laboratories. The adoption of data standards for the integration and distribution of data has been essential. Specific standards include molecular structure formats such as molfiles and InChIs, and spectral data formats such as JCAMP. This presentation will report on our development of the data repository, the importance of utilizing standards for data integration, the flexible nature of the architecture to deliver solutions for various laboratories and our efforts to develop new large data collections. This includes text-mining efforts to extract large spectrum-structure collections from large corpuses.
The document provides information on charged aerosol detection (CAD) technology. It discusses the evolution of CAD products, how CAD works, comparisons to other detection methods like ELSD, and example applications. Key points covered include how CAD provides a uniform response for analytes independent of chemical structure, its wide dynamic range of up to four orders of magnitude, and how it can detect both non-volatile and semi-volatile compounds on HPLC and UHPLC systems without the need for reference standards.
Webinar: What's New in Pipeline Pilot 8.5 Collection Update 1?BIOVIA
The document discusses new features in Pipeline Pilot 8.5 Collection Update 1. It introduces protocol comparison capabilities, updates to the documents and text collection including new visualization and search components. The Accelrys Query Service allows unified searching across data sources. Imaging components now include curvature analysis and color deconvolution. The NGS collection includes performance updates and new viewers. Additional resources and services are available to assist with the upgrade.
An examination of data quality on QSAR Modeling in regards to the environment...Kamel Mansouri
The development of QSAR models is critically dependent on the quality of available data. As part of our efforts to develop public platforms to provide access to predictive models, we have attempted to discriminate the influence of the quality versus quantity of data available to develop and validate QSAR models. We have focused our efforts on the widely used EPISuite software that was initially developed over two decades ago and, specifically, on the PHYSPROP dataset used to train the EPISuite prediction models. This presentation will review our approaches to examining key datasets, the delivery of curated data and the development of machine-learning models for thirteen separate property endpoints of interest to environmental science. We will also review how these data will be made freely accessible to the community via a new “chemistry dashboard”. This abstract does not reflect U.S. EPA policy.
The development of QSAR models is critically dependent on the quality of available data. As part of our efforts to develop public platforms to provide access to predictive models, we have attempted to discriminate the influence of the quality versus quantity of data available to develop and validate QSAR models. We have focused our efforts on the widely used EPISuite software that was initially developed over two decades ago and, specifically, on the PHYSPROP dataset used to train the EPISuite prediction models. This presentation will review our approaches to examining key datasets, the delivery of curated data and the development of machine-learning models for thirteen separate property endpoints of interest to environmental science. We will also review how these data will be made freely accessible to the community via a new “chemistry dashboard”. This abstract does not reflect U.S. EPA policy
USUGM 2014 - Brett Hiemenz (GlaxoSmithKline): From Desktop to Browser - ChemA...ChemAxon
Since its initial adoption of ChemAxon tools in 2009, GlaxoSmithKline has followed a roadmap moving from desktop tools to web-based enterprise solutions. This presentation will touch on previous milestones, provide an update on the current activities, and describe the alignment of GSK with the ChemAxon vision for the Plexus Discovery suite.
A de facto standard or a free-for-all? A benchmark for reading SMILESNextMove Software
The document discusses a benchmark for evaluating how accurately different cheminformatics toolkits can read SMILES strings representing stereochemistry, implicit hydrogens, and aromatic systems. The author tested 15 toolkits on test cases examining cis-trans stereochemistry, implicit valence as defined by the SMILES specification, and their ability to consistently interpret the aromatic nature and hydrogen counts of molecules represented by SMILES strings. While stereochemistry is generally handled well, adherence to the SMILES valence model and consistent aromatic interpretation vary more between tools. The benchmark aims to identify such differences and facilitate improvements to interoperability.
This document describes a biotech startup's transition from using various disconnected systems like Excel, PowerPoint and paper notebooks, to using the Dotmatics informatics platform. The startup implemented Dotmatics for compound registration and searching, data analysis to reduce time spent from afternoons to 30 minutes, and electronic notebooks for chemistry to enable searching and collaboration. Dotmatics provided the startup with a single vendor solution that integrated all of their informatics needs, was easy to deploy across platforms, and enabled collaboration with researchers in China.
Reaxys provides a unified information portal that integrates data from multiple chemistry sources through a single interface. It links chemistry data, structures, citations, and full-text articles. Reaxys also integrates in-house data from sources like electronic lab notebooks through its API and can be used for activities like compound screening, literature searching, and patent analysis to support drug discovery.
Webinar - Pharmacopeial Modernization: How Will Your Chromatography Workflow ...Waters Corporation
In this webinar, Dr. Leonel Santos and Dr. Horacio Pappa from the United States Pharmacopeia (USP) will provide an overview of its pharmacopeial harmonization and modernization efforts. The pair will also review changes described in the pending USP General Chapter <621> on liquid chromatography (LC), which will provide increased flexibility for gradient methods.
Amanda Dlugasch, from Waters Corporation, will follow with an illuminating case study, which leverages USP <621> allowable adjustments to illustrate the benefits of modernizing methods, including migrating HPLC methods to UHPLC or UPLC, without the need to revalidate.
Topics covered in this webinar will include:
- Pharmacopeial monograph modernization prioritization scheme
- Review of USP General Chapter <621> current allowable adjustments to validated chromatographic methods and forthcoming updates
- Case study on the migration of isocratic and gradient pharmacopeial methods to modern chromatography column technology, highlighting improved method performance and throughput
Replay the webinar, hosted by SelectScience:
https://www.selectscience.net/webinars/pharmacopeial-modernization-how-will-your-chromatography-workflow-benefit/?webinarID=1228
The ACCESS-Optimization Project is a 3-year effort between NCI, BoM and Fujitsu to optimize and scale up climate and earth system models run on NCI infrastructure. The project aims to address performance and scalability issues, assist with future HPC procurements, and contribute to model development with a focus on performance. Current work involves profiling applications, constructing and testing higher resolution configurations, and reporting on workflow and scalability issues for future weather and climate applications. Methodologies used include tools for performance analysis and scaling tests. Areas of work include high resolution models of the ocean, atmosphere and coupled climate system, as well as data assimilation procedures. Deliverables to date include porting the ocean model to the new
This document discusses an empirical study of RDF stream processing systems. The study aimed to understand why different systems can produce different outputs for the same inputs. Through experiments, the study found that differences could be explained by parameters like the starting time (t0) of windows in continuous queries. A more detailed model called SECRET was then developed to describe stream processing and help predict system outputs. This led to the CSR-bench benchmark for evaluating and comparing RDF stream reasoning systems.
Similar to (ATS6-PLAT01) Chemistry Harmonization: Bringing together the Direct 9 and Pipeline Pilot Chemistry Data Models (20)
How to get the maximum performance from your AEP server. This will discuss ways to improve execution time of short running jobs and how to properly configure the server depending on the expected number of users as well as the average size and duration of individual jobs. Included will be examples of making use of job pooling, Database connection sharing, and parallel subprotocol tuning. Determining when to make use of cluster, grid, or load balanced configurations along with memory and CPU sizing guidelines will also be discussed.
(ATS6-PLAT05) Security enhancements in AEP 9BIOVIA
In the latest version of the Accelrys Enterprise Platform we have streamlined how permissions are managed and added the capability for packages to define groups and permission sets. In addition, enhancements have been made to File Based Authentication, we have added support for enterprise authentication solutions like Kerberos and SAML and improved the usability of the Administration Portal. This session describes the new features and how to manage them through the Administration Portal.
The Query Service is the new platform solution for querying a variety of data sources. The goal of Query Service is that administrators can configure a metadata description of the data source that can then be used by end users without detailed knowledge of the underlying data source. This session explains how to configure Query Service data sources and use them with the RESTful API or component collection.
(ATS6-PLAT02) Accelrys Catalog and Protocol ValidationBIOVIA
Accelrys Catalog is a powerful new technology for creating an index of the protocols and components within your organization. You will learn about strategies for indexing and how search capabilities can be deployed to professional client and Web Port end users. You will also learn how to use this technology to find out about system usage to aid with system upgrades, server consolidations, and general system maintenance. The protocol validation capability in the admin portal allows administrators to created standard reports on server usage characteristics. You will learn how to report on violations of IT policies (e.g. around security), bad protocol authoring practices, or missing or incomplete protocol documentation. Developers will also learn how to extend and customize the rules used to create these reports.
(ATS6-GS04) Performance Analysis of Accelrys Enterprise Platform 9.0 on IBM’s...BIOVIA
IBM recently completed a benchmarking study of several key modules of the Accelrys Enterprise Platform (AEP) 9.0, using IBM’s iDataPlex and General Parallel File System (GPFS). The results show that the performance of IO intensive workloads, such as Next Generation Sequencing (NGS), can be improved significantly by using GPFS. NGS workloads can also benefit from better load balancing implemented on AEP 9.0. Best practices for scalable IT solutions will also be discussed.
This document outlines an integration between the Contur and HEOS software. The integration is focused on allowing scientists to record experimental data in Contur and register compounds to HEOS as part of their workflow. It describes the Contur REST API and protocol execution framework that can extract and create Contur content. It also describes the HEOS SOAP API that can extract and create content in HEOS, including registering compounds. Components and protocols are provided that use these APIs to facilitate transferring data directly from Contur experiments to HEOS compound registration, without needing to re-enter information, in order to save time and reduce errors.
This document contains an agenda for a two-day Accelrys software development event with over 50 registered attendees from partner companies like BT, Discngine, and IBM. On day one, there are sessions on new features and improvements to various Accelrys products, like Direct and PPChem. There are also sessions on deploying products like Discoverant and using collections. Day two focuses on roadmaps for products like LIMS and ELN. Additional sessions discuss maximizing performance, deployment strategies, and integration. The event aims to provide information to help attendees improve their ability to use Accelrys products.
(ATS6-DEV09) Deep Dive into REST and SOAP Integration for Protocol AuthorsBIOVIA
Pipeline Pilot has always had a strong focus on integration to external resources. In AEP 9.0 we continue this tradition with a major overhaul of our SOAP Connector component as well as improved support for RESTful services. In this talk we will look at how to build protocols that access SOAP services especially secured services and review the approach to accessing RESTful services.
(ATS6-DEV08) Integrating Contur ELN with other systems using a RESTful APIBIOVIA
In order to enable easy integration between Contur ELN and other informatics systems a RESTful API has been developed. Data may be extracted from ELN experiments using GET calls, but external applications can also insert results directly into the ELN record. In particular the API can be used with Accelrys Enterprise Platform to create complex flows for resolving scientific problems. Such protocols may be launched from within the ELN client.
(ATS6-DEV07) Building widgets for ELN home pageBIOVIA
From a developer’s perspective, the Accelrys ELN Home Page is a container of widgets. It manages the layout of widgets, and handles the persistence of their settings. Several widgets are provided with the application: one for creating new experiments, another for tracking work in progress, and an inbox widget for messages sent through the notebook. This out-of-the-box set can be supplemented by building custom widgets.
This session will show several custom widgets examples to demonstrate the basic concepts of widget development and the API they implement. We will also discuss best practices, and how to make your widget a good citizen of the Home Page.
(ATS6-DEV05) Building Interactive Web Applications with the Reporting CollectionBIOVIA
The document discusses building interactive web applications using the Reporting Collection. It describes components like forms, data connectors, interactive elements and AJAX capabilities that allow adding interactivity. The reporting components generate reports in formats like HTML, PDF from data and layouts. Interactive components allow generating full web applications without additional coding. Forms capture user input. The data connector synchronizes selections across visualizations. Protocol links and functions enable drill-down and AJAX functionality. JavaScript attributes and components add custom scripting.
(ATS6-DEV04) Building Web MashUp applications that include Accelrys Applicati...BIOVIA
One of the biggest challenges in most corporate environments is providing a way for users to access all the data they need, usually stored in multiple disparate locations, from one interface that they are comfortable with. As web applications have become more popular, RESTful APIs have emerged as the preffered web service format in recent years. Many Accelrys applications now provide RESTful APIs that allow developers to build mashup applications. This session will explore some of these APIs and how to use them to build a simple application.
(ATS6-DEV03) Building an Enterprise Web Solution with AEPBIOVIA
In this session, we'll take a deep dive into building an Enterprise Solution with AEP. We'll be using Pipeline Pilot to develop the protocols that will provide our server-side implementations and ExtJS to build the user interface. We'll look at the techniques involved in using protocols to implement actions and explore the capabilities of ExtJS to produce powerful enterprise applications.
This document discusses different strategies for building web applications using the Accelrys Enterprise Platform (AEP). It outlines three main strategies: Form & Result, Dashboard, and Enterprise Application.
Form & Result is best for simple applications that focus on running protocols and displaying results. Dashboard adds interactivity with JavaScript and the Data Connector. Enterprise Application employs a third-party JavaScript library to build a fully customized user interface, separate from AEP.
The document provides examples and discusses the technologies involved in each approach. It recommends choosing based on requirements complexity, development time, and skill sets, noting that Form & Result is fastest but least customizable, while Enterprise Application is most complex but powerful.
(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0BIOVIA
This document summarizes new features in Accelrys AEP 9.0 including improvements to protocol database searching, protocol comparison, protocol linking via URLs, parameter initialization, promotion and metadata, autosave functionality, Pilotscript hashmaps, XML parsing, and Unicode support. It encourages providing feedback to help further develop the platform.
(ATS6-APP09) ELN configuration management with ADMBIOVIA
(ATS6-APP09) ELN configuration management with ADM
Starting with AELN 6.7, Accelrys ELN administrators have complete control over the timing and distribution of software updates to clients using Accelrys Deployment Manager (ADM). This session provides a quick overview, then dives deep into the technical aspects of ADM. Attendees will leave with a better understanding of how to use ADM to lower the costs associated with managing client updates.
(ATS6-APP07) Configuration of Accelrys ELN to Clone to the Latest Template Ve...BIOVIA
The document discusses Accelrys ELN's new capability to clone experiments to the latest template version in release 6.8. It explains that the cloned document will now pull data from corresponding sections in the latest template rather than copying the source experiment exactly. It provides instructions for configuring this clone-to-latest behavior and maintaining templates to include the latest sections. The demonstration shows how sections in the new cloned experiment are matched and inherited from either the source or template.
(ATS6-APP06) Accelrys LIMS and Accelrys ELN integration BIOVIA
Integrating a diverse suite of applications can be a challenge. We'll discuss the RESTful API that was developed to integrate Accelrys ELN with Accelrys LIMS and IM. Then we'll see how the API will evolve to be a general purpose API for querying and updating instance data.
(ATS6-APP05) Deploying Contur ELN to large organizationsBIOVIA
Introducing new IT systems that affect many users could be challenging, in particular for large organizations. This session will describe how Contur ELN has been deployed to 1000+ users in different fields of R&D. Case studies will be used to illustrate strategies and practical considerations.
(ATS6-APP04) Flexible Data Capture for Improved Laboratory ErgonomicsBIOVIA
The document discusses developing mobile applications to allow scientists to capture experimental observations and data in the laboratory using their mobile devices, with an initial proof of concept application developed for Android and Windows 8 devices; it outlines potential use cases and design priorities for flexible data capture and integration with Accelrys ELN and other systems to improve laboratory ergonomics and workflow.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...Fwdays
Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
What is an RPA CoE? Session 1 – CoE VisionDianaGray10
In the first session, we will review the organization's vision and how this has an impact on the COE Structure.
Topics covered:
• The role of a steering committee
• How do the organization’s priorities determine CoE Structure?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillLizaNolte
HERE IS YOUR WEBINAR CONTENT! 'Mastering Customer Journey Management with Dr. Graham Hill'. We hope you find the webinar recording both insightful and enjoyable.
In this webinar, we explored essential aspects of Customer Journey Management and personalization. Here’s a summary of the key insights and topics discussed:
Key Takeaways:
Understanding the Customer Journey: Dr. Hill emphasized the importance of mapping and understanding the complete customer journey to identify touchpoints and opportunities for improvement.
Personalization Strategies: We discussed how to leverage data and insights to create personalized experiences that resonate with customers.
Technology Integration: Insights were shared on how inQuba’s advanced technology can streamline customer interactions and drive operational efficiency.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
Essentials of Automations: Exploring Attributes & Automation ParametersSafe Software
Building automations in FME Flow can save time, money, and help businesses scale by eliminating data silos and providing data to stakeholders in real-time. One essential component to orchestrating complex automations is the use of attributes & automation parameters (both formerly known as “keys”). In fact, it’s unlikely you’ll ever build an Automation without using these components, but what exactly are they?
Attributes & automation parameters enable the automation author to pass data values from one automation component to the next. During this webinar, our FME Flow Specialists will cover leveraging the three types of these output attributes & parameters in FME Flow: Event, Custom, and Automation. As a bonus, they’ll also be making use of the Split-Merge Block functionality.
You’ll leave this webinar with a better understanding of how to maximize the potential of automations by making use of attributes & automation parameters, with the ultimate goal of setting your enterprise integration workflows up on autopilot.
High performance Serverless Java on AWS- GoTo Amsterdam 2024Vadym Kazulkin
Java is for many years one of the most popular programming languages, but it used to have hard times in the Serverless community. Java is known for its high cold start times and high memory footprint, comparing to other programming languages like Node.js and Python. In this talk I'll look at the general best practices and techniques we can use to decrease memory consumption, cold start times for Java Serverless development on AWS including GraalVM (Native Image) and AWS own offering SnapStart based on Firecracker microVM snapshot and restore and CRaC (Coordinated Restore at Checkpoint) runtime hooks. I'll also provide a lot of benchmarking on Lambda functions trying out various deployment package sizes, Lambda memory settings, Java compilation options and HTTP (a)synchronous clients and measure their impact on cold and warm start times.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
(ATS6-PLAT01) Chemistry Harmonization: Bringing together the Direct 9 and Pipeline Pilot Chemistry Data Models
1. (ATS6-PLAT01) Chemistry Harmonization
Bringing together the Direct 9 and Pipeline Pilot Chemistry
Data Models
Ton van Daelen, Ph.D.
Product Director, Platform
Product Management
ton.vandaelen@accelrys.com
Keith Taylor, Ph.D.
Product Manager, Chemistry
Product Management
keith.taylor@accelrys.com
2. The information on the roadmap and future software development efforts are
intended to outline general product direction and should not be relied on in making
a purchasing decision.
3. Content
• We are harmonizing the chemical representations in
Pipeline Pilot 9.0 and Direct 9.0
• Pipeline Pilot, Direct and Draw to adopt best-of-breed
features
• What will you learn?
– What your scientists need to be aware of
– How to manage this change as an administrator
4. Direct 9.0 – Changes
• Note: Direct 9 will return different search results in some
cases, consistent with Pipeline Pilot
– Aromaticity perception now based on Hückel rule (4n+2)
– Tautomer perception based on Sayle et al. paper
• Consistency between Pipeline Pilot, Accelrys Direct, and
Accelrys Draw
– Same chemistry, same results everywhere
*Canonicalization and Enumeration of Tautomers, Sayle and Delany, EuroMUG99, 28-29 October 1999, Cambridge, UK
5. Pipeline Pilot 9.0 – New Capabilities
Consistency between Pipeline Pilot Chemistry, Direct, and Draw
• Enhanced representation – ‘What you see is what you have’
• Depiction engine from Direct and Draw
• Mappers supporting new representations
• Calculators upgraded to interpret new representations
• Enhanced perceptions of stereochemistry, aromaticity, and rings
Note
• Changes to perception mean that models, and calculators must be
relearned and re-baselined
– Significant effect from new ring perception option
– Stereochemistry and aromaticity have smaller, but important effect
6. Pipeline Pilot 9.0 – Improved Chemical Representations
• Single/double/triple bonds supported
in NONS
• Coordination/Dative bond
• Haptic bonds
• Markush Homology Groups
• Hydrogen bonds
7. • Rendering between Accelrys Draw and Pipeline Pilot
9.0 now consistent
• Pipeline Pilot now supports:
– PNG
– JPEG
– GIF
– SVG
– EMF – Linux and Windows!
• SVG and EMF generation fast
– ~ 10,000 structures per second
Pipeline Pilot 9.0 – Depiction
Draw
Pipeline
Pilot
8. • Abbreviated groups are frequently used to simplify structures
• Attachment points are now correct
– The Pipeline Pilot 8.x depictions are incorrect on the left of the phenyl
group
– The labels depicted imply different chemical entities
• Visual corruption
• Nitrile (CN) and isonitrile (NC) are chemically different
• NCS and SCN are also different entities
• Rich text markup renders correctly
• Whitespace around labels is consistent
– Affects perceived bond length
Pipeline Pilot 9.0 – Depiction
Draw
Pipeline
Pilot
9. • Markush/Rgroup depiction is complete in Pipeline Pilot rendering
• Now renders
– Rgroups definitions (e.g. R1 …)
– Rgroup logic (R1 = 1; R2 >= 0)
– Directionality indicated for fragments with multiple attachment points (e.g.
“ on R2)
Pipeline Pilot 9.0 – Depiction
Draw
Pipeline
Pilot
10. • Nonspecific (NONS) representation are equivalent with
Direct 8.0 and Draw 4.1
– Pipeline Pilot version does not lose information
• Examples from mass spectrometry and industrial chemicals
Pipeline Pilot 9.0 – Depiction
Draw
Draw
Pipeline
Pilot
Pipeline
Pilot
11. • Increased focus on biological therapeutics
• Representation exposed in Pipeline Pilot 8.5
• Completed in 9.0
– Much more functional and sophisticated
Biologics
Pipeline
Pilot
Draw
15. What does this mean to my scientists? (1)
• Higher quality reports
– Supports perception of quality research
• Enhanced depiction of biologics and Markush generics
– Look different and minor adjustments to depiction protocols may be needed
• New chemical representations
– No change to existing protocols
– New opportunities opened up
• Expect marginal differences in hit sets between Direct 8 and 9 due to different
aromaticity and tautomer perceptions
16. • Enhanced mapping – New in 9.0 e.g. Imipramine Metabolites
Mapping: Non Specific Structures - New
17. • Screen MDDR data set
– 129,237 structures screened in ~30s
– No pre-processing
Mapping: Homology group screening
Hits = 470
Hits = 108
Hits = 45
Hits = 16
Hits = 10
18. • Changes to stereochemical and aromaticity perception drive changes in the behavior
of:
– Learned models
– Calculators
– Structure Matchers
• Need to relearn and re-baseline calculators and models
• Change is discontinuous (!)
• There will be no legacy mode
– Because this will cause incompatibilities and drive confusion
Data Model Changes from PP 8.x PP9
19. Compatibility: Pipeline Pilot and Accelrys Direct
• PP 9.0 and Direct 9.0 (2013)
– 100% compatible
• PP 9.0 and Direct 8.0
– Only difference is aromaticity
perception edge-cases
– Direct 8.0 uses its current
aromaticity perception
• Template based
– Differs from that in Pipeline Pilot 9.0
• Hückel (4n+2) rule based
– Minor differences will be observed
20. Dataset
Number of
Structures
Canonical
SMILES
AlogP
Number of
Rings
Number of
Aromatic Rings
Number of
Stereo Atoms
ECFP4
ACD 239,996 251 105 2,455 65 0 214
Asinex 137,799 26 24 1,070 22 0 43
Maybridge 51,058 2 0 438 0 0 1
MDDR 2010 201,748 62 24 3,271 29 4 46
WDI 53,517 37 14 612 10 0 42
Observed Differences in Calculated Values
Table shows the number of structures in the datasets that had different values in 9.0 compared with 8.5
Difference generally very small
Ring perception leads to more prominent differences especially in drug-like datasets
21. • Descriptors such as EC Fingerprints, Canonical Smiles, Ring Counts,
AlogP could be different from Pipeline Pilot 8.5
• Results from learned models that use such descriptors could be a
little different
– Retraining the models is recommended
• Canonical SMILES and feature keys could be different
– Recalculating database indices is recommended
• Similarity and substructure searching could also produce different
results
Effect of Perception Changes
22. Effect of Perception Changes
Comparison of DrugLike models
learned in Pipeline Pilot 8.5 and
retrained in Pipeline Pilot 9.0
applied to molecules in the Asinex
data set
The results are very similar for most
molecules, with larger deviations
for a few
23. What do I need to do as an admin?
• When to upgrade?
– Use Direct and AEP/PP independently:
• Upgrade to get new capabilities
– Use Direct and PP in a mixed environment:
• As soon as possible in order to benefit from harmonized chemistry
• If you are using ChemReg
– Wait until AEP 9.1 is released and do one AEP upgrade
– AEP 9.1 contains chemistry updates for Direct 9 capabilities
• What instructions do I give my users?
– Rebuild learned models and calculators under PP 9.0
• What testing do I need to do?
– Run your standard test yet and determine that differences from baseline are
expected due to the changes in chemical perception
24. Implications for Other Products
• Direct 9 retains historic APIs and search type
– Maintenance and interfacing are unchanged
• All supported versions of Draw are compatible with Direct 9
• ChemReg 3.2 will be supported on Direct 8 and 9
• AELN will support Direct 9 in a future release
• Should I be running Direct 8 and 9 simultaneously for a
while?
– This is possible but not recommended: different search results will
confuse users
– Recommendation: verify your enterprise systems with Direct 9
and then move Direct 9 to production
25. Summary
• Chemistry harmonization project:
– PP 9.0 inherits many new chemical representations
– Existing representations enhanced
– Aromaticity, stereochemistry and ring perceptions enhanced
– Significant improvement to depiction aesthetics
• Accelrys Enterprise Platform, Pipeline Pilot 9 and Direct 9
deliver the same results
26. Where do I go for more information?
• Resources
– Admin guides
• AEP/PP 9
• Direct 9
– Chemical representation changes documents
• AEP/PP 9
• Direct 9
• Community / download
– Log into Accelrys community forums
• E.g.: https://community.accelrys.com/community/accelrys_direct__draw__and_jdraw
• Accelrys is there to help
– Customer support – upgrade strategies
– Professional services – upgrade service
28. • Single chemistry foundation with single data model implemented in
a single code stream
– Adopted by Tools and Platform
• Direct , Pipeline Pilot and Accelrys Enterprise Platform
– Application Stack inherits all of the chemistry capabilities
• Simplifies development and application environment
• Enhances our ability to deliver new functionality more quickly across the
products
Harmonization delivers
29. Other New Features in PP 9.0
• Component for reaction-based tautomer enumeration
• Based on a set of twenty one SMIRKS described in "Tautomerism in Large
Databases", Sitzmann, M.; Ihlenfeldt, W.D. & Nicklaus, M. C., J. Comput. Aided
Mol. Des., 2010, 24, 521-551
• Components to do Data Fusion and to Rank Similarities
• Based on “Combination of Similarity Rankings Using Data Fusion”, Peter Willett, J.
Chem. Inf. Model., 2013, 53, 1−10
• Bad Isotope Filter now flags radioactive isotopes
• Components to check structures for querying or registration
• Customizable external elements table (PTable)
• Alternative method to calculate atom-atom mappings in
reactions
30. • Ported CHRP mapper (FSMapper) to Pipeline Pilot source base
• New mapping components decide automatically (user doesn’t know or care)
which mapper to use (PP SGMapper or new FSMapper), depending on the
molecular features present in queries and targets
• FSMapper is used for
• Reactions
• Rgroups with two attachments
• Polymers and link nodes
• Variable-attachment bonds (Markush bonds)
Harmonization of Mapping Functionality
31. • New mapping components
• Work with queries from Tag and from File
• Old mapping components are in a deprecated folder
• Use only PP SGMapper (don’t handle all the new features)
• Can be used to reproduce previous mapping behavior if needed
Harmonization of Mapping Functionality
32. • Charged non-metals are now treated as their “isoelectronic” equivalent:
– B- ~ C ~ N+ ~ O+2 ~ F+3
– Si- ~ P ~ S+ ~ Cl+2
• The bad valence filter is improved and now catches more bad anions.
• Metal anions no longer have implicit hydrogens
– Aluminum anions are an exception (for support of aluminum hydride anion)
• Nitrogen (V) is still allowed as a drawing alternative for nitro- and diazo- groups, amine
oxides, and related substructures. However, the application is now less likely to perceive
uncharged quaternary nitrogens as implicit hydrogens.
• Atoms with illegal valence are now better distinguished from atoms with maximum
valence in ECFP fingerprint bits. For example, the Oxygen in N=O and N#O is now typed
differently. This can affect the Canonical SMILES atom order for structures containing
atoms with illegal valence.
• The changes in valence result in changes to ECFP fingerprint bits and Canonical SMILES.
Valence and Implicit Hydrogens
33. Ring perception is improved. Previously, the SSSR ring perception algorithm was used, which is not unique
and often misses rings in complex non-planar assemblies, when they are atom-order and bond-order
dependent. The unique “K-rings” perception algorithm is now used, which is the union of all possible SSSR
sets. These changes result in changes to Canonical SMILES and improved aromaticity perception.
Examples
• Now perceived as 3 rings:
• Now perceived as 4 rings:
• Now perceived as 6 rings:
Rings
34. • The isoelectronic equivalence enhancement described in Valence and Implicit Hydrogens improves the
perception of ring systems containing charged non-metals. Improved detection of bad valence for
anions also contributes to improved perception of aromaticity.
• The set of atoms that can contribute a lone pair to an aromatic ring is extended (from N,O,P,S) to
include As, Se, and Te.
• These changes result in changes to ECFP fingerprint bits and Canonical SMILES.
Examples
• Now perceived as aromatic:
• No longer perceived as aromatic:
Aromaticity Perception
35. • The isoelectronic equivalence enhancement described in Valence and Implicit Hydrogens improves the
perception of stereogenic centers that include charged non-metals.
• The symmetric equivalence of O-/OH/=O groups attached to P and S atoms has been extended to
include As, Se, and Te centers.
• Stereo validation logic of reader code is synchronized with perception code. This allows for more
consistent application of rules prohibiting S(IV) centers, P(V) centers, symmetric equivalence of O-
/OH/=O, etc.
• “Double-symmetric” ring atom perception is improved Several symmetric spiro cases are now
correctly not marked as pseudo-stereo.
Examples
• Now perceived as stereo:
• More consistently perceived as not stereo:
Stereochemical Perception
38. OpenEye Molecule To Name Component
2,3,4,5-tetrahydro-1λ6,4-benzothiazepine 1,1-dioxide
2,3,4,5-tetrahydro-1λ<sup>6</sup>,4-benzothiazepine 1,1-dioxide
Options to use HTML tags and special characters
39. OpenEye Molecule From Name Component
2-[4-[(3,5-dichloro-4-pyridyl) oxy]phenyl] acetonitrile
leucine
tylenol
40. New Science
• Scaffold Tree
• Bases on "The Scaffold Tree, Visualization of the Scaffold Universe by
Hierarchical Scaffold Classification", Schuffenhauer, A., Ertl, P., Roggo, S.,
Wetzel, S., Koch, M. A., Waldmann, H., J. Chem. Inf. Model. 2007, 47, 47-58
• Quantitative Estimate of Drug-Likeness (QED)
• Based on “Quantifying the Chemical Beauty of Drugs”, G. Richard Bickerton,
Gaia V. Paolini, Jérémy Besnard, Sorel Muresan, Andrew L. Hopkins, Nature
Chemistry 4, 90–98 (2012)
• Synthetic Accessibility (SAscore)
• Based on “Estimation of Synthetic Accessibility Score of Drug-like
Molecules Based on Molecular Complexity and Fragment Contributions”,
Peter Ertl and Ansgar Schuffenhauer, Journal of Cheminformatics, 2009, 1:8