• Save
EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD” file for bioassay definitions and data: Building the BioAssay Research Database
Upcoming SlideShare
Loading in...5
×
 

EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD” file for bioassay definitions and data: Building the BioAssay Research Database

on

  • 1,838 views

Building on the success of the Molecular Libraries Program (MLP), the Broad Institute MLP team is co-leading with the National Center for Advancing Translational Sciences (NCATS) an NIH-sponsored ...

Building on the success of the Molecular Libraries Program (MLP), the Broad Institute MLP team is co-leading with the National Center for Advancing Translational Sciences (NCATS) an NIH-sponsored project across 7 institutions to augment the data in PubChem with the creation of the Bioassay Research Database (BARD). The BARD platform standardizes the representation of bioassays in a next-generation repository and provides a user-friendly interface that supports sophisticated queries and data mining. Data originating from publicly-funded chemical biology research efforts will be presented with appropriate context including structured assay and result annotations. These annotations use relevant ontologies including, for example, the BioAssay Ontology, Gene Ontology, and the Unit Ontology. We simplified the representation of ontologies into a hierarchical data dictionary to enable data producers to more easily create and upload projects, assays, and results, while creating two separate user interfaces for data consumers. The BARD WebQuery Interface leverages a Google-like interface with auto-suggest functionality for complex queries, such as retrieval of all assays, and results for biological pathways such as “DNA repair” or “oxidative stress”; presentation of this information in a rich-user interface that includes spreadsheet support for structure-activity relationship analyses. Compounds, projects, and assays can be exported into an Amazon-like query cart for refining queries, and additional computations can be executed on datasets via community-developed plug-ins including promiscuity analyses via the BioActivity Data Associative Promiscuity Pattern Learning Engine (BADAPPLE) and a CYP450 metabolism site prediction plugin (hgp://www.farma.ku.dk/smartcyp/) using 2D structure fingerprints. Integration between the WebQuery and Desktop clients enables power users to initiate analyses in WebQuery and gain more insight via the Desktop client.

Lastly, as industry and academia work together to innovate in small-molecule therapeutics, we have created an initial specification for the Assay Definition Standard. This standard through the Assay Definition Format has been used as the medium of data file transfer for data upload. We expect that the Chemical Biology community now has an opportunity to leverage this standard to routinely transfer assay and result data within and between information systems and organizations.

This presentation will highlight the BARD platform with a focus on representing the cumulative body of work that exploits the ChemAxon toolkit.

Statistics

Views

Total Views
1,838
Views on SlideShare
586
Embed Views
1,252

Actions

Likes
0
Downloads
0
Comments
0

3 Embeds 1,252

http://www.chemaxon.com 1246
http://bobek.chemaxon.com 4
http://www.pinterest.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD” file for bioassay definitions and data: Building the BioAssay Research Database EUGM 2013 - Andrea de Souza (Broad Institute): Setting the stage for the “SD” file for bioassay definitions and data: Building the BioAssay Research Database Presentation Transcript

    • Andrea de SouzaDirector, Informatics, Data Analysis & FinanceCenter for the Science of TherapeuticsMay 29, 2013BioAssay Research Database
    • Direct ContributorsNIH Molecular Libraries – Glenn McFadden, Ajay PillaiNIH Chemical Genomics Center – Chris Austin (PI), John Braisted, MarcFerrer, Rajarshi Guha, Ajit Jadhav, Dac-Trung Nguyen, Tyler Peryea, NoelSouthall, Henrike VeithBroad Institute – Benjamin Alexander, Jacob Asiedu, Kay Aubrey, JoshuaBittker, Steve Brudz, Simon Chatwin, Paul Clemons, Vlado Dancik, SivaDandapani, Andrea de Souza, Dan Durkin, David Lahr, Jeri Levine, JudyMcGloughlin, Phil Montgomery, Jose Perez, Stuart Schreiber (PI), GilWalzer, Xiaorong XiangUniversity of New Mexico – Cristian Bologa, Steve Mathias, Tudor Oprea,Larry Sklar (PI), Oleg Ursu, Anna Waller, Jeremy YangUniversity of Miami – Saminda Abeyruwan, Hande Küküc, VanceLemmon, Ahsan Mir, Magdalena Przydzial, Kunie Sakurai, StephanSchürer, Uma Vempati, Ubbo VisserVanderbilt University – Eric Dawson, Bill Graham, Craig Lindsley (PI),Shaun StaufferSanford-Burnham Medical Research Institute – “T.C.” Chung, JenaDiwan, Michael Hedrick, Gavin Magnuson, Siobhan Malany, Ian Pass,Anthony Pinkerton, Derek Stonich, John Reed (PI)Scripps Research Institute – Yasel Cruz, Mark Southern,Hugh Rosen (PI)
    • BARD: BioAssay Research DatabaseMission: Enable biomedical researchers and cheminformaticscientists to effectively use MLP data to generate newhypotheses• Unique collaboration amongst 7 NIH & academic centers• Develop and adopt an Assay Definition Standard (ADS)• Provide tools for assay registration, querying &visualizationo Deploy predictive modelso Foster new methods to interpret chemical biology datao Enable private data sharing• Developed as an open-source, industrial-strengthplatform to support public translational research
    • BARD: BioAssay Research DatabaseMission: Enable biomedical researchers and cheminformaticscientists to effectively use MLP data to generate newhypothesesTeam Science• Provide tools for assay registration and data querying &visualizationo Deploy predictive modelso Foster new methods to interpret chemical biology datao Enable private data sharing• Developed as an open-source, industrial-strength platform toResearch Data ManagementTechnologyPredictive ModelsThe BARD platform will support public translational research
    • Research Data ManagementThe Value of Context
    • The Value of ContextResearch Data Management
    • PubChem BioAssay
    • PubChem BioAssay and BARDstructure the data
    • PubChem BioAssay and BARDPubChem BARDMissing or fuzzy assay definitions,experiments and project conceptsIntroduce assay definitions,experiments and projects‘Column header’ centric withconcentration details embeddedResult types and concentrations asexperimental variablesExtensive use of unstructured text Transition to structured use ofcommon languagePubChemMLP-BioAssaystructurethe data
    • EntrezUniprotGene Ontology Gene OntologyDiseaseOntologyBioAssay Ontology BioAssay Ontology BioAssay Ontology BioAssay OntologyUnitOntologyUniprot UniprotUnitOntologyBARD Dictionary & Term HierarchyChemicalOntologyBARD Assay Definition Hierarchy• Annotate all assays to a minimum standard• Integrate and extend ontologies• Enable assay registration• Represent assays, results, experiments using ADS• Exchange information in ADS via ADFStructuring the Data
    • BARD Technology ComponentsDefine & RegisterAssaysData Dictionary – std termsCatalog of Assay ProtocolsHigh Quality Data &Result DepositionCalculations & ResultsProject-experiment associationQuery & InterpretInformationIntuitive Guided QueriesCross Assay & SAR centric viewsAdvance applicationsEnableHypothesisGenerationNovice Expert
    • BARD Technology ComponentsDefine & RegisterAssaysData Dictionary – std termsCatalog of Assay ProtocolsHigh Quality Data &Result DepositionCalculations & ResultsProject-experiment associationQuery & InterpretInformationIntuitive Guided QueriesCross Assay & SAR centric viewsAdvance applicationsEnableHypothesisGenerationNovice Expert
    • Web ClientFilter on annotations, such asdetection method typeGoogle-like searching of: 4,000+ assays, 35M+ compounds, 300+ projectsSave items ofinterest for furtheranalysisAmazon-like Query Cart
    • Web Client - Project Specific Views
    • Web Client – Probe Development Workflow
    • Sunburst VisualizationMolecular activity against target classesTarget classifications from PantherDBPANTHER in 2013: modeling the evolution of gene function,and other gene attributes, in the context of phylogenetic trees.Huaiyu Mi, Anushya Muruganujan and Paul D. ThomasNucl. Acids Res. (2012) doi: 10.1093/nar/gks1118
    • JerseyD3.jsWeb Query & Desktop ClientsData Warehouse & REST API Catalog of Assay ProtocolsCommercial LicenseMySQL support forCAP coming soonAs open source as possibleJGoodies
    • Chemaxon Usage in BARDUNM Promiscuity PluginJChem for scaffold decompositionREST API & WarehouseJChem for rendering structures andmolecule fingerprint generationhttp://bard.nih.gov/api/latest/compounds/6915727/image?s=200http://bard.nih.gov/api/latest/compounds/?filter=n1cccc2ccccc12%5Bstructure%5D&type=sim&cutoff=0.9&expand=truehttp://bard.nih.gov/api/latest/plugins/badapple/prom/cid/6915727?expand=true
    • Chemaxon Usage in BARDWeb Query ClientJChem for rendering structuresDesktop ClientJChem for rendering structures,molecule import & exportMarvin for drawing query structures
    • • BioActivity Data AssociativePromiscuity Pattern Learning Engine• Associations via scaffolds for chemicalspace navigationExample URI* description<base>/badapple/prom/cid/752424For compound with specified ID,return scaffold IDs and scores.<base>/badapple/prom/cid/752424?expand=trueAdditional statistics, scaffold smiles,and inDrug flag.<base>/badapple/prom/scafid/233For scaffold with specified ID, returnstatistics and smiles.Predictive Models
    • Predictive Models• Predicts CYP450 isoformsmetabolism sites with 2Dstructures• Patrik Rydberg et. al• Released under LGPL• BARD plugin– Summary HTML view– Data view
    • Navigating the Maze
    • Long-Term Path ForwardMLPTBDNCI-60TBDDatasetsCAP WebQueryDesktopAPIsToolsBADAppleCYP450TBDTBDMethods Data AnalysisWorkflow 1Workflow 2Workflow 3as a PlatformSustained Community EngagementADS