Your SlideShare is downloading. ×
  • Like
Indo us 2012
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Indo us 2012


OSDD Presentation at Washington DC

OSDD Presentation at Washington DC

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Open Source Drug Discovery CSIR-led Team India Consortium with Global Partnership Affordable Healthcare for All Cheminformatics and Open Source Drug Discovery: a case study in academic collaboration between the U.S. and India Abhik Seal Phd Student Indiana University) (Researcher OSDD CSIR) Anshu Bhardwaj Scientist, OSDD Unit Council of Scientific & Industrial Research Delhi, India 23rd March 2012, Washington DC
  • 2. OSDD Focus : Tropical Neglected DiseasesFirst Disease Target : TuberculosisTuberculosis (TB) is one of leading causes of fatality, ranking second only to HIV asthe killer infectious disease of adults worldwide. New TB cases 2010 At least one person inthe world is newlyinfected with TB bacillievery second Over 1000 deaths a day or 3 deaths every 2 mins Source:
  • 3. Countries that had reported at least one XDR-TB case by end March 2011Argentina Bhutan France Japan Namibia Republic of Korea ThailandArmenia Cambodia Georgia Kazakhstan Nepal Republic of Moldova TogoAustralia Canada Germany Kenya Netherlands Romania TunisiaAustria Chile Greece Kyrgyzstan Norway Russian Federation UkraineAzerbaijan China India Latvia Pakistan Slovenia United Arab EmiratesBangladesh Colombia Indonesia Lesotho Peru South Africa United KingdomBelgium Czech Republic Iran (Islamic Rep. of) Lithuania Philippines Spain United States of AmericaBotswana Ecuador Ireland Mexico Poland Swaziland UzbekistanBrazil Egypt Israel Mozambique Portugal Sweden Viet NamBurkina Faso Estonia Italy Myanmar Qatar Tajikistan
  • 4. TB Drug Discovery
  • 5. World TB Day is 24th March 2012 It commemorates the discovery of TB bacillus (Mycobacterium tuberculosis) through sputum microscopy which is still the diagnostics used to detect TB! No progress whatsoever, and we are discussing network communications
  • 6. Challenges with Drug Discovery of Neglected Diseases• Lack of market incentives• TB is a complex disease – latency, relapse, resistance• Clinical trials take a long time & study of relapse needs long follow up (up to 18months)• Patient access is not direct, is through government agencies
  • 7. Conventional vs Open Innovation Approach to Drug Discovery … Corporate R&D R&D Diabetics Cancer HQ R&D Neurological … Disorder Sales Production Packaging Pre-Clinical Formulation Trial Clinical Trial …
  • 8. Conventional vs Open Innovation Approach to Drug DiscoveryResearch groupsIndustry collaborationIndividual participationOpen Data Sharing
  • 9. OSDD Process Flow Clinical trials Public Funding of Clinical TrialsGovernment of India commitment - $46 million
  • 10. Status: OSDD Projects Chemical Screening/ Hit Clinical Drug Target Virtual Synthesis Hit to Trials Identification Screening /library identification Lead Candidate 45 Other projects aim to develop tools, databases and repositories for the 19 OSDD community 9 6 2 1September 2008…………………………………………………………………March 2012
  • 11. OSDD Platform System ArchitectureCollaborative tools to accelerate neglected diseases research” in the book “CollaborativeComputational Technologies for Biomedical Research”. Wiley and Sons. 2011
  • 12. Post-genomics data on Mtb is ‘Linked’ from disparate resources More than a Million Data Points are now “Linked” Pathway/ Gene/operon Networks predictions Gene Drug targets Mtb Expression Data Regulatory Orthologs Elements Variation and repeats* This is representative set of post-genomics data available on TB Collaborator: Deeksha Bhartiya Nitin Kumar Dr. Vinod Scaria
  • 13. Comparison of Source Tracks UCSC Genome Browser on Mycobacterium1 6 tuberculosis H37Rv 06/20/1998 Assembly2 WebTb Operon Map3 Argo Genome Browser not web based4 PGBrowser: Pathogen Genome Browser 35 BioHealthBase 166 Ensembl ~157 Tbrowse 100
  • 14. DeekshaBhartiya Deeksha Bhartiya Nitin Kumar OpenLabNoteBook on SysBorgTB
  • 15. Biology is complex !! From a mathematical point of view, to create an accurate model of a single mammalian cell may require generating and then solving somewhere between 100,000 to one million equations The human brain can only processNeed automation & new seven pieces of data at a time!!!technology to address thecomplexity
  • 16. The “Connect to Decode” Programme OSDD C2D Collaborative Community Curation Literature 800+ Student Researchers Curated Annotations Annotation Tools Raw Annotations Genomic DatabasesPathway/Interactome | Gene Ontology | Protein Structure/Fold | Glycomics| Immunome
  • 17. Working on the cloud.. Online discussion Right Wrong (mark in green) (mark in red) Many eye balls, makeCommunity Curation!! the ‘bug’ shallow!!!
  • 18. Mtb Metabolome Map on Payao Sub-map of the metabolic network on Payao SBI developed customized plug ins for OSDD for generating the metabolic map
  • 19. C2D April 2010 – Onsite Activity
  • 20. iOSDD890From SocialNetwork to Biological Network
  • 21. OSDD Community Effort to Understand Mtb Biology
  • 22. Within weeks, 830 volunteered to re-annotate the entire M.tuberculosis genome. The work started in December 2009 andwas completed by April 2010, packing nearly 300 man-years into4 months! Source: Munos B. Can Open-Source Drug R&D Repower Pharmaceutical Innovation? Clin Pharmacol Ther 2010;87:534–536 Social engineering for virtual big science in systems biology Source: Hiroaki Kitano Nature Chemical Biology 7, 323–326 (2011)
  • 23. Connect to Decode Phase II - Themes
  • 24. Large student community from colleges and university are Cloning, Expressing and Purifying selected Mtb genesTo clone and express select genesof Mycobacterium tuberculosisOpen Access Repository of MtbclonesMore than 120 sequence confirmed clones are ready for distribution
  • 25. OSDDChem: Open Chemistry Initiative A Large number of molecules are beingsubmitted for screening
  • 26. Computational Resources developed with Community participation http://sysborg2.osdd.netBhardwaj et al. Tuberculosis (Edinb). Bhardwaj et al. 2011 John Wiley & Sons, Inc.2009 Sep;89(5):386-7 Chembio Toolkit TrapTBWorkflow engine with federated resources Mtb drug targets database AmPhyDBMtb essential genes database Antimycobacterial Phytomolecule DatabaseA Comprehensive database of Mtb transporters Mtb-Human Interaction Database
  • 27. Enabling Complex Computational Analysis For Experimental Biologists/ChemistsQ. Find novel genes and mutations & map known drug resistance mutationson genome of an MDR-TB strain
  • 28. Galaxy provides - Simplified GUI design Ease of integrating modules Fewer components for creating workflows Sharable workflows for better collaboration
  • 29. Custom APIs for importing input files from OSDD’s open lab note books Get data customized for extracting files from open lab note book
  • 30. Custom APIs for exporting results to OSDD’s Open lab note book Workflows and the result of the workflows are stored as separate lab note books Lab note book has details of the experiments performed Results of one experiment may be invoked for analysis in another experiment All versions of the workflow and the results are stored Flexibility to execute nested workflows
  • 31. Our Approach : Data & Tool integration In addition to access heterogeneous sources of data like BioMartCentral/UCSC Table Browser (, Open lab note book of is interfaced with GalaxyStandalone databases and toolsTools as web services: • Web services can be added as tools in Galaxy • Extends the potential of galaxy workflowsThe process Configure & Identify the Search for Code for Write XML Integrate to module the WSDL client for Galaxy Galaxy
  • 32. ChemBio toolkit : >300 Modules integrated by OSDD CommunityS. No Resources Clients1 KEGG: Kyoto Encyclopedia of Genes and Genomes 602 GetEntry: DDBJ sequence search by accessionID 433 GPSR : tools 334 PDB : Protein Data Bank 305 BioModel:mathematical models of biological DB 256 Gtps : Gene Trek in Prokaryote Space 8 WSDbfetch: retrieve entries from biological dbs using7 7 entry identifiers or accession no.8 Gibv: Genome Information Broker for Viruses 79 DDBJ :DNA Data bank of Japan 710 Mafft: a multiple sequence alignment program 411 Fasta:- DDBJ database 412 Ensembl : maintains automatic annotation 413 VecScreen vector contamination 414 OMIM:Online Mendelian Inheritance in man 415 Gtop: Gene-product Informatics 316 GO: Gene Ontology 317 SPS : Splicing Profile based Score 218 GIBIS: Genome Information Broker for Insertion Sequence 119 RefSeq: database of sequence 120 GIB: Genome Information Broker 121 GIBEnv- DDBJ database 122 TxSearch: Database indexing & searching 1
  • 33. OSDD Community suggests tools for integration in Galaxy
  • 34. Data amplification: Cheminformatics PubchemBioassay data (approx. 100,000 molecules/ dataset Successful Screen Potential PubChem Models (30 million) Hits 6000 descriptors /moleculeo Down sizing and random validation require multiple calculation for validation of resultso Cross validation up to 50+ time for each experiment
  • 35. C-DAC’s Garuda Grid – Indian Grid Computing InitiativeC-DAC is R&D organization under Ministry of Communication & Information Technology, IndiaC-DAC’s Garuda Grid is targeted at providing a facility for the scientific community, which would enable them to seamlessly access the distributed resources.Compute Power of GARUDA: ~ 70TFs (6000 CPUs) Currently there are 55 Garuda Partners Has NKN (National Knowledge Network) connectivity at 10Gbps
  • 36. Features: Customized Galaxy on GARUDA• Integrated with Grid Authentication mechanism - Indian Grid Certificate Authority (IGCA)• Integrated with Gridway Metascheduler - Job scheduling and management• Integrated OSDD tools - Weka (for data mining) and Autodock (Virtual screening).• Provided support to upload multiple input files as tar file• Data libraries of OSDD community are uploaded and are shared by all users• Integrated with PostgreSQL
  • 37. Garuda- Galaxy Job Submission - Flow Galaxy Job 2. Based on Tool, it Galaxy GUI Manager sends the job to the correct runner. Gridway Job runner 3. Gridway job runner Garuda-OSDD Server uses user’s Garuda proxy file for job submission 1. User selects tool and Input parameters Internet
  • 38. Weka in Galaxy
  • 39. Garuda Usage by OSDD: Job Accounting
  • 40. High Performance Grid Computing for OSDD members
  • 41. Customized Galaxy with applications as Web Services and on the Grid for Open Source Drug Discovery (OSDD) A CSIR led team India consortium with global partnership for affordable healthcare Anshu Bhardwaj Council of Scientific & Industrial Research (CSIR), India Chintalapati Janaki, Center for Development of Advanced Computing (C-DAC), 25-26 May 2011
  • 42. “In the long history of human mankind those whohave learned to collaborate and improvise mosteffectively have prevailed.” --Charles Darwin
  • 43. Cheminformatics: a strong case for community collaborative scienceThere is now an incredibly rich resource of publicinformation relating compounds, targets, genes, pathways,and diseases. Just for starters there is in the public domaininformation on: ~30 million compounds and ~500,000 bioassays (PubChem, ChemSpider) ~60 million compound bioactivities (PubChem Bioassay) ~5,000 drugs (DrugBank) ~9 million protein sequences (SwissProt) and ~60,000 3D structures (PDB) ~14 million human nucleotide sequences (EMBL) ~20 million life science publications (PubMED) Multitude of other sets (drugs, toxicogenomics, chemogenomics, metagenomics …)
  • 44. Community Speaks: What excites them about CheminformaticsI have thus chosen ‘Cheminformatics’ to study the vast pool of chemical compounds much more in details and analyze so as to narrow down to potential drug candidate. With the unique combination of IT and Chemistry, I am confident that one can actually derive much more meaningful information of a chemical entity on this earth. Rajdeep (BioIT)I am organic chemist. I prepared several organic molecules.We go for biological activity, maximum times it gives negative result. But with help of informatics in chemistry we can predict molecular properties. We can replace many ligands or substituents or functional group easily. And we can design our desirable molecule. ---ChirupuloI am doing my M.Pharm in pharmaceutical chemistry,and i like cheminformatics that i need accurate results but soon....and i am really interested in molecular I am here. --- Haffy manafCheminformatics deals with information about chems. It combines tools and techniques of IT for information about chemical entities at the finger tip on click of a mouse. Databases are available for properties of descriptors. Softwares help to calculate molecular properties. Cheminformatics thus come handy tool for learning chemistry.------ Dr Keshav Mohan
  • 45. Challenges in implementation of Cheminformatics projects• Access to Journals for Chemical Structures• Lack of proper communication systems other than skype• Lack of software tools for accelerated drug discovery• Need of high speed internet• Need more experts to teach/train community members• Proper time schedule of IU cheminformatics classes
  • 46. Cheminformatics AwarenessIndiana University Initiatives (Prof David J Wild)
  • 47. Tools Developed for Large Scale Bio-Chemical Data MinningAssociation Search – visualize literature supported associations between any two entities (compound, drug, gene, pathway, disease, side effect). PLoS One, in press.Semantic Link Association Prediction (SLAP) – find most highly associated entities (compound, drug, gene, pathway, disease, side effect) to any other entity, based on probabilistic weightings of graph edges based on public experimental datasets. Paper in preparationBioLDA – find most highly associated entities to any other entity based on a complex topic model analysis of the literature (PubMed). PLoS One, 2011, 6 (3), e17243See also: WENDI (J. Cheminf., 2010,2,6); Chemogenomic Explorer (BMC Bio. 2011,12,256), ChemLDA, ChemBioGrid (J. Chem. Inf. Model., 2007; 47(4) pp 1303-1307)
  • 48. OSDD virtual resources
  • 49. CheminformaticsCommunity of About 400 PubChem ChEMBL DrugBank HT Virtual screening Curated molecule Cheminformatics Data Mining Experimental datasets Models and Analysis AssaysOther Active Communities: •OSDD Women Scientists Forum •OSDD Junior Scientists Forum
  • 50. Ideal Case US-India Cheminformatics Collaboration Research Wet lab research IU CCRG Industrypartnerships Education OSDD Many Open interested cheminfo. students group
  • 51. But in order to sustain…? Funding forresearch in U.S. $1.3m NIH Funding for$360,000 Eli Lilly research in$120,000 Pfizer $0 osdd $46m Govt
  • 52. What should be our approach to reach out and integrate? Most of the biologists and chemists do not use computational workflows for their analysis Awareness about the advantages of using such workflow engines The Community needs to be trained for using the workflows The Community needs to be trained for integrating applications Web services vs standalone applications – each have their own set of advantages and limitations Developers of algorithms should be encouraged to report results in globally accepted standard formats with standard ontologies
  • 53. OSDD Open Access Resources Assembly line for drug discoveryI Biological Repository i. Open access clinical strains repository ii. Open access clone repository iii. Open access protein repositoryII Chemical Repository i. Open access small molecule repositoryIII Open Screening Facility I. Submit your compounds for anti-tuberculosis screening
  • 54. Public Private Partnerships as Open Collaborative Endeavors to solve Scientific Challenges s12 • Five synthetic ‘thiophene containing trisubstituted methanes’, which showed a s14 s15 MIC of <1.56 µg/ml, no cytotoxicity in mammalian N O N O CF3 O cells being synthesised in PPP ModeInhibition of FAAL and FACL Preclinical development of enzymes by acyl-sulfamoyl thiophene containing analogues trisubstituted methanes
  • 55. Collaboration with TB Alliance on Human Clinical Trials PA-824 in combination with other drugs Affordable Healthcare for All
  • 56. Target based approach HumanSystems ClinicalBiology Trials Ligand Hit to Lead based approach
  • 57. An Innovative Approach to Drug Discovery: A New Paradigm Biology/ Genomics High Risk, Innovation Funnel Innovation Driven Target Identification Sphere Strategy-> Open Target Validation Innovation with best minds from academia/ industry Hit(s)Value Risk Validated/ Quality Lead Process Oriented – Strategy-> Industry CRO’s Participation Optimised Candidate Drug Strategy-> Clinical Trials OSDD to support clinical trials in Registered Drug collaboration with pharma Drugs to be available without IP encumbrances
  • 58. Major International Collaborations Metabolic Map Network Generation Structural Interactome to predict Off- Site Interactions of Drug Candidates Cheminformatics and e-learning
  • 59. Geek Nation: How Indian Science Is Taking Over The WorldAuthor, Angela Saini
  • 60. Science 24 February 2012:Vol. 335 no. 6071 p. 909 NEWS FOCUS
  • 61. OSDD Portfolio March 2012
  • 62. OSDD Community &The Team Leaders Not all are shown
  • 63. Some of the OSDD PIs Mtb Systems Target Validation Biology Cloning of potential drug targetsPPI Validation OSDDChem Cheminformatics Community + E-learning Mtb Genome Analysis Galaxy Integration with Grid Email: Skype: anshu.bhardwaj
  • 64. OSDD : A Global Community - More than 5500 members from over 130 countries Statistics as of March 2012
  • 65. Open Source Drug Discovery (OSDD) Model “Team India Consortium with International Participation” Open Synthesis and Exchange of Knowledge Candidate Lead Targets Molecules PRECLINICAL & CLINICAL Drug in silico SCREENING TRIAL Mycobacterium tuberculosis in vivo VALIDATION Wiki Portal Contract Academia Research & Hospitals Organisations Exchange of Ideas/Results Community ParticipationLead Organization Current PartnersCouncil of Scientific andIndustrial Research (CSIR), India
  • 66. Together we can … .. and we should ! Email: Skype: anshu.bhardwaj Matt Smadley |