SlideShare a Scribd company logo
1 of 22
Download to read offline
Big Data, Big Challenge.
Puneet Kacker, Kanpur
08-OCT-2015
What is Big Data?
 Big Data is data that is too large, complex and dynamic for any
conventional data tools to capture, store, manage and
analyze.
 The right use of Big Data allows analysis to spot trends and
gives niche insights that help create value and innovation
much faster than conventional methods.
 However, there is more to the big data deluge than mere
volumes; in particular, increasing data heterogeneity and
complexity makes it difficult to extract knowledge from such
data.
 If the use of big data for drug discovery should indeed open
new frontiers, and not only be hype, new visions and concepts
are required to reduce data complexity and increase data
consistency from different sources.
What is Big Data?
What is the Challenge?
Three “V’s”, i.e., the Volume, Variety and
Velocity of data coming in is what creates the
challenge.
http://hlwiki.slais.ubc.ca/images/1/1a/Big_data_2013.jpg
1 PB = 1000 TB
big challenges in data storage,
processing and analysis.
Coordinated efforts from both
experimental biologists and
bioinformaticists are required
to overcome these challenges.
Big Biological Data
Open Source
Chemical Compounds
Drug Targets
10,774 Targets
Drug Discovery Through Virtual Screening
One Target, One Compound
Disease
Enzyme, Drug Target
Potential Drug
Candidate
One Target, One Compound
Disease
Enzyme, Drug Target
Potential Drug
Candidate
1 Target, 1 Compound, 1 Disease = 1 Molecular Docking Run
One Compound to Many Targets
10,000 Protein
Targets
Disease-1
Disease-2
Disease-N
Potential Drug
Candidate
10,000 Targets, 1 Compound, 10,000 Diseases = Total 10,000 Molecular
Docking Runs
One Compound to Many Targets and Their Conformations
10,000 Protein
Targets
Disease-1
Disease-2
Disease-N
Potential Drug
Candidate
10,000X2 Target Conformations, 1 Compound, 10,000 Diseases = Total 20,000 Molecular Docking Runs
Conf-1Conf-2
Many Compounds to Many Targets and Their Conformations
10,000 Protein
Targets
Disease-1
Disease-2
Disease-N 60,826,590
Potential Compounds
10,000X2 Target Conformations, 60,826,590
Compounds, 10,000 Diseases = Total 1,216,531,800,000 Molecular Docking Runs
Conf-1Conf-2
Calculation
Suppose one docking run takes 1 min. time on single processor
 1,216,531,800,000 /60 = 20275530000 Hours
 1,216,531,800,000 /(60X24) = 844813750 Days
 1,216,531,800,000 /(60X24X30) = 28160458 Months
 1,216,531,800,000 /(60X24X30X12) = 2346704 Years
 1,216,531,800,000 /(60X24X30X12X60) = 39111 Births
10 Crores Processors will be needed to complete all the docking runs in less than a day time
An excel sheet can accommodate 1048576 rows by 16384 columns
What if the same calculations are carried out by two different methods!
Big Data requires Big resources and smart data handling methods
Supporting Tools/Languages
R is a free software environment for
statistical computing and graphics.
https://www.r-project.org/
Hadoop is an open-source framework that
allows to store and process big data in a
distributed environment across clusters of
computers using simple programming models.
https://hadoop.apache.org/
Let’s Learn Programming Interactively
http://tryr.codeschool.com/levels/1/challenges/1
Further Reading
And After That
Thank You!
www.puneetsclassroom.in

More Related Content

Similar to Big Data, Big Challenge in Drug Discovery

Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Robert Grossman
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
 
Slides for st judes
Slides for st judesSlides for st judes
Slides for st judesSean Ekins
 
wolstencroft-ogf20-astro
wolstencroft-ogf20-astrowolstencroft-ogf20-astro
wolstencroft-ogf20-astrowebuploader
 
MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?Al Dossetter
 
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Robert Grossman
 
Collaborative Database and Computational Models for Tuberculosis Drug Discovery
Collaborative Database and Computational Models for Tuberculosis Drug DiscoveryCollaborative Database and Computational Models for Tuberculosis Drug Discovery
Collaborative Database and Computational Models for Tuberculosis Drug DiscoverySean Ekins
 
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal ChemistryEmerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal ChemistryEd Griffen
 
AI-powered Medical Imaging Analysis for Precision Medicine
AI-powered Medical Imaging Analysis for Precision MedicineAI-powered Medical Imaging Analysis for Precision Medicine
AI-powered Medical Imaging Analysis for Precision MedicineSean Yu
 
Deep learning for large scale biodiversity monitoring
Deep learning for large scale biodiversity monitoringDeep learning for large scale biodiversity monitoring
Deep learning for large scale biodiversity monitoringGreenapps&web
 
Where Technology Meets Medicine: SickKids High Performance Computing Data Centre
Where Technology Meets Medicine: SickKids High Performance Computing Data CentreWhere Technology Meets Medicine: SickKids High Performance Computing Data Centre
Where Technology Meets Medicine: SickKids High Performance Computing Data CentreScalar Decisions
 
Big data, big knowledge big data for personalized healthcare
Big data, big knowledge big data for personalized healthcareBig data, big knowledge big data for personalized healthcare
Big data, big knowledge big data for personalized healthcareredpel dot com
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User StudyEnrico Daga
 
Addressing the Challenge of Scalability in Viral Vectors
Addressing the Challenge of Scalability in Viral VectorsAddressing the Challenge of Scalability in Viral Vectors
Addressing the Challenge of Scalability in Viral VectorsMerck Life Sciences
 
Addressing the Challenge of Scalability in Viral Vectors
Addressing the Challenge of Scalability in Viral VectorsAddressing the Challenge of Scalability in Viral Vectors
Addressing the Challenge of Scalability in Viral VectorsMilliporeSigma
 
Molecular docking and its importance in drug design
Molecular docking and its importance in drug designMolecular docking and its importance in drug design
Molecular docking and its importance in drug designdevilpicassa01
 
Big Data in Biomedicine: Where is the NIH Headed
Big Data in Biomedicine: Where is the NIH HeadedBig Data in Biomedicine: Where is the NIH Headed
Big Data in Biomedicine: Where is the NIH HeadedPhilip Bourne
 
HealthBIO 2021_PerkinElmer, leading with innovation - from COVID success into...
HealthBIO 2021_PerkinElmer, leading with innovation - from COVID success into...HealthBIO 2021_PerkinElmer, leading with innovation - from COVID success into...
HealthBIO 2021_PerkinElmer, leading with innovation - from COVID success into...Business Turku
 

Similar to Big Data, Big Challenge in Drug Discovery (20)

Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
 
Slides for st judes
Slides for st judesSlides for st judes
Slides for st judes
 
wolstencroft-ogf20-astro
wolstencroft-ogf20-astrowolstencroft-ogf20-astro
wolstencroft-ogf20-astro
 
MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?
 
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
 
Collaborative Database and Computational Models for Tuberculosis Drug Discovery
Collaborative Database and Computational Models for Tuberculosis Drug DiscoveryCollaborative Database and Computational Models for Tuberculosis Drug Discovery
Collaborative Database and Computational Models for Tuberculosis Drug Discovery
 
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal ChemistryEmerging Challenges for Artificial Intelligence in Medicinal Chemistry
Emerging Challenges for Artificial Intelligence in Medicinal Chemistry
 
AI-powered Medical Imaging Analysis for Precision Medicine
AI-powered Medical Imaging Analysis for Precision MedicineAI-powered Medical Imaging Analysis for Precision Medicine
AI-powered Medical Imaging Analysis for Precision Medicine
 
Deep learning for large scale biodiversity monitoring
Deep learning for large scale biodiversity monitoringDeep learning for large scale biodiversity monitoring
Deep learning for large scale biodiversity monitoring
 
Where Technology Meets Medicine: SickKids High Performance Computing Data Centre
Where Technology Meets Medicine: SickKids High Performance Computing Data CentreWhere Technology Meets Medicine: SickKids High Performance Computing Data Centre
Where Technology Meets Medicine: SickKids High Performance Computing Data Centre
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
 
Big data, big knowledge big data for personalized healthcare
Big data, big knowledge big data for personalized healthcareBig data, big knowledge big data for personalized healthcare
Big data, big knowledge big data for personalized healthcare
 
Propagating Data Policies - A User Study
Propagating Data Policies - A User StudyPropagating Data Policies - A User Study
Propagating Data Policies - A User Study
 
biomedicines-03-00203
biomedicines-03-00203biomedicines-03-00203
biomedicines-03-00203
 
Addressing the Challenge of Scalability in Viral Vectors
Addressing the Challenge of Scalability in Viral VectorsAddressing the Challenge of Scalability in Viral Vectors
Addressing the Challenge of Scalability in Viral Vectors
 
Addressing the Challenge of Scalability in Viral Vectors
Addressing the Challenge of Scalability in Viral VectorsAddressing the Challenge of Scalability in Viral Vectors
Addressing the Challenge of Scalability in Viral Vectors
 
Molecular docking and its importance in drug design
Molecular docking and its importance in drug designMolecular docking and its importance in drug design
Molecular docking and its importance in drug design
 
Big Data in Biomedicine: Where is the NIH Headed
Big Data in Biomedicine: Where is the NIH HeadedBig Data in Biomedicine: Where is the NIH Headed
Big Data in Biomedicine: Where is the NIH Headed
 
HealthBIO 2021_PerkinElmer, leading with innovation - from COVID success into...
HealthBIO 2021_PerkinElmer, leading with innovation - from COVID success into...HealthBIO 2021_PerkinElmer, leading with innovation - from COVID success into...
HealthBIO 2021_PerkinElmer, leading with innovation - from COVID success into...
 

Recently uploaded

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 

Recently uploaded (20)

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 

Big Data, Big Challenge in Drug Discovery

  • 1. Big Data, Big Challenge. Puneet Kacker, Kanpur 08-OCT-2015
  • 2. What is Big Data?
  • 3.  Big Data is data that is too large, complex and dynamic for any conventional data tools to capture, store, manage and analyze.  The right use of Big Data allows analysis to spot trends and gives niche insights that help create value and innovation much faster than conventional methods.  However, there is more to the big data deluge than mere volumes; in particular, increasing data heterogeneity and complexity makes it difficult to extract knowledge from such data.  If the use of big data for drug discovery should indeed open new frontiers, and not only be hype, new visions and concepts are required to reduce data complexity and increase data consistency from different sources. What is Big Data?
  • 4. What is the Challenge? Three “V’s”, i.e., the Volume, Variety and Velocity of data coming in is what creates the challenge. http://hlwiki.slais.ubc.ca/images/1/1a/Big_data_2013.jpg 1 PB = 1000 TB big challenges in data storage, processing and analysis. Coordinated efforts from both experimental biologists and bioinformaticists are required to overcome these challenges.
  • 9. Drug Discovery Through Virtual Screening
  • 10. One Target, One Compound Disease Enzyme, Drug Target Potential Drug Candidate
  • 11. One Target, One Compound Disease Enzyme, Drug Target Potential Drug Candidate 1 Target, 1 Compound, 1 Disease = 1 Molecular Docking Run
  • 12. One Compound to Many Targets 10,000 Protein Targets Disease-1 Disease-2 Disease-N Potential Drug Candidate 10,000 Targets, 1 Compound, 10,000 Diseases = Total 10,000 Molecular Docking Runs
  • 13. One Compound to Many Targets and Their Conformations 10,000 Protein Targets Disease-1 Disease-2 Disease-N Potential Drug Candidate 10,000X2 Target Conformations, 1 Compound, 10,000 Diseases = Total 20,000 Molecular Docking Runs Conf-1Conf-2
  • 14. Many Compounds to Many Targets and Their Conformations 10,000 Protein Targets Disease-1 Disease-2 Disease-N 60,826,590 Potential Compounds 10,000X2 Target Conformations, 60,826,590 Compounds, 10,000 Diseases = Total 1,216,531,800,000 Molecular Docking Runs Conf-1Conf-2
  • 15. Calculation Suppose one docking run takes 1 min. time on single processor  1,216,531,800,000 /60 = 20275530000 Hours  1,216,531,800,000 /(60X24) = 844813750 Days  1,216,531,800,000 /(60X24X30) = 28160458 Months  1,216,531,800,000 /(60X24X30X12) = 2346704 Years  1,216,531,800,000 /(60X24X30X12X60) = 39111 Births 10 Crores Processors will be needed to complete all the docking runs in less than a day time An excel sheet can accommodate 1048576 rows by 16384 columns
  • 16. What if the same calculations are carried out by two different methods!
  • 17. Big Data requires Big resources and smart data handling methods
  • 18. Supporting Tools/Languages R is a free software environment for statistical computing and graphics. https://www.r-project.org/ Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models. https://hadoop.apache.org/
  • 19. Let’s Learn Programming Interactively http://tryr.codeschool.com/levels/1/challenges/1