Abstract
Pharos (https://pharos.nih.gov/) is an integrated web-based informatics platform for the analysis of data aggregated by the Illuminating the Druggable Genome (IDG) Knowledge Management Center, an NIH Common Fund initiative. The current version of Pharos (as of October 2019) spans 20,244 proteins in the human proteome, 19,880 disease and phenotype associations, and 226,829 ChEMBL compounds. This resource not only collates and analyzes data from over 60 high-quality resources to generate these types, but also uses text indexing to find less apparent connections between targets, and has recently begun to collaborate with institutions that generate data and resources. Proteins are ranked according to a knowledge-based classification system, which can help researchers to identify less studied “dark” targets that could be potentially further illuminated. This is an important process for both drug discovery and target validation, as more knowledge can accelerate target identification, and previously understudied proteins can serve as novel targets in drug discovery. In this webinar, Dr. Tudor Oprea will introduce how to use Pharos to find targets of interest for drug discovery.
The top 3 key questions that Pharos can answer:
1. What are the novel drug targets that may play a role in a specific disease?
2. What are the diseases that are related directly or indirectly to a drug target?
3. Find researchers that are related directly or indirectly to a drug target.
Presenter: Tudor Oprea, MD, PhD, Professor of Medicine, Chief of Translational Informatics Division & Internal Medicine, University of New Mexico
dkNET Webinar Information: https://dknet.org/about/webinar
Drug Repositioning Conference Washington DC 20190923Tudor Oprea
Discussing the knowledge-based classification of human proteins and its applications in target repurposing discovery, with potential applications for Rare Diseases
Covering our on-going Machine Learning efforts using Protein Knowledge Graphs and MetaPath / XGBoost to predict novel protein-disease associations. Specific Examples for Type 2 Diabetes.
Computational Drug Repositioning Workflow.
Addressing the limitations and potential of machine learning in target and drug repurposing.
Drug Repositioning Candidates: Alprazolam / Glycopyrronium / Oteracil.
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...Jeremy Yang
Talk given at 14th Annual New Mexico BioInformatics, Science and Technology (NMBIST) Symposium, entitled Integrative Omics, on March 14-15, 2019. Most slides c/o IDG KMC PI Tudor Oprea, MD, PhD.
Drug Repositioning Conference Washington DC 20190923Tudor Oprea
Discussing the knowledge-based classification of human proteins and its applications in target repurposing discovery, with potential applications for Rare Diseases
Covering our on-going Machine Learning efforts using Protein Knowledge Graphs and MetaPath / XGBoost to predict novel protein-disease associations. Specific Examples for Type 2 Diabetes.
Computational Drug Repositioning Workflow.
Addressing the limitations and potential of machine learning in target and drug repurposing.
Drug Repositioning Candidates: Alprazolam / Glycopyrronium / Oteracil.
Illuminating the Druggable Genome with Knowledge Engineering and Machine Lear...Jeremy Yang
Talk given at 14th Annual New Mexico BioInformatics, Science and Technology (NMBIST) Symposium, entitled Integrative Omics, on March 14-15, 2019. Most slides c/o IDG KMC PI Tudor Oprea, MD, PhD.
Mel Reichman on Pool Shark’s Cues for More Efficient Drug DiscoveryJean-Claude Bradley
Mel Reichman, senior investigator and director of the LIMR Chemical Genomics Center at the Lankenau Institute for Medical Research presents at the chemistry department at Drexel University on November 12, 2009.
Modern drug discovery by high-throughput screening (HTS) begins with testing hundreds of thousands of compounds in biological assays. The confirmed hit rate for typical HTS is less than 0.5%; therefore, 99.5% of the costs of HTS are for generating null data. Orthogonal convolution of compound libraries (OCL) is 500% more efficient than present HTS practice. The OCL method combines 10 compounds per well. An advantage of this method is that each compound is represented twice in two separately arrayed pools. The potential for the approach to better enable academic centers of excellence to validate medicinally relevant biological targets is discussed.
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicineiosrjce
In this review report we like to focus on the new challenges in methodology of modern biology be
used in medical science. Today human health is a primary issue to cure disease, undoubtedly the answer to this
is bioinformatics or (In-silco) tools has change the concept of treating patients to understand the need of
genomic medicine in use. Those with new modes of action in clinical treatment, is a major health concern in
medical science. On global prospective scientific role in constructing new ideas to remediate health care to
treat disease exciting in nature is challenging task. So awareness needs to accelerate store clinical datasets for
scientific represents to design genomic drugs. This new outline will drive the medical to discover public data
and create a cognitive approach to use technology cheaper at cost effective mode.
Talk delivered at Warwick Biomedical Engineering Seminar series 27 November 2014. Develops a theme emerging from a review in 2010:
J Watkins, A Marsh, P C Taylor, D R J Singer
Therapeutic Delivery, 2010, 1, 651-665
"Continued adherence to a single-drug single-target paradigm will limit the ability of chemists to contribute to advances in personalized medicine, whether they be in discovery or delivery"
Identification of PFOA linked metabolic diseases by crossing databasesYoann Pageaud
The increasing amount of biological data makes possible their interpretation more accurate and richer than never before. Various way of representations and interpretations of the links between those data have been applied or developed consequently to these new elements which can be taken into account in diagnostics and soon in personalized medicine. The aim of this student project was to cross data coming from various databases to be able to link Perfluorooctaoic Acid (PFOA) to one or more human phenotypes and metabolic diseases. Our approach makes possible an easy and confident interpretation on the data kept and also allow us to rank diseases linked according to their risk of correlation to a specific set of proteins.
The Monarch Initiative: From Model Organism to Precision Medicinemhaendel
NIH BD2K all-hands meeting poster November 12, 2015.
Attempts at correlating phenotypic aspects of disease with causal genetic influences are often confounded by the challenges of interpreting diverse data distributed across numerous resources. New approaches to data modeling, integration, tooling, and community practices are needed to make efficient use of these data. The Monarch Initiative is an international consortium working on the development of shared data, tools, and standards to enable direct translation of integrated genotype, phenotype, and environmental data from human and model organisms to enhance our understanding of human disease. We utilize sophisticated semantic mapping techniques across a diverse set of standardized ontologies to deeply integrate data across species, sources, and modalities. Using phenotype similarity matching algorithms across these data enables disorder prediction, variant prioritization, and patient matching against known diseases and model organisms. These similarity algorithms form the core of several innovative tools. The Exomiser, which enables exome variant prioritization by combining pathogenicity, frequency, inheritance, protein interaction, and cross-species phenotype data. Our Phenotype Sufficiency tool provides clinicians the ability to compare patient phenotypic profiles using the Human Phenotype Ontology to determine uniqueness and specificity in support of variant prioritization. The PhenoGrid visualization widget illustrates phenotype similarity between patients, known diseases, and model organisms. Monarch develops models in collaboration with the community in support of the burgeoning genotype-phenotype disease research community. We have successfully used Exomiser to solve a number of undiagnosed patient cases in collaboration with the NIH Undiagnosed Disease Program. Ongoing development in coordination with the Global Alliance for Genetic Health (GA4GH) and other groups will catalyze the realization of our goal of a vital translational community focused on the collaborative application of integrated genotype, phenotype, and environmental data to human disease.
Discuss about Al, machine learning, and the hype cycle
Discuss the knowledge-based classification of proteins
Discuss applications of AI/ML to drug discovery
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...dkNET
Presenter: Andrew Schroeder, PhD. Project Manager & Senior Data Curator, 4D Nucleome Data Coordination and Integration Center (4DN-DCIC), Park Lab, Department of Biomedical Informatics, Harvard Medical School
Abstract
The Common Fund 4D Nucleome program, currently in its 9th year, is a consortium of researchers that aims to understand the principles behind the three-dimensional organization of the nucleus and how this organization can change over time to affect a variety of cellular processes. The 4DN Data Portal (data.4dnucleome.org) is an expanding resource hosting data generated by the 4DN Network and other reference nucleomics data sets. The portal provides tools for search, exploration, visualization, and download. An overview of the data portal, highlighting available data, how it can be found, visualized and used for analyses will be presented.
The top 3 key questions that the 4DN data portal can answer:
1. Are there significant sites of long-range chromatin contacts near my gene or region of interest?
2. What omics datasets are available for my tissue of interest?
3. Are there imaging datasets available that are relevant to my tissue of interest?
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Office Hours: NIH Data Management and Sharing Mandate 05/03/2024dkNET
Presenter: Jeffrey Grethe, PhD, Principal Investigator of NIDDK Information Network (dkNET), Center for Research in Biological Systems, University of California San Diego
For all proposals submitted on/after January 25 2023, NIH requires the sharing of data from all NIH funded studies. Do you have appropriate data management practices and sharing plans in place to meet these requirements? Have questions or need some help? Join the dkNET office hours to learn about NIH’s policy (NOT-OD-21-013) and resources that could help.
*Previous Office Hours Slides and Recording: https://dknet.org/rin/research-data-management
Upcoming Webinars Schedule: https://dknet.org/about/webinar
More Related Content
Similar to dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
Mel Reichman on Pool Shark’s Cues for More Efficient Drug DiscoveryJean-Claude Bradley
Mel Reichman, senior investigator and director of the LIMR Chemical Genomics Center at the Lankenau Institute for Medical Research presents at the chemistry department at Drexel University on November 12, 2009.
Modern drug discovery by high-throughput screening (HTS) begins with testing hundreds of thousands of compounds in biological assays. The confirmed hit rate for typical HTS is less than 0.5%; therefore, 99.5% of the costs of HTS are for generating null data. Orthogonal convolution of compound libraries (OCL) is 500% more efficient than present HTS practice. The OCL method combines 10 compounds per well. An advantage of this method is that each compound is represented twice in two separately arrayed pools. The potential for the approach to better enable academic centers of excellence to validate medicinally relevant biological targets is discussed.
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicineiosrjce
In this review report we like to focus on the new challenges in methodology of modern biology be
used in medical science. Today human health is a primary issue to cure disease, undoubtedly the answer to this
is bioinformatics or (In-silco) tools has change the concept of treating patients to understand the need of
genomic medicine in use. Those with new modes of action in clinical treatment, is a major health concern in
medical science. On global prospective scientific role in constructing new ideas to remediate health care to
treat disease exciting in nature is challenging task. So awareness needs to accelerate store clinical datasets for
scientific represents to design genomic drugs. This new outline will drive the medical to discover public data
and create a cognitive approach to use technology cheaper at cost effective mode.
Talk delivered at Warwick Biomedical Engineering Seminar series 27 November 2014. Develops a theme emerging from a review in 2010:
J Watkins, A Marsh, P C Taylor, D R J Singer
Therapeutic Delivery, 2010, 1, 651-665
"Continued adherence to a single-drug single-target paradigm will limit the ability of chemists to contribute to advances in personalized medicine, whether they be in discovery or delivery"
Identification of PFOA linked metabolic diseases by crossing databasesYoann Pageaud
The increasing amount of biological data makes possible their interpretation more accurate and richer than never before. Various way of representations and interpretations of the links between those data have been applied or developed consequently to these new elements which can be taken into account in diagnostics and soon in personalized medicine. The aim of this student project was to cross data coming from various databases to be able to link Perfluorooctaoic Acid (PFOA) to one or more human phenotypes and metabolic diseases. Our approach makes possible an easy and confident interpretation on the data kept and also allow us to rank diseases linked according to their risk of correlation to a specific set of proteins.
The Monarch Initiative: From Model Organism to Precision Medicinemhaendel
NIH BD2K all-hands meeting poster November 12, 2015.
Attempts at correlating phenotypic aspects of disease with causal genetic influences are often confounded by the challenges of interpreting diverse data distributed across numerous resources. New approaches to data modeling, integration, tooling, and community practices are needed to make efficient use of these data. The Monarch Initiative is an international consortium working on the development of shared data, tools, and standards to enable direct translation of integrated genotype, phenotype, and environmental data from human and model organisms to enhance our understanding of human disease. We utilize sophisticated semantic mapping techniques across a diverse set of standardized ontologies to deeply integrate data across species, sources, and modalities. Using phenotype similarity matching algorithms across these data enables disorder prediction, variant prioritization, and patient matching against known diseases and model organisms. These similarity algorithms form the core of several innovative tools. The Exomiser, which enables exome variant prioritization by combining pathogenicity, frequency, inheritance, protein interaction, and cross-species phenotype data. Our Phenotype Sufficiency tool provides clinicians the ability to compare patient phenotypic profiles using the Human Phenotype Ontology to determine uniqueness and specificity in support of variant prioritization. The PhenoGrid visualization widget illustrates phenotype similarity between patients, known diseases, and model organisms. Monarch develops models in collaboration with the community in support of the burgeoning genotype-phenotype disease research community. We have successfully used Exomiser to solve a number of undiagnosed patient cases in collaboration with the NIH Undiagnosed Disease Program. Ongoing development in coordination with the Global Alliance for Genetic Health (GA4GH) and other groups will catalyze the realization of our goal of a vital translational community focused on the collaborative application of integrated genotype, phenotype, and environmental data to human disease.
Discuss about Al, machine learning, and the hype cycle
Discuss the knowledge-based classification of proteins
Discuss applications of AI/ML to drug discovery
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...dkNET
Presenter: Andrew Schroeder, PhD. Project Manager & Senior Data Curator, 4D Nucleome Data Coordination and Integration Center (4DN-DCIC), Park Lab, Department of Biomedical Informatics, Harvard Medical School
Abstract
The Common Fund 4D Nucleome program, currently in its 9th year, is a consortium of researchers that aims to understand the principles behind the three-dimensional organization of the nucleus and how this organization can change over time to affect a variety of cellular processes. The 4DN Data Portal (data.4dnucleome.org) is an expanding resource hosting data generated by the 4DN Network and other reference nucleomics data sets. The portal provides tools for search, exploration, visualization, and download. An overview of the data portal, highlighting available data, how it can be found, visualized and used for analyses will be presented.
The top 3 key questions that the 4DN data portal can answer:
1. Are there significant sites of long-range chromatin contacts near my gene or region of interest?
2. What omics datasets are available for my tissue of interest?
3. Are there imaging datasets available that are relevant to my tissue of interest?
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Office Hours: NIH Data Management and Sharing Mandate 05/03/2024dkNET
Presenter: Jeffrey Grethe, PhD, Principal Investigator of NIDDK Information Network (dkNET), Center for Research in Biological Systems, University of California San Diego
For all proposals submitted on/after January 25 2023, NIH requires the sharing of data from all NIH funded studies. Do you have appropriate data management practices and sharing plans in place to meet these requirements? Have questions or need some help? Join the dkNET office hours to learn about NIH’s policy (NOT-OD-21-013) and resources that could help.
*Previous Office Hours Slides and Recording: https://dknet.org/rin/research-data-management
Upcoming Webinars Schedule: https://dknet.org/about/webinar
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET
Presenter: Chen Li, PhD. Professor, Department of Computer Science, University of California Irvine
Abstract
Many data analytics projects have collaborators with complementary backgrounds, including biologists, bioinformaticians, computer scientists, and AI/ML experts. Many of them have limited experience to code, set up a computing infrastructure, and use MLmodels. Existing tools and services, such as email attachments, GitHub, and Google Drive are inefficient for sharing data and analyses. In this talk, we present an open source system called Texera that provides a cloud computing platform for collaborators to share data and analyses as workflows. After seven years of development, the system has a rich set of powerful features, such as shared editing, shared execution, version control, commenting, debugging, user-defined functions in multiple languages (e.g., Python, R, Java), and support of state-of-the-art AI/ML techniques. Its backend parallel engine enables scalable computation on large data sets using computing clusters. We will show a demo of the system, and present our vision supported by a recent NIH award, dkNET(NIDDK Information Network, https://dknet.org), to serve the diabetes, endocrinology, and metabolic diseases research communities through the FAIR sharing of data and knowledge.
Resource link: https://github.com/Texera/texera
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Webinar: Unlocking the Power of FAIR Data Sharing with ImmPort 04/12/2024dkNET
Presenter: Sanchita Bhattacharya, ImmPort Science Program Lead, Bakar Computational Health Sciences Institute UCSF
Abstract
The Immunology Database and Analysis Portal (ImmPort, https://www.immport.org/home) is a domain-specific data repository for immunology-related data which is funded by the National Institutes of Health, National Institute of Allergy and Infectious Diseases, and Division of Allergy, Immunology, and Transplantation. ImmPort has been making scientific data Findable, Accessible, Interoperable, and Reusable (FAIR) for over 20 years. ImmPort data sets encompass over 7 million experimental results across 160 diseases and conditions, including data related to diabetes, kidney and liver transplantation, celiac disease, and many more conditions. In this webinar, participants will learn about data management and sharing through ImmPort, as well as finding and leveraging data sets of interest for research.
The top 3 key questions that the ImmPort can answer:
1. How can researchers share data through ImmPort to comply with the NIH Data Management and Sharing policy?
2. How does ImmPort support FAIR data and why is this powerful for research?
3. What scientific data does ImmPort house that would be of interest to NIDDK researchers?
Upcoming webinars schedule: https://dknet.org/about/webinar
Presenter: Angela Oliveira Pisco , PhD
Abstract
Although the genome is often called the blueprint of an organism, it is perhaps more accurate to describe it as a parts list composed of the various genes that may or may not be used in the different cell types of a multicellular organism. While nearly every cell in the body has essentially the same genome, each cell type makes different use of that genome and expresses a subset of all possible genes. This has motivated efforts to characterize the molecular composition of various cell types within humans and multiple model organisms, both by transcriptional and proteomic approaches. We created a human reference atlas comprising nearly 500,000 cells from 24 different tissues and organs, many from the same donor. This atlas enabled molecular characterization of more than 400 cell types, their distribution across tissues, and tissue-specific variation in gene expression. One caveat to current approaches to make cell atlases is that individual organs are often collected at different locations, collected from different donors, and processed using different protocols. Controlled comparisons of cell types between different tissues and organs are especially difficult when donors differ in genetic background, age, environmental exposure, and epigenetic effects. To address this, we developed an approach to analyzing large numbers of organs from the same individual. We collected multiple tissues from individual human donors and performed coordinated single-cell transcriptome analyses on live cells. The donors come from a range of ethnicities, are balanced by gender, have a mean age of 51 years, and have a variety of medical backgrounds. Tissue experts used a defined cell ontology terminology to annotate cell types consistently across the different tissues, leading to a total of 475 distinct cell types with reference transcriptome profiles. The Tabula Sapiens also provided an opportunity to densely and directly sample the human microbiome throughout the gastrointestinal tract. The Tabula Sapiens has revealed discoveries relating to shared behavior and subtle, organ-specific differences across cell types. We found T cell clones shared between organs and characterized organ-dependent hypermutation rates among B cells. Endothelial cells and macrophages are shared across tissues, often showing subtle but clear differences in gene expression. We found an unexpectedly large and diverse amount of cell type–specific RNA splice variant usage and discovered and validated many previously undefined splices. The intestinal microbiome was revealed to have nonuniform species distributions down to the 3-inch (7.62-cm) length scale. These are but a few examples of how the Tabula Sapiens represents a broadly useful reference...Full abstract: https://dknet.org/about/blog/2726
Resource link: https://tabula-sapiens-portal.ds.czbiohub.org
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...dkNET
Presenter: Malene Lindholm, PhD, Instructor, Department of Medicine, Stanford University
Abstract
The Molecular Transducers of Physical Activity Consortium (MoTrPAC) aims to map the molecular responses to exercise and training to elucidate how exercise improves health and prevents disease. The first MoTrPAC data provides an extensive temporal map of the dynamic multi-omic response to endurance training across multiple rat tissues. All results can be viewed, interrogated, and downloaded in a user-friendly, publicly accessible data portal (https://motrpac-data.org). The MoTrPAC data compendium includes transcriptomics, proteomics, metabolomics, phosphoproteomics, acetylproteomics, ubiquitylproteomics, DNA methylation, chromatin accessibility, and multiplexed immunoassay data. This compilation constitutes of 211 datasets across 19 tissues, 25 molecular assays, and 4 training time points in adult male and female rats. Over 35,000 analytes were found to be differentially regulated in response to endurance training, with many displaying sexual dimorphism. We observed a male-specific recruitment of immune cells to adipose tissues and an anticorrelated transcriptional response in the adrenal gland related to the stress response. Temporal multi-omic and multi-tissue integration demonstrated similar temporal responses in the heart and skeletal muscle, reflecting a concerted adaptation of mitochondrial biogenesis and metabolism. Integrative multi-omic network analysis revealed connections between the heat shock-mediated stress response and mitochondrial biogenesis. Training increased phospholipids and decreased triacylglycerols in the liver, and there were extensive changes to mitochondrial protein acetylation. Many changes were relevant for human health conditions, such as non-alcoholic fatty liver disease, inflammatory bowel disease, cardiovascular wellness, and tissue damage and repair. Altogether, this MoTrPAC resource provides an unprecedented view of the effects of exercise across an organism, revealing mechanistic details of how exercise impacts mammalian health. The MoTrPAC data hub is the primary online resource to disseminate this large-scale multi-omics data.
The top 3 questions that the MoTrPAC resource can answer:
1. What is the multi-omic response to endurance exercise across different tissues?
2. What are the top signaling pathways affected in response to exercise and do they differ between males and females?
3. How can the MoTrPAC data hub be utilized to interrogate all the MoTrPAC findings?
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET
Presenter: Pieter Dorrestein, PhD, Professor, Skaggs School of Pharmacy and Pharmaceutical Sciences, Department of Pharmacology and Pediatrics, University of California San Diego
Abstract
In the analysis of organs, volatilome, or biofluids, the microbiome influences 15-70% of detectable mass spectrometry molecules. Typically, only 10% of human untargeted metabolomics data can be assigned a molecular structure, with merely 1-2% traceable to microbial origins. Human microbiomes contribute metabolites through the microbial metabolism of host-derived substances, digestion of food and beverage molecules, and de novo assembly using proteins encoded by genetic elements. Despite the significance of microbiome-derived metabolites to human health, there is no centralized knowledge base for community access. To address this, the "Collaborative Microbial Metabolite Center" (CMMC) leverages expertise in mass spectrometry, microbiome innovation, and the GNPS ecosystem to built a knowledgebase. It aims to create a user-accessible microbiome resource, enrich bioactivity knowledge, and facilitate data deposition. The CMMC includes the construction of a knowledge base, MicrobeMASST tool, and health phenotype enrichment workflows, the construction and use will be discussed in this presentation. The use of this ecosystem will be exemplified by the discovery of 20,000 bile acids, many of which were shown to be of microbial origin and linked to diet and IBD.
The top 3 key questions that this resource can answer:
1. How can we leverage the 1000’s of public metabolomics studies to discover microbial metabolites and their organ distributions as well as their phenotypic, including health, associations?
2. If one has an unknown molecule, how can one assess what microbes make a molecule without known structure?
3. How can one contribute to the expansion of the knowledgebase on microbial metabolites?
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...dkNET
Presenter: Paul Cohen, MD, PhD, Albert Resnick, M.D. Associate Professor, Rockefeller University
Abstract
White and brown adipocytes not only play a central role in energy storage and combustion but are also dynamic secretory cells that secrete signaling molecules linking levels of energy stores to vital physiological systems. Disruption of the signaling properties of adipocytes, as occurs in obesity, contributes to insulin resistance, type 2 diabetes, and other metabolic disorders. Fat cells have been estimated to secrete over 1,000 polypeptides and microproteins and an even larger number of small molecule metabolites. The great majority of the adipocyte secretome has not been defined or characterized. A major obstacle has been the lack of suitable technologies to quantitatively identify circulating proteins and metabolites, determine their cellular origin, and elucidate their function. Building on key innovations in chemical biology and mass spectrometry, our team is generating an encyclopedia of the white and brown adipocyte secretome in mouse models and humans. Our work has the potential to identify new secreted mediators with roles in obesity, type 2 diabetes, and metabolic diseases, provide a crucial resource for researchers and clinicians, and lead to new biomarkers and therapies.
The top 3 key questions that this resource can answer:
1. What techniques can be used to characterize the secretome of a cell type in vitro and in vivo?
2. What is the full complement of proteins and metabolites secreted by different kinds of adipocytes?
3. How should one prioritize uncharacterized secreted mediators for functional study?
Resource link: https://secrepedia.org/
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...dkNET
Presenter: Margo Emont, PhD. Instructor, Beth Israel Deaconess Medical Center/Harvard Medical School
Abstract
White adipose tissue, once regarded as morphologically and functionally bland, is now recognized to be dynamic, plastic and heterogenous, and is involved in a wide array of biological processes including energy homeostasis, glucose and lipid handling, blood pressure control and host defense. High-fat feeding and other metabolic stressors cause marked changes in adipose morphology, physiology and cellular composition, and alterations in adiposity are associated with insulin resistance, dyslipidemia and type 2 diabetes. Here we provide detailed cellular atlases of human and mouse subcutaneous and visceral white fat at single-cell resolution across a range of body weight. We identify subpopulations of adipocytes, adipose stem and progenitor cells, vascular and immune cells and demonstrate commonalities and differences across species and dietary conditions. We link specific cell types to increased risk of metabolic disease and provide an initial blueprint for a comprehensive set of interactions between individual cell types in the adipose niche in leanness and obesity. These data comprise an extensive resource for the exploration of genes, traits and cell types in the function of white adipose tissue across species, depots and nutritional conditions.
The top 3 key questions that this resource can answer:
1. How specific is my gene of interest to a particular cell type in adipose tissue?
2. Is the gene/pathway that I am studying in mouse adipose tissue also present in human adipose tissue (and is it regulated similarly in low vs high body weight)?
3. What are the changes in gene expression in a specific cell type at low vs high body weight?
Resource link:
https://singlecell.broadinstitute.org/single_cell/study/SCP1376/a-single-cell-atlas-of-human-and-mouse-white-adipose-tissue#study-summary
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...dkNET
Presenter: Susan Redline, MD, MPH, Peter C. Farrell Professor of Sleep Medicine, Professor of Epidemiology, Harvard T.H. Chan School of Public Health
Abstract
Experimental, clinical and epidemiological studies have identified multiple inter-relationships of sleep with glucose regulation and metabolic disease. In one meta-analysis, after overweight and family history of diabetes, the next 7 top risk factors for incident diabetes were measures of sleep health. These included poor sleep quality, insomnia, short or extremely long sleep duration, and sleep apnea; each sleep problem was associated with incident diabetes with relative risks ranging from 1.38 to 1.74. A mechanism linking sleep apnea with diabetes is through the effects of intermittent hypoxemia on insulin sensitivity. However, studies using neurophysiological markers of sleep in healthy adults showed that selective reduction of slow wave sleep reduced glucose tolerance by 23%, thus additionally suggesting the importance neurophysiological mechanisms during sleep in glucose regulation. In support of this, longitudinal epidemiological studies demonstrated that higher proportions of slow wave sleep (N3) were protective for the development of type 2 diabetes. Recent animal and human studies also point to the effects of sleep micro-architecture—specifically the coupling of slow waves and spindles- on short-term and long-term glucose regulation, possibly through the effects on signaling between the hippocampus and hypothalamus, and changes in autonomic nervous system output. Experimental data also demonstrate a prominent role of the circadian system in regulating glucose and lipid levels. In support of those studies, epidemiological associations have identified significant associations between actigraphy-based measures of sleep irregularity (a marker of circadian disruption) with incident metabolic dysfunction and hypertension. This rich data implicating sleep disturbances as drivers of metabolic disease, coupled with data indicating a high prevalence of sleep and circadian disorders in the population, suggest novel opportunities to target sleep and circadian pathways for preventing or treating metabolic dysfunction, as well as key knowledge gaps.
The National Sleep Research Resource (NSRR; sleepdata.org) provides a large and growing repository of well-annotated polysomnograms (PSGs), actigraphy studies, and questionnaires, some associated with clinical and biochemical data relevant to understanding the links between sleep and circadian disorders with metabolic disease. Notably, the NSRR includes over 50,000 PSGs, which concurrently include multiple physiological signals with high temporal resolution, allowing generation of thousands of variables summarizing dynamic physiological changes and “cross-talk” between physiological systems...(Please see https://dknet.org/about/blog/2674 for full abstract)
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...dkNET
For all proposals submitted on/after January 25 2023, NIH will require the sharing of data from all NIH funded studies. Do you have appropriate data management practices and sharing plans in place to meet these requirements? Have questions or need some help? Join the dkNET office hours to learn about NIH’s policy (NOT-OD-21-013) and resources that could help.
*Previous Office Hours Slides and Recording: https://dknet.org/rin/research-data-management
Upcoming Webinars Schedule: https://dknet.org/about/webinar
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023dkNET
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch
Presenter: Jeffrey Grethe, PhD, dkNET Principal Investigator, University of California San Diego
Abstract
The dkNET (NIDDK Information Network) team is announcing an exciting new service - Biomed Resource Watch (BRW, https://scicrunch.org/ResourceWatch), a knowledge base for aggregating and disseminating known problems and performance information about research resources such as antibodies, cell lines, and tools. We aggregate trustworthy information from authorized sources such as Cellosaurus, Antibody Registry, Human Protein Atlas, ENCODE, and many more. In addition, BRW includes antibody specificity text mining information extracted from the literature via natural language processing. BRW provides researchers and curators an easy-to-use interface to report their claims about a specific resource. Researchers can check information about a resource before planning their experiments via BRW-enhanced Resource Reports. This new service aims to help improve efficiency in selecting appropriate resources, enhancing scientific rigor and reproducibility, and promoting a FAIR (Findable, Accessible, Interoperable, Reusable) research resource ecosystem in the biomedical research community.
Join us for a webinar to introduce the following resources & topics:
1. An overview of dkNET
2. How Resource Reports benefit you
3. Biomed Resource Watch
3.1 Navigating Biomed Resource Watch
3.2 How to Submit a Claim
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...dkNET
dkNET New Investigator Pilot Program in Bioinformatics Awardee Webinar Series
Presenter: Wenting Wu, PhD. Research Assistant Professor, Center for Diabetes and Metabolic Diseases, Department of Medical and Molecular Genetics, Associate Director of Data and Analytics Core for Center for Diabetes and Metabolic Diseases, Indiana University School of Medicine
Abstract
Type 1 diabetes (T1D) is an immune-mediated disease that results in insulin insufficiency and affects 0.3% of the population, including both children and adults. To support clinical trial efforts, there is an urgent need to develop reliable biomarkers capable of predicting T1D risk and guiding therapeutic interventions. Recently, whole blood bulk RNA sequencing has been used to guide T1D clinical trial design and assess response to disease modifying interventions. While the use of bulk RNA sequencing is cost-effective, these datasets provide limited information about cell specific gene expression changes. Here, we aimed to apply computational strategies to deconvolute cell type composition using cell specific gene expression references. Single-cell RNA sequencing (scRNA-seq) was conducted to profile peripheral blood mononuclear cells obtained from youth within recent T1D onset and age- and sex-matched controls and identified 31 distinct cell clusters. Using this pre-defined reference dataset, we ran computational algorithms CIBERSORTx and other deconvolution methods simultaneously to deconvolute cell proportions using public clinical trial data. We focused our initial analysis on data from the TN-20 Rituximab trial, which tested the anti-CD20 monoclonal antibody rituximab vs placebo in recent onset T1D. This talk will introduce recent advances of scRNA-seq techniques and computational deconvolution methods and demonstrate that how we apply different deconvolution approaches for secondary analysis of existing clinical trial data, in the purpose of linking cell specific immune signatures associated with drug responder status.
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...dkNET
dkNET New Investigator Pilot Program in Bioinformatics Awardee Webinar Series
Presenter: Joon Ha, PhD. Associate Professor, Department of Mathematics, Howard University, Washington DC.
Abstract
The most common form of diabetes, type 2 diabetes (T2D) is a failure of insulin-secreting pancreatic beta-cells to increase insulin to the level required to maintain normal blood glucose. Thus, identifying beta-cell function and insulin sensitivity in those who are at high risk is crucial to preventing and delaying the disease. Hyper-glycemic clamp and euglycemic hyper- insulinemic clamp are considered to be gold standard measures for these quantities. However, these two methods demand highly skilled labor and thus are cost-prohibitive. Glucose challenge tests have been used to estimate beta-cell function and insulin sensitivity. The product of beta-cell function and insulin sensitivity, termed the disposition index (DI), is of great value because it measures beta-cell function relative to insulin requirements. However, glucose challenge tests are expensive and time-consuming and therefore impractical to implement in large-scale clinical studies. To address this challenge, we developed a model disposition index (mDI estimated without insulin) that does not require insulin measurements during an oral glucose tolerance test (OGTT) (Ha et al., Diabetes 2021 (70) suppl. 1). mDI outperforms the conventional oral disposition index (oDI) at predicting progression to diabetes.
To further increase access and refine the assessments of beta-cell function, we are adapting our model to calculate a model disposition index using continuous glucose monitoring (CGM). CGM has been in the spotlight of diabetes management and has revolutionized the field of medicine as they are approved for glucose monitoring and clinical decision-making in patients with diabetes. CGM devices are relatively inexpensive compared to oral glucose challenge tests, accessible, and simple to use, especially in remote or free-living environments. The CGM device continuously measures interstitial glucose every 5 minutes and provides glucose profiles for 7-14 days. Thus, there are numerous data points compared to glucose challenge tests, but the abundant data points have not previously been used for estimating metabolic parameters. We compared mDI to two widely used CGM-derived metabolic parameters for assessing metabolic status and risk, mean glucose and glycemic excursion. Both mean glucose and glycemic excursion correlated strongly with mDI. The new approach promises to be cost- effective and easy to perform and therefore implementable in large-scale clinical studies. As for specific clinical applications, estimated model parameters during OGTTs identified ethnic differences in common pathways to T2D between Pima Indians and Koreans.
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...dkNET
For all proposals submitted on/after January 25 2023, NIH requires data sharing from all NIH-funded studies. Do you have appropriate data management practices and sharing plans in place to meet these requirements? Have questions or need some help? Join the dkNET office hours to learn about NIH’s policy (NOT-OD-21-013) and available resources that could help.
In our upcoming session on March 3, 2023, we are pleased to invite Dr. Jeffrey Grethe, dkNET co-PI and expert on Data Management and Sharing, Dr. Rebecca Rodriguez, Repository Program Director at NIDDK, Ms. Reaya Reuss, Chief of Staff to the Deputy Director at NIDDK, and the support team members from the NIDDK Central Repository. They will be available to answer any questions you may have.
*Previous Office Hours Slides and Recording: https://dknet.org/about/blog/2535
Upcoming Webinars Schedule: https://dknet.org/about/webinar
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...dkNET
dkNET New Investigator Pilot Program in Bioinformatics Awardee Webinar Series
Presenter: Rie Sakai-Bizmark, PhD. Assistant Professor, The Lundquist Institute at Harbor-UCLA Medical Center, David Geffen School of Medicine at UCLA
Abstract
Women with gestational diabetes mellitus (GDM) are at high risk of developing glucose intolerance after delivery. In the long term, women with GDM have a nearly 10-fold higher risk of developing type 2 diabetes mellitus (T2D) than women without GDM. The American Diabetes Association (ADA) and the American College of Obstetrics and Gynecology (ACOG) recommend that women with GDM undergo a 75-g oral glucose tolerance test (OGTT) between four and 12 weeks postpartum, and periodically thereafter. However, postpartum glucose screening (PGS) rate is historically low despite of various interventions to improve such rate. We hypothesized that PGS rate is lower among postpartum homeless women than their housed counterparts, and that interventions to improve PGS rate among postpartum homeless women with GDM should be tailored to their unique circumstances. The Japanese Society of Diabetes and Pregnancy (JSDP) modified the method to perform PGS with random plasma glucose (RPG) and glycated hemoglobin (HbA1c), which are simple and less invasive, to reduce the risk of COVID-19 infection by shortening the time spent in the hospital. RPG or HbA1c test do not require fasting. Therefore, homeless women who utilized care for other reasons could have the test as PGS. Given the barriers faced by homeless individuals, we hypothesize that RPG and HbA1c at healthcare utilizations during the postpartum period could be one of the strategies to identify high-risk individuals early because 1] healthcare utilizations are an opportunity for healthcare providers and social workers to educate homeless patients on GDM and their insurance eligibility and coverage for the screening, and 2] the physical barriers to health care access, which are often cited as a reason for the low PGS rate, are removed.
This proposed study will use administrative data from five states (AZ, CO, NC, NJ, and OR), which collectively include 9.3% of the US female homeless population. Each state will provide detailed, linked, multi-level, anonymized data for postpartum homeless women from four sources: 1] Medicaid claims; 2] Homeless Management Information System (HMIS); 3] birth records; and 4] the American Hospital Association (AHA) database to obtain hospital characteristics. With data from 2013 to 2020, an estimated sample size of 24,000 homeless women who delivered babies and 3,290 postpartum homeless women with GDM will be included.
First, we will estimate rates of GDM and PGS among homeless women. Second, we will estimate the cost-effectiveness of performing RPG and HbA1c tests...[Full abstract: https://dknet.org/about/blog/2581]
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...dkNET
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies Analyzed with Linear Mixed Models
Presenter:
Kylie K. Harrall, MS, Research Instructor, Lifecourse Epidemiology of Adiposity and Diabetes (LEAD) Center, University of Colorado Anschutz Medical Campus
Abstract
Planning a reproducible study requires selecting a sample size which will ensure appropriate statistical power. Free point-and-click software (Kreidler et al., Journal of Statistical Software, 2013, 10.18637/jss.v054.i10) makes it easy to select a sample size for clustered and longitudinal designs with linear mixed models. The software, a suite of training modules, and reference materials are freely available online (www.SampleSizeShop.org ). The software interface and training materials are aimed at biomedical scientists, included those funded by NIDDK. We give examples of study designs for which the software will compute power and sample size, including a study with clustering, a study with longitudinal repeated measures, and a study with multiple outcomes, where heterogeneity of response among subgroups is of interest.
The top 3 key questions that the Sample Size Shop can answer:
1. What free, online, point-and-click, wizard-style, NIH-funded, validated, published power and sample size software provides calculations for studies with clusters, longitudinal studies, and longitudinal studies with clusters?
2. Can GLIMMPSE (www.SampleSizeShop.org) compute power and sample size for randomized controlled clinical trials and observational studies funded by NIDDK?
3. Why use validated power and sample size software instead of writing simulations?
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...dkNET
Abstract
AIRR-seq data (antibody/B-cell and T-cell receptor sequences from Adaptive Immune Receptor Repertoires) can describe the adaptive immune response in exquisite detail, and comparison and analysis of these data across studies and institutions can greatly contribute to the development of diagnostics and therapeutics, including the discovery of monoclonal antibodies for treatment of autoimmune diseases.
The AIRR community has developed protocols and standards for curating, analyzing and sharing AIRR-seq data (www.airr-community.org), and supports the AIRR Data Commons, a set of geographically distributed repositories that follows the AIRR Community’s metadata standards and the FAIR principles. The ADC currently comprises > 5 Bn receptor sequences from over 86 studies and ~9000 repertoires. The data model of the ADC has recently been expanded to include gene expression and cell phenotype data from single immune receptor cells, as well as MHC/HLA genotyping.
The iReceptor Gateway (ireceptor.org) queries this AIRR Data Commons for specific “metadata”, e.g. “find all repertoires from T1D studies” or for specific CDR3 sequences (e.g., find all repertoires from healthy individuals expressing this CDR3 sequence). Data from these federated repositories can then be analyzed through the Gateway by several sophisticated analysis tools, or downloaded for further analysis offline. The iReceptor Team at Simon Fraser University has recently initiated a collaboration to greatly expand the amount of bulk and single-cell immune profiling data from T1D studies in the AIRR Data Commons. For more information on obtaining or sharing AIRR-seq data contact support@ireceptor.org.
The top 3 key questions that the Adaptive Immune Receptor Repertoire (AIRR) can answer:
1. A researcher observes that many individuals with Type 1 Diabetes express a specific B-cell or T-cell receptor compared to controls (i.e., a “public” clonotype). To what degree is this receptor observed to be public across other T1D studies or other autoimmune disease populations?
2. Can Machine Learning be used to identify individuals who will respond well to a new cancer immunotherapy based on differences in their antibody/B-cell or T-cell receptor repertoires as curated in the AIRR Data Commons?
3. Is there an association between particular HLA, immunoglobulin (IG), or T-cell receptor (TR) germline gene polymorphisms and propensity toward specific infectious or autoimmune diseases?
Presenters:
Dr. Felix Breden, Scientific Director, iReceptor
Dr. Brian Corrie, Technical Director, iReceptor
Dr. Kira Neller, Bioinformatics Director, iReceptor
Upcoming webinars schedule: https://dknet.org/about/webinar
dkNET Office Hours - "Are You Ready for 2023? New NIH Data Management and Sha...dkNET
For all proposals submitted on/after January 25 2023, NIH will require the sharing of data from all NIH funded studies. Do you have appropriate data management practices and sharing plans in place to meet these requirements? Have questions or need some help? Join the dkNET office hours to learn about NIH’s policy (NOT-OD-21-013) and resources (https://dknet.org/rin/research-data-management) that could help.
Upcoming Webinars Schedule: https://dknet.org/about/webinar
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...dkNET
Abstract
The(sugar)science was launched two years ago with the aim of helping scientists who study type 1 diabetes (T1D) and related interdisciplinary fields connect globally. We also wanted to create a digital space where trainees in the field can be supported, celebrated and connected to future positions. As part of our mission, our all volunteer team created the State of the Science series (2021. 2022), connecting global thought leaders around T1D research topics for discussion with a larger scientific audience. The second State of Science series was led by women scientists following the ADA publication which highlighted the paucity of women scientists in the leadership positions in the field.
To encourage the scientific community at large to dive into pre-existing data and pull out novel hypotheses that pertain to T1D, we created and together with dkNET, hosted D-Challenge 2021 and 2022. These competitions awarded $40K and $50K respectively to those who mined data and developed the most creative and testable hypothesis as judged by scientific experts in the field. These teams were also able to have an audience with the JDRFT1D Fund as part of a "pitch polish" which facilitated their interaction with venture capital.
To date, we have hosted over 200 interviews with T1D focused scientists in academia and industry and have an audience of 35K. Our reach on social media continues to grow and our metrics indicate a robust following. We share opportunities for positions in the field, engage and support trainees and together, our young scientific team published a paper, Similarities between bacterial GAD and human GAD65: Implications in gut mediated autoimmune type 1 diabetes, PLOS, February 2022.
We are currently engaged in the build of a T1D TCR Repository. We connected the AIRR data commons community with top TCR scientists in the field to begin this community based venture. It has the possibility to be incredibly instructive in defining the prodrome , which will further inform the field as it pertains to understanding the etiology of T1D.
Current team members that will join the discussion today will be Neha Mejety, Johns Hopkins University undergraduate and Tiffany Richardson, doctoral degree candidate at VUMC Diabetes.
The top 3 key questions that the(sugar)science can answer:
1. How can I find scientists to collaborate with in Type 1 diabetes research?
2. Where can I learn about Type 1 diabetes trending topics?
3. Where can I find forums to discuss novel ideas with scientists or key opinion leaders and find opportunities for Type 1 diabetes research ?
Presenters:
Monica Westley, PhD, Founder, the(sugar)science
Tiffany Richardson
Neha Majety
Upcoming webinars schedule: https://dknet.org/about/webinar
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
ISI 2024: Application Form (Extended), Exam Date (Out), EligibilitySciAstra
The Indian Statistical Institute (ISI) has extended its application deadline for 2024 admissions to April 2. Known for its excellence in statistics and related fields, ISI offers a range of programs from Bachelor's to Junior Research Fellowships. The admission test is scheduled for May 12, 2024. Eligibility varies by program, generally requiring a background in Mathematics and English for undergraduate courses and specific degrees for postgraduate and research positions. Application fees are ₹1500 for male general category applicants and ₹1000 for females. Applications are open to Indian and OCI candidates.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills MN
Travis Hills of Minnesota developed a method to convert waste into high-value dry fertilizer, significantly enriching soil quality. By providing farmers with a valuable resource derived from waste, Travis Hills helps enhance farm profitability while promoting environmental stewardship. Travis Hills' sustainable practices lead to cost savings and increased revenue for farmers by improving resource efficiency and reducing waste.
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
1. Tudor I. Oprea
University of New Mexico, Albuquerque NM
10/23/2020
dkNET: Connecting Researchers to Resources
Via Zoom Funding: NIH U24 CA224370 & NIH U24 TR002278
http://druggablegenome.net/
http://datascience.unm.edu/
2. 75% of protein research still
focused on 10% genes known
before human genome was mapped
AM Edwards et al, Nature, 2011
This prompted NIH to start the
Illuminating the Druggable Genome
Initiative
3. Informatics, Data Science and
Machine Learning (“AI”) can be
used as follows:
Diseases: EMR processing,
nosology, ontology, & EMR-based ML
Targets: drug target selection &
validation, phenotype associations,
ML
Drugs: Identifying novel therapeutic
modalities using in silico methods
IDG is developing methods
applicable to each of these 3 areas
8/24/20 revision
Diseases image credit: Julie McMurry, Melissa Haendel (OHSU).
All other images credit: Nature Reviews Drug Discovery cover page
4. 2/4/20 revisionR. Santos et al., Nature Rev.Drug Discov. 2017, 16:19-34 link
We curated 667 human genome-derived
proteins and 226 pathogen-derived
biomolecules through which 1,578 US FDA-
approved drugs act.
This set included 1004 orally formulated
drugs as well as 530 injectable drugs
(approved through June 2016).
Data captured in DrugCentral (link)
5. 2/4/20 revision
RFA-RM-16-026
(DRGC)
GPCRs
U24 DK116195:
Bryan Roth, M.D., Ph.D. (UNC)
Brian Shoichet, Ph.D. (UCSF)
Ion
Channels
U24 DK116214:
Lily Jan, Ph.D. (UCSF)
Michael T. McManus, Ph.D. (UCSF)
Kinases
U24 DK116204:
Gary L. Johnson, Ph.D. (UNC)
RFA-RM-16-025
(RDOC)
Outreach
U24 TR002278:
Stephan C. Schürer, Ph.D. (UMiami)
Tudor Oprea, M.D., Ph.D. (UNM)
Larry A. Sklar, Ph.D. (UNM)
RFA-RM-16-024
(KMC) Data
U24 CA224260:
Avi Ma’ayan, Ph.D. (ISMMS)
U24 CA224370:
Tudor Oprea, M.D., Ph.D. (UNM)
RFA-RM-18-011
(CEIT)
Tools
U01 CA239106: N Kannan, PhD & KJ Kochut (UGA)
U01 CA239108: PN Robinson, MD PhD (JAX), CJ Mungall
(LBL), T Oprea (UNM)
U01 CA239069: G Wu, PhD (OHSU), PG D’Eustachio PhD
(NYU), Lincoln D Stein, PhD (OICR)
T. Oprea et al., Nature Rev.Drug Discov. 2018, 17:317-332 link
6. Most protein classification schemes are
based on structural and functional criteria.
For therapeutic development, it is useful to
understand how much and what types of
data are available for a given protein,
thereby highlighting well-studied and
understudied targets.
Tclin: Proteins annotated as drug targets
Tchem: Proteins for which potent small
molecules are known
Tbio: Proteins for which biology is better
understood
Tdark: These proteins lack antibodies,
publications or Gene RIFs
T. Oprea et al., Nature Rev.Drug Discov. 2018, 17:317-332 link 2/10/20 revision
2020 Update: Tdark 31.2%;Tbio 57.7%;Tchem 8%;Tclin 3.1%
9. Tclin proteins are associated
with drug Mechanism of Action
(MoA) – NRDD 2017
Tchem proteins have
bioactivitis in ChEMBL and
DrugCentral, + human curation
for some targets
Kinases: <= 30nM
GPCRs: <= 100nM
Nuclear Receptors: <= 100nM
Ion Channels: <= 10μM
Non-IDG Family Targets: <= 1μM
10/19/16 revision
Bioactivities of approved drugs (by Target class)
ChEMBL: database of bioactive chemicals
https://www.ebi.ac.uk/chembl/
DrugCentral: online drug compendium
http://drugcentral.org/
R. Santos et al., Nature Rev.Drug Discov. 2017, 16:19-34 link
10. Tbio proteins lack small molecule annotation cf.Tchem criteria,
and satisfy one of these criteria:
protein is above the cutoff criteria for Tdark
protein is annotated with a GO Molecular Function or Biological Process
leaf term(s) with an Experimental Evidence code
protein has confirmed OMIM phenotype(s)
Tdark (“ignorome”) have little information available, and satisfy
these criteria:
PubMed text-mining score from Jensen Lab < 5
<= 3 Gene RIFs
<= 50 Antibodies available according to antibodypedia.com
8/20/15 revisionT. Oprea et al., Nature Rev.Drug Discov. 2018, 17:317-332 link
11. Tdark parameters differ from the other TDLs across the 4 external
metrics cf.Kruskal-Wallis post-hoc pairwise Dunn tests
2/23/18 revisionT. Oprea et al., Nature Rev.Drug Discov. 2018, 17:317-332 link
12. https://rpubs.com/
cbologa/TDL7
Tdark:
9199 proteins in 2013
7658 proteins in 2016
6368 proteins in 2020
Tclin:
601 proteins in 2013
592 proteins in 2016
659 proteins in 2020
10/12/20 revisionT. Sheils, S.L. Mathias et al., Nucleic Acids Research 2021 doi:10.1093/nar/gkaa993
13. T. Sheils, S.L. Mathias et al., Nucleic Acids Research 2021 doi:10.1093/nar/gkaa993 10/12/20 revision
14. 2/4/20 revisionHaendel M, et al. Nature Rev.Drug Discov. 2020 19:77-78 link
We revised the number of RDs from ~7,000 to
10,393 using Disease Ontology, OrphaNet,
GARD, NCIT, OMIM and the Monarch
Initiative MONDO system
We also pointed out the lack of a uniform
definition for rare diseases, and called for
coordinated efforts to precisely define them
We surveyed therapeutic modalities
available to translate advances in the
scientific understanding of rare diseases into
therapies, and discussed overarching issues
in drug development for rare diseases.
15. Tambuyzer E, et al. Nature Rev.Drug Discov. 2020 19:93-111 link 2/4/20 revision
16. 6077 human proteins are associated
with at least one Rare Disease.
Sources: Disease Ontology (RD-slim),
eRAM and OrphaNet
~50% agreement (gene level)
Contrast:Tclin at 3% & Tchem at 7%
overall vs. RD subset: 6.94% Tclin and
14.1% for Tchem.
20% of the RD proteome is Tclin &
Tchem. This means hope for cures.
Potentially significant opportunities for
target & drug repurposing.
2/4/20 revisionTambuyzer E, et al. Nature Rev.Drug Discov. 2020 19:93-111 link
17. 3/12/18 revision
~35% of the proteins remain
poorly described (Tdark)
~11% of the Proteome (Tclin & Tchem) are currently targeted by
small molecule probes
With help from rare disease patient advocacy groups, rare disease
research is likely to witness a significant increase in translation
18. IN GOD WE TRUST.
All others bring Data.
Quote attributed to W. Edwards Deming, controversial:
Other attributions: George A. Box and Robert W. Hayden.
Bernhard Fisher, MD has said this to a journalist
19. https://pharos.nih.gov/targets/KCNJ11
The IDG KMC tracks 11 information
channels for protein-disease
associations, accessible via the
Pharos portal.
Our challenge is to harmonize
disease concepts, and to enable
computational use: e.g., KCNJ11 with
ABCC8 form the Sulfonylurea 1
Kir6.2 receptor, MoA drug target for
glibenclamide (type 2 diabetes).
10/23/20 revision
The challenge for ML & AI: How to prioritize targets? i.e., which protein-
disease associations are clinically actionable?
(involved is not the same as committed)
22. 9/09/20 revisionG. KC, G Bocci et al., Nature Machine Intell 2020, submitted link
We used data from the NCATS COVID19
portal to develop a suite of ML models for
six assays related to SARS-CoV-2 activities:
• viral entry (Spike/ACE2 via AlphaLISA;
counterscrens TruHit & ACE2 inhibition)
• viral replication (3CL or Mpro)
• live virus infectivity (CPE & cytotoxicity)
REDIAL-2020 prediction workflow
Input: SMILES
Drug Name
PubChem CID
ML: Fingerprints
Pharmacophores
Phys-chem
based on:
RDKit
scikit-learn
External set predictions
a) CPE, 24 actives;
b) CPE, 14 actives;
c) 3CL, 6 actives.
http://drugcentral.org/Redial
23. 9/09/20 revisionG. KC, G Bocci et al., Nature Machine Intell 2020, submitted link
http://drugcentral.org/Redial
24. IDG KMC2 seeks knowledge gaps
across the five branches of the
“knowledge tree”:
Genotype; Phenotype; Interactions
& Pathways; Structure & Function;
and Expression, respectively.
We can use biological systems
network modeling to infer novel
relationships based on available
evidence, and infer new “function”
and “role in disease” data based
on other layers of evidence
Primary focus on Tdark & Tbio
O. Ursu,T Oprea et al., IDG2 KMC 2/01/18 revision
25. O. Ursu et al., manuscript in preparation
Data source Data type Data points
CCLE Gene expression 19,006,134
GTEx Gene expression 2,612,227
Protein Atlas Gene & Protein expression 949,199
Reactome Biological pathways 303,681
KEGG Biological pathways 27,683
StringDB Protein-Protein interactions 5,080,023
Gene ontology Biological pathways & Gene function 434,317
InterPro Protein structure and function 467,163
ClinVar Human Gene - Disease/Phenotype associations 881,357
GWAS Gene - Disease/Phenotype associations 54,360
OMIM Human Gene - Disease/Phenotype associations 25,557
UniProt Disease Human Gene - Disease/Phenotype associations 5,365
JensenLab DISEASE Gene - Disease associations from text mining 44,829
NCBI Homology Homology mapping of human/mouse/rat genes 70,922
IMPC Mouse Gene - Phenotype associations 2,153,999
RGD Rat Gene - Phenotype associations 117,606
LINCS Drug induced gene signatures 230,111,315
We developed automated
methods for data collection
(TCRD), visualization (Pharos)
and data aggregation.
These aggregated datasets
were used to build machine
learning models for 20+
disease and 73 mouse
phenotype.
Each knowledge graph
contains ~22,000 metapaths
and 284 million path instances.
10/07/18 revision
26. a meta-path is a path consisting of
a sequence of relations defined
between different object types
(i.e., structural paths at the meta
level)
Our metapaths encode type-
specific network topology
between the source node (e.g.,
Protein) and the destination node
(e.g., Disease).
This approach enables the trans-
formation of assertions/evidence
chains of heterogeneous
biological data types into a ML
ready format.
G. Fu et al., BMC Bioinformatics 2016, 17:160 is an early example for drug-target interactions 10/01/18 revision
Similar assertions or evidence form metapaths (white).
Instances of metapath (paths) are used to determine the strength of the
evidence linking a gene to disease/phenotype/function.
27. one protein-disease
association at the time
O. Ursu,T Oprea et al., IDG2 KMC 2/01/18 revision
Genes associated with a disease/phenotype are positive examples, whereas genes lacking the same
association are negative examples. The Metapath approach transforms assertions/evidence chains into
classification problems that can be solved using suitably designed machine learning algorithms.
28. All datasets are merged, via R
scripts, into a PostgreSQL.
Python under development.
Graph embedding transforms
evidence paths into vectors,
converting data into matrices.
Input genes are positive
labels. OMIM (not input) are
negative labels (we prefer true
negatives where possible).
XGBoost runs 100 models.The
“median model” (AUC, F1) is
then selected for analysis and
prediction to avoid overfitting.
10/15/19 revisionJ.J.Yang, P. Kumar, D. Byrd et al., IDG2 KMC
29. A soccer match at RoboCup, Nagoya 2017
Image searching for “Bad AI”
30. Build data matrix from “Alzheimer’s disease” in
TCRD subset
protein knowledge graph along metapaths:
Protein – Protein Interactions
Pathways
GO terms
Gene expression
...
Training set: 53 genes associated with
Alzheimer’s disease (positives); 3,952 genes
associated with other pathologies from OMIM
were assumed to be negative
Test set: 23 genes associated with Alzheimer's
(positives) and 200 genes not associated with
Alzheimer's (negatives) from Text Mining
“Complete forest” binary classifier using
XGBoost & 5-fold cross-validation.
Weighted model is better than balanced model
2/14/18 revisionML work by Oleg Ursu
Bal. Predicted
Actual
Pos Neg
Pos 16 7
Neg 94 106
Wtd Predicted
Actual
Pos Neg
Pos 20 3
Neg 41 159
31. The top most important features are interactions with
proteins mediating inflammatory processes (JAK2/Tclin,
IL10 & IL2 / Tchem), response to oxidative stress
(GSTP1/Tchem), nervous system development (BDNF/Tbio)
and glycolysis (GAPDH/Tchem).
LINCS drug-induced gene expression perturbations are
the largest category of features for these predictions.
Brain cortex expression is a necessary requirement.
One Reactome pathway (AU-rich mRNA elements binding
proteins) is also important.
Weighted approached showed better performance in the
test set for Alzheimer's Disease, Schizophrenia, and Dilated
Cardiomyopathy.
4/23/18 revisionML work by Oleg Ursu
32. We tested the top 20 genes identified
by PKG/m-p/ML with a high-
throughput validation system by
measuring AD-relevant
hyperphosphorylated (at
S199/S202/T205) tau protein (AT8-Tau
and AT180-Tau) using a Cellomics®
high-content microscope; as well as
gene expression and
immunochemistry analysis via human
AD induced pluripotent stem cells
and human AD brain tissue
8/24/20 revisionAD validation work by Jessica Binder & Kiran Bhaskar,funded by U24CA224370-S2 supplement
33. 2/14/19 revisionAD validation work by Jessica Binder & Kiran Bhaskar,funded by U24CA224370-S2 supplement
SHSY5Y’s in vitro
siRNA knock-downs
measuring ∆pTau
(AT8) levels –
unbiased cellomics
qPCR gene
expression
Human induced
pluripotent stem
cells derived into
neurons –AD vs Ctrl
A
K
N
A
B
C
O
2
C
C
N
Y
C
R
T
A
M
F
A
M
92B
F
O
X
P
4
F
R
R
S
1
G
R
IN
2C
IL
17R
E
L
L
IL
R
A
3
L
M
04
N
D
R
G
2
P
IB
F
1
R
A
B
40AS
C
G
B
3A
1S
L
C
44A
2
S
P
O
P
S
T
A
R
D
3
T
M
E
F
F
2T
X
N
D
C
12
0
1
2
2.5
5.0
7.5
FoldChange(2^-∆∆Ct)
RelativetoCtrl
AX0018
sAD2.1
*
****
**
**
**
****
**
*
**
*
****
****
****
****
****
****
*
35. 5/22/19 revisionAD validation work by Jessica Binder & Kiran Bhaskar,funded by U24CA224370-S2 supplement
Top 20 Genes
predicted by the
XGBoost/Metapath
model, clustered by
functional roles
36. 8/24/20 revisionAD validation work by Jessica Binder & Kiran Bhaskar,funded by U24CA224370-S2 supplement
We proposed to validation ML models for the top 20 genes:
AKNA, BC02, CCNY,CRTAM, FAM92B, FOXP4, FRRS1, GRIN2C,1L17REL,
LILRA3, LM04, NDRG2, PIBF1, RAB40A, SCGB3A1, SLC44A2, SPOP,
STARD3,TMEFF2,TXNDC12
The most obvious effects based on the combined Cellomics & qPCR
of iPSNs & autopsy brains suggests that AKNA, LILRA3, PIBF1 and
TXNDC12 significantly increased pTau (as tracked by two different
antibodies for T180, S202 and S205)
PIBF1, LILRA3 and CRTAM show the most significant effect on tau
phosphorylation; two (CRTAM and LILRA3) novel genes are
implicated in innate immune pathways
37. ML work by Tudor Oprea
Genes 51
Source https://omim.org/entry/125853
AUC 0.72±0.02
1/16/19 revision
First model: 51 OMIM genes
associated with T2D vs. 3,954
OMIM genes associated with
other pathologies. AUC = 0.72 ±
0.08.
VIP-ranked variables include
HFE & HMOX1, which relate to
hemochromatosis (80% leads to
T2D), and IL1B & IL10 (suggests
an immune component).
38. From: Mark McCarthy <mark.mccarthy@drl.ox.ac.uk>
Sent: Friday, December 7, 2018 11:10 AM
The general summary is that we don’t see any enrichment for T2D associations
in either exome or GWAS data from the predicted gene sets (however we slice
them up).
But having that we don’t really see anything in the TRAINING set either: No
association in the exomes, and a weak (just nominal) association in the GWAS
data.
To be honest, I think, now we’ve taken a look at it, we’d all question the
training set: I had missed that this came from OMIM, which is simply not a
reliable source of information in this regard.
1/3/19 revision
39. ML work by Tudor Oprea
Genes 54
Source Causal T2DM transcripts
AUC 0.79±0.01
1/16/19 revision
• Second model: 54 causal transcripts
provided by Anuba Mahajan & Mark
McCarthy vs. 3,954 OMIM genes.
AUC = 0.79 ± 0.01.
Genes confirmed by GWAS (9 in
top 24): C2CD4B, C2CD4A,
JAZF1, ADAMTS9, CRY2,
LINGO2, THADA, TMEM18 &
SEC16B. 4 genes have GO terms
for insulin secretion: CPLX1,
ADRA2A, SYT7 & SYTL4
Top 4 VIP-ranked variables include
2 PPI nodes: SLC30A8 (rs13266634)
and GIPR (rs8108269), which have
GWAS-T2D associations.
41. Mackmyra tasked Microsoft and Fourkind to create novel
whisky recipes using AI
From input of 75 recipes,“AI” could generate 70 million
combinations.
Nr 36 on the AI ranked combinations was approved by
humans
https://www.geekwire.com/2019/microsoft-got-creation-worlds-first-whisky-formulated-ai/ 9/22/19 revision
42. How long does it take to move from “natural” language processing
to AI-driven large-dataset mining? Klingon, anyone? tlhIngan, vay'?
9/25/19 revision
Tomáš Mikolov (Google), developed an efficient algorithm to compute the
distributed representation of words, Word2Vec. It’s currently used for automatic
translation, spam filtering and speech recognition. Word2vec encodes words
using a distribution of weights across 100s of elements that compose the vectors.
Each element contributes to many words.
T. Mikolov et al.,ICLR 2013
10/10/19 revision
43. Alexahealth™: Given today’s health status and my calorie budget,
what food should I shop/prepare today?
Expanding on current models, IDG KMC could use AI/ML to integrate context-
specific computational reasoning tools (“AMI”) with /real time –omics,
biomarker and biomedical literature data.
These could be plugged into hospital / EMR data to improve patient services.
10/10/19 revision
44. 8/24/20 revision
Predictivity between different models for the same disease (even
using the same ML methods) may differ due to input variations
High quality data is really hard to obtain
Weakest components:
‘Ground Truth’ (true negatives) and Domain Expertise