Large Scale computing with medical metabolic phenotyping dataChristoph Steinbeck
This document discusses large scale computing with medical metabolic phenotyping data. It notes that genes are not the full story and that the metabolome is the most accessible and dynamically changing molecular phenotype. It describes how investigating metabolomes at large scale can help disentangle the exposome conundrum. Phenome centers are emerging around the world to study over 100,000 patient samples per year, generating several petabytes of data annually. The MetaboLights database serves as an experimental repository for this type of data.
Validating microbiome claims – including the latest DNA techniquesEagle Genomics
Abel Ureta-Vidal, Founder and CEO of Eagle Genomics, discusses how advanced DNA techniques help us to identify and characterise the microbiome, leading us to ways to prove cosmetic claims at the in-cosmetics formulation summit, 25th October 2017.
I spoke on "Big Data in Biology". The talk basically concentrates on how biology has affected big data and how big data has become a key player in biology. I have also covered how DNA storage can address long term archival storage.
This document provides an overview of IRB Barcelona, a biomedical research institute. It discusses the institute's mission to conduct multidisciplinary research at the forefront of biomedicine. Examples are given of recent research findings, including using immunotherapy to treat colon cancer, how circadian rhythms impact aging, and developing a nano-carrier drug delivery system. Facts and figures about IRB Barcelona are presented, noting its designation as a Severo Ochoa Centre of Excellence and research awards/grants received. Infrastructure needs like high-bandwidth internet connections are also mentioned to support processing large datasets from core facilities.
Digital transformation of translational medicineEagle Genomics
Anthony Finbow, Executive Chairman, and William Spooner, Chief Science Officer, discuss Eagle Genomics' software product, marketed at pharmaceutical and biotech companies, which enables radical improvements in the productivity of scientific research.
Expert Panel on Data Challenges in Translational ResearchEagle Genomics
A panel of experts including Alexandre Passioukov, VP Translational Medicine at Pierre Fabre, Xose Fernandez, Chief Data Officer at Institut Curie, Abel Ureta-Vidal, CEO at Eagle Genomics share their first-hand experience of enabling translational research in pharmaceutical and biomedical organisations, and discuss the challenges around the establishment of streamlined, seamless data handling and governance to accelerate innovation.
Considerations and challenges in building an end to-end microbiome workflowEagle Genomics
Many of the data management and analysis challenges in microbiome research are shared with genomics and other life-science big-data disciplines. However there are aspects that are specific: some are intrinsic to microbiome data, some are related to the maturity of the field, with others related to extracting business value from the data.
Machine learning has many applications and opportunities in biology, though also faces challenges. It can be used for tasks like disease detection from medical images. Deep learning models like convolutional neural networks have achieved performance exceeding human experts in detecting pneumonia from chest X-rays. Frameworks like DeepChem apply deep learning to problems in drug discovery, while platforms like Open Targets integrate data on drug targets and their relationships to diseases. Overall, machine learning shows promise for advancing biological research, though developing expertise through learning resources and implementing models to solve real-world problems is important.
Large Scale computing with medical metabolic phenotyping dataChristoph Steinbeck
This document discusses large scale computing with medical metabolic phenotyping data. It notes that genes are not the full story and that the metabolome is the most accessible and dynamically changing molecular phenotype. It describes how investigating metabolomes at large scale can help disentangle the exposome conundrum. Phenome centers are emerging around the world to study over 100,000 patient samples per year, generating several petabytes of data annually. The MetaboLights database serves as an experimental repository for this type of data.
Validating microbiome claims – including the latest DNA techniquesEagle Genomics
Abel Ureta-Vidal, Founder and CEO of Eagle Genomics, discusses how advanced DNA techniques help us to identify and characterise the microbiome, leading us to ways to prove cosmetic claims at the in-cosmetics formulation summit, 25th October 2017.
I spoke on "Big Data in Biology". The talk basically concentrates on how biology has affected big data and how big data has become a key player in biology. I have also covered how DNA storage can address long term archival storage.
This document provides an overview of IRB Barcelona, a biomedical research institute. It discusses the institute's mission to conduct multidisciplinary research at the forefront of biomedicine. Examples are given of recent research findings, including using immunotherapy to treat colon cancer, how circadian rhythms impact aging, and developing a nano-carrier drug delivery system. Facts and figures about IRB Barcelona are presented, noting its designation as a Severo Ochoa Centre of Excellence and research awards/grants received. Infrastructure needs like high-bandwidth internet connections are also mentioned to support processing large datasets from core facilities.
Digital transformation of translational medicineEagle Genomics
Anthony Finbow, Executive Chairman, and William Spooner, Chief Science Officer, discuss Eagle Genomics' software product, marketed at pharmaceutical and biotech companies, which enables radical improvements in the productivity of scientific research.
Expert Panel on Data Challenges in Translational ResearchEagle Genomics
A panel of experts including Alexandre Passioukov, VP Translational Medicine at Pierre Fabre, Xose Fernandez, Chief Data Officer at Institut Curie, Abel Ureta-Vidal, CEO at Eagle Genomics share their first-hand experience of enabling translational research in pharmaceutical and biomedical organisations, and discuss the challenges around the establishment of streamlined, seamless data handling and governance to accelerate innovation.
Considerations and challenges in building an end to-end microbiome workflowEagle Genomics
Many of the data management and analysis challenges in microbiome research are shared with genomics and other life-science big-data disciplines. However there are aspects that are specific: some are intrinsic to microbiome data, some are related to the maturity of the field, with others related to extracting business value from the data.
Machine learning has many applications and opportunities in biology, though also faces challenges. It can be used for tasks like disease detection from medical images. Deep learning models like convolutional neural networks have achieved performance exceeding human experts in detecting pneumonia from chest X-rays. Frameworks like DeepChem apply deep learning to problems in drug discovery, while platforms like Open Targets integrate data on drug targets and their relationships to diseases. Overall, machine learning shows promise for advancing biological research, though developing expertise through learning resources and implementing models to solve real-world problems is important.
Talk by Christoph Steinbeck, European Bioinformatics Institute (EMBL-EBI) on challenges for data sharing of clinical data in metabolomics research. This workshop was co-organised with the European BBMRI Biobanking infrastructure as part of the BioMedBridge symposium at the Wellcome Trust Conference Centre in Hinxton, UK.
Rashi Srivastava presented on the KEGG database in biotechnology. KEGG is a database that contains genomic, chemical, and systems information to understand biological functions from the molecular level up. It includes pathways, genes, compounds, diseases, drugs, and organisms. KEGG can be searched through its flat file format using DBGET or through its relational database format for more complex queries. It also contains the KEGG MEDICUS search tool and direct SQL searches of its relational database.
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Databasebigdatabm
This document introduces MOPED, a publicly accessible database containing multi-omics expression data from various organisms, tissues, and conditions. MOPED integrates protein, gene, and pathway expression levels and allows users to explore expression patterns and connections. It currently contains over 4 million gene expression records, 600,000 protein records, and data on human, mouse, worm, and yeast. The document outlines MOPED's key features for visualizing expression data, filtering experiments, and linking proteins and genes to pathways and external databases. Contact information is provided for questions.
Bioinformatics addresses problems related to the storage, retrieval and analysis of biological data including sequences, structures, and functions. It develops algorithms and statistics to analyze large datasets and tools to efficiently access and manage biologically significant information. The Human Genome Project coordinated by the US Department of Energy and National Institutes of Health applied bioinformatics to sequence the human genome ahead of schedule. Bioinformatics draws on skills in molecular biology, mathematics, computer science, and statistics to solve problems in areas like genomics, proteomics, and structural biology.
This document discusses genome database systems. It begins with an introduction to bioinformatics and genomes. It then discusses the background of genome databases, including some examples. The major characteristics of genome database systems are described as having high complex data, schema changes at a rapid pace, and complex queries. The key areas of data management in genome databases are discussed as non-standard data, complex queries, data interpretation, integration across databases, and uniform management solutions. Major research areas and applications that impact society are also summarized.
The document discusses the Kyoto Encyclopedia of Genes and Genomes (KEGG), which is an online database that contains information on genomes, enzymatic pathways, and biological chemicals. KEGG aims to computerize current biological knowledge and provide consistent annotations. It maintains six main databases: KEGG Pathway, KEGG Genes, KEGG Genome, KEGG Ligand, KEGG BRITE, and KEGG Cancer. These databases contain information on metabolic pathways, gene catalogs, genome sequences, chemical reactions, functional hierarchies, and cancer pathways respectively. KEGG can be used to detect enzyme clusters, compare gene clusters across genomes, and model and simulate biological systems.
This document discusses genomic resource conservation of horticultural crops. It defines genomic resources and describes the types that exist, including genomic, mitochondrial and chloroplast DNA, RNA, DNA markers, probes, primers, vectors, cloned genes, libraries, and sequence information. Genomic resources can be used for transgenic and cisgenic plant development, molecular breeding, germplasm screening, detailed understanding of plant biology, crop improvement through markers and genes, assessing diversity, and comparative genomics. The status of sequencing projects and EST resources available in public databases for various horticultural crops is presented. DNA banks that conserve genomic resources are described. Major plant DNA banks around the world are listed. Different genomic resource databases are provided. Future areas of focus include
This document describes the Galaxy Framework as a unifying bioinformatics solution for multi-omic data analysis. It discusses how Galaxy provides workflows for proteogenomics and metaproteomics analysis using mass spectrometry data from various omics fields including genomics, transcriptomics, and metagenomics. Examples are given of ongoing projects applying these Galaxy workflows to analyze salivary proteomics, hibernation in ground squirrels, lung cancer proteomics, and oral biofilm metaproteomics related to childhood caries.
Bioinformatics analyzes massive amounts of biological data like DNA sequences to uncover hidden biological information. It has many applications like molecular medicine, drug development, and microbial genome analysis. Common bioinformatics tools like BLAST are used to compare query sequences against databases to find similar sequences. BLAST works through a heuristic algorithm that finds short matches between sequences to locate potential homologs in an efficient manner. Other algorithms like Smith-Waterman and FASTA also perform sequence alignment but with different tradeoffs in accuracy and speed.
Bioinformatics is an interdisciplinary field that merges biology, computer science, and information technology. It is applied in areas like genomics, proteomics, and systems biology. While some basic analysis can be done through user-friendly tools, truly customized work requires programming skills and an understanding of underlying algorithms. Bioinformatics is not just a service field but rather involves scientific experimentation throughout the entire analysis process from experimental design to evaluation. It is a dedicated field of research in its own right, not a quick or interchangeable task.
This document summarizes various proteomics resources available at the EBI and ExPASy. It describes databases such as UniProt, IntAct, Reactome, and PRIDE that provide protein sequence and functional information, molecular interaction data, pathways information, and proteomics identifications respectively. It also outlines tools available at ExPASy including Hits, Prosite, SWISS-MODEL, SwissDock, and neXtProt which allow investigation of protein domains, motifs, homology modeling, protein-ligand docking, and human protein information. Additionally, it mentions STRING and VenomZone for protein-protein interaction and venom data, and various other tools for predictions, analyses, and working with proteomics data.
This document summarizes big data in the life sciences sector and its strategic importance for stakeholders such as pharmaceutical and medical device companies. It discusses how capturing, storing, managing data flows and analyzing large amounts of information affects all aspects of organizations, particularly the discovery and research & development stages. Implementing a strategic shift towards big data approaches requires support from senior management and organization-wide implementation. Areas that can benefit include genomics, clinical research, epidemiology, public health, and understanding product effectiveness and health outcomes. Managing data generated across the entire value chain, from discovery to real-world use, has become vastly more challenging due to increasing data volumes.
Computational Biology and BioinformaticsSharif Shuvo
Computational Biology and Bioinformatics is a rapidly developing multi-disciplinary field. The systematic achievement of data made possible by genomics and proteomics technologies has created a tremendous gap between available data and their biological interpretation.
Basic design and organisation of diagnostic laboratoryRohit Hari
This document discusses the organization and design of diagnostic laboratories. It begins by introducing the importance of laboratory design and organization. Key documents like organizational charts and procedure manuals describe how personnel, equipment, and facilities are integrated. The document then discusses four major changes in laboratory design: 1) Open-plan designs for flexibility, 2) Increased automation testing, 3) Incorporation of biosafety level 3 containment, and 4) Growth of molecular testing. Ten steps for an efficient laboratory layout are also outlined. The document concludes that well-designed laboratories provide flexibility for future growth and ability to incorporate new technologies.
Genome-Scale Metabolic Models and Systems Medicine of Metabolic SyndromeNatal van Riel
workshop on 'The interplay of fat and carbohydrate metabolism with application in Metabolic Syndrome and Type 2 Diabetes', December 12 and 13, 2013, Eindhoven University of Technology
This document provides information about purchasing a Cisco AS5200 ASYNC Card (part number 69-000958-00) from Launch 3 Telecom. It describes how to purchase the card by phone, email, or online request form. It also details Launch 3 Telecom's payment options, same-day shipping and tracking, warranty, and additional repair and logistics services available.
MCX gold and silver prices have broken below rising trend lines and are trading below the 50-day moving average, signaling bearish trends. Technical indicators also show downside momentum. Gold is expected to fall to 28400 levels while silver may drop to 40600 levels. MCX copper has been in a horizontal channel and is trading below technical indicators, suggesting it may decline to 377 levels. Crude oil broke below a symmetrical triangle pattern and is bearish, with a potential target of 3400 levels.
Quona is a venture capital firm focused on financing technology companies providing financial services to underserved populations in emerging markets. It has strategic relationships with organizations like Accion International and backing from multiple financial institutions. Quona invests in areas like alternative lending, payments, insurtech, and next generation banking that can help the 2.1 billion unbanked access financial services and address the $760 billion annual funding gap faced by small and medium enterprises, especially in Latin America.
Talk by Christoph Steinbeck, European Bioinformatics Institute (EMBL-EBI) on challenges for data sharing of clinical data in metabolomics research. This workshop was co-organised with the European BBMRI Biobanking infrastructure as part of the BioMedBridge symposium at the Wellcome Trust Conference Centre in Hinxton, UK.
Rashi Srivastava presented on the KEGG database in biotechnology. KEGG is a database that contains genomic, chemical, and systems information to understand biological functions from the molecular level up. It includes pathways, genes, compounds, diseases, drugs, and organisms. KEGG can be searched through its flat file format using DBGET or through its relational database format for more complex queries. It also contains the KEGG MEDICUS search tool and direct SQL searches of its relational database.
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Databasebigdatabm
This document introduces MOPED, a publicly accessible database containing multi-omics expression data from various organisms, tissues, and conditions. MOPED integrates protein, gene, and pathway expression levels and allows users to explore expression patterns and connections. It currently contains over 4 million gene expression records, 600,000 protein records, and data on human, mouse, worm, and yeast. The document outlines MOPED's key features for visualizing expression data, filtering experiments, and linking proteins and genes to pathways and external databases. Contact information is provided for questions.
Bioinformatics addresses problems related to the storage, retrieval and analysis of biological data including sequences, structures, and functions. It develops algorithms and statistics to analyze large datasets and tools to efficiently access and manage biologically significant information. The Human Genome Project coordinated by the US Department of Energy and National Institutes of Health applied bioinformatics to sequence the human genome ahead of schedule. Bioinformatics draws on skills in molecular biology, mathematics, computer science, and statistics to solve problems in areas like genomics, proteomics, and structural biology.
This document discusses genome database systems. It begins with an introduction to bioinformatics and genomes. It then discusses the background of genome databases, including some examples. The major characteristics of genome database systems are described as having high complex data, schema changes at a rapid pace, and complex queries. The key areas of data management in genome databases are discussed as non-standard data, complex queries, data interpretation, integration across databases, and uniform management solutions. Major research areas and applications that impact society are also summarized.
The document discusses the Kyoto Encyclopedia of Genes and Genomes (KEGG), which is an online database that contains information on genomes, enzymatic pathways, and biological chemicals. KEGG aims to computerize current biological knowledge and provide consistent annotations. It maintains six main databases: KEGG Pathway, KEGG Genes, KEGG Genome, KEGG Ligand, KEGG BRITE, and KEGG Cancer. These databases contain information on metabolic pathways, gene catalogs, genome sequences, chemical reactions, functional hierarchies, and cancer pathways respectively. KEGG can be used to detect enzyme clusters, compare gene clusters across genomes, and model and simulate biological systems.
This document discusses genomic resource conservation of horticultural crops. It defines genomic resources and describes the types that exist, including genomic, mitochondrial and chloroplast DNA, RNA, DNA markers, probes, primers, vectors, cloned genes, libraries, and sequence information. Genomic resources can be used for transgenic and cisgenic plant development, molecular breeding, germplasm screening, detailed understanding of plant biology, crop improvement through markers and genes, assessing diversity, and comparative genomics. The status of sequencing projects and EST resources available in public databases for various horticultural crops is presented. DNA banks that conserve genomic resources are described. Major plant DNA banks around the world are listed. Different genomic resource databases are provided. Future areas of focus include
This document describes the Galaxy Framework as a unifying bioinformatics solution for multi-omic data analysis. It discusses how Galaxy provides workflows for proteogenomics and metaproteomics analysis using mass spectrometry data from various omics fields including genomics, transcriptomics, and metagenomics. Examples are given of ongoing projects applying these Galaxy workflows to analyze salivary proteomics, hibernation in ground squirrels, lung cancer proteomics, and oral biofilm metaproteomics related to childhood caries.
Bioinformatics analyzes massive amounts of biological data like DNA sequences to uncover hidden biological information. It has many applications like molecular medicine, drug development, and microbial genome analysis. Common bioinformatics tools like BLAST are used to compare query sequences against databases to find similar sequences. BLAST works through a heuristic algorithm that finds short matches between sequences to locate potential homologs in an efficient manner. Other algorithms like Smith-Waterman and FASTA also perform sequence alignment but with different tradeoffs in accuracy and speed.
Bioinformatics is an interdisciplinary field that merges biology, computer science, and information technology. It is applied in areas like genomics, proteomics, and systems biology. While some basic analysis can be done through user-friendly tools, truly customized work requires programming skills and an understanding of underlying algorithms. Bioinformatics is not just a service field but rather involves scientific experimentation throughout the entire analysis process from experimental design to evaluation. It is a dedicated field of research in its own right, not a quick or interchangeable task.
This document summarizes various proteomics resources available at the EBI and ExPASy. It describes databases such as UniProt, IntAct, Reactome, and PRIDE that provide protein sequence and functional information, molecular interaction data, pathways information, and proteomics identifications respectively. It also outlines tools available at ExPASy including Hits, Prosite, SWISS-MODEL, SwissDock, and neXtProt which allow investigation of protein domains, motifs, homology modeling, protein-ligand docking, and human protein information. Additionally, it mentions STRING and VenomZone for protein-protein interaction and venom data, and various other tools for predictions, analyses, and working with proteomics data.
This document summarizes big data in the life sciences sector and its strategic importance for stakeholders such as pharmaceutical and medical device companies. It discusses how capturing, storing, managing data flows and analyzing large amounts of information affects all aspects of organizations, particularly the discovery and research & development stages. Implementing a strategic shift towards big data approaches requires support from senior management and organization-wide implementation. Areas that can benefit include genomics, clinical research, epidemiology, public health, and understanding product effectiveness and health outcomes. Managing data generated across the entire value chain, from discovery to real-world use, has become vastly more challenging due to increasing data volumes.
Computational Biology and BioinformaticsSharif Shuvo
Computational Biology and Bioinformatics is a rapidly developing multi-disciplinary field. The systematic achievement of data made possible by genomics and proteomics technologies has created a tremendous gap between available data and their biological interpretation.
Basic design and organisation of diagnostic laboratoryRohit Hari
This document discusses the organization and design of diagnostic laboratories. It begins by introducing the importance of laboratory design and organization. Key documents like organizational charts and procedure manuals describe how personnel, equipment, and facilities are integrated. The document then discusses four major changes in laboratory design: 1) Open-plan designs for flexibility, 2) Increased automation testing, 3) Incorporation of biosafety level 3 containment, and 4) Growth of molecular testing. Ten steps for an efficient laboratory layout are also outlined. The document concludes that well-designed laboratories provide flexibility for future growth and ability to incorporate new technologies.
Genome-Scale Metabolic Models and Systems Medicine of Metabolic SyndromeNatal van Riel
workshop on 'The interplay of fat and carbohydrate metabolism with application in Metabolic Syndrome and Type 2 Diabetes', December 12 and 13, 2013, Eindhoven University of Technology
This document provides information about purchasing a Cisco AS5200 ASYNC Card (part number 69-000958-00) from Launch 3 Telecom. It describes how to purchase the card by phone, email, or online request form. It also details Launch 3 Telecom's payment options, same-day shipping and tracking, warranty, and additional repair and logistics services available.
MCX gold and silver prices have broken below rising trend lines and are trading below the 50-day moving average, signaling bearish trends. Technical indicators also show downside momentum. Gold is expected to fall to 28400 levels while silver may drop to 40600 levels. MCX copper has been in a horizontal channel and is trading below technical indicators, suggesting it may decline to 377 levels. Crude oil broke below a symmetrical triangle pattern and is bearish, with a potential target of 3400 levels.
Quona is a venture capital firm focused on financing technology companies providing financial services to underserved populations in emerging markets. It has strategic relationships with organizations like Accion International and backing from multiple financial institutions. Quona invests in areas like alternative lending, payments, insurtech, and next generation banking that can help the 2.1 billion unbanked access financial services and address the $760 billion annual funding gap faced by small and medium enterprises, especially in Latin America.
El documento describe las cuatro etapas del desarrollo cognitivo según Piaget: 1) la etapa sensoriomotora, donde los niños descubren el mundo a través de la acción y el movimiento; 2) la etapa preoperacional, donde desarrollan el lenguaje y pensamiento egocéntrico; 3) la etapa de las operaciones concretas, donde adquieren la capacidad de razonamiento lógico sobre objetos concretos; y 4) la etapa de las operaciones formales, donde pueden razonar sobre conceptos abstractos y hip
In a team of 5 students we designed this complete Brand Identity for a fictitious interplanetary travel agency called
Mirò, based on the style of this famous Spanish artist.
The document provides information about a short course on next generation sequencing and analysis of sequence variants. It includes an agenda with sessions on introduction to NGS applications in medical research, data analysis pipelines, interpretation of variants, and tools for predicting pathogenicity. It also provides background on the organizing institutions, the CNAG sequencing center and its projects, and an overview of bioinformatics analysis pipelines and resources.
World-wide data exchange in metabolomics, Wageningen, October 2016Christoph Steinbeck
Talk given at the Netherlands Institute of Ecology in Wageningen, where I describe the development of the MetaboLights database and the value of data sharing in Metabolomics and molecular Biology in General
2023-11-09 HealthRI Biobanking day_Amsterdam_Alain van Gool.pdfAlain van Gool
Examples of lessons learned in Omics-based biomarker studies from myself and colleagues in X-omics and EATRIS, for an audience of biobankers, researchers and diagnostic/clinical chemistry experts.
Will Biomedical Research Fundamentally Change in the Era of Big Data?Philip Bourne
This document discusses how biomedical research may fundamentally change in the era of big data. It notes that biomedical research has always been data-driven, but the scope, variety, complexity and volume of data is now much greater. It also discusses the need for more open data sharing and new tools and methods for large-scale analysis. The document suggests biomedical research may move towards a more collaborative "platform" model, as seen with companies like Airbnb, with the goal of improving data access, reuse and reproducibility of research. However, overcoming challenges like incentives, trust and work practices will be important for any new platform to succeed.
Developing an Efficient Infrastruture, Standards and Data-Flow for MetabolomicsChristoph Steinbeck
The document discusses the development of efficient infrastructure, standards, and data flow for metabolomics. It describes the European Bioinformatics Institute (EBI) and its role in archiving, classifying, analyzing, and sharing metabolomics data through databases like MetaboLights. MetaboLights has experienced rapid data growth and is now the recommended repository for several journals. Efforts are underway to establish global standards and facilitate data exchange through initiatives like COSMOS and MetabolomeXchange. The document outlines plans to build out reference metabolomes and enable large-scale computing with medical metabolomics data.
Outlining the proces and lessons learned in organising the technological infrastructure at the Radboud university medical center, to shape the Radboudumc Technology Centers, supporting our mission in enabling personalized healthcare.
This document discusses large scale computing with medical metabolic phenotyping data. It describes how metabolomics measures small molecule metabolites in biological organisms using techniques like NMR and mass spec. Metabolomics data is growing rapidly and being stored in repositories like MetaboLights at EBI. Phenome centers around the world are generating huge amounts of metabolomics data that will require exabytes of storage. The PhenoMeNal project aims to enable large scale computing with medical metabolomics data through collaborations between EBI, other data centers, and phenomics user communities.
Big Data in Biomedicine: Where is the NIH HeadedPhilip Bourne
The National Institutes of Health (NIH) is taking actions to address the implications of big data for biomedical research and healthcare. These include developing a "Commons" approach to make data findable, accessible, interoperable and reusable. The NIH is also establishing initiatives like the Precision Medicine Initiative to generate large datasets and the Center for Predictive Computational Phenotyping to develop predictive models from electronic health records. Overall, the NIH aims to train a workforce equipped for data science and facilitate open collaboration to realize the potential of big data for improving health outcomes.
This document discusses the role of bioinformatics in medicine today. It begins by explaining how genomics differs from genetics in studying many genes and genomic features together rather than single genes. It then describes some of the key genomic databases that are used in bioinformatics, including primary sequence databases like GenBank, metadatabases like Entrez, genome databases like Ensembl and UCSC, and pathway and protein databases. The document provides an example of how bioinformatics is used to analyze autism data, including processing sequencing data, identifying copy number variations, mapping genes, building networks, and identifying significant clusters to understand autism better.
Developing tools & Methodologies for the NExt Generation of Genomics & Bio In...Intel IT Center
This document discusses computational challenges in analyzing next generation DNA sequencing data and implications for diagnostics and therapeutics. It notes that sequencing one genome now takes just a few days but analysis is the bottleneck. The author's organization, Genomeon, has developed a high performance computing system called SHADOWFAX to analyze over 7,700 genomes. Genomeon is focusing on analyzing repetitive DNA sequences called microsatellites that were previously understudied. They have identified patterns of informative microsatellites that differentiate cancer types like breast cancer from healthy genomes with high sensitivity and specificity. These findings could enable new cancer risk assessment, companion diagnostics, clinical trial stratification, drug targets, and non-cancer applications.
ITHANET - Information and Database Portal for the Thalassaemias and other Hae...Human Variome Project
This document describes ITHANET, an online portal that aims to be a scientific and diagnostic tool for research and treatment of thalassemias and other hemoglobinopathies. The portal includes several main sections - a community section to connect researchers and organizations, an encyclopedia called IthaPedia with educational resources, and several databases and tools including IthaGenes for hemoglobin variations, IthaMaps with epidemiological data, and an HPLC tool to facilitate diagnosis. Future plans are described to expand existing databases and develop a new genotype-phenotype database and updated version of IthaMaps.
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekData Driven Innovation
This document summarizes genomic big data management, integration and mining. It discusses the exponential growth of biological data due to advances in sequencing technologies. Next generation sequencing techniques generate large amounts of short DNA reads. Several public databases contain heterogeneous biological data sources. Effective data management and integration methods are needed to analyze these large and complex datasets. Supervised machine learning can be used to extract knowledge and classify samples. Tools like CAMUR apply rule-based classification to problems like analyzing gene expression from cancer datasets. Future work involves advanced integration systems and new big data approaches for biological data.
The EMBL-European Bioinformatics Institute (EBI) is a large bioinformatics research and services institute located in Hinxton, UK. It is part of the European Molecular Biology Laboratory and houses massive biological databases and bioinformatics software tools that are freely available to researchers. Key goals of EBI include building and maintaining biological databases, making data widely accessible, and conducting bioinformatics research to advance biology. EBI coordinates data collection and dissemination internationally and houses over 500 staff from diverse backgrounds.
This document provides an overview of genomics, bioinformatics, and related topics. It discusses:
- The genomics and bioinformatics group members Amit Garg, Lokesh Joshi, and Pankaj Phogat.
- Definitions of genomics, genome, and bioinformatics.
- An overview of the human genome project including its history, goals of identifying and sequencing all human genes, and completion in 2003.
- Other completed genome projects such as for bacteria and yeast.
- The role of bioinformatics in collecting, organizing, analyzing, and sharing biological data through computational modeling, databases, and other tools.
Centre for Genomic Regulation Talk February 2024.pptxNick Brown
Presentation I recently gave at the Centre for Genomic Regulation in Barcelona - providing a glimpse into my career in AstraZeneca over 20 years with some advice for younger data scientists getting into the field.
2022-11-23 DTL Future of data-driven life sciences, Utrecht, Alain van Gool.pdfAlain van Gool
A pitch on directions to improve experimental reproducibility, illustrated by examples of past experiences. I made the plee to move from 'Proudly invented here' to 'Proudly copyied from', to re-use each other's eperiences in successes and failures.
Similar to PhenoMeNal: Large scale computing with medical metabolic phenotyping data (20)
Publication of raw and curated NMR spectroscopic data for organic moleculesChristoph Steinbeck
The document discusses nuclear magnetic resonance (NMR) spectroscopy and the need for sharing raw NMR data. It describes NMRReData as a machine-readable representation for linking NMR spectral data to chemical structures. Benefits of NMRReData include improved data quality, easier data sharing and storage, and validation of results. The document also calls for building a stable, open archive with community standards for submitting raw NMR data and metadata. Existing frameworks could support such an archive by handling submissions and allowing search/visualization of NMR data.
Lecture on Computer-Assisted Structure Elucidation delivered as part of the summer school on metabolomics data analysis in the cloud on Sardinia, 2017. Author and Speaker: Prof. Dr. Christoph Steinbeck, Friedrich-Schiller-University, Jena, Germany.
Talk given at the symposium about government-funded databases and open chemistry at the national meeting of the American Chemical Society in Washington, 21 Aug 2017
Building an efficient infrastructure, standards and data flow for metabolomicsChristoph Steinbeck
Christoph Steinbeck from the European Bioinformatics Institute discusses developing efficient infrastructure, standards, and data flow for metabolomics. Metabolomics measures small molecule metabolites in organisms and generates large amounts of data. The MetaboLights database was created to share metabolomics data openly. It is growing rapidly with a doubling time of 3 months for metabolomics data. Efforts are underway to build standardized reference data through projects like COSMOS and MetabolomeXchange. While genomes of thousands of species have been sequenced, far fewer complete metabolomes exist. The talk advocates for focused efforts to map metabolites and build quantitative models of well-studied model organisms' metabolomes.
The document argues that the time is right to focus intensively on model organism metabolomes by identifying and mapping all metabolites onto pathways and developing quantitative metabolic models for model organisms. It proposes establishing metabolomes for a series of established model organisms in microbial, animal and plant research. The MetaboLights Database at the EBI is collaboratively assembling species metabolomes through data sharing, with a 3-month doubling time for metabolomics data submissions. Developing deep annotation of model organism metabolomes would help build comprehensive species metabolomes.
A talk at the Molecular Informatics Open Source Meeting (MIOSS) at the European Bioinformatics Institute (EMBL-EBI) in Hinxton, Cambridge, United Kingdon
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
Authoring a personal GPT for your research and practice: How we created the Q...Leonel Morgado
Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
PhenoMeNal: Large scale computing with medical metabolic phenotyping data
1. Christoph Steinbeck
and the
PhenoMeNal consortium
European Bioinformatics Institute, Hinxton
Friedrich-Schiller-University, Jena
Large scale computing with medical
metabolic phenotyping data
5. Untargeted Metabolomics
of Cerebrospinal Fluid
AD – red; MCI- blue
Alzheimer’s Disease (AD), Mild Cognitive Impairment (MCI), Cognitive Normal (CN)
AD – red; CN- blue
6. Untargeted Metabolomics
of Cerebrospinal Fluid
AD – red; MCI- blue
Alzheimer’s Disease (AD), Mild Cognitive Impairment (MCI), Cognitive Normal (CN)
AD – red; CN- blue
Basis for a
diagnostic tool
19. PhenoMeNal Objectives
2. Operate and consolidate the PhenoMeNal e-
infrastructure based on existing internal and external
HPC (High-performance computing) resources
20. PhenoMeNal Objectives
3. To improve and scale-up tools used within the
infrastructure to cope with very large datasets
4.To establish technology for a water-tight audit trail
21. PhenoMeNal Objectives
5. To establish privacy-protection methods
6.To foster the worldwide adoption of PhenoMeNal
7. To develop a model to ensure sustainability of the
PhenoMeNal network