I spoke on "Big Data in Biology". The talk basically concentrates on how biology has affected big data and how big data has become a key player in biology. I have also covered how DNA storage can address long term archival storage.
This document discusses the importance of computational tools for biological research. It provides an overview of how computer applications are used in areas like the Human Genome Project, transcriptomics, proteomics, and systems biology. The document also notes challenges for biological research in Thailand, including a lack of background knowledge in computer science and limited access to free and easy-to-use computational tools, especially in the Thai language. It argues that biology students in Thailand should be taught bioinformatics and computational biology skills to better facilitate biological research.
How Can We Make Genomic Epidemiology a Widespread Reality? - William HsiaoWilliam Hsiao
The document discusses genomic epidemiology and the requirements to bring genomic sequencing into routine public health practice. It outlines two parts: (1) what genomic epidemiology is and why it is important; and (2) the requirements for genomic sequencing to be used routinely in public health. Whole genome sequencing is seen as a way to generate high quality pathogen genomes quickly and allow for more detailed tracking of disease spread compared to traditional methods. However, bringing genomic sequencing into public health practice requires overcoming barriers such as the need for user-friendly analysis platforms, training public health personnel in genomics, and improving information sharing between organizations.
OECD Webinar | From Data to Knowledge and Beyond Adverse Outcome Pathways as ...OECD Environment
On 23 and 30 November 2020, the OECD hosted a webinar on training needs, resources, and opportunities for adverse outcomes pathways (AOPs). This interactive webinar discussed opportunities for expanding the AOP community of trainers to meet current needs, considering all available resources.
The objectives of this webinar were to:
- outline past and current training activities
- receive your input on experiences in conducting and/or receiving training; and
- gather your views, needs and ideas for the provision of training in the future.
Watch the webinar video recording at: https://youtu.be/7ObxATifDds.
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Alejandra Gonzalez-Beltran
Metagenomic Data Provenance and Management using the ISA infrastructure - overview, implementation patterns & software tools
Slides presented at EBI Metagenomics Bioinformatics course: http://www.ebi.ac.uk/training/course/metagenomics2014
This document provides an introduction to the field of bioinformatics. It defines bioinformatics as the merging of biology, computer science, and information technology into a single discipline. The document outlines key topics in bioinformatics including what is bioinformatics, why it is needed due to the growth of sequencing data, common data types and analysis problems, careers in bioinformatics, and different sequencing technologies such as Illumina and SOLiD sequencing.
I spoke on "Big Data in Biology". The talk basically concentrates on how biology has affected big data and how big data has become a key player in biology. I have also covered how DNA storage can address long term archival storage.
This document discusses the importance of computational tools for biological research. It provides an overview of how computer applications are used in areas like the Human Genome Project, transcriptomics, proteomics, and systems biology. The document also notes challenges for biological research in Thailand, including a lack of background knowledge in computer science and limited access to free and easy-to-use computational tools, especially in the Thai language. It argues that biology students in Thailand should be taught bioinformatics and computational biology skills to better facilitate biological research.
How Can We Make Genomic Epidemiology a Widespread Reality? - William HsiaoWilliam Hsiao
The document discusses genomic epidemiology and the requirements to bring genomic sequencing into routine public health practice. It outlines two parts: (1) what genomic epidemiology is and why it is important; and (2) the requirements for genomic sequencing to be used routinely in public health. Whole genome sequencing is seen as a way to generate high quality pathogen genomes quickly and allow for more detailed tracking of disease spread compared to traditional methods. However, bringing genomic sequencing into public health practice requires overcoming barriers such as the need for user-friendly analysis platforms, training public health personnel in genomics, and improving information sharing between organizations.
OECD Webinar | From Data to Knowledge and Beyond Adverse Outcome Pathways as ...OECD Environment
On 23 and 30 November 2020, the OECD hosted a webinar on training needs, resources, and opportunities for adverse outcomes pathways (AOPs). This interactive webinar discussed opportunities for expanding the AOP community of trainers to meet current needs, considering all available resources.
The objectives of this webinar were to:
- outline past and current training activities
- receive your input on experiences in conducting and/or receiving training; and
- gather your views, needs and ideas for the provision of training in the future.
Watch the webinar video recording at: https://youtu.be/7ObxATifDds.
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Alejandra Gonzalez-Beltran
Metagenomic Data Provenance and Management using the ISA infrastructure - overview, implementation patterns & software tools
Slides presented at EBI Metagenomics Bioinformatics course: http://www.ebi.ac.uk/training/course/metagenomics2014
This document provides an introduction to the field of bioinformatics. It defines bioinformatics as the merging of biology, computer science, and information technology into a single discipline. The document outlines key topics in bioinformatics including what is bioinformatics, why it is needed due to the growth of sequencing data, common data types and analysis problems, careers in bioinformatics, and different sequencing technologies such as Illumina and SOLiD sequencing.
Slides for the afternoon session on "Introduction to Bioinformatics", delivered at the James Hutton Institute, 29th, 20th May and 5th June 2014, by Leighton Pritchard and Peter Cock.
Slides cover introductory guidance and links to resources, theory and use of BLAST tools, and a workshop featuring some common tools and tasks.
Professor Carole Goble, University of Manchester, talks at the RIN "Research data: policies & behaviour" event as part of a series on Research Information in Transition.
The document discusses challenges with storing and analyzing bioinformatics data and proposes solutions using NoSQL database technologies. It notes that bioinformatics data is large in volume, unstructured, and distributed globally. Relational databases struggle with this type of data. NoSQL databases offer alternatives as they are not constrained by schemas, are designed for performance and scalability, and can handle distributed unstructured data. The document presents MongoDB and Cassandra as examples of NoSQL databases suitable for bioinformatics and describes a case study using MongoDB for genomic data. It concludes that NoSQL technologies are promising for the future of bioinformatics given the unique data challenges in the field.
The document proposes an Oz Mammals Bioinformatics and Data Resource to store, share, and analyze genomic and other data from Australian mammal studies. It would:
1) Capture existing Oz mammal data and resources, provide long-term storage, and integrate new genomic data from the OMG Project.
2) Enable data sharing within the OMG project and provide access to Oz mammal data worldwide.
3) Give access to data processing, analysis, and visualization tools, and integrate with external resources like the Atlas of Living Australia.
Slides contain information about why bioinformatics appeared,
who bioinformaticians are, what they do, what kind of cool applications and challenges in bioinformatics there are.
Slides were prepared for the Bioinformatics seminar 2016, Institute of Computer Science, University of Tartu.
Marco Brandizi and Keywan Hassani-Pak, Rothamsted Research, Invited Presentation at SWAT4HCLS 2022.
FAIR data principles are being a driving force in life sciences and other scientific domains, helping researchers to share their data and free all of their potential to integrate information and do novel discoveries. Knowledge graphs are an ever more popular paradigm to model data according to such principles, and technologies such as graph databases are emerging as complementary to approaches like linked data. All of this includes the agronomy, farming and food domains. How advanced the adoption of sound data management policies is in these life domains? How does that compare to other life sciences? In this presentation, we will talk about our practical experience, focusing on KnetMiner, a gene and molecular biology discovering platform, which is based on building and publishing knowledge graphs according to the FAIR principles, as well as using a mix of linked data standards for life sciences and recent graph database and API technologies. We will welcome questions and discussions from the audience about similar experience.
This document describes an educational research kit that uses CRISPR-Cas10 gene editing technology. The kit allows students to explore gene transfer between phages, which is currently not well understood. It leverages the accessibility of CRISPR to enable high school, college, and graduate students to learn about gene editing and DNA analysis of bacteria. The technology was successfully piloted and demonstrates that CRISPR research can be made accessible through an educational kit. The kit was invented by Dr. Asma Hatoum, an assistant professor of biological sciences.
Bioinformatics is defined as the field that studies biology using computers and information technology. It involves the collection, storage, and analysis of molecular biological data using techniques from computer science and statistics. Some key events in bioinformatics include Watson and Crick proposing the DNA double helix structure in 1953, and the development of sequence alignment and structure prediction algorithms in the 1970s. Bioinformatics aims to better understand living cells at the molecular level by analyzing raw molecular sequence and structure data. It provides globally accessible databases and analysis tools to enable sharing and study of biological data.
This research aimed to develop a tool to group bioassays from PubChem based on experimental parameters extracted from narratives using natural language processing (NLP). The researchers used Latent Semantic Indexing (LSI) to identify topics in over 2000 bioassay narratives from Pubmed abstracts. LSI was able to group assays without supervision but was sensitive to the number of tokens and concepts used, focusing on either species or chemical compounds. While encouraging, additional studies are needed to better control LSI's effectiveness for chemical modeling applications.
KnetMiner provides an easy to use web interface to visualisation and data mining tools for the discovery and evaluation of candidate genes from large scale integrations of public and private data sets. It addresses the needs of scientists who generally lack the time and technical expertise to review all relevant information available in the literature, from key model species and from a potentially wide range of related biological databases. We have previously developed genome-scale knowledge networks (GSKNs) for multiple crop and animal species (Hassani-Pak et al. 2016). The KnetMiner web server searches and evaluates millions of relations and concepts within the GSKNs in real-time to determine if direct or indirect links between genes and trait-based keywords can be established. KnetMiner accepts as user inputs: search terms in combination with a gene list and/or genomic regions. It produces a table of ranked candidate genes and allows users to explore the output in interactive genome and network map visualisation tools that have been optimised for web use on desktop and mobile devices. The KnetMiner web server and the GSKNs provide a step-forward towards systematic and evidence-based gene discovery.
Bioinformatics combines computer science, statistics, mathematics, and biology to study and process biological data on a large scale. The document discusses several applications of bioinformatics including information search and retrieval, sequence comparison for genetics, phylogenetic analysis, genome annotation, proteomics, pharmacogenomics, and drug discovery. Tools are provided for various applications such as linkage analysis, phylogenetic analysis, genome annotation, and protein identification.
This document introduces bioinformatics and discusses some of its key concepts and applications. It defines bioinformatics as an interdisciplinary field that combines computer science, statistics and engineering to study and process biological data. It describes some basic cell components like DNA, RNA and proteins, and how genetics and the genetic code work. It also provides a brief history of bioinformatics, highlighting projects like the Human Genome Project. Finally, it outlines several applications of bioinformatics like phylogenetic analysis, drug design, microarray analysis and protein-protein interaction networks.
This document discusses using AgriSchemas and schema.org to model and share interoperable agricultural data from sources like KnetMiner and DFW for use cases involving molecular biology, gene expression, literature, and experiments. AgriSchemas provides a way to formally represent heterogeneous agricultural data to support exploratory research and data integration/sharing according to FAIR principles. Examples show how gene, publication, experiment and other data from KnetMiner are modeled and made accessible via AgriSchemas and linked data formats. Ongoing work focuses on additional areas like host-pathogen interactions, weather data, and dataset metadata.
The document discusses knowledge management of experimental data through the ISA ecosystem. It describes the ISA-tab format and software suite that allows annotation and curation of experimental metadata. As a use case, it analyzes a dataset on metabolite profiling from a study of fatty acid amide hydrolase knockout mice. The ISA tools can represent investigations and assays, convert data to standardized formats, and facilitate sharing and analysis of experimental data.
Bioinformatics is an interdisciplinary field that merges biology, computer science, and information technology. It is applied in areas like genomics, proteomics, and systems biology. While some basic analysis can be done through user-friendly tools, truly customized work requires programming skills and an understanding of underlying algorithms. Bioinformatics is not just a service field but rather involves scientific experimentation throughout the entire analysis process from experimental design to evaluation. It is a dedicated field of research in its own right, not a quick or interchangeable task.
This document provides an overview of bioinformatics and highlights several key points:
- Bioinformatics has emerged as a field to help analyze the vast amounts of biological data being generated through high-throughput technologies. It integrates biology, computer science, and information technology.
- The size of the human genome and rate of data generation has grown exponentially, necessitating computational approaches. International efforts like the Human Genome Project helped sequence the entire human genome.
- Bioinformatics tools and databases are used to study genomics, transcriptomics, proteomics and more to better understand living systems at the molecular level and enable applications in medicine, agriculture, forensics and more. This work also raises ethical, legal and social considerations.
B.sc biochem i bobi u-1 introduction to bioinformaticsRai University
This document provides an introduction to the field of bioinformatics. It defines bioinformatics as using computer science and software tools to store, retrieve, organize and analyze biological data. The history of bioinformatics began in the 1970s with early work to create protein sequence databases. Today, bioinformatics has many applications including drug design, DNA analysis, and agricultural biotechnology. It also covers several key areas including genomics, proteomics, and systems biology. Necessary skills for bioinformatics include knowledge of molecular biology, mathematics, programming, and computer proficiency.
Genomic epidemiology uses whole genome sequencing data from pathogens combined with epidemiological investigations to track the spread of infectious diseases. The document discusses making genomic epidemiology a widespread reality in public health. It outlines key requirements including building a user-friendly analysis platform, developing portable analysis pipelines, providing training to public health personnel, and improving information sharing between organizations.
Slides for the afternoon session on "Introduction to Bioinformatics", delivered at the James Hutton Institute, 29th, 20th May and 5th June 2014, by Leighton Pritchard and Peter Cock.
Slides cover introductory guidance and links to resources, theory and use of BLAST tools, and a workshop featuring some common tools and tasks.
Professor Carole Goble, University of Manchester, talks at the RIN "Research data: policies & behaviour" event as part of a series on Research Information in Transition.
The document discusses challenges with storing and analyzing bioinformatics data and proposes solutions using NoSQL database technologies. It notes that bioinformatics data is large in volume, unstructured, and distributed globally. Relational databases struggle with this type of data. NoSQL databases offer alternatives as they are not constrained by schemas, are designed for performance and scalability, and can handle distributed unstructured data. The document presents MongoDB and Cassandra as examples of NoSQL databases suitable for bioinformatics and describes a case study using MongoDB for genomic data. It concludes that NoSQL technologies are promising for the future of bioinformatics given the unique data challenges in the field.
The document proposes an Oz Mammals Bioinformatics and Data Resource to store, share, and analyze genomic and other data from Australian mammal studies. It would:
1) Capture existing Oz mammal data and resources, provide long-term storage, and integrate new genomic data from the OMG Project.
2) Enable data sharing within the OMG project and provide access to Oz mammal data worldwide.
3) Give access to data processing, analysis, and visualization tools, and integrate with external resources like the Atlas of Living Australia.
Slides contain information about why bioinformatics appeared,
who bioinformaticians are, what they do, what kind of cool applications and challenges in bioinformatics there are.
Slides were prepared for the Bioinformatics seminar 2016, Institute of Computer Science, University of Tartu.
Marco Brandizi and Keywan Hassani-Pak, Rothamsted Research, Invited Presentation at SWAT4HCLS 2022.
FAIR data principles are being a driving force in life sciences and other scientific domains, helping researchers to share their data and free all of their potential to integrate information and do novel discoveries. Knowledge graphs are an ever more popular paradigm to model data according to such principles, and technologies such as graph databases are emerging as complementary to approaches like linked data. All of this includes the agronomy, farming and food domains. How advanced the adoption of sound data management policies is in these life domains? How does that compare to other life sciences? In this presentation, we will talk about our practical experience, focusing on KnetMiner, a gene and molecular biology discovering platform, which is based on building and publishing knowledge graphs according to the FAIR principles, as well as using a mix of linked data standards for life sciences and recent graph database and API technologies. We will welcome questions and discussions from the audience about similar experience.
This document describes an educational research kit that uses CRISPR-Cas10 gene editing technology. The kit allows students to explore gene transfer between phages, which is currently not well understood. It leverages the accessibility of CRISPR to enable high school, college, and graduate students to learn about gene editing and DNA analysis of bacteria. The technology was successfully piloted and demonstrates that CRISPR research can be made accessible through an educational kit. The kit was invented by Dr. Asma Hatoum, an assistant professor of biological sciences.
Bioinformatics is defined as the field that studies biology using computers and information technology. It involves the collection, storage, and analysis of molecular biological data using techniques from computer science and statistics. Some key events in bioinformatics include Watson and Crick proposing the DNA double helix structure in 1953, and the development of sequence alignment and structure prediction algorithms in the 1970s. Bioinformatics aims to better understand living cells at the molecular level by analyzing raw molecular sequence and structure data. It provides globally accessible databases and analysis tools to enable sharing and study of biological data.
This research aimed to develop a tool to group bioassays from PubChem based on experimental parameters extracted from narratives using natural language processing (NLP). The researchers used Latent Semantic Indexing (LSI) to identify topics in over 2000 bioassay narratives from Pubmed abstracts. LSI was able to group assays without supervision but was sensitive to the number of tokens and concepts used, focusing on either species or chemical compounds. While encouraging, additional studies are needed to better control LSI's effectiveness for chemical modeling applications.
KnetMiner provides an easy to use web interface to visualisation and data mining tools for the discovery and evaluation of candidate genes from large scale integrations of public and private data sets. It addresses the needs of scientists who generally lack the time and technical expertise to review all relevant information available in the literature, from key model species and from a potentially wide range of related biological databases. We have previously developed genome-scale knowledge networks (GSKNs) for multiple crop and animal species (Hassani-Pak et al. 2016). The KnetMiner web server searches and evaluates millions of relations and concepts within the GSKNs in real-time to determine if direct or indirect links between genes and trait-based keywords can be established. KnetMiner accepts as user inputs: search terms in combination with a gene list and/or genomic regions. It produces a table of ranked candidate genes and allows users to explore the output in interactive genome and network map visualisation tools that have been optimised for web use on desktop and mobile devices. The KnetMiner web server and the GSKNs provide a step-forward towards systematic and evidence-based gene discovery.
Bioinformatics combines computer science, statistics, mathematics, and biology to study and process biological data on a large scale. The document discusses several applications of bioinformatics including information search and retrieval, sequence comparison for genetics, phylogenetic analysis, genome annotation, proteomics, pharmacogenomics, and drug discovery. Tools are provided for various applications such as linkage analysis, phylogenetic analysis, genome annotation, and protein identification.
This document introduces bioinformatics and discusses some of its key concepts and applications. It defines bioinformatics as an interdisciplinary field that combines computer science, statistics and engineering to study and process biological data. It describes some basic cell components like DNA, RNA and proteins, and how genetics and the genetic code work. It also provides a brief history of bioinformatics, highlighting projects like the Human Genome Project. Finally, it outlines several applications of bioinformatics like phylogenetic analysis, drug design, microarray analysis and protein-protein interaction networks.
This document discusses using AgriSchemas and schema.org to model and share interoperable agricultural data from sources like KnetMiner and DFW for use cases involving molecular biology, gene expression, literature, and experiments. AgriSchemas provides a way to formally represent heterogeneous agricultural data to support exploratory research and data integration/sharing according to FAIR principles. Examples show how gene, publication, experiment and other data from KnetMiner are modeled and made accessible via AgriSchemas and linked data formats. Ongoing work focuses on additional areas like host-pathogen interactions, weather data, and dataset metadata.
The document discusses knowledge management of experimental data through the ISA ecosystem. It describes the ISA-tab format and software suite that allows annotation and curation of experimental metadata. As a use case, it analyzes a dataset on metabolite profiling from a study of fatty acid amide hydrolase knockout mice. The ISA tools can represent investigations and assays, convert data to standardized formats, and facilitate sharing and analysis of experimental data.
Bioinformatics is an interdisciplinary field that merges biology, computer science, and information technology. It is applied in areas like genomics, proteomics, and systems biology. While some basic analysis can be done through user-friendly tools, truly customized work requires programming skills and an understanding of underlying algorithms. Bioinformatics is not just a service field but rather involves scientific experimentation throughout the entire analysis process from experimental design to evaluation. It is a dedicated field of research in its own right, not a quick or interchangeable task.
This document provides an overview of bioinformatics and highlights several key points:
- Bioinformatics has emerged as a field to help analyze the vast amounts of biological data being generated through high-throughput technologies. It integrates biology, computer science, and information technology.
- The size of the human genome and rate of data generation has grown exponentially, necessitating computational approaches. International efforts like the Human Genome Project helped sequence the entire human genome.
- Bioinformatics tools and databases are used to study genomics, transcriptomics, proteomics and more to better understand living systems at the molecular level and enable applications in medicine, agriculture, forensics and more. This work also raises ethical, legal and social considerations.
B.sc biochem i bobi u-1 introduction to bioinformaticsRai University
This document provides an introduction to the field of bioinformatics. It defines bioinformatics as using computer science and software tools to store, retrieve, organize and analyze biological data. The history of bioinformatics began in the 1970s with early work to create protein sequence databases. Today, bioinformatics has many applications including drug design, DNA analysis, and agricultural biotechnology. It also covers several key areas including genomics, proteomics, and systems biology. Necessary skills for bioinformatics include knowledge of molecular biology, mathematics, programming, and computer proficiency.
Genomic epidemiology uses whole genome sequencing data from pathogens combined with epidemiological investigations to track the spread of infectious diseases. The document discusses making genomic epidemiology a widespread reality in public health. It outlines key requirements including building a user-friendly analysis platform, developing portable analysis pipelines, providing training to public health personnel, and improving information sharing between organizations.
This document provides an introduction to the field of bioinformatics. It defines bioinformatics as a branch of science that uses computer technology to analyze and integrate biological information that can be applied to gene-based drug discoveries. It discusses the emergence of bioinformatics due to the desire to understand how genetic structure affects traits. It also outlines some common applications of bioinformatics like drug design, gene therapy, and microbial genomic analysis. Finally, it provides examples of some bioinformatics tools, databases, and centers in India.
This document is a resume for Gautam Machiraju. It summarizes his education and research experience. He has a B.A. in Applied Mathematics from UC Berkeley with a concentration in Mathematical Biology and a minor in Bioengineering. He has worked on several research projects involving mathematical modeling and data analysis related to biology and healthcare. These include modeling cancer biomarker shedding kinetics, mining literature for biomarker data, and using deep learning on patient time-series data. He has strong skills in programming, mathematics, and bioinformatics.
This document is a resume for Gautam Machiraju. It summarizes his education and research experience. He has a B.A. in Applied Mathematics from UC Berkeley with a concentration in Mathematical Biology and a minor in Bioengineering. He has worked on several research projects involving mathematical modeling and data analysis related to cancer biomarkers, genomics, and proteomics. His skills include programming, mathematics, data science, and laboratory techniques. He is currently a bioinformatics research assistant at Stanford University School of Medicine.
The EMBL-European Bioinformatics Institute (EBI) is a large bioinformatics research and services institute located in Hinxton, UK. It is part of the European Molecular Biology Laboratory and houses massive biological databases and bioinformatics software tools that are freely available to researchers. Key goals of EBI include building and maintaining biological databases, making data widely accessible, and conducting bioinformatics research to advance biology. EBI coordinates data collection and dissemination internationally and houses over 500 staff from diverse backgrounds.
The document discusses a knowledge management platform developed at Genentech to manage pre-clinical animal studies.
The platform called DIVOS manages over 12,000 in vivo studies dating back to 1998 across multiple therapeutic areas conducted both in-house and by CROs. The technical approach involved developing a structured yet flexible platform to capture study details while enabling data reuse. People were key to change through various teams guiding strategy, tactics and operations. New capabilities like improved study logistics, enhanced collaboration and new insights from data analysis provide evidence of the platform's success.
AB3ACBS 2016: EMBL Australia Bioinformatics ResourcePhilippa Griffin
The EMBL Australian Bioinformatics Resource (EMBL-ABR) is a distributed national research infrastructure that provides bioinformatics support to life science researchers in Australia. It has a hub-and-nodes structure with the hub hosted at the Victorian Life Science Computation Initiative at the University of Melbourne and 10 nodes located across Australian institutions. EMBL-ABR aims to increase Australia's capacity for bioinformatics research and data science, provide training in bioinformatics, and enable participation in international collaborations.
Free webinar-introduction to bioinformatics - biologist-1Elia Brodsky
The Omics Logic Introduction to Bioinformatics program is a one-month online training program that provides an introduction to the field of bioinformatics for beginners. The program consists of six sessions taught by an international team of experts, covering topics like genomics, transcriptomics, statistical analysis, machine learning, and a final bioinformatics project. Participants will learn data analysis skills in Python and R and how to extract insights from multi-omics datasets with applications in biomedicine. The goal is to prepare students for data-driven research in life sciences through interactive lessons, coding exercises, and independent projects.
CINECA webinar slides: Modular and reproducible workflows for federated molec...CINECAProject
Genetic analysis of molecular traits such as gene expression, splicing and chromatin accessibility requires a number of complex analysis steps that can easily take weeks or months for a analyst to implement from scratch. In the CINECA project, we have developed a number of modular Nextflow workflows that standardise and automate these steps. In this webinar, we will give an overview of the CINECA workflows for genotype imputation, gene expression and splicing quantification, data normalisation and association testing, and demonstrate how these workflows can be used in a federated setting without transferring identifiable personal data between partners.
The CINECA webinar series aims to discuss ways to address common challenges and share best practices in the field of cohort data analysis, as well as distribute CINECA project results. All CINECA webinars include an audience Q&A session during which attendees can ask questions and make suggestions. Please note that all webinars are recorded and available for posterior viewing.
This webinar took place on 10th November 2020 and is part of the CINECA webinar series.
For previous and upcoming CINECA webinars see:
https://www.cineca-project.eu/webinars
1) Big data standards are needed to make data understandable, reusable, and shareable across different databases and domains.
2) Effective standards require reporting sufficient experimental details and context in both human-readable and machine-readable formats.
3) Developing standards is a collaborative process involving different stakeholder groups to define requirements, vocabularies, and data models through both formal standards bodies and grassroots organizations.
This document summarizes research on non-model ascidian species Molgula occulta and Molgula oculata. An international collaboration generated transcriptome data, sequenced the genomes of three Molgula species, and examined gene expression patterns related to tail development. Analysis revealed heterochronic shifts in developmental timing between tailed and tailless species. The data resources enabled further study of evolutionary shifts in gene regulatory networks underlying conserved developmental processes. The document emphasizes the importance of methods development for large-scale data analysis to enable new biological insights.
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...GigaScience, BGI Hong Kong
Scott Edmunds talk at the HUPO congress in Geneva, September 6th 2011 on GigaScience - a journal or a database? Lessons learned from the Genomics Tsunami.
This document provides an overview of next generation sequencing (NGS) analysis. It discusses various NGS platforms such as Illumina, Roche 454, PacBio, and Ion Torrent. It also covers common file formats for sequencing data like FASTQ, quality control measures to assess data quality, and applications of NGS such as RNA-seq and ChIP-seq. The document aims to introduce researchers to basic concepts in NGS analysis and highlights available resources for storing and analyzing large sequencing datasets.
EiTESAL eHealth Conference 14&15 May 2017 EITESANGO
This document discusses bioinformatics and some of its key concepts and tools. It begins with definitions of bioinformatics as the intersection of biology, computer science, and information technology. It then discusses some of the data formats, tools, and skills used in bioinformatics, including working with nucleotide sequence data, translating sequences into amino acids, and analyzing large datasets. It also summarizes how ontologies are used to represent concepts and how various data types are organized and stored in databases for analysis.
Taverna is a free and open-source workflow management system that allows researchers to design and execute scientific workflows. It was developed by the University of Manchester to support in silico experiments in biology. Taverna provides a graphical user interface for designing workflows using a variety of distributed data sources and web services without having to learn complex programming. It has been widely adopted by researchers in fields such as biology, healthcare, astronomy, and cheminformatics to automate analysis pipelines and share workflows.
Will Biomedical Research Fundamentally Change in the Era of Big Data?Philip Bourne
This document discusses how biomedical research may fundamentally change in the era of big data. It notes that biomedical research has always been data-driven, but the scope, variety, complexity and volume of data is now much greater. It also discusses the need for more open data sharing and new tools and methods for large-scale analysis. The document suggests biomedical research may move towards a more collaborative "platform" model, as seen with companies like Airbnb, with the goal of improving data access, reuse and reproducibility of research. However, overcoming challenges like incentives, trust and work practices will be important for any new platform to succeed.
Big Data Infrastructure for Translational Research discusses challenges in building big data infrastructure for translational research. It defines big data as large and complex data difficult to process with typical tools. Big data comes from various sources like mobile devices, sensors, clinical monitors. Scaling data acquisition from patient bed to institution is discussed. Tools used include databases, scripting languages, statistical packages and visualization. Challenges include data capture, curation, storage, sharing and analysis. A multidisciplinary team approach is advocated to tackle big data challenges in translational medicine.
Similar to Supporting researchers in the molecular life sciences Jeff Christiansen (20)
Presentation by Dr Steve McEachern, ADA, to the 'Unlocking value from publicly funded Clinical Research Data' workshop, cohosted by ARDC and CSIRO at ANU on 6 March 2019.
Presentation by Hugo Leroux and Liming Zhu, CSIRO, to the 'Unlocking value from publicly funded Clinical Research Data' workshop, cohosted by ARDC and CSIRO at ANU on 6 March 2019.
The document summarizes plans by the Australian Government to establish new legislation and institutions to streamline access to and use of public sector data. Key points include:
- A new Commonwealth Data Sharing and Release Act will be introduced in 2019 to provide consistent rules for sharing data and establish a National Data Commissioner to oversee the system.
- The National Data Commissioner will ensure transparency, accountability, security, and appropriate risk management in data sharing.
- New rules will focus on enabling data to be shared for purposes like research and policy-making, while protecting privacy and building public trust in data use.
- The government will continue consulting stakeholders on the legislation to address concerns and help the public understand the reforms.
Presentation by Prof Chris Rowe, ADNet, to the 'Unlocking value from publicly funded Clinical Research Data' workshop, cohosted by ARDC and CSIRO at ANU on 6 March 2019.
Investigator-initiated clinical trials: a community perspectiveARDC
Presentation by Miranda Cumpston, ACTA, to the 'Unlocking value from publicly funded Clinical Research Data' workshop, cohosted by ARDC and CSIRO at ANU on 6 March 2019.
Presentation by Dr Merran Smith, PHRN, to the 'Unlocking value from publicly funded Clinical Research Data' workshop, cohosted by ARDC and CSIRO at ANU on 6 March 2019.
International perspective for sharing publicly funded medical research dataARDC
Presentation by Olivier Salvado, CSIRO, to the 'Unlocking value from publicly funded Clinical Research Data' workshop, cohosted by ARDC and CSIRO at ANU on 6 March 2019.
Presentation by Prof Lisa Askie, ANZCTR, to the 'Unlocking value from publicly funded Clinical Research Data' workshop, cohosted by ARDC and CSIRO at ANU on 6 March 2019.
Presentation by Dr Davina Ghersi, NHMRC, to the 'Unlocking value from publicly funded Clinical Research Data' workshop, cohosted by ARDC and CSIRO at ANU on 6 March 2019.
Presentation by Dr Adrian Burton, ARDC, to the 'Unlocking value from publicly funded Clinical Research Data' workshop, cohosted by ARDC and CSIRO at ANU on 6 March 2019.
FAIR for the future: embracing all things dataARDC
FAIR for the future: embracing all things data - Natasha Simons, Keith Russell and Liz Stokes, presented at Taylor & Francis Scholarly Summits in Sydney 11 Feb 2019 and Melbourne 14 Feb 2019.
How to make your data count webinar, 26 Nov 2018ARDC
This document outlines the Make Data Count (MDC) initiative to standardize and promote the tracking of research data usage metrics. MDC has developed a Code of Practice for data usage logs, built an open hub to aggregate standardized usage data, and implemented tracking and display of usage metrics at their own repositories. They encourage other repositories to follow five simple steps to Make Their Data Count: 1) Read the Code of Practice, 2) Process usage logs, 3) Send logs to the hub, 4) Pull usage metrics from the hub, and 5) Display metrics. Future work includes outreach, iteration on implementations, and expanding metrics beyond DOIs.
A Free 200-Page eBook ~ Brain and Mind Exercise.pptxOH TEIK BIN
(A Free eBook comprising 3 Sets of Presentation of a selection of Puzzles, Brain Teasers and Thinking Problems to exercise both the mind and the Right and Left Brain. To help keep the mind and brain fit and healthy. Good for both the young and old alike.
Answers are given for all the puzzles and problems.)
With Metta,
Bro. Oh Teik Bin 🙏🤓🤔🥰
How to Download & Install Module From the Odoo App Store in Odoo 17Celine George
Custom modules offer the flexibility to extend Odoo's capabilities, address unique requirements, and optimize workflows to align seamlessly with your organization's processes. By leveraging custom modules, businesses can unlock greater efficiency, productivity, and innovation, empowering them to stay competitive in today's dynamic market landscape. In this tutorial, we'll guide you step by step on how to easily download and install modules from the Odoo App Store.
This presentation was provided by Racquel Jemison, Ph.D., Christina MacLaughlin, Ph.D., and Paulomi Majumder. Ph.D., all of the American Chemical Society, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
A Visual Guide to 1 Samuel | A Tale of Two HeartsSteve Thomason
These slides walk through the story of 1 Samuel. Samuel is the last judge of Israel. The people reject God and want a king. Saul is anointed as the first king, but he is not a good king. David, the shepherd boy is anointed and Saul is envious of him. David shows honor while Saul continues to self destruct.
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapitolTechU
Slides from a Capitol Technology University webinar held June 20, 2024. The webinar featured Dr. Donovan Wright, presenting on the Department of Defense Digital Transformation.
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...EduSkills OECD
Andreas Schleicher, Director of Education and Skills at the OECD presents at the launch of PISA 2022 Volume III - Creative Minds, Creative Schools on 18 June 2024.
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...
Supporting researchers in the molecular life sciences Jeff Christiansen
1. Supporting Researchers in the Molecular Life Sciences
Jeff Christiansen
UQ RCC Health and Life Sciences Program Manager
QCIF Health and Life Sciences Program Manager
EMBL-ABR Key Areas Coordinator
2.
3. DNA
mRNA
protein
metabolites
The central dogma of biology
Cell type 1 vs cell type 2: same genes but different mRNAs, proteins and metabolites (and with different levels)
Traditionally, researchers would focus on a small numbers of genes/proteins etc. due to technical constraints
folding
large
molecules
(small molecules)
enzymatic
catalysis
4. Global biomolecular profiling: the data explosion
DNA RNA protein metabolites
genomics transcriptomics proteomics metabolomics
20,005 ‘protein
coding’ genes
~200,000(?) transcripts
abundance?
16,518 identified
abundance?
>24597 compounds
abundance?
https://www.ebi.ac.uk/metabolights/referencehttps://hupo.org/HPP-Q&Ahttps://hupo.org/HPP-Q&A
5. The data explosion: challenges
• Data storage
• non-complex org’s (bacteria): 12GB raw data / sample (genomic, transcriptomic, proteomic, metabolomic)
• globally, est. 100 PB used by 20 largest institutions for genomic storage alone1
• Tools
• to convert data from raw > processed
• for comparative analyses on processed data (e.g. genome v. genome, transcriptome v. proteome)
• documenting methods (i.e. tool use – versions used, workflows applied)
• Compute
• resource intense (e.g. a single human : mouse genome alignment consumes ~100 CPU hrs.)
• Data management
• context surrounding the specimen (e.g. healthy vs diseased) and experiment
• context surrounding the data itself (provenance, state {raw, processed}, formats, etc.)
• managing sharing within research team
• data publishing at project end to international repositories
• Skills development
• enabling biologists to utilise bioinformatics approaches (expert [cmd line] > novice [GUI])
• enabling biologists to use storage, tools, compute and data management effectively
Stephens et al (2015) Astronomical or Genomical? PLOS Biology https://doi.org/10.1371/journal.pbio.1002195
6. Unmet Needs for Analysing Biological Big Data:
A Survey of 704 NSF Principal Investigators
Percent responding negatively
(318 ≤ n ≤ 510)
0% 20% 40% 60% 80% 100%
Barone L, Williams J, Micklos D; BioRxiv (2017)
Training on integration of multiple data types
Training on data management and metadata
Training on scaling analysis to cloud/HPC
Multi-step analysis workflows or pipelines
Cloud computing
Search for data & discover relevant datasets
Support for bioinformatics and analysis
Publish data to the community
Updated analysis software
Share data with colleagues
Training on basic computing and scripting
Sufficient data storage
High-performance computing
90% indicated
they are
currently or will
soon be
analysing large
digital datasets
7. Australian needs
The Most UsefulBiggest bioinformatics difficulty
https://www.embl-abr.org.au/news/braembl-community-survey-report-2013/
2013
N=210
12. Organise training material and events around research-relevant tasks, not the tools themselves
Training in how to perform tasks is required
13. Organise training material and events around research-relevant tasks, not the tools themselves
Training in how to perform tasks is required
Genome Annotation using Apollo
15. Involve a wide variety of users in usability testing
Building more intuitive tools is imperative
16. Involve a wide variety of users in usability testing
Building more intuitive tools is imperative
14 users (novice to expert bioinformaticians, student to CI)
5 tests (representing broad task types)
47 usability issues found – 38 addressed
17. Build/provide functionality that supports users with differing informatics skill levels
Building more intuitive tools is imperative
18. Build/provide functionality that supports users with differing informatics skill levels
Building more intuitive tools is imperative
20. Australia is geographically challenging:
leverage technology, international and local expertise to help
deliver training to a wider audience
Genome Annotation using Apollo
Dr Monica Muñoz-Torres
Project Lead, Apollo Project, Berkeley
21. Australia is geographically challenging:
leverage technology, international and local expertise to help
deliver training to a wider audience
Genome Annotation using Apollo
9 EMBL-ABR Nodes, 92 registrants
QLD: QCIF, JCU (TSV+CNS)
NSW: UNSW, SCU
VIC: Monash, UniMelb
SA: UniAdel
TAS: UTas
22. Australia is geographically challenging:
leverage technology, international and local expertise to help
deliver training to a wider audience
Genome Annotation using Apollo