A basic overview of considerations for designing genomics experiments using Next Generation Sequencing (NGS). Includes a discussion of power, accuracy, what samples to collect, and what sequencing parameters to use.
This document provides an overview of plasmids. It defines plasmids as small, circular, extrachromosomal DNA molecules that can replicate independently in bacteria. Plasmids contain genes that provide benefits to bacteria like antibiotic resistance. They are transferred between bacteria through processes like transformation, transduction, and conjugation. Plasmids are classified based on their functions and are important tools in biotechnology as they allow cloning, protein production, and other applications.
There is the fifth video by Miss Aymen Arif Sindh Biotechnologist Association has taken initiative for all young scientists, researchers, and students to have the platform to show their talent and interest in different activities.
Topic: Plasmids and its types
Presentation by: Aymen Arif
Research Officer at Halal Food and testing Laboratory,
Industrial Analytical Center, H.E.J (ICCBS).
Youtube: https://www.youtube.com/watch?v=-spdnc-2z6Q
Plasmids can transfer genetic material between bacterial cells through a process called conjugation. Joshua Lederberg and Esther Lederberg first observed this process in 1946 when they found that mixing strains of E. coli resulted in new strains with combined genetic traits. Conjugation involves direct contact between a donor and recipient cell and allows for horizontal gene transfer of plasmids and genes they carry. Key genes called tra genes encode machinery for the process, including formation of a pilus for cell attachment and a channel for DNA transfer. Self-transmissible plasmids can transfer themselves while mobilizable plasmids require the proteins from a conjugative plasmid to be transferred. Conjugation plays a major role in spreading antibiotic resistance genes
DNA sequencing refers to determining the order of nucleotide bases in a DNA molecule. Frederick Sanger developed the chain termination/dideoxy method in 1977, which uses DNA polymerase and dideoxynucleotides to generate fragmented DNA strands of different lengths. Pyrosequencing was later developed, detecting light pulses from nucleotide incorporation without electrophoresis. Both methods have advanced biological research and applications in medicine, biotechnology and forensics.
The document summarizes DNA sequencing methods. It discusses the DNA double helix structure and how the four nitrogenous bases form complementary pairs between strands. It then describes the two main historical DNA sequencing methods: the Maxam-Gilbert method which uses chemical degradation, and the Sanger method which is based on chain termination using dideoxynucleotides. The Sanger method is now the most common approach and involves sequencing in four separate reactions with one of the four ddNTPs added to each.
This document discusses DNA sequencing methods and their history and applications. It covers first generation sequencing methods like Sanger sequencing and Maxam-Gilbert sequencing. It also covers next generation sequencing (NGS) methods like 454 pyrosequencing, Illumina sequencing, and Ion Torrent semiconductor sequencing. NGS allows high-throughput, massively parallel sequencing of DNA fragments. Template preparation for NGS involves fragmenting DNA, attaching fragments to beads, and emulsion PCR. The document provides details on the chemistry and detection methods used for different sequencing platforms.
The document discusses Ion Torrent semiconductor sequencing. It begins by providing background on first and next generation sequencing. It then describes Ion Torrent sequencing, noting that it detects pH changes from nucleotide incorporation rather than using modified nucleotides or optics. The principle, procedure involving fragmentation, ligation, amplification and pH detection on a CMOS chip, applications in genetics and medicine, advantages of speed and lower cost, and challenges including high cost per nucleotide and analysis complexity are summarized.
This document provides an overview of plasmids. It defines plasmids as small, circular, extrachromosomal DNA molecules that can replicate independently in bacteria. Plasmids contain genes that provide benefits to bacteria like antibiotic resistance. They are transferred between bacteria through processes like transformation, transduction, and conjugation. Plasmids are classified based on their functions and are important tools in biotechnology as they allow cloning, protein production, and other applications.
There is the fifth video by Miss Aymen Arif Sindh Biotechnologist Association has taken initiative for all young scientists, researchers, and students to have the platform to show their talent and interest in different activities.
Topic: Plasmids and its types
Presentation by: Aymen Arif
Research Officer at Halal Food and testing Laboratory,
Industrial Analytical Center, H.E.J (ICCBS).
Youtube: https://www.youtube.com/watch?v=-spdnc-2z6Q
Plasmids can transfer genetic material between bacterial cells through a process called conjugation. Joshua Lederberg and Esther Lederberg first observed this process in 1946 when they found that mixing strains of E. coli resulted in new strains with combined genetic traits. Conjugation involves direct contact between a donor and recipient cell and allows for horizontal gene transfer of plasmids and genes they carry. Key genes called tra genes encode machinery for the process, including formation of a pilus for cell attachment and a channel for DNA transfer. Self-transmissible plasmids can transfer themselves while mobilizable plasmids require the proteins from a conjugative plasmid to be transferred. Conjugation plays a major role in spreading antibiotic resistance genes
DNA sequencing refers to determining the order of nucleotide bases in a DNA molecule. Frederick Sanger developed the chain termination/dideoxy method in 1977, which uses DNA polymerase and dideoxynucleotides to generate fragmented DNA strands of different lengths. Pyrosequencing was later developed, detecting light pulses from nucleotide incorporation without electrophoresis. Both methods have advanced biological research and applications in medicine, biotechnology and forensics.
The document summarizes DNA sequencing methods. It discusses the DNA double helix structure and how the four nitrogenous bases form complementary pairs between strands. It then describes the two main historical DNA sequencing methods: the Maxam-Gilbert method which uses chemical degradation, and the Sanger method which is based on chain termination using dideoxynucleotides. The Sanger method is now the most common approach and involves sequencing in four separate reactions with one of the four ddNTPs added to each.
This document discusses DNA sequencing methods and their history and applications. It covers first generation sequencing methods like Sanger sequencing and Maxam-Gilbert sequencing. It also covers next generation sequencing (NGS) methods like 454 pyrosequencing, Illumina sequencing, and Ion Torrent semiconductor sequencing. NGS allows high-throughput, massively parallel sequencing of DNA fragments. Template preparation for NGS involves fragmenting DNA, attaching fragments to beads, and emulsion PCR. The document provides details on the chemistry and detection methods used for different sequencing platforms.
The document discusses Ion Torrent semiconductor sequencing. It begins by providing background on first and next generation sequencing. It then describes Ion Torrent sequencing, noting that it detects pH changes from nucleotide incorporation rather than using modified nucleotides or optics. The principle, procedure involving fragmentation, ligation, amplification and pH detection on a CMOS chip, applications in genetics and medicine, advantages of speed and lower cost, and challenges including high cost per nucleotide and analysis complexity are summarized.
The document discusses various methods for screening and selecting recombinant cells. Direct selection methods include antibiotic resistance screening and blue-white color screening. Indirect selection methods include screening by nucleic acid hybridization, colony hybridization, immunological assays, and detecting protein/enzyme activity. These screening methods allow identification of recombinant cells that contain the gene of interest from a mixture of transformed cells.
Radial immunodiffusion (RID) or Mancini method, Mancini immunodiffusion or single radial immunodiffusion (SRID) assay is an immunodiffusion technique used in immunology to determine the quantity or concentration of an antigen in a sample
Basics of Undergraduate/university fellows
RNA TRANSPOSABLE ELEMENTS (COPIA) IN Drosophila
within host genomes.
As TEs comprise more than 40% of the human genome and are linked to
numerous diseases, understanding their mechanisms of mobilization and
regulation is important.
Drosophila melanogaster is an ideal model organism for the study of eukaryotic
TEs as its genome contains a diverse array of active TEs.
Also referred to as “jumping genes,” TEs move, or transpose, to different locations
throughout the genomes in which they reside.
This document provides information about monoclonal antibodies including their production, types, and applications. It discusses how monoclonal antibodies are produced using the hybridoma technique which fuses antibody producing B cells with myeloma cells. This results in immortal cell lines that produce identical antibodies targeting a single epitope. The document contrasts monoclonal and polyclonal antibodies and describes the evolution of monoclonal antibodies from murine to humanized and human forms to reduce immunogenicity. It also covers monoclonal antibody nomenclature, pharmacokinetics, adverse effects, and therapeutic uses.
Nucleic acids have potential as therapeutic agents by inhibiting gene expression through various mechanisms. Plasmids can introduce genetic material into cells to produce therapeutic proteins. Oligonucleotides can be used for antisense or antigene applications to selectively block expression of disease-causing proteins. Aptamers and DNAzymes can directly interact with and interfere with proteins implicated in disease. RNA-based therapeutics like RNA aptamers, decoys, antisense RNA, ribozymes, and siRNAs can also inhibit gene expression. MicroRNAs naturally downregulate gene expression and have potential therapeutic applications. Gene and stem cell therapies aim to treat genetic disorders and repair damaged tissue.
Phage display technology allows the display of proteins or peptides on the outside of bacteriophages while encoding the corresponding gene on the inside. This allows for large libraries to be screened in vitro to select for interactions between the displayed molecules and a target. The most common phages used are filamentous phages like M13, which can be genetically manipulated to display proteins of interest. The technique involves inserting a gene into the phage coat protein gene, infecting bacteria to produce phage particles displaying the protein, and panning against a target to isolate interacting proteins.
Illumina (sequencing by synthesis) methodFekaduKorsa
The document outlines the Illumina sequencing by synthesis method in 12 steps: 1) DNA sample preparation and attachment to a flow cell, 2) bridge amplification to clone the DNA fragments, 3) determination of the first base by addition of fluorescently labeled nucleotides and imaging, 4) repeating the process to determine the second and subsequent bases, 5) generating sequenced reads. The sequenced reads are then 6) aligned and analyzed by comparing to a reference sequence to identify differences.
Fundamental techniques of gene manipulationManigandan s
Restriction enzymes are bacterial enzymes that cut DNA at specific nucleotide sequences. They were discovered in the 1960s and have become important tools in molecular biology. Restriction enzymes recognize short palindromic sequences and make cuts within or near these sequences. There are three main types of restriction enzymes that differ in their subunit structure and cleavage patterns. Restriction enzymes are used in techniques like DNA cloning, Southern blotting, and genome mapping.
Yeast artificial chromosomes (YACs) are engineered yeast chromosomes containing megabase-sized fragments of foreign DNA. YACs are created by ligating bacterial plasmid DNA containing yeast centromeres, telomeres, and replication origins to fragments of genomic DNA. This creates an artificial yeast chromosome capable of replicating inside yeast cells and accommodating very large DNA inserts of up to 2 megabases. YACs can be used to generate whole genome libraries for mapping genes and identifying chromosomal sequences important for constructing mammalian artificial chromosomes. However, YACs are less stable than bacterial artificial chromosomes and prone to rearrangements due to their large insert sizes.
The document discusses plasmids, which are small DNA molecules found in bacteria that can replicate independently of the bacterial chromosome. Plasmids contain genes and can be transferred between bacteria. They have various functions like carrying antibiotic resistance genes or virulence genes. Plasmids are commonly used as cloning vectors in genetic engineering to make copies of genes and produce proteins of interest.
Phage display is a technique for displaying antibody fragments or proteins on the surface of bacteriophages. It allows the rapid selection of antibodies that bind to a target antigen from large libraries of antibody variants. The key steps involve constructing phage libraries that display antibody fragments, panning the libraries against the target antigen to select for binding phages, amplifying these phages, and then identifying antibodies that bind to the antigen through sequencing and testing. Phage display has applications in diagnostics, research, and therapeutics, producing antibodies that can detect antigens, be used as research tools, or developed as treatments for diseases like cancer.
Homologous genes are genes that have descended from a common ancestral gene. There are two main types of homologous genes:
1. Orthologous genes are homologous genes in different species that arose due to speciation. For example, the human and mouse eyeless genes are orthologs that descended from the eyeless gene in their last common ancestor.
2. Paralogous genes are homologous genes within the same species that arose due to a gene duplication event. For example, the fruit fly eyeless and twin of eyeless genes are paralogs that descended from a duplication of the eyeless gene in a fruit fly ancestor.
Homologous genes can differ in their sequences due
Benzer conducted experiments to study intragenic recombination within the rII gene of bacteriophage T4. He isolated rare recombinants that arose from exchanges between DNA of co-infecting phages. This allowed him to map mutations within the rII gene at high resolution. He determined that rII mutations occurred in two complementation groups, rIIA and rIIB, representing two different genes. Through deletion mapping, Benzer was able to localize many rII mutations to a short region within gene A or B. His work established the concept of the cistron as the smallest genetic unit.
introduction,history scope and applications of
relation to other fields , bioinformatics,biological databases,computers internet,sequence development, and
introduction to sequence development and alignment
Bidirectional and rolling circular dna replicationGayathri91098
The document discusses bidirectional and rolling circular DNA replication. It begins by explaining that DNA replication is semi-conservative, with each new DNA molecule composed of one original strand and its complement. It then describes how in bidirectional replication, replication forks move in opposite directions from a single origin to copy both strands simultaneously. Rolling circular replication involves the continuous, unidirectional synthesis of multiple copies of a circular DNA or RNA genome through the sequential steps of initiation by introducing a nick, elongation by polymerase moving in a circle, and termination by cleaving the replicated strand.
Transfection involves introducing foreign DNA into host cells to produce a new phenotype. There are two main methods of transfection - vector-mediated and non-vector mediated. Vector-mediated transfection uses bacteriophage, retroviral, cosmid, baculovirus, and plasmid vectors to introduce DNA. Non-vector mediated methods include direct techniques like microinjection, electroporation, and particle bombardment, and indirect techniques like calcium phosphate precipitation and DEAE-dextran. Retroviral vectors are modified retroviruses that can introduce foreign DNA into host chromosomal DNA. Microinjection involves injecting DNA directly into cells using a micropipette under a microscope. Electroporation uses electric pulses to create temporary
Conjugation is the transfer of genetic material between bacterial cells that are in direct contact. Joshua Lederberg and Edward Tatum discovered conjugation in 1946 through an experiment where they mixed two auxotrophic E. coli strains that could complement each other's mutations. Bernard Davis provided evidence in 1950 that direct cell contact was required by showing that genetic transfer did not occur when the two strains were separated by a filter. During conjugation, a pilus forms to attach the donor and recipient cells and allows single-stranded DNA to pass between them to make both cells viable donors that can now transfer the genetic material.
Overview of methods for variant calling from next-generation sequence dataThomas Keane
This document provides an overview of methods for variant calling from next-generation sequencing data. It discusses data formats and workflows, including SNP calling, short indels, and structural variation. The document describes alignment, BAM improvement through realignment and base quality recalibration, library merging, and duplicate removal. It also reviews software tools for these processes and introduces the variant call format (VCF) standard.
The document discusses various methods for screening and selecting recombinant cells. Direct selection methods include antibiotic resistance screening and blue-white color screening. Indirect selection methods include screening by nucleic acid hybridization, colony hybridization, immunological assays, and detecting protein/enzyme activity. These screening methods allow identification of recombinant cells that contain the gene of interest from a mixture of transformed cells.
Radial immunodiffusion (RID) or Mancini method, Mancini immunodiffusion or single radial immunodiffusion (SRID) assay is an immunodiffusion technique used in immunology to determine the quantity or concentration of an antigen in a sample
Basics of Undergraduate/university fellows
RNA TRANSPOSABLE ELEMENTS (COPIA) IN Drosophila
within host genomes.
As TEs comprise more than 40% of the human genome and are linked to
numerous diseases, understanding their mechanisms of mobilization and
regulation is important.
Drosophila melanogaster is an ideal model organism for the study of eukaryotic
TEs as its genome contains a diverse array of active TEs.
Also referred to as “jumping genes,” TEs move, or transpose, to different locations
throughout the genomes in which they reside.
This document provides information about monoclonal antibodies including their production, types, and applications. It discusses how monoclonal antibodies are produced using the hybridoma technique which fuses antibody producing B cells with myeloma cells. This results in immortal cell lines that produce identical antibodies targeting a single epitope. The document contrasts monoclonal and polyclonal antibodies and describes the evolution of monoclonal antibodies from murine to humanized and human forms to reduce immunogenicity. It also covers monoclonal antibody nomenclature, pharmacokinetics, adverse effects, and therapeutic uses.
Nucleic acids have potential as therapeutic agents by inhibiting gene expression through various mechanisms. Plasmids can introduce genetic material into cells to produce therapeutic proteins. Oligonucleotides can be used for antisense or antigene applications to selectively block expression of disease-causing proteins. Aptamers and DNAzymes can directly interact with and interfere with proteins implicated in disease. RNA-based therapeutics like RNA aptamers, decoys, antisense RNA, ribozymes, and siRNAs can also inhibit gene expression. MicroRNAs naturally downregulate gene expression and have potential therapeutic applications. Gene and stem cell therapies aim to treat genetic disorders and repair damaged tissue.
Phage display technology allows the display of proteins or peptides on the outside of bacteriophages while encoding the corresponding gene on the inside. This allows for large libraries to be screened in vitro to select for interactions between the displayed molecules and a target. The most common phages used are filamentous phages like M13, which can be genetically manipulated to display proteins of interest. The technique involves inserting a gene into the phage coat protein gene, infecting bacteria to produce phage particles displaying the protein, and panning against a target to isolate interacting proteins.
Illumina (sequencing by synthesis) methodFekaduKorsa
The document outlines the Illumina sequencing by synthesis method in 12 steps: 1) DNA sample preparation and attachment to a flow cell, 2) bridge amplification to clone the DNA fragments, 3) determination of the first base by addition of fluorescently labeled nucleotides and imaging, 4) repeating the process to determine the second and subsequent bases, 5) generating sequenced reads. The sequenced reads are then 6) aligned and analyzed by comparing to a reference sequence to identify differences.
Fundamental techniques of gene manipulationManigandan s
Restriction enzymes are bacterial enzymes that cut DNA at specific nucleotide sequences. They were discovered in the 1960s and have become important tools in molecular biology. Restriction enzymes recognize short palindromic sequences and make cuts within or near these sequences. There are three main types of restriction enzymes that differ in their subunit structure and cleavage patterns. Restriction enzymes are used in techniques like DNA cloning, Southern blotting, and genome mapping.
Yeast artificial chromosomes (YACs) are engineered yeast chromosomes containing megabase-sized fragments of foreign DNA. YACs are created by ligating bacterial plasmid DNA containing yeast centromeres, telomeres, and replication origins to fragments of genomic DNA. This creates an artificial yeast chromosome capable of replicating inside yeast cells and accommodating very large DNA inserts of up to 2 megabases. YACs can be used to generate whole genome libraries for mapping genes and identifying chromosomal sequences important for constructing mammalian artificial chromosomes. However, YACs are less stable than bacterial artificial chromosomes and prone to rearrangements due to their large insert sizes.
The document discusses plasmids, which are small DNA molecules found in bacteria that can replicate independently of the bacterial chromosome. Plasmids contain genes and can be transferred between bacteria. They have various functions like carrying antibiotic resistance genes or virulence genes. Plasmids are commonly used as cloning vectors in genetic engineering to make copies of genes and produce proteins of interest.
Phage display is a technique for displaying antibody fragments or proteins on the surface of bacteriophages. It allows the rapid selection of antibodies that bind to a target antigen from large libraries of antibody variants. The key steps involve constructing phage libraries that display antibody fragments, panning the libraries against the target antigen to select for binding phages, amplifying these phages, and then identifying antibodies that bind to the antigen through sequencing and testing. Phage display has applications in diagnostics, research, and therapeutics, producing antibodies that can detect antigens, be used as research tools, or developed as treatments for diseases like cancer.
Homologous genes are genes that have descended from a common ancestral gene. There are two main types of homologous genes:
1. Orthologous genes are homologous genes in different species that arose due to speciation. For example, the human and mouse eyeless genes are orthologs that descended from the eyeless gene in their last common ancestor.
2. Paralogous genes are homologous genes within the same species that arose due to a gene duplication event. For example, the fruit fly eyeless and twin of eyeless genes are paralogs that descended from a duplication of the eyeless gene in a fruit fly ancestor.
Homologous genes can differ in their sequences due
Benzer conducted experiments to study intragenic recombination within the rII gene of bacteriophage T4. He isolated rare recombinants that arose from exchanges between DNA of co-infecting phages. This allowed him to map mutations within the rII gene at high resolution. He determined that rII mutations occurred in two complementation groups, rIIA and rIIB, representing two different genes. Through deletion mapping, Benzer was able to localize many rII mutations to a short region within gene A or B. His work established the concept of the cistron as the smallest genetic unit.
introduction,history scope and applications of
relation to other fields , bioinformatics,biological databases,computers internet,sequence development, and
introduction to sequence development and alignment
Bidirectional and rolling circular dna replicationGayathri91098
The document discusses bidirectional and rolling circular DNA replication. It begins by explaining that DNA replication is semi-conservative, with each new DNA molecule composed of one original strand and its complement. It then describes how in bidirectional replication, replication forks move in opposite directions from a single origin to copy both strands simultaneously. Rolling circular replication involves the continuous, unidirectional synthesis of multiple copies of a circular DNA or RNA genome through the sequential steps of initiation by introducing a nick, elongation by polymerase moving in a circle, and termination by cleaving the replicated strand.
Transfection involves introducing foreign DNA into host cells to produce a new phenotype. There are two main methods of transfection - vector-mediated and non-vector mediated. Vector-mediated transfection uses bacteriophage, retroviral, cosmid, baculovirus, and plasmid vectors to introduce DNA. Non-vector mediated methods include direct techniques like microinjection, electroporation, and particle bombardment, and indirect techniques like calcium phosphate precipitation and DEAE-dextran. Retroviral vectors are modified retroviruses that can introduce foreign DNA into host chromosomal DNA. Microinjection involves injecting DNA directly into cells using a micropipette under a microscope. Electroporation uses electric pulses to create temporary
Conjugation is the transfer of genetic material between bacterial cells that are in direct contact. Joshua Lederberg and Edward Tatum discovered conjugation in 1946 through an experiment where they mixed two auxotrophic E. coli strains that could complement each other's mutations. Bernard Davis provided evidence in 1950 that direct cell contact was required by showing that genetic transfer did not occur when the two strains were separated by a filter. During conjugation, a pilus forms to attach the donor and recipient cells and allows single-stranded DNA to pass between them to make both cells viable donors that can now transfer the genetic material.
Overview of methods for variant calling from next-generation sequence dataThomas Keane
This document provides an overview of methods for variant calling from next-generation sequencing data. It discusses data formats and workflows, including SNP calling, short indels, and structural variation. The document describes alignment, BAM improvement through realignment and base quality recalibration, library merging, and duplicate removal. It also reviews software tools for these processes and introduces the variant call format (VCF) standard.
This slidedeck details two comprehensive informatics solutions — the Biomedical Genomics Workbench and Ingenuity Knowledge Base Variant Analysis platforms. We show the intuitive user interface of CLC Cancer Research Workbench and demonstrate how the rich biological content from Ingenuity Knowledge Base helps you rapidly identify critical variants in your samples.
This document provides an introduction to next generation sequencing (NGS) technologies. It begins with an outline of topics to be covered, including the evolution of NGS technologies, their descriptions and comparisons, bioinformatics challenges of NGS data analysis, and some aspects of NGS data analysis workflows and tools. The document then delves into explanations of specific NGS platforms, their performance characteristics, and the sequencing processes. It discusses the large computational infrastructure and data management needs of NGS, as well as quality control, preprocessing of NGS data, and popular analysis tools and workflows.
Next Generation Sequencing & Transcriptome AnalysisBastian Greshake
This document discusses next generation sequencing methods like 454 sequencing, Illumina sequencing, and SOLiD sequencing. It then describes how the large amounts of sequencing data generated can be used for transcriptome analysis to study gene expression, identify new genes, and analyze non-model organisms. Finally, it outlines some of the common bioinformatics tools used for assembling sequencing reads, detecting SNPs, finding homologous genes, identifying open reading frames, and annotating genes.
NGS has enabled high-throughput genome sequencing and analysis, changing genomic research. Technologies like Roche 454, Solexa/Illumina, and SOLiD allow massively parallel sequencing of genomes. NGS has applications in de novo genome sequencing, resequencing, RNA-seq, ChIP-seq, methylation analysis, and more. It provides advantages over microarrays like detecting novel transcripts, splicing variants, and sequence variations. NGS data requires processing including quality control, mapping, and variant identification to realize its full potential to revolutionize genomic research and medicine.
1. Next-generation sequencing methods such as Roche 454, Illumina GAII, and ABI SOLiD allow for high throughput DNA sequencing through massive parallel sequencing.
2. These methods involve clonal amplification of DNA fragments on solid surfaces or in emulsion PCR followed by sequencing using pyrosequencing, sequencing by synthesis with reversible terminators, or sequencing by ligation approaches.
3. The resulting sequencing data requires high throughput management and analysis pipelines to process the large volumes of sequence data produced.
Deciphering DNA sequences is essential for virtually all branches of biological research. With the
advent of capillary electrophoresis (CE)-based Sanger sequencing, scientists gained the ability to
elucidate genetic information from any given biological system. This technology has become widely
adopted in laboratories around the world, yet has always been hampered by inherent limitations in
throughput, scalability, speed, and resolution that often preclude scientists from obtaining the essential
information they need for their course of study. To overcome these barriers, an entirely new technology
was required—Next-Generation Sequencing (NGS), a fundamentally different approach to sequencing
that triggered numerous ground-breaking discoveries and ignited a revolution in genomic science.
NGS for Infectious Disease Diagnostics: An Opportunity for Growth Alira Health
Infectious diseases are a major public health concern causing over 3.5 million deaths worldwide. Diagnosing patients as quickly and effectively as possible is crucial for managing disease outbreaks. Next-generation sequencing (NGS) provides unique capabilities to understand the genetic profile of infectious disease patients that no other technology can match.
Whole-genome metagenomics allows clinicians to take a deeper dive into pathogens by generating big-data about their characteristics. This data can be rapidly analyzed using complex bioinformatics software algorithms to achieve clinical-grade diagnostic accuracy. In a healthcare system shifting towards personalized medicine, NGS can provide clinicians the tools that they need to prescribe individualized treatments to save patients who were previously untreatable. The result is improved quality of care, better treatment regimes, and cost-saving healthcare.
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...QIAGEN
Liquid biopsies enable us to monitor the evolution of genetic aberrations in primary tumors as they shed the tumor cells into the circulation. The limitation is the ability to detect these low frequency genetic aberrations in a consistent manner to understand short- and long-term implications and how this information will be used in the clinic. This slidedeck will cover the challenges and solutions associated with multiple steps as one starts with liquid biopsy and move towards finding a new biomarker.
Digital RNAseq Technology Introduction: Digital RNAseq Webinar Part 1QIAGEN
QIAseq RNA is a revolutionary turnkey solution for digital gene expression analysis by NGS. From 10 genes to 1000, from one sample to 100, QIAseq RNA delivers precise results on both ION and Illumina sequencing platforms. The data from QIAseq RNA is directly comparable to expression analysis derived from whole transcriptome sequencing or by qRTPCR, only better, cheaper, faster, and more flexible. This webinar will describe the principles of digital expression analysis by NGS, and review the features and benefits of the QIAseq system, options available, and the integrated data analysis package.
This document discusses next generation sequencing technologies. It provides details on several massively parallel sequencing platforms and describes their advantages over traditional Sanger sequencing such as higher throughput, lower costs, and ability to process millions of reads in parallel. It then outlines several applications of next generation sequencing like mutation discovery, transcriptome analysis, metagenomics, epigenetics research and discovery of non-coding RNAs.
“Next-Generation Sequencing (NGS) Global Market – Forecast To 2022”Vinay Shiva Prasad
Next Generation Sequencing (NGS) has revolutionized the way the human genomes are being sequenced now a days. The time and cost to sequence has tremendously reduced . In the last five years, the sequencing cost has rapidly reduced and this is supposed to have a huge impact on the NGS market in the coming years.
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for AllQIAGEN
The document summarizes the QIAseq NGS portfolio from QIAGEN for cancer research and other applications. It describes several QIAseq kits that provide streamlined workflows for targeted DNA sequencing, whole genome sequencing, single-cell sequencing, cell-free DNA sequencing, and single-cell RNA sequencing. The kits are shown to provide high library conversion rates even from low input DNA amounts down to 10 pg, with uniform coverage and low GC bias.
Next generation sequencing (NGS) involves chemical assays other than traditional Sanger sequencing to sequence DNA. The basic steps of NGS are: 1) library preparation where DNA is fragmented and adapters are ligated, 2) clonal amplification where fragments are amplified, and 3) sequencing reactions using techniques like pyrosequencing or reversible dye terminators to determine the sequence. The sequenced fragments are then analyzed bioinformatically to map reads and assemble sequences.
This document provides an overview of next generation sequencing (NGS) for cancer genomics. It discusses how NGS allows sequencing of whole genomes or targeted regions to identify genetic changes between normal and tumor DNA. Analysis involves aligning DNA reads to reference genomes, detecting mutations, and categorizing them as silent, non-silent, oncogenic drivers, or signatures of mutational processes. RNA sequencing can also profile gene expression and detect fusion genes. The data requires specialized bioinformatics analysis to interpret genetic changes driving cancer.
Course: Bioinformatics for Biomedical Research (2014).
Session: 2.1.1- Next Generation Sequencing. Technologies and Applications. Part I: NGS Introduction and Technology Overview.
Statistics and Bioinformatisc Unit (UEB) & High Technology Unit (UAT) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.
Next Gen Sequencing (NGS) Technology OverviewDominic Suciu
Next generation sequencing (NGS) provides several new technologies for DNA sequencing that have significantly increased throughput and reduced costs compared to previous methods. NGS technologies include Roche/454, Illumina, ABI SOLiD, Ion Torrent, and PacBio. These technologies have various applications including whole genome sequencing, detection of genetic mutations associated with diseases, RNA sequencing to study gene expression, and ChIP sequencing to identify DNA-binding sites. NGS is revolutionizing genomic research by allowing comprehensive study of genomes, transcriptomes, and gene regulation.
How to cluster and sequence an ngs library (james hadfield160416)James Hadfield
A presentation for people intersted in understanding how Illumina adapter ligation, clustering ands SBS sequencing work. Follow core-genomics http://core-genomics.blogspot.co.uk/
A workshop is intended for those who are interested in and are in the planning stages of conducting an RNA-Seq experiment. Topics to be discussed will include:
* Experimental Design of RNA-Seq experiment
* Sample preparation, best practices
* High throughput sequencing basics and choices
* Cost estimation
* Differential Gene Expression Analysis
* Data cleanup and quality assurance
* Mapping your data
* Assigning reads to genes and counting
* Analysis of differentially expressed genes
* Downstream analysis/visualizations and tables
3 - SamplingTechVarious sampling techniques are employedniques(new1430).pptgharkaacer
In the context of medicine, "medical sampling" generally refers to the collection of biological materials or data for analysis, diagnosis, or research purposes. Various sampling techniques are employed to ensure the integrity and representativeness of the samples. Here are some common medical sampling techniques:
Blood Sampling: One of the most common medical sampling techniques, it involves drawing blood from a vein, usually using a needle. This method is used for countless diagnostic tests, from basic blood counts to more complex disease markers.
Urine Sampling: Urine can be collected randomly or at specific times (e.g., first morning sample, 24-hour urine collection) to assess kidney function, detect metabolic products, and diagnose diseases.
Tissue Biopsy: This involves extracting a small piece of tissue from the body for examination under a microscope. Biopsies can be performed on various body parts, including the skin, liver, and kidneys, to diagnose cancer and other diseases.
Hypothesis Testing and Experimental design.pptxAbile2
The document discusses hypothesis testing and experimental design. It explains that hypothesis testing involves designing a method to collect data to evaluate and accept or reject a hypothesis. Experimental design aims to minimize sources of variability except the treatment being tested. Key aspects of experimental design discussed include sample size, replication, controls, randomization, and interspersion of treatments. Different types of experimental designs are also outlined, including completely randomized designs, randomized complete block designs, and factorial designs.
Experimental method of Educational Research.Neha Deo
experimental method is the most challenging method of the Educational research. In the experimental method different functional & factorial designs can be used. One has to think over the internal & external validity of the experiment also.In this presentation all these things are discussed in details.
This document provides an overview of genome-wide association studies (GWAS). It discusses the basic concept of GWAS, running and analyzing a GWAS, and interpreting the results. Key points include: GWAS genotype individuals for hundreds of thousands to millions of SNPs to look for associations with traits; extensive quality control is required; imputation can increase SNP coverage; statistical analysis includes computing p-values and correcting for multiple testing; significant findings still require replication in independent samples.
The document discusses research design and experimentation. It defines research design as the blueprint that guides the research process. An effective research design maximizes systematic variance between variables, minimizes error variance, and controls for confounding variables. This is known as the MAXMINCON principle. Experimental designs have higher internal validity while non-experimental correlational designs have higher external validity but lower internal validity since they cannot control for confounding variables.
sience 2.0 : an illustration of good research practices in a real studywolf vanpaemel
a presentation explaining the what, how and why of some of the features of science 2.0 (replication, registration, high power, bayesian statistics, estimation, co-pilot multi-software approach, distinction between confirmatory and exploratory analyses, and open science) using steegen et al. (2014) as a running example.
Advances and Applications Enabled by Single Cell TechnologyQIAGEN
Over the past 5 years, single-cell genomics have become a powerful technology for studying small samples and rare cells, and for dissecting complex populations such as heterogeneous tumors. Single-cell technology is enabling many new insights into diverse research areas from oncology, immunology and microbiology to neuroscience, stem cell and developmental biology. This webinar introduces single-cell technology and summarizes the newest scientific applications in various research areas, all in the context of current literature.
This document provides a review and summary of major scientific events, trends, and publications in translational bioinformatics in 2008 by Russ B. Altman from Stanford University. Some of the key topics covered include the sequencing and analysis of an individual's diploid genome, next-generation sequencing technologies, genome-wide association studies, pharmacogenomics, analysis of high-throughput molecular data, neuroscience datasets, and using molecular information to improve disease detection and treatment. The review highlights over 25 seminal papers from 2008 and provides insights on emerging trends in the field.
The document discusses various sampling methods and concepts in research methodology. It defines key terms like population, sample, sampling frame, probability sampling, and non-probability sampling. It then explains different probability sampling techniques like simple random sampling, systematic sampling, stratified sampling, cluster sampling, and multistage sampling. It also discusses non-probability sampling methods and compares the advantages and disadvantages of different approaches. The document emphasizes the importance of representative sampling.
This document discusses key concepts in study design, including:
1) It defines target populations, study populations, and samples, noting that samples are used to make inferences about larger populations.
2) It discusses sources of sampling error and types of sampling methods, including simple random sampling, systematic random sampling, and stratified random sampling.
3) It outlines different types of study designs including descriptive (PO) versus analytic (PICO/PECO) studies, and experimental versus observational studies. Within observational studies, it distinguishes between cohort, case-control, and cross-sectional designs.
This document provides an overview of evidence-based psychiatry. It defines evidence-based medicine and outlines the key steps, including formulating an answerable clinical question using PICO, finding the best evidence, appraising the evidence, and applying it to individual patients. The document discusses different types of studies and how to evaluate their results, focusing on sensitivity, specificity, odds ratios, and other statistical concepts. It emphasizes integrating individual clinical expertise with the best available external evidence.
This document discusses key principles of statistical sampling:
1) The principle of statistical regularity states that a randomly selected moderate sample will possess the characteristics of the larger population.
2) The principle of inertia of large numbers states that as sample size increases, results become more reliable and accurate due to balancing of variations.
3) It describes different sampling methods like simple random sampling, systematic random sampling, stratified random sampling, and cluster sampling.
This document provides an overview of important considerations for designing a successful clinical research study. It discusses how to begin by defining research questions and assessing feasibility. It then covers common study design types including experimental, observational, descriptive, and analytical designs. Examples are given of randomized clinical trials, cohort studies, cross-sectional studies, and case-control studies that could be used to study the relationship between hormone therapy and coronary heart disease. Statistical issues like sample size calculations and analytic approaches are also highlighted.
This document provides an overview of key concepts in biostatistics for clinical research. It discusses study design considerations including descriptive versus analytical studies, and observational versus experimental designs. It also covers topics like clinical trial methodology, ethics, and sample size calculation. Sample size depends on the statistical parameter, design, hypothesis being tested, and is neither too small to lack power nor too large to waste resources. Resource limitations may require adjusting the target sample size or power. Planned statistical analyses should be tailored to the research objectives.
The document discusses various topics related to experimental research methods, including defining research problems, sampling techniques, research designs, variables, hypothesis testing, and statistics. Specifically, it defines key terms like independent and dependent variables, different sampling methods, research designs like experimental and quasi-experimental, and statistical analyses commonly used in experimental research like t-tests, ANOVA, regression, and chi-square tests.
Dr. Marie Culhane - Increase the value of your diagnostics and your value as ...John Blue
Increase the value of your diagnostics and your value as a diagnostician - Dr. Marie Culhane, Associate Clinical Professor, Veterinary Population Medicine, College of Veterinary Medicine, University of Minnesota, from the 2013 Allen D. Leman Swine Conference, September 14-17, 2013, St. Paul, Minnesota, USA.
More presentations at http://www.swinecast.com/2013-leman-swine-conference-material
sampling in research, a written report which consists of the following: definitions and terminologies, the sampling types and methods, the sampling process, the sampling storage, and sampling errors.
Design of experiments - Dr. Manu Melwin Joy - School of Management Studies, C...manumelwin
Planning an experiment to obtain appropriate data and drawing inference out of the data with respect to any problem under investigation is known as design and analysis of experiments.
This might range anywhere from the formulations of the objectives of the experiment in clear terms to the final stage of the drafting reports incorporating the important findings of the enquiry
Single-cell RNA sequencing (scRNA-seq) allows researchers to analyze gene expression at the individual cell level, exposing heterogeneity that is hidden in bulk tissue analysis. There are various platforms for scRNA-seq that differ in throughput and customizability. Experimental design considerations include the number of cells to sequence, desired sequencing depth, and controlling for batch effects. The analysis workflow generally involves processing and filtering data, normalization, clustering, differential expression analysis, and trajectory inference to reconstruct cellular responses.
Similar to Making your science powerful : an introduction to NGS experimental design (20)
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
State of Artificial intelligence Report 2023kuntobimo2016
Artificial intelligence (AI) is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.
We believe that AI will be a force multiplier on technological progress in our increasingly digital, data-driven world. This is because everything around us today, ranging from culture to consumer products, is a product of intelligence.
The State of AI Report is now in its sixth year. Consider this report as a compilation of the most interesting things we’ve seen with a goal of triggering an informed conversation about the state of AI and its implication for the future.
We consider the following key dimensions in our report:
Research: Technology breakthroughs and their capabilities.
Industry: Areas of commercial application for AI and its business impact.
Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI.
Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us.
Predictions: What we believe will happen in the next 12 months and a 2022 performance review to keep us honest.
Intelligence supported media monitoring in veterinary medicine
Making your science powerful : an introduction to NGS experimental design
1. Making your science powerful: an introduction to
NGS experimental design
7th May CSCR seminar
Dr Jelena Aleksic
2. The purpose of this talk
Bioinformaticians often get asked about experimental design, as
we have experience working with the data.
Good experimental design is absolutely crucial for NGS
applications, because of the size of the datasets.
The purpose of this talk is to introduce the basic concepts of
good NGS experimental design.
3. NGS Experimental Design
• Importance
• Specificity and power
• Accuracy, precision and bias
• Considerations for sample collection
• Considerations for sequencing
• Conclusions
4. Without a valid design, valid scientific conclusions cannot be drawn. Also
particularly important for NGS because of the size of the datasets.
5. Ronald A. Fisher
(1890-1962)
“To consult the statistician after an experiment is
finished is often merely to ask them to conduct a
post mortem examination. They can perhaps say what
the experiment died of.” (1938)
Good to think about early!
6. Value of Planning
Rushing into experiments without thoughtful planning invites
failure.
“Seventy percent of whether your experiment will
work is determined before you touch the first test
tube”
Tung-Tien Sun (2004).
Excessive trust in authorities and its influence on experimental design.
Nature Reviews Molecular Cell Biology
7. Still a serious issue in the field
- Almost 70% of all the human RNA-seq samples in GEO do not
have biological replicates (Feng et al. 2012)
- More unreplicated RNA-seq data were published than
replicated RNA-seq data in 2011 (Feng et al. 2012)
- ENCODE guidelines recommend two ChIP-seq biological
replicates (Landt et al. 2012), but this is controversial and
arguably too few
8. • Importance
• Specificity and power
• Accuracy, precision and bias
• Considerations for sample collection
• Considerations for sequencing
• Conclusions
NGS Experimental Design
9. Important concepts
• Specificity: the proportion of results that are genuine ones,
as opposed to false positives
• Power (sensitivity): the proportion of genuine results that are
successfully identified by the experiment
e.g. At FDR 1%, 1 in 100 results will be a false positive.
e.g. At 20% power, the experiment will successfully identify
200 out of 1000 genuine differentially expressed genes at a
specified fold-change level.
10. Adequately Powered
The power (sensitivity) of an experiment is the
probability that it can detect an effect, if it is present.
•Power is often overlooked.
•A probability - any value between 0% and 100%.
•Achieved by:
–Using appropriate numbers of animals / samples
(sample size)
–Controlling sources of variation
If you increase the variability when you increase the size then it
won't necessarily have more power.
11. Adequately Powered
•Too many samples:
–wastes resources e.g. animals
(unethical), money, time and
effort.
•Too few samples:
–may lack power and miss a
scientifically important effect. Power(%)
Sample Size per group (n)
12. How much power is enough?
– Depends on the experiment purpose.
– To just get the top few targets = not much.
– To get a comprehensive list of targets = a lot.
– Rare transcripts / genomic variants = a lot.
– Also depends on variability of samples.
– In vitro studies = less variability
– In vivo = more variability, mixture of tissues
– Cancer = hugely variable, need a lot of power
13. How do I tell?
• It’s possible to calculate the required power of an experiment.
For RNA-seq, there’s even an online tool (and there are
various general calculators):
• Alternatively, use rules of thumb based on other people’s
systematic reviews of a particular experiment type.
14. How many replicates?
4 mice…
No, it should be
12 mice…
Look into my crystal ball……
The sensitivity, ability or power to detect changes
depends on the sample size.
15. Depends on the resources, the goals of the study, and the
reliability of the technology:
–How much money do you have?
–Can you handle all these samples without problem?
–What size of differences (effect size) to detect?
–It’s an accurate representation of the population?
–Large enough to achieve meaningful results?
How many replicates?
16. What is the minimum number
of replicates?
Experiments with only one replicate break
statistics and make puppies cry.
• Unless:
– It’s a pilot experiment, e.g. to estimate data quality /
antibody specificity / quantity of particular transcripts etc.,
and you plan to do follow-ups (further reps or other
biological validation).
– It’s performed on a homogeneous and well-characterised
system where a lot of other data already exists. e.g. RNA-
seq of a much used cell line.
–It’s a simple confirmatory experiment
17. What is the minimum number
of replicates?
• Adequate minimum number of replicates for an
exploratory study = 3.
• The reasoning
– 3 data points per gene lets you fit a statistical distribution
more accurately, and gives a better estimate of variability
– This in turn improves both the specificity and the
sensitivity of the experiment.
Some labs do 4 replicates as standard, in case one fails.
18. • Importance
• Specificity and power
• Accuracy, precision and bias
• Considerations for sample collection
• Considerations for sequencing
• Conclusions
NGS Experimental Design
19. Precise & Unbiased
PRECISION:
• Reproducibility of repeated
measurements.
• Precise estimation of the quantity of
interest.
Precision
Bias
Target
UNBIASED:
• Bias can affect accuracy.
• Should control for systematic differences between the measure
and some “true" value (target).
• Doesn’t confound that estimate with a technical effect.
• Systematic variation (bias) leads to results being inaccurate.
• Random variation (chance) leads to results being imprecise.
20. “A biased scientific result is no
different from a useless one”
Daniel Sarewitz
(Beware the creeping cracks of bias. Nature, 2012)
21. • Bias is avoided by:
• Correct selection of experimental units.
• Randomisation of the experimental units.
• Randomisation of the order in which
measurements are made.
• “Blinding” and the use of coded samples where
appropriate.
• Failure to randomise and blind can lead to
false positive and negative results
Controlling bias
22. Confounding factors:
example
Plate1 Plate2 Plate3
Control Treatment 1 Treatment 2
RNA Extraction
The difference between Control, Treatment 1
and Treatment 2 is confounded by Plate
23. Confounding factors:
example 2
Day1, Plate 1 Day2, Plate 2 Day3, Plate 3
Control Treatment 1 Treatment 2
RNA Extraction
The difference between Control, Treatment 1
and Treatment 2 is confounded by day and plate.
24. Sources of potential bias
Issues can arise if any of the experimental steps are
applied in a systematically biased way between sample
and control. e.g. During:
- Sample collection procedure
- Molecular biology during the main experiment (e.g.
ChIP)
- Library prep
- Sequencing (different lanes on same flowcell is ok, but
different flow cells can produce different results)
25. http://www.the-scientist.com/blog/display/57558/
• A GWAS study of 800 centenarians against controls found 150 SNPs which can predict
if a person is a centenarian with 77 % accuracy.
• Problem: they used different SNP chips for centenarian vs control.
• Retracted 2011 following an independent lab reviewed the data and QC applied.
26. • Importance
• Specificity and power
• Accuracy, precision and bias
• Considerations for sample collection
• Considerations for sequencing
• Conclusions
NGS Experimental Design
27. Samples to collect
• Adequate minimum number of replicates for an
exploratory study = 3.
• Controls are also essential.
• Biological replicates should be performed -
technical reps will not capture variation present in the
tissues / organisms.
–> This means you need at least 3 biological replicate
samples, and 3 biological replicate controls.
28. Biological or technical
replicates?
Choose Samples
Extract RNA
Quality Control
Convert to cDNA
Quality Control
Library prep
Washing
Sequencing
Analysis
Biological Replication
Technical Replication
Technical Replication
Technical Replication
Technical Replication
RNAseqProcessingWorkflow
29. Biological replicate tricky
cases
• Some biological replicate cases are obvious: e.g.
tissue samples from different mice, blood samples
from different people, cell cultures from different
people.
• Others are less clear cut:
• e.g. Experiments using a single cell line
34. Performing cell line replicates
• The aim is to assess the replicability of a given
experiment, by performing it independently
• This means ideally they should be done on different
days, with fresh medium and reagents etc. However,
this might be impractical.
• It is at least important to perform the treatments
independently. e.g. To get 4 replicates, do 4 separate
transfections (or RNAi etc) for the experimental
sample, and 4 for the control.
35. Experiment Controls
• A very important part in your experiment,
because it’s very difficult to eliminate all of the
possible confounding variables & bias.
• Designing the experiment with controls in mind is
often more crucial than determining the
independent variable.
• Increases the statistical validity of your data.
• Two types: positive and negative.
36. Experimental Controls
• Placebo control
Mimic a procedure or treatment without the
actual use of the procedure or test
substance.
• e.g. same surgical procedure but
without X implanted
• Vehicle control
Used in studies where a substance is used to deliver an experimental
compound.
e.g. apply EtOH to cell lines on it’s own as a control since it’s
used as a vehicle for delivering the Tamoxifen drug.
• Baseline biological state
e.g. Wild type to observe the effects of a knockout.
37. ChIP-seq Controls
• What controls to use
– Either input DNA or IgG antibody are commonly used. Both
have pros and cons.
• Replicates
- The control should have as many replicates as the sample,
as it is also variable.
• Particularly important
- ChIP reactions are extremely noisy and variable.
38. Should I pool samples?
When working with very small amounts of tissue sample, pooling is a
good way of reducing the noise while keeping the number of n
reasonably small.
e.g. where n = no. of sequencing reactions.
• Better to pool the biological material (tissue, cells), not the purified
RNA or labeled cDNA. In this way, problems are far easier to spot.
• However, there’s no way to estimate variation between individuals
in a pool, which is sometimes important and often interesting.
• Beware of outliers - Don’t include any sample that looks suspicious.
• In some studies (pooled vs unpooled designs) the majority of DEGs turn out
extreme in only one individual.
39. • Importance
• Specificity and power
• Accuracy, precision and bias
• Considerations for sample collection
• Considerations for sequencing
• Conclusions
NGS Experimental Design
40. Basic NGS parameters
• Library size / sequencing depth: the number of reads
obtained for a given sample (e.g. 20 million)
• Read length: bp length of each read in the library (e.g. 150
bp)
• Single end vs paired end: single end sequencing only
sequences one end of each DNA fragment, while paired end
sequences both ends.
41. How deeply should you
sequence
• Depends on the application: e.g. some parts of the genome
will be more tricky to sequence, so full coverage for genome
assembly may require very deep sequencing.
• RNA-seq differential expression: very roughly (for human
and mouse) - at least 10 million reads per sample is required,
and 30 million reads per sample should suit most applications.
Can sequence deeper if low expressed genes are particularly
important.
• If in doubt: you can run a pilot experiment and see how much
coverage and saturation you get, and potentially sequence
further after that.
42. Sequencing depth vs
replicates
Replicates are good! Even if you don’t sequence any more
deeply.
A study looking at RNAseq (Liu et al. 2013) found that adding
more replicates at 10 million reads library size was a more cost-
efficient way of improving the power of the experiment, than
adding more sequencing coverage. This was true for 3-7
replicates.
So, if you don’t have money for a lot of sequencing, you can still
multiplex a number of biological replicates!
43. What read length?
• The read length should match your sample fragment length:
you want it to be around the size of and just a bit shorter than your
fragments.
• Recommended for Illumina:
sRNAseq, ribosomal profiling = 50 bp read length
mRNAseq (fragment size ~80-200bp) = 150 bp read length
longer fragments = 250 bp
• For longer reads / niche applications: other sequencers also
exist. PacBio has an impressive read length (often >10 kb)
44. Single end vs paired end
• Single end: a bit cheaper, less sequencing required. Suitable for
most general purposes, such as differential expression analysis.
• Paired end: contains more length and positional information about
the sequenced fragment - can tell where it starts and stops.
• Applications for which you need paired end sequencing:
- splice junctions
- rearrangements: indels and inversions
45. • Importance
• Specificity and power
• Accuracy, precision and bias
• Considerations for sample collection
• Considerations for sequencing
• Conclusions
NGS Experimental Design
46. Summary
• Experimental design is key to a successful genomics experiment.
• The required power of an experiment should be considered when
choosing the number of replicates.
• An adequate number of biological replicates (usually >=3) is
crucial for a reliable statistical analysis
• Appropriate controls are essential
• Systematic bias can completely mess up an experiment and should
be controlled
47. Acknowledgements
Thank you to Ela and everyone in the Frye lab!
Iwona Driskell
Sandra Blanco
Shobbir Hussain
Joana Flores
Martyna Popis
Roberto Bandiera
Abdul Sajini
Mahalia Page
Nikoletta Gkatza
Carolin Witte
And also the following colleagues for
excellent experimental design
discussions:
Sabine Dietmann
Patrick Lombard
Meryem Ralser
Bettina Fischer (CSBC)
Sarah Carl (CSBC)
Duncan Odom (CRUK CRI)
Audrey Fu (University of Chicago)
And thank you all for listening!
I would particularly like to thank Roslin Russell (CRUK) for letting
me use her course material slides as a basis for preparing this talk