It is a presentation showing the process of doing a prokaryotic genome annotation using RAST server. It is a basic work in bioinformatics field. Genome annotation is total genome analysis of an organism. We can easily do it using bioinformatics tool like RAST server.
Automated sequencing of genomes require automated gene assignment
Includes detection of open reading frames (ORFs)
Identification of the introns and exons
Gene prediction a very difficult problem in pattern recognition
Coding regions generally do not have conserved sequences
Much progress made with prokaryotic gene prediction
Eukaryotic genes more difficult to predict correctly
Course: Bioinformatics for Biologiacl Researchers (2014).
Session: 3.1- Introduction to Metagenomics. Applications, Approaches and Tools.
Statistics and Bioinformatisc Unit (UEB) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.
Metagenomics is the study of genetic material recovered directly from environmental samples. Metagenomics is a molecular tool used to analyse DNA acquired from environmental samples, in order to study the community of microorganisms present, without the necessity of obtaining pure cultures.
In this presentation, I talk about the various tools for the submission of DNA or RNA sequences into various sequence databases. The sequence submission tools talked about in this presentation are BankIt, Sequin and Webin.
Visualizing the pan genome - Australian Society for Microbiology - tue 8 jul ...Torsten Seemann
Invited talk at the Australian Society for Microbiology Annual Conference 2014 on "FriPan" our tool for visualizing bacterial pan genomes across 10-100s of isolates.
It is a presentation showing the process of doing a prokaryotic genome annotation using RAST server. It is a basic work in bioinformatics field. Genome annotation is total genome analysis of an organism. We can easily do it using bioinformatics tool like RAST server.
Automated sequencing of genomes require automated gene assignment
Includes detection of open reading frames (ORFs)
Identification of the introns and exons
Gene prediction a very difficult problem in pattern recognition
Coding regions generally do not have conserved sequences
Much progress made with prokaryotic gene prediction
Eukaryotic genes more difficult to predict correctly
Course: Bioinformatics for Biologiacl Researchers (2014).
Session: 3.1- Introduction to Metagenomics. Applications, Approaches and Tools.
Statistics and Bioinformatisc Unit (UEB) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.
Metagenomics is the study of genetic material recovered directly from environmental samples. Metagenomics is a molecular tool used to analyse DNA acquired from environmental samples, in order to study the community of microorganisms present, without the necessity of obtaining pure cultures.
In this presentation, I talk about the various tools for the submission of DNA or RNA sequences into various sequence databases. The sequence submission tools talked about in this presentation are BankIt, Sequin and Webin.
Visualizing the pan genome - Australian Society for Microbiology - tue 8 jul ...Torsten Seemann
Invited talk at the Australian Society for Microbiology Annual Conference 2014 on "FriPan" our tool for visualizing bacterial pan genomes across 10-100s of isolates.
Creating a Cyberinfrastructure for Advanced Marine Microbial Ecology Research...Larry Smarr
06.06.27
Invited Talk
ONR Review
Scripps Institution of Oceanography, UCSD
Title: Creating a Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA)—Taking Metagenomics to Light Speed
La Jolla, CA
Hervé Blottiere-El impacto de las ciencias ómicas en la medicina, la nutrició...Fundación Ramón Areces
El 29 de marzo de 2016 celebramos un Simposio Internacional sobre el 'Impacto de las ciencias ómicas en la medicina, nutrición y biotecnología'. Organizado por la Fundación Ramón Areces en colaboración con la Real Academia Nacional de Medicina y BioEuroLatina, abordó cómo un mejor conocimiento del genoma humano está permitiendo notables avances hacia una medicina de precisión.
Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characteri...QIAGEN
High-throughput sequencing, combined with high-resolution metagenomic analysis, provides a powerful diagnostic tool for clinical management of enteric disease. Forty-five patient samples of known and unknown disease etiology and 20 samples from health individuals were subjected to next-generation sequencing. Subsequent metagenomic analysis identified all microorganisms (bacteria, viruses, fungi and parasites) in the samples, including the expected pathogens in the samples of known etiology. Multiple pathogens were detected in the individual samples, providing evidence for polymicrobial infection. Patients were clearly differentiated from healthy individuals based on microorganism abundance and diversity. The speed, accuracy and actionable features of CosmosID bioinformatics and curated GenBook® databases, implemented in the QIAGEN Microbial Genomics Pro Suite, and the functional analysis, leveraging the QIAGEN functional metagenomics workflow, provide a powerful tool contributing to the revolution in clinical diagnostics, prophylactics and therapeutics that is now in progress globally.
Identification of antibiotic resistance genes in Klebsiella pneumoniae isolat...QIAGEN
Antibiotic resistant strains of pathogenic bacteria are a growing worldwide health problem. To effectively combat the spread of difficult-to-treat bacterial infections, rapid surveillance methods for detection of antibiotic resistance genes is required to monitor both bacterial isolates and metagenomic samples. Additionally, identification of potential new sources for different antibiotic resistance genes is critical. Both of these goals require tools that can be used for profiling of antibiotic resistance genes from various types of samples. Real-time PCR has proven to be effective for the detection of antibiotic resistance genes. Using PCR array technology, simultaneous detection of 87 prevalent and important antibiotic resistance genes is possible and should prove to be an effective method for antibiotic resistance monitoring. This allows for a more comprehensive profiling of antibiotic resistance genes than is possible using individual PCR assays.
VHIR Seminar led by Joel Doré. Research Director. Institut National de la Recherche Agronomique (INRA). Jouy-en-Josas, France.
Abstract: The human intestinal tract harbours a complex microbial ecosystem which plays a key role in nutrition and health. Interactions between food constituents, microbes and the host organism derive from a long co-evolution that resulted in a mutualistic association.
Current investigations into the human faecal metagenome are delivering an extensive gene repertoire representative of functional potentials of the human intestinal microbiota. The most redundant genomic traits of the human intestinal microbiota are identified and thereby its functional balance. These observation point towards the existence of enterotypes, i.e. microbiota sharing specific traits but yet independent of geographic origin, age, sex etc.. It also shows a unique segregation of the human population into individuals with low versus high gene-counts. In the end, it not only gives an unprecedented view of the intestinal microbiota, but it also significantly expands our ability to look for specificities of the microbiota associated with human diseases and to ultimately validate microbial signatures of prognostic and diagnostic value in immune mediated diseases.
Metagenomics of the human intestinal tract was applied to specifically compare obese versus lean individuals as well as to explore the dynamic changes associated with a severe calory-restricted diet. Microbiota structure differs with body-mass index and a limited set of marker species may be used as diagnostic model with a >85% predictive value. Among obese subjects; the overall phenotypic characteristics are worse in individuals with low gene counts microbiota, including a worse evolution of morphometric parameters over a period of 10 years, a low grade inflammatory context also associated with insulin-resistance, and the worst response to dietary constraints in terms of weight loss or improvement of biological and inflammatory characteristics. Low gene count microbiota is also associated with less favourable conditions in inflammatory bowel disease, such as higher relapse rate in ulcerative colitis patients.
Finally, microbiota transplantation has seen a regain of interest with applications expanding from Clostridium difficile infections to immune mediated and metabolic diseases.
The human intestinal microbiota should hence be regarded as a true organ, amenable to rationally designed modulation for human health.
QIAseq Technologies for Metagenomics and Microbiome NGS Library PrepQIAGEN
In this slide deck, learn about the innovative technologies that form the basis of QIAGEN’s portfolio of QIAseq library prep solutions for metagenomics and microbiome sequencing. Whether your research starts from single microbial cells, 16s rRNA PCR amplicons, or gDNA for whole genome analysis, QIAseq technologies offer tips and tricks for capturing the genomic diversity of your samples in the most unbiased, streamlined way possible.
MicrobeDB provides centralized local storage and access to completed archaeal and bacterial genomes.
MicrobeDB is an open source project available on GitHub:
https://github.com/mlangill/MicrobeDB
Lecture on the annotation of transposable elementsfmaumus
Lecture on the annotation of transposable elements at the CNRS school "BioinfoTE" in 2020 (Fréjus, France). https://bioinfote.sciencesconf.org/
ORGANIZING COMITEE
Emmanuelle Lerat (LBBE – CNRS Université Lyon 1),
Anna-Sophie Fiston-Lavier (ISEM – Université de Montpellier)
Florian Maumus (URGI – INRAe Versailles)
François Sabot (DIADE – IRD Montpellier)
GRC Workshop held at Churchill College on Sep 21, 2014. Talk by Bronwen Aken discussing the Ensembl approach to annotating the complete human reference assembly.
Presentation at 2019 ASHG GRC/GIAB workshop describing recent updates to the MANE project, which aims to provide matched annotation from RefSeq and GENCODE.
A workshop is intended for those who are interested in and are in the planning stages of conducting an RNA-Seq experiment. Topics to be discussed will include:
* Experimental Design of RNA-Seq experiment
* Sample preparation, best practices
* High throughput sequencing basics and choices
* Cost estimation
* Differential Gene Expression Analysis
* Data cleanup and quality assurance
* Mapping your data
* Assigning reads to genes and counting
* Analysis of differentially expressed genes
* Downstream analysis/visualizations and tables
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
The i5K, an initiative to sequence the genomes of 5,000 insect and related arthropod species, is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process, and Apollo is serving as the platform to empower this community.
This presentation is an introduction to Apollo for the members of the i5K Pilot Project working on species of the order Hemiptera.
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun SequencesSurya Saha
Presented at Cornell Symbiosis symposium. Workflow for processing amplicon based 16S/ITS sequences as well as whole genome shotgun sequences are described. Slides include short description and links for each tool.
DISCLAIMER: This is a small subset of tools out there. No disrespect to methods not mentioned.
Characterizing Protein Families of Unknown FunctionMorgan Langille
by Morgan G. I. Langille & Jonathan A. Eisen. This scientific poster was presented at the 18th Annual International Meeting on Microbial Genomics at Lake Arrowhead, California, USA. Sept. 12-16, 2010.
This is my first lab presentation during my post-doc in Jonathan Eisen's lab. I discuss new features and changes with HMMER 3. Also, I discuss how I used the new version to identify PFAMs in all 80 samples of the GOS metagenomic datasets with the hope of testing of "community profiling" may work.
A graduate student's experience in bioinformaticsMorgan Langille
I present my experiences in bioinformatics with a focus of graduate school in the MSFHR/CIHR Strategic Training Program for Bioinformatics. This presentation was given to BiNS (Bioinformatics Network of Students) at SFU.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
2. Learning Objectives
• Contrast 16S and metagenomic sequencing
• Taxonomy from metagenomes
• Function from metagenomes
• Applicability of assembling and gene calling with metagenomic data
• Metagenomic inference and limitations
• Tutorial on processing metagenomic data to determine functional
and taxonomic profiles
3. 16S vs Metagenomics
• 16S is targeted sequencing of a single gene which acts as a
marker for identification
• Pros
– Well established
– Sequencing costs are relatively cheap (~50,000 reads/sample)
– Only amplifies what you want (no host contamination)
• Cons
– Primer choice can bias results towards certain organisms
– Usually not enough resolution to identify to the strain level
– Different primers are needed for archaea & eukaryotes (18S)
– Doesn’t identify viruses
4. 16S vs Metagenomics
• Metagenomics: sequencing all the DNA in a sample
• Pros
– No primer bias
– Can identify all microbes (euks, viruses, etc.)
– Provides functional information (“What are they doing?”)
• Cons
– More expensive (millions of sequences needed)
– Host/site contamination can be significant
– May not be able to sequence “rare” microbes
– Complex bioinformatics
6. Metagenomics: Who is there?
• Goal: Identify the relative abundance of different
microbes in a sample given using metagenomics
• Problems:
– Reads are all mixed together
– Reads can be short (~100bp)
– Lateral gene transfer
• Two broad approaches
1. Binning Based
2. Marker Based
7. Binning Based
• Attempts to group or “bin” reads into the
genome from which they originated
• Composition-based
– Uses sequence composition such as GC%, k-mers (e.g.
Naïve Bayes Classifier)
– Generally not very precise
• Sequence-based
– Compare reads to large reference database using
BLAST (or some other similarity search method)
– Reads are assigned based on “Best-hit” or “Lowest
Common Ancestor” approach
8. LCA: Lowest Common Ancestor
• Use all BLAST hits above a threshold and assign taxonomy at the
lowest level in the tree which covers these taxa.
• Notable Examples:
– MEGAN: http://ab.inf.uni-
tuebingen.de/software/megan5/
• One of the first metagenomic tools
• Does functional profiling too!
– MG-RAST: https://metagenomics.anl.gov/
• Web-based pipeline (might need to wait awhile for results)
– Kraken: https://ccb.jhu.edu/software/kraken/
• Fastest binning approach to date and very accurate.
• Large computing requirements (e.g. >128GB RAM)
9. Marker Based
• Single Gene
• Identify and extract reads hitting a single marker gene (e.g.
16S, cpn60, or other “universal” genes)
• Use existing bioinformatics pipeline (e.g. QIIME, etc.)
• Multiple Gene
• Several universal genes
– PhyloSift (Darling et al, 2014)
» Uses 37 universal single-copy genes
• Clade specific markers
– MetaPhlAn2 (Truong et al., 2015)
10. Marker or Binning?
• Binning approaches
– Similarity search is computationally intensive
– Varying genome sizes and LGT can bias results
• Marker approaches
– Doesn’t allow functions to be linked directly to
organisms
– Genome reconstruction/assembly is not possible
– Dependent on choice of markers
11. MetaPhlAn2
• Uses “clade-specific” gene markers
• A clade represents a set of genomes that can be
as broad as a phylum or as specific as a species
• Uses ~1 million markers derived from 17,000
genomes
– ~13,500 bacterial and archaeal, ~3,500 viral, and ~110
eukaryotic
• Can identify down to the species level (and
possibly even strain level)
• Can handle millions of reads on a standard
computer within a few minutes
14. Using MetaPhlan
• MetaPhlan uses Bowtie2 for sequence similarity searching
(nucleotide sequences vs. nucleotide database)
• Paired-end data can be used directly
• Each sample is processed individually and then multiple
sample can be combined together at the last step
• Output is relative abundances at different taxonomic levels
15. Absolute vs. Relative Abundance
• Absolute abundance: Numbers represent real
abundance of thing being measured (e.g. the
actual quantity of a particular gene or organism)
• Relative abundance: Numbers represent
proportion of thing being measured within
sample
• In almost all cases microbiome studies are
measuring relative abundance
– This is due to DNA amplification during sequencing
library preparation not being quantitative
16. Relative Abundance Use Case
• Sample A:
– Has 108 bacterial cells (but we don’t know this from sequencing)
– 25% of the microbiome from this sample is classified as Shigella
• Sample B:
– Has 106 bacterial cells (but we don’t know this from sequencing)
– 50% of the microbiome from this sample is classified as Shigella
• “Sample B contains twice as much Shigella as Sample A”
– WRONG! (If quantified it we would find Sample A has more Shigella)
• “Sample B contains a greater proportion of Shigella compared to
Sample A”
– Correct!
18. What do we mean by function?
• General categories
– Photosynthesis
– Nitrogen metabolism
– Glycolysis
• Specific gene families
– Nifh
– EC: 1.1.1.1 (alchohol dehydrogenase)
– K00929 (butyrate kinase)
19. Various Functional Databases
• COG
– Well known but original classification (not updated since 2003)
• SEED
– Used by the RAST and MG-RAST systems
• PFAM
– Focused more on protein domains
• EggNOG
– Very comprehensive (~190k groups)
• UniRef
– Has clustering at different levels (e.g. UniRef100, UniRef90, UniRef50)
– Most comprehensive and is constantly updated
• KEGG
– Very popular, each entry is well annotated, and often linked into “Modules” or “Pathways”
– Full access now requires a license fee
• MetaCyc
– Becoming more widely used.
– More microbe focused than KEGG
20. KEGG
• We will focus on using the
KEGG database during this
workshop
• KEGG Orthologs (KOs)
– Most specific. Thought to be
homologs and doing the same
exact “function”
– ~12,000 KOs in the database
– These can be linked into KEGG
Modules and KEGG Pathways,
– Identifiers: K01803, K00231, etc.
21. KEGG (cont.)
• KEGG Modules
– Manually defined functional units
– Small groups of KOs that function together
– ~750 KEGG Modules
– Identified: M00002, M00011, etc.
22. KEGG (cont.)
• KEGG Pathways
– Groups KOs into large pathways (~230)
– Each pathway has a graphical map
– Individual KOs or Modules can be
highlighted within these maps
– Pathways can be collapsed into very
general functional terms (e.g. Amino Acid
Metabolism, Carbohydrate Metabolism,
etc.)
23. Metagenomic Annotation Systems
• Web-based
– Provide functional and taxonomic analysis, plus hosts your data.
– EBI Metagenomics Server
– MG-RAST
– IMG/M
• GUI based
– MEGAN
• Taxonomy and functional annotation
– ClovR
• Virtual Machine based, contains SOP, hasn’t been updated recently
• Command-line based
– MetAMOS
• Built in assembly, highly customizable, some features can be buggy
– Humann
• Functional annotation
– DIY
• Set up your own in-house custom computational pipeline
25. Humann Step 1
• Reads are searched against a protein database (e.g. KEGG)
– Can use BLASTX, but much faster methods now available (e.g. BLAT,
USEARCH, RapSearch2, DIAMOND)
Buchfink et al., 2015
27. Humann Step 2
• Normalize and weight search results
• The relative abundance of each KO is
calculated:
– Number of reads mapping to a gene sequence in
that KO
– Weighted by the inverse p-value of each mapping
– Normalized by the average length of the KO
29. Humann Step 3
• Reduce number of pathways
• A KO can map to one or more KEGG Pathways
– Just because a KO is found in a pathway doesn’t mean
that complete pathway exists in the community
– If a pathway has 20 KOs and only 2 KOs are observed
in the community (but at high abundances) what
should be the abundance of the pathway?
– MinPath (Ye, 2009) attempts to estimate the
abundance of these pathways and remove spurious
noise
31. Humann Step 4
• Reduce false positive pathways further and
normalize by KO copy number
• Using the organism information from the KEGG
hits
– Pathways that are not found to be in any of the
observed organisms AND are made up mostly of KOs
mapping to a different pathway are removed
– KO abundance can be divided by the estimated copy
number of that KO as observed from the KEGG
organism database
33. Humann Step 5
• Smoothing pathways by gap filling
– Sequencing depth or poor sequence searches
could lead to some KOs within pathways being
absent or in low abundance
– KOs with 1.5 interquartile ranges below the
pathway median are raised to the pathway
median
35. What about assembly?
• Assembly is often
used in genomics to
join raw reads into
longer contigs and
scaffolds
TECHNOLOGY FEATURE
2. Find overlaps between reads
…AGCCTAGACCTACAGGATGCGCGACACGT
GGATGCGCGACACGTCGCATATCCGGT…
3. Assemble overlaps into contigs
1. Fragment DNA and sequence
4. Assemble contigs into scaffolds
ar
O
av
h
ea
h
g
p
ev
m
in
Ju
In
ge
ev
fo
ge
as
an
scGenome assembly stitches together a genome
MichaelSchatz,ColdSpringHarbor
rved.
36. Assembly for Metagenomics?
• Pros
– Less computation time for similarity search (sequences are collapsed)
– Can allow annotation when reads are too short (<100bp)
– Can sometimes (partially) reconstruct genomes
• Cons
– Assembly is computationally intensive (high memory machines
needed)
– Collapsed reads must be added back to get relative abundances (not
all assemblers do this natively)
– Low read depth and high diversity can cause assemblers to fail
– Reads are not all from the same genome so chimeras are possible
– Some organisms/genes will assemble easier (e.g. more abundant)
which could lead to annotation bias
37. What about gene calling?
• In genomics, normally you would predict the start and stop
positions of genes using a gene prediction program before
annotating the genes
• In metagenomics:
– Pros:
• May result in less false positives from annotating “non-real” genes
• Lowers the number of similarity searches
– Cons
• Computationally intensive
• No good learning dataset
• Raw reads will not cover an entire gene
• Often requires assembled data
– Possible tools: FragGeneScan, MetaGeneAnnotator
– Alternative: Do 6 frame-translation (e.g. BLASTX)
38. Community Function Potential
• Important that this is metagenomics, not
metatranscriptomics, and not metaproteomics
• These annotations suggest the functional
potential of the community
• The presence of these genes/functions does not
mean that they are biologically active (e.g. may
not be transcribed)
43. Predicting the abundance of a
single function
Known gene abundance
Ancestral gene abundance
Predicted gene abundance
44. Predicting the abundance of a
single function
Known gene abundance
Ancestral gene
abundance
Predicted gene
abundance
Repeat for each function (~8000X)
Repeat for all unknown tips (>100,000)
52. Visualization and Statistics
• Various tools are available to determine
statistically significant taxonomic differences
across groups of samples
– Excel
– SigmaPlot
– Past
– R (many libraries)
– Python (matplotlib)
– STAMP
56. STAMP
• Input
1. “Profile file”: Table of features (samples by OTUs,
samples by functions, etc.)
• Features can form a heirarchy (e.g. Phylum, Order, Class,
etc) to allow data to be collapsed within the program
2. “Group file”: Contains different metadata for
grouping samples
• Can be two groups: (e.g. Healthy vs Sick) or multiple groups
(e.g. Water depth at 2M, 4M, and 6M)
• Output
– PCA, heatmap, box, and bar plots
– Tables of significantly different features