SlideShare a Scribd company logo
GFP Workshop
Undergraduate Bioinformatics Club (UBIC) at UCSD
Alexander Niema Moshiri
Green Fluorescent Protein:
Origins
 Green Fluorescent Protein (GFP) is a naturally-occurring
protein in a species of jellyfish, Aequorea victoria
 When excited by blue or ultraviolet light, GFP
fluoresces a green color
A fluorescent Aequorea victoria
Green Fluorescent Protein:
A Brief History of wtGFP
 GFP has been studied as early as the 1960s
 However, its utility for molecular biologists was
not realized until the 1990s
 In 1992, Douglas Prasher cloned and
sequenced the wild-type GFP (wtGFP) gene
 “Wild-type” = Natural
 Prasher proposed using GFP as a biochemical
tracer that allows us to look at the inner
workings of cells
Douglas Prasher
Green Fluorescent Protein:
Recombination of wtGFP
 The lab of Martin Chalfie expressed wtGFP in E. coli and
C. elegans
 To their surprise, wtGFP was able to glow in both
species without needing any jellyfish cofactors
C. elegans expressing wtGFP
Green Fluorescent Protein:
Bioengineered
 In 1995, by changing a single amino
acid, Roger Tsien engineered the first
improved mutant of GFP with
increased fluorescence and
photostability
 Tsien was awarded the 2008 Nobel
Prize in chemistry for his GFP work
 He is currently a professor at UCSD
 Further improvements to GFP were
made over the next few years
Roger Tsien
Green Fluorescent Protein:
Current State of Mutants
 Today, many more derivatives have
been created from GFP and dsRed (a
red fluorescent protein)
 Researchers have access to a range
of colors, including green, yellow,
orange, red, violet, blue, and cyan
An illustration of a San Diego beach scene
drawn using 8 colors of FPs
Rainbow of FPs from the Tsien lab
Green Fluorescent Protein:
Experimental Uses
 We mentioned before that FPs can be used to track
cellular processes
 Researchers can simply attach an FP to some object of
interest and then they can visually follow the object
Mice expressing GFP next to normal mice GFP-expressing neurons
Protein Data Bank:
A Brief Overview
 The Protein Data Bank (PDB) is a
repository of 3D structural data
of large biological molecules
(e.g. proteins and nucleic acids)
 This structural data can be
downloaded and used to render
a 3D image of the molecule of
interest
3D rendering of GFP from PDB data
Protein Data Bank:
Step 1: Querying the PDB
 Open Mozilla Firefox and navigate to www.rcsb.org
 The search box on the top of the page allows you to
“Search by PDB ID, author, macromolecule, sequence,
or ligands”
 Search for the term Green Fluorescent Protein and hit
“Go”
 Scroll down and click on entry 4KW4: “Crystal Structure
of Green Fluorescent Protein”
Protein Data Bank:
Step 2: Questions About Results
 Who are the authors of the primary citation for 4KW4?
 What organism is this protein from?
 How long (in amino acids) is this protein?
 What method was used to produce this entry’s data?
 What is the resolution in Angstroms (Å)?
Protein Data Bank:
Step 3: Rendering 3D Structure
 Return to the PDB homepage: www.rcsb.org
 In the left-column panel, click “Visualize”
 In the box that says “Enter a PDB ID”, enter 4KW4 and
click “View Jmol”
 You should see a 3D rendering of GFP
 You can click and drag the 3D render to rotate it
Protein Data Bank:
Step 4: Display Customization
 Under “Select Display Mode,” click “Custom View”
 Cycle through the different Style options and choose
your favorite
 My personal favorite is the default, Cartoon
 Cycle through the different Color options
 You can also change the color(s) by Right-Clicking on the
3D render, going to Color, then Structures, then Cartoon
(assuming you’re still in Cartoon style), and choosing a
color
 You can also go to Color  Structures  Cartoon  By
Scheme and choose one of those options
Protein Data Bank:
Step 5: Exporting 3D Image
 Finish customizing the 3D image to your liking
 Feel free to play with the other options in the menu that
pops up when you Right-Click on the 3D image
 If you want to revert to the original settings, just refresh
the page and it will reload with the default settings
 When you are ready to export the final image, just click
the blue “Export 3D Image” button, specify a
destination, and click “Save”
 Enjoy your cool 3D image of GFP!
Multiple Sequence Alignment:
The FASTA Format
 The FASTA format is a text-based format for
representing DNA, RNA, or Protein sequences
 A sequence in the FASTA format begins with a single-line
description (beginning with the ‘>’ character), followed
by line(s) of sequence data
Multiple Sequence Alignment:
Sequence Alignment
 A sequence alignment is a way of arranging biological
sequences (DNA, RNA, or Protein) to identify regions of
similarity between the sequences
 Gaps can be inserted between characters in the
sequences so that identical or similar characters can be
aligned in the same column
An example multiple sequence alignment
Multiple Sequence Alignment:
GFP and its Derivatives
 In the following activity, we will
align the sequences of GFP and some
of its derivative fluorescent proteins
 These proteins’ sequences are
provided in the file named
protein_sequences.fasta
 Using the results from the multiple
sequence alignment, we will be able
to construct a phylogenetic tree
 This tree will provide us information
about the pairwise “closeness”
between the protein sequences
Multiple Sequence Alignment:
ClustalW2
 ClustalW2 is a popular multiple sequence alignment tool
 Download protein_sequences.fasta from:
 http://ubic.ucsd.edu/gfp/
 Go to the ClustalW2 website:
 http://www.ebi.ac.uk/Tools/msa/clustalw2/
 Under “STEP 1 – Enter your input sequences”, upload
protein_sequences.fasta by clicking “Choose File”
 Under “STEP 4 – Submit your job”, click “Submit”
Multiple Sequence Alignment:
ClustalW (Continued)
 After you click Submit, ClustalW2 will redirect you to
the results of the multiple sequence alignment
 The IDs of the sequences are to the left of the alignment,
and each row of the alignment corresponds to a single
sequence (e.g. the first row of every chunk is
“GFP(4KW4)”)
 If the alignment doesn’t make sense to you, be sure to
ask one of the UBIC officers any questions you have!
Evolutionary Relationships:
Phylogenetic Tree
 A phylogenetic tree is a branching diagram (or “tree”)
that shows relationships of “closeness” between
different biological species or other entities
 Elements that are closer together on the tree have
“closer” (more similar) sequences
 In the ClustalW2 results page, click “Send to
ClustalW2_Phylogeny”
 On the resulting page, under “STEP 3 – Submit your
job”, click “Submit”
 Draw out the phylogenetic tree (questions will be asked
about it on the Extra Credit assignment)
Phylogenetic Trees:
Biological Importance
 The information provided by phylogenetic trees is
extremely valuable and is even applicable to medicine
 In 1994, Richard Schmidt, an American physician, used a
sample of blood from one of his AIDS-infected patients to
inject into his ex-lover and former colleague, Janice
Trahan, infecting her with HIV
 HIV DNA was collected from the victim, from the putative
patient source, and from thirty-two other unrelated, HIV-
positive individuals
 Scientists concluded that of all the samples they tested,
the two viruses' DNA from the victim and the patient
matched almost exactly, even with HIV's potential to
mutate very rapidly
Phylogenetic Tree from the
HIV Court Case
GFP Workshop:
Summary
 Congratulations on finishing the GFP Workshop!
Throughout the workshop, you learned the following:
 GFP’s history and uses
 How to use the PDB (and rendering 3D protein structures)
 Multiple Sequence Alignment using ClustalW2
 Phylogenetic Tree Construction from a Multiple Sequence
Alignment using ClustalW2_Phylogeny
 We hope you enjoyed the workshop, and we hope you
have found interest in the field of Bioinformatics!

More Related Content

What's hot

FRET, FRAP, TIFR MICROSCOPY
FRET, FRAP, TIFR MICROSCOPYFRET, FRAP, TIFR MICROSCOPY
FRET, FRAP, TIFR MICROSCOPY
BaishaliTamuli1
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
Surender Rawat
 
Whole genome sequencing
Whole genome sequencingWhole genome sequencing
Whole genome sequencing
qadardana kakar
 
Fluorescence recovery after photo bleaching
Fluorescence recovery after photo bleachingFluorescence recovery after photo bleaching
Fluorescence recovery after photo bleaching
anasshokor
 
Green Fluorescent Protein
Green Fluorescent ProteinGreen Fluorescent Protein
Green Fluorescent Protein
Zainab Lali
 
TRANSPOSABLE ELEMENTS
TRANSPOSABLE   ELEMENTSTRANSPOSABLE   ELEMENTS
TRANSPOSABLE ELEMENTS
seetugulia
 
Transfection
TransfectionTransfection
Transfection
Achyut Bora
 
Dna methylation
Dna methylationDna methylation
Dna methylation
Sushma Marla
 
Halophiles
HalophilesHalophiles
Halophiles
SnehasishKundu1
 
differential centrifugation
differential centrifugationdifferential centrifugation
differential centrifugation
Rakshmitha Marni
 
4.5 Extremeophiles
4.5 Extremeophiles4.5 Extremeophiles
4.5 Extremeophiles
Edovate Learning Corp.
 
Whole genome shotgun sequencing
Whole genome shotgun sequencingWhole genome shotgun sequencing
Whole genome shotgun sequencing
Goutham Sarovar
 
Mass Spectrometry: Protein Identification Strategies
Mass Spectrometry: Protein Identification StrategiesMass Spectrometry: Protein Identification Strategies
Mass Spectrometry: Protein Identification Strategies
Michel Dumontier
 
Genomic library
Genomic libraryGenomic library
Genomic library
Chinnu S Kumar
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNA
maryamshah13
 
5’ capping
5’ capping5’ capping
5’ capping
EmaSushan
 
DNA-DNA Hybridisation
DNA-DNA HybridisationDNA-DNA Hybridisation
DNA-DNA Hybridisation
Nishanth S
 
What is Epigenetics?
What is Epigenetics?What is Epigenetics?
What is Epigenetics?
Garry D. Lasaga
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
HIMANSHU JAIN
 
Lectut btn-202-ppt-l22. hybridization procedures
Lectut btn-202-ppt-l22. hybridization proceduresLectut btn-202-ppt-l22. hybridization procedures
Lectut btn-202-ppt-l22. hybridization procedures
Rishabh Jain
 

What's hot (20)

FRET, FRAP, TIFR MICROSCOPY
FRET, FRAP, TIFR MICROSCOPYFRET, FRAP, TIFR MICROSCOPY
FRET, FRAP, TIFR MICROSCOPY
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
Whole genome sequencing
Whole genome sequencingWhole genome sequencing
Whole genome sequencing
 
Fluorescence recovery after photo bleaching
Fluorescence recovery after photo bleachingFluorescence recovery after photo bleaching
Fluorescence recovery after photo bleaching
 
Green Fluorescent Protein
Green Fluorescent ProteinGreen Fluorescent Protein
Green Fluorescent Protein
 
TRANSPOSABLE ELEMENTS
TRANSPOSABLE   ELEMENTSTRANSPOSABLE   ELEMENTS
TRANSPOSABLE ELEMENTS
 
Transfection
TransfectionTransfection
Transfection
 
Dna methylation
Dna methylationDna methylation
Dna methylation
 
Halophiles
HalophilesHalophiles
Halophiles
 
differential centrifugation
differential centrifugationdifferential centrifugation
differential centrifugation
 
4.5 Extremeophiles
4.5 Extremeophiles4.5 Extremeophiles
4.5 Extremeophiles
 
Whole genome shotgun sequencing
Whole genome shotgun sequencingWhole genome shotgun sequencing
Whole genome shotgun sequencing
 
Mass Spectrometry: Protein Identification Strategies
Mass Spectrometry: Protein Identification StrategiesMass Spectrometry: Protein Identification Strategies
Mass Spectrometry: Protein Identification Strategies
 
Genomic library
Genomic libraryGenomic library
Genomic library
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNA
 
5’ capping
5’ capping5’ capping
5’ capping
 
DNA-DNA Hybridisation
DNA-DNA HybridisationDNA-DNA Hybridisation
DNA-DNA Hybridisation
 
What is Epigenetics?
What is Epigenetics?What is Epigenetics?
What is Epigenetics?
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
Lectut btn-202-ppt-l22. hybridization procedures
Lectut btn-202-ppt-l22. hybridization proceduresLectut btn-202-ppt-l22. hybridization procedures
Lectut btn-202-ppt-l22. hybridization procedures
 

Viewers also liked

Green fluorescent protein (gfp)
Green fluorescent protein (gfp)Green fluorescent protein (gfp)
Green fluorescent protein (gfp)
TBQ-RLORC
 
GFP (Green Fluorescent Protein) Presentation
GFP (Green Fluorescent Protein) PresentationGFP (Green Fluorescent Protein) Presentation
GFP (Green Fluorescent Protein) Presentation
University of Toronto
 
Green Fluorescence Protein
Green Fluorescence ProteinGreen Fluorescence Protein
Green Fluorescence Protein
TPHS Creative Curriculm for Children
 
P glo presentation
P glo presentation P glo presentation
P glo presentation
Mills Cbst
 
New Fluorescent Proteins S. Semih Ekimler(3)
New Fluorescent Proteins   S. Semih Ekimler(3)New Fluorescent Proteins   S. Semih Ekimler(3)
New Fluorescent Proteins S. Semih Ekimler(3)
Semih Ekimler
 
Microsoft PowerPoint - SUSRC 4.27 (Website)
Microsoft PowerPoint - SUSRC 4.27 (Website)Microsoft PowerPoint - SUSRC 4.27 (Website)
Microsoft PowerPoint - SUSRC 4.27 (Website)
Joseph S. Danner
 
Zumm R&I report
Zumm R&I  reportZumm R&I  report
Zumm R&I report
CharmaineCIC
 
Communication Channels
Communication ChannelsCommunication Channels
Communication Channels
Zainab Lali
 
B.Sc. Biochemistry II Cellular Biochemistry Unit 4 Basic Techniques in Microb...
B.Sc. Biochemistry II Cellular Biochemistry Unit 4 Basic Techniques in Microb...B.Sc. Biochemistry II Cellular Biochemistry Unit 4 Basic Techniques in Microb...
B.Sc. Biochemistry II Cellular Biochemistry Unit 4 Basic Techniques in Microb...
Rai University
 
The Colourful World of Fluorescent Proteins
The Colourful World of Fluorescent ProteinsThe Colourful World of Fluorescent Proteins
The Colourful World of Fluorescent Proteins
Caroline Sepiol
 
Making organelles visible - in planta and in societas
Making organelles visible - in planta and in societasMaking organelles visible - in planta and in societas
Making organelles visible - in planta and in societas
Anne Osterrieder
 
MTT Presentation
MTT PresentationMTT Presentation
MTT Presentation
Landstar Intermodal
 
cell Viability princes
cell Viability  princescell Viability  princes
cell Viability princes
Dr.Prameswari Kasa
 
homologus recombination
homologus recombinationhomologus recombination
homologus recombination
Deepak Rohilla
 
MTT cell proliferation assay
MTT cell proliferation assay MTT cell proliferation assay
MTT cell proliferation assay
Chander K Negi
 
Fluorescence microscopy introduction
Fluorescence microscopy introductionFluorescence microscopy introduction
Fluorescence microscopy introduction
Nicole Salgado Cortes
 
Dna repair
Dna repairDna repair
Dna repair
Anand Reghuvaran
 
Mtt lecture
Mtt lectureMtt lecture
Mtt lecture
rubina1000
 
Microscopy
MicroscopyMicroscopy
Microscopy
Diego Ramos
 
Microscopy
MicroscopyMicroscopy
Microscopy
karade
 

Viewers also liked (20)

Green fluorescent protein (gfp)
Green fluorescent protein (gfp)Green fluorescent protein (gfp)
Green fluorescent protein (gfp)
 
GFP (Green Fluorescent Protein) Presentation
GFP (Green Fluorescent Protein) PresentationGFP (Green Fluorescent Protein) Presentation
GFP (Green Fluorescent Protein) Presentation
 
Green Fluorescence Protein
Green Fluorescence ProteinGreen Fluorescence Protein
Green Fluorescence Protein
 
P glo presentation
P glo presentation P glo presentation
P glo presentation
 
New Fluorescent Proteins S. Semih Ekimler(3)
New Fluorescent Proteins   S. Semih Ekimler(3)New Fluorescent Proteins   S. Semih Ekimler(3)
New Fluorescent Proteins S. Semih Ekimler(3)
 
Microsoft PowerPoint - SUSRC 4.27 (Website)
Microsoft PowerPoint - SUSRC 4.27 (Website)Microsoft PowerPoint - SUSRC 4.27 (Website)
Microsoft PowerPoint - SUSRC 4.27 (Website)
 
Zumm R&I report
Zumm R&I  reportZumm R&I  report
Zumm R&I report
 
Communication Channels
Communication ChannelsCommunication Channels
Communication Channels
 
B.Sc. Biochemistry II Cellular Biochemistry Unit 4 Basic Techniques in Microb...
B.Sc. Biochemistry II Cellular Biochemistry Unit 4 Basic Techniques in Microb...B.Sc. Biochemistry II Cellular Biochemistry Unit 4 Basic Techniques in Microb...
B.Sc. Biochemistry II Cellular Biochemistry Unit 4 Basic Techniques in Microb...
 
The Colourful World of Fluorescent Proteins
The Colourful World of Fluorescent ProteinsThe Colourful World of Fluorescent Proteins
The Colourful World of Fluorescent Proteins
 
Making organelles visible - in planta and in societas
Making organelles visible - in planta and in societasMaking organelles visible - in planta and in societas
Making organelles visible - in planta and in societas
 
MTT Presentation
MTT PresentationMTT Presentation
MTT Presentation
 
cell Viability princes
cell Viability  princescell Viability  princes
cell Viability princes
 
homologus recombination
homologus recombinationhomologus recombination
homologus recombination
 
MTT cell proliferation assay
MTT cell proliferation assay MTT cell proliferation assay
MTT cell proliferation assay
 
Fluorescence microscopy introduction
Fluorescence microscopy introductionFluorescence microscopy introduction
Fluorescence microscopy introduction
 
Dna repair
Dna repairDna repair
Dna repair
 
Mtt lecture
Mtt lectureMtt lecture
Mtt lecture
 
Microscopy
MicroscopyMicroscopy
Microscopy
 
Microscopy
MicroscopyMicroscopy
Microscopy
 

Similar to GFP Workshop

Building Biomedical Knowledge Graphs for In-Silico Drug Discovery
Building Biomedical Knowledge Graphs for In-Silico Drug DiscoveryBuilding Biomedical Knowledge Graphs for In-Silico Drug Discovery
Building Biomedical Knowledge Graphs for In-Silico Drug Discovery
Vaticle
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!
adcobb
 
Enabling the Computational Future of Biology.pdf
Enabling the Computational Future of Biology.pdfEnabling the Computational Future of Biology.pdf
Enabling the Computational Future of Biology.pdf
Vaticle
 
Bioinformatics Final Presentation
Bioinformatics Final PresentationBioinformatics Final Presentation
Bioinformatics Final Presentation
Shruthi Choudary
 
Session i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcSession i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmc
USD Bioinformatics
 
Session i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmcSession i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmc
USD Bioinformatics
 
Bioinformatics for beginners (exam point of view)
Bioinformatics for beginners (exam point of view)Bioinformatics for beginners (exam point of view)
Bioinformatics for beginners (exam point of view)
Sijo A
 
Article
ArticleArticle
Article
MisbahAlwi
 
Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequence
sababibi
 
Major biological nucleotide databases
Major biological nucleotide databasesMajor biological nucleotide databases
Major biological nucleotide databases
Vidya Kalaivani Rajkumar
 
Accelerating Scientific Research Through Machine Learning and Graph
Accelerating Scientific Research Through Machine Learning and GraphAccelerating Scientific Research Through Machine Learning and Graph
Accelerating Scientific Research Through Machine Learning and Graph
Neo4j
 
Finding Allelic Frequencies Using MapReduce/Hadoop
Finding Allelic Frequencies Using MapReduce/HadoopFinding Allelic Frequencies Using MapReduce/Hadoop
Finding Allelic Frequencies Using MapReduce/Hadoop
Mahmoud Parsian
 
Nihms379831 stephen quake
Nihms379831 stephen quakeNihms379831 stephen quake
Nihms379831 stephen quake
鋒博 蔡
 
Introduction to Bioinformatics: Part 3
Introduction to Bioinformatics: Part 3Introduction to Bioinformatics: Part 3
Introduction to Bioinformatics: Part 3
AhmedAbdElMoniem35
 
Bioinformatics - Discovering the Bio Logic Of Nature
Bioinformatics - Discovering the Bio Logic Of NatureBioinformatics - Discovering the Bio Logic Of Nature
Bioinformatics - Discovering the Bio Logic Of Nature
Robert Cormia
 
Identification of PFOA linked metabolic diseases by crossing databases
Identification of PFOA linked metabolic diseases by crossing databasesIdentification of PFOA linked metabolic diseases by crossing databases
Identification of PFOA linked metabolic diseases by crossing databases
Yoann Pageaud
 
2015 03 13_puurs_v_public
2015 03 13_puurs_v_public2015 03 13_puurs_v_public
2015 03 13_puurs_v_public
Prof. Wim Van Criekinge
 
David
DavidDavid
Media guide t-bo_c2
Media guide t-bo_c2Media guide t-bo_c2
Media guide t-bo_c2
Elsa von Licy
 
DNA analysis on your laptop: Spot the differences
DNA analysis on your laptop: Spot the differencesDNA analysis on your laptop: Spot the differences
DNA analysis on your laptop: Spot the differences
Barbera van Schaik
 

Similar to GFP Workshop (20)

Building Biomedical Knowledge Graphs for In-Silico Drug Discovery
Building Biomedical Knowledge Graphs for In-Silico Drug DiscoveryBuilding Biomedical Knowledge Graphs for In-Silico Drug Discovery
Building Biomedical Knowledge Graphs for In-Silico Drug Discovery
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!
 
Enabling the Computational Future of Biology.pdf
Enabling the Computational Future of Biology.pdfEnabling the Computational Future of Biology.pdf
Enabling the Computational Future of Biology.pdf
 
Bioinformatics Final Presentation
Bioinformatics Final PresentationBioinformatics Final Presentation
Bioinformatics Final Presentation
 
Session i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcSession i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmc
 
Session i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmcSession i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmc
 
Bioinformatics for beginners (exam point of view)
Bioinformatics for beginners (exam point of view)Bioinformatics for beginners (exam point of view)
Bioinformatics for beginners (exam point of view)
 
Article
ArticleArticle
Article
 
Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequence
 
Major biological nucleotide databases
Major biological nucleotide databasesMajor biological nucleotide databases
Major biological nucleotide databases
 
Accelerating Scientific Research Through Machine Learning and Graph
Accelerating Scientific Research Through Machine Learning and GraphAccelerating Scientific Research Through Machine Learning and Graph
Accelerating Scientific Research Through Machine Learning and Graph
 
Finding Allelic Frequencies Using MapReduce/Hadoop
Finding Allelic Frequencies Using MapReduce/HadoopFinding Allelic Frequencies Using MapReduce/Hadoop
Finding Allelic Frequencies Using MapReduce/Hadoop
 
Nihms379831 stephen quake
Nihms379831 stephen quakeNihms379831 stephen quake
Nihms379831 stephen quake
 
Introduction to Bioinformatics: Part 3
Introduction to Bioinformatics: Part 3Introduction to Bioinformatics: Part 3
Introduction to Bioinformatics: Part 3
 
Bioinformatics - Discovering the Bio Logic Of Nature
Bioinformatics - Discovering the Bio Logic Of NatureBioinformatics - Discovering the Bio Logic Of Nature
Bioinformatics - Discovering the Bio Logic Of Nature
 
Identification of PFOA linked metabolic diseases by crossing databases
Identification of PFOA linked metabolic diseases by crossing databasesIdentification of PFOA linked metabolic diseases by crossing databases
Identification of PFOA linked metabolic diseases by crossing databases
 
2015 03 13_puurs_v_public
2015 03 13_puurs_v_public2015 03 13_puurs_v_public
2015 03 13_puurs_v_public
 
David
DavidDavid
David
 
Media guide t-bo_c2
Media guide t-bo_c2Media guide t-bo_c2
Media guide t-bo_c2
 
DNA analysis on your laptop: Spot the differences
DNA analysis on your laptop: Spot the differencesDNA analysis on your laptop: Spot the differences
DNA analysis on your laptop: Spot the differences
 

GFP Workshop

  • 1. GFP Workshop Undergraduate Bioinformatics Club (UBIC) at UCSD Alexander Niema Moshiri
  • 2. Green Fluorescent Protein: Origins  Green Fluorescent Protein (GFP) is a naturally-occurring protein in a species of jellyfish, Aequorea victoria  When excited by blue or ultraviolet light, GFP fluoresces a green color A fluorescent Aequorea victoria
  • 3. Green Fluorescent Protein: A Brief History of wtGFP  GFP has been studied as early as the 1960s  However, its utility for molecular biologists was not realized until the 1990s  In 1992, Douglas Prasher cloned and sequenced the wild-type GFP (wtGFP) gene  “Wild-type” = Natural  Prasher proposed using GFP as a biochemical tracer that allows us to look at the inner workings of cells Douglas Prasher
  • 4. Green Fluorescent Protein: Recombination of wtGFP  The lab of Martin Chalfie expressed wtGFP in E. coli and C. elegans  To their surprise, wtGFP was able to glow in both species without needing any jellyfish cofactors C. elegans expressing wtGFP
  • 5. Green Fluorescent Protein: Bioengineered  In 1995, by changing a single amino acid, Roger Tsien engineered the first improved mutant of GFP with increased fluorescence and photostability  Tsien was awarded the 2008 Nobel Prize in chemistry for his GFP work  He is currently a professor at UCSD  Further improvements to GFP were made over the next few years Roger Tsien
  • 6. Green Fluorescent Protein: Current State of Mutants  Today, many more derivatives have been created from GFP and dsRed (a red fluorescent protein)  Researchers have access to a range of colors, including green, yellow, orange, red, violet, blue, and cyan An illustration of a San Diego beach scene drawn using 8 colors of FPs Rainbow of FPs from the Tsien lab
  • 7. Green Fluorescent Protein: Experimental Uses  We mentioned before that FPs can be used to track cellular processes  Researchers can simply attach an FP to some object of interest and then they can visually follow the object Mice expressing GFP next to normal mice GFP-expressing neurons
  • 8. Protein Data Bank: A Brief Overview  The Protein Data Bank (PDB) is a repository of 3D structural data of large biological molecules (e.g. proteins and nucleic acids)  This structural data can be downloaded and used to render a 3D image of the molecule of interest 3D rendering of GFP from PDB data
  • 9. Protein Data Bank: Step 1: Querying the PDB  Open Mozilla Firefox and navigate to www.rcsb.org  The search box on the top of the page allows you to “Search by PDB ID, author, macromolecule, sequence, or ligands”  Search for the term Green Fluorescent Protein and hit “Go”  Scroll down and click on entry 4KW4: “Crystal Structure of Green Fluorescent Protein”
  • 10. Protein Data Bank: Step 2: Questions About Results  Who are the authors of the primary citation for 4KW4?  What organism is this protein from?  How long (in amino acids) is this protein?  What method was used to produce this entry’s data?  What is the resolution in Angstroms (Å)?
  • 11. Protein Data Bank: Step 3: Rendering 3D Structure  Return to the PDB homepage: www.rcsb.org  In the left-column panel, click “Visualize”  In the box that says “Enter a PDB ID”, enter 4KW4 and click “View Jmol”  You should see a 3D rendering of GFP  You can click and drag the 3D render to rotate it
  • 12. Protein Data Bank: Step 4: Display Customization  Under “Select Display Mode,” click “Custom View”  Cycle through the different Style options and choose your favorite  My personal favorite is the default, Cartoon  Cycle through the different Color options  You can also change the color(s) by Right-Clicking on the 3D render, going to Color, then Structures, then Cartoon (assuming you’re still in Cartoon style), and choosing a color  You can also go to Color  Structures  Cartoon  By Scheme and choose one of those options
  • 13. Protein Data Bank: Step 5: Exporting 3D Image  Finish customizing the 3D image to your liking  Feel free to play with the other options in the menu that pops up when you Right-Click on the 3D image  If you want to revert to the original settings, just refresh the page and it will reload with the default settings  When you are ready to export the final image, just click the blue “Export 3D Image” button, specify a destination, and click “Save”  Enjoy your cool 3D image of GFP!
  • 14. Multiple Sequence Alignment: The FASTA Format  The FASTA format is a text-based format for representing DNA, RNA, or Protein sequences  A sequence in the FASTA format begins with a single-line description (beginning with the ‘>’ character), followed by line(s) of sequence data
  • 15. Multiple Sequence Alignment: Sequence Alignment  A sequence alignment is a way of arranging biological sequences (DNA, RNA, or Protein) to identify regions of similarity between the sequences  Gaps can be inserted between characters in the sequences so that identical or similar characters can be aligned in the same column An example multiple sequence alignment
  • 16. Multiple Sequence Alignment: GFP and its Derivatives  In the following activity, we will align the sequences of GFP and some of its derivative fluorescent proteins  These proteins’ sequences are provided in the file named protein_sequences.fasta  Using the results from the multiple sequence alignment, we will be able to construct a phylogenetic tree  This tree will provide us information about the pairwise “closeness” between the protein sequences
  • 17. Multiple Sequence Alignment: ClustalW2  ClustalW2 is a popular multiple sequence alignment tool  Download protein_sequences.fasta from:  http://ubic.ucsd.edu/gfp/  Go to the ClustalW2 website:  http://www.ebi.ac.uk/Tools/msa/clustalw2/  Under “STEP 1 – Enter your input sequences”, upload protein_sequences.fasta by clicking “Choose File”  Under “STEP 4 – Submit your job”, click “Submit”
  • 18. Multiple Sequence Alignment: ClustalW (Continued)  After you click Submit, ClustalW2 will redirect you to the results of the multiple sequence alignment  The IDs of the sequences are to the left of the alignment, and each row of the alignment corresponds to a single sequence (e.g. the first row of every chunk is “GFP(4KW4)”)  If the alignment doesn’t make sense to you, be sure to ask one of the UBIC officers any questions you have!
  • 19. Evolutionary Relationships: Phylogenetic Tree  A phylogenetic tree is a branching diagram (or “tree”) that shows relationships of “closeness” between different biological species or other entities  Elements that are closer together on the tree have “closer” (more similar) sequences  In the ClustalW2 results page, click “Send to ClustalW2_Phylogeny”  On the resulting page, under “STEP 3 – Submit your job”, click “Submit”  Draw out the phylogenetic tree (questions will be asked about it on the Extra Credit assignment)
  • 20. Phylogenetic Trees: Biological Importance  The information provided by phylogenetic trees is extremely valuable and is even applicable to medicine  In 1994, Richard Schmidt, an American physician, used a sample of blood from one of his AIDS-infected patients to inject into his ex-lover and former colleague, Janice Trahan, infecting her with HIV  HIV DNA was collected from the victim, from the putative patient source, and from thirty-two other unrelated, HIV- positive individuals  Scientists concluded that of all the samples they tested, the two viruses' DNA from the victim and the patient matched almost exactly, even with HIV's potential to mutate very rapidly
  • 21. Phylogenetic Tree from the HIV Court Case
  • 22. GFP Workshop: Summary  Congratulations on finishing the GFP Workshop! Throughout the workshop, you learned the following:  GFP’s history and uses  How to use the PDB (and rendering 3D protein structures)  Multiple Sequence Alignment using ClustalW2  Phylogenetic Tree Construction from a Multiple Sequence Alignment using ClustalW2_Phylogeny  We hope you enjoyed the workshop, and we hope you have found interest in the field of Bioinformatics!