SlideShare a Scribd company logo
1 of 70
Download to read offline
!
! !
Comparative+Genomics+of+Sterol+Homeostasis+
+
Dissertation*by*
!
!
!
Philip!Allen!Dy!
!
!
In!Partial!Fulfilment!of!the!Requirements!
for!the!Degree!of!
Bachelor!of!Science!in!Biosciences!
2nd!–!May!–!2014!
!
!
!
!
Submitted+To:+
Dr.!Steve!Meaney!
School!of!Biological!Sciences!
College!of!Sciences!and!Health!
Dublin!Institute!of!Technology!
i!!
!
Acknowledgements+
!
Foremost,!I!would!like!to!express!my!special!appreciation!and!thanks!to!my!
supervisor!Dr.!Steve!Meaney,!you!have!been!an!exceptional!mentor.!His!guidance!
has!made!this!a!thoughtful!and!rewarding!journey.!
!
Besides!my!supervisor,!I!would!like!to!thank!the!rest!of!the!members!of!the!
dissertation!committee,!Dr.!Orla!Howe,!Dr.!Celine!Herra,!Dr.!John!Kearney,!and!
Dr.!Alison!Malkin!for!providing!guidance!and!support!of!the!different!aspects!
involved!in!the!dissertation.!!
!
Also,!thanks!to!Alan!Lennon!and!Gavin!Meehan!for!the!stimulating!discussions!
and!for!providing!input!in!completion!with!the!dissertation.!!
!
Last!but!not!the!least,!an!extended!gratitude!goes!to!my!family!and!friends!for!
their!continuing!support!in!the!final!stages!of!the!dissertation.!
ii""
Abbreviations,
"
Abbreviation" Description,
Annu"Rev"Genomics"Hum"
Genet,,
Annual"Review"of"Genomics"and"Human"Genetics"
ATP, Adenosine"Triphosphate"
Biochem"Biophys"Res"
Commun,
Biochemical"and"Biophysical"Research"
Communications"
BLAST, Basic"Local"Alignment"Search"Tool"
BLAT,, BlastAlike"Alignment"Tool"
BMC"Cancer, Biomed"Central"Cancer"
BMC"Evol"Biol" Biomed"Central"Evolutionary"Biology"
bp" Base"pair"
cDNA", complementary"DNA"
Clin"Biochem, Clinical"Biochemistry"
CNS,, Central"Nervous"System"
CRM,, cisARegulatory"Modules"
Curr"Opin"Cell"Biol, Current"Opinion"in"Cell"Biology"
CYP46A1,, Cholesterol"24Ahydroxylase"
CYP7B1,, 25Ahydroxycholesterol"7AalphaAhydroxylase"
DHEA,, dehydroepiandrosterone"
DNA, Deoxyribonucleic"Acid"
DOE" Department"of"Energy"
ECR,, Evolutionary"Conserved"Regions"
ENCODE" Encyclopedia"Of"DNA"Elements"
Exp"Biol"Med, Experimental"Biology"and"Medicine"
Genome"Res, Genome"Research"
HGNC" HUGO"Gene"Nomenclature"Committee"
HMGACoA,, 3AhydroxyA3AmethylglutarylAcoenzyme"A"
HMGCR,, 3AhydroxyA3AmethylAglutarylACoA"reductase"
HMM" Hidden"Markov"Model"
iii!!
J!Am!Coll!Nutr+ Journal!of!the!American!College!of!Nutrition!
J!Biol! Journal!of!Biology!
J!Biol!Chem+ Journal!of!Biological!Chemistry!
J!Clin!Invest+ Journal!of!Clinical!Investigation!
J!Mol!Biol+ Journal!of!Molecular!Biology!
J!Neurochem+ Journal!of!Neurochemistry!
J!Neurosci+ Journal!of!Neuroscience!
LDF! Linear!Discriminant!Function!
Methods!Mol!Biol! Methods!in!Molecular!Biology!
Mol!Syst!Biol! Molecular!Systems!Biology!
mRNA! messenger!RNA!
MULAN++ Multiple!sequence!Local!Alignment!
NADPH++ Nicotinamide!Adenine!Dinucleotide!Phosphate!
Nat!Genet! Nature!Genetics!
Nat!Rev!Mol!Cell!Biol! Nature!Reviews!Molecular!Cell!Biology!
NCBI++ National!Centre!for!Biotechnology!Information!
NFY++ Nuclear!Factor!Y!
NHGRI! National!Human!Genome!Research!Institute!
NIH! National!Institutes!of!Health!
Nucleic!Acids!Res! Nucleic!Acids!Research!
PLoS!Genetics++ PloS!Genetics!
PNS+ Peripheral!Nervous!System!
Prog!Neurobiol+ Progress!in!Neurobiology!
PLoS!Genet! PloS!Genetics!
RNA++ Ribonucleic!Acid!
SP1+ Specificity!Protein!1!
SREBP++ Sterol!Regulatory!ElementSBinding!Proteins!
TFBS! Transcription!Factor!Binding!Sites!
tRNA++ transfer!RNA!
TSS! Transcription!Start!Site!
UV++ Ultraviolet!
!
iv!!
List+of+Figures+
Figure*1:*Cholesterol*Biosynthesis*Pathway*...................................................................................................................*6!
Figure*2:*Intermembrane*cholesterol*regulation*via*sterol*regulatory*element?binding**
proteins*(SREBPs)*......................................................................................................................................................................*9!
Figure*3:*Cytogenic*location*of*CYP7B1*gene.*............................................................................................................*11!
Figure*4:*ENSEMBL*homepage.*.........................................................................................................................................*15!
Figure*5:*Search*results*for*the*CYP7B1*gene.*............................................................................................................*16!
Figure*6:*Results*for*the*CYP7B1*Transcript.*..............................................................................................................*16!
Figure*7:*Summary*of*the*CYP7B1?001*Transcript.*.................................................................................................*17!
Figure*8:*Choices*of*different*configurations*for*data*extraction.*.....................................................................*17!
Figure*9:*Fasta*sequence*of*the*human*CYP7B1*gene.*............................................................................................*18!
Figure*10:*The*species*to*be*sequenced*are*organised*into*folders.*..................................................................*18!
Figure*11:*Mulan*homepage*which*can*be*accessed*at*http://mulan.dcode.org.*......................................*19!
Figure*12:*Homepage*for*the*sequences*to*be*applied.*..........................................................................................*19!
Figure*13:*Summary*page*of*results*of*sequences*aligned.*...................................................................................*20!
Figure*14:*Dynamic*visualisation*profile*in*standard*stacked?pairwise*structure.*...................................*20!
Figure*15:*Dynamic*visualisation*profile*in*color*density*by*interspecies*conservation*
configuration.*............................................................................................................................................................................*21!
Figure*16:*Summary*of*conservation*of*the*sequences*inputted.*.......................................................................*21!
Figure*17:*Phylogenetic*tree*result*of*the*sequences*submitted.*........................................................................*22!
Figure*18:*multiTF*homepage*within*Mulan.*.............................................................................................................*22!
Figure*19:*Results*summary*of*multiTF*profile.*.........................................................................................................*23!
Figure*20:*zPicture*homepage*for*the*sequences*to*be*applied.*.........................................................................*23!
Figure*21:*Results*page*of*the*sequences*differentiated.*.......................................................................................*24!
Figure*22:*Dynamic*visualisation*profile*of*ECR*of*species*of*interest.*...........................................................*24!
Figure*23:*Spidey*homepage.*.............................................................................................................................................*25!
Figure*24:*Summary*results*of*the*sequence*inputted.*...........................................................................................*25!
Figure*25:*FPROM*homepage*.............................................................................................................................................*26!
Figure*26:*Summary*page*of*results*of*the*position*of*the*predicted*promoter*regions.*........................*26!
Figure*27:*Mulan*conservation*analysis*for*the*human*CYP7B1*gene*from*159.5kb*to*212.6kb*
compared*with*guinea*pig,*elephant,*cat,*rabbit,*mouse,*cow,*dog,*armadillo,*horse*and*dolphin.*The*
conservation*profile*depicts*the*differential*prediction*of*the*non?coding*ECRs*in*different*species*
(the*legend*on*the*left*describes*colouring*of*different*type*of*elements).*....................................................*30!
Figure*28:*Mulan*conservation*analysis*for*the*human*CYP7B1*gene*from*0kb*to*53.2kb*compared*
with*guinea*pig,*elephant,*cat,*rabbit,*mouse,*cow,*dog,*armadillo,*horse*and*dolphin.*.........................*31!
Figure*29:*Mulan*conservation*analysis*for*the*human*CYP7B1*gene*from*53.2kb*to*106.3*compared*
with*guinea*pig,*elephant,*cat,*rabbit,*mouse,*cow,*dog,*armadillo,*horse*and*dolphin.*.........................*32!
v!!
Figure*30:*Mulan*conservation*analysis*for*the*human*CYP7B1*gene*from*106.3kb*to*159.5*
compared*with*guinea*pig,*elephant,*cat,*rabbit,*mouse,*cow,*dog,*armadillo,*horse*and*dolphin.*...*33!
Figure*31:*Mulan*conservation*analysis*for*the*human*CYP7B1*gene*from*159.9kb*to*212.6kb*
compared*with*guinea*pig,*elephant,*cat,*rabbit,*mouse,*cow,*dog,*armadillo,*horse*and*dolphin.*...*34!
Figure*32:*Phylogenetic*tree*represents*the*evolutionary*history*of*the*CYP7B1*gene*in*different*
vertebrate*lineages*(with*the*numbers*corresponding*to*the*number*of*nucleotide*mismatches**
per*kb).*.........................................................................................................................................................................................*35!
Figure*33:*Phylogenetic*shadowing*profile*of*the*human*and*10*other*species*comparison*
(100bp/70%)*identity*threshold.*.....................................................................................................................................*36!
Figure*34:*multiTF*identification*of*conserved*TFBS*of*the*CYP7B1*gene.*The*identified*TFBS*are*
depicted*as*coloured*tick*marks*above*the*conservation*profile.*......................................................................*38!
Figure*35:*The*potential*transcription*binding*sites*found*indicated*within*the*CYP7B1*gene.*.........*40!
!
vi##
Table&of&Contents&
&
ABSTRACT&.........................................................................................................................................&1#
1.0&INTRODUCTION&........................................................................................................................&1#
1.1#GENETICS#.....................................................................................................................................................#1#
1.2#GENOMICS#....................................................................................................................................................#2#
1.3#COMPARATIVE#GENOMICS#........................................................................................................................#3#
1.4#CHOLESTEROL#BIOSYNTHESIS#PATHWAY#..............................................................................................#6#
1.5#STUDY#OF#CHOLESTEROL#BIOSYNTHESIS#...............................................................................................#9#
1.6#CYP7B1#GENE#........................................................................................................................................#10#
1.7#OBJECTIVES#...............................................................................................................................................#11#
2.0&MATERIALS&AND&METHODS&..............................................................................................&12#
2.1#METHODOLOGY#FLOW#CHART#..............................................................................................................#12#
2.2#MATERIALS#...............................................................................................................................................#13#
2.2.1$Software$..............................................................................................................................................$13#
2.3#METHODS#..................................................................................................................................................#14#
2.3.1$Retrieval$of$Sequences$..................................................................................................................$14#
2.3.2$Generating$Alignments$.................................................................................................................$19#
3.0&RESULTS&...................................................................................................................................&27#
3.1#GENERATING#AND#VISUALISING#DNA#ALIGNMENTS:#MULAN#.........................................................#27#
3.2#PHYLOGENETIC#SHADOWING#IMPLEMENTED#IN#MULAN#.................................................................#36#
3.3#DETECTION#OF#EVOLUTIONARY#CONSERVED#TFBS:#MULTITF#.......................................................#37#
3.4#EXON#AND#PROMOTER#REGION#USING#SPIDEY#AND#FPROM#............................................................#41#
4.0&DISCUSSION&.............................................................................................................................&49#
4.1#GENETICS#AND#SEQUENCING#.................................................................................................................#49#
4.2#HOW#CHOLESTEROL#PLAYS#A#ROLE#IN#CYP7B1#................................................................................#50#
4.3#PROBLEMS#AND#CHALLENGES#IN#BIOINFORMATICS#.........................................................................#51#
4.4#PROBLEMS#WITH#ONLINE#TOOLS#..........................................................................................................#52#
4.5#SUMMARY#OF#RESULTS#...........................................................................................................................#53#
4.6#FUTURE#OF#COMPARATIVE#GENOMICS#.................................................................................................#54#
BIBLIOGRAPHY&.............................................................................................................................&55#
APPENDIX&I&.....................................................................................................................................&62#
1!!
Abstract+
Comparative!genomics!provides!the!means!to!delimitate!functional!regions!in!
anonymous!DNA!sequences.!The!successful!application!of!this!method!is!
currently!shifting!to!deciphering!the!nonScoding!encryption!of!gene!regulation!
across!genomes.!The!CYP7B1!gene!transcripts!of!eleven!mammals!(human,!
armadillo,!cat,!cow,!dog,!dolphin,!elephant,!guinea!pig,!horse,!mouse,!and!rabbit)!
have!been!sequenced.!To!facilitate!practical!application!of!comparative!sequence!
analysis!to!genetics!and!genomics,!several!analytical!and!visualisation!tools!are!
used!for!the!analysis!of!arbitrary!sequences!and!whole!genomes.!These!tools!
include!Mulan,!alignment!tool,!a!phylogenetic!tree,!evolutionary!transcription!
factor!analysis!tool,!multiTF,!an!exon!prediction!tool,!Spidey!and!a!human!
promoter!prediction!tool,!FPROM.!The!overall!goal!of!this!research!is!to!
determine!if!the!gene,!CYP7B1,!share!a!common!relationship!in!terms!of!function!
and!structure!to!the!species!in!investigation.!
1.0+Introduction+
1.1+Genetics+
Genetics!is!the!study!of!heredity!and!hereditary!variation.!A!monk!named!Gregor!
Mendel!documented!a!particulate!mechanism!for!inheritance!in!the!midS19th!
century!(Weiling,!1991).!He!observed!that!organisms!inherit!traits!by!way!of!
discrete!‘units!of!inheritance’,!which!is!somewhat!the!obscure!translation!of!a!
gene.!Genes!are!made!from!a!long!molecule!called!deoxyribonucleic!acid!(DNA).!
A!DNA!is!a!polymer!that!is!made!up!of!monomer!subunits!that!code!for!
thousands!of!different!kinds!of!proteins!with!each!gene!made!up!of!a!sugar,!a!
base!and!a!phosphate!group.!The!sugar!in!DNA!is!2’Sdeoxyribose!containing!a!
fiveScarbon!(pentose)!sugar.!There!are!four!types!of!nucleobases:!adenine!(A),!
thymine!(T),!cytosine!(C)!and!guanine!(G).!Adenine!and!guanine!have!two!
nitrogenScontaining!rings!and!exists!as!purines.!Thymine!and!cytosine!have!a!
single!nitrogenScontaining!ring,!which!exists!as!pyrimidines.!The!bases!are!
attached!to!the!1’Scarbon!of!the!sugar,!deoxyribose.!The!observation!of!that!bases!
2!!
are!present!in!the!genomes!of!different!species!led!to!the!concept!that!the!
sequence!of!bases!is!the!form!in!which!the!genetic!information!is!carried!out.!The!
power!of!DNA!sequencing!as!a!research!tool!has!triggered!the!dramatic!
advancement!of!DNA!sequencing!technology!since!the!discovery!of!the!DNA!
double!helical!structure!model!by!Watson!and!Crick!in!1953!(Watson!&!Crick,!
1953),!allowing!even!more!genomes!to!be!sequenced!and!making!comparative!
genomics!an!accessible!focal!point!for!the!study!of!any!form!of!life.!
!
1.2+Genomics+
Genomics!focuses!on!the!study!of!genome!structure!and!its!function.!
Bioinformatics!is!a!branch!of!science!that!uses!computational!approaches!to!
solve!genomics!problems!(Lesk,!2008).!Bioinformatics!develops!databases!to!
store,!retrieve,!organise!and!analyse!relationship!between!biological!data!sets.!
The!central!dogma!of!molecular!biology!is!that!the!sequence!specifies!the!
function!i.e.,!
DNA!!!Ribonucleic!Acid!(RNA)!!!Protein!
!
Frederick!Sanger!and!colleagues!(Sanger,!1981),!and!Alan!Maxam!and!Walter!
Gilbert!proposed!methods!for!rapid!sequencing!of!DNA!molecules!in!1975.!
Geneticists!uses!two!approaches!to!genome!sequencing!(Reece!et!al.,!2011):!
• MapSbased!sequencing:!starts!with!vectors!that!can!accommodate!large!
fragments!of!an!organism’s!genome!which!are!then!convened!into!genetic!
maps!using!recombinational!analysis.!The!National!Institutes!of!Health!
(NIH)!and!the!Department!of!Energy!(DOE)!chose!the!mapSbased!
sequencing!method!for!the!Human!Genome!Project!(HGP).!
• Shotgun!sequencing:!was!made!by!advances!in!sequencing!technology!
and!development!of!software!to!assemble!sequences!where!DNA!is!
broken!up!into!shorter!fragments!into!a!continuous!sequence.!
!
The!International!Human!Genome!Sequencing!Consortium!publicised!the!first!
draft!of!the!HGP!in!the!journal!Nature!in!February!2001,!with!the!sequence!of!the!
entire!genome’s!three!billion!base!pairs!almost!90%!complete.!The!full!sequence!
3!!
was!completed!and!published!in!April!2003!
(http://www.genome.gov/11006929).!!
!
After!the!completion!of!the!HGP,!the!National!Human!Genome!Research!Institute!
(NHGRI)!has!launched!the!Encyclopedia!of!DNA!Elements!(ENCODE)!Consortium!
in!September!2003!which!aims!to!identify!all!functional!elements!in!the!human!
genome,!primarily!the!remaining!component!regarded!as!‘junk’!(the!DNA!that!
contains!no!biological!function!and!is!not!transcribed).!
!
Projects!like!the!‘1000!Genomes!Project’!was!implemented!as!a!result!of!a!
reduction!in!costs!by!next!generation!sequencing!technologies!(“nextSgen”!
sequencing!platforms).!The!1000!Genomes!Project!is!the!first!project!to!sequence!
the!genomes!of!a!large!number!of!anonymous!participants!of!different!ethnic!
groups!providing!a!comprehensive!resource!on!human!genetic!variation.!They!
have!recently!announced!the!sequencing!of!1,092!genomes!(McVean,!2012).!
!
1.3+Comparative+Genomics+
Comparative!genomics!is!a!field!of!biological!research!in!which!the!genome!
sequences!of!different!species!are!compared!(Touchman,!2010)!to!gain!insights!
into!how!genomes!and!genes!evolved!and!to!assess!the!functional!significance!of!
genome!components!(Strachan!&!Read,!2011)!using!computers.!!A!simple!
comparison!of!the!general!features!of!genomes!such!as!genome!size,!number!of!
genes!and!chromosome!number!present!an!entry!point!into!comparative!
genomic!analysis.!Comparative!genomics!rely!on!the!comparison!of!sequences!at!
the!genome!scale!(Strachan!&!Read,!2011).!It!begins!with!powerful!computer!
programs!that!identify!homologous!regions!within!the!genomes!under!
comparison.!
!
The!analysis!of!the!individual!genome!sequences!gives!much!insight!into!genome!
structure!but!less!into!genome!function.!One!big!challenge!for!the!next!phase!of!
genomics!is!to!distinguish!functional!DNA!and!assign!a!role!to!it!(Collins!et!al.,!
2003).!Only!a!small!fraction!of!the!genome!(5%)!consists!of!protein!coding!
4!!
proteins!(Korf,!2007)!(Lesk,!2008).!The!average!human!gene!is!approximately!30!
kilobases!(kb)!long!and!not!the!initial!estimated!100,000!total!numbers!of!genes!
predicted!in!the!early!1990’s!(Korf,!2007);!which!is!only!a!minute!increase!
compared!over!flies!and!worms!which!has!13,600!(50%!of!human!equivalence)!
and!19,000!genes!(40%!of!human!equivalence)!(Howard!Hughes!Medical!
Institute,!2001),!respectively.!
!
Functional!sequences,!are!regions!of!similarity!between!the!sequences,!are!
subject!to!evolutionary!selection!that!can!result!in!a!signature!being!left!in!the!
aligned!sequences.!Comparing!sequences!can!find!these!signatures!of!selection!
and!so!putative!functional!sequences!can!be!deduced!(Miller!et!al.,!2004).!
Pairwise!sequence!alignments,!the!identification!of!residueSresidue!
correspondences!to!compare!genomic!sequences,!are!the!basic!tool!of!
bioinformatics.!By!contrast,!multiple!sequence!alignment!is!an!alignment!of!three!
or!more!biological!sequences!that!gives!a!more!reliable!assessment!of!similarity!
(Zvelebil!&!Baum,!2008).!These!multiple!sequence!alignments!can!then!be!used!
to!detect!evolutionary!conserved!regions!(ECRs)!in!the!sequence.!These!
conserved!regions!may!act!as!potential!transcription!factor!binding!sites!(cisS
regulatory!elements)!in!regulating!gene!expression!(Lu,!2011)!–!cisSregulatory!
modules!(CRMs)!include!promoters,!enhancers,!silencers,!and!insulators!or!
boundary!elements!(Miller!et!al.,!2004).!
!
Pairwise!sequence!alignments!can!be!carried!out!using!online!comparative!
genomics!tools!such!as!Basic!Local!Alignment!Search!Tool!(BLAST)!and!BlastS
Like!Alignment!Tool!(BLAT).!!
!
BLAST,!implemented!at!National!Centre!for!Biotechnology!Information!(NCBI)!
(http://blast.ncbi.nlm.nih.gov/Blast.cgi),!is!used!regularly!via!a!web!interface!as!
to!compare!a!query!sequence!to!a!database!or!library!of!sequences!in!a!rapid!
comparison!(Altschul!et!al.,!1990).!BLAST!uses!a!heuristic!method!to!look!for!
short!matches!between!two!sequences!and!attempts!to!find!similarities!in!
sequences!and!provides!statistical!information!about!the!alignment;!this!is!the!
expected!value,!or!false!positive!rate!(Ye!et!al.,!2006).!!
5!!
!
BLAT!available!at!(http://genome.ucsc.edu/cgiSbin/hgBlat),!is!a!new!alignment!
tool!similar!to!BLAST!used!to!compare!biological!sequences!such!as!DNA,!RNA!
and!proteins!but!is!structured!differently.!BLAT!uses!a!different!indexing!
approach!where!it!keeps!an!index!of!an!entire!genome!in!memory,!therefore!the!
target!database!for!BLAT!is!the!index!derived!from!the!assemble!of!the!entire!
genome!rather!than!a!set!of!sequences.!It!is!stated!by!(Kent,!2002)!that!BLAT!is!
more!accurate.!
!
Multiple!sequence!alignments!can!be!performed!with!online!comparative!
genomics!software!such!as!Clustal!and!VISTA!as!the!most!widely!used!tools.!
!
Clustal!Omega!is!available!at!https://www.ebi.ac.uk/Tools/msa/clustalo/,!which!
is!the!latest!addition!to!the!Clustal!family.!The!new!release!increases!scalability!
over!the!previous!versions!and!improves!the!accuracy!of!the!progressive!
alignment!procedure!(Thompson!et!al.,!1994).!!
!
VISTA!is!developed!and!hosted!at!Genomics!Division!of!Lawrence!Berkeley!
National!Laboratory,!which!is!accessible!at!
http://genome.lbl.gov/vista/index.shtml.!It!is!a!collection!of!tools!and!databases!
that!allows!for!extensive!comparative!genomics!analyses.!mVISTA!is!the!server!
that!is!used!to!align!and!compare!multiple!sequences!of!species.!
!
Phylogenetic!footprinting!is!an!approach!for!finding!functional!elements!from!
sequence!data!(Ganley!&!Kobayashi,!2007).!It!relies!on!detecting!high!degrees!of!
conservation!across!different!species!(Zhang!&!Gerstein,!2003).!Phylogenetic!
footprinting!shortens!the!amount!of!sequence!under!consideration!by!focusing!
attention!on!conserved!regions!that!are!more!likely!to!serve!a!biological!function!
(Thompson!et!al.,!2004)!(Wasserman!et!al.,!2000).!
!
6!!
1.4+Cholesterol+Biosynthesis+Pathway+
Cholesterol!is!the!major!sterol!in!animal!tissues.!It!is!a!27Scarbon!sterol!derived!
from!a!single!precursor,!acetate!(Nelson!&!Cox,!2008).!The!steroid!nucleus!
consists!of!4!planar!rings!and!hydrocarbon!chain!extends!from!C17.!Cholesterol!is!
essential!in!regulating!cell!membrane!permeability!and!fluidity!and!for!the!
production!of!bile!and!other!steroid!hormones!such!as!oestrogen,!testosterone!
and!cortisone.!!
!
Cholesterol!plays!a!unique!role!among!the!many!lipids!in!mammalian!cells.!The!
endoplasmic!reticulum!is!the!main!organelle!responsible!for!regulation!of!
cholesterol!synthesis.!Intracellular!cholesterol!concentration,!Adenosine!
Triphosphate!(ATP)!levels!and!hormones!(glucagon!and!insulin)!regulate!
cholesterol!production.!In*vivo!cholesterol!concentration!is!dictated!by!the!diet!
and!biosynthesis.!High!concentrations!are!generally!associated!with!increased!
risk!of!cardiovascular!disease!and!health.!According!to!(Maxfield!&!van!Meer,!
2010),!the!levels!of!cholesterol!vary!in!different!organelles!by!5!–!10!fold!but!the!
mechanisms!for!these!differences!are!only!partially!understood.!
!
!
Figure*1:!Cholesterol*Biosynthesis*Pathway!(Rosanoff!&!Seelig,!2004).!
7!!
Nicotinamide!adenine!dinucleotide!phosphate!(NADPH)!produced!in!the!pentose!phosphate!
pathway!is!required!for!cholesterol!synthesis.!Cholesterol!is!made!from!acetylSCoA!in!four!
distinct!stages:!(1)!the!condensation!of!3!x!2!carbon!compounds!acetylSCoA!molecules!to!form!6!
carbon!compound,!mevalonate;!(2)!the!conversion!of!mevalonate!to!activated!isoprene;!(3)!the!
polymerisation!of!6!x!5!carbon!compounds!isoprene!to!form!the!30Scarbon!linear!squalene;!and!
(4)!the!cyclisation!of!squalene!to!form!the!steroid!nucleus!which!subsequently!forms!cholesterol.!
(Nelson!&!Cox,!2008).!
!
Cholesterol!is!widespread!in!biological!membranes!especially!in!animals!and!its!
presence!can!modify!the!role!of!membrane!bound!proteins.!The!presence!of!
cholesterol!in!membrane!reduces!fluidity!by!stabilising!extended!chain!
conformations!of!the!hydrocarbon!tails!of!fatty!acids!and!hydrocarbon!chains!by!
van!der!Waals!interactions!(Campbell!&!Farell,!2012).!Cholesterol!is!rich!in!
glycosphingolipids,!glycosylphosphatidyllinositol!anchored!proteins!and!
signalling!molecules,!which!function!as!signalling!platforms!and!have!been!
shown!to!be!crucial!for!the!assembly!and!activity!of!various!signalling!networks!
(Le!Roy!&!Wrana,!2005).!
!
The!cholesterol!biosynthesis!pathway!synthesizes!nonSsterol!isoprenoids!such!as!
dolichol,!hemeSA,!isopentenyl!transfer!RNA!(tRNA)!and!ubiquinone.!As!reported!
by!(Buhaescu!&!Izzedine,!2007),!these!molecules!appear!to!be!potential!
interesting!therapeutic!targets!for!onSgoing!research!in!oncology,!autoimmune!
disorders,!atherosclerosis!and!Alzheimer’s!disease.!Statins!(e.g.!atorvastatin,!
lovastatin,!simvastatin),!which!are!a!class!of!drug!that!lowers!the!levels!of!
cholesterol!by!inhibiting!3ShydroxyS3SmethylglutarylScoenzyme!A!(HMGSCoA)!
reductase!enzyme,!prevent!cardiovascular!disease!to!those!who!are!at!high!risk!
(Lewington!et!al.,!2007)!by!acting!as!reversible,!competitive!inhibitors!of!HMGS
CoA!reductase.!They!are!also!being!tested!for!neuroprotective!properties.!The!
fundamental!mechanism!of!statins!is!to!inhibit!cellular!cholesterol!synthesis.!
However,!the!cholesterol!biosynthesis!pathway!also!has!several!bySproducts,!the!
nonSsterol!isoprenoids,!which!are!also!important!in!cellular!functioning!(van!der!
Most!et!al.,!2009).!IsoprenoidSmediated!inhibition!of!cholesterol!synthesis!is!also!
being!applied!to!cancer!chemotherapy!and!chemoprevention!(Mo!&!Elson,!
2004).!
!
8!!
Cholesterol!is!also!the!precursor!of!important!steroid!hormones!(Campbell!&!
Farell,!2012).!Like!cholesterol,!these!hormones!have!a!4Sring!sterol!nucleus.!
Glucocorticoids,!mineralocorticoids,!and!sex!hormones!(steroid!hormones)!–!are!
produced!from!cholesterol!by!modifications!of!the!side!chain!and!the!addition!of!
oxygen!atoms!into!the!steroid!ring!system!(Nelson!&!Cox,!2008)!but!lack!the!
alkyl!chain!attached!to!the!DSring!of!cholesterol.!
1,!25!dihydroxycholecalciferol!is!synthesised!in!the!skin!from!the!action!of!
ultraviolet!(UV)!light!on!7Sdehydrocholesterol,!which!is!derived!from!cholesterol.!
The!cholecalciferol!produced!is!hydroxylated!in!the!liver!by!25Shydroxyvitamin!
D!which!is!the!main!circulating!form!of!vitamin!D.!25Shydroxyvitamin!D!is!then!
further!hydroxylated!in!the!kidneys!which!results!in!the!final!product,!1,25!
dihydroxycholecalciferol!(active!form!of!vitamin!D).!
!
Cholesterol!in!the!mammalian!brain!is!a!risk!factor!for!certain!
neurodegenerative!diseases.!In!the!vertebrate!nervous!system,!majority!of!
cholesterol!resides!in!the!myelin!(Saher!et!al.,!2009).!Schwann!cells!synthesise!
essentially!all!cholesterol!that!they!require!for!myelination!autonomously!in!the!
peripheral!nervous!system!(PNS)!(Fu!et!al.,!1998).!A!crucial!enzyme!of!the!
cholesterol!biosynthesis!pathway!–!oligodendroglial!inactivation!of!squalene!
synthase!–reported!that!cholesterol!is!a!rateSlimiting!factor!for!central!
myelination!(Saher!et!al.,!2005).!Myelin!forms!an!insulating!sheath!around!axons!
that!consists!of!tightly!compacted!membranes.!A!study!by!(Saher!et!al.,!2011)!
acknowledged!that!cholesterol!appeared!to!be!the!only!integral!myelin!
component!that!is!critical!and!rate!limiting!for!the!advancement!of!the!central!
nervous!system!(CNS)!and!PNS!myelin.!The!function!of!myelin,!which!is!highly!
enriched!in!glycosphingolipids,!is!dependent!on!its!unique!composition!for!rapid!
and!efficient!saltatory!nerve!conduction.!
!
9!!
1.5+Study+of+Cholesterol+Biosynthesis+
!
Figure*2:*Intermembrane*cholesterol*regulation*via*sterol*regulatory*element?binding*proteins*
(SREBPs)*(Sun!et!al.,!2005).*
SREBPs,!which!are!the!membraneSembedded!transcriptional!activators!of!cholesterol!synthesis,!
are!transported!by!SCAP!to!the!Golgi!complex!in!COPII–coated!vesicles!for!processing.!
Cholesterol!triggers!SCAP!by!binding!to!Insig,!which!blocks!binding!of!COPII!proteins!to!SCAP!
revoking!SREBP!transport!and!thus,!terminating!cholesterol!synthesis.!
!
The!ubiquity!of!cholesterol!and!its!precursors!in!the!cell!membranes!of!
eukaryotic!species!make!cholesterol!biosynthesis!an!ideal!pathway!to!analyse!
and!model.!Studies!have!already!been!taken!by!(Ohyama!et!al.,!2006)!to!evaluate!
the!biological!importance!of!transcriptional!regulation!of!cholesterol!24S
hydroxylase!(CYP46A1)!genes!by!analysing!orthologous!sequence!comparison!to!
localised!conserved!nonScoding!regions,!i.e.!potential!regulatory!regions.!
CYP46A1!is!a!key!regulator!of!brain!cholesterol!elimination!(Shafaati!et!al.,!
2009).!
!
The!synthesis!of!cholesterol!and!its!derivatives!provides!an!example!of!a!novel!
eukaryotic!membrane!component,!which!in!higher!animals!is!used!as!a!
precursor!for!the!synthesis!of!higher!molecules!(Freilich!et!al.,!2008).!The!
production!of!cholesterol!is!regulated!by!intracellular!cholesterol!concentration.!
10!!
The!rateSlimiting!step!in!the!pathway!to!cholesterol!is!the!conversion!of!HMGS
CoA!to!mevalonate,!the!reaction!catalysed!by!HMGSCoA!reductase!(Nelson!&!Cox,!
2008)!(Wilcox!et!al.,!2007).!!
!
The!regulation!in!response!to!cholesterol!levels!is!mediated!by!a!system!of!
transcrtiptional!regulation!encoding!the!3ShydroxyS3SmethylglutarylSCoA!
reductase!(HMGCR)!gene.!!A!system!of!transcriptional!regulation!of!the!gene!
encoding!HMGSCoA!reductase!mediates!the!regulation!in!response!to!cholesterol!
levels.!Sterol!regulatory!elementSbinding!proteins!(SREBPs)!are!regulatory!
proteins!that!control!the!HMGSCoA!gene!along!with!other!genes!to!mediate!the!
uptake!of!cholesterol.!The!SREBP!family!member!SREBP1!is!a!major!
transcriptional!activator!of!cholesterol!and!fatty!acid!metabolism!that!has!been!
involved!in!insulin!resistance,!diabetes!and!other!dietSrelated!diseases!(Reed!et!
al.,!2008).!Recent!studies!have!shown!that!SREBP!interact!with!transcription!
factors,!specificity!protein!1!(SP1)!and!nuclear!factor!Y!(NFY)!in!regulating!
specific!classes!of!target!genes!(Reed!et!al.,!2008)!(Horton!et!al.,!2002).!
!
1.6+CYP7B1+Gene+
25Shydroxycholesterol!7SalphaShydroxylase!is!an!enzyme!that!is!encoded!by!the!
CYP7B1!gene!in!humans!(Setchel!et!al.,!1998).!This!gene!encodes!a!member!of!
the!cytochrome!P450,!family!7,!subfamily!B,!polypeptide!1!superfamily!of!
enzymes.!It!is!a!proteinScoding!gene!that!catalyses!many!reactions!including!drug!
metabolism!and!synthesis!of!steroids,!other!lipids!and!cholesterol.!
!
Stapleton!first!discovered!CYP7B!in!a!differential!screen!of!transcripts!expressed!
in!a!rat!hippocampal!complementary!DNA!(cDNA)!library!versus!the!remainder!
of!the!brain.!CYP7B!catalyses!the!7SalphaShydroxylation!of!oxysterols!and!3SbetaS
hydroxysteroids!including!dehydroepiandrosterone!(DHEA)!(Rose!et!al.,!1997),!a!
major!adrenal!steroid!(Stapleton!et!al.,!1995).!The!CYP7B1!gene!has!
demonstrated!by!report!tagging!to!be!expressed!particularly!strongly!in!the!
brain,!liver,!spleen,!kidney!and!heart!(Rose!et!al.,!2001).!
!
11!!
The!first!reaction!in!the!cholesterol!catabolic!pathway!is!catalysed!by!the!
endoplasmic!reticulum!membrane!protein,!which!essentially!converts!
cholesterol!to!bile!acids.!It!also!plays!a!minor!role!in!total!bile!acid!synthesis,!but!
may!also!be!involved!in!the!development!of!atherosclerosis,!neurosteroid!
metabolism!and!sex!hormone!synthesis!(NCBI!RefSeq,!2008).!
!
!
Figure*3:!Cytogenic*location*of*CYP7B1*gene.!
The!CYP7B1!gene!is!located!from!the!base!pair!64,595,971!to!base!pair!64,798,790!(or!the!long!
(q)!arm!of!chromosome!8!at!position!21.3)!(National!Library!of!Medicine,!2014).!
1.7+Objectives+
1. Retrieve!gene!sequences!from!each!of!the!ten!species!(armadillo,!cat,!cow,!
dog,!dolphin,!elephant,!guinea!pig,!horse,!mouse!and!rabbit)!and!human!
as!a!reference!sequence!and!save!in!fasta*format.!
2. Perform!multiple!sequence!alignment!using,!e.g.!Mulan,!Clustal!omega!
and/or!mVISTA.!
3. Identify!the!ECRs!and!analyse!these!regions!for!conserved!transcription!
factor!binding!sites!(TFBS).!
4. Review!the!data!and!identify!TFBS!of!regions!present!in!some/!all!species!
(e.g.!mammals).!
5. Summarise!data!results.!
6. Analyse!the!presumptive!sites!and!try!to!assign!meaning!tot!the!possible!
regulatory!networks.!
! 12!
2.0+Materials+and+Methods+
2.1+Methodology+Flow+Chart+
!
A*flow*chart*of*the*methods*involved*in*retrieving*the*sequence*of*CYP7B1*gene*from*the*NCBI*
database.!
Analyse!data!results!
Perform!promoter!prediction!using!FPROM!
Perform!exon!prediction!using!NCBI's!spidey!
Tabulate!multiTF!data!
Open!multiTF!within!Mulan!
View!dynamic!overlay!
Perform!multiple!sequence!alignment!using!Mulan!
Organise!saved!qiles!into!folders!
Save!sequence!in!fasta!format!
Retrieve!sequence!from!ENSEMBL!
! 13!
2.2+Materials+
2.2.1+Software+
Since!it!is!a!computerSbased!discipline,!there!are!online!programs!used!in!order!
to!retrieve!sequences!and!compare!them!with!other!species.!This!includes:!
!
ENSEMBL!Genome!Browser!(http://www.ensembl.org/index.html)!is!a!scientific!
project!jointly!constructed!by!the!European!Bioinformatics!Institute!and!
Wellcome!Trust!Sanger!Institute!in!1999.!It!is!mostly!used!to!locate!and!describe!
the!relationships!of!individual!genes!that!can!be!identified.!
!
BLAST!(http://blast.ncbi.nlm.nih.gov/Blast.cgi)!is!used!to!compare!primary!
biological!sequence!information!including!the!amino!acid!sequences!of!different!
proteins!or!the!nucleotides!of!DNA!sequences.!It!is!by!far!the!most!widely!used!
technique!for!detecting!similarity!between!sequences!of!interest!(Altschul!et!al.,!
1990).!
!
Clustal!Omega!(https://www.ebi.ac.uk/Tools/msa/clustalo/)!is!a!new!multiple!
sequence!alignment!tool!that!uses!seeded!guide!trees!and!hidden!Markov!model!
(HMM)!profileStechniques!to!generate!hundreds!and!thousands!of!alignments!
and!virtually,!align!any!number!of!protein!sequences!quickly!which!results!in!the!
delivery!of!accurate!alignments!(Sievers!et!al.,!2011).!
!
MULAN!Multiple!Sequence!Local!Alignment!(http://mulan.dcode.org)!employs!
two!alignment!strategies,!which!allow!for!comparative!analysis!of!multiple!
sequences!that!are!present!either!as!draft!or!finished!configuration.!
!
mVista!(http://genome.lbl.gov/vista/mvista/submit.shtml)!is!a!set!of!programs!
for!comparing!DNA!sequences!from!two!or!more!species!up!to!megabases!long!
and!visualise!these!alignments!with!annotation!information!(Frazer!et!al.,!2004).!
!
! 14!
multiTF!(http://multitf.dcode.org)!identifies!transcription!factor!binding!sites!
(TFBS)!conserved!across!multiple!species!involved!into!the!alignment.!It!is!
dynamically!interconnected!with!Mulan.!!
!
Spidey!(http://www.ncbi.nlm.nih.gov/spidey/)!is!a!messenger!RNA!(mRNA)StoS
genomic!alignment!program.!Sarah!Wheelan!created!the!program!relying!heavily!
on!the!alignment!manager!to!easily!manage!and!quickly!access!alignments!and!
sets!of!alignments.!
!
FPROM!Human!Promoter!Prediction!
(http://linux1.softberry.com/berry.phtml?topic=fprom&group=programs&subg
roup=promoter)!is!an!online!program!that!identifies!promoter!regions!and!
regulatory!sites!(Solovyev!et!al.,!2010).!
!
Promoter!2.0!Prediction!Server!(http://www.cbs.dtu.dk/services/Promoter/)!is!
an!online!bioinformatics!tool!used!to!predict!transcription!starts!sites!in!DNA!
sequences.!It!has!been!developed!as!an!evolution!of!simulated!transcription!
factors!that!interact!with!sequences!n!promoter!regions.!
!
2.3+Methods+
2.3.1+Retrieval+of+Sequences+
The!gene,!CYP7B1,!was!entered!into!the!human!genome!database!at!
http://www.ensembl.org/index.html!and!the!sequences!were!retrieved.!Each!
human!gene!name!and!international!symbol!were!retrieved!from!the!HUGO!gene!
nomenclature!committee!(HGNC)!website!(http://www.genenames.org).!It!is!
necessary!to!provide!a!unique!symbol!for!each!gene!to!facilitate!electronic!data!
retrieval!from!publications.!Each!symbol!maintains!parallel!construction!in!
different!members!of!a!gene!family!and!can!also!be!used!in!other!species.!
! 15!
!
ENSEMBL!Accession!#.! Linnaeus!Classification! Common!Name!
ENST00000310193! Homo!Sapiens! Human!
ENSDNOT00000004738! Dasypus!Novemcinctus! Armadillo!
ENSFCAT00000010214! Felis!Catus! Cat!
ENSBTAT00000001710! Bos!Taurus! Cow!
ENSCAFT00000011608! Canis!Lupus!Familiaris! Dog!
ENSTTRT00000001843! Tursiops!Truncatus! Dolphin!
ENSLAFT00000012755! Loxodonta!Africana! Elephant!
ENSCPOT00000004704! Cavia!Porcellus! Guinea!Pig!
ENSECAT00000007698! Equus!Caballus! Horse!
ENSMUST00000035625! Mus!Musculus! Mouse!
ENSOCUT00000000440! Oryctolagus!Cuniculus! Rabbit!
Table*1:*List*of*species*to*be*analysed*and*gene*sequences*to*be*retrieved.*
!
A!detailed!procedure!to!retrieve!sequences!via!screenshots!follows:!
!
!
Figure*4:!ENSEMBL*homepage.!
The!gene!of!interest,!CYP7B1,!was!entered!along!with!‘Human’!selected!into!the!search!button!
before!selecting!go.!
! 16!
!!
!
Figure*5:*Search*results*for*the*CYP7B1*gene.*
On!the!leftShand!side,!the!category!is!restricted!to!just!transcript.!
!
!
Figure*6:*Results*for*the*CYP7B1*Transcript.*
The!ENSEMBL!human!transcript!of!interest!is!then!selected!(CYP7B1S001).!
!
! 17!
!
Figure*7:*Summary*of*the*CYP7B1?001*Transcript.*
Data!is!to!be!extracted!which!can!be!found!on!the!leftShand!side.!
!
!
Figure*8:*Choices*of*different*configurations*for*data*extraction.*
A!fasta!sequence!is!choosen!as!the!output!file!which!will!then!be!used!to!input!sequences!into!
different!online!software.!The!sequence!for!upstream!(5’)!and!downstream!(3’)!is!restricted!to!
5000bases!and!cDNA!button!is!ticked!for!use!in!exon!and!promoter!predictions.!
!
! 18!
!
Figure*9:*Fasta*sequence*of*the*human*CYP7B1*gene.*
This!fasta!sequence!is!then!saved!in!a!normal!.txt!file!in!fasta*format!and!saved!into!a!folder!for!
easy!access.!Fasta!format!is!a!text!format!with!all!files!beginning!within!a!single!line!description.!
A!‘>’!must!appear!in!the!first!column!and!the!rest!of!the!title!line!is!arbitrary!but!must!be!
informative!(Lesk,!2008).!
!
!
Figure*10:*The*species*to*be*sequenced*are*organised*into*folders.*
Each!sequence!retrieved!was!saved!in!fasta!format!with!each!file!annotated!with!the!common!
biological!classification!of!each!species.!Each!sequence!was!then!saved!in!its!corresponding!
folder.!
! 19!
2.3.2+Generating+Alignments+
!
Figure*11:*Mulan*homepage*which*can*be*accessed*at*http://mulan.dcode.org.*
Having!all!the!sequences!saved!in!fasta!format,!it!is!then!inputted!to!Mulan!for!a!multiple!
sequence!alignment.!On!the!homepage,!the!number!of!species!under!investigation!(e.g.!11)!is!
chosen.!
!
!
Figure*12:*Homepage*for*the*sequences*to*be*applied.*
Each!sequence!can!be!pasted!in,!in!FASTA!format,!uploaded!as!a!FASTA!file,!or!entered!as!an!
accession!number!with!the!available!annotation!on!the!right!side!before!hitting!the!‘submit’!
button.!
! 20!
!
Figure*13:*Summary*page*of*results*of*sequences*aligned.*
A!completed!alignment!request!results!in!a!‘summary!page’!which!provides!links!to!the!
interactive!dynamic!visualisation!tool,!pairwise!dynamic!plots,!dotSplots,!annotation!files,!
sequence!files!and!a!portal!to!the!transcription!factor!binding!site!analysis!tool,!MultiTF.!
!
!
Figure*14:*Dynamic*visualisation*profile*in*standard*stacked?pairwise*structure.*
A!dynamic!visualisation!gives!a!graphical!representation!of!the!relationship!of!the!species!
inputted!including!conserved!evolutionary!regions!indicated!in!red.!A!legend!is!attached!on!the!
leftShand!side!which!highlights!intronic!regions!in!pink,!coding!regions!in!blue,!untranslated!
regions!in!yellow!and!repeat!regions!in!green.!
!
! 21!
!
Figure*15:*Dynamic*visualisation*profile*in*color*density*by*interspecies*conservation*configuration.*
At!the!visualisation!option,!‘color!density!by!interspecies!conservation’!can!be!selected.!This!
illustrates!the!relationship!between!a!conserved!element!and!the!number!of!species!that!share!a!
particular!region!(Loots!&!Ovcharenko,!2007).!
!
!
Figure*16:*Summary*of*conservation*of*the*sequences*inputted.*
Summary!conservation!can!also!be!selected!as!the!visualisation!option.!It!collects!shared!
similarities!from!all!the!pairwise!comparisons!into!a!single!conservation!profile.!
!
! 22!
!
Figure*17:*Phylogenetic*tree*result*of*the*sequences*submitted.*
There!is!an!option!to!view!the!phylogenetic!tree!of!the!sequences!in!question.!It!describes!
evolutionary!relationships!between!the!human,!dolphin,!armadillo,!mouse,!horse,!guinea!pig,!
rabbit,!cat,!elephant,!cow!and!dog!sequences!of!the!CYP7B1!gene.!Every!tree!branch!estimates!a!
number!of!substitutions!per!1kb!of!sequence.!
!
!
Figure*18:*multiTF*homepage*within*Mulan.*
The!Mulan!alignment!can!be!submitted!to!‘multiTF’.!It!is!a!tool!used!for!the!identification!of!
conserved!TFBS.!After!submitting!the!alignment!to!multiTF,!the!transcription!factors!to!be!
investigated!can!be!selected.!
!
! 23!
!
Figure*19:*Results*summary*of*multiTF*profile.*
A!results!readout!page!is!then!displayed.!This!page!can!be!used!to!navigate!a!summary!of!multiS
conserved!sites.!
!
!
Figure*20:*zPicture*homepage*for*the*sequences*to*be*applied.*
zPicture!can!be!used!as!an!alternative!to!identifying!ECRs.!It!is!highly!flexible!which!allows!users!
to!differentially!predict!ECRs.!All!eleven!sequences!are!inputted!to!each!corresponding!box.!
!
! 24!
!
Figure*21:*Results*page*of*the*sequences*differentiated.*
Options!to!view!results!include!dynamic!visualisation,!pairwise!visualisation!and!dotSplots.!!
!
!
Figure*22:*Dynamic*visualisation*profile*of*ECR*of*species*of*interest.*
A!graphical!representation!of!the!different!species!added.!Ultraconserved!regions!are!displayed!
as!red.!There!are!annotations!that!are!highlighted!with!pink!for!introns,!blue!for!known!coding!
exons,!yellow!for!untranslated!regions!and!green!for!repeats.!
!
! 25!
!
Figure*23:*Spidey*homepage.*
Spidey!is!ultimately!used!to!predict!regions!of!exons!in!a!sequence!of!DNA.!Each!species!cDNA!
and!genomic!DNA!is!inputted!in!its!corresponding!box!before!hitting!‘align’.!All!of!the!eleven!
species!were!inputted!and!each!species!exons!were!predicted.!
!
!
Figure*24:*Summary*results*of*the*sequence*inputted.*
The!summary!page!exhibits!the!predicted!regions!of!the!exons!for!each!species.!It!shows!genomic!
coordinates!of!the!exons!as!well!as!mRNA!coordinates!and!actual!length!of!each!of!the!exons.!
!
! 26!
!
Figure*25:*FPROM*homepage*
FPROM! is! an! online! tool! used! to! predict! the! site! of! the! promoters! in! a! gene! sequence.! The!
genomic!DNA!sequence!is!inputted!before!hitting!‘proceed’.!
!
!
Figure*26:*Summary*page*of*results*of*the*position*of*the*predicted*promoter*regions.*
The!summary!page!shows!the!different!positions!of!the!predicted!promoter!in!the!sequence!as!
well!as!the!position!of!the!TATA!box.!The!promoter!regions!and!TATA!boxes!are!found!within!
5000bases!long!and!therefore,!results!over!5000bases!are!generally!excluded.!
!
! 27!
After!retrieving!all!the!required!sequences!for!each!gene,!the!results!of!the!
alignment!studies!were!tabulated!to!make!the!output!easier!to!analyse!and!
summarise.!The!first!result!created!was!a!physical!map!of!the!overall!gene!of!
each!of!the!species!created!in!Microsoft!PowerPoint!including!predicted!exons!
and!promoter!regions.!All!other!results!are!pasted!directly!from!the!site’s!page.!
3.0+Results+
3.1+Generating+and+visualising+DNA+alignments:+Mulan+
The!genome!envelops!biologically!functional!elements!that!have!mutated!at!a!
slower!rate!than!the!neutrally!evolving!genomic!background.!Therefore,!
comparative!sequence!analysis!of!different!species!that!identifies!ECRs!facilitates!
the!prediction!of!functional!regions!(Loots!&!Ovcharenko,!2005).!It!is!currently!a!
widely!employed!technique!to!graphically!represent!sequence!conservation!
profiles!in!reference!to!the!base!DNA!sequence!that!is!linear!along!horizontal!xS
axis,!while!the!vertical!coordinate!displays!the!percent!identity!ration!with!the!
secondary!sequence!(50%,!75%!and!100%)!(Schwartz!et!al.,!2000).!The!regions!
of!conservation!are!graphically!characterised!as!peaks!and!evolutionary!
thresholds!can!be!defined!to!highlight!ECRs!of!userSdefined!minimal!percent!
identity!and!length.!Analytically,!a!100bp/70%!identity!threshold!provides!high!
sensitivity!for!analysing!human/!species!conservation!profiles.!
!
MultipleSsequence!comparative!analysis!is!a!challenging!task!in!terms!of!
generating!highly!reliable!alignments!and!graphically!displaying!the!alignment!
results.!To!address!the!complexity!stemming!from!user!input!sequence!files!that!
potentially!consist!of!a!large!number!of!sequences!of!varying!lengths!and!
different!phylogenetic!relationships,!a!set!of!different!visualisation!options!is!
application!to!any!finished!multipleSsequence!local!alignments.!For!example,!the!
reference!sequence!can!be!dynamically!changed,!and!the!new!stacking!order!of!
conservation!profiles!with!the!rest!of!the!species!will!be!automatically!
determined!using!the!evolutionary!relationship!of!each!sequence!to!the!
reference!sequence,!where!most!closely!related!species!are!the!bottom.!
!
! 28!
In!figure!27,!the!graph!represents!a!dynamic!visualisation!profile!showing!the!
similar!conservation!profiles!for!each!of!the!species.!There!are!“ghosting”!
regions,!which!appear!blank!but!evidently,!these!regions!still!has!a!function!that!
is!not!akin!to!the!conserved!regions!found!among!all!the!species.!These!ECRs!
represents!a!relationship!between!all!of!the!species!in!query!that!share!the!same!
function.!These!ECRs!will!determine!the!locations!of!the!different!TFBS!that!
would!be!predicted!using!multiTF.!
!
The!results!that!are!shown!in!figures!28S31!exhibit!a!dynamic!visualisation!
profile!in!‘color!density!by!interspecies!conservation’!where!it!illustrates!a!
relationship!between!the!colour!density!of!a!conserved!element!and!the!number!
of!species!that!share!a!particular!region.!It!shows!the!full!gene!sequence!of!the!
species!investigated.!There!are!similarities!that!exist!throughout!the!sequence!
but!potentially!have!a!different!function.!The!colour!intensity!of!a!conserved!
region!depends!on!the!number!of!different!species!that!contain!the!region!(the!
darker,!the!more!conserved!species).!From!the!graph,!this!conservation!is!shared!
among!all!of!the!mammals.!This!analysis!is!performed!for!every!pixelSwide!
region!of!the!conservation!plot.!The!number!of!ECRs!from!different!species!that!
overlap!with!a!particular!pixel!count!towards!the!number!of!species!sharing!this!
region.!In!a!recent!study,!it!was!observed!that!regions!conserved!in!multiple!
species!often!correlate!with!functional!elements!(Frazer!at!al,!2004).!Therefore,!
the!colour!density!of!the!plot!can!potentially!highlight!different!DNA!segments!in!
the!base!sequence!with!unique!evolutionary!character.!
!
Two!additional!data!representation!modules!are!implemented!in!the!Mulan!tool:!
phylogenetic!shadowing!and!‘summary!of!conservation’.!While!‘summary!of!
conservation’!collects!all!the!shared!nucleotide!similarities!from!all!the!pairwise!
comparisons!into!a!single!conservation!profile,!the!phylogenetic!shadowing!
option!effectively!collects!all!the!cumulative!nucleotide!matches!(Ovcharenko!et!
al,!2004a).!The!phylogenetic!shadowing!visualisation!display!accurately!depicts!
the!coding!exon!as!the!most!highly!conserved!element!(Figure!33).!In!addition,!
the!identified!ECR!sharply!defines!the!exon!boundaries!without!any!priori!
knowledge!of!its!location.!
! 29!
A!phylogenetic!tree,!or!more!commonly!known!as!phylogeny,!is!a!graphical!
representation!that!depicts!evolutionary!similarities!among!a!set!of!species!
(Baum,!2008).!!
!
As!seen!in!figure!32,!the!dimension!lines!give!the!amount!of!genetic!change.!The!
lines!are!branches!and!represent!evolutionary!lineages!changing!over!time!(the!
longer!the!branch,!the!larger!the!amount!of!change).!Every!tree!branch!estimates!
a!number!of!substitutions!per!1kilobite!(kb)!of!sequence.!Each!lineage!has!a!part!
of!its!history!that!is!unique!to!it!alone!and!parts!that!are!shared!with!other!
lineages,!for!example,!as!shown!in!figure!32,!the!dog!and!horse!respectively!has!
its!own!unique!history!but!also!contains!shared!history*that!is!common!with!
other!lineages,!e.g.!cow!and!dolphin.!Similarly,!each!lineage!has!ancestors!that!
are!unique!to!that!lineage!and!ancestors!that!are!shared!with!other!lineages.!
!
Phylogenetic!trees!often!provide!an!efficient!structure!for!organising!knowledge!
of!biodiversity!and!allow!one!to!develop!an!accurate,!nonSprogressive!conception!
of!the!totality!of!evolutionary!history!(Baum,!2008).!
! 30!
!
Figure*27:*Mulan*conservation*analysis*for*the*human*CYP7B1*gene*from*159.5kb*to*212.6kb*
compared*with*guinea*pig,*elephant,*cat,*rabbit,*mouse,*cow,*dog,*armadillo,*horse*and*dolphin.*The*
conservation*profile*depicts*the*differential*prediction*of*the*non?coding*ECRs*in*different*species*
(the*legend*on*the*left*describes*colouring*of*different*type*of*elements).*
! 31!
!
Figure'28:'Mulan'conservation'analysis'for'the'human'CYP7B1'gene'from'0kb'to'53.2kb'compared'with'guinea'pig,'elephant,'cat,'rabbit,'mouse,'cow,'dog,'armadillo,'
horse'and'dolphin.'
'
! 32!
!
Figure'29:'Mulan'conservation'analysis'for'the'human'CYP7B1'gene'from'53.2kb'to'106.3'compared'with'guinea'pig,'elephant,'cat,'rabbit,'mouse,'cow,'dog,'armadillo,'
horse'and'dolphin.'
! 33!
!
Figure'30:'Mulan'conservation'analysis'for'the'human'CYP7B1'gene'from'106.3kb'to'159.5'compared'with'guinea'pig,'elephant,'cat,'rabbit,'mouse,'cow,'dog,'armadillo,'
horse'and'dolphin.'
!
! 34!
!
Figure'31:'Mulan'conservation'analysis'for'the'human'CYP7B1'gene'from'159.9kb'to'212.6kb'compared'with'guinea'pig,'elephant,'cat,'rabbit,'mouse,'cow,'dog,'
armadillo,'horse'and'dolphin.'
! 35!
!
Figure'32:'Phylogenetic'tree'represents'the'evolutionary'history'of'the'CYP7B1'gene'in'different'vertebrate'lineages'(with'the'numbers'corresponding'to'the'number'of'
nucleotide'mismatches'per'kb).
! 36!
3.2$Phylogenetic$shadowing$implemented$in$Mulan$
Phylogenetic!shadowing!has!emerged!as!a!strategy!for!deciphering!putative!
regulatory!elements!in!comparisons!of!closely!related!species.!It!compares!many!
closely!related!sequences!simultaneously!and!combining!mutations!from!all!the!
sequences!into!a!single!conservation!profile!(Loots!&!Ovcharenko,!2005).!
Phylogenetic!shadowing!implemented!in!Mulan!provides!easy!sequence!and!
annotation!through!several!venues!combined!with!a!fast!and!dynamic!
visualisation!interface.!It!is!best!used!for!the!analysis!of!large!sequence!intervals!
with!prior!set!up!or!known!conservation!detection!parameters.!
!
Based!on!the!results!showed!in!figure!33,!this!justifies!the!location!of!TFBS!found!
using!multiTF!which!are!shown!on!figure!35.!Most!of!the!TFBS!found!using!
multiTF!are!within!184kb!–!199kb.!It!is!important!to!keep!in!mind!though!that!
not!all!TFBS!can!be!found!using!phylogenetic!shadowing!owing!to!the!nature!of!
its!technique!and!that!false!positives!could!occur!(Blanchette!&!Tompa,!2002).!
!
Figure'33:'Phylogenetic'shadowing'profile'of'the'human'and'10'other'species'comparison'
(100bp/70%)'identity'threshold.'
'
! 37!
3.3$Detection$of$evolutionary$conserved$TFBS:$multiTF$
The!complexity!of!transcriptional!regulation!in!vertebrates!achieved!through!the!
combinatorial!and!synchronised!binding!of!different!transcription!factors!to!
gene!regulatory!elements.!These!CRMs!contain!a!specific!footprint!consisting!of!
several!TFBS!(Loots!&!Ovcharenko,!2005).!CRMs!are!usually!several!hundred!
base!pairs!in!length!and!stand!out!of!the!neighbouring!genomic!sequence!as!wellV
conserved!regions.!Their!function!can!be!inferred!computationally!only!by!
functions!that!have!been!associated!with!known!TFBS!patterns!present!in!CRM.!
!
Predicting!functional!TFBS!is!a!very!challenging!process!originating!from!the!
nature!of!binding!sites!that!are!very!short!in!length!(usually!ranging!from!6!to!
12bp).!Therefore,!TFBS!occur!at!a!highVfreqeuncy!across!a!genome!and!result!in!
an!overabundance!of!falseVpositive!predictions.!But!the!ability!to!accurately!
predict!functional!TFBS!is!a!powerful!approach!for!sequenceVbased!discovery!of!
gene!regulatory!sequences.!
!
Mulan!is!integrated!with!the!multiTF!that!operates!with!multipleVsequence!
alignments!and!benefits!from!extensive!sampling!of!the!phylogeny!and!performs!
a!search!for!TFBS!that!are!represented!in!all!the!species.!
!
In!this!report,!multiTF!is!used!to!analyse!the!CYP7B1!gene!to!look!for!any!TFBS!
known!to!enhance!the!expression!of!this!gene!across!the!10!species!in!query!with!
the!human!gene!being!the!reference!sequence.!10!TFBS!were!identified!as!wellV
defined!clusters!towards!the!end!region!(184kb!–!199kb)!of!the!CYP7B1!gene!
(Figure!34).!These!are!typically!illustrated!as!coloured!tick!marks!above!the!
conservation!profile.!The!potential!locations!of!each!of!these!predicted!TFBS!are!
highlighted!in!Figure!35!with!*!indicating!a!high!similarity!between!species.!
!
These!data!suggests!that!by!analysing!TFBS!pattern!in!multipleVsequence!
alignments,!one!can!dramatically!filter!out!sites!that!have!diverged!throughout!
evolution,!and!select!for!sites!that!are!most!likely!functional.!
! 38!
!
Figure'34:'multiTF'identification'of'conserved'TFBS'of'the'CYP7B1'gene.'The'identified'TFBS'are'
depicted'as'coloured'tick'marks'above'the'conservation'profile.'
! 39!
$
!
!
!
! 40!
!
!
!
!
Figure'35:'The'potential'transcription'binding'sites'found'indicated'within'the'CYP7B1'gene.'
! 41!
3.4$Exon$and$promoter$region$using$Spidey$and$Fprom$
Expressed!sequences!are!the!key!to!the!inner!workings!of!an!organism.!To!
understand!fully!the!function!of!an!expressed!sequence,!however,!it!needs!to!be!
put!in!its!genomic!context.!Alignment!of!expressed!sequences!to!their!parent!
genomic!sequences!can!be!used!to!find!or!confirm!a!gene’s!positive,!to!locate!
potential!regulatory!elements!and!alternative!splicing.!With!estimates!of!the!
human!gene!of!only!30,000,!alternative!splicing!may!be!an!important!factor!in!
generating!transcriptional!diversity,!so!mRNAVtoVgenomic!alignments!will!be!
crucial!to!our!understanding!of!the!genome.!
!
Species! No.!of!exons!predicted! No.!of!exons!correct!
Homo!sapiens! 5! 6!
Dasypus!novemcinctus! 5! 6!
Felis!catus! 6! 6!
Bos!taurus! 5! 7!
Canis!lupus!familiaris! 5! 6!
Tursiops!truncatus! 5! 6!
Loxodonta!africana! 7! 7!
Cavia!porcellus! 6! 8!
Equus!caballus! 6! 7!
Mus!musculus! 5! 6!
Oryctolagus!cuniculus! 6! 6!
Table'2:'Results'of'no.'of'exons'predicted'by'Spidey'vs.'No.'of'exons'correct'taken'from'Ensembl.'
!
The!difference!in!exons!predicted!and!the!correct!number!of!exons!could!be!that!
there!is!no!good!splice!sites!at!either!the!donor!or!acceptor!splice!junctions!and!
so!therefore,!Spidey!could!not!place!those!junctions!unambiguously.!!
!
The!11!species!we!obtained!contained!at!least!83%!of!mRNA!aligned!across!the!
species!with!an!overall!identity,!for!most,!of!100%.!If!the!3’!end!of!the!mRNA!
does!not!align!completely,!it!is!first!examined!for!the!presence!of!a!poly(A)!tail.!
Out!of!the!11!species!in!query,!the!armadillo!species!indicate!a!possibility!of!a!
pseudogene!(dysfunctional!relatives!of!genes!that!have!lost!their!proteinVcoding!
! 42!
ability!or,!otherwise,!is!no!longer!expressed!in!the!cell)!since!a!nonValigning!
poly(A)!tail:!9!was!found.!
!
The!exact!locations!of!the!different!exons!of!each!species!are!accentuated!in!
Appendix!I.!It!also!shows!the!overall!percent!identity,!the!percent!coverage!of!the!
mRNA!and!the!presence!an!aligning!or!nonValigning!poly(A)!tail.!
!
Species! Length!(bp)! Predicted!TSS! Predicted!TATA!
box!
TATA!sequence!
Homo!sapiens! 212,627! 3,925! 3,894! TATATATG!
Dasypus!
novemcinctus!
194,050! 4711! 4,681! TTTAAAAG!
Felis!catus! 39,628! 5505! 5,477! TATAAGTA!
Bos!taurus! 181,409! 1405! 1,375! AATAAAAG!
Canis!lupus!
familiaris!
178,270! 5,368! 5,343! TATTAAAG!
Tursiops!
truncatus!
187,667! 1,530! 1,500! AATATATC!
Loxodonta!
africana!
40,375! 3,371! 3,341! TATAAAAA!
Cavia!porcellus! 38,224! 1,570! 1,530! TATATAAT!
Equus!caballus! 209,282! 4,469! 4,439! AATAAAAG!
Mus!musculus! 181,389! 3,968! 3,939! TATAAAAA!
Oryctolagus!
cuniculus!
46,718! V! V! V!
Table'3:'Fprom'predictions'of'TSS'and'TATA'box.'
!
The!Fprom!(find!promoter)!can!be!used!to!identify!transcription!start!sites!(TSS)!
upstream!of!annotated!coding!parts!of!genes!found!by!gene!prediction!software.!
According!to!Softberry!(developers!of!Fprom),!for!approximately!50V55%!level!
of!true!promoter!region!recognition,!the!Fprom!program!will!give!one!false!
positive!prediction!for!about!4000bp.!
!
! 43!
Examples!of!Fprom!predictions!are!presented!in!Table!3.!The!predicted!
promoter!and!TATA!box!regions!are!narrowed!down!to!within!~5kb!long!but!it!
is!important!to!note!that!there!are!a!lot!of!promoters!found!in!each!species!which!
are!highlighted!in!the!model!shown!on!the!next!page.!No!promoter!or!TATA!box!
has!been!found!for!the!rabbit!species!within!the!5kb!limit!!
!
For!each!position!on!a!given!sequence,!the!Fprom!program!evaluates!the!
occurrence!of!TSS!using!two!linear!discriminant!functions!(separate!for!TATA+!
and!TATAV!promoters)!with!characteristics!computed!at!a!given!position.!If!it!
finds!a!TATAVbox!(using!a!TATAVbox!weight!matrix)!in!the!region,!then!it!
computes!the!value!of!Linear!Discriminant!Function!(LDF)!for!TATA+!promoters,!
otherwise!the!value!of!LDF!for!TATAVless!promoters.!
!
The!computational!identification!of!promoters!in!genomic!DNA!is!an!extremely!
difficult!problem!(Solovyev,!2002).!This!task!is!twoVfold:!finding!the!exact!
position!of!a!TSS!within!a!long!upstream!region!of!a!typical!eukaryotic!gene;!and!
avoiding!false!positive!predictions!within!exon!and!intron!sequences!(Solovyev!
et!al.,!2006).!
!
Regions!of!DNA,!which!signal!initiation,!are!termed!promoters!and!lie!‘upstream’!
of!the!start!of!the!actual!gene.!Initiation!starts!with!molecules!such!as!
polymerase!II!enzymes!finding!promoter!regions!upstream!(towards!the!3’!end!
of!a!strand)!of!a!gene.!These!regions!consist!of!specific!patterns!of!bases!known!
as!TATA!box.!The!start!point!of!a!gene!is!typically!25!bases!downstream!of!the!
TATA!box!for!eukaryotes.!!
! 44!
Below!shows!the!model!results!of!exon!and!promoter!prediction:!
Homo!sapiens!(Human)!!
!
The!human!transcript!identified!49!promoters!predicted.!It!contains!5!exons!that!
are!100%!identical!to!the!genomic!sequence,!covering!86%!of!the!mRNA!length.!
!
Dasypus!Novemcinctus!(Armadillo)!!
!
!The!armadillo!transcript!identified!69!promoters!predicted.!It!contains!5!exons!
that!are!100%!identical!to!the!genomic!sequence,!covering!94%!of!the!mRNA!
length!with!a!nonValigning!poly(A)!tail:!9.!
!
! 45!
!Felis!Catus!(Cat)!
!
The!cat!transcript!identified!14!promoters!predicted.!It!contains!6!exons!that!are!
100%!identical!to!the!genomic!sequence,!covering!100%!of!the!mRNA!length.!
!
Bos!Taurus!(Cow)!
!
The!cow!transcript!identified!58!promoters!predicted.!It!contains!5!exons!that!
are!100%!identical!to!the!genomic!sequence,!covering!83%!of!the!mRNA!length.!
!
!
! 46!
Canis!Lupus!Familiaris!(Dog)!
!
The!dog!transcript!identified!51!promoters!predicted.!It!contains!5!exons!that!
are!100%!identical!to!the!genomic!sequence,!covering!92%!of!the!mRNA!length.!
!
Tursiops!Truncatus!(Dolphin)!
!
The!dolphin!transcript!identified!92!promoters!predicted.!It!contains!5!exons!
that!are!100%!identical!to!the!genomic!sequence,!covering!91%!of!the!mRNA!
length.!
!
! 47!
Loxodonta!Africana!(Elephant)!
!
The!elephant!transcript!identified!10!promoters!predicted.!It!contains!7!exons!
that!are!100%!identical!to!the!genomic!sequence,!covering!100%!of!the!mRNA!
length.!
!
Cavia!Porcellus!(Guinea!Pig)!
!
The!guinea!pig!transcript!identified!9!promoters!predicted.!It!contains!6!exons!
that!are!98.7%!identical!to!the!genomic!sequence,!covering!100%!of!the!mRNA!
length.
! 48!
Equus!Cabalus!(Horse)!
!
The!horse!transcript!identified!77!promoters!predicted.!It!contains!6!exons!that!
are!100%!identical!to!the!genomic!sequence,!covering!94%!of!the!mRNA!length.!
!
Mus!Musculus!(Mouse)!
!
The!mouse!transcript!identified!59!promoters!predicted.!It!contains!5!exons!that!
are!100%!identical!to!the!genomic!sequence,!covering!88%!of!the!mRNA!length.!
! 49!
Oryctolagus!Cuniculus!(Rabbit)!
!
The!rabbit!transcript!identified!13!promoters!predicted.!It!contains!6!exons!that!
are!100%!identical!to!the!genomic!sequence,!covering!100%!of!the!mRNA!length.!
!
The!entire!predicted!promoter!of!each!of!the!species!retrieved!from!Fprom!
contained!sequence!elements!within!the!V10!to!V35!bp!that!is!commonly!
expected.!
4.0$Discussion$
4.1$Genetics$and$sequencing$
Genes!have!been!known!to!exist!on!chromosomes,!which!ultimately!are!
composed!protein!and!DNA.!Other!theories!have!since!then!emerged!as!to!which!
is!responsible!for!inheritance!since!Mendel’s!work!in!the!midV19th!century.!This!
included!Griffith’s!experiment!in!1928!that!suggests!that!bacteria!are!capable!of!
transferring!genetic!information!through!transformation!(Reece!et!al.,!2011).!
Sixteen!years!later,!Oswald!Avery,!Colin!McLeod!and!Maclyn!McCarty!identified!
DNA!as!the!carrier!for!genetic!information!in!bacteria!(Klug!et!al.,!2007).!Watson!
and!Crick!in!1953!determined!the!structure!of!DNA!through!Franklin!and!
Wilkin’s!work!on!XVray!crystallography.!This!showed!that!genetic!information!
exists!in!the!sequence!of!nucleotides!on!each!strand!of!DNA.!In!the!following!
years,!scientists!tried!to!understand!how!DNA!controls!the!process!of!protein!
! 50!
production.!It!was!then!discovered!that!the!cell!uses!DNA!as!a!template!to!create!
matching!mRNA.!The!nucleotide!sequence!of!mRNA!is!used!to!create!an!amino!
acid!sequence!in!protein!and!this!translation!between!nucleotide!sequenced!and!
amino!acid!sequences!is!known!as!the!genetic!code!(Rice,!2009).!This!newfound!
molecular!understanding!of!inheritance!has!led!to!the!development!of!DNA!
sequencing.!
!
The!genome!of!an!organism!contains!thousands!of!genes,!but!not!all!these!genes!
need!to!be!active!at!any!given!moment.!A!gene!is!expressed!when!it!is!being!
transcribed!into!mRNA!and!there!exist!many!cellular!methods!of!controlling!the!
expression!of!genes!such!that!proteins!are!produced!only!when!needed!by!the!
cell.!Regulatory!proteins!such!as!transcription!factors!bind!to!DNA!to!either!
promote!or!inhibit!the!transcription!of!a!gene!(Brivanlou!&!Darnell,!2002).!
!
As!the!entire!genomes!of!many!different!species!are!sequenced,!this!led!to!the!
direction!in!current!research!on!gene!finding!in!a!comparative!genomics!
approach!that!is!based!on!the!principle!that!the!forces!of!natural!selection!which!
drive!the!genes!and!other!functional!elements!to!endure!mutation!at!a!slower!
rate!than!the!rest!of!the!genome,!since!mutation!in!functional!elements!are!more!
likely!to!negatively!impact!the!organism!than!mutations!elsewhere.!Genes!can!
thus!be!evolutionary!detected!by!comparing!the!genomes!of!related!species!to!
detect!this!evolutionary!pressure!for!conservation.!
!
4.2$How$cholesterol$plays$a$role$in$CYP7B1$
Cholesterol!metabolised!to!7alphaVhydroxylated!bile!acids!is!a!principle!pathway!
of!cholesterol!degradation.!Cholesterol!7alphaVhydroxylase!(CYP7A1)!is!the!
initial!and!rateVdetermining!enzyme!in!the!“classic!pathway”!of!bile!acid!
synthesis.!An!“alternative”!pathway!of!bile!acid!synthesis!begins!with!27V
hydroxylation!of!cholesterol!by!27Vhydroxylase!(CYP27),!followed!by!CYP7B1!
(Ren!et!al.,!2003).!It!plays!a!minor!role!in!total!bile!acid!synthesis!but!the!
regulation!of!CYP7B1,!possibly!a!rateVdetermining!enzyme!in!the!alternative!
pathway,!has!not!been!thoroughly!studied!(Pandak!et!al.,!2002).!
! 51!
Role/!Tissue! Substrates!
Bile!salt!synthesis! !
Liver! 25VHydroxycholesterol,!27V
hydroxycholesterol!
Steroid!hormone!metabolism! !
Brain! Pregnenolone,!
dehydroepiandrosterone!
Metabolism!of!estrogen!receptor!
ligands!
!
Prostate! 5αVAdrostaneV3β,17βVdiol!
Prostate! Dehydroepiandrosterone!(?)!
Vascular! 27Vhydroxycholesterol!
Immunoglobulin!production! !
Immune!cells! 25Vhydroxycholesterol!
Table'4:'Physiological'roles'of'CYP7B1'adapted'from'(Stiles!et!al.,!2009).'
!
4.3$Problems$and$Challenges$in$Bioinformatics$
Bioinformatics!had!been!developed!to!handle!and!analyse!the!vast!amounts!of!
information!being!generated!by!sequencing!projects!but!had!considered!that!
once!the!human!genome!was!sequenced,!there!would!be!a!major!logistical!
problem!in!handling!the!sequence!data.!
Some!of!the!earliest!problems!in!genomics!concerned!how!to!measure!similarity!
of!DNA!and!protein!sequences!either!within!a!genome,!or!across!the!genomes!of!
different!species.!DNA!and!proteins!can!be!similar!in!terms!of!their!function,!
their!structure!or!their!linear!sequence!of!nucleotides!or!amino!acids.!The!key!
presumption!for!DNA!is!that!if!two!DNA!sequences!are!similar!that!they!probably!
share!the!same!function,!even!if!they!occur!in!different!parts!of!the!genome!or!
across!two!or!more!genomes!(Keedwall!&!Narayanan,!2005).!
!
Predictability!has!been!blatantly!difficult!in!biology,!and!the!role!of!theory!in!
biology!is!very!different!from!that!of!theoretical!physics,!which!usually!takes!a!
leading!role!in!research!(Buehler!&!Rashidi,!2005).!Much!of!the!difficulties!of!
theoretical!biology!are!rooted!in!the!complexity!of!biological!systems.!Many!
! 52!
cellular!components!and!mechanisms!remain!to!be!discovered.!Genomics!and!
proteomics!provide!good!examples!of!the!difficulties!in!predicting!the!behaviour!
of!complex!system.!The!reasons!lie!in!the!nature!of!incomplete!data!rest!and!
incomplete!or!wrong!data!annotation.!
!
4.4$Problems$with$online$tools$
The!tools!used!in!bioinformatics!are!applied!mathematics!and!computer!science.!
Information!storage!and!retrieval,!statistical!analysis,!data!fitting,!and!computer!
simulation!are!central!tasks,!and!today’s!molecular!biology!would!be!impossible!
without!them.!Computers!are!essential!in!processing!large!amounts!of!data!in!a!
timeVefficient!manner!that!is!otherwise!inefficient!through!manual!processing.!
Computers,!however,!need!to!come!with!instructions,!and!the!analytical!process!
that!foes!into!the!system!is!the!work!of!the!human!operator!and!needs!to!be!
included!in!the!overall!time!it!takes!to!solve!a!problem!with!a!numerical!
processor!(Buehler!&!Rashidi,!2005).!Thus,!human!intervention!takes!time!and!is!
errorVprone.!
!
Running!comparative!analysis!or!searching!for!predicted!transcription!start!sites!
are!not!an!easy!task.!There!are!complications!most!especially!with!publicly!
funded!online!software!since!most!of!them,!nowadays,!have!either!run!its!course,!
or!its!developers!no!long!fund!it!or!it!is!very!limited!to!the!length!of!the!sequence!
in!one!can!submit.!
!
! 53!
4.5$Summary$of$results$
To!understand!a!novel!sequence!for!its!potential!functionality!in!an!organism,!
multiple!sequence!alignment!provides!biological!information!through!
evolutionary!related!genes!and!proteins.!
!
Using!Mulan!to!align!multiple!sequences!of!different!species!allowed!the!
determination!of!these!ECRs!shared!commonly!among!the!different!species!in!
question.!A!summary!led!to!confirming!the!region!to!where!the!TFBS!are!located.!
Transcription!factors!are!most!essential!for!the!regulation!of!gene!expression.!
Predicting!the!position!of!the!different!TFBS!predicted!shows!where!these!
transcription!factors!either!bind!to!enhancer!or!promoter!regions!of!DNA!to!the!
genes!they!regulate.!
!
Determining!the!phylogeny!has!increased!the!sequence!homology!between!
sequences,!which!indicates!a!closer!evolutionary!relationship!among!the!species.!
!
Overall,!the!increasing!interest!in!the!‘junk’!DNA,!that!is,!DNA!which!is!believed!
not!to!code!for!any!protein!has,!over!the!years,!allowed!scientists!and!
researchers!to!question!whether!such!sections!of!the!DNA!are!the!remains!of!
previously!useful!DNA!that!now!contains!no!function,!or!whether!nonVcoding!
DNA!provides!a!structural!aid!to!help!stabilise!chromosomes!and!the!nucleus!
(Keedwall!&!Narayanan,!2005).!
! 54!
4.6$Future$of$comparative$genomics$
In!the!next!several!years,!genomes!from!a!wide!variety!of!species!covering!many!
taxa!will!be!sequenced!which!will!therefore!bring!many!advances!in!comparative!
genomics.!The!resources!for!comparative!genomics!is!expected!to!be!much!more!
user!friendly!and!that!they!will!become!part!of!the!toolkit!of!virtually!every!
experimental!biologist.!However,!building!the!bioinformatics!structure!to!realise!
this!exciting!potential!will!require!new!developments!(Miller!et!al.,!2004).!
!
The!amount!of!biological!sequence!information!is!increasing!very!rapidly!and!
seems!to!be!following!an!exponential!growth!law.!Computational!methods!are!
playing!an!increasing!role!in!biological!sciences.!Genome!sequencing!projects!
have!been!remarkably!successful,!and!comparative!analysis!of!whole!genomes!is!
now!possible.!This!provides!challenges!and!opportunities!for!new!types!of!study!
in!bioinformatics.!At!the!same!time,!several!types!of!experimental!methods!are!
being!developed!currently!that!may!be!classed!as!‘highVthroughput’!(Higgs!&!
Attwood,!2005).!These!include!microarrays,!proteomics,!and!structural!
genomics.!The!philosophy!behind!these!methods!is!to!study!large!number!of!
genes!or!proteins!simultaneously,!rather!than!to!specialise!in!individual!cases.!
Bioinformatics!therefore!has!a!role!in!developing!statistical!methods!for!analysis!
of!large!data!sets,!and!in!developing!methods!of!information!management!for!the!
new!types!of!data!being!generated.!
!
!
! 55!
Bibliography$
Altschul,!S.!et!al.,!1990.!Basic!local!alignment!search!tool.!J'Mol'Biol,!215(3),!
pp.403V10.!
Baum,!D.,!2008.!Reading!a!Phylogenetic!Tree:!The!Meaning!of!Monophyletic!
Groups.!Nature'Education,!1(1),!p.190.!
Blanchette,!M.!&!Tompa,!M.,!2002.!Discovery!of!Regulatory!Elements!by!a!
Computational!Method!for!Phylogenetic!Footprinting.!Genome'Res,!12(5),!
pp.739V48.!
Brivanlou,!A.H.!&!Darnell,!J.E.,!2002.!Signal!Transduction!and!the!Control!of!Gene!
Expression.!Science,!295(5556),!pp.813V18.!
Buehler,!L.K.!&!Rashidi,!H.H.,!2005.!Computers!in!Biology!and!Medicine.!In!L.K.!
Buehler,!ed.!Bioinformatics'Basics:'Applications'in'Biological'Science'and'Medicine.!
2nd!ed.!Boca!Raton:!CRC!Press.!
Buhaescu,!I.!&!Izzedine,!H.,!2007.!Mevalonate!pathway:!a!review!of!clinical!and!
therapeutical!implications.!Clin'Biochem,!40(9V10),!pp.575V84.!
Campbell,!M.!&!Farell,!S.,!2012.!Lipid!Metabolism:!Cholesterol!Biosynthesis.!In!A.!
White,!ed.!Biochemistry.!7th!ed.!Belmont:!Brooks/Cole.!pp.613V20.!
Collins,!F.!&!Galas,!D.,!1993.!A!New!FiveVYear!Plan!for!the!United!States:!Human!
Genome!Program.!Science,!262,!pp.43V46.!
Collins,!F.,!Green,!E.,!Guttmacher,!A.!&!Guyer,!M.,!2003.!A!Vision!for!the!Future!of!
Genomics!Research:!A!Blueprint!for!the!Genomic!Era.!Nature,!422(6934),!
pp.835V47.!
Collins,!F.S.!et!al.,!1998.!New!goals!for!the!U.S.!Human!Genome!Project:!1998V
2003.!Science,!282(5389),!pp.682V89.!
Dietrich,!W.F.!et!al.,!1996.!A!comprehensive!genetic!map!of!the!mouse!genome.!
Nature,!380(6570),!pp.149V52.!
Fletcher,!H.!&!Hickey,!I.,!2013.!DNA!Structure.!In!E.!Owen,!ed.!Genetics.!4th!ed.!
New!York:!Garland!Science.!p.2.!
Frazer,!K.!et!al.,!2004.!VISTA:!computational!tools!for!comparative!genomics.!
Nucleic'Acids'Res,!32(Web!Server!Issue),!pp.W273V9.!
! 56!
Freilich,!S.,!Goldovsky,!L.,!Ouzounis,!C.A.!&!Thornton,!J.M.,!2008.!Metabolic!
innovations!towards!the!human!lineage.!BMC'Evol'Biol,!8,!p.247.!
Fu,!Q.!et!al.,!1998.!Control!of!cholesterol!biosynthesis!in!Schwann!cells.!J'
Neurochem,!71(2),!pp.549V55.!
Ganley,!A.R.!&!Kobayashi,!T.,!2007.!Phylogenetic!footprinting!to!find!functional!
DNA!elements.!Methods'Mol'Biol,!395,!pp.367V80.!
Higgs,!P.G.!&!Attwood,!T.K.,!2005.!Introduction:!the!revolution!in!bilogical!
information.!In!P.G.!Higgs,!ed.!Bioinformatics'and'Molecular'Evolution.!Oxford:!
Blackwell!Science!Ltd.!
Horton,!J.,!Goldstein,!J.!&!Brown,!M.,!2002.!SREBPs:!activators!of!the!complete!
program!of!cholesterol!and!fatty!acid!synthesis!in!the!liver.!J'Clin'Invest,!109(9),!
pp.1125V31.!
Howard!Hughes!Medical!Institute,!2001.!Species:'Comparing'their'Genome.!
[Online]!Available!at:!http://www.actionbioscience.org/genomics/hhmi.html!
[Accessed!6!April!2014].!
Keedwall,!E.!&!Narayanan,!A.,!2005.!Introduction!to!Problems!and!Challenges!in!
Bioinformatics.!In!E.!Keedwall,!ed.!Intelligent'Bioinformatics.!West!Sussex:!John!
Wiley!&!Sons!Ltd.!pp.31V49.!
Kent,!W.,!2002.!BLAT—The!BLASTVLike!Alignment!Tool.!Genome'Res,!12,!pp.656V
64.!
Klug,!W.S.,!Cummings,!M.R.!&!Spencer,!C.A.,!2007.!Introduction!to!Genetics.!In!G.!
Carlson,!ed.!Essentials'of'Genetics.!6th!ed.!New!Jersey:!Pearson!Prentice!Hall.!p.4.!
Korf,!B.R.,!2007.!The!Human!Genome.!In!M.!Sugden,!ed.!Human'Genetics'and'
Genomics.!3rd!ed.!Oxford:!Blackwell!Publishing!Ltd.!pp.77V80.!
Le!Roy,!C.!&!Wrana,!J.L.,!2005.!ClathrinV!and!nonVclathrinVmediated!endocytic!
regulation!of!cell!signalling.!Nat'Rev'Mol'Cell'Biol,!6,!pp.112V26.!
Lesk,!A.M.,!2008.!Genome!organization!and!evolution.!In!A.M.!Lesk,!ed.!
Introduction'to'Bioinformatics.!3rd!ed.!New!York:!Oxford!University!Press!Inc.!
p.104.!
Lewington,!S.!et!al.,!2007.!Blood!cholesterol!and!vascular!mortality!by!age,!sex,!
and!blood!pressure:!a!metaVanalysis!of!individual!data!from!61!prospective!
studies!with!55!000!vascular!deaths.!The'Lancet,!370(9602),!pp.1829V39.!
! 57!
Lieberman,!M.!&!Ricer,!R.,!2013.!BRS'Biochemistry,'Molecular'Biology'and'
Genetics.!6th!ed.!Philadelphia:!Lippincott!Williams!&!Wilkins.!
Loots,!G.G.!&!Ovcharenko,!I.,!2005.!Dcode.org!anthology!of!comparative!genomic!
tools.!Nucleic'Acids'Res,!33(Web!server!issue),!pp.W56V64.!
Loots,!G.G.!&!Ovcharenko,!I.,!2007.!Mulan!MultipleVSequence!Alignment!to!
Predict!Functional!Elements!in!Genomic!Sequences.!Methods'Mol'Biol,!395,!
pp.237V54.!
Lu,!H.,!2011.!Application!of!comparative!genomics!for!the!detection!of!genomic!
features!and!transcriptional!regulatory!elements.!Graduate'Theses'and'
Dissertations,!Paper!12151.!
Maxfield,!F.!&!van!Meer,!G.,!2010.!Cholesterol,!the!central!lipid!of!mammalian!
cells.!Curr'Opin'Cell'Biol,!22(4),!pp.422V29.!
McVean,!G.A.,!2012.!An!integrated!map!of!genetic!variation!from!1,092!human!
genomes.!Nature,!491(7422),!pp.56V65.!
Miller,!W.,!Makova,!K.D.,!Nekrutenko,!A.!&!Hardison,!R.C.,!2004.!Comparative!
Genomics.!Annu'Rev'Genomics'Hum'Genet,!5,!pp.15V56.!
Mo,!H.!&!Elson,!C.,!2004.!Studies!of!the!isoprenoidVmediated!inhibition!of!
mevalonate!synthesis!applied!to!cancer!chemotherapy!and!chemoprevention.!
Exp'Biol'Med,!229(7),!pp.567V85.!
National!Library!of!Medicine,!2014.!CYP7B1'Z'cytochrome'P450,'family'7,'
subfamily'B,'polypeptide'1.![Online]!Available!at:!
http://ghr.nlm.nih.gov/gene/CYP7B1![Accessed!13!April!2014].!
NCBI!RefSeq,!2008.!CYP7B1'cytochrome'P450,'family'7,'subfamily'B,'polypeptide'1'
[Homo'sapiens'(human)].![Online]!Available!at:!
http://www.ncbi.nlm.nih.gov/gene?Db=gene&Cmd=ShowDetailView&TermToS
earch=9420![Accessed!13!April!2014].!
Nelson,!D.L.!&!Cox,!M.M.,!2008.!In!K.!Ahr,!ed.!Lehninger's'Principles'of'
Biochemistry.!5th!ed.!New!York:!W.H.!Freeman!and!Company.!pp.831V45.!
Ohyama,!Y.!et!al.,!2006.!Studies!on!the!transcriptional!regulation!of!cholesterol!
24Vhydroxylase!(CYP46A1):!Marked!insensitivity!towards!different!regulatory!
axes.!J'Biol'Chem,!281(7),!pp.3810V20.!
Ovcharenko,!I.!et!al.,!2005.!Mulan:!MultipleVsequence!local!alignment!and!
visualisation!for!studying!function!and!evolution.!Genome'Res,!15(1),!pp.184V94.!
! 58!
Ovcharenko,!I.!et!al.,!2005.!Mulan:!MultipleVsequence!local!alignment!and!
visualization!for!studying!function!and!evolution.!Genome'Res,!15(1),!pp.184V94.!
Ovcharenko,!I.!et!al.,!2004.!zPicture:!dynamic!alignment!and!visualization!tool!for!
analyzing!conservation!profiles.!Genome'Res,!14(3),!pp.472V77.!
Pandak,!W.!et!al.,!2002.!Regulation!of!oxysterol!7alphaVhydroxylase!(CYP7B1)!in!
primary!cultures!of!rat!hepatocytes.!Hepatology,!35(6),!pp.1400V8.!
Parker,!S.!et!al.,!2009.!Local!DNA!Topography!Correlates!with!Functional!
Noncoding!Regions!of!the!Human!Genome.!Science,!324(5925),!pp.389V92.!
Pennisi,!E.,!2007.!Breakthrough!of!the!year.!Human!genetic!variation.!Science,!
318(5858),!pp.1842V43.!
Pennisi,!E.,!2013.!The!CRISPR!Craze.!Science,!341(6148),!pp.833V36.!
Reece,!J.B.!et!al.,!2011.!Genomes!and!Their!Evolution:!New!approaches!have!
accelerated!the!pace!of!genome!sequencing.!In!B.!Wilbur,!ed.!Campbell'Biology.!
9th!ed.!San!Francisco:!Pearson!Education,!Inc.!pp.473V74.!
Reece,!J.B.!et!al.,!2011.!The!Molecular!Basis!of!Inheritance:!DNA!is!the!Genetic!
Material.!In!Campbell'Biology.!9th!ed.!San!Francisco:!Pearson!Education,!Inc.!
pp.351V56.!
Reece,!J.B.!et!al.,!2011.!The!Structure!and!Function!of!Large!Biological!Molecules:!
Nucleic!acids!store,!transmit,!and!help!express!hereditary!information.!In!B.!
Wilbur,!ed.!Campbell'Biology.!9th!ed.!San!Francisco:!Pearson!Education!Inc.!
p.133.!
Reed,!B.!et!al.,!2008.!GenomeVWide!Occupancy!of!SREBP1!and!Its!Partners!NFY!
and!SP1!Reveals!Novel!Functional!Roles!and!Combinatorial!Regulation!of!Distinct!
Classes!of!Genes.!PLoS'Genet,!4(7).!
Ren,!S.!et!al.,!2003.!Regulation!of!oxysterol!7alphaVhydroxylase!(CYP7B1)!in!the!
rat.!Metabolism,!52(5),!pp.636V42.!
Rice,!S.A.,!2009.!DNA!(raw!material!of!evolution).!In!S.A.!Rice,!ed.!Encyclopedia'of'
Evolution.!New!York:!Infobase!Publishing.!p.134.!
Rosanoff,!A.!&!Seelig,!M.S.,!2004.!Comparison!of!Mechanism!and!Functional!
Effects!of!Magnesium!and!Statin!Pharmaceuticals.!J'Am'Coll'Nutr,!23(5),!pp.501V
05.!
Rose,!K.!et!al.,!2001.!Neurosteroid!Hydroxylase!CYP7B!vivid!reporter!activity!in!
dentate!gyrus!of!geneVtargeted!mice!and!abolition!of!a!widespread!pathway!of!
! 59!
steroid!and!oxysterol!hydroxylation.!Journal'of'Biological'Chemistry,!276,!
pp.23937V44.!
Rose,!K.A.!et!al.,!1997.!Cyp7b,!a!novel!brain!cytochrome!P450,!catalyzes!the!
synthesis!of!neurosteroids!7alphaVhydroxy!dehydroepiandrosterone!and!7alphaV
hydroxy!pregnenolone.!Proceedings'of'the'National'Academy'of'Sciences'of'the'
United'States'of'America,!94(10),!pp.4925V30.!
Saher,!G.!et!al.,!2005.!High!cholesterol!level!is!essential!for!myelin!membrane!
growth.!Nat'Neurosci,!8(4),!pp.468V75.!
Saher,!G.!et!al.,!2009.!Cholesterol!Regulates!the!Endoplasmic!Reticulum!Exit!of!
the!Major!Membrane!Protein!P0!Required!for!Peripheral!Myelin!Compaction.!J'
Neurosci,!29(19),!pp.6094V104.!
Saher,!G.,!Quintes,!S.!&!Nave,!K.,!2011.!Cholesterol:!a!novel!regulatory!role!in!
myelin!formation.!The'Neuroscientist,!17(1),!pp.79V93.!
Samuelsson,!T.,!2012.!Genomics'and'Bioinformatics.!New!York:!Cambridge!
University!Press.!
Sanger,!F.,!1981.!Determination!of!Nucleotide!Sequences!in!DNA.!Science,!
214(4526),!pp.1205V10.!
Schlicker,!A.,!2005.!A!Global!Approach!to!Comparative!Genomics:!Comparison!of!
Functional!Annotation!over!the!Taxonomic!Tree.!Master'Thesis.!
Schwartz,!S.!et!al.,!2000.!PipMakerVVa!web!server!for!aligning!two!genomic!DNA!
sequences.!Genome'Res,!10(4),!pp.577V86.!
Setchel,!K.D.!et!al.,!1998.!Identification!of!a!new!inborn!error!in!bile!acid!
synthesis:!mutation!of!the!oxysterol!7alphaVhydroxylase!gene!causes!severe!
neonatal!liver!disease.!J'Clin'Invest,!102(9),!pp.1690V703.!
Shafaati,!M.,!O'Driscoll,!R.,!Björkhem,!I.!&!Meaney,!S.,!2009.!Transcriptional!
regulation!of!cholesterol!24Vhydroxylase!by!histone!deacetylase!inhibitors.!
Biochem'Biophys'Res'Commun,!378(4),!pp.689V94.!
Sievers,!F.!et!al.,!2011.!Fast,!scalable!generation!of!highVquality!protein!multiple!
sequence!alignments!using!Clustal!Omega.!Mol'Syst'Biol,!7,!p.539.!
Solovyev,!V.,!2002.!Finding!genes!by!computer:!probabilistic!and!discriminative!
approaches.!In!T.!Jiang,!T.!Smith,!Y.!Xu!&!M.!Zhang,!eds.!Current'Topics'in'
Computational'Biology.!Massachusetts:!The!MIT!Press.!pp.365V401.!
! 60!
Solovyev,!V.,!Kosarev,!P.,!Seledsov,!I.!&!Vorobyev,!D.,!2006.!Automatic!annotation!
of!eukaryotic!genes,!pseudogenes!and!promoters.!Genome'Biol,!7(1),!p.S10.!
Solovyev,!V.V.,!Shahmuradov,!I.A.!&!Salamov,!A.A.,!2010.!Identification!of!
promoter!regions!and!regulatory!sites.!Methods'Mol'Biol,!674,!pp.57V83.!
Stapleton,!G.!et!al.,!1995.!A!novel!cytochrome!P450!expressed!primarily!in!brain.!
J'Biol'Chem,!270(50),!pp.29739V45.!
Stiles,!A.,!McDonald,!J.,!Bauman,!D.!&!Russell,!D.,!2009.!CYP7B1:!One!Cytochrome!
P450,!Two!Human!Genetic!Diseases,!and!Multiple!Physiological!Functions.!J'Biol'
Chem,!284(42),!pp.28485V89.!
Strachan,!T.!&!Read,!A.,!2011.!Comparative!Genomics.!In!T.!Strachan,!ed.!Human'
Molecular'Genetics.!4th!ed.!New!York:!Garland!Science.!p.306.!
Sun,!L.VP.,!Li,!L.,!Goldstein,!J.L.!&!Brown,!M.S.,!2005.!Insig!Required!for!SterolV
mediated!Inhibition!of!Scap/SREBP!Binding!to!COPII!Proteins!in!Vitro.!J'Biol'
Chem,!280,!pp.26483V90.!
Thompson,!J.,!Higgins,!D.!&!Gibson,!T.,!1994.!CLUSTAL!W:!improving!the!
sensitivity!of!progressive!multiple!sequence!alignment!through!sequence!
weighting,!positionVspecific!gap!penalties!and!weight!matrix!choice.!Nucleic'Acids'
Res,!22(22),!pp.4673V80.!
Thompson,!W.!et!al.,!2004.!Decoding!Human!Regulatory!Circuits.!Genome'Res,!
14(10a),!pp.1967V74.!
Touchman,!J.,!2010.!Comparative!Genomics.!Nature'Education'Knowledge,!3(10),!
p.13.!
van!der!Most,!P.!et!al.,!2009.!Statins:!Mechanisms!of!neuroprotection.!Prog'
Neurobiol,!88,!pp.64V75.!
Wasserman,!W.W.!et!al.,!2000.!HumanVmouse!genome!comparisons!to!locate!
regulatory!sites.!Nat'Genet,!26(2),!pp.225V28.!
Watson,!J.D.!&!Crick,!F.H.C.,!1953.!A!Structure!for!Deoxyribose!Nucleic!Acid.!
Nature,!171,!pp.737V38.!
Weiling,!F.,!1991.!Historical!study:!Johann!Gregor!Mendel!1822–1884.!Am'J'Med'
Genet,!40(1),!pp.1V25.!
Wheelan,!S.J.,!Church,!D.M.!&!Ostell,!J.M.,!2001.!Spidey:!A!tool!for!mRNAVtoV
Genomic!Alignments.!Genome'Res,!11(11),!pp.1952V57.!
! 61!
Wilcox,!C.!et!al.,!2007.!Coordinate!upVregulation!of!TMEM97!and!cholesterol!
biosynthesis!genes!in!normal!ovarian!surface!epithelial!cells!treated!with!
progesterone:!implications!for!pathogenesis!of!ovarian!cancer.!BMC'Cancer,!7,!
p.223.!
Ye,!J.,!McGinnis,!S.!&!Madden,!T.L.,!2006.!BLAST:!improvements!for!better!
sequence!analysis.!Nucleic'Acids'Res,!34(Web!Server),!pp.W36V39.!
Zhang,!Z.!&!Gerstein,!M.,!2003.!Of!mice!and!men:!phylogenetic!footprinting!aids!
the!discovery!of!regulatory!elements.!J'Biol,!2(2),!p.11.!
Zvelebil,!M.!&!Baum,!J.O.,!2008.!Producing!and!Analyzing!Sequence!Alignments.!
In!D.!Holdsworth,!ed.!Understanding'Bioinformatics.!New!York:!Garland!Science.!
pp.89V90.!
!
!
!
! 62!
Appendix$I$
Exact!positions!of!different!exons!of!different!species!predicted!using!Spidey.!
Human!
!
Armadillo!
!
Cat!
!
Cow!
!
Dog!
!
Dolphin!
!
! 63!
Elephant!
!
Guinea!Pig!
!
Horse!
!
Mouse!
!
Rabbit!
!

More Related Content

Similar to Dissertation

Lioce Letter ANA Presidents
Lioce Letter ANA PresidentsLioce Letter ANA Presidents
Lioce Letter ANA PresidentsLori Lioce
 
Final internship report Tugba Aydin-3
Final internship report Tugba Aydin-3Final internship report Tugba Aydin-3
Final internship report Tugba Aydin-3Tugba Aydin
 
Integrated Research Report
Integrated Research ReportIntegrated Research Report
Integrated Research Reportashleyyeap
 
Integrated Final research report pdf
Integrated Final research report pdfIntegrated Final research report pdf
Integrated Final research report pdfchristinelee1996
 
ENG Final research report pdf
ENG Final research report pdfENG Final research report pdf
ENG Final research report pdfMadeline Liew
 
Economics Research Report
Economics Research ReportEconomics Research Report
Economics Research Reportashleyyeap
 
Protecting Haiti's Children: Risk factors and outcomes before and since the 2...
Protecting Haiti's Children: Risk factors and outcomes before and since the 2...Protecting Haiti's Children: Risk factors and outcomes before and since the 2...
Protecting Haiti's Children: Risk factors and outcomes before and since the 2...Nicholas Cooper
 
Ebola clinical care guidelines
Ebola clinical care guidelines Ebola clinical care guidelines
Ebola clinical care guidelines Ally O'Mara
 
WAVE - Autism Inclusion 101 companion handout
WAVE - Autism Inclusion 101 companion handoutWAVE - Autism Inclusion 101 companion handout
WAVE - Autism Inclusion 101 companion handoutSheila Bell
 

Similar to Dissertation (12)

Lioce Letter ANA Presidents
Lioce Letter ANA PresidentsLioce Letter ANA Presidents
Lioce Letter ANA Presidents
 
Final internship report Tugba Aydin-3
Final internship report Tugba Aydin-3Final internship report Tugba Aydin-3
Final internship report Tugba Aydin-3
 
Integrated Research Report
Integrated Research ReportIntegrated Research Report
Integrated Research Report
 
Integrated Final research report pdf
Integrated Final research report pdfIntegrated Final research report pdf
Integrated Final research report pdf
 
ENG Final research report pdf
ENG Final research report pdfENG Final research report pdf
ENG Final research report pdf
 
How Cross-Organizational Cooperation on AI lead to changes in Norway
How Cross-Organizational Cooperation on AI lead to changes in NorwayHow Cross-Organizational Cooperation on AI lead to changes in Norway
How Cross-Organizational Cooperation on AI lead to changes in Norway
 
Economics Research Report
Economics Research ReportEconomics Research Report
Economics Research Report
 
Overwintering Onions in Low Tunnels; Gardening Guidebook for New Hampshire
Overwintering Onions in Low Tunnels; Gardening Guidebook for New Hampshire Overwintering Onions in Low Tunnels; Gardening Guidebook for New Hampshire
Overwintering Onions in Low Tunnels; Gardening Guidebook for New Hampshire
 
Protecting Haiti's Children: Risk factors and outcomes before and since the 2...
Protecting Haiti's Children: Risk factors and outcomes before and since the 2...Protecting Haiti's Children: Risk factors and outcomes before and since the 2...
Protecting Haiti's Children: Risk factors and outcomes before and since the 2...
 
Ebola clinical care guidelines
Ebola clinical care guidelines Ebola clinical care guidelines
Ebola clinical care guidelines
 
WAVE - Autism Inclusion 101 companion handout
WAVE - Autism Inclusion 101 companion handoutWAVE - Autism Inclusion 101 companion handout
WAVE - Autism Inclusion 101 companion handout
 
Dominic Piedmonte Brochure
Dominic Piedmonte BrochureDominic Piedmonte Brochure
Dominic Piedmonte Brochure
 

Dissertation