SlideShare a Scribd company logo
1 of 29
Homology Modeling of theHomology Modeling of the
human PAX8 Protein andhuman PAX8 Protein and
mechanisms for sequencemechanisms for sequence
specific DNA recognitionspecific DNA recognition
Abhishek DabralAbhishek Dabral
School of Biology,School of Biology,
Georgia Institute of TechnologyGeorgia Institute of Technology
What is PAX?What is PAX?
The PAX gene family encodes a group ofThe PAX gene family encodes a group of
transcription factors that have been conservedtranscription factors that have been conserved
through millions of years of evolution and playthrough millions of years of evolution and play
roles in early development.roles in early development.
Pax proteins are transcriptional regulators thatPax proteins are transcriptional regulators that
have critical roles in mammalian development, thehave critical roles in mammalian development, the
mutations of PAX genes cause profoundmutations of PAX genes cause profound
developmental defects.developmental defects.
PAX OrganizationPAX Organization
► All PAX proteins have aAll PAX proteins have a paired domain (PD)paired domain (PD),,
which spans 128 amino acids near the N-terminuswhich spans 128 amino acids near the N-terminus
and consists of two helix-turn-helix (HTH) motifs.and consists of two helix-turn-helix (HTH) motifs.
► Sequence conservation among PAX proteins isSequence conservation among PAX proteins is
highest in the paired domainhighest in the paired domain but can also bebut can also be
extended to a paired-typeextended to a paired-type homeodomain (HD)homeodomain (HD) andand
to a stretch of residues between paired domainto a stretch of residues between paired domain
and homeodomain calledand homeodomain called octapeptide (OP)octapeptide (OP)..
PAX StructurePAX Structure
► PD is composed of amino andPD is composed of amino and
carboxy terminal subdomainscarboxy terminal subdomains
each of which are made up of 3each of which are made up of 3
alpha helices resembling thealpha helices resembling the
HTH (helix-turn-helix) motifHTH (helix-turn-helix) motif
found in all HD.found in all HD.
► Third helix of PD and HDThird helix of PD and HD
proteins interacts with the majorproteins interacts with the major
groove of the DNA.groove of the DNA.
► PDs have the ability to not adoptPDs have the ability to not adopt
a fixed structure unless it isa fixed structure unless it is
bound to DNA, this lends it abound to DNA, this lends it a
great diversity as a protein.great diversity as a protein.
In mammals,In mammals, 99 PAXPAX genes have been identified.genes have been identified.
PAX genes divided intoPAX genes divided into 4 subgroups4 subgroups based on:based on:
► Genomic StructureGenomic Structure
► Sequence SimilaritySequence Similarity
► Conserved FunctionConserved Function
PAX SubgroupsPAX Subgroups
PAX 8 is the only member of the family expressed in the
thyroid tissue.
PAX 8 cooperates with TTF1 (Thyroid Transcription Factor 1) to
influence thyroid specific gene regulation.
Pax8 is extremely important for the correct development of the thyroid gland
because inactivation of the Pax8 gene causes absence of follicular cells,
and
therefore absence of thyroid hormone .
PAX 8 co-expresses with Wilms’ tumor gene (WT1) during
kidney development suggesting a possible interaction.
The PAX FamiliesThe PAX Families
Splice Variants in PAX 8Splice Variants in PAX 8
Alternative splicing in PAX gene by inclusionAlternative splicing in PAX gene by inclusion
or exclusion of exons 7 and/or 8 hasor exclusion of exons 7 and/or 8 has
produced several known products but theproduced several known products but the
biological significance of the variants isbiological significance of the variants is
unknown.unknown.
The human PAX8 gene generates at leastThe human PAX8 gene generates at least
five different alternatively spliced transcriptsfive different alternatively spliced transcripts
encoding different PAX8 isoforms.encoding different PAX8 isoforms.
. . . .10 . . . .20 . . . .30 . . . .40 . . . .50 . . . .60 . . . .70 . . . .80 . . . .90 . . . 100 . . . 110 . . . 120 . . . 130 . . . 140 . . . 150 . . . 160 . . . 170 . . . 180 . . . 190 . . . 200 . . . 210 . . . 220 . . . 230 . . . 240 . . . 250
pax8A_mRNA 1:GGGAACAAACTTCAGAAGGAGGAGAGACACCGGGCCCAGGGCACCCTCGCGGGCGGACCCAAGCAGTGAGGGCCTGCAGCCGGCCGGCCAGGGCAGCGGCAGGCGCGGCCCGGACCTACGGGAGGAAGCCCCGAGCCCTCGGCGGGCTGCGAGCGACTCCCCGGCGATGCCTCACAACTCCATCAGATCTGGCCATGGAGGGCTGAACCAGCTGGGAGGGGCCTTTGTGAATGGCAGACCTCTGCCGGAA: 250
pax8B_mRNA 1:GGGAACAAACTTCAGAAGGAGGAGAGACACCGGGCCCAGGGCACCCTCGCGGGCGGACCCAAGCAGTGAGGGCCTGCAGCCGGCCGGCCAGGGCAGCGGCAGGCGCGGCCCGGACCTACGGGAGGAAGCCCCGAGCCCTCGGCGGGCTGCGAGCGACTCCCCGGCGATGCCTCACAACTCCATCAGATCTGGCCATGGAGGGCTGAACCAGCTGGGAGGGGCCTTTGTGAATGGCAGACCTCTGCCGGAA: 250
pax8C_mRNA 1:GGGAACAAACTTCAGAAGGAGGAGAGACACCGGGCCCAGGGCACCCTCGCGGGCGGACCCAAGCAGTGAGGGCCTGCAGCCGGCCGGCCAGGGCAGCGGCAGGCGCGGCCCGGACCTACGGGAGGAAGCCCCGAGCCCTCGGCGGGCTGCGAGCGACTCCCCGGCGATGCCTCACAACTCCATCAGATCTGGCCATGGAGGGCTGAACCAGCTGGGAGGGGCCTTTGTGAATGGCAGACCTCTGCCGGAA: 250
pax8D_mRNA 1:GGGAACAAACTTCAGAAGGAGGAGAGACACCGGGCCCAGGGCACCCTCGCGGGCGGACCCAAGCAGTGAGGGCCTGCAGCCGGCCGGCCAGGGCAGCGGCAGGCGCGGCCCGGACCTACGGGAGGAAGCCCCGAGCCCTCGGCGGGCTGCGAGCGACTCCCCGGCGATGCCTCACAACTCCATCAGATCTGGCCATGGAGGGCTGAACCAGCTGGGAGGGGCCTTTGTGAATGGCAGACCTCTGCCGGAA: 250
pax8E_mRNA 1:GGGAACAAACTTCAGAAGGAGGAGAGACACCGGGCCCAGGGCACCCTCGCGGGCGGACCCAAGCAGTGAGGGCCTGCAGCCGGCCGGCCAGGGCAGCGGCAGGCGCGGCCCGGACCTACGGGAGGAAGCCCCGAGCCCTCGGCGGGCTGCGAGCGACTCCCCGGCGATGCCTCACAACTCCATCAGATCTGGCCATGGAGGGCTGAACCAGCTGGGAGGGGCCTTTGTGAATGGCAGACCTCTGCCGGAA: 250
. . . 260 . . . 270 . . . 280 . . . 290 . . . 300 . . . 310 . . . 320 . . . 330 . . . 340 . . . 350 . . . 360 . . . 370 . . . 380 . . . 390 . . . 400 . . . 410 . . . 420 . . . 430 . . . 440 . . . 450 . . . 460 . . . 470 . . . 480 . . . 490 . . . 500
pax8A_mRNA 251:GTGGTCCGCCAGCGCATCGTAGACCTGGCCCACCAGGGTGTAAGGCCCTGCGACATCTCTCGCCAGCTCCGCGTCAGCCATGGCTGCGTCAGCAAGATCCTTGGCAGGTACTACGAGACTGGCAGCATCCGGCCTGGAGTGATAGGGGGCTCCAAGCCCAAGGTGGCCACCCCCAAGGTGGTGGAGAAGATTGGGGACTACAAACGCCAGAACCCTACCATGTTTGCCTGGGAGATCCGAGACCGGCTCC: 500
pax8B_mRNA 251:GTGGTCCGCCAGCGCATCGTAGACCTGGCCCACCAGGGTGTAAGGCCCTGCGACATCTCTCGCCAGCTCCGCGTCAGCCATGGCTGCGTCAGCAAGATCCTTGGCAGGTACTACGAGACTGGCAGCATCCGGCCTGGAGTGATAGGGGGCTCCAAGCCCAAGGTGGCCACCCCCAAGGTGGTGGAGAAGATTGGGGACTACAAACGCCAGAACCCTACCATGTTTGCCTGGGAGATCCGAGACCGGCTCC: 500
pax8C_mRNA 251:GTGGTCCGCCAGCGCATCGTAGACCTGGCCCACCAGGGTGTAAGGCCCTGCGACATCTCTCGCCAGCTCCGCGTCAGCCATGGCTGCGTCAGCAAGATCCTTGGCAGGTACTACGAGACTGGCAGCATCCGGCCTGGAGTGATAGGGGGCTCCAAGCCCAAGGTGGCCACCCCCAAGGTGGTGGAGAAGATTGGGGACTACAAACGCCAGAACCCTACCATGTTTGCCTGGGAGATCCGAGACCGGCTCC: 500
pax8D_mRNA 251:GTGGTCCGCCAGCGCATCGTAGACCTGGCCCACCAGGGTGTAAGGCCCTGCGACATCTCTCGCCAGCTCCGCGTCAGCCATGGCTGCGTCAGCAAGATCCTTGGCAGGTACTACGAGACTGGCAGCATCCGGCCTGGAGTGATAGGGGGCTCCAAGCCCAAGGTGGCCACCCCCAAGGTGGTGGAGAAGATTGGGGACTACAAACGCCAGAACCCTACCATGTTTGCCTGGGAGATCCGAGACCGGCTCC: 500
pax8E_mRNA 251:GTGGTCCGCCAGCGCATCGTAGACCTGGCCCACCAGGGTGTAAGGCCCTGCGACATCTCTCGCCAGCTCCGCGTCAGCCATGGCTGCGTCAGCAAGATCCTTGGCAGGTACTACGAGACTGGCAGCATCCGGCCTGGAGTGATAGGGGGCTCCAAGCCCAAGGTGGCCACCCCCAAGGTGGTGGAGAAGATTGGGGACTACAAACGCCAGAACCCTACCATGTTTGCCTGGGAGATCCGAGACCGGCTCC: 500
. . . 510 . . . 520 . . . 530 . . . 540 . . . 550 . . . 560 . . . 570 . . . 580 . . . 590 . . . 600 . . . 610 . . . 620 . . . 630 . . . 640 . . . 650 . . . 660 . . . 670 . . . 680 . . . 690 . . . 700 . . . 710 . . . 720 . . . 730 . . . 740 . . . 750
pax8A_mRNA 501:TGGCTGAGGGCGTCTGTGACAATGACACTGTGCCCAGTGTCAGCTCCATTAATAGAATCATCCGGACCAAAGTGCAGCAACCATTCAACCTCCCTATGGACAGCTGCGTGGCCACCAAGTCCCTGAGTCCCGGACACACGCTGATCCCCAGCTCAGCTGTAACTCCCCCGGAGTCACCCCAGTCGGATTCCCTGGGCTCCACCTACTCCATCAATGGGCTCCTGGGCATCGCTCAGCCTGGCAGCGACAA: 750
pax8B_mRNA 501:TGGCTGAGGGCGTCTGTGACAATGACACTGTGCCCAGTGTCAGCTCCATTAATAGAATCATCCGGACCAAAGTGCAGCAACCATTCAACCTCCCTATGGACAGCTGCGTGGCCACCAAGTCCCTGAGTCCCGGACACACGCTGATCCCCAGCTCAGCTGTAACTCCCCCGGAGTCACCCCAGTCGGATTCCCTGGGCTCCACCTACTCCATCAATGGGCTCCTGGGCATCGCTCAGCCTGGCAGCGACAA: 750
pax8C_mRNA 501:TGGCTGAGGGCGTCTGTGACAATGACACTGTGCCCAGTGTCAGCTCCATTAATAGAATCATCCGGACCAAAGTGCAGCAACCATTCAACCTCCCTATGGACAGCTGCGTGGCCACCAAGTCCCTGAGTCCCGGACACACGCTGATCCCCAGCTCAGCTGTAACTCCCCCGGAGTCACCCCAGTCGGATTCCCTGGGCTCCACCTACTCCATCAATGGGCTCCTGGGCATCGCTCAGCCTGGCAGCGACAA: 750
pax8D_mRNA 501:TGGCTGAGGGCGTCTGTGACAATGACACTGTGCCCAGTGTCAGCTCCATTAATAGAATCATCCGGACCAAAGTGCAGCAACCATTCAACCTCCCTATGGACAGCTGCGTGGCCACCAAGTCCCTGAGTCCCGGACACACGCTGATCCCCAGCTCAGCTGTAACTCCCCCGGAGTCACCCCAGTCGGATTCCCTGGGCTCCACCTACTCCATCAATGGGCTCCTGGGCATCGCTCAGCCTGGCAGCGACAA: 750
pax8E_mRNA 501:TGGCTGAGGGCGTCTGTGACAATGACACTGTGCCCAGTGTCAGCTCCATTAATAGAATCATCCGGACCAAAGTGCAGCAACCATTCAACCTCCCTATGGACAGCTGCGTGGCCACCAAGTCCCTGAGTCCCGGACACACGCTGATCCCCAGCTCAGCTGTAACTCCCCCGGAGTCACCCCAGTCGGATTCCCTGGGCTCCACCTACTCCATCAATGGGCTCCTGGGCATCGCTCAGCCTGGCAGCGACAA: 750
. . . 760 . . . 770 . . . 780 . . . 790 . . . 800 . . . 810 . . . 820 . . . 830 . . . 840 . . . 850 . . . 860 . . . 870 . . . 880 . . . 890 . . . 900 . . . 910 . . . 920 . . . 930 . . . 940 . . . 950 . . . 960 . . . 970 . . . 980 . . . 990 . . .1000
pax8A_mRNA 751:GAGGAAAATGGATGACAGTGATCAGGATAGCTGCCGACTAAGCATTGACTCACAGAGCAGCAGCAGCGGACCCCGAAAGCACCTTCGCACGGATGCCTTCAGCCAGCACCACCTCGAGCCGCTCGAGTGCCCATTTGAGCGGCAGCACTACCCAGAGGCCTATGCCTCCCCCAGCCACACCAAAGGCGAGCAGGGCCTCTACCCGCTGCCCTTGCTCAACAGCACCCTGGACGACGGGAAGGCCACCCTG:1000
pax8B_mRNA 751:GAGGAAAATGGATGACAGTGATCAGGATAGCTGCCGACTAAGCATTGACTCACAGAGCAGCAGCAGCGGACCCCGAAAGCACCTTCGCACGGATGCCTTCAGCCAGCACCACCTCGAGCCGCTCGAGTGCCCATTTGAGCGGCAGCACTACCCAGAGGCCTATGCCTCCCCCAGCCACACCAAAGGCGAGCAGGGCCTCTACCCGCTGCCCTTGCTCAACAGCACCCTGGACGACGGGAAGGCCACCCTG:1000
pax8C_mRNA 751:GAGGAAAATGGATGACAGTGATCAGGATAGCTGCCGACTAAGCATTGACTCACAGAGCAGCAGCAGCGGACCCCGAAAGCACCTTCGCACGGATGCCTTCAGCCAGCACCACCTCGAGCCGCTCGAGTGCCCATTTGAGCGGCAGCACTACCCAGAGGCCTATGCCTCCCCCAGCCACACCAAAGGCGAGCAGGGCCTCTACCCGCTGCCCTTGCTCAACAGCACCCTGGACGACGGGAAGGCCACCCTG:1000
pax8D_mRNA 751:GAGGAAAATGGATGACAGTGATCAGGATAGCTGCCGACTAAGCATTGACTCACAGAGCAGCAGCAGCGGACCCCGAAAGCACCTTCGCACGGATGCCTTCAGCCAGCACCACCTCGAGCCGCTCGAGTGCCCATTTGAGCGGCAGCACTACCCAGAGGCCTATGCCTCCCCCAGCCACACCAAAGGCGAGCAGGGC......................................................: 946
pax8E_mRNA 751:GAGGAAAATGGATGACAGTGATCAGGATAGCTGCCGACTAAGCATTGACTCACAGAGCAGCAGCAGCGGACCCCGAAAGCACCTTCGCACGGATGCCTTCAGCCAGCACCACCTCGAGCCGCTCGAGTGCCCATTTGAGCGGCAGCACTACCCAGAGGCCTATGCCTCCCCCAGCCACACCAAAGGCGAGC...........................................................: 941
. . .1010 . . .1020 . . .1030 . . .1040 . . .1050 . . .1060 . . .1070 . . .1080 . . .1090 . . .1100 . . .1110 . . .1120 . . .1130 . . .1140 . . .1150 . . .1160 . . .1170 . . .1180 . . .1190 . . .1200 . . .1210 . . .1220 . . .1230 . . .1240 . . .1250
pax8A_mRNA 1001:ACCCCTTCCAACACGCCACTGGGGCGCAACCTCTCGACTCACCAGACCTACCCCGTGGTGGCAGATCCTCACTCACCCTTCGCCATAAAGCAGGAAACCCCCGAGGTGTCCAGTTCTAGCTCCACCCCTTCCTCTTTATCTAGCTCCGCCTTTTTGGATCTGCAGCAAGTCGGCTCCGGGGTCCCGCCCTTCAATGCCTTTCCCCATGCTGCCTCCGTGTACGGGCAGTTCACGGGCCAGGCCCTCCTCT:1250
pax8B_mRNA 1001:ACCCCTTCCAACACGCCACTGGGGCGCAACCTCTCGACTCACCAGACCTACCCCGTGGTGGCAG..........................................................................................................................................................................................:1064
pax8C_mRNA 1001:ACCCCTTCCAACACGCCACTGGGGCGCAACCTCTCGACTCACCAGACCTACCCCGTGGTGGCAG...............................................................................CTCCGCCTTTTTGGATCTGCAGCAAGTCGGCTCCGGGGTCCCGCCCTTCAATGCCTTTCCCCATGCTGCCTCCGTGTACGGGCAGTTCACGGGCCAGGCCCTCCTCT:1171
pax8D_mRNA 946:..........................................................................................................................................................................................................................................................: 946
pax8E_mRNA 941:..........................................................................................................................................................................................................................................................: 941
. . .1260 . . .1270 . . .1280 . . .1290 . . .1300 . . .1310 . . .1320 . . .1330 . . .1340 . . .1350 . . .1360 . . .1370 . . .1380 . . .1390 . . .1400 . . .1410 . . .1420 . . .1430 . . .1440 . . .1450 . . .1460 . . .1470 . . .1480 . . .1490 . . .1500
pax8A_mRNA 1251:CAGGGCGAGAGATGGTGGGGCCCACGCTGCCCGGATACCCACCCCACATCCCCACCAGCGGACAGGGCAGCTATGCCTCCTCTGCCATCGCAGGCATGGTGGCAGGAAGTGAATACTCTGGCAATGCCTATGGCCACACCCCCTACTCCTCCTACAGCGAGGCCTGGCGCTTCCCCAACTCCAGCTTGCTGAGTTCCCCATATTATTACAGTTCCACATCAAGGCCGAGTGCACCGCCCACCACTGCCAC:1500
pax8B_mRNA 1064:...GGCGAGAGATGGTGGGGCCCACGCTGCCCGGATACCCACCCCACATCCCCACCAGCGGACAGGGCAGCTATGCCTCCTCTGCCATCGCAGGCATGGTGGCAGGAAGTGAATACTCTGGCAATGCCTATGGCCACACCCCCTACTCCTCCTACAGCGAGGCCTGGCGCTTCCCCAACTCCAGCTTGCTGAGTTCCCCATATTATTACAGTTCCACATCAAGGCCGAGTGCACCGCCCACCACTGCCAC:1311
pax8C_mRNA 1172:CAGGGCGAGAGATGGTGGGGCCCACGCTGCCCGGATACCCACCCCACATCCCCACCAGCGGACAGGGCAGCTATGCCTCCTCTGCCATCGCAGGCATGGTGGCAGGAAGTGAATACTCTGGCAATGCCTATGGCCACACCCCCTACTCCTCCTACAGCGAGGCCTGGCGCTTCCCCAACTCCAGCTTGCTGAGTTCCCCATATTATTACAGTTCCACATCAAGGCCGAGTGCACCGCCCACCACTGCCAC:1421
pax8D_mRNA 946:......GAGAGATGGTGGGGCCCACGCTGCCCGGATACCCACCCCACATCCCCACCAGCGGACAGGGCAGCTATGCCTCCTCTGCCATCGCAGGCATGGTGGCAGGAAGTGAATACTCTGGCAATGCCTATGGCCACACCCCCTACTCCTCCTACAGCGAGGCCTGGCGCTTCCCCAACTCCAGCTTGCTGAGTTCCCCATATTATTACAGTTCCACATCAAGGCCGAGTGCACCGCCCACCACTGCCAC:1190
pax8E_mRNA 941:.......................................................................................................AGGAAGTGAATACTCTGGCAATGCCTATGGCCACACCCCCTACTCCTCCTACAGCGAGGCCTGGCGCTTCCCCAACTCCAGCTTGCTGAGTTCCCCATATTATTACAGTTCCACATCAAGGCCGAGTGCACCGCCCACCACTGCCAC:1088
. . .1510 . . .1520 . . .1530 . . .1540 . . .1550 . . .1560 . . .1570 . . .1580 . . .1590 . . .1600 . . .1610 . . .1620 . . .1630 . . .1640 . . .1650 . . .1660 . . .1670 . . .1680 . . .1690 . . .1700 . . .1710 . . .1720 . . .1730 . . .1740 . . .1750
pax8A_mRNA 1501:GGCCTTTGACCATCTGTAGTTGCCATGGGGACAGTGGGAGCGACTGAGCAACAGGAGGACTCAGCCTGGGACAGGCCCCAGAGAGTCACACAAAGGAATCTTTATTTATTACATGAAAAATAACCACAAGTCCAGCATTGCGGCACACTCCCTGTGTGGTTAATTTAATGAACCATGAAAGACAGGATGACCTTGGACAAGGCCAAACTGTCCTCCAAGACTCCTTAATGAGGGGCAGGAGTCCCAGGGA:1750
pax8B_mRNA 1312:GGCCTTTGACCATCTGTAGTTGCCATGGGGACAGTGGGAGCGACTGAGCAACAGGAGGACTCAGCCTGGGACAGGCCCCAGAGAGTCACACAAAGGAATCTTTATTTATTACATGAAAAATAACCACAAGTCCAGCATTGCGGCACACTCCCTGTGTGGTTAATTTAATGAACCATGAAAGACAGGATGACCTTGGACAAGGCCAAACTGTCCTCCAAGACTCCTTAATGAGGGGCAGGAGTCCCAGGGA:1561
pax8C_mRNA 1422:GGCCTTTGACCATCTGTAGTTGCCATGGGGACAGTGGGAGCGACTGAGCAACAGGAGGACTCAGCCTGGGACAGGCCCCAGAGAGTCACACAAAGGAATCTTTATTTATTACATGAAAAATAACCACAAGTCCAGCATTGCGGCACACTCCCTGTGTGGTTAATTTAATGAACCATGAAAGACAGGATGACCTTGGACAAGGCCAAACTGTCCTCCAAGACTCCTTAATGAGGGGCAGGAGTCCCAGGGA:1671
pax8D_mRNA 1191:GGCCTTTGACCATCTGTAGTTGCCATGGGGACAGTGGGAGCGACTGAGCAACAGGAGGACTCAGCCTGGGACAGGCCCCAGAGAGTCACACAAAGGAATCTTTATTTATTACATGAAAAATAACCACAAGTCCAGCATTGCGGCACACTCCCTGTGTGGTTAATTTAATGAACCATGAAAGACAGGATGACCTTGGACAAGGCCAAACTGTCCTCCAAGACTCCTTAATGAGGGGCAGGAGTCCCAGGGA:1440
pax8E_mRNA 1089:GGCCTTTGACCATCTGTAGTTGCCATGGGGACAGTGGGAGCGACTGAGCAACAGGAGGACTCAGCCTGGGACAGGCCCCAGAGAGTCACACAAAGGAATCTTTATTTATTACATGAAAAATAACCACAAGTCCAGCATTGCGGCACACTCCCTGTGTGGTTAATTTAATGAACCATGAAAGACAGGATGACCTTGGACAAGGCCAAACTGTCCTCCAAGACTCCTTAATGAGGGGCAGGAGTCCCAGGGA:1338
. . .1760 . . .1770 . . .1780 . . .1790 . . .1800 . . .1810 . . .1820 . . .1830 . . .1840 . . .1850 . . .1860 . . .1870 . . .1880 . . .1890 . . .1900 . . .1910 . . .1920 . . .1930 . . .1940 . . .1950 . . .1960 . . .1970 . . .1980 . . .1990 . . .2000
pax8A_mRNA 1751:AAGAGAACCATGCCATGCTGAAAAAGACAAAATTGAAGAAGAAATGTAGCCCCCAGCCGGTACCCACCAAAGGAGAGAAGAAGCAATAGCCGAGGAACTTGGGGGGATGGCGAATGGTTCCTGCCCGGGCCCAAGGGGTGCACAGGGCACCTCCATGGCTCCATTATTAACACAACTCTAGCAATTATGGACCATAAGCACTTCCCTCCAGCCCACAAGTCACAGCCTGGTGCCGAGGCTCTCCTCACCA:2000
pax8B_mRNA 1562:AAGAGAACCATGCCATGCTGAAAAAGACAAAATTGAAGAAGAAATGTAGCCCCCAGCCGGTACCCACCAAAGGAGAGAAGAAGCAATAGCCGAGGAACTTGGGGGGATGGCGAATGGTTCCTGCCCGGGCCCAAGGGGTGCACAGGGCACCTCCATGGCTCCATTATTAACACAACTCTAGCAATTATGGACCATAAGCACTTCCCTCCAGCCCACAAGTCACAGCCTGGTGCCGAGGCTCTCCTCACCA:1811
pax8C_mRNA 1672:AAGAGAACCATGCCATGCTGAAAAAGACAAAATTGAAGAAGAAATGTAGCCCCCAGCCGGTACCCACCAAAGGAGAGAAGAAGCAATAGCCGAGGAACTTGGGGGGATGGCGAATGGTTCCTGCCCGGGCCCAAGGGGTGCACAGGGCACCTCCATGGCTCCATTATTAACACAACTCTAGCAATTATGGACCATAAGCACTTCCCTCCAGCCCACAAGTCACAGCCTGGTGCCGAGGCTCTCCTCACCA:1921
pax8D_mRNA 1441:AAGAGAACCATGCCATGCTGAAAAAGACAAAATTGAAGAAGAAATGTAGCCCCCAGCCGGTACCCACCAAAGGAGAGAAGAAGCAATAGCCGAGGAACTTGGGGGGATGGCGAATGGTTCCTGCCCGGGCCCAAGGGGTGCACAGGGCACCTCCATGGCTCCATTATTAACACAACTCTAGCAATTATGGACCATAAGCACTTCCCTCCAGCCCACAAGTCACAGCCTGGTGCCGAGGCTCTCCTCACCA:1690
pax8E_mRNA 1339:AAGAGAACCATGCCATGCTGAAAAAGACAAAATTGAAGAAGAAATGTAGCCCCCAGCCGGTACCCACCAAAGGAGAGAAGAAGCAATAGCCGAGGAACTTGGGGGGATGGCGAATGGTTCCTGCCCGGGCCCAAGGGGTGCACAGGGCACCTCCATGGCTCCATTATTAACACAACTCTAGCAATTATGGACCATAAGCACTTCCCTCCAGCCCACAAGTCACAGCCTGGTGCCGAGGCTCTCCTCACCA:1588
LAEGVCDNDTVPSVSSINRIIRTKVQQPFNLPMDSCVATKSLSPGHTLIPSSAVTPPESPQSDSLGSTYSINGLLGIAQPGSDK
ATG
UAG
5’ UTR 1
2 3
4 5
6
7 8
9 10
11
siRNA
? ?
?
MPHNSIRSGHGGLNQLGGAFVNGRPLPE
VVRQRIVDLAHQGVRPCDISRQLRVSHGCVSKILGRYYETGSIRPGVIGGSKPKVATPKVVEKIGDYKRQNPTMFAWEIRDRLL
RKMDDSDQDSCRLSIDSQSSSSGPRKHLRTDAFSQHHLEPLECPFERQHYPEAYASPSHTKGEQG LYPLPLLNSTLDDGKATLT
PSNTPLGRNLSTHQTYPVVAD PHSPFAIKQETPEVSSSSSTPSSLSSSAFLDLQQVGSGVPPFNAFPHAASVYGQFTGQALLS
GREMVGPTLPGYPPHIPTSGQGSYASSAIAGMVAG SEYSGNAYGHTPYSSYSEAWRFPNSSLLSSPYYYSSTSRPSAPPTTAT
AFDHL
= paired domain
= octapeptide
= partial homeodomain
= activation domain
= repression domain
= intron with de novo CpG island
= translocation breakpoints with PPAR-gamma
= intron/exon boundaries
Splice Variants in PAX 8
What questions could a PAX 8What questions could a PAX 8
model answer?model answer?
Better understanding of :Better understanding of :
Paired Domain-DNA interactionPaired Domain-DNA interaction––
► Biological function of PDBiological function of PD
Function of N and C subdomainsFunction of N and C subdomains ––
► Specific DNA contacts made by themSpecific DNA contacts made by them
► Do they cooperate with each other, does one affect theDo they cooperate with each other, does one affect the
function of the other and how?function of the other and how?
Effects of mutationsEffects of mutations
► Relation to the abnormal phenotypeRelation to the abnormal phenotype
Why Homology modeling?Why Homology modeling?
►No solved X- Ray structureNo solved X- Ray structure for ourfor our
Target protein ie. PAX 8Target protein ie. PAX 8
Moreover:Moreover:
►X-Ray structure is both time consumingX-Ray structure is both time consuming
and expensiveand expensive
►Only a small number of proteins can beOnly a small number of proteins can be
made to form crystals and crystal ismade to form crystals and crystal is
not the protein’s native state.not the protein’s native state.
Why Homology modeling?Why Homology modeling?
►No solved NMR structureNo solved NMR structure for ourfor our
Target protein ie. PAX 8Target protein ie. PAX 8
Moreover:Moreover:
►NMR does not work too well for proteinNMR does not work too well for protein
complexes.complexes.
►Very time consumingVery time consuming
Obtain Target Sequence
Get Information about
Target Protein
Template Selection
(Crystal Structures)
Initial Model
Validate
Model
Sequence Database (Genbank)
WHAT IF,
PROCHECK,
3D JIGSAW,
Esypred,
SWISS Model,
FUGUE
RAMPAGE
MODELER TOOLBOX
Blastp, CDD
BLAST PDB database
Clean PDB files
Create alignment of target
with template sequences
(Convert aln to ali)
MODELER
CLUSTALW
Steps for HomologySteps for Homology
ModelingModeling
The Template StructureThe Template Structure
PAX6PAX6
5822580|pdb|6PAX|A5822580|pdb|6PAX|A Chain A,Chain A, Crystal Structure Of The HumanCrystal Structure Of The Human
Pax-6 Paired Domain-Dna Complex Reveals A GeneralPax-6 Paired Domain-Dna Complex Reveals A General
Model For Pax Protein-Dna InteractionsModel For Pax Protein-Dna Interactions
Length = 133Length = 133
Score = 198 bits (503),Score = 198 bits (503),
Expect = 3e-52Expect = 3e-52
Identities = 92/123 (74%),Identities = 92/123 (74%),
Positives = 107/123 (86%)Positives = 107/123 (86%)
Query: 10Query: 10
HGGLNQLGGAFVNGRPLPEVVRQRIVDLAHQGVRPCDISRQLHGGLNQLGGAFVNGRPLPEVVRQRIVDLAHQGVRPCDISRQL
RVSHGCVSKILGRYYETG 69 H G+NQLGG FVNGRPLP+RVSHGCVSKILGRYYETG 69 H G+NQLGG FVNGRPLP+
RQRIV+LAH G RPCDISR L+VS+GCVSKILGRYY TG Sbjct: 2RQRIV+LAH G RPCDISR L+VS+GCVSKILGRYY TG Sbjct: 2
HSGVNQLGGVFVNGRPLPDSTRQRIVELAHSGARPCDISRILQHSGVNQLGGVFVNGRPLPDSTRQRIVELAHSGARPCDISRILQ
VSNGCVSKILGRYYATG 61 Query: 70VSNGCVSKILGRYYATG 61 Query: 70
SIRPGVIGGSKPKVATPKVVEKIGDYKRQNPTMFAWEIRDRLLASIRPGVIGGSKPKVATPKVVEKIGDYKRQNPTMFAWEIRDRLLA
EGVCDNDTVPSVSSIN 129 SIRP IGGSKP+VATP+VV KI YK++EGVCDNDTVPSVSSIN 129 SIRP IGGSKP+VATP+VV KI YK++
P++FAWEIRDRLL+EGVC ND +PSVSSIN Sbjct: 62P++FAWEIRDRLL+EGVC ND +PSVSSIN Sbjct: 62
SIRPRAIGGSKPRVATPEVVSKIAQYKQECPSIFAWEIRDRLLSESIRPRAIGGSKPRVATPEVVSKIAQYKQECPSIFAWEIRDRLLSE
GVCTNDNIPSVSSIN 121 Query: 130 RII 132 R++ Sbjct: 122GVCTNDNIPSVSSIN 121 Query: 130 RII 132 R++ Sbjct: 122
RVL 124RVL 124
Target-Template AlignmentTarget-Template Alignment
DNA ContactsDNA Contacts
MODELMODEL
Hypothetical DNA fit of the modelHypothetical DNA fit of the model
ValidationValidation
1. Swiss Model(http:swissmodel.expasy.org)1. Swiss Model(http:swissmodel.expasy.org)
WhatCheck Report generated for your SWISS MODEL request :WhatCheck Report generated for your SWISS MODEL request :
► No errors in amino acid nomenclatureNo errors in amino acid nomenclature
► Improper Dihedral angle distribution OK —Improper Dihedral angle distribution OK —
The RMS Z-score for all improper dihedrals in the structure is within normal range.The RMS Z-score for all improper dihedrals in the structure is within normal range.
► Normal bond angle variability.Normal bond angle variability.
► A few residues had abnormal backbone torsion angles.A few residues had abnormal backbone torsion angles.
► A few pair of atoms had abnormally short interatomic distances.A few pair of atoms had abnormally short interatomic distances.
Overall the model conforms to the common refinement constraintsOverall the model conforms to the common refinement constraints
Ramachandran plotRamachandran plot
(http://raven.bioc.cam.ac.uk/rampage.php)(http://raven.bioc.cam.ac.uk/rampage.php)
Residue [ 43 :ARG] ( 68.15, 44.04) in Allowed region
Residue [ 73 :LYS] (-118.37, -75.31) in Allowed region
Number of residues in favoured region (~98.0% expected) : 119 ( 98.3%)
Number of residues in allowed region ( ~2.0% expected) : 2 ( 1.7%)
Number of residues in outlier region : 0 ( 0.0%)
Main Chain-Side Chain ContactsMain Chain-Side Chain Contacts
Source:Source: MolProbity, an interactive macromolecular structure validation tool provided by the RichardsonMolProbity, an interactive macromolecular structure validation tool provided by the Richardson
laboratory, Duke University.laboratory, Duke University.
LimitationsLimitations
► Could not model the entire protein due to lack ofCould not model the entire protein due to lack of
homologous structures and extensive loop regionhomologous structures and extensive loop region
which is tough to model.which is tough to model.
► The paired box region may undergo someThe paired box region may undergo some
structural changes in the presence of the partialstructural changes in the presence of the partial
homeodomain (cooperativity in DNA binding).homeodomain (cooperativity in DNA binding).
► The DNA contacts made by the model may differThe DNA contacts made by the model may differ
from the template due to presence of other non-from the template due to presence of other non-
identical residues.identical residues.
ReferencesReferences
1.1.
2.2.
3.3.
4.4.
5.5.
6.6.
Simon C. Lovell, Ian W. Davis, W.Simon C. Lovell, Ian W. Davis, W.
Bryan Arendall III, Paul I. W. deBryan Arendall III, Paul I. W. de
Bakker, J. Michael Word, Michael G.Bakker, J. Michael Word, Michael G.
Prisant, Jane S. Richardson, David C.Prisant, Jane S. Richardson, David C.
Richardson (2003)Richardson (2003)
Structure validation by C-alpha geometry: phi,Structure validation by C-alpha geometry: phi,
psipsi, and C-beta deviation., and C-beta deviation. Proteins:Proteins:
Structure, Function, and Genetics.Structure, Function, and Genetics. 5050::
437-450.437-450.
7.
8.8.

More Related Content

What's hot

Megan Aubrey Research Summary
Megan Aubrey Research SummaryMegan Aubrey Research Summary
Megan Aubrey Research SummaryMegan Aubrey
 
Ablooglu, AJ (2010) Development
Ablooglu, AJ (2010) DevelopmentAblooglu, AJ (2010) Development
Ablooglu, AJ (2010) DevelopmentArarat Ablooglu
 
NJ Stem Cell Symposium 2011 Abstract
NJ Stem Cell Symposium 2011 AbstractNJ Stem Cell Symposium 2011 Abstract
NJ Stem Cell Symposium 2011 AbstractChristopher S Park
 
SHSARP paper final
SHSARP paper finalSHSARP paper final
SHSARP paper finalKaylee Racs
 
Science-2013-Vannier-239-42
Science-2013-Vannier-239-42Science-2013-Vannier-239-42
Science-2013-Vannier-239-42Sumit Sandhu
 
Transgenic animals- Sharmista
Transgenic animals- SharmistaTransgenic animals- Sharmista
Transgenic animals- SharmistaSharmistaChaitali
 
Generation of transgenic non human primates with germline transmission
Generation of transgenic non human primates with germline transmissionGeneration of transgenic non human primates with germline transmission
Generation of transgenic non human primates with germline transmissionUniversity Of Wuerzburg,Germany
 
CRISPR-Cas 9 on eradication pests
CRISPR-Cas 9 on eradication pestsCRISPR-Cas 9 on eradication pests
CRISPR-Cas 9 on eradication pestsSuzanneYong1
 
NYC* 2013 - "Analyzing the Human Genome/DNA with Cassandra"
NYC* 2013 - "Analyzing the Human Genome/DNA with Cassandra"NYC* 2013 - "Analyzing the Human Genome/DNA with Cassandra"
NYC* 2013 - "Analyzing the Human Genome/DNA with Cassandra"DataStax Academy
 
Animal Genetic Engineering 110816.ppt
Animal Genetic Engineering 110816.pptAnimal Genetic Engineering 110816.ppt
Animal Genetic Engineering 110816.pptDr. Aurora Bakaj
 
Regulation of pten activity by its carboxyl terminal autoinhibitory
Regulation of pten activity by its carboxyl terminal autoinhibitoryRegulation of pten activity by its carboxyl terminal autoinhibitory
Regulation of pten activity by its carboxyl terminal autoinhibitoryChau Chan Lao
 
Radiation Protection : Phospholipase A
Radiation Protection : Phospholipase ARadiation Protection : Phospholipase A
Radiation Protection : Phospholipase ADmitri Popov
 

What's hot (18)

Megan Aubrey Research Summary
Megan Aubrey Research SummaryMegan Aubrey Research Summary
Megan Aubrey Research Summary
 
Fall project
Fall projectFall project
Fall project
 
Ablooglu, AJ (2010) Development
Ablooglu, AJ (2010) DevelopmentAblooglu, AJ (2010) Development
Ablooglu, AJ (2010) Development
 
NJ Stem Cell Symposium 2011 Abstract
NJ Stem Cell Symposium 2011 AbstractNJ Stem Cell Symposium 2011 Abstract
NJ Stem Cell Symposium 2011 Abstract
 
SHSARP paper final
SHSARP paper finalSHSARP paper final
SHSARP paper final
 
Chromosomes
ChromosomesChromosomes
Chromosomes
 
Grant Proposal 2006
Grant Proposal 2006Grant Proposal 2006
Grant Proposal 2006
 
Science-2013-Vannier-239-42
Science-2013-Vannier-239-42Science-2013-Vannier-239-42
Science-2013-Vannier-239-42
 
Lynchetal
LynchetalLynchetal
Lynchetal
 
en%2E2014-1815
en%2E2014-1815en%2E2014-1815
en%2E2014-1815
 
Transgenic animals- Sharmista
Transgenic animals- SharmistaTransgenic animals- Sharmista
Transgenic animals- Sharmista
 
Generation of transgenic non human primates with germline transmission
Generation of transgenic non human primates with germline transmissionGeneration of transgenic non human primates with germline transmission
Generation of transgenic non human primates with germline transmission
 
CRISPR-Cas 9 on eradication pests
CRISPR-Cas 9 on eradication pestsCRISPR-Cas 9 on eradication pests
CRISPR-Cas 9 on eradication pests
 
NYC* 2013 - "Analyzing the Human Genome/DNA with Cassandra"
NYC* 2013 - "Analyzing the Human Genome/DNA with Cassandra"NYC* 2013 - "Analyzing the Human Genome/DNA with Cassandra"
NYC* 2013 - "Analyzing the Human Genome/DNA with Cassandra"
 
Animal Genetic Engineering 110816.ppt
Animal Genetic Engineering 110816.pptAnimal Genetic Engineering 110816.ppt
Animal Genetic Engineering 110816.ppt
 
projekt final
projekt finalprojekt final
projekt final
 
Regulation of pten activity by its carboxyl terminal autoinhibitory
Regulation of pten activity by its carboxyl terminal autoinhibitoryRegulation of pten activity by its carboxyl terminal autoinhibitory
Regulation of pten activity by its carboxyl terminal autoinhibitory
 
Radiation Protection : Phospholipase A
Radiation Protection : Phospholipase ARadiation Protection : Phospholipase A
Radiation Protection : Phospholipase A
 

Similar to pax8b

Why phelogyny has to be this way not other way aroundExaplain bas.pdf
Why phelogyny has to be this way not other way aroundExaplain bas.pdfWhy phelogyny has to be this way not other way aroundExaplain bas.pdf
Why phelogyny has to be this way not other way aroundExaplain bas.pdffazalenterprises
 
Homo sapiens (human pepsin) NCBI GENBANK
Homo sapiens (human pepsin) NCBI GENBANKHomo sapiens (human pepsin) NCBI GENBANK
Homo sapiens (human pepsin) NCBI GENBANKShreyaBhatt23
 
Practical 7 dna, rna and the flow of genetic information5
Practical 7 dna, rna and the flow of genetic information5Practical 7 dna, rna and the flow of genetic information5
Practical 7 dna, rna and the flow of genetic information5Osama Barayan
 
Local Genes, for Local Bacteria
Local Genes, for Local BacteriaLocal Genes, for Local Bacteria
Local Genes, for Local BacteriaBen Pascoe
 
Biotech 2012 spring-6_protein_interactions_0
Biotech 2012 spring-6_protein_interactions_0Biotech 2012 spring-6_protein_interactions_0
Biotech 2012 spring-6_protein_interactions_0BioinformaticsInstitute
 
Computational approaches to study Genetics
Computational approaches to study GeneticsComputational approaches to study Genetics
Computational approaches to study GeneticsArithmer Inc.
 
BioSmalltalk
BioSmalltalkBioSmalltalk
BioSmalltalkESUG
 
s41598-020-71015-9.pdf
s41598-020-71015-9.pdfs41598-020-71015-9.pdf
s41598-020-71015-9.pdfHadgi1
 
Protein Sequence Databases
Protein Sequence Databases Protein Sequence Databases
Protein Sequence Databases Hemant Bothe
 
Lectins for pest control
Lectins for pest controlLectins for pest control
Lectins for pest controlGuru P N
 
Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...
Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...
Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...Wani Ahad
 
Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...
Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...
Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...Wani Ahad
 
How is Genetic Information Passed on from the.pptx
How is Genetic Information Passed on from the.pptxHow is Genetic Information Passed on from the.pptx
How is Genetic Information Passed on from the.pptxDiovieLubos2
 
Proteomics a search tool for vaccines
Proteomics a search tool for vaccinesProteomics a search tool for vaccines
Proteomics a search tool for vaccinesLawrence Okoror
 
Nuclear Transport And Its Effect On Breast Cancer Tumor Cells
Nuclear Transport And Its Effect On Breast Cancer Tumor CellsNuclear Transport And Its Effect On Breast Cancer Tumor Cells
Nuclear Transport And Its Effect On Breast Cancer Tumor CellsStephanie Clark
 

Similar to pax8b (20)

Why phelogyny has to be this way not other way aroundExaplain bas.pdf
Why phelogyny has to be this way not other way aroundExaplain bas.pdfWhy phelogyny has to be this way not other way aroundExaplain bas.pdf
Why phelogyny has to be this way not other way aroundExaplain bas.pdf
 
Homo sapiens (human pepsin) NCBI GENBANK
Homo sapiens (human pepsin) NCBI GENBANKHomo sapiens (human pepsin) NCBI GENBANK
Homo sapiens (human pepsin) NCBI GENBANK
 
HoxA1 By Ensembl 2
HoxA1 By Ensembl 2HoxA1 By Ensembl 2
HoxA1 By Ensembl 2
 
Practical 7 dna, rna and the flow of genetic information5
Practical 7 dna, rna and the flow of genetic information5Practical 7 dna, rna and the flow of genetic information5
Practical 7 dna, rna and the flow of genetic information5
 
Local Genes, for Local Bacteria
Local Genes, for Local BacteriaLocal Genes, for Local Bacteria
Local Genes, for Local Bacteria
 
Protein databases
Protein databasesProtein databases
Protein databases
 
Biotech 2012 spring-6_protein_interactions_0
Biotech 2012 spring-6_protein_interactions_0Biotech 2012 spring-6_protein_interactions_0
Biotech 2012 spring-6_protein_interactions_0
 
Computational approaches to study Genetics
Computational approaches to study GeneticsComputational approaches to study Genetics
Computational approaches to study Genetics
 
BioSmalltalk
BioSmalltalkBioSmalltalk
BioSmalltalk
 
s41598-020-71015-9.pdf
s41598-020-71015-9.pdfs41598-020-71015-9.pdf
s41598-020-71015-9.pdf
 
Protein Sequence Databases
Protein Sequence Databases Protein Sequence Databases
Protein Sequence Databases
 
Lectins for pest control
Lectins for pest controlLectins for pest control
Lectins for pest control
 
Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...
Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...
Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...
 
Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...
Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...
Sequence Characterization of Coding Regions of the Myostatin Gene (GDF8) from...
 
How is Genetic Information Passed on from the.pptx
How is Genetic Information Passed on from the.pptxHow is Genetic Information Passed on from the.pptx
How is Genetic Information Passed on from the.pptx
 
Proteomics a search tool for vaccines
Proteomics a search tool for vaccinesProteomics a search tool for vaccines
Proteomics a search tool for vaccines
 
HSPs.ppt
HSPs.pptHSPs.ppt
HSPs.ppt
 
Biological databases
Biological databasesBiological databases
Biological databases
 
PhyloGenes Webinar Spring 2020
PhyloGenes Webinar Spring 2020PhyloGenes Webinar Spring 2020
PhyloGenes Webinar Spring 2020
 
Nuclear Transport And Its Effect On Breast Cancer Tumor Cells
Nuclear Transport And Its Effect On Breast Cancer Tumor CellsNuclear Transport And Its Effect On Breast Cancer Tumor Cells
Nuclear Transport And Its Effect On Breast Cancer Tumor Cells
 

pax8b

  • 1. Homology Modeling of theHomology Modeling of the human PAX8 Protein andhuman PAX8 Protein and mechanisms for sequencemechanisms for sequence specific DNA recognitionspecific DNA recognition Abhishek DabralAbhishek Dabral School of Biology,School of Biology, Georgia Institute of TechnologyGeorgia Institute of Technology
  • 2. What is PAX?What is PAX? The PAX gene family encodes a group ofThe PAX gene family encodes a group of transcription factors that have been conservedtranscription factors that have been conserved through millions of years of evolution and playthrough millions of years of evolution and play roles in early development.roles in early development. Pax proteins are transcriptional regulators thatPax proteins are transcriptional regulators that have critical roles in mammalian development, thehave critical roles in mammalian development, the mutations of PAX genes cause profoundmutations of PAX genes cause profound developmental defects.developmental defects.
  • 3.
  • 4. PAX OrganizationPAX Organization ► All PAX proteins have aAll PAX proteins have a paired domain (PD)paired domain (PD),, which spans 128 amino acids near the N-terminuswhich spans 128 amino acids near the N-terminus and consists of two helix-turn-helix (HTH) motifs.and consists of two helix-turn-helix (HTH) motifs. ► Sequence conservation among PAX proteins isSequence conservation among PAX proteins is highest in the paired domainhighest in the paired domain but can also bebut can also be extended to a paired-typeextended to a paired-type homeodomain (HD)homeodomain (HD) andand to a stretch of residues between paired domainto a stretch of residues between paired domain and homeodomain calledand homeodomain called octapeptide (OP)octapeptide (OP)..
  • 5. PAX StructurePAX Structure ► PD is composed of amino andPD is composed of amino and carboxy terminal subdomainscarboxy terminal subdomains each of which are made up of 3each of which are made up of 3 alpha helices resembling thealpha helices resembling the HTH (helix-turn-helix) motifHTH (helix-turn-helix) motif found in all HD.found in all HD. ► Third helix of PD and HDThird helix of PD and HD proteins interacts with the majorproteins interacts with the major groove of the DNA.groove of the DNA. ► PDs have the ability to not adoptPDs have the ability to not adopt a fixed structure unless it isa fixed structure unless it is bound to DNA, this lends it abound to DNA, this lends it a great diversity as a protein.great diversity as a protein.
  • 6. In mammals,In mammals, 99 PAXPAX genes have been identified.genes have been identified. PAX genes divided intoPAX genes divided into 4 subgroups4 subgroups based on:based on: ► Genomic StructureGenomic Structure ► Sequence SimilaritySequence Similarity ► Conserved FunctionConserved Function PAX SubgroupsPAX Subgroups
  • 7. PAX 8 is the only member of the family expressed in the thyroid tissue. PAX 8 cooperates with TTF1 (Thyroid Transcription Factor 1) to influence thyroid specific gene regulation. Pax8 is extremely important for the correct development of the thyroid gland because inactivation of the Pax8 gene causes absence of follicular cells, and therefore absence of thyroid hormone . PAX 8 co-expresses with Wilms’ tumor gene (WT1) during kidney development suggesting a possible interaction. The PAX FamiliesThe PAX Families
  • 8. Splice Variants in PAX 8Splice Variants in PAX 8 Alternative splicing in PAX gene by inclusionAlternative splicing in PAX gene by inclusion or exclusion of exons 7 and/or 8 hasor exclusion of exons 7 and/or 8 has produced several known products but theproduced several known products but the biological significance of the variants isbiological significance of the variants is unknown.unknown. The human PAX8 gene generates at leastThe human PAX8 gene generates at least five different alternatively spliced transcriptsfive different alternatively spliced transcripts encoding different PAX8 isoforms.encoding different PAX8 isoforms.
  • 9. . . . .10 . . . .20 . . . .30 . . . .40 . . . .50 . . . .60 . . . .70 . . . .80 . . . .90 . . . 100 . . . 110 . . . 120 . . . 130 . . . 140 . . . 150 . . . 160 . . . 170 . . . 180 . . . 190 . . . 200 . . . 210 . . . 220 . . . 230 . . . 240 . . . 250 pax8A_mRNA 1:GGGAACAAACTTCAGAAGGAGGAGAGACACCGGGCCCAGGGCACCCTCGCGGGCGGACCCAAGCAGTGAGGGCCTGCAGCCGGCCGGCCAGGGCAGCGGCAGGCGCGGCCCGGACCTACGGGAGGAAGCCCCGAGCCCTCGGCGGGCTGCGAGCGACTCCCCGGCGATGCCTCACAACTCCATCAGATCTGGCCATGGAGGGCTGAACCAGCTGGGAGGGGCCTTTGTGAATGGCAGACCTCTGCCGGAA: 250 pax8B_mRNA 1:GGGAACAAACTTCAGAAGGAGGAGAGACACCGGGCCCAGGGCACCCTCGCGGGCGGACCCAAGCAGTGAGGGCCTGCAGCCGGCCGGCCAGGGCAGCGGCAGGCGCGGCCCGGACCTACGGGAGGAAGCCCCGAGCCCTCGGCGGGCTGCGAGCGACTCCCCGGCGATGCCTCACAACTCCATCAGATCTGGCCATGGAGGGCTGAACCAGCTGGGAGGGGCCTTTGTGAATGGCAGACCTCTGCCGGAA: 250 pax8C_mRNA 1:GGGAACAAACTTCAGAAGGAGGAGAGACACCGGGCCCAGGGCACCCTCGCGGGCGGACCCAAGCAGTGAGGGCCTGCAGCCGGCCGGCCAGGGCAGCGGCAGGCGCGGCCCGGACCTACGGGAGGAAGCCCCGAGCCCTCGGCGGGCTGCGAGCGACTCCCCGGCGATGCCTCACAACTCCATCAGATCTGGCCATGGAGGGCTGAACCAGCTGGGAGGGGCCTTTGTGAATGGCAGACCTCTGCCGGAA: 250 pax8D_mRNA 1:GGGAACAAACTTCAGAAGGAGGAGAGACACCGGGCCCAGGGCACCCTCGCGGGCGGACCCAAGCAGTGAGGGCCTGCAGCCGGCCGGCCAGGGCAGCGGCAGGCGCGGCCCGGACCTACGGGAGGAAGCCCCGAGCCCTCGGCGGGCTGCGAGCGACTCCCCGGCGATGCCTCACAACTCCATCAGATCTGGCCATGGAGGGCTGAACCAGCTGGGAGGGGCCTTTGTGAATGGCAGACCTCTGCCGGAA: 250 pax8E_mRNA 1:GGGAACAAACTTCAGAAGGAGGAGAGACACCGGGCCCAGGGCACCCTCGCGGGCGGACCCAAGCAGTGAGGGCCTGCAGCCGGCCGGCCAGGGCAGCGGCAGGCGCGGCCCGGACCTACGGGAGGAAGCCCCGAGCCCTCGGCGGGCTGCGAGCGACTCCCCGGCGATGCCTCACAACTCCATCAGATCTGGCCATGGAGGGCTGAACCAGCTGGGAGGGGCCTTTGTGAATGGCAGACCTCTGCCGGAA: 250 . . . 260 . . . 270 . . . 280 . . . 290 . . . 300 . . . 310 . . . 320 . . . 330 . . . 340 . . . 350 . . . 360 . . . 370 . . . 380 . . . 390 . . . 400 . . . 410 . . . 420 . . . 430 . . . 440 . . . 450 . . . 460 . . . 470 . . . 480 . . . 490 . . . 500 pax8A_mRNA 251:GTGGTCCGCCAGCGCATCGTAGACCTGGCCCACCAGGGTGTAAGGCCCTGCGACATCTCTCGCCAGCTCCGCGTCAGCCATGGCTGCGTCAGCAAGATCCTTGGCAGGTACTACGAGACTGGCAGCATCCGGCCTGGAGTGATAGGGGGCTCCAAGCCCAAGGTGGCCACCCCCAAGGTGGTGGAGAAGATTGGGGACTACAAACGCCAGAACCCTACCATGTTTGCCTGGGAGATCCGAGACCGGCTCC: 500 pax8B_mRNA 251:GTGGTCCGCCAGCGCATCGTAGACCTGGCCCACCAGGGTGTAAGGCCCTGCGACATCTCTCGCCAGCTCCGCGTCAGCCATGGCTGCGTCAGCAAGATCCTTGGCAGGTACTACGAGACTGGCAGCATCCGGCCTGGAGTGATAGGGGGCTCCAAGCCCAAGGTGGCCACCCCCAAGGTGGTGGAGAAGATTGGGGACTACAAACGCCAGAACCCTACCATGTTTGCCTGGGAGATCCGAGACCGGCTCC: 500 pax8C_mRNA 251:GTGGTCCGCCAGCGCATCGTAGACCTGGCCCACCAGGGTGTAAGGCCCTGCGACATCTCTCGCCAGCTCCGCGTCAGCCATGGCTGCGTCAGCAAGATCCTTGGCAGGTACTACGAGACTGGCAGCATCCGGCCTGGAGTGATAGGGGGCTCCAAGCCCAAGGTGGCCACCCCCAAGGTGGTGGAGAAGATTGGGGACTACAAACGCCAGAACCCTACCATGTTTGCCTGGGAGATCCGAGACCGGCTCC: 500 pax8D_mRNA 251:GTGGTCCGCCAGCGCATCGTAGACCTGGCCCACCAGGGTGTAAGGCCCTGCGACATCTCTCGCCAGCTCCGCGTCAGCCATGGCTGCGTCAGCAAGATCCTTGGCAGGTACTACGAGACTGGCAGCATCCGGCCTGGAGTGATAGGGGGCTCCAAGCCCAAGGTGGCCACCCCCAAGGTGGTGGAGAAGATTGGGGACTACAAACGCCAGAACCCTACCATGTTTGCCTGGGAGATCCGAGACCGGCTCC: 500 pax8E_mRNA 251:GTGGTCCGCCAGCGCATCGTAGACCTGGCCCACCAGGGTGTAAGGCCCTGCGACATCTCTCGCCAGCTCCGCGTCAGCCATGGCTGCGTCAGCAAGATCCTTGGCAGGTACTACGAGACTGGCAGCATCCGGCCTGGAGTGATAGGGGGCTCCAAGCCCAAGGTGGCCACCCCCAAGGTGGTGGAGAAGATTGGGGACTACAAACGCCAGAACCCTACCATGTTTGCCTGGGAGATCCGAGACCGGCTCC: 500 . . . 510 . . . 520 . . . 530 . . . 540 . . . 550 . . . 560 . . . 570 . . . 580 . . . 590 . . . 600 . . . 610 . . . 620 . . . 630 . . . 640 . . . 650 . . . 660 . . . 670 . . . 680 . . . 690 . . . 700 . . . 710 . . . 720 . . . 730 . . . 740 . . . 750 pax8A_mRNA 501:TGGCTGAGGGCGTCTGTGACAATGACACTGTGCCCAGTGTCAGCTCCATTAATAGAATCATCCGGACCAAAGTGCAGCAACCATTCAACCTCCCTATGGACAGCTGCGTGGCCACCAAGTCCCTGAGTCCCGGACACACGCTGATCCCCAGCTCAGCTGTAACTCCCCCGGAGTCACCCCAGTCGGATTCCCTGGGCTCCACCTACTCCATCAATGGGCTCCTGGGCATCGCTCAGCCTGGCAGCGACAA: 750 pax8B_mRNA 501:TGGCTGAGGGCGTCTGTGACAATGACACTGTGCCCAGTGTCAGCTCCATTAATAGAATCATCCGGACCAAAGTGCAGCAACCATTCAACCTCCCTATGGACAGCTGCGTGGCCACCAAGTCCCTGAGTCCCGGACACACGCTGATCCCCAGCTCAGCTGTAACTCCCCCGGAGTCACCCCAGTCGGATTCCCTGGGCTCCACCTACTCCATCAATGGGCTCCTGGGCATCGCTCAGCCTGGCAGCGACAA: 750 pax8C_mRNA 501:TGGCTGAGGGCGTCTGTGACAATGACACTGTGCCCAGTGTCAGCTCCATTAATAGAATCATCCGGACCAAAGTGCAGCAACCATTCAACCTCCCTATGGACAGCTGCGTGGCCACCAAGTCCCTGAGTCCCGGACACACGCTGATCCCCAGCTCAGCTGTAACTCCCCCGGAGTCACCCCAGTCGGATTCCCTGGGCTCCACCTACTCCATCAATGGGCTCCTGGGCATCGCTCAGCCTGGCAGCGACAA: 750 pax8D_mRNA 501:TGGCTGAGGGCGTCTGTGACAATGACACTGTGCCCAGTGTCAGCTCCATTAATAGAATCATCCGGACCAAAGTGCAGCAACCATTCAACCTCCCTATGGACAGCTGCGTGGCCACCAAGTCCCTGAGTCCCGGACACACGCTGATCCCCAGCTCAGCTGTAACTCCCCCGGAGTCACCCCAGTCGGATTCCCTGGGCTCCACCTACTCCATCAATGGGCTCCTGGGCATCGCTCAGCCTGGCAGCGACAA: 750 pax8E_mRNA 501:TGGCTGAGGGCGTCTGTGACAATGACACTGTGCCCAGTGTCAGCTCCATTAATAGAATCATCCGGACCAAAGTGCAGCAACCATTCAACCTCCCTATGGACAGCTGCGTGGCCACCAAGTCCCTGAGTCCCGGACACACGCTGATCCCCAGCTCAGCTGTAACTCCCCCGGAGTCACCCCAGTCGGATTCCCTGGGCTCCACCTACTCCATCAATGGGCTCCTGGGCATCGCTCAGCCTGGCAGCGACAA: 750 . . . 760 . . . 770 . . . 780 . . . 790 . . . 800 . . . 810 . . . 820 . . . 830 . . . 840 . . . 850 . . . 860 . . . 870 . . . 880 . . . 890 . . . 900 . . . 910 . . . 920 . . . 930 . . . 940 . . . 950 . . . 960 . . . 970 . . . 980 . . . 990 . . .1000 pax8A_mRNA 751:GAGGAAAATGGATGACAGTGATCAGGATAGCTGCCGACTAAGCATTGACTCACAGAGCAGCAGCAGCGGACCCCGAAAGCACCTTCGCACGGATGCCTTCAGCCAGCACCACCTCGAGCCGCTCGAGTGCCCATTTGAGCGGCAGCACTACCCAGAGGCCTATGCCTCCCCCAGCCACACCAAAGGCGAGCAGGGCCTCTACCCGCTGCCCTTGCTCAACAGCACCCTGGACGACGGGAAGGCCACCCTG:1000 pax8B_mRNA 751:GAGGAAAATGGATGACAGTGATCAGGATAGCTGCCGACTAAGCATTGACTCACAGAGCAGCAGCAGCGGACCCCGAAAGCACCTTCGCACGGATGCCTTCAGCCAGCACCACCTCGAGCCGCTCGAGTGCCCATTTGAGCGGCAGCACTACCCAGAGGCCTATGCCTCCCCCAGCCACACCAAAGGCGAGCAGGGCCTCTACCCGCTGCCCTTGCTCAACAGCACCCTGGACGACGGGAAGGCCACCCTG:1000 pax8C_mRNA 751:GAGGAAAATGGATGACAGTGATCAGGATAGCTGCCGACTAAGCATTGACTCACAGAGCAGCAGCAGCGGACCCCGAAAGCACCTTCGCACGGATGCCTTCAGCCAGCACCACCTCGAGCCGCTCGAGTGCCCATTTGAGCGGCAGCACTACCCAGAGGCCTATGCCTCCCCCAGCCACACCAAAGGCGAGCAGGGCCTCTACCCGCTGCCCTTGCTCAACAGCACCCTGGACGACGGGAAGGCCACCCTG:1000 pax8D_mRNA 751:GAGGAAAATGGATGACAGTGATCAGGATAGCTGCCGACTAAGCATTGACTCACAGAGCAGCAGCAGCGGACCCCGAAAGCACCTTCGCACGGATGCCTTCAGCCAGCACCACCTCGAGCCGCTCGAGTGCCCATTTGAGCGGCAGCACTACCCAGAGGCCTATGCCTCCCCCAGCCACACCAAAGGCGAGCAGGGC......................................................: 946 pax8E_mRNA 751:GAGGAAAATGGATGACAGTGATCAGGATAGCTGCCGACTAAGCATTGACTCACAGAGCAGCAGCAGCGGACCCCGAAAGCACCTTCGCACGGATGCCTTCAGCCAGCACCACCTCGAGCCGCTCGAGTGCCCATTTGAGCGGCAGCACTACCCAGAGGCCTATGCCTCCCCCAGCCACACCAAAGGCGAGC...........................................................: 941 . . .1010 . . .1020 . . .1030 . . .1040 . . .1050 . . .1060 . . .1070 . . .1080 . . .1090 . . .1100 . . .1110 . . .1120 . . .1130 . . .1140 . . .1150 . . .1160 . . .1170 . . .1180 . . .1190 . . .1200 . . .1210 . . .1220 . . .1230 . . .1240 . . .1250 pax8A_mRNA 1001:ACCCCTTCCAACACGCCACTGGGGCGCAACCTCTCGACTCACCAGACCTACCCCGTGGTGGCAGATCCTCACTCACCCTTCGCCATAAAGCAGGAAACCCCCGAGGTGTCCAGTTCTAGCTCCACCCCTTCCTCTTTATCTAGCTCCGCCTTTTTGGATCTGCAGCAAGTCGGCTCCGGGGTCCCGCCCTTCAATGCCTTTCCCCATGCTGCCTCCGTGTACGGGCAGTTCACGGGCCAGGCCCTCCTCT:1250 pax8B_mRNA 1001:ACCCCTTCCAACACGCCACTGGGGCGCAACCTCTCGACTCACCAGACCTACCCCGTGGTGGCAG..........................................................................................................................................................................................:1064 pax8C_mRNA 1001:ACCCCTTCCAACACGCCACTGGGGCGCAACCTCTCGACTCACCAGACCTACCCCGTGGTGGCAG...............................................................................CTCCGCCTTTTTGGATCTGCAGCAAGTCGGCTCCGGGGTCCCGCCCTTCAATGCCTTTCCCCATGCTGCCTCCGTGTACGGGCAGTTCACGGGCCAGGCCCTCCTCT:1171 pax8D_mRNA 946:..........................................................................................................................................................................................................................................................: 946 pax8E_mRNA 941:..........................................................................................................................................................................................................................................................: 941 . . .1260 . . .1270 . . .1280 . . .1290 . . .1300 . . .1310 . . .1320 . . .1330 . . .1340 . . .1350 . . .1360 . . .1370 . . .1380 . . .1390 . . .1400 . . .1410 . . .1420 . . .1430 . . .1440 . . .1450 . . .1460 . . .1470 . . .1480 . . .1490 . . .1500 pax8A_mRNA 1251:CAGGGCGAGAGATGGTGGGGCCCACGCTGCCCGGATACCCACCCCACATCCCCACCAGCGGACAGGGCAGCTATGCCTCCTCTGCCATCGCAGGCATGGTGGCAGGAAGTGAATACTCTGGCAATGCCTATGGCCACACCCCCTACTCCTCCTACAGCGAGGCCTGGCGCTTCCCCAACTCCAGCTTGCTGAGTTCCCCATATTATTACAGTTCCACATCAAGGCCGAGTGCACCGCCCACCACTGCCAC:1500 pax8B_mRNA 1064:...GGCGAGAGATGGTGGGGCCCACGCTGCCCGGATACCCACCCCACATCCCCACCAGCGGACAGGGCAGCTATGCCTCCTCTGCCATCGCAGGCATGGTGGCAGGAAGTGAATACTCTGGCAATGCCTATGGCCACACCCCCTACTCCTCCTACAGCGAGGCCTGGCGCTTCCCCAACTCCAGCTTGCTGAGTTCCCCATATTATTACAGTTCCACATCAAGGCCGAGTGCACCGCCCACCACTGCCAC:1311 pax8C_mRNA 1172:CAGGGCGAGAGATGGTGGGGCCCACGCTGCCCGGATACCCACCCCACATCCCCACCAGCGGACAGGGCAGCTATGCCTCCTCTGCCATCGCAGGCATGGTGGCAGGAAGTGAATACTCTGGCAATGCCTATGGCCACACCCCCTACTCCTCCTACAGCGAGGCCTGGCGCTTCCCCAACTCCAGCTTGCTGAGTTCCCCATATTATTACAGTTCCACATCAAGGCCGAGTGCACCGCCCACCACTGCCAC:1421 pax8D_mRNA 946:......GAGAGATGGTGGGGCCCACGCTGCCCGGATACCCACCCCACATCCCCACCAGCGGACAGGGCAGCTATGCCTCCTCTGCCATCGCAGGCATGGTGGCAGGAAGTGAATACTCTGGCAATGCCTATGGCCACACCCCCTACTCCTCCTACAGCGAGGCCTGGCGCTTCCCCAACTCCAGCTTGCTGAGTTCCCCATATTATTACAGTTCCACATCAAGGCCGAGTGCACCGCCCACCACTGCCAC:1190 pax8E_mRNA 941:.......................................................................................................AGGAAGTGAATACTCTGGCAATGCCTATGGCCACACCCCCTACTCCTCCTACAGCGAGGCCTGGCGCTTCCCCAACTCCAGCTTGCTGAGTTCCCCATATTATTACAGTTCCACATCAAGGCCGAGTGCACCGCCCACCACTGCCAC:1088 . . .1510 . . .1520 . . .1530 . . .1540 . . .1550 . . .1560 . . .1570 . . .1580 . . .1590 . . .1600 . . .1610 . . .1620 . . .1630 . . .1640 . . .1650 . . .1660 . . .1670 . . .1680 . . .1690 . . .1700 . . .1710 . . .1720 . . .1730 . . .1740 . . .1750 pax8A_mRNA 1501:GGCCTTTGACCATCTGTAGTTGCCATGGGGACAGTGGGAGCGACTGAGCAACAGGAGGACTCAGCCTGGGACAGGCCCCAGAGAGTCACACAAAGGAATCTTTATTTATTACATGAAAAATAACCACAAGTCCAGCATTGCGGCACACTCCCTGTGTGGTTAATTTAATGAACCATGAAAGACAGGATGACCTTGGACAAGGCCAAACTGTCCTCCAAGACTCCTTAATGAGGGGCAGGAGTCCCAGGGA:1750 pax8B_mRNA 1312:GGCCTTTGACCATCTGTAGTTGCCATGGGGACAGTGGGAGCGACTGAGCAACAGGAGGACTCAGCCTGGGACAGGCCCCAGAGAGTCACACAAAGGAATCTTTATTTATTACATGAAAAATAACCACAAGTCCAGCATTGCGGCACACTCCCTGTGTGGTTAATTTAATGAACCATGAAAGACAGGATGACCTTGGACAAGGCCAAACTGTCCTCCAAGACTCCTTAATGAGGGGCAGGAGTCCCAGGGA:1561 pax8C_mRNA 1422:GGCCTTTGACCATCTGTAGTTGCCATGGGGACAGTGGGAGCGACTGAGCAACAGGAGGACTCAGCCTGGGACAGGCCCCAGAGAGTCACACAAAGGAATCTTTATTTATTACATGAAAAATAACCACAAGTCCAGCATTGCGGCACACTCCCTGTGTGGTTAATTTAATGAACCATGAAAGACAGGATGACCTTGGACAAGGCCAAACTGTCCTCCAAGACTCCTTAATGAGGGGCAGGAGTCCCAGGGA:1671 pax8D_mRNA 1191:GGCCTTTGACCATCTGTAGTTGCCATGGGGACAGTGGGAGCGACTGAGCAACAGGAGGACTCAGCCTGGGACAGGCCCCAGAGAGTCACACAAAGGAATCTTTATTTATTACATGAAAAATAACCACAAGTCCAGCATTGCGGCACACTCCCTGTGTGGTTAATTTAATGAACCATGAAAGACAGGATGACCTTGGACAAGGCCAAACTGTCCTCCAAGACTCCTTAATGAGGGGCAGGAGTCCCAGGGA:1440 pax8E_mRNA 1089:GGCCTTTGACCATCTGTAGTTGCCATGGGGACAGTGGGAGCGACTGAGCAACAGGAGGACTCAGCCTGGGACAGGCCCCAGAGAGTCACACAAAGGAATCTTTATTTATTACATGAAAAATAACCACAAGTCCAGCATTGCGGCACACTCCCTGTGTGGTTAATTTAATGAACCATGAAAGACAGGATGACCTTGGACAAGGCCAAACTGTCCTCCAAGACTCCTTAATGAGGGGCAGGAGTCCCAGGGA:1338 . . .1760 . . .1770 . . .1780 . . .1790 . . .1800 . . .1810 . . .1820 . . .1830 . . .1840 . . .1850 . . .1860 . . .1870 . . .1880 . . .1890 . . .1900 . . .1910 . . .1920 . . .1930 . . .1940 . . .1950 . . .1960 . . .1970 . . .1980 . . .1990 . . .2000 pax8A_mRNA 1751:AAGAGAACCATGCCATGCTGAAAAAGACAAAATTGAAGAAGAAATGTAGCCCCCAGCCGGTACCCACCAAAGGAGAGAAGAAGCAATAGCCGAGGAACTTGGGGGGATGGCGAATGGTTCCTGCCCGGGCCCAAGGGGTGCACAGGGCACCTCCATGGCTCCATTATTAACACAACTCTAGCAATTATGGACCATAAGCACTTCCCTCCAGCCCACAAGTCACAGCCTGGTGCCGAGGCTCTCCTCACCA:2000 pax8B_mRNA 1562:AAGAGAACCATGCCATGCTGAAAAAGACAAAATTGAAGAAGAAATGTAGCCCCCAGCCGGTACCCACCAAAGGAGAGAAGAAGCAATAGCCGAGGAACTTGGGGGGATGGCGAATGGTTCCTGCCCGGGCCCAAGGGGTGCACAGGGCACCTCCATGGCTCCATTATTAACACAACTCTAGCAATTATGGACCATAAGCACTTCCCTCCAGCCCACAAGTCACAGCCTGGTGCCGAGGCTCTCCTCACCA:1811 pax8C_mRNA 1672:AAGAGAACCATGCCATGCTGAAAAAGACAAAATTGAAGAAGAAATGTAGCCCCCAGCCGGTACCCACCAAAGGAGAGAAGAAGCAATAGCCGAGGAACTTGGGGGGATGGCGAATGGTTCCTGCCCGGGCCCAAGGGGTGCACAGGGCACCTCCATGGCTCCATTATTAACACAACTCTAGCAATTATGGACCATAAGCACTTCCCTCCAGCCCACAAGTCACAGCCTGGTGCCGAGGCTCTCCTCACCA:1921 pax8D_mRNA 1441:AAGAGAACCATGCCATGCTGAAAAAGACAAAATTGAAGAAGAAATGTAGCCCCCAGCCGGTACCCACCAAAGGAGAGAAGAAGCAATAGCCGAGGAACTTGGGGGGATGGCGAATGGTTCCTGCCCGGGCCCAAGGGGTGCACAGGGCACCTCCATGGCTCCATTATTAACACAACTCTAGCAATTATGGACCATAAGCACTTCCCTCCAGCCCACAAGTCACAGCCTGGTGCCGAGGCTCTCCTCACCA:1690 pax8E_mRNA 1339:AAGAGAACCATGCCATGCTGAAAAAGACAAAATTGAAGAAGAAATGTAGCCCCCAGCCGGTACCCACCAAAGGAGAGAAGAAGCAATAGCCGAGGAACTTGGGGGGATGGCGAATGGTTCCTGCCCGGGCCCAAGGGGTGCACAGGGCACCTCCATGGCTCCATTATTAACACAACTCTAGCAATTATGGACCATAAGCACTTCCCTCCAGCCCACAAGTCACAGCCTGGTGCCGAGGCTCTCCTCACCA:1588 LAEGVCDNDTVPSVSSINRIIRTKVQQPFNLPMDSCVATKSLSPGHTLIPSSAVTPPESPQSDSLGSTYSINGLLGIAQPGSDK ATG UAG 5’ UTR 1 2 3 4 5 6 7 8 9 10 11 siRNA ? ? ? MPHNSIRSGHGGLNQLGGAFVNGRPLPE VVRQRIVDLAHQGVRPCDISRQLRVSHGCVSKILGRYYETGSIRPGVIGGSKPKVATPKVVEKIGDYKRQNPTMFAWEIRDRLL RKMDDSDQDSCRLSIDSQSSSSGPRKHLRTDAFSQHHLEPLECPFERQHYPEAYASPSHTKGEQG LYPLPLLNSTLDDGKATLT PSNTPLGRNLSTHQTYPVVAD PHSPFAIKQETPEVSSSSSTPSSLSSSAFLDLQQVGSGVPPFNAFPHAASVYGQFTGQALLS GREMVGPTLPGYPPHIPTSGQGSYASSAIAGMVAG SEYSGNAYGHTPYSSYSEAWRFPNSSLLSSPYYYSSTSRPSAPPTTAT AFDHL = paired domain = octapeptide = partial homeodomain = activation domain = repression domain = intron with de novo CpG island = translocation breakpoints with PPAR-gamma = intron/exon boundaries Splice Variants in PAX 8
  • 10. What questions could a PAX 8What questions could a PAX 8 model answer?model answer? Better understanding of :Better understanding of : Paired Domain-DNA interactionPaired Domain-DNA interaction–– ► Biological function of PDBiological function of PD Function of N and C subdomainsFunction of N and C subdomains –– ► Specific DNA contacts made by themSpecific DNA contacts made by them ► Do they cooperate with each other, does one affect theDo they cooperate with each other, does one affect the function of the other and how?function of the other and how? Effects of mutationsEffects of mutations ► Relation to the abnormal phenotypeRelation to the abnormal phenotype
  • 11. Why Homology modeling?Why Homology modeling? ►No solved X- Ray structureNo solved X- Ray structure for ourfor our Target protein ie. PAX 8Target protein ie. PAX 8 Moreover:Moreover: ►X-Ray structure is both time consumingX-Ray structure is both time consuming and expensiveand expensive ►Only a small number of proteins can beOnly a small number of proteins can be made to form crystals and crystal ismade to form crystals and crystal is not the protein’s native state.not the protein’s native state.
  • 12. Why Homology modeling?Why Homology modeling? ►No solved NMR structureNo solved NMR structure for ourfor our Target protein ie. PAX 8Target protein ie. PAX 8 Moreover:Moreover: ►NMR does not work too well for proteinNMR does not work too well for protein complexes.complexes. ►Very time consumingVery time consuming
  • 13. Obtain Target Sequence Get Information about Target Protein Template Selection (Crystal Structures) Initial Model Validate Model Sequence Database (Genbank) WHAT IF, PROCHECK, 3D JIGSAW, Esypred, SWISS Model, FUGUE RAMPAGE MODELER TOOLBOX Blastp, CDD BLAST PDB database Clean PDB files Create alignment of target with template sequences (Convert aln to ali) MODELER CLUSTALW Steps for HomologySteps for Homology ModelingModeling
  • 14. The Template StructureThe Template Structure PAX6PAX6 5822580|pdb|6PAX|A5822580|pdb|6PAX|A Chain A,Chain A, Crystal Structure Of The HumanCrystal Structure Of The Human Pax-6 Paired Domain-Dna Complex Reveals A GeneralPax-6 Paired Domain-Dna Complex Reveals A General Model For Pax Protein-Dna InteractionsModel For Pax Protein-Dna Interactions Length = 133Length = 133 Score = 198 bits (503),Score = 198 bits (503), Expect = 3e-52Expect = 3e-52 Identities = 92/123 (74%),Identities = 92/123 (74%), Positives = 107/123 (86%)Positives = 107/123 (86%) Query: 10Query: 10 HGGLNQLGGAFVNGRPLPEVVRQRIVDLAHQGVRPCDISRQLHGGLNQLGGAFVNGRPLPEVVRQRIVDLAHQGVRPCDISRQL RVSHGCVSKILGRYYETG 69 H G+NQLGG FVNGRPLP+RVSHGCVSKILGRYYETG 69 H G+NQLGG FVNGRPLP+ RQRIV+LAH G RPCDISR L+VS+GCVSKILGRYY TG Sbjct: 2RQRIV+LAH G RPCDISR L+VS+GCVSKILGRYY TG Sbjct: 2 HSGVNQLGGVFVNGRPLPDSTRQRIVELAHSGARPCDISRILQHSGVNQLGGVFVNGRPLPDSTRQRIVELAHSGARPCDISRILQ VSNGCVSKILGRYYATG 61 Query: 70VSNGCVSKILGRYYATG 61 Query: 70 SIRPGVIGGSKPKVATPKVVEKIGDYKRQNPTMFAWEIRDRLLASIRPGVIGGSKPKVATPKVVEKIGDYKRQNPTMFAWEIRDRLLA EGVCDNDTVPSVSSIN 129 SIRP IGGSKP+VATP+VV KI YK++EGVCDNDTVPSVSSIN 129 SIRP IGGSKP+VATP+VV KI YK++ P++FAWEIRDRLL+EGVC ND +PSVSSIN Sbjct: 62P++FAWEIRDRLL+EGVC ND +PSVSSIN Sbjct: 62 SIRPRAIGGSKPRVATPEVVSKIAQYKQECPSIFAWEIRDRLLSESIRPRAIGGSKPRVATPEVVSKIAQYKQECPSIFAWEIRDRLLSE GVCTNDNIPSVSSIN 121 Query: 130 RII 132 R++ Sbjct: 122GVCTNDNIPSVSSIN 121 Query: 130 RII 132 R++ Sbjct: 122 RVL 124RVL 124
  • 18. Hypothetical DNA fit of the modelHypothetical DNA fit of the model
  • 19. ValidationValidation 1. Swiss Model(http:swissmodel.expasy.org)1. Swiss Model(http:swissmodel.expasy.org) WhatCheck Report generated for your SWISS MODEL request :WhatCheck Report generated for your SWISS MODEL request : ► No errors in amino acid nomenclatureNo errors in amino acid nomenclature ► Improper Dihedral angle distribution OK —Improper Dihedral angle distribution OK — The RMS Z-score for all improper dihedrals in the structure is within normal range.The RMS Z-score for all improper dihedrals in the structure is within normal range. ► Normal bond angle variability.Normal bond angle variability. ► A few residues had abnormal backbone torsion angles.A few residues had abnormal backbone torsion angles. ► A few pair of atoms had abnormally short interatomic distances.A few pair of atoms had abnormally short interatomic distances. Overall the model conforms to the common refinement constraintsOverall the model conforms to the common refinement constraints
  • 20. Ramachandran plotRamachandran plot (http://raven.bioc.cam.ac.uk/rampage.php)(http://raven.bioc.cam.ac.uk/rampage.php) Residue [ 43 :ARG] ( 68.15, 44.04) in Allowed region Residue [ 73 :LYS] (-118.37, -75.31) in Allowed region Number of residues in favoured region (~98.0% expected) : 119 ( 98.3%) Number of residues in allowed region ( ~2.0% expected) : 2 ( 1.7%) Number of residues in outlier region : 0 ( 0.0%)
  • 21. Main Chain-Side Chain ContactsMain Chain-Side Chain Contacts Source:Source: MolProbity, an interactive macromolecular structure validation tool provided by the RichardsonMolProbity, an interactive macromolecular structure validation tool provided by the Richardson laboratory, Duke University.laboratory, Duke University.
  • 22. LimitationsLimitations ► Could not model the entire protein due to lack ofCould not model the entire protein due to lack of homologous structures and extensive loop regionhomologous structures and extensive loop region which is tough to model.which is tough to model. ► The paired box region may undergo someThe paired box region may undergo some structural changes in the presence of the partialstructural changes in the presence of the partial homeodomain (cooperativity in DNA binding).homeodomain (cooperativity in DNA binding). ► The DNA contacts made by the model may differThe DNA contacts made by the model may differ from the template due to presence of other non-from the template due to presence of other non- identical residues.identical residues.
  • 24. 2.2.
  • 25. 3.3.
  • 26. 4.4.
  • 27. 5.5.
  • 28. 6.6. Simon C. Lovell, Ian W. Davis, W.Simon C. Lovell, Ian W. Davis, W. Bryan Arendall III, Paul I. W. deBryan Arendall III, Paul I. W. de Bakker, J. Michael Word, Michael G.Bakker, J. Michael Word, Michael G. Prisant, Jane S. Richardson, David C.Prisant, Jane S. Richardson, David C. Richardson (2003)Richardson (2003) Structure validation by C-alpha geometry: phi,Structure validation by C-alpha geometry: phi, psipsi, and C-beta deviation., and C-beta deviation. Proteins:Proteins: Structure, Function, and Genetics.Structure, Function, and Genetics. 5050:: 437-450.437-450. 7.
  • 29. 8.8.