50% social chromosomes in ants, 50% bioinformatics for genomics in emerging model organisms. Given at #CTBio http://pathogenomics.bham.ac.uk/blog/2013/07/cream-teas-and-bioinformatics-balti-and-bioinformatics-goes-on-its-holidays/
Apologies - videos and transitions are largely missing as part of the PDF conversion.
The work referenced here includes:
http://dx.doi.org/10.1038/nature11832
http://dx.doi.org/10.1073/pnas.1009690108
http://sequenceserver.com
https://github.com/yeban/afra & http://afra.sbcs.qmul.ac.uk
https://github.com/monicadragan/GeneValidator
26. Allozyme screen Social form associated to Gp-9 locus
Frequency of
the most
common allele
Locus!
0.3!
0.4!
0.5!
0.6!
0.7!
0.8!
0.9!
1.0!
Single queen!Multiple queen!
Est-6!Est-4!
G
3pdh-1!C
a-4!Pgm
-4!Ddh-1!Pro-5!
Pgm
-3!Acoh-5!acoh-1!A
cy-1!Pgm
-1!Aat-2!Gp-9!
Ken Ross and colleagues
Laurent Keller and colleagues
27. Single queen form Multiple queen form
Ken Ross and colleagues
Laurent Keller and colleagues
Social form completely associated to Gp-9 locus
28. bbbbBB BB Bb bb
Ken Ross and colleagues
Laurent Keller and colleagues
Single queen form Multiple queen form
Social form completely associated to Gp-9 locus
29. bbBB BB Bb
x
Gp-9 bb females rare
Ken Ross and colleagues
Laurent Keller and colleagues
Single queen form Multiple queen form
Social form completely associated to Gp-9 locus
30. BB BB Bb
Ken Ross and colleagues
Laurent Keller and colleagues
Single queen form Multiple queen form
Social form completely associated to Gp-9 locus
31. BB BB Bb
x
Ken Ross and colleagues
Laurent Keller and colleagues
Single queen form Multiple queen form
Social form completely associated to Gp-9 locus
32. BB BB Bb
x x
Ken Ross and colleagues
Laurent Keller and colleagues
Social form completely associated to Gp-9 locus
Single queen form Multiple queen form
33. BB BB Bb
x x x
Ken Ross and colleagues
Laurent Keller and colleagues
Single queen form Multiple queen form
(>15% )(< 5% )
Social form completely associated to Gp-9 locus
34. •Is this gene the single überregulator?
Social form completely associated to Gp-9 locus
35. •Is this gene the single überregulator?
Social form completely associated to Gp-9 locus
maybe 1/14th of the genome?
•Only 14 allozyme markers were used
Locus!
0.3!
0.4!
0.5!
0.6!
0.7!
0.8!
0.9!
1.0!
Single queen!Multiple queen!
Est-6!Est-4!
G
3pdh-1!C
a-4!Pgm
-4!Ddh-1!Pro-5!
Pgm
-3!Acoh-5!acoh-1!A
cy-1!Pgm
-1!Aat-2!Gp-9!
38. Are other genes linked to Gp-9?
Sequenced:
•a Gp-9 B ♂ genome
39. Single ♂:
His brothers:
11×
4×
(330bp-insert paired reads) (normal single-end reads)45× +
Sequencing from haploid males (for easier assembly):
(8,000 & 20,000bp-insert paired reads)
B 20x
The genome of a Gp-9 B ♂ fire ant
Assembly approach:
1. Assemble short Illumina reads with SOAPdenovo→N50: 3600 bp
2. Chop assembly into “fake 454 reads” (300bp)
3. Assemble fake + real 454 reads with Newbler→N50: 720,000 bp
10,000 scaffolds (100 biggest scaffolds: 50% of genome)
Total: 350,000,000 bp assembled.The rest: repeats→
Wurm et al 2011
41. ★ Expansion of lipid-processing gene families (for Cuticular Hydrocarbons)
★ 420 putative olfactory receptors (more than any other insect!)
★ Functional DNA-methylation system
The genome of the fire ant
Some findings:
Wurm et al 2011
0.05
Si
O
R
04
64
8+
10
SiOR01968+4
Si
OR
00
89
9+7
SiOR028
14+3
Si
OR
04
17
1+
6
SiOR
04609
+4
Si
O
R
00
33
0+
28
SiOR0
2694+2
5
SiOR04609+20
SiOR05285+6
Si
O
R00
33
0+
25
SiO
R04
510
+15
Si
OR
00
33
0+1
8
SiO
R0
46
09
+23
SiO
R01
968
+23
SiO
R03
952
+4
Si
OR
04
64
8+1
6
SiOR05901
+2SiOR0294
4+4
SiOR01968+5
Si
OR
04
17
1+1
9 Si
O
R04
64
8+
5
SiOR
1053
5+3
SiO
R0
12
24
+3
SiO
R0
672
3+2
SiOR01968+9
SiO
R0
288
3+1
SiO
R0
08
99
+3
SiOR0
1629+1
Si
O
R
04
17
1+
1
SiOR01
629+11
Si
O
R
04
17
1+
10
SiO
R0
41
71
+13
SiO
R0
26
94
+3
Si
OR
04
17
1+2
0
SiOR02694+35
Si
OR
04
17
1+1
5
SiO
R0
460
9+7
Si
OR
05
11
8+
2
SiO
R07
837
+2
SiOR02694+27
SiOR01968+10
Si
OR
04
64
8+17
SiOR
0196
8+19
SiOR
0269
4+17
Si
OR
04
64
8+
13
SiOR01968+6
Si
OR
00
89
9+
12
SiO
R0
59
01
+1
Si
OR
00
33
0+2
0
SiOR0
2648+
2
SiOR
0265
9+2
SiOR
01968
+16
Si
OR
00
89
9+1
1
SiO
R02
974
Si
O
R0
41
71
+2
SiO
R03
952
+2
SiOR06
792+2
Si
OR
04
51
0+4
SiO
R0
41
71
+28
SiOR05285+5
Si
OR
05
28
5+
9
Si
OR
00
89
9+
15
Si
OR
04
64
8+
3
SiOR0269
4+36
SiO
R10
535+1
SiOR
0269
4+19
SiOR026
94+23
Si
OR
02
69
4+1
SiOR04609
+14
SiO
R0
11
22
Si
O
R
04
64
8+
9
SiOR02694+34
SiOR0
1629+8
Si
O
R
04
64
8+
8
SiO
R0
45
10
+8
SiOR065
73
SiOR0
2944+1
Si
O
R
00
33
0+
26
SiOR
00330
+1
SiO
R02
694
+15SiOR
0303
8
SiOR
05
28
5+7
SiO
R0
08
99
+5
SiO
R04
609
+10
SiOR
0460
9+3
SiO
R04
339
SiO
R0
80
68
Si
O
R
04
64
8+
6
Si
OR
04
51
0+
2
Si
OR
05
28
5+8
SiO
R0
157
3+4
SiOR
0366
3
Si
OR
04
17
1+8
SiOR
0185
8+2
SiOR01968+2
SiOR01968+1
SiO
R0
26
94
+5
SiOR01968+3
SiO
R0
672
3+3
SiOR0
1968+1
5
SiOR05285+1
SiO
R0
08
99
+4
SiO
R0
46
09
+22
Si
OR
04
17
1+9
SiO
R0
269
4+9
SiOR
02648
+1
SiOR0679
2+3
SiO
R0
157
3+2
SiOR
0269
4+20
SiO
R1
05
42
SiOR04609+15
SiO
R0
269
4+8
SiOR
00
33
0+1
6
SiO
R0
08
99
+2
SiO
R0
269
4+1
0
SiO
R0
4510+9
SiOR05285+3
SiO
R0
417
1+2
4
SiOR
0460
9+2
SiO
R0
52
85
+11
SiO
R02
694
+14
SiO
R0
157
3+1
SiOR05285
+2
Si
OR
00
61
3
SiO
R01
968
+22
Si
O
R
00
89
9+
9
SiOR06843+2
SiOR026
94+37
SiO
R0
08
99
+1
SiO
R0
460
9+9
SiOR
0543
1+2
SiOR
1053
5+2
SiO
R0
03
30
+15
SiOR
0269
4+18
SiO
R0
122
4+2
SiO
R0
451
0+1
1
Si
OR
00
33
0+
23
SiOR02694+29
SiO
R05
416
SiO
R0
52
85
+10
Si
OR
02
69
4+2
SiOR
01629
+9
SiOR
0543
1+1
SiO
R02
883
+2
SiO
R0
41
71
+14
SiOR
0834
1
SiOR0269
4+22
SiO
R0
122
4+1
SiOR019
68+12
SiOR
05431
+3
SiO
R0
45
10
+6
SiO
R0
26
94
+7
SiO
R0
46
09
+5
SiOR02
944+2
SiO
R03
952
+3
SiOR01968
+8
SiO
R04
609
+24
SiOR02694+30
SiOR0162
9+10
SiO
R04
510
+14
Si
OR
00
56
5
Si
OR
05
11
8+
3
SiO
R01
321
SiOR04609+19
SiO
R0
03
30
+14
SiOR02
694+38
SiO
R0
460
9+8
Si
O
R0
41
71
+1
6
SiOR
1045
5
SiOR04609+16
SiOR04
609+21
SiOR02694+28
SiOR
0265
9+1
Si
OR
04
17
1+
5
Si
O
R00
33
0+
29
SiOR0
1968+1
4
SiOR0
3983
Si
O
R
00
33
0+
27
SiOR05285+4
Si
OR
04
51
0+
1
Si
O
R
00
89
9+
8
Si
O
R
04
64
8+
7
SiOR04609+17
SiO
R00
330+
5
SiOR
02694
+21
SiOR02814+4
SiO
R00
330
+7
SiO
R01
629
+3
SiO
R0
196
8+2
6
SiOR02694+31
Si
OR
04
64
8+
2
SiOR0
2694+3
9
SiO
R0
417
1+2
5
Si
OR
06
57
7
SiO
R0
196
8+2
5
SiO
R04
171
+21
SiOR
0679
2+6
SiOR0
4609+1
1
SiO
R0
269
4+1
1
SiOR06792+1
Si
OR
04
17
1+
4
SiOR01629+5
Si
OR
00
33
0+
21
Si
OR
04
64
8+1
5
SiO
R00
330
+6
SiO
R02
694
+16
Si
O
R0
46
48
+1
1
Si
OR0
46
48
+4
SiOR
0033
0+3
SiOR06535
Si
OR
04
17
1+
7
SiOR10493
SiOR02694+32
SiOR
06792
+4
Si
OR
04
51
0+3
SiO
R06
890
SiO
R01
968+
20
SiOR04
609+12
Si
OR
04
17
1+
3
SiOR
0196
8+18
SiOR0196
8+11
SiOR046
09+13
SiOR0
1629+1
2
Si
O
R
00
33
0+
22
SiOR02694+33
SiO
R0
03
30
+13
SiO
R0
157
3+3
SiO
R04
510
+16
SiO
R0
417
1+1
2
SiO
R0451
0+1
3
Si
O
R0
51
18
+1
SiOR02
944+3
SiO
R0
41
71
+26
Si
OR
04
17
1+
17
Si
O
R0
08
99
+1
4
SiO
R0
41
71
+29
Si
OR
00
89
9+
13
SiO
R02
694
+13
Si
OR0
03
30
+2
4
Si
OR
00
33
0+1
9
SiO
R04
171
+27
SiOR02
694+24
SiOR
04
51
0+5
SiOR07090
SiO
R03
952
+1
SiO
R0
451
0+1
0
Si
OR
00
33
0+1
7
SiOR0
2694+2
6
SiOR
04609
+1
Si
OR
00
89
9+6
Si
OR
04
64
8+
12
SiOR02
814+2
SiO
R0
033
0+1
1
Si
OR
04
17
1+
18
SiOR
0196
8+17
SiO
R0
033
0+1
0
SiO
R0
033
0+9
SiO
R01
629
+2
Si
O
R
04
17
1+
11
SiO
R0
451
0+1
2
SiO
R01
968
+21
SiO
R00
330
+8
SiO
R01
858+
1
SiO
R0
45
10
+7
SiOR
05431
+4
SiO
R0
26
94
+6
SiOR01
968+13
SiOR
0033
0+4
SiOR04609+18
Si
OR
00
89
9+1
0
SiO
R0
0330+12
Si
OR
00
33
0+
31
SiOR06843+1
SiO
R07
837
+1
SiOR
0033
0+2
SiOR01629+4
Si
OR
04
64
8+
1
SiO
R01
968
+24
Si
OR
04
17
1+2
3
SiO
R01
629+
7
Si
OR
04
64
8+1
4
SiOR06843+3
SiOR06792
+5
SiOR01968+7
SiOR
0162
9+6
SiO
R02
883
+3
SiO
R0
269
4+1
2
Si
OR
05
11
8+
4
Si
OR
04
17
1+2
2
SiOR0
1080
SiO
R0
460
9+6
SiOR02
814+1
Si
OR0
03
30
+3
0
SiO
R0
52
85
+12
SiO
R0
672
3+1
Si
OR
02
69
4+
4
42. ★ Expansion of lipid-processing gene families (for Cuticular Hydrocarbons)
★ 420 putative olfactory receptors (more than any other insect!)
★ Functional DNA-methylation system
★Ant-specific duplication and subfunctionalisation
of vitellogenin (in bees: involved in reproduction & division of labor)
The genome of the fire ant
Some findings:
Wurm et al 2011
significance of these duplication events in vitellogenins, odor
perception genes, and a family of lipid-processing genes. We also
discuss additional features of interest in the fire ant genome rel-
evant to the complex social biology of this species, including sex
determination genes, DNA methylation genes, telomerase, and
the insulin and juvenile hormone pathways.
Vitellogenins. In contrast to other insects that mainly have only one
or two vitellogenins, the fire ant genome harbors four adjacent
regulation of life span (27, 28) and division of labor (29). Quanti-
tative RT-PCR shows that Vg1 and Vg4 are preferentially expressed
in workers and Vg2 and Vg3 in queens (Fig. 3C, SI Materials and
Methods, and Table S1G). Vitellogenin expression in S. invicta
workers is surprising because they lack ovaries. Given the super-
organism properties of ant societies, the expression patterns sug-
gest that vitellogenins underwent neo- or subfunctionalization
after duplication to acquire caste-specific functions.
Odor Perception. Consistent with studies in other insects, we find
a single S. invicta ortholog to DmOr83b, a broadly expressed ol-
factory receptor (OR) required to interact with other ORs for
Drosophila and Tribolium castaneum olfaction (30–32). Beyond
OR83b, OR number varies greatly between insect species. Blast
searches and GeneWise searches using an HMM profile con-
structed with aligned ORs from N. vitripennis (33) and Pogono-
myrmex barbatus identified more than 400 loci in the S. invicta
genome with significant sequence similarity to ORs. Preliminary
work on gene model reconstruction identified 297 intact full-
length proteins. Many S. invicta ORs are in tandem arrays (Fig.
S2A) and derive from recent expansions. S. invicta may thus har-
bor the largest identified insect OR repertoire because there are
10 ORs in Pediculus humanus (34), 60 in Drosophila, 165 in
A. mellifera, 225 in N. vitripennis (33), and 259 in T. castaneum
(32). The large numbers of N. vitripennis and T. castaneum ORs
are thought to be due to current or past difficulties in host and
food finding. As has been suggested for A. mellifera (35), the large
number of S. invicta ORs may result from the importance of
chemical communication in ants. The odorant-binding proteins
(OBPs) are another family of genes also known to play roles in
chemosensation in Drosophila (36). Intriguingly, the social orga-
nization of S. invicta colonies is completely associated with se-
Eumeta
No hits 3424
Not assigned 274
Cnidaria 100
C
Bilat
Nematoda 25
Deuterostomia 173
Arachnida 50
Paraneoptera 577
Diptera 404
Lepidoptera 29
Fig. 2. Taxonomic distribution of best blastp hits of S. invicta proteins to the
nonredundant (nr) protein database (E < 10−5
). Results were first plotted
using MEGAN software (22) and then branches with fewer than 20 hits were
removed, branch lengths were reduced for compactness, and tree topology
was adjusted to reflect consensus phylogenies (23, 24).
Vg1Vg4 Vg3 Vg2
2,330,000 bp 2,360,000 bp
A
B CSolenopsis Vg1
Solenopsis Vg4
Solenopsis Vg2
Solenopsis Vg3
Apis Vg
Bombus Vg
Nasonia Vg1
Nasonia Vg2
Pteromalus Vg
Encarsia Vg
Pimpla Vg
Athalia Vg
Apocrita
Tenthedinoidea
Vespoidea
Apoidea
Aculeata
Chalcidoidea
Ichneumonoidea 0
5000
10000
15000
20000
25000
Vg3Vg2
142 389 17820 1.4 9269 0.61 40
WQ WQW Q WQ
*** ***
Vg1 Vg4
* ***
0
100
200
300
400
500
600
EVOLUTION
0.05
Si
O
R
04
64
8+
10
SiOR01968+4
Si
OR
00
89
9+7
SiOR028
14+3
Si
OR
04
17
1+
6
SiOR
04609
+4
Si
O
R
00
33
0+
28
SiOR0
2694+2
5
SiOR04609+20
SiOR05285+6
Si
O
R00
33
0+
25
SiO
R04
510
+15
Si
OR
00
33
0+1
8
SiO
R0
46
09
+23
SiO
R01
968
+23
SiO
R03
952
+4
Si
OR
04
64
8+1
6
SiOR05901
+2SiOR0294
4+4
SiOR01968+5
Si
OR
04
17
1+1
9 Si
O
R04
64
8+
5
SiOR
1053
5+3
SiO
R0
12
24
+3
SiO
R0
672
3+2
SiOR01968+9
SiO
R0
288
3+1
SiO
R0
08
99
+3
SiOR0
1629+1
Si
O
R
04
17
1+
1
SiOR01
629+11
Si
O
R
04
17
1+
10
SiO
R0
41
71
+13
SiO
R0
26
94
+3
Si
OR
04
17
1+2
0
SiOR02694+35
Si
OR
04
17
1+1
5
SiO
R0
460
9+7
Si
OR
05
11
8+
2
SiO
R07
837
+2
SiOR02694+27
SiOR01968+10
Si
OR
04
64
8+17
SiOR
0196
8+19
SiOR
0269
4+17
Si
OR
04
64
8+
13
SiOR01968+6
Si
OR
00
89
9+
12
SiO
R0
59
01
+1
Si
OR
00
33
0+2
0
SiOR0
2648+
2
SiOR
0265
9+2
SiOR
01968
+16
Si
OR
00
89
9+1
1
SiO
R02
974
Si
O
R0
41
71
+2
SiO
R03
952
+2
SiOR06
792+2
Si
OR
04
51
0+4
SiO
R0
41
71
+28
SiOR05285+5
Si
OR
05
28
5+
9
Si
OR
00
89
9+
15
Si
OR
04
64
8+
3
SiOR0269
4+36
SiO
R10
535+1
SiOR
0269
4+19
SiOR026
94+23
Si
OR
02
69
4+1
SiOR04609
+14
SiO
R0
11
22
Si
O
R
04
64
8+
9
SiOR02694+34
SiOR0
1629+8
Si
O
R
04
64
8+
8
SiO
R0
45
10
+8
SiOR065
73
SiOR0
2944+1
Si
O
R
00
33
0+
26
SiOR
00330
+1
SiO
R02
694
+15SiOR
0303
8
SiOR
05
28
5+7
SiO
R0
08
99
+5
SiO
R04
609
+10
SiOR
0460
9+3
SiO
R04
339
SiO
R0
80
68
Si
O
R
04
64
8+
6
Si
OR
04
51
0+
2
Si
OR
05
28
5+8
SiO
R0
157
3+4
SiOR
0366
3
Si
OR
04
17
1+8
SiOR
0185
8+2
SiOR01968+2
SiOR01968+1
SiO
R0
26
94
+5
SiOR01968+3
SiO
R0
672
3+3
SiOR0
1968+1
5
SiOR05285+1
SiO
R0
08
99
+4
SiO
R0
46
09
+22
Si
OR
04
17
1+9
SiO
R0
269
4+9
SiOR
02648
+1
SiOR0679
2+3
SiO
R0
157
3+2
SiOR
0269
4+20
SiO
R1
05
42
SiOR04609+15
SiO
R0
269
4+8
SiOR
00
33
0+1
6
SiO
R0
08
99
+2
SiO
R0
269
4+1
0
SiO
R0
4510+9
SiOR05285+3
SiO
R0
417
1+2
4
SiOR
0460
9+2
SiO
R0
52
85
+11
SiO
R02
694
+14
SiO
R0
157
3+1
SiOR05285
+2
Si
OR
00
61
3
SiO
R01
968
+22
Si
O
R
00
89
9+
9
SiOR06843+2
SiOR026
94+37
SiO
R0
08
99
+1
SiO
R0
460
9+9
SiOR
0543
1+2
SiOR
1053
5+2
SiO
R0
03
30
+15
SiOR
0269
4+18
SiO
R0
122
4+2
SiO
R0
451
0+1
1
Si
OR
00
33
0+
23
SiOR02694+29
SiO
R05
416
SiO
R0
52
85
+10
Si
OR
02
69
4+2
SiOR
01629
+9
SiOR
0543
1+1
SiO
R02
883
+2
SiO
R0
41
71
+14
SiOR
0834
1
SiOR0269
4+22
SiO
R0
122
4+1
SiOR019
68+12
SiOR
05431
+3
SiO
R0
45
10
+6
SiO
R0
26
94
+7
SiO
R0
46
09
+5
SiOR02
944+2
SiO
R03
952
+3
SiOR01968
+8
SiO
R04
609
+24
SiOR02694+30
SiOR0162
9+10
SiO
R04
510
+14
Si
OR
00
56
5
Si
OR
05
11
8+
3
SiO
R01
321
SiOR04609+19
SiO
R0
03
30
+14
SiOR02
694+38
SiO
R0
460
9+8
Si
O
R0
41
71
+1
6
SiOR
1045
5
SiOR04609+16
SiOR04
609+21
SiOR02694+28
SiOR
0265
9+1
Si
OR
04
17
1+
5
Si
O
R00
33
0+
29
SiOR0
1968+1
4
SiOR0
3983
Si
O
R
00
33
0+
27
SiOR05285+4
Si
OR
04
51
0+
1
Si
O
R
00
89
9+
8
Si
O
R
04
64
8+
7
SiOR04609+17
SiO
R00
330+
5
SiOR
02694
+21
SiOR02814+4
SiO
R00
330
+7
SiO
R01
629
+3
SiO
R0
196
8+2
6
SiOR02694+31
Si
OR
04
64
8+
2
SiOR0
2694+3
9
SiO
R0
417
1+2
5
Si
OR
06
57
7
SiO
R0
196
8+2
5
SiO
R04
171
+21
SiOR
0679
2+6
SiOR0
4609+1
1
SiO
R0
269
4+1
1
SiOR06792+1
Si
OR
04
17
1+
4
SiOR01629+5
Si
OR
00
33
0+
21
Si
OR
04
64
8+1
5
SiO
R00
330
+6
SiO
R02
694
+16
Si
O
R0
46
48
+1
1
Si
OR0
46
48
+4
SiOR
0033
0+3
SiOR06535
Si
OR
04
17
1+
7
SiOR10493
SiOR02694+32
SiOR
06792
+4
Si
OR
04
51
0+3
SiO
R06
890
SiO
R01
968+
20
SiOR04
609+12
Si
OR
04
17
1+
3
SiOR
0196
8+18
SiOR0196
8+11
SiOR046
09+13
SiOR0
1629+1
2
Si
O
R
00
33
0+
22
SiOR02694+33
SiO
R0
03
30
+13
SiO
R0
157
3+3
SiO
R04
510
+16
SiO
R0
417
1+1
2
SiO
R0451
0+1
3
Si
O
R0
51
18
+1
SiOR02
944+3
SiO
R0
41
71
+26
Si
OR
04
17
1+
17
Si
O
R0
08
99
+1
4
SiO
R0
41
71
+29
Si
OR
00
89
9+
13
SiO
R02
694
+13
Si
OR0
03
30
+2
4
Si
OR
00
33
0+1
9
SiO
R04
171
+27
SiOR02
694+24
SiOR
04
51
0+5
SiOR07090
SiO
R03
952
+1
SiO
R0
451
0+1
0
Si
OR
00
33
0+1
7
SiOR0
2694+2
6
SiOR
04609
+1
Si
OR
00
89
9+6
Si
OR
04
64
8+
12
SiOR02
814+2
SiO
R0
033
0+1
1
Si
OR
04
17
1+
18
SiOR
0196
8+17
SiO
R0
033
0+1
0
SiO
R0
033
0+9
SiO
R01
629
+2
Si
O
R
04
17
1+
11
SiO
R0
451
0+1
2
SiO
R01
968
+21
SiO
R00
330
+8
SiO
R01
858+
1
SiO
R0
45
10
+7
SiOR
05431
+4
SiO
R0
26
94
+6
SiOR01
968+13
SiOR
0033
0+4
SiOR04609+18
Si
OR
00
89
9+1
0
SiO
R0
0330+12
Si
OR
00
33
0+
31
SiOR06843+1
SiO
R07
837
+1
SiOR
0033
0+2
SiOR01629+4
Si
OR
04
64
8+
1
SiO
R01
968
+24
Si
OR
04
17
1+2
3
SiO
R01
629+
7
Si
OR
04
64
8+1
4
SiOR06843+3
SiOR06792
+5
SiOR01968+7
SiOR
0162
9+6
SiO
R02
883
+3
SiO
R0
269
4+1
2
Si
OR
05
11
8+
4
Si
OR
04
17
1+2
2
SiOR0
1080
SiO
R0
460
9+6
SiOR02
814+1
Si
OR0
03
30
+3
0
SiO
R0
52
85
+12
SiO
R0
672
3+1
Si
OR
02
69
4+
4
43. Social form completely associated to Gp-9 locus
BB BB Bb
x x x
Single queen form Multiple queen form
(>15% )(< 5% )
Are other genes linked to Gp-9?
44. Are other genes linked to Gp-9?
Sequenced:
•a Gp-9 B ♂ genome
•a Gp-9 b ♂ genome
“Next Generation Genotyping.”
RAD sequencing
48. AACTG
Gp-9 B
Gp-9 B
GGCCT
Gp-9 B
Gp-9 B
AAGGT
Gp-9 B
Gp-9 B
CCAGT
Gp-9 b
Gp-9 b
TAAAT
Gp-9 b
Gp-9 b
GGAAT
Gp-9 b
Gp-9 b
38 Gp-9 B
males
38 Gp-9 b
males
RAD sequencing of haploid ♂ for
SNP discovery & genotyping
49. Identify polymorphism
individual x locus
genotype table
RADseq: sequencing the same 0.01% of the
genome in many individuals
A B C D E F
L1 A C A A C C
L2 G G T - T G
L3 - A G A - G
L4 C - - G G C
L5 T T C T C -
L6 G A A - - G
2419loci
38 B♂ & 38 b♂
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20+
Amount of variance explained per principal component
%VarianceExplained
051015202530
12.7%
6.1% 5.4% 4.8% 4.7% 3.9% 3.5% 3.2% 3.1% 2.9% 2.8% 2.6% 2.4% 2.3% 2.2% 2.0% 1.9% 1.7% 1.6%
30.2%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20+
Amount of variance explained per principal component
%VarianceExplained
051015202530
PCA: Principal Component Analysis
55. SbSB
Gp-9B
genetic map
E3
E17
A22
Gp-9 B male
SbSB
Gp-9B
genetic map
E3
E17
A22
Gp-9 B male Gp-9 b malea
SbSB
Gp-9B
genetic map
E3
E17
A22
Gp-9 B male Gp-9 b malea
SbSB
Gp-9B
genetic map
E3
E17
A22
Gp-9 b male
Why non-recombining? Structural differences
using FISH: Flourescence in situ Hybridization
John Wang @Taipei
56. X X X Y
Maybe several
rearrangements
Predictions:
•genes in S are responsible for phenotype?
SBSB SB Sb
Single queen colony Multiple queen colony
57. Most BB vs Bb gene expression
differences map to S
Non-recombing region of S contains 800 genes
Gene Expression Patterns for a Social Trait
Gene expression: Gp-9 Bb vs BB workers in multiple queen colonies
29 significant genes
are in the SB/Sb region
(p<10-10)
20 of
Similar for BB vs Bb queens;
& for B vs b males.Wang et al 2008
58. Predictions:
•genes in S are responsible for phenotype?
•Sb is degenerating?
probably!
⟹ directional (antagonistic?) selection?
X X X Y
Maybe several
rearrangements
SBSB SB Sb
Single queen colony Multiple queen colony
59. Is Sb degenerating?
99.8% of non-gap sequences are identical
SB
Sb
(Almost) no SB or Sb-specific sequence
Actually quite similar to SB:
But clearly: relaxation of purifying selection
Sb contains more small repeats
genes seem to be intact in Sb
Introns bigger in Sb than SB
60. Sb is degenerating:
repeats cause bad assembly
[b] vs. [c]: p < 10-4
[a] vs. [c]: p < 10-7
Gp-9B male Gp-9b male
Region:
Genome assembly:
Normally recombining
regions from all 16
linkage groups
Normally recombining
regions from all 16
linkage groups
Sb region without
recombination
in Gp-9 Bb queens
SB region without
recombination
in Gp-9 Bb queens
Scaffoldlength(bp)
0
1,000,000
2,000,000
3,000,000
4,000,000
5,000,000
6,000,000
[a] [a], [b] [a] [c]
SB Sb
61. Is Sb degenerating?
Sb contains more big repeats ⟹ bad assembly
dN/dS bigger in S than rest of the genome
Probably ♂ haploidy = strong purifying selection
⟹ slower degeneration
99.8% of non-gap sequences are identical
(Almost) no SB or Sb-specific sequence
Actually quite similar to SB:
But clearly: relaxation of purifying selection
Sb contains more small repeats
genes seem to be intact in Sb
Introns bigger in Sb than SB
62. Age of the region based on dS
0
50
100
150
200
250
0.00 0.05 0.10 0.15 0.20
leafcutterDndsSubset$dS
count
leafcutterdS
0
50
100
150
0.00 0.05 0.10 0.15 0.20
subset(dndsdata, gp9linked == TRUE)$dS
count
gp9linkedSolenopsisdS
0
50
100
150
200
250
0.00 0.05 0.10 0.15 0.20
leafcutterDndsSubset$dS
count leafcutterdS
0
50
100
150
0.00 0.05 0.10 0.15 0.20
subset(dndsdata, gp9linked == TRUE)$dS
count
gp9linkedSolenopsisdS
Leafcutter common ancestor: 8,000,000-10,000,000 years ago
Maximum Likelihood Estimation of SB/Sb age:280,000-425,000
⟹ little time for degeneration
63. Summary
Solenopsis invicta queen number determined by Gp-9 genotypes:
•only BB workers ➔ single BB queen
•with Bb workers ➔ multiple Bb queens
Genome sequencing + RAD Genotyping
•Gp-9 marks ~4% of genome
•social like sex chromosomes: SB is like X; Sb is likeY
some relaxation of purifying selection
but haploid males ➔ strong purifying selection
Structural differences between SB and Sb ➔ no recombination
Ants are cool.
SB and Sb stopped recombining ~400,000 years ago.
65. 1.“Can you BLAST this for me?”
Antgenomes.org SequenceServer
BLAST made easy
(well, we’re trying...)
Web server:
Anurag Priyam & Git community - http://sequenceserver.com
blast on 48-core
512gig fat machine
via ssh
66. http://www.sequenceserver.com/
(requires a BLAST+ install)
Do you have BLAST-formatted databases? If not:
sequenceserver format-databases /path/to/fastas
1. Installing
gem install sequenceserver
# .sequenceserver.conf
bin: ~/ncbi-blast-2.2.25+/bin/
database: /Users/me/blast_databases/
2. Configure.
sequenceserver
### Launched SequenceServer at: http://0.0.0.0:4567
3. Launch.
67. Gene prediction
Evidence
Dozens of software algorithms: dozens of predictions
Intron Exon UTRConsensus making:
This fails 20% of the time:
•missing pieces
•extra pieces
•incorrect merging
•incorrect splitting
Visual inspection...
required.
70. Gene prediction
Evidence
Dozens of software algorithms: dozens of predictions
Intron Exon UTRConsensus making:
This fails 20% of the time:
•missing pieces
•extra pieces
•incorrect merging
•incorrect splitting
manual fixing required.
1 gene: 20 minutes - 3 days
15,000 genes * 20 species =
impossible.
Visual inspection required.
71. 3. Crowd-sourcing gene
model fixing
https://github.com/yeban/afra
http://afra.sbcs.qmul.ac.uk
Anurag Priyam
•volunteers:
•Prestige
•Line on CV
•students (coursework & cash)
•“gold farmers”
Attract:
Begin
Being curated
Curate
Being curated
Curate
Being curated
Curate
Submit Submit Submit
“
create nex
WebApollo + Facebook + Badges + Points + Redundancy
72. Ecology & Evolution & Vital-IT
@ Lausanne
Evolve & Psych @ Queen Mary
LAURENT KELLER
JOHN WANG
DEWAYNE SHOEMAKER
OKSANA RIBA-GROGNUZ
MINGKWAN NIPITWATTANAPHON
y.wurm@qmul.ac.uk
M Corona, S Nygaard, BG Hunt, KK Ingram, L
Falquet, M Nipitwattanaphon, D Gotzek, MB Dijkstra,
J Oettler, F Comtesse, CJ Shih, WJ Wu, CC Yang, J
Thomas, E Beaudoing, S Pradervand, V Flegel, ED
Cook, R Fabbretti, H Stockinger, L Long, WG
Farmerie, J Oakey, JJ Boomsma, P Pamilo, SV Yi, J
Heinze, MAD Goodisman, L Farinelli, K Harshman, N
Hulo, L Cerutti, Ioannis Xenarios
Anurag Priyam &
Monica Dragan