SNP discovery in African taurine and Zebu cattle by whole genome sequencing of pooled DNA


Published on

Poster by Noyes HA, Agaba M, Anderson SI, Archibald AL, Ashelford K, Bradley D, Brass A, Finalyson HA, Hanotte O, Kay S, Kemp SJ, Khodadadi M, Law AS, Lu Z, Smith S, Talbot R, and Hall N. For the BecA Opening, Nairobi, 5 November 2010

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

SNP discovery in African taurine and Zebu cattle by whole genome sequencing of pooled DNA

  1. 1. SNP discovery in African taurine and Zebu cattle by whole genome sequencing of pooled DNA Noyes HA 1 Agaba M 2 Anderson SI 3 Archibald AL 3 Ashelford K 1 Bradley D 4 Brass A 6 Finalyson HA 3 Hanotte O 5 Kay S 1 Kemp SJ 1,2 Khodadadi M 6 Law AS 3 Lu Z 3 Smith S 3 Talbot R 3 Hall N 1 Acknowledgements: These studies were funded by the Wellcome Trust 1 University of Liverpool Liverpool, UK 2 International Livestock Research Institute Nairobi Kenya   3 Roslin Instiute University of Edinburgh, Edinburgh UK   4 Trinity College Dublin Ireland 6 University of Nottingham Nottingham UK 6 University of Manchester Manchester UK   N’Dama ( B. taurus) and Boran (B. indicus) differ in response to trypanosome infection N’Dama Boran Tsetse flies transmit trypanosomiasis and effectively exclude cattle from large areas of Central Africa. Zebu breeds such as Boran are preferred by farmers but are generally highly susceptible to trypanosomiasis, as are European taurine. African taurine breeds such as N’Dama can survive and grow under moderate tsetse challenge, despite carrying parasites in the blood stream; a phenomenon known as trypanotolerance. Consequently trypanotolerant breeds still predominate in areas of West Africa that are under significant tsetse challenge A Origins of African Cattle Taurine cattle ( Bos taurus ) were domesticated in the Middle East, and possibly in Africa as well, 6,000 – 10,000 years ago. These animals spread throughout the continent and adapted to endemic diseases such as trypanosomiasis as they went. Indicine cattle ( Bos indicus, zebu) were domesticated in India from aurochsen that may have diverged from the aurochsen of the Middle East as much as 200-500kya. They were introduced into Africa and are not so well adapted to trypanosomiasis but they are more manageable and preferred by farmers. Introgression of Bos indicus alleles into Africa Estimated Bos indicus (dark) and Bos taurus (light) proportions in African and neighbouring cattle populations from microsatellite data (right). Zebu ( Bos indicus) alleles predominate in East Africa and decline in frequency towards the west and south of the continent. Note the pockets of trypanotolerant pure taurine cattle in forest West Africa, which are indicated by red boxes. Dots indicate sample locations and include European and African Taurine as well as Zebu. Hanotte et al Science 2002 vol. 296 336-9 <ul><li>Methods </li></ul><ul><li>The genomes of Boran and N’Dama were sequenced with two objectives. </li></ul><ul><li>To develop a panel of marker SNP that can be used for whole genome association studies in African cattle </li></ul><ul><li>To identify candidate QTL SNP and structural variants (SV). </li></ul><ul><li>N’Dama samples were collected in the Gambia. Four pools of five DNA samples were sequenced on the Illumina Genome Analyzer II to a total of 15.35x coverage and SNP were selected with Q PHRED > 20 and coverage > 2. </li></ul><ul><li>A pool of 10 Boran DNA samples from the ILRI herd in Kenya were sequenced on the ABI SOLiD to 11.42x coverage and SNP were selected that had a confidence score > 0.9 and coverage > 1. </li></ul><ul><li>Sahiwal is a Bos indicus breed that was sequenced to assist in the identification of Bos indicus and Bos taurus alleles in Boran. A pool of DNA from ten animals was sequenced to a depth of 12x. </li></ul><ul><li>Short read sequences were aligned to the University of Maryland Bovine Assembly version 3 of the Bos taurus Hereford reference sequence (Genome Biology 2009 vol. 10 (4) pp. R42). </li></ul>Annotating the genomes of N’Dama and Boran to identify potentially functional SNP that correlate with phenotype Introgression of Bos indicus alleles into Africa Table 1 SNP consequences in N’Dama, Boran and Sahiwal Counts of numbers of SNP relative to the Hereford bovine reference sequence. SNP in genes were annotated with workflow described above Freeman et al 2006 Animal Genetics 37 1-9. Minor Allele Frequencies of moving 50kb windows in a 4Mb region around TLR5 and SUSD4 on Bta16 calculted by the method of of (Rubin et al Nature 2010 464: 587–591 . Regions of low minor allele frequency might be under selection. TLR 5 (circled) is in a QTL for trypanotolerance and is under the first of the troughs of the double dip (red circle) and SUSD4 is under the second with MAF of 0.049 and 0.048 respectively. The horizontal line through the MAF plot is at 20.44, the mean MAF. Sequencing the genomes of N’Dama and Boran A Taverna workflow was developed to annotate SNP within genes. SNP positions were converted from UMD3 to Bta4 co-ordinates using liftOver from UCSC. All SNP within genes were classified with the Ensembl API and SNP classified as non-synonymous were classified using a local copy of Polyphen (Nucleic Acids Research 30: 3894-3900). Results are shown in Table 1 below. The workflow is being adapted for human and mouse and is available from the authors as a virtual machine. Taverna workflow for genome annotation Regions with low minor allele frequencies that may be under selection Results The location, nature and potential impact of the putative SNPs are summarised in Table 1. Particularly interesting is the large numbers of stop gained. Many of these might represent errors in annotation, for example two genes annotated as one, but clearly these are candidates for further investigation as QTL genes. Sahiwal had many more SNP than the N’Dama and Boran. It was expected that Bos indicus would have more SNP relative to the Hereford Bos taurus reference sequence because of the 200,000 years separating Sahiwal from Hereford. However it is not clear why Boran which is mainly of Bos indicus origin did not have a similar number of SNP. This difference may be because N’Dama and Sahiwal were sequenced on the Illumina platform and Boran was sequenced on the ABI SOLiD platform. Minor allele frequencies were calculated using the method of (Rubin et al Nature 2010 464: 587–591). Regions of low minor allele frequency might be under selection. Evidence for selection was found around TLR5 within a QTL on chromosome 16 (see figure). Other regions were also under selection Cattle Tsetse Cattle and tsetse QTL with p<0.0043 PCV Body weight Parasitaemia Hanotte et al PNAS 2003 7443-7448 Boran (relatively susceptible) The N’Dama and Boran each contribute trypanotolerance alleles at 5 of the 10 most significant QTL, indicating that a synthetic breed could have even higher tolerance than the N’Dama. N’Dama (tolerant) N'dama males were crossed with Boran females to generate F1 individuals that were subsequently crossed to create an F2 mapping population of 177 animals composed of a few large families. These animals were challenged and monitored for parasitaemia, body weight and PCV leading to the identification of 10 major QTL (Hanotte et al PNAS 2003 100 7443-8). The confidence intervals of the QTL were large making it difficult to identify candidate genes. Interestingly the susceptible Boran cattle were found to carry the resistance alleles at five out of ten loci. Boran have evidently developed some resistance to disease suggesting that a synthetic breed might be even more resistant than either parent. Mapping Trypanotolerance QTL Consequence Boran N'Dama Sahiwal Total SNP mapped to UMD3.0 11,458,009 11,112,844 21,960,387 SNP lifted over to Bta4 10,799,611 9,847,097 19,789,571 Within Exon 111,754 123,605 222,731 Annotated by Ensembl API STOP_LOST 7 23 25 STOP_GAINED 179 840 1,197 SYNONYMOUS_CODING 33,916 30,909 57,073 NON_SYNONYMOUS_CODING 20,013 27,751 45,237 3PRIME_UTR 16,606 17,117 31,770 5PRIME_UTR 2,557 3,880 7,227 UPSTREAM 4,120 5,042 9,477 DOWNSTREAM 6,412 7,541 14,005 INTRONIC 24,249 26,268 49,228 INTERGENIC 58 73 139 SPLICE_SITE 1,537 2026 3,353 WITHIN_NON_CODING_GENE 2,099 2126 3,960 WITHIN_MATURE_miRNA 1 9 40 Annotation of non-synonymous SNP by Polyphen ANNOTATED BY POLYPHEN 17,797 26,466 40,060 BENIGN 13,429 14,161 24,619 POSSIBLY DAMAGING 1,686 3,552 5,388 PROBABLY DAMAGING 1,931 5,908 8,176