Indian Journal of Biotechnology
Vol. 7, October 2008, pp 478-481
The PCR amplification, sequencing and computer-aided analysis of
ovine αS1-casein gene promoter
S K Bhure* and B Sharma1
*Project Directorate on Animal Disease Monitoring And Aurveillance, IVRI Campus, Hebbal, Bangalore 560 024, India
Indian Veterinary Research Institute, Izatnagar 243 122, India
Received 12 October 2006; revised 13 February 2008 ; accepted 15 April 2008
The paper reports 5'-flanking sequences of ovine αS1-CSNGP (casein gene promoter) of 2185 bp. It has shown many
deletions, substitutions and a 12 bp addition compared to bovine sequence. The comparative study showed 2136 bp of
5′-flanking region and 49 bp exon I sequence. The exon I sequence contained two ribosomal binding sites. The
computational analysis showed presence of core promoter elements, viz., TATA box, CAAT box and initiator sequence.
However, no typical GC box was found. Of five known mammary gland specific sequences, three sequences, viz., milk box,
Groenen structure and Yu Lee 6, were found. The 220 bp Groenen structure contained other milk protein gene specific
sequences (MGF, MPBF, Yu Lee 2, 4 and 5, and Oka box C) and hormone responsive elements (PRE, PRL-RE). Other
HREs (GRE, CRE, GHRE and IRE) and ubiquitous transcription factor binding sites were also present. These milk protein
gene specific regulatory sequences and HREs are responsible for tissue specific and multi-hormone regulation of the ovine
Keywords: Ovine αS1-caseine gene promoter, gene regulation, transcription factor binding sites, hormone responsive elements
The production of transgenic animals, expressing
the foreign DNA sequence introduced into their
genome, is a powerful technique for both biological
research and bio-industry. The promoters of milk
protein genes have been used to produce human
proteins in mammary glands that are essential for the
treatment of many diseases1,2. The elements which
regulate tissue specific expression have to be present
in the promoter region for production of heterologous
protein of interest. So far, the studies utilizing bovine
αS1-casein gene (αS1-CSNG) promoter demonstrated
high level of expression in the milk of transgenic
animals3,4. The characterization of transcript
regulatory sequences in the promoter regions of any
gene is an important prerequisite for understanding
the interplay of various regulatory/hormone
interactions which control their expression. However,
this kind of study requires a mammary gland specific
cell lines which are capable of expressing
heterologous protein gene consistently. The
computer-aided search of milk protein gene promoters
for various cis-acting DNA elements and trans-acting
*Author for correspondence:
Tel: 91-80-23419576; Fax: 91-80-23415329
factors will simplify the transcription regulation
studies to a greater extent. The promoter of
αS1-CSNG gene of sheep has not been characterized.
Therefore, the promoter region was cloned and
sequenced. The sequence was then submitted to
GenBank database of the National Center for
Biotechnology Information (NCBI, acc. no.
Materials and Methods
The genomic DNA was isolated from 5 mL of
sheep blood as per the method described by
Sambrook and Russel5. The DNA isolated from
polymorph nuclear leucocytes was used as template
for PCR. A set of primers was designed from the
conserved regions of heterologous sequences of
αS1-CSNG available on NCBI, GenBank. The Hotstart PCR conditions were standardized by using
gradient PCR with varied concentration of MgCl2.
The PCR mixture contained 25 pM of each primer
(Forward Primer, 5′-CCA GAT GGG CAT GAA
AAA GGA-3′ and Reverse Primer, 5′-AAC CCA
AGA CTG GGA AGA AG-3′), 200 M of each dNTP
and 1.5 U of XT-Taq DNA polymerase (Banglore
Genei, India) in a final volume of 50 µL. The Hotstart PCR was performed as follows: denaturation at
94°C for 60 sec; annealing at 65°C for 60 sec;
BHURE & SHARMA: OVINE α S1 CASEIN GENE PROMOTER
extension at 72°C for 2 min in 35 cycles. The PCR
product was gel purified by using GenTM Elute Gel
extraction kit (Sigma, USA) as per the
manufacturer’s protocol and cloned in pGEMT-Easy
cloning vector (Promega, USA). Two recombinant
clones were sequenced completely for both strands
(ABI Prism 310).
All the related milk protein gene sequences were
down loaded from GenBank, NCBI. The sequences
were edited for 5′-flanking regions and checked for
sequence homology with ovine αS1-CSNGP using
MegAlign of DNAstar molecular biology software.
Homology map of 5′-flanking regions was
constructed between ovine αS1-CSNGP and different
casein genes of sheep, goat, cattle, buffalo, yak, rabbit
and rat. Computer analysis of putative cis- and transregulatory sequences in the promoter region of ovine
αS1-CSNG was carried out by Gene Tool Lite and
DNASIS molecular biology software. Separate
database was prepared in DNASIS for various milk
protein associated consensus transcription factorbinding sequences available in the literature.
Results and Discussion
The full-length cloned fragment was 2185 bp. The
comparison of 3′-end sequence of cloned 2185 bp
fragment with that of 5′-end 45 bp ovine αS1-CSNG
mRNA showed absolute homology. This 49 bp region
is exon I sequence, which is conserved across rat
α-, β-, γ-casein and bovine αS1-casein gene6. This
exon I sequence was edited from the ovine DNA
fragment to get 5′-flanking region of the ovine αS1CSNG. The sequence had shown 91.2% homology
with goat (GenBank acc. no. AJ504712), 91.7% with
cattle (GenBank acc. no. X59856), 81.9% with yak
(GenBank acc. no. AF194983), 89.9% with bovine
αS1-CSN (GenBank acc. no. AF529305 segment 1).
Thus, comparison showed significant homology with
closely related species (bovine, goat, buffalo and yak)
and low/insignificant homology was observed with
camel, rabbit and rat αS1-casein gene. The highest
homology was with bovine and caprine αS1-CSNG
Since the complete 5′-flanking sequence of bovine
αS1-CSNG was available and had shown significant
homology with ovine amplicon. The bovine sequence
(GenBank acc. no. X59856) was considered for
further analysis and also because of close
evolutionary relationship and presumably unmodified
transcription factor binding site preferences. The
ovine promoter region had shown additional
sequences at -1005 to -992, -240 to -237 and -221 to 218. A 12 bp additional sequence was also noted
whose role, if any, in influencing promoter activity
requires further study.
On the basis of sequence comparison with bovine
αS1-casein gene promoter6, we predicted a putative
transcription start site CCA+1TCA, which has the
same sequence as that of initiator consensus
YYA+1(T/A)YY. The exon I showed two ribosomal
binding sites CCTTGATCA, centered at +5/+13 and
GCTGCTTC at +26/+336-8. The amplified ovine αS1CSNGP contained complete non-coding exon I except
for last four nucleotides, CAAG.
The comparison with known consensus sequences
of both ubiquitous and specific transcription factor
motifs described for milk protein gene promoters
showed several motifs common to different milk
protein gene promoters as described previously9-13.
The data of consensus sequences showing stringent
homology are arranged as milk protein specific
sequences (Table 1) and hormone receptor consensus
sequences (Table 2). The motifs are distributed
throughout the ovine αS1-casein 5′-flanking region.
However, most of the milk gene promoter specific
motifs are clustered between transcription initiation
Table 1 —Mammary gland-specific-transcription factors and consensus sequences analysed in ovine αS1-CSNGP.
Mammary specific factor
Mammary cell activating factor
Mammary gland specific
Yin and Yang factor 1
-196, -1420, -1514, -1712
Oka box C
Mammary gland specific
INDIAN J BIOTECHNOL, OCTOBER 2008
Table 2—Hormone-receptor-consensus sequence in ovine αS1-CSNGP
Hormone response element
Glucocorticoid responsive element
Progesterone responsive element
Cyclic AMP responsive element
Growth hormone unit
-1602, -1351, -278
-1335, -499, -419
Rat prolactin unit,
IRE1 factor, rat insulin 1 unit, MAMM system
site and -155, except for MAF and an upstream MGF
site. The sequence was also searched for basal
promoter elements, viz., transcription start
site/initiator sequence, TATA signal, GC box and
CAAT box. The ovine αS1-CSNGP contains a
sequence TTTAAATA at -29, which showed
homology with the TATA box of cattle, buffalo,
camel, rat αS1-CSNG and γ-CSN gene of rat. A
sequence, CAAAAT resembling CAAT box of rabbit
β-casein gene promoter was found at -57 (Gen Bank
acc. no. X 15735). An MGF/MPBF STAT5 sequence
were located at -98 and -193711-15. MGF is a
transcription factor discovered initially in the
mammary epithelial cells of lactating animals and is a
novel member of the cytokine-regulated transcription
factor gene family and known to mediate prolactin
responsiveness of milk protein gene expression. The
MGF/MPBF/STAT5 site found in ovine αS1-CSNGP
may be presumed to confer prolactin hormone
induction. In ovine αS1-CSNGP, the DNA segment
between -240 to -20 showed 96% homology to the
“Groenen structure” consensus sequence11. This 220
bp DNA segment contains MGF/MPBF, Yu Lee 2, 4,
5 and 6, Oka box C, PRE, PRL-RE, and γ- and βinterferon responsive elements. There are four
sequences showing 65-70% homology to the
consensus milk box sequence as described by Laird
et al10. Five mammary gland specific sequences
associated with milk protein genes have been
reported, viz., milk box, Groenen structure, Yu Lee
sequence 1 and 6, and Oka box A16; three of them are
present in the ovine sequence. The presence of these
mammary gland specific sequences contributes to the
tissue specificity of the promoter (Table 1).
Milk protein gene expression is regulated by a
combination of steroid and polypeptide hormones,
viz., prolactin, insulin, glucocorticoids and
estrogens being the most important positive and
progesterone the main negative regulator of gene
expression17. The hexanucleotide, TGTYCT is a
part of a number of glucocorticoid receptor binding
sites18, which is located at -278, -1351 and -1602. A
sequence, ATTTCCGATGT at -116 had shown
homology to rabbit progesterone receptor binding
sequence at -110 19. A sequence, CTGATTA at -40
showed resemblance to rat prolactin unit20 but is
present in inverted position relative to the
orientation of gene. The sequence, GCCATCTG at
-1421 showed homology to rat insulin unit21 and
TGACATCA at -1748 to human promoter CRE
element22; they were found in ovine αS1-CSNGP.
These results agree well with experimental data
showing that the expression of milk protein genes is
subject to hormone regulation by glucocorticoids,
progesterone, prolactin and insulin17. The milk
protein gene expression is also regulated by
mammary tissue specific transcription factors11.
The other ubiquitous transcription factor binding
sites found in the promoter region include AP 1, AP
2, AP 3, W-element, TTS and possible two types of
enhancer elements found were PEA 3-CS. However,
the ubiquitously expressed Oct 1 transcription factor,
which is involved in the regulation of expression of
many tissue specific and housekeeping genes, was not
found in the ovine αS1-CSNGP23.
The temporal and tissue-specific expression of milk
protein genes are controlled by a distinct class of cooperating and antagonistic class of transcription
factors, which are associated with multiple,
sometimes clustered, binding sites. The number and
position of potential binding sites can play a decisive
role in the outcome of these synergistic and
antagonistic interactions. The general theme is that
common consensus sequences are present in all but
their different spatial arrangements exist in the
promoters from different species, which also holds
true for ovine αS1-CSNGP. The promoter with
deletion of tissue specific regulatory sequences and
certain negative regulatory elements can make it
useful for the construction an inducible eukaryotic
BHURE & SHARMA: OVINE α S1 CASEIN GENE PROMOTER
expression vector. The computational analysis
showed the presence of mammary gland specific
regulatory elements, which can make ovine αS1CSNGP useful for transgenic vector construction.
Authors sincerely thank the Director, Indian
Veterinary Research Institute, Izatnagar and Indian
Council of Agricultural Research, New Delhi for
providing the necessary facilities and financial
support during the research work.
1 Wilmut I, Archibald A L, McClenaghan, M, Simons J P,
Whitelaw C B et al, Production of pharmaceutical proteins in
milk, Experientia, 47 (1991) 905-912.
2 Houdebine L M, Production of pharmaceutical proteins from
transgenic animals, J Biotechnol, 34 (1994) 269-287.
3 Toman P D, Pieper F, Sakai N, Karatzas C, Platenburg E
et al, Expression of HBsAg gene in transgenic goats under
direction of bovine α-S1 casein control sequence, Transgenic
Res, 8 (1999) 415-427.
4 Tan X H, Cheng X, Zhou J, Chen H X, Un F Y et al, Bovine
α-S1-casein gene sequences direct expression of a variant of
human tissue plasminogen activator in the milk of transgenic
mice, Yi-Chuan-Xue-Bao, 28 (2001) 405-410.
5 Sambrook J & Russel D W, Molecular cloningA
laboratory manual, 3rd edn (Cold Spring Harbor, New York)
6 Koczan D, Hobom G & Seyfert H M, Genomic organization
of the bovine α-S1-casein gene, Nucleic Acids Res, 19 (1991)
7 Mercier J C, Gaye P, Soulier S, Hue-Delahaie D & Villotte J
L, Construction and identification of recombinant plasmids
carrying cDNAs coding for ovine αS1-, αS2-, β-, κ-casein
and β-lactoglobulin. Nucleotide sequence of αS1-casein
cDNA, Biochemie, 67 (1985) 959-971.
8 Yu-Lee L Y, Richter-Mann L, Couch C H, Stewart A F,
Mackinlay AG et al, Evolution of the casein multigene
family: Conserved sequences in the 5’-flanking and exon
regions, Nucleic Acids Res, 14 (1986)1883-1902.
9 Hall L, Emery D C, Davies M S, Parker D & Craig R K,
Organization and sequence of human α-lactalbumin gene,
Biochem J, 242 (1987) 735-742.
10 Laird J E, Jack L, Hall L, Boulton A P & Parker D, Structure
and expression of the guinea-pig α-lactalbumin gene,
Biochem J, 254 (1988) 85-94.
11 Groenen M A M, Dijkhof R J, van der Poel J J, van Diggelen
R & Verstege E, Multiple octamer binding sites in the
promoter region of bovine αS2-casein gene, Nucleic Acids
Res, 20 (1992) 4311-4318.
12 Yoshimura M & Oka T, Isolation and structural analysis of
mouse β-casein gene, Gene, 78 (1989) 267-275.
13 Watson C J, Gordon K E, Robertson M & Clark A J,
Interaction of DNA-binding proteins with milk protein gene
promoter in vitro: Identification of mammary gland specific
factor, Nucleic Acids Res, 19 (1991) 6603-6610.
14 Schmitt-Ney M, Doppler W, Ball R K & Groner B, β-Casein
gene promoter activity is regulated by hormone-mediated
relief of transcriptional repression and a mammary glandspecific nuclear factor, Mol Cell Biol, 7 (1991) 3745-3755.
15 Wakao H, Gouilleux F & Groner B, Mammary gland factor
(MGF) is a novel member of the cytokine regulated
transcription factor gene family and confers the prolactin
response, Eur Mol Biol Organ J, 13 (1994) 2182-2191.
16 Malewski T & Zwierzchowski L, Computer-aided analysis of
potential transcription-factor binding sites in rabbit β-casein
gene promoter, BioSystems, 36 (1995) 109-119.
17 Vonderhaar B K & Ziska S E, Hormonal regulation of milk
protein gene expression, Annu Rev Physiol, 51 (1989)
18 Scheidereit C, Geisse S, Westphal H M & Beato M, The
glucocorticoid receptor binds to defined nucleotide
sequences near the promoter of mouse mammary tumor
virus, Nature (Lond), 304 (1983) 749-752.
19 von der Ahe D, Janich S, Sceidereist C, Renkawitz R, Schutz
G et al, Glucocorticoid and progesterone receptors bind to
the same sites in two hormonally regulated promoters,
Nature (Lond), 313 (1985) 706-709.
20 Schuster W A, Treacy M N & Martin F, Tissue specific
trans-acting factor interaction with proximal rat prolactin
gene promoter sequences, EMBO J, 6 (1988) 1721-33.
21 Ohlasson H, Karlsson O & Edlund T, A beta specific protein
binds to two major regulatory sequences of insulin gene
enhancer, PNAS J, 85 (1988) 4228-31.
22 Lichtenheld M G & Podack E R, Structure of human perforin
gene. A simple gene organization with interesting potential
regulatory sequences, J Immunol, 143 (1989) 4267-4274.
23 Zhao F Q, Zheng Y, Dong B & Oka T, Cloning, genomic
organization, expression, and effect on β-casein promoter
activity of a novel isoform of the mouse Oct-1 transcription
factor, Gene, 326 (2004) 175-187.