1. Human Genome
What is Human Genome?
The human (Homo sapiens) genomeis the complete set of human genetic information, stored as DNA
sequences within the 23 chromosome pairs of the cell nucleus, and in a small DNA molecule within the
mitochondrion.
What is a Human Genome
the complete set of genetic information for humans
encoded as DNA sequences within the 23 chromosome pairs in cell nuclei and in a small DNA molecule
found within individual mitochondria
include both protein-coding DNA genes and noncoding DNA
Haploid human genomes consist of three billion DNA base pairs, while diploid genomes have twice the DNA
content.
While there are significant differences among the genomes of human individuals these are considerably smaller
than the differences between humans and their closest living relatives, the chimpanzees and bonobos.
Non-coding DNA
In genomics and related disciplines, noncoding DNA sequences are components of an organism's DNA that
do not encode protein sequences.
Some noncoding DNA is transcribed into functional noncoding RNA molecules while others are not transcribed
or give rise to RNA transcripts of unknown function.
Many noncoding DNA sequences have important biological functions as indicated by comparative
genomics studies that report some regions of noncoding DNA that are highly conserved, sometimes on time-
scales representing hundreds of millions of years, implying that these noncoding regions are under
strong evolutionary pressure and positive selection
For example, in the genomes of humans and mice, which diverged from a common ancestor 65–75 million
years ago, protein-coding DNA sequences account for only about 20% of conserved DNA, with the remaining
80% of conserved DNA represented in noncoding regions.
Linkage mapping often identifies chromosomal regions associated with a disease with no evidence of
functional coding variants of genes within the region, suggesting that disease-causing genetic variants lie in
the noncoding DNA
Noncoding functional RNA
Noncoding RNA molecules play many essential roles in cells, especially in the many reactions of protein
synthesis and RNA processing
Non-coding RNA genes include highly abundant and functionally important RNAs such as transfer RNA (tRNA)
and ribosomal RNA (rRNA), as well as RNAs such as snoRNAs, microRNAs,siRNAs, snRNAs, exRNAs,
2. and piRNAs and the long ncRNAs that include examples such as Xist and HOTAIRA non-coding RNA (ncRNA)
is a functional RNA molecule that is not translated into a protein.
Cis- and Trans-regulatory elements
Cis-regulatory elements are sequences that control the transcription of a nearby gene. Cis-elements may be
located in 5' or 3' untranslated regions or within introns. Trans-regulatory elementscontrol the transcription of
a distant gene.
Promoters facilitate the transcription of a particular gene and are typically upstream of the coding
region. Enhancer sequences may also exert very distant effects on the transcription levels of genes.
Introns
Introns are non-coding sections of a gene, transcribed into the precursor mRNA sequence, but ultimately
removed by RNA splicing during the processing to mature messenger RNA. Many introns appear to be mobile
genetic elements
An intron is any nucleotide sequence within a gene that is removed by RNA splicing while the final mature
RNA product of a gene is being generated.
Studies of group I introns from Tetrahymena protozoans indicate that some introns appear to be selfish genetic
elements, neutral to the host because they remove themselves from flanking exonsduring RNA processing and
do not produce an expression bias between alleles with and without the intron.
Some introns appear to have significant biological function, possibly throughribozyme functionality that may
regulate tRNA and rRNA activity as well as protein-coding gene expression, evident in hosts that have become
dependent on such introns over long periods of time
for example, the trnL-intron is found in all green plants and appears to have been vertically inherited for several
billions of years, including more than a billion years within chloroplasts and an additional 2–3 billion years prior
in the cyanobacterial ancestors of chloroplasts.
Pseudogenes
Pseudogenes are DNA sequences, related to known genes, that have lost their protein-coding ability or are
otherwise no longer expressed in the cell.
Pseudogenes arise from retrotransposition or genomic duplication of functional genes, and become "genomic
fossils" that are nonfunctional due to mutations that prevent the transcription of the gene, such as within the
gene promoter region, or fatally alter the translation of the gene, such as premature stop codons or frameshifts
Repeat sequences, transposons and viral elements
Transposons and retrotransposons are mobile genetic elements
Retrotransposon repeated sequences, which include long interspersed nuclear elements (LINEs) and short
interspersed nuclear elements (SINEs), account for a large proportion of the genomic sequences in many
species.
3. Telomeres
A telomere is a region of repetitive nucleotide sequences at each end of a chromatid, which protects the end
of the chromosome from deterioration or from fusion with neighboring chromosomes
Telomere regions deter the degradation of genes near the ends of chromosomes by allowing chromosome
ends to shorten, which necessarily occurs during chromosome replication.
Without telomeres, the genomes would progressively lose information and be truncated after cell division
because the synthesis of Okazaki fragments requires RNA primers attaching ahead on the lagging strand. Over
time, due to each cell division, the telomere ends become shorter.
During cell division, enzymes that duplicate DNA cannot continue their duplication all the way to the end of
chromosomes. If cells divided without telomeres, they would lose the ends of their chromosomes, and the
necessary information they contain.
The telomeres are disposablebuffers blocking the ends of the chromosomes, are consumed during cell
division, and are replenished by an enzyme,telomerase reverse transcriptase.
Coding sequences (protein-coding genes)
Protein coding sequences are DNA sequences that are transcribed into mRNA and in which the corresponding
mRNA molecules are translated into a polypeptide chain.
Every three nucleotides, termed a codon, in a protein coding sequence encodes 1 amino acid in the
polypeptide chain. In some cases, different chassis may either map a given codon to a different sequence or
may use different codons more or less frequently.
In the Registry, protein coding sequences begin with a start codon (usually ATG) and end with a stop codon
(usually with a double stop codon TAA TAA). Protein coding sequences are often abbreviatedwith the
acronym CDS.
Although protein coding sequences are often considered to be basic parts, in fact proteins coding sequences
can themselves be composed of one or more regions, called protein domains. Thus, a protein coding sequence
could either be entered as a basic part or as a composite part of two or more protein domains.
The N-terminal domain of a protein coding sequence is special in a number of ways. First, it always contains a
start codon, spaced at an appropriatedistance from a ribosomal bindingsite. Second, many coding regions
have special features at the N terminus, such as protein export tags and lipoprotein cleavage and attachment
tags. These occur at the beginning of a coding region, and therefore are termed Head domains.
A protein domain is a sequence of amino acids which fold relatively independently and which are evolutionarily
shuffled as a unit among different protein coding regions. The DNA sequence of such domains must maintain
in-frame translation, and thus is a multiple of three bases. Since these protein domains are within a protein
coding sequence, they are called Internal domains. Certain Internal domains have particular functions in protein
cleavage or splicing and are termed Special Internal domains.
4. Similarly, the C-terminal domain
of a protein is special, containing
at least a stop codon. Other
special features, such as
degradation tags, are also
required to be at the extreme C-
terminus. Again, these domains
cannot function when internal to
a coding region, and are termed
Tail domains.
Human Genetics Disorders
some genetic disorders only cause disease
in combination with the appropriate
environmental factors (such as diet).
With these caveats, genetic disorders may
be described as clinically defineddiseases
caused by genomic DNA sequence
variation
In the most straightforward cases, the
disorder can be associated with variation
in a single gene. For example, cystic
fibrosis is caused by mutations in the CFTR gene, and is the most common recessive disorder in caucasian
populations with over 1,300 different mutations known.[52]
Disease-causing mutations in specific genes are usually severe in terms of gene function, and are fortunately
rare, thus genetic disorders are similarly individually rare.
However, since there are many genes that can vary to cause genetic disorders, in aggregate they constitute a
significant component of known medical conditions, especially in pediatric medicine.
Molecularly characterized genetic disorders are those for which the underlying causal gene has been
identified, currently there are approximately 2,200 such disorders annotated in the OMIM database