DNA Sequencing & Data Analysis Techniques

DNA Sequencing & data Analysing
30-Nov-2016
Dr.jassim Mohammed Abdo
Director of Duhok Research Center
PhD in Molecular Biology &Immunology
Issued by Ludwig-Maximilians University
Munich, Germany

History
 1953 - structure of DNA established as a double
helix.
 1970 - first method of DNA sequencing involved a
location specific primer extension strategy.
 1977 - Frederick sanger published a method for DNA
sequencing with chain terminating inhibitors.

 1977 - Allan Maxam and Walter Gilbert developed
DNA sequencing by chemical degradation.
 1977 - the first genome to be sequenced was that of
bacteriophage φX174.
 1990 - several new methods are developed in the mid
to late 90’s.
 2003 - Complete Human Genome Project.

• The first sequence of the human genome was
obtained using so called »first generation«
sequencing technology
• In the following years, »second« or »next
generation« sequencing (NGS) technologies were
developed, characterized by massive parallelization,
improved automation and speed, and, most
importantly, greatly reduced price

For example, in 2001, the cost of sequencing a human genome
was almost 100 million$, In 2015, it was just 1245$

Primary NA sequence can be produced by
Sanger-based technologies or NGS technologies

Sanger-Sequencing
Statistically
you get
every
possible
fragment
size
modified after de.academic.ru

How to read the sequence?
modified after www.mun.ca.jpg

Overview major DNA reading technologies

Three major nucleotide databanks host primary sequence data

The batch submissions originate mostly from
sequencing centers

Primary sequence dbs are synchronised and
every sequence receives a unique identifier

One sequence entry contains three categories of
different types of information

data Analysing
• BioEdit
• Chromas
• DNA star
• Lasegene
• Gegenees
• DNA Maste
• Oligo Analyzer
• DNA Club

After sending a sample to be sequenced, the result needs to be
interpreted, the normal steps in the process include:
1. Open the chromatogram file, check the quality of the sequence.
and determine the length of high quality sequence.
2. Differentiate the vector sequence (if used) from the insert’s
sequence using restriction sites as markers
3. 3. If you know exactly what the sequence should be, make a pair-
wise alignment of the DNA sequences using Bioedit, ClustalW or
NCBI’s BLAST2.
4. If the DNA sequence contains variations to the consensus,
perform a pair-wise alignment of the predicted peptide
sequences

Analysing sequences and chromatograms
Molecular biologists sequence samples of DNA for a huge
range of reasons and we will explore this fundamental
technique here. 'Dye terminator sequencing' is currently the
preferred method used by molecular biologists for
sequencing of DNA samples.

Chromatogram files can be opened by Bioedit or by Chromas.,Lasegen ,DNAstar ……
These programs will display the hromatogram of the sequence, it is up to you to
determine the reliability of the sequence.

How to submit a sequence in NCBI
we use BankIt if,
We have a single sequence, a simple set of sequences (for example:16S rRNA, matK,
ITS/rRNA, amoE, tefB, cytb, or COI sets), or a small batch of different sequences
we prefer to use a web-based submission tool
the feature annotation for our sequences is not complicated
we do not require advanced sequence analysis tools
we use Sequin if,
we prefer to work on our submission off-line
we have a sequence or sequences that are complex
we would like graphical viewing and editing options, including an alignment editor
we would like the option to have network access to related analytical tools

DNA machine can sequence human genomes in hours

GenBank Sequence Submission Policy
the GenBank database is intended for new sequence data that is determined by
and annotated by the submitter
sequences built or derived from other GenBank primary data intended for
the Third Party Annotation (TPA) database may be submitted through BankIt
the following types of submissions are NOT acceptable:
sequences less than 200 nucleotides long, unless they represent complete
exons, non-coding RNAs (ncRNAs), microsatellites or ancient DNA
non-contiguous sequences that have been artificially joined; for example,
multiple exons without their intervening introns or without a 'gap'
representing any missing sequence
single sequences that are a mix of molecule types, such as mix of genomic
and mRNA sequence data

DNA Sequencing & Data Analysis Techniques

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (15)

Similar to DNA Sequencing & Data Analysis Techniques

Similar to DNA Sequencing & Data Analysis Techniques (20)

Recently uploaded

Recently uploaded (20)

DNA Sequencing & Data Analysis Techniques