Dna data storage

Course name : Seminar(IT331)
Faculty Guide : Prof. Nikita Patel
Presented By : Ravi Vaniya (15IT036)
Sanat Dhobi (15IT027)

From Magnetic drive to Genomic drive

Synopsis
 Introduction
 History (Evolution of Memory Storage Devices)
 Challenges of BigData
 What is DNA? , Why DNA? (A Biological perspective)
 DNA Data storage
 How data is stored? (Algorithms , Techniques etc.)
 Current research in world (case study by Microsoft)
 Pro’s and Con’s
 Application and Future scope

Introduction
 Deoxyribonucleic acid (DNA) is a molecule that
carries the genetic (hereditary) instructions used
in the growth, development and functioning of all
known living organism and many viruses.
 Most DNA molecules consist of
two biopolymer strands coiled around each other
to form a double helix.
 The information in DNA is stored as a code made
up of four nitrogen bases: adenine (A), guanine
(G), cytosine (C), and thymine (T).
 Nucleotide = Nitrogen base + Sugar + Phosphate.

History
(Evolution of Memory Storage Devices)

Earlier devices
 In mid-1700 – Punch card
It was used for input both of programs and data.
Used as early as 1725 in the textile industry (for controlling
mechanized textile looms).
 In 1946 – Selectron tube
Capacity - 32 to 512 bytes.
4096-bit Selectron was 10 inches long and 3 inches wide.
Con’s - expensive and production problems.

Earlier devices …
 In 1932 – Magnetic drum memory
Memory capacity - 10 kB.
 In 1951 – Magnetic tape
 In 1956 – Hard disk drive
 IBM Model 350 - It had 50 24-inch discs with a total storage
capacity of 5 million characters (just under 5 MB).
 In 1971 – First Floppy drive (Diskette).
 In 1978 – Compact disc
 In 1980 – Hard disk drive (First 1 GB drive)

After 1990s …
 DVD and Flask storage (like SD card).
 Micro drive
 Holography.
 Cloud storage.

History
 The idea about the possibility of recording,
storage and retrieval of information on DNA
molecules were originally made by Mikhail
Neiman
 He published his idea in 1964–65 in the
Radiotekhnika journal, USSR(now Russia),
and the technology during that time was
referred to as MNeimONics(Mikhail Neiman
OligoNucleotides).

Introduction
 What is big data ?
Big data is a term for data sets that are so large or complex that
traditional data processing application software is inadequate to deal
with them.
 Problem for existing DBMS…
 Solutions..
1. Use software/framework
2. Some new technology

Issues
1. Data Volume
2. Data Velocity
3. Data Variety
4. Data Value
5. Data Complexity
Example : Google map

Challenges
 Privacy and security
 Data access and sharing of information
 Analytical challenges
 Human resource and manpower
 Technical – Fault tolerance , Scalability , Quality of data

Solution – 1 : Framework/Software
 Hadoop
Hadoop is an open-source framework(by Apache) that allows to store and process big
data in a distributed environment across clusters of computers using simple
programming models. It is designed to scale up from single servers to thousands of
machines, each offering local computation and storage.
 Let’s see how Hadoop works?

Traditional Approach Google’s Solution

Why DNA ?
1. Density of information that can be stored
- one gram of single-strand DNA could store as much as an exabyte
(1018 bytes).
2. DNA storage is not re-writable
- good for archiving records
3. Preservation
- DNA can still be sequenced from dried mummies thousands of
years old , but such sequences are rarely complete.

Polymerase Chain Reaction
 PCR is a technique to make many copies of a specific DNA region in
vitro (in a test tube rather than an organism).
 PCR relies on a thermostable DNA polymerase, Taq polymerase, and
requires DNA primers designed specifically for the DNA region of
interest.
 In PCR, the reaction is repeatedly cycled through a series of
temperature changes, which allow many copies of the target region to
be produced.
 PCR has many research and practical applications. It is routinely used
in DNA cloning, medical diagnostics, and forensic analysis of DNA.

Advantages
 Density of information that can be stored is very high i.e. one gram of
single-strand DNA could store as much as an Exabyte.
 DNA storage is not re-writable means it is good for archiving records.
 DNA can be preserved for long time.
 DNA can maintain its integrity without any power supply. Also, its
small size and weight make it easy to store and transport.
 DNA is less susceptible to technical failures.

Disadvantages
 High cost of DNA synthesis per data stored (around US$12,400 per
megabyte of data stored).
 Data is read back at low speed.
 DNA is not rewritable, i.e. it can’t update the information it holds
without redoing the entire information storing process.
 DNA does not allow random access either, meaning, to access a
particular part of the data stored, the entire stored information should
be decoded.

References ...
 www.google.co.in
 Official website : University of Washington
 Official website : Microsoft Inc.
 Research paper by Siddhant Shrivastava and Rohan Badlani International
Journal of Electrical Energy, Vol. 2, No. 2, June 2014
 https://en.wikipedia.org/wiki/DNA
 http://www.the-scientist.com/?articles.view/articleNo/32494/title/DNA-Data-
Storage
 https://www.khanacademy.org/science/biology/biotech-dna-technology/dna-
sequencing-pcr-electrophoresis/a/polymerase-chain-reaction-pcr

Dna data storage

More Related Content

What's hot

Similar to Dna data storage

Recently uploaded

Dna data storage