Presentation offered at http://www.smartiotlondon.com/2016-seminar-programme/big-data-and-genomics-the-future-of-genetic-engineering
Bioinformatics: the marriage of biology and Big Data, and how this will change the way we perform genetic engineering.
This presentation explain our company (Alkol Biotech) compares DNA strands, focusing on its development of the “EunergyCane” sugarcane crop: Europe’s only sugarcane variety. It explain the tools the company uses such as Big Data, Machine Learning, and Fast Sequencing.
Learning Outcomes:
1 – Learn on the new field of “Bioinformatics”, which is the marriage of IT and biology
2 – Learn how Big Data is changing the game on genetic engineering
3 – Learn what are the tools used and expected results
2. Low sequencing costs = lots of data
As the costs of sequencing a genome decreases, the DNA of more and
more organisms become publicly available, meaning more data
3. Low sequencing costs = lots of data
This problem is increased if we consider initiatives such as 1000Genomes
or if we were to sequence everyone in the US today (313 Exabytes)
4. genomics = lots of data
In fact, the number of “bytes” involved in each DNA genome is in the
range of millions to billions
5. genomics = lots of data
And if you are still unconvinced, take the “Minion”, by UK company
Oxford Nanopores, which sells for US$900, is the size of a USB stick, and
can sequence a human genome in 8 hours
7. Tools used = complicated
For genomics data we use ADAM, BLAST and several comparison tools
ADAM is an open-source, high performance, distributed platform for
genomic analsys. ADAM defines a:
1 - Data schema and layout on disk
2 - A Scala API
3 - A command line interface
BLAST is an aligment tool which is able to reconstruct the entire strand
based on “shotgun” chunks.
8. An example = our project
We are currently using Big Data to find promising strands among millions
of DNA sequences, using the tools described as I’ll explain now
9. How we use it = to build new crops
The current state of the biobased industry (biofuels, bioplastics, etc) is
trying to adapt to unsuitable feedstocks. That is exactly the opposite to
what making did with food, where it adapted crops to its feeding needs
10. Sugarcane = much more than sugar!
Among the feedstocks currently used by the biobased industries, one
stands out: sugarcane. However, it currently grows only in tropical
regions. A pity, considering the amount of products it originates.
11. Eunergycane = European sugarcane
Thus, being able to adapt sugarcane to grow in Europe would mean a lot
of new products being sustainably produced. We are half-way in that
project with our EUnergyCane variety, the only one genuinely european
12. a pine tree and an edelweiss?
Maybe the only thing that is common between a pine tree and an
edelweiss is the fact that both can stand cold places.
13. Looking for a philosopher’s stone
Thus, a comparison between the DNA strand of the pine tree and of the
Edelweiss should reveal common regions, one of which responsible for
example for giving a crop the ability to withstand the cold
14. How we use it = to build new crops
This is how we develop our work: by analizing DNA strands of crops which
can resist the cold in order to find that “Philosopher’s Stone” which, when
inserted into sugarcane, would make it able to grow in Europe. For that,
new techniques such as CRISPR/CAS 9 prevent the use of plasmids and
GMO’s
15. Conclusion = big data is much more
Big Data is not only for gathering customer data at banks and telcos, but a
valuable tool in finding new and unsuspecting data in any area of human
knowledge.
It use in Genomics may allow finding cures for otherwise incurable
diseases, develop new crops with increased capabilities, and much more
Thank you
alcosta@alkolbiotech.co.uk