Genomics and Health data is nowadays one of the hot topics requiring lots of computations and specially machine learning. This helps science with a very relevant societal impact to get even better outcome. That is why Apache Spark and its ADAM library is a must have.
This talk will be twofold.
First, we'll show how Apache Spark, MLlib and ADAM can be plugged all together to extract information from even huge and wide genomics dataset. Everything will be packed into examples from the Spark Notebook, showing how bio-scientists can work interactively with such a system.
Second, we'll explain how these methodologies and even the datasets themselves can be shared at very large scale between remote entities like hospitals or laboratories using micro services leveraging Apache Spark, ADAM, Play Framework 2, Avro and Tachyon.