At the StampedeCon 2013 Big Data conference in St. Louis, Jeff Melching, Big Data Engineer and Architect at Monsanto, discussed Legacy Analysis: How Hadoop Streaming Enables Software Reuse – A Genomics Case Study. The bioinformatics domain and in particular computational genomics has always had the problem of computing analytics against very large data sets. Traditionally, these analytics have leveraged grid and compute farm technologies. Additionally, the analytics software and algorithms have been built up over the past 30 years by contributions from both the public and private domain and written in a number of programming languages. When these software packages are brought in house and combined with the skills and preferences of internal bioinformatics researchers, what you get is a myriad of different technologies linked together in an analytics pipeline. The rise of technologies like MapReduce in hadoop have made the execution of such pipelines much more efficient, but what about all those analytic pipelines I have built up over the years that aren’t written in MapReduce? Do I have to rewrite them? Do I have to know java? This talk will explain how hadoop streaming can help you reuse instead of rewriting. It will also touch on techniques for packaging and deploying hadoop applications without having to centrally manage software versions on the cluster.