VLDB 2013 Early Career Research Contribution Award Presentation
Abstract: Four years ago at VLDB 2009, a paper was published about a research prototype, called HadoopDB, that attempted to transform Hadoop --- a batch-oriented scalable system designed for processing unstructured data --- into a full-fledged parallel database system that can achieve real-time (interactive) query responses across both structured and unstructured data. In 2010 it was commercialized by Hadapt, a start-up that was formed to accelerate the engineering of the HadoopDB ideas, and to harden the codebase for deployment in real-world, mission-critical applications. In this talk I will give an overview of HadoopDB, and how it combines ideas from the Hadoop and database system communities. I will then describe how the project transitioned from a research prototype written by PhD students at Yale University into enterprise-ready software written by a team of experienced engineers. We will examine particular technical features that are required in enterprise Hadoop deployments, and technical challenges that we ran into while making HadoopDB robust enough to be deployed in the real world. The talk will conclude with an analysis of how starting a company impacts the tenure process, and some thoughts for graduate students and junior faculty considering a similar path.