ABSTRACT: The ongoing big data revolution has revolutionized the way in which technology is used to empower new business segments like social networking and transform old business segments like traditional retail. However, the DNA that is used to build data processing platform is evolving quite rapidly. There is a plethora of competing tools, technologies, and “religion” for how to build state-of-the-art data analysis frameworks. In this talk, I will go over five ways to build scalable high-performance long-lasting data analysis frameworks in the wrong way. Surprisingly, the industry is full of examples of organization building frameworks in this “wrong” way. Since the “right” way to build a technology framework is dependent on the key business drivers, it is my hope that this talk will spur a discussion on what is the “right” way for Pinterest. The talk will focus on technologies including “data plumbing” (e.g. tools in the Hadoop ecosystem), and statistical modeling methods (e.g. R and Python). In this talk, I’ll try to connect to platform builders, data scientists, and business decision makers.
BIO: Jignesh Patel is a Professor in Computer Sciences at the University of Wisconsin-Madison, where he also earned his Ph.D. He has worked in the area of databases (now fashionably called “big data”) for over two decades. He has won several best paper awards, and industry research awards. He is the recipient of the Wisconsin COW teaching award, and the U. Michigan College of Engineering Education Excellence Award. He has a strong interest in seeing research ideas transition to actual products. His Ph.D. thesis work was acquired by NCR/Teradata in 1997, and he also co-founded Locomatix -- a startup that built a platform to power real-time data-driven mobile services. Locomatix became part of Twitter in 2013. He is an ACM Distinguished Scientist and an IEEE Senior Member. He also serves on the board of Lands’ End, and advises a number of startups.