Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook. One key feature in Presto is the ability to query data where it lives via a uniform ANSI SQL interface. Presto’s connector architecture creates an abstraction layer for anything that can be expressed in a row-like format, such as HDFS, Amazon S3, Azure Storage, NoSQL stores, relational databases, Kafka streams and even proprietary data stores. Furthermore, a single Presto query can combine data from multiple sources, allowing for analytics across your entire organization.
This talk will be co-presented by Facebook and Teradata, the two largest contributors to Presto. The talk will focus on Presto’s ability to query virtually any data source via it’s connector interface. Facebook and Teradata will present some of their use cases of Presto querying various data sources, discuss the existing connectors in Presto, and describe the anatomy of a connector.