How to build a SQL-based data warehouse for a trillion rows in Python
By Ville Tuulos
http://tuulos.github.io/pydata-2014/...
Upcoming SlideShare
Loading in …5
×

How to build a SQL-based data warehouse for a trillion rows in Python by Ville Tuulos PyData SV 2014

717 views

Published on

In this talk, we show how and why AdRoll built a custom, high-performance data warehouse in Python which can handle hundreds of billions of data points with sub-minute latency on a small cluster of servers. This feat is made possible by a non-trivial combination of compressed data structures, meta-programming, and just-in-time compilation using Numba, a compiler for numerical Python. To enable smooth interoperability with existing tools, the system provides a standard SQL-interface using Multicorn and Foreign Data Wrappers in PostgreSQL.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
717
On SlideShare
0
From Embeds
0
Number of Embeds
19
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

How to build a SQL-based data warehouse for a trillion rows in Python by Ville Tuulos PyData SV 2014

  1. 1. How to build a SQL-based data warehouse for a trillion rows in Python By Ville Tuulos http://tuulos.github.io/pydata-2014/#/

×