2. AGENDA
What we do @Lookout
Analytics Architecture
Event Ingestion Pipelines
Storm
Questions
3. BIO
Data Engineer at Lookout, started 2013
Previously at Demandbase, Project Perf. Corp.
6 years of Data Engineering
From Mumbai, India
etl.svbtle.com
14. ADHOC QUERYING
Hive CLI - Command-line interface to Hive
Hue - Toad style GUI for ad hoc queries on Hive
R Studio - Statistical analysis
Shiny - Reporting/Querying tool based on R
Sparkle Pony(Homegrown Ruby app) - MySQL Querying for
stakeholders
Hadoop File System Browser
24. LANDING DATA IN HADOOP
Topologies write data to a landing directory in Hadoop using
HDFS Bolt
Directories are rotated depending on latency requirements of
downstream reports
Directories are moved to location of the table in Hive