Snowflakes in the Cloud Real world experience on a new approach for Big Data
1. Snowflakes in the Cloud
Real world experience on a new
approach for Big Data
Robert Fehrmann
Principal Architect @ Snagajob
2.
3. About Me
● Master Degree in Computer
Science from “Technische
Universitaet Braunschweig”
● 25 years building the data tier for
applications in different verticals
● Evangelist for polyglot data
environments
● Community involvement
(MongoDB User Groups / DevOps)
4. Agenda
● The Snagajob Story:
○ How did we end up in the Big Data World using
Snowflake
● Gotcha’s
○ Interesting stories on using Snowflake
5. Funnel Analysis
750 000 postings every day
600,000 unique visitors
X% find the posting
interesting
Y% apply for the
posting
(candidate)
Z%
Using Analytics to understand
the funnel
- Geographical Analysis
- Customer Analysis
- Historical Analysis
- Industry Analysis
- Click through rate &
abandoning the search
- What makes a Posting
Interesting, ...
6. Event Collection Framework V1
Web WebWeb
Message
Bus
LB
Tracking
Service
Tracking
Service
Flume
Flume
Flume
Hadoop
Hue Impala Report
Console
SQL-DW
Looker
Vertica
7. Evolution
201620142012
“We want to be a
cloud based
company”
Peter Harris, CEO
2015
Search
Continues
For a true
cloud
solution till
….
Data warehouse &
platform software
( on premise)
Vertica Data
Warehouse
Hadoop
Vertica Data
Warehouse
Move to Cloud
Doesn’t solve all
problems
Hadoop
8. Goals for Next Generation Solution
● Horizontal Scalability
● PaaS
● Stability
● Ease of Use
● Can’t be more expensive
13. Gotcha #1
● Problem:
○ Funnel Analysis got slower over time
● Base Metrics
○ 15 Billion rows
○ Analysis on monthly dataset: 2 minutes per run
○ 3 medium clusters in DW during business hours
27. Other Features
● Undrop (DB, Table, Schema) no restore required
● Clone (DB, Table, Schema) (metadata only operation)
● Native JSON Parsing (as well as CSV, AVRO, XML, Parquet)
● Automatic Encryption of Data
● Automatic Query Optimization (no tuning)
● All Data in one place (single source of truth)