Data Science at Scale @ barricade.io

@davidcoallier
Part of an amazing team at Barricade.io

Got all?
Having the three is real hard!

Is that it?
Well don’t forget your purpose.

You are not an economist.
ɪˈkɒnəmɪst/: Someone with all the answers, and none of the questions.

Analyse Results
You will be sad.

Conversate
Talk about your findings.

Good Chats
Imply egoless and collaborative data scientists.

1. Hacking
2. Maths & Stats
3. Expertise

1. Question
2. Be Pragmatic
3. Features
4. Analyse
5. Share.

A team!
Rarely a single-person effort.

An Example
Fraud Prevention — Business Prevention

I knew better.
Obviously… duh

We didn’t share.
Science has historically been shared.

Empathise.
Use human language, not lingo.

We’re still small
About a billion data points a day.

Humble Beginnings
Typically… an Queue and an API.

This had issues.
Hard to scale, hard to decouple, etc.

Enter the
Lambda Architecture.

Speed Layer: U new behaviour from new data
Batch Layer: All classified behaviour since T

Speed Layer: U new behaviour from new data
Batch Layer: All classified behaviour since T
Serve Layer: Batch layer U Speed Layer

Kafka Queue.
Distributed messaging system
Append-only log
Consumers have offsets
Partition for parallelism
Replicate for redundancy
Message order guaranteed, per-partition

Data Science at Scale @ barricade.io

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Viewers also liked

Viewers also liked (20)

Similar to Data Science at Scale @ barricade.io

Similar to Data Science at Scale @ barricade.io (20)

More from David Coallier

More from David Coallier (15)

Recently uploaded

Recently uploaded (20)

Data Science at Scale @ barricade.io