SlideShare a Scribd company logo
1 of 71
Download to read offline
Greg Brockman
Andy Brody
Christian Anderson
Philipp Antoni
Carl Jackson
Jonas Schneider

Siddarth
Chandrasekaran
Ludwig Pettersson
Nelson Elhage
Steve Woodrow
Jorge Ortiz
Participation
● ~7.5k participants from 97 countries, 6
continents
● ~9.5k unique IP logins
● 216 capturers
Solvers
CAPTURE THE FLAG

ARCHITECTURE
Or, an illustrated guide to flowcharts
What’s the holy grail of scaling?

Ø
Challenges
Scaling up to unknown capacity
- 53 instances, 800 ECU at peak

User isolation
Reliability and Availability (*-abilities)
git push: what happens?
git push → lvl0-asdf@stripe-ctf.com:level0
stripe-ctf.com. IN A

ctfweb
ctfweb
ctfweb
(sinatra)

gate
gate

https://stripe-ctf.com/

(nginx, haproxy)

submitter
submitter
submitter
(poseidon)

colossus
(LDAP, scoring)

(poseidon)
(poseidon)

test case
test case
test case
generator
test case
generator
test case
generator
build
generator
generator

(sinatra)
(sinatra)

queue
queue

ctfdb
ctfdb
ctfdb
(mongo)
(mongo)
(mongo)

(RabbitMQ) .
(RabbitMQ)

(docker)

test case
test case
test case
generator
test case
generator
test case
generator
worker
generator
generator
(docker)

gitcoin

test case
test case
test case
generator
test case
generator
test case
test case
generator
generator.
generator
gen
What Went Wrong
– containerization
– garbage collection
- containers, filesystems, disk space
– system stability
– bugs, misconfiguration
What Went Right
– containerization
– service architecture
- queueing, separation of roles
– load balancing
– horizontal scaling
Level 0
The Mysterious Program
Level 0: Sets
Level 0: mmap
● mmap,munmap - map or unmap files or
devices into memory
● mmap the dictionary into memory
● You can actually mmap stdin as well!
● Binary search!
Level 0: Bloom filters
●
●
●
●
●

Hash function: f(str) => int
Look at the result of N hash functions.
Probabilistic.
False positives, but no false negatives.
If you run into false positives, just push
again!
Level 0: Minimal perfect hashing
Given dictionary D = {w₁, w₂, … wn}
use MATH to generate a hash function f
f : D → {0..n-1} is one-to-one
aka every word hashes to a different small
integer
● So you can build a no-collisions hash table
● CMPH - C Minimal Perfect Hashing Library
● Build this ahead of time, link it to the binary
●
●
●
●
Level 1
Gitcoin
Gitcoin
commit 000000216ba61aecaafb11135ee43b1674855d6ff7
Author: Alyssa P Hacker <alyssa@example.com>
Date:
Wed Jan 22 14:10:15 2014 -0800
Give myself a Gitcoin! nonce: tahf8buC
diff --git a/LEDGER.txt b/LEDGER.txt
index 3890681..41980b2 100644
--- a/LEDGER.txt
+++ b/LEDGER.txt
@@ -7,3 +7,4 @@ carl: 30
gdb: 12
nelhage: 45
jorge: 30
+user-aph123: 1
Why crypto currencies?
– distributed currency system
– git security model
– massively parallel problems
– using the right tools for the job
Git object model
Git object model
$ git cat-file -p 000000effe7d920b391a24633e7298469dcf51b5
tree 7da86a5b10ff6db916598b653ce63e1dc0cb73c8
parent 0000000df4815161b72f4c5ed23e9fbf5deed922
author Alyssa P Hacker <alyssa@example.com> 1391129313 +0000
committer Alyssa P Hacker <alyssa@example.com> 1391129313
+0000
Mined a Gitcoin!
nonce 0302d1e2

$ git show 000000
error: short SHA1 000000 is ambiguous.
error: short SHA1 000000 is ambiguous.
Remember wafflecopter?

Want to do as few rounds of SHA1 as possible
Compute prefix, update only for nonce
SHA1 is Embarrassingly Parallel

– Each miner can be totally independent
– Each miner requires little memory
– Each miner requires little code
Tools: a spoon (bash)
Tools: a shovel (go)
Tools: an army of backhoes (GPU)
Tools: big machines (ASIC)
Bonus Round
Hash Rates:
bash: 400 H/s
our go miners: 1.9 MH/s
100 cores EC2: 130 MH/s
GPU: 1-2 GH/s
Network: ~10 GH/s
WE DID IT!
Dogecoin is only at 80 GH/s
Level 2
DDOS Defense
Elephants and mice: a DDOS model
Elephants and mice: a DDOS model
The stub of a Node.js proxy
Balance across the backends
Balance across the backends
1. Round robin
Balance across the backends
1. Round robin
2. Choose backend with min. load
Balance across the backends
1. Round robin
2. Choose backend with min. load
3. Randomize
Let the mice through
Let the mice through
1. Reduce the overall load
Let the mice through
1. Reduce the overall load
2. Use an off-the-shelf solution
Let the mice through
1. Reduce the overall load
2. Use an off-the-shelf solution
3. Learn to recognize a mouse
If you had a global view
Recognize a mouse
1. Threshold rate or number
2. Learn by hand or with automation
The top solutions
➔ Balance load in a simple way
➔ Learn to recognize a mouse
➔ Keep an eye on the backends
Level 3
Instant Code Search
We’re sorry about the Scala
The problem
● Text search over ~100M of text
● Arbitrary substring search (not just whole
words)
● There is a “free” indexing stage
● Distribute across up to 4 nodes
● Each node is limited to 500M of RAM
Search 101: Inverted Index
/tree/A: “the quick brown fox jumps over …”
/tree/B: “the fox is red”

“the”

[A, B]

“quick”

[A]

“brown”

[A]

“fox”

[A, B]

“red”

[B]
Search 102: Arbitrary Substring
● “trigram index”
● Store an inverted index of trigrams
● “the quick brown fox …” →
“the”, “he_”, “e_q”, “_qu”, “qui”, “uic”, …

● To query, look up all trigrams in the search
term, and intersect
● Search(“rown”) →
index[“row”] ∩ index[“own”]
○ Check each result to verify the match
Sharding
● We give you four nodes
● …but they all run on the same physical node
during grading
● And we didn’t resource-limit grading
containers (other than memory)
● So you don’t actually get more CPU, disk
I/O, or memory bandwidth
● Sharding ended up not really mattering
Winning the contest
● The spec is for arbitrary substring search
● But we only generate/query words from a
dictionary
● Some words are substrings of other
words…
● … but not too many
● Use an inverted index over dictionary words
Handling substrings
● Option A
○ substrings : word → [all words containing that word]
○ index : word → [list of lines containing that full word]
○
○

for word in substrings[query]:
results += index[word]

○ Can compute substrings table by brute search
○ When indexing lines, just split into words

● Option B
○ index : word → [all lines containing that word,
including as a substring]
○ Need to do the substring search as you index each
line
Other ways to beat the level
● Slurp the entire tree into RAM and use a
decent string search
○ (not java.lang.String.indexOf() -- that’s slow!)

● Shell out to grep
○ GNU grep is fast
Level 4
SQLCluster
SQLCluster
●
●
●
●

5 SQLite nodes
Queries submitted to all nodes
Random network and node failures
Must maintain full linearizability of queries
Octopus

http://wallpaperswide.com/angry_octopus-wallpapers.html
Octopus
● (Grumpy) network simulator
● Submits queries and checks for correctness
● Several “monkeys” manipulate the network:
○
○
○
○

Netsplit monkey
Lagsplit monkey
SPOF monkey
etc.
Consensus Algorithms
● Raft (“In Search of an Understandable Consensus Algorithm”, Diego
Ongaro and John Ousterhout, 2013)

● Zab (“Zab: High-performance broadcast for primary-backup systems”,
Flavio P. Junqueira, Benjamin C. Reed, and Marco Serafini, 2011)

● Paxos (“The Part-Time Parliament”, Leslie Lamport, 1998, originally
submitted 1990)

● Viewstamped Replication (“Viewstamped Replication:
A New Primary Copy Method to Support Highly Available Distributed
Systems”, Brian M. Oki and Barbara H. Liskov, 1988)
etc. etc.
Consensus Algorithms

Almost everyone chose Raft
https://github.com/goraft/raft
Gotchas: Idempotency
● Octopus sends node1 a commit
● Node1 forwards it to the leader, node0
● Node0 processes it and sends it back
● Octopus kills the node0 ⇔ node1 link
vs.
● Octopus sends node1 a commit
● Node1 forwards it to the leader, node0
● Octopus kills the node0 ⇔ node1 link
Gotchas: Idempotency
● How do you tell between these two cases?
● Naive: resubmit the query to find out!
● If the query was processed, return the old
result
● Common trick: Idempotency tokens
UPDATE ctf3 SET friendCount=friendCount+10,
requestCount=requestCount+1, favoriteWord="
jjfqcjamhpghnqq" WHERE name="carl"; SELECT * FROM ctf3
Making it Fast
● Every top solution replaced sql.go
● Two main strategies:
○ Write your own sqlite (or enough of it to pass)
○ Use sqlite bindings, :memory: database

● These perform roughly equally well
● Raft has a few timers you can tune
● Golf network traffic
Octopus Vulnerabilities
0
4

1

3
http://sweetclipart.com/octopus-line-art-756

2
First Solve: Single Master
0
4

1

3

2
Forward-to-Master
0
4

1

3

2
Single Point of Failure Detection
0
4

1

3

2
Redirect-to-Master
0
4

1

3

2
Redirect-to-Master
0
302 http:
//node0/sql

4

1

3

2
Redirect-to-Master
0
4

1

3

2
Leaderboard Top 10
●
●
●
●

Everyone changed SQLite “bindings”
Seven solutions used go-raft
Two solutions used redirect-to-master
One solution implemented Raft in C++

● If at first you don’t succeed…
○ Mean submissions: 1444 (stddev 1031)
○ Max: 3946
○ Min: 58 (the C++ solution)
Speculative: Time-based consensus
●
●
●
●
●

All nodes were run on the same host
Local clock: total ordering on events
Leaderless state machine
???
Profit?

● Similar ideas to Spanner (Google)
Greg Brockman
Andy Brody
Christian Anderson
Philipp Antoni
Carl Jackson
Jonas Schneider

Siddarth
Chandrasekaran
Ludwig Pettersson
Nelson Elhage
Steve Woodrow
Jorge Ortiz

More Related Content

What's hot

Learning groovy -EU workshop
Learning groovy  -EU workshopLearning groovy  -EU workshop
Learning groovy -EU workshopadam1davis
 
Как мы охотимся на гонки (data races) или «найди багу до того, как она нашла ...
Как мы охотимся на гонки (data races) или «найди багу до того, как она нашла ...Как мы охотимся на гонки (data races) или «найди багу до того, как она нашла ...
Как мы охотимся на гонки (data races) или «найди багу до того, как она нашла ...yaevents
 
Introductory Clojure Presentation
Introductory Clojure PresentationIntroductory Clojure Presentation
Introductory Clojure PresentationJay Victoria
 
JVM performance options. How it works
JVM performance options. How it worksJVM performance options. How it works
JVM performance options. How it worksDmitriy Dumanskiy
 
Optimizing Communicating Event-Loop Languages with Truffle
Optimizing Communicating Event-Loop Languages with TruffleOptimizing Communicating Event-Loop Languages with Truffle
Optimizing Communicating Event-Loop Languages with TruffleStefan Marr
 
Hyperloglog Project
Hyperloglog ProjectHyperloglog Project
Hyperloglog ProjectKendrick Lo
 
DConf 2016: Bitpacking Like a Madman by Amaury Sechet
DConf 2016: Bitpacking Like a Madman by Amaury SechetDConf 2016: Bitpacking Like a Madman by Amaury Sechet
DConf 2016: Bitpacking Like a Madman by Amaury SechetAndrei Alexandrescu
 
{"JSON, Swift and Type Safety" : "It's a wrap"}
{"JSON, Swift and Type Safety" : "It's a wrap"}{"JSON, Swift and Type Safety" : "It's a wrap"}
{"JSON, Swift and Type Safety" : "It's a wrap"}Anthony Levings
 
Start Wrap Episode 11: A New Rope
Start Wrap Episode 11: A New RopeStart Wrap Episode 11: A New Rope
Start Wrap Episode 11: A New RopeYung-Yu Chen
 
Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)
Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)
Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)Geoffrey De Smet
 
Intro to data oriented design
Intro to data oriented designIntro to data oriented design
Intro to data oriented designStoyan Nikolov
 
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019Rafał Leszko
 
Lisp for Python Programmers
Lisp for Python ProgrammersLisp for Python Programmers
Lisp for Python ProgrammersVsevolod Dyomkin
 
Triton and symbolic execution on gdb
Triton and symbolic execution on gdbTriton and symbolic execution on gdb
Triton and symbolic execution on gdbWei-Bo Chen
 
Andriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsAndriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsOWASP Kyiv
 
PyCon Poland 2016: Maintaining a high load Python project: typical mistakes
PyCon Poland 2016: Maintaining a high load Python project: typical mistakesPyCon Poland 2016: Maintaining a high load Python project: typical mistakes
PyCon Poland 2016: Maintaining a high load Python project: typical mistakesViach Kakovskyi
 

What's hot (20)

Learning groovy -EU workshop
Learning groovy  -EU workshopLearning groovy  -EU workshop
Learning groovy -EU workshop
 
Как мы охотимся на гонки (data races) или «найди багу до того, как она нашла ...
Как мы охотимся на гонки (data races) или «найди багу до того, как она нашла ...Как мы охотимся на гонки (data races) или «найди багу до того, как она нашла ...
Как мы охотимся на гонки (data races) или «найди багу до того, как она нашла ...
 
Introductory Clojure Presentation
Introductory Clojure PresentationIntroductory Clojure Presentation
Introductory Clojure Presentation
 
JVM performance options. How it works
JVM performance options. How it worksJVM performance options. How it works
JVM performance options. How it works
 
Optimizing Communicating Event-Loop Languages with Truffle
Optimizing Communicating Event-Loop Languages with TruffleOptimizing Communicating Event-Loop Languages with Truffle
Optimizing Communicating Event-Loop Languages with Truffle
 
Cryptography 202
Cryptography 202Cryptography 202
Cryptography 202
 
Memory management
Memory managementMemory management
Memory management
 
Synchronization
SynchronizationSynchronization
Synchronization
 
Hyperloglog Project
Hyperloglog ProjectHyperloglog Project
Hyperloglog Project
 
DConf 2016: Bitpacking Like a Madman by Amaury Sechet
DConf 2016: Bitpacking Like a Madman by Amaury SechetDConf 2016: Bitpacking Like a Madman by Amaury Sechet
DConf 2016: Bitpacking Like a Madman by Amaury Sechet
 
Collections forceawakens
Collections forceawakensCollections forceawakens
Collections forceawakens
 
{"JSON, Swift and Type Safety" : "It's a wrap"}
{"JSON, Swift and Type Safety" : "It's a wrap"}{"JSON, Swift and Type Safety" : "It's a wrap"}
{"JSON, Swift and Type Safety" : "It's a wrap"}
 
Start Wrap Episode 11: A New Rope
Start Wrap Episode 11: A New RopeStart Wrap Episode 11: A New Rope
Start Wrap Episode 11: A New Rope
 
Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)
Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)
Pushing the rule engine to its limits with drools planner (parisjug 2010-11-09)
 
Intro to data oriented design
Intro to data oriented designIntro to data oriented design
Intro to data oriented design
 
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019
 
Lisp for Python Programmers
Lisp for Python ProgrammersLisp for Python Programmers
Lisp for Python Programmers
 
Triton and symbolic execution on gdb
Triton and symbolic execution on gdbTriton and symbolic execution on gdb
Triton and symbolic execution on gdb
 
Andriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsAndriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tips
 
PyCon Poland 2016: Maintaining a high load Python project: typical mistakes
PyCon Poland 2016: Maintaining a high load Python project: typical mistakesPyCon Poland 2016: Maintaining a high load Python project: typical mistakes
PyCon Poland 2016: Maintaining a high load Python project: typical mistakes
 

Similar to Stripe CTF3 wrap-up

NSC #2 - Challenge Solution
NSC #2 - Challenge SolutionNSC #2 - Challenge Solution
NSC #2 - Challenge SolutionNoSuchCon
 
Fast dynamic analysis, Kostya Serebryany
Fast dynamic analysis, Kostya SerebryanyFast dynamic analysis, Kostya Serebryany
Fast dynamic analysis, Kostya Serebryanyyaevents
 
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...Yandex
 
Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"NUS-ISS
 
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)JiandSon
 
Kettunen, miaubiz fuzzing at scale and in style
Kettunen, miaubiz   fuzzing at scale and in styleKettunen, miaubiz   fuzzing at scale and in style
Kettunen, miaubiz fuzzing at scale and in styleDefconRussia
 
Troubleshooting .net core on linux
Troubleshooting .net core on linuxTroubleshooting .net core on linux
Troubleshooting .net core on linuxPavel Klimiankou
 
Going Multi-Node
Going Multi-NodeGoing Multi-Node
Going Multi-NodeSmartLogic
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging EnvironmentsPaul Groth
 
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a FoeHaim Yadid
 
Nibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeNibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeEdward Capriolo
 
Building your own NSQL store
Building your own NSQL storeBuilding your own NSQL store
Building your own NSQL storeEdward Capriolo
 
Nibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeNibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeEdward Capriolo
 
Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)Kai Chan
 
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...Alexandre Moneger
 
Mp26 : Tachyon, sloppiness is bliss
Mp26 : Tachyon, sloppiness is blissMp26 : Tachyon, sloppiness is bliss
Mp26 : Tachyon, sloppiness is blissMontreal Python
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java codeAttila Balazs
 
Address/Thread/Memory Sanitizer
Address/Thread/Memory SanitizerAddress/Thread/Memory Sanitizer
Address/Thread/Memory SanitizerPlatonov Sergey
 

Similar to Stripe CTF3 wrap-up (20)

NSC #2 - Challenge Solution
NSC #2 - Challenge SolutionNSC #2 - Challenge Solution
NSC #2 - Challenge Solution
 
Fast dynamic analysis, Kostya Serebryany
Fast dynamic analysis, Kostya SerebryanyFast dynamic analysis, Kostya Serebryany
Fast dynamic analysis, Kostya Serebryany
 
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
Константин Серебряный "Быстрый динамичекский анализ программ на примере поиск...
 
Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"
 
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
 
Kettunen, miaubiz fuzzing at scale and in style
Kettunen, miaubiz   fuzzing at scale and in styleKettunen, miaubiz   fuzzing at scale and in style
Kettunen, miaubiz fuzzing at scale and in style
 
Troubleshooting .net core on linux
Troubleshooting .net core on linuxTroubleshooting .net core on linux
Troubleshooting .net core on linux
 
Going Multi-Node
Going Multi-NodeGoing Multi-Node
Going Multi-Node
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
 
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
 
Nibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeNibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL store
 
Building your own NSQL store
Building your own NSQL storeBuilding your own NSQL store
Building your own NSQL store
 
Nibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeNibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL store
 
Killer Bugs From Outer Space
Killer Bugs From Outer SpaceKiller Bugs From Outer Space
Killer Bugs From Outer Space
 
Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 2 (SoCal Code Camp LA 2013)
 
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
 
Mp26 : Tachyon, sloppiness is bliss
Mp26 : Tachyon, sloppiness is blissMp26 : Tachyon, sloppiness is bliss
Mp26 : Tachyon, sloppiness is bliss
 
Static analysis for beginners
Static analysis for beginnersStatic analysis for beginners
Static analysis for beginners
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java code
 
Address/Thread/Memory Sanitizer
Address/Thread/Memory SanitizerAddress/Thread/Memory Sanitizer
Address/Thread/Memory Sanitizer
 

Recently uploaded

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Recently uploaded (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Stripe CTF3 wrap-up

  • 1. Greg Brockman Andy Brody Christian Anderson Philipp Antoni Carl Jackson Jonas Schneider Siddarth Chandrasekaran Ludwig Pettersson Nelson Elhage Steve Woodrow Jorge Ortiz
  • 2. Participation ● ~7.5k participants from 97 countries, 6 continents ● ~9.5k unique IP logins ● 216 capturers
  • 4. CAPTURE THE FLAG ARCHITECTURE Or, an illustrated guide to flowcharts
  • 5. What’s the holy grail of scaling? Ø
  • 6. Challenges Scaling up to unknown capacity - 53 instances, 800 ECU at peak User isolation Reliability and Availability (*-abilities)
  • 7. git push: what happens?
  • 8. git push → lvl0-asdf@stripe-ctf.com:level0 stripe-ctf.com. IN A ctfweb ctfweb ctfweb (sinatra) gate gate https://stripe-ctf.com/ (nginx, haproxy) submitter submitter submitter (poseidon) colossus (LDAP, scoring) (poseidon) (poseidon) test case test case test case generator test case generator test case generator build generator generator (sinatra) (sinatra) queue queue ctfdb ctfdb ctfdb (mongo) (mongo) (mongo) (RabbitMQ) . (RabbitMQ) (docker) test case test case test case generator test case generator test case generator worker generator generator (docker) gitcoin test case test case test case generator test case generator test case test case generator generator. generator gen
  • 9. What Went Wrong – containerization – garbage collection - containers, filesystems, disk space – system stability – bugs, misconfiguration
  • 10. What Went Right – containerization – service architecture - queueing, separation of roles – load balancing – horizontal scaling
  • 13. Level 0: mmap ● mmap,munmap - map or unmap files or devices into memory ● mmap the dictionary into memory ● You can actually mmap stdin as well! ● Binary search!
  • 14. Level 0: Bloom filters ● ● ● ● ● Hash function: f(str) => int Look at the result of N hash functions. Probabilistic. False positives, but no false negatives. If you run into false positives, just push again!
  • 15. Level 0: Minimal perfect hashing Given dictionary D = {w₁, w₂, … wn} use MATH to generate a hash function f f : D → {0..n-1} is one-to-one aka every word hashes to a different small integer ● So you can build a no-collisions hash table ● CMPH - C Minimal Perfect Hashing Library ● Build this ahead of time, link it to the binary ● ● ● ●
  • 17. Gitcoin commit 000000216ba61aecaafb11135ee43b1674855d6ff7 Author: Alyssa P Hacker <alyssa@example.com> Date: Wed Jan 22 14:10:15 2014 -0800 Give myself a Gitcoin! nonce: tahf8buC diff --git a/LEDGER.txt b/LEDGER.txt index 3890681..41980b2 100644 --- a/LEDGER.txt +++ b/LEDGER.txt @@ -7,3 +7,4 @@ carl: 30 gdb: 12 nelhage: 45 jorge: 30 +user-aph123: 1
  • 18. Why crypto currencies? – distributed currency system – git security model – massively parallel problems – using the right tools for the job
  • 20. Git object model $ git cat-file -p 000000effe7d920b391a24633e7298469dcf51b5 tree 7da86a5b10ff6db916598b653ce63e1dc0cb73c8 parent 0000000df4815161b72f4c5ed23e9fbf5deed922 author Alyssa P Hacker <alyssa@example.com> 1391129313 +0000 committer Alyssa P Hacker <alyssa@example.com> 1391129313 +0000 Mined a Gitcoin! nonce 0302d1e2 $ git show 000000 error: short SHA1 000000 is ambiguous. error: short SHA1 000000 is ambiguous.
  • 21. Remember wafflecopter? Want to do as few rounds of SHA1 as possible Compute prefix, update only for nonce
  • 22. SHA1 is Embarrassingly Parallel – Each miner can be totally independent – Each miner requires little memory – Each miner requires little code
  • 23. Tools: a spoon (bash)
  • 25. Tools: an army of backhoes (GPU)
  • 27. Bonus Round Hash Rates: bash: 400 H/s our go miners: 1.9 MH/s 100 cores EC2: 130 MH/s GPU: 1-2 GH/s Network: ~10 GH/s
  • 28. WE DID IT! Dogecoin is only at 80 GH/s
  • 30. Elephants and mice: a DDOS model
  • 31. Elephants and mice: a DDOS model
  • 32. The stub of a Node.js proxy
  • 33. Balance across the backends
  • 34. Balance across the backends 1. Round robin
  • 35. Balance across the backends 1. Round robin 2. Choose backend with min. load
  • 36. Balance across the backends 1. Round robin 2. Choose backend with min. load 3. Randomize
  • 37. Let the mice through
  • 38. Let the mice through 1. Reduce the overall load
  • 39. Let the mice through 1. Reduce the overall load 2. Use an off-the-shelf solution
  • 40. Let the mice through 1. Reduce the overall load 2. Use an off-the-shelf solution 3. Learn to recognize a mouse
  • 41. If you had a global view
  • 42. Recognize a mouse 1. Threshold rate or number 2. Learn by hand or with automation
  • 43. The top solutions ➔ Balance load in a simple way ➔ Learn to recognize a mouse ➔ Keep an eye on the backends
  • 45. We’re sorry about the Scala
  • 46. The problem ● Text search over ~100M of text ● Arbitrary substring search (not just whole words) ● There is a “free” indexing stage ● Distribute across up to 4 nodes ● Each node is limited to 500M of RAM
  • 47. Search 101: Inverted Index /tree/A: “the quick brown fox jumps over …” /tree/B: “the fox is red” “the” [A, B] “quick” [A] “brown” [A] “fox” [A, B] “red” [B]
  • 48. Search 102: Arbitrary Substring ● “trigram index” ● Store an inverted index of trigrams ● “the quick brown fox …” → “the”, “he_”, “e_q”, “_qu”, “qui”, “uic”, … ● To query, look up all trigrams in the search term, and intersect ● Search(“rown”) → index[“row”] ∩ index[“own”] ○ Check each result to verify the match
  • 49. Sharding ● We give you four nodes ● …but they all run on the same physical node during grading ● And we didn’t resource-limit grading containers (other than memory) ● So you don’t actually get more CPU, disk I/O, or memory bandwidth ● Sharding ended up not really mattering
  • 50. Winning the contest ● The spec is for arbitrary substring search ● But we only generate/query words from a dictionary ● Some words are substrings of other words… ● … but not too many ● Use an inverted index over dictionary words
  • 51. Handling substrings ● Option A ○ substrings : word → [all words containing that word] ○ index : word → [list of lines containing that full word] ○ ○ for word in substrings[query]: results += index[word] ○ Can compute substrings table by brute search ○ When indexing lines, just split into words ● Option B ○ index : word → [all lines containing that word, including as a substring] ○ Need to do the substring search as you index each line
  • 52. Other ways to beat the level ● Slurp the entire tree into RAM and use a decent string search ○ (not java.lang.String.indexOf() -- that’s slow!) ● Shell out to grep ○ GNU grep is fast
  • 54. SQLCluster ● ● ● ● 5 SQLite nodes Queries submitted to all nodes Random network and node failures Must maintain full linearizability of queries
  • 56. Octopus ● (Grumpy) network simulator ● Submits queries and checks for correctness ● Several “monkeys” manipulate the network: ○ ○ ○ ○ Netsplit monkey Lagsplit monkey SPOF monkey etc.
  • 57. Consensus Algorithms ● Raft (“In Search of an Understandable Consensus Algorithm”, Diego Ongaro and John Ousterhout, 2013) ● Zab (“Zab: High-performance broadcast for primary-backup systems”, Flavio P. Junqueira, Benjamin C. Reed, and Marco Serafini, 2011) ● Paxos (“The Part-Time Parliament”, Leslie Lamport, 1998, originally submitted 1990) ● Viewstamped Replication (“Viewstamped Replication: A New Primary Copy Method to Support Highly Available Distributed Systems”, Brian M. Oki and Barbara H. Liskov, 1988) etc. etc.
  • 58. Consensus Algorithms Almost everyone chose Raft https://github.com/goraft/raft
  • 59. Gotchas: Idempotency ● Octopus sends node1 a commit ● Node1 forwards it to the leader, node0 ● Node0 processes it and sends it back ● Octopus kills the node0 ⇔ node1 link vs. ● Octopus sends node1 a commit ● Node1 forwards it to the leader, node0 ● Octopus kills the node0 ⇔ node1 link
  • 60. Gotchas: Idempotency ● How do you tell between these two cases? ● Naive: resubmit the query to find out! ● If the query was processed, return the old result ● Common trick: Idempotency tokens UPDATE ctf3 SET friendCount=friendCount+10, requestCount=requestCount+1, favoriteWord=" jjfqcjamhpghnqq" WHERE name="carl"; SELECT * FROM ctf3
  • 61. Making it Fast ● Every top solution replaced sql.go ● Two main strategies: ○ Write your own sqlite (or enough of it to pass) ○ Use sqlite bindings, :memory: database ● These perform roughly equally well ● Raft has a few timers you can tune ● Golf network traffic
  • 63. First Solve: Single Master 0 4 1 3 2
  • 65. Single Point of Failure Detection 0 4 1 3 2
  • 69. Leaderboard Top 10 ● ● ● ● Everyone changed SQLite “bindings” Seven solutions used go-raft Two solutions used redirect-to-master One solution implemented Raft in C++ ● If at first you don’t succeed… ○ Mean submissions: 1444 (stddev 1031) ○ Max: 3946 ○ Min: 58 (the C++ solution)
  • 70. Speculative: Time-based consensus ● ● ● ● ● All nodes were run on the same host Local clock: total ordering on events Leaderless state machine ??? Profit? ● Similar ideas to Spanner (Google)
  • 71. Greg Brockman Andy Brody Christian Anderson Philipp Antoni Carl Jackson Jonas Schneider Siddarth Chandrasekaran Ludwig Pettersson Nelson Elhage Steve Woodrow Jorge Ortiz