Clojure In Production
Stackup March 10th, 2015
Alex Kehayias
@alexkehayias
About Shareablee
• Help brands and publishers figure out
what works and what doesn’t on social
• Optimize content creation...
What does that mean?
Collect and analyze millions of
pieces of content and billions of
interactions on all social
networks constantly aggregate...
How do we make good on our
promises?
Le Stack
… and frontend stuff
What informs tech decisions
• Speed of iteration
• Ease of testing
• Scaling story
• Failure scenarios
• Small team, low m...
What we have built
• Scalable data collection framework powered
by Apache Storm
• Full replay at any time from archives to...
Collection Framework
• Really nice DSL in Clojure for writing and testing Storm topologies
(defbolt split-sentence ["word"...
Full Replay
• API calls are limited, archives are not
• Keep it idempotent all the way through
• Reuse existing Storm topo...
Ephemeral Batch Jobs
• All the processing power, none of the
maintenance
• Terminate on completion (cheap)
• Clojure/Casca...
The Future
• Functional programming is the real deal
• Moving away from Hadoop
• Streams as a primary abstraction
• Large-...
Upcoming SlideShare
Loading in …5
×

Alex clojure in production stackup meetup 20150310 v1

70 views

Published on

031015

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
70
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Alex clojure in production stackup meetup 20150310 v1

  1. 1. Clojure In Production Stackup March 10th, 2015 Alex Kehayias @alexkehayias
  2. 2. About Shareablee • Help brands and publishers figure out what works and what doesn’t on social • Optimize content creation for engagement • Benchmark and predict performance
  3. 3. What does that mean?
  4. 4. Collect and analyze millions of pieces of content and billions of interactions on all social networks constantly aggregate on demand and calculate metrics interactively
  5. 5. How do we make good on our promises?
  6. 6. Le Stack … and frontend stuff
  7. 7. What informs tech decisions • Speed of iteration • Ease of testing • Scaling story • Failure scenarios • Small team, low maintenance • Tradeoffs • Fragmentation
  8. 8. What we have built • Scalable data collection framework powered by Apache Storm • Full replay at any time from archives to rebuild all data stores • Ephemeral batch jobs • Fast, automated time series down-sampling with tunable throttling • Excel templating library (soon to be open sourced) • Much much MUCH more…
  9. 9. Collection Framework • Really nice DSL in Clojure for writing and testing Storm topologies (defbolt split-sentence ["word"] [tuple collector] (let [words (.split (.getString tuple 0) " ")] (doseq [w words] (emit-bolt! collector [w] :anchor tuple)) (ack! collector tuple))) • Reusable components plug and play (defrabbitmqspout users-spout {"default" ["meta" "url" "opts"] "fail" ["meta" "reason"]} message-to-tuple dequeue-pred :exchange "USERS_EXCHANGE" :priority-queue "USERS_PRIORITY_QUEUE" :queue "USERS_QUEUE") • Need more parallelism? (bolt-spec {"1" :shuffle} split-sentence :p 5)
  10. 10. Full Replay • API calls are limited, archives are not • Keep it idempotent all the way through • Reuse existing Storm topologies, but different source; it’s all data (publish-msg {:date “2014-01-01” :version "1.0” :user-id “123456789” :resource “media” :service “instagram”})
  11. 11. Ephemeral Batch Jobs • All the processing power, none of the maintenance • Terminate on completion (cheap) • Clojure/Cascalog/S3/EMR • Logic programming in the large (<- [?name ?type ?count] (users ?name ?user-id) (actions ?type ?user-id) (count ?count))
  12. 12. The Future • Functional programming is the real deal • Moving away from Hadoop • Streams as a primary abstraction • Large-scale time series aggregation • Iterative prediction and classification

×