Your SlideShare is downloading. ×
0
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Building a high throughput rest api with scala
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Building a high throughput rest api with scala

7,605

Published on

Slides from my talk the Scala DC Meetup on Jan 15th 2014.

Slides from my talk the Scala DC Meetup on Jan 15th 2014.

Published in: Technology
2 Comments
16 Likes
Statistics
Notes
  • @binkabir Yes, but at that time spray.io was not part of the typesafe stack, and we also needed to build a web admin console, so play framework worked out to be a better choice.
    Now that spray is going to be merged in to the typesafe umbrella , we'll surely revisit it.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • great slides, did you ever looked at spray.io before?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
7,605
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
104
Comments
2
Likes
16
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Building a high throughput REST API with Scala + Play + Akka Bhaskar V. Karambelkar https://www.linkedin.com/in/bhaskarvk https://twitter.com/bhaskar_vk Scala DC-MD-NOVA meetup Jan-15-2014
  • 2. Status quo • APIs used to be built with various protocols such as JDBC (Stored Procs), JMS, SOAP/HTTP, XML-RPC, file transfer. • Issues –      No uniformity Not firewall friendly Programming language dependency (JMS) Not easy to test / document. Not easy to scale, load-balance, fail-over. Scala DC-MD-NOVA meetup Jan-15-2014
  • 3. Why Scala + Play + Akka • Needed an API that could successfully tackle the 4 Vs of Big Data viz. Volume, Velocity, Variety, Veracity. • Needed the API to be horizontally as well as vertically scalable. • Needed an “event driven” architecture/ programming model. • Needed easy “HA”, “fail-over”, “concurrency”, “load balancing” constructs. Scala DC-MD-NOVA meetup Jan-15-2014
  • 4. Stack • • • • • Scala 2.10.3, Play 2.2.1, Akka 2.2.3. Eclipse + ScalaIDE (4.0.0 M1) Mongo DB as a Config Data Store + Queue metrics-scala library for metrics. Webjars library to manage javascript/css dependencies. • sbt for building, jenkins for CI. Scala DC-MD-NOVA meetup Jan-15-2014
  • 5. 1.0 Architecture Scala DC-MD-NOVA meetup Jan-15-2014
  • 6. Architecture Cont. • Apache Reverse Proxy ( HA, Load Balancing, fail-over, TLS termination). • API farm gets JSON POSTs , parses JSON , normalized to Scala Objects, uploaded to Mongo DB acting as a Q. • Same API farm de-queues from Mongo, sends it to next hop in the pipeline. • A basic admin console written in AngularJS. • Eventual destination HDFS & Elasticsearch. Scala DC-MD-NOVA meetup Jan-15-2014
  • 7. Performance in Production on first run • Slow JSON parsing, frequent OOMs, or even worse JVM hangs (kill -9). • No Transactions in MongoDB , so Data Loss in case of crash/hang. • Not scalable beyond a certain load. • CPUs pegged at 60 to 70% utilization, non-uniform core usage. • Heap usage high. • I/O bottlenecks. • Heavy en-queuing slowed down de-queuing, so queues fill up over time. Scala DC-MD-NOVA meetup Jan-15-2014
  • 8. Architecture 2.0 Scala DC-MD-NOVA meetup Jan-15-2014
  • 9. Architecture 2.0 Cont. • Dedicated Pipelines for clients. • Separate heavy traffic from light traffic. • Separate enqueue and de-queue in to dedicated API Server instances. • Compression all the way, even in Mongo. • Incremental JSON Parsing. • Avoid unnecessary JSON->Object->BSON>Object->Stream. • Changed logic so as to not lose data even in the event of an instance crash/hang. Scala DC-MD-NOVA meetup Jan-15-2014
  • 10. Results • Platform Stable • CPU usage steady @ 30 to 40 %, with uniform distribution across cores. • Memory consumption under control, no more OOM / hanging. • Increased Throughput and scalability. • Very easy to increase scaling, create more data paths. Scala DC-MD-NOVA meetup Jan-15-2014
  • 11. Buzzwords/Recommendations • Scala – Immutability every where, Use case classes / immutable collections. – Monadic Patterns everywhere ( Collections, Try, Option) . • Akka – – – – prefer ! (tell) Over ? (ask) Tune Dispatcher parameters, don’t rely on default dispatcher. Give Scheduler its own dispatcher. Routers with own dispatcher for load-balancing actors writing to destinations. – CircuitBreaker to prevent cascading failures. – Throttler Actor for throttling when required. Scala DC-MD-NOVA meetup Jan-15-2014
  • 12. Buzzwords/Recommendations • Play – Prefer non-blocking/async calls whenever possible. – Use webjars for managing javascript/css dependency. – For huge JSONs use incremental JSON parser + Play’s Iteratee f/w. • JVM – Use Java 7. – Profile and tune GC and memory params. Scala DC-MD-NOVA meetup Jan-15-2014
  • 13. Some Numbers • Current Load – 2.5 Billion events / day ( > 30 K/sec sustained). – 2 to 3 TB / day. – Expected to grow by 5x to 10x. • Current h/w count – 2 Data Paths with 4 enqueue and 4 de-queue API servers in each path. Scala DC-MD-NOVA meetup Jan-15-2014
  • 14. Future … • Waiting for Typesafe platform to stabilize a bit (akka-io, spray, akka-cluster) • More reactive than current implementation (Play Futures, Iteratees) • Reactive Mongo (currently we use Casbah). • Evaluating Scala for use in the analytics pipeline (spark f/w, cascading). Scala DC-MD-NOVA meetup Jan-15-2014
  • 15. Thank You ! • Questions ?, Comments ? Scala DC-MD-NOVA meetup Jan-15-2014

×