From Java to Scala at CrowdMix
A one-year journey, growing a set of microservices written in Java 8 into a Scala-based
reactive system
Stefano Galarraga - galarragas@gmail.com - @stefgalarraga
Emanuele Blanco - emanuele.blanco@gmail.com - @manub
What Does Crowdmix Do?
• A social network focused on music
• The model is based on crowds
• People can share different type of content in the crowds they joined
• Music obviously is the most interesting content
• System has been designed for scalability from day 1
• We don’t own any music content but we allow people to share and listen to
tracks across different streaming services
Who are we?
Stefano
• Started (reluctantly) working in Java in 1997
• Worked mostly in Java since then (some C/C++ C#)
• Got interested in Functional Programming around 2011/2012
• Started with Haskell then moved to Scala
• Working primarily in Scala since 2013
• Almost all my open source activity is in Scala
• AKKA enthusiast
• Working mostly in Big Data recently
Who are we?
Emanuele:
• Coding for fun since early 2000s, for profit since 2008
• Comes from a Java/Groovy background
• Started studying Scala in 2011, using it professionally in 2014
• Both Coursera’s courses pre-Scala Center in 2013
• Helpful, but still need real life expertise!
• Creator of scalatest-embedded-kafka (https://github.com/manub/scalatest-
embedded-kafka)
CM Architecture diagram today (“legacy” in red)
Some history - The CM Dev Team and Platform - Start
Early 2015:
A small team of developers, some basic (not so micro)services (profiles, feeds
and music matching)
Tech Stack:
• Java 8
• Dropwizard
• Cassandra
Some history - The CM Dev Team and Platform - Now
Now: Still quite a small team. The team grew up and then recently shrinked
Tech Stack:
• Scala, Cassandra, Kafka
• Rest Services in Spray and some in Play
• AKKA Streams for Kafka Consumers (reactive-kafka)
• Some “pure” AKKA actor code
• Legacy: Java 8, Dropwizard, Groovy (fading away)
• Spark (in Scala) for Batch processing
• Kotlin is gradually replacing Java on the Android part too!!
• And Swift is replacing Objective-C
Some history - The CM Dev Team and Platform
Early 2015
• ~ 5 developers setting up the first microservices (profiles, feeds and music matching) in Java 8 using
Dropwizard
• Scala wasn’t well known by the team, fear of not being able to hire Scala devs
June 2015
• Big (and fast) growth of the tech team, many devs joining coming from Scala experience
• Java 8 - while better than previous versions - still felt verbose for the Scala guys, trying to convince the rest of
the team to adopt Scala for expressiveness and conciseness
September 2015
• First microservices in Scala, new functionalities in feeds service written in Scala, music matching completely
rewritten (for other reasons too)
• Starting to experiment with AKKA Streams (1.0)
• More Scala Devs joining (mostly Big Data)
• Beginning to write first ETL processors in Spark
Some history - The CM Dev Team and Platform
Autumn 2015
• All newly-developed services are in Scala
• AKKA streams is the way to go for Kafka event processors
Today
• Only three services with still some parts in Java
• Some test code is still in Groovy
• All Scala code for the rest
• General consensus about moving everything to Scala
• Of the original set of developers with no Scala background:
• Everyone is capable of maintaining the code and some need some review-supervision
on special parts
• Around half are reasonably fluent in Scala and autonomous
• Others left the team
Some history - The CM Dev Team and Platform
Today:
• Codebase
• Scala code: 1,858 files for 132,926 lines
• Java code: 1,093 files for 93,293 lines
• Groovy code: 237 files 28,120 lines
• Client platform
• iOS: from Objective-C to Swift
• Android: from Java to Kotlin
Moving to Scala - Why
• Many of the devs coming from Scala projects were feeling very limited with
Java, even though Java 8 represents an improvement from earlier versions
• Moving towards a more reactive platform:
• Non blocking async code is much easier to write in Scala (Futures, Actors, Scala Async)
• Scala offers some more lightweight and performing REST framework (such as Spray/AKKA
Http, Finch/Finagle or Play!)
• AKKA Streams are extremely effective to write event processing pipelines
• Quite Java dev friendly too
• Nowadays Scala attracts more talent, people who work in Scala are more likely to pick up
other new technologies instead of “sticking with the well known”
• Beware: “Senior” developers are often reluctant to learn something new!!!
Moving to Scala - Gains
• Attracting/keeping good/right people
• Most of devs with experience in scalable systems were very keen on using Scala
• Most of the developers were asking during interviews how much Java code they would have
had to maintain
• Frameworks:
• Testing frameworks: Much more expressive, compact. Better tests
• Property-based testing, better fixtures
• Initial attempts to get rid of JUnit using Spock (on Groovy) but Scalatest is far superior
• Reactive frameworks:
• See above. Java has good stuff too (Hystrix, ...)
• Data processing layer:
• Spark is available for Java but fits Scala much better
Moving to Scala - Losses
• Need to write fill some gaps:
• Metrics: Out of the box support in Dropwizard, had to integrate/complement for Spray
• Not too much work but need some effort (Kamon + some of our code)
• Circuit breakers: Nothing forces you not to use Hystrix but if you start writing Future-based
API there is a bit of impedance to handle.
• AKKA circuit-breaker is a good replacement with limited features
• API Docs:
• Good support for Swagger in Dropwizard and Jersey. Not so good for Spray
• Still swagger 1.0 and doesn’t interact well with the code (actually found a 2.0
compliant..)
• Need to train/supervise developers
• Small slowdown in the beginning, some entropy to contain in the transition
Moving to Scala - What worked and why
• Had a good percentage of developers with real Scala experience
• Good amount of “cautious” developers, nobody was pushing towards more esoteric Scala
functions
• no Scalaz/Cats, limited usage of implicit conversions and parameters
• Being able to have every Java dev pairing with an experienced Scala one
• While not pairing, active code reviews using PRs
• Data pipeline frameworks are limiting the scope and helping to avoid getting
lost in Scala land
• Spark, AKKA Streams
• Almost everybody seemed keen to learn Scala and to port Java code
• No “rewrite everything” approach; instead
• New code gets preferably written in Scala
• Old code gets ported when there’s business need to change it
Moving to Scala - What didn’t work so well
• Keeping REST Java output code in Java and replacing the service layer in
Scala
• Calling Scala code from Java is painful
• Problems in writing a non-blocking service layer, ended up limiting the amount of gain
• ListenableFuture and CompletionStage may translate into Scala Futures with some help
• I would probably move in the opposite way and then replace the Java code entirely
• Gradle and ScalaTest don’t really work seamlessly…
• It’s easy to forget @RunWith
• sbt is still the best option to build Scala, but it’s not the easiest tool around
• Scala is great, but the compiler is still way slower than javac
• Dotty to the rescue? http://scala-lang.org/news/roadmap-next/
Summary
• It worked for us
• We are not a common case probably
• Limited legacy
• Good Scala expertise in the team
• Suggestions for other trying to do the same:
• Get scala devs with real development experience to pair/mentor
• Not just having completed the Coursera courses => they maintained sw written in Scala
• Spread them around
• Stick to the Principle of Least Power
http://www.lihaoyi.com/post/StrategicScalaStylePrincipleofLeastPower.html
• Adopt a strict “simplified scala” strategy for the beginning
• Play is usually a good gateway drug
• Spray, AKKA HTTP is not
• Consider Finch/Finatra/Finagle
Summary
From: http://blog.goodstuff.im/yes-virginia-scala-is-hard (very old post!!)
So, how can you figure out if Scala will be "easy" or "hard" for your organization:
Your company has speakers at JavaOne, OSCON, Strangle Loop, QCon: Scala will be easy
Lunch-time discussions involve the criteria for moving from a developer to a senior developer: Scala
will be hard
Your developers can write code in NotePad if they have to: Easy
Your developers stare blankly or say 3 "Hail Marys" when they hear the name "Zed Shaw": Scala == Hard
Developers all follow Dean Wampler on Twitter: Scala Easy
Your developers come in at 9:15 and leave before 6 and don't check work email at night: Hard
Summary
Questions?

From java to scala at crowd mix

  • 1.
    From Java toScala at CrowdMix A one-year journey, growing a set of microservices written in Java 8 into a Scala-based reactive system Stefano Galarraga - galarragas@gmail.com - @stefgalarraga Emanuele Blanco - emanuele.blanco@gmail.com - @manub
  • 2.
    What Does CrowdmixDo? • A social network focused on music • The model is based on crowds • People can share different type of content in the crowds they joined • Music obviously is the most interesting content • System has been designed for scalability from day 1 • We don’t own any music content but we allow people to share and listen to tracks across different streaming services
  • 3.
    Who are we? Stefano •Started (reluctantly) working in Java in 1997 • Worked mostly in Java since then (some C/C++ C#) • Got interested in Functional Programming around 2011/2012 • Started with Haskell then moved to Scala • Working primarily in Scala since 2013 • Almost all my open source activity is in Scala • AKKA enthusiast • Working mostly in Big Data recently
  • 4.
    Who are we? Emanuele: •Coding for fun since early 2000s, for profit since 2008 • Comes from a Java/Groovy background • Started studying Scala in 2011, using it professionally in 2014 • Both Coursera’s courses pre-Scala Center in 2013 • Helpful, but still need real life expertise! • Creator of scalatest-embedded-kafka (https://github.com/manub/scalatest- embedded-kafka)
  • 5.
    CM Architecture diagramtoday (“legacy” in red)
  • 6.
    Some history -The CM Dev Team and Platform - Start Early 2015: A small team of developers, some basic (not so micro)services (profiles, feeds and music matching) Tech Stack: • Java 8 • Dropwizard • Cassandra
  • 7.
    Some history -The CM Dev Team and Platform - Now Now: Still quite a small team. The team grew up and then recently shrinked Tech Stack: • Scala, Cassandra, Kafka • Rest Services in Spray and some in Play • AKKA Streams for Kafka Consumers (reactive-kafka) • Some “pure” AKKA actor code • Legacy: Java 8, Dropwizard, Groovy (fading away) • Spark (in Scala) for Batch processing • Kotlin is gradually replacing Java on the Android part too!! • And Swift is replacing Objective-C
  • 8.
    Some history -The CM Dev Team and Platform Early 2015 • ~ 5 developers setting up the first microservices (profiles, feeds and music matching) in Java 8 using Dropwizard • Scala wasn’t well known by the team, fear of not being able to hire Scala devs June 2015 • Big (and fast) growth of the tech team, many devs joining coming from Scala experience • Java 8 - while better than previous versions - still felt verbose for the Scala guys, trying to convince the rest of the team to adopt Scala for expressiveness and conciseness September 2015 • First microservices in Scala, new functionalities in feeds service written in Scala, music matching completely rewritten (for other reasons too) • Starting to experiment with AKKA Streams (1.0) • More Scala Devs joining (mostly Big Data) • Beginning to write first ETL processors in Spark
  • 9.
    Some history -The CM Dev Team and Platform Autumn 2015 • All newly-developed services are in Scala • AKKA streams is the way to go for Kafka event processors Today • Only three services with still some parts in Java • Some test code is still in Groovy • All Scala code for the rest • General consensus about moving everything to Scala • Of the original set of developers with no Scala background: • Everyone is capable of maintaining the code and some need some review-supervision on special parts • Around half are reasonably fluent in Scala and autonomous • Others left the team
  • 10.
    Some history -The CM Dev Team and Platform Today: • Codebase • Scala code: 1,858 files for 132,926 lines • Java code: 1,093 files for 93,293 lines • Groovy code: 237 files 28,120 lines • Client platform • iOS: from Objective-C to Swift • Android: from Java to Kotlin
  • 11.
    Moving to Scala- Why • Many of the devs coming from Scala projects were feeling very limited with Java, even though Java 8 represents an improvement from earlier versions • Moving towards a more reactive platform: • Non blocking async code is much easier to write in Scala (Futures, Actors, Scala Async) • Scala offers some more lightweight and performing REST framework (such as Spray/AKKA Http, Finch/Finagle or Play!) • AKKA Streams are extremely effective to write event processing pipelines • Quite Java dev friendly too • Nowadays Scala attracts more talent, people who work in Scala are more likely to pick up other new technologies instead of “sticking with the well known” • Beware: “Senior” developers are often reluctant to learn something new!!!
  • 12.
    Moving to Scala- Gains • Attracting/keeping good/right people • Most of devs with experience in scalable systems were very keen on using Scala • Most of the developers were asking during interviews how much Java code they would have had to maintain • Frameworks: • Testing frameworks: Much more expressive, compact. Better tests • Property-based testing, better fixtures • Initial attempts to get rid of JUnit using Spock (on Groovy) but Scalatest is far superior • Reactive frameworks: • See above. Java has good stuff too (Hystrix, ...) • Data processing layer: • Spark is available for Java but fits Scala much better
  • 13.
    Moving to Scala- Losses • Need to write fill some gaps: • Metrics: Out of the box support in Dropwizard, had to integrate/complement for Spray • Not too much work but need some effort (Kamon + some of our code) • Circuit breakers: Nothing forces you not to use Hystrix but if you start writing Future-based API there is a bit of impedance to handle. • AKKA circuit-breaker is a good replacement with limited features • API Docs: • Good support for Swagger in Dropwizard and Jersey. Not so good for Spray • Still swagger 1.0 and doesn’t interact well with the code (actually found a 2.0 compliant..) • Need to train/supervise developers • Small slowdown in the beginning, some entropy to contain in the transition
  • 14.
    Moving to Scala- What worked and why • Had a good percentage of developers with real Scala experience • Good amount of “cautious” developers, nobody was pushing towards more esoteric Scala functions • no Scalaz/Cats, limited usage of implicit conversions and parameters • Being able to have every Java dev pairing with an experienced Scala one • While not pairing, active code reviews using PRs • Data pipeline frameworks are limiting the scope and helping to avoid getting lost in Scala land • Spark, AKKA Streams • Almost everybody seemed keen to learn Scala and to port Java code • No “rewrite everything” approach; instead • New code gets preferably written in Scala • Old code gets ported when there’s business need to change it
  • 15.
    Moving to Scala- What didn’t work so well • Keeping REST Java output code in Java and replacing the service layer in Scala • Calling Scala code from Java is painful • Problems in writing a non-blocking service layer, ended up limiting the amount of gain • ListenableFuture and CompletionStage may translate into Scala Futures with some help • I would probably move in the opposite way and then replace the Java code entirely • Gradle and ScalaTest don’t really work seamlessly… • It’s easy to forget @RunWith • sbt is still the best option to build Scala, but it’s not the easiest tool around • Scala is great, but the compiler is still way slower than javac • Dotty to the rescue? http://scala-lang.org/news/roadmap-next/
  • 16.
    Summary • It workedfor us • We are not a common case probably • Limited legacy • Good Scala expertise in the team • Suggestions for other trying to do the same: • Get scala devs with real development experience to pair/mentor • Not just having completed the Coursera courses => they maintained sw written in Scala • Spread them around • Stick to the Principle of Least Power http://www.lihaoyi.com/post/StrategicScalaStylePrincipleofLeastPower.html • Adopt a strict “simplified scala” strategy for the beginning • Play is usually a good gateway drug • Spray, AKKA HTTP is not • Consider Finch/Finatra/Finagle
  • 17.
    Summary From: http://blog.goodstuff.im/yes-virginia-scala-is-hard (veryold post!!) So, how can you figure out if Scala will be "easy" or "hard" for your organization: Your company has speakers at JavaOne, OSCON, Strangle Loop, QCon: Scala will be easy Lunch-time discussions involve the criteria for moving from a developer to a senior developer: Scala will be hard Your developers can write code in NotePad if they have to: Easy Your developers stare blankly or say 3 "Hail Marys" when they hear the name "Zed Shaw": Scala == Hard Developers all follow Dean Wampler on Twitter: Scala Easy Your developers come in at 9:15 and leave before 6 and don't check work email at night: Hard
  • 18.
  • 19.

Editor's Notes

  • #3 Ste
  • #4 Ste
  • #5 Ema
  • #6 Ste + Ema
  • #7 Ema We should describe the situation at company startup, the kind of developer, why Java 8 was chosen
  • #8 Ema We could describe the size changes 5 - 20 - 12
  • #9 Stefano Are dates correct, any other hot milestone? You are the best on it Ema, since you did most of the initial injection work
  • #10 Stefano
  • #11 Need to collect the stats
  • #12 Ema Any other good reason? I added the fact people that do Scala are generally more keen to learn
  • #13 Ema
  • #14 Ema/Ste
  • #15 Ste
  • #16 Ste/Ema I added the fact that if you have ListenableFuture or CompletionStage you can work with Scala futures, but it require
  • #17 Ste