SlideShare a Scribd company logo
1 of 35
Building Distributed Systems in Scala
A presentation to Emerging Technologies for the Enterprise
April 8, 2010 – Philadelphia, PA




                                                             TM
About @al3x
‣   At Twitter since 2007
‣   Working on the Web
    since 1995
‣   Co-author of
    Programming Scala
    (O’Reilly, 2009)
‣   Into programming
    languages,
    distributed systems.
About Twitter
‣   Social messaging – a
    new way to
    communicate
‣   Launched in
    mid-2006
‣   Hit the mainstream in
    2008
‣   50+ million tweets per
    day (600+ per
    second)
‣   Millions of users
    worldwide
Technologies Used At Twitter
Languages                         Frameworks
‣   Ruby, JavaScript              ‣   Rails
‣   Scala                         ‣   jQuery
‣   lil’ bit of C, Python, Java


Data Storage                      Misc.
‣   MySQL                         ‣   memcached
‣   Cassandra                     ‣   ZooKeeper
‣   HBase (Hadoop)                ‣   Jetty
                                  ‣   so much more!
Why Scala?
‣   A language that’s both fun and productive.
‣   Great performance (on par with Java).
‣   Object-oriented and functional programming,
    together.
‣   Ability to reuse existing Java libraries.
‣   Flexible concurrency (Actors, threads, events).
‣   A smart community with infectious momentum.
Hawkwind
A case study in (re)building
a distributed system in Scala.
Requirements
‣   Search for people by name, username, eventually
    by other attributes.
‣   Order the results some sensible way (ex: by
    number of followers).
‣   Offer suggestions for misspellings/alternate names.
‣   Handle case-folding and other text normalization
    concerns on the query string.
‣   Return results in about a second, preferably less.
Finding People on Twitter
Finding People on Twitter




results
Finding People on Twitter



                 suggestion




results
Finding People on Twitter

                              speedy!

                 suggestion




results
First Attempt: acts_as_solr
‣   Crunched on time, so we wanted the fastest
    route to working user search.
‣   Uses the Solr distribution/platform from Apache
    Lucene.
‣   Tries to make Rails integration straightforward
    and idiomatic.
‣   Easy to get running, hard to operationalize.
In the Interim: A Move to SOA
‣   Stopped thinking of our architecture as just a
    Rails app and the components that orbit it.
‣   Started building isolated services that
    communicate with the rest of the system via
    Thrift (an RPC and server framework).
‣   Allows us freedom to change the underlying
    implementation of services without modifying the
    rest of the system.
Thrift Example
   struct Results {
     1: list<i64> people
     2: string suggestion
     3: i32 processingTime /* milliseconds */
     4: list<i32> timings
     5: i32 totalResults
   }

   service NameSearch {
    Results find(1: string name, 2: i32 maxResults, 3: bool
   wantSuggestion)

  Results find_with_ranking(1: string name, 2: i32 maxResults, 3: bool
wantSuggestion, 4: Ranker ranking)
}
Second Attempt: Hawkwind 1
‣   A quick (three weeks) bespoke Scala project to
    “stop the bleeding”.
‣   Vertically but not horizontally scalable: no
    sharding, no failover, machine-level redundancy.
‣   Ran into memory and disk space limits.
‣   Reused Java code but didn’t offer nice Scala
    wrappers or rewrites.
‣   Still, planned to grow 10x, grew 25x!
Goals for Hawkwind 2
‣   Horizontally scalable: sharded corpus,
    replication of shards, easy to grow the service.
‣   Faster.
‣   Higher-quality results.
‣   Better use of Scala (language features,
    programming style).
‣   Maintainable code base, make it easy to add
    features.
High-Level Concepts
‣   Shards: pieces of the user corpus.
‣   Replicas: copies of shards.
‣   Document Servers.
‣   Merge Servers.
‣   Every machine gets the same code, can be
    either a Document Server or a Merge Server.
Hawkwind 2                                          Internet




High-Level                               queries for users, API requests




Architecture                                    Rails Cluster



                                    Thrift call to semi-random Merge Server




                                     Merge           Merge           Merge
                                     Server          Server          Server


                                Thrift calls to semi-random replica of each shard




                Shard 1      Shard 1         Shard 2           Shard 2         Shard 3      Shard 3
               Doc Server   Doc Server      Doc Server        Doc Server      Doc Server   Doc Server




                                  periodic deliveries of sharded user corpus




                                               Hadoop (HBase)
Taking Care of Data
‣   A Hadoop job gathers up the user data and slices it
    into shards.
‣   A cron job fetches these data dumps several times
    per day.
‣   To load a new corpus on a Document Server, simply
    restart the process.
‣   Redundancy and staggered scheduling keeps the
    system from running too hot while restarts are in
    progress.
What a Document Server does
‣   On startup, load Thrift serialized User objects.
‣   Populate an Inverted Index, Map, and Trie with
    normalized attributes of those User objects.
‣   Once ready, listen for queries.
‣   Answering a query basically means looking
    stuff up in those pre-populated data structures.
‣   Maintains a connection pool for Thrift requests,
    wrapping org.apache.commons.pool.
What a Merge Server does
‣   Gets queries.
‣   Fans out queries to Document Servers.
‣   Waits for queries to come back using a custom
    ParallelFuture class, which wraps a number of
    java.util.concurrent classes.
‣   Merges together the result sets, re-ranks them,
    and ships ‘em back to the requesting client.
How to model a distributed system?
‣   Literal decomposition: classes for all
    architectural components (Shard, Replica, etc.).
‣   Each component knows/does as little as
    possible.
‣   Isolate mutable state, test carefully.
‣   Cleanly delegate calls.
Literal Decomposition: Replica
case class Replica(val shard: Shard, val server: Server) {
 private val log = Logger.get
 val BACKEND_ERROR = Stats.getCounter("backend_timeout")

    def query(q: Query): DocResults = w3c.time("replica-query") {
      server.thriftCall { client =>
        // logic goes here
      }
    }

    def ping(): Boolean = server.thriftCall { client =>
      log.debug("calling ping via thrift for %s", server)
      val rv = client.ping()
      log.debug("ping returned %s from %s", rv, server)
      rv
    }
}
Literal Decomposition: Server
 case class Server(val hostname: String, val port: Int) {
  val pool = ConnectionPool(hostname, port)
  private val log = Logger.get

     def thriftCall[A](f: Client => A) = {
       log.debug("making thriftCall for server %s", this)
       pool.withClient { client => f(client) }
     }

     def replica: Replica = {
       Replica(ShardMap.serversToShards(this), this)
     }
 }
Hawkwind 2
Query Call
                      MergeLayer.query




Graph                  ShardMap.query




                 shard.replicaManager ! query




                         shard.query




                       randomReplica()




                        replica.query




                       server.thriftCall




             NameSearchDocumentLayerClient.find
Hawkwind 2
Query Call
                                   MergeLayer.query




Graph      what’s this?             ShardMap.query




                              shard.replicaManager ! query




                                      shard.query




                                    randomReplica()




                                     replica.query




                                    server.thriftCall




                          NameSearchDocumentLayerClient.find
ShardMap: Isolating Mutable State
‣   A singleton and an Actor.
‣   Contains a map from Servers to their
    corresponding Shards.
‣   Also contains a map from Shards to the Replicas
    of those shards.
‣   Responsible for populating and managing
    those maps.
‣   Send it a message to evict or reinsert a Replica.
‣   Fans out queries to Shards.
ReplicaHealthChecker
‣   Much like the ShardMap, a singleton and an
    Actor.
‣   Maintains mutable lists of unhealthy Replicas
    (“the penalty box”).
‣   Constantly checking to see if evicted Replicas
    are healthy again (back online).
‣   Sends messages to itself – an effective Actor
    technique.
Challenges, Large and Small
‣   Fast importing of huge serialized Thrift object
    dumps.
‣   Testing the ShardMap and ReplicaHealthChecker
    (mutable state wants to hurt you).
‣   Efficient accent normalization and filtering for
    special characters.
‣   Working with the Apache Commons object pool.
‣   Breaking out different ranking mechanisms in a
    clean, reusable way.
Libraries & Tools
Things that make working in Scala
way more productive.
sbt – the Simple Build Tool
‣   Scala’s answer to Ant and Maven.
‣   Sets up new projects.
‣   Maintains project configuration, build tasks,
    and dependencies in pure Scala. Totally open-
    ended.
‣   Interactive console.
‣   Will run tasks as soon as files in your project
    change – automatically compile and run tests!
Ostrich
‣   Gather statistics about your application.
‣   Counters, gauges, and timings.
‣   Share stats via JMX, a plain-text socket, a web
    interface, or log files.
‣   Ex:
          Stats.time("foo") {
            timeConsumingOperation()
          }
Configgy
‣   Manages configuration files and logging.
‣   Flexible file format, can include files in other files.
‣   Inheritance, variable substitution.
‣   Tunable logging, logging with Scribe.
‣   Subscription API: push and validate
    configuration changes to running processes.
‣   Ex:
      val foo = config.getString(“foo”)
Specs + xrayspecs
 ‣   A behavior-driven development (BDD) testing
     framework for Scala.
 ‣   Elegant, readable, fun-to-write tests.
 ‣   Support for several mocking frameworks (we
     like Mockito).
 ‣   Test concurrent operations, time, much more.
 ‣   Ex:
"suggestion with a List of null does not blow up" in {
  MergeLayer.suggestion("steve", List(null)) mustEqual None
}
Questions?                                 Follow me at
                                           twitter.com/al3x

Learn with us at engineering.twitter.com
Work with us at jobs.twitter.com




                                                   TM

More Related Content

What's hot

Solr + Hadoop = Big Data Search
Solr + Hadoop = Big Data SearchSolr + Hadoop = Big Data Search
Solr + Hadoop = Big Data Search
Mark Miller
 
NYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solr
thelabdude
 
Topic Modeling via Tensor Factorization Use Case for Apache REEF Framework
Topic Modeling via Tensor Factorization Use Case for Apache REEF FrameworkTopic Modeling via Tensor Factorization Use Case for Apache REEF Framework
Topic Modeling via Tensor Factorization Use Case for Apache REEF Framework
DataWorks Summit
 

What's hot (20)

How to build your query engine in spark
How to build your query engine in sparkHow to build your query engine in spark
How to build your query engine in spark
 
Introduction to Spark with Scala
Introduction to Spark with ScalaIntroduction to Spark with Scala
Introduction to Spark with Scala
 
Solr + Hadoop = Big Data Search
Solr + Hadoop = Big Data SearchSolr + Hadoop = Big Data Search
Solr + Hadoop = Big Data Search
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
 
Why your Spark job is failing
Why your Spark job is failingWhy your Spark job is failing
Why your Spark job is failing
 
Hadoop on osx
Hadoop on osxHadoop on osx
Hadoop on osx
 
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbAirbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
 
Akka Streams and HTTP
Akka Streams and HTTPAkka Streams and HTTP
Akka Streams and HTTP
 
NYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solr
 
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
 
Apache Sqoop: Unlocking Hadoop for Your Relational Database
Apache Sqoop: Unlocking Hadoop for Your Relational Database Apache Sqoop: Unlocking Hadoop for Your Relational Database
Apache Sqoop: Unlocking Hadoop for Your Relational Database
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache Kafka
 
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
 
Cassandra & puppet, scaling data at $15 per month
Cassandra & puppet, scaling data at $15 per monthCassandra & puppet, scaling data at $15 per month
Cassandra & puppet, scaling data at $15 per month
 
Why your Spark Job is Failing
Why your Spark Job is FailingWhy your Spark Job is Failing
Why your Spark Job is Failing
 
Habits of Effective Sqoop Users
Habits of Effective Sqoop UsersHabits of Effective Sqoop Users
Habits of Effective Sqoop Users
 
Topic Modeling via Tensor Factorization - Use Case for Apache REEF
Topic Modeling via Tensor Factorization - Use Case for Apache REEFTopic Modeling via Tensor Factorization - Use Case for Apache REEF
Topic Modeling via Tensor Factorization - Use Case for Apache REEF
 
Topic Modeling via Tensor Factorization Use Case for Apache REEF Framework
Topic Modeling via Tensor Factorization Use Case for Apache REEF FrameworkTopic Modeling via Tensor Factorization Use Case for Apache REEF Framework
Topic Modeling via Tensor Factorization Use Case for Apache REEF Framework
 
Building a High-Performance Database with Scala, Akka, and Spark
Building a High-Performance Database with Scala, Akka, and SparkBuilding a High-Performance Database with Scala, Akka, and Spark
Building a High-Performance Database with Scala, Akka, and Spark
 

Viewers also liked

Scaling Twitter with Cassandra
Scaling Twitter with CassandraScaling Twitter with Cassandra
Scaling Twitter with Cassandra
Ryan King
 
IoT 공통 보안가이드
IoT 공통 보안가이드IoT 공통 보안가이드
IoT 공통 보안가이드
봉조 김
 
4.16세월호참사 특별조사위원회 중간점검보고서
4.16세월호참사 특별조사위원회 중간점검보고서4.16세월호참사 특별조사위원회 중간점검보고서
4.16세월호참사 특별조사위원회 중간점검보고서
봉조 김
 
AWS Innovate: AWS Container Management using Amazon EC2 Container Service an...
AWS Innovate:  AWS Container Management using Amazon EC2 Container Service an...AWS Innovate:  AWS Container Management using Amazon EC2 Container Service an...
AWS Innovate: AWS Container Management using Amazon EC2 Container Service an...
Amazon Web Services Korea
 

Viewers also liked (20)

Building Distributed Systems from Scratch - Part 1
Building Distributed Systems from Scratch - Part 1Building Distributed Systems from Scratch - Part 1
Building Distributed Systems from Scratch - Part 1
 
Purely Functional Data Structures in Scala
Purely Functional Data Structures in ScalaPurely Functional Data Structures in Scala
Purely Functional Data Structures in Scala
 
Advanced Functional Programming in Scala
Advanced Functional Programming in ScalaAdvanced Functional Programming in Scala
Advanced Functional Programming in Scala
 
NoSQL at Twitter (NoSQL EU 2010)
NoSQL at Twitter (NoSQL EU 2010)NoSQL at Twitter (NoSQL EU 2010)
NoSQL at Twitter (NoSQL EU 2010)
 
Getting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache MesosGetting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache Mesos
 
Data Structures In Scala
Data Structures In ScalaData Structures In Scala
Data Structures In Scala
 
Scaling Twitter with Cassandra
Scaling Twitter with CassandraScaling Twitter with Cassandra
Scaling Twitter with Cassandra
 
Apache spark Intro
Apache spark IntroApache spark Intro
Apache spark Intro
 
Data analysis scala_spark
Data analysis scala_sparkData analysis scala_spark
Data analysis scala_spark
 
Message-passing concurrency in Python
Message-passing concurrency in PythonMessage-passing concurrency in Python
Message-passing concurrency in Python
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
 
HBase @ Twitter
HBase @ TwitterHBase @ Twitter
HBase @ Twitter
 
IoT 공통 보안가이드
IoT 공통 보안가이드IoT 공통 보안가이드
IoT 공통 보안가이드
 
(2016 08-02) 멘토스성과발표간담회
(2016 08-02) 멘토스성과발표간담회(2016 08-02) 멘토스성과발표간담회
(2016 08-02) 멘토스성과발표간담회
 
4.16세월호참사 특별조사위원회 중간점검보고서
4.16세월호참사 특별조사위원회 중간점검보고서4.16세월호참사 특별조사위원회 중간점검보고서
4.16세월호참사 특별조사위원회 중간점검보고서
 
2015개정교육과정질의 응답자료
2015개정교육과정질의 응답자료2015개정교육과정질의 응답자료
2015개정교육과정질의 응답자료
 
4.16세월호참사 특별조사위원회 제3차 청문회 자료집 3차 청문회 자료집(최종) 2
4.16세월호참사 특별조사위원회 제3차 청문회 자료집 3차 청문회 자료집(최종) 24.16세월호참사 특별조사위원회 제3차 청문회 자료집 3차 청문회 자료집(최종) 2
4.16세월호참사 특별조사위원회 제3차 청문회 자료집 3차 청문회 자료집(최종) 2
 
Predictive modeling healthcare
Predictive modeling healthcarePredictive modeling healthcare
Predictive modeling healthcare
 
Java 8 Lambda Expressions
Java 8 Lambda ExpressionsJava 8 Lambda Expressions
Java 8 Lambda Expressions
 
AWS Innovate: AWS Container Management using Amazon EC2 Container Service an...
AWS Innovate:  AWS Container Management using Amazon EC2 Container Service an...AWS Innovate:  AWS Container Management using Amazon EC2 Container Service an...
AWS Innovate: AWS Container Management using Amazon EC2 Container Service an...
 

Similar to Building Distributed Systems in Scala

Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lightbend
 

Similar to Building Distributed Systems in Scala (20)

High Availability for OpenStack
High Availability for OpenStackHigh Availability for OpenStack
High Availability for OpenStack
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
 
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
 
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & KafkaBack-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
 
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015Scaling Spark Workloads on YARN - Boulder/Denver July 2015
Scaling Spark Workloads on YARN - Boulder/Denver July 2015
 
The Why and How of Scala at Twitter
The Why and How of Scala at TwitterThe Why and How of Scala at Twitter
The Why and How of Scala at Twitter
 
Data cleaning with the Kurator toolkit: Bridging the gap between conventional...
Data cleaning with the Kurator toolkit: Bridging the gap between conventional...Data cleaning with the Kurator toolkit: Bridging the gap between conventional...
Data cleaning with the Kurator toolkit: Bridging the gap between conventional...
 
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...
 
Martin Odersky: What's next for Scala
Martin Odersky: What's next for ScalaMartin Odersky: What's next for Scala
Martin Odersky: What's next for Scala
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
 
Scala+data
Scala+dataScala+data
Scala+data
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.com
 
Kafka for data scientists
Kafka for data scientistsKafka for data scientists
Kafka for data scientists
 
Jug - ecosystem
Jug -  ecosystemJug -  ecosystem
Jug - ecosystem
 
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
 
DEVNET-1106 Upcoming Services in OpenStack
DEVNET-1106	Upcoming Services in OpenStackDEVNET-1106	Upcoming Services in OpenStack
DEVNET-1106 Upcoming Services in OpenStack
 
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingNear Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
 
Couchbase Data Pipeline
Couchbase Data PipelineCouchbase Data Pipeline
Couchbase Data Pipeline
 
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
 

More from Alex Payne

Splitting up your web app
Splitting up your web appSplitting up your web app
Splitting up your web app
Alex Payne
 
The perils and rewards of working on stuff that matters
The perils and rewards of working on stuff that mattersThe perils and rewards of working on stuff that matters
The perils and rewards of working on stuff that matters
Alex Payne
 
The Interaction Design Of APIs
The Interaction Design Of APIsThe Interaction Design Of APIs
The Interaction Design Of APIs
Alex Payne
 

More from Alex Payne (17)

Splitting up your web app
Splitting up your web appSplitting up your web app
Splitting up your web app
 
The perils and rewards of working on stuff that matters
The perils and rewards of working on stuff that mattersThe perils and rewards of working on stuff that matters
The perils and rewards of working on stuff that matters
 
Emerging Languages: A Tour of the Horizon
Emerging Languages: A Tour of the HorizonEmerging Languages: A Tour of the Horizon
Emerging Languages: A Tour of the Horizon
 
Speedy, Stable, and Secure: Better Web Apps Through Functional Languages
Speedy, Stable, and Secure: Better Web Apps Through Functional LanguagesSpeedy, Stable, and Secure: Better Web Apps Through Functional Languages
Speedy, Stable, and Secure: Better Web Apps Through Functional Languages
 
Mind The Tools
Mind The ToolsMind The Tools
Mind The Tools
 
Strange Loop 2009 Keynote: Minimalism in Computing
Strange Loop 2009 Keynote: Minimalism in ComputingStrange Loop 2009 Keynote: Minimalism in Computing
Strange Loop 2009 Keynote: Minimalism in Computing
 
The Business Value of Twitter
The Business Value of TwitterThe Business Value of Twitter
The Business Value of Twitter
 
Twitter API 2.0
Twitter API 2.0Twitter API 2.0
Twitter API 2.0
 
The Interaction Design Of APIs
The Interaction Design Of APIsThe Interaction Design Of APIs
The Interaction Design Of APIs
 
Why Scala for Web 2.0?
Why Scala for Web 2.0?Why Scala for Web 2.0?
Why Scala for Web 2.0?
 
The Twitter API: A Presentation to Adobe
The Twitter API: A Presentation to AdobeThe Twitter API: A Presentation to Adobe
The Twitter API: A Presentation to Adobe
 
Protecting Public Hotspots
Protecting Public HotspotsProtecting Public Hotspots
Protecting Public Hotspots
 
Twitter at BarCamp 2008
Twitter at BarCamp 2008Twitter at BarCamp 2008
Twitter at BarCamp 2008
 
Securing Rails
Securing RailsSecuring Rails
Securing Rails
 
Why Scala?
Why Scala?Why Scala?
Why Scala?
 
Designing Your API
Designing Your APIDesigning Your API
Designing Your API
 
Scaling Twitter - Railsconf 2007
Scaling Twitter - Railsconf 2007Scaling Twitter - Railsconf 2007
Scaling Twitter - Railsconf 2007
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Recently uploaded (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

Building Distributed Systems in Scala

  • 1. Building Distributed Systems in Scala A presentation to Emerging Technologies for the Enterprise April 8, 2010 – Philadelphia, PA TM
  • 2. About @al3x ‣ At Twitter since 2007 ‣ Working on the Web since 1995 ‣ Co-author of Programming Scala (O’Reilly, 2009) ‣ Into programming languages, distributed systems.
  • 3. About Twitter ‣ Social messaging – a new way to communicate ‣ Launched in mid-2006 ‣ Hit the mainstream in 2008 ‣ 50+ million tweets per day (600+ per second) ‣ Millions of users worldwide
  • 4. Technologies Used At Twitter Languages Frameworks ‣ Ruby, JavaScript ‣ Rails ‣ Scala ‣ jQuery ‣ lil’ bit of C, Python, Java Data Storage Misc. ‣ MySQL ‣ memcached ‣ Cassandra ‣ ZooKeeper ‣ HBase (Hadoop) ‣ Jetty ‣ so much more!
  • 5. Why Scala? ‣ A language that’s both fun and productive. ‣ Great performance (on par with Java). ‣ Object-oriented and functional programming, together. ‣ Ability to reuse existing Java libraries. ‣ Flexible concurrency (Actors, threads, events). ‣ A smart community with infectious momentum.
  • 6. Hawkwind A case study in (re)building a distributed system in Scala.
  • 7. Requirements ‣ Search for people by name, username, eventually by other attributes. ‣ Order the results some sensible way (ex: by number of followers). ‣ Offer suggestions for misspellings/alternate names. ‣ Handle case-folding and other text normalization concerns on the query string. ‣ Return results in about a second, preferably less.
  • 9. Finding People on Twitter results
  • 10. Finding People on Twitter suggestion results
  • 11. Finding People on Twitter speedy! suggestion results
  • 12. First Attempt: acts_as_solr ‣ Crunched on time, so we wanted the fastest route to working user search. ‣ Uses the Solr distribution/platform from Apache Lucene. ‣ Tries to make Rails integration straightforward and idiomatic. ‣ Easy to get running, hard to operationalize.
  • 13. In the Interim: A Move to SOA ‣ Stopped thinking of our architecture as just a Rails app and the components that orbit it. ‣ Started building isolated services that communicate with the rest of the system via Thrift (an RPC and server framework). ‣ Allows us freedom to change the underlying implementation of services without modifying the rest of the system.
  • 14. Thrift Example struct Results { 1: list<i64> people 2: string suggestion 3: i32 processingTime /* milliseconds */ 4: list<i32> timings 5: i32 totalResults } service NameSearch { Results find(1: string name, 2: i32 maxResults, 3: bool wantSuggestion) Results find_with_ranking(1: string name, 2: i32 maxResults, 3: bool wantSuggestion, 4: Ranker ranking) }
  • 15. Second Attempt: Hawkwind 1 ‣ A quick (three weeks) bespoke Scala project to “stop the bleeding”. ‣ Vertically but not horizontally scalable: no sharding, no failover, machine-level redundancy. ‣ Ran into memory and disk space limits. ‣ Reused Java code but didn’t offer nice Scala wrappers or rewrites. ‣ Still, planned to grow 10x, grew 25x!
  • 16. Goals for Hawkwind 2 ‣ Horizontally scalable: sharded corpus, replication of shards, easy to grow the service. ‣ Faster. ‣ Higher-quality results. ‣ Better use of Scala (language features, programming style). ‣ Maintainable code base, make it easy to add features.
  • 17. High-Level Concepts ‣ Shards: pieces of the user corpus. ‣ Replicas: copies of shards. ‣ Document Servers. ‣ Merge Servers. ‣ Every machine gets the same code, can be either a Document Server or a Merge Server.
  • 18. Hawkwind 2 Internet High-Level queries for users, API requests Architecture Rails Cluster Thrift call to semi-random Merge Server Merge Merge Merge Server Server Server Thrift calls to semi-random replica of each shard Shard 1 Shard 1 Shard 2 Shard 2 Shard 3 Shard 3 Doc Server Doc Server Doc Server Doc Server Doc Server Doc Server periodic deliveries of sharded user corpus Hadoop (HBase)
  • 19. Taking Care of Data ‣ A Hadoop job gathers up the user data and slices it into shards. ‣ A cron job fetches these data dumps several times per day. ‣ To load a new corpus on a Document Server, simply restart the process. ‣ Redundancy and staggered scheduling keeps the system from running too hot while restarts are in progress.
  • 20. What a Document Server does ‣ On startup, load Thrift serialized User objects. ‣ Populate an Inverted Index, Map, and Trie with normalized attributes of those User objects. ‣ Once ready, listen for queries. ‣ Answering a query basically means looking stuff up in those pre-populated data structures. ‣ Maintains a connection pool for Thrift requests, wrapping org.apache.commons.pool.
  • 21. What a Merge Server does ‣ Gets queries. ‣ Fans out queries to Document Servers. ‣ Waits for queries to come back using a custom ParallelFuture class, which wraps a number of java.util.concurrent classes. ‣ Merges together the result sets, re-ranks them, and ships ‘em back to the requesting client.
  • 22. How to model a distributed system? ‣ Literal decomposition: classes for all architectural components (Shard, Replica, etc.). ‣ Each component knows/does as little as possible. ‣ Isolate mutable state, test carefully. ‣ Cleanly delegate calls.
  • 23. Literal Decomposition: Replica case class Replica(val shard: Shard, val server: Server) { private val log = Logger.get val BACKEND_ERROR = Stats.getCounter("backend_timeout") def query(q: Query): DocResults = w3c.time("replica-query") { server.thriftCall { client => // logic goes here } } def ping(): Boolean = server.thriftCall { client => log.debug("calling ping via thrift for %s", server) val rv = client.ping() log.debug("ping returned %s from %s", rv, server) rv } }
  • 24. Literal Decomposition: Server case class Server(val hostname: String, val port: Int) { val pool = ConnectionPool(hostname, port) private val log = Logger.get def thriftCall[A](f: Client => A) = { log.debug("making thriftCall for server %s", this) pool.withClient { client => f(client) } } def replica: Replica = { Replica(ShardMap.serversToShards(this), this) } }
  • 25. Hawkwind 2 Query Call MergeLayer.query Graph ShardMap.query shard.replicaManager ! query shard.query randomReplica() replica.query server.thriftCall NameSearchDocumentLayerClient.find
  • 26. Hawkwind 2 Query Call MergeLayer.query Graph what’s this? ShardMap.query shard.replicaManager ! query shard.query randomReplica() replica.query server.thriftCall NameSearchDocumentLayerClient.find
  • 27. ShardMap: Isolating Mutable State ‣ A singleton and an Actor. ‣ Contains a map from Servers to their corresponding Shards. ‣ Also contains a map from Shards to the Replicas of those shards. ‣ Responsible for populating and managing those maps. ‣ Send it a message to evict or reinsert a Replica. ‣ Fans out queries to Shards.
  • 28. ReplicaHealthChecker ‣ Much like the ShardMap, a singleton and an Actor. ‣ Maintains mutable lists of unhealthy Replicas (“the penalty box”). ‣ Constantly checking to see if evicted Replicas are healthy again (back online). ‣ Sends messages to itself – an effective Actor technique.
  • 29. Challenges, Large and Small ‣ Fast importing of huge serialized Thrift object dumps. ‣ Testing the ShardMap and ReplicaHealthChecker (mutable state wants to hurt you). ‣ Efficient accent normalization and filtering for special characters. ‣ Working with the Apache Commons object pool. ‣ Breaking out different ranking mechanisms in a clean, reusable way.
  • 30. Libraries & Tools Things that make working in Scala way more productive.
  • 31. sbt – the Simple Build Tool ‣ Scala’s answer to Ant and Maven. ‣ Sets up new projects. ‣ Maintains project configuration, build tasks, and dependencies in pure Scala. Totally open- ended. ‣ Interactive console. ‣ Will run tasks as soon as files in your project change – automatically compile and run tests!
  • 32. Ostrich ‣ Gather statistics about your application. ‣ Counters, gauges, and timings. ‣ Share stats via JMX, a plain-text socket, a web interface, or log files. ‣ Ex: Stats.time("foo") { timeConsumingOperation() }
  • 33. Configgy ‣ Manages configuration files and logging. ‣ Flexible file format, can include files in other files. ‣ Inheritance, variable substitution. ‣ Tunable logging, logging with Scribe. ‣ Subscription API: push and validate configuration changes to running processes. ‣ Ex: val foo = config.getString(“foo”)
  • 34. Specs + xrayspecs ‣ A behavior-driven development (BDD) testing framework for Scala. ‣ Elegant, readable, fun-to-write tests. ‣ Support for several mocking frameworks (we like Mockito). ‣ Test concurrent operations, time, much more. ‣ Ex: "suggestion with a List of null does not blow up" in { MergeLayer.suggestion("steve", List(null)) mustEqual None }
  • 35. Questions? Follow me at twitter.com/al3x Learn with us at engineering.twitter.com Work with us at jobs.twitter.com TM

Editor's Notes

  1. This is literally all there is to this class!