SlideShare a Scribd company logo
Distributed & highly available server
applications in Java and Scala
Max Alexejev, Aleksei Kornev
JavaOne Moscow 2013
24 April 2013
What is talkbits?
Architecture
by Max Alexejev
Lightweight SOA
Key principles
● S1, S2 - edge services
● Each service is 0..1 servers
and 0..N clients built together
● No special "broker" services
● All services are stateless
● All instances are equal
What about state?
State is kept is specialized
distributed systems and fronted
by specific services.
Example follows...
Case study: Talkbits backend
Recursive call
Requirements for a distrubuted RPC system
Must have and nice to have
● Elastic and reliable discovery - schould handle nodes brought up and
shut down transparently and not be a SPOF itself
● Support for N-N topology of client and server instances
● Disconnect detection and transparent reconnects
● Fault tolerance - for example, by retries to remaining instances where
called instance goes down
● Clients backoff built-in - i.e., clients should not overload servers
when load spikes - as far as possible
● Configurable load distribution - i.e., which server instance to call for
this specific request
● Configurable networking layer - keepalives & heartbeats, timeouts,
connection pools etc.)
● Distributed tracing facilities
● Portability among different platforms
● Distributed stack traces for exceptions
● Transactions
Key principles to be lightweight and get rid of architectural waste
● Java SE
● No containers. Even servlet containers are light and built-in
● Standalone applications: unified configuration, deployment, metrics,
logging, single development framework - more on this later
● All launched istances are equal and process requests - no "special"
nodes or "active-standby" patterns
● Minimal dependencies and JAR size
● Minimal memory footprint
● One service - one purpose
● Highly tuned for this one purpose (app, JVM, OS, HW)
● Isolated fault domains - i.e., single datasource or external service is
fronted by one service only
No bloatware in technology stack!
"Lean" services
Finagle library
(twitter.github.io/finagle) acts
as a distributed RPC
framework.
Services are written in Java
and Scala and use Thrift
communication protocol.
Talkbits implementation choices
Apache Zookeeper (zookeeper.apache.org)
Provides reliable service discovery mechanics. Finagle has a nice built-in
integration with Zookeeper.
Finagle server: networking
Finagle is built on top of Netty - asynchronous, non-blocking TCP server.
Finagle codec
trait Codec[Req, Rep]
class ThriftClientFramedCodec(...) extends Codec[ThriftClientRequest, Array[Byte]] {
pipeline.addLast("thriftFrameCodec", new ThriftFrameCodec)
pipeline.addLast("byteEncoder", new ThriftClientChannelBufferEncoder)
pipeline.addLast("byteDecoder", new ThriftChannelBufferDecoder)
...
}
Finagle comes with ready-made codecs for
Thrift, HTTP, Memcache, Kestrel, HTTP streaming.
Finagle services and filters
// Service is simply a function from request to a future of response.
trait Service[Req, Rep] extends (Req => Future[Rep])
// Filter[A, B, C, D] converts a Service[C, D] to a Service[A, B].
abstract class Filter[-ReqIn, +RepOut, +ReqOut, -RepIn]
extends ((ReqIn, Service[ReqOut, RepIn]) => Future[RepOut])
abstract class SimpleFilter[Req, Rep] extends Filter[Req, Rep, Req, Rep]
// Service transformation example
val serviceWithTimeout: Service[Req, Rep] =
new RetryFilter[Req, Rep](..) andThen
new TimeoutFilter[Req, Rep](..) andThen
service
Finagle comes with
rate limiting, retries,
statistics, tracing,
uncaught exceptions
handling, timeouts and
more.
Functional composition
Given Future[A]
Sequential composition
def map[B](f: A => B): Future[B]
def flatMap[B](f: A => Future[B]): Future[B]
def rescue[B >: A](rescueException: PartialFunction[Throwable, Future[B]]): Future[B]
Concurrent composition
def collect[A](fs: Seq[Future[A]]): Future[Seq[A]]
def select[A](fs: Seq[Future[A]]): Future[(Try[A], Seq[Future[A]])]
And more
times(), whileDo() etc.
Functional composition on RPC calls
Sequential composition
val nearestChannel: Future[Channel] =
metadataClient.getUserById(uuid) flatMap {
user => geolocationClient.getNearestChannelId( user.getLocation() )
} flatMap {
channelId => metadataClient.getChannelById( channelId )
}
Concurrent composition
val userF: Future[User] = metadataClient.getUserById(uuid)
val bitsCountF: Future[Integer] = metadataClient.getUserBitsCount(uuid)
val avatarsF: Future[List[Avatar]] = metadataClient.getUserAvatars(uuid)
val(user, bitsCount, avatars) =
Future.collect(Seq(userF, bitsCountF, avatarsF)).get()
*All this stuff works in Java just like in Scala, but does not look as cool.
Finagle server: threading model
You should never block worker threads in order to achieve high
performance (throughput).
For blocking IO or long compuntations, delegate to FuturePool.
val diskIoFuturePool = FuturePool(Executors.newFixedThreadPool(4))
diskIoFuturePool( { scala.Source.fromFile(..) } )
Boss thread accepts new
client connections and
binds NIO Channel to a
specific worker thread.
Worker threads perform
all client IO.
More gifts and bonuses from Finagle
In addition to all said before, Finagle has
● Load-distribution in N-N topos - HeapBalancer ("least active
connections") by default
● Client backoff strategies - comes with TruncatedBinaryBackoff
implementation
● Failure detection
● Failover/Retry
● Connection Pooling
● Distributed Tracing (Zipkin project based on Google Dapper paper)
Finagle, Thrift & Java: lessons learned
Pros
● Gives a lot out of the box
● Production-proven and stable
● Active development community
● Lots of extension points in the library
Cons
● Good for Scala, usable with Java
● Works well with Thrift and HTTP (plus trivial protocols), but lacks
support for Protobuf and other stuff
● Poor exceptions handling experience with Java (no Scala match-es)
and ugly code
● finagle-thrift is a pain (old libthrift version lock-in, Cassandra
dependencies clash, cannot return nulls, and more). All problems
avoidable thought.
● Cluster scatters and never gathers when whole Zookeeper ensemble
is down.
Finagle: competitors & alternatives
Trending
● Akka 2.0 (Scala, OpenSource) by Typesafe
● ZeroRPC (Python & Node.js, OpenSource) by DotCloud
● RxJava (Java, OpenSource) by Netflix
Old
● JGroups (Java, OpenSource)
● JBOSS Remoting (Java, OpenSource) by JBOSS
● Spread Toolkit (C/C++, Commercial & OpenSource)
Configuration, deployment,
monitoring and logging
by Aleksei Kornev
Get stuff done...
Typical application
Architecture of talkbits service
One way to configure service, logs, metrics.
One way to package and deploy service.
One way to lunch service.
Bundled in one-jar.
One delivery unit. Contains:
Java service
In a single executable fat-jar.
Installation script
[Re]installs service on the machine,
registers it in /etc/init.d
Init.d script
Contains instructions to start, stop,
restart JVM and get quick status.
Delivery
Logging
Confuguration
● SLF4J as an API, all other libraries redirected
● Logback as a logging implementation
● Each service logs to /var/log/talkbits/... (application logs, GC logs)
● Daily rotation policy applied
● Also sent to loggly.com for aggregation, grouping etc.
Aggregation
● loggly.com
● sshfs for analyzing logs by means of linux tools such as grep, tail, less,
etc.
Aggregation alternatives
Splunk.com, Flume, Scribe, etc...
Metrics
Application metrics and health checks are implemented with CodaHale lib
(metrics.codahale.com). Codahale reports metrics via JMX.
Jolokia JVM agent (www.jolokia.org/agent/jvm.html) exposes JMX beans
via REST (JSON / HTTP), using JVMs internal HTTP server.
Monitoring agent use jolokia REST interface to fetch metrics and send
them to monitoring system.
All metrics are divided into common metrics (HW, JVM, etc) and service-
specific metrics.
Deployment
Fabric (http://fabfile.org) used for
environments provisioning and
services deployment.
Process
● Fabric script provisions new env
(or uses existing) by cluster
scheme
● Amazon instances are
automatically tagged with
services list (i.e., instance roles)
● Fabric script reads instance roles
and deploys (redeploys)
appropriate components.
Monitoring
As monitoring platform we chose Datadoghq.com. Datadog is a SaaS
which is easy to integrate into your infrastucture. Datadog agent is
opensourced and implemented in Python. There are many predefined
checksets (plugins, or integrations) for popular products out of the box -
including JVM, Cassandra, Zookeeper and ElasticSearch.
Datadog provides REST API.
Alternatives
● Nagios, Zabbix - need to have bearded admin in team. We wanted to
go SaaS and outsource infrastructure as far as possible.
● Amazon CloudWatch, LogicMonitor, ManageEngine, etc.
Process
Each service has own monitoring agent instance on a single machine. If
node has 'monitoring-agent' role in the roles tag of EC2 instance,
monitoring agent will be installed for each service on this node.
Talkbits cluster structure
QA
Max Alexejev
HTTP://RU.LINKEDIN.COM/PUB/MAX-ALEXEJEV/51/820/AB9
http://www.slideshare.net/MaxAlexejev/
MALEXEJEV@GMAIL.COM
Aleksei Kornev
aleksei.kornev@gmail.com

More Related Content

What's hot

Distributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and ScalaDistributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and Scala
Max Alexejev
 
CoAP in Reactive Blocks
CoAP in Reactive BlocksCoAP in Reactive Blocks
CoAP in Reactive Blocks
Bitreactive
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
Kevin Brockhoff
 
Dynamic Service Chaining
Dynamic Service Chaining Dynamic Service Chaining
Dynamic Service Chaining
Tail-f Systems
 
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, ConfluentIntroducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
HostedbyConfluent
 
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
confluent
 
Apache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceApache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open Source
Timothy Spann
 
Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...
Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...
Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...
Brent Salisbury
 
DEVNET-2005 Using the Cisco Open SDN Controller RESTCONF APIs
DEVNET-2005	Using the Cisco Open SDN Controller RESTCONF APIsDEVNET-2005	Using the Cisco Open SDN Controller RESTCONF APIs
DEVNET-2005 Using the Cisco Open SDN Controller RESTCONF APIs
Cisco DevNet
 
High Performance Object Pascal Code on Servers (at EKON 22)
High Performance Object Pascal Code on Servers (at EKON 22)High Performance Object Pascal Code on Servers (at EKON 22)
High Performance Object Pascal Code on Servers (at EKON 22)
Arnaud Bouchez
 
Coap based application for android phones
Coap based application for android phonesCoap based application for android phones
Coap based application for android phones
Md Syed Ahamad
 
(ATS3-DEV04) Introduction to Pipeline Pilot Protocol Development for Developers
(ATS3-DEV04) Introduction to Pipeline Pilot Protocol Development for Developers(ATS3-DEV04) Introduction to Pipeline Pilot Protocol Development for Developers
(ATS3-DEV04) Introduction to Pipeline Pilot Protocol Development for Developers
BIOVIA
 
CoAP Talk
CoAP TalkCoAP Talk
CoAP Talk
Basuke Suzuki
 
Build a Micro HTTP Server for Embedded System
Build a Micro HTTP Server for Embedded SystemBuild a Micro HTTP Server for Embedded System
Build a Micro HTTP Server for Embedded System
Jian-Hong Pan
 
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
HostedbyConfluent
 

What's hot (15)

Distributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and ScalaDistributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and Scala
 
CoAP in Reactive Blocks
CoAP in Reactive BlocksCoAP in Reactive Blocks
CoAP in Reactive Blocks
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
 
Dynamic Service Chaining
Dynamic Service Chaining Dynamic Service Chaining
Dynamic Service Chaining
 
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, ConfluentIntroducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
 
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
 
Apache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceApache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open Source
 
Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...
Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...
Augmenting Flow Operations and Feedback on the Model Driven MD_SAL Approach i...
 
DEVNET-2005 Using the Cisco Open SDN Controller RESTCONF APIs
DEVNET-2005	Using the Cisco Open SDN Controller RESTCONF APIsDEVNET-2005	Using the Cisco Open SDN Controller RESTCONF APIs
DEVNET-2005 Using the Cisco Open SDN Controller RESTCONF APIs
 
High Performance Object Pascal Code on Servers (at EKON 22)
High Performance Object Pascal Code on Servers (at EKON 22)High Performance Object Pascal Code on Servers (at EKON 22)
High Performance Object Pascal Code on Servers (at EKON 22)
 
Coap based application for android phones
Coap based application for android phonesCoap based application for android phones
Coap based application for android phones
 
(ATS3-DEV04) Introduction to Pipeline Pilot Protocol Development for Developers
(ATS3-DEV04) Introduction to Pipeline Pilot Protocol Development for Developers(ATS3-DEV04) Introduction to Pipeline Pilot Protocol Development for Developers
(ATS3-DEV04) Introduction to Pipeline Pilot Protocol Development for Developers
 
CoAP Talk
CoAP TalkCoAP Talk
CoAP Talk
 
Build a Micro HTTP Server for Embedded System
Build a Micro HTTP Server for Embedded SystemBuild a Micro HTTP Server for Embedded System
Build a Micro HTTP Server for Embedded System
 
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
 

Similar to Java one2013

Developing Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's GuideDeveloping Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's Guide
Mohanraj Thirumoorthy
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
Joe Stein
 
Microservices and modularity with java
Microservices and modularity with javaMicroservices and modularity with java
Microservices and modularity with java
DPC Consulting Ltd
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
Athens Big Data
 
KrakenD API Gateway
KrakenD API GatewayKrakenD API Gateway
KrakenD API Gateway
Albert Lombarte
 
DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIs
Cisco DevNet
 
A Tour of Apache Kafka
A Tour of Apache KafkaA Tour of Apache Kafka
A Tour of Apache Kafka
confluent
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Timothy Spann
 
Mobicents Summit 2012 - Jean Deruelle - Mobicents SIP Servlets
Mobicents Summit 2012 - Jean Deruelle - Mobicents SIP ServletsMobicents Summit 2012 - Jean Deruelle - Mobicents SIP Servlets
Mobicents Summit 2012 - Jean Deruelle - Mobicents SIP Servletstelestax
 
Twitter Finagle
Twitter FinagleTwitter Finagle
Twitter Finagle
Knoldus Inc.
 
Current & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylightCurrent & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylight
abhijit2511
 
Comparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systemsComparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systems
Imesha Sudasingha
 
Hands on with CoAP and Californium
Hands on with CoAP and CaliforniumHands on with CoAP and Californium
Hands on with CoAP and Californium
Julien Vermillard
 
Summit 16: How to Compose a New OPNFV Solution Stack?
Summit 16: How to Compose a New OPNFV Solution Stack?Summit 16: How to Compose a New OPNFV Solution Stack?
Summit 16: How to Compose a New OPNFV Solution Stack?
OPNFV
 
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Timothy Spann
 
Serverless design with Fn project
Serverless design with Fn projectServerless design with Fn project
Serverless design with Fn project
Siva Rama Krishna Chunduru
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Guido Schmutz
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
Timothy Spann
 
CNCF Singapore - Introduction to Envoy
CNCF Singapore - Introduction to EnvoyCNCF Singapore - Introduction to Envoy
CNCF Singapore - Introduction to Envoy
Harish
 
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
Hua Chu
 

Similar to Java one2013 (20)

Developing Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's GuideDeveloping Microservices using Spring - Beginner's Guide
Developing Microservices using Spring - Beginner's Guide
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
Microservices and modularity with java
Microservices and modularity with javaMicroservices and modularity with java
Microservices and modularity with java
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
KrakenD API Gateway
KrakenD API GatewayKrakenD API Gateway
KrakenD API Gateway
 
DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIs
 
A Tour of Apache Kafka
A Tour of Apache KafkaA Tour of Apache Kafka
A Tour of Apache Kafka
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
 
Mobicents Summit 2012 - Jean Deruelle - Mobicents SIP Servlets
Mobicents Summit 2012 - Jean Deruelle - Mobicents SIP ServletsMobicents Summit 2012 - Jean Deruelle - Mobicents SIP Servlets
Mobicents Summit 2012 - Jean Deruelle - Mobicents SIP Servlets
 
Twitter Finagle
Twitter FinagleTwitter Finagle
Twitter Finagle
 
Current & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylightCurrent & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylight
 
Comparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systemsComparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systems
 
Hands on with CoAP and Californium
Hands on with CoAP and CaliforniumHands on with CoAP and Californium
Hands on with CoAP and Californium
 
Summit 16: How to Compose a New OPNFV Solution Stack?
Summit 16: How to Compose a New OPNFV Solution Stack?Summit 16: How to Compose a New OPNFV Solution Stack?
Summit 16: How to Compose a New OPNFV Solution Stack?
 
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
 
Serverless design with Fn project
Serverless design with Fn projectServerless design with Fn project
Serverless design with Fn project
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
CNCF Singapore - Introduction to Envoy
CNCF Singapore - Introduction to EnvoyCNCF Singapore - Introduction to Envoy
CNCF Singapore - Introduction to Envoy
 
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
 

Java one2013

  • 1. Distributed & highly available server applications in Java and Scala Max Alexejev, Aleksei Kornev JavaOne Moscow 2013 24 April 2013
  • 4. Lightweight SOA Key principles ● S1, S2 - edge services ● Each service is 0..1 servers and 0..N clients built together ● No special "broker" services ● All services are stateless ● All instances are equal What about state? State is kept is specialized distributed systems and fronted by specific services. Example follows...
  • 5. Case study: Talkbits backend Recursive call
  • 6. Requirements for a distrubuted RPC system Must have and nice to have ● Elastic and reliable discovery - schould handle nodes brought up and shut down transparently and not be a SPOF itself ● Support for N-N topology of client and server instances ● Disconnect detection and transparent reconnects ● Fault tolerance - for example, by retries to remaining instances where called instance goes down ● Clients backoff built-in - i.e., clients should not overload servers when load spikes - as far as possible ● Configurable load distribution - i.e., which server instance to call for this specific request ● Configurable networking layer - keepalives & heartbeats, timeouts, connection pools etc.) ● Distributed tracing facilities ● Portability among different platforms ● Distributed stack traces for exceptions ● Transactions
  • 7. Key principles to be lightweight and get rid of architectural waste ● Java SE ● No containers. Even servlet containers are light and built-in ● Standalone applications: unified configuration, deployment, metrics, logging, single development framework - more on this later ● All launched istances are equal and process requests - no "special" nodes or "active-standby" patterns ● Minimal dependencies and JAR size ● Minimal memory footprint ● One service - one purpose ● Highly tuned for this one purpose (app, JVM, OS, HW) ● Isolated fault domains - i.e., single datasource or external service is fronted by one service only No bloatware in technology stack! "Lean" services
  • 8. Finagle library (twitter.github.io/finagle) acts as a distributed RPC framework. Services are written in Java and Scala and use Thrift communication protocol. Talkbits implementation choices Apache Zookeeper (zookeeper.apache.org) Provides reliable service discovery mechanics. Finagle has a nice built-in integration with Zookeeper.
  • 9. Finagle server: networking Finagle is built on top of Netty - asynchronous, non-blocking TCP server. Finagle codec trait Codec[Req, Rep] class ThriftClientFramedCodec(...) extends Codec[ThriftClientRequest, Array[Byte]] { pipeline.addLast("thriftFrameCodec", new ThriftFrameCodec) pipeline.addLast("byteEncoder", new ThriftClientChannelBufferEncoder) pipeline.addLast("byteDecoder", new ThriftChannelBufferDecoder) ... } Finagle comes with ready-made codecs for Thrift, HTTP, Memcache, Kestrel, HTTP streaming.
  • 10. Finagle services and filters // Service is simply a function from request to a future of response. trait Service[Req, Rep] extends (Req => Future[Rep]) // Filter[A, B, C, D] converts a Service[C, D] to a Service[A, B]. abstract class Filter[-ReqIn, +RepOut, +ReqOut, -RepIn] extends ((ReqIn, Service[ReqOut, RepIn]) => Future[RepOut]) abstract class SimpleFilter[Req, Rep] extends Filter[Req, Rep, Req, Rep] // Service transformation example val serviceWithTimeout: Service[Req, Rep] = new RetryFilter[Req, Rep](..) andThen new TimeoutFilter[Req, Rep](..) andThen service Finagle comes with rate limiting, retries, statistics, tracing, uncaught exceptions handling, timeouts and more.
  • 11. Functional composition Given Future[A] Sequential composition def map[B](f: A => B): Future[B] def flatMap[B](f: A => Future[B]): Future[B] def rescue[B >: A](rescueException: PartialFunction[Throwable, Future[B]]): Future[B] Concurrent composition def collect[A](fs: Seq[Future[A]]): Future[Seq[A]] def select[A](fs: Seq[Future[A]]): Future[(Try[A], Seq[Future[A]])] And more times(), whileDo() etc.
  • 12. Functional composition on RPC calls Sequential composition val nearestChannel: Future[Channel] = metadataClient.getUserById(uuid) flatMap { user => geolocationClient.getNearestChannelId( user.getLocation() ) } flatMap { channelId => metadataClient.getChannelById( channelId ) } Concurrent composition val userF: Future[User] = metadataClient.getUserById(uuid) val bitsCountF: Future[Integer] = metadataClient.getUserBitsCount(uuid) val avatarsF: Future[List[Avatar]] = metadataClient.getUserAvatars(uuid) val(user, bitsCount, avatars) = Future.collect(Seq(userF, bitsCountF, avatarsF)).get() *All this stuff works in Java just like in Scala, but does not look as cool.
  • 13. Finagle server: threading model You should never block worker threads in order to achieve high performance (throughput). For blocking IO or long compuntations, delegate to FuturePool. val diskIoFuturePool = FuturePool(Executors.newFixedThreadPool(4)) diskIoFuturePool( { scala.Source.fromFile(..) } ) Boss thread accepts new client connections and binds NIO Channel to a specific worker thread. Worker threads perform all client IO.
  • 14. More gifts and bonuses from Finagle In addition to all said before, Finagle has ● Load-distribution in N-N topos - HeapBalancer ("least active connections") by default ● Client backoff strategies - comes with TruncatedBinaryBackoff implementation ● Failure detection ● Failover/Retry ● Connection Pooling ● Distributed Tracing (Zipkin project based on Google Dapper paper)
  • 15. Finagle, Thrift & Java: lessons learned Pros ● Gives a lot out of the box ● Production-proven and stable ● Active development community ● Lots of extension points in the library Cons ● Good for Scala, usable with Java ● Works well with Thrift and HTTP (plus trivial protocols), but lacks support for Protobuf and other stuff ● Poor exceptions handling experience with Java (no Scala match-es) and ugly code ● finagle-thrift is a pain (old libthrift version lock-in, Cassandra dependencies clash, cannot return nulls, and more). All problems avoidable thought. ● Cluster scatters and never gathers when whole Zookeeper ensemble is down.
  • 16. Finagle: competitors & alternatives Trending ● Akka 2.0 (Scala, OpenSource) by Typesafe ● ZeroRPC (Python & Node.js, OpenSource) by DotCloud ● RxJava (Java, OpenSource) by Netflix Old ● JGroups (Java, OpenSource) ● JBOSS Remoting (Java, OpenSource) by JBOSS ● Spread Toolkit (C/C++, Commercial & OpenSource)
  • 17. Configuration, deployment, monitoring and logging by Aleksei Kornev
  • 20. Architecture of talkbits service One way to configure service, logs, metrics. One way to package and deploy service. One way to lunch service. Bundled in one-jar.
  • 21. One delivery unit. Contains: Java service In a single executable fat-jar. Installation script [Re]installs service on the machine, registers it in /etc/init.d Init.d script Contains instructions to start, stop, restart JVM and get quick status. Delivery
  • 22. Logging Confuguration ● SLF4J as an API, all other libraries redirected ● Logback as a logging implementation ● Each service logs to /var/log/talkbits/... (application logs, GC logs) ● Daily rotation policy applied ● Also sent to loggly.com for aggregation, grouping etc. Aggregation ● loggly.com ● sshfs for analyzing logs by means of linux tools such as grep, tail, less, etc. Aggregation alternatives Splunk.com, Flume, Scribe, etc...
  • 23. Metrics Application metrics and health checks are implemented with CodaHale lib (metrics.codahale.com). Codahale reports metrics via JMX. Jolokia JVM agent (www.jolokia.org/agent/jvm.html) exposes JMX beans via REST (JSON / HTTP), using JVMs internal HTTP server. Monitoring agent use jolokia REST interface to fetch metrics and send them to monitoring system. All metrics are divided into common metrics (HW, JVM, etc) and service- specific metrics.
  • 24. Deployment Fabric (http://fabfile.org) used for environments provisioning and services deployment. Process ● Fabric script provisions new env (or uses existing) by cluster scheme ● Amazon instances are automatically tagged with services list (i.e., instance roles) ● Fabric script reads instance roles and deploys (redeploys) appropriate components.
  • 25. Monitoring As monitoring platform we chose Datadoghq.com. Datadog is a SaaS which is easy to integrate into your infrastucture. Datadog agent is opensourced and implemented in Python. There are many predefined checksets (plugins, or integrations) for popular products out of the box - including JVM, Cassandra, Zookeeper and ElasticSearch. Datadog provides REST API. Alternatives ● Nagios, Zabbix - need to have bearded admin in team. We wanted to go SaaS and outsource infrastructure as far as possible. ● Amazon CloudWatch, LogicMonitor, ManageEngine, etc. Process Each service has own monitoring agent instance on a single machine. If node has 'monitoring-agent' role in the roles tag of EC2 instance, monitoring agent will be installed for each service on this node.