Talkbits service
architecture and
deployment.
by Aleksei Kornev
Get stuff done...
Typical application
Architecture of talkbits service
One way to configure service, logs, metrics.
One way to package and deploy service.
One way to lunch service.
Bundled in one-jar.
One delivery unit. Contains:
Java service
In a single executable fat-jar.
Installation script
[Re]installs service on the
machine, registers it in
/etc/init.d
Init.d script
Contains instructions to start,
stop, restart JVM and get quick
status.
Delivery
Logging
Confuguration
• SLF4J as an API, all other libraries redirected
• Logback as a logging implementation
• Each service logs to /var/log/talkbits/... (application logs, GC logs)
• Daily rotation policy applied
• Also sent to loggly.com for aggregation, grouping etc.
Aggregation
• loggly.com
• sshfs for analyzing logs by means of linux tools such as grep, tail,
less, etc.
Aggregation alternatives
Splunk.com, Flume, Scribe, etc...
Metrics
Application metrics and health checks are implemented with CodaHale
lib (metrics.codahale.com). Codahale reports metrics via JMX.
Jolokia JVM agent (www.jolokia.org/agent/jvm.html) exposes JMX beans
via REST (JSON / HTTP), using JVMs internal HTTP server.
Monitoring agent use jolokia REST interface to fetch metrics and send
them to monitoring system.
All metrics are divided into common metrics (HW, JVM, etc) and
service-specific metrics.
Deployment
Fabric (http://fabfile.org) used for
environments provisioning and
services deployment.
Process
• Fabric script provisions new env
(or uses existing) by cluster
scheme
• Amazon instances are
automatically tagged with
services list (i.e., instance roles)
• Fabric script reads instance roles
and deploys (redeploys)
appropriate components.
Monitoring
As monitoring platform we chose Datadoghq.com. Datadog is a SaaS
which is easy to integrate into your infrastucture. Datadog agent is
opensourced and implemented in Python. There are many predefined
checksets (plugins, or integrations) for popular products out of the box -
including JVM, Cassandra, Zookeeper and ElasticSearch.
Datadog provides REST API.
Alternatives
• Nagios, Zabbix - need to have bearded admin in team. We wanted to
go SaaS and outsource infrastructure as far as possible.
• Amazon CloudWatch, LogicMonitor, ManageEngine, etc.
Process
Each service has own monitoring agent instance on a single machine. If
node has 'monitoring-agent' role in the roles tag of EC2 instance,
monitoring agent will be installed for each service on this node.
Talkbits cluster structure
QA
Aleksei Kornev
aleksei.kornev@gmail.com
Max Alexejev
malexejev@gmail.com

Talkbits service architecture and deployment

  • 1.
  • 2.
  • 3.
  • 4.
    Architecture of talkbitsservice One way to configure service, logs, metrics. One way to package and deploy service. One way to lunch service. Bundled in one-jar.
  • 5.
    One delivery unit.Contains: Java service In a single executable fat-jar. Installation script [Re]installs service on the machine, registers it in /etc/init.d Init.d script Contains instructions to start, stop, restart JVM and get quick status. Delivery
  • 6.
    Logging Confuguration • SLF4J asan API, all other libraries redirected • Logback as a logging implementation • Each service logs to /var/log/talkbits/... (application logs, GC logs) • Daily rotation policy applied • Also sent to loggly.com for aggregation, grouping etc. Aggregation • loggly.com • sshfs for analyzing logs by means of linux tools such as grep, tail, less, etc. Aggregation alternatives Splunk.com, Flume, Scribe, etc...
  • 7.
    Metrics Application metrics andhealth checks are implemented with CodaHale lib (metrics.codahale.com). Codahale reports metrics via JMX. Jolokia JVM agent (www.jolokia.org/agent/jvm.html) exposes JMX beans via REST (JSON / HTTP), using JVMs internal HTTP server. Monitoring agent use jolokia REST interface to fetch metrics and send them to monitoring system. All metrics are divided into common metrics (HW, JVM, etc) and service-specific metrics.
  • 8.
    Deployment Fabric (http://fabfile.org) usedfor environments provisioning and services deployment. Process • Fabric script provisions new env (or uses existing) by cluster scheme • Amazon instances are automatically tagged with services list (i.e., instance roles) • Fabric script reads instance roles and deploys (redeploys) appropriate components.
  • 9.
    Monitoring As monitoring platformwe chose Datadoghq.com. Datadog is a SaaS which is easy to integrate into your infrastucture. Datadog agent is opensourced and implemented in Python. There are many predefined checksets (plugins, or integrations) for popular products out of the box - including JVM, Cassandra, Zookeeper and ElasticSearch. Datadog provides REST API. Alternatives • Nagios, Zabbix - need to have bearded admin in team. We wanted to go SaaS and outsource infrastructure as far as possible. • Amazon CloudWatch, LogicMonitor, ManageEngine, etc. Process Each service has own monitoring agent instance on a single machine. If node has 'monitoring-agent' role in the roles tag of EC2 instance, monitoring agent will be installed for each service on this node.
  • 10.
  • 11.