SlideShare a Scribd company logo
1 of 44
The
Automation
  Factory


    nathan@milford.io
    blog.milford.io
    twitter.com/NathanMilford
    github.com/nmilford
This is NOT strictly a
  Cassandra talk.



  ♫ There's no earthly way of knowing ♫
This is an infrastructure talk.




       ♫ How your infrastructure's growing. ♫
Startups move fast.

        Priorities change.

Infrastructure needs to be able to
            pivot, too.


        ♫ Who knows where business is going.
         Or which way the data's flowing. ♫
When you scale up,

so do your problems.




     ♫ Drives imploding?
      IO plateauing? ♫
Not to mention unexpected
           disasters.




We lost a whole data center during
         Hurricane Sandy.
          ♫ Is a hurricane a'blowing? ♫
How do you keep up with growth?




       ♫ There's no earthly way of knowing ♫
How do you deal with failure?




      ♫ Are the status LEDs a 'glowing?
       Is the server reaper mowing? ♫
How do you deal with too much
          success?




      ♫ Yes! The danger must be growing
       For the data keeps on flowing. ♫
What do you do?




♫ And they're certainly not showing
 any signs that they are slowing! ♫
Hold your breath.

  Make a wish.

 Automate!
♫ Come with me
            And you'll be
In a world of systems automation ♫
♫ Take a look
            And you’ll see
      Into my Chef lucubrations

        So login, Install, begin
With the Chef cookbook of my creation
    What you'll see might require
            Explanation ♫
♫ If you want to view paradise
 Simply go to Github and view it
 Pull requests welcome, go to it
    Want to change the code
       A merge will do it ♫




        https://github.com/linkedin/glu/
       https://github.com/octo/collectd/
       https://github.com/opscode/chef/
      https://github.com/saltstack/salt/
    https://github.com/outbrain/onering/
 https://github.com/nmilford/chef-cassandra/
https://github.com/rabbitmq/rabbitmq-server/
def discover_cassandra_schema
  require 'cassandra-cql'
  schema = {}
  server = "#{node[:ipaddress]}:#{node[:Cassandra][:rpc_port]}"

  db = CassandraCQL::Database.new("#{server}") rescue nil
  if db
    db.keyspaces.collect{|s| schema[s.name] =
s.column_families.collect{|cfname, cfobj| cfname } }
    schema.delete("system")
    schema.delete("OpsCenter")
    return schema                     ♫ There is no life I know
  end                            To compare with writing automation
  return nil                               Write it once
end                                       You’ll be free♫
*clickity*

*clickity*

*clickity*

♫ To play Diablo 3 ♫
♫ If you want to scale past a petabyte
 Just install Chef, Salt and Graphite
If you want to sleep the whole night
         Automate the world
          It will be all right♫
♫ There is no life I know
To compare with writing automation
          Write it once
         You’ll be free ♫
♫ If you truly wish to be.♫
The
     Automation
       Factory
     A Journey from Bare Metal
     to Active Cassandra Node


nathan@milford.io
blog.milford.io
twitter.com/NathanMilford
github.com/nmilford
Cassandra NYC 2011




http://www.slideshare.net/nmilford/cassandra-for-sysadmins
2 Years Later
●   80 billion impressions a month.

●   4 clusters for disparate
    use-cases, more in planning.

●   73 Cassandra nodes
    across 3 data centers.
Mo' Servers,
   Mo' Problems

We got multiple cages of servers.


   So... yeah... you can see where
     automation might help :)
Automation Attack Plan




                     ●
                         Provisioning!
●
    Orchestration!           ●
                                 Command and Control!
●
    Config Management! ● Monitoring and Alerting!
Provisioning
●
    Started with Cobbler (which is Awesome!)
●
    High performance infrastructures are snowflakes,
    can get out of hand fast.




●
    No tool that worked completely, end to end, the
    tool won't write itself.
We Built Our Own: Onering




Note: I am only a moderate Lord of the Rings Fan, and the guy who did most of the work on it, Gary Hetzel, is a
Star Trek fan. We are not responsible for any LotR puns.
                                https://github.com/outbrain/onering/
Onering: Provisioning &
    Orchestration
       ●
           Initiates/manages provisioning
           and inventory.
       ●
           Acts as an orchestration layer in
           our automation.
       ●
           Keeps all metadata, which is
           searchable.
       ●
           Has a CLI tool and REST API to
           work with.
       ●
           Acts as our single point of truth
           & final authority on state.
Onering Provisioning Workflow
➔
 Developers put in machine requests by role for
quarterly order.
➔
    Machines show up, get racked and powered on.
➔
 Machines boot into the Razor microkernel and report to
Onering.
➔
  Appropriate nodes get kickstarted & bootstrapped into
roles specified.
➔
    Additional nodes sit idle in 'allocatable' state.
➔
    Once OS is installed, configuration is handed off to...
Config Management: Chef
●
  Onering bootstraps into a Chef run.
●
  Chef installs all the system stuff.
●
  Chef sets up Java and tunes the OS how we like.
●
  Chef runs the Cassandra Cookbook.
include_recipe "java"

package "apache-cassandra1" do
  action :install
end

template "/etc/cassandra/conf/cassandra.yaml" do
  owner "cassandra"
  group "cassandra"
  mode "0755"
  source "cassandra.yaml.erb"
end



                        https://github.com/opscode/chef/
Cassandra Cookbook does it all!
                          ●
                              Builds/mounts disks.
                          ●
                              Handles multiple clusters,
                              different versions.
                          ●
                              Generates configs (in some
                              cases automatically based
                              on hardware profile).
                          ●
                              Connects to local instance
                              and gets the schema.
                          ●
                              Generates collectd config
                              and maintenance script.
                          ●
                              Schedules maintenance.
         https://github.com/nmilford/chef-cassandra
Glu: Continuous Deployment
                ●   Not related to getting a C* node
                    to production, but it's how we get
                    apps there.
                ●   Built at Linkedin.
                ●   Onering talks to it!
●
  Holds deployment metadata.
●
  Maven Builds an RPM, dumps to a repo.
●
  Glu-Agent yum installs it and performs checks.


                    https://github.com/linkedin/glu
Command & Control:
Distributed commands:
salt '*ny*' cassandra.column_families
salt 'cass*' cassandra.compactionstats
salt '*stg*' cassandra.info
salt 'cass1.ny.*' cassandra.keyspaces
salt -E 'cass1-(stg|prod)' cassandra.netstats
salt '*' cassandra.tpstats

Scary commands:
salt '*' --batch-size 25% service.restart cassandra
salt '*' -b2 cmd.run "nodetool -h $(hostname) -p 7199 snapshot"

We actually wrap salt in Onering to provide AAA, as well to allow use of Onering
metadata for node targeting.

                              https://github.com/saltstack/salt
Monitoring




 Is Hard...
Common Monitoring & Events Bus
●
    A single infrastructure-wide bus for systems
    data:
    –   Metrics
    –   Events
    –   Metadata
●
    Collectd as systems agent.
●
    RabbitMQ as message bus.
●
    Graphite as metrics endpoint.
●
    Working on an events mechanism.
●
    Each layer should be interchangeable.
Collectd
 ●
     Been around forever.
 ●
     Had to rebuild the JMX plugin to not use OpenJDK.
 ●
     Easy to write plugins and extend.
 ●
     Writes to RabbitMQ out of the box.
 ●
     Easy to templatize config for Chef.
<% @node[:Cassandra][:Keyspaces].each do |ks| -%>
<%    ks[1].each do |cf| -%>
       Collect "<%= ks[0] %>.<%= cf %>"
       Collect "KeyCache.<%= ks[0] %>.<%= cf %>"
       Collect "RowCache.<%= ks[0] %>.<%= cf %>"
<%    end -%>
<% end -%>
                        https://github.com/octo/collectd
RabbitMQ
●
    Lots of apps support AMPQ.
●
    Shovel plugin for multi-site.
●
    Pretty stable.
●
    I'm not mad at it.




                https://github.com/rabbitmq/rabbitmq-server
Graphite




●
    Plays well with RabbitMQ.
●
    Easy to get metrics into.
●
    Scads of functions.
●
    Easy to get meaningful data out of.


                      https://launchpad.net/graphite
Graphite Render, Activate!
http://graphite/render?
Width=800
&height=600
&from=-2hours
&until=now
&target=sortByMaxima(highestCurrent(collectd.machines
.*.cass2*.GenericJMX.ReadStage.PendingTasks,5))
&target=sortByMaxima(highestCurrent(collectd.machines
.*.cass2*.GenericJMX.MutationStage.PendingTasks,5))
&hideLegend=false
Alerting: Nagios Self Serve
●
    Uses Onering for new node discovery.
●
    Developers add their own alerts based off of
    Graphite data.
●
    Ops get fewer alerts and are not a bottleneck.
●
    Devs are more engaged.
●
    Everyone is happy.
Questions?

More Related Content

What's hot

Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...DataStax
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...DataStax
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonC* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonDataStax Academy
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...DataStax
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...DataStax
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyBenjamin Black
 
Understanding AntiEntropy in Cassandra
Understanding AntiEntropy in CassandraUnderstanding AntiEntropy in Cassandra
Understanding AntiEntropy in CassandraJason Brown
 
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016DataStax
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXzznate
 
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Spark Summit
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into CassandraBrian Hess
 
Cassandra Community Webinar: Apache Cassandra Internals
Cassandra Community Webinar: Apache Cassandra InternalsCassandra Community Webinar: Apache Cassandra Internals
Cassandra Community Webinar: Apache Cassandra InternalsDataStax
 
Introduction to Real-Time Analytics with Cassandra and Hadoop
Introduction to Real-Time Analytics with Cassandra and HadoopIntroduction to Real-Time Analytics with Cassandra and Hadoop
Introduction to Real-Time Analytics with Cassandra and HadoopPatricia Gorla
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applicationsBen Slater
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
 
Why your Spark job is failing
Why your Spark job is failingWhy your Spark job is failing
Why your Spark job is failingSandy Ryza
 
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)Spark Summit
 
Real-time data analytics with Cassandra at iland
Real-time data analytics with Cassandra at ilandReal-time data analytics with Cassandra at iland
Real-time data analytics with Cassandra at ilandJulien Anguenot
 
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...DataStax
 

What's hot (20)

Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonC* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick Branson
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
 
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and Consistency
 
Understanding AntiEntropy in Cassandra
Understanding AntiEntropy in CassandraUnderstanding AntiEntropy in Cassandra
Understanding AntiEntropy in Cassandra
 
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
 
Brisk hadoop june2011
Brisk hadoop june2011Brisk hadoop june2011
Brisk hadoop june2011
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMX
 
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into Cassandra
 
Cassandra Community Webinar: Apache Cassandra Internals
Cassandra Community Webinar: Apache Cassandra InternalsCassandra Community Webinar: Apache Cassandra Internals
Cassandra Community Webinar: Apache Cassandra Internals
 
Introduction to Real-Time Analytics with Cassandra and Hadoop
Introduction to Real-Time Analytics with Cassandra and HadoopIntroduction to Real-Time Analytics with Cassandra and Hadoop
Introduction to Real-Time Analytics with Cassandra and Hadoop
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applications
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Why your Spark job is failing
Why your Spark job is failingWhy your Spark job is failing
Why your Spark job is failing
 
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
 
Real-time data analytics with Cassandra at iland
Real-time data analytics with Cassandra at ilandReal-time data analytics with Cassandra at iland
Real-time data analytics with Cassandra at iland
 
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
Cassandra Backups and Restorations Using Ansible (Joshua Wickman, Knewton) | ...
 

Viewers also liked

Hadoop - Splitting big problems into manageable pieces.
Hadoop - Splitting big problems into manageable pieces.Hadoop - Splitting big problems into manageable pieces.
Hadoop - Splitting big problems into manageable pieces.Nathan Milford
 
SF ElasticSearch Meetup 2013.04.06 - Monitoring
SF ElasticSearch Meetup 2013.04.06 - MonitoringSF ElasticSearch Meetup 2013.04.06 - Monitoring
SF ElasticSearch Meetup 2013.04.06 - MonitoringSushant Shankar
 
Elasticsearch in Production (London version)
Elasticsearch in Production (London version)Elasticsearch in Production (London version)
Elasticsearch in Production (London version)foundsearch
 
LogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeLogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeJames Turnbull
 
Down and dirty with Elasticsearch
Down and dirty with ElasticsearchDown and dirty with Elasticsearch
Down and dirty with Elasticsearchclintongormley
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in NetflixDanny Yuan
 

Viewers also liked (7)

Hadoop - Splitting big problems into manageable pieces.
Hadoop - Splitting big problems into manageable pieces.Hadoop - Splitting big problems into manageable pieces.
Hadoop - Splitting big problems into manageable pieces.
 
SF ElasticSearch Meetup 2013.04.06 - Monitoring
SF ElasticSearch Meetup 2013.04.06 - MonitoringSF ElasticSearch Meetup 2013.04.06 - Monitoring
SF ElasticSearch Meetup 2013.04.06 - Monitoring
 
Elasticsearch in Production (London version)
Elasticsearch in Production (London version)Elasticsearch in Production (London version)
Elasticsearch in Production (London version)
 
LogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeLogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesome
 
Down and dirty with Elasticsearch
Down and dirty with ElasticsearchDown and dirty with Elasticsearch
Down and dirty with Elasticsearch
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
Cassandra+Hadoop
Cassandra+HadoopCassandra+Hadoop
Cassandra+Hadoop
 

Similar to The Automation Factory

Capybara with Rspec
Capybara with RspecCapybara with Rspec
Capybara with RspecOmnia Helmi
 
Oracle goldengate and RAC12c
Oracle goldengate and RAC12cOracle goldengate and RAC12c
Oracle goldengate and RAC12cSiraj Ahmed
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013dotCloud
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Docker, Inc.
 
Deploying Rails Apps with Chef and Capistrano
 Deploying Rails Apps with Chef and Capistrano Deploying Rails Apps with Chef and Capistrano
Deploying Rails Apps with Chef and CapistranoSmartLogic
 
Riak add presentation
Riak add presentationRiak add presentation
Riak add presentationIlya Bogunov
 
Building a serverless company on AWS lambda and Serverless framework
Building a serverless company on AWS lambda and Serverless frameworkBuilding a serverless company on AWS lambda and Serverless framework
Building a serverless company on AWS lambda and Serverless frameworkLuciano Mammino
 
Kettunen, miaubiz fuzzing at scale and in style
Kettunen, miaubiz   fuzzing at scale and in styleKettunen, miaubiz   fuzzing at scale and in style
Kettunen, miaubiz fuzzing at scale and in styleDefconRussia
 
Toolbox of a Ruby Team
Toolbox of a Ruby TeamToolbox of a Ruby Team
Toolbox of a Ruby TeamArto Artnik
 
Virtualization and Cloud Computing with Elastic Server On Demand
Virtualization and Cloud Computing with Elastic Server On DemandVirtualization and Cloud Computing with Elastic Server On Demand
Virtualization and Cloud Computing with Elastic Server On DemandYan Pritzker
 
Methods of Sharding MySQL
Methods of Sharding MySQLMethods of Sharding MySQL
Methods of Sharding MySQLLaine Campbell
 
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.UA Mobile
 
Upgrade Kubernetes the boring way
Upgrade Kubernetes the boring wayUpgrade Kubernetes the boring way
Upgrade Kubernetes the boring wayOleksandr Slynko
 
Fisl - Deployment
Fisl - DeploymentFisl - Deployment
Fisl - DeploymentFabio Akita
 
Analyze Virtual Machine Overhead Compared to Bare Metal with Tracing
Analyze Virtual Machine Overhead Compared to Bare Metal with TracingAnalyze Virtual Machine Overhead Compared to Bare Metal with Tracing
Analyze Virtual Machine Overhead Compared to Bare Metal with TracingScyllaDB
 
Gdb basics for my sql db as (percona live europe 2019)
Gdb basics for my sql db as (percona live europe 2019)Gdb basics for my sql db as (percona live europe 2019)
Gdb basics for my sql db as (percona live europe 2019)Valerii Kravchuk
 
How I hack on puppet modules
How I hack on puppet modulesHow I hack on puppet modules
How I hack on puppet modulesKris Buytaert
 

Similar to The Automation Factory (20)

Capybara with Rspec
Capybara with RspecCapybara with Rspec
Capybara with Rspec
 
Oracle goldengate and RAC12c
Oracle goldengate and RAC12cOracle goldengate and RAC12c
Oracle goldengate and RAC12c
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
 
Deploying Rails Apps with Chef and Capistrano
 Deploying Rails Apps with Chef and Capistrano Deploying Rails Apps with Chef and Capistrano
Deploying Rails Apps with Chef and Capistrano
 
Riak add presentation
Riak add presentationRiak add presentation
Riak add presentation
 
Building a serverless company on AWS lambda and Serverless framework
Building a serverless company on AWS lambda and Serverless frameworkBuilding a serverless company on AWS lambda and Serverless framework
Building a serverless company on AWS lambda and Serverless framework
 
Kettunen, miaubiz fuzzing at scale and in style
Kettunen, miaubiz   fuzzing at scale and in styleKettunen, miaubiz   fuzzing at scale and in style
Kettunen, miaubiz fuzzing at scale and in style
 
Toolbox of a Ruby Team
Toolbox of a Ruby TeamToolbox of a Ruby Team
Toolbox of a Ruby Team
 
Os Wilhelm
Os WilhelmOs Wilhelm
Os Wilhelm
 
infra-as-code
infra-as-codeinfra-as-code
infra-as-code
 
Virtualization and Cloud Computing with Elastic Server On Demand
Virtualization and Cloud Computing with Elastic Server On DemandVirtualization and Cloud Computing with Elastic Server On Demand
Virtualization and Cloud Computing with Elastic Server On Demand
 
Methods of Sharding MySQL
Methods of Sharding MySQLMethods of Sharding MySQL
Methods of Sharding MySQL
 
FreeBSD: Dev to Prod
FreeBSD: Dev to ProdFreeBSD: Dev to Prod
FreeBSD: Dev to Prod
 
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.
Критика "библиотечного" подхода в разработке под Android. UA Mobile 2016.
 
Upgrade Kubernetes the boring way
Upgrade Kubernetes the boring wayUpgrade Kubernetes the boring way
Upgrade Kubernetes the boring way
 
Fisl - Deployment
Fisl - DeploymentFisl - Deployment
Fisl - Deployment
 
Analyze Virtual Machine Overhead Compared to Bare Metal with Tracing
Analyze Virtual Machine Overhead Compared to Bare Metal with TracingAnalyze Virtual Machine Overhead Compared to Bare Metal with Tracing
Analyze Virtual Machine Overhead Compared to Bare Metal with Tracing
 
Gdb basics for my sql db as (percona live europe 2019)
Gdb basics for my sql db as (percona live europe 2019)Gdb basics for my sql db as (percona live europe 2019)
Gdb basics for my sql db as (percona live europe 2019)
 
How I hack on puppet modules
How I hack on puppet modulesHow I hack on puppet modules
How I hack on puppet modules
 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

The Automation Factory

  • 1. The Automation Factory nathan@milford.io blog.milford.io twitter.com/NathanMilford github.com/nmilford
  • 2. This is NOT strictly a Cassandra talk. ♫ There's no earthly way of knowing ♫
  • 3. This is an infrastructure talk. ♫ How your infrastructure's growing. ♫
  • 4. Startups move fast. Priorities change. Infrastructure needs to be able to pivot, too. ♫ Who knows where business is going. Or which way the data's flowing. ♫
  • 5. When you scale up, so do your problems. ♫ Drives imploding? IO plateauing? ♫
  • 6. Not to mention unexpected disasters. We lost a whole data center during Hurricane Sandy. ♫ Is a hurricane a'blowing? ♫
  • 7. How do you keep up with growth? ♫ There's no earthly way of knowing ♫
  • 8. How do you deal with failure? ♫ Are the status LEDs a 'glowing? Is the server reaper mowing? ♫
  • 9. How do you deal with too much success? ♫ Yes! The danger must be growing For the data keeps on flowing. ♫
  • 10. What do you do? ♫ And they're certainly not showing any signs that they are slowing! ♫
  • 11. Hold your breath. Make a wish. Automate!
  • 12. ♫ Come with me And you'll be In a world of systems automation ♫
  • 13. ♫ Take a look And you’ll see Into my Chef lucubrations So login, Install, begin With the Chef cookbook of my creation What you'll see might require Explanation ♫
  • 14. ♫ If you want to view paradise Simply go to Github and view it Pull requests welcome, go to it Want to change the code A merge will do it ♫ https://github.com/linkedin/glu/ https://github.com/octo/collectd/ https://github.com/opscode/chef/ https://github.com/saltstack/salt/ https://github.com/outbrain/onering/ https://github.com/nmilford/chef-cassandra/ https://github.com/rabbitmq/rabbitmq-server/
  • 15. def discover_cassandra_schema require 'cassandra-cql' schema = {} server = "#{node[:ipaddress]}:#{node[:Cassandra][:rpc_port]}" db = CassandraCQL::Database.new("#{server}") rescue nil if db db.keyspaces.collect{|s| schema[s.name] = s.column_families.collect{|cfname, cfobj| cfname } } schema.delete("system") schema.delete("OpsCenter") return schema ♫ There is no life I know end To compare with writing automation return nil Write it once end You’ll be free♫
  • 17. ♫ If you want to scale past a petabyte Just install Chef, Salt and Graphite If you want to sleep the whole night Automate the world It will be all right♫
  • 18. ♫ There is no life I know To compare with writing automation Write it once You’ll be free ♫
  • 19. ♫ If you truly wish to be.♫
  • 20. The Automation Factory A Journey from Bare Metal to Active Cassandra Node nathan@milford.io blog.milford.io twitter.com/NathanMilford github.com/nmilford
  • 22. 2 Years Later ● 80 billion impressions a month. ● 4 clusters for disparate use-cases, more in planning. ● 73 Cassandra nodes across 3 data centers.
  • 23. Mo' Servers, Mo' Problems We got multiple cages of servers. So... yeah... you can see where automation might help :)
  • 24. Automation Attack Plan ● Provisioning! ● Orchestration! ● Command and Control! ● Config Management! ● Monitoring and Alerting!
  • 25. Provisioning ● Started with Cobbler (which is Awesome!) ● High performance infrastructures are snowflakes, can get out of hand fast. ● No tool that worked completely, end to end, the tool won't write itself.
  • 26. We Built Our Own: Onering Note: I am only a moderate Lord of the Rings Fan, and the guy who did most of the work on it, Gary Hetzel, is a Star Trek fan. We are not responsible for any LotR puns. https://github.com/outbrain/onering/
  • 27. Onering: Provisioning & Orchestration ● Initiates/manages provisioning and inventory. ● Acts as an orchestration layer in our automation. ● Keeps all metadata, which is searchable. ● Has a CLI tool and REST API to work with. ● Acts as our single point of truth & final authority on state.
  • 28.
  • 29. Onering Provisioning Workflow ➔ Developers put in machine requests by role for quarterly order. ➔ Machines show up, get racked and powered on. ➔ Machines boot into the Razor microkernel and report to Onering. ➔ Appropriate nodes get kickstarted & bootstrapped into roles specified. ➔ Additional nodes sit idle in 'allocatable' state. ➔ Once OS is installed, configuration is handed off to...
  • 30. Config Management: Chef ● Onering bootstraps into a Chef run. ● Chef installs all the system stuff. ● Chef sets up Java and tunes the OS how we like. ● Chef runs the Cassandra Cookbook. include_recipe "java" package "apache-cassandra1" do action :install end template "/etc/cassandra/conf/cassandra.yaml" do owner "cassandra" group "cassandra" mode "0755" source "cassandra.yaml.erb" end https://github.com/opscode/chef/
  • 31. Cassandra Cookbook does it all! ● Builds/mounts disks. ● Handles multiple clusters, different versions. ● Generates configs (in some cases automatically based on hardware profile). ● Connects to local instance and gets the schema. ● Generates collectd config and maintenance script. ● Schedules maintenance. https://github.com/nmilford/chef-cassandra
  • 32. Glu: Continuous Deployment ● Not related to getting a C* node to production, but it's how we get apps there. ● Built at Linkedin. ● Onering talks to it! ● Holds deployment metadata. ● Maven Builds an RPM, dumps to a repo. ● Glu-Agent yum installs it and performs checks. https://github.com/linkedin/glu
  • 33.
  • 34.
  • 35. Command & Control: Distributed commands: salt '*ny*' cassandra.column_families salt 'cass*' cassandra.compactionstats salt '*stg*' cassandra.info salt 'cass1.ny.*' cassandra.keyspaces salt -E 'cass1-(stg|prod)' cassandra.netstats salt '*' cassandra.tpstats Scary commands: salt '*' --batch-size 25% service.restart cassandra salt '*' -b2 cmd.run "nodetool -h $(hostname) -p 7199 snapshot" We actually wrap salt in Onering to provide AAA, as well to allow use of Onering metadata for node targeting. https://github.com/saltstack/salt
  • 37. Common Monitoring & Events Bus ● A single infrastructure-wide bus for systems data: – Metrics – Events – Metadata ● Collectd as systems agent. ● RabbitMQ as message bus. ● Graphite as metrics endpoint. ● Working on an events mechanism. ● Each layer should be interchangeable.
  • 38. Collectd ● Been around forever. ● Had to rebuild the JMX plugin to not use OpenJDK. ● Easy to write plugins and extend. ● Writes to RabbitMQ out of the box. ● Easy to templatize config for Chef. <% @node[:Cassandra][:Keyspaces].each do |ks| -%> <% ks[1].each do |cf| -%> Collect "<%= ks[0] %>.<%= cf %>" Collect "KeyCache.<%= ks[0] %>.<%= cf %>" Collect "RowCache.<%= ks[0] %>.<%= cf %>" <% end -%> <% end -%> https://github.com/octo/collectd
  • 39. RabbitMQ ● Lots of apps support AMPQ. ● Shovel plugin for multi-site. ● Pretty stable. ● I'm not mad at it. https://github.com/rabbitmq/rabbitmq-server
  • 40. Graphite ● Plays well with RabbitMQ. ● Easy to get metrics into. ● Scads of functions. ● Easy to get meaningful data out of. https://launchpad.net/graphite
  • 42.
  • 43. Alerting: Nagios Self Serve ● Uses Onering for new node discovery. ● Developers add their own alerts based off of Graphite data. ● Ops get fewer alerts and are not a bottleneck. ● Devs are more engaged. ● Everyone is happy.