• Like
  • Save
Push jobs: an orchestration building block for private Chef
Upcoming SlideShare
Loading in...5
×
 

Push jobs: an orchestration building block for private Chef

on

  • 4,261 views

Push jobs is a new feature in Opscode Private Chef that will allow a user to run commands across hundreds of chef managed servers. Push Jobs leverages Erlang/OTP and ZeroMQ to provide scalable and ...

Push jobs is a new feature in Opscode Private Chef that will allow a user to run commands across hundreds of chef managed servers. Push Jobs leverages Erlang/OTP and ZeroMQ to provide scalable and fault tolerant execution.

In this talk I’ll cover the general motivation behind the design and an architectural overview of the system. This will include details of we used Erlang and ZeroMQ to build a robust, scalable system. I’ll also do a demo of the push job feature in action, covering the push jobs server, execution client and knife command line interface.

Statistics

Views

Total Views
4,261
Views on SlideShare
2,729
Embed Views
1,532

Actions

Likes
5
Downloads
14
Comments
0

5 Embeds 1,532

http://www.opscode.com 849
http://www.getchef.com 680
http://translate.googleusercontent.com 1
https://www.google.se 1
https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Push jobs: an orchestration building block for private Chef Push jobs: an orchestration building block for private Chef Presentation Transcript

    • The Opscode Push Jobs ServiceMark AndersonApril 28, 2013
    • Push jobsin a command line• knife job start -quorum 90% chef-client --search role:webapp• Finds all nodes with role webapp• Submits a job with quorum of 90% to the pushy server.• Checks quorum• Starts job on available nodes• Gathers success and failures• And will do this for ten nodes...or a thousand
    • Push jobsWhy not use X?• We wanted to build a tool that could be deeply integrated into chef.• Integrated with authentication model• Clients use their client key to authenticate to the server• Users use their keys to send commands to the api• Integrated with the authorization model• Groups control access now• Eventually there will be fine grained ACLs• Integrated with search and other Chef features• Scalability
    • Push jobsServer• Erlang service• Extends the Chef REST API• Job creation and tracking• Push client configuration• Controls the clients via ZeroMQ• Heartbeating to track node availability• Command execution
    • Push jobsClient• Simple ruby client• Receives heartbeats from the server• Sends back heartbeats to the server• Executes commands• Configuration requirements are minimal• The client initiates all connections to the server• Most configuration is via chef API call using the client key• Opens ZeroMQ connections to server for all other communication
    • Push jobsThe lifecycle of a jobServerClientJobAcceptedSendCommandClientsACKWait forQuorumStart ExecClientsExecCollectResults
    • Push jobsKnife extension• All control for pushy jobs is via extensions to the chef API• Node status• Job control• start• stop• status• Job listing
    • Pushy Demo
    • Push JobsDemoChef/PushyServerChicoHarpoGrouchoGummoZeppo
    • The nitty gritty
    • Internals:Client server interaction• The client initiates all connections to the server• The client authenticates to the server and receives• A session key and TTL• ZeroMQ connection information (ports, heartbeat rate, etc)• Subscribes via ZeroMQ to server heartbeats (1 to many)• Connects via ZeroMQ to the server (1-1)• Sends heartbeats to the server as long as it receives server heartbeats• Awaits commands from the server
    • Security• Protocol security• We leverage the existing API signing mechanism to exchange session keys• All ZeroMQ messages are signed• HMAC SHA256 signing protocol protects point to point messages• RSA 2048/SHA1 protects broadcast messages (just like the chef API)• Relies on the SSL chain of trust to the server.
    • Access control• Access rights controlled by groups• ‘push_job_writers’ group controls job creation and deletion• ‘push_job_readers’ group controls read access to job status and results• Whitelist for commands• The client rejects commands that aren’t on the whitelist• In the future we’d like to do finer grained access control• Perhaps persistent job templates with their own access rights and commands
    • Implementation
    • Why erlang?• Chef 11 work has been very successful• Easier integration• The process (think threads) model allows a great deal of parallelism• Every node has an erlang process to track its state• Every job has a process to track its state
    • Server process structureMessageswitchHeartbeatgeneratorREST APIClientsJobMonitorJobMonitorJobMonitorJobMonitorJobMonitorClientMonitorClientMonitorClientMonitorClientMonitorClientMonitor
    • Why ZeroMQ?• Abstracts away much of the pain of socket libraries• Reliable delivery• Portable• Broad language support• Proven scalability
    • Some impedance mismatch• ZeroMQ provides a lot of goodness for asynchronous execution• That is really helpful in many languages• Erlang doesn’t need that so much, and encourages finer grained tasks• ZeroMQ hides a lot of interesting state• It turns out we care about whether a node dies and comes back• Need a better ZeroMQ/Erlang glue library• ZeroMQ 3 offers some very interesting options for future work
    • Performance and scalability results• We can run a job over 2000 nodes• 15 sec heartbeats• c1.medium• Bottlenecks• Heartbeats consume a lot of resources• Everything goes through router process for zeromq messages
    • Moving towards a federated architectureChef APISQLErchefSolr SQLPush ReportingSQLSQLAuth
    • Where are we now?
    • Availability: Limited release• Time to get people using it• Private Chef only for now• Hosted Chef deferred• Scalability: hosted has more than 2k nodes• Security: ZeroMQ messages aren’t encrypted• Open Source: Eventually
    • Future directions• Scalability improvements• Our biggest customers will want more• Jobs as first class, persistent objects• More control of execution of jobs,• Inside a job: (push jobs makes an excellent DDOS tool)• Between jobs: Chaining results between jobs• Push jobs as a building block• Conducting experiments using it for building block for continuous delivery
    • Questions?