What Big Data Folks Need to Know About DevOps

  • 5,050 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
5,050
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
165
Comments
0
Likes
8

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • \n
  • I’ve been a developer and system administrator for well over 10 years. In that time I’ve worked in a number of environments, from mom & pop startups to huge enterprise software shops. I’ve built fully automated infrastructures for internal and external use and hacked on everything in between. Now, I do training, services and evangelism for Opscode. \n
  • I’ve been a developer and system administrator for well over 10 years. In that time I’ve worked in a number of environments, from mom & pop startups to huge enterprise software shops. I’ve built fully automated infrastructures for internal and external use and hacked on everything in between. Now, I do training, services and evangelism for Opscode. \n
  • I’ve been a developer and system administrator for well over 10 years. In that time I’ve worked in a number of environments, from mom & pop startups to huge enterprise software shops. I’ve built fully automated infrastructures for internal and external use and hacked on everything in between. Now, I do training, services and evangelism for Opscode. \n
  • Why did you come today, what do you hope to learn?\n
  • Why did you come today, what do you hope to learn?\n
  • Why did you come today, what do you hope to learn?\n
  • Why did you come today, what do you hope to learn?\n
  • DevOps, more than just a buzzword. It’s Developers and Operations working together. That might sound obvious, but it’s not.\n
  • To quote Tim O’Reilly, DevOps is the ability to create and deploy reliable software to an unreliable platform platform that scales horizontally.\n
  • DevOps is a cultural movement in Development and Operations. It’s Agile realized at the business level, not just Development. It’s about building trust between Dev and Ops. Development can’t throw code over the fence and expect it to just work anymore, they need to be responsible for performance (and get those guys pagers). Operations can’t justify “uptime” above the business, they need to work with Development to make sure the business is rolling out features. Put them in the same space, they’re on the same team.\n\n
  • Once you have people working together, you’ve got to trust them to get things done. Enable each member of your team to have a voice and you’ll get better results. \n
  • Back it up with metrics. Don’t just monitor for health, monitor for production. Once you’ve got numbers you can make steady change and understand your results.\n
  • You need to be thinking in terms of automating everything you can, so value can be derived from development and operations and you can get down to business instead of tweaking and tinkering. Hand-tuning a dozen machines should not be your business’ edge.\n\n
  • Your infrastructure is not a unique snowflake. With very few exceptions, there is no secret sauce in building servers. Let’s focus on deploying applications in a repeatable, continuous fashion. Infrastructure as Code means that you can tear down and replace your business from version control, data backups and bare metal resources. Want to run on Rackspace instead of EC2? Let’s do it in an hour instead of weeks. How are you going to make this happen?\n
  • At a high level, Chef is a Ruby library for managing infrastructure primitives. It is a systems integration platform built for scale.\n
  • Chef gives you the tools primitives to answer the question... How do you want to model data?\nTo configure your systems.\nAnd integrate them together.\nAnd give you an API you can use to work with your infrastructure\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Idempotent\n
  • Data driven means\n
  • Most users start with the default configurations, because they’re field-tested and peer-reviewed.\n
  • Apache licensed, well over 200 external contributors. Thriving and active user base. \n
  • \n
  • There’s More Than One Way To Do It\nIt’s a Perl motto, but it holds true. We give you the tools, you decide how to work it.\n
  • Let’s talk about how Chef works.\n
  • Agent executable wrapping libraries\nConfigures your system with the libraries.\n\n\n
  • The Chef Server is a publishing system. You store data on the server, and it provides an API to access and search the data.\n
  • We use CouchDB because it stores JSON and has a nice REST API\n
  • Chef is open source, and we have a product called the Opscode Platform. It has the same API as the Open Source Chef Server.\n\n
  • Abstraction of a server. With the chef server, node state data is persisted between runs. The edge node does all the heavy lifting.\n
  • Attributes == data.\n
  • \n
  • Roles are another abstraction that describe a set of configuration functionality about nodes. webserver, loadbalancer, database master, etc.\n
  • \n
  • Resources are an abstraction we feed data into. When you write recipes in Chef, you create resources of things you want to configure.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • The abstraction over the commands or API calls that will configure the resource to be in the state you have defined.\n
  • These actions are relevant to the provider\nCommands or API calls made to configure the resource.\nPackage resources can have many different providers.\n
  • These actions are relevant to the provider\nCommands or API calls made to configure the resource.\nPackage resources can have many different providers.\n
  • These actions are relevant to the provider\nCommands or API calls made to configure the resource.\nPackage resources can have many different providers.\n
  • Providers can be platform specific.\nResources are mapped via the platform to the correct provider.\n
  • Providers can be platform specific.\nResources are mapped via the platform to the correct provider.\n
  • Providers can be platform specific.\nResources are mapped via the platform to the correct provider.\n
  • Providers can be platform specific.\nResources are mapped via the platform to the correct provider.\n
  • Providers can be platform specific.\nResources are mapped via the platform to the correct provider.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • The order of resources in a recipe, and the order of the recipes applied in run lists.\n\nSidebar about the “Why Order Matters: Turing Equivalence in Automated Systems Administration” paper and RPM installation\n
  • Cookbooks encapsulate all the components that recipes need to configure the infrastructure. \n
  • Cookbooks are a directory of code components\nRecipes are the core libraries you use to configure something, but cookbooks can contain other libraries used in recipes.\n
  • Cookbooks are a directory of code components\nRecipes are the core libraries you use to configure something, but cookbooks can contain other libraries used in recipes.\n
  • Cookbooks are a directory of code components\nRecipes are the core libraries you use to configure something, but cookbooks can contain other libraries used in recipes.\n
  • Cookbooks are a directory of code components\nRecipes are the core libraries you use to configure something, but cookbooks can contain other libraries used in recipes.\n
  • Cookbooks are a directory of code components\nRecipes are the core libraries you use to configure something, but cookbooks can contain other libraries used in recipes.\n
  • Find and share cookbooks on cookbooks.opscode.com\n
  • Bags and items in the bags. Anyone play D&D, NWN, etc? Bag of holding!\n\nUsers, application information, network info, cabinet/rack locations. Describe components of your infrastructure with data, and use that data to configure systems.\n
  • Freeform, describes a user.\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • You can use data bags in recipes!\n\n
  • Knife is the “swiss army knife” tool of Chef. It primarily works with the Chef Server API, but it can also interact with other APIs such as cloud providers. \n
  • Knife can be used for many things, search is one of them.\n
  • Knife can be used for many things, search is one of them.\n
  • Knife can be used for many things, search is one of them.\n
  • Knife can be used for many things, search is one of them.\n
  • Knife can be used for many things, search is one of them.\n
  • Knife can be used for many things, search is one of them.\n
  • Now that we’ve covered the basics of Chef. Let’s see how to use Chef to automate deploying Hadoop clusters with Cluster Chef to Amazon EC2.\n
  • If you’re not familiar with Hadoop... well you came to the right place. HDFS, a distributed file system that provides high throughput access to application data and MapReduce is a programming framework for writing applications that rapidly process vast amounts of data in parallel.\n\nA typical Hadoop cluster consists of 2 master pieces. The NameNode and the JobTracker are the masters of the cluster. The NameNode manages the file system metadata and the DataNodes store the actual data. You can have 1 or more DataNodes in your cluster. The Secondary NameNode cleans up the data. It’s deprecated and has been replaced by other nodes in more recent versions of Hadoop. Flip can tell you more about that later.\n
  • The JobTracker manages the jobs queue, scheduling and organizing work for the TaskTracker nodes. You can have 1 or more TaskTracker nodes in your cluster.\n\n
  • Knife is ready to go, we setup Cluster Chef as outlined in the prerequisites.\n
  • We download the cookbooks that were shared on the cookbook site, but we upload them to the Chef Server. These are discrete and separate, the nodes running Chef don’t talk to the cookbooks site. The Cluster Chef repository bundles up several from Opscode’s repository and provides a number of its own cookbooks in “site-cookbooks”. You may remember editing this in the Prerequisites.\n
  • Cluster Chef has cookbooks for Hadoop, Cassandra, Hbase, R, Hive, Pig and more.\n\nCookbooks contain recipes, recipes are how systems are configured. You add recipes to Roles or the run_list to get the behavior you want. \n
  • Roles contain attributes and our run_lists. You add a role to your nodes to get the behavior you want. Ordering is important!\n
  • Cluster Chef adds a layer over Chef’s Roles, managing the creation and naming of the nodes and ensuring enough of them are created. Let’s just focus on the Roles though. The “master” facet uses the “hadoop_master” role, making our master a combination of the namenode, secondary namenode and jobtracker. For our example, our master is also a “hadoop_worker”. This works for our small-scale demo, but you could easily put different Hadoop components on different nodes as you scale up and need dedicated servers for each service.\n
  • Provisioning is the first step. Usually Chef is going to launch the machines individually, but Cluster Chef allows you to launch them in bulk. We need some computers on the internet. For our demonstration they’re going to be a Hadoop master and worker nodes. They could easily be load balancers, webservers, database servers or whatever. We launch those with a cloud API. Every cloud does this. Chef talks to clouds via the library Fog.\n
  • Test our knife cluster command. If all of our prerequisites are in place, this is going to work just fine.\n\n
  • Kinda exciting isn’t it? Let’s take a look at the output and see what’s going on... \n\nCluster Chef is going to create the EC2 Security Groups we need... get our vanilla Ubuntu 10.04 AMI launched and bootstrap it with Chef. It extends the functionality of “knife ec2 create”\n\n
  • knife ec2 server create is the typical way to create our servers, we’re letting Cluster Chef manage them for us instead.\n
  • knife ec2 server create is the typical way to create our servers, we’re letting Cluster Chef manage them for us instead.\n
  • knife ec2 server create is the typical way to create our servers, we’re letting Cluster Chef manage them for us instead.\n
  • knife ec2 server create is the typical way to create our servers, we’re letting Cluster Chef manage them for us instead.\n
  • knife ec2 server create is the typical way to create our servers, we’re letting Cluster Chef manage them for us instead.\n
  • For some reason, the initial startup is still finicky, but is at least down to only two passes for hadoop. Flip can talk about this if he wants, for now it’s up so you can run it. We’re going to use knife to search for our hadoop_master and stop hadoop, fix some permissions and re-run our chef-client.\n\n
  • Now let’s get our workers working and ready to go. We’re going to use Cluster Chef to launch 2 workers, as outlined in our demohadoop.rb cluster file.\n
  • We need to open up the proxy server so we can spelunk a bit on the cluster.\n
  • We now have our 3 node cluster up and running, with minimal touch. The really exciting thing here is that this is easy to deploy and expand, it’s predictable and repeatable. We could add further instrumentation to automatically start working on our data. For now, Flip’s going to give us a couple minutes of hands-on demonstration.\n
  • \n

Transcript

  • 1. What Big Data Folks Need to Know About DevOpsSpeaker:Matt Ray Technical Evangelist ‣ matt@opscode.com ‣ @mattray Copyright © 2011 Opscode, Inc - All Rights Reserved 1
  • 2. Copyright © 2011 Opscode, Inc - All Rights Reservedhttp://www.flickr.com/photos/anotherphotograph/2100904507/sizes/o/ 2
  • 3. Developer, SysAdmin, Hacker,Community Manager Copyright © 2011 Opscode, Inc - All Rights Reserved http://www.flickr.com/photos/anotherphotograph/2100904507/sizes/o/ 2
  • 4. Developer, SysAdmin, Hacker,Community ManagerMany biz & dev environments Copyright © 2011 Opscode, Inc - All Rights Reserved http://www.flickr.com/photos/anotherphotograph/2100904507/sizes/o/ 2
  • 5. Developer, SysAdmin, Hacker,Community ManagerMany biz & dev environmentsOpscode: Training, Services &Evangelism Copyright © 2011 Opscode, Inc - All Rights Reserved http://www.flickr.com/photos/anotherphotograph/2100904507/sizes/o/ 2
  • 6. http://www.flickr.com/photos/timyates/2854357446/sizes/l/Copyright © 2011 Opscode, Inc - All Rights Reserved 3
  • 7. Developers? http://www.flickr.com/photos/timyates/2854357446/sizes/l/ Copyright © 2011 Opscode, Inc - All Rights Reserved 3
  • 8. Developers?Systems Administrators? http://www.flickr.com/photos/timyates/2854357446/sizes/l/ Copyright © 2011 Opscode, Inc - All Rights Reserved 3
  • 9. Developers?Systems Administrators?“BigData” Hacker? http://www.flickr.com/photos/timyates/2854357446/sizes/l/ Copyright © 2011 Opscode, Inc - All Rights Reserved 3
  • 10. Developers?Systems Administrators?“BigData” Hacker?“Business” People? http://www.flickr.com/photos/timyates/2854357446/sizes/l/ Copyright © 2011 Opscode, Inc - All Rights Reserved 3
  • 11. DevOps Copyright © 2011 Opscode, Inc - All Rights Reserved 4
  • 12. DevOpstools + culture Copyright © 2011 Opscode, Inc - All Rights Reserved 5
  • 13. Culture Copyright © 2011 Opscode, Inc - All Rights Reserved 6
  • 14. TrustCopyright © 2011 Opscode, Inc - All Rights Reserved 7
  • 15. Trust(but verify) Copyright © 2011 Opscode, Inc - All Rights Reserved 8
  • 16. Automation Copyright © 2011 Opscode, Inc - All Rights Reserved 9
  • 17. Infrastructure as Code Copyright © 2011 Opscode, Inc - All Rights Reserved 10
  • 18. Copyright © 2011 Opscode, Inc - All Rights Reserved 11
  • 19. Chef is an API foryour Infrastructure Copyright © 2011 Opscode, Inc - All Rights Reserved 12
  • 20. Principles Copyright © 2011 Opscode, Inc - All Rights Reserved 13
  • 21. PrinciplesIdempotent Copyright © 2011 Opscode, Inc - All Rights Reserved 13
  • 22. PrinciplesIdempotentData-driven Copyright © 2011 Opscode, Inc - All Rights Reserved 13
  • 23. PrinciplesIdempotentData-drivenSane defaults Copyright © 2011 Opscode, Inc - All Rights Reserved 13
  • 24. PrinciplesIdempotentData-drivenSane defaultsHackability Copyright © 2011 Opscode, Inc - All Rights Reserved 13
  • 25. PrinciplesIdempotentData-drivenSane defaultsHackabilityTMTOWTDI Copyright © 2011 Opscode, Inc - All Rights Reserved 13
  • 26. Multiple applications of an operation do not change the result Copyright © 2011 Opscode, Inc - All Rights Reserved 14
  • 27. We start with APIs, you supply data Copyright © 2011 Opscode, Inc - All Rights Reserved 15
  • 28. option :json_attribs, :short => "-j JSON_ATTRIBS", :long => "--json-attributes JSON_ATTRIBS", :description => "Load attributes from a JSON file orURL", :proc => nil option :node_name, :short => "-N NODE_NAME", :long => "--node-name NODE_NAME", :description => "The node name for this client", Defaults are sane, but :proc => nil easily changed Copyright © 2011 Opscode, Inc - All Rights Reserved 16
  • 29. Open source and community Copyright © 2011 Opscode, Inc - All Rights Reserved 17
  • 30. Copyright © 2011 Opscode, Inc - All Rights Reserved 18
  • 31. TMTOWTDI Copyright © 2011 Opscode, Inc - All Rights Reserved 19
  • 32. Copyright © 2011 Opscode, Inc - All Rights Reserved 20http://www.brooklynstreetart.com/theBlog/wp-content/uploads/2008/12/swedish_chef_bork-sleeper-cell.jpg
  • 33. Chef Client runs on your systems Copyright © 2011 Opscode, Inc - All Rights Reserved 21
  • 34. Clients talk to a Chef Server Copyright © 2011 Opscode, Inc - All Rights Reserved 22
  • 35. RESTful API w/ JSON responses Copyright © 2011 Opscode, Inc - All Rights Reserved 23
  • 36. Opscode Platformthe central, highly scalable, multi-tenant configuration service from Opscode... a hosted Chef Server Copyright © 2010 Opscode, Inc. – Confidential – Do Not Redistribute 24
  • 37. We call each systemyou configure a Node Copyright © 2011 Opscode, Inc - All Rights Reserved http://www.flickr.com/photos/peterrosbjerg/3913766224/ 25
  • 38. Nodes have Attributes{ "kernel": { Kernel info! "machine": "x86_64", "name": "Darwin", "os": "Darwin", "version": "Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 2010; root:xnu-1504.7.4~1/RELEASE_I386", "release": "10.4.0" }, "platform_version": "10.6.4", "platform": "mac_os_x", "platform_build": "10F569", "domain": "local", Platform info! "os": "darwin", "current_user": "jtimberman", "ohai_time": 1278602661.60043, "os_version": "10.4.0", "uptime": "18 days 17 hours 49 minutes 18 seconds", "ipaddress": "10.13.37.116", "hostname": "cider", "fqdn": "cider.local", "uptime_seconds": 1619358 Hostname and IP!} Copyright © 2011 Opscode, Inc - All Rights Reserved 26
  • 39. Nodes have a Run ListWhat Roles or Recipes to apply in Order Copyright © 2011 Opscode, Inc - All Rights Reserved 27
  • 40. Nodes have Roles Copyright © 2011 Opscode, Inc - All Rights Reserved http://www.flickr.com/photos/laenulfean/374398044/ 28
  • 41. Roles have a Run ListWhat Roles or Recipes to apply in Order Copyright © 2011 Opscode, Inc - All Rights Reserved 29
  • 42. remote_file linkcookbook_file service ruby_blocktemplate execute Chef manages Resources on Nodespackage bash git log deploy user http_request Copyright © 2011 Opscode, Inc - All Rights Reserved 30
  • 43. Resources...Declare a description of the state a part of the node should be in http://www.flickr.com/photos/xiaming/382205902/sizes/l/
  • 44. Resources...Declare a description of the state a part of the node should be in package "apache2" do version "2.2.11-2ubuntu2.6" action :install end template "/etc/apache2/apache2.conf" do source "apache2.conf.erb" owner "root" group "root" mode 0644 action :create end http://www.flickr.com/photos/xiaming/382205902/sizes/l/
  • 45. Resources... Declare a description of the state a part of the node should be in‣ Have a type package "apache2" do version "2.2.11-2ubuntu2.6" action :install end template "/etc/apache2/apache2.conf" do source "apache2.conf.erb" owner "root" group "root" mode 0644 action :create end http://www.flickr.com/photos/xiaming/382205902/sizes/l/
  • 46. Resources... Declare a description of the state a part of the node should be in‣ Have a type package "apache2" do version "2.2.11-2ubuntu2.6" action :install‣ Have a name end template "/etc/apache2/apache2.conf" do source "apache2.conf.erb" owner "root" group "root" mode 0644 action :create end http://www.flickr.com/photos/xiaming/382205902/sizes/l/
  • 47. Resources... Declare a description of the state a part of the node should be in‣ Have a type package "apache2" do version "2.2.11-2ubuntu2.6" action :install‣ Have a name end template "/etc/apache2/apache2.conf" do‣ Have parameters source "apache2.conf.erb" owner "root" group "root" mode 0644 action :create end http://www.flickr.com/photos/xiaming/382205902/sizes/l/
  • 48. Resources... Declare a description of the state a part of the node should be in‣ Have a type package "apache2" do version "2.2.11-2ubuntu2.6" action :install‣ Have a name end template "/etc/apache2/apache2.conf" do‣ Have parameters source "apache2.conf.erb" owner "root"‣ Take action to put the group "root" mode 0644 resource in the action :create declared state end http://www.flickr.com/photos/xiaming/382205902/sizes/l/
  • 49. Resources take action through Providers Copyright © 2011 Opscode, Inc - All Rights Reserved 32
  • 50. Providers...Know how to actually perform the actions specified by a resource. http://www.flickr.com/photos/affableslinky/562950216/
  • 51. Providers...Know how to actually perform the actions specified by a resource. Apt, Yum, Rubygems,Multiple providers Portage, Macports,per resource type. FreeBSD Ports, etc. http://www.flickr.com/photos/affableslinky/562950216/
  • 52. http://www.flickr.com/photos/acurbelo/2628837104/sizes/o/
  • 53. Resources http://www.flickr.com/photos/acurbelo/2628837104/sizes/o/
  • 54. ResourcesPlatform http://www.flickr.com/photos/acurbelo/2628837104/sizes/o/
  • 55. ResourcesPlatformProvider http://www.flickr.com/photos/acurbelo/2628837104/sizes/o/
  • 56. Recipes are lists of Resources http://www.flickr.com/photos/roadsidepictures/2478953342/sizes/o/ Copyright © 2011 Opscode, Inc - All Rights Reserved 35
  • 57. Recipes...Apply resources in the order they are specified http://www.flickr.com/photos/roadsidepictures/2478953342/sizes/o/
  • 58. Recipes...Apply resources in the order they are specified package "apache2" do version "2.2.11-2ubuntu2.6" action :install end template "/etc/apache2/apache2.conf" do source "apache2.conf.erb" owner "root" group "root" mode 0644 action :create end http://www.flickr.com/photos/roadsidepictures/2478953342/sizes/o/
  • 59. Recipes... Apply resources in the order they are specified package "apache2" do version "2.2.11-2ubuntu2.6" action :install 1‣ Evaluates resources in end the order they appear template "/etc/apache2/apache2.conf" do source "apache2.conf.erb" owner "root" group "root" mode 0644 action :create 2 end http://www.flickr.com/photos/roadsidepictures/2478953342/sizes/o/
  • 60. Recipes... Apply resources in the order they are specified‣ Evaluates resources in [ the order they appear "package[apache2]", "template[/etc/apache2/apache2.conf]"‣ Adds each resource to ] the Resource Collection http://www.flickr.com/photos/roadsidepictures/2478953342/sizes/o/
  • 61. Order Matters Copyright © 2011 Opscode, Inc - All Rights Reserved 37
  • 62. Order Mattershttp://www.infrastructures.org/papers/turing/turing.html Copyright © 2011 Opscode, Inc - All Rights Reserved 37
  • 63. Cookbooks arepackages for Recipes Copyright © 2011 Opscode, Inc - All Rights Reserved 38
  • 64. Common Cookbook Components Copyright © 2011 Opscode, Inc - All Rights Reserved 39
  • 65. Common Cookbook Componentsrecipes/ default.rb Copyright © 2011 Opscode, Inc - All Rights Reserved 39
  • 66. Common Cookbook Componentsrecipes/ default.rbfiles/ Copyright © 2011 Opscode, Inc - All Rights Reserved 39
  • 67. Common Cookbook Componentsrecipes/ default.rbfiles/templates/ Copyright © 2011 Opscode, Inc - All Rights Reserved 39
  • 68. Common Cookbook Componentsrecipes/ default.rbfiles/templates/attributes/ default.rb Copyright © 2011 Opscode, Inc - All Rights Reserved 39
  • 69. Common Cookbook Componentsrecipes/ default.rbfiles/templates/attributes/ default.rbmetadata.rb Copyright © 2011 Opscode, Inc - All Rights Reserved 39
  • 70. Cookbooks are shareable! cookbooks.opscode.com Copyright © 2011 Opscode, Inc - All Rights Reserved 40
  • 71. Data bags store arbitrary data Copyright © 2011 Opscode, Inc - All Rights Reserved 41
  • 72. A user data bag item...% knife data bag show users jtimberman{ "comment": "Joshua Timberman", "groups": "sysadmin", "ssh_keys": "ssh-rsa SUPERSEKRATS jtimberman@cider", "files": { ".zshrc": { "mode": "0644", "source": "dot-zshrc" }, ".vimrc": { "mode": "0644", "source": "dot-vimrc" } }, "id": "jtimberman", "uid": 7004, "shell": "/usr/bin/zsh", "openid": "http://jtimberman.myopenid.com/"} Copyright © 2011 Opscode, Inc - All Rights Reserved
  • 73. Data bags make recipes awesome-r (that’s totally a word) Copyright © 2011 Opscode, Inc - All Rights Reserved 43
  • 74. sysadmins = search(:users, groups:sysadmin)sysadminss.each do |u| user u[id] do uid u[id] shell u[shell] comment u[comment] supports :manage_home => true home "/home/#{u[id]}" end directory "/home/#{u[id]}/.ssh" do owner u[id] group u[id] mode 0700 end template "/home/#{u[id]}/.ssh/authorized_keys" do source "authorized_keys.erb" owner u[id] group u[id] mode 0600 variables :ssh_keys => u[ssh_keys] endend Copyright © 2011 Opscode, Inc - All Rights Reserved 43
  • 75. sysadmins = search(:users, groups:sysadmin)sysadminss.each do |u| user u[id] do uid u[id] shell u[shell] comment u[comment] supports :manage_home => true home "/home/#{u[id]}" end directory "/home/#{u[id]}/.ssh" do owner u[id] group u[id] mode 0700 end template "/home/#{u[id]}/.ssh/authorized_keys" do source "authorized_keys.erb" owner u[id] group u[id] mode 0600 variables :ssh_keys => u[ssh_keys] endend Copyright © 2011 Opscode, Inc - All Rights Reserved 43
  • 76. sysadmins = search(:users, groups:sysadmin)sysadminss.each do |u| user u[id] do uid u[id] shell u[shell] comment u[comment] supports :manage_home => true home "/home/#{u[id]}" end directory "/home/#{u[id]}/.ssh" do owner u[id] group u[id] mode 0700 end template "/home/#{u[id]}/.ssh/authorized_keys" do source "authorized_keys.erb" owner u[id] group u[id] mode 0600 variables :ssh_keys => u[ssh_keys] endend Copyright © 2011 Opscode, Inc - All Rights Reserved 43
  • 77. Command-line API utility, Knife http://www.flickr.com/photos/myklroventine/3474391066/ Copyright © 2011 Opscode, Inc - All Rights Reserved 44
  • 78. Nodes, Roles, DataBags are Searchable % knife search node “role:webserver” search(:users, “group:sysadmins”) Copyright © 2011 Opscode, Inc - All Rights Reserved 45
  • 79. Nodes, Roles, DataBags are Searchable % knife search node “role:webserver” search(:users, “group:sysadmins”) Copyright © 2011 Opscode, Inc - All Rights Reserved 45
  • 80. Nodes, Roles, DataBags are Searchable % knife search node “role:webserver” search(:users, “group:sysadmins”) Copyright © 2011 Opscode, Inc - All Rights Reserved 45
  • 81. Cluster Chef Copyright © 2011 Opscode, Inc - All Rights Reserved 46
  • 82. HadoopHDFSNameNodeSecondary NN*DataNode(s) Copyright © 2011 Opscode, Inc - All Rights Reserved 47
  • 83. HadoopMapReduceJobTrackerTaskTracker(s) Copyright © 2011 Opscode, Inc - All Rights Reserved 48
  • 84. Let’s Get CookingPrerequisites are already in place right? http://bit.ly/dda-chef Copyright © 2011 Opscode, Inc - All Rights Reserved 49
  • 85. Push the Cookbooks$ cd $CLUSTER_CHEF_PATH$ knife cookbook upload --all These run as root, kids. Let’s not blindly trust the upstream too much! Copyright © 2011 Opscode, Inc - All Rights Reserved 50
  • 86. CookbooksRecipesdatanode.rbjobtracker.rbnamenode.rbsecondarynamenode.rbtasktracker.rbmore! Copyright © 2011 Opscode, Inc - All Rights Reserved 51
  • 87. Push the Roles$ for foo in roles/*.rb ; do knife role from file $foo &sleep 1 ; done Copyright © 2011 Opscode, Inc - All Rights Reserved 52
  • 88. Cluster Chef’s FacetsRoleshadoop_master hadoop_namenode hadoop_secondarynamenod e hadoop_jobtrackerhadoop_worker hadoop_datanode hadoop_tasktracker Copyright © 2011 Opscode, Inc - All Rights Reserved 53
  • 89. ProvisioningNodesdemohadoop-master-i-77f2661bdemohadoop-worker-i-e390148fdemohadoop-worker-i-ff901493 Copyright © 2011 Opscode, Inc - All Rights Reserved 54
  • 90. Is this thing on?$ knife clusterAvailable cluster subcommands: (for details, knife SUB-COMMAND --help)** CLUSTER COMMANDS **knife cluster launch CLUSTER_NAME FACET_NAME (options)knife cluster show CLUSTER_NAME FACET_NAME (options)knife cluster bootstrap CLUSTER_NAME FACET_NAME SERVER_FQDN (options) Copyright © 2011 Opscode, Inc - All Rights Reserved 55
  • 91. Let’s launch our Hadoop Cluster!$ knife cluster launch demohadoop master --bootstrap Copyright © 2011 Opscode, Inc - All Rights Reserved 56
  • 92. knife ec2 server create Copyright © 2011 Opscode, Inc - All Rights Reserved 57
  • 93. knife ec2 server createCreates EC2 instance via API Copyright © 2011 Opscode, Inc - All Rights Reserved 57
  • 94. knife ec2 server createCreates EC2 instance via APIRetrieves local configuration Copyright © 2011 Opscode, Inc - All Rights Reserved 57
  • 95. knife ec2 server createCreates EC2 instance via APIRetrieves local configurationSSH to instance Copyright © 2011 Opscode, Inc - All Rights Reserved 57
  • 96. knife ec2 server createCreates EC2 instance via APIRetrieves local configurationSSH to instance ‣ Cluster Chef extends this security groups picks the AMI builds the number of specified nodes ‣ Writes chef configuration and authentication ‣ Installs Ruby and Chef ‣ Runs Chef with specified run list Copyright © 2011 Opscode, Inc - All Rights Reserved 57
  • 97. knife ec2 server createCreates EC2 instance via APIRetrieves local configurationSSH to instance ‣ Cluster Chef extends this security groups picks the AMI builds the number of specified nodes ‣ Writes chef configuration and authentication ‣ Installs Ruby and Chef ‣ Runs Chef with specified run list Copyright © 2011 Opscode, Inc - All Rights Reserved 57
  • 98. Still a bit of tweaking$ knife ssh "role:demohadoop_master" "sudo service hadoop-0.20-datanodestop; sudo service hadoop-0.20-namenode stop; sudo service hadoop-0.20-tasktracker stop; sudo service hadoop-0.20-jobtracker stop; sudo servicehadoop-0.20-secondarynamenode stop; sudo -u hdfs hadoop fs -chown -Rhbase:hbase /hadoop/hbase; sudo chef-client" -x ubuntu -aec2.public_hostname -i ~/.chef/keypairs/demohadoop.pem Copyright © 2011 Opscode, Inc - All Rights Reserved 58
  • 99. Hadoop Workers!$ knife cluster launch demohadoop worker --bootstrap Copyright © 2011 Opscode, Inc - All Rights Reserved 59
  • 100. Is it really on?‣ Configure your network settings to use a SOCKS proxy‣ http://ec2-public-ip- address.compute-1.amazonaws.com‣ copy & paste the SSH command‣ Profit! Copyright © 2011 Opscode, Inc - All Rights Reserved 60
  • 101. Our Hadoop Cluster is Operational... Copyright © 2011 Opscode, Inc - All Rights Reserved 61
  • 102. Resources/Questionswww.opscode.com/chefIRC and Mailing lists‣ irc.freenode.net #chef‣ lists.opscode.comTwitter:‣ @opscode, #opschef‣ @mattrayQuestions? Copyright © 2011 Opscode, Inc - All Rights Reserved 62