the NML project

668 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
668
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • the NML project

    1. 1. the NML project <yanglei@snda.com>
    2. 2. Before we start... This is a purely technical discussion, don’t bring politics in. That is:• Which dept. should in charge?• Why not develop in PHP/Java because nobody else in the company can program in Ruby.• How to integrate NML into XX system?
    3. 3. GoalOut-of-band Server ManagementExtremely configurable OS install viaSOL(Serial Over Lan)An intelligent system to control the wholeprocess, minimum human interventionBuild an open-source matrix for Server/OSdistro combinations
    4. 4. StatusMember : me wangjunyan (docs)Subproject Member: lijiehui (LXC:Linux container environment)github: https://github.com/op-sdo-com/nml Fork us!
    5. 5. StatusSpecial thanks to wangjunyan and dinghaifeng!
    6. 6. HP is a coward(think their WebOS)IBM, Dell, HPHP closed ipmi port(udp 623), forcecustomers to use iLO.Practically, iLO is okay. But you needto buy a license before using Remoteconsole redirection while IBM & Delllet you do anything!
    7. 7. Work through10.132.17.100-150 (prod. IP range)10.132.17.200-250 (IPMI IP range)One-to-One mapping (dynamic IPallocation is just impossible for now,but this can be improved)The current solution is neither secure norsufficiently isolated.
    8. 8. Work through1. Set to boot from PXE then restart: ipmitool -I lanplus -U ibm3550 -H 10.132.17.200 -P XX chassis bootdev pxe ipmitool -I lanplus -U ibm3550 -H 10.132.17.200 -P XX chassis power cycle2. Configure DHCP sever to reply by MAC and refuse any other DHCP request(!!) PS dhcp3 supports dynamic configuration update via OMAPI. see man dhcpd.conf
    9. 9. ArchitectureNML’s encapsulates all theintelligence in HTTP.DHCP and iPXE configurations are keptto a minimum.Centralized configuration is easy tomaintain.
    10. 10. Work throughhost aoti_200 { # eth0, eth1 hardware ethernet 00:1A:64:99:E7:50; # hardware ethernet 00:1A:64:99:E7:52; fixed-address 10.132.17.109; server-name "10.132.17.108"; if exists user-class and option user-class = "iPXE" { filename "http://10.132.17.108/nml/ipxe"; } else { filename "undionly.kpxe"; }}
    11. 11. Work through iPXE V.S. PXEiPXE liberate us from TFTP(stupidUDP). iPXE supports HTTP(even iSCSI),so the system scales.iPXE lays the foundation to anautomatic assessment managementplatform.
    12. 12. Work through#!ipxechain http://nml.snda.com/nml/chain/${manufacturer}/${product}/${uuid}?mac=${net0/mac}${manufacturer}, ${product}, ${uuid}, ${net0/mac} arevariables exposed by BIOS.Human make mistakes but BIOS are not.PS: This is probably the earliest stage to obtainhardware info. Early == Accurate
    13. 13. Work throughFrom now on, all the networkcommunication is done through HTTP.Also, the intelligence comes in: get /nml/pxelinux.cfg/:uuid do uuid = params[:uuid] install(uuid, get_ipaddr(uuid), get_gateway(uuid),get_hostname(uuid), get_iface(uuid), get_baudrate(uuid),get_release(uuid)) end
    14. 14. Work throughdef install(uuid, ipaddr, gateway, hostname, iface, baudrate, release) indent = * 4 head = "serial 0 #{baudrate}ntimeout 50nlabel pxeboot" tail = "default ubuntu-installer/amd64/boot-screens/vesamenu.c32" kernel = indent + "kernel %s/linux" % [release] # static ip configuration, avoid dhcp in the preseeding stage configs = [ "console-tools/archs=skip-config", "console-keymaps-at/keymap=us", "vga=normal", "netcfg/confirm_static=true", "netcfg/disable_dhcp=true", "netcfg/get_hostname=#{hostname}", "netcfg/get_domain=.nml", "netcfg/get_nameservers=%s" % [@@dns], "netcfg/get_ipaddress=#{ipaddr}", "netcfg/get_netmask=255.255.255.0", "netcfg/get_gateway=#{gateway}", "console=ttyS0,#{baudrate}n8", "interface=#{iface}", "initrd=#{release}/initrd.gz", "auto url=http://%s/%s/preseed/#{uuid}" % [@@master, @@base] ] append = indent + append + configs.join( ) + -- quiet [head, kernel, append, tail].join("n") + "n"end
    15. 15. ArchitectureWhat’s is preseed?Preseed is kickstart for Debian.Kickstart is answers to questions whenyou manually install a system.
    16. 16. ArchitectureNML tries to provide maximum flexibility from the bottom.Policy makers decided how to utilize it.Maximum flexibility == Each machine can pull its ownconfiguration set.NML tries hard to be OS/Hardware independent. (Goal 3:build a matrix)
    17. 17. Architecture I know real world op desperately want consistency, but this is policy. NML focus on Mechanism. Why flexibility matters? Any real world examples?1. Let the system generate distinct password for every machine. I love elegant solution to security.2. Gain access to partition manager. (ext3, ext4, btrfs and LVM!)3. Move prelinux script to the preseeding stage ensure a continuous integration of company policy (Lessons: Polices can never be applied without powerful infra.)4. Automatic network interfaces configuration. Ubuntu installer smartly apply network configuration to /etc/network/interfaces, so does CentOS’s anaconda.
    18. 18. Architecture Preseed/Kickstart V.S. Image clone• Preseeding is slow. Although installer could utilize yum/apt mirror to speed up package downloading, the entire retrieve-prepare-configure cycle can’t be optimized further.• Image clone is suitable for creating VM.(Xen, LXC, etc) But it is too dump to do anything intelligence. But we want the best of both world! Solution: n_preseed = normailize(uuid.preseed, uuid.hardware) n_preseed.exists? n_preseed.clone(server_ip, uuid) else install(uuid)
    19. 19. Architecture1. Yum/Apt mirror ensure 99% cache hit, all the packages are pulled from LAN. Local master only maintain cache.2. Why not directly mirror upstream repo.? 1. The bandwidth of upstream mirror is likely to fluctuate(e.g., us.archieve.ubuntu.com) 2. Most packages will never be downloaded. In fact, the standard installation of CentOS 6.0 only needs less than 380 packages where a full fledged repo contains 15K. (2.5%)3. Repo. implementations 1. Yum: nginx error_page + proxy_pass + ppull.rb upstream mirror: mirrors.sdo.com (Why not proxy_cache? Because nginx has some issue with range-request when proxy_cache is enabled.) 2. Apt: apt-cacher-ng upstream mirror: mirror.lupaworld.com
    20. 20. The Matrix Ubuntu Ubuntu CentOS CentOS RHEL RHEL Arch FreeBSD Gentoo Fedora Debian 10.04 11.04 5.6 6.0 5.6 6.0 Linux IBM x3550 Y Y Y Y HPProliant DL360 G5 IBM x3550 M2 DellPowerEdge R610 HPProliant DL385 G2 IBMBladeCe nter LS22• Y means both i386 and amd64 is passed• Y* means M[ij] needs extra configuration
    21. 21. Architecture1. Why hardware has dependency on OS distro.?Every OS distro. may bring surprise. e.g.Ubuntu-11.04(codename natty)’s radeon card drivers isincompatible with IBM x3550. You got kernel panic afterinstallation.2. What’s the purpose to support all Linux distro.? • We want Total World Domination • NML is about mechanism not policy • Linode supports all distro. on Xen! Our task is easier.3. Is it time-consuming to support all linux distro.?Just do it.
    22. 22. Questions?One obvious question: What is NML?

    ×