Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

サルでもわかるMesos schedulerの作り方

3,725 views

Published on

Slides from my talk at Tokyo Mesos Meetup #1

Published in: Technology

サルでもわかるMesos schedulerの作り方

  1. 1. サルでもわかるMesos Scheduler の作り方Mesos Tokyo 勉強会 #1  Feb 2015 @wallyqs
  2. 2. About me Name: Wally (ワリ) Twitter: Github: From Mexico :) https://twitter.com/wallyqs https://github.com/wallyqs
  3. 3. Interests インフラ、分散システム、次世代PaaS I like fast deploys with high scalability & availability 文芸的プログラミング Org mode heavy user
  4. 4. Communities Google Summer of Code
  5. 5. Communities HackerSchool alumni Programmer's retreat based in New York. Great community!
  6. 6. Org mode activity Org mode Ruby parser Used at Github for rendering .orgfiles! Added syntax highlighting support and many other improvements/bugs
  7. 7. Agenda Why Mesos? Why implementing our own Scheduler? What does a Mesos Scheduler do? Communication flow between components Basic scheduler implementation styles Examples
  8. 8. Why Mesos?
  9. 9. Background Experience operating PaaS for 3 years now
  10. 10. Short term benefits but..
  11. 11. Full Stack PaaS means fork is almost unavoidable Following community Priorities mismatch Platform too tightly coupled Can only deploy web workloads Conway's Law Etc…
  12. 12. How does Mesos helps? By adding another level of indirection!
  13. 13. A set of APIs instead of a Full Stack approach We can implement an scheduler with the logic we need No vendor lock-in Not becoming unable of following OSS community due to fork. No roadmap mismatch issue
  14. 14. But how does Mesos works???
  15. 15. Basic components of a Mesos Cluster Some Mesos Master Many Mesos Slaves Schedulers Also called frameworks Executors
  16. 16. All communication done via HTTP
  17. 17. Communication flow between components
  18. 18. Example Master running at 192.168.0.7:5050 Slave running at 192.168.0.7:5051 Scheduler running at 192.168.0.7:59508 Executor running at 192.168.0.7:58006
  19. 19. Discovery between Master to Slaves Slaves announce themselves to the Master Master pings slave: POST/slave(1)/PINGHTTP/1.0 User-Agent:libprocess/slave-observer(1)@192.168.0.7:5050 Connection:Keep-Alive Transfer-Encoding:chunked Slave pongs back: POST/slave-observer(1)/PONGHTTP/1.0 User-Agent:libprocess/slave(1)@192.168.0.7:5051 Connection:Keep-Alive
  20. 20. Scheduler starts and Registers to the master POST/master/mesos.internal.RegisterFrameworkMessageHTTP/1.1 Host:192.168.0.7:5050 User-Agent:Go1.1packagehttp Content-Length:44 Connection:Keep-Alive Content-Type:application/x-protobuf Libprocess-From:scheduler(1)@192.168.0.7:59508 Accept-Encoding:gzip
  21. 21. Master ACKs the registering to the scheduler POST/scheduler(1)/mesos.internal.FrameworkRegisteredMessageHTTP/1.0 User-Agent:libprocess/master@192.168.0.7:5050 Connection:Keep-Alive Transfer-Encoding:chunked
  22. 22. Then Master starts giving resources to the Scheduler POST/scheduler(1)/mesos.internal.ResourceOffersMessageHTTP/1.0 User-Agent:libprocess/master@192.168.0.7:5050 Connection:Keep-Alive Transfer-Encoding:chunked cpu2 slave(1)@192.168.0.7:5051
  23. 23. Scheduler accumulates offerings and launches tasks to the Master The Master will give an Slave resource to run the job. POST/master/mesos.internal.LaunchTasksMessageHTTP/1.1 Host:192.168.0.7:5050 User-Agent:Go1.1packagehttp Content-Length:260 Connection:Keep-Alive Content-Type:application/x-protobuf Libprocess-From:scheduler(1)@192.168.0.7:59508 Accept-Encoding:gzip
  24. 24. Master submits job from scheduler to the Slave POST/slave(1)/mesos.internal.RunTaskMessageHTTP/1.0 User-Agent:libprocess/master@192.168.0.7:5050 Connection:Keep-Alive Transfer-Encoding:chunked
  25. 25. Executor is started and registers back to the Slave POST/slave(1)/mesos.internal.RegisterExecutorMessageHTTP/1.0 User-Agent:libprocess/executor(1)@192.168.0.7:58006 Connection:Keep-Alive Transfer-Encoding:chunked
  26. 26. Slave ACKs to the executor that it is aware of it POST/executor(1)/mesos.internal.ExecutorRegisteredMessageHTTP/1.0 User-Agent:libprocess/slave(1)@192.168.0.7:5051 Connection:Keep-Alive Transfer-Encoding:chunked
  27. 27. Then Slave submits a job to the Executor POST/executor(1)/mesos.internal.RunTaskMessageHTTP/1.0 User-Agent:libprocess/slave(1)@192.168.0.7:5051 Connection:Keep-Alive Transfer-Encoding:chunked
  28. 28. Executor will constantly be sharing status to the slave POST/slave(1)/mesos.internal.StatusUpdateMessageHTTP/1.0 User-Agent:libprocess/executor(1)@192.168.0.7:58006 Connection:Keep-Alive Transfer-Encoding:chunked
  29. 29. Then the Slave will escalate the status to the Master POST/master/mesos.internal.StatusUpdateMessageHTTP/1.0 User-Agent:libprocess/slave(1)@192.168.0.7:5051 Connection:Keep-Alive Transfer-Encoding:chunked
  30. 30. And so on, and so on…
  31. 31. Responsibilities of the Scheduler and Executor Scheduler: Receive resource offerings and launch tasks Process status updates about the tasks Executor: Run tasks Update status of the tasks
  32. 32. Basic Example: CommandScheduler サルでも分からなきゃいけないから、ScalaではなくてGoを使いま す。:P 超簡単CommandScheduler(like mesos-execin C++ but in Go) デフォルトのMesos Executorの機能でOK https://github.com/mesos/mesos-go /usr/local/libexec/mesos/mesos-executor Usage: goruncommand_scheduler.go-address=192.168.0.7:5050-task-count=2-cmd="whiletrue;doecho helloworld;done"
  33. 33. Imports packagemain import( "flag" "fmt" "net" "strconv" "github.com/gogo/protobuf/proto" mesos"github.com/mesos/mesos-go/mesosproto" util"github.com/mesos/mesos-go/mesosutil" sched"github.com/mesos/mesos-go/scheduler" )
  34. 34. CommandScheduler type Implement the Schedulerinterface typeCommandSchedulerstruct{ tasksLaunchedint tasksFinishedint totalTasks int }
  35. 35. Scheduler のインタフェース ResourceOffersとStatusUpdateくらい実装すればいい他のメソ ードは一旦ペンディングで ResourceOffers StatusUpdate Registered Reregistered Disconnected OfferRescinded FrameworkMessage SlaveLost ExecutorLost Error
  36. 36. ResourceOffers の実装 The Master will be giving the scheduler offerings Some of the important information contained within the offerings are Resources available: disk, cpu, mem Id of the slave that contains such resources
  37. 37. コード func(sched*CommandScheduler)ResourceOffers(driversched.SchedulerDriver,offers[]*mesos.O ffer){ for_,offer:=rangeoffers{ cpuResources:=util.FilterResources(offer.Resources,func(res*mesos.Resource)bool{ returnres.GetName()=="cpus" }) cpus:=0.0 for_,res:=rangecpuResources{ cpus+=res.GetScalar().GetValue() } memResources:=util.FilterResources(offer.Resources,func(res*mesos.Resource)bool{ returnres.GetName()=="mem" }) mems:=0.0 for_,res:=rangememResources{ mems+=res.GetScalar().GetValue() } fmt.Println("ReceivedOffer<",offer.Id.GetValue(),">withcpus=",cpus,"mem=",mem s) remainingCpus:=cpus remainingMems:=mems
  38. 38. コード ポイント#0: Scheduler is responsible of using resources correctly ポイント#1: TaskIdneeds to be unique somehow ポイント#2: For a task to run it needs a SlaveIdwhich is contained in the offer vartasks[]*mesos.TaskInfo forsched.tasksLaunched<sched.totalTasks&& CPUS_PER_TASK<=remainingCpus&&//ポイント#0 MEM_PER_TASK<=remainingMems{ sched.tasksLaunched++ //ポイント#1 taskId:=&mesos.TaskID{ Value:proto.String(strconv.Itoa(sched.tasksLaunched)), } task:=&mesos.TaskInfo{ Name: proto.String("go-cmd-task-"+taskId.GetValue()), TaskId: taskId, SlaveId:offer.SlaveId,//ポイント#2 Resources:[]*mesos.Resource{ util.NewScalarResource("cpus",CPUS_PER_TASK), util.NewScalarResource("mem",MEM_PER_TASK), }, Command:&mesos.CommandInfo{ Value:proto.String(*jobCmd), }, } fmt.Printf("Preparedtask:%swithoffer%sforlaunchn",task.GetName(),offer.Id.GetVal ue()) tasks=append(tasks,task) remainingCpus-=CPUS_PER_TASK remainingMems-=MEM_PER_TASK
  39. 39. StatusUpdate の実装
  40. 40. コード ポイント#0: Use taskId (status.TaskId.GetValue()) to handle what to do ポイント#1: In this example, the schedule stops if one task dies. func(sched*CommandScheduler)StatusUpdate(driversched.SchedulerDriver,status*mesos.TaskS tatus){ //ポイント#0:status.TaskId.GetValue() fmt.Println("Statusupdate:task",status.TaskId.GetValue(),"isinstate",status.State. Enum().String()) ifstatus.GetState()==mesos.TaskState_TASK_FINISHED{ sched.tasksFinished++ } ifsched.tasksFinished>=sched.totalTasks{ fmt.Println("Totaltaskscompleted,stoppingframework.") driver.Stop(false) } ifstatus.GetState()==mesos.TaskState_TASK_LOST|| status.GetState()==mesos.TaskState_TASK_KILLED||//ポイント#1 status.GetState()==mesos.TaskState_TASK_FAILED{ fmt.Println( "Abortingbecausetask",status.TaskId.GetValue(), "isinunexpectedstate",status.State.String(), "withmessage",status.GetMessage(), ) driver.Abort() } }
  41. 41. 最後に、main Only thing we need to do is pass the scheduler to the configuration. funcmain(){ fwinfo:=&mesos.FrameworkInfo{ User:proto.String(""), Name:proto.String("GoCommandScheduler"), } bindingAddress:=parseIP(*address) config:=sched.DriverConfig{ Scheduler:&CommandScheduler{ tasksLaunched:0, tasksFinished:0, totalTasks: *taskCount, }, Framework: fwinfo, Master: *master, BindingAddress:bindingAddress, } driver,err:=sched.NewMesosSchedulerDriver(config) }
  42. 42. Done! gorunexamples/command_scheduler.go-address="192.168.0.7"-master="192.168.0.7:5050" -logt ostderr=true -task-count=4-cmd="ruby-e'10.times{puts:hellooooooo;sleep1}'" InitializingtheCommandScheduler... FrameworkRegisteredwithMaster &MasterInfo{Id:*20150225-174751-117483712-5050-13334,Ip:*11 7483712,Port:*5050,Pid:*master@192.168.0.7:5050,Hostname:*192.168.0.7,XXX_unrecognized:[],} ReceivedOffer<20150225-174751-117483712-5050-13334-O0>withcpus=4 mem=2812 Preparedtask:go-cmd-task-1withoffer20150225-174751-117483712-5050-13334-O0forlaunch Preparedtask:go-cmd-task-2withoffer20150225-174751-117483712-5050-13334-O0forlaunch Preparedtask:go-cmd-task-3withoffer20150225-174751-117483712-5050-13334-O0forlaunch Preparedtask:go-cmd-task-4withoffer20150225-174751-117483712-5050-13334-O0forlaunch Launching 4tasksforoffer20150225-174751-117483712-5050-13334-O0 Statusupdate:task1 isinstate TASK_RUNNING Statusupdate:task3 isinstate TASK_RUNNING Statusupdate:task2 isinstate TASK_RUNNING Statusupdate:task4 isinstate TASK_RUNNING
  43. 43. What about containers? Mesos 0.20からContainerInfoも使えま す。 例: task:=&mesos.TaskInfo{ Name: proto.String("go-cmd-task-"+taskId.GetValue()), TaskId: taskId, SlaveId: offer.SlaveId, //Executor:sched.executor, Resources:[]*mesos.Resource{ util.NewScalarResource("cpus",CPUS_PER_TASK), util.NewScalarResource("mem",MEM_PER_TASK), }, Command:&mesos.CommandInfo{ Value:proto.String(*jobCmd), }, Container:&mesos.ContainerInfo{//ポイント Type:mesos.ContainerInfo_DOCKER.Enum(), Docker:&mesos.ContainerInfo_DockerInfo{ Image:proto.String(*dockerImage), //Network:mesos.ContainerInfo_DockerInfo_BRIDGE.Enum(), //PortMappings:[]*ContainerInfo_DockerInfo_PortMapping{}, }, }, }
  44. 44. Example sudodockerps CONTAINERID IMAGE COMMAND CREATED STATUS PORTS NAMES 1a8b3c964c3e redis:latest ""/bin/sh-credis- 17minutesago Up17minu tes mesos-88de0870-b613-4bda-9ed4-30995834ccab
  45. 45. What about fault tolerance?
  46. 46. First, Sheduler needs to be aware of all tasks info too typeFaultTolerantCommandSchedulerstruct{ tasksLaunchedint tasksFinishedint totalTasks int tasksList []*mesos.TaskInfo }
  47. 47. ResourceOffers handler A task in order to be valid, it needs an SlaveID. In ResourceOfferswe only execute the ones without SlaveID vartasksToLaunch[]*mesos.TaskInfo for_,task:=rangesched.tasksList{ //Checkifitisrunningalreadyornot(hasanSlaveID) iftask.SlaveId==nil{ fmt.Println("[OFFER]",offer.SlaveId,"willbeusedfortask:",task) task.SlaveId=offer.SlaveId remainingCpus-=CPUS_PER_TASK remainingMems-=MEM_PER_TASK tasksToLaunch=append(tasksToLaunch,task) } } iflen(tasksToLaunch)>0{ fmt.Println("[OFFER]Launching",len(tasksToLaunch),"tasksforoffer",offer.Id.Get Value()) driver.LaunchTasks([]*mesos.OfferID{offer.Id},tasksToLaunch,&mesos.Filters{RefuseSe conds:proto.Float64(1)}) }
  48. 48. StatusUpdate handler StatusUpdateを受け取たら、ハンドリングできる。 次の ResourceOffersが行われるときに、リスケジュールされる。 ifstatus.GetState()==mesos.TaskState_TASK_KILLED{ taskId,_:=strconv.Atoi(*status.GetTaskId().Value) fmt.Println("[STATUS]TASK_KILLED:",taskId) sched.tasksList[taskId-1].SlaveId=nil } ifstatus.GetState()==mesos.TaskState_TASK_FAILED{ taskId,_:=strconv.Atoi(*status.GetTaskId().Value) fmt.Println("[STATUS]TASK_FAILED:",taskId) sched.tasksList[taskId-1].SlaveId=nil }
  49. 49. Conclusions Not so complicated to create your own custom schedulers Easy to extend and wrap around HTTP APIs to build desired logic. Good pluggable solution!
  50. 50. ご静聴ありがとうございます。

×