Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Distributed Development


Published on

Slides from the talk given at the 28 meeting of

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Distributed Development

  1. 1. Distributed DevelopmentProject FreeSpace<br />Alexey<br />Dmitri<br />
  2. 2. This Talk Is About…<br />Speed<br />How to get things done faster<br />Quality<br />How to get feedback faster<br />How to get more testing done<br />Manageability<br />Cloud monitoring & control<br />Decentralization/fault-tolerance<br />Not only development!<br />
  3. 3. Joel’s Test <br />Do you use source control?<br />Can you make a build in one step?<br />Do you make daily builds?<br />Do you have a bug database?<br />Do you fix bugs before writing new code?<br />Do you have an up-to-date schedule?<br />Do you have a spec?<br />Do programmers have quiet working conditions?<br />Do you use the best tools money can buy?<br />Do you have testers?<br />Do new candidates write code during their interview?<br />Do you do hallway usability testing?<br />Typical answer: No.<br />
  4. 4. Best Tools Money Can Buy?<br />Hardware<br />Fast CPU<br />Lots of RAM<br />SSDs<br />Multiple monitors<br />Software<br />Commercial issue tracking<br />Paid source code hosting<br />File sync services<br />ReSharper<br />
  5. 5. In large/complex projects<br />IDE interaction is slow<br />Code analysis is slow<br />Compilation is slow<br />Testing is slow<br />(Re)deployment is slow<br />
  6. 6. IDE interaction is slow<br />IDEs are slow, but we cannot ditch them<br />We have nearly few software options for optimizing IDEs<br />E.g., VS is both disk I/O-bound (SSD a must) and CPU-bound<br />We cannot relocate, e.g., the ReSharper cache into a distributed service<br />IDEs can be spawned on same project in many machines<br />Multiple screens/remote desktop windows<br />Synchronizable with Dropbox, SugarSync, etc.<br />But project/solution reloads in VS will kill you<br />
  7. 7. Code analysis is slow<br />In-depth analysis of either compiled or source code is computationally intensive<br />NDepend, FxCop and others can all be run remotely<br />Not just on the build server<br />Most of these tools output report files<br />Can send these to origin<br />Some of these tools can be made to work on per-file/per-project rather than per-solution<br />
  8. 8. Compilation is slow<br />Compilation is<br />C#/VB.NET – acceptable<br />F# – slow<br />C++ – atrociously slow<br />Made worse by pre/post-build<br />PostSharp<br />Entity Framework<br />Code Contracts<br />Moles<br />Etc.<br />VS compilation process inefficient<br />Will rebuild projects that haven’t changed<br />Will not parallelize by default<br />MSBuild is parallelizable<br />/m:n<br />Can spawn multiple processes<br />
  9. 9. Testing is (painfully) slow<br />Unit testing is badly parallelized<br />MbUnit’s [Parallelizable]<br />Same in NUnit 3<br />Can easily parallelize at different granularity<br />Test case/method<br />Test fixture<br />Test assembly<br />
  10. 10. Fear of Builds/Tests<br />Developers loath to compile or run tests too frequently<br />Disruption of focus leads them to<br />Surf the web<br />Go for coffee<br /><insert your pastime here><br />Everyone loses<br />Loss of concentration/motivation<br />Developers never ‘in the zone’<br />TDD does not work<br />
  11. 11. Who cares?<br />Developers<br />Employers<br />Fixed salary<br />Don’t care about TTM<br />Accustomed to substandard tools/equipment<br />View compilation as a one-time process<br />Don’t care about frequent/continuous testing<br />See manual deployment as normal<br />More concerned with saving money than getting things done<br />Uninformed about good/best practices<br />Not concerned with quick delivery (in case of service companies that charge by the hour) <br />
  12. 12. Problem: nobody recognizes compilation/testing as wasted time<br />
  13. 13. How to speed things up?<br />Optimize or buy a faster computer<br />SSDs<br />More RAM<br />Faster CPUs<br />Costly! Has to be done for every developer.<br />Alternatively… use existing infrastructure<br />Both physical (e.g., dev machines) and virtual<br />Distribute workload between machines<br />Use idle resources – no need to buy new machines.<br />
  14. 14. Status Quo<br />Computers get faster<br />More cores per CPU<br />Faster hard drives (SSD, hybrid)<br />Software gets more demanding<br />Windows eats more and more RAM & HDD<br />VS is slower<br />Everyone else follows suit<br />The overall development experience isnot getting any better<br />
  15. 15. Why Distributed?<br />Resource under-utilization<br />A typical enterprise (IT-specific or not) is unlikely to use 100% of processing resources<br />Resource overload<br />Bottlenecks in servers<br />Resource costs<br />Server-grade hardware<br />Reliability concerns (e.g. RAID)<br />
  16. 16. Three Pillars of Distribution<br />Get data on everyone’s machine<br />Cloud storage/file sync<br />Verification necessary<br />Get machines chatting with one another<br />XMPPclient on each node<br />Load balance and optimize execution plan<br />Send commands to do work, get results<br />Work items synchronized via cloud storage<br />Redundancy/reliability guarantees<br />Integrate with existing systems<br />Easy because XMPP uses XML<br />
  17. 17. Scale Vectors<br />Core<br />Machine<br />Better support for Multi-Core<br />Exists in some cases<br />MSBuild<br />MbUnit (+ NUnit 3)<br />Could be leveraged in the general case<br />Not easy!<br />Needs to mind end user’s preferences<br />Support arbitrary networks (both on- and off-site)<br />Need to control code sync (security)<br />Can go for full resource utilization (esp. off-hours)<br />Speculative processing<br />E.g., Monte-Carlo simulations<br />Operations which are prohibitively resource-intensive<br />E.g., mutation tests<br />
  18. 18. Leveraging the Model<br />Compilation<br />Compiling on dev’s machine is counterproductive<br />Compilation of some languages (C++, Scala) takes far too much time<br />But the problem exists everywhere (.Net, Java)<br />Deployment<br />Testing<br />Large test base cannot be run on a dev’s machine<br />CI is not the answer<br />Constrained to a single machine<br />Can be distributed, but not straightforward<br />Code analysis<br />Very costly<br />Coverage analysis<br />Extremely costly<br />
  19. 19. Challenges<br />Load balancing<br />File synchronization<br />Security<br />
  20. 20. Compilation<br />MSBuild<br />Builds all major types of VS projects<br />Can parallelize locally (/m:n, n=# of processes)<br />Builds block VS<br />Build on the UI thread<br />Builds often inefficient<br />Cannot build only projects affected by changes<br />Cannot use multiple machines<br />
  21. 21. Distributed Compilation<br />Dramatically speed up solution build<br />Determine project dependencies<br />Build different projects on separate machines<br />Use multiple MSBuild processes per machine<br />Depending on CPU count & power<br />Does not distract the developer<br />Development machine usable without interruption<br />Quicker feedback on errors<br />Allows to instrument a continuous build policy<br />Build on ever file save<br />
  22. 22. Distributed Testing<br />Testing is slow<br />Unit testing is largely not parallelized<br />MbUnit [Paralellizable]<br />Nunit will only support it in v.3<br />Not parallelized between several machines<br />Testing in specific environment difficult<br />Requires complicated (possibly manual) deployment processes<br />Developer typically only tests on their own box (+ maybe CI server)<br />Distributed testing ensures tests work everywhere<br />
  23. 23. Side Effects<br />Side effects are unwelcome on users’ machines<br />Environment changes may have undesired consequences<br />Builds are typically exempt from this<br />They do not affect anything beyond solution work folders<br />Unit tests may or may not affect host system<br />Integration tests typically do affect hosts<br />Require ‘clean’ set-ups<br />
  24. 24. Isolation<br />No side-effects<br />Irrelevant, just take care of load<br />Side effects<br />Process-level virtualization (for existing machines)<br />JauntePE<br />App-V<br />Virtual machines<br />Hyper-V<br />ESX<br />
  25. 25. Virtualized Testing<br />Creation of multiple physical nodes is costly<br />Physical machine re-configuration takes too much time<br />Can configure a virtual test environment with<br />Hyper-V<br />System Center Virtual Machine Manager<br />Virtual/physical migrations<br />Different hardware requirements<br />Multi-CPU system<br />Fast disks<br />Very large amounts of RAM<br />
  26. 26. Project FreeSpace<br />Private Cloud Infrastructure<br />XMPP + file sync<br />Single-MSI deployment<br />Plugin architecture<br />Fully self-updating<br />Decentralized service orchestration<br />Self-organizing<br />Each node has identical capability<br />Easy to administer<br />
  27. 27. Project FreeSpace Features<br />Distributed Compilation<br />Initially MSBuild<br />Distributed Unit Testing<br />Initially via Gallio test automation framework<br />Distributed Integration Testing<br />Virtual machine management<br />Initially via Hyper-V<br />Distributed deployment<br />E.g., create new VM for testers with appropriate binaries etc.<br />
  28. 28. Benefit Summary<br />Better than Continuous Integration<br />Better than Continuous Testing<br />Better than local compilation<br />Better than local testing<br />
  29. 29. Better Than Continuous Integration<br />Good for long-running builds/tests<br />Happens on a single machine<br />Can set up, e.g., multiple instances, but it’s not straightforward<br />Not designed for distributed builds<br />Not optimized for idle processing<br />Assumes server is dedicated<br />Does not give immediate feedback<br />Typically works on commit<br />I.e., detects source control changes<br />
  30. 30. Better Than Continuous Testing<br />Testing is often more costly than compilation<br />Typically, tests run on commit<br />Continuous testing (e.g., Mighty Moose) systems ensure that<br />Tests run on each save<br />Only tests affected by changes are executed<br />Fast feedback…<br />But not instant – you still need to recompile.<br />Given the option, why not build/test thingsall the time?<br />
  31. 31. Better Than Local Compilation<br />Does not block IDE<br />Scales across your network<br />Much faster builds<br />Immediate feedback<br />
  32. 32. Better Than Local Testing<br />Much faster recompilation<br />Tests do not tax developer CPU<br />Allows for immediate testing in different environments<br />Tests happen in parallel (where possible)<br />
  33. 33. That’s all!<br />Questions?<br />