Tech challenges In a Large-Scale Agile Project

                        Harald Søvik and Morten Forfang

              Com...
2       Harald Søvik and Morten Forfang

being developed. A lot of them were previously not familiar with software de-
vel...
Tech challenges In a Large-Scale Agile Project    3




Fig. 1. The build time and runtime dependencies between important ...
4        Harald Søvik and Morten Forfang

since the common module is bloated. Finally, each of the non-common modules
beco...
Tech challenges In a Large-Scale Agile Project     5

    This code will look identical in all four Mats subsystem modules...
6      Harald Søvik and Morten Forfang

of bureaucracy when dealing with a small number of ever-present modules. We
clearl...
Tech challenges In a Large-Scale Agile Project    7

to necessary gather all build steps under a single command. This was ...
8      Harald Søvik and Morten Forfang

domain programmers to work decoupled from the technical programmers. The
processes...
Tech challenges In a Large-Scale Agile Project     9

process code the same was it treated technical code. That is: provid...
10     Harald Søvik and Morten Forfang

4. Breivold H. et al.: ANALYZING SOFTWARE EVOLVABILITY, 32nd IEEE Inter-
   nation...
Upcoming SlideShare
Loading in …5
×

Tech challenges in a large scale agile project

1,461 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,461
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
26
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Tech challenges in a large scale agile project

  1. 1. Tech challenges In a Large-Scale Agile Project Harald Søvik and Morten Forfang Computas AS, Lysaker Torg 45, 1327 Lysaker, Norway {hso,mfo}@computas.com, http://www.computas.com Abstract. A five year, 25 man java project effort, that started with a waterfall-like methodology and that adopted Scrum after less than a year, has been concluded. We present three key technical challenges, briefly analyze their consequences and discuss the solutions we adopted. Firstly, we discuss how we modularized our architecture, module delin- eation principles, coupling and the trade-offs of abstraction. Then we discuss testing environments, their automation and relation to branches and the effect on customer involvement and feedback. Finally we discuss the benefits and disadvantages of representing domain knowledge declar- atively. For all three challenges we discuss how the project’s agility was affected. Key words: java, scrum, agile, modularization, coupling, technical chal- lenge, module bloat, dependency, applicaiton layer, pojo, ide, testing, branch, continuous integration, deploy, maven, database updates, do- main expert, footprint, declarative knowledge, business logic, domain knowledge, process 1 Introduction A 5 year long Java and agile-project has recently come to a conclusion. It has been conducted between Computas (a Norwegian consultancy firm) and a major Norwegian governmental authority. As a rather large project (manning 20-30 developers throughout the project), it has faced a lot of challenges, but overall the project is recognized as a major success. This experience reports focuses on three major technical challenges that arose during the project. We believe those challenges partially can be attributed to the SCRUM methodology. For each of them, we try to identify the consequence and the cause, and then follow up with any solutions we tried, and an analysis of whether the problem was successfully solved or not. 1.1 A brief history The project started out using a crossover variant of the waterfall methodology and long-term iterations, but turned towards agile after 8 months of develop- ment. The customer and domain experts needed to see, touch and feel the system
  2. 2. 2 Harald Søvik and Morten Forfang being developed. A lot of them were previously not familiar with software de- velopment. Being able to see results of their decisions quickly made them more confident and eager. To support parallel development, the project was organized into 4 different teams, whereas 3 were committing to the development branch, and one team working maintenance on the production branch. The teams were ”self sufficient” in technical expertise - meaning that multiple programmers could be working simultaneously on the same module or layer of the system. The teams were also assigned a pool of domain experts suitable for the tasks they had chosen for the next iteration. The project were divided into 4 overall domain-centric periods, each support- ing a specific business target. All of these focused on similar goals: – Enable self-service or more automated customer interaction – Reduce the number of internal databases and managed systems – Increase data quality – Unify business processes to ensure quality 2 Coupling and modularization The dependency topology of a system and it’s relation to evolvability is a cur- rent, active research field ([3]). It is predominantly argued that modularization, low coupling and high cohesion are characteristics that enhance salient non- functional characteristics of a system, like analyzability, integrity, changeability and testability ([4], [7]). In this section, we explore some of the trade-offs we have found between modularization and various forms of efficiency. We use the terms subsystem, cohesion, coupling and framework as defined by [6] and component and collaboration as defined by [8]. 2.1 Subsystems and frameworks The project consists of a series of core subsystems and legacy systems. In this ar- ticle we focus on the Mats subsystem. This subsystem depends crucially on the Framesolutions framework. The Framesolutions framework is a fairly compre- hensive set of customizable, proprietary components that many of our projects use. It at the same time provides core functionality on all application layers and it provides a fairly rigid set of programming patterns. The framework is centrally maintained separately from the project source code. 2.2 The Mats subsystem For the purposes of this article, the The Mats subsystem features four build-time modules; the application server, the web server, the client and a common module. The run-time and compile-time dependencies between these are illustrated in figure˜1.
  3. 3. Tech challenges In a Large-Scale Agile Project 3 Fig. 1. The build time and runtime dependencies between important Mats subsystem modules. Technologically, the two first are realized as Java enterprise archives running on JBoss 4.2.3 servers, the client is a Java application and the common module is a jar file included in all the other modules. Overall, this particular way of modularizing the Mats subsystem strikes a good balance between having increased complexity through many modules, de- creased intra-module coupling and increased module cohesion. There are however a couple of major ill effects: – The common module got bloated. – New, distinct functional areas were not handled well enough as separate col- laborations with proper dependency management. Let’s first briefly discuss the module bloat. On the surface of it, there are few reasons why one shouldn’t put a piece of code or resource in the common module. It is always present and the developer doesn’t have to think about possible reuse. As long as the entity in question is put in common, it can always be reused by some other module. This idea were thought to be a very good one in a large project, and were even more so thought of when we turned agile. We believed that an large and agile project should have a lot of supportive means to enhance reusability. When multiple teams were working on multiple tasks, it could be difficult to spot a candidate for reusability. But if the code was put in the very same module, we believed the programmers could hardly overlook the ”reusabilityness”. The idea were not very successful, and while reuse inside a specific sub-domain (read: team) sometimes has been possible, cross-domain (read: cross-team) reuse is virtually non-existent. This leads to a bloated common module, lowers cohesion and heightens cou- pling. In runtime we increase the overall footprint since the common module is present with all the other modules. At build-time we increase the build time
  4. 4. 4 Harald Søvik and Morten Forfang since the common module is bloated. Finally, each of the non-common modules become unnecessarily complex since there is much in the common module that is unwanted and unneeded by the module in question. Now, let’s move onto dependency management for new collaborations. Given some functionality you may choose to put this into one of the four mentioned modules, but you cannot choose to put it in a module governed by functional area. For example, if you are writing some web functionality from the farm animal domain, you can choose to put the code in the web or the common module, but there is no module that takes care of the farm animal domain. This leads to modules being characterized by having general purpose and very domain specific code side-by-side. Clearly this makes the modules lower their cohesion and calls for sustained developer discipline to avoid increasing intra- module coupling. With the comprehensive, automatic test approach and environment in the Mats project, we categorize the tests according to test level and domain. Char- acterizing tests by domain alleviates the problem of discerning which functional area a particular piece of code belongs to. This doesn’t improve module coupling, but it makes maintenance and refactoring easier. For example: @Test(groups = {"unit", "husdyr"}, dependsOnMethods = {"validererNyIndivid"}, expectedExceptions = IllegalArgumentException.class) public void kanIkkeMerkesSammeDagenEllerTilbakeITid() {} (from Mats 4.0b-SNAPSHOT depending on testng version 5.8) Here “husdyr” (farm animal) is a category corresponding to a fairly large functional subdomain. 2.3 Application Layer Separation From a developer’s technical perspective we let the Mats subsystem (cf. section 2.2) be a single project. The Framesolutions framework and code pertaining to the Integration Platform is kept apart. Modern IDE’s are getting quite good at providing instant and global symbol search and dependency tracking1 . This means that all the previously mentioned modules appear as one seamless whole. The FrameSolution framework (cf. section 2.1) provides two quite powerful mechanisms that makes the Mats subsystem (cf. section 2.2) module boundaries easier to handle. There is the concept of a FrameSolution Manager. Such a manager serves as a “bridge” over a layer or between modules. The bridge may well cross physical hosts and application servers. An example invocation from Mats is ImportMeldingManager.getInstance(). getSisteFoerstemottakerAdresser(importoer, foersteMottaker) 1 The Mats project is predominantly using IDEA IntelliJ version 8
  5. 5. Tech challenges In a Large-Scale Agile Project 5 This code will look identical in all four Mats subsystem modules. For exam- ple, if this code is run on the web server, it will remote call a server stub on the application server. This code is easy for a developer to understand, quite condensed and not far from the ideal of a POJO method invocation. There is just this line needed, no annotation or other configuration. There is the concept of a soft pointer that transparently deals with persistent objects that are not “owned”, i.e. that the association between a class A and a class B is not such that CRUD operations on one is continued on the other. The soft pointer mechanism is together with a cache, implemented such that no matter where in a virtual machine an instance is soft referred to, it always points to the same, single persisted instance. This removes the burden of keeping track of multiple editions of the same object. It furthermore makes it possible to make intelligent, automated decisions w.r.t. persistent cascading and lazy loading. Generally this particular configuration makes it very easy to create function- ality covering the entire vertical stack from the GUI at the thick client or on the web through business logic and down to the persistence or integration service layer. The code looks surprisingly similar on all application layers. Developers are overall very happy with this arrangement. One would perhaps think that this great freedom results in the inter-layer APIs to be in too great flux and that the entire subsystem architecture would over time loose necessary integrity. Due to the rigour of the FrameSolution frame- work (cf. section 2.1) that enforces quite strong patterns on all layers, this does in general not happen. There are however still some major unintended effects: – Just looking at the code in the IDE, it is not always clear on what layer (client, web server, application server) one is coding. Due to the powerful Framesolution abstraction features, the code would work, but not surprisingly, it often leads to highly resource demanding code. – Since the entire Mats subsystem source is editable and available in the IDE, the build time on the individual developer host becomes intolerable. The IDE’s automatic synchronization features also take an irritating amount of time. These unintended effects were partially handled by using a hierarchical mod- ule structure, dividing all code into one of the Mats subsystem modules, cf. section 2.2. However, this was not a very effective layering, since a lot of code still resided in the common module. Other mitigating actions were to introduce faster developer hardware (par- ticularly SSDs) and optimizing the build procedure. It remains a problem, and a fundamental trade-off, whether to let the devel- oper use the powerful abstraction features of the Framesolution framework or write more module dependent, efficient, lower level code. 2.4 Summary Overall we’ve found that developers are quite happy with the productivity boost they get from layer abstraction, having all the source in the IDE and the lack
  6. 6. 6 Harald Søvik and Morten Forfang of bureaucracy when dealing with a small number of ever-present modules. We clearly could’ve introduced more modules, stronger barriers between them or dif- ferent delineation principles. This could’ve bought us higher and better cohesion, and lower intra-module coupling. Chances are, we also would’ve introduced more overhead, increased inter-module coupling and increased demands on developer discipline. 3 Testing environments To support the idea of agile development with domain experts, several testing environments were accessible for the experts throughout the project. This was introduced as early as possible so that the experts could get comfortable with the system and the quirks of using a system that is under development. We were working with two branches of the code (most of the time), and both of these had a range of environments associated with them. First, the development (”trunk”) branch was the target of all changes bound for the next major release. Second, the maintenance branch was holding error fixes and small changes that could be deployed to production very often. Each of these branches had three classes of testing environments: Continuous build [2], nightly build and manual, full-scale build. Thus, there were six different environments to maintain at all times. We did invest heavily in infrastructure and build automation for these environments. 3.1 Effect Because of the wide variety of testing environments, the experts always had an appropriate environment to carry out their testing needs. At first, it proved a little difficult to realize how to employ this diversity, and it was frequently necessary to remind them of in which environment it would be advisable to test a new feature or a bug fix. After the first major release, most people got acquainted with the idea, and became very happy with the different environments. See Table 1 for the different applications. Continuous Nightly Manual Development Experimental features New, stable features Production ready features Maintenance Experimental bug fixes Stable bug fixes Regression tests Table 1. Different testing environments A key feature in maintaining all of these environments, was to have com- pletely automated build- and deploy processes. It proved incredibly valuable - both for scheduled (nightly) builds, and whenever a new feature had to be tested immediately (continuous). To be able to automate the deploy process, it was close
  7. 7. Tech challenges In a Large-Scale Agile Project 7 to necessary gather all build steps under a single command. This was success- fully implemented with Maven. This helped assure similar builds on multiple platforms, i.e. different integration environments and development platforms. 3.2 Difficulties Database structured updates proved to be difficult to automate, and were thus handled manually. It would have been possible to automate within the applica- tion, but the customer technicians discouraged such an approach. It was believed that an automated handling would weaken the exhibited control over database schema. After spending a lot of time reimbursing the necessity of continuous testing, many of the domain experts got used to the idea of having a environment at hand all day and night. This notion was also backed by a stable production sys- tem. Thus, whenever a testing environment failed to deploy correctly at night, or a programming error lead to a critical failure of the application - the domain experts would quickly let us know that they were inhibited from doing their work efficiently. To counter this, the service monitoring tool used in production were also employed to testing environments - with email and even SMS alerts. The testing environments were subject to the same uptime requirements as produc- tion. Of course - there were no really dramatic consequences when a fault lead to unplanned downtime. 3.3 Conclusion The biggest con of this approach was the impact on the development platform. Whereas nightly deployment worked fine, frequent deployments by developers or the continuous integration engine proved very time consuming. It is thus our conclusion that a more flexible and dynamic approach should be available for developers. This was partially implemented with hot code reloading tools, like Java hotspot and JRebel. These approaches were partially successful: Many changes were handled and deployed seamlessly, but changes to initialization code and structurally modeled data had to be handled otherwise. Testing environments and infrastructure is at heart of an agile project. The ability to quickly recognize new or altered features, and give feedback to appro- priate developers or domain experts, is a stepping stone in enabling agility. 4 Declaratively represented knowledge A key concept in this project was Business Process Modeling: Separation between domain knowledge and implementation. From artificial intelligence, cognitive architecture work and knowledge modeling, the distinction between declarative and procedural knowledge has long been deemed important [1], [5]. This idea could be paraphrased as separating what to do from how to do it. It allows the
  8. 8. 8 Harald Søvik and Morten Forfang domain programmers to work decoupled from the technical programmers. The processes can be modeled and mocked and remodeled, until the ideal workflow is reached. A few vendors offers systems implemented this way, and Computas is per- haps the leading supplier in Norway. We believed that this technique would fit quite well in an agile project. It would be possible to change the behaviour of the system without altering the technical details, and visa versa: Altering the technical implementation without altering the business processes. 4.1 Effect This approach presents challenges to the way a customer (and developers) man- age software. Since the knowledge is decoupled from the program logic, it is possible to alter the behaviour of a running program. Thus, whenever the developers decided to refactor the business logic, or the domain experts decided to alter the business processes, it would introduce no overhead for the other party. And this proved to be a correct assumption. Even though most of the business logic stayed the same once written, the business pro- cess often changed in subsequent iterations. This made the domain experts rater self sufficient in many questions, and made it possible to adapt to experiences in a quick fashion. However, we underestimated the need of technical means to handle this dis- tinction: It should not be necessary to rebuild and redeploy such a system when developing and testing processes. Nevertheless, developers often did so to ”ensure correctness and consistency”. 4.2 Difficulties It might sound trivial, but postponing to figure out how to correctly reload pro- cess definitions at runtime reduced the development turnaround significantly. And not to mention the developers dissatisfaction of having to wait for changes to deploy before testing. Thus, it became clear that the application had to be de- signed for ”hot knowledge reloading”. This challenge was solved by implementing a re-initialization feature: Whenever process definitions or other domain defined data did change, the developer was able to reload the definitions and start using them right away. This feature is not used in production, but would theoreti- cally allow process managers to alter a process in the middle of the day without scheduling downtime and employing technical personnel. Also, our proprietary domain modeling language (although BPEL-compliant) were of course unsupported in any Integrated Development Environment (IDE). The language itself has a dedicated IDE which is well implemented and very functional for modeling purposes. But the functionality provided by our code IDE[9] were oblivious to such data representation, and provided no development support whatsoever. It hindsight, we should have realized the cost, and imple- mented a support plugin for our code IDE, so that it would treat changed to
  9. 9. Tech challenges In a Large-Scale Agile Project 9 process code the same was it treated technical code. That is: provide support for syntax highlighting, autocompletion and code deployment 4.3 Conclusion Use of proprietary data formats can both be rewarding and expensive. One need to be aware of the necessary level of support, and accept the cost of maintaining such a level. Use of declarative knowledge representations is potentially a major catalysator when dealing with domain experts, but actually implementing the knowledge requires a high level of expertise. Also in this matter, one should be very careful with respect to the tools one choose to use for implementation and assist. 5 Summary This experience report outlines three difficulties we experienced during this project. Although the details may appear minor, the impact of improvements was significant, because of the project size. We believe that it is incredibly im- portant not to neglect those problems that were manageable in smaller projects, but may scale fast in large projects, and give the developers an arena where they can give an early warning of such challenges. We would like to emphasize some key issues from this report: – Abstraction and generalization should been driven by necessity and experi- ence, not guesswork and intentions. – Abstractions is not a substitute for decoupling and formally controlled module interfaces. – Treat your testing environment like a small production environment. Define a Service Level Agreement for your domain experts and testers. Automate everything. – Do not underestimate the accumulated overhead caused by small problems being ignored over a lengthy period of time. – Do not underestimate the support that can be provided by modern day de- velopment tools. References 1. Brachman R. and Levesque H: Knowledge Representation and Reasoning, 1st ed., Morgan Kaufmann 2004, ISBN 978-1558609327 2. Fowler M.: Continuous Integration, http://www.martinfowler.com/articles/ contin- uousIntegration.html 3. Breivold H.: SOFTWARE ARCHITECTURE EVOLUTION AND SOFTWARE EVOLVABILITY, Malardalen University, Sweden, 2009
  10. 10. 10 Harald Søvik and Morten Forfang 4. Breivold H. et al.: ANALYZING SOFTWARE EVOLVABILITY, 32nd IEEE Inter- national Computer Software and Applications Conference (COMPSAC), Finland 2008 5. Kendal S. and Creen M.: An Introduction to Knowledge Engineering, 1st edition, Springer 2006, ISBN 978-1846284755 6. Lethbridge T and Laganiere R., Object-Oriented Software Engineering, 2nd ed., McGraw-Hill 2005, ISBN 978-0073220345 7. Pressman R: Software Engineering: A Practitioner’s Approach, 5th ed., McGraw- Hill, ISBN 978-0073655789 8. Unified Modeling Language, version 2.x, www.uml.org 9. http://www.jetbrains.com/idea/

×