Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

  • Be the first to comment

  • Be the first to like this


  1. 1. The Web GiantsCulture – Practices – Architecture AUGMENTED
  2. 2. Foreword........................................................................................................................6 Introduction..................................................................................................................9 Culture..........................................................................................................................11 The Obsession with Performance Measurement......................13 Build vs Buy.....................................................................................................19 Enhancing User Experience...................................................................27 Code crafters.................................................................................................33 Open Source Contribution....................................................................41 Sharing Economy platforms..................................................................47 Organization.............................................................................................................57 Pizza Teams.....................................................................................................59 Feature Teams...............................................................................................65 DevOps.............................................................................................................71 Practices.......................................................................................................................85 Lean Startup...................................................................................................87 Minimum Viable Product.........................................................................95 Continuous Deployment.......................................................................105 Feature Flipping.........................................................................................113 Test A/B...........................................................................................................123 Design Thinking.........................................................................................129 Device Agnostic.........................................................................................143 Perpetual beta.............................................................................................151 Architecture............................................................................................................157 Cloud First.....................................................................................................159 Commodity Hardware............................................................................167 Sharding..........................................................................................................179 TP vs. BI: the new NoSQL approach..............................................193 Big Data Architecture..............................................................................201 Data Science................................................................................................211 Design for Failure......................................................................................221 The Reactive Revolution........................................................................227 Open API ......................................................................................................235 About OCTO Technology..............................................................................243 Authors......................................................................................................................245 Table of Contents
  3. 3. 3 THE WEB GIANTS It has become such a cliché to start a book, a talk or a preface by stating that the rate of change is accelerating. However, it is true: the world is changing faster both because of the exponential rate of technology evolution and the central role of the user in today’s economy. It is also a change characterized by Marc Andreessen in his famous blog post as “software is eating the world“. Not only is software at the core of the digital economy, but producing software is changing dramatically too. This is not a topic for Web companies, this is a revolution that touches all companies. To cope with their environment’s change, they need to reinvent themselves into software companies, with new ways of working, organizing themselves and producing digital experiences for their customers. This is why I am so pleased to write the preface to “The Web’s Giants“. I have been using this book intensely since the first French edition was on the market. I have given copies to colleagues both at Bouygues Telecom and at AXA, I have made it a permanent reference in my own blogs, talks and writing. Why? It is the simplest, most pragmatic and convincing set of answers to the previous questions: what to do in this software- infused, technology-enabled, customer-centric fast changing 21st century? This is not a conceptual book, a book about why you should do this or that. This is a beautifully written story about how software and service development is organized in some of the best-run companies of the world. First, this is a book about practices. The best way to grow change in a complex world is to adopt practices. It is the only way to learn, by doing. These practices are sorted into three categories: culture, organization and architecture; but there is a common logic and a systemic reinforcement. Practices are easier to pick and they are less intimidating than methodologies or concepts. However, strong will and perseverance are required. I will not spoil your reading by summarizing what OCTO found when they look at the most common practices of the most successful software companies of the world. I will rather try to convince you that reading this book is an urgent task for almost everyone, based on four ideas. The first and foremost idea is that software systems must be built to change constantly.  This is equally true for information systems, support systems, embedded, web or mobile software. What we could define as customer engagement platforms are no longer complex systems that one designs and builds, but continuously evolving systems that are grown. This new generation of software systems is the core of the Web Giants. Constant evolution is mandatory to cope with exponential technology changes, as well as the only way to co-construct engagement platforms through customer feedbacks. The unpredictability of usage, especially social usage, means that digital experiences software processes that can only be crafted through measure and continuous improvement. This Foreword
  4. 4. 4 THE WEB GIANTS FOREWORD critical change, from software being designed to software being grown, means that all companies that provide digital experiences to their customers must become software companies.  A stable software support system could be outsourced, delegated or bought, but a constantly evolving self-adaptive system becomes a core capability. This capability is deeply mixed with business and its delivery processes and agents are to be valued and respected. The second key idea is that there exists a new way of building such software systems. We are facing two tremendous challenges: to churn out innovations at the rate that is expected by the market, and to constantly integrate new features while factoring out olderones,toavoidthesuffocationbyconstantgrowththatplaguedpreviousgenerations of software systems. The solution is a combination of open innovation - there are clearly more smart developers outside any company than inside – together with source-level “white box“ integration and minimalist “platform“ design principles. When all your code needs to be constantly updated to follow the environment change, the less you own the better. It is also time to bring source code back from the dark depths of “black box integration“. Open source culture is both about leveraging the treasure trove of what may be found in larger development communities and about mashing up composite applications by weaving source code that one may be proud of. Follow the footsteps of the Web Giants: code that changes constantly is worth being well-written, structured, documented and test-viewed by as many eyeballs as possible. The third idea is another way of saying that “software is eating the world“, this book is not about software, it is about a new way of thinking about your company, whichever businessyouarein.Notsurprisingly,many“known“practicessuchasagiledevelopment, lean startup, measure obsession or obsession about saving customer’s time - the most precious commodity of the digital age -, have found their way into Octo’s list. By reading the practical testimonies from the Web Giants, a new kind of customer-focused organization will emerge. Thus, this is a book for everyone, not for geeks only. This is of the utmost importance since many of the change levers lay in other stakeholders’ hands than software developers themselves. For instance, a key requirement for agility is to switch from solution requirement to problem requirement, allowing the solution to be co-developed by cross-functional teams as well as users. The last idea I would propose is that there is a price to pay for this transformation. There are technologies, tools and practices that you must acquire and learn. Devops practices, such as continuous delivery or managing infrastructure as code, require to master a set of tools and to build skills, there is no “free lunch“. A key set of benefits from the Web Giants way of working comes from massive automation. This book also
  5. 5. 5 shows some of the top recent technology patterns in the architecture section. Since this list is evolving by nature, the most important lesson is to create an environment where “doers“ may continuously experience the tools of the future, such as massively parallel cloud programming, big data or artificial intelligence. A key consequence is that there is a true efficiency and competitiveness difference between those who do and those who don’t master the said set of tools and skills. In the world of technology, we often use the world “Barbarians“ to talk about newcomers who leverage their software/technology skills to displace incumbents in older industries. This is not a question of mindset (trying to take legacy companies head-front is an age-old strategy for newcomers) but a matter of capabilities! As stated earlier, there would be other, more conceptual, ways to introduce the key ideas and practices that are pictured in this book. One could tell about the best sources on motivation and collaborative work, such as Daniel Pink for instance. These Web Giants practices reflect the state of the art of managing intrinsic motivation. The same could be said about the best books on lean management and self-organization. The reference to Lean Startup is one from many subtle references to the influence of the Toyota Way in the modern 21st century forms of organization. Similarly, it would be tempting to convoke complex system theory - see Jurgen Apello and his “Management 3.0“ book for instance - to explain why the practices observed and selected by Octo are the natural answer to the challenges of the increasingly changing and complex world that we live in. From a technology perspective, it is striking to see the similarity with the culture & organizational traits described by Salim Ismael, Michael Malone and Yuri van Geest in their book “Exponential organizations“. The beauty of this pragmatic approach is that you have almost all what you need to know in a much shorter package, which is fun and engaging to read. To conclude this preface, I would advise you to read this book carefully, to share it with your colleagues, your friends and your children - when it’s time to think about what it means to do something that matters in this new world. It tells a story about the new way of working that you cannot afford to miss. Some of the messages: measuring everything, learning by doing, loving your code and respecting those who build things, may make the most seasoned manager smile, but times are changing. This is no longer a set of suggested, “nice-to-have“ practices, as it might have been ten years ago. It is the standard of web-age software development, and de facto the only way for any company to succeed in the digital world. Yves Caseau - National Academy of Technologies of France, President of the ICT commission. Head of Digital of AXA Group THE WEB GIANTS
  6. 6. 6 THE WEB GIANTS INTRODUCTION Introduction Something extraordinary is happening at this very moment; a sort of revolution is underway. Across the Atlantic, as well as in other parts of the world such as France, people are reinventing how to work with information technology. They are Amazon, Facebook, Google, Netflix and LinkedIn, to name but the most famous. This new generation of players has managed to shed old dogmas to examine afresh the issues at hand by coming up with new, radical and efficient solutions for long-standing IT problems. Computer scientists are well aware of the fact that when IT tools are introduced to a trade, the benefits of computerization can only be reaped if business processes are re-thought in light of the new potential offered by technology. One trade, however, has mostly managed thus far to avoid upheavals in their processes: Information Technology itself. Many continued – and still do – to build information systems the way one would build highways or bridges. There is a tendency to forget that the matter being handled on a daily basis is extremely volatile. By dint of hearing tell of Moore’s law,[1] its true meaning is forgotten: what couldn’t be done last year is possible today; what cannot be done today will be possible tomorrow. The beliefs and habits of the ecosystem we live in must be challenged at regular intervals. This thought is both terrifying and wonderful. Now that the pioneers have paved the way, it is important to re-visit business processes. The new approaches laid out here offer significant increases in through efficiency, proactivity, and the capacity for innovation, to be harnessed before the competition pulls the rug out from under your feet. The good news is that the Web Giants are not only paving the way; they espouse the vision of an IT community. They are committed to the Open Source principle, openly communicating their practices to appeal to potential recruits, and work in close collaboration with the research community. Their work methods are public knowledge and very accessible to those who care to delve. The aim of this book is to provide a synthesis of practices, technological solutions and the most salient traits of IT culture. Our hope is that it will inspire readers to make contributions to an information age capable of reshaping our world. This book is designed for both linear and thematic reading. Those who opt for the former may find some repetition. [1] empirical law which states that computing power roughly doubles in capacity at a fixed price every 18 months.
  7. 7. 7 THE WEB GIANTS Culture
  8. 8. 8 The obsession with performance measurement................................. 13 Build vs Buy..................................................................................... 19 Enhancing the user experience......................................................... 27 Code crafters................................................................................... 33 Developing Open Source................................................................. 41 THE WEB GIANTS
  9. 9. THE WEB GIANTS The obsession with performance measurement
  11. 11. 11 THE WEB GIANTSCULTURE / THE OBSESSION WITH PERFORMANCE MEASUREMENT Description In IT, we are all familiar with quotes reminding us of the importance of performance measurement: That which cannot be measured cannot be improved; without measurement, it is all opinion. Web Giants have taken this idea to the extreme, and most have developed a strong culture of performance measurement. The structure of their activities leads them in this direction. These activities often share three characteristics: For these companies, IT is their means of production. Their costs are therefore directly correlated to the optimal use of equipment and software. Improvements in the number of concurrent users or CPU usage result in rapid ROI. Revenues are directly correlated to the efficiency of the service provided. As a result, improvements in conversion rates lead to rapid ROI. They are surrounded by computers! And computers are excellent measurement instruments, so they may as well get the most out of them! Most Web Giants have made a habit of measuring everything, response times, most visited web pages or the articles (content or sales pages) that work best, the time spent on individual pages... In short, nothing unusual – at first glance. But that’s not all! – They also measure the heat generated by a given CPU, or the energy consumption of a transformer, as well as the average time between two hard disk failures (MTBF, Mean Time Between Failure).[1] This motivates them to build infrastructure that maximizes the energy efficiency of their installations, as these players closely monitor PUE, or Power Usage Effectiveness. Most importantly, they have learned to base their action plans on this wealth of metrics. [1]
  12. 12. 12 THE WEB GIANTS Part of this trend is A/B testing (see “A/B Testing“ on p. 123 for further information), which consists of testing different versions of an application on different client groups. Does A work better than B? The best way to find out remains objective measurement: it results in concrete data that defy common sense and reveal the limits of armchair expertise, as demonstrated by the website, which references A/B testing results. In an interview, Yassine Hinnach – then Senior Engineer Manager at LinkedIn – spoke of how LinkedIn teams were encouraged to quickly put any technology designed to boost site performance to the test. Thus decisions to adopt a given technology are made on the basis of observed metrics. has published an article presenting Amazon’s recipes for success, based on interviews with its CTO. Among the more interesting quotes, the following caught our attention: Everyone must be able to experiment, learn, and iterate. Position, obedience, and tradition should hold no power. For innovation to flourish, measurement must rule.[2] As another example of this approach, here is what Timothy B. Lee, a journalist for Wired and the New York Times, had to say about Google’s culture of performance measurement: Rather than having intimate knowledge of what their subordinates are doing, Google executives rely on quantitative measurements to evaluate the company’s performance. The company keeps statistics on everything— page load times, downtime rates, click-through rates, etc—and works obsessively to improve these figures. The obsession with data-driven management extends even to the famous free snacks, which are chosen based on careful analysis of usage patterns and survey results.“[3] [2] [3] odds.ars
  13. 13. 13 THE WEB GIANTSCULTURE / THE OBSESSION WITH PERFORMANCE MEASUREMENT The consequences of this modus operandi run deep. A number of pure players display in their offices the motto “In God we trust. Everything else, we test“. This is more than just a nod to Deming;[4] it is a profoundly pragmatic approach to the issues at hand. An extreme example of this trend, verging on caricature, is Google’s ‘Project Oxygen’: a team of internal statisticians combed through HR data collected from within – annual performance reviews, feedback surveys, nominations for top-manager awards. They distilled the essence of what makes a good manager down to 8 rules. Reading through them, any manager worthy of the name would be struck by how jaw-droppingly obvious it all seems. However, they backed their claims with hard, cold data,[5] and that made all the difference! What about me? The French are fond of modeling, and are often less pragmatic than their English-speaking counterparts. Indeed, we believe that this constant and quick feedback loop “hypothesis measurement decision“ should be an almost systematic reflex in the ISD world, and can be put into effect at a moment’s notice. The author of these lines still has painful memories of two four-hour meetings with ten people organized to find out if shifting requests to the service layer to http would have a “significant“ impact on performance. Ten working days would have largely sufficed for a developer to figure that out, at a much lower cost. OCTO consultants have also had the experience, several times over, of discovering that applications performed better when the cache that was used to improve performance was removed! The cure was therefore worse than the disease and its alleged efficacy never actually measured. Management runs the risk of falling into the trap of believing that analysis by “hard data“ is a done deal. It may be a good idea to regularly check that this is indeed the case, and especially that the information gathered is put to use in decision-making. [4] “In God we trust; all others must bring data“, W. Edward Deming. [5] Adam BRYANT, Google’s Quest to Build a Better Boss, The New York Times Company, March 12, 2011 : 13
  14. 14. 14 THE WEB GIANTS Nevertheless, it cannot be emphasized enough that an ecosystem fostering the application of said information makes up part of the recipe for success of Web Giants. Two other practices support the culture of performance metrics: Automated tests: it’s either red or green, no one can argue with that. As a result, this ensures that it is always the same thing being measured. Short cycles. To measure – and especially interpret – the data, one must be able to compare options, “all other things being equal“. This is crucial. We recently diagnosed the steps undertaken to improve the performance of an application. But about a dozen other optimizations were made to the next release. How then can efficient optimizations be distinguished from those that are counter- productive?
  15. 15. 15 THE WEB GIANTS Build vs Buy
  16. 16. 16 THE WEB GIANTS
  17. 17. 17 THE WEB GIANTS CULTURE / BUILD VS BUY Description One striking difference in the strategy of Web Giants as compared to more usual IT departments lies in their arbitrations around Build vs. Buy. The issue is as old as computers themselves: is it better to invest in designing software to best fit your needs or to use a software package complete with the capitalization and R&D of a publisher (or community) having had all necessary leisure to master the technology and business points? Most major firms have gone for the second option and have enshrined maximal software packaging among their guiding principles, based on the view that IT is not one of their pillar businesses so is better left to professionals. The major Web companies have tended to do the exact reverse. This makes sense given that IT is precisely their core business, and as such is too sensitive to be left in the hands of outsiders. The resulting divergences are thus coherent. Nonetheless, it is useful to push the analysis one step further because Web Giants have other motives too: first, being in control of the development process to ensure it is perfectly adjusted to meet their needs, and second, the cost of scaling up! These are concerns found in other IT departments, meaning that it can be a good idea to look very closely into your software package decisions. Finding balanced solutions On the first point, one of the built-in flaws of software packages is that they are designed for and by the needs which most arise for the publisher’s clients.[1] Your needs are thus only a small subset of what the software package is built to do. Adopting a software package by definition entails overkill, i.e. an overly complex solution not optimized for your [1] We will not insist here on the fact that you should not stray too far from the standard out-of-the-box software package as this can be (very) expensive in the long term, especially when there are new releases.
  18. 18. 18 THE WEB GIANTS needs; and which has a price both in terms of execution and complexity, offsetting any savings made by not investing in the design and development of a complete application. This is particularly striking in the software package data model. Much of the model’s complexity stems from the fact that the package is optimized for interoperability (a highly standardized Conceptual Data Model, extension tables, low model expressiveness as it is a meta-model...). However the abstractions and the “hyper-genericity“ that this leads to in software design has an impact on processing performance.[2] Moreover, Web Giants have constraints in terms of volumes, transaction speed and the number of simultaneous users which push the envelopes of traditional architecture and which, in consequence, require fine-tuned optimizations determined by observed access-patterns. Such read- intensive transactions must not be optimized in the same way as others, where the stakes will be determined by I/O writing metrics. In short, to attain such results, you have to pop the hood and poke around in the engine, which is not something you will be able to do with a software package (all guarantees are revoked from the moment you fiddle with the innards). Because performance is an obsession for Web Giants, the overhead costs and low possibilities for adjustments to the software package make the latter quite simply unacceptable. Costs The second particularly critical point is of course the cost when scaling up. When the number of processors and servers increases, the costs rise very quickly, but not always in linear fashion, making some items more visible. And this is true of both business software packages and hardware. That is precisely one of the arguments which led LinkedIn to gradually replace their Oracle database by an in-house solution, Voldemort.[3] . In a similar vein, in 2010 we carried out a study on the main e-commerce [2] When it is not a case of a cumbersome interface. [3] Yassine Hinnach, Évolution de l’architecture de LinkedIn, enjeux techniques et Organizationnels, USI 2011:
  19. 19. 19 THE WEB GIANTS CULTURE / BUILD VS BUY sites in France: at the time, eight of the ten largest sites (in terms of annual turnover) ran on platforms developed in-house and 2 used e-commerce software packages. Web Giants thus prefer Build to Buy. But not only. They also massively have recourse to Open source solutions (cf. “Developing open source“, p. 41). Linux and MySQL reign supreme in many firms. Development languages and technologies are almost all open source: very little .NET for example, but instead Java, Ruby, PHP, C(++), Python, Scala... And they do not hesitate to fork off from other projects: Google for example uses a largely modified Linux kernel.[4] This is also the case for one of the main worldwide Global Distribution Systems. Most technologies making a stir today in the world of high performance architecture are the result of developments carried out by Web Giants and then opened to the community. Cassandra, developed by Facebook, Hadoop and HBase inspired by Google and developed by Yahoo!, Voldemort by LinkedIn... A way, in fact, of combining the advantages of software perfectly tailored to your needs but nonetheless enhanced by improvements contributed by the development community, with, as an added bonus, a market trained to use the technologies you use. Coming back to the example of LinkedIn, many of their technologies are grounded in open source solutions: Zoie, a real time indexing and search system based on Lucene. Bobo, a faceted search library based on Lucene. Azkaban, a batch workflow job scheduler to manage Hadoop job dependencies. GLU, a deployment framework. [4]
  20. 20. 20 THE WEB GIANTS How can I make it work for me? Does this mean I have to do away with software packages in my IT choices? Of course not, not for everything. Software packages can be the best solution, no one today would dream of reengineering a payroll system. However, ad hoc developments should be considered in certain cases: when the IT tool is key to the success of your business. Figure 1 lays out orientations in terms of strategy. The other context where specific developments can be the right choice is that of high performance: with companies turning to “full web solutions“, very few business software packages have the architecture to support the traffic intensity of some websites. As for infrastructure solutions, open source has become the norm: OSs and application servers foremost. Often also databases and message buses. Open source are ideally adapted to run the solutions of Web Giants. There is no doubt as to their capacity for performance and stability. One hurdle remains: reluctance on the part of CIOs to forgo the support found in software packages. And yet, when you look at what actually happens, when there are problems with the commercial technical platform, it is rarely support from the publisher, handsomely paid for, which provides the solution, but rather networks of specialists and help fora Unique, differentiating. Perceived as a commercial asset. Innovations and strategic assets Faster SPECIFIC SOFTWARE PACKAGE BPO[5] Resources Cheaper Common to all industry organizations. Perceived as a production asset. Common to all organizations. Perceived as a ressource. [5] Business Process Outsourcing.
  21. 21. 21 THE WEB GIANTS CULTURE / BUILD VS BUY on the Internet. For application platforms of the database or message bus type, the answer is less clearcut because some commercial solutions include functionalities that you do not find in open source alternatives. However if you are sending an Oracle into regions where MySQL will not be able to follow, that means that you have very sophisticated needs... which is not the case for 80% of the contexts we encounter !
  22. 22. 22 THE WEB GIANTS Enhancing User Experience
  23. 23. 23 WEB GIANTS
  24. 24. 24 THE WEB GIANTS CULTURE / ENHANCING THE USER EXPERIENCE Description Performance: a must One conviction shared by Web Giants is that users’ judgment of performance is crucial. Performance is directly linked to visitor retention and loyalty. How users feel about a particular service is linked to the speed with which the graphic interface is displayed. Most people have no interest in software architecture, server power, or network latency due to web based services. All that matters is the impression of seamlessness. User-friendliness is no longer negotiable Web Giants have fully grasped this and speak of metrics in terms of “the bat of an eyelash“. In other words, it is a matter of fractions of seconds. Their measurements, carried out namely through A/B testing (cf. “A/B Testing“, p. 123), are very clear: Amazon : a 100ms. increase in latency means a 1% loss in sales. Google : a page taking more than 500ms to load loses 20% of traffic (pages visited). Yahoo! : more than 400ms to load means + 5 to 9 % abandons. Bing : over 1 second to load means a loss of 2.8% in advertising income. How are these performances attained? In keeping with the Device Agnostic pattern (cf. “Device Agnostic“, p. 143), Web Giants develop native interfaces, or Web interfaces, to always offer the best possible user experience. In both cases, performance as perceived by the user must be maximized.
  25. 25. 25 THE WEB GIANTS Native applications With the iPhone, Apple reintroduced applications developed for a specific device (stopping short of the assembler however) to maximize perceived performance. Thus Java and Flash technologies are banished from the iPhone. The platform also uses visual artifacts: when an app is launched, it displays the view as seen when it was last charged by the system to strengthen the impression that it is instantaneous, with the actual app being loaded in the background. On Android, Java applications are executed on a virtual machine optimized for the platform. They can also be written in C to maximize performance. Generally speaking, there is a consensus around native development, especially on mobile platform: it must be as tightly linked as possible to the device. Multi-platform technologies such as Java ME, Flash and Silverlight do not directly enhance the user experience and are therefore put aside. Web applications Fully loading a Web page usually takes between 4 and 10 seconds (including graphics, JavaScript, Flash, etc.). It would seem that perceived slowness in display is generally linked for 5% to server processing, and for 95% to browser processing. Web Giants have therefore taken considerable care to optimize the display of Web pages. As illustration, here is a list of the main good practices which most agree optimize user perception: It is crucial to cache all static resources (graphics, CSS style sheets, JavaScript scripts, Flash animations, etc.) whenever possible. There are various HTTP cache technologies for this. It is important to become skillful at optimizing the life-cycle of the resources in the cache. It is also advisable to use a cache network, or Content Delivery Network (CDN) to bring the resources as close as possible to the end user to reduce network latency. We highly recommend that you have cache servers in the countries where the majority of your users live.
  26. 26. 26 CULTURE / ENHANCING THE USER EXPERIENCE Downloading in background is a way of masking sluggishness in the display of various elements on the page. One thing many do is to use sprites: the principle is to aggregate images in a single file to limit the amount of data to be loaded; they can then be selected on the fly by the navigator (see the Gmail example below). Having recourse to multiple domain names is a way to maximize parallelization in simultaneous resource loading by the navigator. One must bear in mind that navigators are subjected to a maximum number of simultaneous queries for a same domain. for example loads their images from Placing JavaScript resources at the very end of the page to ensure that graphics appear as quickly as possible. Using tools to minimize, i.e. removing from the code (JavaScript, HTML, etc.) all characters (enter, comments, etc.) serving to read the code but not to execute it, and to shorten as much as possible function names. Compacting the various source code files such as JavaScript in a single file whenever possible. Who makes it work for them? There are many examples of such practices among Web Giants, e.g. Google, Gmail, Viadeo, Github, Amazon, Yahoo!... References among Web Giants Google has the most extensive distributed cache network of all Web Giants: the search giant is said to have machines in all major cities, and even a private global network, although corroboration is difficult to come by. Google Search pushes the real-time user experience to the limits with its “Instant Search“ which loads search results as you type your query. This function stems from formidable technical skill and has aroused the interest of much of the architect community.
  27. 27. 27 THE WEB GIANTS Gmail images are reduced to a strict minimum (two sprite images shown on Figure 1), and the site makes intensive cache use and loads JavaScript in the background Figure 1: Gmail sprite images. France Sites using or having used the content delivery network Akamai: How can I make it work for me? The consequences of display latency are the same with in-house applications within any IT department: users who get fed up with the application and stop using it. This to say that this is a pattern which perfectly applies to your own business Sources • Eric Daspet, “Performance des applications Web, quoi faire et pourquoi ?“ USI 2011 (French only): > sessions/997-performance-des-applications-web-quoi-faire-et-pourquoi • Articles on Google Instant Search: > become-faster-with-5-7x-more-results.html > scenes.html Editor’s note: By definition, sprites are designed for screen display, we are unable to provide any better definition for the printing of this example. Thank you for your understanding.
  28. 28. 28 THE WEB GIANTS Code Crafters
  29. 29. 29 THE WEB GIANTS CULTURE / CODE CRAFTERS Description Today Web Giants are there to remind us that a career as a developer can be just as prestigious as manager or consultant. Indeed, some of the most striking successes of Silicon Valley have originated with one or several visionary geeks who are passionate about quality code. When these companies’ products gain in visibility, satisfying an increasing number of users means hugging the virtuous cycle in development quality, without which success can vanish as quickly as it came. Which is why a software development culture is so important to Web Giants, based on a few key principles: attracting and recruiting the best programmers, investing in developer training and allowing them more independence, gaining their loyalty through workplace attractiveness and payscale, being intransigent as to the quality of software development - because quality is non-negotiable. Implementation The first challenge the Giants face is thus recruiting the best programmers. They have become masters at the art, which is trickier than it might at first appear. One test which is often used by the majors is to have the candidates write code. A test Facebook uses is the FizzBuzz. This exercise, inspired by a drinking game which some of you might recognize, consists in displaying the first 1000 prime numbers, except for multiples of 3 or 5, where “Fizz“ or “Buzz“ respectively must be displayed, and except for multiples of 3 and 5, where “FizzBuzz“ must be displayed. This little programming exercise weeds out 99.5% of the candidates. Similarly, to be hired by Google, between four and nine technical interviews are necessary.
  30. 30. 30 THE WEB GIANTS Salary is obviously to be taken into account. To have very good developers, you have to be ready to pay the price. At Facebook, Senior Software Engineers are among the best paid employees. Once programmers have joined your firm, the second challenge is to favor their development, fulfillment, and to enrich their skills. In such companies, programmers are not considered code laborers to be watched over by a manager but instead as key players. The Google model, which encourages developers to devote 20% of their time to R&D projects, is often cited as an example. This practice can give rise to contributions to open-source projects, which provide many benefits to the company (cf. “Open Source Contribution“, p. 41). On the Netflix blog for example, they mention their numerous open source initiatives, namely on Zookeeper and Cassandra. The benefit to Netflix is twofold: its developers gain in notoriety outside the company, while at the same time developing the Netflix platform. Another key element in developer loyalty is the working conditions. The internet provides ample descriptions of the extent to which Web Giants are willing to go to provide a pleasant workplace. The conditions are strikingly different from what one finds in most Tech companies. But that is not all! Netflix, again, has built a culture which strongly focuses on its employees’ autonomy and responsibility. More recently, Valve, a video game publisher, sparked a buzz among developers when they published their Handbook, which describes a work culture which is highly demanding but also propitious to personal fulfillment. 37 signals, lastly, with their book Getting Real, lays out their very open practices, often the opposite of what one generally finds in such organizations. In addition to efforts deployed in recruiting and holding on to programmers, there is also a strong culture of code and software quality. It is this culture that creates the foundations for moving and adapting quickly, all while managing mammoth technological platforms where performance and robustness are crucial. Web Giants are very close to the Software Craftsmanship[1] movement, which promotes a set of values and practices aiming to guarantee top-quality software and to provide as much value as possible to end-users. Within this movement, Google and GitHub have not hesitated to share their coding guidelines[2] . [1] [2] and
  31. 31. 31 THE WEB GIANTS How can I make it work for me? Recruiting It is important to implement very solid recruitment processes when hiring your programmers. After a first interview to get a sense of the person you wish to recruit, it is essential to have the person code. You can propose a few technical exercises to assess the candidate’s expertise, but it is even more interesting to have them code as a pair with one of your developers, to see whether there is good feeling around the project. You can also ask programmers to show their own code, especially what they are most proud of - or most ashamed of. More than the code itself, discussions around coding will bring in a wealth of information on the candidate. Also, did they put their code on GitHub? Do they take part in open source projects? If so, you will have representative samples of the code they can produce. Quality: Offer your developers the context which will allow them to continue producing top-quality software (since that is non-negotiable). Leave them time to write unit tests, to set up the development build you will need for Continuous Deployment (cf. “Continuous Deployment“, p. 105), to work in pairs, to hold design workshops in their business domain, to prototype. The practice which is known to have the most impact on quality is peer code reviewing. This happens all too rarely in our sector. R&D: Giving your developers the chance to participate in R&D projects in addition to their work is a practice which can be highly profitable. It can generate innovation, contribute to project improvement and, in the case of Open Source, increase your company’s attractiveness for developers. It is also simply a source of motivation for this often neglected group. More and more firms are adopting the principles of Hackathons, popularized by Facebook, where the principle consists in coding, in one or two days, working software. CULTURE / CODE CRAFTERS
  32. 32. 32 THE WEB GIANTS Training: Training can be externalized but you can also profit from knowledge sharing among in-house developers by e.g. organizing group programming workshops, commonly called “Dojo“.[3] Developers can gather for half a day, around a video projector, to share knowledge and together learn about specific technical issues. It is also a way to share developer practices and, within a team, to align with programming standards. Lastly, working on open source projects is also a way of learning about new technologies. Workplace: Where and how you work are important! Allowing independence, promoting openness and transparency, hailing mistakes and keeping a manageable rhythm are all paying practices in the long term. Associated patterns Pattern “Pizza Teams“, p. 59. Pattern “DevOps“, p. 65. Pattern “Continuous Deployment“, p. 105. Sources • Company culture at Netflix: > • What every good programmer should know: > programmer • List of all the programmer positions currently open at Facebook: • The highest salary at Facebook? Senior Software Engineer: ranked-2012-5?op=1 [3]
  33. 33. 33 THE WEB GIANTS CULTURE / CODE CRAFTERS • GitHub programming guidelines: • How GitHub grows: • Open source contributions from Netflix: html • The FizzBuzz test: • Getting Real: • The Software Craftsmanship manifesto: • The Google blog on tests: • The Happy Manifesto:
  34. 34. 34 THE WEB GIANTS Open Source Contribution
  35. 35. 35 THE WEB GIANTS Description Why is it Web Giants such as Facebook, Google and Twitter do so much to develop Open Source? A technological edge is a key to conquering the Web. Whether it be to stand out from the competition by launching new services (remember when Gmail came out with all its storage space at a time when Hotmail was lording it?) or more practically to overcome inherent constraints such as the growth challenge linked to the expansion of their user base. On numerous occasions, Web Giants have pulled through by inventing new technologies. If so, one would think that their technological mastery, and the asset which is the code, would be carefully shielded from prying eyes, whereas in fact the widely shared pattern one finds is that Web Giants are not only major consumers of open source technology, they are also the main contributors. The pattern “developing open source“ consists of making public a software tool (library, framework...) developed and used in-house. The code is made available on a public server such as GitHub, with a free license of the Apache type for example, authorizing its use and adaptation by other companies. In this way, the code is potentially open to development by the entire world. Moreover, open source applications are traditionally accompanied by much publicity on the web and during programming conferences. Who makes it work for them? There are many examples. Among the most representative is Facebook and its Cassandra database, built to manage massive quantities of data distributed over several servers. It is interesting to note that among current users of Cassandra, one finds other Web Giants, e.g. Twitter and Digg, whereas Facebook has abandoned Cassandra in favor of another open source storage solution - HBase - launched by the company Powerset. With the NoSQL movement, the new foundations of the Web are today massively based on the technologies of the Giants.
  36. 36. 36 THE WEB GIANTS Facebook has furthermore opened several frameworks up to the community, such as its HipHop engine which compiles PHP in C++, Thrift, a multilanguage development service, and Open Compute, an Open hardware initiative which aims to optimize how datacenters function. But Facebook is not alone. Google has done the same with its user interface framework GWT, used namely in Adword. Another example is the Tesseract Optical Character Recognition (OCR) tool initially developed by HP and then by Google, which opened it up to the community a few years later. Lastly, one cannot name Google without citing Android, its open source operating system for mobile devices, not to mention their numerous scientific publications on storing and processing massive quantities of data. We are referring more particularly to their papers on Big Table and Map Reduce which inspired the Hadoop project. The list could go on and on, so we will end with first Twitter and its CSS framework and very trendy responsive design, called Bootstrap, and the excellent Ruby On Rails extracted from the Basecamp project management software opened up to the community by 37signals. Why does it work? Putting aside ideological considerations, we propose to explore various advantages to be drawn from developing open software. Open and free does not necessarily equate with price and profit wars. In fact, from one angle, opening up software is a way of cutting competition off in the bud for specific technologies. Contributing to Open Source is a way of redefining a given technology sector while ensuring sway over the best available solution. For a long time, Google was the main sponsor of the Mozilla Foundation and its flagship project Firefox, to the tune of 80%. A way to diversify to counter Microsoft. Let us come back to our analysis of the three advantages. [1] Interface Homme Machine. CULTURE / OPEN SOURCE CONTRIBUTION
  37. 37. 37 THE WEB GIANTS Promoting the brand By opening cutting-edge technology up to the community, Web Giants position themselves as leaders, pioneers. It implicitly communicates a spiritofinnovationreigningintheirhalls,aconstantquestforimprovements. They show themselves as being able to solve big problems, masters of technological prowess. Delivering a successful Open Source framework says that you solved a common problem faster or better than anyone else. And that, in a way, the problem is now behind you. Done and gone, you’re already moving onto the next. One step ahead of the game. To share a framework is to make a strong statement, to reinforce the brand. It is a way to communicate an implicit and primal message: “We are the best, don’t you worry“ And then, to avoid being seen as the new Big Brother, one can’t but help feeling that the message also implied is: “We’re open, we’re good guys, fear not“.[2] Attracting - and keeping - the best This is an essential aspect which can be fostered by an open source approach. Because “displaying your code“ means showing part of your DNA, your way of thinking, of solving problems - show me your code and I will tell you who you are. It is the natural way of publicizing what exactly goes on in your company: the expertise of your programmers, your quality standards, what your teams work on day by day... A good means to attract “compatible“ coders who would have already been following the projects led by your company. Developing Open Source thus helps you to spot the most dedicated, competent and motivated programmers, and when you hire them you are already sure they will easily integrate your ecosystem. In a manner of speaking, Open Source is like a huge trial period, open to all. [2] Google’s motto: “Don’t be evil“
  38. 38. 38 THE WEB GIANTS Attracting the best geeks is one thing, hanging on to them is another. On this point, Open Source can be a great way to offer your company’s best programmers a showcase demonstration open to the whole world. That way they can show their brilliance, within their company and beyond. Promoting Open Source bolsters your programmers’ resumes. It takes into account the Personal Branding needs of your staff, while keeping them happy at work. All programmers want to work in a place where programming is important, within an environment which offers a career path for software engineers. Spoken as a programmer. Improving quality Simply “thinking open source“ is already a leap forward in quality: opening up code - a framework - to the community first entails defining its contours, naming it, describing the framework and its aim. That alone is a significant step towards improving the quality of your software because it inevitably leads to breaking it up into modules, giving it structure. It also makes it easier to reuse the code in-house. It defines accountability within the code and even within teams. It goes without saying that programmers who are aware that their code will be checked (not to mention read by programmers the world over) will think twice before committing an untested method or a hastily assembled piece of code. Beyond making programmers more responsible, feedback from peers outside the company is always useful. How can I make it work for me? When properly used, Open Source can be an intelligent way not only to structure your RD but also to assess programmer performance. The goal of this paper was to explore the various advantages offered by opening up certain technologies. If you are not quite up to making the jump culturally speaking, or if your IS is not ready yet, it can nonetheless be useful to play with the idea taking a few simple-to-implement actions. Depending on the size of your company, launching your very first Open Source project can unfortunately be met with general indifference. We do not all have the powers of communication of Facebook. Beginning by CULTURE / OPEN SOURCE CONTRIBUTION
  39. 39. 39 THE WEB GIANTS contributing to Open Source projects already underway can be a good initial step for testing the culture within your teams. Like Google and GitHub, another action which works towards the three advantages laid out here can be to materialize and publish on the web your programming guidelines. Another possibility is to encourage your programmers to open a development blog where they could discuss the main issues they have come up against. The Instagram Engineering Tumblr moderated by Instagram can be a very good source of inspiration. Sources • The Facebook developer portal, Open Source projects: • Open-Source Projects Released By Google: • The Twitter developer portal, Open Source projects: • Instagram Engineering Blog: • The rules for writing GitHub code: • A question on Quora: Open Source: “Why would a big company do open-source projects?“: open-source-projects
  40. 40. 40 THE WEB GIANTS Sharing Economy platforms
  41. 41. 42 THE WEB GIANTS CULTURE / SHARING ECONOMY PLATFORMS Description The principles at work in the platforms of the sharing economy (exponential business platforms) are one of the keys to the successes of the web giants and other startups valuated at $1 billion (“unicorns“) such as BlablaCar, Cloudera, Social finance, or over $10 billion (“decacorns“) such as Uber, AirBnB, Snapchat, Flipkart (List and valuation of the Uni/Deca-corns). The latter are disrupting existing ecosystems, inventing new ones, wiping out others. And yet “Businesses never die, only business models evolve“ (To learn more, see: Philippe Siberzahan, “Relevez le défi de l’innovation de rupture“). Concerns over the risks of disintermediation are legitimate given that digital technology has led to the development of numerous highly successful “exponential business platforms“ (see the article by Maurice Levy, “Se faire ubériser“). The article below begins with a recap of what is common to these platforms and then explores the main fundamentals necessary for building or becoming an exponential business platform. The wonderful world of the “Sharing economy“ Thereisacontinuousstreamofnewcomersknockingatthedoor,progressively transforming many sectors of the economy, driving them towards a so- called “collaborative“ economy. Among other goals, this approach strives to develop a new type of relation: Consumer-to-Consumer (C2C). This is true e.g. in the world of consumer loans, where the company LendingHome (Presentation of LendingHome) is based on peer-2-peer lending. Another area of interest is blockchain technology such as decentralisation and the “peer-2-peer'isation“ of money through the Bitcoin! What is most striking is that this type of relation can have an impact in unexpected places such as personalised urban car services (e.g. Luxe and Drop Don't Park), and movers (Lugg as an “Uber/Lyft for moving“). Business platforms such as these favor peer-2-peer relations. They have achieved exponential growth by leveraging the multitudes (For further information, see: Nicolas Colin Henri Verdier, L'âge de la multitude: Entreprendre et gouverner après la révolution numérique). Such models make it possible for very small structures to grow very quickly by generating revenues per employee which can be from 100 to 1000 times higher than
  42. 42. 43 THE WEB GIANTS in businesses working in the same sector but which are much larger. The fundamental question is then to know what has enabled some of them to become hits and to grow their popularity, in terms of both community and revenues. What are the ingredients in the mix, and how does one become so rapidly successful? At this stage, the contextual elements and common ground we discern are: An often highly regulated market where these platforms appear and then develop by providing new solutions which break away from regulations (for example the obligation for hotels to make at least 10% of their rooms disability friendly, which does not apply to individuals using the AirBnB system). An as yet unmet need in supply and demand can make it possible to earn a living or to generate additional revenue for a better quality of life (Cf. AirBnB's 2015 communication campaign on the subject) or at the least to share costs (Blablacar). This point in particular raises crucial questions as to the very notion of work, its regulation and the taxation of platforms. There is strong friction around the experience, of clients and citizens, where the market has as yet to provide a response (such as valet car services in large cities around the world where parking is completely saturated) A deliberate strategy to not invest in material assets but rather to efficiently embrace the business of creating links between people. Given this understanding of the context, the 5 main principles we propose to become an exponential business platform are: Develop your “network lock-in effect“. Pair up algorithms with the user experience. Develop trust. Think user and be rigorous in execution. Carefully choose your target when you launch platform experiments.
  43. 43. 44 THE WEB GIANTS “Network lock-in effect“ The more supply and demand grow and come together, the more indispensable your platform becomes. Indispensable because in the end that is where the best offers are to be found, the best deals, where your friends are. There is an inflection point where the network of suppliers and users becomes the main asset, the central pillar. Attracting new users is no longer the principal preoccupation. This asset makes it possible to become the reference platform for your segment. This growth can provide a monopoly over its use case, especially if there are exclusive deals that can be obtained through offers valid on your platform only. It can then extend to offers which follow upon the first (for example Uber's position as an urban mobility platform has led them to diversify into a meal delivery service for restaurants). This is one of the elements which were very quickly theorised in the Lean Startup approach: the virality coefficient. The perfect match: User eXperience Algorithms What is crucial in the platform is setting up the perfect relation between supply and demand, celerity in implementing relations in time and/ or space, lower prices as compared to traditional systems, and even providing services that weren't possible before. For some, algorithms for establishing relations are the core of their operations to deliver on their daily promise of offering suggestions and possibilities for relevant connections within a few micro-seconds. The perfect match is a fine-tuned mix between stellar research into the user experience (all the way to swipe!), often using a mobile-first approach to explore and offer services, based on advanced algorithms to expose relevant associations. A telling example is the use of “Swipe“ in terms of uniquely tailored user experiences for fast browsing as in the personal relationship tool “Tinder“. CULTURE / SHARING ECONOMY PLATFORMS
  44. 44. 45 THE WEB GIANTS Trust security To get beyond the early adapters to reach the market majority, two elements are critical to the client experience: trust in the platform, trust towards the other platform users (both consumers and providers). Who has not experienced stress when reserving one's first AirBnB? Who has not wondered whether Uber would actually be there? This level of trust conveyed by the platform and platform users is so important that it has been one of the leveraging effects, like for the shared Blablacar platform which thrived once the transactions were operated by the platform. What happens to the confidential data provided to the platform? You may remember a recent hacking event of personal data on the “Ashley Madison“ sites affecting the 37 million platform users who wanted total discretion (Revelations around the hacking of the Ashley Madison sites). Security is thus key to protecting platform transactions, guaranteeing private data and reassuring users. Think user excel in execution Above all it is about realising that what the market and what the clients want is not to be found in marketing plans, sales forecasts and key functionalities. The main questions to ask revolve around the triplets Client / Problem / Solution: Do I really have a problem that is worth solving? Is my solution the right one for my client? Will my client buy it? For how much? Use whatever you can to check your hypotheses: interviews, market studies, prototypes... To succeed, these platforms aim to reach production very quickly, iterating and improving while their competition is still exploring their business plan. It is then a ferocious race between pioneers and copycats, because in this type of race “winner takes all“ (For further reading, see The Second Machine Age, Erik Brynjolfsson Andrew Mcafee).
  45. 45. 46 THE WEB GIANTS Then excellence in execution becomes the other pillar. This operational excellence covers: the platform itself and the users it “hosts“: active users, quality of the goods offered... quality in rating with numerous well assessed offers... offers which are mediated by the platform (comments, satisfaction surveys...) One may note in particular the example of AirBnB on the theme of excellence in execution, beyond software, where the quality in the description of the lodgings as well as beautiful photos were a strong differential as compared to the competition of the time (Craig's List) (A few words on the quality of the photos at AirBnB). Critical market size Critical market size is one of the elements which make it possible to rapidly reach a sufficiently strong network effect (speed in reaching a critical size is fundamental to not being overrun by copycats). Critical market size is made up of two aspects: Selecting the primary territories for deployment, most often in cities or mega-cities, Ensuring deployment in other cities in the area, when possible in standardized regulatory contexts. You must therefore choose cities particularly concerned by your value propositions for your platform, where a sufficient number of early adapters is high enough to quickly garner takeaways. Mega-cities in the Americas, Europe and Asia are therefore choice targets for experimental deployments. Lastly, during the generalisation phase, it is no surprise to see stakeholders deploying massively in the USA (a market which represents 350 million inhabitants, with standardised tax and regulatory environments, despite state and federal differences) or in China (where the Web giants are among the most impressive players, such as: Alibaba, Tencent and Weibo) as well as Russia. CULTURE / SHARING ECONOMY PLATFORMS
  46. 46. 47 THE WEB GIANTS In Europe, cities such as Paris, Barcelona, London, Berlin, etc. are often prime choices for businesses. What makes it work for them? As examined above, there are many ingredients for exponentially scalable Organizations and business models on the platform model: strong possibilities for employees to self-organise, the User eXperience, continuous experimentation... algorithms (namely intelligent networking), and leveraging one's community. What about me? For IT and marketing departments, you can begin your thinking by exploring digital innovations (looking for new uses) that fit in with your business culture (based e.g. on Design thinking). In certain domains, this approach can give you access to new markets or to disruption before the competition. A recent example is that of Accor which has entered the market of independent hotels through its acquisition of Fastbooking (Accor gets its hands on Fastbooking). Still in the area of self-disruption, two main strategies are coming to the fore. The first consists, based on partnerships or capital investments through incubators, in coming back into the game without shouldering all of the risk. The other strategy, more ambitious and therefore riskier, is to take inspiration from these new approaches to transform from within. It is then important to examine whether some of these processes can be opened up to transform them into an open platform, thereby leveraging the multitudes. In the distribution sector for example, the question of positioning and opening up various strategic processes is raised: is it a good idea to turn your supply chain into a peer-2-peer platform so that SMEs can become consumers and not only providers in the supply chain? Are pharmacies the next on the list of programmed uberisations through stakeholders such as In the medical domain, has just leveraged €18 million to ensure its development (Doctolib raises funds)...
  47. 47. 48 THE WEB GIANTS Associated patterns Enhancing the user experience A/B Testing Feature Flipping Lean Startup Sources • List of unicorns: • Philippe Siberzahan, “Relevez le défi de l’innovation de rupture“, édition Pearson • Article by Maurice Levy on “Tout le monde a peur de se faire ubériser“ le-monde-a-peur-de-se-faire-uberiser-maurice-levy.html • Lending Home present through “C’est pas mon idée“: lassaut-du-credit.html • Nicolas Colin Henri Verdier, “l’âge de la multitude, 2nde édition“ • Ashley Madison hacking: extraconjugales-hack-adultere • Second âge de la machine, Erik Brynjolfsson • Quality of AirBnB photos: • Accor met la main sur Fastbooking: met-la-main-sur-fastbooking.htm • Doctolib raises 18M€: millions-d-euros-39826390.htm CULTURE / SHARING ECONOMY PLATFORMS
  48. 48. 49 THE WEB GIANTS Organization
  49. 49. 50 Pizza Teams..................................................................................... 59 Feature Teams................................................................................. 65 DevOps........................................................................................... 71 THE WEB GIANTS
  50. 50. 51 THE WEB GIANTS Pizza Teams
  51. 51. 52 THE WEB GIANTS
  52. 52. 53 THE WEB GIANTS ORGANIZATION / PIZZA TEAMS Description What is the right size for a team to develop great software? Organizational studies have been investigating the issue of team size for several years now. Although answers differ and seem to depend on various criteria such as the nature of tasks to be carried out, the average level, and team diversity, there is consensus on a size of between 5 and 15 members.[1][5] Any fewer than 5 and the team is vulnerable to outside events and lacks creativity. Any more than 12 and communication is less efficient, coherency is lost, there is an increase in free-riding and in power struggles, and the team’s performance drops rapidly the more members there are. This is obviously also true in IT. The firm Quantitative Software Management, specialized in the preservation and analysis of metrics from IT projects, has published some interesting statistics. If you like numbers, I highly recommend their Web site, it is chock full of information! Based on a sample of 491 projects, QSM measured a loss of productivity and heightened variability with an increase in team size, with a quite clear break once one reaches 7 people. In correlation, average project duration increases and development efforts skyrocket once one goes beyond 15.[6] In a nutshell: if you want speed and quality, cut your team size! Why are we mentioning such matters in this work devoted to Web Giants? Very simply because they are particularly aware of the importance of team size for project success, and daily deploy techniques to keep size down. [1] [2] [3] [4] [5] [6]
  53. 53. 54 THE WEB GIANTS In fact the title of this chapter is inspired by the name Amazon gave to this practice:[7] if your team can’t be fed on two pizzas, then cut people. Albeit these are American size pizzas, but nonetheless about 8 people. Werner Vogels (Amazon VP and CTO) drove the point home with the following quote which could almost be by Nietzsche: Small teams are holy. But Amazon is not alone, far from it. To illustrate the importance that team dynamics have for Web Giants: Google hired Evan Wittenberg to be manager of Global Leadership Development; the former academic was known, in part, for his work on team size. The same discipline is applied at Yahoo! which limits its product teams in the first year to between 5 and 10 people. As for Vidaeo, they have adopted the French pizza size approach with teams of 5-6 people. In the field of startups, Instagram, Dropbox, Evernote.... are known for having kept their development teams as small as possible for as long as possible. How can I make it work for me? A small, agile team will always be more efficient than a big lazy team; such is the conclusion which could be drawn from the accumulated literature on team size. In the end, you only need to remember it to apply it... and to steer away from linear logic such as: “to go twice as fast, all you need is double the people!“ Nothing could be more wrong! According to these studies, a team exceeding 15 people should set alarm bells ringing.[8][10] [7] [8] [9] [10] human-nature
  54. 54. 55 THE WEB GIANTS ORGANIZATION / PIZZA TEAMS You then have two options: Fight tooth and nail to prevent the team from growing, and, if that fails, to adopt the second solution; split the team up into smaller teams. But think very carefully before you do so and bear in mind that a team is a group of people motivated around a common goal. Which is the subject of the following chapter, “Feature Teams“.
  55. 55. 56 THE WEB GIANTS Feature Teams
  56. 56. 57 THE WEB GIANTS ORGANIZATION / FEATURE TEAMS Description In the preceding chapter, we saw that Web Giants pay careful attention to the size of their teams. That is not all they pay attention to concerning teams however: they also often organize their teams around functionalities, known as “feature teams“. A small and versatile team is a key to moving swiftly, and most Web Giants resist multiplying the number of teams devoted to a single product as much as possible. However, when a product is a hit, a dozen people no longer suffice for the scale up. Even in such a case, team size must remain small to ensure coherence, therefore it is the number of teams which must be increased. This raises the question of how to delimit the perimeters of each. There are two main options:[1] Segmenting into “technological“ layers. Segmenting according to “functionality thread“. By “functionality thread“ we mean being in a position to deliver independent functionalities from beginning to end, to provide a service to the end user. In contrast, one can also divide teams along technological layers, with one team per type of technology: typically, the presentation layer, business layer, horizontal foundations, database... This is generally the organization structure adopted in Information Departments, each group working within its own specialty. However, whenever Time To Market becomes crucial, organization into technological layers, also known as Component Teams, begins to show its limitations. This is because Time To Market crunches often necessitate Agile or Lean approaches. This means specification, development, and production with the shortest possible cycles, if not on the fly. [1] There are in truth other possible groupings, e.g. by release, geographic area, user segment or product family. But that would be beyond the scope of the work here; some of the options are dead ends, others can be assimilated to functionality thread divisions.
  57. 57. 58 THE WEB GIANTS Functionality 1 Functionality 2 Functionality 4 Functionality 5 Team 1 - Front Team 1 - Back Team 1 - Exchange Team 1 - Base The trouble with Component Teams is you often find yourself with bottlenecks. Let us take the example laid out in Figure 1. Figure 1 Theredarrowsindicatethefirstproblem.Themostimportantfunctionalities (functionality 1) are swamping the Front team. The other teams are left producing marginal elements for these functionalities. But nothing can be released until Team 1 has finished. There is not much the other teams can do to help (not sharing the same specialty as Team 1), so are left twiddling their thumbs or stocking less important functionalities (and don’t forget that in Lean, stocks are bad...). There’s worse. Functionality 4 needs all four teams to work together. The trouble is that, in Agile mode, each team individually carries out the detailed analysis. Whereas here, what is needed is the detailed impact analysis on the 4 teams. This means that the detailed analysis has to take place upstream, which is precisely what Agile strives to avoid. Similarly, downstream, the work of the 4 teams has to be synchronized for testing, which means waiting for laggers. To limit the impact, task priorities have to be defined for each team in a centralized manner. And little by little, you find yourselves with a scheduling department striving to best synchronize all the work but leaving no room for team autonomy.
  58. 58. 59 THE WEB GIANTS ORGANIZATION / FEATURE TEAMS In short, you have a waterfall effect upstream in analysis and planning and a waterfall effect downstream in testing and deploying to production. This type of dynamics is very well described in the work of Craig Larman and Bas Vodde, Scaling Lean and Agile. Feature teams can correct these errors: with each team working on a coherent functional subset - and doing so without having to think about the technology - they are capable of delivering value to the end client at any moment, with little need to call on other teams. This entails having all necessary skills for producing functionalities in each team, which can mean (among others) an architect, an interface specialist, a Web developer, a Java developer, a database expert, and, yes, even someone to run it... because when taken to the extreme, you end up with the DevOps “you build it, you run it“, as described in the next chapter (cf. “DevOps“, p. 71). But then how do you ensure the technological coherence of the product, if each Java expert in each feature team takes the decisions within their perimeter? This issue is addressed by the principle of community of practice. Peers from each type of specialty get together at regular intervals to exchange on their practices and to agree on technological strategies for the product being produced. Feature Teams have the added advantage that teams quickly progress in the business, this in turn fosters implication of the developers in the quality of the final product. Practicing the method is of course sloppier than what we’ve laid out here: defining perimeters is no easy task, team dynamics can be complicated, communities of practice must be fostered... Despite the challenges, this organization method brings true benefits as compared to hierarchical structures, and is much more effective and agile. To come back to our Web Giants, this is the type of organization they tend to favor. Facebook in particular, which communicates a lot around the culture, focuses on teams which bring together all the necessary talents to create a functionality.[2] [2] article/0,28804,2036683_2037109_2037111,00.html
  59. 59. 60 THE WEB GIANTS It is also the type of structure that Viadeo, Yahoo! and Microsoft[3] have chosen to develop their products. How can I make it work for me? Web Giants are not alone in applying the principles of Feature Teams. It is an approach also often adopted by software publishers. Moreover, Agile is spreading throughout our Information Departments and is starting to be applied to bigger and bigger projects. Once your project reaches a certain size (3-4 teams), Feature Teams are the most effective answer, to the point where some Information Departments naturally turn to that type of pattern.[4] [3] Michael A. Cusumano and Richard W. Selby. 1997. How Microsoft builds software. Commun. ACM 40, 6 (June 1997), 53-61 : [4] retour-dexperience-lagilite-a-grande-echelle (French only).
  60. 60. 61 THE WEB GIANTS DevOps
  61. 61. 62 THE WEB GIANTS
  62. 62. 63 THE WEB GIANTS ORGANIZATION / DEVOPS Description The “DevOps“ method is a call to rethink the divisions common in our organizations, separating development on one hand, i.e. those who write application codes (“Devs“) and operations on the other, i.e. those who deploy and implement the applications (“Ops“). Such thoughts are certainly as old as CIOs but find renewed life thanks notably to two groups. First there are the agilists who have minimized constraints on the development side and are now capable of providing highly valued software to their clients on a much more frequent basis. Then there are the experts or “Prod“ managers, known as the Web Giants (Amazon, Facebook, LinkedIn...) who have shared their experiences in how they have managed the Dev vs. Ops divide. Beyond the intellectual beauty of the exercise, DevOps is mainly (if not entirely) gearing to reduce the Time To Market (TTM). Obviously, there are other positive effects, but the main priority, all being mentioned, is this TTM (hardly surprising in the Web industry). Dev Ops: differing local concerns but a common goal Organizational divides notwithstanding, the preoccupations of Development and Operations are indeed distinct and equally laudable: Figure 1 Seeking to innovate Seeking to rationalize Local targets DevOps “wall of confusion“ Different cultures Deliver new functionalities (of quality) Guarantee application runs (stability) Product Culture (software) Service Culture (archiving, supervision, etc.)
  63. 63. 64 THE WEB GIANTS Software development seeks heightened responsiveness (under pressure notably from their industry and the market): they have to move fast, add new functionalities, reorient work, refactor, upgrade frameworks, test deployment across all environments... The very nature of software is to be flexible and adaptable. In contrast, Operations need stability and standardization. Stability, because it is often difficult to anticipate what the impacts of a given modification to the code, architecture or infrastructure will be. Converting a local disk into a server can impact response times, a change in code can heavily impact CPU activity leading to difficulties in capacity planning. Standardization, because Operations seek to ensure that certain rules (equipment configuration, software versions, network security, log file configuration...) are uniformly followed to ensure the quality of service of the infrastructure. And yet both groups, Devs and Ops, have a shared objective: to make the system work for the client. DevOps: capitalizing on Agility Agility became a buzzword somewhat over ten years ago, its main objective being to reduce constraints in development processes. The Agile method introduced the notions of “short cycle“, “user feedback“, “Product owner“, i.e. a person in charge of managing the roadmap, setting priorities, etc. Agility also shook up traditional management structures by including cross-silo teams (developers and operators) and played havoc with administrative departments. Today, when those barriers are removed, software development is most often carried out with one to two-week frequencies. Business sees the software evolve during the construction phase. It is now time to bring people from operations into the following phases:
  64. 64. 65 THE WEB GIANTS Provisioning / spinning up environments: in most firms, deploying to an environment can take between one to four months (even though environments are now virtualized). This is surprisingly long, especially when the challengers are Amazon or Google. Deployment: this is without doubt the phase when problems come to a crunch as it creates the most instability; agile teams sometimes limit themselves to one deployment per quarter to limit the impacts on production. In order to guarantee system stability, these deployments are often carried out manually, are therefore lengthy, and can introduce errors. In short, they are risky. Incident resolution and meeting non-functional needs: Production is the other software user. Diagnosis must be fast, the problems and resilience stakes must be explained, and robustness must be taken into account. DevOps is organized around 3 pillars: infrastructure as code (IaC), continuous delivery, and a culture of cooperation 1. “Infrastructure as Code“ or how to reduce provisioning and environment deployment delays One of the most visible friction points is in the lack of collaboration between Dev and Ops in deployment phases. Furthermore this is the activity which consumes the most resources: half of production time is thus taken up by deployment and incident management. Figure 2. Source: Study by Deepak Patil (Microsoft Global Foundation Services) in 2006, via a presentation modified by James Hamilton (Amazon Web Services), jrh/TalksAndPapers/JamesHamilton_POA20090226.pdf ORGANIZATION / DEVOPS
  65. 65. 66 THE WEB GIANTS CMDB Mustreflecttargetconfiguration real-worldsystemconfiguration configuration OpenStack VMWare vCloud OpenNebula VM instanciation / OS Installation - Installation of Operating System Bootstrapping Capistrano Custom script (shell, python…) Commandand control Application Service Orchestration - Deploy application code to services (war, php source, ruby, ...) - RDBMS deployment (figure...) Chef Puppet CFEngine System Configuration - Deploy and install services required for application execution (JVM, application servers...) - Configuration of these services (logs, ports, rights, etc.) And although it is difficult to establish general rules, it is highly likely that part of this cost (the 31% segment) could be reduced by automating deployment. There are many reliable tools available today to generate provisioning and deployment to new environments, ranging from setting up Virtual Machines to software deployment and system configuration. Figure 3. Classification of the main tools (october 2012) These tools (each in its own language) can be used to code infrastructure: to install and deploy an HTTP service for server applications, to create repositories for the log files... The range of services and associated gains are many: Guaranteeing replicable and reliable processes (no user interaction, thus removing a source of errors) namely through their capacity to manage versions and rollback operations. Productivity. One-click deployment rather than a set of manual tasks, thus saving time. Traceability to quickly understand and explain any failures.
  66. 66. 67 THE WEB GIANTS ORGANIZATION / DEVOPS Reducing Time To Recovery: In a worst case scenario, the infrastructure can be recreated from scratch. In terms of recovery this is highly useful. In keeping with ideas stemming from Recovery Oriented Architecture, resilience can be addressed either by attempting to prevent systems from failing by working on the MTBF - Mean Time Between Failures, or by accelerating repairs by working on the MTTR - Mean Time To Recovery. The second approach, although not always possible to implement, is the least costly. It is also useful in organizations where many environments are necessary. In such organizations, the numerous environments are essentially kept available and little used because configuration takes too long. Automation is furthermore a way of initializing a change in collaboration culture between Dev and Ops. This is because automation increases the possibilities for self-service for Dev teams, at the very least over the ante- production environments. 2. Continuous Delivery Traditionally, in our organizations, the split between Dev and Ops comes to a head during deployment phases, when development delivers or shuffles off their code, which then continues on its long way through the production process. The following quote from Mary and Tom Poppendieck[1] puts the problem in a nutshell: How long would it take your organization to deploy a change
that involves just one single line of code? The answer is of course not obvious, but in the end it is here that differences in objectives diverge the most. Development seeks control over part of the infrastructure, for rapid deployment, on demand, to all environments. In contrast, production must see to making environments available, rationalizing costs, allocating resources (bandwidth, CPU...) [1] Mary and Tom Poppendieck, Implementing Lean Software Development: From Concept to Cash, Addison-Wesley, 2006.
  67. 67. 68 THE WEB GIANTS Also ironical is the fact that the less one deploys, the more the TTR (Time To Repair) increases, therefore reducing the quality of service to the end client. Figure 4. Source: change-4608108 In other words, the more changes there are between releases (i.e. the higher the number of changes to the code), the lower the capacity to rapidly fix bugs following deployment, thus increasing TTR - this is the instability ever-dreaded by Ops. Here again, addressing such waste can reduce the time taken up by Incident Management as shown in Figure 2. Figure 5. Source: change-4608108 Deploys Size of Deploy Vs Incident TTR 5 180 UnitsofChangedCode TTR(minutes) 160 140 120 100 80 60 40 20 0 4 3 2 1 0 Sev 1 TTR Sev 2 TTR Lines Per Deploys Changed CHANGE SIZE Huge changesets deployed rarely (high TTR) (low TTR) Tiny changesets deployed often CHANGE FREQUENCY
  68. 68. 69 THE WEB GIANTS ORGANIZATION / DEVOPS To finish, Figure 5, taken from a Flickr study, shows the correlation between TTR (and therefore the seriousness of the incidents) depending on the amount of code deployed (and therefore the number of change to the code). However, continuous deployment is not easy and requires: Automation of the deployment and provisioning processes: Infras- tructure as Code Automation of the software construction and deployment processes. Build automation becomes the construction chain which carries the source management software to the various environments where the software will be deployed. Thus a new build system is neces- sary, including environment management, workflow management for more quickly compiling source code into binary code, creating documentation and release notes to swiftly understand and fix any failures, the capacity to distribute testing across agents to reduce delays, and always guaranteeing short cycle times. Taking these factors into account at the architecture level and above all respecting the following principle: decouple functionality deploy- ment and code deployment using patterns such as: Feature flipping (cf. Feature flipping p. 113), dark launch… This of course entails a new level of complexity but offers the necessary flexibility for this type of continuous deployment. A culture of measurement with user-oriented metrics. This is not only about measuring CPU consumption, it is also about correlating busi- ness and application metrics to understand and anticipate system behavior. 3. A culture of collaboration if not an organizational model These two practices, Infrastructure as Code and Continuous Delivery, can be implemented in traditional organizations (with Infrastructure as Code at Ops and Continuous Delivery at Dev). However, once development and production reach their local optimum and a good level of maturity, the latter will always be hampered by the organizational division.
  69. 69. 70 THE WEB GIANTS This is where the third pillar comes into its own; a culture of collaboration, nay cooperation, with all teams becoming more independent rather than throwing problems at each other in the production process. This can mean for example giving Dev access to machine logs, providing them with production data the day before so that they can roll out the integration environments themselves, opening up the metrics and monitoring tools (or even displaying the metrics in open spaces)... Bringing that much more flexibility to Dev, sharing responsibility and information on “what happens in Prod“, which are actually just so many tasks with little added value that Ops would no longer have to shoulder. The main cultural elements around DevOps could be summarized as follows: Sharing both technical metrics (response times, number of backups...) as well as business metrics (changes in generated profits...) Ops is also the software client. This can mean making changes to the software architecture and developments to more easily integrate monitoring tools, to have relevant and useful log files, to help diagnosis (and reduce the TTD, Time To Diagnose). To go further, certain Ops needs should be expressed as user stories in the backlog. A lean approach [] and post-mortems which focus on the deep causes (the 5 whys) and implementing countermeasures (French only). It remains however that in this model, the zones of responsibility (especially development, software monitoring, datacenter use and support) which exist are somewhat modified. Traditional firms give the project team priority. In this model, deployment processes, software monitoring and datacenter management are spread out across several organizations.
  70. 70. 71 THE WEB GIANTS ORGANIZATION / DEVOPS Figure 6: Project teams Inversely, some stakeholders (especially Amazon) have taken this model very far by proposing multidisciplinary teams in charge of ensuring the service functions - from the client’s perspective (cf. Feature Teams, p. 65). You build it, you run it. In other words, each team is responsible for the business, from Dev to Ops. Figure 7: Product team – You build it, you run it. BUSINESS SOFTWARE PRODUCTION FLOW MONITORING (BUILD) PRODUCTION (RUN) (Source: Cutter IT Journal, Vol. 24. N°8. August 21), modified) Project Teams Application Management Technical Management Service Desk Users SOFTWARE PRODUCTION FLOW PRODUCTS/SERVICES (BUILD RUN) PRODUCTION Service Desk Infrastructure Users (Source: Cutter IT Journal, Vol. 24. N°8. August 21), modified)
  71. 71. 72 THE WEB GIANTS Moreover it is within this type of organization that the notion of self- service takes on a different and fundamental meaning. One then sees one team managing the software and its use and another team in charge of datacenters. The dividing line is farther “upstream“ than is usual, which allows scaling up and ensuring a balance between agility and cost rationalization (e.g. linked to the datacenter architecture). The AWS Cloud is probably the result of this... It is something else altogether, but imagine an organization with product teams and production teams who would jointly offer services (in the sense of ITIL) such as AWS or Google App Engine... Conclusion DevOps is thus nothing more than a set of practices to leverage improvements around: Tools to industrialize the infrastructure and reassure production as to how the infrastructure is used by development. Self service is a concept hardwired into the Cloud. Public Cloud offers are mature on the subject but some offers (for example VMWare) aim to reproduce the same methods internally. Without necessarily reaching such levels of maturity however, one can imagine using tools like Puppet, Chef or CFEngine... Architecture which makes it possible to decouple deployment cycles, to deploy code without deploying all functionalities… (cf. Feature flipping, p. 113 and Continuous Deployment, p.105). Organizational methods, leading to implementation of Amazon’s “Pizza teams“ patterns (cf. Pizza Teams, p. 59) and You build it, you run it. Processes and methodologies to render all these exchanges more fluid. How to deploy more often? How to limit risks when deploying progressively? How to apply the “flow“ lessons from Kanban to production? How to rethink the communication and coordination mechanisms at work along the development/operations divide?
  72. 72. 73 THE WEB GIANTS ORGANIZATION / DEVOPS In sum, these four strands make it possible to reach the DevOps goals: improve collaboration, trust and objective alignment between development and operations, giving priority to addressing the stickiest issues, summarized in Figure 8. Figure 8 Faster provisioning Improved quality of service Continuous improvement Operational efficiency Infrastructure as Code Continuous Delivery Increased deployment reliability Faster incident resolution (MTTR) Improved TTM Culture of collaboration
  73. 73. 74 Sources • White paper on the DevOps Revolution: • Wikipedia article: • Flickr Presentation at the Velocity 2009 conference: • Definition of DevOps by Damon Edwards: • Article by John Allspaw on DevOps: just-happen-with-deployment/ • Article on the share of deployment activities in Operations: focus-on-deployment.html • USI 2009 (French only): quelques-idees-issues-des-grands-du-web-pour-remettre-en-cause-vos- reflexes-d-architectes#webcast_autoplay THE WEB GIANTS
  74. 74. 75 THE WEB GIANTS Practices
  75. 75. 76 Lean Startup.................................................................................... 87 Minimum Viable Product.................................................................. 95 Continuous Deployment................................................................ 105 Feature Flipping............................................................................. 113 Test A/B......................................................................................... 123 Design Thinking............................................................................. 129 Device Agnostic............................................................................. 143 Perpetual beta............................................................................... 151 THE WEB GIANTS
  76. 76. 77 THE WEB GIANTS Lean Startup
  77. 77. 78 THE WEB GIANTS PRACTICES / LEAN STARTUP Description Creating a product is a very perilous undertaking. Figures show that 95% of all products and startups perish from want of clients. Lean Startup is an approach to product creation designed to reduce risks and the impact of failures by, in parallel, tackling organizational, business and technical aspects, and through aggressive iterations. It was formalized by Eric Ries, and was strongly inspired by Steve Blank’s Customer Development Build – Mesure – Learn All products and functionalities start with a hypothesis. The hypothesis can stem from data collection on the ground or a simple intuition. Whatever the underlying reason, the Lean Startup approach aims to: Consider all ideas as hypotheses, it doesn’t matter whether they concern marketing or functionalities, validate all hypotheses as quickly as possible on the ground. This last point is at the core of the Lean Startup approach. Each hypothesis, from business, systems admin or development - must be validated, for quality as well as metrics. Such an approach makes it possible to implement a learning loop for both the product and the client. Lean Startup refuses the approach which consists of developing a product for over a year only to discover that the choices made (in marketing, functionalities, sales) threaten the entire organization. Testing is of the essence. Figure 1 IDEAS PRODUCTLEARN BUILD DATA MEASURE