Mini-course "Practices of the Web Giants" at Global Code - São Paulo

  • 5,466 views
Uploaded on

Mini-course "Practices of the Web Giants" …

Mini-course "Practices of the Web Giants"
Global Code - São Paulo
24/01/2014

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
5,466
On Slideshare
0
From Embeds
0
Number of Embeds
19

Actions

Shares
Downloads
0
Comments
1
Likes
7

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. WEB GIANTS Innovations, Practices, Culture Mathieu DESPRIEE mde@octo.com 1 © OCTO 2014
  • 2. 2 © OCTO 2014
  • 3. Soon in english ! 3 © OCTO 2014
  • 4. Digitalization ! Yesterday, Internet was a tool ! Today, numerical technologies are changing everything : the way we communicate, work, learn, do business… the way we live
  • 5. http://postscapes.com/internet-of-things-examples/
  • 6. ! All this data will end in the IT system of some company, and they will make money from it “Big data is the new oil” ! It’s not only about data : there will be new usages, new services… new competitors ! ! Sooner or later, every company will face the problematics the web giants had to face
  • 7. 14 © OCTO 2014
  • 8. BIGGER FASTER BETTER 17 © OCTO 2014
  • 9. BIGGER 18 © OCTO 2014
  • 10. 19
  • 11. highend machine / mainframe! highly redondant hardware! symmetric multi-processing! 20 lots © OCTO 2014 of CPU, RAM, disk!
  • 12. “commodity hardware”! x86 machines! pizza box with few CPUs and disks! 21 no hardware redundancy! © OCTO 2014
  • 13. ! They measured everything : !   Power efficiency of all hardware parts !   Performance to power ratio, $ per transaction, etc. !   Cost models of failures ! ! ! 22 For them : Commodity hardware is 3 to 12 times cheaper Start to design datacenters only based on commodity hw Start to design application distributed on thousands of non reliable machines
  • 14. Small is beautiful, but… ! ! Web giants are the champions of infrastructure automation, that’ why they became champions of the cloud ! Need to completely redefine application resilience, since the hardware is not reliable, and constantly fails. ! 23 Having to deploy on many machines changes everything : you need to automate things Resilience must be handled by software. Especially for databases
  • 15. SHARDING NoSQL 24 © OCTO 2014
  • 16. NoSQL ! « Not Only SQL » ! To go beyond RDBMS limitations ! ! ! ! ! Google : BigTable Amazon : DynamoDB Facebook : Cassandra, sharded key-value mysql LinkedIn : Voldemort etc
  • 17. The need for speed Amazon: Google: Yahoo: Bing: … and availability 100ms of degradation of latency more than 500ms in page load more than 400ms in page load more than 1s in page load Amazon: 1 min of unavailability = = = = -1% of revenues -20% of page views +5 to 9% of bounce -2.8% in ad revenues = 50 K$ of revenue loss (The blink of an eye is 300 ms) 26 Les géants du Web
  • 18. New storage architectures and the CAP theorem « Availability » Users can access the system (read or write) A is also related to response time. The more you look for consistency, the worst will be the latency Large websites use “eventually consistent” datastores (NoSQL) DBRMS universe Can pick only two ! « Consistency » All users have the same version of information « Partition tolerance » The system continues to work in case of network partition, ie. when different nodes cannot communicate
  • 19. NoSQL ! A radically different approach to database ! ! ! ! Distributed storage, tolerating failure by replicating data Consistency constraint is relaxed : eventual consistency Focus is put on availability and low response times (low latency) Linear horizontal scalability ! Variety of datamodels ! key/value ! column oriented !   graph
  • 20. Different sharding approaches ! Google ! BigTable, with the distributed storage file system GFS ! Amazon !   Famous paper about Dynamo, key/value store organised in a ring of replication with consistent hashing, and original approach to eventual consistency ! Facebook !   Cassandra, inspired form both BigTable, and Dynamo !   also : specific design of a sharded mysql used as key/value store ! 29 …
  • 21. BigData Hadoop 30 © OCTO 2014
  • 22. Exponential growth of capacities CPU, memory, network bandwith, storage … all of them followed the Moore’s law Source : http://strata.oreilly.com/2011/08/building-data-startups.html 31
  • 23. 70 Seagate Barracuda 7200.10 64 MB/s 60 MB/s 50 40 Seagate Barracuda ATA IV 30 20 IBM DTTA 35010 10 0,7 MB/s 0 2010 1990 Storage capacity Throughtput We can store 100’000 times more data, but it takes 1000 times longer to read it ! x 100’000 x 91 32
  • 24. Google paper : Map Reduce  Key  principles   !   Parallelize,  distribute,  and  load-­‐balance  processing   !   Fault-­‐tolerant  (hide  failure  of  nodes  during  the  processing)   !   Co-­‐loca;on  of  processing  and  data   33
  • 25. 34
  • 26. Integration w/ Information System Querying Advanced processing Orchestration Distributed Processing Distributed Storage Monitoring and Management Overview of Hadoop architecture 35
  • 27. A new way of doing BI and data analytics ! Consider that all the data is valuable, and store everything : structured and un-structured data ! Scale to peta-bytes of storage, at a low cost !   Yahoo has a cluster of 42’000 nodes ! ! 36 Don’t force the data to match a predefined data model (tables and schema), instead use a “schema-on-read” approach Don’t move the data (ETL) to process it, instead move the processing to the data (Map-Reduce)
  • 28. 37
  • 29. Build vs. Buy Strategic and innovative Assets Faster SPECIFIC Unique, Differentiating Perceived as a competitive advantage Common to all companies in a sector Perceived as an advantage for production COMMERCIAL SOFTWARE PACKAGES BPO Common to all companies Perceived as a resource Resources Cheaper 38
  • 30. They use and contribute massively to open source ! Facebook : MySQL, Cassandra, Thrift, open compute (open source hardware and datacenter design)… ! Google : android, GWT, chromium, linux kernel… !   through their papers : GFS, MapReduce ! LinkedIn : Voldemort, Kafka, Zoie … ! NetFlix : a huge list of software… I trust software I hacked myself 39
  • 31. A way to expose services of applications, to be re-used by others to build and enrich their own services and applications 40
  • 32. 41
  • 33. 42 http://www.programmableweb.com/
  • 34. 43
  • 35. ! They take advantage of innovation made by others (individuals, or companies) ! Crowdsourced RnD ! 44
  • 36. 45
  • 37. Be a platform from the beginning Memo de Jeff Bezos (2002) 1) All teams will expose their data and functionality through service interfaces. 2) Teams must communicate with each other through these interfaces. 3) There will be no other form of interprocess communication allowed: no direct linking, no direct reads of another team’s data store, no sharedmemory model, no back-doors whatsoever. The only communication allowed is via service interface calls over the network. 4) It doesn’t matter what technology they use. HTTP, Corba, Pubsub, custom protocols — doesn’t matter. Bezos doesn’t care. 5) All service interfaces, without exception, must be designed from the ground up to be externalizable. That is to say, the team must plan and design to be able to expose the interface to developers in the outside world. No exceptions. 6) Anyone who doesn’t do this will be fired. 7) Thank you; have a nice day! 46
  • 38. Open API : advantages to do it ! Leverage effect !   enrich your service portfolio and business opportunities with many partners ! Do bigger things by using « collective intelligence of the world » ! Create an ecosystem around you ! Improve the quality !   If you want your APIs to be used, !   Companies of the world are looking at what you are doing à it brings pressure on you to improve ! Attract talented people !   The best way to attract good developers : they will want to come and work with those who created these APIs 47
  • 39. FASTER One of the things we most value at Facebook engineering is moving fast. 48 © OCTO 2014
  • 40. 49
  • 41. We try things. We celebrate our failures. This is a company where it is absolutely OK to try something that is very hard, have it not be successful, take the learning and apply it to something new Eric Schmidt former Google’s CEO Move fast and break things Mark Zuckerberg Facebook Failure is totally OK. As long as you fail fast 50 Marissa Mayer Yahoo
  • 42. 51
  • 43. The minimum viable product is that version of a new product which allows a team to collect the maximum amount of validated learning about customers with the least effort Eric Ries pioneer of Lean Startup 52
  • 44. 53
  • 45. Short cycles to validate quickly each hypothesis
  • 46. Lean Startup example - 55Les géants du Web
  • 47. 56
  • 48. 57
  • 49. 58
  • 50. Multi-variant testing / Google analytics 59
  • 51. Continuous Deployment 60
  • 52. How long would it take your organization to deploy a change that involves just one single line of code? Mary Poppendieck From Concept To Cash 61
  • 53. ! 2 deployments per day ! ! A deployment somewhere in datacenters every 11 seconds Any moment, an average of 10’000 servers are being updated ! 10 deployments / day 62
  • 54. Why deploy continuously ? ! ! Improve Time To Market Learn Faster IDEAS (and it needs metrics !) LEARN FAST DATA CODE FAST CODE MEASURE FAST 63
  • 55. Why deploy continuously ? ! ! Smaller change = Smallest Time-to-Recover You reduce the risks, by lowering the impacts of problems 64
  • 56. DevOps   1.  Infrastructure  as  Code   2.  Con;nuous  Delivery   3.  Collabora;on   65 65
  • 57. Infra as Code : Industrialize and Automate everything logstash chef puppet vagrant git ! capistrano open stack test driven infrastructure ! 66
  • 58. Continuous Delivery : a pipeline to bring code to production 67
  • 59. Tools and practices ! Continuous integration ! TDD - Test Driven Development (automated unit testing) ! Code reviews ! Continuous code auditing (sonar…) ! Functional test automation ! Strong non-functional tests (performance, availability…) ! Automated packaging and deployment, independent of target environment ! Zero downtime deployment 68
  • 60. Feature flipping ! ! ! Push code to production != push a feature to production Enable/ Disable a new feature on production in seconds “Graceful degradation” during peaks of traffic ! Can be used for A/B testing ! 69
  • 61. Datamodel evolution strategy example Datamodel Version N Datamodel Version N V.1 Datamodel Version N+1 Hybrid V.1 + V.2 Datamodel Version N+1 V.2 70
  • 62. Dark Launch @ Facebook We chose to simulate the impact of many real users hitting many machines by means of a “dark launch” period in which Facebook pages would make connections to the chat servers, query for presence information and simulate message sends without a single UI element drawn on the page. YES ! IT’S A LOAD TEST ON A PRODUCTION PLATFORM ! 71
  • 63. 72
  • 64. You build it, You run it ! 73
  • 65. Des outils partagés, qui facilitent les interactions Open  the  tools     to  the  devs  !   3.   COLLABORATIO N   (culture,  organisa@on…)   74
  • 66. BETTER 75 © OCTO 2014
  • 67. Design for failure
  • 68. NetFlix Hystrix 77 Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable.
  • 69. The cult of measurement
  • 70. In God we trust. All others must bring data W. Edwards Deming 79
  • 71. Everyone must be able to experiment, learn and iterate. Position, obedience and tradition should not hold no power. For innovation to flourish, measurement must rule. Werner Vogels , CTO of Amazon 80
  • 72. ! They measure everything ! ! ! ! ! ! ! 81             usage infrastructure, from datacenter to HDD power consumption operational processes efficiency … self-service restaurant queue length ! management practices (Google) Good ideas come from the field, from real data, because managers always have biases when they try to interpret situations
  • 73. Best size for teams 82 http://www.qsm.com/process_improvement_01.html
  • 74. 2 pizzas teams 83
  • 75. Use a Component oriented organization ? Feature 1 Feature 2 Team Back Feature 4 Team middleware Feature 5 84 Team Front Team framework
  • 76. Feature team = cross functional teams Product Owner – UX designer –Developers – Testers – Ops 85
  • 77. Feature Teams at Spotify 86
  • 78. 87
  • 79. Software Craftmanship
  • 80. If an idea worths 1,  a well-executed idea worths $100...$1’000...$10’000’000 !
  • 81. Attract and hire the best WHAT FACEBOOK EMPLOYEES EARN: Senior software engineer $132,503 Product manager $130,143 User interface engineer $129,136 Machine learning engineer $123,379 Engineering manager $123,379 Source : www.glassdoor.com/index.htm Software engineer $111,562 Project manager $98,302 Operations engineer $82,626 Site reliability engineer $80,413 Software engineering intern $74,700 Account executive $62,124 Network engineer $121,500 Business development mgr $115,000 ! They are also known to have tough technical interviews, to get only the best developers !
  • 82. Develop the talents ! ! Lots of training ! Code review / Pair programming ! Mentoring ! Slack-time dedicated to RnD , or personal projects ! Hackatons ! Strong open source involvement
  • 83. Keep people
  • 84. Software is eating the world Be prepared for it ! 94 © OCTO 2014
  • 85. THANK YOU ! ! ! ! To get these slides, To get the book in french (for free) To be notified when the book is available in English JUST SEND ME AN EMAIL ! mde@octo.com 95 © OCTO 2014