Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

[JSDC 2016] Codex: Conditional Modules Strike Back

11,223 views

Published on

Netflix runs hundreds of multivariate AB tests a year, many of which help personalize the experience in the UI. This causes an exponential growth in the number of user experiences we serve to members, with each unique experience resulting in a unique JS/CSS bundle. Pre-publishing million of permutations to the CDN for each build of each UI simply does not work at Netflix scale.

Instead, we've taken a novel approach by standing up a brand new Node.js service: Codex. Codex's sole responsibility is to build personalized JS/CSS bundles on the fly for our members as they move through the Netflix user experience. This frees up our UI teams to innovate rapidly on the UI itself, without having to worry about the costs of infrastructure and the complexity of pre-publishing to the CDN.

As we stood up Codex, we learned a ton about building a horizontally scalable Node.js microservice. This talk is the story of how we built, designed, and scaled that service to meet the needs of our 80 million customers.

Published in: Internet
  • Be the first to comment

[JSDC 2016] Codex: Conditional Modules Strike Back

  1. 1. Alex Liu A N E T F L I X E N G I N E E R I N G O R I G I N A L @stinkydofualiu@netflix.com codex conditionaL ModuleS Strike bacK
  2. 2. #netflixeverywhere
  3. 3. contents I. The Problem II. Codex: Conditional Bundling III. Scaling for Netflix IV. Looking to the Future
  4. 4. i. the Problem
  5. 5. It's all about building JS bundles.
  6. 6. // bundle.js var a = require('a'); var b = require('b'); bundle.js
  7. 7. // bundle.js var a = require('a'); var b = require('b'); bundle.js
  8. 8. var bundles = [ 'bundle.js' ]; bundle.js home.jsprofile.jsfeed.js album.js signup.js
  9. 9. home.js profile.js feed.js signup.js account.js setting.js album.js photo.js login.js var bundles = [ 'home.js', 'profile.js', 'feed.js', 'signup.js' 'account.js', 'settings.js', 'album.js', 'photo.js', 'login.js' ];
  10. 10. var bundles = [ 'home.js', 'profile.js', 'feed.js', 'signup.js' 'account.js', 'settings.js', 'album.js', 'photo.js', 'login.js' ]; /home /profile /feed
  11. 11. var bundles = [ 'home.js', 'profile.js', 'feed.js', 'signup.js' 'account.js', 'settings.js', 'album.js', 'photo.js', 'login.js' ]; /home /profile /feed home.js profile.js feed.js
  12. 12. html5shiv es5-shim
  13. 13. var bundles = [ 'home.js', 'homeIE.js', 'profile', 'profileIE', 'feed.js', 'feedIE.js' 'signup.js', 'signupIE.js', ... ];
  14. 14. var bundles = [ 'home.js', 'homeIE.js', 'profile', 'profileIE', 'feed.js', 'feedIE.js' 'signup.js', 'signupIE.js', ... ];
  15. 15. var bundles = [ 'home.js', 'homeIE.js', 'profile', 'profileIE', 'feed.js', 'feedIE.js' 'signup.js', 'signupIE.js', ... ]; /home
  16. 16. var bundles = [ 'home.js', 'homeIE.js', 'profile', 'profileIE', 'feed.js', 'feedIE.js' 'signup.js', 'signupIE.js', ... ]; /home
  17. 17. var bundles = [ 'home.js', 'homeIE.js', 'profile', 'profileIE', 'feed.js', 'feedIE.js' 'signup.js', 'signupIE.js', ... ]; /home home.js NO
  18. 18. var bundles = [ 'home.js', 'homeIE.js', 'profile', 'profileIE', 'feed.js', 'feedIE.js' 'signup.js', 'signupIE.js', ... ]; /home home.js homeIE.js YESNO
  19. 19. Netflix AB testing!
  20. 20. AB Test w/ multiple cells Cells Control (Cell 1) Cell 2 Cell 3 Movie Cover Art
  21. 21. AB Test w/ multiple cells Cells Control (Cell 1) Cell 2 Cell 3 Movie Cover Art 14% 6%
  22. 22. Netflix AB testing!
  23. 23. Netflix AB testing!
  24. 24. home.js
  25. 25. home.js newSearch.jsoldSearch.js
  26. 26. home.js oldSearch.js
  27. 27. home.js oldSearch.js
  28. 28. home.js newSearch.js
  29. 29. home.js newSearch.jsoldSearch.js
  30. 30. home.js newSearch.jsoldSearch.js jQuery React~80KB ~120KB
  31. 31. larger bundle size: • file sizes • time to download • memory usage • time to interactive (TTI)
  32. 32. old school Lego bricks were generic
  33. 33. new Lego is about specialization
  34. 34. hard to reuse specialized bricks
  35. 35. home.js newSearch.jsoldSearch.js
  36. 36. home.js newSearch.jsoldSearch.js
  37. 37. home.js newSearch.jsoldSearch.js
  38. 38. // starting to look like a // lot of bundles... var bundles = [ 'homeNewSearch.js', 'homeNewSearchIE.js', 'homeOldSearch.js', 'homeOldSearchIE.js', ... ];
  39. 39. // starting to look like a // lot of bundles... var bundles = [ 'homeNewSearch.js', 'homeNewSearchIE.js', 'homeOldSearch.js', 'homeOldSearchIE.js', ... ]; 4x variations already!
  40. 40. Netflix runs hundreds of AB tests
  41. 41. Netflix runs hundreds of AB tests but we personalize on many other dimensions too
  42. 42. |S1| ⋅ |S2| ⋅⋅⋅ |Sn| = |S1 × S2 × ⋅⋅⋅ × Sn|
  43. 43. |S1| ⋅ |S2| ⋅⋅⋅ |Sn| = |S1 × S2 × ⋅⋅⋅ × Sn| 3100 = 5.1537752e47
  44. 44. 40,000,000,000 bricks to reach the
  45. 45. 40,000,000,000 bricks to reach the 7,600,000,000,000,000 bricks to reach
  46. 46. 40,000,000,000 bricks to reach the 7,600,000,000,000,000 bricks to reach Enough bricks to reach 6.7812832e32 times.
  47. 47. that’s a #$*%^ ton of bundles!
  48. 48. https://xkcd.com/303/
  49. 49. Website's full bundle is 10MB+
  50. 50. how do we deal with conditional modules?
  51. 51. ii. codex conditional Bundling :
  52. 52. what if we generate on-demand?
  53. 53. what if we generate on-demand? 1. identify the UI variation 2. generate the bundle
  54. 54. how do we identify the UI variation?
  55. 55. AB Tests
  56. 56. AB Tests
  57. 57. AB Tests
  58. 58. truthsnoun, plural [trooth z, trooths] a bucket of boolean flags used to build a personalized Netflix experience
  59. 59. { "webfonts": false, "instantSearch": true, "socialFeatures": false, "motionBanner": true, "html5video": true, "customScrollbar": true }
  60. 60. { "webfonts": false, "instantSearch": true, "socialFeatures": false, "motionBanner": true, "html5video": true, "customScrollbar": true } inputs and outputs are NOT 1:1
  61. 61. how do we generate the bundle?
  62. 62. home.js newSearch.jsoldSearch.js
  63. 63. // home.js if (truths.isNewSearch === true) { require('./newSearch'); } else { require('./oldSearch'); }
  64. 64. home.js newSearch.jsoldSearch.js
  65. 65. home.js newSearch.jsoldSearch.js conditioncondition
  66. 66. home.js newSearch.jsoldSearch.js conditioncondition
  67. 67. conditioncondition
  68. 68. isNewSearch!isNewSearch
  69. 69. // home.js if (truths.isNewSearch === true) { require('./newSearch'); } else { require('./oldSearch'); }
  70. 70. home.js newSearch.jsoldSearch.js isNewSearch!isNewSearch
  71. 71. home.js newSearch.js newEntryPoint.js oldSearch.js !isNewSearch isNewSearch
  72. 72. git
  73. 73. git
  74. 74. git codex(node.js module)
  75. 75. git codex artifact (node.js module)
  76. 76. artifact
  77. 77. artifact
  78. 78. artifact
  79. 79. { "home.js": { "deps": [ "dep1.js", "dep2.js", "dep3.js", ], "conditionalDeps": { "newSearch.js": { "name": "isNewSearch", "value": true
  80. 80. ], "conditionalDeps": { "newSearch.js": { "name": "isNewSearch", "value": true }, "oldSearch.js": { "name": "isNewSearch", "value": false } } }
  81. 81. ], "conditionalDeps": { "newSearch.js": { "name": "isNewSearch", "value": true }, "oldSearch.js": { "name": "isNewSearch", "value": false } } }
  82. 82. it's a conditional map!
  83. 83. web/v1 web/v2 web/v3 web/v4
  84. 84. artifacttruths
  85. 85. <html/> http://codex.nflxext.com/web/v1/83af
  86. 86. http://codex.nflxext.com/web/v1/83af
  87. 87. <script/> http://codex.nflxext.com/web/v1/83af
  88. 88. <script/> http://codex.nflxext.com/web/v1/83af
  89. 89. <script/> <script/> codex ? i got this! http://codex.nflxext.com/web/v1/83af
  90. 90. codex http://codex.nflxext.com/web/v1/83af http://codex.nflxext.com/{team}/{version}/{truths}
  91. 91. codex web/v1 http://codex.nflxext.com/web/v1/83af http://codex.nflxext.com/{team}/{version}/{truths}
  92. 92. codex { 83: newSearchTest, af: isChrome } web/v1 http://codex.nflxext.com/web/v1/83af http://codex.nflxext.com/{team}/{version}/{truths}
  93. 93. home.js
  94. 94. oldSearch.js home.js
  95. 95. oldSearch.js home.js
  96. 96. oldSearch.js home.js
  97. 97. newSearch.jsoldSearch.js home.js
  98. 98. home.js newSearch.jsoldSearch.js response times <= 80ms
  99. 99. codex http://codex.nflxext.com/web/v1/83af
  100. 100. <script/> codex here you go http://codex.nflxext.com/web/v1/83af
  101. 101. <script/> <script/> codex cached! here you go http://codex.nflxext.com/web/v1/83af
  102. 102. Recap • Build Time: build conditional graph (artifact) • Run Time: apply truths to artifact • Conditional bundling is transparent, universal, 
 configuration free!
  103. 103. iii. Scaling For Netflix
  104. 104. web/v1 web/v2 tv/v5 tv/v7
  105. 105. Storage Metadata
  106. 106. Amazon S3 Amazon DynamoDB Storage Metadata
  107. 107. Build Time: Codex Artifact Management
  108. 108. web/v1 Build Time: Codex Artifact Management
  109. 109. web/v1 web/v1 Build Time: Codex Artifact Management saved!
  110. 110. SAVED! web/v1 web/v1 Build Time: Codex Artifact Management saved!
  111. 111. Build Time: Codex Artifact Management
  112. 112. activate web/v1 web/v1 activated! Build Time: Codex Artifact Management
  113. 113. Run Time: Codex Bundler
  114. 114. web/v1 here are the active build ids Run Time: Codex Bundler
  115. 115. here are the artifacts web/v1 here are the active build ids Run Time: Codex Bundler
  116. 116. here are the artifacts web/v1 here are the active build ids Run Time: Codex Bundler
  117. 117. prod prod-new canary
  118. 118. 16GB ought to be enough for us!__🙂
  119. 119. codex
  120. 120. codex
  121. 121. codex
  122. 122. 400+ artifacts!
  123. 123. 400+ artifacts! …and we ran out of memory
  124. 124. 32GB ought to be enough for us!__🤔
  125. 125. 800+ artifacts!
  126. 126. 800+ artifacts! …and we ran out of memory
  127. 127. 64GB ought to be enough for us…?__😭
  128. 128. 1600+ artifacts!
  129. 129. 1600+ artifacts! …and we ran out of memory. again.
  130. 130. 1600+ artifacts! Our teams will use as much as we give them. …and we ran out of memory. again.
  131. 131. Q: What's cheap, plentiful, and fast enough?
  132. 132. Q: What's cheap, plentiful, and fast enough? A: Disk.
  133. 133. codex LevelDB
  134. 134. codex LevelDB
  135. 135. 100% CPU usage.
  136. 136. what's the problem?
  137. 137. ~68%
  138. 138. v8::internal::Runtime_ParseJson
  139. 139. codex LevelDB
  140. 140. codex LevelDB JSON.parse
  141. 141. JSON.parse is slow. And blocks the CPU!
  142. 142. codex LevelDB JSON.parse LRU Cache: Saving the CPU
  143. 143. codex LevelDB JSON.parse LRU Cache: Saving the CPU
  144. 144. codex LevelDB JSON.parse LRU Cache: Saving the CPU
  145. 145. codex LevelDB JSON.parse LRU Cache: Saving the CPU
  146. 146. codex LevelDB JSON.parse LRU Cache: Saving the CPU
  147. 147. Breaking change to conditional graph traversal algorithm!
  148. 148. http://codex.nflxext.com/web/v1/83af
  149. 149. http://codex.nflxext.com/web/v1/83af old algorithm? new algorithm?
  150. 150. old algorithm? new algorithm? http://codex.nflxext.com/1.0.0/web/v1/83af http://codex.nflxext.com/2.0.0/web/v1/83af
  151. 151. 1.0.0 2.0.0 zuul
  152. 152. zuul 1.0.0 2.0.0
  153. 153. zuul 1.0.0 2.0.0
  154. 154. Good for now.
  155. 155. Good for now. Continue to look for engineering wins.
  156. 156. What about operational resiliency?
  157. 157. eu-west-1 us-west-2 us-east-1
  158. 158. eu-west-1 us-west-2 us-east-1
  159. 159. eu-west-1 us-west-2 ???
  160. 160. eu-west-1 us-west-2 ???
  161. 161. eu-west-1 us-west-2 ???
  162. 162. ??? ??? ???
  163. 163. <script/> <script/> codex
  164. 164. <script/> codex ???
  165. 165. codex <script/> ???
  166. 166. Recap • Management plane necessary at scale • Performance is critical (TTI) • Redundancy across 3 AWS zones • Resilient against CDN failure
  167. 167. iv. Looking To The Future
  168. 168. why not {bundler}?
  169. 169. how do we support tree shaking?
  170. 170. don't be afraid to challenge common convention.
  171. 171. don't make assumptions about the upper limits.
  172. 172. don't optimize before you understand the system.
  173. 173. use the scientific method: 1. gather data 2. formulate hypothesis 3. test hypothesis 4. repeat

  174. 174. engineer for fault tolerance
  175. 175. Netflix scale is challenging.
  176. 176. https://www.flickr.com/clement127/ https://www.flickr.com/jose_antonio_hidalgo_jimenez/ https://www.flickr.com/reiterlied/ Lego Photo Credits
  177. 177. Image Credits
  178. 178. Image Credits
  179. 179. Image Credits Artist: alecive (Alessandro Roncone) Iconset Homepage: https://github.com/alecive/FlatWoken
  180. 180. Alex Liu aliu@netflix.com @stinkydofu fin

×