Your SlideShare is downloading. ×
0
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Scaling Instagram
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Scaling Instagram

85,427

Published on

Instagram 扩展性实践

Instagram 扩展性实践

Published in: Technology
7 Comments
249 Likes
Statistics
Notes
No Downloads
Views
Total Views
85,427
On Slideshare
0
From Embeds
0
Number of Embeds
59
Actions
Shares
0
Downloads
1,014
Comments
7
Likes
249
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Scaling Instagram AirBnB Tech Talk 2012 Mike Krieger Instagram
  • 2. me- Co-founder, Instagram- Previously: UX & Front-end @ Meebo- Stanford HCI BS/MS- @mikeyk on everything
  • 3. communicating andsharing in the real world
  • 4. 30+ million users in less than 2 years
  • 5. the story of how we scaled it
  • 6. a brief tangent
  • 7. the beginning
  • 8. Text
  • 9. 2 product guys
  • 10. no real back-end experience
  • 11. analytics & python @ meebo
  • 12. CouchDB
  • 13. CrimeDesk SF
  • 14. let’s get hacking
  • 15. good components in place early on
  • 16. ...but were hosted on a single machine somewhere in LA
  • 17. less powerful than my MacBook Pro
  • 18. okay, we launched. now what?
  • 19. 25k signups in the first day
  • 20. everything is on fire!
  • 21. best & worst day of our lives so far
  • 22. load was through the roof
  • 23. first culprit?
  • 24. favicon.ico
  • 25. 404-ing on Django,causing tons of errors
  • 26. lesson #1: don’t forget your favicon
  • 27. real lesson #1: most of your initial scaling problems won’t be glamorous
  • 28. favicon
  • 29. ulimit -n
  • 30. memcached -t 4
  • 31. prefork/postfork
  • 32. friday rolls around
  • 33. not slowing down
  • 34. let’s move to EC2.
  • 35. scaling = replacing allcomponents of a car while driving it at 100mph
  • 36. since...
  • 37. “"canonical [architecture]of an early stage startup in this era." (HighScalability.com)
  • 38. Nginx &Redis &Postgres &Django.
  • 39. Nginx & HAProxy &Redis & Memcached &Postgres & Gearman &Django.
  • 40. 24h Ops
  • 41. our philosophy
  • 42. 1 simplicity
  • 43. 2 optimize forminimal operational burden
  • 44. 3 instrument everything
  • 45. walkthrough:1 scaling the database2 choosing technology3 staying nimble4 scaling for android
  • 46. 1 scaling the db
  • 47. early days
  • 48. django ORM, postgresql
  • 49. why pg? postgis.
  • 50. moved db to its own machine
  • 51. but photos kept growing and growing...
  • 52. ...and only 68GB of RAM on biggest machine in EC2
  • 53. so what now?
  • 54. vertical partitioning
  • 55. django db routers make it pretty easy
  • 56. def db_for_read(self, model): if app_label == photos: return photodb
  • 57. ...once you untangle all your foreign key relationships
  • 58. a few months later...
  • 59. photosdb > 60GB
  • 60. what now?
  • 61. horizontal partitioning!
  • 62. aka: sharding
  • 63. “surely we’ll have hiredsomeone experiencedbefore we actually need to shard”
  • 64. you don’t get to choosewhen scaling challenges come up
  • 65. evaluated solutions
  • 66. at the time, none wereup to task of being our primary DB
  • 67. did in Postgres itself
  • 68. what’s painful about sharding?
  • 69. 1 data retrieval
  • 70. hard to know what yourprimary access patternswill be w/out any usage
  • 71. in most cases, user ID
  • 72. 2 what happens ifone of your shards gets too big?
  • 73. in range-based schemes (like MongoDB), you split
  • 74. A-H: shard0I-Z: shard1
  • 75. A-D: shard0E-H: shard2I-P: shard1Q-Z: shard2
  • 76. downsides (especially on EC2): disk IO
  • 77. instead, we pre-split
  • 78. many many many(thousands) of logical shards
  • 79. that map to fewer physical ones
  • 80. // 8 logical shards on 2 machinesuser_id % 8 = logical shardlogical shards -> physical shard map{ 0: A, 1: A, 2: A, 3: A, 4: B, 5: B, 6: B, 7: B}
  • 81. // 8 logical shards on 2 4 machinesuser_id % 8 = logical shardlogical shards -> physical shard map{ 0: A, 1: A, 2: C, 3: C, 4: B, 5: B, 6: D, 7: D}
  • 82. little known but awesome PG feature: schemas
  • 83. not “columns” schema
  • 84. - database: - schema: - table: - columns
  • 85. machineA: shard0 photos_by_user shard1 photos_by_user shard2 photos_by_user shard3 photos_by_user
  • 86. machineA: machineA’: shard0 shard0 photos_by_user photos_by_user shard1 shard1 photos_by_user photos_by_user shard2 shard2 photos_by_user photos_by_user shard3 shard3 photos_by_user photos_by_user
  • 87. machineA: machineC: shard0 shard0 photos_by_user photos_by_user shard1 shard1 photos_by_user photos_by_user shard2 shard2 photos_by_user photos_by_user shard3 shard3 photos_by_user photos_by_user
  • 88. can do this as long asyou have more logical shards than physical ones
  • 89. lesson: take tech/toolsyou know and try first toadapt them into a simple solution
  • 90. 2 which tools where?
  • 91. where to cache /otherwise denormalize data
  • 92. we <3 redis
  • 93. what happens when a user posts a photo?
  • 94. 1 user uploads photowith (optional) caption and location
  • 95. 2 synchronous write tothe media database for that user
  • 96. 3 queues!
  • 97. 3a if geotagged, async worker POSTs to Solr
  • 98. 3b follower delivery
  • 99. can’t have every user who loads her timelinelook up all their followers and then their photos
  • 100. instead, everyone gets their own list in Redis
  • 101. media ID is pushed onto a list for every personwho’s following this user
  • 102. Redis is awesome forthis; rapid insert, rapid subsets
  • 103. when time to render afeed, we take small # of IDs, go look up info in memcached
  • 104. Redis is great for...
  • 105. data structures that are relatively bounded
  • 106. (don’t tie yourself to a solution where your in-memory DB is your main data store)
  • 107. caching complex objectswhere you want to more than GET
  • 108. ex: counting, sub- ranges, testing membership
  • 109. especially when TaylorSwift posts live from the CMAs
  • 110. follow graph
  • 111. v1: simple DB table(source_id, target_id, status)
  • 112. who do I follow? who follows me? do I follow X?does X follow me?
  • 113. DB was busy, so westarted storing parallel version in Redis
  • 114. follow_all(300 item list)
  • 115. inconsistency
  • 116. extra logic
  • 117. so much extra logic
  • 118. exposing your support team to the idea of cache invalidation
  • 119. redesign took a page from twitter’s book
  • 120. PG can handle tens ofthousands of requests, very light memcached caching
  • 121. two takeaways
  • 122. 1 have a versatilecomplement to your core data storage (like Redis)
  • 123. 2 try not to have twotools trying to do the same job
  • 124. 3 staying nimble
  • 125. 2010: 2 engineers
  • 126. 2011: 3 engineers
  • 127. 2012: 5 engineers
  • 128. scarcity -> focus
  • 129. engineer solutions that you’re not constantly returning to because they broke
  • 130. 1 extensive unit-tests and functional tests
  • 131. 2 keep it DRY
  • 132. 3 loose coupling using notifications / signals
  • 133. 4 do most of our work inPython, drop to C when necessary
  • 134. 5 frequent code reviews, pull requests to keep things in the ‘shared brain’
  • 135. 6 extensive monitoring
  • 136. munin
  • 137. statsd
  • 138. “how is the system right now?”
  • 139. “how does this compare to historical trends?”
  • 140. scaling for android
  • 141. 1 million new users in 12 hours
  • 142. great tools that enable easy read scalability
  • 143. redis: slaveof <host> <port>
  • 144. our Redis frameworkassumes 0+ readslaves
  • 145. tight iteration loops
  • 146. statsd & pgfouine
  • 147. know where you canshed load if needed
  • 148. (e.g. shorter feeds)
  • 149. if you’re tempted toreinvent the wheel...
  • 150. don’t.
  • 151. “our app serverssometimes kernel panic under load”
  • 152. ...
  • 153. “what if we write amonitoring daemon...”
  • 154. wait! this is exactly what HAProxy is great at
  • 155. surround yourself with awesome advisors
  • 156. culture of opennessaround engineering
  • 157. give back; e.g. node2dm
  • 158. focus on making what you have better
  • 159. “fast, beautiful photo sharing”
  • 160. “can we make all of ourrequests 50% the time?”
  • 161. staying nimble = remind yourself of what’s important
  • 162. your users around theworld don’t care that you wrote your own DB
  • 163. wrapping up
  • 164. unprecedented times
  • 165. 2 backend engineerscan scale a system to 30+ million users
  • 166. key word = simplicity
  • 167. cleanest solution with the fewest moving parts as possible
  • 168. don’t over-optimize orexpect to know ahead of time how site will scale
  • 169. don’t think “someoneelse will join & take care of this”
  • 170. will happen sooner than you think; surround yourself with great advisors
  • 171. when adding software tostack: only if you have to,optimizing for operational simplicity
  • 172. few, if any, unsolvablescaling challenges for a social startup
  • 173. have fun

×