초보자를 위한 분산 캐시 이야기

12,556 views

Published on

Published in: Technology
0 Comments
145 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
12,556
On SlideShare
0
From Embeds
0
Number of Embeds
2,640
Actions
Shares
0
Downloads
254
Comments
0
Likes
145
Embeds 0
No embeds

No notes for slide

초보자를 위한 분산 캐시 이야기

  1. 1. 초보자를 위한 분산 캐시 이야기 charsyam@naver.com http://charsyam.wordpress.com
  2. 2. 잉여서버개발자 In NHN관심분야!!!Cloud, Big Data특기!!!발표 날로 먹기!!!
  3. 3. What is Cache?
  4. 4. Cache is a componentthat transparently stores data sothat future requests for thatdata can be served faster. In Wikipedia
  5. 5. Cache 는 나중에 요청올 결과를미리 저장해두었다가 빠르게서비스해 주는 것
  6. 6. Lots ofData
  7. 7. Cache
  8. 8. Lots of DataCache
  9. 9. CPU Cache
  10. 10. Browser Cache
  11. 11. Why is Cache?
  12. 12. Use Case: Login
  13. 13. Use Case: LoginCommon Case
  14. 14. Use Case: Login Common Case Read From DBSelect * from Users where id=‘charsyam’;
  15. 15. 일반적인 DB구성 Master REPLICATION/FailOver Slave
  16. 16. 일반적인 DB구성모든 Traffic은 Master 가 처리 Slave는 장애 대비 Master REPLICATION/FailOver Slave
  17. 17. 이러니깐 부하가!!!
  18. 18. 선택의 기로!!!
  19. 19. Scale UP VsScale OUT
  20. 20. Scale UP
  21. 21. 초당 1000 TPS
  22. 22. 초당 3000 TPS3배 처리 가능한 서버를 투입
  23. 23. ScaleOUT
  24. 24. 초당 1000 TPS
  25. 25. 초당 2000 TPS
  26. 26. 초당 3000 TPS
  27. 27. Scale Out을 선택!!!
  28. 28. Scale Out을 선택!!! 돈만 많으면 Scale Up도 좋습니다.
  29. 29. Request 분석읽기 70%, 쓰기 30%
  30. 30. 분석 결론읽기 70%,읽기를 분산하자!!!
  31. 31. One Write Master +Multi Read Slave
  32. 32. Client ONLY WRITE Only READ MasterREPLICATION Slave Slave Slave
  33. 33. EventualConsistency
  34. 34. Ex) Replication Master REPLICATION Slave에 부하가 있거나, 다른 이유로, 복제가 느려질 수 있다. 그러나 언젠가는 같아진다. Slave
  35. 35. Login 정보 중에 잠시라도 서로 다르면 안되는 경우는?
  36. 36. Login 정보 중에 잠시라도 서로 다르면 안되는 경우는?=> 다시 처음으로!!!
  37. 37. 초반에는 행복했습니다.
  38. 38. Client MasterREPLICATION Slave Slave Slave Slave Slave
  39. 39. 부하가 더 커지니!!!
  40. 40. Client Master REPLICATIONSlave Slave Slave Slave Slave SlaveSlave Slave Slave Slave Slave Slave
  41. 41. 성능 향상이 미비함!!! WHY?
  42. 42. 머신의 I/O는 Zero섬
  43. 43. Partitioning
  44. 44. Scalable Partitioning Client PART 1 PART 2 Web Server Web Server DBMS DBMS
  45. 45. Paritioning성능
  46. 46. Paritioning성능관리이슈
  47. 47. Paritioning성능관리이슈비용
  48. 48. 간단한 해결책
  49. 49. DB 서버 Disk는 SSD메모리도 데이터보다 많이
  50. 50. DB 서버 Disk는 SSD메모리도 데이터보다 많이=> 돈돈돈!!!
  51. 51. Why is Cache?
  52. 52. Use Case: Login
  53. 53. Use Case: Login Read From CacheGet charsyam
  54. 54. Use Case: LoginApply Cache For Read
  55. 55. General Cache Layer Storage Layer Cache Application READ WRITE Server UPDATE WRITE DBMS
  56. 56. Type 1: 1 Key – N Items KEY ValueProfile:User_ID Profile - LastLoginTime - UserName - Host - Name
  57. 57. Type 2: 1 Key – 1 Item KEY Valuename:User_ID UserNameLastLoginTime:User_ID LastLoginTime
  58. 58. Pros And ConsType 1: 1 Key – N Items Pros: Just 1 get.
  59. 59. Pros And ConsType 1: 1 Key – N Items Pros: Just 1 get. Cons: if 1 item is changed. Need Cache Update And Race Condition
  60. 60. Pros And ConsType 2: 1 Key – 1 Item Pros: if 1 item is changed, just change that item.
  61. 61. Pros And ConsType 2: 1 Key – 1 Item Pros: if 1 item is changed, just change that item. Cons: Some Items can be removed
  62. 62. 변화하는 데이터변화하지 않는 데이터
  63. 63. 변화하는 데이터 Divide변화하지 않는 데이터
  64. 64. Don’t Try Update After didn’t Read DB Value Profile - LastLoginTime - UserName - Host - Name
  65. 65. Don’t Try Update After didn’t Read DB Value EVENT: Update Only LastLoginTime Profile - LastLoginTime - UserName - Host - Name
  66. 66. Don’t Try Update After didn’t Read DB Value EVENT: Update Only LastLoginTime Profile Read Cache: User Profile - LastLoginTime - UserName - Host - Name
  67. 67. Don’t Try Update After didn’t Read DB Value EVENT: Update Only LastLoginTime Profile Read Cache: User Profile - LastLoginTime - UserName Update Data & DB - Host - Name Just Save Cache
  68. 68. Don’t Try Update After didn’t Read DB Value EVENT: Update Only LastLoginTime Profile Read Cache: User Profile - LastLoginTime - UserName Update Data & DB - Host - Name Just Save Cache It Makes Race Condition
  69. 69. NeedGlobal Lock
  70. 70. Need Global Lock=> 오늘은 PASS!!!
  71. 71. How to Test
  72. 72. Using Real Data100,000 Request In 25 million User IDReproduct From Log
  73. 73. Test ResultMemcache VS Mysql 136 vs 1613 seconds
  74. 74. Result ofApplying Cache
  75. 75. Select Query
  76. 76. CPU utility
  77. 77. 균등한 속도!!! 부하 감소!!!
  78. 78. What is Hard?
  79. 79. Key Value – SYNC Easy and Fail!!! 1. Update To DB 2. Fail To Transaction
  80. 80. Key Value – SYNC HARD and 1. Update To DB Fail!!! 2. Update to Cache
  81. 81. Key Value – How and 1. Update To DB Fail!!! 2. Update to Cache RETRY BATCH DELETE
  82. 82. Data!The most important thing
  83. 83. If data is not important Cache Updating is not important Login Count Last Login Time ETC
  84. 84. BUT!
  85. 85. If data is important Cache Updating is important Server Address Data Path ETC
  86. 86. HOW!
  87. 87. RETRY!Retry, Retry, RetrySolve Over 9x%.Or Delete Cache!!!
  88. 88. Batch!Queuing Service
  89. 89. Error Log Queue BatchError Log ProcessorError LogError Log CacheError Log Server
  90. 90. Error Log Queue BatchError Log ProcessorError Log Error LogError Log Cache Server
  91. 91. Error Log Queue Batch UPDATEError Log ProcessorError LogError Log Cache Server Error Log
  92. 92. CachesMemcache Redis
  93. 93. MemcacheAtomic Operation
  94. 94. MemcacheAtomic Operation Key:Value
  95. 95. MemcacheAtomic Operation Key:Value Single Thread
  96. 96. Memcache Processing Over 100,000 TPS
  97. 97. Memcache Max size of item 1 MB
  98. 98. Memcache LRU, ExpireTime
  99. 99. RedisKey:Value
  100. 100. RedisExpireTimeReplicationSnapshot
  101. 101. RedisKey:ValueCollection Hash List Sorted Set
  102. 102. 주의 사항
  103. 103. Item Size1~XXX KB,Item 사이즈는 적을수록좋다.
  104. 104. Cache InGlobal Service
  105. 105. Memcached In Facebook• Facebook and Google and Many Companies• Facebook – 하루 Login 5억명(한달에 8억명)(최신, 밑에는 작년 2010/04자료) – 활성 사용자 7,000만 – 사용자 증가 비율 4일에 100만명 – Web 서버 10,000 대, Web Request 초당 2000만번 – Memcached 서버 805대 -> 15TB, HitRate: 95% – Mysq server 1,800 대 Master/Slave(각각, 900대) • Mem: 25TB, SQL Query 초당 50만번
  106. 106. Cache In Twitter(Old) API WEB Page Cache DB DB
  107. 107. Cache In Twitter(NEW) API WEB Page Cache Fragment Cache Row Cache Vector Cache DB DB DB
  108. 108. Redis In WongaUsing Redis for DataStore Write/Read
  109. 109. Redis In Wonga Flash Client Ruby Backend
  110. 110. NoCache In Wonga1 million daily Users200 million daily HTTP Requests
  111. 111. NoCache In Wonga1 million daily Users200 million daily HTTP Requests100,000 DB Operation Per Sec40,000 DB update Per Sec
  112. 112. NoCache In Wonga
  113. 113. First Scale Out
  114. 114. First Setting – 3 Month
  115. 115. More Traffic
  116. 116. MySQL hiccups – DB Problems Began
  117. 117. Analysis About DBMSActiveRecord’s status Check caused20% extra DB60% of All Updates were doneon ‘tiles’ table
  118. 118. Applying Sharding 2xD
  119. 119. Result of Applying Sharding 2xD
  120. 120. Doubling MySQL
  121. 121. Result of Doubling MySQL
  122. 122. 50,000 TPS on EC2
  123. 123. Wonga Choose Redis!But some failure => 오늘은 PASS!!!
  124. 124. Result!!!
  125. 125. Consistent Hashing
  126. 126. Origin K = 10000 N=5 Server Server User Request Proxy Server Server Server
  127. 127. FAIL : Redistribution about 2000 Users K = 10000 N=4 Server Server User Request Proxy Server Server Server
  128. 128. RECOVER: Redistribution about 2500 Users K = 10000 N=5 Server Server User Request Proxy Server Server Server
  129. 129. Add A,B,C Server A
  130. 130. Add A,B,C Server A B
  131. 131. Add A,B,C Server A B C
  132. 132. Add Item 1 A 1 B C
  133. 133. Add Item 2 A 1 B 2 C
  134. 134. Add Item 3,4,5 A 3 1 4 B 5 2 C
  135. 135. Fail!! B Server A 3 4 B 5 2 C
  136. 136. Add Item 1 Again -> Allocated C Server A 3 1 4 B 5 2 C
  137. 137. Recover B Server -> Add Item 1 A 3 1 4 B 1 5 2 C
  138. 138. Thank You!
  139. 139. Real Implementation A C+2 C+3 B+3 A+1 A+4 B B+2 C+1 A+2 B+1 A+3 C

×