Successfully reported this slideshow.

Tweaking perfomance on high-load projects_Думанский Дмитрий

10,452 views

Published on

Конференция AI&Big Data Lab, 12 апреля 2014

Published in: Data & Analytics, Technology
  • Be the first to comment

Tweaking perfomance on high-load projects_Думанский Дмитрий

  1. 1. Tweaking performance on high-load projects
  2. 2. Dmitriy Dumanskiy Cogniance, mGage project Java Team Lead
  3. 3. Project evolution Mgage Mobclix XXXX
  4. 4. Mgage delivery load 3 billions req/mon. ~8 c3.xLarge Amazon instances. Average load : 2400 req/sec Peak : x10
  5. 5. Mobclix delivery load 14 billions req/mon. ~16 c3.xLarge Amazon instances. Average load : 6000 req/sec Peak : x6
  6. 6. XXXX delivery Load 20 billions req/mon. ~14 c3.xLarge Amazon instances. Average load : 11000 req/sec Peak : x6
  7. 7. Is it a lot? Average load : 11000 req/sec
  8. 8. Twitter : new tweets 15 billions a month Average load : 5700 req/sec Peak : x30
  9. 9. Delivery load Requests per month Max load per instance, req/sec Requirements Servers, AWS c3. xLarge Mgage 3 billions 300 HTTP Time 95% < 60ms 8 Mobclix 14 billions 400 HTTP Time 95% < 100ms 16 XXXX 20 billions 800 HTTPS Time 99% < 100ms 14
  10. 10. Delivery load c3.XLarge - 4 vCPU, 2.8 GHz Intel Xeon E5-2680 LA - ~2-3 1-2 cores reserved for sudden peaks
  11. 11. BE tech stacks Mobclix : Spring, iBatis, MySql, Solr, Vertica, Cascading, Tomcat Mgage : Spring, Hibernate, Postgres, Distributed ehCache, Hadoop, Voldemort, Jboss XXXX: Spring, Hibernate, MySQL, Solr, Cascading, Redis, Tomcat
  12. 12. Initial problem ● ~1000 req/sec ● Peaks 6x ● 99% HTTPS with response time < 100ms
  13. 13. Real problem ● ~85 mln active users, ~115 mln registered users ● 11.5 messages per user per day ● ~11000 req/sec ● Peaks 6x ● 99% HTTPS with response time < 100ms ● Reliable and scalable for future grow up to 80k
  14. 14. Architecture AdServer Console (UI) Reporting
  15. 15. Architecture Console (UI) MySql SOLR Master SOLR Slave SOLR SlaveSOLR Slave
  16. 16. SOLR? Why? ● Pros: ○ Quick search on complex queries ○ Has a lot of build-in features (master- slave replication, RDBMS integration) ● Cons: ○ Only HTTP, embedded performs worth ○ Not easy for beginners ○ Max load is ~100 req/sec per instance
  17. 17. “Simple” query "-(-connectionTypes:"+"""+getConnectionType()+"""+" AND connectionTypes:[* TO *]) AND "+"-connectionTypeExcludes:"+"""+getConnectionType()+"""+" AND " + "-(- OSes:"+"(""+osQuery+"" OR ""+getOS()+"")"+" AND OSes:[* TO *]) AND " + "- osExcludes:"+"(""+osQuery+"" OR ""+getOS()+"")" "AND (runOfNetwork:T OR appIncludes:"+getAppId()+" OR pubIncludes:"+getPubId()+" OR categories: ("+categoryList+"))" +" AND -appExcludes:"+getAppId()+" AND -pubExcludes:" +getPubId()+" AND -categoryExcludes:("+categoryList+") AND " + keywordQuery+" AND " + "-(-devices:"+"""+getHandsetNormalized()+"""+" AND devices:[* TO *]) AND " + "-deviceExcludes:"+"""+getHandsetNormalized()+"""+" AND " + "-(-carriers:"+""" +getCarrier()+"""+" AND carriers:[* TO *]) AND " + "-carrierExcludes:"+""" +getCarrier()+"""+" AND " + "-(-locales:"+"(""+locale+"" OR ""+langOnly+"")" +" AND locales:[* TO *]) AND " + "-localeExcludes:"+"(""+locale+"" OR "" +langOnly+"") AND " + "-(-segments:("+segmentQuery+") AND segments:[* TO *]) AND " + "-segmentExcludes:("+segmentQuery+")" + " AND -(-geos:"+geoQuery+" AND geos:[* TO *]) AND " + "-geosExcludes:"+geoQuery
  18. 18. Architecture MySql Solr Master SOLR Slave AdServer SOLR Slave AdServer SOLR Slave AdServer No-SQL
  19. 19. AdServer - Solr Slave Delivery: volitile DeliveryData cache; Cron Job: DeliveryData tempCache = loadData(); cache = tempCache;
  20. 20. Why no-sql? ● Realtime data ● Quick response time ● Simple queries by key ● 1-2 queries to no-sql on every request. Average load 10-20k req/sec and >120k req/sec in peaks. ● Cheap solution
  21. 21. Why Redis? Pros ● Easy and light-weight ● Low latency and response time. 99% is < 1ms. Average latency is ~0.2ms ● Up to 100k 'get' commands per second on c1.X-Large ● Cool features (atomic increments, sets, hashes) ● Ready AWS service — ElastiCache
  22. 22. Why Redis? Cons ● Single-threaded from the box ● Utilize all cores - sharding/clustering ● Scaling/failover not easy ● Limited up to max instance memory (240GB largest AWS) ● Persistence/swapping may delay response ● Cluster solution not production ready
  23. 23. DynamoDB vs Redis Price per month Put, 95% Get, 95% Rec/sec DynamoDB 58$ 300ms 150ms 50 DynamoDB 580$ 60ms 8ms 780 DynamoDB 5800$ 16ms 8ms 1250 Redis 200$ (c1.medium) 3ms <1ms 4000 ElastiCache 600$ (c1.xlarge) <1ms <1ms 10000
  24. 24. What about others? ● Cassandra ● Voldemort ● Memcached
  25. 25. Redis RAM problem ● 1 user entry ~ from 80 bytes to 3kb ● ~85 mln users ● Required RAM ~ from 1 GB to 300 GB
  26. 26. Data compression speed
  27. 27. Data compression size
  28. 28. Data compression Json → Kryo binary → 4x times less data → Gzipping → 2x times less data == 8x less data Now we need < 40 GB + Less load on network stack
  29. 29. AdServer BE Average response time — ~1.2 ms Load — 800 req/sec with LA ~4 c3.XLarge == 4 vCPU
  30. 30. AdServer BE ● Logging — 12% of time (5% on SSD); ● Response generation — 15% of time; ● Redis request — 50% of time; ● All business logic — 23% of time;
  31. 31. Reporting AdServer Hadoop ETL MySQLConsole S3 S3 Delivery logs Aggregated logs
  32. 32. Log structure { "uid":"test", "platform":"android", "app":"xxx", "ts":1375952275223, "pid":1, "education":"Some-Highschool-or-less", "type":"new", "sh":1280, "appver":"6.4.34", "country":"AU", "time":"Sat, 03 August 2013 10:30:39 +0200", "deviceGroup":7, "rid":"fc389d966438478e9554ed15d27713f51", "responseCode":200, "event":"ad", "device":"N95", "sw":768, "ageGroup":"18-24", "preferences":["beer","girls"] }
  33. 33. Log structure ● 1 mln. records == 0.6 GB. ● ~900 mln records a day == ~0.55 TB. ● 1 month up to 20 TB of data. ● Zipped data is 10 times less.
  34. 34. Reporting Customer : “And we need fancy reporting” But 20 TB of data per month is huge. So what we can do?
  35. 35. Reporting Dimensions: device, os, osVer, sreenWidth, screenHeight, country, region, city, carrier, advertisingId, preferences, gender, age, income, sector, company, language, etc... Use case: I want to know how many users saw my ad in San- Francisco.
  36. 36. Reporting Geo table: Country, City, Region, CampaignId, Date, counters; Device table: Device, Carrier, Platform, CampaignId, Date, counters; Uniques table: CampaignId, UID
  37. 37. Predefined report types → aggregation by predefined dimensions → 500-1000 times less data 20 TB per month → 40 GB per month
  38. 38. Of course - hadoop ● Pros: ○ Unlimited (depends) horizontal scaling ● Cons: ○ Not real-time ○ Processing time directly depends on quality code and on infrastructure cost. ○ Not all input can be scaled ○ Cluster startup is so... long
  39. 39. Alternatives? ● Storm ● Redshift ● Vertica ● Math models?
  40. 40. Elastic MapReduce ● Easy setup ● Easy extend ● Easy to monitor
  41. 41. Timing ● Hadoop (cascading) : ○ 25 GB in peak hour takes ~40min (-10 min). CSV output 300MB. With cluster of 4 c3.xLarge. ● MySQL: ○ Put 300MB in DB with insert statements ~40 min.
  42. 42. Timing ● Hadoop (cascading) : ○ 25 GB in peak hour takes ~40min (-10 min). CSV output 300MB. With cluster of 4 c3.xLarge. ● MySQL: ○ Put 300MB in DB with insert statements ~40 min. ● MySQL: ○ Put 300MB in DB with optimizations ~5 min.
  43. 43. Optimized are ● No “insert into”. Only “load data” - ~10 times faster ● “ENGINE=MyISAM“ vs “INNODB” when possible - ~5 times faster ● For “upsert” - temp table with “ENGINE=MEMORY” - IO savings
  44. 44. Cascading Hadoop: void map(K key, V val, OutputCollector collector) { ... } void reduce(K key, Iterator<V> vals, OutputCollector collector) { ... } Cascading: Scheme sinkScheme = new TextLine(new Fields( "word", "count")); Pipe assembly = new Pipe("wordcount"); assembly = new Each(assembly, new Fields( "line" ), new RegexGenerator(new Fields("word"), ",") ); assembly = new GroupBy(assembly, new Fields( "word")); Aggregator count = new Count(new Fields( "count")); assembly = new Every(assembly, count);
  45. 45. Why cascading? Hadoop Job 1 Hadoop Job 2 Hadoop Job 3 Result of one job should be processed by another job
  46. 46. Lessons Learned
  47. 47. Redis sharding Redis shard 0 Redis shard 1 Redis shard 2 AdServer shardNumber = UID.hashCode() / 3
  48. 48. Resharding problem All data already in shards, how to add new shards?
  49. 49. Resharding problem. Solution Old Shard NewShard 1. Get NEW UID. If not present - a). AdServer a) Get OLD UID 2. Save UID to new Shard Removal script
  50. 50. Postgres partitioning ● Queries on small partitions ● Distributed index ● Less index size ● Small partitions may fit RAM memory ● Easy to remove/move
  51. 51. Cost of IO L1 cache 3 cycles L2 cache 14 cycles RAM 250 cycles Disk 41 000 000 cycles Network 240 000 000 cycles
  52. 52. Cost of IO @Cacheable is everywhere
  53. 53. Hadoop Map input : 300 MB Map output : 80 GB
  54. 54. Hadoop ● mapreduce.map.output.compress = true ● codecs: GZip, BZ2 - CPU intensive ● codecs: LZO, Snappy ● codecs: JNI ~x10
  55. 55. Hadoop Consider Combiner
  56. 56. Hadoop Text, IntWritable, BytesWritable, NullWritable, etc Simpler - better
  57. 57. Hadoop Missing data: map(T value, ...) { Log log = parse(value); Data data = dbWrapper.getSomeMissingData(log.getCampId()); }
  58. 58. Hadoop Missing data: map(T value, ...) { Log log = parse(value); Data data = dbWrapper.getSomeMissingData(log.getCampId()); } Wrong
  59. 59. Hadoop Unnecessary data: map(T value, ...) { Log log = parse(value); Key resultKey = makeKey(log.getCampName(), ...); output.collect(resultKey, resultValue); }
  60. 60. Hadoop Unnecessary data: map(T value, ...) { Log log = parse(value); Key resultKey = makeKey(log.getCampName(), ...); output.collect(resultKey, resultValue); } Wrong
  61. 61. Hadoop Unnecessary data: RecordWriter.write(K key, V value) { Entity entity = makeEntity(key, value); dbWrapper.save(entity); }
  62. 62. Hadoop Unnecessary data: RecordWriter.write(K key, V value) { Entity entity = makeEntity(key, value); dbWrapper.save(entity); } Wrong
  63. 63. Hadoop public boolean equals(Object obj) { EqualsBuilder equalsBuilder = new EqualsBuilder(); equalsBuilder.append(id, otherKey.getId()); ... } public int hashCode() { HashCodeBuilder hashCodeBuilder = new HashCodeBuilder(); hashCodeBuilder.append(id); ... }
  64. 64. Hadoop public boolean equals(Object obj) { EqualsBuilder equalsBuilder = new EqualsBuilder(); equalsBuilder.append(id, otherKey.getId()); ... } public int hashCode() { HashCodeBuilder hashCodeBuilder = new HashCodeBuilder(); hashCodeBuilder.append(id); ... } Wrong
  65. 65. Hadoop public void map(...) { … for (String word : words) { output.collect(new Text(word), new IntVal(1)); } }
  66. 66. Hadoop public void map(...) { … for (String word : words) { output.collect(new Text(word), new IntVal(1)); } } Wrong
  67. 67. Hadoop class MyMapper extends Mapper { Text word = new Text(); IntVal one = new IntVal(1); public void map(...) { for (String word : words) { word.set(word); output.collect(word, one); } } }
  68. 68. Network Per 1 AdServer instance : Income traffic : ~100Mb/sec Outcome traffic : ~50Mb/sec LB all traffic : Almost 10 Gb/sec
  69. 69. Amazon
  70. 70. AWS ElastiCache SLOWLOG GET 1) 1) (integer) 35 2) (integer) 1391709950 3) (integer) 34155 4) 1) "GET" 2) "2ads10percent_rmywqesssitmfksetzvj" 2) 1) (integer) 34 2) (integer) 1391709830 3) (integer) 34863 4) 1) "GET" 2) "2ads10percent_tteeoomiimcgdzcocuqs"
  71. 71. AWS ElastiCache 35ms for GET? WTF? Even java faster
  72. 72. AWS ElastiCache ● Strange timeouts (with SO_TIMEOUT 50ms) ● No replication for another cluster ● «Cluster» is not a cluster ● Cluster uses usual instances, so pay for 4 cores while using 1
  73. 73. AWS Limits. You never know where ● Network limit ● PPS rate limit ● LB limit ● Cluster start time up to 20 mins ● Scalability limits ● S3 is slow for many files
  74. 74. Facts ● HTTP x2 faster HTTPS ● HTTPS keep-alive +80% performance ● Java 7 40% faster Java 6 (our case) ● All IO operations minimized

×