Dmitry Tiagulskyi, Yaroslav Yermilov "It Scales Until It Doesn’t"

•Download as PPTX, PDF•

0 likes•468 views

We are used to thinking that “high-load” means distributed systems, computing power, application, and kernel profiling. But sometimes you can’t simply scale your cluster. Maybe your hashmaps don’t fit in the server memory. Maybe you need single-digit millisecond latency. Maybe the cost is too high. Or your server is a … mobile phone. In this talk, we will show how popular and lesser-known algorithms, data structures, and systems tuning helped us to overcome these blockers. Who said you don’t need to know algorithms nowadays?

Technology

It Scales
Until It Doesn’t
yaroslav.yermilov@grammarly.com - Software Engineer @ Core Services
dima.tiagulskyi@grammarly.com - Software Engineer @ Core Services

● Algorithms & Data Structures for Language Models (Ngrams)
○ Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein “Introduction to Algorithms” (Chapter 11 “Hash
Tables” (253) specifically)
○ F.C. Botelho, R. Pagh, and N. Ziviani. Simple and space-efficient minimal perfect hash functions. In Proc. of the 10th Workshop
on Algorithms and Data Structures (WADs’07), pages 139–150. Springer LNCS vol. 4619, 2007.
○ Djamal Belazzougui, Fabiano Botelho, and Martin Dietzfelbinger. 2009. Hash, displace, and compress. Algorithms - ESA 2009,
pages 682–693.
○ Kenneth Heafield. 2011. KenLM: Faster and smaller language model queries. In Proceedings of the Sixth Workshop on Statistical
Machine Translation, Edinburgh, UK, July. Association for Computational Linguistics.
○ Adam Pauls and Dan Klein. 2011. Faster and smaller ngram language models. In Proceedings of ACL, Portland, Oregon.
○ David Talbot and Miles Osborne. 2007. Randomised language modelling for statistical machine translation. In Proceedings of
ACL, pages 512–519, Prague, Czech Republic.
○ D. Guthrie, M. Hepple, and W. Liu, “Efficient minimal perfect hash language models,” in Proceedings of LREC’10. Valletta,
Malta: European Language Resources Association (ELRA), May 2010.
References

● AWS Virtualization, ENA
○ https://en.wikipedia.org/wiki/X86_virtualization
○ https://aws.amazon.com/ru/blogs/aws/elastic-network-adapter-high-performance-network-interface-for-amazon-ec2/
○ https://medium.com/@paccattam/aws-enhanced-networking-an-overview-aee8a852cf5c
○ AWS re:Invent 2017: Optimizing Network Performance for Amazon EC2 Instances: https://youtu.be/-dWgqtGKPfc
○ https://www.kernel.org/doc/Documentation/networking/scaling.txt
○ AWS re:Invent 2017: C5 Instances and the Evolution of Amazon EC2 Virtualization: https://youtu.be/LabltEXk0VQ
○ http://www.brendangregg.com/blog/2017-11-29/aws-ec2-virtualization-2017.html
○ https://blog.ubuntu.com/2017/04/05/ubuntu-on-aws-gets-serious-performance-boost-with-aws-tuned-kernel
References

● NUMA
○ https://www.redhat.com/files/summit/2014/summit2014_riel_chegu_w_0340_automatic_numa_balancing.pdf
○ https://www.cmg.org/wp-content/uploads/2015/10/numa.pdf
● Network Servers, IO Multiplexing & Epoll
○ https://eli.thegreenplace.net/2018/measuring-context-switching-and-memory-overheads-for-linux-threads/
○ Asynchronous IO with Boost.Asio: https://youtu.be/rwOv_tw2eA4
○ https://eklitzke.org/blocking-io-nonblocking-io-and-epoll
○ https://habr.com/post/416669/
● Netty
○ One Framework to rule them all by Norman Maurer: https://youtu.be/DKJ0w30M0vg
○ https://dou.ua/lenta/columns/netty-optimization/
References

● Benchmarking
○ http://www.brendangregg.com/blog/2018-06-30/benchmarking-checklist.html
○ http://www.brendangregg.com/activebenchmarking.html
○ http://www.brendangregg.com/usemethod.html
○ http://www.brendangregg.com/linuxperf.html
● Java Profiling
○ http://psy-lob-saw.blogspot.com/2016/02/why-most-sampling-java-profilers-are.html
○ https://github.com/jvm-profiling-tools/async-profiler
○ https://medium.com/netflix-techblog/java-in-flames-e763b3d32166
References

Recently uploaded

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell

How to write a Business Continuity PlanDatabarracks

CloudStudio User manual (basic edition):comworks

From Family Reminiscence to Scholarly Archive .Alan Dix

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

Advanced Computer Architecture – An IntroductionDilum Bandara

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Artificial intelligence in cctv survelliance.pptxhariprasad279825

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Commit 2024 - Secret Management made easyAlfredo García Lavilla

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

Anypoint Exchange: It’s Not Just a Repo!Manik S Magar

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

Recently uploaded (20)

Unraveling Multimodality with Large Language Models.pdf

Advanced Test Driven-Development @ php[tek] 2024

DSPy a system for AI to Write Prompts and Do Fine Tuning

How to write a Business Continuity Plan

CloudStudio User manual (basic edition):

From Family Reminiscence to Scholarly Archive .

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy

DMCC Future of Trade Web3 - Special Edition

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf

Designing IA for AI - Information Architecture Conference 2024

Advanced Computer Architecture – An Introduction

Developer Data Modeling Mistakes: From Postgres to NoSQL

Artificial intelligence in cctv survelliance.pptx

Vertex AI Gemini Prompt Engineering Tips

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx

Commit 2024 - Secret Management made easy

SAP Build Work Zone - Overview L2-L3.pptx

Anypoint Exchange: It’s Not Just a Repo!

Connect Wave/ connectwave Pitch Deck Presentation

Dmitry Tiagulskyi, Yaroslav Yermilov "It Scales Until It Doesn’t"

1. It Scales Until It Doesn’t yaroslav.yermilov@grammarly.com - Software Engineer @ Core Services dima.tiagulskyi@grammarly.com - Software Engineer @ Core Services

10.

11.

12.

13.

14.

15.

16.

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

27.

28.

29.

30.

31.

32.

33.

34.

35.

36.

37.

38.

39.

40.

41.

42.

43.

44.

45.

46.

47.

48.

49.

50.

51.

52.

53.

54.

55.

56.

57.

58.

59.

60.

61.

62.

63.

64.

65.

66.

67.

68.

69.

70.

71.

72.

73.

74.

75.

76.

77.

78.

79.

80.

81.

82.

83.

84.

85.

86.

87.

88.

89.

90.

91.

92.

93.

94.

95.

96.

97.

98.

99.

100.

101.

102.

103.

104.

105.

106.

107.

108.

109. ● Algorithms & Data Structures for Language Models (Ngrams) ○ Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein “Introduction to Algorithms” (Chapter 11 “Hash Tables” (253) specifically) ○ F.C. Botelho, R. Pagh, and N. Ziviani. Simple and space-efficient minimal perfect hash functions. In Proc. of the 10th Workshop on Algorithms and Data Structures (WADs’07), pages 139–150. Springer LNCS vol. 4619, 2007. ○ Djamal Belazzougui, Fabiano Botelho, and Martin Dietzfelbinger. 2009. Hash, displace, and compress. Algorithms - ESA 2009, pages 682–693. ○ Kenneth Heafield. 2011. KenLM: Faster and smaller language model queries. In Proceedings of the Sixth Workshop on Statistical Machine Translation, Edinburgh, UK, July. Association for Computational Linguistics. ○ Adam Pauls and Dan Klein. 2011. Faster and smaller ngram language models. In Proceedings of ACL, Portland, Oregon. ○ David Talbot and Miles Osborne. 2007. Randomised language modelling for statistical machine translation. In Proceedings of ACL, pages 512–519, Prague, Czech Republic. ○ D. Guthrie, M. Hepple, and W. Liu, “Efficient minimal perfect hash language models,” in Proceedings of LREC’10. Valletta, Malta: European Language Resources Association (ELRA), May 2010. References

110. ● AWS Virtualization, ENA ○ https://en.wikipedia.org/wiki/X86_virtualization ○ https://aws.amazon.com/ru/blogs/aws/elastic-network-adapter-high-performance-network-interface-for-amazon-ec2/ ○ https://medium.com/@paccattam/aws-enhanced-networking-an-overview-aee8a852cf5c ○ AWS re:Invent 2017: Optimizing Network Performance for Amazon EC2 Instances: https://youtu.be/-dWgqtGKPfc ○ https://www.kernel.org/doc/Documentation/networking/scaling.txt ○ AWS re:Invent 2017: C5 Instances and the Evolution of Amazon EC2 Virtualization: https://youtu.be/LabltEXk0VQ ○ http://www.brendangregg.com/blog/2017-11-29/aws-ec2-virtualization-2017.html ○ https://blog.ubuntu.com/2017/04/05/ubuntu-on-aws-gets-serious-performance-boost-with-aws-tuned-kernel References

111. ● NUMA ○ https://www.redhat.com/files/summit/2014/summit2014_riel_chegu_w_0340_automatic_numa_balancing.pdf ○ https://www.cmg.org/wp-content/uploads/2015/10/numa.pdf ● Network Servers, IO Multiplexing & Epoll ○ https://eli.thegreenplace.net/2018/measuring-context-switching-and-memory-overheads-for-linux-threads/ ○ Asynchronous IO with Boost.Asio: https://youtu.be/rwOv_tw2eA4 ○ https://eklitzke.org/blocking-io-nonblocking-io-and-epoll ○ https://habr.com/post/416669/ ● Netty ○ One Framework to rule them all by Norman Maurer: https://youtu.be/DKJ0w30M0vg ○ https://dou.ua/lenta/columns/netty-optimization/ References

112. ● Benchmarking ○ http://www.brendangregg.com/blog/2018-06-30/benchmarking-checklist.html ○ http://www.brendangregg.com/activebenchmarking.html ○ http://www.brendangregg.com/usemethod.html ○ http://www.brendangregg.com/linuxperf.html ● Java Profiling ○ http://psy-lob-saw.blogspot.com/2016/02/why-most-sampling-java-profilers-are.html ○ https://github.com/jvm-profiling-tools/async-profiler ○ https://medium.com/netflix-techblog/java-in-flames-e763b3d32166 References

Dmitry Tiagulskyi, Yaroslav Yermilov "It Scales Until It Doesn’t"

Recommended

Recommended

More Related Content

More from Fwdays

More from Fwdays (20)

Recently uploaded

Recently uploaded (20)

Dmitry Tiagulskyi, Yaroslav Yermilov "It Scales Until It Doesn’t"