Node.js Scaling in Highload
Timur Shemsedinov
KPI, Metarhia
You will rewrite all your code to reach each level
● 100 connections anyhow
● <50k connections single process
● ~500k connections single server
● >1 mln. connections single with opt. or cloud
● >2,5 mln. connections multiple servers
● >10 mln. connections clusterware or cloud
Scaling and performance factors:
● JavaScript, V8
● Node.js, libuv
● Code style and patterns
● Architectural decisions
● Dependencies
JavaScript is...
● Implemented and available everywhere
● Most common and widespread language
● Minimalistic syntax
● Libraries, community: NPM, Github and Stackoverflow
● Patterns: streams, factories, observers, queues, etc.
● Techniques: lambdas, closures, mixins, chaining, functors...
JavaScript is...
● Unpredictable optimizations in V8
● Async, blocking, long loops & recursion
● The Tower of Babel
● Crap in NPM & idiots in community
● Multiparadigm approach isn’t so easy
● Problems: mixins, poor types, change built-in classes...
Node.js is...
● Asynchronous I/O, libuv, streams
● V8, stat optimization, hidden classes...
● Production ready infrastructure & runtime
● Stateful and in-memory processing
● Multi-paradigmal: fp, data flow, type driven, oop,
declarative, prototypes, async, mp, etc.
Important decisions
● Long life Process vs Per request Process
● Stateful vs Stateless
● Asynchronous I/O vs no I/O at all
● Algorithms vs Data Structures
● Server inside App vs App inside Server
● Lazy vs Immediate
2012—2019
Step 1 (2012-2013)
● Node.js cluster mode
● Nginx as reverse proxy
● MongoDB or MySQL as DBMS
● HTTP for static files and AJAX API
● SSE for events push
Client side
Browser
Step 1 (2012-2013)
Server side
Nginx
Mongo
or My Worker Worker Worker
Web
Application
Nodejs
Master
FS
Key-
value
HTTP
SSE
Step 2 (2014-2015)
● Node.js cluster mode
● CDN for static files
● MongoDB as DBMS
● HTTP for API
● SSE for events push
Added:
MongoDB
Removed:
Nginx for static and
API proxying,
MySQL, Memcached
Client side
Browser
Step 2 (2014-2015)
Server side
Mongo
DB Worker Worker Worker
Web
Application
Nodejs
Master
Static HTTP
SSE
CDN
HTTP API
Step 3 (2016-2017)
● Node.js
● CDN for static files
● RAM for database
● Websocket
○ for API
○ for events
Added:
Websocket, In-memory
Removed:
HTTP API, SSE,
Node.js cluster mode,
MongoDB
Client side
Browser
Step 3 (2016-2017)
Server side
Persistent
Storage Worker Worker Worker
Web
Application
Nodejs
Master
Static HTTP
CDN
Websocket
Step 4 (2017) for mobile apps
● Node.js
● RAM for database
● JSTP (over TCP or WS)
○ for API
○ for events
Added:
JSTP
Removed:
Websocket,
CDN (not needed for
mobile apps)
iOS & Android
Step 4 (2017) for mobile apps
Server side
Persistent
Storage Worker Worker Worker
Mobile
Application
Nodejs
Master
JSTP (TCP)
Step 5 (2018-2019)
● Node.js
● GS for database
● Metarhia/protocol
○ for API
○ for events
○ for DB sync
○ Binary streams
Added:
GS & DB sync
Protocol
iOS & Android
Step 5 (2018-2019)
Server side
Persistent
Storage
Mobile
Application
Nodejs
Master
Sync
GS
GSGSGS
AppAppApp
JSTP
API
Features / Aspects
● Data Synchronization
● Offline
● Interactivity
● Low latency / availability
● Highload
● High connectivity
● High interconnection
● Scalability
● Big data
● Big memory &
in-memory DB
● Integration flexibility
Late period
Future alternatives
Think about...
● Linear scaling
● Client-side balancing without bottleneck
● Sharding of data and calculations
● High connectivity
● High interconnection
Bad practices checklist
● Mixing abstraction layers
● Business-logic in middlewares
● require() not at initialization
● Hells: callback, async.js, promise
● Server-side load balancing
● Dependencies and vendor lock
Usage overview in Applied fields
● Business (corporate IS including ERP, BI & BPM)
● Economics (including Supply Chain Management, CRM)
● Commerce and trading (R-T and High-Frequency Trading)
● Enterprise (Knowledge Management, CAD/CAM, DBs)
● Education (University IS, remote learning and e-Learning)
● Medicine (MDS, Diagnostics, Equipment control)
● Governance (Databases, Public services, Collaboration)
● Social networking (Instant messaging, Content publishing)
● Communications (Voice/video, IM, file exchange)
● Mass media and Highload interactive TV (second screen)
● Clusterware for massively parallel distributed cloud apps
● Production Automation, Telemetry and Internet of Things
● Fog/dew computing (use client devices for distributed data processing)
The main idea
again
Questions
Timur Shemsedinov
telegram: t.me/metarhia & t.me/nodeua
meetup.com/NodeUA
meetup.com/HowProgrammingWorks
tshemsedinov@github
tshemsedinov@facebook
marcusaurelius@habrahabr
timur.shemsedinov@gmail.com

Node.js scaling in highload

  • 1.
    Node.js Scaling inHighload Timur Shemsedinov KPI, Metarhia
  • 2.
    You will rewriteall your code to reach each level ● 100 connections anyhow ● <50k connections single process ● ~500k connections single server ● >1 mln. connections single with opt. or cloud ● >2,5 mln. connections multiple servers ● >10 mln. connections clusterware or cloud
  • 3.
    Scaling and performancefactors: ● JavaScript, V8 ● Node.js, libuv ● Code style and patterns ● Architectural decisions ● Dependencies
  • 5.
    JavaScript is... ● Implementedand available everywhere ● Most common and widespread language ● Minimalistic syntax ● Libraries, community: NPM, Github and Stackoverflow ● Patterns: streams, factories, observers, queues, etc. ● Techniques: lambdas, closures, mixins, chaining, functors...
  • 6.
    JavaScript is... ● Unpredictableoptimizations in V8 ● Async, blocking, long loops & recursion ● The Tower of Babel ● Crap in NPM & idiots in community ● Multiparadigm approach isn’t so easy ● Problems: mixins, poor types, change built-in classes...
  • 7.
    Node.js is... ● AsynchronousI/O, libuv, streams ● V8, stat optimization, hidden classes... ● Production ready infrastructure & runtime ● Stateful and in-memory processing ● Multi-paradigmal: fp, data flow, type driven, oop, declarative, prototypes, async, mp, etc.
  • 8.
    Important decisions ● Longlife Process vs Per request Process ● Stateful vs Stateless ● Asynchronous I/O vs no I/O at all ● Algorithms vs Data Structures ● Server inside App vs App inside Server ● Lazy vs Immediate
  • 9.
  • 10.
    Step 1 (2012-2013) ●Node.js cluster mode ● Nginx as reverse proxy ● MongoDB or MySQL as DBMS ● HTTP for static files and AJAX API ● SSE for events push
  • 11.
    Client side Browser Step 1(2012-2013) Server side Nginx Mongo or My Worker Worker Worker Web Application Nodejs Master FS Key- value HTTP SSE
  • 12.
    Step 2 (2014-2015) ●Node.js cluster mode ● CDN for static files ● MongoDB as DBMS ● HTTP for API ● SSE for events push Added: MongoDB Removed: Nginx for static and API proxying, MySQL, Memcached
  • 13.
    Client side Browser Step 2(2014-2015) Server side Mongo DB Worker Worker Worker Web Application Nodejs Master Static HTTP SSE CDN HTTP API
  • 14.
    Step 3 (2016-2017) ●Node.js ● CDN for static files ● RAM for database ● Websocket ○ for API ○ for events Added: Websocket, In-memory Removed: HTTP API, SSE, Node.js cluster mode, MongoDB
  • 15.
    Client side Browser Step 3(2016-2017) Server side Persistent Storage Worker Worker Worker Web Application Nodejs Master Static HTTP CDN Websocket
  • 16.
    Step 4 (2017)for mobile apps ● Node.js ● RAM for database ● JSTP (over TCP or WS) ○ for API ○ for events Added: JSTP Removed: Websocket, CDN (not needed for mobile apps)
  • 17.
    iOS & Android Step4 (2017) for mobile apps Server side Persistent Storage Worker Worker Worker Mobile Application Nodejs Master JSTP (TCP)
  • 18.
    Step 5 (2018-2019) ●Node.js ● GS for database ● Metarhia/protocol ○ for API ○ for events ○ for DB sync ○ Binary streams Added: GS & DB sync Protocol
  • 19.
    iOS & Android Step5 (2018-2019) Server side Persistent Storage Mobile Application Nodejs Master Sync GS GSGSGS AppAppApp JSTP API
  • 20.
    Features / Aspects ●Data Synchronization ● Offline ● Interactivity ● Low latency / availability ● Highload ● High connectivity ● High interconnection ● Scalability ● Big data ● Big memory & in-memory DB ● Integration flexibility
  • 21.
  • 22.
  • 23.
    Think about... ● Linearscaling ● Client-side balancing without bottleneck ● Sharding of data and calculations ● High connectivity ● High interconnection
  • 24.
    Bad practices checklist ●Mixing abstraction layers ● Business-logic in middlewares ● require() not at initialization ● Hells: callback, async.js, promise ● Server-side load balancing ● Dependencies and vendor lock
  • 26.
    Usage overview inApplied fields ● Business (corporate IS including ERP, BI & BPM) ● Economics (including Supply Chain Management, CRM) ● Commerce and trading (R-T and High-Frequency Trading) ● Enterprise (Knowledge Management, CAD/CAM, DBs) ● Education (University IS, remote learning and e-Learning) ● Medicine (MDS, Diagnostics, Equipment control)
  • 27.
    ● Governance (Databases,Public services, Collaboration) ● Social networking (Instant messaging, Content publishing) ● Communications (Voice/video, IM, file exchange) ● Mass media and Highload interactive TV (second screen) ● Clusterware for massively parallel distributed cloud apps ● Production Automation, Telemetry and Internet of Things ● Fog/dew computing (use client devices for distributed data processing)
  • 35.
  • 36.
  • 37.
    Timur Shemsedinov telegram: t.me/metarhia& t.me/nodeua meetup.com/NodeUA meetup.com/HowProgrammingWorks tshemsedinov@github tshemsedinov@facebook marcusaurelius@habrahabr timur.shemsedinov@gmail.com