Scalable Web Apps
Upcoming SlideShare
Loading in...5
×
 

Scalable Web Apps

on

  • 3,045 views

Scalable web apps - execution time vs development time.

Scalable web apps - execution time vs development time.

Statistics

Views

Total Views
3,045
Views on SlideShare
880
Embed Views
2,165

Actions

Likes
6
Downloads
25
Comments
0

7 Embeds 2,165

http://athlan.pl 2143
http://cloud.feedly.com 16
https://www.google.pl 2
http://inoreader.com 1
http://digg.com 1
http://my.dudamobile.com 1
http://translate.googleusercontent.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Scalable Web Apps Scalable Web Apps Presentation Transcript

  • Scalable web apps execution time vs development time Piotr Pelczar me@athlan.pl
  • Types of scaling Vertical scaling Horizontal scaling scale up scale out
  • Think about your app as a worker not single instance OS Load balancer App Server #1 App #1 App #2 Server #2 App #3 App #4 Server #3 App #5
  • Think about your app as a worker not single instance Load balancer Server #1 App #1 Server #3 Load balancer App #2 Server #2 App #3 App #4 App #5 Server #n
  • Sessions We need: • Common • Fast • Persistent Storage for sessions.
  • Sessions OS Load balancer App Server #1 App #1 App #2 Server #2 App #3 Session storage App #4 Server #3 App #5
  • Sessions - Redis • • • • • Key-value in memory database (hash-tabled) Scalable up to 1k nodes Partitioning with Query routing Non blocking M-S replication on nodes Clustered (currently not production ready) http://athlan.pl/symfony2-redis-session-handler/
  • Redis - Partitioning with Query routing Query random node Miss Node #1 Hit, abort Node #2 Node #3 Also supported: • Client-side partitioning (app calls appropriate node) • Proxy assisted partitioning (proxy selects appropriate node)
  • Centralized Logging • Logs should be centrailzed to avoid taking notice to each node separately • Approaches: – File replication (rsync + cron) – syslog (easy to integrate with log4j) • syslogd over UDP p:514 • rsyslog over TCP, stores data in db
  • Common storage, no local changes! • Keep storage avaliable to all nodes – Symfony2 Gaufrette Bundle • • • • • FTP Amazon S3 OpenCloud AzureBlobStorage Rackspace
  • Architecture OS Load balancer App Server #1 App #1 App #2 OS Session storage Server #2 App #3 App #4 Server #3 App #5 Files storage abstraction Centralized logging
  • Continuous Integration • To keep all nodes up-to-date, you need CI • Automatize disabling nodes, building, deploying – Jenkins CI
  • Contineous Integration 1. Disable service on node 2. Deploy/build app 1. Copy files 2. Update db schema (liquibase, ORM schema update) 3. Execute scripts 3. Re-run service
  • Balance the payload - HAProxy Yeah guys, this is logo :) But no schema is needed just imagine how it works. • Very, very fast proxy! • Software TCP/HTTP load balancer • Different node selecting algorithms: – roudrobin (limit 4128) – static-rr – leastconn (lowest number of connections)
  • Balance the payload - HAProxy • You can check node’s status by pinging • Dead node is excluded from balancing strategy vi /etc/haproxy/haproxy.cfg option httpchk HEAD /check.txt HTTP/1.0 server webA 192.168.0.102:80 check server webB 192.168.0.103:80 check
  • Balance the payload - HAProxy • Monitor node’s status by read stats from socket via socat. echo "show stat" | socat /tmp/haproxy.sock stdio
  • Balance the payload - HAProxy • Monitor node’s status by native stats webapp console
  • Nodes Monitoring - Zabbix • Zabbix, centralized server monitoring
  • Zabbix + HAProxy • UserParameter=haproxy.qcur[*], echo "show stat" | socat /tmp/haproxy.sock stdio | grep -i '$1' | sed 's/,/ /g' | awk '{print $$3}'
  • Reverse Proxy and Varnish cache • Global virtual user = global cache http://tomayko.com/writings/things-caches-do
  • Reverse Proxy – Expiration model http://tomayko.com/writings/things-caches-do
  • Reverse Proxy – Expiration model http://tomayko.com/writings/things-caches-do
  • Reverse Proxy – Validation model http://tomayko.com/writings/things-caches-do
  • Reverse Proxy – Validation model http://tomayko.com/writings/things-caches-do
  • Reverse Proxy and Varnish cache Apache :81 Varnish :80 App
  • Reverse Proxy and Varnish cache Apache :8081 Varnish :8080 App HAProxy :80 Apache :8083 Varnish :8082 App
  • Reverse Proxy and Varnish cache Apache :8081 App Varnish :80 HAProxy :81 Apache :8082 App
  • Varnish and ESI <!DOCTYPE html> <html> <body> <!-- ... some content --> <!-- Embed the content of another page here --> <esi:include src="http://..." /> <!-- ... more content --> </body> </html>
  • Scaling databases - Master slave Write Master Slave Read • All data redundancy Slave Slave
  • MongoDB scaling • Common models to spread data over nodes: – range keys – hash keys • Many nodes on cheap machines • No all data redundancy in each node
  • MongoDB – range-based keys http://docs.mongodb.org • Awesome for range queries (grab data from min nodes – Query isolation) • Not good enough to distribute data over nodes in case of monotinic incemental
  • MongoDB – hash-based keys http://docs.mongodb.org • Take notice: not good for range queries while merge-sorting, no Query isolation in this case • Write scaling – Write to many nodes simultaneously (take notice to readers-writer lock, where write is exclusive)
  • Mongodb sharding and clustering http://docs.mongodb.org
  • CQRS • Command Query Responsibility Segregation – separate application service layers for writing and readng from DB (possibility to use different data sources like RAM or DB)
  • CQRS • Examples – post-insert population cache • all SELECTs are from cache (even invalid) • consider LFU instead of LRU to invaidate cache – pre-insert into memory • dump results periodicaly In both approaches there is convenient to use Queues or data bus !
  • Queues, RabbitMQ • RabbitMQ is based on AMQP (Advanced Message Queuing Protocol) – point-to-point – publish-and-subscribe – queueing, routing • AMQP is not JMS (Java Message Service is an API, not protocol) • Happy Rabit is empty Rabbit – do not try to store any data (messages) in queue system in persistent mode to keep HA
  • Queues, RabbitMQ • Simple queue • Work queues (one consumer) • Publish/Subscribe (many consumers)
  • Box vs spread architecture. • Box architecture – no scaling – easy to maintenance Server Webapp Redis RabbitMQ Varnish DB
  • Box vs spread architecture. • Spread architecture – High availability – more integrations, more administrative Server #1 RabbitMQ Redis HAProxy Server #2 Server #3 Webapp Webapp DB shard Varnish DB shard Varnish
  • Scalable web apps execution time vs development time Piotr Pelczar me@athlan.pl