Scalarea Aplicatiilor Web - 2009
Upcoming SlideShare
Loading in...5
×
 

Scalarea Aplicatiilor Web - 2009

on

  • 1,069 views

 

Statistics

Views

Total Views
1,069
Views on SlideShare
1,029
Embed Views
40

Actions

Likes
0
Downloads
2
Comments
0

2 Embeds 40

http://www.linkedin.com 31
https://www.linkedin.com 9

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Scalarea Aplicatiilor Web - 2009 Scalarea Aplicatiilor Web - 2009 Presentation Transcript

  • Scalarea Aplicatiilor Web Andrei Gheorghe idevelop.ro
  • Cazul cel mai comun Shared Hosting
  • Unde apar probleme • Puterea de procesare a serverului: CPU, RAM, etc • Latimea de banda • Capacitate de stocare • Baza de date
  • Server Web + Server DB
  • Load Balancing
  • Load Balancing • Hardware • Balancingul se face la nivel de transport pachete • Scump, nu stie nimic despre arhitectura aplicatiei • DNS Load Distribution (quot;Round Robinquot;) • Statistic, distribuie traficul uniform • Nu stie nimic despre disponibilitatea serverelor • Pot aparea probleme de DNS caching • Este o solutie doar la scara foarte mare • Reverse Proxy
  • Reverse Proxy Load Balancing • Un singur front-end pentru mai multe servere • Securitate • Accelerarea cererilor SSL • Caching • nginx, squid, lighthttpd
  • Relational Databases tabele, coloane, joinuri
  • MySQL Replication
  • MySQL Cluster • Data node – Nu se interactioneaza direct cu ele • Management node – Configurarea si monitorizarea clusterului • SQL node (mysqld process): – Un server MySQL care se conecteaza la nodurile de date pentru a cere sau stoca informatii • Generally, each node will run on a separate host
  • MySQL Cluster • Synchronous Replication – Datele sunt replicate pe mai multe noduri pentru a asigura disponibilitatea in cazul deconectarii unui nod de date • Horizontal Data Partitioning – Informatiile sunt partitionate automat intre toate nodurile de date folosind un algoritm bazat pe primary key • Hybrid Storage – memory / disk • Shared Nothing – “no single point of failure“
  • Normalizare • Presupune aducerea bazei de date la o “forma normala” • Datele sunt structurate pe tabele cu relatii intre ele, si fiecare informatie apare o singura data • Asigura consistenta informatiei in cazul operatiilor asupra bazei de date
  • Normalizare / Denormalizare USERS user_id, user_name, user_password POSTS post_id, post_author_id COMMENTS c_id, c_post_id, c_text
  • Normalizare / Denormalizare USERS user_id, user_name, user_password POSTS post_id, post_author_id, post_author_name COMMENTS c_id, c_post_id, c_text
  • Normalizare / Denormalizare USERS user_id, user_name, user_password POSTS post_id, post_author_id, post_author_name, post_comment_count COMMENTS c_id, c_post_id, c_text
  • Key → Value Databases
  • Key → Value Databases • Distributed, persistent hash tables • quot;Eventual consistencyquot; • Permit SELECT-uri cu conditii • Necesita o doza de denormalizare a datelor • Tratarea manuala a inconsistentelor, propagarea datelor corecte • MemcacheDB, CouchDB, Amazon SimpleDB, Hypertable, Google BigTable
  • Sharding
  • Vertical Sharding • Un server pentru useri, un server pentru search, etc • JOIN-urile intre tabele se fac manual • Denormalizarea DB reduce nevoia de JOIN-uri SEARCH COMMENTS USERS
  • Horizontal Sharding • Impartirea inregistrarilor dintr-un tabel intre mai multe servere • Algoritmul de impartire este foarte important • in functie de algoritmul ales, reechilibrarea datelor in cazul modificarii topologiei USR #1 poate fi dificila • Se poate folosi un dictionar central USR #2 • algoritm transparent • mai usor de reechilibrat USR #3 • poate crea SPF
  • Avantajele sharding-ului • High availability. • Daca un server crapa, aplicatia continua sa functioneze • Query-uri mai rapide • Query-urile fiind pe bucati mai mici de date se executa mai repede • Rata de scriere mai mare • Scrierile se executa mai repede deoarece, neavand un server central, se executa in paralel
  • Cache
  • memcached memcached -d -u www -m 2048 -l 10.0.0.8 -p 11211 • Hash table distribuit, pastrat in RAM set(key, value) get(key) delete(key) • value este de obicei un intreg obiect serializat • Ex: articol + comentarii + informatii autor • Exista clase de interactiune cu memcached pentru orice limbaj de programare, inclusiv PHP
  • memcached • quot;Least Recently Usedquot; • Intr-o retea cu mai multe servere, instantele de memcached pot fi legate intre ele pentru a forma un cluster memcache in care cache-ul este replicat pe mai multe noduri • memcached ruleaza pe Linux, Windows, poate fi pornit oriunde exista RAM liber
  • Session Clustering
  • Load Balancing Revisited
  • Session Clustering • Store in common filesystem • Not useful in multi-server environments • NFS will cache pages • Store in database • Very fast because you are only ever looking up primary keys • Make sure the DB has row locking (InnoDB), not table locking. • Store in memcached • Stored across several machines rather than just one. • A total machine failure now affects only a percentage of users rather than everyone.
  • Content Delivery Network • A collection of web servers distributed across multiple locations to deliver static content more efficiently to users. • The server selected for delivering content to a specific user is typically based on a measure of network proximity.
  • Multiple Codebases • Daca arhitectura serverelor si a site-ului o permite, se pot face lucruri interesante avand cod diferit • Folosind un reverse proxy, se pot trimite 10% din vizitatori spre o versiune 2.0 beta a site-ului si observa felul cum interactioneaza • Daca lucrurile nu ies cum ar trebui, se revine la codul initial si nu au fost afectati decat 10%
  • Studii de caz highscalability.com
  • LAMP Shards Memcached Squid Smarty Imagemagick
  • • More than 4 billion queries per day • ~35M photos in squid cache (total) • ~2M photos in squid’s RAM • ~470M photos, 4 or 5 sizes of each • 38k req/sec to memcached (12M objects) • 2 PB raw storage (consumed about ~1.5TB on Sunday • Over 400,000 photos being added every day
  • • Debian Linux, Apache, PHP, MySQL • memcached • MemcacheDB - distributed key-value storage system which conforms to memcache protocol →15,000 writes/second, 64,000 reads/second • Lots of servers
  • • 26 million uniques a month • 30 million users. • Uniques are only half that traffic. Traffic = unique web visitors + APIs + Digg buttons. • 2 billion requests a month • 13,000 requests a second, peak at 27,000 requests a second.
  • • Data are separated into separate clusters: User Actions, Users, Comments, Items, etc. • Asynchronous queuing architecture for near- term processing
  • Amazon Web Services
  • Simple Storage Service (S3) • Cloud storage service • Servere in US / Europe • REST API • Stocare: $0.150 / GB • Upload: $0.100 / GB • Download: $0.170 / GB • Twitter foloseste S3 pentru pozele userilor
  • Elastic Compute Cloud (EC2) • On-demand server instances • In 5 minute poti porni un server la care ai acces root • $0.10 / ora, 99.95% uptime garantat – 4 ore pe an downtime • Se pot aloca adrese IP statice si se pot construi arhitecturi complexe • Acces rapid la S3
  • SimpleDB • Distributed hash DB • Permite SELECT-uri cu conditii • Query limitat la 5 secunde
  • thank you, come again