Application architecture for the rest of us - php xperts devcon 2012

5,572 views

Published on

Summarize various architectural principals for enterprise applications

Published in: Technology
1 Comment
9 Likes
Statistics
Notes
No Downloads
Views
Total views
5,572
On SlideShare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
239
Comments
1
Likes
9
Embeds 0
No embeds

No notes for slide

Application architecture for the rest of us - php xperts devcon 2012

  1. 1. APPLICATIONARCHITECTURE FORTHE REST OF USPresented byM N Islam Shihan
  2. 2. Introduction Target Audience What is Architecture?  Architecture is the foundation of your application  Applications are not like Sky Scrappers  Enterprise Vs Personal Architecture Why look ahead in Architecture?  Adaptabilitywith Growth  Maintainability  Requirements never ends
  3. 3. Enterprise Architecture (cont…) Security Responsiveness Extendibility Availability Load Management Distributed Computation Caching Scalability
  4. 4. Security
  5. 5. Security (cont…)Think about Security first of all Network Security: Implement Firewall &Reverse Proxy for your network SQL Injection: Never forget to escapefield values in your queries XSS (Cross Site Scripting): Never trust user provided (or grabbed from third party data sources) data and display without sanitizing/escaping CSRF (Cross Site Request Forgery): Never let your forms to be submitted from third party sites
  6. 6. Security (cont…) DDOS (Distributed Daniel of Services): Enable real time monitoring of access to detect and prevent DDOS attacks Session fixation: Implement session key regeneration for every request Always hash your security tokens/cookies with new random salts per request/session basis (or in an interval) Stay tuned and up-to-date with security news and releases of all of your used tools and technologies
  7. 7. Responsiveness
  8. 8. Responsiveness (cont…) Web applications should be as responsive as Desktop Applications Plan well and apply good use of JavaScript to achieve Responsiveness Detect browsers and provide separate response/interface depending on detected browser type Implement unobtrusive use of JavaScript Implement optimal use of Ajax Use Comet Programming instead of Polling Implement deferred/asynchronous processing of large computations using Job Queue
  9. 9. Extendibility Implement and use robust data access interface, so that they can be exposed easily via web services (like REST, SOAP, JSONP) Use architectural patterns & best practices  SOA(Service Oriented Architecture)  MVC (Model View Controller) Modular architecture with plug-ability Allow hooks and overrides through Events
  10. 10. Availability
  11. 11. Availability (cont…) Implement well planned Disaster Recovery policy Use version control for your sources Use RAID for your storage devices Keep hot standby fallback for each of your primary data/content servers Perform periodical backup of your source repository, files & data Implement periodical archiving of your old data Provide mechanism to the users to switch between current and archived data when possible
  12. 12. Load Management
  13. 13. Load Management (cont…) Monitor and Benchmark your servers periodically and find pick usage time Optimize to support at least 150% of pick time load Use web servers with high I/O performance Introduce load balancer to distribute loads among multiple application Servers Start with software (aka. reverse proxy) then grow to use hardware load balancer only if necessary Use CDNs to serve your static contents Use public CDNs to serve the open source JavaScript or CSS files when possible
  14. 14. Caching To Cache Or Not to Cache?  Analyze the nature of content and response generated by your application very well  What to cache?  Analyze and set proper expiry time  Invalidate cache whenever content changes  Partial caching will also bring you speed  When caching is bad? Understand various types of web caches  Browser cache  Proxy cache  Gateway cache
  15. 15. Caching (cont…) Implement server side caching  Runtime in-memory cache  Per request: Global variables  Shared: Memcached  Persistent Cache  Per Server: File based, APC  Shared: Db based, Redis  Optimizers and accelerators: eAccelerator, XCache Reverse proxy/gateway cache  Varnish cache
  16. 16. Distributed Computing
  17. 17. Scalability What the heck is this? Scalability is the soul of enterprise architecture Scalability pyramid
  18. 18. Scalability (cont…)Vertical Scalability (scaling up)
  19. 19. Scalability (cont…)Horizontal Scalability (scaling out)
  20. 20. Scalability (cont…)
  21. 21. Scalability Scaling up (vertical) vs. Scaling out (horizontal)
  22. 22. Scalability Database Scalability  Vertical: Add resource to server as needed  In most cases produce single point of failure  Horizontal: Distribute/replicate data among multiple servers  Cloud Services: Store your data to third party data centers and pay with respect to your usage
  23. 23. Scalability (cont…)Scaling DatabaseScaling options Master/Slave  Master for Write, Slaves for Read Cluster Computing  Single storage with multiple server node Table Partitioning  Large tables are split among partitions Federated Tables  Tables are shared among multiple servers Distributed Key Value Stores Distributed Object DB Database Sharding
  24. 24. Scalability (cont…)Database Sharding  Smaller databases are easier to manage  Smaller databases are faster  Database sharding can reduce costs  Need one or multiple well define shard functions  "Dont do it, if you dont need to!" (37signals.com)  "Shard early and often!" (startuplessonslearned. blogspot.com)
  25. 25. Scalability (cont…)Database ShardingWhen appropriate? What to analyze? High-transaction database  Identify all transaction-intensive applications tables in your schema. Mixed workload database usage  Determine the transaction volume  Frequent reads, including complex your database is currently handling queries and joins (or is expected to handle).  Write-intensive transactions (CRUD  Identify all common SQL statements statements, including INSERT, (SELECT, INSERT, UPDATE, UPDATE, DELETE) DELETE), and the volumes  Contention for common tables and/or associated with each. rows  Develop an understanding of your General Business Reporting "table hierarchy" contained in your  Typical "repeating segment" report schema; in other words the main generation parent-child relationships.  Some data analysis (mixed with other  Determine the "key distribution" for workloads) transactions on high-volume tables, to determine if they are evenly spread or are concentrated in narrow ranges.
  26. 26. Scalability (cont…)Database Sharding Challenges  Reliability  Automated backups  Database Shard redundancy  Cost-effective hardware redundancy  Automated failover  Disaster Recovery  Distributed queries  Aggregation of statistics  Queries that support comprehensive reports
  27. 27. Scalability (cont…)Database Sharding Challenges (cont…)  Avoidance of cross-shard joins  Auto-increment key management  Support for multiple Shard Schemes  Session-based sharding  Transaction-based sharding  Statement-based sharding  Determine the optimum method for sharding the data  Shard by a primary key on a table  Shard by the modulus of a key value  Maintain a master shard index table
  28. 28. Scalability (cont…)Database ShardingExample Bookstore schema showing how data is sharded
  29. 29. Tools Application framework Load balancer with multiple application servers Continuous integration Automated Testing  TDD (Test Driven Development)  BDD (Behavior Driven Development) Monitoring  Services  Servers  Error Logging  Access Logging Content Data Networks (CDN) FOSS
  30. 30. Think Ahead
  31. 31. Think Ahead (cont…) Understand business model Analyze requirement in greatest detail Plan for extendibility Be agile, do incremental architecture Create/use frameworks SQL or NoSQL? Sharding or clustering or both? Cloud services?
  32. 32. Guidelines Enrich your knowledge: Read, read & read. Read anything available : jokes to religions. Follow patterns & best practices Mix technologies  Don’t let your tools/technologies limit your vision  Invent/customize technology if required Use FOSS  Don’t expect ready solutions  Find the closest match  Customize as needed
  33. 33. Guidelines (cont…)Database Optimization Use established & proven solutions  MySQL  PostgreSQL  MongoDB  Redis  Memchached  CouchDB Understand and utilize indexing & full-text search Use optimized DB structure & algorithms  Modified Preorder Tree Traversal (MPTT)  Map Reduce ORM or not?
  34. 34. Guidelines (cont…)Database Optimization Optimize your queries  One big query is faster than repetitive smaller queries  Never be lazy to write optimized queries  One Ring to Rule `em All  Use Runtime In Memory Cache  Filtering in-memory cached dataset is much faster than executing a query in DB
  35. 35. Guidelines (cont…) One Ring to Rule `em All Perform Selection, then Projection, then Join a_i d A B C1,000 records 1000,000 records 1000,000,000 records A simple example Write a standard SQL query to find all records with fields A.a1, B.b1 and C.c1 from tables A (id, a1,a2, a3, …,aP), B (id, a_id, b1, b2, b3, …, bQ), and C(id, b_id, c1, c2, c3, …,cR) given that A.aX, B.bY and C.cZ will match ‘X’, ‘Y’ and ‘Z’ values respectively. Assume all tables A, B, C has primary keys defined by id column and a_id and b_id are the foreign keys in B from A and in C from B respectively.
  36. 36. GuidelinesOne Ring to Rule `em All (cont…)Solution 1SELECT A.a1, B.b1, C.c1FROM A, B, CWHERE A.id = B.a_id AND B.id = C.b_idAND A.aX = ‘X’ AND B.bY = ‘Y’ AND C.cZ = ‘Z’Why it Sucks?•Remembered the size of A, B and C tables?•Cross product of tables are always memory extensive, why? •A x B x C will have 1,000 x 1,000,000 x 1,000,000,000 records with (P +1) + (Q +2) + (R +2) fields •Can you imagine the size of in-memory result set of joined tables? •It will be HUGE
  37. 37. GuidelinesOne Ring to Rule `em All (cont…)Solution 2SELECT A.a1, B.b1, C.c1FROM A INNER JOIN B ON A.id = B.a_id INNER JOIN C ON B.id = C.b_idWHERE A.aX = ‘X’ AND B.bY = ‘Y’ AND C.cZ = ‘Z’Why it still Sucks?•A B C will produce (1,000 x 1,000,000) records to perform A B andthen produce another (1,000 x 1,000,000,000) records to compute (A B) Cand then it will filters the records defined by WHERE clause.•The number of fields, that is P+1 in A, Q+2 in B and R+2 in C will alsocontribute in memory consumption.•It is optimized but still be HUGE with respect to memory consumption andcomputation
  38. 38. GuidelinesOne Ring to Rule `em All (cont…)Optimal SolutionSELECT A.a1, B.b1, C.c1FROM (SELECT id, a1 FROM A WHERE aX = ‘X’) as AINNER JOIN ( SELECT id, b1, a_id FROM B WHERE bY = ‘Y’) as B ON A.id = B.a_idINNER JOIN ( SELECT id, c1, b_id FROM C WHERE cZ = ‘Z’) as C ON B.id =Why this solution out performs? C.b_id•Let’s keep the explanation as an exercise 
  39. 39. Reference : Tools Security  Nmap: http://nmap.org/  Nikto: http://cirt.net/Nikto2  List of Tools: http://sectools.org/ Caching  APC: http://php.net/manual/en/book.apc.php  XCache: http://xcache.lighttpd.net/  eAccelerator: http://sourceforge.net/projects/eaccelerator/  Varnish Cache: https://www.varnish-cache.org/  MemCached: http://memcached.org/  Redis: http://redis.io/ Load Balancer  HAProxy: http://haproxy.1wt.eu/  Pound: http://www.apsis.ch/pound/
  40. 40. Reference : Tools (cont…) NoSQL  MongoDB: http://www.mongodb.org/  CouchDB: http://couchdb.apache.org/  A complete list: http://nosql-database.org/ Distributed Computing  GearMan: http://gearman.org/ Message Queue/Job Server  RabitMQ: http://www.rabbitmq.com/  ActiveMQ: http://activemq.apache.org/ Monitoring  Nagios: http://www.nagios.org/ Testing  Selenium: http://seleniumhq.org/  Cucumber: http://cukes.info/  Watir: http://watir.com/  PhpUnit: http://www.phpunit.de/manual/3.7/en/ MPTT  Shameless Promotion: https://github.com/mnishihan/phpMptt
  41. 41. Reference : Articles Caching  http://www.mnot.net/cache_docs/  http://bit.ly/9cTJfA Load Balancing  http://en.wikipedia.org/wiki/Load_balancing_%28computing%29  http://1wt.eu/articles/2006_lb/index.html Scalability & Architecture  http://www.diranieh.com/DistributedDesign_1/Scalability.htm  http://www.infoq.com/presentations/Facebook-Software-Stack  http://99designs.com/tech-blog/blog/2012/01/30/infrastructure-at-99designs/  http://bit.ly/16cKu Database Sharding  http://www.codefutures.com/database-sharding/  http://bit.ly/Y3b3J  http://www.startuplessonslearned.com/2009/01/sharding-for-startups.html CDN  http://bit.ly/sMRyxC MPTT  http://www.sitepoint.com/hierarchical-data-database/
  42. 42. Thank YouJoin phpXperts [http://bit.ly/phpxperts]Follow me on twitter [http://twitter.com/mnishihan]Subscribe in facebook [http://fb.me/mnishihan]
  43. 43. Questions???I will be glad to answer 

×