New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Scalable Website C12009 1
1. What Do You Need to Know About
Creating and Running a Scalable
Web Site but Were Afraid to Ask?
Session Code: S304347
Chris Webster chris.webster@sun.com
Girish Balachandran girish.balachandran@sun.com
2. Learn some of the techniques we
used to create and build
zembly.com
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 2
5. Introduction - zembly.com
Browser based social application development
environment (wiki for applications)
•Content Management System for executing code
•Executable API repository
•many outgoing HTTP web API calls
•15% writes vs. 85% reads
Audience
•App Developers
•zembly App User – variable (one app has 30K+)
•API Provider – Major players (Flickr, Twitter) but extensible and
expanding
Concept > Stealth > Private Beta > Open Beta
launch
•only constant is change
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 5
6. Introduction – software supporting zembly
Development Tools
•NetBeans
•Subversion
•kenai.com
•Hudson
•Automated Testing (JUnit, Selinium)
Infrastructure
•Solaris
•Glassfish (Java EE 5, JPA)
•MySQL + Enterprise Monitor
•Apache
•Memcached
•AdventNET – Application Manager
•zenoss 2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 6
7. Introduction – we are zembly
Engineering Team – Chris
•Platform
•Girish
•user interaction
Site engineering - Girish
•Chris
•monitoring, performance, DBA, etc.
•Operations
•data center operators - external
•system admin - external
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 7
9. Development process - overview
Agile development - 3 week sprints
Everyone can nominate work for the upcoming sprint
Planning meeting selects sprint deliverables
Engineers complete work
Build vetting
Weekly production deployment
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 9
10. Development process - planning
Strategic direction
Engineering discussions before nominations
Tasks must have
•Clear description, time estimate, and dependencies
•Account for demos, blogging, bugs, tests
No features added during planning
Load balancing for people
Retrospectives are important
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 10
11. Development process - code
Pair programming
Continuous Build
•compiler errors
•unit test failures
•JSLint – for java script errors
•archive successful builds
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 11
12. Development process – tests
Deploy successful build for functional and automated UI
tests
•functional tests (blackbox testing)
•UI tests: JUnit and Selenium
Nightly build
•code coverage
•FindBugs
Nightly performance tests
•Repeatability
•needs historical data to detect trends
•model production traffic
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 12
13. Development Process – code lines
3 code lines
•trunk – must pass all automated tests
•staging – created from specific trunk version once a week.
•Deployed in clone of production
•exercise it for a week
•blockers fixed asap
•hackathon – one hour holistic testing
•week keeps extending till this codeline is perfected.
•production – vetted production builds
•future P1 fixes go to production branch and pushed to production fast
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 13
14. Development process - experience
No nominations
Nominations without details
Growing backlog
Looooooonnnnnngggg planning meetings
Adding work in the middle of the sprint
No accounting for bug fixing, demos, tests ...
Regressions where we don't have tests
Urgent fixes needed in production
Users finding problems before monitoring
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 14
16. w
ik
ET
ze
i /b
N
m
lo
y.
bl
g/
bl
y.
fo
m
C
ru
ze
O
m
M
Load
Balancer
zembly.COM zembly.NET forum
Static Static wiki
Content Content blog
z.COM z.COM z.NET z.NET .COM/.NET
Dynamic Dynamic Dynamic Dynamic Dynamic
Site Site Execution Execution *
Content Content Content Content Content
Dynamic Data Cache Memcached
Master
Write Read Slave Read Slave Read Slave
Database
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 16
17. zembly database setup
Master DB
Local All writes go here Local
HD HD
RAID
RAID
Primary Replication Standby
Master Master
Local Local
HD HD
n
io
at
lic
ep
R
Replication
1 Backup/10mins
MySQL Dump &/
ZFS snapshot
Backup
Slave
Thumper
Thumper Tape
ZFS
Slave DB Slave DB Slave DB Slave DB
1 2 3 n
Local Local Local Local
RAID1 RAID1 RAID1 RAID1
HD HD HD HD
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 17
18. Architecture - experiences
Perpetual architecture
•start simple and grow organic
•development and maintenance inline with site traffic
•monitor site traffic always
•keep game plan for the next level ready
Have exact same setup in staging
•could be a subset of production
•experimentation, debug, test, etc.
•changes run in staging for at least 2weeks before moving to production
Monitoring built in to each layer of the stack
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 18
20. w
ik
ET
ze
i /b
N
m
lo
y.
bl
g/
bl
y.
fo
m
C
ru
ze
O
m
M
Load
Balancer
zembly.COM zembly.NET forum
Static Static wiki
Content Content blog
z.COM z.COM z.NET z.NET .COM/.NET
Dynamic Dynamic Dynamic Dynamic Dynamic
Site Site Execution Execution *
Content Content Content Content Content
Dynamic Data Cache Memcached
Master
Write Read Slave Read Slave Read Slave
Database
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 20
21. w
ik
ET
ze
i /b
N
m
lo
y.
bl
g/
bl
y.
fo
m
C
ru
ze
O
m
M
Load
Balancer
zembly.COM zembly.NET forum
Static Static wiki
Content Content blog
z.COM z.COM z.NET z.NET .COM/.NET
Dynamic Dynamic Dynamic Dynamic Dynamic
Site Site Execution Execution *
Content Content Content Content Content
Dynamic Data Cache Memcached
Master
Write Read Slave Read Slave Read Slave
Database
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 21
22. w
ik
ET
ze
i /b
N
m
lo
y.
bl
g/
bl
y.
fo
m
C
ru
ze
O
m
M
Load
Balancer
zembly.COM zembly.NET forum
Static Static wiki
Content Content blog
de
pl
oy
!
z.COM z.COM z.NET z.NET .COM/.NET
Dynamic Dynamic Dynamic Dynamic Dynamic
Site Site Execution Execution *
Content Content Content Content Content
Dynamic Data Cache Memcached
Master
Write Read Slave Read Slave Read Slave
Database
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 22
23. w
ik
ET
ze
i /b
N
m
lo
y.
bl
g/
bl
y.
fo
m
C
ru
ze
O
m
M
Load
Balancer
zembly.COM zembly.NET forum
Static Static wiki
Content Content blog
z.COM z.COM z.NET z.NET .COM/.NET
Dynamic Dynamic Dynamic Dynamic Dynamic
Site Site Execution Execution *
Content Content Content Content Content
Dynamic Data Cache Memcached
Master
Write Read Slave Read Slave Read Slave
Database
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 23
24. w
ik
ET
ze
i /b
N
m
lo
y.
bl
g/
bl
y.
fo
m
C
ru
ze
O
m
M
Load
Balancer
zembly.COM zembly.NET forum
Static Static wiki
Content Content blog
de
pl
oy
!
z.COM z.COM z.NET z.NET .COM/.NET
Dynamic Dynamic Dynamic Dynamic Dynamic
Site Site Execution Execution *
Content Content Content Content Content
Dynamic Data Cache Memcached
Master
Write Read Slave Read Slave Read Slave
Database
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 24
25. Architecture & Deployment - deployment
Rolling deployments
•automated homegrown script
•turn off monitoring
•no user impact during deployment
•firewall block of port
•localhost URL still accessible
•deployment verified thru localhost URL
•versioning of webserver static content
•special care for db structural updates
•changes are versioned
•existing table definition changes need more than 1 deployment
•reason?
•Glassfish restarts takes the longest of the deployment time
•parallelize deployment after first server
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 25
27. Availability – experiences
differences between staging and production
hardware failure
unmonitored service
late night changes
human errors
data center problems
using services without SLA
lack of documentation
lack of backup (human, infrastructure)
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 27
28. Availability - failover
Failover
•reduce the affect on users
•unknown SLA -- have a workaround ready (hopefully automated)
•redundancy everywhere (including humans)!
•helps also horizontal scaling
•document and test the doc
•must be solid and dependable
•gives you a good night sleep!
•get professional help
•mysql support
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 28
29. Availability - backups
Backup, backup and double backup
•users are forgiving about site outages, but not data loss
•consider data loss from software defects & human errors
•DB - mysql dump, mysql ZFS snapshot
•htdocs - filesystem replication
•ZFS snapshot & replication
•rsync works just fine also
•be aware of your restoration time
•mysql dumps are slower but gives you consistent data always
•watch out for mysql settings that might prolong crash recovery
•innodb_log_file_size
•practice your maintenance mode
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 29
30. Availability – monitoring
Monitor everything
•appserver, DB, firewall, apache, even monitor the monitor!
•mysql monitor
•homegrown scripts
•log file monitors
•Adventnet
Use datacenter monitoring – if available
•advantages – call home for issues
•first level support for hardware should be 24x7 live and online
•human to trouble shoot your problem as soon as it occurs
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 30
31. Availability – problem prevention
Change review
•avoid late night changes
•Trust, but verify
•document changes -- pay it forward
•build up an failure proof recovery manual
Watch your traffic
•collect all sorts of stats
•to start with, don't worry about coherency of data
Monitor & penalize users for malicious activities
Keep all infrastructure software security patches current
•specifically the web facing ones – wiki, blog, forum, etc
Run a security audit on your site
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 31
33. Performance – “run forest run”...fast
Monitor your site traffic for performance numbers
Read from slave DB always
Memcached is your friend
DB query optimizations takes you really far
Start with best practice configurations -- adjust
systematically based on live data
Know your OS!
Know your network!
Have an automated performance test suite
Expert help [blog, support, ...]
2009 CommunityOne WEST Conference | san francisco, ca | developers.sun.com/events/communityone 33
34. What Do You Need to Know About
Creating and Running a Scalable Web
Site but Were Afraid to Ask?
Session Code: S304347
Chris Webster chris.webster@sun.com
Girish Balachandran girish.balachandran@sun.com