Scalability without going nuts

James Cox
Chief squirrel, smokeclouds
james@smokeclouds.com

Scalability without
going nuts
1

1

what this is
( just an overview )

2

2

This is an overview of some of the areas i’ve focused on when investigating scalability.

There are no easy answers ‐ but hopefully these ideas will give you some directions for your own 
apps

From something small comes something big ‐ i just made that up. We’re going to have fun with 
making our apps work when there is more than one user by looking at code, ops and more.

particularly we’ll try and wade through some language improvements/tips, some infrastructure 
planning tips, stuff to make MySQL better, and so on.

we’ll also touch on proxy/app servers, fileshares and some questions at the end, if we get that 
far.

I hope you’re all comfortable, mobile phones are all off as i know we’re all busy people

right. lets begin.

The Language Performance Race
3

3

Rails isn’t fastest ( assembler is )

4

4

rails isn’t fastest ‐ that’s ok.

Life is about tradeoff and compromise 

We pick rails because of its ease and efficiency to code ‐ and we can refactor, scale and improve later. 
or just buy more servers.

refer to recent rants on ruby perf etc...

Planning Trumps All ( even donald )

5

5

A bit of planning and process mapping will do more for your ability to scale than any later 
improvements, usually ruling out a rewrite if you’ve got the core of the project in the right direction.

Analyze
( don’t guess )

6

6

Once you have your planning arranged, don’t guess as to where performance is struggling ‐ actually 
try and get some numbers to benchmark against. To learn more go watch the excellent peepcast 
httperf tutorial ‐ which i almost played instead of doing this talk!

Speed Perceived ( the easiest way )

7

7

There is always the “glamour” of making a high performance app which can handle all the requests 
you can possibly imagine. 

Not everyone can be a livejournal and actually make their servers push 98 MB/s on their 100MB 
network cards.

Find the areas of the app which the userbase perceives to be the slowest: it may be that you can make 
your app appear ‘faster’ by improving the UI/UX.

Work on these areas and then radiate outwards: it’s easier to refactor in chunks than as a whole 

(tangent: SOA architecture is not a bad idea....)

Focus on your app ( it’s usually cheaper )

8

8

So how can we make our app faster?

There are a number of techniques we can employ to make our apps better.  

Now to discuss some of them.....

Improving ActiveRecord:

:select, :limit, :offset
( take what you need )

9

9

‐ You don’t always need that data ‐ this problem hides itself when 
you are first building ‐ but as you 
add data, no limit/offset means you often end up grabbing too many rows
‐ This is particularly important when using TCP connections to your database.
‐ oftentimes an app is waiting for the data to transfer, so limit it to just the stuff you need


:include => :association
( keep it eager )

10

10

OK so eager loading changes your query from N+1 (where n is the number of rows multiplied by 
associations) to one query.

Under the hood, this works by causing a LEFT OUTER JOIN ‐ SQL for joining the tables together. Outer 
joins work by including rows even when one half of the join is NULL.

High query counts are bad because they cause queueing for read/write on the table.


Model < CachedModel
( cache ﬁrst, ask questions later )

11

11

So you’ve limited your query to the least amount of data necessary ‐ or you’re just looking up a  single 
row.  What next?

Cache your data in a fast retrieval store such as memcached. Nice ActiveRecord extension for this 
(even if it is a bit hairy)

n.b. this only works with simple ID based lookups ‐ for anything complex you need to use Cache.set 
and Cache.get


acts_as_cached
( built from experience )

12

12

Better alternative to CachedModel, but you have to add this as a method to a Model.

This is a bit more structured than CachedModel.
Built from CNET’s chow/chowhound team


cache_fu
( in incubation )

13

13

Better alternative to CachedModel, but you have to add this as a method to a Model.

This is a bit more structured than CachedModel.
Built from CNET’s chow/chowhound team


@var ||= Model.find(...)
( keep your code dry )

14

14

Ever do a lookup ‐ current_user, current_page, or some other check that happens more than once in a 
request?

the ||= method says ‐ use the instance variable or define it via the query.


@@modulo ||= (52 % 100)
( run once, save forever )

15

15

@@ is a class variable ‐ a quick way to store a variable for the lifetime of the app...

Improving ActionView:

template optimizer
( non-lazy views )

16

16

if you use semantic views ‐ markaby, builder ‐ or lots of helpers ‐‐ you have to spend way too much 
time to parse the file to get some HTML in the end...

link_to, image_tag, form_tags ‐ all helpers for HTML functions which are, honestly, for people who’ve 
gotten bored writing HTML.

During each request the view rhtml has to be parsed and delivered ‐ this is expensive. It’s so expensive 
to do this parsing that, in other languages ‐ e.g. PHP all optimizers focus on serving up byte‐compiled 
scripts ‐ and this goes back to our first comment that assembler is faster. 

so get your views back to the ‘compiled’ form and ditch those helpers early by optimizing your 
templates

This should bring down the ‘Render’ part of the query log.


Publish Once
( caching always wins )

17

17

You’re going to have gotten your page load time to somewhat of an optimal level by now ‐ improving 
your database queries, and then pre‐compiling your templates.

Now consider if you can cache your pages.

Is this a highly trafficked content website? (caching is a must)
Can you get away with profile etc pages being cached till updated? (social networking site)


caches_page: bad
( nightmare to cleanup )

18

18

caches_page is the trick used to simply write out the entire page to disk... can be tricky to keep up to 
date, and also hard work for a slow disk.

This also falls down if you have a loose url schema: a site i’ve hacked on had about 500MB of content, 
but caches_page has generated 30GB of content ‐‐ why? spiders will pervert your url schema ‐ and 
cause it to generate waaaay too much content.

<%=
cache(:action => 'feature', :part => 'most_read') do
render :partial => 'article/most_read'
end -%>

19

19

Drop a fragment cache into your view and save repetitive tasks

Doesn’t yet work with robot‐coop’s memcache‐client as a fast store for fragments ‐ 

but

There is a memcache backed fragment store gem ‐ eg, extended fragment cache

Improving Sanity:

Follow Edge
( DHH Breaks Stuﬀ )

20

20

@@ is a class variable ‐ a quick way to store a variable for the lifetime of the app...

Avoid Shared Hosting
( there’s only so much to go around )

24

24

When I was living at my family home, my brothers always used to share my stuff ‐ clothes, shower gel, 
aftershave ‐ you name it.

Same is true for server resources ‐ everyone’s gotta share.

Not all users play nice ‐ that crazy crawler on your box is taking up all the ram and the spammer is 
getting you black listed.

Too many variables you can’t control ‐ VPS software is pretty harsh for setting process limits to save 
the box as a whole

Underconfigured software ‐ all packages to make it work for everyone. Low performance: designed to 
encourage upgrades.

New Players
( always one )

25

25

SOME vps are getting it right ‐ Engine Yard, Rails Machine ‐ high‐performance focused servers

Expects trusted users ‐ won’t cater for the low‐end user

Expensive to buy into, low availability ‐ but often a worthwhile investment

Multiple Servers?
( work them hard )

26

26

One server or more?

It’s great if you have the infrastructure.... but do you know how to split them up?

Setup Hot
( universe is inﬁnite )

27

27

There’s also performance in productivity ‐ it makes sense to mirror setups on each machine for hot‐
backup as well as for predictability.

capistrano will help you with this.

8 Server Gem
Proxy/Web Static (2)

Application Servers (4)

Database Layer (2)

28

28

It’s great if you have the infrastructure.... but do you know how to split them up?

Think of the shape of a ruby ‐ the top is a bit of a plateau, and that’s where you put static and proxy 
servers. You’ll want to load balance these for high availability ‐ but generally these scale very well as 
they don’t do much but route traffic and serve files.

The widest part ‐ those are your application servers, and you can grow these out to as many as you 
can imagine. This is your workhorse layer ‐ everything interesting happens here. Careful you don’t 
have too many of these for the proxy servers ‐ if there are so many choices for each proxy some of 
these can sit idle.

The bottom, hidden part is the best bit ‐ the database layer. This is a somewhat sacred layer: not many 
servers can play this part at once. Ensure you put your best machines at this level. You’re going to 
want to see high ram, good I/O throughput, lots of CPU power and plentiful disk space.

Playing Well Together
( there is only one sandpit )

29

29

So you’ve gotten your servers tagged up ‐ how do you assign them tasks?

With one of our clients, we had a situation where we have a mega busy ad‐server and a busy CMS 
sharing the same database. it made sense to break them apart onto two servers ‐ the query stats made 
sense.

... but we could put the admin and the front end app and proxy servers on the same machines ‐ 

Why? Front end/admin work well together. Databases are heavy read/write so two busy databases 
will fight/queue for file system access.

MySQL Tuning
( feed the beast )

30

30

OK lets cover some tips getting MySQL to play nice.

Why MySQL over others? Mostly business reasons than tech ‐ it has a nice pathway to move on to a 
fully supported contract when you need it. 

MySQL is also on the cusp of launching a really awesome NBD cluster ‐ this is basically a high 
availability memory store database which retains integrity via the standard server.

mysql> s

mysql Ver 14.7 Distrib 4.1.19, for pc-linux-gnu (i686) using readline 4.3

Uptime: 10 hours 11 min 47 sec

Threads: 3 Questions: 10,171,505 Slow queries: 334 Opens: 224 Flush tables: 1 Open
tables: 106 Queries per second avg: 277.100

31

31

This is a single machine, dual 2.4GHz xeon processor, hyperthreaded. 2GB RAM. Linux.

Yes it is possible to get some really high performance MySQL going ‐ you just need to get the settings 
right ‐ this is trial and error (mostly)

Had over a billion queries on an uptime of 60 days, but some ‘technician’ at the datacenter rebooted 
the wrong box. So I can’t show that off. shame!

# query cache considered harmful
query_cache_size=0

# key_buffer_size is the size of the buffer used for index blocks.
key_buffer_size=100M

# The maximum size of one packet.
max_allowed_packet=1M

# the length of time (in seconds) that we want to log against.
#long-query-time=3
log-slow-queries=/var/log/mysql_slow_queries

32

32

Some key variables I always have set...

query cache is not always as useful as it seems ‐ OK for truly unoptimized badly indexed stuff, not so 
good for when you need to manage the stack‐ think of a logging table or a user table in a social 
network ‐ when the data changes more quickly than the time it takes to create and query the cache‐ 
you’re in trouble.

it was also quickly written to make MySQL 4 less slow in response to a customer request. 

buffer size ‐ set to be as much spare ram as you have ‐ this is the amount of memory it’ll allocate to fit 
in the buffer. If it has to keep allocating, then it’ll do the sort in chunks which takes FOREVER.

The message buffer is initialised to net_buffer_length bytes, but can grow up to max_allowed_packet 
bytes when needed. Good if you’re passing around large objects such as images, articles, and so on ‐ set 
it high and forget about it (as long as your network can cope)

ALWAYS log slow queries ‐ and regularly check. This is your first port of call for optimizing your DB!!!

# if you use network (tcp) based connections

wait_timeout=90
net_write_timeout=180
net_read_timeout=60
max_connections=500

mysql > SHOW FULL PROCESSLIST; (for more info)

33

33

If your DB server is different to your app server, it’s important to set these. Oftentimes i’ve seen 
servers where appservers are queuing due to long laggy timeouts and no available connections.

It’s OK to ditch AR
( DHH won’t get upset )

34

34

Sometimes it’s just simpler to drop out and craft a very focused query, use a stored procedure or 
function, mysql variables.... force an index.

Just because you can’t do it in a #find doesn’t mean you shouldn’t do it. (ie, don’t sacrifice ultimate 
performance for manageability every time)

good example and not easy using standard AR ‐‐ using INSERT DELAYED is great for when you don’t 
need to know the id of the row inserted. Good for things like logs, stats etc.

Proxy > App
( warm up the pack, the engine’s running )

35

35

Best advice right now is to use nginx as a front end to a mongrel cluster (or two)

it’s very fast and scalable ‐ nginx is lightweight, and can handle upstream clusters with ease, as well as 
use fast onboard PCRE style regex for handling different paths based on their needs.

mongrel, while not being the fastest in the pack, lets you scale out easily. Plus Zed is pretty clever, and 
he’ll fix stuff quickly. 

Why use them? Lots of these ‘new’ http servers are more focused towards a smaller goalset ‐ they are 
designed to achieve one or two things. Apache HTTPD lets you embed almost any module imaginable 
in the chainset. It’s clear who’s going to be faster.

Event Driven?
( don’t presume your traﬃc )

36

36

You can use swiftiply and evented mongrel to move away from the high cost of threads. This is useful 
because rails sits in one big loop for each request ‐ so tieing up expensive threads waiting for your app 
to get done is not necessarily efficient. Perhaps try running it in an event loop

haven’t tried this yet in any kind of real‐world example ‐ but really keen to see if it can scale (and stand 
up)

Req/sec (mean)
250.00
Stats courtesy of http://blog.kovyrin.net/ 234

220
218.75
207

187
187.50

156.25

125.00
nginx litespeed lighttpd(fcgi) apache(fcgi)

37

37

Clear alternatives if you aren’t scaling past one appserver ‐ these numbers are sort of indicative

litespeed (pay for product) has some nice numbers and an apparently easy‐to‐use interface ‐ live tool 
for adding new lsapis on the fly

lighttpd + apache, yes, straight fastcgi is good but you can’t scale past four FCGI processes, mongrel 
can

KeepAlive
( no point if you’re dead )

38

38

KeepAlive almost never works. 99% of the time, you’re going to benefit just making your appserver/
webserver ignore it. Most browsers now work around this to help improve perceived performance.

You can get the same kind of benefit by parallelizing your asset requests ‐ ie randomize from server1/
server2 etc.

Edge rails supports this natively.

Hostname Lookup
( do not do this. ever. )

39

39

anything that interferes with the business of serving your webpage to the client is going to hurt your 
performance.

turn off hostname lookup, excessive logs, unused modules ‐‐ anything you really really don’t need.

make sure your apps are compiled to perform the best with your setup (except for MySQL where you 
should always use their compiled versions)

Do you use stats packages? Make sure the JS calls are right before the end </body> tag ‐‐ you may get 
lucky and browsers will deal with complicated stuff like styles and so on, or render the page to the 
screen whilst waiting ‐ these calls typically block and the browser can’t do much till they return.

So be sure your stats package can handle your traffic before you stick it up there. (Hint: self‐installable 
stuff like mint can’t handle millions of hits per day without lots of hardware to support it)

Really bad stats?

perhaps use an async XMLHttpRequest to fire it, an IFrame or the onload handler....

NFS and Beyond
( sharing is good )

40

40

Are you pre‐caching on every server ? Then use a shared file store!

It’s also easier to expire one store than many.

be warned ‐ NFS traditionally hasn’t been known to scale as well as it could ‐ more recent versions are 
more performant

Some NFS options you can turn off (you don’t always need to write, for example) and staying in sync is 
not always important for a small share you can just remount if it gets crazy.

Write over NFS
( be super eﬃcient )

41

41

Zed pointed out this really brain‐dead simple efficiency. If you use NFS ‐ use it to write to your asset 
servers ‐ disk is cheap but the network tear down / start up is expensive. Don’t saturate your net card 
just passing data around again and again.

Always look for the simplest path.

MogileFS, NFS Clusters
( brainy sharing )

42

42

If you’re struggling under the load of lots of static assets (think youtube or flickr)  and you can’t quite 
afford a network attached storage device with a petabyte of disk space,

consider using up the many multi gigabyte disks you have in your servers!

cluster up for NFS clusters (tricky but not impossible) where you can create a pseudo raid over 
machines via software. google for it

or use mogileFS and its HTTP DAV style api for grabbing your data chunks. RobotCOOP have a 
working library.

Tuning Recap
( were you listening? )

43

43

  1. Check for bottlenecks. focus on perceived areas of slowness
  2. Improve by making users happy
  3. Look at your layout ‐ are your servers fighting for CPU/RAM time?
  4. Are you on a shared host and being kept in strict limits?
  5. Is your code optimal ‐ especially templates? 
  6. Can you get more servers?
  7. Tuning your apps ‐ is the MySQL processlist showing lots of waiting queries?
  8. Are you running the most optimal HTTP setup?
  9. is your cache causing you problems on the disk?
10. Attend one of our scalability talks ‐ starting in May. ask the skillsmatter team here for more info.
10. Hire me.... or someone like me :)

Any Questions?

44

44

Resources -
talk: smokeclouds.com/scalability.pdf
me: smokeclouds.com :: imaj.es
blogs: brainspl.at :: blog.kovyrin.net : caboo.se
app: mongrel.net :: litespeed.com
web: lighttpd.net :: nginx.net :: swiftcore.org
hosts: railsmachina.com :: engineyard.com

45

45

Scalability without going nuts

Recommended

Recommended

More Related Content

Similar to Scalability without going nuts

Similar to Scalability without going nuts (20)

Recently uploaded

Recently uploaded (20)

Scalability without going nuts