Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Optimera STHLM 2011 - Mikael Berggren, Spotify
1. Is it Web Scale?
Mikael
Berggren
miken@spo1fy.com
2. Who
is
this
old
guy?
• Web
Team
Lead
@
Spo1fy
• Started
back
in
the
the
glory
days
of
2000
• Only
works
for
companies
beginning
with
an
S
(Spray,
Stardoll
&
Spo1fy)
4. Success
in
scale
fail
• Choose
by
buzz
/
trends
• Use
technology
suitable
for
something
completely
different
• Web
frameworks
• Don’t
measure
or
look
at
graphs
5. “We’re
using
X
and
it
scales”
• There
is
no
magic
solu1on
that
scales
out
of
the
box
• Choose
the
technology
you’re
familiar
with
• Middle
bird
get’s
the
worm
6. What
is
scaling?
• It’s
about
lying
:)
• Take
control
of
your
code
• Push
instead
of
pull
• Finding
boZlenecks
and
fix
them
• Avoid
SPoF
as
much
as
you
can
• Measure
and
analyze
7. Cache
is
king!
• Cache
on
mul1ple
instances
• Cache
as
close
to
the
final
result
as
possible
• Memcache
is
good
• but
flat
files
is
even
beZer
8. Using
memcache
• Local
&
Global
instances
• Global:
For
everything
that
needs
to
be
distributed
and
synchronized
• Local:
For
everything
else
• Make
sure
failovers
work
correctly
• Use
getMul1
when
possible
• Have
several
small
instances
instead
of
one
large
9. File
cache
• Fast
and
reliable
• No
3rd
party
dependencies
• Used
ocen
=>
Really
fast
access
• Easy
to
scale
• Atomic
updates
10. Web
frameworks?
No
thanks!
• Never
built
for
your
specific
needs.
If
they
are,
you’re
damn
lucky
:)
• Hard
to
control
data
and
request
handling
your
way
(plugins,
modules)
• A
lot
of
overhead
for
each
request
11. Database
scaling
• Do
your
homework
on
indexes
and
queries
• Test
your
knowledge
on
indexes
and
queries
• Use
slave
nodes
for
reads
• Horizontal
sharding
for
segments
of
users
• Ver1cal
sharding
for
user-‐data
12. RDB
vs.
NoSQL
• “No”
in
NoSQL
stands
for
“Not
Only”
• Use
the
correct
storage
for
your
purpose
• Don’t
do
as
Digg.com…
13. Scaling
at
Stardoll
• >3500
dynamic
PV/s
• Horizontal
sharding
of
user-‐data
• Pre-‐genera1on
of
content
• Measuring
render-‐1me
for
each
page
14. Scaling
at
Spo1fy
• We
said
“Bye
Wordpress!”
and
the
servers
where
happy
again
• 95%
of
all
content:
flat
files
• Dependencies
on
shared
services
• Handle
huge
spikes
in
traffic
16. Recommenda1ons
• Have
a
code
standard
everyone
must
follow!
• Great
error
handling
and
logging
• Possibility
to
disable
features/func1onality
• Try
to
do
requests
asynchronous
• Avoid
race
condi1ons
• Monitor,
measure
and
analyze
!important