Scalability
Single	Machine	in	a	Garage	
vs
Thousands	of	Clusters	in	Multiple	Data	Centers
Deepak	Goyal
@WalmartLabs
Assumptions
• Network	is	reliable
• Latency	does	not	exist
• Bandwidth	is	unlimited
• Network	is	secure
• Topology	never	changes
• Administrators	are	always	available
• Network	is	homogeneous
cAP Theorem	(Eric	Brewer*)
Every	read	receives	the	most	recent	write	or	an	error
Consistency
Partition	Tolerance
The	system	continues	to	operate	despite	an	
arbitrary	number	of	messages	being	dropped	(or	
delayed)	by	the	network	between	nodes
Availability
Every	request	receives	a	(non-error)	
response	– without	guarantee	that	it	
contains	the	most	recent	write
screw	you	and	pick	2*
Core	Architecture	of	an	application	
1. Application	or	Web	Server
• Application	Server:	responds	to	requests	on	any	protocol
• Web	Server:	responds	to	requests	primarily	on	HTTP/S
2. Database	Server
• Responds	to	requests	for	manipulate	data
Concepts	
• Vertical	Scaling	(scale-up)
• Horizontal	Scaling	(scale-out)
• Vertical	Partitioning
• Horizontal	Partitioning
• Master-Master	and	Master	Slave
Master-Slave	and	Master-Master
Vertical	Scaling	vs	Horizontal	Scaling
1. Processing	power
• more	cores
• more	cache
2. Memory
• more	RAM
• better	RAMs
3. More	disk	space
• more	HDDS
• moving	to	SSDs
Adding	more	
(commodity/cheap)	
machines	into	the	pool	of	
resources.
Failed	Architectures*
If	a	design	can	be	failed	on	paper,	it	is	BOUND	TO	fail	in	practice.
learn	from	the	mistakes	of	others*
Failed	Architecture
• A	single	machine	with	an	app	sever	and	a	DB	server
Polyglot	Persistence	
Different	kind	of	data	are	best	dealt	with	different	DB	solutions
1. Key-Value
• Shopping	Cart,	Session	Data
2. Document	Store
• Completed	Orders,	Archival
3. RDBMS	(SQL)
• Inventory	Management
4. Graph	Store
• Customer	social	graph
Sharding
Consistent	Hashing	(Caching)

Scalability