When you’re building the next killer mobile app, how can you ensure that your app is both stable and capable of near-instant data updates? The answer: Build a backend! Siva Katir says that there’s much more to building a backend than standing up a SQL server in your datacenter and calling it a day. Since different types of apps demand different backend services, how do you know what sort of backend you need? And, more importantly, how can you ensure that your backend scales so you can survive an explosion of users when you are featured in the app store? Siva discusses the common scenarios facing mobile app developers looking to expand beyond just the device. He’ll share best practices learned while building the PlayFab and other companies’ backends. Join Siva to learn how you can ensure that your app can scale safely and affordably into the millions of concurrent users and across multiple platforms.
3. CAN YOUR MOBILE
INFRASTRUCTURE SURVIVE
1 MILLION CONCURRENT USERS?
Siva Katir
PlayFab, Inc
Mobile Dev + Test 2016
Don’t be your own worst enemy!
The Simpsons: Tapped Out launched by
EA in 2012
Backend was so unprepared for massive
loads of traffic it was pulled for FIVE
months for total redesign
Went on to become a huge and long-
lasting hit in the market for many years
afterwards
Can your company afford to add an
extra 5 months to the development
cycle? Including lost marketing and
promotional spend? Including lost
mindshare? Including bad press?
4. Be your own guardian angel!
Loadout launched on Steam by
Edge of Reality
500x increase in players overnight
on being featured in Steam store
EC2 auto-scaled in atomic and
replaceable servers instantly to
handle load
No downtime, no panic, no fires
DO YOU EVEN NEED A BACKEND?
Maybe! Maybe not!
5. What can my backend do for me?
Push updates without going
through full certification process
• New artwork? No problem!
• Message of the day!
• In-app purchase promotions!
Improve customer service
• Have an authoritative source for
what a client ‘has’
• Direct access to grant entitlements
to remediate issues
What can my backend do for me?
Support a single user across multiple
devices
• Recover a user’s session even if they
lose or replace their device
• Continue the same session across
multiple devices (phone to tablet)
Perform ‘trusted’ transactions
(especially around receipt verification)
• Clients are untrustworthy!
• Client-to-Provider transaction can only
say if a receipt is valid, NOT if a receipt
is valid for your app
6. Know Your Project
What is your budget?
• What does it cost to host?
• What does it cost to run?
Who are your engineers?
• Do you have the in-house expertise to
manage all services?
• DevOps? Backend? Whole-Stack?
Front-End?
• Are they willing to be on-call 24x7?
What do you need to put in the cloud?
Why?
Know Your Data
What data are you storing?
• User data
• Group data
• Application data
How does each piece of data need to
be queried?
• Can all data be looked up by a key?
• Need to do arbitrary field queries?
Is the data read and/or write heavy?
How much data do you expect to store
per user?
7. BUILDING A BACKEND 101
Not taught in schools!
Pick a Cloud Provider
Is your language well supported in
your provider?
How much self management is
required for each service?
How well is scalability built in?
Do you have region requirements?
• European data protection laws
• Russia and China have special data
laws
8. Large Needs or Small Needs?
Database + basic CRUD APIs?
• AWS Lambda!
Complex data + user management?
• AWS Mobile or Azure Mobile
Services!
Highly custom requirements?
• Roll your own on a public cloud
(PROCEED WITH CAUTION!)
Storing and Retrieving Data
Know your databases strength
• MySQL – Very easy to get started with and
widely supported
• MS-SQL – Powerful query engine and incredibly
performant
• MongoDB – Can query against arbitrary fields
• DynamoDB – Very easy scaling and fast random
access
Know their weaknesses too
• MySQL – very hard to scale
• MS-SQL – still pretty hard to scale
• MongoDB – very hard to scale correctly and
maintain data integrity
• DynamoDB – can only query against
predefined indexes cost effectively
9. Storing and Retrieving Data
Novel solutions to database shortcomings
• Use multiple databases to take advantage of their
individual strengths
• Example: Store “index” data in SQL, while using
DynamoDB for actual data storage which clients use
• Allows you to store all data without needing to scale
a difficult to scale database
Keys:
• Have a way to reliably update the SQL database out
of the user’s flow
• Don’t treat the SQL store as authoritative
• Some tools can make this entirely seamless, such as
using DynamoDB write streams and Lambda to
update SQL through
SQL:
{
“playerId”: 00001
“purchaseId”: 1002092,
“purchaseValue”: 0.99,
“purchaseDate”: 03/01/2016 09:01:05
}
DyanamoDB:
{
“playerId”: 00001,
“purchaseId”: 1002092,
“purchasedItems”:
[{
“itemName”: ”in_app_1”,
“purchasePrice”: 0.99
}]
}
SELECT purchaseId, purchaseValue FROM
sqlPurchaseTable WHERE purchaseDate > 3/1/2016
Plan For Failure
Design for the worst, hope for the best
• Any machine can go down at any time
• No machine should be ‘special’
If any machine can go down then any
machine can also be brought up
Architect-in failure behavior both up
and down the stack
• DB times out?
• Web server disk fails?
• Third-party provider goes down?
http://gunshowcomic.com/648
10. COMMON PITFALLS
It’s a trap!
Saving Data
Remote != Local
Do:
• Save only changed data
• Save data in batches
• Prepare for connection failures
• Prepare for client failures
• Prepare for server failures
Don’t:
• Save on a timer (unless it’s retrying)
• Save duplicated data
• Expect it to work
• Make assumptions on if it worked
http://cloudtweaks.com/
11. Loading Data
Easy Wins
• Client:
• Pre-load data during idle times
• Cache locally
• Assume data can fail to be loaded
• Assume data can arrive corrupted or out of
order
• Assume it will load slow
• If security matters, connect via SSL
• Don’t connect directly to the data store
• Server:
• Cache data that is OK to serve stale
• Design data schemas to make each request
perform as few queries as possible
• Design authorization in such a way to prevent
any, or at least limit any extra queries
Easy Fails
• Trying to implement a custom SSL service
• Trying to be clever with caching
• Assuming anything will work on the first try
Scalability
Don’t optimize early
• Actually know what your bottlenecks
are; most likely it is NOT string
handling!
• Run a realistic load test with a
profiler to get actual useful data
Don’t run blind
• Know your KPIs before launch
• Track your KPIs realtime via counters
with DataDog, Cloudwatch ect
• Set up alerting to your DRI
12. Scalability
Know what infrastructure to scale and when
• Data
• API servers
• Load balancers
Design to scale horizontally, not vertically
• All services should be stateless unless they
don’t need to scale with number of users
• Don’t assume a server will exist minute to
minute
Keep a safe capacity margin in your
infrastructure
• 50% is reasonable
• Know how long it will take to increase capacity
Managing Connections
Use connection pooling
Don’t try to outsmart your language’s
connection management
Making a connection has a cost!
Don’t re-invent a protocol if an existing
one will do
• HTTP is way easier to debug than
websockets
• Websockets stream data way more
efficiently than HTTP
• Both are safer than using raw TCP