Black Friday, Christmas, peak travel season—no matter what the event, there will come a time when your API infrastructure needs to meet increased business demand quickly. Without the proper, load-tested infrastructure in place, your IT team won't be able to meet the demand with the speed and agility required.
Learn how to prepare for the high traffic events. This webcast will get you started on a preparedness checklist and include best practices to help you start the planning and testing process.
We will discuss:
- planning for availability, scale, and capacity
- how to forecast load for peak events
- how to leverage analytics to detect when you are approaching full capacity traffic
- where security fits in
Watch Video: http://youtu.be/EhBGRLGbzXY
Download Podcast: http://bit.ly/1FbQZJM
4. Apigee Edge – Nov 27th to Dec 1st, 2014
• Experienced zero downtime
• Supported 6 of the top 10 retailers
• Handled a 263% annual increase in API calls
• Managed over 8000 TPS peak loads for a single retailer
• Tested for 5x more capacity than required
• Proactively alerted customers of 50+ problems with their stack
https://pages.apigee.com/Black-Friday.html
4
Review, Optimize Existing APIs
Design
Scale, Redundancy, Reduced Latency with multi-region presence
Auto-scaling in the Apigee Public Cloud
Remove inefficiencies
Provide only what is needed right now - response pagination
JSON rather than XML
Fully leverage HTTP – e.g.: If-Modified-Since, compression
Improve/optimize the target servers, add capacity
Leverage out-of-the-box policies when possible
Re-evaluate the use of scripts, Java callouts
Relocate complex/custom logic to node.js
Add Caching
Response caching
Object caching
Take advantage of L1 and L2 caching
Cache both static and dynamic content
Leverage API BaaS
Consider storing store, product, and inventory information in API BaaS
Take advantages of mobile features – location and push notifications – to optimize API use
Security Checklist
Confirm that all recent security vulnerabilities have been adequately addressed
Turn on SSL
Use OAuth
Multiplier - Assume 10x Increase in Traffic
Use multiplier & trailing 4 weeks average traffic as a baseline
Estimate both average as well as peak traffic
Consider industry trends
Base API Traffic Estimate on Prior 2-3 Years
Holiday promotions
Other end-of-year programs
Leverage Analytics & Insights
Edge Analytics for prior years
Edge Analytics for current season
Insights for journey and predictive analytics
combine API data with other data to better understand trends
Ask questions – analyze potential issues
How does load affect SLA
How does load affect latency
How do spikes affect errors
How many new apps have been introduced
Are there apps that are inefficient
Why?
- API testing – like system testing – is a first class concern
- Load/Performance testing is used to understand system behavior in real-world situations
- Identify weaknesses in all layers
What?
Operational – test the test
- health checks
- SLA checks
- availability
Functional
- use case testing
- test based on how apps use APIs
Performance– Stress, Load, Soak, Spike
Stress: determine the breaking point
induce errors and determine how system degrades
know what system failure looks like
Load: determine effect of tolerable load
75-85% of stress test TPS
Affect on API latency
Affect on SLA
Soak: identify instability that occurs after extended use
75-80% of load test TPS
resource leaks
gradual performance degradation at millions of transactions (e.g.: inefficiencies in business function/algorithm, database design)
Spike:
vary between 75-125% of load test TPS
Observe behavior when traffic is not constant
Where?
- PROD is what matters
- test the PROD systems –performance, functional, operational
- and, test in both PROD and NON-PROD environments
When?
- operational/health check testing should be continuous
- know when PROD APIs are not working – SLA, errors
- functional tests should be continuous
- often failure of a single API negatively affects other APIs
- performance testing should be conducted
- at expected peak times (holiday season)
- whenever significant design/architectural changes have been made
Who?
- Operations
- DevOps – API developers
- Business teams
How?
- Tools – jmeter, Apache Bench, LoadUI/SoapUI, Nagios, etc.
And…
- test scripts are code
- use software engineering best practices
- manage them with version control
Operational and Functional Testing
health checks
SLA checks
- availability
In PROD because that’s what the Customers sees
- PROD is what matters
Continuously
operational/health check testing should be continuous
- know when PROD APIs are not working – SLA, errors
functional tests should be continuous
- often failure of a single API negatively affects other APIs
By your DevOps/Operations Teams