Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Jason Wilson & David Von Lehman
PRESENTING
AWS and the Nordstrom Data Lab
Recommendo Overview
• REST-ful product recommendations API
• Live on nordstrom.com in November
• Service emails live in Ja...
By the Numbers
• Over 4 billion recommendations served
• >3 million API hits per day
• 105 days between first commit and g...
50/50 test against
incumbent vendor
How We Built It
• Continuous integration and deployment from
the first week
• 90+ percent code coverage
• Fewer moving par...
DynamoDB
• Fully managed NoSQL database-as-a-service
• Web API with SDK support for Python, Ruby, node.js,
.NET, and Java
...
• JavaScript on the server atop the Google V8 engine
• Asynchronous event loop makes it ideal for real-time
data intensive...
JavaScript – Learn to Love It
• No type checking, don’t find
errors until runtime
• Not classical OO
• var keyword
• Callb...
AWS Components
• EC2 – Provides web-scale computing as a
service.
• ELB – elastic load balancer. Routes incoming
traffic t...
AWS Components
Elastic Beanstalk
• AWS PaaS – lightweight abstraction layer atop EC2/ELB with
no additional costs
• More transparent than...
Continuous Deployment
git push
to dev
branch
Jenkins
CI
unit
tests
git push
to EB
git pull
dev
git
checkout
master
git mer...
Performance testing
• Initial performance was poor.
• Disable DNS caching when load testing against
ELB.
• Pre-warm ELB fo...
Early Perf results – YIKES!Transactions
per second
Response
time (seconds)
Performance tuning
• New relic, Nodetime
– Real-time performance monitoring of node
runtime
• node-mem-watch
– Evented ins...
Real Performance
• Pleasantly surprised 
• Average latency ~90ms
• Dynamo response times <10ms
• Handful of auto-scaling ...
400%
64%
DynamoDB
Lessons Learned / Pitfalls
• True zero downtime deployment is difficult to
achieve
• Thoroughly explore the Elastic Beanst...
Harness the Cloud
On-Premise IaaS PaaS
% time
infrastructure experience
Logging Monitoring
Redundancy
Deployment
Automation
High-
Availability
Scalability
Iterative
Development
Build to
Experime...
PaaS Venn Diagram
Robust
Systems
Rapid
Delivery
Platypus
as a
Service
Recommendo 2.0
• Sku based recommendations – size!
• Truly personalized recs based on individual browse
and purchase histo...
Additional AWS Services
• Elasticache and Redis
• Elastic Beanstalk worker tiers
• SQS
• S3
Wrap-Up
• Recommendo – initial success, now building upon what we
have learned
• Node.js + DynamoDB + Elastic Beanstalk is...
Thank you
• Questions / comments?
• @davidvlsea
• ds@nordstrom.com
AWS Meetup - Nordstrom Data Lab and the AWS Cloud
AWS Meetup - Nordstrom Data Lab and the AWS Cloud
Upcoming SlideShare
Loading in …5
×

AWS Meetup - Nordstrom Data Lab and the AWS Cloud

2,219 views

Published on

The Nordstrom Data Lab is building out an API that powers product recommendations for our customer online and beyond. Recommendo, our flagship product, was built from the ground up using Node.js and AWS in a little over three months. Since launch in November 2013 we've served up over three billion recommendations and survived Black Friday and Cyber Monday without breaking a sweat. We'll be sharing our learnings for building and operating a high traffic API on the AWS platform as a service focusing on Node.js, Elastic Beanstalk, and DynamoDB. Additionally we'll discuss some of the cultural challenges and opportunities presented when adopting the public cloud at a large corporate IT organization. In short, we believe there are tremendous advantages to be had for enterprises willing to make the leap to the cloud.

Published in: Technology
  • Be the first to comment

AWS Meetup - Nordstrom Data Lab and the AWS Cloud

  1. 1. Jason Wilson & David Von Lehman PRESENTING AWS and the Nordstrom Data Lab
  2. 2. Recommendo Overview • REST-ful product recommendations API • Live on nordstrom.com in November • Service emails live in January • Lives in the AWS cloud – Elastic Beanstalk, DynamoDB, node.js • 3rd party rec vendors don’t tap into what is unique about Nordstrom or fashion
  3. 3. By the Numbers • Over 4 billion recommendations served • >3 million API hits per day • 105 days between first commit and go-live (Aug 6 and Nov 19 respectively) • 5 servers with auto-scaling to 20 (turns out we don’t need them) • 90ms average request latency
  4. 4. 50/50 test against incumbent vendor
  5. 5. How We Built It • Continuous integration and deployment from the first week • 90+ percent code coverage • Fewer moving parts == less to monitor, fewer ways for things to go wrong • Fully PaaS based to minimize sys admin responsibilities • How can we support this ourselves without carrying pagers?
  6. 6. DynamoDB • Fully managed NoSQL database-as-a-service • Web API with SDK support for Python, Ruby, node.js, .NET, and Java • High performance queries, backed by SSD • Maintains predictable performance for data at any size through horizontal scale out • Auto replication across 3 availability zones • Need to understand data access patterns up front • Pay for only what you use/need – both storage and R/W throughput
  7. 7. • JavaScript on the server atop the Google V8 engine • Asynchronous event loop makes it ideal for real-time data intensive applications • Vibrant open-source community around excellent npm package manager (50K+ packages) • Seeing increased adoption in enterprises including Wal-Mart, LinkedIn, PayPal, Dow Jones, Microsoft, New York Times
  8. 8. JavaScript – Learn to Love It • No type checking, don’t find errors until runtime • Not classical OO • var keyword • Callback hell • Server debugging too hard • But wait.. • Chrome and V8 • Dynamic can be your friend • npm! • express, async, mocha
  9. 9. AWS Components • EC2 – Provides web-scale computing as a service. • ELB – elastic load balancer. Routes incoming traffic to ec2 instances, scales up to meet demand. • Auto-scaling group – a logical collection of EC2 instances behind an ELB
  10. 10. AWS Components
  11. 11. Elastic Beanstalk • AWS PaaS – lightweight abstraction layer atop EC2/ELB with no additional costs • More transparent than Azure or Heroku • Supports Java, .NET, Python, Node.js, PHP, and Ruby • git push deployment • Auto-scaling group with custom triggers and auto applied config • Possible to configure the AMI including yum packages, environment variables, and more • Supports custom AMIs • Automated health checks
  12. 12. Continuous Deployment git push to dev branch Jenkins CI unit tests git push to EB git pull dev git checkout master git merge dev git push master Jenkins CI unit tests git push to EB (prod) Development Production
  13. 13. Performance testing • Initial performance was poor. • Disable DNS caching when load testing against ELB. • Pre-warm ELB for higher upfront throughput • jmeter-ec2, bees with machine guns
  14. 14. Early Perf results – YIKES!Transactions per second Response time (seconds)
  15. 15. Performance tuning • New relic, Nodetime – Real-time performance monitoring of node runtime • node-mem-watch – Evented inspection of heap, gc events, leak events, and heap diffing • ssh into instances
  16. 16. Real Performance • Pleasantly surprised  • Average latency ~90ms • Dynamo response times <10ms • Handful of auto-scaling up and back events • One outage due to bad exception handling
  17. 17. 400% 64%
  18. 18. DynamoDB
  19. 19. Lessons Learned / Pitfalls • True zero downtime deployment is difficult to achieve • Thoroughly explore the Elastic Beanstalk configuration options • Catch those errors – a rogue unhandled exception can bring it all down • Health checks that actually do something • Out of the box monitoring is pretty good
  20. 20. Harness the Cloud On-Premise IaaS PaaS % time infrastructure experience
  21. 21. Logging Monitoring Redundancy Deployment Automation High- Availability Scalability Iterative Development Build to Experiment Evolutionary Architecture Change Tolerant Frequent Releases Small Teams Agility vs. Industrial Strength Security
  22. 22. PaaS Venn Diagram Robust Systems Rapid Delivery Platypus as a Service
  23. 23. Recommendo 2.0 • Sku based recommendations – size! • Truly personalized recs based on individual browse and purchase history DynamoDB Batch Recs Real- Time Refiner y ScorerIngester Redis Streams
  24. 24. Additional AWS Services • Elasticache and Redis • Elastic Beanstalk worker tiers • SQS • S3
  25. 25. Wrap-Up • Recommendo – initial success, now building upon what we have learned • Node.js + DynamoDB + Elastic Beanstalk is a winning combination • Possible to out-perform an incumbent vendor solution in a competitive differentiating capability • Cloud and PaaS enable small teams to move quick and deliver solid production caliber systems • Incremental cost of “gold plating” steadily shrinking • Your company benefits when percent of resources devoted to core competency is maximized
  26. 26. Thank you • Questions / comments? • @davidvlsea • ds@nordstrom.com

×