SlideShare a Scribd company logo
1 of 27
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/resiliency-website
http://www.infoq.com/presentati
ons/nasa-big-data
http://www.infoq.com/presentati
ons/nasa-big-data
Presented at QCon London
www.qconlondon.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
From	
  Instability	
  to	
  Resilience:	
  
The	
  Story	
  of	
  a	
  Web	
  Site	
  
Richard	
  Campbell	
  
@richcampbell	
  
Richard	
  Campbell	
  
•  Background	
  
–  First	
  laid	
  hands	
  on	
  a	
  microcomputer	
  in	
  1977,	
  it’s	
  been	
  all	
  
downhill	
  from	
  there	
  
–  Spent	
  the	
  last	
  fiGeen	
  years	
  helping	
  companies	
  scale	
  soGware	
  
on	
  a	
  variety	
  of	
  plaIorms	
  
•  Currently	
  
–  Architect	
  and	
  Consultant	
  
–  Organizer	
  of	
  DevIntersecNon	
  
–  Rabid	
  Podcaster	
  
Podcasts	
  
For .NET Developers
First published 2002
Two shows a week
955 episodes in the archive
For IT Pros
First published 2007
Once a week
357 episodes in the archive
For Tablet Developers
First Published 2011
Once a week
126 episodes so far
What	
  is	
  Resilience?	
  
Resilience	
  DefiniNon	
  
The	
  Web	
  Site	
  Story	
  
•  VerNcal	
  market	
  e-­‐commerce	
  system	
  
•  More	
  than	
  200,000	
  discrete	
  items	
  
•  Considered	
  mature	
  (version	
  3!)	
  
•  Busier	
  than	
  ever,	
  but	
  making	
  less	
  money	
  
•  Increasing	
  tech	
  support	
  calls	
  and	
  complaints	
  
DiagnosNc	
  InstrumentaNon	
  
•  Performance	
  Monitor	
  
– CPU	
  uNlizaNon	
  
– Requests/Sec	
  
– Requests	
  Queued	
  
– .NET	
  Heap	
  Allocated	
  
IniNal	
  Diagnosis	
  
•  CPU	
  uNlizaNon	
  high,	
  not	
  pinned	
  
•  20-­‐30	
  Requests/sec,	
  steady	
  
•  Request	
  Queued	
  jumps	
  occasionally	
  
•  .NET	
  Heap	
  Climbs	
  to	
  800MB,	
  then	
  dumps	
  
– Memory	
  leak?	
  
•  Worker	
  process	
  recycling	
  every	
  20	
  minutes	
  
Taking	
  AcNon	
  
•  Speed	
  of	
  response	
  vs.	
  comprehensive	
  
response	
  
•  Must	
  have	
  measurements	
  before	
  and	
  aGer	
  
•  Addressing	
  a	
  memory	
  leak	
  with	
  more	
  
memory!	
  
First	
  AcNons	
  &	
  Results	
  
•  Switch	
  to	
  64	
  Bit	
  OS	
  (but	
  compile	
  to	
  32	
  bit)	
  
•  Added	
  4GB	
  of	
  Memory	
  (for	
  a	
  total	
  of	
  8GB)	
  
•  Memory	
  leak	
  conNnued	
  
– Just	
  took	
  longer	
  
– Dump	
  occurred	
  a	
  2GB	
  (maximum	
  pool	
  size)	
  
– Worker	
  process	
  recycling	
  every	
  two	
  hours	
  
•  Complaint	
  level	
  drops	
  dramaNcally	
  
Character	
  of	
  the	
  Web	
  Site	
  
•  Accuracy	
  
•  Reliability	
  
•  Scalability	
  
•  Performance	
  
•  Where	
  does	
  Resilience	
  fit?	
  
Bejer	
  Living	
  through	
  Hardware	
  
•  Adding	
  more	
  web	
  servers	
  
– Spreading	
  failure	
  around	
  
•  Moving	
  to	
  the	
  Cloud	
  
– Paying	
  for	
  failure	
  by	
  the	
  hour	
  
Increasing	
  Understanding	
  
•  What	
  is	
  causing	
  this	
  memory	
  leak?	
  
– Using	
  .NET	
  Memory	
  Profiling	
  to	
  analyze	
  
•  Most	
  memory	
  consumed	
  by	
  cache	
  
<foreshadow>	
  
•  NoNced	
  large	
  blocks	
  of	
  staNc	
  memory	
  (session	
  
objects)	
  	
  
</foreshadow>	
  
When	
  Cache	
  Runs	
  Amok	
  
•  Caching	
  added	
  to	
  improve	
  performance	
  
•  Cache	
  objects	
  being	
  created	
  around	
  searches	
  
•  What	
  are	
  the	
  chances	
  of	
  that	
  cache	
  object	
  
ever	
  being	
  used	
  again?	
  
•  How	
  do	
  you	
  expire	
  a	
  cache	
  object?	
  
	
  
InstrumenNng	
  Cache	
  
•  Wrapped	
  cache	
  objects	
  in	
  a	
  dicNonary	
  
abstracNon	
  
•  Kept	
  counters	
  for	
  re-­‐use	
  
•  Two	
  hours	
  of	
  run	
  Nme	
  (one	
  worker	
  process	
  
recycle)	
  results	
  in	
  95%	
  of	
  cache	
  objects	
  never	
  
reused	
  
	
  
Fixing	
  Cache	
  
•  Don’t	
  allow	
  free	
  form	
  search	
  cache	
  
•  Don’t	
  let	
  the	
  users	
  populate	
  cache	
  at	
  all	
  
•  Treat	
  the	
  real	
  problem	
  
Valuing	
  Resiliency	
  
•  Nothing	
  like	
  failure	
  to	
  help	
  determine	
  value	
  
•  Owning	
  your	
  ROI	
  
•  Spending	
  CapEx	
  to	
  reduce	
  OpEx	
  
Adding	
  More	
  Resiliency	
  
•  Bend	
  further	
  
– MulNple	
  servers	
  provide	
  redundancy	
  
•  ElasNcity	
  
– Can	
  we	
  expand	
  or	
  shrink	
  based	
  on	
  need?	
  
•  Bejer	
  feedback	
  
– How	
  do	
  we	
  know	
  when	
  we’re	
  bent,	
  when	
  to	
  
expand	
  and	
  when	
  to	
  shrink?	
  
Clusters	
  of	
  Joy	
  
•  Successful	
  failover	
  is	
  hardware,	
  soGware	
  	
  &	
  
configuraNon	
  together	
  
•  If	
  it	
  hasn’t	
  been	
  tested,	
  it	
  doesn’t	
  work	
  
•  RouNne	
  failover	
  is	
  your	
  friend!	
  
The	
  Plague	
  of	
  State	
  
•  Geqng	
  session	
  out	
  of	
  the	
  web	
  server	
  
•  The	
  bajle	
  of	
  performance	
  over	
  scalability	
  
•  As	
  lijle	
  state	
  as	
  possible	
  (and	
  no	
  less)	
  
InstrumenNng	
  ProducNon	
  
•  The	
  only	
  source	
  of	
  truth	
  
•  How	
  can	
  you	
  build	
  soGware	
  that	
  can	
  test	
  in	
  
producNon	
  without	
  impacNng	
  the	
  user?	
  
•  InstrumentaNon	
  driving	
  features	
  
Building	
  Resiliency	
  
•  Resiliency	
  is	
  a	
  journey,	
  not	
  a	
  desNnaNon	
  
•  Know	
  the	
  character	
  of	
  your	
  system	
  
•  Work	
  from	
  facts,	
  not	
  conjecture	
  to	
  move	
  the	
  
system	
  forward	
  
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/resiliency
-website

More Related Content

More from C4Media

Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsC4Media
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechC4Media
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/awaitC4Media
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaC4Media
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?C4Media
 

More from C4Media (20)

Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 

Recently uploaded

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 

Recently uploaded (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

From Instability to Resilience: The Story of a Web Site

  • 1.
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /resiliency-website http://www.infoq.com/presentati ons/nasa-big-data http://www.infoq.com/presentati ons/nasa-big-data
  • 3. Presented at QCon London www.qconlondon.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4. From  Instability  to  Resilience:   The  Story  of  a  Web  Site   Richard  Campbell   @richcampbell  
  • 5. Richard  Campbell   •  Background   –  First  laid  hands  on  a  microcomputer  in  1977,  it’s  been  all   downhill  from  there   –  Spent  the  last  fiGeen  years  helping  companies  scale  soGware   on  a  variety  of  plaIorms   •  Currently   –  Architect  and  Consultant   –  Organizer  of  DevIntersecNon   –  Rabid  Podcaster  
  • 6. Podcasts   For .NET Developers First published 2002 Two shows a week 955 episodes in the archive For IT Pros First published 2007 Once a week 357 episodes in the archive For Tablet Developers First Published 2011 Once a week 126 episodes so far
  • 9. The  Web  Site  Story   •  VerNcal  market  e-­‐commerce  system   •  More  than  200,000  discrete  items   •  Considered  mature  (version  3!)   •  Busier  than  ever,  but  making  less  money   •  Increasing  tech  support  calls  and  complaints  
  • 10. DiagnosNc  InstrumentaNon   •  Performance  Monitor   – CPU  uNlizaNon   – Requests/Sec   – Requests  Queued   – .NET  Heap  Allocated  
  • 11. IniNal  Diagnosis   •  CPU  uNlizaNon  high,  not  pinned   •  20-­‐30  Requests/sec,  steady   •  Request  Queued  jumps  occasionally   •  .NET  Heap  Climbs  to  800MB,  then  dumps   – Memory  leak?   •  Worker  process  recycling  every  20  minutes  
  • 12. Taking  AcNon   •  Speed  of  response  vs.  comprehensive   response   •  Must  have  measurements  before  and  aGer   •  Addressing  a  memory  leak  with  more   memory!  
  • 13. First  AcNons  &  Results   •  Switch  to  64  Bit  OS  (but  compile  to  32  bit)   •  Added  4GB  of  Memory  (for  a  total  of  8GB)   •  Memory  leak  conNnued   – Just  took  longer   – Dump  occurred  a  2GB  (maximum  pool  size)   – Worker  process  recycling  every  two  hours   •  Complaint  level  drops  dramaNcally  
  • 14. Character  of  the  Web  Site   •  Accuracy   •  Reliability   •  Scalability   •  Performance   •  Where  does  Resilience  fit?  
  • 15. Bejer  Living  through  Hardware   •  Adding  more  web  servers   – Spreading  failure  around   •  Moving  to  the  Cloud   – Paying  for  failure  by  the  hour  
  • 16. Increasing  Understanding   •  What  is  causing  this  memory  leak?   – Using  .NET  Memory  Profiling  to  analyze   •  Most  memory  consumed  by  cache   <foreshadow>   •  NoNced  large  blocks  of  staNc  memory  (session   objects)     </foreshadow>  
  • 17. When  Cache  Runs  Amok   •  Caching  added  to  improve  performance   •  Cache  objects  being  created  around  searches   •  What  are  the  chances  of  that  cache  object   ever  being  used  again?   •  How  do  you  expire  a  cache  object?    
  • 18. InstrumenNng  Cache   •  Wrapped  cache  objects  in  a  dicNonary   abstracNon   •  Kept  counters  for  re-­‐use   •  Two  hours  of  run  Nme  (one  worker  process   recycle)  results  in  95%  of  cache  objects  never   reused    
  • 19. Fixing  Cache   •  Don’t  allow  free  form  search  cache   •  Don’t  let  the  users  populate  cache  at  all   •  Treat  the  real  problem  
  • 20. Valuing  Resiliency   •  Nothing  like  failure  to  help  determine  value   •  Owning  your  ROI   •  Spending  CapEx  to  reduce  OpEx  
  • 21. Adding  More  Resiliency   •  Bend  further   – MulNple  servers  provide  redundancy   •  ElasNcity   – Can  we  expand  or  shrink  based  on  need?   •  Bejer  feedback   – How  do  we  know  when  we’re  bent,  when  to   expand  and  when  to  shrink?  
  • 22. Clusters  of  Joy   •  Successful  failover  is  hardware,  soGware    &   configuraNon  together   •  If  it  hasn’t  been  tested,  it  doesn’t  work   •  RouNne  failover  is  your  friend!  
  • 23. The  Plague  of  State   •  Geqng  session  out  of  the  web  server   •  The  bajle  of  performance  over  scalability   •  As  lijle  state  as  possible  (and  no  less)  
  • 24. InstrumenNng  ProducNon   •  The  only  source  of  truth   •  How  can  you  build  soGware  that  can  test  in   producNon  without  impacNng  the  user?   •  InstrumentaNon  driving  features  
  • 25. Building  Resiliency   •  Resiliency  is  a  journey,  not  a  desNnaNon   •  Know  the  character  of  your  system   •  Work  from  facts,  not  conjecture  to  move  the   system  forward  
  • 26.
  • 27. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/resiliency -website