SlideShare a Scribd company logo
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Use of Self-Healing Techniques to Improve the Reliability of a Dynamic
and Geo-Distributed Ad Delivery Service
Nicolas Brousse and Oleksii Mykhailov
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Adobe Advertising Cloud
Serving All Media Content Across
Any Screens in Any Format
2
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 3
BEFORE
RFP, IO, human based orders
NOW
Programmatic Ad Buying with
Real Time Bidding
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 4
Latency
<50ms @ 95th percentile
High Traffic
300 billion requests a day
Huge Datasets
Billions of objects to store
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 5
Ad Content Delivered To Eyeball
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 6
Traditional Ad
Serving Implement
GeoDNS GSLB
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Inconsistent GeoDNS Routing
High Latency From Eyeball To Content Origin
Origin Failure Impact User Experience
Impact Campaign Performance and Revenue
7
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Relying on GeoDNS to Figure Out Eyeball Location is
UNRELIABLE
8
Optimal Route
Actual Route
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 9
TCP and TLS
Handshake Impact
Latency
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Datacenter Blackout
Network Outage
Human Errors
Natural Disaster
10
High Risks Of Origin Failures
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Service Unavailability
11
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 12
SOLUTION
Eyeball Traffic Access Content via Smart Edges
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 13
Smart Edges Are
Anycast POPs That
Manage Failover and
Self-Healing
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 14
Few Words About Anycast
§ Shortest Path Routing Means
§ Not Latency Aware
§ Not Congestion Aware / Packet Loss
§ Limited Control for Traffic Steering
§ Difficult Troubleshooting
§ Failover lead to packet RST for Active Sessions
§ Mitigation with a large and well distributed number of POPs
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 15
Anycast POP
Improve Latency
e.g. 3X Faster
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 16
Automate
Failover and Recovery
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 17
Inject Failures In
Production To Validate
Smart Edges Behavior
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Human Failover vs Self-Healing
18
> 1h Failure
< 15min
< 30min Region
Traffic Rerouted
Self Recover
Note: since the paper publication, we reduced automated failover time to be less than a few seconds. See demo.
Fig. 1 Human Failover with Manual Recovery steps Fig. 2 Automated Failover and Self-Healing Recovery
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 19
Injecting Complete
Data Center Failure
at the Regional Level
LIVE
© 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Recorded Demo
20
IEEE ISSRE 2018 - Use of Self-Healing Techniques to Improve the Reliability of a Dynamic and Geo-Distributed Ad Delivery Service

More Related Content

Similar to IEEE ISSRE 2018 - Use of Self-Healing Techniques to Improve the Reliability of a Dynamic and Geo-Distributed Ad Delivery Service

Adobe Ask the AEM Community Expert Session Oct 2016
Adobe Ask the AEM Community Expert Session Oct 2016Adobe Ask the AEM Community Expert Session Oct 2016
Adobe Ask the AEM Community Expert Session Oct 2016
AdobeMarketingCloud
 
Developer To Architect
Developer To ArchitectDeveloper To Architect
Developer To Architect
Anurag Yadav
 
Design - Start Your API Journey Today
Design - Start Your API Journey TodayDesign - Start Your API Journey Today
Design - Start Your API Journey Today
LaurenWendler
 
SAP TechEd 2010 Rich Internet Applications for the Enterprise
SAP TechEd 2010 Rich Internet Applications for the EnterpriseSAP TechEd 2010 Rich Internet Applications for the Enterprise
SAP TechEd 2010 Rich Internet Applications for the Enterprise
Anne Kathrine Petterøe
 
Adobe Flash Platform Summit 2010
Adobe Flash Platform Summit 2010Adobe Flash Platform Summit 2010
Adobe Flash Platform Summit 2010
Anne Kathrine Petterøe
 
Adobe Advertising Cloud: The Reality of Cloud Bursting with OpenStack
Adobe Advertising Cloud: The Reality of Cloud Bursting with OpenStackAdobe Advertising Cloud: The Reality of Cloud Bursting with OpenStack
Adobe Advertising Cloud: The Reality of Cloud Bursting with OpenStack
Nicolas Brousse
 
[NEW LAUNCH!] How to Architect for Multi-Region Redundancy Using Anycast IPs ...
[NEW LAUNCH!] How to Architect for Multi-Region Redundancy Using Anycast IPs ...[NEW LAUNCH!] How to Architect for Multi-Region Redundancy Using Anycast IPs ...
[NEW LAUNCH!] How to Architect for Multi-Region Redundancy Using Anycast IPs ...
Amazon Web Services
 
Marketing in the Age of Mobile
Marketing in the Age of MobileMarketing in the Age of Mobile
Marketing in the Age of Mobile
Adobe Experience Cloud
 
Where is cold fusion headed
Where is cold fusion headedWhere is cold fusion headed
Where is cold fusion headed
ColdFusionConference
 
Design - Start Your API Journey Today
Design - Start Your API Journey TodayDesign - Start Your API Journey Today
Design - Start Your API Journey Today
LaurenWendler
 
Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...
Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...
Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...
Amazon Web Services
 
JUMP13 Whitepapers Live: Mobile Innovation
JUMP13 Whitepapers Live: Mobile InnovationJUMP13 Whitepapers Live: Mobile Innovation
JUMP13 Whitepapers Live: Mobile Innovation
Jamie Brighton
 
Value Added Services and WebRTC
Value Added Services and WebRTCValue Added Services and WebRTC
Value Added Services and WebRTC
Dialogic Inc.
 
Monitoring Serverless Applications (SRV303-S) - AWS re:Invent 2018
Monitoring Serverless Applications (SRV303-S) - AWS re:Invent 2018Monitoring Serverless Applications (SRV303-S) - AWS re:Invent 2018
Monitoring Serverless Applications (SRV303-S) - AWS re:Invent 2018
Amazon Web Services
 
iBeacons: Reality or Still a Work in Progress?
iBeacons:  Reality or Still a Work in Progress?iBeacons:  Reality or Still a Work in Progress?
iBeacons: Reality or Still a Work in Progress?
Ray Pun
 
AEM & Single Page Applications (SPAs) 101
AEM & Single Page Applications (SPAs) 101AEM & Single Page Applications (SPAs) 101
AEM & Single Page Applications (SPAs) 101
Adobe
 
Automating Disaster Recovery for Faultless Service Delivery
Automating Disaster Recovery for Faultless Service DeliveryAutomating Disaster Recovery for Faultless Service Delivery
Automating Disaster Recovery for Faultless Service Delivery
CA Technologies
 
Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...
Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...
Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...
Amazon Web Services
 
Building Volkswagen Group's Digital Ecosystem (AMT304) - AWS re:Invent 2018
Building Volkswagen Group's Digital Ecosystem (AMT304) - AWS re:Invent 2018Building Volkswagen Group's Digital Ecosystem (AMT304) - AWS re:Invent 2018
Building Volkswagen Group's Digital Ecosystem (AMT304) - AWS re:Invent 2018
Amazon Web Services
 
Automating the Modern Software Factory
Automating the Modern Software FactoryAutomating the Modern Software Factory
Automating the Modern Software Factory
CA Technologies
 

Similar to IEEE ISSRE 2018 - Use of Self-Healing Techniques to Improve the Reliability of a Dynamic and Geo-Distributed Ad Delivery Service (20)

Adobe Ask the AEM Community Expert Session Oct 2016
Adobe Ask the AEM Community Expert Session Oct 2016Adobe Ask the AEM Community Expert Session Oct 2016
Adobe Ask the AEM Community Expert Session Oct 2016
 
Developer To Architect
Developer To ArchitectDeveloper To Architect
Developer To Architect
 
Design - Start Your API Journey Today
Design - Start Your API Journey TodayDesign - Start Your API Journey Today
Design - Start Your API Journey Today
 
SAP TechEd 2010 Rich Internet Applications for the Enterprise
SAP TechEd 2010 Rich Internet Applications for the EnterpriseSAP TechEd 2010 Rich Internet Applications for the Enterprise
SAP TechEd 2010 Rich Internet Applications for the Enterprise
 
Adobe Flash Platform Summit 2010
Adobe Flash Platform Summit 2010Adobe Flash Platform Summit 2010
Adobe Flash Platform Summit 2010
 
Adobe Advertising Cloud: The Reality of Cloud Bursting with OpenStack
Adobe Advertising Cloud: The Reality of Cloud Bursting with OpenStackAdobe Advertising Cloud: The Reality of Cloud Bursting with OpenStack
Adobe Advertising Cloud: The Reality of Cloud Bursting with OpenStack
 
[NEW LAUNCH!] How to Architect for Multi-Region Redundancy Using Anycast IPs ...
[NEW LAUNCH!] How to Architect for Multi-Region Redundancy Using Anycast IPs ...[NEW LAUNCH!] How to Architect for Multi-Region Redundancy Using Anycast IPs ...
[NEW LAUNCH!] How to Architect for Multi-Region Redundancy Using Anycast IPs ...
 
Marketing in the Age of Mobile
Marketing in the Age of MobileMarketing in the Age of Mobile
Marketing in the Age of Mobile
 
Where is cold fusion headed
Where is cold fusion headedWhere is cold fusion headed
Where is cold fusion headed
 
Design - Start Your API Journey Today
Design - Start Your API Journey TodayDesign - Start Your API Journey Today
Design - Start Your API Journey Today
 
Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...
Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...
Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...
 
JUMP13 Whitepapers Live: Mobile Innovation
JUMP13 Whitepapers Live: Mobile InnovationJUMP13 Whitepapers Live: Mobile Innovation
JUMP13 Whitepapers Live: Mobile Innovation
 
Value Added Services and WebRTC
Value Added Services and WebRTCValue Added Services and WebRTC
Value Added Services and WebRTC
 
Monitoring Serverless Applications (SRV303-S) - AWS re:Invent 2018
Monitoring Serverless Applications (SRV303-S) - AWS re:Invent 2018Monitoring Serverless Applications (SRV303-S) - AWS re:Invent 2018
Monitoring Serverless Applications (SRV303-S) - AWS re:Invent 2018
 
iBeacons: Reality or Still a Work in Progress?
iBeacons:  Reality or Still a Work in Progress?iBeacons:  Reality or Still a Work in Progress?
iBeacons: Reality or Still a Work in Progress?
 
AEM & Single Page Applications (SPAs) 101
AEM & Single Page Applications (SPAs) 101AEM & Single Page Applications (SPAs) 101
AEM & Single Page Applications (SPAs) 101
 
Automating Disaster Recovery for Faultless Service Delivery
Automating Disaster Recovery for Faultless Service DeliveryAutomating Disaster Recovery for Faultless Service Delivery
Automating Disaster Recovery for Faultless Service Delivery
 
Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...
Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...
Accelerate Value from Big Data, AI, and IoT Initiatives with One-Tenth of the...
 
Building Volkswagen Group's Digital Ecosystem (AMT304) - AWS re:Invent 2018
Building Volkswagen Group's Digital Ecosystem (AMT304) - AWS re:Invent 2018Building Volkswagen Group's Digital Ecosystem (AMT304) - AWS re:Invent 2018
Building Volkswagen Group's Digital Ecosystem (AMT304) - AWS re:Invent 2018
 
Automating the Modern Software Factory
Automating the Modern Software FactoryAutomating the Modern Software Factory
Automating the Modern Software Factory
 

More from Nicolas Brousse

<Programming> 2019 - ICW'19: The Issue of Monorepo and Polyrepo In Large Ente...
<Programming> 2019 - ICW'19: The Issue of Monorepo and Polyrepo In Large Ente...<Programming> 2019 - ICW'19: The Issue of Monorepo and Polyrepo In Large Ente...
<Programming> 2019 - ICW'19: The Issue of Monorepo and Polyrepo In Large Ente...
Nicolas Brousse
 
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
Nicolas Brousse
 
SuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuite
SuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuiteSuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuite
SuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuite
Nicolas Brousse
 
SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...
SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...
SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...
Nicolas Brousse
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
Nicolas Brousse
 
Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployme...
Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployme...Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployme...
Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployme...
Nicolas Brousse
 
Improving Operations Efficiency with Puppet
Improving Operations Efficiency with PuppetImproving Operations Efficiency with Puppet
Improving Operations Efficiency with Puppet
Nicolas Brousse
 
Scaling Bleeding Edge Technology in a Fast-paced Environment
Scaling Bleeding Edge Technology in a Fast-paced EnvironmentScaling Bleeding Edge Technology in a Fast-paced Environment
Scaling Bleeding Edge Technology in a Fast-paced Environment
Nicolas Brousse
 
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
Nicolas Brousse
 
Bringing Business Awareness to Your Operation Team (Nagios World Conference 2...
Bringing Business Awareness to Your Operation Team (Nagios World Conference 2...Bringing Business Awareness to Your Operation Team (Nagios World Conference 2...
Bringing Business Awareness to Your Operation Team (Nagios World Conference 2...
Nicolas Brousse
 
Optimizing your Monitoring and Trending tools for the Cloud (Nagios World Con...
Optimizing your Monitoring and Trending tools for the Cloud (Nagios World Con...Optimizing your Monitoring and Trending tools for the Cloud (Nagios World Con...
Optimizing your Monitoring and Trending tools for the Cloud (Nagios World Con...
Nicolas Brousse
 

More from Nicolas Brousse (11)

<Programming> 2019 - ICW'19: The Issue of Monorepo and Polyrepo In Large Ente...
<Programming> 2019 - ICW'19: The Issue of Monorepo and Polyrepo In Large Ente...<Programming> 2019 - ICW'19: The Issue of Monorepo and Polyrepo In Large Ente...
<Programming> 2019 - ICW'19: The Issue of Monorepo and Polyrepo In Large Ente...
 
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
 
SuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuite
SuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuiteSuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuite
SuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuite
 
SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...
SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...
SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployme...
Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployme...Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployme...
Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployme...
 
Improving Operations Efficiency with Puppet
Improving Operations Efficiency with PuppetImproving Operations Efficiency with Puppet
Improving Operations Efficiency with Puppet
 
Scaling Bleeding Edge Technology in a Fast-paced Environment
Scaling Bleeding Edge Technology in a Fast-paced EnvironmentScaling Bleeding Edge Technology in a Fast-paced Environment
Scaling Bleeding Edge Technology in a Fast-paced Environment
 
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
 
Bringing Business Awareness to Your Operation Team (Nagios World Conference 2...
Bringing Business Awareness to Your Operation Team (Nagios World Conference 2...Bringing Business Awareness to Your Operation Team (Nagios World Conference 2...
Bringing Business Awareness to Your Operation Team (Nagios World Conference 2...
 
Optimizing your Monitoring and Trending tools for the Cloud (Nagios World Con...
Optimizing your Monitoring and Trending tools for the Cloud (Nagios World Con...Optimizing your Monitoring and Trending tools for the Cloud (Nagios World Con...
Optimizing your Monitoring and Trending tools for the Cloud (Nagios World Con...
 

Recently uploaded

Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
FODUU
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
Techgropse Pvt.Ltd.
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 

Recently uploaded (20)

Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 

IEEE ISSRE 2018 - Use of Self-Healing Techniques to Improve the Reliability of a Dynamic and Geo-Distributed Ad Delivery Service

  • 1. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Use of Self-Healing Techniques to Improve the Reliability of a Dynamic and Geo-Distributed Ad Delivery Service Nicolas Brousse and Oleksii Mykhailov
  • 2. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Adobe Advertising Cloud Serving All Media Content Across Any Screens in Any Format 2
  • 3. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 3 BEFORE RFP, IO, human based orders NOW Programmatic Ad Buying with Real Time Bidding
  • 4. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 4 Latency <50ms @ 95th percentile High Traffic 300 billion requests a day Huge Datasets Billions of objects to store
  • 5. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 5 Ad Content Delivered To Eyeball
  • 6. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 6 Traditional Ad Serving Implement GeoDNS GSLB
  • 7. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Inconsistent GeoDNS Routing High Latency From Eyeball To Content Origin Origin Failure Impact User Experience Impact Campaign Performance and Revenue 7
  • 8. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Relying on GeoDNS to Figure Out Eyeball Location is UNRELIABLE 8 Optimal Route Actual Route
  • 9. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 9 TCP and TLS Handshake Impact Latency
  • 10. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Datacenter Blackout Network Outage Human Errors Natural Disaster 10 High Risks Of Origin Failures
  • 11. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Service Unavailability 11
  • 12. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 12 SOLUTION Eyeball Traffic Access Content via Smart Edges
  • 13. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 13 Smart Edges Are Anycast POPs That Manage Failover and Self-Healing
  • 14. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 14 Few Words About Anycast § Shortest Path Routing Means § Not Latency Aware § Not Congestion Aware / Packet Loss § Limited Control for Traffic Steering § Difficult Troubleshooting § Failover lead to packet RST for Active Sessions § Mitigation with a large and well distributed number of POPs
  • 15. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 15 Anycast POP Improve Latency e.g. 3X Faster
  • 16. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 16 Automate Failover and Recovery
  • 17. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 17 Inject Failures In Production To Validate Smart Edges Behavior
  • 18. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Human Failover vs Self-Healing 18 > 1h Failure < 15min < 30min Region Traffic Rerouted Self Recover Note: since the paper publication, we reduced automated failover time to be less than a few seconds. See demo. Fig. 1 Human Failover with Manual Recovery steps Fig. 2 Automated Failover and Self-Healing Recovery
  • 19. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 19 Injecting Complete Data Center Failure at the Regional Level LIVE
  • 20. © 2018 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Recorded Demo 20