SlideShare a Scribd company logo
1 of 52
Effective SOA
Lessons from Amazon, Google, and Lucidchart
By Derrick Isaacson
Can I get that
without the
bacon?
Said no one
ever
http://www.food.com/photo-finder/all/bacon?photog=1072593
http://baconipsum.com/?paras=1&type=all-meat&start-with-lorem=1
http://www.someecards.com/usercards/viewcard/MjAxMi03YWZiMjJiMTg3NDFhYTUy
Simplicity of Single Component Services
• I can’t remember if that getter function takes 100ns or 100ms. - Said no
engineer ever
• Should I try to model this server request as a “remote procedure call”?
• 6 orders of magnitude difference!
• My front-side bus fails for only 1 second every 17 minutes! - Said no
engineer ever
• 99.9% availability
• Our internet only supports .NET. - Said no engineer ever
• Do we need an SDK?
"A distributed system is at best a
necessary evil, evil because of the extra
complexity...
An application is rarely, if ever, intrinsically
distributed. Distribution is just the lesser of
the many evils, or perhaps better put, a
sensible engineering decision given the
trade-offs involved."
-David Cheriton, Distributed Systems Lecture Notes, ch. 1
Distributed System Architectures
Does it have to be “Service-oriented”?
http://upload.wikimedia.org/wikipedia/commons/d/da/KL_CoreMemory.jpg
Distributed Memory
RPC
<I’m>
<not>
<making>
<a>
<service>
<request>
<I’m>
<just>
<calling>
<a>
<procedure>
Distributed File System
mount -t nfs -o proto=tcp,port=2049 nfs-server:/ /mnt
Distributed Data Stores
• Replated MySQL
• Mongo
• S3
• RDS
• BigTable
• Cassandra
…
P2P
Streaming Media
Service-oriented Architectures
Social Bookmarking App
GET /profiles/123
GET /users/123
Calculate something
GET /users/123/permissions
If user can’t view profile
send 403
POST /eventFeed {new profile view}
GET /users/123/friends
GET /bookmarks?userId=123
GET /catalog/books?ids=1,3,10
Calculate something else
GET /bookmarks/trending
Send response
Lucidchart.com by Status Code
96.5%
2xx or
3xx
Lucidchart.com 1s+ Latencies
10.8%
> 1s
What Happened?!?
I though SOA was supposed to make my app better!
Simple SOA Availability
<98.7%
99.5%
99.8%
99.6%
.995 * .998 * .998 * .996 = 0.987
A distributed system is at
best a necessary evil
<98.7%
99.5%
99.8%
99.6%
The CAP Theorem
http://learnyousomeerlang.com/distribunomicon
The CAP Theorem1
• Safety – nothing bad ever happens
• Liveness – good things happen
• Unreliability – network dis-connectivity,
crash failures, message loss, Byzantine
failures, slowdown, etc.
• Consistency – every response sent to a
client is correct
• Availability – every request gets a
response
• Partition tolerance – operating in the
face of arbitrary failures
Consistency: Nothing Bad Happens
Assumption: Failures Happen
Availability Consistency
ResponseHandler<User> handler = new ResponseHandler<User>()
{
public User handleResponse(
final HttpResponse response) {
int status = response.getStatusLine().getStatusCode();
if (status >= 200 && status < 300) {
HttpEntity entity = response.getEntity();
return entity != null ? Parser.parse(entity) : null;
} else {
…
}
}
};
HttpGet userGet = new HttpGet("http://example.com/users/123");
User user = httpclient.execute(userGet, handler);
https://hc.apache.org/httpcomponents-client-4.3.x/examples.html
Works great to calculate a user!
GET /profiles/123
GET /users/123
Calculate something
GET /users/123/permissions
If user can’t view profile
send 403
POST /eventFeed {new profile view}
GET /users/123/friends
GET /bookmarks?userId=123
GET /catalog/books?ids=1,3,10
Calculate something else
GET /bookmarks/trending
Send response
Best Effort Availability -
Euphemism for not always available
Best Effort Consistency -
Euphemism for not always consistent
Google File System: relaxed consistency model
Throughput
Latency
Amazon Checkout
http://highscalability.com/amazon-architecture
“WOW
I really regret
sacrificing
consistency for
availability”
-said no amazon ever That’s $74 Billion
Hang Consistency!
Add
• Caching
• Timeouts
• Retries
• Guessing
• Anything!
Tip 1:
HTTP
Caching
Availability/Performance Consistency
Tip 2: HTTP Caching as Fallback
Tip 3: Retries
• Exponential backoffs & max retries
Tip 3: HTTP Caching Technologies
• Apache HttpComponents – HttpClient Cache
• Ehcache
• Redis
• Memcached
• CloudFront
• Akamai
• Berkeley DB
• AWS SNS (for notifying caches components of changes)
Segmenting Consistency and Availability
1. Data Partitioning
Shopping Cart
Warehouse Inventory DB
Segmenting
2. Operation Partitioning
Reads
Writes
Dynamo PNUTS&
Segmenting
3. Functional partitioning
User Service, Document Snapshots
Document Service
Segmenting
4. Hierarchical Partitioning
Leaves
Root
http://www.slashgear.com/google-data-center-hd-photos-hit-where-the-internet-lives-gallery-17252451/
Timeouts
Stop Guessing and Just Calculate It
• Max I/O wait time = # of threads * (CONNECT_TIMEOUT +
READ_TIMEOUT)
• 9 front end servers received 1900 requests in 60 seconds and 300
for Flickr resources (16%).
• 35 requests per server per minute
• Max 100 threads, => 6,000 thread seconds in one minute
• Goal: ensure < 10% of thread seconds spent blocked on Flickr I/O
• 600 < 35 requests * (CONNECT_TIMEOUT + READ_TIMEOUT)
• CONNECT_TIMEOUT + READ_TIMEOUT < 17 seconds
TCP Connect
Send
Request Block on socket read Read response
CONNECT_TIMEOU
T
READ_TIMEOUT
Best Effort Consistency System
99.9%
99.5%
99.8%
99.6%
Wow, my
pizza has too
much cheese
and toppings
Said no one
ever
http://upload.wikimedia.org/wikipedia/commons/6/60/Pizza_Hut_Meat_Lover's_pizza_3.JPG
“WOW
My system has
too much
caching,
timeouts, and
availability.”
-said no one ever
Questions?
golucid.co
http://www.slideshare.net/DerrickIsaacson
References
1. Perspectives on the CAP Theorem
2. Bacon Ipsum
3. Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant
Web
4. The Google File System
5. Big Table
6. Amazon Architecture References
7. Apache HttpComponents
8. Apache HttpClient Cache
9. Ehcache

More Related Content

Similar to Effective Lessons for Building SOA from Amazon, Google and Lucidchart

Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13Dave Gardner
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13Dave Gardner
 
Azure Cloud Patterns
Azure Cloud PatternsAzure Cloud Patterns
Azure Cloud PatternsTamir Dresher
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applicationsAmit Kejriwal
 
Node.js Performance Case Study
Node.js Performance Case StudyNode.js Performance Case Study
Node.js Performance Case StudyFabian Frank
 
Rest in a Nutshell 2014_05_27
Rest in a Nutshell 2014_05_27Rest in a Nutshell 2014_05_27
Rest in a Nutshell 2014_05_27Derrick Isaacson
 
Mobile App Performance: Getting the Most from APIs (MBL203) | AWS re:Invent ...
Mobile App Performance:  Getting the Most from APIs (MBL203) | AWS re:Invent ...Mobile App Performance:  Getting the Most from APIs (MBL203) | AWS re:Invent ...
Mobile App Performance: Getting the Most from APIs (MBL203) | AWS re:Invent ...Amazon Web Services
 
Chaos Engineering on Cloud Foundry
Chaos Engineering on Cloud FoundryChaos Engineering on Cloud Foundry
Chaos Engineering on Cloud FoundryKarun Chennuri
 
Fronteers 20131205 the realtime web
Fronteers 20131205   the realtime webFronteers 20131205   the realtime web
Fronteers 20131205 the realtime webBert Wijnants
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Xavier Lucas
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...confluent
 
Артем Оробец «На пути к low-latency»
Артем Оробец «На пути к low-latency»Артем Оробец «На пути к low-latency»
Артем Оробец «На пути к low-latency»DataArt
 
Reactive microservices with play and akka
Reactive microservices with play and akkaReactive microservices with play and akka
Reactive microservices with play and akkascalaconfjp
 
The Real World - Plugging the Enterprise Into It (nodejs)
The Real World - Plugging  the Enterprise Into It (nodejs)The Real World - Plugging  the Enterprise Into It (nodejs)
The Real World - Plugging the Enterprise Into It (nodejs)Aman Kohli
 
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...Amazon Web Services
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 
Developing on SQL Azure
Developing on SQL AzureDeveloping on SQL Azure
Developing on SQL AzureIke Ellis
 
How Netflix does Microservices
How Netflix does Microservices How Netflix does Microservices
How Netflix does Microservices Manuel Correa
 
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...Matt Ray
 

Similar to Effective Lessons for Building SOA from Amazon, Google and Lucidchart (20)

Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13
 
Azure Cloud Patterns
Azure Cloud PatternsAzure Cloud Patterns
Azure Cloud Patterns
 
Optimizing Uptime in SOA
Optimizing Uptime in SOAOptimizing Uptime in SOA
Optimizing Uptime in SOA
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
 
Node.js Performance Case Study
Node.js Performance Case StudyNode.js Performance Case Study
Node.js Performance Case Study
 
Rest in a Nutshell 2014_05_27
Rest in a Nutshell 2014_05_27Rest in a Nutshell 2014_05_27
Rest in a Nutshell 2014_05_27
 
Mobile App Performance: Getting the Most from APIs (MBL203) | AWS re:Invent ...
Mobile App Performance:  Getting the Most from APIs (MBL203) | AWS re:Invent ...Mobile App Performance:  Getting the Most from APIs (MBL203) | AWS re:Invent ...
Mobile App Performance: Getting the Most from APIs (MBL203) | AWS re:Invent ...
 
Chaos Engineering on Cloud Foundry
Chaos Engineering on Cloud FoundryChaos Engineering on Cloud Foundry
Chaos Engineering on Cloud Foundry
 
Fronteers 20131205 the realtime web
Fronteers 20131205   the realtime webFronteers 20131205   the realtime web
Fronteers 20131205 the realtime web
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28
 
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...
 
Артем Оробец «На пути к low-latency»
Артем Оробец «На пути к low-latency»Артем Оробец «На пути к low-latency»
Артем Оробец «На пути к low-latency»
 
Reactive microservices with play and akka
Reactive microservices with play and akkaReactive microservices with play and akka
Reactive microservices with play and akka
 
The Real World - Plugging the Enterprise Into It (nodejs)
The Real World - Plugging  the Enterprise Into It (nodejs)The Real World - Plugging  the Enterprise Into It (nodejs)
The Real World - Plugging the Enterprise Into It (nodejs)
 
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Developing on SQL Azure
Developing on SQL AzureDeveloping on SQL Azure
Developing on SQL Azure
 
How Netflix does Microservices
How Netflix does Microservices How Netflix does Microservices
How Netflix does Microservices
 
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
 

More from Derrick Isaacson

UJUG Craftsmanship Roundup April 2017
UJUG Craftsmanship Roundup April 2017UJUG Craftsmanship Roundup April 2017
UJUG Craftsmanship Roundup April 2017Derrick Isaacson
 
Cargo Cult Security UJUG Sep2015
Cargo Cult Security UJUG Sep2015Cargo Cult Security UJUG Sep2015
Cargo Cult Security UJUG Sep2015Derrick Isaacson
 
Cargo Cult Security at OpenWest
Cargo Cult Security at OpenWestCargo Cult Security at OpenWest
Cargo Cult Security at OpenWestDerrick Isaacson
 
Cargo Cult Security 2014_01_18
Cargo Cult Security 2014_01_18Cargo Cult Security 2014_01_18
Cargo Cult Security 2014_01_18Derrick Isaacson
 
UJUG 2013 Architecture Roundup with Lucid Software
UJUG 2013 Architecture Roundup with Lucid SoftwareUJUG 2013 Architecture Roundup with Lucid Software
UJUG 2013 Architecture Roundup with Lucid SoftwareDerrick Isaacson
 
Scaling Web Services with Evolvable RESTful APIs - JavaOne 2013
Scaling Web Services with Evolvable RESTful APIs - JavaOne 2013Scaling Web Services with Evolvable RESTful APIs - JavaOne 2013
Scaling Web Services with Evolvable RESTful APIs - JavaOne 2013Derrick Isaacson
 

More from Derrick Isaacson (6)

UJUG Craftsmanship Roundup April 2017
UJUG Craftsmanship Roundup April 2017UJUG Craftsmanship Roundup April 2017
UJUG Craftsmanship Roundup April 2017
 
Cargo Cult Security UJUG Sep2015
Cargo Cult Security UJUG Sep2015Cargo Cult Security UJUG Sep2015
Cargo Cult Security UJUG Sep2015
 
Cargo Cult Security at OpenWest
Cargo Cult Security at OpenWestCargo Cult Security at OpenWest
Cargo Cult Security at OpenWest
 
Cargo Cult Security 2014_01_18
Cargo Cult Security 2014_01_18Cargo Cult Security 2014_01_18
Cargo Cult Security 2014_01_18
 
UJUG 2013 Architecture Roundup with Lucid Software
UJUG 2013 Architecture Roundup with Lucid SoftwareUJUG 2013 Architecture Roundup with Lucid Software
UJUG 2013 Architecture Roundup with Lucid Software
 
Scaling Web Services with Evolvable RESTful APIs - JavaOne 2013
Scaling Web Services with Evolvable RESTful APIs - JavaOne 2013Scaling Web Services with Evolvable RESTful APIs - JavaOne 2013
Scaling Web Services with Evolvable RESTful APIs - JavaOne 2013
 

Recently uploaded

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 

Recently uploaded (20)

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 

Effective Lessons for Building SOA from Amazon, Google and Lucidchart

Editor's Notes

  1. Effective SOA: Lessons from Amazon, Google, and LucidchartIt has been observed that &quot;A distributed system is at best a necessary evil, evil because of the extra complexity.&quot; Multiple nodes computing on inconsistent state with regular communication failures present entirely different challenges than those computer science students face in the classroom writing DFS algorithms. The past 30 years have seen some interesting theories and architectures to deal with these complexities in what we now call &quot;cloud computing&quot;. Some researchers worked on &quot;distributed memory&quot; and others built &quot;remote procedure calls&quot;. More commercially successful architectures of late have popularized ideas like the CAP theorem, distributed caches, and REST.Using examples from companies like Amazon and Google this presentation walks through some practical tips to evolve your service-oriented architecture. Google&apos;s Chubby service demonstrates how you can take advantage of CAP&apos;s &quot;best effort availability&quot; options and Amazon&apos;s &quot;best effort consistency&quot; services show the other end of the spectrum. Practical lessons learned from Lucidchart&apos;s forays into SOA share insight through quantitative analyses on how to make your system highly available.Bio:Derrick Isaacson is the Director of Engineering for Lucid Software Inc (lucidchart.com). He has a BS in EE from BYU and an MS in CS from Stanford. He&apos;s developed big services at Amazon, web platforms for Microsoft, and graphical apps at Lucidchart. Derrick has two patent applications at Microsoft and Domo. For fun he cycles, backpacks, and takes his son out in their truck.
  2. Multiple nodes computing on inconsistent state with regular communication failures present entirely different challenges than those computer science students face in the classroom writing DFS algorithms.
  3. Idea:Nothing’s more familiar to programmers than reading from and writing to memory? We access variables all day long. Why not make distributed state access look like simple memory access? We can use modern operating systems’ support for virtual memory to “swap in” memory that is located on another machine.Problem:How often do you go to access a variable and can’t because a section of memory is “down”?How do you provide a mutex to parallel threads of execution?How can the distributed memory layer be efficient when it has no knowledge of the application?
  4. Idea: Next to memory access, nothing’s more familiar to programmers than functional calls. Can we make distributed state transfer look like a simple procedure call? SOAP!Problems:How often do you retry a method call because the JVM failed to invoke it the first time?Why does incrementing a value take 100 milliseconds?Why does your internet only support .NET and PHP (stub compiler/SDK)?
  5. Idea:Easy network file sharing.NFS, AFS, GFSWorks great for files.
  6. Idea:Easy network file sharing.NFS, AFS, GFSWorks great for files.
  7. Idea:How could you steal bandwidth from universities and avoid infringement lawsuits at the same time?Problems:Mooching resources is a great business model but a terrible architecture if that’s not what you’re going for.
  8. Idea:I have so much state I don’t want to transfer it all in a single response.
  9. What’s the availability of the overall system if a single response for service A is calculated by making 4 total requests to services B, C, and D?If the average availabilities of those 3 components are as given, and the random values are modeled as IID, what is the maximum percentage of requests is service A able to calculate correctly?.995 * .998 * .998 * .996 = 0.987IID is a bad assumption for nearly any distributed system, but it illustrates the effective of naively distributing computation.When crash failures originating at service A are included, the total availability is &lt; 98.7%!That’s an average of 19 minutes of downtime per day!
  10. What’s the availability of the overall system if a single response for service A is calculated by making 4 total requests to services B, C, and D?If the average availabilities of those 3 components are as given, and the random values are modeled as IID, what is the maximum percentage of requests is service A able to calculate correctly?.995 * .998 * .998 * .996 = 0.987IID is a bad assumption for nearly any distributed system, but it illustrates the effective of naively distributing computation.When crash failures originating at service A are included, the total availability is &lt; 98.7%!That’s an average of 19 minutes of downtime per day!
  11. Conjecture made by UC Berkely computer scientist in 2000Gilbert &amp; Lynch published a formal proof
  12. Conjecture made by UC Berkely computer scientist in 2000Gilbert &amp; Lynch published a formal proof
  13. We want the user to 1) get a response (available) and 2) have it be consistent with the view of other nodes.From that end user definition, a slow, error status, or non-existent response are all “incorrect”.
  14. “In order to model partition tolerance, the networkwill be allowed to lose arbitrarily many messages sent from one node toanother.” – Gilbert &amp; Lynchhttp://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdfIt becomes a fundamental tradeoff between availability and consistency.
  15. It drops below the SLA for a consistent, available response perhaps 10 of every 1000 requests.
  16. It turns out the usual approach to implementing a computation like this errs on the side of consistency. If a single service request fails, this calculation hangs or returns an error to the user.
  17. Web crawler, Big Table“our access patterns are highly stylized”“GFS has a relaxed consistency model that supports our highly distributed applications well but remains relatively simple and efficient to implement.”“Record append’s append-at-least-once semantics preserves each writer’s output. Readers deal with the occasional padding and duplicates as follows. Each record prepared by the writer contains extra information like checksums so that its validity can be verified. A reader can identify and discard extra padding and record fragments using the checksums. If it cannot tolerate the occasional duplicates (e.g., if they would trigger non-idempotent operations), it can filter them out using unique identifiers in the records, which are often needed anyway to name corresponding application entities such as web documents.”
  18. “For the checkout process you always want to honor requests to add items to a shopping cart because it&apos;s revenue producing. In this case you choose high availability. Errors are hidden from the customer and sorted out later.”
  19. Wow, that guy must be an engineer – said no one ever
  20. The Amazon Dynamo and Yahoo PNUTS data stores support high read availability while limiting write availability in the face of partitions. For example, Dynamo has a configurable number of replicas on which the data must be stored before a write is confirmed to the client.
  21. Network partitions are less frequent at the leaves of a geographically hierarchical system.
  22. The CAP theorem appears to have implications for scalability.“Intuitively, we think of a system a scalable if it can grow efficiently, using new resources efficiently to handle more load. In order to efficiently use new resources, there must be coordination among those resources;” – Gilbert &amp; Lynch
  23. What’s the availability of the overall system if a single response for service A is calculated by making 4 total requests to services B, C, and D?If the average availabilities of those 3 components are as given, and the random values are modeled as IID, what is the maximum percentage of requests is service A able to calculate correctly?.995 * .998 * .998 * .996 = 0.987IID is a bad assumption for nearly any distributed system, but it illustrates the effective of naively distributing computation.When crash failures originating at service A are included, the total availability is &lt; 98.7%!That’s an average of 19 minutes of downtime per day!