SlideShare a Scribd company logo
Expect the unexpected:
Anticipate and prepare for failures
in micro services 	
  
Bhak&	
  Mehta	
  
@bhak&_mehta	
  
Introduc&on	
  
•  Senior	
  So7ware	
  Engineer	
  at	
  Blue	
  Jeans	
  
Network	
  
•  Worked	
  at	
  Sun	
  Microsystems/Oracle	
  for	
  13	
  
years	
  
•  CommiGer	
  to	
  numerous	
  open	
  source	
  projects	
  
including	
  GlassFish	
  Applica&on	
  Server	
  
My	
  recent	
  book	
  
Previous	
  book	
  
Blue	
  Jeans	
  Network	
  
Blue	
  Jeans	
  Network	
  
•  Video	
  conferencing	
  in	
  the	
  cloud	
  
•  Customers	
  in	
  all	
  segments	
  
•  Millions	
  of	
  users	
  
•  Interoperable	
  
•  Video	
  sharing,	
  Content	
  sharing	
  
•  Mobile	
  friendly	
  
•  Solu&ons	
  for	
  large	
  scale	
  events	
  
What	
  you	
  will	
  learn	
  
•  Microservices	
  architecture	
  
•  Challenges	
  at	
  scale	
  
•  Lessons	
  learned,	
  &ps	
  and	
  prac&ces	
  to	
  prevent	
  
cascading	
  failures	
  
•  Resilience	
  planning	
  at	
  various	
  stages	
  	
  
•  Real	
  world	
  examples	
  
Customer B
Top	
  level	
  architecture	
  	
  
INTERNET
Customer A
SIP, H.323
HTTP / HTTPS
Media Node
Web	
  Server	
  
Middleware	
  
services	
  
Cache	
  
Service
discovery
Messaging
	
  DB	
  
Proxy	
  
layer	
  
Connector	
  Node	
  
Micro	
  services	
  architecture	
  
Path	
  to	
  Micro	
  services	
  
•  Advantages	
  
– Simplicity	
  
– Isola&on	
  of	
  problems	
  
– Scale	
  up	
  and	
  scale	
  down	
  
– Easy	
  deployment	
  
– Clear	
  separa&on	
  of	
  concerns	
  
– Heterogeneity	
  and	
  polyglo&sm	
  
Microservices	
  
•  Disadvantages	
  
– Not	
  a	
  free	
  lunch!	
  
– Distributed	
  systems	
  prone	
  to	
  failures	
  
– Eventual	
  consistency	
  
– More	
  effort	
  in	
  terms	
  of	
  deployments,	
  release	
  
managements	
  
– 	
  Challenges	
  in	
  tes&ng	
  the	
  various	
  services	
  evolving	
  
independently,	
  regression	
  tests	
  etc	
  
Monoliths	
  to	
  Micro	
  services	
  
Resilient	
  system	
  
•  Processes	
  transac&ons,	
  even	
  when	
  there	
  are	
  
transient	
  impulses,	
  persistent	
  stresses	
  
•  Func&ons	
  even	
  when	
  there	
  are	
  component	
  
failures	
  disrup&ng	
  normal	
  processing	
  	
  
•  Accepts	
  failures	
  will	
  happen	
  
•  Designs	
  for	
  crumple	
  zones	
  
Kinds	
  of	
  failures	
  
•  Challenges	
  at	
  scale	
  
•  Integra&on	
  point	
  failures	
  	
  
– 	
  Network	
  errors	
  	
  
– Seman&c	
  errors.	
  	
  
– Slow	
  responses	
  
– Outright	
  hang	
  
– GC	
  issues	
  
 	
  
An&cipate	
  failures	
  at	
  scale	
  
•  An&cipate	
  growth	
  	
  
•  Design	
  for	
  next	
  order	
  of	
  magnitude	
  	
  
•  Design	
  for	
  10x	
  plan	
  to	
  rewrite	
  for	
  100x	
  	
  
	
  
 	
  
Resiliency	
  planning	
  Stage	
  1	
  
•  When	
  developing	
  code	
  
– Avoiding	
  Cascading	
  failures	
  
•  Circuit	
  breaker	
  
•  Timeouts	
  
•  Retry	
  
•  Bulkhead	
  
•  Cache	
  op&miza&ons	
  
– Avoid	
  malicious	
  clients	
  
•  Rate	
  limi&ng	
  
Resiliency	
  planning	
  Stage	
  2	
  
•  Planning	
  for	
  dealing	
  with	
  failures	
  before	
  
deploy	
  
– load	
  test	
  
– a/b	
  test	
  
– longevity	
  
	
  
Resiliency	
  planning	
  Stage	
  3	
  
•  Watching	
  out	
  for	
  failures	
  a7er	
  deploy	
  
– health	
  check	
  
– metrics	
  
 	
  
Cascading	
  failures	
  
Caused	
  by	
  Chain	
  reac&ons	
  
For	
  example	
  
	
  	
  	
  	
  One	
  node	
  in	
  a	
  load	
  balance	
  group	
  fails	
  
	
  	
  	
  	
  Others	
  need	
  to	
  pick	
  up	
  work	
  
	
  	
  	
  	
  Eventually	
  performance	
  can	
  degenerate	
  
	
  
Cascading	
  failures	
  with	
  aggrega&on	
  
Cascading	
  failure	
  with	
  aggrega&on	
  
 
Timeouts	
  
•  Clients	
  may	
  prefer	
  a	
  response	
  	
  
– 	
  failure	
  	
  
– 	
  success	
  
– 	
  job	
  queued	
  for	
  later	
  
All	
  aggrega&on	
  requests	
  to	
  microservices	
  should	
  
have	
  reasonable	
  &meouts	
  set	
  	
  
	
  
	
  	
  
Types	
  of	
  Timeouts	
  
•  Connec&on	
  &meout	
  
– Max	
  &me	
  before	
  connec&on	
  can	
  be	
  established	
  or	
  
Error	
  
•  Socket	
  &meout	
  
– Max	
  &me	
  of	
  inac&vity	
  between	
  two	
  packets	
  once	
  
connec&on	
  is	
  established	
  
	
  
	
  	
  
Timeouts	
  paGern	
  
•  Timeouts	
  +	
  Retries	
  go	
  together	
  
•  Transient	
  failures	
  can	
  be	
  remedied	
  with	
  fast	
  
retries	
  
•  However	
  problems	
  in	
  network	
  can	
  last	
  for	
  a	
  
while	
  so	
  probability	
  of	
  retries	
  failing	
  	
  
Timeouts	
  in	
  code	
  
In	
  JAX-­‐RS	
  
Client client = ClientBuilder.newClient();
client.property(ClientProperties.CONNECT_TIMEOUT, 5000);
client.property(ClientProperties.READ_TIMEOUT, 5000)
	
  
Retry	
  paGern	
  
•  Retry	
  for	
  failures	
  in	
  case	
  of	
  network	
  failures,	
  
&meouts	
  or	
  server	
  errors	
  
•  Helps	
  transient	
  network	
  errors	
  such	
  as	
  
dropped	
  connec&ons	
  or	
  server	
  fail	
  over	
  
Retry	
  paGern	
  
•  If	
  one	
  of	
  the	
  services	
  is	
  slow	
  or	
  malfunc&oning	
  
and	
  other	
  services	
  keep	
  retrying	
  then	
  the	
  
problem	
  becomes	
  worse	
  
•  Solu&on	
  
– Exponen&al	
  backoff	
  
– Circuit	
  breaker	
  paGern	
  
Circuit	
  breaker	
  paGern	
  
Circuit	
  breaker	
  A	
  circuit	
  breaker	
  is	
  an	
  electrical	
  device	
  used	
  in	
  an	
  
electrical	
  panel	
  that	
  monitors	
  and	
  controls	
  the	
  amount	
  of	
  amperes	
  
(amps)	
  being	
  sent	
  through	
  
	
  
Circuit	
  breaker	
  paGern	
  
•  Safety	
  device	
  
•  If	
  a	
  power	
  surge	
  occurs	
  in	
  the	
  electrical	
  wiring,	
  
the	
  breaker	
  will	
  trip.	
  	
  
•  Flips	
  from	
  “On”	
  to	
  “Off”	
  and	
  shuts	
  electrical	
  
power	
  from	
  that	
  breaker	
  
Circuit	
  breaker	
  
•  Neflix	
  Hystrix	
  follows	
  circuit	
  breaker	
  paGern	
  
•  If	
  a	
  service’s	
  error	
  rate	
  exceeds	
  a	
  threshold	
  it	
  
will	
  trip	
  the	
  circuit	
  breaker	
  and	
  block	
  the	
  
requests	
  for	
  a	
  specific	
  period	
  of	
  &me	
  
Bulkhead	
  
Bulkhead	
  
•  Avoiding	
  chain	
  reac&ons	
  by	
  isola&ng	
  failures	
  
•  Helps	
  prevent	
  cascading	
  failures	
  
Bulkhead	
  
•  An	
  example	
  of	
  bulkhead	
  could	
  be	
  isola&ng	
  the	
  
database	
  dependencies	
  per	
  service	
  
•  Similarly	
  other	
  infrastructure	
  components	
  can	
  
be	
  isolated	
  such	
  as	
  cache	
  infrastructure	
  
Rate	
  Limi&ng	
  
•  Restric&ng	
  the	
  number	
  of	
  requests	
  that	
  can	
  be	
  
made	
  by	
  a	
  client	
  
•  Client	
  can	
  be	
  iden&fied	
  based	
  on	
  the	
  access	
  
token	
  used	
  
•  Addi&onally	
  clients	
  can	
  be	
  iden&fied	
  based	
  on	
  
IP	
  address	
  
Rate	
  Limi&ng	
  
•  With	
  JAX-­‐RS	
  Rate	
  limi&ng	
  can	
  be	
  implemented	
  
as	
  a	
  filter	
  
•  This	
  filter	
  can	
  check	
  the	
  access	
  count	
  for	
  a	
  
client	
  and	
  if	
  within	
  limit	
  accept	
  the	
  request	
  
•  Else	
  throw	
  a	
  429	
  Error	
  
•  Code	
  at	
  
hGps://github.com/bhak&-­‐mehta/samples/
tree/master/ratelimi&ng	
  
Cache	
  op&miza&ons	
  
•  Stores	
  response	
  informa&on	
  related	
  to	
  
requests	
  in	
  a	
  temporary	
  storage	
  for	
  a	
  specific	
  
period	
  of	
  &me	
  
•  Ensures	
  that	
  server	
  is	
  not	
  burdened	
  
processing	
  those	
  requests	
  in	
  future	
  when	
  
responses	
  can	
  be	
  fulfilled	
  from	
  the	
  cache	
  
Cache	
  op&miza&ons	
  
Gelng	
  from	
  first	
  level	
  cache	
  
Gelng	
  from	
  second	
  
	
  level	
  cache	
  
Gelng	
  from	
  the	
  DB	
  
Dealing	
  with	
  latencies	
  in	
  response	
  
•  Have	
  a	
  &meout	
  for	
  the	
  aggrega&on	
  service	
  
•  Dispatch	
  requests	
  in	
  parallel	
  and	
  collect	
  
responses	
  
•  Associate	
  a	
  priority	
  with	
  all	
  the	
  responses	
  
collected	
  
Handling	
  par&al	
  failures	
  best	
  prac&ces	
  
•  One	
  service	
  calls	
  another	
  which	
  can	
  be	
  slow	
  or	
  
unavailable	
  
•  Never	
  block	
  indefinitely	
  wai&ng	
  for	
  the	
  service	
  
•  Try	
  to	
  return	
  par&al	
  results	
  
•  Provide	
  a	
  caching	
  layer	
  and	
  return	
  cached	
  
data	
  
	
  
Asynchronous	
  PaGerns	
  
•  PaGern	
  to	
  deal	
  with	
  long	
  running	
  jobs	
  
•  Some	
  resources	
  may	
  take	
  longer	
  &me	
  to	
  
provide	
  results	
  
•  Not	
  needing	
  client	
  to	
  wait	
  for	
  the	
  response	
  
Reac&ve	
  programming	
  model	
  
•  Use	
  reac&ve	
  programming	
  such	
  as	
  
CompletableFuture	
  in	
  Java	
  8,	
  ListenableFuture	
  
•  Rx	
  Java	
  
Asynchronous	
  API	
  
•  Reac&ve	
  paGerns	
  
•  Message	
  Passing	
  
– Akka	
  actor	
  model	
  
•  Message	
  queues	
  
– Communica&on	
  between	
  services	
  via	
  shared	
  
message	
  queues	
  
– Websockets	
  
Logging	
  
•  Complex	
  distributed	
  systems	
  introduce	
  many	
  
points	
  of	
  failure	
  
•  Logging	
  helps	
  link	
  events/transac&ons	
  between	
  
various	
  components	
  that	
  make	
  an	
  applica&on	
  or	
  
a	
  business	
  service	
  
•  ELK	
  stack	
  
•  Splunk,	
  syslog	
  
•  Loggly	
  
•  LogEntries	
  
Logging	
  best	
  prac&ces	
  
•  Include	
  detailed,	
  consistent	
  paGern	
  across	
  
service	
  logs	
  
•  Obfuscate	
  sensi&ve	
  data	
  
•  Iden&fy	
  caller	
  or	
  ini&ator	
  as	
  part	
  of	
  logs	
  
•  Do	
  not	
  log	
  payloads	
  by	
  default	
  
Best	
  prac&ces	
  when	
  designing	
  APIs	
  for	
  
mobile	
  clients	
  
– Avoid	
  chalness	
  
– Use	
  aggregator	
  paGern	
  	
  
	
  
Resilience	
  planning	
  Stage	
  2	
  
•  Before	
  deploy	
  
– Load	
  tes&ng	
  
– Longevity	
  tes&ng	
  
– Capacity	
  planning	
  
Load	
  tes&ng	
  
•  Ensure	
  that	
  you	
  test	
  for	
  load	
  on	
  APIs	
  
– Jmeter	
  
•  Plan	
  for	
  longevity	
  tes&ng	
  	
  
	
  
Capacity	
  Planning	
  
•  An&cipate	
  growth	
  
•  Design	
  for	
  handling	
  exponen&al	
  growth	
  
Resilience	
  planning	
  Stage	
  3	
  
•  A7er	
  deploy	
  
– Health	
  check	
  
– Metrics	
  and	
  Monitoring	
  
– Phased	
  rollout	
  of	
  features	
  
 	
  
	
  
Health	
  Check	
  
•  Memory	
  
•  CPU	
  
•  Threads	
  
•  Error	
  rate	
  
•  If	
  any	
  of	
  the	
  checks	
  exceed	
  a	
  threshold	
  send	
  
alert	
  
 	
  
Metrics	
  
•  Response	
  &mes,	
  throughput	
  
– Iden&fy	
  slow	
  running	
  DB	
  queries	
  
•  GC	
  rate	
  and	
  pause	
  dura&on	
  
– Garbage	
  collec&on	
  can	
  cause	
  slow	
  responses	
  
•  Monitor	
  unusual	
  ac&vity	
  
•  Third	
  party	
  library	
  metrics	
  	
  
– For	
  example	
  Couchbase	
  hits	
  
– atop	
  
Metrics	
  
•  Load	
  average	
  
•  Up&me	
  
•  Log	
  sizes	
  
Monitoring	
  
Monitoring	
  
server	
  
Produc&on	
  Environment	
  
CHECKS	
  
ALERTS	
  
Email	
  
Monitoring	
  Stack	
  
• Log	
  Aggrega&on	
  framework	
  Applica&on	
  
• So7ware	
  analy&cs	
  tool	
  that	
  
monitors	
  performance	
  	
  
OS	
  /	
  Applica&on	
  
Code	
  
• Collectd	
  /	
  Graphite	
  Network,	
  Server	
  
Rollout	
  of	
  new	
  features	
  
•  Phasing	
  rollout	
  of	
  new	
  features	
  	
  
•  Have	
  a	
  way	
  to	
  turn	
  features	
  off	
  if	
  not	
  behaving	
  
as	
  expected	
  
•  Alerts	
  and	
  more	
  alerts!	
  
	
  
Real	
  &me	
  examples	
  
•  Neflix's	
  Simian	
  Army	
  induces	
  failures	
  of	
  
services	
  and	
  even	
  datacenters	
  during	
  the	
  
working	
  day	
  to	
  test	
  both	
  the	
  applica&on's	
  
resilience	
  and	
  monitoring.	
  
•  Latency	
  Monkey	
  to	
  simulate	
  slow	
  running	
  
requests	
  
•  Wiremock	
  to	
  mock	
  services	
  
•  Saboteur	
  to	
  create	
  deliberate	
  network	
  
mayhem	
  
Takeaway	
  
•  Inevitability	
  of	
  failures	
  
– Expect	
  systems	
  will	
  fail	
  
– Failure	
  preven&on	
  
 	
  
	
  	
  	
  	
  
References	
  
•  hGps://commons.wikimedia.org/wiki/File:Bulkhead_PSF.png	
  
•  hGps://en.wikipedia.org/wiki/Circuit_breaker#/media/
File:Four_1_pole_circuit_breakers_fiGed_in_a_meter_box.jpg	
  
•  hGps://www.flickr.com/photos/skynoir/	
  Beer	
  in	
  hand:	
  skynoir/Flickr/Crea&ve	
  Commons	
  License	
  
Ques&ons	
  
•  TwiGer:	
  @bhak&_mehta	
  
•  Email:	
  bhak&@bluejeans.com	
  

More Related Content

What's hot

VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log InsightVMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld
 
VMware Log Insight
VMware Log Insight VMware Log Insight
VMware Log Insight
Iwan Rahabok
 
DrupalCamp LA 2014 - A Perfect Launch, Every Time
DrupalCamp LA 2014 - A Perfect Launch, Every TimeDrupalCamp LA 2014 - A Perfect Launch, Every Time
DrupalCamp LA 2014 - A Perfect Launch, Every Time
Suzanne Aldrich
 
Adding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance TestAdding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance Test
Rodolfo Kohn
 
Performance testing virtualized systems v5
Performance testing virtualized systems v5Performance testing virtualized systems v5
Performance testing virtualized systems v5
Mentora
 
Database failover from client perspective
Database failover from client perspectiveDatabase failover from client perspective
Database failover from client perspective
Priit Piipuu
 
Datasheet weblogic midvisionextensionforibmraf
Datasheet weblogic midvisionextensionforibmrafDatasheet weblogic midvisionextensionforibmraf
Datasheet weblogic midvisionextensionforibmraf
MidVision
 
Towards the Cloud: Architecture Patterns and VDI Story
Towards the Cloud: Architecture Patterns and VDI StoryTowards the Cloud: Architecture Patterns and VDI Story
Towards the Cloud: Architecture Patterns and VDI Story
IT Expert Club
 
VMworld 2013: How to Replace Websphere Application Server (WAS) with TCserver
VMworld 2013: How to Replace Websphere Application Server (WAS) with TCserver VMworld 2013: How to Replace Websphere Application Server (WAS) with TCserver
VMworld 2013: How to Replace Websphere Application Server (WAS) with TCserver
VMworld
 
(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management
BIOVIA
 
Through the JMX Window
Through the JMX WindowThrough the JMX Window
Through the JMX Window
C2B2 Consulting
 
High Availbilty In Sql Server
High Availbilty In Sql ServerHigh Availbilty In Sql Server
High Availbilty In Sql Server
Rishikesh Tiwari
 
Planning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPMPlanning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPM
WASdev Community
 
Microservices architecture ext
Microservices architecture extMicroservices architecture ext
Microservices architecture ext
Vikash Kodati
 
Building a Scalable Architecture for web apps
Building a Scalable Architecture for web appsBuilding a Scalable Architecture for web apps
Building a Scalable Architecture for web apps
Directi Group
 
Achieving Zero Downtime for SQL
Achieving Zero Downtime for SQLAchieving Zero Downtime for SQL
Achieving Zero Downtime for SQL
ScaleArc
 
Oracle Coherence Strategy and Roadmap (OpenWorld, September 2014)
Oracle Coherence Strategy and Roadmap (OpenWorld, September 2014)Oracle Coherence Strategy and Roadmap (OpenWorld, September 2014)
Oracle Coherence Strategy and Roadmap (OpenWorld, September 2014)
jeckels
 
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Continuent
 
URP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to KnowURP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to Know
Todd Palino
 
Monitoring Oracle SOA Suite
Monitoring Oracle SOA SuiteMonitoring Oracle SOA Suite
Monitoring Oracle SOA Suite
C2B2 Consulting
 

What's hot (20)

VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log InsightVMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
 
VMware Log Insight
VMware Log Insight VMware Log Insight
VMware Log Insight
 
DrupalCamp LA 2014 - A Perfect Launch, Every Time
DrupalCamp LA 2014 - A Perfect Launch, Every TimeDrupalCamp LA 2014 - A Perfect Launch, Every Time
DrupalCamp LA 2014 - A Perfect Launch, Every Time
 
Adding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance TestAdding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance Test
 
Performance testing virtualized systems v5
Performance testing virtualized systems v5Performance testing virtualized systems v5
Performance testing virtualized systems v5
 
Database failover from client perspective
Database failover from client perspectiveDatabase failover from client perspective
Database failover from client perspective
 
Datasheet weblogic midvisionextensionforibmraf
Datasheet weblogic midvisionextensionforibmrafDatasheet weblogic midvisionextensionforibmraf
Datasheet weblogic midvisionextensionforibmraf
 
Towards the Cloud: Architecture Patterns and VDI Story
Towards the Cloud: Architecture Patterns and VDI StoryTowards the Cloud: Architecture Patterns and VDI Story
Towards the Cloud: Architecture Patterns and VDI Story
 
VMworld 2013: How to Replace Websphere Application Server (WAS) with TCserver
VMworld 2013: How to Replace Websphere Application Server (WAS) with TCserver VMworld 2013: How to Replace Websphere Application Server (WAS) with TCserver
VMworld 2013: How to Replace Websphere Application Server (WAS) with TCserver
 
(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management
 
Through the JMX Window
Through the JMX WindowThrough the JMX Window
Through the JMX Window
 
High Availbilty In Sql Server
High Availbilty In Sql ServerHigh Availbilty In Sql Server
High Availbilty In Sql Server
 
Planning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPMPlanning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPM
 
Microservices architecture ext
Microservices architecture extMicroservices architecture ext
Microservices architecture ext
 
Building a Scalable Architecture for web apps
Building a Scalable Architecture for web appsBuilding a Scalable Architecture for web apps
Building a Scalable Architecture for web apps
 
Achieving Zero Downtime for SQL
Achieving Zero Downtime for SQLAchieving Zero Downtime for SQL
Achieving Zero Downtime for SQL
 
Oracle Coherence Strategy and Roadmap (OpenWorld, September 2014)
Oracle Coherence Strategy and Roadmap (OpenWorld, September 2014)Oracle Coherence Strategy and Roadmap (OpenWorld, September 2014)
Oracle Coherence Strategy and Roadmap (OpenWorld, September 2014)
 
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
Webinar Slides: Real-Time Replication vs. ETL - How Analytics Requires New Te...
 
URP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to KnowURP? Excuse You! The Three Kafka Metrics You Need to Know
URP? Excuse You! The Three Kafka Metrics You Need to Know
 
Monitoring Oracle SOA Suite
Monitoring Oracle SOA SuiteMonitoring Oracle SOA Suite
Monitoring Oracle SOA Suite
 

Viewers also liked

USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal HabitatsUSGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
Marcellus Drilling News
 
Nuvola: a tale of migration to AWS
Nuvola: a tale of migration to AWSNuvola: a tale of migration to AWS
Nuvola: a tale of migration to AWS
Matteo Moretti
 
Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)
Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)
Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)
Laura Zielke
 
Catálogo Elk Sport 2016 2017
Catálogo Elk Sport 2016 2017Catálogo Elk Sport 2016 2017
Catálogo Elk Sport 2016 2017
Elk Sport
 
Deploying services: automation with docker and ansible
Deploying services: automation with docker and ansibleDeploying services: automation with docker and ansible
Deploying services: automation with docker and ansible
John Zaccone
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence
Jervin Real
 
Distributed cat herding
Distributed cat herdingDistributed cat herding
Distributed cat herding
Jilles van Gurp
 
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
Codemotion
 
Micro Services - Small is Beautiful
Micro Services - Small is BeautifulMicro Services - Small is Beautiful
Micro Services - Small is Beautiful
Eberhard Wolff
 
Adaptive Content Show & Tell - Austin Content
Adaptive Content Show & Tell - Austin ContentAdaptive Content Show & Tell - Austin Content
Adaptive Content Show & Tell - Austin Content
cdelk
 
Resume
ResumeResume
Data Visualization on the Tech Side
Data Visualization on the Tech SideData Visualization on the Tech Side
Data Visualization on the Tech Side
Mathieu Elie
 
Fuel cell
Fuel cellFuel cell
Fuel cell
Ahmed M. Elkholy
 
Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...
Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...
Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...
Sean Whalen
 
Bsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue TeamsBsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue Teams
Suraj Pratap
 
Tubular Labs - Using Elastic to Search Over 2.5B Videos
Tubular Labs - Using Elastic to Search Over 2.5B VideosTubular Labs - Using Elastic to Search Over 2.5B Videos
Tubular Labs - Using Elastic to Search Over 2.5B Videos
Tubular Labs
 
"Mini Texts"
"Mini Texts" "Mini Texts"
"Mini Texts"
Emily Kissner
 
Splunk Dynamic lookup
Splunk Dynamic lookupSplunk Dynamic lookup
Splunk Dynamic lookup
Splunk
 
Secure Yourself, Practice what we preach - BSides Austin 2015
Secure Yourself, Practice what we preach - BSides Austin 2015Secure Yourself, Practice what we preach - BSides Austin 2015
Secure Yourself, Practice what we preach - BSides Austin 2015
Michael Gough
 
Honey Potz - BSides SLC 2015
Honey Potz - BSides SLC 2015Honey Potz - BSides SLC 2015
Honey Potz - BSides SLC 2015
Ethan Dodge
 

Viewers also liked (20)

USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal HabitatsUSGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
USGS Report on the Impact of Marcellus Shale Drilling on Forest Animal Habitats
 
Nuvola: a tale of migration to AWS
Nuvola: a tale of migration to AWSNuvola: a tale of migration to AWS
Nuvola: a tale of migration to AWS
 
Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)
Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)
Acts 6:1-7 ~ Organic Growth of the Early Church (pt. 1)
 
Catálogo Elk Sport 2016 2017
Catálogo Elk Sport 2016 2017Catálogo Elk Sport 2016 2017
Catálogo Elk Sport 2016 2017
 
Deploying services: automation with docker and ansible
Deploying services: automation with docker and ansibleDeploying services: automation with docker and ansible
Deploying services: automation with docker and ansible
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence
 
Distributed cat herding
Distributed cat herdingDistributed cat herding
Distributed cat herding
 
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
 
Micro Services - Small is Beautiful
Micro Services - Small is BeautifulMicro Services - Small is Beautiful
Micro Services - Small is Beautiful
 
Adaptive Content Show & Tell - Austin Content
Adaptive Content Show & Tell - Austin ContentAdaptive Content Show & Tell - Austin Content
Adaptive Content Show & Tell - Austin Content
 
Resume
ResumeResume
Resume
 
Data Visualization on the Tech Side
Data Visualization on the Tech SideData Visualization on the Tech Side
Data Visualization on the Tech Side
 
Fuel cell
Fuel cellFuel cell
Fuel cell
 
Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...
Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...
Open Secrets of the Defense Industry: Building Your Own Intelligence Program ...
 
Bsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue TeamsBsides Delhi Security Automation for Red and Blue Teams
Bsides Delhi Security Automation for Red and Blue Teams
 
Tubular Labs - Using Elastic to Search Over 2.5B Videos
Tubular Labs - Using Elastic to Search Over 2.5B VideosTubular Labs - Using Elastic to Search Over 2.5B Videos
Tubular Labs - Using Elastic to Search Over 2.5B Videos
 
"Mini Texts"
"Mini Texts" "Mini Texts"
"Mini Texts"
 
Splunk Dynamic lookup
Splunk Dynamic lookupSplunk Dynamic lookup
Splunk Dynamic lookup
 
Secure Yourself, Practice what we preach - BSides Austin 2015
Secure Yourself, Practice what we preach - BSides Austin 2015Secure Yourself, Practice what we preach - BSides Austin 2015
Secure Yourself, Practice what we preach - BSides Austin 2015
 
Honey Potz - BSides SLC 2015
Honey Potz - BSides SLC 2015Honey Potz - BSides SLC 2015
Honey Potz - BSides SLC 2015
 

Similar to Expect the unexpected: Prepare for failures in microservices

Expect the unexpected: Anticipate and prepare for failures in microservices b...
Expect the unexpected: Anticipate and prepare for failures in microservices b...Expect the unexpected: Anticipate and prepare for failures in microservices b...
Expect the unexpected: Anticipate and prepare for failures in microservices b...
Bhakti Mehta
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challenges
Ivo Andreev
 
WebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck ThreadsWebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck Threads
Maarten Smeets
 
Microservices for java architects it-symposium-2015-09-15
Microservices for java architects it-symposium-2015-09-15Microservices for java architects it-symposium-2015-09-15
Microservices for java architects it-symposium-2015-09-15
Derek Ashmore
 
Debugging Microservices - key challenges and techniques - Microservices Odesa...
Debugging Microservices - key challenges and techniques - Microservices Odesa...Debugging Microservices - key challenges and techniques - Microservices Odesa...
Debugging Microservices - key challenges and techniques - Microservices Odesa...
Lohika_Odessa_TechTalks
 
Tech talk microservices debugging
Tech talk microservices debuggingTech talk microservices debugging
Tech talk microservices debugging
Andrey Kolodnitsky
 
Performance and Scalability Tuning
Performance and Scalability TuningPerformance and Scalability Tuning
Performance and Scalability Tuning
Andres March
 
APIs, STOP Polling, lets go Streaming
APIs, STOP Polling, lets go StreamingAPIs, STOP Polling, lets go Streaming
APIs, STOP Polling, lets go Streaming
Phil Wilkins
 
Caching up is hard to do: Improving your Web Services' Performance
Caching up is hard to do: Improving your Web Services' PerformanceCaching up is hard to do: Improving your Web Services' Performance
Caching up is hard to do: Improving your Web Services' Performance
RTigger
 
Fault Tolerance in Distributed Environment
Fault Tolerance in Distributed EnvironmentFault Tolerance in Distributed Environment
Fault Tolerance in Distributed Environment
Orkhan Gasimov
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith
Markus Eisele
 
Writing microservices in Java -- Chicago-2015-11-10
Writing microservices in Java -- Chicago-2015-11-10Writing microservices in Java -- Chicago-2015-11-10
Writing microservices in Java -- Chicago-2015-11-10
Derek Ashmore
 
Software Architecture for Cloud Infrastructure
Software Architecture for Cloud InfrastructureSoftware Architecture for Cloud Infrastructure
Software Architecture for Cloud Infrastructure
Tapio Rautonen
 
Service Stampede: Surviving a Thousand Services
Service Stampede: Surviving a Thousand ServicesService Stampede: Surviving a Thousand Services
Service Stampede: Surviving a Thousand Services
Anil Gursel
 
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
HostedbyConfluent
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
confluent
 
Production Ready Microservices at Scale
Production Ready Microservices at ScaleProduction Ready Microservices at Scale
Production Ready Microservices at Scale
Rajeev Bharshetty
 
Embracing Failure - Fault Injection and Service Resilience at Netflix
Embracing Failure - Fault Injection and Service Resilience at NetflixEmbracing Failure - Fault Injection and Service Resilience at Netflix
Embracing Failure - Fault Injection and Service Resilience at Netflix
Josh Evans
 
Writing microservices in java java one-2015-10-28
Writing microservices in java java one-2015-10-28Writing microservices in java java one-2015-10-28
Writing microservices in java java one-2015-10-28
Derek Ashmore
 
New Generation Oracle RAC Performance
New Generation Oracle RAC PerformanceNew Generation Oracle RAC Performance
New Generation Oracle RAC Performance
Anil Nair
 

Similar to Expect the unexpected: Prepare for failures in microservices (20)

Expect the unexpected: Anticipate and prepare for failures in microservices b...
Expect the unexpected: Anticipate and prepare for failures in microservices b...Expect the unexpected: Anticipate and prepare for failures in microservices b...
Expect the unexpected: Anticipate and prepare for failures in microservices b...
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challenges
 
WebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck ThreadsWebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck Threads
 
Microservices for java architects it-symposium-2015-09-15
Microservices for java architects it-symposium-2015-09-15Microservices for java architects it-symposium-2015-09-15
Microservices for java architects it-symposium-2015-09-15
 
Debugging Microservices - key challenges and techniques - Microservices Odesa...
Debugging Microservices - key challenges and techniques - Microservices Odesa...Debugging Microservices - key challenges and techniques - Microservices Odesa...
Debugging Microservices - key challenges and techniques - Microservices Odesa...
 
Tech talk microservices debugging
Tech talk microservices debuggingTech talk microservices debugging
Tech talk microservices debugging
 
Performance and Scalability Tuning
Performance and Scalability TuningPerformance and Scalability Tuning
Performance and Scalability Tuning
 
APIs, STOP Polling, lets go Streaming
APIs, STOP Polling, lets go StreamingAPIs, STOP Polling, lets go Streaming
APIs, STOP Polling, lets go Streaming
 
Caching up is hard to do: Improving your Web Services' Performance
Caching up is hard to do: Improving your Web Services' PerformanceCaching up is hard to do: Improving your Web Services' Performance
Caching up is hard to do: Improving your Web Services' Performance
 
Fault Tolerance in Distributed Environment
Fault Tolerance in Distributed EnvironmentFault Tolerance in Distributed Environment
Fault Tolerance in Distributed Environment
 
Stay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolithStay productive_while_slicing_up_the_monolith
Stay productive_while_slicing_up_the_monolith
 
Writing microservices in Java -- Chicago-2015-11-10
Writing microservices in Java -- Chicago-2015-11-10Writing microservices in Java -- Chicago-2015-11-10
Writing microservices in Java -- Chicago-2015-11-10
 
Software Architecture for Cloud Infrastructure
Software Architecture for Cloud InfrastructureSoftware Architecture for Cloud Infrastructure
Software Architecture for Cloud Infrastructure
 
Service Stampede: Surviving a Thousand Services
Service Stampede: Surviving a Thousand ServicesService Stampede: Surviving a Thousand Services
Service Stampede: Surviving a Thousand Services
 
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
Chill, Distill, No Overkill: Best Practices to Stress Test Kafka with Siva Ku...
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
 
Production Ready Microservices at Scale
Production Ready Microservices at ScaleProduction Ready Microservices at Scale
Production Ready Microservices at Scale
 
Embracing Failure - Fault Injection and Service Resilience at Netflix
Embracing Failure - Fault Injection and Service Resilience at NetflixEmbracing Failure - Fault Injection and Service Resilience at Netflix
Embracing Failure - Fault Injection and Service Resilience at Netflix
 
Writing microservices in java java one-2015-10-28
Writing microservices in java java one-2015-10-28Writing microservices in java java one-2015-10-28
Writing microservices in java java one-2015-10-28
 
New Generation Oracle RAC Performance
New Generation Oracle RAC PerformanceNew Generation Oracle RAC Performance
New Generation Oracle RAC Performance
 

More from Bhakti Mehta

Reliability teamwork
Reliability teamworkReliability teamwork
Reliability teamwork
Bhakti Mehta
 
Scaling Confluence Architecture: A Sneak Peek Under the Hood
Scaling Confluence Architecture: A Sneak Peek Under the HoodScaling Confluence Architecture: A Sneak Peek Under the Hood
Scaling Confluence Architecture: A Sneak Peek Under the Hood
Bhakti Mehta
 
Let if flow: Java 8 Streams puzzles and more
Let if flow: Java 8 Streams puzzles and moreLet if flow: Java 8 Streams puzzles and more
Let if flow: Java 8 Streams puzzles and more
Bhakti Mehta
 
Real world RESTful service development problems and solutions
Real world RESTful service development problems and solutionsReal world RESTful service development problems and solutions
Real world RESTful service development problems and solutions
Bhakti Mehta
 
Think async
Think asyncThink async
Think async
Bhakti Mehta
 
Fight empire-html5
Fight empire-html5Fight empire-html5
Fight empire-html5
Bhakti Mehta
 
50 tips50minutes
50 tips50minutes50 tips50minutes
50 tips50minutes
Bhakti Mehta
 
Con fess 2013-sse-websockets-json-bhakti
Con fess 2013-sse-websockets-json-bhaktiCon fess 2013-sse-websockets-json-bhakti
Con fess 2013-sse-websockets-json-bhakti
Bhakti Mehta
 

More from Bhakti Mehta (8)

Reliability teamwork
Reliability teamworkReliability teamwork
Reliability teamwork
 
Scaling Confluence Architecture: A Sneak Peek Under the Hood
Scaling Confluence Architecture: A Sneak Peek Under the HoodScaling Confluence Architecture: A Sneak Peek Under the Hood
Scaling Confluence Architecture: A Sneak Peek Under the Hood
 
Let if flow: Java 8 Streams puzzles and more
Let if flow: Java 8 Streams puzzles and moreLet if flow: Java 8 Streams puzzles and more
Let if flow: Java 8 Streams puzzles and more
 
Real world RESTful service development problems and solutions
Real world RESTful service development problems and solutionsReal world RESTful service development problems and solutions
Real world RESTful service development problems and solutions
 
Think async
Think asyncThink async
Think async
 
Fight empire-html5
Fight empire-html5Fight empire-html5
Fight empire-html5
 
50 tips50minutes
50 tips50minutes50 tips50minutes
50 tips50minutes
 
Con fess 2013-sse-websockets-json-bhakti
Con fess 2013-sse-websockets-json-bhaktiCon fess 2013-sse-websockets-json-bhakti
Con fess 2013-sse-websockets-json-bhakti
 

Recently uploaded

14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
SyedAbiiAzazi1
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
ChristineTorrepenida1
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
yokeleetan1
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.pptPROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
bhadouriyakaku
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Exception Handling notes in java exception
Exception Handling notes in java exceptionException Handling notes in java exception
Exception Handling notes in java exception
Ratnakar Mikkili
 
New techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdfNew techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdf
wisnuprabawa3
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
Mukeshwaran Balu
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
Self-Control of Emotions by Slidesgo.pptx
Self-Control of Emotions by Slidesgo.pptxSelf-Control of Emotions by Slidesgo.pptx
Self-Control of Emotions by Slidesgo.pptx
iemerc2024
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 

Recently uploaded (20)

14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.pptPROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Exception Handling notes in java exception
Exception Handling notes in java exceptionException Handling notes in java exception
Exception Handling notes in java exception
 
New techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdfNew techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdf
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
Self-Control of Emotions by Slidesgo.pptx
Self-Control of Emotions by Slidesgo.pptxSelf-Control of Emotions by Slidesgo.pptx
Self-Control of Emotions by Slidesgo.pptx
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 

Expect the unexpected: Prepare for failures in microservices

  • 1. Expect the unexpected: Anticipate and prepare for failures in micro services   Bhak&  Mehta   @bhak&_mehta  
  • 2. Introduc&on   •  Senior  So7ware  Engineer  at  Blue  Jeans   Network   •  Worked  at  Sun  Microsystems/Oracle  for  13   years   •  CommiGer  to  numerous  open  source  projects   including  GlassFish  Applica&on  Server  
  • 6. Blue  Jeans  Network   •  Video  conferencing  in  the  cloud   •  Customers  in  all  segments   •  Millions  of  users   •  Interoperable   •  Video  sharing,  Content  sharing   •  Mobile  friendly   •  Solu&ons  for  large  scale  events  
  • 7. What  you  will  learn   •  Microservices  architecture   •  Challenges  at  scale   •  Lessons  learned,  &ps  and  prac&ces  to  prevent   cascading  failures   •  Resilience  planning  at  various  stages     •  Real  world  examples  
  • 8. Customer B Top  level  architecture     INTERNET Customer A SIP, H.323 HTTP / HTTPS Media Node Web  Server   Middleware   services   Cache   Service discovery Messaging  DB   Proxy   layer   Connector  Node  
  • 10. Path  to  Micro  services   •  Advantages   – Simplicity   – Isola&on  of  problems   – Scale  up  and  scale  down   – Easy  deployment   – Clear  separa&on  of  concerns   – Heterogeneity  and  polyglo&sm  
  • 11. Microservices   •  Disadvantages   – Not  a  free  lunch!   – Distributed  systems  prone  to  failures   – Eventual  consistency   – More  effort  in  terms  of  deployments,  release   managements   –   Challenges  in  tes&ng  the  various  services  evolving   independently,  regression  tests  etc  
  • 12. Monoliths  to  Micro  services  
  • 13. Resilient  system   •  Processes  transac&ons,  even  when  there  are   transient  impulses,  persistent  stresses   •  Func&ons  even  when  there  are  component   failures  disrup&ng  normal  processing     •  Accepts  failures  will  happen   •  Designs  for  crumple  zones  
  • 14. Kinds  of  failures   •  Challenges  at  scale   •  Integra&on  point  failures     –   Network  errors     – Seman&c  errors.     – Slow  responses   – Outright  hang   – GC  issues  
  • 15.    
  • 16.
  • 17. An&cipate  failures  at  scale   •  An&cipate  growth     •  Design  for  next  order  of  magnitude     •  Design  for  10x  plan  to  rewrite  for  100x      
  • 18.    
  • 19. Resiliency  planning  Stage  1   •  When  developing  code   – Avoiding  Cascading  failures   •  Circuit  breaker   •  Timeouts   •  Retry   •  Bulkhead   •  Cache  op&miza&ons   – Avoid  malicious  clients   •  Rate  limi&ng  
  • 20. Resiliency  planning  Stage  2   •  Planning  for  dealing  with  failures  before   deploy   – load  test   – a/b  test   – longevity    
  • 21. Resiliency  planning  Stage  3   •  Watching  out  for  failures  a7er  deploy   – health  check   – metrics  
  • 22.    
  • 23. Cascading  failures   Caused  by  Chain  reac&ons   For  example          One  node  in  a  load  balance  group  fails          Others  need  to  pick  up  work          Eventually  performance  can  degenerate    
  • 24. Cascading  failures  with  aggrega&on  
  • 25. Cascading  failure  with  aggrega&on  
  • 26.  
  • 27. Timeouts   •  Clients  may  prefer  a  response     –   failure     –   success   –   job  queued  for  later   All  aggrega&on  requests  to  microservices  should   have  reasonable  &meouts  set          
  • 28. Types  of  Timeouts   •  Connec&on  &meout   – Max  &me  before  connec&on  can  be  established  or   Error   •  Socket  &meout   – Max  &me  of  inac&vity  between  two  packets  once   connec&on  is  established        
  • 29. Timeouts  paGern   •  Timeouts  +  Retries  go  together   •  Transient  failures  can  be  remedied  with  fast   retries   •  However  problems  in  network  can  last  for  a   while  so  probability  of  retries  failing    
  • 30. Timeouts  in  code   In  JAX-­‐RS   Client client = ClientBuilder.newClient(); client.property(ClientProperties.CONNECT_TIMEOUT, 5000); client.property(ClientProperties.READ_TIMEOUT, 5000)  
  • 31. Retry  paGern   •  Retry  for  failures  in  case  of  network  failures,   &meouts  or  server  errors   •  Helps  transient  network  errors  such  as   dropped  connec&ons  or  server  fail  over  
  • 32. Retry  paGern   •  If  one  of  the  services  is  slow  or  malfunc&oning   and  other  services  keep  retrying  then  the   problem  becomes  worse   •  Solu&on   – Exponen&al  backoff   – Circuit  breaker  paGern  
  • 33. Circuit  breaker  paGern   Circuit  breaker  A  circuit  breaker  is  an  electrical  device  used  in  an   electrical  panel  that  monitors  and  controls  the  amount  of  amperes   (amps)  being  sent  through    
  • 34. Circuit  breaker  paGern   •  Safety  device   •  If  a  power  surge  occurs  in  the  electrical  wiring,   the  breaker  will  trip.     •  Flips  from  “On”  to  “Off”  and  shuts  electrical   power  from  that  breaker  
  • 35. Circuit  breaker   •  Neflix  Hystrix  follows  circuit  breaker  paGern   •  If  a  service’s  error  rate  exceeds  a  threshold  it   will  trip  the  circuit  breaker  and  block  the   requests  for  a  specific  period  of  &me  
  • 37. Bulkhead   •  Avoiding  chain  reac&ons  by  isola&ng  failures   •  Helps  prevent  cascading  failures  
  • 38. Bulkhead   •  An  example  of  bulkhead  could  be  isola&ng  the   database  dependencies  per  service   •  Similarly  other  infrastructure  components  can   be  isolated  such  as  cache  infrastructure  
  • 39. Rate  Limi&ng   •  Restric&ng  the  number  of  requests  that  can  be   made  by  a  client   •  Client  can  be  iden&fied  based  on  the  access   token  used   •  Addi&onally  clients  can  be  iden&fied  based  on   IP  address  
  • 40. Rate  Limi&ng   •  With  JAX-­‐RS  Rate  limi&ng  can  be  implemented   as  a  filter   •  This  filter  can  check  the  access  count  for  a   client  and  if  within  limit  accept  the  request   •  Else  throw  a  429  Error   •  Code  at   hGps://github.com/bhak&-­‐mehta/samples/ tree/master/ratelimi&ng  
  • 41. Cache  op&miza&ons   •  Stores  response  informa&on  related  to   requests  in  a  temporary  storage  for  a  specific   period  of  &me   •  Ensures  that  server  is  not  burdened   processing  those  requests  in  future  when   responses  can  be  fulfilled  from  the  cache  
  • 42. Cache  op&miza&ons   Gelng  from  first  level  cache   Gelng  from  second    level  cache   Gelng  from  the  DB  
  • 43. Dealing  with  latencies  in  response   •  Have  a  &meout  for  the  aggrega&on  service   •  Dispatch  requests  in  parallel  and  collect   responses   •  Associate  a  priority  with  all  the  responses   collected  
  • 44. Handling  par&al  failures  best  prac&ces   •  One  service  calls  another  which  can  be  slow  or   unavailable   •  Never  block  indefinitely  wai&ng  for  the  service   •  Try  to  return  par&al  results   •  Provide  a  caching  layer  and  return  cached   data    
  • 45. Asynchronous  PaGerns   •  PaGern  to  deal  with  long  running  jobs   •  Some  resources  may  take  longer  &me  to   provide  results   •  Not  needing  client  to  wait  for  the  response  
  • 46. Reac&ve  programming  model   •  Use  reac&ve  programming  such  as   CompletableFuture  in  Java  8,  ListenableFuture   •  Rx  Java  
  • 47. Asynchronous  API   •  Reac&ve  paGerns   •  Message  Passing   – Akka  actor  model   •  Message  queues   – Communica&on  between  services  via  shared   message  queues   – Websockets  
  • 48. Logging   •  Complex  distributed  systems  introduce  many   points  of  failure   •  Logging  helps  link  events/transac&ons  between   various  components  that  make  an  applica&on  or   a  business  service   •  ELK  stack   •  Splunk,  syslog   •  Loggly   •  LogEntries  
  • 49. Logging  best  prac&ces   •  Include  detailed,  consistent  paGern  across   service  logs   •  Obfuscate  sensi&ve  data   •  Iden&fy  caller  or  ini&ator  as  part  of  logs   •  Do  not  log  payloads  by  default  
  • 50. Best  prac&ces  when  designing  APIs  for   mobile  clients   – Avoid  chalness   – Use  aggregator  paGern      
  • 51. Resilience  planning  Stage  2   •  Before  deploy   – Load  tes&ng   – Longevity  tes&ng   – Capacity  planning  
  • 52. Load  tes&ng   •  Ensure  that  you  test  for  load  on  APIs   – Jmeter   •  Plan  for  longevity  tes&ng      
  • 53. Capacity  Planning   •  An&cipate  growth   •  Design  for  handling  exponen&al  growth  
  • 54. Resilience  planning  Stage  3   •  A7er  deploy   – Health  check   – Metrics  and  Monitoring   – Phased  rollout  of  features  
  • 56. Health  Check   •  Memory   •  CPU   •  Threads   •  Error  rate   •  If  any  of  the  checks  exceed  a  threshold  send   alert  
  • 57.    
  • 58. Metrics   •  Response  &mes,  throughput   – Iden&fy  slow  running  DB  queries   •  GC  rate  and  pause  dura&on   – Garbage  collec&on  can  cause  slow  responses   •  Monitor  unusual  ac&vity   •  Third  party  library  metrics     – For  example  Couchbase  hits   – atop  
  • 59. Metrics   •  Load  average   •  Up&me   •  Log  sizes  
  • 60. Monitoring   Monitoring   server   Produc&on  Environment   CHECKS   ALERTS   Email  
  • 61. Monitoring  Stack   • Log  Aggrega&on  framework  Applica&on   • So7ware  analy&cs  tool  that   monitors  performance     OS  /  Applica&on   Code   • Collectd  /  Graphite  Network,  Server  
  • 62. Rollout  of  new  features   •  Phasing  rollout  of  new  features     •  Have  a  way  to  turn  features  off  if  not  behaving   as  expected   •  Alerts  and  more  alerts!    
  • 63. Real  &me  examples   •  Neflix's  Simian  Army  induces  failures  of   services  and  even  datacenters  during  the   working  day  to  test  both  the  applica&on's   resilience  and  monitoring.   •  Latency  Monkey  to  simulate  slow  running   requests   •  Wiremock  to  mock  services   •  Saboteur  to  create  deliberate  network   mayhem  
  • 64. Takeaway   •  Inevitability  of  failures   – Expect  systems  will  fail   – Failure  preven&on  
  • 65.            
  • 66. References   •  hGps://commons.wikimedia.org/wiki/File:Bulkhead_PSF.png   •  hGps://en.wikipedia.org/wiki/Circuit_breaker#/media/ File:Four_1_pole_circuit_breakers_fiGed_in_a_meter_box.jpg   •  hGps://www.flickr.com/photos/skynoir/  Beer  in  hand:  skynoir/Flickr/Crea&ve  Commons  License  
  • 67. Ques&ons   •  TwiGer:  @bhak&_mehta   •  Email:  bhak&@bluejeans.com