SlideShare a Scribd company logo
1 of 26
CAS Optimizations
-Abhishek Parwal
CAS in brief
Processing
At
Channel
Ad Server
(CAS)
Incoming
Ad
Request
Request
fanned out to
DSP’s for Ad
Challenges
• Figure out hidden bottlenecks in performance
– High Latency , Low Throughput
Performance Gain Summary
Initially Now % change
Mean Time 611 ms 93 ms 85% reduction
95th Percentile 620 ms 150 ms 76 % reduction
QPS 63 409 549 % increase
Performance - Initially
Performance - Now
Alternate explored
• Netty is good for requirements we had , and haven’t found
any more disadvantageous to Others.
– Used by Facebook, Twitter and many others for high
performance
• We can probably later experiment , with nifty /swift, thrift
server based on netty – it has handlers based that does
thrift decoding, and provides few other features.
– https://github.com/facebook/nifty/
– https://github.com/facebook/swift
• Explored various things like
– Dropwizard framework based on jetty & jersey – has in built
integration with yammer
– Phoenix 3 , Servlet 3.0 -Async NIO- Tomcat container
1st shot
What we did
• Connection Pool
– Initially
• Earlier we used custom written netty code for some partners and for some ning – were not
making a connection pool.
• Had a bug which used to timeout if response was not written in cutoff time
– Now
• Connection Pool for outbound request to Partners – used ning. Setup and teardown time
of connection saved . Integration team is following with partners to do change for
maintaining a persistent connection.
• Extra Threads
– Had a Execution Handler , which use to create a new thread to execute a handler in a pipeline,
every time a request arrives. It was not required as we were not doing a blocking operation in
the handler
– Shared the WorkerPool thread between server and client
• HashedWheelTimer
– were doing polling every 5 ms for checking timeout . Was taking around 16-20 % CPU. Increased
this to 20 ms.
• File logging
– Reduced to minimum
– Moved completely to logback .
– Introduced Turbo Filters .
Hashed Wheel Timer
Latency
Channel Pipeline – Netty Basics
IO Request
Inbound Handler 1
Inbound Handler 2
Outbound Handler 1
Outbound Handler 2
Channel.read() Channel.write()
IO Response
Channel Pipeline – Contd.
protected void initChannel(final SocketChannel ch) throws Exception {
ChannelPipeline pipeline = ch.pipeline();
pipeline.addLast("logging", loggingHandler);
pipeline.addLast("incomingLimitHandler",
incomingConnectionLimitHandler);
pipeline.addLast("decoderEncoder", new HttpServerCodec());
pipeline.addLast("aggregator", new HttpObjectAggregator(1024 *
1024));
}
Consider Handler in a pipeline is like a filters in servlet world, Except there
are only filters and nothing special like a Servlet at end.
Next we did
• Moved to Netty4 framework from Netty3
– Advantages:
• Less GC overhead and memory consumption
– Removal of Event Objects.
– Buffer pooling
• Better thread model - http://netty.io/wiki/new-and-noteworthy-in-
4.0.html#wiki-h2-34
• Pool of Marshaller , UnMarshaller , DocumentBuilder , JaxbContext creation once
– used commons-pool from apache for pooling
• JSON parsing library we used for request parsing was slow
– should use Jackson.
– Now, we are having thrift serialized request.
• Persistent connection at server side with UMP .
– Cas Timeout Handler for keep alive connections timeout. Existing Netty
Handlers handle read / write timeouts irrespective of whether op took place.
– Unnecessary thread waiting for lock in synchronized block while sending
response. Release lock as soon as possible.
Removal of Event Objects
class Before implements ChannelUpstreamHandler {
void handleUpstream(ctx, ChannelEvent e) {
if (e instanceof MessageEvent) { ... }
else if (e instanceof ChannelStateEvent) { ... }
...
}
}
class After implements ChannelInboundHandler {
void channelActive(ctx) { ... }
void channelInactive(ctx) { ... }
void channelRead(ctx, msg) { ... }
void userEventTriggered(ctx, evt) { ... }
...
}
Buffer Pooling
• Pool implementation using ByteBufAllocator which is based on Facebook’s Jemalloc -
https://www.facebook.com/video/video.php?v=696488619305
– Why?
• Doesn’t waste memory bandwidth by filling in zeros. Now ‘new
byte[capacity]’ doesn’t happen every time a new message arrives or
response is sent.
– Advantage:
• 5 times less frequent GC pauses: 45.5 vs. 9.2 times/min.
• 5 times less garbage production: 207.11 vs 41.81 MiB/s
– Disadvantage:
• Memory Management at the hand of user. Explicit release of buffers. Netty
has Leak Reporting facilities.
• We are using Pooled Direct Memory in CAS.
– Pooled – means memory pool maintained by Netty
– Direct – means memory allocated outside the JVM. So, JVM doesn’t garbage
collect it.
• Direct is there in Java NIO ByteBuffer too.
Allocation time for Buffer
Thread Model
Thread Model contd.
Thread Model contd - Sharing event
Loop
Tools Used
• Tcpdump, Wireshark
– Ning was adding :80 for Host header, while DSP
didn’t expect. Though it is as per HTTP/1.1 spec
rfc.
• JVisualVm , JProfiler , Eclipse Java Monitor
• Gatling , Jmeter - for performance
benchmarking
Further
• Use a standard Netty4 based http client api. Waiting on Ning to
release a update.
– Several good alternates.
• https://github.com/brunodecarvalho/http-client
• https://github.com/timboudreau/netty-http-client
•Facebook’s Nifty / Swift
• Move to Bare Metal Boxes from VM’s.
Thank You!

More Related Content

What's hot

MessagePack Rakuten Technology Conference 2010
MessagePack Rakuten Technology Conference 2010MessagePack Rakuten Technology Conference 2010
MessagePack Rakuten Technology Conference 2010
Sadayuki Furuhashi
 
Altitude SF 2017: QUIC - A low-latency secure transport for HTTP
Altitude SF 2017: QUIC - A low-latency secure transport for HTTPAltitude SF 2017: QUIC - A low-latency secure transport for HTTP
Altitude SF 2017: QUIC - A low-latency secure transport for HTTP
Fastly
 
How happy they became with H2O/mruby and the future of HTTP
How happy they became with H2O/mruby and the future of HTTPHow happy they became with H2O/mruby and the future of HTTP
How happy they became with H2O/mruby and the future of HTTP
Ichito Nagata
 

What's hot (20)

Lecture set 7
Lecture set 7Lecture set 7
Lecture set 7
 
MessagePack Rakuten Technology Conference 2010
MessagePack Rakuten Technology Conference 2010MessagePack Rakuten Technology Conference 2010
MessagePack Rakuten Technology Conference 2010
 
Altitude SF 2017: QUIC - A low-latency secure transport for HTTP
Altitude SF 2017: QUIC - A low-latency secure transport for HTTPAltitude SF 2017: QUIC - A low-latency secure transport for HTTP
Altitude SF 2017: QUIC - A low-latency secure transport for HTTP
 
Include os @ flossuk 2018
Include os @ flossuk 2018Include os @ flossuk 2018
Include os @ flossuk 2018
 
Using SaltStack to orchestrate microservices in application containers at Sal...
Using SaltStack to orchestrate microservices in application containers at Sal...Using SaltStack to orchestrate microservices in application containers at Sal...
Using SaltStack to orchestrate microservices in application containers at Sal...
 
iptables and Kubernetes
iptables and Kubernetesiptables and Kubernetes
iptables and Kubernetes
 
Troubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issuesTroubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issues
 
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014
 
KubeCon EU 2016: Getting the Jobs Done With Kubernetes
KubeCon EU 2016: Getting the Jobs Done With KubernetesKubeCon EU 2016: Getting the Jobs Done With Kubernetes
KubeCon EU 2016: Getting the Jobs Done With Kubernetes
 
Notes on Netty baics
Notes on Netty baicsNotes on Netty baics
Notes on Netty baics
 
Tuning TCP and NGINX on EC2
Tuning TCP and NGINX on EC2Tuning TCP and NGINX on EC2
Tuning TCP and NGINX on EC2
 
Load Balancing 101
Load Balancing 101Load Balancing 101
Load Balancing 101
 
FPV Streaming Server with ffmpeg
FPV Streaming Server with ffmpegFPV Streaming Server with ffmpeg
FPV Streaming Server with ffmpeg
 
HTTP/2, HTTP/3 and SSL/TLS State of the Art in Our Servers
HTTP/2, HTTP/3 and SSL/TLS State of the Art in Our ServersHTTP/2, HTTP/3 and SSL/TLS State of the Art in Our Servers
HTTP/2, HTTP/3 and SSL/TLS State of the Art in Our Servers
 
Load Balancing with Apache
Load Balancing with ApacheLoad Balancing with Apache
Load Balancing with Apache
 
IP Virtual Server(IPVS) 101
IP Virtual Server(IPVS) 101IP Virtual Server(IPVS) 101
IP Virtual Server(IPVS) 101
 
Why a new CPAN client cpm is fast
Why a new CPAN client cpm is fastWhy a new CPAN client cpm is fast
Why a new CPAN client cpm is fast
 
How happy they became with H2O/mruby and the future of HTTP
How happy they became with H2O/mruby and the future of HTTPHow happy they became with H2O/mruby and the future of HTTP
How happy they became with H2O/mruby and the future of HTTP
 
Criu texas-linux-fest-2014
Criu texas-linux-fest-2014Criu texas-linux-fest-2014
Criu texas-linux-fest-2014
 
Optimizing kubernetes networking
Optimizing kubernetes networkingOptimizing kubernetes networking
Optimizing kubernetes networking
 

Viewers also liked

Viewers also liked (17)

Cách cấu hình đăng nhập 1 lần sso giữa ad server với v center 6
Cách cấu hình đăng nhập 1 lần sso giữa ad server với v center 6Cách cấu hình đăng nhập 1 lần sso giữa ad server với v center 6
Cách cấu hình đăng nhập 1 lần sso giữa ad server với v center 6
 
England
EnglandEngland
England
 
Qué queremos llegar a ser
Qué queremos llegar a serQué queremos llegar a ser
Qué queremos llegar a ser
 
Slutt på sjansespillet!
Slutt på sjansespillet!Slutt på sjansespillet!
Slutt på sjansespillet!
 
Santos primera clase
Santos primera claseSantos primera clase
Santos primera clase
 
IME-06 Catalogo
IME-06 CatalogoIME-06 Catalogo
IME-06 Catalogo
 
Ficha técnica SEA Half Tank 24V
Ficha técnica SEA Half Tank 24VFicha técnica SEA Half Tank 24V
Ficha técnica SEA Half Tank 24V
 
553102_SC
553102_SC553102_SC
553102_SC
 
Epals - PPP IES "El Batán"
Epals - PPP IES "El Batán"Epals - PPP IES "El Batán"
Epals - PPP IES "El Batán"
 
Changing The Game: Inkjet Textile Decoration and Finishing
Changing The Game: Inkjet Textile Decoration and FinishingChanging The Game: Inkjet Textile Decoration and Finishing
Changing The Game: Inkjet Textile Decoration and Finishing
 
đề Cương ôn tập môn khuyến nông
đề Cương ôn tập môn khuyến nôngđề Cương ôn tập môn khuyến nông
đề Cương ôn tập môn khuyến nông
 
Wallcovering
WallcoveringWallcovering
Wallcovering
 
Shot breakdown master list
Shot breakdown master list Shot breakdown master list
Shot breakdown master list
 
Past, Present, & Future of the Ad Server
Past, Present, & Future of the Ad ServerPast, Present, & Future of the Ad Server
Past, Present, & Future of the Ad Server
 
Answer to Marlowe
Answer to MarloweAnswer to Marlowe
Answer to Marlowe
 
Scaling agile with sa fe v1.0
Scaling agile with sa fe v1.0Scaling agile with sa fe v1.0
Scaling agile with sa fe v1.0
 
Catalogo displays
Catalogo displaysCatalogo displays
Catalogo displays
 

Similar to Ad Server Optimization

Similar to Ad Server Optimization (20)

Building Netty Servers
Building Netty ServersBuilding Netty Servers
Building Netty Servers
 
Scaling Kubernetes to Support 50000 Services.pptx
Scaling Kubernetes to Support 50000 Services.pptxScaling Kubernetes to Support 50000 Services.pptx
Scaling Kubernetes to Support 50000 Services.pptx
 
Scale Kubernetes to support 50000 services
Scale Kubernetes to support 50000 servicesScale Kubernetes to support 50000 services
Scale Kubernetes to support 50000 services
 
Go Concurrency Patterns
Go Concurrency PatternsGo Concurrency Patterns
Go Concurrency Patterns
 
Tunning mobicent-jean deruelle
Tunning mobicent-jean deruelleTunning mobicent-jean deruelle
Tunning mobicent-jean deruelle
 
Treasure Data Summer Internship 2016
Treasure Data Summer Internship 2016Treasure Data Summer Internship 2016
Treasure Data Summer Internship 2016
 
Distributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and ScalaDistributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and Scala
 
Devoxx Maroc 2015 HTTP 1, HTTP 2 and folks
Devoxx Maroc  2015 HTTP 1, HTTP 2 and folksDevoxx Maroc  2015 HTTP 1, HTTP 2 and folks
Devoxx Maroc 2015 HTTP 1, HTTP 2 and folks
 
Tomcat from a cluster to the cloud on RP3
Tomcat from a cluster to the cloud on RP3Tomcat from a cluster to the cloud on RP3
Tomcat from a cluster to the cloud on RP3
 
SDAccel Design Contest: Vivado HLS
SDAccel Design Contest: Vivado HLSSDAccel Design Contest: Vivado HLS
SDAccel Design Contest: Vivado HLS
 
Scaling Push Messaging for Millions of Devices @Netflix
Scaling Push Messaging for Millions of Devices @NetflixScaling Push Messaging for Millions of Devices @Netflix
Scaling Push Messaging for Millions of Devices @Netflix
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauDoing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
 
CEPH中的QOS技术
CEPH中的QOS技术CEPH中的QOS技术
CEPH中的QOS技术
 
Reactive server with netty
Reactive server with nettyReactive server with netty
Reactive server with netty
 
Aerospike Go Language Client
Aerospike Go Language ClientAerospike Go Language Client
Aerospike Go Language Client
 
Writing Asynchronous Programs with Scala & Akka
Writing Asynchronous Programs with Scala & AkkaWriting Asynchronous Programs with Scala & Akka
Writing Asynchronous Programs with Scala & Akka
 
Springone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and ReactorSpringone2gx 2014 Reactive Streams and Reactor
Springone2gx 2014 Reactive Streams and Reactor
 
Finagle and Java Service Framework at Pinterest
Finagle and Java Service Framework at PinterestFinagle and Java Service Framework at Pinterest
Finagle and Java Service Framework at Pinterest
 
Puppet Camp London Fall 2015 - Service Discovery and Puppet
Puppet Camp London Fall 2015 - Service Discovery and PuppetPuppet Camp London Fall 2015 - Service Discovery and Puppet
Puppet Camp London Fall 2015 - Service Discovery and Puppet
 
London Puppet Camp 2015: Service Discovery and Puppet
London Puppet Camp 2015: Service Discovery and PuppetLondon Puppet Camp 2015: Service Discovery and Puppet
London Puppet Camp 2015: Service Discovery and Puppet
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Ad Server Optimization

  • 2. CAS in brief Processing At Channel Ad Server (CAS) Incoming Ad Request Request fanned out to DSP’s for Ad
  • 3. Challenges • Figure out hidden bottlenecks in performance – High Latency , Low Throughput
  • 4. Performance Gain Summary Initially Now % change Mean Time 611 ms 93 ms 85% reduction 95th Percentile 620 ms 150 ms 76 % reduction QPS 63 409 549 % increase
  • 6.
  • 8.
  • 9. Alternate explored • Netty is good for requirements we had , and haven’t found any more disadvantageous to Others. – Used by Facebook, Twitter and many others for high performance • We can probably later experiment , with nifty /swift, thrift server based on netty – it has handlers based that does thrift decoding, and provides few other features. – https://github.com/facebook/nifty/ – https://github.com/facebook/swift • Explored various things like – Dropwizard framework based on jetty & jersey – has in built integration with yammer – Phoenix 3 , Servlet 3.0 -Async NIO- Tomcat container
  • 11. What we did • Connection Pool – Initially • Earlier we used custom written netty code for some partners and for some ning – were not making a connection pool. • Had a bug which used to timeout if response was not written in cutoff time – Now • Connection Pool for outbound request to Partners – used ning. Setup and teardown time of connection saved . Integration team is following with partners to do change for maintaining a persistent connection. • Extra Threads – Had a Execution Handler , which use to create a new thread to execute a handler in a pipeline, every time a request arrives. It was not required as we were not doing a blocking operation in the handler – Shared the WorkerPool thread between server and client • HashedWheelTimer – were doing polling every 5 ms for checking timeout . Was taking around 16-20 % CPU. Increased this to 20 ms. • File logging – Reduced to minimum – Moved completely to logback . – Introduced Turbo Filters .
  • 14. Channel Pipeline – Netty Basics IO Request Inbound Handler 1 Inbound Handler 2 Outbound Handler 1 Outbound Handler 2 Channel.read() Channel.write() IO Response
  • 15. Channel Pipeline – Contd. protected void initChannel(final SocketChannel ch) throws Exception { ChannelPipeline pipeline = ch.pipeline(); pipeline.addLast("logging", loggingHandler); pipeline.addLast("incomingLimitHandler", incomingConnectionLimitHandler); pipeline.addLast("decoderEncoder", new HttpServerCodec()); pipeline.addLast("aggregator", new HttpObjectAggregator(1024 * 1024)); } Consider Handler in a pipeline is like a filters in servlet world, Except there are only filters and nothing special like a Servlet at end.
  • 16. Next we did • Moved to Netty4 framework from Netty3 – Advantages: • Less GC overhead and memory consumption – Removal of Event Objects. – Buffer pooling • Better thread model - http://netty.io/wiki/new-and-noteworthy-in- 4.0.html#wiki-h2-34 • Pool of Marshaller , UnMarshaller , DocumentBuilder , JaxbContext creation once – used commons-pool from apache for pooling • JSON parsing library we used for request parsing was slow – should use Jackson. – Now, we are having thrift serialized request. • Persistent connection at server side with UMP . – Cas Timeout Handler for keep alive connections timeout. Existing Netty Handlers handle read / write timeouts irrespective of whether op took place. – Unnecessary thread waiting for lock in synchronized block while sending response. Release lock as soon as possible.
  • 17. Removal of Event Objects class Before implements ChannelUpstreamHandler { void handleUpstream(ctx, ChannelEvent e) { if (e instanceof MessageEvent) { ... } else if (e instanceof ChannelStateEvent) { ... } ... } } class After implements ChannelInboundHandler { void channelActive(ctx) { ... } void channelInactive(ctx) { ... } void channelRead(ctx, msg) { ... } void userEventTriggered(ctx, evt) { ... } ... }
  • 18. Buffer Pooling • Pool implementation using ByteBufAllocator which is based on Facebook’s Jemalloc - https://www.facebook.com/video/video.php?v=696488619305 – Why? • Doesn’t waste memory bandwidth by filling in zeros. Now ‘new byte[capacity]’ doesn’t happen every time a new message arrives or response is sent. – Advantage: • 5 times less frequent GC pauses: 45.5 vs. 9.2 times/min. • 5 times less garbage production: 207.11 vs 41.81 MiB/s – Disadvantage: • Memory Management at the hand of user. Explicit release of buffers. Netty has Leak Reporting facilities. • We are using Pooled Direct Memory in CAS. – Pooled – means memory pool maintained by Netty – Direct – means memory allocated outside the JVM. So, JVM doesn’t garbage collect it. • Direct is there in Java NIO ByteBuffer too.
  • 22. Thread Model contd - Sharing event Loop
  • 23.
  • 24. Tools Used • Tcpdump, Wireshark – Ning was adding :80 for Host header, while DSP didn’t expect. Though it is as per HTTP/1.1 spec rfc. • JVisualVm , JProfiler , Eclipse Java Monitor • Gatling , Jmeter - for performance benchmarking
  • 25. Further • Use a standard Netty4 based http client api. Waiting on Ning to release a update. – Several good alternates. • https://github.com/brunodecarvalho/http-client • https://github.com/timboudreau/netty-http-client •Facebook’s Nifty / Swift • Move to Bare Metal Boxes from VM’s.

Editor's Notes

  1. Initially- Mean 611 , 95th percentile -620 , QPS – 63 Now - Mean – 93 ms, 95th percentile – 150 ms, QPS -409 . So Improvements In Mean time – 85% reduction , 95th Percentile time – 76 % reduction , QPS – 549% increase.
  2. Mean 611 , 95th percentile -620 , QPS – 63
  3. Mean – 93 ms, 95th percentile – 150 ms, QPS -409 . So Improvements In Mean time – 85% reduction , 95th Percentile time – 76 % reduction , QPS – 549% increase.
  4. https://blog.twitter.com/2013/netty-4-at-twitter-reduced-gc-overhead