SlideShare a Scribd company logo
Sunday, 19 August 12
Triage
                       Dealing with errors in production
                             PyCon Australia 2012

                            Luke Cawood / @lwcd
                         Lars Yencken / @larsyencken




Sunday, 19 August 12
99designs



Sunday, 19 August 12
Sunday, 19 August 12
Balancer



                                  Cache
                                  Cache


                                  App
                                  App
                                   App
                                    App



                       Memcache     DB
                                    DB       Queue



                                             Worker


Sunday, 19 August 12
Balancer




                                   Cache


                                  App
                                  App
                                   App
                                    App



                       Memcache      DB      Queue



                                             Worker


Sunday, 19 August 12
Errors



Sunday, 19 August 12
Sunday, 19 August 12
Hmmm....




Sunday, 19 August 12
Triage



Sunday, 19 August 12
Triage
                       • Improve signal to noise ratio by aggregating
                         similar errors
                       • Allow for claiming, resolving and ranking
                         errors in terms of importance
                       • Integration with github, build tools
                       • Play with new tools and technology
                       • Provide open source alternative to
                         commercial products in this space
Sunday, 19 August 12
Round 1(Fight!)




Sunday, 19 August 12
Round 1(Fight!)

                       • Errors continue to log directly to mongo
                       • Aggregation via incremental MapReduce
                       • Deliver a prototype in one day


Sunday, 19 August 12
Sunday, 19 August 12
Scalability Fatality!

                       • Worked fine during development
                       • Production load caused the MapReduce to
                         asplode!
                       • (Not that we have a lot of errors, right?!)


Sunday, 19 August 12
Round 2




Sunday, 19 August 12
(sub)zeroMQ

                       •   Async error API using
                           zeroMQ pub/sub
                           sockets

                       •   MessagePack as error
                           format (fast, binary)

                       •   Aggregation in python




Sunday, 19 August 12
Aggregation Method

                       • Generate hash in python based on error
                         document
                       • Query mongo for error hash
                       • Create or update error document based
                         on outcome of query, incrementing
                         counters etc where appropriate



Sunday, 19 August 12
Sunday, 19 August 12
Sunday, 19 August 12
Sunday, 19 August 12
Scalability Fatality 2

                       • Multithreaded experiments
                       • Mongo optimisations
                        • There is no schema
                        • The cake is a lie
                       • Mongo ‘upsert’ rocks!

Sunday, 19 August 12
Updating like a boss
                       collection.update(criteria, document, upsert=False)




Sunday, 19 August 12
Updating like a boss
                       collection.update(criteria, document, upsert=False)




Sunday, 19 August 12
Updating like a boss
                       collection.update(criteria, document, upsert=False)




Sunday, 19 August 12
Updating like a boss
                       collection.update(criteria, document, upsert=False)




Sunday, 19 August 12
Updating like a boss
                       collection.update(criteria, document, upsert=False)




Sunday, 19 August 12
Sunday, 19 August 12
Outcomes & future



Sunday, 19 August 12
Outcomes

                       • Getting the ‘right’ level of grouping hard
                       • What to do with errors that just wont go
                         away?
                       • Error occurrence count - what does this
                         tell us?



Sunday, 19 August 12
Future

                       • Easier installation, package in pypi
                       • Better language support (plz halp)
                       • Drop in replacement for airbrake etc
                       • Client side logging (javascript)
                       • Email style filters & actions - ifttt.com

Sunday, 19 August 12
Thanks
                       •   99designs for research and development time

                       •   Contributors:

                               •   Luke Cawood - Project lead

                               •   Josh Benham - Developer

                               •   Jamison Lu - Developer

                           •   Additional assistance

                               •   Lars Yencken - Operations

                               •   99designs UX team




Sunday, 19 August 12
Thanks for listening!
                          https://github.com/lwc/triage



Sunday, 19 August 12

More Related Content

Similar to Triage: real-world error logging for web applications

[Phind] Miracle
[Phind] Miracle[Phind] Miracle
[Phind] Miracle
Chia-Yu Kuo
 
Rubypalooza 2009
Rubypalooza 2009Rubypalooza 2009
Rubypalooza 2009
John Woodell
 
Disposable Testing Environments: There's Nothing Like Production Except Produ...
Disposable Testing Environments: There's Nothing Like Production Except Produ...Disposable Testing Environments: There's Nothing Like Production Except Produ...
Disposable Testing Environments: There's Nothing Like Production Except Produ...
Atlassian
 
Cloud4all Architecture Overview
Cloud4all Architecture OverviewCloud4all Architecture Overview
Cloud4all Architecture Overview
icchp2012
 
Pagetypes
PagetypesPagetypes
Pagetypes
Georg Schmidl
 
Html5 new sword for interactive app
Html5 new sword for interactive appHtml5 new sword for interactive app
Html5 new sword for interactive app
Yohan Totting
 
Responsive Web Design & Workflow
Responsive Web Design & WorkflowResponsive Web Design & Workflow
Responsive Web Design & Workflow
houhr
 
99 inception-deck
99 inception-deck99 inception-deck
99 inception-deck
drewz lin
 
Cloud Tech III: Actionable Metrics
Cloud Tech III: Actionable MetricsCloud Tech III: Actionable Metrics
Cloud Tech III: Actionable Metrics
royrapoport
 
Caching, sharding, distributing - Scaling best practices
Caching, sharding, distributing - Scaling best practicesCaching, sharding, distributing - Scaling best practices
Caching, sharding, distributing - Scaling best practices
Lars Jankowfsky
 
Cross-platform tools for mobile application development
Cross-platform tools for mobile application developmentCross-platform tools for mobile application development
Cross-platform tools for mobile application development
bertouttier
 
[JVMLS 12] Kotlin / Java Interop
[JVMLS 12] Kotlin / Java Interop[JVMLS 12] Kotlin / Java Interop
[JVMLS 12] Kotlin / Java Interop
Andrey Breslav
 
100% JS
100% JS100% JS
100% JS
__lucas
 
Core Data in Motion
Core Data in MotionCore Data in Motion
Core Data in Motion
Lori Olson
 
JS-Everywhere - LocalStorage Hands-on
JS-Everywhere - LocalStorage Hands-onJS-Everywhere - LocalStorage Hands-on
JS-Everywhere - LocalStorage Hands-on
Brice Argenson
 
Falling in Love with Frontend Exception | Devon 2012
Falling in Love with Frontend Exception | Devon 2012Falling in Love with Frontend Exception | Devon 2012
Falling in Love with Frontend Exception | Devon 2012
Daum DNA
 
Firefoxos bcndevcon
Firefoxos bcndevconFirefoxos bcndevcon
Firefoxos bcndevcon
Alina Mierlus
 
Performance for Product Developers
Performance for Product DevelopersPerformance for Product Developers
Performance for Product Developers
Matthew Wilkes
 
Arnaud Porterie - The Truth About C++
Arnaud Porterie - The Truth About C++Arnaud Porterie - The Truth About C++
Arnaud Porterie - The Truth About C++
Arnaud Porterie
 
Cloudera Desktop
Cloudera DesktopCloudera Desktop
Cloudera Desktop
Hadoop User Group
 

Similar to Triage: real-world error logging for web applications (20)

[Phind] Miracle
[Phind] Miracle[Phind] Miracle
[Phind] Miracle
 
Rubypalooza 2009
Rubypalooza 2009Rubypalooza 2009
Rubypalooza 2009
 
Disposable Testing Environments: There's Nothing Like Production Except Produ...
Disposable Testing Environments: There's Nothing Like Production Except Produ...Disposable Testing Environments: There's Nothing Like Production Except Produ...
Disposable Testing Environments: There's Nothing Like Production Except Produ...
 
Cloud4all Architecture Overview
Cloud4all Architecture OverviewCloud4all Architecture Overview
Cloud4all Architecture Overview
 
Pagetypes
PagetypesPagetypes
Pagetypes
 
Html5 new sword for interactive app
Html5 new sword for interactive appHtml5 new sword for interactive app
Html5 new sword for interactive app
 
Responsive Web Design & Workflow
Responsive Web Design & WorkflowResponsive Web Design & Workflow
Responsive Web Design & Workflow
 
99 inception-deck
99 inception-deck99 inception-deck
99 inception-deck
 
Cloud Tech III: Actionable Metrics
Cloud Tech III: Actionable MetricsCloud Tech III: Actionable Metrics
Cloud Tech III: Actionable Metrics
 
Caching, sharding, distributing - Scaling best practices
Caching, sharding, distributing - Scaling best practicesCaching, sharding, distributing - Scaling best practices
Caching, sharding, distributing - Scaling best practices
 
Cross-platform tools for mobile application development
Cross-platform tools for mobile application developmentCross-platform tools for mobile application development
Cross-platform tools for mobile application development
 
[JVMLS 12] Kotlin / Java Interop
[JVMLS 12] Kotlin / Java Interop[JVMLS 12] Kotlin / Java Interop
[JVMLS 12] Kotlin / Java Interop
 
100% JS
100% JS100% JS
100% JS
 
Core Data in Motion
Core Data in MotionCore Data in Motion
Core Data in Motion
 
JS-Everywhere - LocalStorage Hands-on
JS-Everywhere - LocalStorage Hands-onJS-Everywhere - LocalStorage Hands-on
JS-Everywhere - LocalStorage Hands-on
 
Falling in Love with Frontend Exception | Devon 2012
Falling in Love with Frontend Exception | Devon 2012Falling in Love with Frontend Exception | Devon 2012
Falling in Love with Frontend Exception | Devon 2012
 
Firefoxos bcndevcon
Firefoxos bcndevconFirefoxos bcndevcon
Firefoxos bcndevcon
 
Performance for Product Developers
Performance for Product DevelopersPerformance for Product Developers
Performance for Product Developers
 
Arnaud Porterie - The Truth About C++
Arnaud Porterie - The Truth About C++Arnaud Porterie - The Truth About C++
Arnaud Porterie - The Truth About C++
 
Cloudera Desktop
Cloudera DesktopCloudera Desktop
Cloudera Desktop
 

Recently uploaded

みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 

Triage: real-world error logging for web applications

  • 2. Triage Dealing with errors in production PyCon Australia 2012 Luke Cawood / @lwcd Lars Yencken / @larsyencken Sunday, 19 August 12
  • 5. Balancer Cache Cache App App App App Memcache DB DB Queue Worker Sunday, 19 August 12
  • 6. Balancer Cache App App App App Memcache DB Queue Worker Sunday, 19 August 12
  • 11. Triage • Improve signal to noise ratio by aggregating similar errors • Allow for claiming, resolving and ranking errors in terms of importance • Integration with github, build tools • Play with new tools and technology • Provide open source alternative to commercial products in this space Sunday, 19 August 12
  • 13. Round 1(Fight!) • Errors continue to log directly to mongo • Aggregation via incremental MapReduce • Deliver a prototype in one day Sunday, 19 August 12
  • 15. Scalability Fatality! • Worked fine during development • Production load caused the MapReduce to asplode! • (Not that we have a lot of errors, right?!) Sunday, 19 August 12
  • 16. Round 2 Sunday, 19 August 12
  • 17. (sub)zeroMQ • Async error API using zeroMQ pub/sub sockets • MessagePack as error format (fast, binary) • Aggregation in python Sunday, 19 August 12
  • 18. Aggregation Method • Generate hash in python based on error document • Query mongo for error hash • Create or update error document based on outcome of query, incrementing counters etc where appropriate Sunday, 19 August 12
  • 22. Scalability Fatality 2 • Multithreaded experiments • Mongo optimisations • There is no schema • The cake is a lie • Mongo ‘upsert’ rocks! Sunday, 19 August 12
  • 23. Updating like a boss collection.update(criteria, document, upsert=False) Sunday, 19 August 12
  • 24. Updating like a boss collection.update(criteria, document, upsert=False) Sunday, 19 August 12
  • 25. Updating like a boss collection.update(criteria, document, upsert=False) Sunday, 19 August 12
  • 26. Updating like a boss collection.update(criteria, document, upsert=False) Sunday, 19 August 12
  • 27. Updating like a boss collection.update(criteria, document, upsert=False) Sunday, 19 August 12
  • 30. Outcomes • Getting the ‘right’ level of grouping hard • What to do with errors that just wont go away? • Error occurrence count - what does this tell us? Sunday, 19 August 12
  • 31. Future • Easier installation, package in pypi • Better language support (plz halp) • Drop in replacement for airbrake etc • Client side logging (javascript) • Email style filters & actions - ifttt.com Sunday, 19 August 12
  • 32. Thanks • 99designs for research and development time • Contributors: • Luke Cawood - Project lead • Josh Benham - Developer • Jamison Lu - Developer • Additional assistance • Lars Yencken - Operations • 99designs UX team Sunday, 19 August 12
  • 33. Thanks for listening! https://github.com/lwc/triage Sunday, 19 August 12