Your SlideShare is downloading. ×
  • Like
Ungooglable
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Ungooglable

  • 1,140 views
Published

This talk covers a basic methodology for finding and fixing problems in a live system. It covers general techniques for finding the source of issues quickly, workarounds, patching, digging into code, …

This talk covers a basic methodology for finding and fixing problems in a live system. It covers general techniques for finding the source of issues quickly, workarounds, patching, digging into code, when and how to get help.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,140
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
20
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Un ablemanaging “disasters” without loosing your cool @eleddy
  • 2. Develadminisystemators This talk is for the who have to constantly deal with UNKNOWNS
  • 3. ‣ Know thy system ‣ Know thy tools ‣ Know thy neighbors Three Commands
  • 4. Stairway to Freedom Prepare Isolate Damage Control Diagnose Patch Clean Fix Document Horizon of Intervention
  • 5. Communicate Prepare Isolate Control Diagnose Patch Clean Fix Document Dear Magic Makers - As some of you may already know, customers are experiencing troubles retrieving their historical records because our archive server is not responding. I am investigating the issue now and will send an update in 20 minutes. Please fence calls in the meanwhile. If someone can please get me a redbull and some nacho cheese corn nuts in the meanwhile, that would be stellar. Thanks! coworkers Mayday! High Priority bossman
  • 6. Prepare for the Worst ‣ Backups ‣ Local Data.fs ‣ Set a time limit Prepare Isolate Control Diagnose Patch Clean Fix Document
  • 7. Disable Interference Disabled all backups and packing Opened up port 8080 to outside network Moved logs to temporary disk Prepare Isolate Control Diagnose Patch Clean Fix Document
  • 8. Isolation by Elimination Prepare Isolate Control Diagnose Patch Clean Fix Document Network Hardware Software Data works for me obvious, sporadic crazy shit everything else not recreatable locally
  • 9. Isolation by Elimination Prepare Isolate Control Diagnose Patch Clean Fix Document Network Hardware Software Data works for me obvious, sporadic crazy shit everything else not recreatable locally
  • 10. Isolation by Elimination Prepare Isolate Control Diagnose Patch Clean Fix Document Network Hardware Software Data works for me obvious, sporadic crazy shit everything else not recreatable locally
  • 11. Isolation by Elimination Prepare Isolate Control Diagnose Patch Clean Fix Document Network Hardware Software Data works for me obvious, sporadic crazy shit everything else not recreatable locally
  • 12. Isolation by Elimination Prepare Isolate Control Diagnose Patch Clean Fix Document Network Hardware Software Data works for me obvious, sporadic crazy shit everything else not recreatable locally
  • 13. Zopesplosion 3000 Architecture Apache Varnish HAProxy CDN APIs Zope Zope Zope Zope Zope Zope Zope MySQL MongoDB SPARQL WTF mate ZEO 1-4 ZEO 5-8 ZEO 9-12 Prepare Control Diagnose Patch Clean Fix DocumentIsolate
  • 14. Zopesplosion 3000 Architecture Apache Varnish HAProxy CDN APIs Zope Zope Zope Zope Zope Zope Zope MySQL MongoDB SPARQL ZEO 1-4 ZEO 5-8 ZEO 9-12 Prepare Control Diagnose Patch Clean Fix DocumentIsolate ?
  • 15. Zopesplosion 3000 Architecture Apache Varnish HAProxy CDN APIs Zope Zope Zope Zope Zope Zope Zope MySQL MongoDB SPARQL ZEO 1-4 ZEO 5-8 ZEO 9-12 Prepare Control Diagnose Patch Clean Fix DocumentIsolate ? ?
  • 16. Zopesplosion 3000 Architecture Apache Varnish HAProxy CDN APIs Zope Zope Zope Zope Zope Zope Zope MySQL MongoDB SPARQL ZEO 1-4 ZEO 5-8 ZEO 9-12 Prepare Control Diagnose Patch Clean Fix DocumentIsolate ?
  • 17. Machine BMachine A How Zeo Cache Works Zope Mem. Cache Zeo
  • 18. Machine BMachine A How Zeo Cache Works Zope Mem. Cache Zeo I Want X
  • 19. Machine BMachine A How Zeo Cache Works Zope Mem. Cache Zeo I Want X I Need X
  • 20. Machine BMachine A How Zeo Cache Works Zope Mem. Cache Zeo I Want X I Need X X
  • 21. Machine BMachine A How Zeo Cache Works Zope Mem. Cache Zeo I Want X I Need X X X
  • 22. Machine BMachine A How Zeo Cache Works Zope Mem. Cache Zeo I Want X I Need X X X X
  • 23. Machine BMachine A How Zeo Cache Works Zope Mem. Cache Zeo I Want X I Need X X X X Modified X
  • 24. Machine BMachine A How Zeo Cache Works Zope Mem. Cache Zeo I Want X I Need X X X X Modified X ‘
  • 25. Machine BMachine A How Zeo Cache Works Zope Mem. Cache Zeo I Want X I Need X X X X Modified X ‘ Modified X
  • 26. Machine BMachine A Machine BMachine A How Zeo Cache Works Zope Mem. Cache Zeo I Want X I Need X X X X Modified X ‘ Modified X Zope Disk Cache Zeo I Want X X X Modified X ‘ RESTART Inconsistent State!
  • 27. Zopesplosion 3000 Architecture Apache Varnish HAProxy CDN APIs Zope Zope Zope Zope Zope Zope Zope MySQL MongoDB SPARQL ZEO 1-4 ZEO 5-8 ZEO 9-12 Prepare Control Diagnose Patch Clean Patch DocumentIsolate Hot damn!
  • 28. Take time to make time ‣ Minimize customer angst ‣ Hang out in custom ‣ Acquisition is your friend ‣ Remember request and response Prepare Control Diagnose Patch Clean Fix DocumentIsolate
  • 29. Prepare Control Diagnose Patch Clean Fix DocumentIsolate
  • 30. Unique or Just Not Obvious? ‣ Zope, zeo, system logs ‣ System stats/monitoring Prepare Isolate Control Diagnose Patch Clean Fix Document
  • 31. Test Case Prepare Isolate Control Diagnose Patch Clean Fix Document Sarcoidosis! Probably not... Estimate Fix Time +
  • 32. Horizon of Intervention Prepare Isolate Control Diagnose Patch Clean Fix Document Can I handle this problem? Can I do it in a timely manner? Yes IRC Plone-users Yes NONO Friends Colleagues
  • 33. Front End Errors Take the performance hit Disable the malfunctioning piece Prepare Isolate Control Diagnose Patch Clean Fix Document
  • 34. temporary patch Prepare Isolate Control Diagnose Patch Clean Fix Document full patch
  • 35. Have I mentioned the importance of Prepare Isolate Control Diagnose Patch Clean Fix Document BACKUPS working with yet? Especially when unfucking data...
  • 36. Clean up Prepare Isolate Control Diagnose Patch Clean Fix Document Disabled all backups and packing Opened up port 8080 to outside network Moved logs to temporary disk Disabled zopes 5-10
  • 37. Clean up Prepare Isolate Control Diagnose Patch Clean Fix Document Disabled all backups and packing Opened up port 8080 to outside network Moved logs to temporary disk Disabled zopes 5-10
  • 38. Prepare Isolate Control Diagnose Patch Clean Fix Document Delete extra/bad files Scripts in version control Communicate Clean up
  • 39. Prepare Isolate Control Diagnose Patch Clean Fix Document I’ve got a fever, and the only solution... is MORE PATCH!
  • 40. ‣ Update/Close Tickets ‣ Integrate Test Cases ‣ Document Processes Prepare Isolate Control Diagnose Patch Clean Fix Document
  • 41. Handling Data Errors Prepare Isolate Control Diagnose Patch Clean Fix Document Network Hardware Software Data works for me obvious, sporadic crazy shit everything else not recreatable locally
  • 42. Handling Data Errors Prepare Isolate Control Diagnose Patch Clean Fix Document Network Hardware Software Data works for me obvious, sporadic crazy shit everything else not recreatable locally
  • 43. Handling Data Errors Prepare Isolate Control Diagnose Patch Clean Fix Document Network Hardware Software Data works for me obvious, sporadic crazy shit everything else not recreatable locally
  • 44. Handling Data Errors Prepare Isolate Control Diagnose Patch Clean Fix Document Network Hardware Software Data works for me obvious, sporadic crazy shit everything else not recreatable locally
  • 45. Handling Data Errors Prepare Isolate Control Diagnose Patch Clean Fix Document Network Hardware Software Data works for me obvious, sporadic crazy shit everything else not recreatable locally
  • 46. Prepare Isolate Control Diagnose Patch Clean Fix Document
  • 47. How Data is Stored Plone root (app) NewsMembers Events acl_users acl_users users roles users roles news.2010.09.08 news.2010.06.13 Prepare Isolate Control Diagnose Patch Clean Fix Document temp_folder
  • 48. The Basics Prepare Isolate Control Diagnose Patch Clean Fix Document ‣ ./bin/instance debug ‣ app ‣ dir, __dict__
  • 49. Direct Connect >>> from ZODB.FileStorage import FileStorage >>> from ZODB.DB import DB >>> storage = FileStorage('var/filestorage/Data.fs') >>> db = DB(storage) >>> connection = db.open() >>> root = connection.root() Prepare Isolate Control Diagnose Patch Clean Fix Document >>> from ZEO import ClientStorage >>> from ZODB import DB >>> address = '10.0.1.5', 8001 >>> db = DB(storage) >>> connection = db.open() >>> root = connection.root() >>> root[‘app’] = PloneSite() >>> root[‘status’] = ‘Running’
  • 50. Prepare Isolate Control Diagnose Patch Clean Fix Document >>> import transaction >>> del app.Plone.news[‘news-item-id’] >>> transaction.commit()
  • 51. _p_changed Prepare Isolate Control Diagnose Patch Clean Fix Document
  • 52. When in doubt... ‣ PDB is your friend ‣ The source is your friend ‣ Throw a party for your friends Prepare Isolate Control Diagnose Patch Clean Fix Document
  • 53. ‣ Know your System ‣ Understand the Tools ‣ Be Nice to your Neighbors