Firebird Anti-Corruption Approach

4,944 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
4,944
On SlideShare
0
From Embeds
0
Number of Embeds
238
Actions
Shares
0
Downloads
103
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Firebird Anti-Corruption Approach

  1. 1. Anti-corruption<br />How to prevent Firebird database corruption<br />Alexey Kovyazin,<br />IBSurgeon,<br />ak@ib-aid.com<br />
  2. 2. IBSurgeon – 8 years!<br />Products<br />IBFirstAID/FBFirstAID, etc<br />FBScanner<br />FBDataGuard<br />Clients<br />Carl Zeiss Meditec, USA<br />Vneshtorgbank, Russia<br />Wells FargoBank, USA<br />Watermark Software, UK<br />Bas-X, Australia<br />Victoria University, New Zealand<br />Kingsway Management, UK<br />Team<br />Dmitry Kuzmenko<br />Alexey Kovyazin<br />Sergey Nikitin<br />Oleg Mateveev& team<br />Consultants<br />Dmitry Yemanov, VladKhorsun, Alex Peshkoff<br />Partners<br />IBPhoenix<br />
  3. 3. Alexey Kovyazin<br />Hosting & cloud partners<br />Yes, it’s me! <br />http://ru.linkedin.com/in/kovyazin<br />2006<br />2008<br />2009<br />2010<br />In 2007 we sold 3mln of Delphi to all Russian schools<br />
  4. 4. Agenda<br />Why bother?<br />Why corruption happens?<br />Reasons<br />Symptoms<br />What things are to monitor to recognize problem?<br />Problems with server <br />Problems with environment<br />Problems with database<br />Maintenance improvements to prevent corruptions.<br />Backups<br />Why we created FBDataGuad?<br />
  5. 5. Why bother?<br />Firebird databases become bigger and bigger every year<br />Information inside Firebird can costs $XXXXXX<br />Outage (corruption, backup/restore breaks) can costs $XXXXX too<br />Real-world examples<br />Bas-X, Australia – Firebird 2.x, 250Gb, no BLOBs, 250 users<br />Watermark Software, UK – Firebird 2.x, up to 400Gb, with BLOBs <br />Profitmed, Russia, medical distribution, Firebird 1.5, 65Gb, 250 users<br />1 Terabyte Firebird 2.1 database<br />http://www.ib-aid.com/articles/item104<br />3.8Billions of records in the biggest table<br />
  6. 6. Corruption reasons & symptons<br />Reasons<br />Misadministration<br />Hardware failures<br />Bugs<br />The actual reason often remains undiscovered<br />Symptoms<br />Repeatable<br />Trackable<br />Complimentary<br />We can prevent corruption if see its symptoms.<br />
  7. 7. Sample Firebird environment<br />Server<br />Database<br />Firebird<br />Copy of backups<br />Backups<br />
  8. 8. What to monitor at Firebird instance level<br />Is Server online?<br />General parameters<br />How much RAM?<br />Mb<br />#<br />Temp files?<br />Records to analyze<br />6 levels<br />Logs<br />Size of logs<br />Is it recommended? <br />Bugs, issues<br />Server version<br />
  9. 9. Firebird instance key parameters<br />Server availability<br />Consumed RAM <br />Temp files size <br />Temp files quantity<br />Records in logs<br />Logs’ size<br />Server version related issues<br />Need to watch for 7 key parameters which can indicate possible or actual problems<br />
  10. 10. What to monitor-1<br />General database checks<br />Database availability -> Outages, firewalls, stability<br />Log records related with database in firebird.log -> early symptoms<br />Check metadata – validate all metadata -> early showings<br />Transactions<br />Transaction markers monitoring (garbage problems)<br />Limit (2 billions between backup/restore)<br />Users<br />Min/max/avg users –> peaks problems, design of application<br />
  11. 11. What to monitor-2<br />Database files<br />Single volume and multi-volume -> Volumes in bin<br />Paths – where to stored (not at the same drive with temp files and backups!)<br />Sizes and growth limits -> Warnings about growth<br />Delta-files (nbackup)<br />Life-time and sizes -> Huge/aged delta problems<br />Backup files<br />Existence, sizes and growth limits -> Backup could kill database<br />
  12. 12. What to monitor-3<br />Number of formats per table<br />No more than 255 -> corruption<br />Less formats in production -> performance problem<br />Non-activated and deactivated indices<br />Deactivated – explicitly deactivated (why deactivated?)<br />Non-activated – indicates problems during restore<br />
  13. 13. What to monitor-4<br />Periodical statistics (gstat) -> deep look into database<br />Firebird server version<br />Examples - problems with nbackup<br />Latest patches are recommended<br />Firebird fbclient.dll version<br />If fbclient.dll <> fbserver - > Problems (disconnects, 10054, errors)<br />Firebird installation size<br />Default database place is %Firebird%Bin<br />Firebird logs size and paths<br />Big logs quickly exhaust space -> corruption<br />
  14. 14. Maintenance-1<br />Backups<br />Revolver (days, week, month copies) backups<br />Backup depth<br />Checking restore (need to check results)<br />Growth prognosis (if not enough space, backup should be canceled)<br />Control backup time (too long backup indicates problems)<br />Today<br />5..7<br />Yesterday<br />Weekly<br />
  15. 15. Big database requires individual maintenance plan <br />Maintenance plan depends on size of database and work mode (8x5, 24x7)<br />Backups scheme is not simple<br />Perform test restores separately<br />To be checked<br />Errors – in firebird.log and run error checking quries on live database<br />Metadata – check integrity <br />Metadata limits<br />Data & BLOBs – walk through data, check segmentation<br />Indices – check indices health<br />Transactions – any gaps, garbage growth, other problems<br />
  16. 16. Everyday minimal (!) maintenance plan for big database<br />
  17. 17. Example of backup plan for big Firebird database<br />Maintenance server<br />Main server<br />Firebird database<br />Nbackup copy<br />Checking restore<br />Gbak-b<br />And each step should be confirmed and reported.<br />
  18. 18. Maintenance-2<br />Indices<br />Recalculate indices statistics -> Performance<br />Selected or excluded<br />Check index status – active/in-active/non-activated -> Problems, corruptions<br />Check physical index health<br />Early showings of corruptions<br />
  19. 19. Maintenance-3<br />Validate database with gfix<br />Don’t forget to shutdown database<br />Analysis (includingfirebird.log)<br />Metadata validation<br />Check important system tables<br />Firebird.log maintenance<br />When log becomes very big, copy it to backup log files<br />And some more things….<br />
  20. 20. And this is not enough!<br />Business wants to have warranty - even if hardware fails data should be recovered!<br />
  21. 21. A big job<br />Implement scripts<br />Check them in the test environment<br />Explore errors messages and codes of Firebird<br />We spent 6 years getting the necessary information…<br />
  22. 22. That’s why we created Fbdataguard<br />
  23. 23. FBDataGuarddoes all above things…<br />Watches database files, volumes, deltas, performs and checks backups in the right way<br />Verifies metadata, data and indices<br />Watches for errors, limits and wrong versions<br />Sends alerts and recommendations<br />
  24. 24. Example with TEMP<br />FBDataGuard found the temp files size = N<br />Not enough space<br />M – N<X<br />Not enough space – administrator will have alert and recommendation to increase TEMP<br />Free space at TEMP- locations= M<br />
  25. 25. Example alert<br />Too big temporary files <br />Total size of all temporary files 3 Gb is more than recommended: 500 Mb <br />Firebird creates temporary files for some SQL queries (PLAN SORT). Too big size of temporary files can indicate performance problems. This is not a strictly defined number, so this threshold depends on particular database and application.<br />
  26. 26. Index problem example<br />non-activatedindices usually indicates corruption (missed Foreign Keys)<br />FBDataGuard found non-activated index after restore<br />Administrator will get alert and recommendation to check indices<br />Possible perfomance problem prevented!<br />
  27. 27. Example of backup problem resolution<br />FBDataGuard found the backup size =M<br />Not enough space<br />M>=N<br />Backup cancelled, database status is set toCritical, administrator gotalert<br />FBDataGuard found free space at backups’ disk = N<br />Corruption of backup was prevented!<br />
  28. 28. Example of backup problem alert<br />Job backup@[ server-0000 / db-0000 ] malfunction<br />Unexpected job backup@[ server-0000 / db-0000 ] error: There is not enough space on the disk <br />
  29. 29. Example of good backup notification<br />
  30. 30. Hardware and UNDELETE failures<br />HDD corruption<br />Flash-drive corruption<br />UNDELETE problem<br />
  31. 31. And even more – protects from hardware failures<br />Metadata repository<br />FBDataGuard Extractor extracts data from corrupted database and inserts to the new<br />New DB<br />Tables data<br />BLOBs<br />
  32. 32. Firebird DataGuard<br />Watch for 26 important database and server parameters<br />Alerts for potential and real problem by email<br />Proper automation of database maintenance<br />Windows, Linux, MacOS, Firebird 1.5-2.1 (not 2.5 yet)<br />Special licensing for ISV (Independent Software Vendors) Firebird developers<br />
  33. 33. Get FBDataGuard 1 year<br />Free 1 year license for all attendees<br />Send request to dataguard@ib-aid.com<br />
  34. 34. Thank you<br />Questions? <br />dataguard@ib-aid.com<br />

×