• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Debugging 2013- Poul henning-kamp

Debugging 2013- Poul henning-kamp



Debugging- for rigtige programmører

Debugging- for rigtige programmører



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Debugging 2013- Poul henning-kamp Debugging 2013- Poul henning-kamp Presentation Transcript

    • Preventive Debugging Poul-Henning Kamp phk@FreeBSD.org phk@Varnish.org @bsdphk
    • 1981 Debugging n. Removing a BUG, either by tinkering with the program or by amending the program specification so that the side effect of the bug is published as a desirable feature. See also: KLUDGE; ONE-LINE PATCH; STEPWISE REFINEMENT. - Stan Kelley-Bootle The Devil's D.P. Dictionary
    • Varnish: A debugging nightmare * 40.000 threads * 1 TB common datastructures * 1.000.000 requests per second * 30 Gbit/s traffic
    • 1949 As soon as we started programming, we found to our surprise that it wasn't as easy to get programs right as we had thought. Debugging had to be discovered. I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs. - Maurice Wilkes
    • 1974 Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - Brian W. Kernighan and P. J. Plauger The Elements of Programming Style
    • The crucial debugging insight: Q: Why are you debugging to begin with ? A: Because you are not a perfect programmer
    • Todays task: Optimize this: void debug(input) { do { idea = think(input); if (idea != NO_IDEA) patch(idea); input += test(); } while (unhappy(input)); }
    • ”Option” doesn't mean its optional If you don't use -Wall -Werror by default, your code sucks by default. Also consider: -Wstrict-prototypes -Wpointer-arith -Wcast-qual -Wswitch -Wcast-align -Wchar-subscripts -Wnested-externs -Wformat -Wno-missing-field-initializers -Wmissing-prototypes -Wreturn-type -Wwrite-strings -Wshadow -Wunused-parameter -Winline -Wredundant-decls -Wextra -Wno-sign-compare
    • Get a second, third & fourth opinion * Use multiple compilers (LLVM vs. GCC) * lint(1) * FlexeLint ($1k) * Coverity * LLVM's static analysis tools * Different CPU/endianess/word-size
    • Random FlexeLint example: if (!(dfu_root->flags |= DFU_IFF_DFU)) main.c 483 Info 820: Boolean test of a parenthesized assignment main.c 483 Info 774: Boolean within 'if' always evaluates to False [Reference: file main.c: line 483] main.c 483 Info 831: Reference cited in prior message
    • 1979 The most effective debugging tool is still careful thought, coupled with judiciously placed print statements. - Brian W. Kernighan Unix for Beginners
    • 2013 It's even easier to let the program just tell you where the bugs are. - Poul-Henning Kamp
    • Proactively eliminate doubt #include <assert.h> if (*q != '0' && r == e) { if (b != vep->tag) { l = e - b; assert(l < sizeof vep->tag); memmove(vep->tag, b, l); vep->tag_i = l; } return (NULL); }
    • Performance price List ● ● ● ● ● ● ● ● char *p += 5; strlen(p); memcpy(p, q, l); Locking System Call Context Switch Disk Access Filesystem 10-9s CPU Memory Protection 10-1s Mechanical
    • What does assert() actually do ? #define assert(e) ((e) ? (void)0 : __assert( __func__, __FILE__, __LINE__, #e)) void __assert(const char *func, const char *file, int line, const char *failedexpr) { } (void)fprintf(stderr, "Assertion failed: ” ”(%s), function %s, file %s, line %d.n", failedexpr, func, file, line); abort(); /* NOTREACHED */
    • Make your own asserts #define AZ(foo) do {assert((foo) == #define AN(foo) do {assert((foo) != #define XXXAZ(foo) do {xxxassert((foo) #define XXXAN(foo) do {xxxassert((foo) #define WRONG(expl) [...] #define INCOMPLETE(expl) [...] #define Lck_AssertHeld() [...] void WS_Assert(...); void MPL_Assert_Sane(...); void VTCP_Assert(...); ... 0);} while (0) 0);} while (0) == 0);} while (0) != 0);} while (0)
    • If it can't happen, assert that AZ(pipe(vwe->pipes)); AZ(shutdown(sock, SHUT_WR)); AZ(close(fd)); AZ(fstatvfs(fd, &fsst)); AZ(pthread_mutex_lock(&vsm_mtx)); ... XXXAZ(unlink(vp->fname)); XXXAZ(setgid(mgt_param.gid)); ... sto = calloc(sizeof *sto, 1); XXXAN(sto); ...
    • Assert the locking situation static void ban_reload(const uint8_t *ban, unsigned len) { struct ban *b, *b2; int duplicate = 0; double t0, t1, t2 = 9e99; ASSERT_CLI(); Lck_AssertHeld(&ban_mtx);
    • Be constructively paranoid static enum req_fsm_nxt cnt_deliver(struct worker *wrk, struct req *req) { char time_str[30]; CHECK_OBJ_NOTNULL(wrk, WORKER_MAGIC); CHECK_OBJ_NOTNULL(req, REQ_MAGIC); CHECK_OBJ_NOTNULL(req->obj, OBJECT_MAGIC); CHECK_OBJ_NOTNULL(req->obj->objcore, OBJCORE_MAGIC); CHECK_OBJ_NOTNULL(req->obj->objcore->objhead, OBJHEAD_MAG CHECK_OBJ_NOTNULL(req->vcl, VCL_CONF_MAGIC); assert(WRW_IsReleased(wrk)); assert(req->obj->objcore->refcnt > 0);
    • (Un)trusted types struct http { unsigned #define HTTP_MAGIC magic; 0x6428b5c9 struct http_conn { unsigned #define HTTP_CONN_MAGIC magic; 0x3e19edd1 struct wrk_accept { unsigned #define WRK_ACCEPT_MAGIC magic; 0x8c4b4d59 struct objcore { unsigned #define OBJCORE_MAGIC magic; 0x4d301302
    • Miniobj.h #define ALLOC_OBJ(to, type_magic) do { (to) = calloc(sizeof *(to), 1); if ((to) != NULL) (to)->magic = (type_magic); } while (0) #define FREE_OBJ(to) do { (to)->magic = (0); free(to); } while (0) #define CHECK_OBJ_NOTNULL(ptr, type_magic) do { assert((ptr) != NULL); assert((ptr)->magic == type_magic); } while (0)
    • A more helpful assert() handler Panic from VCL: PANIC: Had Panic header: fetch thread = (cache-worker) ident = FreeBSD,10.0-ALPHA4,amd64,-smalloc,-smalloc,-hcritbi Backtrace: 0x44afeb: PAN_Init+3fb 0x806e05ef0: _end+80673e5b8 0x806c0201d: _end+80653a6e5 0x45f458: VCL_recv_method+7e8 0x4608f5: VCL_backend_response_method+1b5 0x42d3c2: VBF_Fetch+1e32 0x42c01b: VBF_Fetch+a8b 0x44ddd0: Pool_Work_Thread+500 0x474798: WRK_thread+1d8 0x4745ef: WRK_thread+2f
    • ... even more helpful assert() handler busyobj = 0x803d3a020 { ws = 0x803d3a098 { id = "bo", {s,f,r,e} = {0x803d3bf98,+128,0x0,+57480}, }, do_stream bodystatus = 4 (length), }, http[bereq] = { ws = 0x803d3a098[bo] "GET", "/foo", "HTTP/1.1", "X-Forwarded-For:", "Accept-Encoding: gzip", "X-Varnish: 1004", "Host:", }, http[beresp] = { ws = 0x803d3a098[bo] "HTTP/1.1", "200", "Ok", "Panic: fetch", "Content-Length: 7", }, ws = 0x803d3a218 { id = "(null)", {s,f,r,e} = {0x0,0x0,0x0,0x0}, }, }
    • How ? Use pthread_setspecific(3) to tie state to thread In your __assert() function: Use pthread_getspecific(3) to get that state See also backtrace(3) API (OS/Compiler dependent.)
    • Test that your code works Automate running of your test-cases I mean, you do have test-cases ? right ? Right ?!
    • Varnish testing Varnishtest(1) tool (5000 LOC) Interprets ”VTC” test-language: varnishtest "Does anything get through at all ?" server s1 { rxreq txresp -body "012345n" } -start varnish v1 -vcl+backend {} -start client c1 { txreq -url "/" rxresp expect resp.status == 200 } -run varnish v1 -expect n_object == 1 varnish v1 -expect sess_conn == 1
    • Varnish testing 336 VTC testcases in 11 categories Important category: Regression tests for bugs ”make check” runs test-cases (~ 10 minutes) Jenkins tinderbox builds/tests ~10 platforms gcov(1) used to monitor test-coverage. (89%!)
    • Preventive Debugging Summary * Code defensively * Use tools to improve your code & coding * Assert() that you know what's going on * Fail ASAP. * Dump useful info while you can * Know that your code works & can be executed
    • Does that work ? Yes! Varnish delivers a LOT of web-pages We get approx 1 crash report every 2-3 weeks We almost never need gdb(1)
    • 1946 Intet tab bør ramme os, som kan undgås ved rettidig omhu. - Skibsreder A.P. Møller