Alexey Smirnov, Huijia Lin, and Tzi-cker Chiueh Experimental Computer Systems Lab, Department of Computer Science, SUNY Stony Brook FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hijacking Attacks Contact Information ECSL Lab at SUNY Stony Brook: http://www.ecsl.cs.sunysb.edu E-mail: {alexey,lhj,chiueh}@cs.sunysb.edu Evaluation Test suite:    named  — a DNS daemon from BIND 8.1 program   ghttpd  — an HTTP server   drcatd   — a remote cat program squid  – web caching daemon passlogd  – syslog sniffer. Instrumentation Overheads:  Signatures Evaluation:   named signature  ghttpd signature FORECAST System Architecture  FORCAST compiler  is an extension to GNU C Compiler. FORECAST run-time library  has functions for attack detection, packet identification, and signature generation. The main idea is to log each memory update that a program performs at run-time in a  memory updates log. At compile time , FORECAST instruments the program so that it can  generate log records and detect an attack. At run-time , the instrumented program generates memory updates log records and uses this information when an attack is detected. Introduction An ideal signature should have a low  false positives  and  false negatives  rate. It should include all attack packets and represent each packet as a regular expression. Previous systems either used the last packet as the signature or required a large amount of malicious network data to build a signature.  We present a signature generating system  FORECAST  that can generate a multi-packet signature representing each packet as a regular expression from one attack instance. The whole process is   human-free  and  in a split second   FORCAST has following features:  1) Attack detection (return address modification);  Attack packet identification; Signature generation that represents each packet as a  regular expression  and adds a  length constraint  to the last packet. FTP Sever Example  A vulnerable FTP server code  Memory updates log    Saved return address is in  buf[17] FTP server GET  multi-packet  attack   Signature  char  buf [ 16 ]; Is_auth = is_user =0; // user not authenticated initially while  (1) { recv_packet( p ); if  (! strncmp ( p , “QUIT”,4))  break ; if  (! strncmp ( p , “USER”, 4)) {  is_user =1;  continue ; } if  (! strncmp ( p , “PASS”, 4) &&  is_user ) {  is_auth =1;  continue ;  } if  (! is_auth )  continue ; // authentication required if  (! strncmp ( p , “GET”, 3)) { strcpy ( buf ,  p+4 );  // overflow occurs  send_file( buf ); } } Attack Detection Most control-hijacking attacks modify a  control sensitive data structure  in the victim program, for example a return address. The attacker gets control when the function returns. At the function prolog, the return address is stored in the return address repository. At the function epilog, the return address on the stack is compared with the value stored in the return address repository.   If they are different then an attack is detected. Control and Data Dependencies A  data dependency  is created between variable X and Y when X=Y is executed. A  control dependency  is created between X and Y when Y defines X’s value using a conditional expression: if (Y>1)  X=1; else  X=2. For each conditional and loop expression FORECAST inserts a special  log record . Memory updates log allows to identify control and data dependencies.  Using data dependencies only is insufficient, for example for an FTP server that requires authentication. Memory Updates Logging Memory updates log record: For assignment statement X = Y, For X = Y + Z, an entry is created for each operand.   Standard library functions are proxied: recv(socket, &buf, sizeof(buf), 0) Attack has three packets : USER  alexey PASS  my_pass GET  long_file_name_that_will_overwrite_return_addr <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)> 3   # number of packets 11  # 1st packet length USER alexey 12  # 2nd packet length PASS my_pass 59  # 3rd packet length GET ? ... ? 4 17  # length constraint size and start offset \0   # length constraint regexp

FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hijacking Attacks

  • 1.
    Alexey Smirnov, HuijiaLin, and Tzi-cker Chiueh Experimental Computer Systems Lab, Department of Computer Science, SUNY Stony Brook FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hijacking Attacks Contact Information ECSL Lab at SUNY Stony Brook: http://www.ecsl.cs.sunysb.edu E-mail: {alexey,lhj,chiueh}@cs.sunysb.edu Evaluation Test suite: named — a DNS daemon from BIND 8.1 program ghttpd — an HTTP server drcatd — a remote cat program squid – web caching daemon passlogd – syslog sniffer. Instrumentation Overheads: Signatures Evaluation: named signature ghttpd signature FORECAST System Architecture FORCAST compiler is an extension to GNU C Compiler. FORECAST run-time library has functions for attack detection, packet identification, and signature generation. The main idea is to log each memory update that a program performs at run-time in a memory updates log. At compile time , FORECAST instruments the program so that it can generate log records and detect an attack. At run-time , the instrumented program generates memory updates log records and uses this information when an attack is detected. Introduction An ideal signature should have a low false positives and false negatives rate. It should include all attack packets and represent each packet as a regular expression. Previous systems either used the last packet as the signature or required a large amount of malicious network data to build a signature. We present a signature generating system FORECAST that can generate a multi-packet signature representing each packet as a regular expression from one attack instance. The whole process is human-free and in a split second FORCAST has following features: 1) Attack detection (return address modification); Attack packet identification; Signature generation that represents each packet as a regular expression and adds a length constraint to the last packet. FTP Sever Example A vulnerable FTP server code Memory updates log Saved return address is in buf[17] FTP server GET multi-packet attack Signature char buf [ 16 ]; Is_auth = is_user =0; // user not authenticated initially while (1) { recv_packet( p ); if (! strncmp ( p , “QUIT”,4)) break ; if (! strncmp ( p , “USER”, 4)) { is_user =1; continue ; } if (! strncmp ( p , “PASS”, 4) && is_user ) { is_auth =1; continue ; } if (! is_auth ) continue ; // authentication required if (! strncmp ( p , “GET”, 3)) { strcpy ( buf , p+4 ); // overflow occurs send_file( buf ); } } Attack Detection Most control-hijacking attacks modify a control sensitive data structure in the victim program, for example a return address. The attacker gets control when the function returns. At the function prolog, the return address is stored in the return address repository. At the function epilog, the return address on the stack is compared with the value stored in the return address repository. If they are different then an attack is detected. Control and Data Dependencies A data dependency is created between variable X and Y when X=Y is executed. A control dependency is created between X and Y when Y defines X’s value using a conditional expression: if (Y>1) X=1; else X=2. For each conditional and loop expression FORECAST inserts a special log record . Memory updates log allows to identify control and data dependencies. Using data dependencies only is insufficient, for example for an FTP server that requires authentication. Memory Updates Logging Memory updates log record: For assignment statement X = Y, For X = Y + Z, an entry is created for each operand. Standard library functions are proxied: recv(socket, &buf, sizeof(buf), 0) Attack has three packets : USER alexey PASS my_pass GET long_file_name_that_will_overwrite_return_addr <DIRA_RECV, &p, 11, “USER alexey”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <NULL, &is_user, 4, is_user> <DIRA_RECV, &p, 12, “PASS my_pass”> <DIRA_STRNCMP, &p, 4, NULL> <DIRA_COND, &p, 0, NULL> <DIRA_COND, &is_user, 0, NULL> <NULL, &is_auth, 4, is_auth> <DIRA_RECV, &p, 62, “GET …”> <DIRA_COND, &is_auth, 0, NULL> <DIRA_STRNCMP, &p, 3, NULL> <DIRA_COND, &p, 0, NULL> <&p+4, &buf, strlen(p)-4+1, *(p+4)> 3 # number of packets 11 # 1st packet length USER alexey 12 # 2nd packet length PASS my_pass 59 # 3rd packet length GET ? ... ? 4 17 # length constraint size and start offset \0 # length constraint regexp