Testing Heuristics Andrew Lee CISSP Chief Research Officer ESET LLC [email_address]
What do you need? The appropriateness of the methodology (or it’s correct application) Repeatability Independently verifiable Validated sample sets Adherence to safe and ethical practices in handling and testing samples Understanding of what heuristic detection is (and what it’s not)
A quick word on FP testing No ‘tricks’! Appropriate “ItW” false positive set Evaluation of FP’s ‘ Grey’/unusual or very strange unlikely files will tend to penalize heuristic based products Defaults Best settings
Junk / Corrupt files Poor sample sets simply reinforce the cycle - the more junk added, the more detected Using AV products to determine maliciousness is silly, it simply reinforces the cycle (Kaminski - Eicar 2006?)
“ Time to Update” 6 hours 30 hours at %20 (5 upd) X4 4 hours 8 hours at %50 (10 upd) X3 4 hours 4 hours at 5% (1 upd) X2 1 hour 1 hour at 100% (20 upd) X1 Average TtU Actual Time to Update / % missed (20 Samples) Product
Actual TtU 30 hours 30 hours at %20 X4 8 hours 8 hours at 50% X3 4 hours 4 hours at 5% X2 1 hour 1 hour at 100% X1 Average TtU (zero removed) Actual Time to Update / % missed Product
Mean time Each Dot represents a different product
Lies, Damned Lies and Statistics Statistical intgrity is biased, means of more succesful product are calculated over less samples (necessarily). This is not good for comparisons. Concentrating on speed of update is surely sending the wrong message to the consumers, giving them the false impression that buying a product that releases a lot of updates very quickly is going to protect them better.
Retrospective (Frozen Update) Selection of time period 6 months? 3 months? 1 day? 1 hour? Verification (is it possible to do real time?)
Frozen Update Pt II What samples are important? Is this a recursive process? Single snapshot is not necessarily the most useful information Performance over time Sound statistical model
To quote Dr Alan Solomon. 1. If something is superb at detecting viruses, it's no use if it gives a lot of false alarms. 2. Anything that relies on the user to make a correct decision, on matters that he is not likely to be able to decide about, is useless. 3. You can receive something that is *exactly* what the salesman promised to deliver, and it's nevertheless useless.
 
Shameless plug AVIEN Guide to Managing Malware in the Enterprise http://www.smallblue-greenworld.co.uk/pages/avienguide.html

Testing Heuristic Detections

  • 1.
    Testing Heuristics AndrewLee CISSP Chief Research Officer ESET LLC [email_address]
  • 2.
    What do youneed? The appropriateness of the methodology (or it’s correct application) Repeatability Independently verifiable Validated sample sets Adherence to safe and ethical practices in handling and testing samples Understanding of what heuristic detection is (and what it’s not)
  • 3.
    A quick wordon FP testing No ‘tricks’! Appropriate “ItW” false positive set Evaluation of FP’s ‘ Grey’/unusual or very strange unlikely files will tend to penalize heuristic based products Defaults Best settings
  • 4.
    Junk / Corruptfiles Poor sample sets simply reinforce the cycle - the more junk added, the more detected Using AV products to determine maliciousness is silly, it simply reinforces the cycle (Kaminski - Eicar 2006?)
  • 5.
    “ Time toUpdate” 6 hours 30 hours at %20 (5 upd) X4 4 hours 8 hours at %50 (10 upd) X3 4 hours 4 hours at 5% (1 upd) X2 1 hour 1 hour at 100% (20 upd) X1 Average TtU Actual Time to Update / % missed (20 Samples) Product
  • 6.
    Actual TtU 30hours 30 hours at %20 X4 8 hours 8 hours at 50% X3 4 hours 4 hours at 5% X2 1 hour 1 hour at 100% X1 Average TtU (zero removed) Actual Time to Update / % missed Product
  • 7.
    Mean time EachDot represents a different product
  • 8.
    Lies, Damned Liesand Statistics Statistical intgrity is biased, means of more succesful product are calculated over less samples (necessarily). This is not good for comparisons. Concentrating on speed of update is surely sending the wrong message to the consumers, giving them the false impression that buying a product that releases a lot of updates very quickly is going to protect them better.
  • 9.
    Retrospective (Frozen Update)Selection of time period 6 months? 3 months? 1 day? 1 hour? Verification (is it possible to do real time?)
  • 10.
    Frozen Update PtII What samples are important? Is this a recursive process? Single snapshot is not necessarily the most useful information Performance over time Sound statistical model
  • 11.
    To quote DrAlan Solomon. 1. If something is superb at detecting viruses, it's no use if it gives a lot of false alarms. 2. Anything that relies on the user to make a correct decision, on matters that he is not likely to be able to decide about, is useless. 3. You can receive something that is *exactly* what the salesman promised to deliver, and it's nevertheless useless.
  • 12.
  • 13.
    Shameless plug AVIENGuide to Managing Malware in the Enterprise http://www.smallblue-greenworld.co.uk/pages/avienguide.html