1Kaspersky Whitelisting Database TestA test commissioned by Kaspersky Lab and performed by AV-Test GmbHDate of the report: February 14th, 2013, last update: April 4th, 2013SummaryDuring November 2012 and January 2013, AV-Test performed a test of Kaspersky WhitelistingDatabase in order to determine the coverage and quality of this service. In total five different testcases have been applied: Test Case 01 - Database Coverage Test Case 02 - Database Quality Test Case 03 - Database Speed Test Case 04 - Database False Rate Test Case 05 - Default Deny ModeWith these individual tests it is ensured that both the quality as well as the quantity of the serviceand the underlying data is verified.AV-TEST used data from its own clean file project which includes over 10 Terabyte of data with over20 million files. In order to build the set AV-TEST downloads software from popular download portalsas well as from vendors directly, installs them and captures all created files together with theirmetadata.Four different data sets extracted from the full AV-TEST clean file set were used to perform the test: Daily Set – A set of files that have been tested the same day they have been discovered byAV-TEST, both installers as well as installed files Historic Set – A set of files that have been discovered by AV-TEST before the start of this test,both installers as well as installed files Installer Set – A set of files that have been discovered by AV-TEST before the start of thistest, just installers, no installed files Windows Set – A set of files from standard Windows installationsThe Kaspersky Whitelisting Database is a part of Kaspersky Security Network (KSN) infrastructurewhich is fully integrated into their consumer and corporate products in order to provide a real-timeaccess to a huge amount of world-wide collected information and expertise for every singlecustomer.The results of the test indicate that Kaspersky has a very good coverage of files that have beenknown to AV-TEST prior to the test (Historic Set and Installer Set) with over 91% of the files known toKaspersky at the time of the test. The new files (Daily Set) that have been tested on the day of theirdiscovery were already known to Kaspersky in the amount of 50%. Finally about 98% of the standardWindows files were known to the Kaspersky Whitelist at the time of the test. When only looking atthe relevant executables and libraries 99,9% of the files were known.
2MethodologyPlease refer to the Appendix for a detailed description of the methodology.
3Test ResultsThe following table lists the number of files submitted to the Whitelisting service and the number offiles that were known to the Whitelist at the time of the test.File Set Files Submitted Files known Percentage of known filesDaily Set 253.191 125.427 49,54%Historic Set 4.686.589 4.270.647 91,12%Installer Set 231.598 208.600 90,07%Windows Set 65.470 64.154 97,99%As mentioned before, files that have been published before the start of the test are covered verywell in the Kaspersky Whitelist and even new files that were published during the test are coveredgood. Furthermore, not all unknown files are really a problem. It is not necessary for a Whitelistingservice to cover text files or media files, so these could account for some of the unknown files. Onthe other hand it is important to cover executable files and program libraries as these are the filesthat need to be controlled. Similar observations can be mode for different categories, markets ofregions.The full details of all test results are shown in the appendix. Below are the main findings from eachTest Case.Test Case 01 - Database CoverageThis test aims to verify the coverage of the database particular areas of software as well as differentapplication types. There are further categories that will be checked:• The % coverage ratio of files associated with the consumer market• The % coverage ratio of files associated with the corporate market• The % coverage ratio of files associated with System, Office, Tools and Games application types• The % coverage ratio of files associated with different file format typesThe charts below indicate overall coverage as well as coverage of executable files subset.89,2%10,8%Overall CoverageFound NotFound93,3%6,7%Executable files CoverageFound NotFound
4The next chart shows coverage of both corporate and consumer subsets of software that was testedon purpose to measure abilities of Whiteliting service to identify software specific for each group.The corporate subset contains software such as Acronis® Backup and Security 2010, AutoCAD,Microsoft SQL Server Analysis Services, Crystal Reports 11, Microsoft® Office and so on. Instead,consumer subset was compiled with software such as Adobe Acrobat Reader, Apple iTunes,BitTorrent, DivX Player, Windows Movie Maker and other.Because there is no clear separation between corporate and consumer usage of the most commonsoftware, measurements of coverage was made in additional dimension. A tested subset ofapplications was divided into four common groups named System, Office, Tools and Games. Thechart below represent coverage for each of four types of application stated above.The System subset mostly consist of Microsoft Windows system’s files and 3rdparty drivers producedby 3Com, Canon, NVIDIA and others.96.0%97.5%4,0%2,5%0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%ConsumerCorporateCoverage by Market typeFound NotFound98,9%96,4%94,6%95,0%1,1%3,6%5,4%5,0%0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%GamesToolsOfficeSystemCoverage by Application typeFound NotFound
5The Office subset consists of whole Microsoft Office suite, Crystal Office, BillQuick®, NetMeeting andothers.The Tools subset contains entertainment, video and audio codes, development tools, backup andrecovery software that is simply neither Office software nor Games. For example, the Tools subsetwas compiled with Acronis True Image, ApexSQL, CyberLink PowerDVD, Microsoft Visual Studio, NeroBurning ROM and many others.As well as corporate and consumer groups shares different application’s types, the applicationsthemselves are consist of files which vary by its format type. It is obvious that injection orsubstitution of any of those files by malware may lead to significant changes in application behavior.For this reason, it is valuable from Whitelisting perspective to ensure integrity of all application’s filesand not only end-used launchers. The chart below represents results for common file types.Test Case 02 - Database QualityWhile quantitative coverage results are indicates “How many” files were known for Whitelistingservice, the qualitative results of “What was known” are important as well. The chart belowrepresents results of how much valuable information provided for known files1.The Trusted Level results represent expert knowledge of whenever particular file is a clean one ormalware. Instead, the Category part results represent information of application’s category assignedby Kaspersky for particular clean file. The set of categories defined by Kaspersky is slight different to1To measure the % coverage ratio of presented information the full subset of files that was known forWhitelisting service at the time of the test was selected as expected count. However, the subset of files thatwas used to measure coverage ratio for Certificates & Signatures part was reduced to subset of signed files,which signatures was verified locally. Similar to Category part the subset of files was reduced to those whichhas both valid digital signature and FileVersionInfo structure, because it is a current limitation of Kaspersky’scategorization process. The goal of such measurements is to compare the ratio of what was known againstwhat should be known.82,1%86,8%90,5%78,2%93,4%93,1%17,9%13,2%9,5%21,8%6,6%6,9%0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%otherMediaResourcesInstallers/PackageEnvironment/BundleEnd-used launcherCoverage by format typeFound NotFound
6what was presented above as Application Types in that Kaspersky has defined 16 high level and 95leaf categories.The Certificates And Signatures results represent knowledge of digital signatures and certificated thatexists for a files. The Sources and Packages part provide information about file’s containers and linksfrom which both a files or its container was downloaded. The last part named File’s Statisticsrepresent information of file’s popularity around the globe, count of participant of Kaspersky’sSecurity Network program who decide to trust or not to trust a particular file.The important note about Trusted Level is that its’ value may be in one of two states: “known” whichmeans that the final decision was already made and “In process” which means that the final decisionis still in question. The chart below represents distribution of 99% of available Trusted Level bydefined states.62,3%80,7%91,1%77,7%99,0%37,7%19,3%8,9%22,3%1,0%0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%Files StatisticsSources & PackagesCertificates & SignaturesCategoryTrusted LevelInformation provided for known filesPresent Not Present96,3%3,7%Availability of Trust LevelKnown InProcess
7Test Case 03 – Database SpeedThis test aimed to examine the responsiveness of the whitelisting database using the returning ofrequests from queries made against the database. Specifically, this testing considered:• The time taken for the Whitelist database to respond to verification requests against individualchecksums.• The time required for the addition of a previously unknown clean file into the whitelist after it wassubmitted to Kaspersky for verification.• The time required for Kaspersky to respond to the reporting of, and take appropriate actions upon,a reported false positive result.The below table shows the average response time required for different sizes of hash chunks thathave been submitted to the database.Chunk Size Time taken per HashSmall (below 250 hashes) 1,121sMedium (over 250 and below 500) 0,565sBig (over 2.000 hashes) 0,558sVery big (over 25.000 hashes) 0,145sResults show that the database responds very fast, especially when sending big amounts of hashes.This is more than sufficient to be used in a normal working environment.Test Case 04 - Database False RateThis test aimed to examine the trustworthiness or “correctness” of the database. The testing wasspecifically focused on the following areas:• Number of False Negatives i.e. those files classified as genuine by the solution but in fact malicious,based against a set of samples drawn from AV-TEST’s collections of recent malware.• Number of False Positives i.e. those files classified by the solution to be malicious, but in factgenuine.• Availability of the database in terms of being able to respond to a file lookup request, irrespectiveof the result of the lookup.False Positives False NegativesResult 0,00% 0,00%No false positives or false negatives were noticed during the test.Test Case 05 – Default Deny ModeOne of the purpose of Whitelisting service is to support the Default Deny mode of modernApplication Control in which only known trusted applications will be allow to execute and everythingelse will be block by default. This is a vital requirement for Whitelisting service to be able to identifycritical system’s files to support Default Deny mode. The last chart represents quantitative coverageof modern Windows 8 files for both 32-bit and 64-bit platforms.When looking at the results of Test Case 01 it is obvious that coverage of Windows files is very good,the details are shown in the following table.
8Even though the coverage results are less than a 100%, the remainder part consists of non-executable binaries, images and other similar files which are not affecting the Default Deny mode.98,3%97,5%1,7%2,5%0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%Windows 8 x64Windows 8 x86Windows 8 CoverageFound NotFound
9AppendixThe following part describes the AV-TEST Whitelist Testing methodology for a test of the KasperskyWhitelisting services. The first part of this document describes the test collection that will be used tocarry out the test. The second part of the document describes the actual test cases and the usedmethodology.Test CollectionAV-TEST downloads hundreds of applications for the Windows operating system every day from manypopular download sites (e.g. cnet.com, zdnet.com) and from the vendors directly (e.g. adobe.com,java.com, ibm.com). All in all over 300 different sources are covered. We then actually install thedownloaded programs. All the downloaded and installed files are stored as well as detailed informationabout these files (Productname, Productversion, Filename and Path, Size, Download URL, ...). So wealways know which file belongs to which product, where did we download it from and how old is it.On average we add around 45,000 new, unique files per day with a total size of 18-20 GB. Currently wehave over 10 TB of data in more than 20 million unique files in the clean file collection (historic sec). Alltypes of files are included in this collection, it is not limited to PE files. Languages covered are English,German, French and Spanish.Tests will be carried out on both the historic set as well as the daily added new files over a certain periodof time.Test MethodologyTest Case 01 - Database Coverage (TC01)TEST OBJECTIVEThis test aims to verify the coverage of the database, both for specific regions, and also for particularareas of software, such as consumer or corporate. There are further categories that will be checked:• The % coverage ratio of files associated with specific regions/languages• The % coverage ratio of files associated with the consumer market• The % coverage ratio of files associated with the corporate market• Popularity, Platform, Category, File Type and ReputationTEST DESCRIPTIONAV-TEST will check all new files over a period of two weeks as well as the historic set against theKaspersky database and will later extract numbers for the different regions, markets and other categories.No pre-sorting of the collection is done.The collection is known to include software in English, German, French and Spanish language.Furthermore it is known that consumer software as well as corporate software is included.The returned outputs against each hash is saved to a combined log file for analysis, and variousmeasurements will also be taken, such as the time taken to perform the lookup, andstart/finish times for the process. This allows to perform the analysis of the results after the test has beenfinished.Test Case 02 - Database Quality (TC02)TEST OBJECTIVEThis test aimed at verifying the structure and value of metainformation that is available in the databaseabout each file from the test collection. The result also aimed to reflect the usefulness of any associated
10information for analytical purposes according to a predetermined weighting system agreed in advancewith Kaspersky. Each metainformation parameter was assigned a weight, and thus a metric basedapproach to evaluate the quality of the Whitelisting database could be assessed based upon thecomposition and perceived importance of each parameter.TEST DESCRIPTIONUsing the output generated by TC01, the hashes could be determined for which data had been returned.Following this, the data on a per hash basis can be extracted and it can be checked how many of apredetermined set of data flags were returned. The specific data sets, or flags, that were used in this testwere then awarded a weighted value based on their level of importance, according to appropriateweighting system agreed in advance with Kaspersky. This allowed for a weighted measure (expressed as apercentage score) of data completeness to be awarded to each returned hash so that a judgment on thevalue and quality of the data could be determined.Test Case 03 - Database Speed (TC03)TEST OBJECTIVEThis test aimed to examine the responsiveness of the whitelisting database using the returning ofrequests from queries made against the database. Specifically, this testing considered:• The time taken for the Whitelist database to respond to verification requests against individualchecksums.• The time required for the addition of a previously unknown clean file into thewhitelist after it was submitted to Kaspersky for verification.• The time required for Kaspersky to respond to the reporting of, and take appropriate actions upon, areported false positive result.TEST DESCRIPTIONThe testing of database speed again bases on the data from TC01 which provides information about howlong a request takes to answer. Different chunk sizes (between 10 and 1000 hashes per request) will beused.The timings were subsequently analysed in order to determine the length of average time recorded forprocessing each of these groupings, as well as to ascertain the average time per single hash whenprocessed in these groups.Test Case 04 - Database False Rate (TC04)TEST OBJECTIVEThis test aimed to examine the trustworthiness or “correctness” of the database. The testing wasspecifically focused on the following areas:• Number of False Negatives i.e. those files classified as genuine by the solution but in fact malicious,based against a set of samples drawn from AV-TEST’s collections of recent malware.• Number of False Positives i.e. those files classified by the solution to be malicious, but in fact genuine.• Availability of the database in terms of being able to respond to a file lookup request, irrespective of theresult of the lookup.TEST DESCRIPTIONA list of 20,000 hashes that are associated with files contained within AV-TEST’s malware collections wascompiled, with the criteria for this being that these had been collected just prior to the start of the test.Using the methodology that was employed for TC01, each of these hashes was then parsed through theapplication against the database, and the subsequent returns and outputs were recorded.Subsequent to this, further analysis was also conducted against the output taken from TC01 in order todetermine how many of the known-good files were reported by the whitelist database as infected.Test Case 05 – Default Deny Mode (TC05)TEST OBJECTIVE