• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Web Application Forensics: Taxonomy and Trends
 

Web Application Forensics: Taxonomy and Trends

on

  • 5,807 views

The topic, covering Web Application Forensics is challenging. There are not enough references,...

The topic, covering Web Application Forensics is challenging. There are not enough references,
discussing this subject, especially in the Scientific communities. Often is the the term 'Web
Application Forensics' misunderstood and mixed with IDS/ IPS defensive security approaches.
Another issue is to discern the Web Application Forensics, short Webapp Forensics, from Network
Forensics and Web Services Forensics, and in general to allocate it in the Digital/ Computer
Forensics classification.
Nowadays, Web Platforms are vastly growing, not to mention the so called Web 2.0 hype.
Furthermore, Business Web Applications blast the common security knowledge and premise rapid
inventory of the current security best practices and approaches. The questions, concerning the
automation of the security defensive and investigation methods, are becoming undeniable
important.
In this paper we should try to dispute the questions, concerning taxonomic approaches regarding the
Webapp Forensics; discuss trends, referenced to this topic and debate the matter of automation tools
for Webapp forensics.

Statistics

Views

Total Views
5,807
Views on SlideShare
5,805
Embed Views
2

Actions

Likes
0
Downloads
89
Comments
0

2 Embeds 2

http://www.docseek.net 1
http://www.linkedin.com 1

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Web Application Forensics: Taxonomy and Trends Web Application Forensics: Taxonomy and Trends Document Transcript

    • Web Application Forensics Taxonomy and Trends term paper Krassen Deltchev Krassen.Deltchev@rub.de 5. September 2011 Ruhr-University of BochumDepartment of Electrical Engineering and Information Technology Chair of Network and Data Security Horst Görtz Institute First examiner: Prof. Jörg Schwenk Second Examiner and Supervisor: M.Sc. Dominik Birk
    • Contents List of Figures .................................................................................................................................. 3 List of Tables ................................................................................................................................... 3 Abbreviations ................................................................................................................................... 4 Abstract ............................................................................................................................................ 51. Introduction .................................................................................................................................. 7 1.1. What is Web Application Forensics? .................................................................................... 7 1.2. Limitations of this paper ....................................................................................................... 8 1.3. Reference works ................................................................................................................... 92. Intruder profiles and Web Attacking Scenarios .......................................................................... 11 2.1. Intruder profiling ................................................................................................................. 12 2.2. Current Web Attacking scenarios ........................................................................................ 14 2.3. New Trends in Web Attacking deployment and preventions .............................................. 153. Web Application Forensics ......................................................................................................... 19 3.1. Examples of Webapp Forensics techniques ........................................................................ 23 3.2. WebMail Forensics ............................................................................................................. 25 3.3. Supportive Forensics ........................................................................................................... 274. Webapp Forensics tools .............................................................................................................. 29 4.1. Requirements for Webapp forensics tools .......................................................................... 29 4.2. Proprietary tools .................................................................................................................. 31 4.3. Open Source tools ............................................................................................................... 345. Future work ................................................................................................................................ 396. Conclusion .................................................................................................................................. 41 Appendixes .................................................................................................................................... 42 Appendix A .................................................................................................................................... 42 Application Flow Analysis ............................................................................................................ 42 WAFO victim environment preparedness ...................................................................................... 44 Appendix B .................................................................................................................................... 45 Proprietary WAFO tools ................................................................................................................ 45 Open Source WAFO tools ............................................................................................................. 48 Results of the tools comparison .................................................................................................... 49 List of links .................................................................................................................................... 50 Bibliography .................................................................................................................................. 52 2
    • List of FiguresFigure 1: General Digital Forensics Classification, WAFO allocation ............................................. 8Figure 2: Web attacking scenario taxonomic construction .............................................................. 15Figure 3: Digital Forensics: General taxonomy .............................................................................. 20Figure 4: WAFO phases, in Jess Garcia[1] ...................................................................................... 21Figure 5: Extraneous White Space on Request Line, in [3] ............................................................ 23Figure 6: Google Dorks example, in [3] .......................................................................................... 24Figure 7: Malicious queries at Google search by spammers, in [3] ................................................ 24Figure 8: faked Referrer URL by spammers, in [3] ......................................................................... 24Figure 9: RFI, pulling c99 shell, in [3] ............................................................................................ 24Figure 10: Simple Classic SQLIA, in [3] ........................................................................................ 25Figure 11: NBO evidence in Webapp log, in [3] ............................................................................. 25Figure 12: HTML representation of spam-mail( e-mail spoofing) .................................................. 26Figure 13: e-mail header snippet of the spam-mail in Figure 12 .................................................... 26Figure 14: Spam-assassin sanitized malicious HTML redirection, from example Figure 12 ......... 27Figure 15: Main PyFlag data flow, as [L26] .................................................................................... 35Figure 16: Improving the Testing process of Web Application Scanners, Rafal Los [10] .............. 43Figure 17: Flow based Threat Analysis, Example, Rafal Los [10] .................................................. 43Figure 18: Forensics Readiness, in Jess Garcia [13] ....................................................................... 44Figure 19: MS LogParser general flow, as [L16] ............................................................................ 45Figure 20: LogParser-scripting example, as [L17] .......................................................................... 45Figure 21: Splunk licenses features ................................................................................................ 46Figure 22: Splunk, Windows Management Instrumentation and MSA( ISA) queries, at WWW .. 47Figure 23: PyFlag- load preset and log file output, at WWW ......................................................... 48Figure 24: apache-scalp or Scalp! log file output( XSS query), as [L25] ....................................... 48List of TablesTable 1: Abbreviations ....................................................................................................................... 4Table 2: A proposal for general taxonomic approach, considering the complete WAFO description ...11Table 3: Example of possible Webapp attacking scenario ............................................................... 16Table 4: Standard vs. Intelligent Web intruder ................................................................................ 17Table 5: Web Application Forensics Overview, in [15] ................................................................... 21Table 6: A general Taxonomy of the Forensics evidence, in [1] ..................................................... 22Table 7: Common Players in Layer 7 Communication, in Jess Garcia [1] ..................................... 22Table 8: Traditional vs. Reactive forensics Approaches, in [13] ..................................................... 29Table 9: Functional vs. Security testing, Rafal Los [10] ................................................................. 42Table 10: Standards & Specifications of EFBs, Rafal Los [10] ...................................................... 42Table 11: Basic EFD Concepts [10] ................................................................................................ 42Table 12: Definition of Execution Flow Action and Action Types, Rafal Los [10] ........................ 42Table 13: TRR completion on LogParser, Splunk, PyFlag, Scalp! ................................................. 49Table 14: List of links ...................................................................................................................... 51 3
    • AbbreviationsAnti-Virus AVApplication-Flow Analysis AFABusiness-to-Business B2BCloud-computing CCCloud(-computing) Forensics CCFODigital Forensics DFODigital Image Forensics DIFOExecution-Flow-Based approach EFBIncident Response IRMicrosoft MSNetwork Forensics NFONon- persistent XSS NP-XSSNULL-Byte-Injection NBIOperating System(s) OS(es)Operating System(s) forensics OSFOPersistent( stored) XSS P-XSSProof of Concept PoCRegular Expression RegExRelational Database System RDBMSRemote File Inclusion RFISQL Injection Attacks SQLIATools requirements rules TRRWeb Application Firewall(s) WAF(s)Web Application Forensics WAFOWeb Application Scanner WASWeb Attacking Scenario(s) WASCWeb Services Forensics WSFOTable 1: Abbreviations 4
    • AbstractThe topic, covering Web Application Forensics is challenging. There are not enough references,discussing this subject, especially in the Scientific communities. Often is the the term WebApplication Forensics misunderstood and mixed with IDS/ IPS defensive security approaches.Another issue is to discern the Web Application Forensics, short Webapp Forensics, from NetworkForensics and Web Services Forensics, and in general to allocate it in the Digital/ ComputerForensics classification.Nowadays, Web Platforms are vastly growing, not to mention the so called Web 2.0 hype.Furthermore, Business Web Applications blast the common security knowledge and premise rapidinventory of the current security best practices and approaches. The questions, concerning theautomation of the security defensive and investigation methods, are becoming undeniableimportant.In this paper we should try to dispute the questions, concerning taxonomic approaches regarding theWebapp Forensics; discuss trends, referenced to this topic and debate the matter of automation toolsfor Webapp forensics.KeywordsWeb Application Security, WebMail Security, Web Application Forensics, WebMailForensics, Header Inspection, Plan Cache Inspection, Forensic Tools, ForensicsTaxonomy, Forensics Trends 5
    • 1.Introduction1. IntroductionIn [1], Jess Garcia gives a definition of the term Forensics Readiness:“ Forensics Readiness is the “art” of Maximizing an Environments Ability to collect CredibleDigital Evidence”. This statement we should keep in mind in the further exposition of the paper. Itpoints out several important aspects. Foremost, forensics rely on maximal collection of digitalevidence. If the observed environment1 is not well prepared for forensic investigation, discoveringthe root, for the system is been attacked, could be: sophisticated, not efficient in time and even nondeterministic in finding an appropriate remediation of the problem.Another essential aspect of Forensics, as Jess Garcia, is- the forensic investigation is an art.It is obvious to point out furthermore that, defining best practices, concerning the properdeployment of forensic work, is unbefitting. An intelligent intruder will always find drawbacks insuch best-practice scenarios and try to exploit them as well to accomplish new attacks, completethem successfully and remain concealed.In this way of thoughts, appears the question, how can we suggest taxonomy, regarding forensicwork, if we are aware a priori of the risks such recipes include?We shall propose several general intruders strategies and profiling of the modern Web attacker inthe paper, keeping in mind not to hurt the universal validity of the statements we discuss. In somecases we shall give examples and paradigms through references, though only for the matter of thegood illustration of the statements in the current thesis.Let us describe more precisely the matters, concerning the Webapp Forensics in the next section.1.1. What is Web Application Forensics?Web Application Forensics( WAFO) is a post mortem investigation of a compromised WebApplication( Webapp) system. WAFO consider especially attacks on Layer 7 of the ISO/OSI model.In distinction to this, capturing and filtering of internet protocols on-the-fly is not a concern of theWebapp forensics. More precisely, such issues in general are in the focus of NetworkForensics( NFO). Nevertheless, examining the log files of such automated tools( IDS/ IPS/ trafficfilters/ WAF etc.) is supportive for the right deployment of the Webapp forensic investigation.As stated above, NFO examine in concrete such issues, thats why we should like to discern WebappForensics from it, keeping in mind the supportive function, which Network forensic tools cansupply to WAFO.Consequently, we should like to specifically allocate WAFO in the Digital Forensics( DFO)structure, because some main topics in DFO are not implicitly referred to Layer 7 of the ISO/OSIModel. Such should be designated as follows: Memory Investigations, Operating Systems Forensicsinvestigations, Secure Data Recovery on physical storage of OSes etc. Nevertheless, DFO considerinvestigations of image manipulations [L1], [L2], which in some cases could be also verysupportive for the proper deployment of WAFO.At last, we should categorize WAFO as a sub-class of Cloud Forensics( CCFO) [2]. Cloud1 we assume that, the reader understands the abstraction of the Webapp as a WAFO environment 7
    • 1.IntroductionForensics is a relatively new term in the Security communities. Historically, the existence of WebApplications lead in phase to the Cloud-Computing( CC). Concerning the complexity of the Webapplications, platforms and services presented by the CC, CCFO cover larger investigation areasthan the WAFO. As an example, WAFO is not explicitly observing fraud on Web Services. WebServices are covered by the Web Services Forensics( WSFO), another sub-class of CCFO, andshould be categorical discerned from WAFO, please read further.Let us illustrate the DFO taxonomic structure in the next Figure:Figure 1: General Digital Forensics Classification, WAFO allocationOn behalf of this short introduction of the different Computer Forensics categories, lets designateexplicitly the limitations of the paper. This concerns the better understanding of the papersexposition and explain the absence of examples, covering different exotic attacking scenarios.1.2. Limitations of this paperThis term paper discusses Web Application Forensics, which excludes topics as on-the-fly packetcapturing, packet inspection of sensitive data over ( security) internet protocols. Once again tomention, it does not cover attacks, or attacking scenarios on lower layer than Layer 7 ISO/ OSIModel. For the interested reader, a very good correlation of the Layer 7 Attacks and below,concerning Web Application Security and Forensics can be found at [3]. In distinction to WebServices Forensics [5] and CCFO [2], the presented paper covers only a small topic, concerning thevarieties of fraud Web Applications: • RIA( AJAX, RoR2, Flash, Silverlight et al.) ,2 RoR- Ruby on Rails, http://rubyonrails.org/ 8
    • 1.Introduction • static Web Applications, • dynamic Web Applications and Web Content( .asp(x), .php, .do etc. ), • other Web Implementations( like different CMSes), excluding research on fraud, concerning Web Services Security, or CC Implementations, but explicitly Web Applications.Due to the marginal limitations of the term paper, the reader shall find a couple of illustratingexamples, which do not pretend to cover the variety of illustrative scenarios of Web AttackingTechniques and Web Application Forensics approaches.For the reader concerned, attacks on Layer 7 are introduced and some of them discussed in detailat [4].Furthermore, we should denote a clarification, regarding the references in this paper, consideringtheir proper uniformity, as follows. General knowledge should be referenced by footnotes at theappropriate position. The scientifically approved works are indexed at the end of the paper in theBibliography, as ordinary. Non scientifically approved works, also video-tutorials, live videosnapshots of conferences, blogs etc. are indexed by the List of links after the Appendix of this paper.We should imply this strict references sources division, with respect to the Security ScientificCommunities. In addition to this, let us introduce some of the interesting related works dedicatedon the topic of WAFO.1.3. Reference worksAn extensive approach, covering the different aspects of Web Application Forensics, is given in thebook “Detecting Malice” [3], by Robert Hansen3. The interested reader can find much more thanjust WAFO discussions in this book, but in addition to these also examples of attacks on lower levelthan Layer 7, correlated to the WAFO investigations and many paradigms, derived from real-lifeWAFO investigations.The unprepared reader should notice that, the topics in the book, discussing WAFO tools, arelimited. The author of the book points out the sentence, that every WAFO investigation should beconsidered as unique, especially in its tactical accomplishment, therefore favoring of top automatedtools, should be assumed as inappropriate, please read further.Another interesting approach is given by SANS Institute as Practical Assignment, covering threenotable topics: penetration testing of a compromised Linux System, a post mortem WAFO on theobserved environment and discussions on the legal aspects of the Forensics investigation [6].Despite the fact that, this tutorial in its Version 1.4 is no more relying on an up-to-date example, itillustrates very important basics, concerning WAFO and can be used still as a fundamental readingfor further research on the WAFO topic.BSI4, Germany, describes in the Section, Forensic Toolkits, at “Leitfaden “IT-Forensik” [7], Version1.0, September 2010, different Forensic tools for automated analysis, many of them concerningimplicitly WAFO. The toolkits are compared by the following aspects: • analyzing of log-data,3 http://www.sectheory.com/bio.htm4 https://www.bsi.bund.de/EN/Home/home_node.html 9
    • 1.Introduction • tests, concerning time consistency, • tests, concerning syntax consistency, • tests, concerning semantic consistency, • log-data reduction, • log-data correlation, concerning integration and combining of different log-data sources in a consistent timeline, integration/ combining of events to super-events, • detection of timing correlations( MAC timings) between events.The given approaches can be related to WAFO log file analysis, which designates them asreasonable supportive WAFO investigation methods.Another tutorial, giving basic overview, which should be also considered as fundamental regardingWAFO research, is: “Web Application Forensics: The Uncharted Territory”, presented at [8].Although, the paper is published in 2002, it should not be categorized it in a speedy manner asobsolete.Other papers, articles and presentation papers, concerning specific WAFO aspects, complete thegroup of the related references, concerning the Web Application Forensics research in this termpaper. These should be referenced at the appropriate paragraphs in the papers exposition and not bediscussed individually in this section, furthermore.Lets describe the structure of the term paper. Chapter 2 should give a taxonomic illustration on thetopics, designating intruders profiling and modern Web Attacking Scenarios. Chapter 3 deliberatesWAFO investigation methods and techniques more detailed and concerns further discussion on thematter of signification of a possible WAFO taxonomy. In Chapter 4 are illustrated the WAFOinvestigation supportive tools. An important section outlines the questions, concerning therequirements of WAFO toolkits, which points out the reasonable aspects for determining the toolseither as relevant, or inappropriate for adequate WAFO investigations. Two major group of favoritetools should be designated: Proprietary Toolkits and Open Source solutions. Chapter 5 representsthe final discussion on the papers thesis and suggestions for future work on behalf of the discussedtopics in the former chapters. In Chapter 6 is deliberated the Conclusion on the proposed thesis.The Appendix demonstrates an additional information( tables, diagrams, screenshots and codesnippets) on specific topics, discussed in the exposition part of the paper.Let us proceed with the description of the Web Attacking Scenarios and ( Web) Intruder profiles. 10
    • 2.Intruder profiles and Web Attacking Scenarios2. Intruder profiles and Web Attacking ScenariosIn the introduction part of this thesis is outlined that, the scientifically approved research,concerning Web Application Forensics by the Security and Scientific Communities, should be stillconsidered as insufficient and as not well-established. Thats why, an appropriate categorization ofthe different Forensic Fields and the correct allocation of WAFO in the Digital Forensics hierarchyare adequately appointed as required in the former chapter, which satisfies one of the objectives ofthe current paper.For all that, this classification does not present a complete fundamental basis for further academicresearch on WAFO. Therefore, we should extend the abstract Model, concerning WAFO, byintroducing two other fundamentals: the profile of the modern Web intruder and methodologies asabstract schemae, current Cyber ( Web) attacks are accomplished by.Thus, we should follow the proposed schema for describing completely the aspects of WAFO, seethe following Table: 1. represent the Digital Forensics hierarchy and 2. allocate the field of interest, concerning WAFO, 3. explain the Security Model, WAFO is observing, by: • designating the intruder, • describing the victim environment( Webapps), • specifying the fraudulent methods; 4. demonstrate the WAFO tasks, supporting the security remediation planTable 2: A proposal for general taxonomic approach, considering the complete WAFO descriptionIn this way of thoughts, we should stress that, the intruders attacks on existing Web Applicationsand other Web Implementations nowadays, should be denoted as highly sophisticated. Such Webattacks are rapidly adaptive in their variations and alternations, and in some cases precarious to beeffectively sanitized. Example of such attacks like CSRF, Compounded SQLIA and CompoundedCSRF are described in [4]. A good representative in this group is the famous Sammy worm, whichis still wrongly considered to be a pure XSS Attack. Another confusing example demonstrate theThird Wave of XSS Attacks, DOM based XSS( DOMXSS) [20]. The fact that, DOMXSS attackscannot be detected by IDS/ IPS, or WAF systems, if the payload is obfuscated as an URLparameter, e.g. Web Application server do not record HTML parameters in the log file, but only theprimary URL prefix, should be designated as ominous. If the nature of such Attacking scenarios isfundamentally mistaken, then it is a matter of time that, attacks derivatives should success in theirfurther fraudulent activities on the Web.The task to sanitize a compromised Web application by CSRF is very difficult. It requires immenseefforts of Reverse Engineering and Source Code rectification in reasonable boundaries for time andefficiency. The more general problem is, Web Applications are per se not stealth5. Thus, hardening a5 Exceptions to these could be Intranet-Webapps, which designate another class of Webapps, concerning the term 11
    • 2.Intruder profiles and Web Attacking ScenariosWebapp is not equivalent to hardening of a local host. In other words, the utilization of knownpreventive techniques, like security-through-obscurity, should be anted to secured Intranet Webapplications, Admin Web Interfaces, non-public FTP servers etc., but commercial B2B Webapps,On-line Banking, Social Network Web sites, On-line magazines, WebMail applications and others.These last mentioned applications are meant to be employed from all over the world per definition;they exist, because of the huge amount of their users and customers per se. Thats why, the securingof such Web constructs is more complex and intensive. Of course, there are basic and advancedauthentication techniques applied to Web implementations, though these do not make the Webappstealth for intruders. They just apply the so called user restriction for using sensitive parts of theWeb implementation. In this way of thoughts, pointing out exaggerative cases of Web fraud likeChild pornography and personal image offending issues, is only the top of the iceberg of examplesfor Web crime. The problem is, nowadays Identity Theft and speculations with sensitive personaldata, should not be further categorized as exotic examples of existing Cyber crimes6 over theinternet on Web Platforms. Such crimes designate an everyday persistence. Social networks, socialand health insurance companies strive for more impressive Web representation. E-CommercePlatforms for daily monetary transactions are undeniable nowadays. We should not considernowadays Web 2.0 as a hype, we should keep in mind that, the former dynamic E-commerce Webrepresentations become nowadays sophisticated RIA Web platforms. Such Webapps respect thebetter marketing representation of the Business Logic of the firms, which profit depends at thepresent days on the complexity, rapidly changing dynamic adaption and more user-friendly featuresfor satisfying the Web customer at any time. These aspects explain the huge intruders interest forcompromising Web applications, and furthermore Web Services as well. There is no kind ofdeterministic conclusions on the prediction of Web Attacking Scenarios, or the amount of thedamage they cause every day.In [3], Robert Hansen compares the intensity of Web Attacks representations and amount ofdamage they cause comparatively to the computer viruses. Both of the security topics should notloose attention of the Security communities for a long period of time. Moreover, as already stated,their remediation could not be ascertained straight-forward. As we know, there is no defaultapproach for proper sanitization against computer viruses. The same statement is applicable forWebapp attacking scenarios. Rather, it is a matter of extensive 24/7/365 deployment of propersecurity hardening techniques and strategies, and the adaptive improvement of those. Knowing yourfriends is good, knowing your enemies is crucial. Lets proceed in this way of thoughts, after givingthis conclusive explanation for the motivational purpose of the paper, with the representation ofmodern Web fraud in detail as follows.2.1. Intruder profilingTwo general categories should be designated in this section: the standard intruder profile and theprofile of the intelligent intruder, performing terrible Cyber crime, short- intelligent intruder profile.We should use the adjective intelligent, describing the second intruders profile, as very reasonable,respecting the fact- if we as representatives of the Security Communities, pretend to possesknowledge and know-how, concerning the proper deployment of our duties, this kind of intrudersposses it too and much more. papers definitions, where extensive intruders effort is a pre-requirement for breaking the Intranet security, and should not be discussed here as relevant.6 http://www.justice.gov/criminal/cybercrime/ 12
    • 2.Intruder profiles and Web Attacking ScenariosThere are also fuzzy definitions of intruders, which designate states in between the abovementioned ones. In fact, these profiles are very agile in their representation. For example- a formerintelligent intruder should be categorized better as a latent one, and a motivated standard attackershould not be disrespected. This violator could fulfill the requirements of the category, related to theintelligent intruder profile, at any time with sufficient likelihood.In the category of standard intruder we should determine: script kiddies and hacker wannabes,“fans” of YouTube, or other video platforms, capturing knowledge and know-how from easy how-tovideo tutorials. Bad configured robots and spiders, and any other kind of not well educated, notenough motivated, even not enough skilled daily violators. Specific for this group of intruders is thelack of personal knowledge and know-how, utilization of well known attacking techniques andscenarios well-established on the Web. Such violators are ignorant to and disrespecting the noise7they produce, while trying to accomplish the attacks. These features explain the deduction- astandard attacking scenario, could be sanitized in greater likelihood with standard prevention andhardening techniques( best-practices). In cases of successfully deployed attack(s) on behalf of suchstandard scenarios, the investigation and detection approaches could be considered as standard withgreater likelihood too.For all that, there are cases, which represent attacking scenarios, designated as shadow scenarios. Itis not important, whether these are accomplished successfully, or not at the specific time of theattacks deployment. Their utilization is to cover the deployment of the real attacking scenario.Thats why, we should rather concern, whether these are cases of intelligent intruders attacks.The group of intelligent intruders should deliberate: former ethical hackers; pen testers; securityprofessionals, who have changed sides, disrespecting their duties; intelligently set up automatedtools for Web Intrusion, such as Web Scanners, Web Crawlers, Robots, Spiders etc.The most notable feature describing these representatives is the possession of inferior independentknowledge and know-how. Furthermore, patience, accuracy in the accomplishment of the attackingscenario deployment, strive to learn and assimilate new know-how.Interesting examples, related to this profile, are given at [3]. We should mention some types of suchones. Intelligent hackers are recruited by law firms to achieve a Proof of Concept( PoC) on atargeted Web implementation. If the PoC is positive, this could alter the outcome of the legal case,as this PoC could be used as decisive juristic evidence in most of the situations in account of thehacker recruiting law firm. Such intruders attacks are difficult to be detected right on time.Furthermore, there are other cases, where the damage of the accomplished attack is the determinantalarm after havoc is consequently presented. As already stated, the sanitization of the compromisedWeb Application(s) after such successful attacks is in some cases unfeasible and more often requiressophisticated methods to be achieved. Examples of these are CSRF compromised Webapps, like thecase: PDP GMail CSRF attack8, see also [4]. Therefore, reasonable supportive part to the accuratesanitization of the compromised Webapp, demonstrates the proper deployment of Web ApplicationForensics investigations.Lets mention several examples of modern Web Attacking Scenarios in the next section ofChapter 2.7 We should emphasize here: the Communication Complexity and amount of false positive attempts by the violator(s) in their strive to complete the intended Web attacking scenario(s), which should not be mistaken with the utilization of attacking techniques, where producing communication noise is the core of the attacking strategy, like different DDoS implementations: Fast Fluxing SQLIA, DDoS via XSS, DDoS via XSS with CSRF etc.8 http://www.gnucitizen.org/blog/google-gmail-e-mail-hijack-technique/ 13
    • 2.Intruder profiles and Web Attacking Scenarios2.2. Current Web Attacking scenariosIn May, 2009 Joe McCray9 concludes in his presentation [9] on Advanced SQL Injection atLayerOne10, that Classic SQLIA should no more be categorized as a trend or conventional.In [4] Classic SQLIA are discussed as a part of the current SQLIA taxonomy till 2010. Despite thefact, their categorization by Joe McCray should be respected as reasonable. This controversial issueis presented at many of the current Web Attacking Vectors. To achieve a complete taxonomicapproach, pertaining to a concrete Webapp Attacking vector, many obsolete representations of theAttacking sub-classes should be illustrated, revering the real Web Environment. The mentionedabove Classic SQLIA illustrate obsolete and more over unfeasible Attacking Techniques,considering the properly employed modern defensive methods. The main reason, explaining thisissue is- Web platforms are vastly changing, not only according to its development aspects, butrather the attacking and security hardening scenarios, anted to them. Most likely, an intelligentintruder should not use obsolete techniques, because of the expectant presence of Web Applicationsecurity protection. Detecting deployment of obsolete Attacking Scenarios on a modern Webconstruct, could be classified as an investigation on the standard intruders profile. Nevertheless,this conclusion should not be underestimated, as previously discussed, see shadow scenarios.Lets give some interesting examples of current successful accomplished Web Attacks.In July, 2009 a dynamic CSRF Attack is accomplished on the Web Platform of Newsweek [4], [L4].The Tool, called MonkeyFist11, utilized for this first completely automated CSRF Attack, representsa Python- based small web server, configured via XML. The victim site is been already hardenedvia protecting of the generation of its dynamic elements by security tokens12 and strong session IDS.For all that, this new attacking technique achieves positive results, which designates open questions,concerning the impact of the See surfing sleeping giant.Another recent attack is the SQLIA over the British Navy Website[L5] in November, 2010, whichwas only meant to be a PoC by a Romanian hacker, that Web Application Security can be brokeneven at such high-level hardened Web Implementations.In April 2011, different mass infection by SQLIA is detected. 28000 Web sites are compromised,even several Apple Itunes Store index sites are infected. The SQLIA injects a PHP script, whichredirects the user to a cross-origin phishing site, pretending to deliver an on-line Anti-Virus( AV)protection. The attack is known in the Security Communities as LizaMoon Mass SQLIA13 [L6].The list of such impressive Web Attacking incidents can be proceeded, which should not beenumerated further in the paper. The interested reader should refer further to : • The Web Hacking Incidents Database14 • OWASP Top Ten Project159 http://www.linkedin.com/in/joemccray10 LayerOne- IT- Security conference, http://layerone.info11 http://www.neohaxor.org/2009/08/12/monkeyfist-fu-the-intro/12 The anti-CSRF token is originally suggested by Thomas Schreiber, in 2004: www.securenet.de/papers/Session_Riding.pdf13 http://blogs.mcafee.com/mcafee-labs/lizamoon-the-latest-sql-injection-attack14 http://projects.webappsec.org/w/page/13246995/Web-Hacking-Incident-Database15 http://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project 14
    • 2.Intruder profiles and Web Attacking ScenariosAt the end of this Chapter lets deliberate some interesting trends, concerning the current WebAttacks.2.3. New Trends in Web Attacking deployment and preventionsDiscussing the deployment of Web Attacks, we should consider a more realistic approach, forcategorizing Web Attacking Vectors. As mentioned above, there are two general profiles of the WebIntruders. Keeping in mind, the differences of the Attacks deployment and the level of Attackssophistication, it should be more appropriate to discuss the accomplishment of Web AttackingScenarios, rather than the deployment of Web Application Attacks. In such Attacking Scenarios,which represent a fundamental construct, the Web Attacks should be denoted as executiontechniques in a given attacking setting. This allows us to define single layer attacks, multi-layerattacks, and special attacking sequences as specific implementations in the realization of the WebAttacking Scenario. Such scenarios can adequately illustrate the intention of the different profiles ofWeb Intruders. In distinction to the intelligent Web Intruder, the standard Intruder tries toaccomplish a simple attacking scenario, reduced to the utilization of a special Web attackingtechnique. The Web attacking scenario represents a simple deployment construct: try a well-established attacking procedure(s) and wait for result(s), no matter what.As mentioned above, the intelligent Intruder utilizes more sophisticated scenarios. Some of themcould be planned and sequentially accomplished in a long period of time, till achieving the expectedresult(s). There are cases in which the intelligent attacker could gain enough feedback from thevictim application and thus intentionally reduce the attacking scenario to the deployment of one or acompact amount of attacking techniques, which resembles the scenario to the level of the standardintruders scenario. Nevertheless, important aspects like utilization of non-standard attackingtechniques and less noise at the attacking environment obviously discern the one profile from theanother. These conclusions should be extended in the Chapters, concerning the more detailedrepresentation of WAFO.Lets illustrate the Web Application Scenario construction in the next Figure:Figure 2: Web attacking scenario taxonomic construction 15
    • 2.Intruder profiles and Web Attacking ScenariosThe proposed construct should be extended in the next Table, which denotes an example of apossible Web attacking scenario: Example Attack on well-known CMS [inject c99 shell on the CMS, as a paradigm] Scenario • What is the particular goal: PoC, ID Theft, destroying Personal Image etc. • determine the CMS version, • determine the technical implementation type: concurrent attacking, or sequentially attacking of specific Webapp modules • localize the modules to be compromised: Web Front-end, RDBMS, WebMail interface, News feeder etc. • if CMS version obsolete: • find published exploits( at best 0days16) and utilize them to gather feedback from the victim environment • respect scanning noise as low as possible • if version is up-to-date utilize: • blind application scanning techniques with noise reduction and wait for positive feedback • analyze the results and proceed with further more specific attacking techniques • if success, utilize a refinement of the attack and if of interest, wait for CMS admins reaction- gives feedback on sanitization response time, efforts, utilized hardening techniques etc. • if not successful: • audit the gathered feedback • wait for new published 0day exploits • develop a 0day(s) independently • utilize an scenario sequence execution loop till achieving the goal with respect to: • ( communication) attacking noise • and...try to stay concealedTechnique(s) XSS: SQLIA: CSRF CSFU Particular ... Common well-(these should * NP-XSS17 * error 0day(s) establishedbe ordered, or * P-XSS response like:reordered * timing sniffing for openaccording the SQLIA ... admin debuggingattacking console access onscenario) port 1099 Procedures NP-XSS: Error response SQLIA: ... ( these should • detect dynamic modules on the Webapp, • Step 1, be ordered, or • find variables to be compromised, • Step 2, reordered as • craft the malicious GET- Request and appropriate) taint the input value of the variable to be • … exploited • Step n; • gather feedback • resemble the procedure till expected results are achieved • spread the malicious link to as many as possible Confused Deputies[4]Table 3: Example of possible Webapp attacking scenario16 http://netsecurity.about.com/od/newsandeditorial1/a/aazeroday.htm17 NP- XSS denotes non-persistent XSS; P-XSS abbreviates the Persistent XSS 16
    • 2.Intruder profiles and Web Attacking ScenariosHow this respects the proposed profiles of modern Web intruders, should be illustrated as: Profile Standard Intruder Intelligent Intruder Attacking Scenario static: highly dynamically adaptive18 execution remains on the level of published and well-established Web attacks Techniques static: could remain static, but preferably (as a comment: … better watch it on the Cyber criminal would adapt YouTube19, see [4]) according the successful completion of the Attacking Scenario Procedures static: Could be static, but preferably the “... just copy and pase”, Intruder should seek for a 0day(s) 0day with less likelihoodTable 4: Standard vs. Intelligent Web intruderAnother important aspect, respecting the prevention and sanitization of successfully deployed WebApplication Attacking Scenarios, is illustrated by Rafal Los20 in his presentation at OWASPAppSecDC in October, 2010 [10]. Main topic of his research, concerns the Execution-Flow-Basedapproach as a supportive technique to the Web Application security (pen-)testing. The utilization ofWeb Application scanners( WAS) should be determined as impressive, supporting the pen-testingjob of the security professional/ ethical hacker and not to forget the intelligent intruder [11], [4].Indeed, WAS can effectively map the attacking surface of the Webapp, intended to be compromised.Still, open questions remain, like- do WAS provide full Webapp function- and data-flow coverage,which reports greater feedback, concerning a complete security auditing of the Web construct indetail. Most of the pen-testers/ ethical hackers, do not care what kind of functions, related to theWebapp, should be tested. If they do not exactly know the functional structure and the data-flow ofthe Web Application, how should they consider appropriate and complete functional coverageduring the pen-testing of the Webapp?The job of the pen-tester is to reveal exploits and drawbacks in the realization of a Web Applicationprior to the intelligent intruder. Consequently to this, appears the next question, what are theobjective parameters to designate the pen-testing job completed and well-done?As Rafal Los states, nowadays the pen-testing of Webapps, utilizing WAS, should be still digestedas “pointnscan web application security”. The security researcher suggests in his presentation that,a more reasonable Webapp hardening approach is the combination of the Applicationfunction-/data-flow analysis with the consequent security scanning of the observed Webimplementation. A valuable comparison between the Rafal Los indicated approach and the commonsecurity testing of Webapp(s), outlining the drawbacks of the second one, is given inTable 9, Appendix A.18 Respecting the current level of sanitization know-how, produced attacking noise, reactions of the security professionals to sanitize the particular Webapp, the specific goal for compromising the victim Webapp19 The author of the paper do not intend to be offensive to YouTube, nevertheless the facts are: this video on-line platform is well-established and popular, there are tons of videos, hosted on it, concerning: Classic SQLIA derivatives, XSS derivatives etc., which could be easily found and utilized by script kiddies, hacker wannabes ...20 http://preachsecurity.blogspot.com/ 17
    • 2.Intruder profiles and Web Attacking ScenariosLets summarize these drawbacks, as follows. The current Webapp pen-testing approaches viascanning tools do not deliver adequate functional coverage of modern and dynamic highsophisticated Web Applications. Furthermore, the Business Logic of the Webapp(s) is oftenunderestimated as a requirement for the proper pen-testing utilization. A complete coverage of thefunctional mapping of the Web Application could still not be approved. If the application executionflow is not explicitly conversant, the questions, regarding completeness and validity of the resultsfrom the tested data, should be denoted as open.Therefore, Rafal Los suggests, utilization of Application-Flow Analysis( AFA) in the preparationpart prior to the deployment of the specific Web Application scanning. This combination of the twoapproaches should deliver better results than those from the blind pointnscan examinations.Explanation of this approach is illustrated in Figures 16, 17 and Tables 10, 11, 12, given atAppendix A. For more information, please refer to [10], or consider studying the snapshot of thelive presentation[L7].We should designate these statements as highly applicable for the better utilization of WAFO, aswell. The lack of complete and precise knowledge of the functional structure and data flow of theforensically observed Webapp, should definitely detain the proper and accurate implementation ofWAFO. We should keep in mind these conclusions and extend them in the following Chapters ofthe paper.Lets proceed with the more detailed representation of the Web Application Forensics. 18
    • 3.Web Application Forensics3. Web Application ForensicsThe main task, this Chapter represents, is to proceed further with the taxonomic description ofWAFO, by describing the victim environment, e.g. to designate in detail the Web application inproduction environment. This should be specifically utilized on behalf of the facts: explaining, howWebapp forensics is applied to this environment; determining, what are the main concerning aspectsto WAFO; establishing these statements via particular examples and outlining collaborativetechniques, which extend the proper WAFO investigation. See again Table 2.We proposed in the former Chapters that, utilizing WAFO on behalf of best practices and onlyshould not be considered as reasonable. Presuming this, we should emphasize further explicitlythat, trial-and-error approaches and conclusions,relying on personal experience and high-levelskills, can not be approved as sufficient requirements for proper WAFO deployment.On the one hand we discover high information abundance, concerning the prior discussedcomplexity aspects of RIA Webapps, on the other the impulse for applying appropriate WAFO onthese high-level sophisticated applications is immense.Once again, this confirms the need for proper taxonomy- not best-practices, presenting a recipeshaping of the Web Application Forensics investigation, but categorizations, approved to beuniversally valid and compact in their representation. Lets conclude the illustration of the Webappforensics categorization and extend the described taxonomic aspects heretofore.Respecting the post mortem strategies, after intruders attack is successfully accomplished anddamage is presented, we specify two general approaches for Webapp sanitization- IncidentResponse( IR) and Web Application Forensics. In a word, the differences between them, should beoutlined as follows. The remediation scenario, applied to the compromised application and focusedon the regaining of the implementations complete functionality, is the main concern of the IncidentResponse. In distinction to this, the Forensics investigation focuses on gathering the maximumcollection of evidence, which is relevant for the IR utilization and should be employed to a court ofjurisdiction, if required.Lets demonstrate the complete overview of the Digital Forensics structure and point out thedependencies between IR and CFO, as well as, the dependencies between WAFO and the otherForensics fields. This is illustrated in the next Figure 3. 19
    • 3.Web Application ForensicsFigure 3: Digital Forensics: General taxonomyFor the reader concerned, please refer to [12], where IR and Forensics approaches are compared indetail. More general representation on the topics IR and Forensics should be found at [1], [13], [14].In this way of thoughts, we should derive and should specify the following fundamentalquestions( *), concerning WAFO: 1. how can we describe an environment as ready for Forensics investigations, 2. what evidence should we look for and 3. what is the definition of their location, 4. how can we extract the payload of the Forensics evidence raw data, concerning its proper application in the further steps of IR.Lets designate the general procedure in the implementation of WAFO. The next Figure 4: 20
    • 3.Web Application Forensics Figure 4: WAFO phases, in Jess Garcia[1]This illustrates, respecting the universal validity, the following steps in the WAFO deployment: • seizure- the problem should be designated, • Preliminary Analysis- preparation for the specific WAFO investigation, • Investigation/ Analysis loop- analyzing the collected evidence and proceeding in this manner till the collection of those is maximal and completeIn this way of thoughts, we should underscore the Standard Tasks, WAFO is utilizing, as in [15]: 1. Understand the “normal” flow 2. Capture application and server of the application configuration files 3. Review • Web Server 4. Identify • Malicious input from client log potential • Breaks in normal web files: anomalies: access trends • Application • Unusual referrers Server • Mid-session changes to cookie values • Database Server 5. Determine a remediation plan • ApplicationTable 5: Web Application Forensics Overview, in [15]Lets categorize the evidence, as an argumentation to the second fundamental question, see (2,*), inTable 6: 21
    • 3.Web Application Forensics Digital Forensics evidence: • Human Testimony • Peripherals • Environmental • External Storage • Network traffic • Mobile Devices • Network Devices • … ANYTHING ! • Host: Operating Systems, Databases, ApplicationsTable 6: A general Taxonomy of the Forensics evidence, in [1]To specify the source of the different Forensics evidence, see (3,*), we should clarify the Players,as Jess Garcia in [1], contributing to the Layer 7 communication as follows, see Table 7: Type of Players: … and their Implementation in the Web traffic: Network Traffic Common Operating Systems Client Side ( Web) Browsers Web Servers Server Side Application Servers Database ServersTable 7: Common Players in Layer 7 Communication, in Jess Garcia [1]A reasonable WAFO should present an inspection/ analysis of all evidence these Players produce,which consists of: inspecting the Network traffic logs( inspecting logs of supportive Applications asNIDS, IDS, IPS), analysis of the hosts OS logs( incl. HIPS, HIDS, Event logs etc.), header andcookie inspection of the users Browsers, inspection of the Server logs, belonging to the WebApplication Architecture, cache inspection etc. As we propose in the former Chapter 2, this shouldnot be a simple task, especially when the Webapp is highly process-driven( e.g. AJAX, Silverlight,Flash etc.). This should require additional application-flow analysis, which considers an explicitknowledge, respecting the functional- and data- flow map of the Webapp. The human factor shouldnot be underestimated in this regard. Finally, there is also the important matter of the legal aspects,related to the deployment of the WAFO investigation, which the security professional should beaware of and should maintain during the Web Application Forensics process. We should not discussthis matter in detail. The interested reader should find more information, concerning this topic at[16] and also, as already proposed, in [7]. With respect to the forth fundamental question, see (4,*),focusing on the evidence payload extraction, we should discuss this more detailed in the nextSection 3.1. of this Chapter.To conclude this discussion, we should consent to argue the leading fundamental question, pointingout the Forensics readiness concerns, see (1,*). 22
    • 3.Web Application ForensicsAn environment, which is not prepared for Forensics investigation in an appropriate manner: • application logging is not present or not adequate adjusted, • no kind of supportive forensic tools are applied to the WAFO environment( IDS/ IPS etc.), • users are not well trained for Forensics collaboration;could detain the Web Application Forensics investigation in a way that, the evidence collection isconsiderably incomplete and WAFO could not be anted to the environment, at all [1]. Thats why,the matter of Forensics Readiness should be approved as fundamental in the taxonomy of WAFO,concerning the Preliminary Analysis phase of the Web Application Forensics deployment.An illustrative example of the Forensics Readiness, should be found in [13], referenced in AppendixA, Figure 18. As we specified the general taxonomy, respecting WAFO victim environment, letsproceed with further examples, designating the deployment of different Web Application Forensicstechniques. On one hand, they demonstrate in a more illustrative manner the papers exposition; onthe other, refer to the reasonable question argumentation on how WAFO payload data is gainedfrom evidence in practice.3.1. Examples of Webapp Forensics techniquesIn this Section we should describe different cases of WAFO deployment, concerning Client Sideand Server Side forensics analysis, on given real-life examples, organized as follows: main topic,possible attacks, WAFO techniques illustration.Extraneous White Space on the Request LineThis example is discussed in [3], which provides evidence for anomalies in HTTP requests, storedin the Webapp server log. The whitespace between the requesting URL and the protocol should beconsidered as suspicious. In the next Figure is illustrated a poorly constructed robot, whichobviously intends to accomplish a remote file inclusion:Figure 5: Extraneous White Space on Request Line, in [3]Google DorksExploiting the Google search capabilities, may be illustrated with the next search query [3]: 23
    • 3.Web Application Forensicsintitle:”Index of” master.passwdThe produced evidence should appear in the server logs as follows:Figure 6: Google Dorks example, in [3]The author of the book [3] states, that such requests are still very un-targeted, because of the factthat such requests are chaotic, in term of, the target is not explicitly specified in the search query.Nevertheless, they should not be considered underestimated. In respect to this, follows the nextexample, produced by spammers, utilizing the Google search engine for the same purpose:Figure 7: Malicious queries at Google search by spammers, in [3]Faking a Referring URLA great21 job for faking Referrer URL22 credentials is done by spammers. In the next example thefaked part of the URL is presented in the anchor identifier, which is unique for accessing differentparts on the displayed web page content. Such GET requests should not be approved as valid logfile entries via clicks on the Web page, because the Web server reproduces the whole Web page anddo not matter explicitly about its content, thus such log entry should be determined as malicious and, once again to be mentioned, not produced by a regular Web surfing activity:Figure 8: faked Referrer URL by spammers, in [3]Remote File InclusionA good example for Common Request URL Attacks could be illustrated by the next Remote FileInclusion( RFI)23 attempt stored in the Web Server log:Figure 9: RFI, pulling c99 shell, in [3]The attempt to pull the well known c99 shell on the running machine on behalf of a GET Request isobvious. The c99 shell is classified as a malicious PHP backdoor. There is a great likelihood that,Web intruders try to inject and execute such kind of code on Open Source PHP Webapps, likedifferent PHP-based CMSes, or PHP-forums. In most cases RFIs are deployed to extend thestructure of compromised machines and support the utilization of botnets.21 great job in terms of, discussing the algorithmic approach as security professionals and by no means as favoring the malicious intentions of the Cyber criminal22 RFC 173823 http://projects.webappsec.org/w/page/13246955/Remote-File-Inclusion 24
    • 3.Web Application ForensicsAnother reason for RFI is the attempt to execute code on compromised machine and gain access tosensitive data on it.A simple Classic SQLIAThe following general example illustrates the utilization of SQLIA [4] on a PHP Webapp on behalfof a malicious GET request:Figure 10: Simple Classic SQLIA, in [3]The intruder tries to compromise the admin account on the Webapp, utilizing Tautologies ClassicSQLIA: password= or 1=1 - - . To utilize: the apostrophe, the white spaces and the equals signASC II characters, in the GET request, these are substituted as follows: %27, %20 and %3D, viatheir URL Encoding representatives.NULL-Byte-InjectionA NULL-Byte-Injection( NBI)24 could be also accomplished on behalf of a GET Request, as:Figure 11: NBO evidence in Webapp log, in [3]In the same manner as in the former example the Null ASC II character is URL encoded here by%00. The attack tries to compromise the Perl login.cgi -script and utilizing the NBI to open thesensitive .cgi file.The provided examples illustrate different header inspection cases as part of the Server SideForensics.This list can be extended by further paradigms, related to user client Browser investigationtechniques: Browser Session-Restore Forensics [17] and Cookie inspection etc. Though, we shouldnot consider further illustrations of WAFO techniques in this section, with respect to the marginalboundaries of the term paper. The interested reader should refer to [3] and [15] for moreinformation. Lets proceed with an example, concerning the WebMail forensics.3.2. WebMail ForensicsWeb based Mail( WebMail) represents a separate construct in an Web Application. Furthermore,many firms deploy Web based mail services, like: Yahoo, Amazon etc. Moreover, the WebMaildenotes another data input source on a Webapp, therefore, the strive for compromising Web basedMail implementations still matters. The next Figure 12 illustrates a faked ( spam) e-mail:24 http://projects.webappsec.org/w/page/13246949/Null-Byte-Injection 25
    • 3.Web Application ForensicsFigure 12: HTML representation of spam-mail( e-mail spoofing)This should designate the last case study in the examples exposition. The spam-mail should beconsidered as representative of one of the most utilized attacking techniques, concerning WebMail-e-mail spoofing. We should illustrate according to this a fragment of the mail header, see Figure 13:Figure 13: e-mail header snippet of the spam-mail in Figure 12 26
    • 3.Web Application ForensicsFurthermore, a diffrent supportive attacking technique designate the e-mail sniffing, which shouldnot be discussed in this paper. For the reader concerned, please refer to [18], [19]. The Author of thepaper receives the illustrated spam-mail in January 201125. Lets demonstrate a WebMail headerinspection on the given example, as in Figure 13 already shown, which should explain the e-mailstuffing attempt. On one hand, inspecting the Received- header the domain appears to be valid andbelongs to facebook.com26; on the other, the Return-Path- header, as well as the X-Envelope-Sender- header reveal a totally different sender. The domain, specified there, appears to belong to ahome building company in the US. Moreover, there is another domain very similar to the one in theexample: cedarhomes.com.au. Inspecting as next the Sender header, the sender name appears to bea common name in Australia27. The correlation of the evidence is illustrative. More important, the e-mail-spoofing attempt is identified.A different crucial matter also concerns the discussed spam-mail. A more detailed investigation onthe HTML- content of the spam e-mail, provoked by the suspicious appearance of the Hyper-Linkhere, as in Figure 12, the second row from the bottom of the HTML mask: …, please click here tounsubscribe.; reveals the following dangerous HTML-Tag content, see next Figure:Figure 14: Spam-assassin sanitized malicious HTML redirection, from example Figure 12It appears to be that, the spam-mail is intelligently devised, as the intruder is not actually interestedonly in spamming the e-mail accounts. With greater likelihood, a receiver, who does not use socialplatforms, or just dislike to receive such e-mails, should click on the un-subscribing link, whichshould lead him to a malicious site. Modern versions of Mozilla Firefox Browser can detect thecompromised and malicious domain promelectroncert.kiev.ua and warn the Browser user right ontime,as appropriate. This interesting example illustrates the argumentation, explaining why shouldWebMail Forensics matter.Thus, we conclude this section and proceed to the last part of this Chapter 3, concerning aspects oncollaborative approaches from the other Forensics investigation fields, supporting WAFO.3.3. Supportive ForensicsIn this section we should discuss briefly the supporting part of Network, Digital Image and (OS)-Database Forensics, extending the evidence collection for WAFO investigation. The presence of logdata, derived from IDS/IPS prevention systems, supports the more precise detection of the intrudersactivities on the Webapp and IP provenance. The amount of noise over the network, the intruderproduces, is sufficient as described formerly, to determine properly the violators profile. In somecases, Forensics investigations on digital images uploaded to a compromised Web Applicationshould lead to the successful detection of the intruders origins.25 At this point,the author of the paper should like to express his gratitude to Rechenzentrum at Ruhr-University of Bochum, for the successful sanitization of the spam-mail, utilizing spam-assassin right on time, http://www.rz.ruhr-uni-bochum.de/ , http://spamassassin.apache.org/26 http://www.mtgsy.net/dns/utilities.php27 http://search.ancestry.com.au 27
    • 3.Web Application ForensicsThis denotes once again the reasonable suggestion for extensive correlation of the different payloadas forensic evidence, which should reduce false positives appearances in the results andconsequently to this, more precise attacking detection should be achieved.A very interesting example is pointed out in [3], page 285, concerning the Sharm el Sheikh CaseStudy.At last, we should also mention the notable case, in which WAFO is detained, because of the lack ofsufficient Database log data. Root for such issues could be: the proper utilization of concealingtechniques a Web intruder applies to cover the attacks traces, malfunction in the Database engine,lack of proper WAFO Readiness utilization- logging capabilities of the RDBMS are not adequateadjusted etc. In such cases the WAFO successful examination of compromised RDBMS as a Back-End to a Webapp is constitutive doubtful. Nevertheless, if the RDBMS Application Server has notbeen restarted since the time prior to the moment as the Attacking Scenario is executed, there is areasonable chance to extract important forensic evidence from the RDBMS plan cache. Thisessential approach is discussed in detail in [16].We discuss in this Chapter techniques for deployment of WAFO, which should be considered asmanual techniques. If the observed environment is compact and the amount of sufficient evidence,could be examined by a human in acceptable time and efforts, expanding the collection of suchforensic techniques is undeniably fundamental and relevant.For all that, there are many cases, concerning modern Webapps, in which the observation of the logfiles exceeds the human abilities, like the capacity of logs provided by Web Scanners equal to acouple of Gigabytes[L8].Another example is the utilization of WAFO investigation accomplished rapid in time.In such cases the questions, concerning the utilization of automated tools, enhancing thedeployment of Webapp forensics, become undoubtedly significant.Lets introduce in the next Chapter 4 such tools, respecting WAFO automation techniques. 28
    • 4.Webapp Forensics tools4. Webapp Forensics toolsIn [13], Jess Garcia proposes a categorization of the Forensics approaches, separating them in twoclasses: Traditional forensics methods and Reactive forensics methods. A good illustration of themain parameters, designating the two classes, is represented in the next Table, derived from [13]:Traditional Forensics Approaches: Reactive Forensics Approaches: • Slow • Faster • Manual • Manual/ Automated • More accurate( if done properly) • Risk of False Positives/ Negatives • More forensically Sound • Less forensically Sound( ?) • Older evidence • Fresher evidenceTable 8: Traditional vs. Reactive forensics Approaches, in [13]According to the examples in Chapter 3, we should clarify that, the detection of those could beestablished only by well trained security professional in acceptable matter of time. Manuallydeployed WAFO investigations should be determined as very precise with less false tolerance,though only if applied appropriate. As mentioned above, the complexity of the current WebAttacking Scenarios drives the investigation process to be unacceptable, respecting the time aspect.Business Webapps do not tolerate down-time, which is undoubtedly required that, the Webappimage should be processed for reasonable WAFO. This designates the dualistic matter of WebApplication Forensics investigation: slow and precise versus faster and error prone.On one hand WAFO should be deployed uniquely for every single case of compromised Webapp,on the other the utilization of new techniques, as employment of automated tools in the WAFOinvestigation, should gain without a doubt new( fresher) forensic evidence. This is very important,concerning the maximal Forensics evidence collection, as already proposed. In this way of thoughts,we should explain the fact that, the utilization of new automated techniques in WAFO, is onlyacceptable in case of the proper training prior to their implementation in production environment. Itis crucial to know the particular features of the automated tool, which should be utilized; to knowthe reactions of the Webapp environment as the tool is implemented to it; to know the level oftransparency, concerning the distance between the raw log files data and the tools feedback asevidence payload etc. Lets illustrate some of the fundamental requirements parameters, consideringWAFO automated tools as appropriate for their enforcement in the Forensics investigation process.4.1. Requirements for Webapp forensics toolsAn essential categorization of the requirements for WAFO automated tools is given by RobertHansen in [L9]. We should designate them as tools requirements rules( TRR), as follows: 1. an automated tool candidate for WAFO should be able to parse log files in different formats 2. it should be able to take two independent and differently formatted logs and combine them 29
    • 4.Webapp Forensics tools 3. the WAFO tool must be able to normalize by time 4. it should be able to handle big log files in the range of GiB 5. it should allow utilization of regular expressions and binary logic on any observed parameter in the log file 6. the tool should be able to narrow down to a subset of logical culprits 7. the automated tool should allow implementation of white-lists 8. it should allow a probable culprits list construction, on which basis the security investigator should be able to pivot against 9. it should be also able to maintain a list of suspicious requests, which should indicate a potential compromise 10. the WAFO tool should utilize, decoding of URL data so that, it can be searched easier in readable formateAs we should experience in the further Sections of the Chapter, we should consent that, thecompliance of the heretofore enumerated requirements is still unfeasible.Lets represent a short explanation of them, which should define them as an appropriate constitutivebasis.No matter, if the specific tool imply all of these requirements, or not, this should support a moreappropriate categorization of its skills and utilization area(s). As current Webapps require, withreasonable likelihood, more than one different Web- Servers( for example), parsing the different logformats, could be not an easy task. This is a fundamental reason to decide, whether it is moreappropriate to utilize specialized tools, related to the specific log- file format, or to seek further foran application, with wide variety of supported log- data formats. Two sufficient candidates are:Microsoft IIS file format and Apache Web server log data format28. In this way of thoughts, theconcern is important, stressing the fact, how to combine the raw data from such concurrent runningdifferent Web- Servers to achieve a better correlation of the evidence, provided by the properextraction of the payload from their log data.Furthermore, to outline coincidences, we should consider proper investigation on time-stamps.A normalization on time is crucial.The matter of the current amount of collected log files is discussed enough heretofore and clearlysufficient.The aspects, explaining the utilization of Regular Expressions, should be designated as crucial too.To illustrate this, lets mention the fact, respecting the differences in the implementations ofRegular Expressions on Black- Lists basis and those on White- Lists basis, which employs a furtherparameter in the requirements list. The white- listing utilization should concern cases, in which thetraced payload should express a well-defined construction. If the observed input string differs fromthis limited form, it should be outlined as suspicious. Example, Regular Expressions( RegEx) forfiltering of tamper data of input fields in Webapp as login-ID from an e-mail type.On the contrary, the Black- listing specifies, what kind of construct is wrong and suspicions asdefault. Such filters could be eluded in a simple manner by altering in an appropriate way of theinjection code, so the RegEx should fail with greater likelihood to detect it properly. It is a very28 Statistics for the utilization of the different Web- Server should be found at: http://news.netcraft.com/ 30
    • 4.Webapp Forensics toolscontroversial task to define a Black- List RegEx, which is covering a class of malicious strings andsustain precise(fresh). Furthermore, it is a challenge to implement a forensics tool with minimaland compact collection of malicious signature, which should be able to sustain universally valid.Probability analysis, supporting a right on time detection of malicious signature, is a furtherchallenging topic.Moreover, it should be very useful, if the tool is expendable by the forensics investigator, in termsof the security professional is allowed to refresh and update the list of malicious payload detectingRegExes manually. In the examples in Section 3.1. and 3.2. the illustration of the importance ofproper URL- Encoding is designated and requires no further discussion on it.These conclusions advocate the statement that, TRR1 up to TRR 10 are relevant and fundamentallyimportant for proper WAFO.Lets present a couple of interesting examples of particular WAFO automated tools candidates in thenext Sections 4.2. and 4.3.. As tools requirements basis is already specified, we should classify thetools in general into Open Source and proprietary ones and describe an appropriate tuple of thoseaccordingly.4.2. Proprietary toolsAs we discuss Business related Webapps as sufficient criterion, we should describe at first theBusiness-to-Business implementations of WAFO automated tools. Current representatives in thisclass should be enumerated as follows: EnCase[L10], FTK[L11], Microsoft LogParser[L12],Splunk[L13] etc. According to the WAFO tools requirements the author of the paper deliberates thefollowing favorites in this category, see further.Microsoft LogParserThis forensics tool is developed by Gabriele Giuseppini29. A brief history of MS LogParser is givenat [L15],[L16]. The application can be obtained and utilized for free, see [L12], though as [L14]Microsoft rather designate it as “skunkware” and dislike to give an official support for it. Currentversion of the tool is LogParser 2.2, released at 2005. An unofficial Support site, concerning thetool should be found at: www.logparser.com30.The parser includes in general the following 3 main units: an input engine, a SQL-like query enginecore and an output engine. A good illustration of the tools structure is given at [L16], see furtherAppendix B, Figure 19. MS LogParser utilizes support for many autonomous input file formats: IISlog files( Netmon-Capture-logs), Event log files, text files( W3C, CSV, TSV, XML etc.), WindowsRegistry databases, SQL Server databases, MS ISA Server log files, MS Exchange log files, SMTPprotocol log files, extended W3C log files( like Firewall log files) etc. Another achievement of thetool is, it can search for specific files in the observed file system and also search for specific ActiveDirectory objects. Furthermore, the input engine can combine payload of the different input fileformats, which allows a consolidated parsing and data correlation, thus TRR 1 and TRR 2 aresatisfied. Acceptable input data types are: INTEGER, STRING, TIMESTAMP, REAL, and NULL,29 http://nl.linkedin.com/in/gabrielegiuseppini30 Unluckily at the present moment this site seems to be down. 31
    • 4.Webapp Forensics toolswhich satisfies TRR 3. As [L17] parsing of the input data is achieved in efficient time, whichdesignates another positive feature of the tool. As the data is supplied to the core engine, theForensics examiner is allowed to parse, utilizing SQL-like queries. As default, this is implementedon behalf of a standard command line console, explicitly explained in [21]. Before illustrating thisvia example, lets mention that, there should be unofficial front-ends providing more user-friendlyGUIs, like: simpleLPview0031. However, as the domain logparser.com seems to be down at thepapers development phase, the author of the paper isnt able to utilize tests on the GUI front-end.For the reader concerned, the GUI versions of MS LogParser arent limited to that front-end.Developers can extend the MS LogParser UI via COM- objects, see [L15], which enables theForensics professional to extend the tools abilities by programming custom input format plug-ins.Lets illustrate the MS LogParser syntax, see [L15]:C:Logs>logparser "SELECT * INTO EventLogsTable FROM System" -i:EVT -o:SQL-database:LogsDatabase -iCheckpoint:MyCheckpoint.lpcThe following example, represents a SQL-like query, where the input file format specified by -iconcerns the MS Event logs; the output format is SQL, which means the results are stored in adatabase and could be filtered further as appropriate. An important option is -iCheckpoint, whichdesignates the ability for setting checkpoint on the log files and thus achieve an incremental parsing,on the observed log data, which increases the efficiency on parsing large log files and satisfies insome way the TRR 4. The next example demonstrates, see [L15]:C:>logparser "SELECT ComputerName, TimeGenerated AS LogonTime,STRCAT(STRCAT(EXTRACT_TOKEN (Strings, 1, |), ), EXTRACT_TOKEN(Strings, 0, |))AS Username FROM SERVER01 Security WHERE EventID IN (552; 528) ANDEventCategoryName = Logon/Logoff" -i:EVTa simple string manipulation, which could be extended by RegExes and satisfies TRR 5, 7.Different interesting paradigms can be found at [15], [L15], [L16], [L17].Another notable aspect of the MS LogParser is its ability to execute automated tasks. One approachis to write batch-jobs for the tool and make system scheduler entries for their automated execution,please consider [L14]. Furthermore, the examiner can utilize windows scripting on MS LogParser,as [L17]. Appendix B, Figure 20 illustrates this. The standard implementation scenario is given asfollows, see [L17]: • register the LogParser.dll • create the Logparser object • define and configure the Input format object • define and configure the Output format object • specify the LogParser query • execute the query and obtain the payloadThis briefly introduction of the MS LogParser demonstrates its mightiness without a doubt.However, we should consider the tool as appropriate only concerning MS Windows based31 http://www.logparser.com/simpleLPview00.zip 32
    • 4.Webapp Forensics toolsenvironments, such as .asp, .aspx, .mspx Web applications.An open question remains, regarding the proper examination of Silverlight implementations.Another possible issue could be the iCheckpoint, configuring the incremental parsing jobs. Locatingthe .lpc configuration file(s) could easily lead the intruder to the log files, related to the forensicsjobs, which should be exploited straight ahead.SplunkThis tool is developed and maintained by Splunk Inc32. Its current stable release is 4.2.2, 2011.Although the professional version of the tool is high priced, there is a test version for the limitedtime of 30 days and a bounded amount of parsed log files up to 500 MB. The test version can beemployed for free. Furthermore, there is a community support, concerning Splunk as a mailing listand Community Wikipedia, hosted on the Splunk Inc. domain. Official support regarding Splunkdocumentation, version releases and FAQ/ Case studies is presented at the tools website, whichrequire a free registration.Another advantage of Splunk is the on-the-fly Official-/Community-IRC-Support. Next interestingfeature is the users and official professionals uploaded Video-tutorials, demonstrating specificusage scenarios and case studies.The tool has wide OS support: Windows, Linux, Solaris, Mac OS, FreeBSD, AIX and HP-UX.Splunk can be considered as a highly hardware consuming application33. It was tested on an IntelPentium T7700 with 3GB of RAM machine under Windows XP Professional SP3 and UbuntuLinux 10.04 Lucid Lynx. In both of the cases the setup runs flawlessly with less additionalinstallation effort on the users side. After successful installation Splunk registers a new user on thehost OS, which can be deactivated. The tool represents a python based application. It antes a Webserver, an OpenSSL server and an OpenLDAP, which interact with the different parsers for inputdata. The configuration of the different Splunk elements is implemented via XML, which allowsthem to be userfriendly adjusted. Splunk has even greater input format support than MS LogParser,which designates the tool as not only OS independent, but also input format all-round. Aninteresting combination of Splunk with Nagios is discussed at [L18]. A screenshot of the officialdeliberated features of the tool is illustrated at Appendix B, Figure 21. These aspects relate to theTRR 1, 2, 3, 4, 5. TRR 7, 9, 10 should be tested more extensive in particular.The user interaction with Splunk is utilized via common Web-browser. The different Splunkelements are organized on a dashboard, which allows to be reordered and organized in an user-friendly manner.Lets represent more detailed the main Splunk units. Their description is based on [L19], whichconcerns Splunk version 3.2.6. Although after ver. 4.0 Splunk is completely rewritten, the mainBusiness logic units sustain.In general, the idea behind this tool is not only to parse different log file formats and supportdifferent network protocols, but also to index the parsed data. Thus, the tool impersonates avaluable search engine, like those largely known nowadays on the Internet. This allows the user toaccomplish more userfriendly and precise searches on specific criterion. Indeed, the queryresponses from the tail- dashboard are significantly fast. Intuitively, we designate the first Splunk32 http://www.splunk.com/33 http://www.splunk.com/base/Documentation/latest/installation/SystemRequirements 33
    • 4.Webapp Forensics toolsunit- the index engine. It supports SNMP and syslog as well. Consequently to this, the second unitrepresents the search core engine. One can include different search operators on specific criterionlike Boolean, nested, quoted, wildcards, which respects as already stated TRR 5 and 7. The thirdunit is the alert engine, which somehow satisfies TRR 9. The notifications can be sent via RSS,email, SNMP, or even particular Web hyperlinks. In addition to this, the fourth unit implements thereporting ability of Splunk, TRR 2 and 3. On a specific prepared dashboard the user/ forensicexaminer can not only gain detailed results on the parsed payload in text formate, but also gainderived information as interactive charts and graphs, and specific formated tables according to theauditing jobs. These are well illustrated in Appendix B, Figure 22. An interesting example describesthe reporting abilities of Splunk to detect JavaScript onerror entries, on behalf of user-developedjson- script, see [L22].The fifth and last unit represents the sharing engine/ feature of Splunk. This explains the strive forusers collaborative work on behalf of this tool, where as know-how exchange is encouraged.Another motivation for this unit is a distributed Splunk environment, where not only single instanceof Splunk is serving the specific network. Further abilities of the forensic tool should be mentionedas: scaling the observed network and security of the parsed data.This last feature is important to be discussed more detailed. An open question remains, as denoted atMS LogParser, whether the tool is hardened enough on itself, considering the fact that, the largepayload data it is not only indexed, but also userfriendly represented. As Splunk is without a doubtan interface to every log file and protocol on the observed network, it is highly likable to get thisbonding point compromised. If an attacker succeeds in this matter, one can get every detail, relatedto the observed network represented in an userfriendly format, which disburdens the intruder tocollect valuable payload data and minimizes his/ hers penetration efforts. As the Splunk front-end isrepresented via Web-browser, intuitively the reader concerned, can notice that,CSRF [4] and CSFU [L20] could be respectable candidates for such attacking scenarios, especiallycombined with DOM based XSS attacks [20], [L21], which can trigger the malicious events in theBrowser engine. If such scenarios could be achieved, then Splunk could alter into a favorite jump-start platform for exploiting secured networks, instead of to be utilized as appropriate forensicinvestigation tool. This designates an essential aspect, concerning the future work on WAFO. Weshould not extend this discussion further as it goes beyond the boundaries of the present paper.Lets introduce the selected Open Source WAFO tools, as mentioned above.4.3. Open Source toolsAt first lets describe PyFlag.PyFlagAs at the previously described tool, there is a team behind the PyFlag development: Dr. MichaelCohen, David Collett, Gavin Jackson. The tools name represents an abbreviation of: python basedForensic and Log Analysis GUI. PyFlag is another python implementation of forensic investigationtools, which utilizes as a Front-end for the user the common Web-browser. Current version of the 34
    • 4.Webapp Forensics toolstool is pyflag-0.87-pre1, 2008. The tool is hosted at SourceForge34 and as an Open Source App itcan be obtained for free under the GPL. A support site is considered to be www.pyflag.net. Thisdomain also stores the PyFlag Wiki with presentations of the tool and video tutorials. A differentadvantage is the predefined for examination forensics image also hosted on the support site. Thisimage can be employed for the purposes of training on forensic investigation.The general structure of the tool can be described as follows. The python App antes a Web Serverfor displaying the parsing output and further, the collected input data is stored in a MySQL server,which allows the tool to operate with large amount of log files code lines, respecting TRR 4. The IOSource engine designates the interface to the forensic images, which enable the tool to operate withlarge scale input file types, as Splunk.As the observed image is loaded by the Loader engine in the Virtual File System, different scannerscan be utilized for gaining the forensic relevant payload from the raw data. For the readerconcerned, please refer to [L26]. The main PyFlag data flow is illustrated in the next Figure 15:Figure 15: Main PyFlag data flow, as [L26]PyFlag is natively written to support Unix-like OSes. A Windows based port is currently presentedon the support Web site, PyFlagWindows35. This makes the tool OS independent as well. ThePyFlag developers state that, the tool is not only a forensic investigation tool, but rather a richdevelopment framework. The tool can be used in two modes: either as a python shell, calledPyFlash, or as a userfriendly Web-GUI. The installation process requires some user input; moredetailed, common installation routines like: unpacking the archive to a destination on the host OS,configuring the source via ./configure on Linux systems, checking for dependency issues andutilizing the make install, are demanded.The first start of the tool requires from the forensics investigator to configure the MySQLadministrative account and the Upload directory. This location is crucial for the forensics images,which should be observed. In general PyFlag represents: a Web Application forensic tool( logfiles) , Network forensic tool( capture images via pcap) and an OS forensic investigation tool. Aswe denoted in the introduction of the paper, we should only concentrate on the log files analysis byPyFlag, discerning its other features considering NFO and OSFO( Operating System Forensics).The authors of the tool encourage the forensics investigators to correlate the different evidence fromWAFO, NFO and OSFO, as it was proposed already before.34 http://sourceforge.net/35 http://www.pyflag.net/cgi-bin/moin.cgi/PyFlagWindows 35
    • 4.Webapp Forensics toolsMore detailed, PyFlag supports variety of different and independent input file formats like: IIS logformat, Apache log files, iptables and syslog formats, respecting TRR 1, 2 and 3. The tool supportsalso different kind of level of the formats customization, e.g. Apache logs can be formatted bydefault, or customized by the security professional.Lets explain this. After the installation is completely set up, the user can work with the Browser-Gui PyFlag environment. For analyzing a specific log file, PyFlag presents presets, which aretemplates, allowing to parse a collection of a specific class log files, e.g. IIS log file format. Thepreset controls the driver for parsing the specific log as appropriate. As standard routine for an IISlog file analysis set up is described in [22], as follows: • Select “create Log Preset” from the PyFlag “Log Analysis”- Menu • Select the “pyflag_iis_standard_log” file to test the preset against • Select “IIS” as the log driver and utilize the parsingA more extensive introduction to the WAFO utilization of the tool is presented at Linux.conf.au,2008, please consider watching the presentation video [L23]. After the tool starts to collect payloaddata from the input source, the Forensics investigator can either employ pre-defined queries andthus minimize the parsing time on-the-fly, or wait for complete data collection. The data noise in theobtained collection could be also reduced via white-listing, as TRR 7. Moreover, after data iscollected, the examiner can apply index searching via natural language like queries, comparativelyto Splunk. These features explain the efficient searching by the PyFlag. Another interesting aspectof the tool is the implementation of GeoIP36( Apache). It can be either obtained from the Debianrepository, which presents a smaller GeoIP collection, or downloaded from the GeoIP website as acomplete collection. GeoIP allows to parse the IPs and Timestamps and correlate them to the originlocation of the GET/POST requests in the log file. This respects TRR 3. The tool can also store thecollected evidence payload in output formats like .cvs, which explains its utilization as a Front-endto other tools applied to the investigation. An illustration of the PyFlag Web-GUI is given atAppendix B, Figure 23. To conclude the tools description, we should mention once more time theopen question of possible compromising of the Web-GUI as explained at the Splunk representation.A well known attack concerning HTTP Pollution on ModSecurity37 is presented by Luca Carettoni38in 2009, where the IDS is exploited by an XSS instead of utilizing an image upload to the system.As mentioned above this advocates the fact that, the tool should be revised for such kind of exploitsand especially rechecked for possible DOM based XSS exploits, concerning its own source.Apache-scalp or Scalp!This tool should be considered as explicitly WAFO investigation tool. Scalp! is developed byRomain Gaucher and the project is hosted on code.google.com. Its current version is 0.4 rev. 28,2008. The tool is the only one of the described above, which definitely deploys RegExes. Itrepresents a python script, which can be run in the python console on the common OSes and makesit OS independent. The tool is published under the Apache License 2.0 and is specified for parsingespecially Apache log files, which denotes its usability only on the class of these log files and doesnot respect TRR 1 and 2. It is tested only on a couple of MiB log files, which disrespects further36 http://www.maxmind.com/app/mod_geoip37 http://www.modsecurity.org/38 http://www.linkedin.com/in/lucacarettoni 36
    • 4.Webapp Forensics toolsTRR 4. The tools developer states that, it is possible to release a C++ based version of the tool,considering more efficient parsing of the log data. This should be a main topic of the Scalp! futurework.For all that, a great advantage of the tool is the utilization of the RegEx PHP-IDS39 filter, which isnowadays perhaps one of the most powerful implementations for detecting Web-Attacking Vectorsfingerprints. Thats why, choosing this tool makes it very reasonable, because it is the only one that,states explicitly for parsing log files against known Web 2.0 security culprits like: SQLIA, XSS,CSRF, LFI, DoS, Spam etc. The usage of the tool is straight forward as on every other commoncommand line pen-testing tool. Lets illustrate some of the running modes of Scalp!, as [L24]: • exhaustive mode: the tool will test the complete log file, or a log file snippet and will not break on the first payload pattern, • tough mode40: will parse the log data on behalf of PHP-IDS RegEx, which will reduce false positives in the output, respecting TRR 5, 7 and 9, • period mode: valuable for parsing on time-frame limitation at the supplied log file, which satisfies TRR 3 • sample mode: the tool parses only on an a priory specified sample the log file and ignores the rest of the input data, respecting TRR 7, • attack mode: a very important mode, where the forensics investigator can parse the log file against known Web attacking vectors, which satisfies in these cases- TRR 6.The output format of the Scalp! is designated as: TEXT, XML, HTML, see Appendix B, Figure 24.Lets provide an example of the apache-scalp usage, as [L24]:$./scalp-0.4.py -l /var/log/httpd_log -f ./default_filter.xml -o ./scalp-output --htmlWe should emphasize here that, the leading symbol $ is typical for user driven Unix- like terminalcommand line consoles. This is ,further, not required for utilizing the proper syntax in Scalp!The -l option specifies the Apache Web Server log file, the -f parameter specifies the inclusion ofthe Filters file, in this case an up-to-date PHP-IDS RegEx41, the -o option specifies the output fileand –html denotes its type, illustrated in Figure 24.As a conclusion to this paragraph, we should point out that Scalp! is a specific WAFO tool, perhapssomeone can even classify it as limited, though it is pretty useful for parsing log files of the ApacheWeb Server class against known Web 2.0 culprits. This tool could be an excellent addition toSplunk, or PyFlag, as both of them can parse the huge log files efficiently and prepare appropriateoutput for apache-scalp, so the Forensics investigator can further parse it more specifically onbehalf of the mighty PHP-IDS RegEx. This combination of the tools is recommended, concerningproper WAFO on the Apache Web Server.Lets summarize once again the TRR completion on the presented tools, see Table 13, Appendix B.39 https://phpids.org/40 It is not straight-forward, whether this satisfies completely TRR 6, TRR 8 and TRR 1041 https://dev.itratos.de/projects/php-ids/repository/raw/trunk/lib/IDS/default_filter.xml 37
    • 5.Future work5. Future workIn the papers exposition are outlined several aspects, which require future work.Lets summarize these in a more systematic way in the current Chapter. We should follow a simpleschema to achieve this: conclude what is done so far, specify what is not done, suggest what elseshould be done in the future.The complexity of current Web applications is denoted above and is obvious. Consequently to this,we describe the sophistication of current Web attacking scenarios, as well. The conclusion that,best-practices should not be determined as sufficient approaches, conducting a proper utilization ofWeb application security, or Webapp Forensics, is reasonable. This advocates the fact that, anadequate WAFO taxonomy is highly required.The proposals and schemae for classifying the different WAFO aspects and WAFO in general areunambiguously compact. First of all, these are related to the topic without a doubt. Therequirements, that the classifications should be fundamental and universally valid, are respected.There are no tautologies, nor redundancies in their construct. Furthermore, a classification on therequirements rules, in terms of the proper choice of WAFO automated tools, is also designated. TheTRRs, enumerated in Section 4.1 are fundamental and specify a compact collection as required.For all that, the approval, whether the proposed WAFO taxonomy is complete, is not demonstrated.This aspect represents a great opportunity for future work on the field of WAFO in general.Another important matter is the automation of Webapp Forensics. Once again, the estimation,whether these rules enumerate a complete group of sufficient requirements, is not presented andshould not be considered at this stage as approved, concerning the aspect of completeness. TRR6and TRR8 are essential, proposing future work on them. As Table 13, Appendix B, there is nodeterministic conclusion, whether the observed tools in this paper satisfy all of the mentioned aboverules, or not.We denoted in the exposition part of the paper, that several modern Web attacks and constructedAttacking scenarios by their application, are in debatable amount of cases unfeasible to befingerprinted as a stable static abstract construct. Representatives in this group, once again, areCSRF and CSFU, especially combined with XSS attacks, like DOMXSS. This stresses the question,whether we should be able to develop every single logical culprit and pivot against it, as TRR 6.Additionally to this, the probable culprits( TRR8) should represent a next great challenge forsecurity professionals.In this way of thoughts, a different issue, pretending to be a reasonable candidate for WAFO futurework is the implementation of strong RegEx filters in the WAFO tools. From the group of theobserved WAFO scanners in this paper, only Scalp! exclusively uses a strong RegEx collection,provided and regularly maintained by the PHP-IDS project. We should stress that, concerning theother tools this should not be approved. Of course this feature can be applied by the Forensicsexaminer according the particular cases. Though, the approach presented at Scalp! is more reliable.In a word, the decision making after observing the obtained log files data belongs still to theForensics investigator.As Table 13 in concrete illustrates, there is no currently available automated tool, which can utilizeApplication Flow Analyses on an particularly observed RIA Web application. 39
    • 5.Future workIf we should approve the intruders environment as non-deterministic, then we should strive forprecise image of the victim environment, e.g. the Web application. This conclusion advocates thenext task for WAFO future work- the evaluation of Application Flow Analysis, as in Section 2.3 isalready proposed. Whether future implementations of fully automated WAFO tools, able of decisionmaking, is possible or not, the more important issue is to detain the “pointnscan web applicationsecurity”, and instead of consider an AFA: complete function- and data-flow mapping of theobserved victim environment, combined with a consequent precise application scanning fordetecting properly the real-world culprits, which compromise the modern Web applications.At last, lets specify the mentioned above decision making problem concerning WAFO automatedtools. As designated above, the Forensics professional should still deploy the following eventfultasks, respecting the proper WAFO tools application: set-up the tool, update the filtering rules, tracethe Forensics image by them, do the proper decisions. This denotes the statement- WAFO toolsdemonstrate a semi-automated Forensic-investigation, see previous Chapter 4, Table 6.If we would like to suggest fully automated WAFO tools solutions, this means we should like tosolve an Assignment problem( optimization problem). More detailed, the WAFO tool should be ableto utilize iterative tasks in the input data filtering process. Further, it should be able to reorder theup-to-date scanning filters, producing less false positives and better Attacking detection. In somecases, filters should be optimized and should require restructuring, or even to be specified onceagain from scratch, to meet the TRRs completely. Even though, as already stated, new Attackingtechniques establish their existence on the Web once and again, therefore the tools should be able toconstruct in some cases entirely new detection filters. For all that, the open questions sustain: howrules should be outlined as obsolete and what should be the comparative criterion that fresherfilters should derive from, implementing a fully automated execution without human interactions.An important criterion should be the evaluation of the impact of a successfully accomplished Webattacking technique(s). On behalf of that, the tool should be able to re-adjust and re-new the filteringcollection, employed in the automated part of the WAFO investigation. 40
    • 6.Conclusion6. ConclusionThe current term paper represents an overview on the complex topic, concerning Web ApplicationForensics.The main goal denotes a systematic approach to Webapp Forensics. This is achieved by proposingcategorizations, which do consider to preserve universal validity and sustain fundamental on behalfof estimating the aspects of compactness and completeness.As a consequence to this, the following particular tasks are accomplished as follows: localization ofWAFO in the construct, describing Digital Forensics in general; designating the WAFO securityModel, explained by defining the profiles of the current Web Application intruders andclassification of the present Web Attacking Scenarios. Thus, a complete illustration of the intrudersenvironment is proposed.Furthermore, a description of the victim environment is suggested on behalf of the discussion on theaspects: WAFO deployment phases, WAFO general tasks representation, WAFO evidence taxonomyand WAFO Players classification. The matters, pertaining to the questions on automation of theWeb Application Forensics investigation, are also deliberated. A fundamental and compactcollection of requirements, related to the automated tools and their proper implementation in theWAFO investigation, is proposed. As stated in Chapter 5., an open question, outlining the approvalof this requirements list, sustains.The second objective, concerning modern aspects in the Web Application Forensics, is also coveredby the discussion on the Web Application penetration testing trends and their valuable applicationon WAFO in the last section of Chapter 2 and at the end of Chapter 4.A more practical approach is also presented, as WAFO illustrative paradigms and case studies areenumerated by examples.Thus, the thesis of the current paper is covered, at least for the present time of the term papersdevelopment.The abstractions and categorizations in this paper should be denoted as challenging in the future,considering their redefinition and accuracy aspects, and evaluation on behalf of their compactnessand completeness. 41
    • AppendixesAppendix AApplication Flow AnalysisQ/A TEAM INFOSECURITY TEAMFunctions known Functions unknownApplication understood Application unknownRely on functional specifications Rely on crawlers + experience + luckCoverage known Coverage unknownHighlight key business logic Highlight “found” functionalityTable 9: Functional vs. Security testing, Rafal Los [10] EFD Execution Flow Diagram – Functional paths through the application logic ADM Application Data Mapping – Mapping data requirements against functional pathsTable 10: Standards & Specifications of EFBs, Rafal Los [10] Nodes represent application states Edges represent different actions Graph(s) of flows through the application Paths between nodes represent state changes A set of paths is a flowTable 11: Basic EFD Concepts [10]Execution Flow Action Something that causes a change in state A human, server or browser-driven eventAction Types Direct Supplemental IndirectTable 12: Definition of Execution Flow Action and Action Types, Rafal Los [10] 42
    • Figure 16: Improving the Testing process of Web Application Scanners, Rafal Los [10]Figure 17: Flow based Threat Analysis, Example, Rafal Los [10] 43
    • WAFO victim environment preparednessFigure 18: Forensics Readiness, in Jess Garcia [13] 44
    • Appendix BProprietary WAFO toolsMS LogParserFigure 19: MS LogParser general flow, as [L16]Figure 20: LogParser-scripting example, as [L17] 45
    • SplunkFigure 21: Splunk licenses features 46
    • Figure 22: Splunk, Windows Management Instrumentation and MSA( ISA) queries, at WWW 47
    • Open Source WAFO toolsPyFlagFigure 23: PyFlag- load preset and log file output, at WWWapache-scalp or Scalp!Figure 24: apache-scalp or Scalp! log file output( XSS query), as [L25] 48
    • Results of the tools comparisonTRR completion on the presented tools: TRR Web Forensics tools MS LogParser Splunk PyFlag Scalp! 1 +42 +43 +44 X45 2 + + + X 3 + + + + 4 + + + X 5 + + + + 6 ? + + + 7 ? ? ? + 8 ? ? ? ? 9 ? + ? + 10 ? + ? ?Table 13: TRR completion on LogParser, Splunk, PyFlag, Scalp!Legend: +- denotes, is definitely presented X- denotes, is not presented ?- denotes, is not explicitly officially stated, which requires future research for approving these aspects( features)42 Supported input formats: IIS log files( Netmon-Capture-logs), Event log files, text files( W3C, CSV, TSV, XML etc.), Windows Registry databases, SQL Server databases, MS ISA Server log files, MS Exchange log files, SMTP protocol log files, extended W3C log files( like Firewall log files) etc.43 Supported input formats: all possible, WAFO related, input data types44 Supported input formats: all possible, WAFO related, input data types45 Supported input formats: Apache log files 49
    • List of links L1 LayerOne 2006, Andrew Immerman, Digital Forensics: http://www.youtube.com/watch?v=N8whBp2cp6A L2 NIST Colloquium Series, Digital Forensics: http://www.youtube.com/watch?v=9DKJ6gP5lJY L3 Cloud Computing: http://csrc.nist.gov/groups/SNS/cloud-computing/ L4 MonkeyFist Launches Dynamic CSRF Web Attacks: http://www.darkreading.com/insider-threat/167801100/security/ application-security/218900214/index.html L5 Hacker Hits British Navy Website with SQL Injection Attack: http://www.whitehatsec.com/home/news/10newsarchives/110810eweek.html L6 LizaMoon Mass SQLIA: http://www.eweek.com/c/a/Security/ LizaMoon-Mass-SQL-Injection-Attack-Points-to-Rogue-AV-Site-852537/ L7 Rafal Los, Into the Rabbit Hole: Execution Flow-based Web Application Testing: http://www.youtube.com/watch?v=JJ_DdgRlmb4&feature=related L8 Jeremiah Grossman, Our infrastructure -- Assessing Over 2,000 websites: http://jeremiahgrossman.blogspot.com/2010/09/our-infrastructure-assessing-over- 2000.html L9 Robert Hansen, Web Server Log Forensics App Wanted: http://ha.ckers.org/blog/20100613/web-server-log-forensics-app-wanted/ L10 EnCase: http://www.guidancesoftware.com/ L11 FTK: http://accessdata.com/products/forensic-investigation/ftk L12 Microsoft LogParser: http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=24659 L13 Splunk: www.splunk.com L14 Steve Bunting, Log Parser (Microsoft), The "Swiss Army Knife" for Intrusion Investigators and Computer Forensics Examiners: http://www.stevebunting.org/udpd4n6/forensics/logparser.htm 50
    • L15 Professor Windows - May 2005 Gabriele Giuseppini , How Log Parser 2.2 Works: http://technet.microsoft.com/en-us/library/bb878032.aspx L16 Marc Grote, Using the Logparser Utility to Analyze Exchange/IIS Logs: http://www.msexchange.org/tutorials/using-logparser-utility-analyze-exchangeiis- logs.html?printversion L17 The Scripting Guys, Tales from the Script Januar 2005: http://www.microsoft.com/germany/technet/datenbank/articles/600634.mspx L18 Splunk: Next time your beeper goes off, turn to Splunk. http://www.nagios.org/products/enterprisesolutions/splunk L19 Bill Varhol, What the Splunk?: http://www.ethicalhacker.net/content/view/206/2/ L20 Petko D. Petkov, Cross-site File Upload Attacks: http://www.gnucitizen.org/blog/cross-site-file-upload-attacks/ L21 Mario Heiderich, DOMXSS - Angriffe aus dem Nirgendwo: http://it-republik.de/php/artikel/DOMXSS---Angriffe-aus-dem-Nirgendwo-3565.html? print=0 L22 Carl Yestrau, JavaScript Error Logging with Splunk: http://www.splunk.com/view/SP-CAAACJK L23 Gavin Jackson, Incident Response using PyFlag - the Forensic and Log Analysis GUI: http://mirror.linux.org.au/linux.conf.au/2008/Thu/mel8-099a.ogg L24 Romain Gaucher, Apache-scalp How it works: http://code.google.com/p/apache-scalp/ L25 Scalp: Logfile-Analyzer findet Web-Angriffe: http://www.linux-magazin.de/NEWS/Scalp-Logfile-Analyzer-findet-Web-Angriffe L26 PyFlag manual: http://pyflag.sourceforge.net/Documentation/manual/index.htmlTable 14: List of links 51
    • Bibliography[1] Jess Garcia, Web Forensics, 2006[2] Dominik Birk, Forensic Identification and Validation of Computational Structures in Distributed Environments, 2010[3] Robert Hansen, Detecting Malice, 2009[5] Anoop Singhal, Murat Gunestas, Duminda Wijesekara, Forensics Web Services(FWS), 2010[4] Krassen Deltchev, New Web 2.0 Attacks, 2010[6] Kevin Miller, Fight crime. Unravel incidents... one byte at a time., 2003[7] BSI, Leitfaden "IT-Forensik", 2010[8] Ory Segal, Web Application Forensics: The Uncharted Territory, 2002[20] Shreeraj Shah, Hacking Browsers DOM - Exploiting Ajax and RIA, 2010[9] Joe McCray, Advanced SQL Injection, 2009[10] Rafal Los, Into the Rabbit Hole: Execution Flow-based Web Application Testing, 2010[11] Larry Suto, Analyzing the Effectiveness and Coverage of Web Application Security Scanners, 2009[12] Felix C. Freiling, Bastian Schwittay, A Common Process Model for Incident Response and Computer Forensics, 2007[13] Jess Garcia, Proactive & Reactive Forensics, 2005[14] Lance Müller, User Panel: Forensics & Incident Response It’s important to have options!, 2008[15] Rohyt Belani, Chuck Willis, Web Application Incident Response & Forensics: A Whole New Ball Game!, 2007[16] Edgar Weippl, Database Forensics, 2009[17] Harry Parsonage, Web Browser Session Restore Forensics, 2010[18] William L. Farwell, Email forensics, 2002[19] Thomas Akin, WebMail Forensics, 2003[21] Gabriele Giuseppini et al., Microsoft Log Parser Toolkit: A Complete Toolkit for Microsofts Undocumented Log Analysis Tool, 2005[22] Dr. Michael Cohen et al., PyFlag, Forensic and Log Analysis GUI, 2006 52