Published on

International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Bijayalaxmi Purohit, Pawan Prakash Singh / International Journal of Engineering Researchand Applications (IJERA) ISSN: 2248-9622 www.ijera.comVol. 3, Issue 3, May-Jun 2013, pp.1311-13161311 | P a g eData leakage analysis on cloud computing1Bijayalaxmi Purohit, 2Pawan Prakash Singh1, 2(Department of Computer Science Engineering, Suresh Gyan Vihar University, JaipurABSTRACTCloud describes the use of a collection ofservices, applications, information, andinfrastructure. It is like a pool of resources andservices available in a pay as- you-go manner.Services like computation, network, andinformation storage. This paper mainly focus onthe major security concerns about cloudcomputing. The major areas of focus are: -Information Protection, Virtual DesktopSecurity, Network Security, and VirtualSecurity. In today’s business world, manyorganizations use Information Systems tomanage their sensitive and business criticalinformation. The need to protect such a keycomponent of the organization cannot be overemphasized. Data Loss/Leakage Prevention hasbeen found to be one of the effective ways ofpreventing Data Loss. DLP solutions detect andprevent unauthorized attempts to copy or sendsensitive data, both intentionally or/andunintentionally, without authorization, bypeople who are authorized to access the sensitiveinformation. DLP is designed to detect potentialdata breach incidents in timely manner and thishappens by monitoring data.Keywords – Data leakage analysis in cloudcomputing; Data leakage prevention in cloudcomputing; Checking sensitivity of data; Datasecurity in cloudI. INTRODUCTIONData loss, which means a loss of data thatoccur on any device that stores data. It is a problemfor anyone that uses a computer. Data loss happenswhen data may be physically or logically removedfrom the organization either intentionally orunintentionally. The data loss has become a biggestproblem in organization today where theorganizations are in responsibility to overcome thisproblem. Data Leakage is an incident when theconfidentiality of information has beencompromised. It refers to an unauthorizedtransmission of data from within an organization toan external destination. The data that is leaked outcan either be private in nature and are deemedconfidential whereas Data Loss is loss of data due todeletion, system crash etc. Totally both the term canbe referred as data breach, has been one of thebiggest fears that organization face today.Data Loss/Leakage Prevention (DLP) is a computersecurity term which is used to identify, monitor, andprotect data in use, data in motion, and data atrest[1]. DLP is sued to identify sensitive content byusing deep content analysis to per inside files andwith the use if network communications. DLP ismainly designed to protect information assets inminimal interference in business processes. It alsoenforces protective controls to prevent unwantedincidents. DLP can also be used to reduce risk and toimprove data management practices and even lowercompliance cost. Systems are designed to detect andprevent unauthorized use and transmission ofconfidential information. Vendors refer to the termas Data Leak Prevention, Information LeakDetection and Prevention (ILDP), Information LeakPrevention (ILP), Content Monitoring and Filtering(CMF), Information Protection and Control (IPC) orExtrusion Prevention System by analogy toIntrusion-prevention system [1].In this paper, we deal with both the termsdata loss and data leakage in analyzing how the DLPtechnology helps in minimizing the data loss/leakageproblem? The study is performed as a case researchon DLP technology in organizational perspective.II. BACKGROUNDResearch ApproachThere are three kinds of researchapproaches in scientific research; Quantitativeresearch, Qualitative research and Mixed research.Different researcher gives different definitions toqualitative research, qualitative research and mixedresearch. Here are some of them; QuantitativeApproach: A quantitative research is the one whichinvolves strategies of inquiry such as experimentsand surveys, and collects data on predeterminedinstruments that yield statistical data. Generally, thequantitative research aims at explanation whichanswers primarily to why? Quantitative datacollection is based on precise measurement usingstructured and validated data collection instrumentssuch as closed ended items, behavioral responsesand rating scales.In addition, quantitative research is definedas social research that employs empirical methodsand empirical statements. The author states that anempirical statement is defined as descriptivestatement about “what is the case in the real world”rather than “what ought to be the case”[2].
  2. 2. Bijayalaxmi Purohit, Pawan Prakash Singh / International Journal of Engineering Researchand Applications (IJERA) ISSN: 2248-9622 www.ijera.comVol. 3, Issue 3, May-Jun 2013, pp.1311-13161312 | P a g eTherefore quantitative research is essentially aboutcollecting numerical data to explain a particularphenomenon, particular questions seem immediatelysuited to being answered using quantitative methods.Qualitative Approach: A qualitative research is theone which involves strategies of inquiry such asnarratives, phenomenology’s, ethnographies,grounded theory studies, or case studies.Generally, the qualitative research is a typeof scientific research that aims at understandingwhich answers primarily to how? Qualitative datacollection is based on in-depth interviews,participant observation, field notes and open-endedquestions. Here the research is the primary datacollection instrument.“Participant observation [for collecting dataon naturally occurring behavior’s in their usualcontexts], In-depth interviews [for collecting data onindividual perspectives, and experiences], and Focusgroups [also called as group interviews is effectiveon eliciting data on the cultural groups]” are somekinds of qualitative research methods[3].Mixed Approach: A mixed researchinvolves the mixing of quantitative and qualitativemethods. The mixed approach involves strategies ofinquiry such as collecting data either simultaneouslyor sequentially to best understand research problem.The data collection involves gathering both numericinformation as well as textual information. The studybegins with a broad survey and then focuses onqualitative, open-ended interviews to collect detailedviews from participant. There are three ways ofmixing the data’s such as merging the data,connecting the data, and embedding the data.Though it is not enough to simply collect andanalyze the data’s (both quantitative and qualitative)there is a need to be mixed together in order to forma complete picture of the problem then they do whenstanding alone. From the above details, we thenbelieve our research is of qualitative approach.Therefore the research needs not to know statisticalanalysis as the quantitative approach suggest. Theneed to conduct this research is to know the detailedunderstanding of how the DLP technologyminimizes the data loss problem in theorganization[4].Research strategyGenerally, research strategy is a way ofcollecting and analyzing empirical evidence byfollowing some logic. A research design is the logicthat links the data to be collected and theconclusions to be drawn to the initial questions ofthe study, it ensures coherence. There are five majorresearch strategies; experiments, research survey,archival analysis, histories, and case studies. Eachstrategy has its own strength and weakness and canbe utilized for all three research purposes;exploratory, descriptive, and explanatory. Case studyresearch involves the study of an issue exploredthrough one or more cases through a boundarysystem. The author also states that it is a qualitativeapproach in which the investigator explores a case indetailed, and in depth data collection involvingmultiple sources of information and depicted a casedescription and case based themes[5]. The intent ofcase analysis exists in three variations such as singleinstrumental case study, the multiple case studies,and the intrinsic case study[5]. In a singleinstrumental case study, and then selects onebounded case to illustrate the issue. In a collectivecase study, the one issue is again selected but theinquirer selects multiple case studies to illustrate theissue. The intrinsic case study focuses on the caseitself because the case presents an unusual andunique situation.This research therefore is designed in formof a case study, a single instrumental case study tobe more definite. The research focuses onphenomenon, which is “How do the DLP technologyhelps in minimizing the data loss/leakage problem inconjunction with previously used technologies in theorganization?” and it was examined in theredifferent ways such as the product, people andprocess[6].Data Collection MethodGenerally, qualitative research oftenemphasizes the human factor to understand theirbehavior, knowledge, altitudes and fears. Thequalitative research involves qualitative data that areobtained through methods such as surveys orinterviews, on-site observations, and focus groups.Data are the empirical evidence or information onegathers carefully according to rules orprocedures‟ [7]. We also found that aim of datacollection strategy is to obtain answers fromdifferent sources and this will let the researcher todescribe, compare, and relate one characteristic toanother and demonstrate that certain feature exist incertain categories.Case study is a qualitative approach inwhich the investigator explores a case in detailed,and in depth data collection involving multiplesources of information (such as observation,interviews, documents, audio visual materials) andreports a case description and case based themes.Basically, there are two types of data collectionmethods; Primary and secondary[7]. Primary datacollection: This processes three different types ofstrategies; interview, questioning, and observation. Itis the most substantial method in all qualitativeinquiry. It is first-hand information collected throughvarious methods such as observation, interviewing,mailing, etc.Secondary data collection: This has beencollected and processed by other researchers fordifferent purposes than what it is sued for. It is a
  3. 3. Bijayalaxmi Purohit, Pawan Prakash Singh / International Journal of Engineering Researchand Applications (IJERA) ISSN: 2248-9622 www.ijera.comVol. 3, Issue 3, May-Jun 2013, pp.1311-13161313 | P a g every common practice to collect, process, utilizes,and store data by companies and organizations forthe support of their operation. The secondary dataare mostly collected from sources such as magazine,news paper, TV, internet, reviews, and researcharticles.For this research, interviews, observations,documents, and reports have been extensively usedas a form of data collection. Main data‟ s arecaptured from the company internal knowledge base(real time data or empirical data) as one of ourresearcher is working for the organization on DataLoss Prevention project and another researcher haveworked in conducting interview questions in togather the project details with respect to thesisoutline. Both closed and open ended questions wereused during the interviews, and the interviews wereperformed in email system. Along with this, securityjournals, DLP books such as (Data Leak Prevention- ISACA), and are used in collecting the data.Classification of Information LeakageFrom the paper, the author classified theinformation leakage into three levels which means adocument containing confidential data can beclassified as unintentional leak, intentional leak, andmalicious leak [8].Unintentional Leak:1. Attach document2. Zip and send3. Copy & PasteThe unintentional leakage normally occurs when auser mistakenly sends a confidential data orinformation to third party or wrong recipient. This isdone without any personal intention. For instance, ifan employee sends an email attaching a documentmistakenly this contains confidential data to a wrongperson or to vendor.Intentional Leak:The intentional leakage normally occurswhen a user tries to send a confidential documentwithout aware of company policy and finally sendsanyhow. This is usually done when a user bypassingthe security rules and regulations or devices withouttrying to gain personal benefits. For instance, whenan employee renames a document folder andpartially copies the data from it.Intentional Leak1. Document renames2. Document type change3. Partial data copy4. Remove keywordMalicious Leak:Malicious leakage usually caused when a userdeliberately trying to sneak the confidential data pastthe security rulesMalicious Leak1. Character encoding2. Print screen3. Password protected4. Self extracted archive5. Hide data6. Policies or product.For instance, when an employee sneaks aconfidential data from enterprise system and sendsthem through email and even cause vulnerability tothe system.III. DATA LEAKAGE PREVENTIONUSING MY DLPMyDLP is open source all-in-one data lossprevention software that runs with multi-siteconfigurations on network servers and endpointcomputers. MyDLP development project has madeits source code available under the terms of the GNUGeneral Public License.MyDLP is one of the first free software projects fordata loss prevention.MyDLP allows you to monitor, inspect and preventall outgoing confidential data without the hassle.With painless deployment and configuration, easy touse policy interface and great performance ITadministrators and security officers are able tocombat data leakage.With MyDLP you can;1. Block or quarantine outgoing confidentialdata from your organization network viamail and web. Archive suspicious files.2. Monitor removable device usage in yourorganization and block or quarantineconfidential files copied into these devicessuch as USB memory sticks or smartphones.3. Block or quarantine print jobs whichcontain confidential information.4. Discover confidential data on networkstorages, databases, workstations andlaptops in your organization.
  4. 4. Bijayalaxmi Purohit, Pawan Prakash Singh / International Journal of Engineering Researchand Applications (IJERA) ISSN: 2248-9622 www.ijera.comVol. 3, Issue 3, May-Jun 2013, pp.1311-13161314 | P a g eFigure1. End Points in cloud computing.I. EXPERIMENT AND RESULTFigure2. Dash board of My DLP.Below is the excerpt of the dash board ofDLP suite deployed. The dash board is a graphicaluser interface which provides complete snapshot ofthe DLP. It categorizes incidents by Network,Endpoint and Datacenter. It also shows number ofevents or incidents generated and their status.Based on both the type of policy and content blade,number of incident triggered can be seen in thedashboard. Its very user friendly environment withlots of information on a mouse click.Figure3. DLP Logs. Admin console basically has differentoptions to perform admin activities. Status and overview tab we can find thetypes of devices and their status (active orInactive). Users and Groups tab gives thefunctionality to create, delete or/and modifyroles / users access to DLP Network,Endpoint and Datacenter tabs, system statusis displayed based on device type (Sensor,ICAP server, Grid Worker etc.) Notification - automatic alerts are setwhenever a device or feature fails toperform the job, i.e. an email alert is sentwhen any of the devices or services are hitor stop working. Settings option gives functionality to setvarious thresholds support - It opens up aknowledge base for quick help .This helpsin saving lot of labor work as in anorganization with very huge deployment itis very tidy and uneasy job to keep a trackof all the devices and services. Content Analysis and Policy Application.All three DLP products make use ofcontent analysis (detection of sensitivecontent in documents or messages) andapplication of policy (a specification ofhow to handle sensitive documents ormessages)[9]. Content Blades:-Content blades are highlyaccurate pattern-matching detectors ofsensitive content. DLP supports two kindsof content blades:1. Described-content blades are detaileddescriptions of sensitive content, and maycontain terms, regular expressions,programmatic entities, and other factors toaccurately detect classes of sensitivecontent such as Social Security Numbers.Approximately 150 pre-defined “expert”content blades are available for immediateuse in the DLP product, and it can becustomized or create other content bladesthat are unique to organization.2. Fingerprinted-content blades (or“fingerprints”) are mathematical descriptorsof individual sensitive documents orfragments of documents. They will “match”any copies of those documents or fragmentsfound anywhere in the organization.Fingerprints of known sensitive documents arecreated, and then used to ensure that unauthorizedcopies of the documents are not being used.The DLP products use content blades to performcontent analysis on intercepted messages, storedfiles, and files being manipulated by users. Eachdocument or message is assigned a score, or riskfactor, depending on how strongly it matches acontent blade[10].
  5. 5. Bijayalaxmi Purohit, Pawan Prakash Singh / International Journal of Engineering Researchand Applications (IJERA) ISSN: 2248-9622 www.ijera.comVol. 3, Issue 3, May-Jun 2013, pp.1311-13161315 | P a g ePoliciesFigure4. DLP Policies.Policies are sets of rules that specify when to createan event (a record that a sensitive document ormessage has been detected) and how to act on, orremediate, that event. A policy can base its decisionon the results of content analysis (the risk factor, orseverity, of the analyzed content) and on non-content-based factors such as the identity of themessage sender or the destination of the user action.Below figure gives an excerpt of policies inDLP[11].Approximately 150 pre-defined “expert”policy templates are available for immediate use inthe product, however as per the organizationalrequirement new policies can be created and alreadyexisting policies can be customized. Incidents whensufficient number of events occur, DLP createsincidents that a security officer can evaluate and takeappropriate steps to manually remediate the securityissues that they represent.There is a dedicated workflow followed to analyzethe root cause and follow the remediation process. Adedicated team, working on security incidentshandles the workflow. A watch list is maintainedand on all the users a vigilant eye is kept, if thesecurity incidents are repeated appropriate action istaken involving other departments like legal,compliance etc.Below figure shows the high levelworkflow of Critical Incident Response Center team.The below figure gives an understanding of the typesof Incidents / Events, date and time they occurred,severity level, sender or owner details, protocolused, the exact file name or information along withdetails of type of policy violated.From the GUI incidents can also be differentiatedbased on type (Network, Datacenter & Endpoint).As per the requirement it can be filtered with dateranges (day, week, and month)Figure5. Blocking mail containing confidential data.The above figure is an example of how wecan prevent data leakage in cloud computing. Herewe can observe how confidential mail got blockedwhen an authorized user try to sent it to any otherend points. Because the authorized person is notgranted permission to sent confidential data. This isthe best example of checking sensitivity of data incloud computing.IV. CONCLUSIONIn this paper, we do analysis of dataleakage prevention. Why it can balance the datasecurity and user convenience. And also suggest theway to prevent it by suing My DLP technology.We‘ve also shown the implementation details of thistechnology in our experiment part. However, it isvery easier to implement DLP technology which willdeals will cloud security[12]. Our future work willfocus on data leakage analysis in cloud computingusing a virtual cloud network.V. ACKNOWLEDGEMENTSWe grateful to Mrs.Savita Shiwani for herunprecedented guidance, suggestions for this paper.REFERENCES[1] Ma Jun, Wang Zhiying, Ren Jiangchun, WuJiangjiang, Cheng Yong and MeiSongzhu,” The Application of Chinese WallPolicy in Data Leakage Prevention” inInternational Conference onCommunication Systems and NetworkTechnologies,2012.[2] Charles PEREZ, Babiga BIRREGAH, MarcLEMERCIER, “The Multi-layerimbrication for data leakage preventionfrom mobile devices” in IEEE 11thInternational Conference on Trust, Securityand Privacy in Computing andCommunications,2012.[3] Zhang Xiaosong , Liu Fei , Chen Ting , LiHua , “Research and Application of the
  6. 6. Bijayalaxmi Purohit, Pawan Prakash Singh / International Journal of Engineering Researchand Applications (IJERA) ISSN: 2248-9622 www.ijera.comVol. 3, Issue 3, May-Jun 2013, pp.1311-13161316 | P a g eTransparent Data Encpryption In IntranetData Leakage Prevention” in InternationalConference on Computational Intelligenceand Security,2009.[4] Janusz Marecki, Mudhakar Srivatsa,Pradeep Varakantham, “A DecisionTheoretic Approachto Data LeakagePrevention” in IEEE InternationalConference on Social Computing / IEEEInternational Conference on Privacy,Security, Risk and Trust.[5] M. Srivatsa, P. Rohatgi, S. Balfe, and S.Reidt, “Securing information flows: Ametadata framework,” in Proceedings of1st IEEE Workshop on Quality ofInformation for Sensor Networks (QoISN),2008.[6] D. Roberts, G. Lock, and D. Verma,“Holistan: A Futuristic Scenario forInternational Coalition Operations,” in InProceedings of Fourth InternationalConference on Knowledge Systems forCoalition Operations (KSCO), 2007.[7] Zhao Yong, Liu Jiqian, Han Zhen,ShenChangxiang, ”The Application ofInformation Leakage Defendable Model inEnterprise Intranet”, In: Journal ofComputer Research and Development,pp761-767 2007 44(5)[8] Wang Lei, ZHUANG Yi, Pan Long-ping,”Design and implementation of filewatching system based on mandatoryaccesscontrol”, In: Computer Application.Vol.26 No.12 Dec.2006[9] Lei Zheng, Zhao-feng Ma, Ming Gu,”Techniques of File System Filter Driver-based and Security-enhanced EncryptionSystem”, In:Journal of Chinese ComputerSystems.Vol28.no.7,July 2007[10] Shufen Liu, Zhagxiang Zhang, Yaorui Cui,Lin tao Wu;”A New information LeakageDefendable Model ” In: Computer-AidedIndustrial Design and Conceptual Design,2008. CAID/CD 2008. 9thInternationalConference on,pp:109-112, Nov. 2008[11] Microsoft Corporations: “UsingEncrypting File System”, published:November ,2005[12] Ulf T. Mattsson, CTO Protegrity, “APractical Implementation of TransparentEncryption and Separation of Duties inEnterprise Databases, Protection againstExternal and Internal Attacks onDatabases”, In: E-Commerce Technology,2005. CEC 2005. Seventh IEEEInternational Conference on, pp:559–565,19-22 July 2005