SlideShare a Scribd company logo
1 of 6
Download to read offline
Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.4, July 2012
DOI : 10.5121/acij.2012.3403 21
‘CodeAliker’ - Plagiarism Detection on the Cloud
Nitish Upreti1
and Rishi Kumar2
Department of Computer Science and Engineering, AMITY University, Noida, India.
nitishupreti@gmail.com
Department of Computer Science and Engineering, AMITY University, Noida, India.
rishikumar182000@gmail.com
Abstract
Plagiarism is a burning problem that academics have been facing in all of the varied levels of the
educational system. With the advent of digital content, the challenge to ensure the integrity of academic
work has been amplified. This paper discusses on defining a precise definition of plagiarized computer
code, various solutions available for detecting plagiarism and building a cloud platform for plagiarism
disclosure.
‘CodeAliker’, our application thus developed automates the submission of assignments and the review
process associated for essay text as well as computer code. It has been made available under the GNU’s
General Public License as a Free and Open Source Software.
Keywords
Plagiarism, String matching, Cloud Computing
1.Introduction
An insightful look into the scenario of academic integrity and its implications give us the major
motivation for pursuing the subject. The issue holds utmost significance as the intellectual
standards of an individual pursuing an academia a reestablished around his ability to produce
authoritative work. Plagiarism is thus lethal. Every year a large number of students and scholars
submit a huge volume of material to their respective mentors and professors. Due to the sheer
amount of text involved, a manual scrutiny is infeasible. Analyzing the situation, we found no
existing work in the public domain that solved the problem faced by educational institutes
worldwide. Most of the alternatives were either closed source or catered to only a fraction of the
entire problem. Working on this issue, at the outset we explore the sensitive aspect of
classification of documents as ‘authentic’ or ‘plagiarized’. We then analyze numerous
approaches to Plagiarism detection. Advancing then to our chief goal of implementing an
engine and leveraging the cloud platform for scalable and robust plagiarism detection. Alex
Aliken’s MOSS[1] is chosen as the key approach for building the application. Result and
conclusion follow where we present our observations and learning.
2. Classification of Text
Broadly categorizing, the nature of text submitted to such a system can either be an Essay that is
plain text in language or computer code in any of the popular language such as C, C++, Java or
Ruby for instance. It is easy to figure out whether an essay text has been plagiarized however
source code copying is a delicate issue with mostly a fine line drawn between ‘code reuse’,
Advanced Computing: An International Journal
‘collaboration’, ‘non-citation’ and ‘plagiarism’
an OOP system and ‘Don’t re-invent the wheel
more blurred. With hardly any definition
identifying copying instances is infeasible. Hence
little work has been done on the topic; the only concrete input comes from the work of
Cosma and Mike Joy [2]. Their work follows a
finding the right answers and opinions
to have our own precise judgment on
assignment submitted consists of
in almost all the sophisticated code bases, could be a major potential resource for
plagiarism instances. Nonetheless
looking false positives as Copyright statements occur
of CodeAliker we choose to strip off comments
we address a multitude of other questions. Is using an external library or API an instance of
plagiarism? For most of the cases
are a central part of any sophisticated piece
import statements and library includes.
submission; code for the design is
manually is much more effective than
3. Approaches to Plagiarism Detection
Various different approaches to Plagiarism detection exist and their performance and speed
to a great extent. Also certain plagiarism detection sche
specific structure and nature. A rich t
Plagiarism
Web Scrapping Based
String Matching
Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.4, July 2012
citation’ and ‘plagiarism’. With learning themes such as ‘Code Reuse’ in
invent the wheel’ code philosophies, the distinction are
With hardly any definition in place designing a system capable of accurately
nstances is infeasible. Hence concrete definitions need to be in place.
topic; the only concrete input comes from the work of
[2]. Their work follows a survey-based approach in the U.K academics
finding the right answers and opinions. However for implementing a practical solution we need
judgment on the problem rather than a crude hypothesis
submitted consists of comments and the actual source code. Comments, which occur
in almost all the sophisticated code bases, could be a major potential resource for
onetheless they present a major pitfall and could lead on to suspicious
Copyright statements occur frequently as comments. For the purpose
we choose to strip off comments so as to avoid any such issues. Moving forward
we address a multitude of other questions. Is using an external library or API an instance of
t of the cases we found that library use without citation is legitimate
sophisticated piece of computer program. CodeAliker thus filters out
import statements and library includes. An intuitive User Interface design can also be a part of
ode for the design is put under scrutiny by CodeAliker but looking at the design
manually is much more effective than plain UI code checking.
Approaches to Plagiarism Detection
Various different approaches to Plagiarism detection exist and their performance and speed
Also certain plagiarism detection schemes are more suitable for data
A rich taxonomy can be summarized in the diagram below
Figure 3.1
Plagiarism Detection Approaches
Local Database Based
Non Sturctured
Fingerprinting
Techniques
Matching
Parameterized
Techniques
Structured
ACIJ ), Vol.3, No.4, July 2012
22
With learning themes such as ‘Code Reuse’ in
distinction are even
m capable of accurately
need to be in place. A
topic; the only concrete input comes from the work of Georgina
in the U.K academics for
However for implementing a practical solution we need
rather than a crude hypothesis. Code
Comments, which occur
in almost all the sophisticated code bases, could be a major potential resource for identifying
and could lead on to suspicious
For the purpose
Moving forward,
we address a multitude of other questions. Is using an external library or API an instance of
legitimate as they
CodeAliker thus filters out
an also be a part of
but looking at the design
Various different approaches to Plagiarism detection exist and their performance and speed vary
data set with a
axonomy can be summarized in the diagram below.
Based
Sturctured
Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.4, July 2012
23
Web Scrapping based approaches use the World Wide Web to check for Plagiarism instances
from a large corpus of data. The scope of Web Scrapping is huge and lots of published work
exists on such systems. Our focus for this research is on systems based on a local database
compiled from assignments submitted by students taking the classes and past year submissions.
Local Database Based Approaches can be either Structured or Non Structured. The Structured
approach creates a graph model of information in the document. This approach is used mostly
with code-based assignments.
Non-Structured techniques are the most popular ones and are useful on a wide variety of text
material. They are classified based on the algorithm used. Document Fingerprinting, String
Matching and Parameterized Matching are the popular ones [3].
Tools based on the fingerprint approach work by creating “fingerprints” for each file which
consist statistical information about the file, such as average number of terms per line, number
of unique terms, and number of keywords [4].The DUP tool [5] is based on a parameterized
matching algorithm, which detects identical and near-duplicate sections of source-code, by
matching source-code sections whose identifiers have been substituted (renamed) systematically
[3].
String Matching algorithms are quite popular and effective. MOSS [1], (YAP3) [6], JPlag [7],
and Sherlock [8] are some of the popular ones available. CodeAliker is based on MOSS[1] that
employs string-matching algorithms using k-grams, where a k-gram is an adjacent substring of
length k. Winnowing, a local fingerprinting algorithm is also used to ensure matches of certain
length are detected.
4. Designing the Engine with Ruby
There were various motives for choosing MOSS as the core for CodeAliker’s engine. Also
Ruby was used to implement the engine after considering several important factors. The
language provides excellent text processing libraries, encourages an agile development
methodology and Test Driven Development (TDD). Moreover it is ready for the web with
excellent frameworks available.
MOSS is highly effective for plagiarism detection with text of different nature. It can also be
scaled to handle a large volume of data. MOSS also guarantees matches of certain length to be
detected [1].
The engine consists of three major modules: Text Filter, Hasher and Winnower. All of the
components can be customized with easy to write configuration files.
The text filter has a key role to play when processing code assignments. Based on the approach
MOSS suggests, the comments are stripped off, text is lowercased, identifiers are replaced with
a dummy symbol, language specific keywords are removed and punctuations with no semantic
meanings are stripped off. Filtered text with noise eliminated is thus obtained.
The filtered text is then fed to a Hasher that calculates hashes for the given text. A rolling hash
function based on the famous Rabin Karp Algorithm is employed to calculate hashes quickly.
With each hash value calculated, the corresponding line number where the text occurred is
stored. This aids later in presenting user with the information regarding the instances where
plagiarized text is present.
The Winnower is an implementation of the ‘Robust Winnowing’ algorithm defined by MOSS.A
set of hash is chosen to be as the finger print of a document. Line number information is still
preserved.
Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.4, July 2012
24
Winnower needs to be configured with parameters value ‘k’ for k-gram, a threshold value ‘t’
and a modulus value ‘q’. If there is a substring match at least as long as the guarantee
threshold, ‘t’, then this match is detected, and we do not detect any matches shorter than
the noise threshold, ‘k’ [1].The hash values computed are two large and hinder a
scalable implementation; hence a value ‘q’ is used as the modulus.
For CodeAliker we found the sweet spot with the values 5(k), 8(t) and 10001(q) respectively.
The documents are compared based on the final fingerprints, with plagiarism instance being
reported line by line. Check for essay based assignment is surprisingly similar with the Filter
step being omitted.
5. Building a Cloud Application
The most interesting part of our research is to build a cloud application for the engine. For
building the web application we employ the Ruby on Rails platform.
Ruby on Rails, a full stack framework for Ruby is excellent for agile development and
sustainable productivity. It boasts a high modular design, excellent package management
capabilities, database abstraction with ORM(Object Relational Mapping) library and ease of
deployment.
The application is built with the MVC (Model – View – Controller) design pattern inherent on
the Rails Framework. CodeAliker aims to ease the workflow involved by automating the entire
process. To achieve this, an authentication-based system is introduced for the professors where
assignments for each class they take are available to them as a separate bunch.
The professor can mark any on the assignment as primary and check with respect to that
assignment all the possible plagiarized instances. Thus getting a complete view of the scenario
effortlessly in a non ad-hoc fashion. Academics can also manually supervise the submission and
reviews.
While the traditional delivery of software services have been mainstream, bringing the cloud
into perspective changes the entire scenario. Cloud tends to centralize our resources, code base
and data onto an always-available depot. Hardware resources can be accurately utilized, load on
a high demand system can be catered to and system can be easily scaled. Configuration
management is also made effortless. Any organization with requirements for sucha system is in
need of the cloud. The platform allows changes to be pushed onto codebase with a push of a
button, rather than relying on extensive upgrade packs.
For plagiarism detection in a university or an academic institute the needs are critical and point
towards the cloud. The computing requirements are thus addressed. Also our system is
centralized and easily scaled.
For hosting CodeAliker on the cloud, Heroku, a cloud platform has been employed.
The source code is available to public here on: https://github.com/Myth17/CodeAliker
The application is available for free use at: http://codealiker.heroku.com/
Advanced Computing: An International Journal
6. Results
The results for CodeAliker display
suspected plagiarism instances are previewed
clearer picture, the plagiarized instances are marked with the line numbers in order of aid
manual scrutiny and presenting a
7. Conclusion
We have analyzed the entire scenario of Plagiaris
and solutions for developing an
research being our ability to define
different approaches towards plagiarism detection, practical implementation of the MOSS
engine with fine tuned parameters
platform.
Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.4, July 2012
Figure 6.1
Figure 6.2
results for CodeAliker display the assignment marked as primary to the left while the
suspected plagiarism instances are previewed stacked onto each other in the right. To present a
clearer picture, the plagiarized instances are marked with the line numbers in order of aid
iny and presenting a more cohesive report.
lyzed the entire scenario of Plagiarism detection, while figuring out the
olutions for developing an application for the purpose. Major accomplishment of the
our ability to define a precise definition of code plagiarism, understanding
towards plagiarism detection, practical implementation of the MOSS
parameters and building a scalable web application hosted on a cloud
ACIJ ), Vol.3, No.4, July 2012
25
the assignment marked as primary to the left while the
the right. To present a
clearer picture, the plagiarized instances are marked with the line numbers in order of aid
while figuring out the problems
Major accomplishment of the
understanding
towards plagiarism detection, practical implementation of the MOSS
hosted on a cloud
Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.4, July 2012
26
References
[1] Saul Schleimer, Daniel S. Wilkerson and Alex Aiken, “Winnowing: Local Algorithms for Document
Fingerprinting”, SIGMOD '03 Proceedings of the 2003 ACM SIGMOD international conference on
Management of data, pp 76-85, 2003.
[2] Georgina Cosma and Mike Joy, “ Towards a definition of Source Code Plagiarism”, IEEE
TRANSACTIONS ON EDUCATION, VOL. 51, NO. 2, MAY 2008.
[3] Georgina Cosma and Mike Joy, “An Approach to Source-Code Plagiarism Detection and
Investigation Using Latent Semantic Analysis”, IEEE TRANSACTIONS ON COMPUTERS, VOL.
61, NO. 3, MARCH 2012 379.
[4] M. Mozgovoy, “Desktop Tools for Offline Plagiarism Detection in Computer Programs,”
Informatics in Education, vol. 5, no. 1, pp. 97- 112, 2006.
[5] B. Baker, “On Finding Duplication and Near-Duplication in LargeSoftware Systems,” Proc. IEEE
Second Working Conf. Reverse Eng.,pp. 85-95, 1995.
[6] M.J. Wise, “YAP3: Improved Detection of Similarities in Computer Program and Other Texts,”
Proc. 27th SIGCSE Technical Symp.,pp. 130-134, 1996.
[7] L. Prechelt, G. Malpohl, and M. Philippsen, “Finding PlagiarismsAmong a Set of Programs with
JPlag,” J. Universal ComputerScience, vol. 8, no. 11, pp. 1016-1038, 2002.
[8] M. Joy and M. Luck, “Plagiarism in Programming Assignments,” IEEE Trans. Education, vol. 42,
no. 2, pp. 129-133, May 1999.
Authors
Nitish Upreti is computer science student at AMITY School of Engineering and
Technology, Noida, India. His fields of interest include Algorithm Design, Artificial
Intelligence and building scalable web applications.
Rishi Kumar is computer science faculty at Amity School of Engineering and
Technology, Noida, India. His fields of interest include Artificial Intelligence,
Expert System & Image Processing.

More Related Content

What's hot

IRJET- An Effective Analysis of Anti Troll System using Artificial Intell...
IRJET-  	  An Effective Analysis of Anti Troll System using Artificial Intell...IRJET-  	  An Effective Analysis of Anti Troll System using Artificial Intell...
IRJET- An Effective Analysis of Anti Troll System using Artificial Intell...IRJET Journal
 
A NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIR
A NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIRA NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIR
A NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIRcscpconf
 
A Hybrid Method of Long Short-Term Memory and AutoEncoder Architectures for S...
A Hybrid Method of Long Short-Term Memory and AutoEncoder Architectures for S...A Hybrid Method of Long Short-Term Memory and AutoEncoder Architectures for S...
A Hybrid Method of Long Short-Term Memory and AutoEncoder Architectures for S...AhmedAdilNafea
 
Software Birthmark for Theft Detection of JavaScript Programs: A Survey
Software Birthmark for Theft Detection of JavaScript Programs: A Survey Software Birthmark for Theft Detection of JavaScript Programs: A Survey
Software Birthmark for Theft Detection of JavaScript Programs: A Survey Swati Patel
 
Deepcoder to Self-Code with Machine Learning
Deepcoder to Self-Code with Machine LearningDeepcoder to Self-Code with Machine Learning
Deepcoder to Self-Code with Machine LearningIRJET Journal
 
Vulnerability Assessment and Penetration Testing using Webkill
Vulnerability Assessment and Penetration Testing using WebkillVulnerability Assessment and Penetration Testing using Webkill
Vulnerability Assessment and Penetration Testing using Webkillijtsrd
 
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using PysparkIRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using PysparkIRJET Journal
 
Performance analysis on secured data method in natural language steganography
Performance analysis on secured data method in natural language steganographyPerformance analysis on secured data method in natural language steganography
Performance analysis on secured data method in natural language steganographyjournalBEEI
 
A PPLICATION OF C LASSICAL E NCRYPTION T ECHNIQUES FOR S ECURING D ATA -...
A PPLICATION OF  C LASSICAL  E NCRYPTION  T ECHNIQUES FOR  S ECURING  D ATA -...A PPLICATION OF  C LASSICAL  E NCRYPTION  T ECHNIQUES FOR  S ECURING  D ATA -...
A PPLICATION OF C LASSICAL E NCRYPTION T ECHNIQUES FOR S ECURING D ATA -...IJCI JOURNAL
 
Mining Unstructured Software Repositories Using IR Models
Mining Unstructured Software Repositories Using IR ModelsMining Unstructured Software Repositories Using IR Models
Mining Unstructured Software Repositories Using IR ModelsSAIL_QU
 
M phil-computer-science-cryptography-projects
M phil-computer-science-cryptography-projectsM phil-computer-science-cryptography-projects
M phil-computer-science-cryptography-projectsVijay Karan
 
IRJET- A Survey for an Efficient Secure Guarantee in Network Flow
IRJET-  	  A Survey for an Efficient Secure Guarantee in Network FlowIRJET-  	  A Survey for an Efficient Secure Guarantee in Network Flow
IRJET- A Survey for an Efficient Secure Guarantee in Network FlowIRJET Journal
 
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...IJNSA Journal
 
Securing Cloud Using Fog: A Review
Securing Cloud Using Fog: A ReviewSecuring Cloud Using Fog: A Review
Securing Cloud Using Fog: A ReviewIRJET Journal
 
New enterprise application and data security challenges and solutions apr 2...
New enterprise application and data security challenges and solutions   apr 2...New enterprise application and data security challenges and solutions   apr 2...
New enterprise application and data security challenges and solutions apr 2...Ulf Mattsson
 
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUESMULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUESijcseit
 
Detection of multiword from a wordnet is complex
Detection of multiword from a wordnet is complexDetection of multiword from a wordnet is complex
Detection of multiword from a wordnet is complexeSAT Publishing House
 
A survey of cloud based secured web application
A survey of cloud based secured web applicationA survey of cloud based secured web application
A survey of cloud based secured web applicationIAEME Publication
 
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...IRJET Journal
 
Privacy-Preserving Updates to Anonymous and Confidential Database
Privacy-Preserving Updates to Anonymous and Confidential DatabasePrivacy-Preserving Updates to Anonymous and Confidential Database
Privacy-Preserving Updates to Anonymous and Confidential Databaseijdmtaiir
 

What's hot (20)

IRJET- An Effective Analysis of Anti Troll System using Artificial Intell...
IRJET-  	  An Effective Analysis of Anti Troll System using Artificial Intell...IRJET-  	  An Effective Analysis of Anti Troll System using Artificial Intell...
IRJET- An Effective Analysis of Anti Troll System using Artificial Intell...
 
A NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIR
A NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIRA NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIR
A NOVEL APPROACH OF CLASSIFICATION TECHNIQUES FOR CLIR
 
A Hybrid Method of Long Short-Term Memory and AutoEncoder Architectures for S...
A Hybrid Method of Long Short-Term Memory and AutoEncoder Architectures for S...A Hybrid Method of Long Short-Term Memory and AutoEncoder Architectures for S...
A Hybrid Method of Long Short-Term Memory and AutoEncoder Architectures for S...
 
Software Birthmark for Theft Detection of JavaScript Programs: A Survey
Software Birthmark for Theft Detection of JavaScript Programs: A Survey Software Birthmark for Theft Detection of JavaScript Programs: A Survey
Software Birthmark for Theft Detection of JavaScript Programs: A Survey
 
Deepcoder to Self-Code with Machine Learning
Deepcoder to Self-Code with Machine LearningDeepcoder to Self-Code with Machine Learning
Deepcoder to Self-Code with Machine Learning
 
Vulnerability Assessment and Penetration Testing using Webkill
Vulnerability Assessment and Penetration Testing using WebkillVulnerability Assessment and Penetration Testing using Webkill
Vulnerability Assessment and Penetration Testing using Webkill
 
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using PysparkIRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
 
Performance analysis on secured data method in natural language steganography
Performance analysis on secured data method in natural language steganographyPerformance analysis on secured data method in natural language steganography
Performance analysis on secured data method in natural language steganography
 
A PPLICATION OF C LASSICAL E NCRYPTION T ECHNIQUES FOR S ECURING D ATA -...
A PPLICATION OF  C LASSICAL  E NCRYPTION  T ECHNIQUES FOR  S ECURING  D ATA -...A PPLICATION OF  C LASSICAL  E NCRYPTION  T ECHNIQUES FOR  S ECURING  D ATA -...
A PPLICATION OF C LASSICAL E NCRYPTION T ECHNIQUES FOR S ECURING D ATA -...
 
Mining Unstructured Software Repositories Using IR Models
Mining Unstructured Software Repositories Using IR ModelsMining Unstructured Software Repositories Using IR Models
Mining Unstructured Software Repositories Using IR Models
 
M phil-computer-science-cryptography-projects
M phil-computer-science-cryptography-projectsM phil-computer-science-cryptography-projects
M phil-computer-science-cryptography-projects
 
IRJET- A Survey for an Efficient Secure Guarantee in Network Flow
IRJET-  	  A Survey for an Efficient Secure Guarantee in Network FlowIRJET-  	  A Survey for an Efficient Secure Guarantee in Network Flow
IRJET- A Survey for an Efficient Secure Guarantee in Network Flow
 
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...
 
Securing Cloud Using Fog: A Review
Securing Cloud Using Fog: A ReviewSecuring Cloud Using Fog: A Review
Securing Cloud Using Fog: A Review
 
New enterprise application and data security challenges and solutions apr 2...
New enterprise application and data security challenges and solutions   apr 2...New enterprise application and data security challenges and solutions   apr 2...
New enterprise application and data security challenges and solutions apr 2...
 
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUESMULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
MULTILINGUAL INFORMATION RETRIEVAL BASED ON KNOWLEDGE CREATION TECHNIQUES
 
Detection of multiword from a wordnet is complex
Detection of multiword from a wordnet is complexDetection of multiword from a wordnet is complex
Detection of multiword from a wordnet is complex
 
A survey of cloud based secured web application
A survey of cloud based secured web applicationA survey of cloud based secured web application
A survey of cloud based secured web application
 
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
 
Privacy-Preserving Updates to Anonymous and Confidential Database
Privacy-Preserving Updates to Anonymous and Confidential DatabasePrivacy-Preserving Updates to Anonymous and Confidential Database
Privacy-Preserving Updates to Anonymous and Confidential Database
 

Similar to ‘CodeAliker’ - Plagiarism Detection on the Cloud

Put Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and HowPut Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and HowMassimiliano Di Penta
 
A Novel Approach for Code Clone Detection Using Hybrid Technique
A Novel Approach for Code Clone Detection Using Hybrid TechniqueA Novel Approach for Code Clone Detection Using Hybrid Technique
A Novel Approach for Code Clone Detection Using Hybrid TechniqueINFOGAIN PUBLICATION
 
A Survey On Plagiarism Detection
A Survey On Plagiarism DetectionA Survey On Plagiarism Detection
A Survey On Plagiarism DetectionKarla Adamson
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models IJECEIAES
 
A Tool to Detect Plagiarism in Java Source Code.pdf
A Tool to Detect Plagiarism in Java Source Code.pdfA Tool to Detect Plagiarism in Java Source Code.pdf
A Tool to Detect Plagiarism in Java Source Code.pdfKayla Smith
 
GENERIC CODE CLONING METHOD FOR DETECTION OF CLONE CODE IN SOFTWARE DEVELOPMENT
GENERIC CODE CLONING METHOD FOR DETECTION OF CLONE CODE IN SOFTWARE DEVELOPMENT GENERIC CODE CLONING METHOD FOR DETECTION OF CLONE CODE IN SOFTWARE DEVELOPMENT
GENERIC CODE CLONING METHOD FOR DETECTION OF CLONE CODE IN SOFTWARE DEVELOPMENT IAEME Publication
 
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...IRJET Journal
 
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...IRJET Journal
 
A REPORT On DETECTION OF PHISHING WEBSITE USING MACHINE LEARNING
A REPORT On DETECTION OF PHISHING WEBSITE USING MACHINE LEARNINGA REPORT On DETECTION OF PHISHING WEBSITE USING MACHINE LEARNING
A REPORT On DETECTION OF PHISHING WEBSITE USING MACHINE LEARNINGEmma Burke
 
Paper id 22201490
Paper id 22201490Paper id 22201490
Paper id 22201490IJRAT
 
Finding Zero-Days Before The Attackers: A Fortune 500 Red Team Case Study
Finding Zero-Days Before The Attackers: A Fortune 500 Red Team Case StudyFinding Zero-Days Before The Attackers: A Fortune 500 Red Team Case Study
Finding Zero-Days Before The Attackers: A Fortune 500 Red Team Case StudyDevOps.com
 
Machine Learning in Static Analysis of Program Source Code
Machine Learning in Static Analysis of Program Source CodeMachine Learning in Static Analysis of Program Source Code
Machine Learning in Static Analysis of Program Source CodeAndrey Karpov
 
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONMAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONIJNSA Journal
 
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONMAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONIJNSA Journal
 
A Platform for Application Risk Intelligence
A Platform for Application Risk IntelligenceA Platform for Application Risk Intelligence
A Platform for Application Risk IntelligenceCheckmarx
 
Software Analytics: Towards Software Mining that Matters (2014)
Software Analytics:Towards Software Mining that Matters (2014)Software Analytics:Towards Software Mining that Matters (2014)
Software Analytics: Towards Software Mining that Matters (2014)Tao Xie
 
AI Based Student S Assignments Plagiarism Detector
AI Based Student S Assignments Plagiarism DetectorAI Based Student S Assignments Plagiarism Detector
AI Based Student S Assignments Plagiarism DetectorAsia Smith
 

Similar to ‘CodeAliker’ - Plagiarism Detection on the Cloud (20)

Put Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and HowPut Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and How
 
A Novel Approach for Code Clone Detection Using Hybrid Technique
A Novel Approach for Code Clone Detection Using Hybrid TechniqueA Novel Approach for Code Clone Detection Using Hybrid Technique
A Novel Approach for Code Clone Detection Using Hybrid Technique
 
A Survey On Plagiarism Detection
A Survey On Plagiarism DetectionA Survey On Plagiarism Detection
A Survey On Plagiarism Detection
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models
 
A Tool to Detect Plagiarism in Java Source Code.pdf
A Tool to Detect Plagiarism in Java Source Code.pdfA Tool to Detect Plagiarism in Java Source Code.pdf
A Tool to Detect Plagiarism in Java Source Code.pdf
 
GENERIC CODE CLONING METHOD FOR DETECTION OF CLONE CODE IN SOFTWARE DEVELOPMENT
GENERIC CODE CLONING METHOD FOR DETECTION OF CLONE CODE IN SOFTWARE DEVELOPMENT GENERIC CODE CLONING METHOD FOR DETECTION OF CLONE CODE IN SOFTWARE DEVELOPMENT
GENERIC CODE CLONING METHOD FOR DETECTION OF CLONE CODE IN SOFTWARE DEVELOPMENT
 
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
 
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
 
A REPORT On DETECTION OF PHISHING WEBSITE USING MACHINE LEARNING
A REPORT On DETECTION OF PHISHING WEBSITE USING MACHINE LEARNINGA REPORT On DETECTION OF PHISHING WEBSITE USING MACHINE LEARNING
A REPORT On DETECTION OF PHISHING WEBSITE USING MACHINE LEARNING
 
Paper id 22201490
Paper id 22201490Paper id 22201490
Paper id 22201490
 
Finding Zero-Days Before The Attackers: A Fortune 500 Red Team Case Study
Finding Zero-Days Before The Attackers: A Fortune 500 Red Team Case StudyFinding Zero-Days Before The Attackers: A Fortune 500 Red Team Case Study
Finding Zero-Days Before The Attackers: A Fortune 500 Red Team Case Study
 
Machine Learning in Static Analysis of Program Source Code
Machine Learning in Static Analysis of Program Source CodeMachine Learning in Static Analysis of Program Source Code
Machine Learning in Static Analysis of Program Source Code
 
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONMAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
 
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONMAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
 
Learning activity 4
Learning activity 4Learning activity 4
Learning activity 4
 
H1803044651
H1803044651H1803044651
H1803044651
 
A Platform for Application Risk Intelligence
A Platform for Application Risk IntelligenceA Platform for Application Risk Intelligence
A Platform for Application Risk Intelligence
 
Software Analytics: Towards Software Mining that Matters (2014)
Software Analytics:Towards Software Mining that Matters (2014)Software Analytics:Towards Software Mining that Matters (2014)
Software Analytics: Towards Software Mining that Matters (2014)
 
AI Based Student S Assignments Plagiarism Detector
AI Based Student S Assignments Plagiarism DetectorAI Based Student S Assignments Plagiarism Detector
AI Based Student S Assignments Plagiarism Detector
 
Edi text
Edi textEdi text
Edi text
 

More from acijjournal

Jan_2024_Top_read_articles_in_ACIJ.pdf
Jan_2024_Top_read_articles_in_ACIJ.pdfJan_2024_Top_read_articles_in_ACIJ.pdf
Jan_2024_Top_read_articles_in_ACIJ.pdfacijjournal
 
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdfCall for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdfacijjournal
 
cs - ACIJ (4) (1).pdf
cs - ACIJ (4) (1).pdfcs - ACIJ (4) (1).pdf
cs - ACIJ (4) (1).pdfacijjournal
 
cs - ACIJ (2).pdf
cs - ACIJ (2).pdfcs - ACIJ (2).pdf
cs - ACIJ (2).pdfacijjournal
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)acijjournal
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)acijjournal
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)acijjournal
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)acijjournal
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)acijjournal
 
3rdInternational Conference on Natural Language Processingand Applications (N...
3rdInternational Conference on Natural Language Processingand Applications (N...3rdInternational Conference on Natural Language Processingand Applications (N...
3rdInternational Conference on Natural Language Processingand Applications (N...acijjournal
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)acijjournal
 
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable DevelopmentGraduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable Developmentacijjournal
 
Genetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary MethodologyGenetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary Methodologyacijjournal
 
Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...acijjournal
 
Advanced Computing: An International Journal (ACIJ)
Advanced Computing: An International Journal (ACIJ) Advanced Computing: An International Journal (ACIJ)
Advanced Computing: An International Journal (ACIJ) acijjournal
 
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & PrinciplesE-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principlesacijjournal
 
10th International Conference on Software Engineering and Applications (SEAPP...
10th International Conference on Software Engineering and Applications (SEAPP...10th International Conference on Software Engineering and Applications (SEAPP...
10th International Conference on Software Engineering and Applications (SEAPP...acijjournal
 
10th International conference on Parallel, Distributed Computing and Applicat...
10th International conference on Parallel, Distributed Computing and Applicat...10th International conference on Parallel, Distributed Computing and Applicat...
10th International conference on Parallel, Distributed Computing and Applicat...acijjournal
 
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...acijjournal
 
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...acijjournal
 

More from acijjournal (20)

Jan_2024_Top_read_articles_in_ACIJ.pdf
Jan_2024_Top_read_articles_in_ACIJ.pdfJan_2024_Top_read_articles_in_ACIJ.pdf
Jan_2024_Top_read_articles_in_ACIJ.pdf
 
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdfCall for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
Call for Papers - Advanced Computing An International Journal (ACIJ) (2).pdf
 
cs - ACIJ (4) (1).pdf
cs - ACIJ (4) (1).pdfcs - ACIJ (4) (1).pdf
cs - ACIJ (4) (1).pdf
 
cs - ACIJ (2).pdf
cs - ACIJ (2).pdfcs - ACIJ (2).pdf
cs - ACIJ (2).pdf
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
 
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
7thInternational Conference on Data Mining & Knowledge Management (DaKM 2022)
 
3rdInternational Conference on Natural Language Processingand Applications (N...
3rdInternational Conference on Natural Language Processingand Applications (N...3rdInternational Conference on Natural Language Processingand Applications (N...
3rdInternational Conference on Natural Language Processingand Applications (N...
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
 
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable DevelopmentGraduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
Graduate School Cyber Portfolio: The Innovative Menu For Sustainable Development
 
Genetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary MethodologyGenetic Algorithms and Programming - An Evolutionary Methodology
Genetic Algorithms and Programming - An Evolutionary Methodology
 
Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...
 
Advanced Computing: An International Journal (ACIJ)
Advanced Computing: An International Journal (ACIJ) Advanced Computing: An International Journal (ACIJ)
Advanced Computing: An International Journal (ACIJ)
 
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & PrinciplesE-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
E-Maintenance: Impact Over Industrial Processes, Its Dimensions & Principles
 
10th International Conference on Software Engineering and Applications (SEAPP...
10th International Conference on Software Engineering and Applications (SEAPP...10th International Conference on Software Engineering and Applications (SEAPP...
10th International Conference on Software Engineering and Applications (SEAPP...
 
10th International conference on Parallel, Distributed Computing and Applicat...
10th International conference on Parallel, Distributed Computing and Applicat...10th International conference on Parallel, Distributed Computing and Applicat...
10th International conference on Parallel, Distributed Computing and Applicat...
 
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
DETECTION OF FORGERY AND FABRICATION IN PASSPORTS AND VISAS USING CRYPTOGRAPH...
 
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
Detection of Forgery and Fabrication in Passports and Visas Using Cryptograph...
 

Recently uploaded

Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 

Recently uploaded (20)

Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 

‘CodeAliker’ - Plagiarism Detection on the Cloud

  • 1. Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.4, July 2012 DOI : 10.5121/acij.2012.3403 21 ‘CodeAliker’ - Plagiarism Detection on the Cloud Nitish Upreti1 and Rishi Kumar2 Department of Computer Science and Engineering, AMITY University, Noida, India. nitishupreti@gmail.com Department of Computer Science and Engineering, AMITY University, Noida, India. rishikumar182000@gmail.com Abstract Plagiarism is a burning problem that academics have been facing in all of the varied levels of the educational system. With the advent of digital content, the challenge to ensure the integrity of academic work has been amplified. This paper discusses on defining a precise definition of plagiarized computer code, various solutions available for detecting plagiarism and building a cloud platform for plagiarism disclosure. ‘CodeAliker’, our application thus developed automates the submission of assignments and the review process associated for essay text as well as computer code. It has been made available under the GNU’s General Public License as a Free and Open Source Software. Keywords Plagiarism, String matching, Cloud Computing 1.Introduction An insightful look into the scenario of academic integrity and its implications give us the major motivation for pursuing the subject. The issue holds utmost significance as the intellectual standards of an individual pursuing an academia a reestablished around his ability to produce authoritative work. Plagiarism is thus lethal. Every year a large number of students and scholars submit a huge volume of material to their respective mentors and professors. Due to the sheer amount of text involved, a manual scrutiny is infeasible. Analyzing the situation, we found no existing work in the public domain that solved the problem faced by educational institutes worldwide. Most of the alternatives were either closed source or catered to only a fraction of the entire problem. Working on this issue, at the outset we explore the sensitive aspect of classification of documents as ‘authentic’ or ‘plagiarized’. We then analyze numerous approaches to Plagiarism detection. Advancing then to our chief goal of implementing an engine and leveraging the cloud platform for scalable and robust plagiarism detection. Alex Aliken’s MOSS[1] is chosen as the key approach for building the application. Result and conclusion follow where we present our observations and learning. 2. Classification of Text Broadly categorizing, the nature of text submitted to such a system can either be an Essay that is plain text in language or computer code in any of the popular language such as C, C++, Java or Ruby for instance. It is easy to figure out whether an essay text has been plagiarized however source code copying is a delicate issue with mostly a fine line drawn between ‘code reuse’,
  • 2. Advanced Computing: An International Journal ‘collaboration’, ‘non-citation’ and ‘plagiarism’ an OOP system and ‘Don’t re-invent the wheel more blurred. With hardly any definition identifying copying instances is infeasible. Hence little work has been done on the topic; the only concrete input comes from the work of Cosma and Mike Joy [2]. Their work follows a finding the right answers and opinions to have our own precise judgment on assignment submitted consists of in almost all the sophisticated code bases, could be a major potential resource for plagiarism instances. Nonetheless looking false positives as Copyright statements occur of CodeAliker we choose to strip off comments we address a multitude of other questions. Is using an external library or API an instance of plagiarism? For most of the cases are a central part of any sophisticated piece import statements and library includes. submission; code for the design is manually is much more effective than 3. Approaches to Plagiarism Detection Various different approaches to Plagiarism detection exist and their performance and speed to a great extent. Also certain plagiarism detection sche specific structure and nature. A rich t Plagiarism Web Scrapping Based String Matching Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.4, July 2012 citation’ and ‘plagiarism’. With learning themes such as ‘Code Reuse’ in invent the wheel’ code philosophies, the distinction are With hardly any definition in place designing a system capable of accurately nstances is infeasible. Hence concrete definitions need to be in place. topic; the only concrete input comes from the work of [2]. Their work follows a survey-based approach in the U.K academics finding the right answers and opinions. However for implementing a practical solution we need judgment on the problem rather than a crude hypothesis submitted consists of comments and the actual source code. Comments, which occur in almost all the sophisticated code bases, could be a major potential resource for onetheless they present a major pitfall and could lead on to suspicious Copyright statements occur frequently as comments. For the purpose we choose to strip off comments so as to avoid any such issues. Moving forward we address a multitude of other questions. Is using an external library or API an instance of t of the cases we found that library use without citation is legitimate sophisticated piece of computer program. CodeAliker thus filters out import statements and library includes. An intuitive User Interface design can also be a part of ode for the design is put under scrutiny by CodeAliker but looking at the design manually is much more effective than plain UI code checking. Approaches to Plagiarism Detection Various different approaches to Plagiarism detection exist and their performance and speed Also certain plagiarism detection schemes are more suitable for data A rich taxonomy can be summarized in the diagram below Figure 3.1 Plagiarism Detection Approaches Local Database Based Non Sturctured Fingerprinting Techniques Matching Parameterized Techniques Structured ACIJ ), Vol.3, No.4, July 2012 22 With learning themes such as ‘Code Reuse’ in distinction are even m capable of accurately need to be in place. A topic; the only concrete input comes from the work of Georgina in the U.K academics for However for implementing a practical solution we need rather than a crude hypothesis. Code Comments, which occur in almost all the sophisticated code bases, could be a major potential resource for identifying and could lead on to suspicious For the purpose Moving forward, we address a multitude of other questions. Is using an external library or API an instance of legitimate as they CodeAliker thus filters out an also be a part of but looking at the design Various different approaches to Plagiarism detection exist and their performance and speed vary data set with a axonomy can be summarized in the diagram below. Based Sturctured
  • 3. Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.4, July 2012 23 Web Scrapping based approaches use the World Wide Web to check for Plagiarism instances from a large corpus of data. The scope of Web Scrapping is huge and lots of published work exists on such systems. Our focus for this research is on systems based on a local database compiled from assignments submitted by students taking the classes and past year submissions. Local Database Based Approaches can be either Structured or Non Structured. The Structured approach creates a graph model of information in the document. This approach is used mostly with code-based assignments. Non-Structured techniques are the most popular ones and are useful on a wide variety of text material. They are classified based on the algorithm used. Document Fingerprinting, String Matching and Parameterized Matching are the popular ones [3]. Tools based on the fingerprint approach work by creating “fingerprints” for each file which consist statistical information about the file, such as average number of terms per line, number of unique terms, and number of keywords [4].The DUP tool [5] is based on a parameterized matching algorithm, which detects identical and near-duplicate sections of source-code, by matching source-code sections whose identifiers have been substituted (renamed) systematically [3]. String Matching algorithms are quite popular and effective. MOSS [1], (YAP3) [6], JPlag [7], and Sherlock [8] are some of the popular ones available. CodeAliker is based on MOSS[1] that employs string-matching algorithms using k-grams, where a k-gram is an adjacent substring of length k. Winnowing, a local fingerprinting algorithm is also used to ensure matches of certain length are detected. 4. Designing the Engine with Ruby There were various motives for choosing MOSS as the core for CodeAliker’s engine. Also Ruby was used to implement the engine after considering several important factors. The language provides excellent text processing libraries, encourages an agile development methodology and Test Driven Development (TDD). Moreover it is ready for the web with excellent frameworks available. MOSS is highly effective for plagiarism detection with text of different nature. It can also be scaled to handle a large volume of data. MOSS also guarantees matches of certain length to be detected [1]. The engine consists of three major modules: Text Filter, Hasher and Winnower. All of the components can be customized with easy to write configuration files. The text filter has a key role to play when processing code assignments. Based on the approach MOSS suggests, the comments are stripped off, text is lowercased, identifiers are replaced with a dummy symbol, language specific keywords are removed and punctuations with no semantic meanings are stripped off. Filtered text with noise eliminated is thus obtained. The filtered text is then fed to a Hasher that calculates hashes for the given text. A rolling hash function based on the famous Rabin Karp Algorithm is employed to calculate hashes quickly. With each hash value calculated, the corresponding line number where the text occurred is stored. This aids later in presenting user with the information regarding the instances where plagiarized text is present. The Winnower is an implementation of the ‘Robust Winnowing’ algorithm defined by MOSS.A set of hash is chosen to be as the finger print of a document. Line number information is still preserved.
  • 4. Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.4, July 2012 24 Winnower needs to be configured with parameters value ‘k’ for k-gram, a threshold value ‘t’ and a modulus value ‘q’. If there is a substring match at least as long as the guarantee threshold, ‘t’, then this match is detected, and we do not detect any matches shorter than the noise threshold, ‘k’ [1].The hash values computed are two large and hinder a scalable implementation; hence a value ‘q’ is used as the modulus. For CodeAliker we found the sweet spot with the values 5(k), 8(t) and 10001(q) respectively. The documents are compared based on the final fingerprints, with plagiarism instance being reported line by line. Check for essay based assignment is surprisingly similar with the Filter step being omitted. 5. Building a Cloud Application The most interesting part of our research is to build a cloud application for the engine. For building the web application we employ the Ruby on Rails platform. Ruby on Rails, a full stack framework for Ruby is excellent for agile development and sustainable productivity. It boasts a high modular design, excellent package management capabilities, database abstraction with ORM(Object Relational Mapping) library and ease of deployment. The application is built with the MVC (Model – View – Controller) design pattern inherent on the Rails Framework. CodeAliker aims to ease the workflow involved by automating the entire process. To achieve this, an authentication-based system is introduced for the professors where assignments for each class they take are available to them as a separate bunch. The professor can mark any on the assignment as primary and check with respect to that assignment all the possible plagiarized instances. Thus getting a complete view of the scenario effortlessly in a non ad-hoc fashion. Academics can also manually supervise the submission and reviews. While the traditional delivery of software services have been mainstream, bringing the cloud into perspective changes the entire scenario. Cloud tends to centralize our resources, code base and data onto an always-available depot. Hardware resources can be accurately utilized, load on a high demand system can be catered to and system can be easily scaled. Configuration management is also made effortless. Any organization with requirements for sucha system is in need of the cloud. The platform allows changes to be pushed onto codebase with a push of a button, rather than relying on extensive upgrade packs. For plagiarism detection in a university or an academic institute the needs are critical and point towards the cloud. The computing requirements are thus addressed. Also our system is centralized and easily scaled. For hosting CodeAliker on the cloud, Heroku, a cloud platform has been employed. The source code is available to public here on: https://github.com/Myth17/CodeAliker The application is available for free use at: http://codealiker.heroku.com/
  • 5. Advanced Computing: An International Journal 6. Results The results for CodeAliker display suspected plagiarism instances are previewed clearer picture, the plagiarized instances are marked with the line numbers in order of aid manual scrutiny and presenting a 7. Conclusion We have analyzed the entire scenario of Plagiaris and solutions for developing an research being our ability to define different approaches towards plagiarism detection, practical implementation of the MOSS engine with fine tuned parameters platform. Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.4, July 2012 Figure 6.1 Figure 6.2 results for CodeAliker display the assignment marked as primary to the left while the suspected plagiarism instances are previewed stacked onto each other in the right. To present a clearer picture, the plagiarized instances are marked with the line numbers in order of aid iny and presenting a more cohesive report. lyzed the entire scenario of Plagiarism detection, while figuring out the olutions for developing an application for the purpose. Major accomplishment of the our ability to define a precise definition of code plagiarism, understanding towards plagiarism detection, practical implementation of the MOSS parameters and building a scalable web application hosted on a cloud ACIJ ), Vol.3, No.4, July 2012 25 the assignment marked as primary to the left while the the right. To present a clearer picture, the plagiarized instances are marked with the line numbers in order of aid while figuring out the problems Major accomplishment of the understanding towards plagiarism detection, practical implementation of the MOSS hosted on a cloud
  • 6. Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.4, July 2012 26 References [1] Saul Schleimer, Daniel S. Wilkerson and Alex Aiken, “Winnowing: Local Algorithms for Document Fingerprinting”, SIGMOD '03 Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp 76-85, 2003. [2] Georgina Cosma and Mike Joy, “ Towards a definition of Source Code Plagiarism”, IEEE TRANSACTIONS ON EDUCATION, VOL. 51, NO. 2, MAY 2008. [3] Georgina Cosma and Mike Joy, “An Approach to Source-Code Plagiarism Detection and Investigation Using Latent Semantic Analysis”, IEEE TRANSACTIONS ON COMPUTERS, VOL. 61, NO. 3, MARCH 2012 379. [4] M. Mozgovoy, “Desktop Tools for Offline Plagiarism Detection in Computer Programs,” Informatics in Education, vol. 5, no. 1, pp. 97- 112, 2006. [5] B. Baker, “On Finding Duplication and Near-Duplication in LargeSoftware Systems,” Proc. IEEE Second Working Conf. Reverse Eng.,pp. 85-95, 1995. [6] M.J. Wise, “YAP3: Improved Detection of Similarities in Computer Program and Other Texts,” Proc. 27th SIGCSE Technical Symp.,pp. 130-134, 1996. [7] L. Prechelt, G. Malpohl, and M. Philippsen, “Finding PlagiarismsAmong a Set of Programs with JPlag,” J. Universal ComputerScience, vol. 8, no. 11, pp. 1016-1038, 2002. [8] M. Joy and M. Luck, “Plagiarism in Programming Assignments,” IEEE Trans. Education, vol. 42, no. 2, pp. 129-133, May 1999. Authors Nitish Upreti is computer science student at AMITY School of Engineering and Technology, Noida, India. His fields of interest include Algorithm Design, Artificial Intelligence and building scalable web applications. Rishi Kumar is computer science faculty at Amity School of Engineering and Technology, Noida, India. His fields of interest include Artificial Intelligence, Expert System & Image Processing.