SlideShare a Scribd company logo
Copy/Paste detector
for source code on
JavaScript
Andrey Kucherenko
Lead Software Engineer @ EPAM Systems (Kiev)
ABOUT
Andrey Kucherenko
Lead Software Engineer @ EPAM Systems (Kiev)
https://github.com/kucherenko/jscpd
ABOUT TOPIC
Copy/Paste detector for source code on
JavaScript
● Reasons
● Algorithm
● Architecture
● Tools
● Plans
https://github.com/kucherenko/jscpd
REASONS
https://github.com/kucherenko/jscpd
EXISTING TOOLS
https://github.com/kucherenko/jscpd
PHPCPD
ALGORITHM
Rabin Karp: String Matching
Algorithm
https://github.com/kucherenko/jscpd
ALGORIHTM
https://github.com/kucherenko/jscpd
ALGORIHTM
https://github.com/kucherenko/jscpd
const answer = 6 * 7;
const click = event => console.log(event)
function hello(name) {
console.log(`Hello ${name}`);
}
ALGORIHTM
https://github.com/kucherenko/jscpd
{ "type": "Keyword", "value": "const" },
{ "type": "Identifier", "value": "answer" },
{ "type": "Punctuator", "value": "=" },
{ "type": "Numeric", "value": "6" },
{ "type": "Punctuator", "value": "*" },
{ "type": "Numeric", "value": "7" },
{ "type": "Punctuator", "value": ";" },
{ "type": "Keyword", "value": "const" },
{ "type": "Identifier", "value": "click" },
{ "type": "Punctuator", "value": "=" },
{ "type": "Identifier", "value": "event" },
{ "type": "Punctuator", "value": "=>" },
{ "type": "Identifier", "value": "console" },
{ "type": "Punctuator", "value": "." },
{ "type": "Identifier", "value": "log" },
{ "type": "Punctuator", "value": "(" },
{ "type": "Identifier", "value": "event" },
{ "type": "Punctuator", "value": ")" },
{ "type": "Keyword", "value": "function" },
{ "type": "Identifier", "value": "hello" },
{ "type": "Punctuator", "value": "(" },
{ "type": "Identifier", "value": "name" },
{ "type": "Punctuator", "value": ")" },
{ "type": "Punctuator", "value": "{ " },
{ "type": "Identifier", "value": "console" },
{ "type": "Punctuator", "value": "." },
{ "type": "Identifier", "value": "log" },
{ "type": "Punctuator", "value": "(" },
{ "type": "Template", "value": "`Hello ${ " },
{ "type": "Identifier", "value": "name" },
{ "type": "Template", "value": " }`" },
{ "type": "Punctuator", "value": ")" },
{ "type": "Punctuator", "value": ";" },
{ "type": "Punctuator", "value": " }"
ALGORIHTM
https://github.com/kucherenko/jscpd
'3237124366', '07c4845853', '0b70d4ffe0', '7bb34abce5',
'59fb864552', '9f769a5e93', 'c913cc056b', '3237124366',
'4b09c07aa1', '0b70d4ffe0', '06b9bc0ef4', 'caf867a125',
'bd6845a7e7', '7fb7e4e540', '5fb26ad7b9', '66e5a4ee55',
'06b9bc0ef4', '3aa73104c6', '4a43a19740', 'bca573934e',
'66e5a4ee55', 'da5400f4e2', '3aa73104c6', 'fecd1071ce',
'bd6845a7e7', '7fb7e4e540', '5fb26ad7b9', '66e5a4ee55',
'4ab7ca70d5', 'da5400f4e2', '7c61a943f3', '3aa73104c6',
'c913cc056b', 'a120515c8c'
ARCHITECTURE
https://github.com/kucherenko/jscpd
TOKENS
https://github.com/kucherenko/jscpd
Esprima
acorn
TOKENS
https://github.com/kucherenko/jscpd
TOOLS
https://github.com/kucherenko/jscpd
SUPPORTED LANGUAGES
https://github.com/kucherenko/jscpd
JavaScript Go Swift
CoffeeScript Python Objective-C
TypeScript CSS/SASS Perl
Java C# Lua
C/C++ HTML Scala
PHP XML/XSLT Other…
FEATURES
● Reporters: json, xml, xslt, console
● Blame authors of copy/paste
● Use semantic of the languages (e.g. skip
comments)
● Community extensions: gulp-jscpd, html-
reporter, grunt-jscpd, etc.
https://github.com/kucherenko/jscpd
STATISTICS
https://github.com/kucherenko/jscpd
REPORTS
https://github.com/kucherenko/jscpd
REPORTS
https://github.com/kucherenko/jscpd
REPORTS
https://github.com/kucherenko/jscpd
PLANS
● More reporters
● Cross projects detections
● New API
● Improve performance (add cache, add
bloom filters, etc)
● Persistent store (NoSQL etc.)
● Reports for period
https://github.com/kucherenko/jscpd
QUESTIONS
Questions?
https://github.com/kucherenko/jscpd
Questions?
Questions?

More Related Content

What's hot

Javascript Unit Testing
Javascript Unit TestingJavascript Unit Testing
Javascript Unit Testing
Paul Klipp
 
Beware: Sharp Tools
Beware: Sharp ToolsBeware: Sharp Tools
Beware: Sharp Tools
chrismdp
 
Kansai.pm 10周年記念 Plack/PSGI 入門
Kansai.pm 10周年記念 Plack/PSGI 入門Kansai.pm 10周年記念 Plack/PSGI 入門
Kansai.pm 10周年記念 Plack/PSGI 入門
lestrrat
 

What's hot (20)

PHP 机智问答
PHP 机智问答PHP 机智问答
PHP 机智问答
 
The Lesser Known Features of ECMAScript 6
The Lesser Known Features of ECMAScript 6The Lesser Known Features of ECMAScript 6
The Lesser Known Features of ECMAScript 6
 
Writing a compiler in go
Writing a compiler in goWriting a compiler in go
Writing a compiler in go
 
Parse, scale to millions
Parse, scale to millionsParse, scale to millions
Parse, scale to millions
 
What's New in JavaScript
What's New in JavaScriptWhat's New in JavaScript
What's New in JavaScript
 
Why my Go program is slow?
Why my Go program is slow?Why my Go program is slow?
Why my Go program is slow?
 
PuppetDB, Puppet Explorer and puppetdbquery
PuppetDB, Puppet Explorer and puppetdbqueryPuppetDB, Puppet Explorer and puppetdbquery
PuppetDB, Puppet Explorer and puppetdbquery
 
Diving into HHVM Extensions (php[tek] 2016)
Diving into HHVM Extensions (php[tek] 2016)Diving into HHVM Extensions (php[tek] 2016)
Diving into HHVM Extensions (php[tek] 2016)
 
Javascript Unit Testing
Javascript Unit TestingJavascript Unit Testing
Javascript Unit Testing
 
Chat code
Chat codeChat code
Chat code
 
Expoによるモバイルアプリ開発入門
Expoによるモバイルアプリ開発入門Expoによるモバイルアプリ開発入門
Expoによるモバイルアプリ開発入門
 
Naughty And Nice Bash Features
Naughty And Nice Bash FeaturesNaughty And Nice Bash Features
Naughty And Nice Bash Features
 
Groovy on the Shell
Groovy on the ShellGroovy on the Shell
Groovy on the Shell
 
Value protocols and codables
Value protocols and codablesValue protocols and codables
Value protocols and codables
 
Beware: Sharp Tools
Beware: Sharp ToolsBeware: Sharp Tools
Beware: Sharp Tools
 
チームメイトのためにdocstringを書こう! pyconjp2019
チームメイトのためにdocstringを書こう! pyconjp2019チームメイトのためにdocstringを書こう! pyconjp2019
チームメイトのためにdocstringを書こう! pyconjp2019
 
Unleash your inner console cowboy
Unleash your inner console cowboyUnleash your inner console cowboy
Unleash your inner console cowboy
 
Kansai.pm 10周年記念 Plack/PSGI 入門
Kansai.pm 10周年記念 Plack/PSGI 入門Kansai.pm 10周年記念 Plack/PSGI 入門
Kansai.pm 10周年記念 Plack/PSGI 入門
 
RingoJS
RingoJSRingoJS
RingoJS
 
Angular gotchas
Angular gotchasAngular gotchas
Angular gotchas
 

Similar to Copy/paste detector for source code on javascript

AST - the only true tool for building JavaScript
AST - the only true tool for building JavaScriptAST - the only true tool for building JavaScript
AST - the only true tool for building JavaScript
Ingvar Stepanyan
 

Similar to Copy/paste detector for source code on javascript (20)

GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑
 
Automatic discovery of Web API Specifications: an example-driven approach
Automatic discovery of Web API Specifications: an example-driven approachAutomatic discovery of Web API Specifications: an example-driven approach
Automatic discovery of Web API Specifications: an example-driven approach
 
Example-driven Web API Specification Discovery
Example-driven Web API Specification DiscoveryExample-driven Web API Specification Discovery
Example-driven Web API Specification Discovery
 
Elasticsearch intro output
Elasticsearch intro outputElasticsearch intro output
Elasticsearch intro output
 
Agile Testing Days 2018 - API Fundamentals - postman collection
Agile Testing Days 2018 - API Fundamentals - postman collectionAgile Testing Days 2018 - API Fundamentals - postman collection
Agile Testing Days 2018 - API Fundamentals - postman collection
 
AST - the only true tool for building JavaScript
AST - the only true tool for building JavaScriptAST - the only true tool for building JavaScript
AST - the only true tool for building JavaScript
 
Overview of GraphQL & Clients
Overview of GraphQL & ClientsOverview of GraphQL & Clients
Overview of GraphQL & Clients
 
How to ship customer value faster with step functions
How to ship customer value faster with step functionsHow to ship customer value faster with step functions
How to ship customer value faster with step functions
 
C# 6 and 7 and Futures 20180607
C# 6 and 7 and Futures 20180607C# 6 and 7 and Futures 20180607
C# 6 and 7 and Futures 20180607
 
Testing swagger contracts without contract based testing
Testing swagger contracts without contract based testingTesting swagger contracts without contract based testing
Testing swagger contracts without contract based testing
 
Avro, la puissance du binaire, la souplesse du JSON
Avro, la puissance du binaire, la souplesse du JSONAvro, la puissance du binaire, la souplesse du JSON
Avro, la puissance du binaire, la souplesse du JSON
 
Introduction To Groovy 2005
Introduction To Groovy 2005Introduction To Groovy 2005
Introduction To Groovy 2005
 
Rack Middleware
Rack MiddlewareRack Middleware
Rack Middleware
 
How to ship customer value faster with step functions
How to ship customer value faster with step functionsHow to ship customer value faster with step functions
How to ship customer value faster with step functions
 
Let's build a parser!
Let's build a parser!Let's build a parser!
Let's build a parser!
 
GraphQL IN Golang
GraphQL IN GolangGraphQL IN Golang
GraphQL IN Golang
 
Types End-to-End @ samsara
Types End-to-End @ samsaraTypes End-to-End @ samsara
Types End-to-End @ samsara
 
Getting Started with Microsoft Bot Framework
Getting Started with Microsoft Bot FrameworkGetting Started with Microsoft Bot Framework
Getting Started with Microsoft Bot Framework
 
Building Go Web Apps
Building Go Web AppsBuilding Go Web Apps
Building Go Web Apps
 
아파트 정보를 이용한 ELK stack 활용 - 오근문
아파트 정보를 이용한 ELK stack 활용 - 오근문아파트 정보를 이용한 ELK stack 활용 - 오근문
아파트 정보를 이용한 ELK stack 활용 - 오근문
 

Recently uploaded

Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Recently uploaded (20)

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 

Copy/paste detector for source code on javascript