Extended subtree a new similarity function for tree structured data

•Download as DOC, PDF•

1 like•261 views

TO GET THIS PROJECT THROUGH ONLINE OR TRAINING SESSION CONTACT: LansA Informatics Pvt Ltd No 165, 5th Street, Crosscut road, Gandhipuram, Coimbatore - 641 012 Landline: 0422 - 4204373 Mobile: +91 90 953 953 33 Email :lansa.projects@gmail.com Website: www.lansainformatics.com Blog: www.lansastudentscdc.blogspot.com

Education

EXTENDED SUBTREE: A NEW SIMILARITY FUNCTION FOR TREE
STRUCTURED DATA
ABSTRACT:
Although several distance or similarity functions for trees have
been introduced, their performance is not always satisfactory
in many applications, ranging from document clustering to
natural language processing. This research proposes a new
similarity function for trees, namely Extended Subtree (EST),
where a new subtree mapping is proposed. EST generalizes the
edit base distances by providing new rules for subtree mapping.
Further, the new approach seeks to resolve the problems and
limitations of previous approaches. Extensive evaluation
frameworks are developed to evaluate the performance of the
new approach against previous proposals. Clustering and
classification case studies utilizing three real-world and one
synthetic labeled data sets are performed to provide an unbiased
evaluation where different distance functions are investigated.
The experimental results demonstrate the superior performance
of the proposed distance function. In addition, an empirical

runtime analysis demonstrates that the new approach is one of
the best tree distance functions in terms of runtime efficiency.
EXISTING SYSTEM:
THE extensive application of tree structured data in today’s
information technology is obvious. Trees can model many
information systems like XML and HTML. User behavior in a
website (visited pages), proteins, and DNA can be modeled with
a tree. Moreover, programming language compilers parse the
code into a tree as a first step. Consequently, in many
applications involving tree structured data, tree comparison is
required. The applications include document clustering, natural
language processing, cross browser compatibility, and automatic
web testing. Several tree comparison approaches have been
already introduced to address this domain. Edit base distances
are a well-known family of tree distances based on mapping and
edit operations.
DISADVANTAGES OF EXISTING SYSTEM:

  Order-preserving rules may prevent mapping between
similar nodes.
According to the one-to-one condition, any node in a tree
can be mapped into only one node in another tree leading to
inappropriate mappings with respect to similarity.
PROPOSED SYSTEM:
We propose a new similarity function with respect to tree
structured data, namely Extended Subtree(EST). The new
similarity function avoids these problems by preserving the
structure of the trees. That is mapping subtrees rather than nodes
are utilized by new mapping rules. The motivation of proposing
EST is to enhance the edit base mappings, provided in Section,
by generalizing the one-to-one and order preserving mapping
rules Consequently, EST introduces new rules for sub tree
mapping. This new approach seeks to resolve the problems and
limitations of edit based approaches
ADVANTAGES OF PROPOSED SYSTEM:

 It provides runtime efficiency.
Clustering and classification frameworks.
SYSTEM CONFIGURATION:-
HARDWARE REQUIREMENTS:-
Processor - Pentium –IV
Speed - 1.1 Ghz
RAM - 512 MB(min)
Hard Disk - 40 GB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - LCD/LED
SOFTWARE REQUIREMENTS:
Operating system : Windows XP.
Coding Language : .Net
Data Base : SQL Server 2005
Tool : VISUAL STUDIO 2008.

REFERENCE:
Ali Shahbazi, Student Member, IEEE and James Miller, Member, IEEE,
“Extended Subtree: A New Similarity Function for Tree Structured Data”
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,
VOL. 26, NO. 4, APRIL 2014

What's hot

Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsBertram Ludäscher

Multiscale Mapper NetworksColleen Farrelly

DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR...cscpconf

Asif nosqlAsif Ali

AN ENTROPIC OPTIMIZATION TECHNIQUE IN HETEROGENEOUS GRID COMPUTING USING BION...ijcsit

H04564550IOSR-JEN

Expression of Query in XML object-oriented databaseEditor IJCATR

Current clustering techniquesPoonam Kshirsagar

What's hot (8)

Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us

Multiscale Mapper Networks

DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR...

Asif nosql

AN ENTROPIC OPTIMIZATION TECHNIQUE IN HETEROGENEOUS GRID COMPUTING USING BION...

H04564550

Expression of Query in XML object-oriented database

Current clustering techniques

Similar to Extended subtree a new similarity function for tree structured data

ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES cscpconf

ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIEScsandit

Enhancing keyword search over relational databases using ontologiescsandit

2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )SBGC

Partitioning of Query Processing in Distributed Database System to Improve Th...IRJET Journal

Web based-distributed-sesnzer-using-service-oriented-architectureAidah Izzah Huriyah

Fota Delta Size Reduction Using FIle Similarity AlgorithmsShivansh Gaur

Query optimization to improve performance of the code executionAlexander Decker

11.query optimization to improve performance of the code executionAlexander Decker

Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor

IEEE 2014 JAVA DATA MINING PROJECTS Shortest path computing in relational dbmsIEEEFINALYEARSTUDENTPROJECTS

No Sql On Social And Sematic WebStefan Ceriu

NoSQL On Social And Sematic WebStefan Prutianu

CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...ijcsit

2014 IEEE JAVA DATA MINING PROJECT Shortest path computing in relational dbmsIEEEMEMTECHSTUDENTSPROJECTS

Parallel and Distributed Information Retrieval Systemvimalsura

JPJ1420 Efficient Prediction of Difficult Keyword Queries over Databaseschennaijp

FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION cscpconf

ApresentDiogo Guimaraes

Improved Text Mining for Bulk Data Using Deep Learning Approach IJCSIS Research Publications

Similar to Extended subtree a new similarity function for tree structured data (20)

ENHANCING KEYWORD SEARCH OVER RELATIONAL DATABASES USING ONTOLOGIES

Enhancing keyword search over relational databases using ontologies

2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )

Partitioning of Query Processing in Distributed Database System to Improve Th...

Web based-distributed-sesnzer-using-service-oriented-architecture

Fota Delta Size Reduction Using FIle Similarity Algorithms

Query optimization to improve performance of the code execution

11.query optimization to improve performance of the code execution

Welcome to International Journal of Engineering Research and Development (IJERD)

IEEE 2014 JAVA DATA MINING PROJECTS Shortest path computing in relational dbms

No Sql On Social And Sematic Web

NoSQL On Social And Sematic Web

CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...

2014 IEEE JAVA DATA MINING PROJECT Shortest path computing in relational dbms

Parallel and Distributed Information Retrieval System

JPJ1420 Efficient Prediction of Difficult Keyword Queries over Databases

FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION

Apresent

Improved Text Mining for Bulk Data Using Deep Learning Approach

Recently uploaded

microwave assisted reaction. General introductionMaksud Ahmed

fourth grading exam for kindergarten in writingTeacherCyreneCayanan

Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417

Measures of Central Tendency: Mean, Median and ModeThiyagu K

Interactive Powerpoint_How to Master effective communicationnomboosow

Código Creativo y Arte de Software | Unidad 1Maestría en Comunicación Digital Interactiva - UNR

Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha

Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron

Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood

Holdier Curriculum Vitae (April 2024).pdfagholdier

Accessible design: Minimum effort, maximum impactdawncurless

Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31

A Critique of the Proposed National Education Policy ReformChameera Dedduwage

Introduction to Nonprofit Accounting: The BasicsTechSoup

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics

Grant Readiness 101 TechSoup and Remy ConsultingTechSoup

Advance Mobile Application Development class 07Dr. Mazin Mohamed alkathiri

Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic

Sports & Fitness Value Added Course FY..Disha Kariya

Recently uploaded (20)

microwave assisted reaction. General introduction

fourth grading exam for kindergarten in writing

Unit-IV- Pharma. Marketing Channels.pptx

Measures of Central Tendency: Mean, Median and Mode

Interactive Powerpoint_How to Master effective communication

Código Creativo y Arte de Software | Unidad 1

Call Girls in Dwarka Mor Delhi Contact Us 9654467111

Q4-W6-Restating Informational Text Grade 3

Web & Social Media Analytics Previous Year Question Paper.pdf

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx

Holdier Curriculum Vitae (April 2024).pdf

Accessible design: Minimum effort, maximum impact

Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...

A Critique of the Proposed National Education Policy Reform

Introduction to Nonprofit Accounting: The Basics

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...

Grant Readiness 101 TechSoup and Remy Consulting

Advance Mobile Application Development class 07

Key note speaker Neum_Admir Softic_ENG.pdf

Sports & Fitness Value Added Course FY..

Extended subtree a new similarity function for tree structured data

1. EXTENDED SUBTREE: A NEW SIMILARITY FUNCTION FOR TREE STRUCTURED DATA ABSTRACT: Although several distance or similarity functions for trees have been introduced, their performance is not always satisfactory in many applications, ranging from document clustering to natural language processing. This research proposes a new similarity function for trees, namely Extended Subtree (EST), where a new subtree mapping is proposed. EST generalizes the edit base distances by providing new rules for subtree mapping. Further, the new approach seeks to resolve the problems and limitations of previous approaches. Extensive evaluation frameworks are developed to evaluate the performance of the new approach against previous proposals. Clustering and classification case studies utilizing three real-world and one synthetic labeled data sets are performed to provide an unbiased evaluation where different distance functions are investigated. The experimental results demonstrate the superior performance of the proposed distance function. In addition, an empirical

2. runtime analysis demonstrates that the new approach is one of the best tree distance functions in terms of runtime efficiency. EXISTING SYSTEM: THE extensive application of tree structured data in today’s information technology is obvious. Trees can model many information systems like XML and HTML. User behavior in a website (visited pages), proteins, and DNA can be modeled with a tree. Moreover, programming language compilers parse the code into a tree as a first step. Consequently, in many applications involving tree structured data, tree comparison is required. The applications include document clustering, natural language processing, cross browser compatibility, and automatic web testing. Several tree comparison approaches have been already introduced to address this domain. Edit base distances are a well-known family of tree distances based on mapping and edit operations. DISADVANTAGES OF EXISTING SYSTEM:

3.   Order-preserving rules may prevent mapping between similar nodes. According to the one-to-one condition, any node in a tree can be mapped into only one node in another tree leading to inappropriate mappings with respect to similarity. PROPOSED SYSTEM: We propose a new similarity function with respect to tree structured data, namely Extended Subtree(EST). The new similarity function avoids these problems by preserving the structure of the trees. That is mapping subtrees rather than nodes are utilized by new mapping rules. The motivation of proposing EST is to enhance the edit base mappings, provided in Section, by generalizing the one-to-one and order preserving mapping rules Consequently, EST introduces new rules for sub tree mapping. This new approach seeks to resolve the problems and limitations of edit based approaches ADVANTAGES OF PROPOSED SYSTEM:

4.  It provides runtime efficiency. Clustering and classification frameworks. SYSTEM CONFIGURATION:- HARDWARE REQUIREMENTS:- Processor - Pentium –IV Speed - 1.1 Ghz RAM - 512 MB(min) Hard Disk - 40 GB Key Board - Standard Windows Keyboard Mouse - Two or Three Button Mouse Monitor - LCD/LED SOFTWARE REQUIREMENTS: Operating system : Windows XP. Coding Language : .Net Data Base : SQL Server 2005 Tool : VISUAL STUDIO 2008.

5. REFERENCE: Ali Shahbazi, Student Member, IEEE and James Miller, Member, IEEE, “Extended Subtree: A New Similarity Function for Tree Structured Data” IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 26, NO. 4, APRIL 2014

6. REFERENCE: Ali Shahbazi, Student Member, IEEE and James Miller, Member, IEEE, “Extended Subtree: A New Similarity Function for Tree Structured Data” IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 26, NO. 4, APRIL 2014

Extended subtree a new similarity function for tree structured data

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Similar to Extended subtree a new similarity function for tree structured data

Similar to Extended subtree a new similarity function for tree structured data (20)

More from Papitha Velumani

More from Papitha Velumani (20)

Recently uploaded

Recently uploaded (20)

Extended subtree a new similarity function for tree structured data