SlideShare a Scribd company logo
1 of 6
LeMeniz Infotech 
36, 100 feet Road, Natesan 
Nagar(Near Indira Gandhi Statue, Next 
to Fish-O-Fish), Pondicherry-605 005 
Call: 0413-4205444, +91 99625 88976, 
95663 55386. 
For More Projects Titles Visits : www.lemenizinfotech.com | Call Us : 9962588976 
/9566355386 
Do Your Projects With Domain Experts 
SCALABLE KEYWORD SEARCH ON LARGE RDF DATA 
ABSTRACT 
Keyword search is a useful tool for exploring large RDF datasets. Existing 
techniques either rely on constructing a distance matrix for pruning the search 
space or building summaries from the RDF graphs for query processing. Existing 
techniques have serious limitations in dealing with realistic, large RDF data with 
tens of millions of triples. Furthermore, the existing summarization techniques may 
lead to incorrect/incomplete results. To address these issues, an effective 
summarization algorithm is proposed to summarize the RDF data. Given a 
keyword query, the summaries lend significant pruning powers to exploratory 
keyword search and result in much better efficiency compared to previous works. 
Unlike existing techniques, this search algorithm always return correct results. 
Besides, the summaries we built can be updated incrementally and efficiently. 
Experiments on both benchmark and large real RDF data sets show that this 
techniques are scalable and efficient. 
AIM 
Aim is to design a scalable and exact solution that handles realistic RDF datasets 
with tens of millions of triples.
LeMeniz Infotech 
36, 100 feet Road, Natesan 
Nagar(Near Indira Gandhi Statue, Next 
to Fish-O-Fish), Pondicherry-605 005 
Call: 0413-4205444, +91 99625 88976, 
95663 55386. 
For More Projects Titles Visits : www.lemenizinfotech.com | Call Us : 9962588976 
/9566355386 
Do Your Projects With Domain Experts 
INTRODUCTION 
The RDF (Resource Description Framework) is the de-facto standard for data 
representation on the Web. So, it is no surprise that we are inundated with large 
amounts of rapidly growing RDF data from disparate domains. For instance, the 
Linked Open Data (LOD) initiative integrates billions of entities from hundreds of 
sources. Just one of these sources, the DBpedia dataset, describes more than 3:64 
million things using more than 1 billion RDF triples; and it contains numerous 
keywords. 
Keyword search is an important tool for exploring and searching large data 
corpuses whose structure is either unknown, or constantly changing. So, keyword 
search has already been studied in the context of relational databases, XML 
documents, and more recently over graphs and RDF data. 
EXISTING SYSTEM 
Keyword search on generic graphs 
For keyword search on generic graphs, many techniques assume that graphs fit in 
memory, an assumption that breaks for big RDF graphs. 
Existing the approaches maintain a distance matrix for all vertex pairs, and clearly 
do not scale for graphs with millions of vertices.
LeMeniz Infotech 
36, 100 feet Road, Natesan 
Nagar(Near Indira Gandhi Statue, Next 
to Fish-O-Fish), Pondicherry-605 005 
Call: 0413-4205444, +91 99625 88976, 
95663 55386. 
For More Projects Titles Visits : www.lemenizinfotech.com | Call Us : 9962588976 
/9566355386 
Do Your Projects With Domain Experts 
Furthermore, these works do not consider how to handle updates. A typical 
approach used here for keyword-search is backward search. Backward search when 
used to find a Steiner tree in the data graph is NP-hard. 
Large graph data 
The graph data are first partitioned into small subgraphs by heuristics. In this 
version of the problem, the authors assumed edges across the boundaries of the 
partitions are weighted. A partition is treated as a supernode and edges crossing 
partitions are superedges. The supernodes and superedges form a new graph, which 
is considered as a summary the underlying graph data. By recursively performing 
partitioning and building summaries, a large graph can be eventually summarized 
with a small summary and fit into memory for query processing. 
During query evaluation, the correspondent supernodes containing the keywords 
being queried are unfolded and the respective portion of graph are fetched from 
external memory for query processing. This approach cannot be applied to RDF. 
Keyword search for RDF data 
Search is first applied on the schema/summary of the data to identify promising 
relations which could have all the keywords being queried. Then, by translating 
these relations into search patterns and executing them against the RDF data, the 
actual subgraphs are retrived.
LeMeniz Infotech 
36, 100 feet Road, Natesan 
Nagar(Near Indira Gandhi Statue, Next 
to Fish-O-Fish), Pondicherry-605 005 
Call: 0413-4205444, +91 99625 88976, 
95663 55386. 
For More Projects Titles Visits : www.lemenizinfotech.com | Call Us : 9962588976 
/9566355386 
Do Your Projects With Domain Experts 
Disadvantages 
 Returns incorrect answers, i.e., the keyword search returns answers that do 
not correspond to real subgraphs or misses valid matches from the 
underlying RDF data 
 Inability to scale to handle typical RDF datasets with tens of millions of 
triples. 
PROPOSED SYSTEM 
 To design a scalable and exact solution that handles realistic RDF datasets 
with tens of millions of triples. 
 To use SPARQL query language to efficiently process the RDF data 
 Efficiently retrieve every partition from the data by collaboratively using 
SPARQL query and any RDF store without explicitly storing the partition 
 This approach starts by splitting the RDF graph into multiple, smaller 
partitions. Then, it defines a minimal set of common type-based structures 
that summarizes the partitions. Intuitively, the summary book keeps the 
distinct structures from all the partitions.
LeMeniz Infotech 
36, 100 feet Road, Natesan 
Nagar(Near Indira Gandhi Statue, Next 
to Fish-O-Fish), Pondicherry-605 005 
Call: 0413-4205444, +91 99625 88976, 
95663 55386. 
For More Projects Titles Visits : www.lemenizinfotech.com | Call Us : 9962588976 
/9566355386 
Do Your Projects With Domain Experts 
Advantages 
 Better efficiency 
 Overcome scalability issues 
 Better results 
LITERATURE SUMMARY 
Keyword search on generic graphs, many techniques assume that graphs fit in 
memory, an assumption that breaks for big RDF graphs. For instance, existing the 
approaches maintain a distance matrix for all vertex pairs, and clearly do not scale 
for graphs with millions of vertices. Furthermore, these works do not consider how 
to handle updates. 
He et al proposed a tractable problem that does not aim to find a Steiner tree and 
can be answered by using backward search. 
Large graph data to support keyword search were also studied. The graph data are 
first partitioned into small subgraphs by heuristics. It is assumed that edges across 
the boundaries of the partitions are weighted. A partition is treated as a supernode 
and edges crossing partitions are superedges. The supernodes and superedges form 
a new graph, which is considered as a summary the underlying graph data. By 
recursively performing partitioning and building summaries, a large graph can be 
eventually summarized with a small summary and fit into memory for query 
processing. During query evaluation, the correspondent supernodes containing the
LeMeniz Infotech 
36, 100 feet Road, Natesan 
Nagar(Near Indira Gandhi Statue, Next 
to Fish-O-Fish), Pondicherry-605 005 
Call: 0413-4205444, +91 99625 88976, 
95663 55386. 
For More Projects Titles Visits : www.lemenizinfotech.com | Call Us : 9962588976 
/9566355386 
Do Your Projects With Domain Experts 
keywords being queried are unfolded and the respective portion of graph are 
fetched from external memory for query processing. 
Hardware requirements: 
Processor : Any Processor above 500 MHz. 
Ram : 128Mb. 
Hard Disk : 10 Gb. 
Compact Disk : 650 Mb. 
Input device : Standard Keyboard and Mouse. 
Output device : VGA and High Resolution Monitor. 
Software requirements: 
Operating System : Windows Family. 
Language : JDK 1.5 
Database : RDF

More Related Content

What's hot

What's hot (20)

EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphs
 
Content + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learningContent + Signals: The value of the entire data estate for machine learning
Content + Signals: The value of the entire data estate for machine learning
 
Data Skills for Digital Era
Data Skills for Digital EraData Skills for Digital Era
Data Skills for Digital Era
 
Efficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining ProcessEfficient Practices for Large Scale Text Mining Process
Efficient Practices for Large Scale Text Mining Process
 
3 classification
3  classification3  classification
3 classification
 
DS4G
DS4GDS4G
DS4G
 
Analysing Large Citation Network
Analysing Large Citation NetworkAnalysing Large Citation Network
Analysing Large Citation Network
 
Big Data
Big DataBig Data
Big Data
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
An Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search TechniqueAn Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search Technique
 
10 Popular Hadoop Technical Interview Questions
10 Popular Hadoop Technical Interview Questions10 Popular Hadoop Technical Interview Questions
10 Popular Hadoop Technical Interview Questions
 
Massive scale analytics with Stratosphere using R
Massive scale analytics with Stratosphere using RMassive scale analytics with Stratosphere using R
Massive scale analytics with Stratosphere using R
 
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
Fishing Graphs in a Hadoop Data Lake by Jörg Schad and Max Neunhoeffer at Big...
 
Sustainability Investment Research Using Cognitive Analytics
Sustainability Investment Research Using Cognitive AnalyticsSustainability Investment Research Using Cognitive Analytics
Sustainability Investment Research Using Cognitive Analytics
 
Search powered by deep learning smart data 2017
Search powered by deep learning smart data 2017Search powered by deep learning smart data 2017
Search powered by deep learning smart data 2017
 
A Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics CorporationA Blended Approach to Analytics at Data Tactics Corporation
A Blended Approach to Analytics at Data Tactics Corporation
 
How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps How to migrate to GraphDB in 10 easy to follow steps
How to migrate to GraphDB in 10 easy to follow steps
 
Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)Data Tactics Data Science Brown Bag (April 2014)
Data Tactics Data Science Brown Bag (April 2014)
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 

Viewers also liked

Riverside Resources Corporate Presentation
Riverside Resources Corporate PresentationRiverside Resources Corporate Presentation
Riverside Resources Corporate Presentation
mhallaran
 
Алексей Янин (Websellers): эффективная реклама в социальных медиа
Алексей Янин (Websellers): эффективная реклама в социальных медиаАлексей Янин (Websellers): эффективная реклама в социальных медиа
Алексей Янин (Websellers): эффективная реклама в социальных медиа
Ekaterina Giganova
 
педмарафон 2013
педмарафон 2013педмарафон 2013
педмарафон 2013
Elena Loseva
 
Delay dependent stability of single-loop controlled grid-connected inverters ...
Delay dependent stability of single-loop controlled grid-connected inverters ...Delay dependent stability of single-loop controlled grid-connected inverters ...
Delay dependent stability of single-loop controlled grid-connected inverters ...
LeMeniz Infotech
 
Watkin presentation 4
Watkin presentation 4Watkin presentation 4
Watkin presentation 4
Sarah Hoss
 
Артем Чучакин (Идея Нова) о ботах и мертвых душах в социальных медиа
Артем Чучакин (Идея Нова) о ботах и мертвых душах в социальных медиаАртем Чучакин (Идея Нова) о ботах и мертвых душах в социальных медиа
Артем Чучакин (Идея Нова) о ботах и мертвых душах в социальных медиа
Ekaterina Giganova
 

Viewers also liked (20)

Forgather
ForgatherForgather
Forgather
 
Images
ImagesImages
Images
 
Blogger did you know!
Blogger   did you know!Blogger   did you know!
Blogger did you know!
 
Riverside Resources Corporate Presentation
Riverside Resources Corporate PresentationRiverside Resources Corporate Presentation
Riverside Resources Corporate Presentation
 
Алексей Янин (Websellers): эффективная реклама в социальных медиа
Алексей Янин (Websellers): эффективная реклама в социальных медиаАлексей Янин (Websellers): эффективная реклама в социальных медиа
Алексей Янин (Websellers): эффективная реклама в социальных медиа
 
педмарафон 2013
педмарафон 2013педмарафон 2013
педмарафон 2013
 
βρες τη σωστή λέξη
βρες τη σωστή λέξηβρες τη σωστή λέξη
βρες τη σωστή λέξη
 
Delay dependent stability of single-loop controlled grid-connected inverters ...
Delay dependent stability of single-loop controlled grid-connected inverters ...Delay dependent stability of single-loop controlled grid-connected inverters ...
Delay dependent stability of single-loop controlled grid-connected inverters ...
 
A privacy preserving framework for managing mobile ad requests and billing in...
A privacy preserving framework for managing mobile ad requests and billing in...A privacy preserving framework for managing mobile ad requests and billing in...
A privacy preserving framework for managing mobile ad requests and billing in...
 
Berlin Gay Bars
Berlin Gay BarsBerlin Gay Bars
Berlin Gay Bars
 
How to Get Involved in the Magento Community #mm16ar
How to Get Involved in the Magento Community #mm16arHow to Get Involved in the Magento Community #mm16ar
How to Get Involved in the Magento Community #mm16ar
 
HAB-Brochure
HAB-BrochureHAB-Brochure
HAB-Brochure
 
Costume
CostumeCostume
Costume
 
Watkin presentation 4
Watkin presentation 4Watkin presentation 4
Watkin presentation 4
 
Magazin son
Magazin sonMagazin son
Magazin son
 
Артем Чучакин (Идея Нова) о ботах и мертвых душах в социальных медиа
Артем Чучакин (Идея Нова) о ботах и мертвых душах в социальных медиаАртем Чучакин (Идея Нова) о ботах и мертвых душах в социальных медиа
Артем Чучакин (Идея Нова) о ботах и мертвых душах в социальных медиа
 
A fast acquisition all-digital delay-locked loop using a starting-bit predict...
A fast acquisition all-digital delay-locked loop using a starting-bit predict...A fast acquisition all-digital delay-locked loop using a starting-bit predict...
A fast acquisition all-digital delay-locked loop using a starting-bit predict...
 
Imp
ImpImp
Imp
 
A new control strategy for distributed static compensators considering transm...
A new control strategy for distributed static compensators considering transm...A new control strategy for distributed static compensators considering transm...
A new control strategy for distributed static compensators considering transm...
 
Evaluation question 1 (digi pack)
Evaluation question 1 (digi pack)Evaluation question 1 (digi pack)
Evaluation question 1 (digi pack)
 

Similar to Scalable keyword search on large rdf data

Iaetsd mapreduce streaming over cassandra datasets
Iaetsd mapreduce streaming over cassandra datasetsIaetsd mapreduce streaming over cassandra datasets
Iaetsd mapreduce streaming over cassandra datasets
Iaetsd Iaetsd
 

Similar to Scalable keyword search on large rdf data (20)

IEEE 2014 JAVA DATA MINING PROJECTS Scalable keyword search on large rdf data
IEEE 2014 JAVA DATA MINING PROJECTS Scalable keyword search on large rdf dataIEEE 2014 JAVA DATA MINING PROJECTS Scalable keyword search on large rdf data
IEEE 2014 JAVA DATA MINING PROJECTS Scalable keyword search on large rdf data
 
Az31349353
Az31349353Az31349353
Az31349353
 
Processing cassandra datasets with hadoop streaming based approaches
Processing cassandra datasets with hadoop streaming based approachesProcessing cassandra datasets with hadoop streaming based approaches
Processing cassandra datasets with hadoop streaming based approaches
 
Iaetsd mapreduce streaming over cassandra datasets
Iaetsd mapreduce streaming over cassandra datasetsIaetsd mapreduce streaming over cassandra datasets
Iaetsd mapreduce streaming over cassandra datasets
 
fast nearest neighbor search with keywords
fast nearest neighbor search with keywordsfast nearest neighbor search with keywords
fast nearest neighbor search with keywords
 
Keyword Query Routing
Keyword Query RoutingKeyword Query Routing
Keyword Query Routing
 
Ramya ppt.pptx
Ramya ppt.pptxRamya ppt.pptx
Ramya ppt.pptx
 
An incremental and distributed inference methodfor large scale ontologies bas...
An incremental and distributed inference methodfor large scale ontologies bas...An incremental and distributed inference methodfor large scale ontologies bas...
An incremental and distributed inference methodfor large scale ontologies bas...
 
IJET-V3I2P14
IJET-V3I2P14IJET-V3I2P14
IJET-V3I2P14
 
ast nearest neighbor search with keywords
ast nearest neighbor search with keywordsast nearest neighbor search with keywords
ast nearest neighbor search with keywords
 
Fast nearest neighbor search with keywords
Fast nearest neighbor search with keywordsFast nearest neighbor search with keywords
Fast nearest neighbor search with keywords
 
JPJ1422 Fast Nearest Neighbour Search With Keywords
JPJ1422   Fast Nearest Neighbour Search With KeywordsJPJ1422   Fast Nearest Neighbour Search With Keywords
JPJ1422 Fast Nearest Neighbour Search With Keywords
 
OUTCOME ANALYSIS IN ACADEMIC INSTITUTIONS USING NEO4J
OUTCOME ANALYSIS IN ACADEMIC INSTITUTIONS USING NEO4JOUTCOME ANALYSIS IN ACADEMIC INSTITUTIONS USING NEO4J
OUTCOME ANALYSIS IN ACADEMIC INSTITUTIONS USING NEO4J
 
2.Introduction to NOSQL (Core concepts).pptx
2.Introduction to NOSQL (Core concepts).pptx2.Introduction to NOSQL (Core concepts).pptx
2.Introduction to NOSQL (Core concepts).pptx
 
Cr25555560
Cr25555560Cr25555560
Cr25555560
 
Big data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edgeBig data analytics: Technology's bleeding edge
Big data analytics: Technology's bleeding edge
 
Bridging the gap between the semantic web and big data: answering SPARQL que...
Bridging the gap between the semantic web and big data:  answering SPARQL que...Bridging the gap between the semantic web and big data:  answering SPARQL que...
Bridging the gap between the semantic web and big data: answering SPARQL que...
 
The future of Big Data tooling
The future of Big Data toolingThe future of Big Data tooling
The future of Big Data tooling
 
Big Data SSD Architecture: Digging Deep to Discover Where SSD Performance Pay...
Big Data SSD Architecture: Digging Deep to Discover Where SSD Performance Pay...Big Data SSD Architecture: Digging Deep to Discover Where SSD Performance Pay...
Big Data SSD Architecture: Digging Deep to Discover Where SSD Performance Pay...
 
keyword query routing
keyword query routingkeyword query routing
keyword query routing
 

More from LeMeniz Infotech

Interleaved digital power factor correction based on the sliding mode approach
Interleaved digital power factor correction based on the sliding mode approachInterleaved digital power factor correction based on the sliding mode approach
Interleaved digital power factor correction based on the sliding mode approach
LeMeniz Infotech
 
Bumpless control for reduced thd in power factor correction circuits
Bumpless control for reduced thd in power factor correction circuitsBumpless control for reduced thd in power factor correction circuits
Bumpless control for reduced thd in power factor correction circuits
LeMeniz Infotech
 
A bidirectional three level llc resonant converter with pwam control
A bidirectional three level llc resonant converter with pwam controlA bidirectional three level llc resonant converter with pwam control
A bidirectional three level llc resonant converter with pwam control
LeMeniz Infotech
 
Efficient single phase transformerless inverter for grid tied pvg system with...
Efficient single phase transformerless inverter for grid tied pvg system with...Efficient single phase transformerless inverter for grid tied pvg system with...
Efficient single phase transformerless inverter for grid tied pvg system with...
LeMeniz Infotech
 
Highly reliable transformerless photovoltaic inverters with leakage current a...
Highly reliable transformerless photovoltaic inverters with leakage current a...Highly reliable transformerless photovoltaic inverters with leakage current a...
Highly reliable transformerless photovoltaic inverters with leakage current a...
LeMeniz Infotech
 
Grid current-feedback active damping for lcl resonance in grid-connected volt...
Grid current-feedback active damping for lcl resonance in grid-connected volt...Grid current-feedback active damping for lcl resonance in grid-connected volt...
Grid current-feedback active damping for lcl resonance in grid-connected volt...
LeMeniz Infotech
 

More from LeMeniz Infotech (20)

A fast fault tolerant architecture for sauvola local image thresholding algor...
A fast fault tolerant architecture for sauvola local image thresholding algor...A fast fault tolerant architecture for sauvola local image thresholding algor...
A fast fault tolerant architecture for sauvola local image thresholding algor...
 
A dynamically reconfigurable multi asip architecture for multistandard and mu...
A dynamically reconfigurable multi asip architecture for multistandard and mu...A dynamically reconfigurable multi asip architecture for multistandard and mu...
A dynamically reconfigurable multi asip architecture for multistandard and mu...
 
Interleaved digital power factor correction based on the sliding mode approach
Interleaved digital power factor correction based on the sliding mode approachInterleaved digital power factor correction based on the sliding mode approach
Interleaved digital power factor correction based on the sliding mode approach
 
Bumpless control for reduced thd in power factor correction circuits
Bumpless control for reduced thd in power factor correction circuitsBumpless control for reduced thd in power factor correction circuits
Bumpless control for reduced thd in power factor correction circuits
 
A bidirectional single stage three phase rectifier with high-frequency isolat...
A bidirectional single stage three phase rectifier with high-frequency isolat...A bidirectional single stage three phase rectifier with high-frequency isolat...
A bidirectional single stage three phase rectifier with high-frequency isolat...
 
A bidirectional three level llc resonant converter with pwam control
A bidirectional three level llc resonant converter with pwam controlA bidirectional three level llc resonant converter with pwam control
A bidirectional three level llc resonant converter with pwam control
 
Efficient single phase transformerless inverter for grid tied pvg system with...
Efficient single phase transformerless inverter for grid tied pvg system with...Efficient single phase transformerless inverter for grid tied pvg system with...
Efficient single phase transformerless inverter for grid tied pvg system with...
 
Highly reliable transformerless photovoltaic inverters with leakage current a...
Highly reliable transformerless photovoltaic inverters with leakage current a...Highly reliable transformerless photovoltaic inverters with leakage current a...
Highly reliable transformerless photovoltaic inverters with leakage current a...
 
Grid current-feedback active damping for lcl resonance in grid-connected volt...
Grid current-feedback active damping for lcl resonance in grid-connected volt...Grid current-feedback active damping for lcl resonance in grid-connected volt...
Grid current-feedback active damping for lcl resonance in grid-connected volt...
 
Connection of converters to a low and medium power dc network using an induct...
Connection of converters to a low and medium power dc network using an induct...Connection of converters to a low and medium power dc network using an induct...
Connection of converters to a low and medium power dc network using an induct...
 
Stamp enabling privacy preserving location proofs for mobile users
Stamp enabling privacy preserving location proofs for mobile usersStamp enabling privacy preserving location proofs for mobile users
Stamp enabling privacy preserving location proofs for mobile users
 
Sbvlc secure barcode based visible light communication for smartphones
Sbvlc secure barcode based visible light communication for smartphonesSbvlc secure barcode based visible light communication for smartphones
Sbvlc secure barcode based visible light communication for smartphones
 
Read2 me a cloud based reading aid for the visually impaired
Read2 me a cloud based reading aid for the visually impairedRead2 me a cloud based reading aid for the visually impaired
Read2 me a cloud based reading aid for the visually impaired
 
Privacy preserving location sharing services for social networks
Privacy preserving location sharing services for social networksPrivacy preserving location sharing services for social networks
Privacy preserving location sharing services for social networks
 
Pass byo bring your own picture for securing graphical passwords
Pass byo bring your own picture for securing graphical passwordsPass byo bring your own picture for securing graphical passwords
Pass byo bring your own picture for securing graphical passwords
 
Eplq efficient privacy preserving location-based query over outsourced encryp...
Eplq efficient privacy preserving location-based query over outsourced encryp...Eplq efficient privacy preserving location-based query over outsourced encryp...
Eplq efficient privacy preserving location-based query over outsourced encryp...
 
Analyzing ad library updates in android apps
Analyzing ad library updates in android appsAnalyzing ad library updates in android apps
Analyzing ad library updates in android apps
 
An exploration of geographic authentication scheme
An exploration of geographic authentication schemeAn exploration of geographic authentication scheme
An exploration of geographic authentication scheme
 
Dotnet IEEE Projects 2016-2017 | Dotnet IEEE Projects Titles 2016-2017
Dotnet IEEE Projects 2016-2017 | Dotnet IEEE Projects Titles 2016-2017Dotnet IEEE Projects 2016-2017 | Dotnet IEEE Projects Titles 2016-2017
Dotnet IEEE Projects 2016-2017 | Dotnet IEEE Projects Titles 2016-2017
 
Context based access control systems for mobile devices
Context based access control systems for mobile devicesContext based access control systems for mobile devices
Context based access control systems for mobile devices
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 

Scalable keyword search on large rdf data

  • 1. LeMeniz Infotech 36, 100 feet Road, Natesan Nagar(Near Indira Gandhi Statue, Next to Fish-O-Fish), Pondicherry-605 005 Call: 0413-4205444, +91 99625 88976, 95663 55386. For More Projects Titles Visits : www.lemenizinfotech.com | Call Us : 9962588976 /9566355386 Do Your Projects With Domain Experts SCALABLE KEYWORD SEARCH ON LARGE RDF DATA ABSTRACT Keyword search is a useful tool for exploring large RDF datasets. Existing techniques either rely on constructing a distance matrix for pruning the search space or building summaries from the RDF graphs for query processing. Existing techniques have serious limitations in dealing with realistic, large RDF data with tens of millions of triples. Furthermore, the existing summarization techniques may lead to incorrect/incomplete results. To address these issues, an effective summarization algorithm is proposed to summarize the RDF data. Given a keyword query, the summaries lend significant pruning powers to exploratory keyword search and result in much better efficiency compared to previous works. Unlike existing techniques, this search algorithm always return correct results. Besides, the summaries we built can be updated incrementally and efficiently. Experiments on both benchmark and large real RDF data sets show that this techniques are scalable and efficient. AIM Aim is to design a scalable and exact solution that handles realistic RDF datasets with tens of millions of triples.
  • 2. LeMeniz Infotech 36, 100 feet Road, Natesan Nagar(Near Indira Gandhi Statue, Next to Fish-O-Fish), Pondicherry-605 005 Call: 0413-4205444, +91 99625 88976, 95663 55386. For More Projects Titles Visits : www.lemenizinfotech.com | Call Us : 9962588976 /9566355386 Do Your Projects With Domain Experts INTRODUCTION The RDF (Resource Description Framework) is the de-facto standard for data representation on the Web. So, it is no surprise that we are inundated with large amounts of rapidly growing RDF data from disparate domains. For instance, the Linked Open Data (LOD) initiative integrates billions of entities from hundreds of sources. Just one of these sources, the DBpedia dataset, describes more than 3:64 million things using more than 1 billion RDF triples; and it contains numerous keywords. Keyword search is an important tool for exploring and searching large data corpuses whose structure is either unknown, or constantly changing. So, keyword search has already been studied in the context of relational databases, XML documents, and more recently over graphs and RDF data. EXISTING SYSTEM Keyword search on generic graphs For keyword search on generic graphs, many techniques assume that graphs fit in memory, an assumption that breaks for big RDF graphs. Existing the approaches maintain a distance matrix for all vertex pairs, and clearly do not scale for graphs with millions of vertices.
  • 3. LeMeniz Infotech 36, 100 feet Road, Natesan Nagar(Near Indira Gandhi Statue, Next to Fish-O-Fish), Pondicherry-605 005 Call: 0413-4205444, +91 99625 88976, 95663 55386. For More Projects Titles Visits : www.lemenizinfotech.com | Call Us : 9962588976 /9566355386 Do Your Projects With Domain Experts Furthermore, these works do not consider how to handle updates. A typical approach used here for keyword-search is backward search. Backward search when used to find a Steiner tree in the data graph is NP-hard. Large graph data The graph data are first partitioned into small subgraphs by heuristics. In this version of the problem, the authors assumed edges across the boundaries of the partitions are weighted. A partition is treated as a supernode and edges crossing partitions are superedges. The supernodes and superedges form a new graph, which is considered as a summary the underlying graph data. By recursively performing partitioning and building summaries, a large graph can be eventually summarized with a small summary and fit into memory for query processing. During query evaluation, the correspondent supernodes containing the keywords being queried are unfolded and the respective portion of graph are fetched from external memory for query processing. This approach cannot be applied to RDF. Keyword search for RDF data Search is first applied on the schema/summary of the data to identify promising relations which could have all the keywords being queried. Then, by translating these relations into search patterns and executing them against the RDF data, the actual subgraphs are retrived.
  • 4. LeMeniz Infotech 36, 100 feet Road, Natesan Nagar(Near Indira Gandhi Statue, Next to Fish-O-Fish), Pondicherry-605 005 Call: 0413-4205444, +91 99625 88976, 95663 55386. For More Projects Titles Visits : www.lemenizinfotech.com | Call Us : 9962588976 /9566355386 Do Your Projects With Domain Experts Disadvantages  Returns incorrect answers, i.e., the keyword search returns answers that do not correspond to real subgraphs or misses valid matches from the underlying RDF data  Inability to scale to handle typical RDF datasets with tens of millions of triples. PROPOSED SYSTEM  To design a scalable and exact solution that handles realistic RDF datasets with tens of millions of triples.  To use SPARQL query language to efficiently process the RDF data  Efficiently retrieve every partition from the data by collaboratively using SPARQL query and any RDF store without explicitly storing the partition  This approach starts by splitting the RDF graph into multiple, smaller partitions. Then, it defines a minimal set of common type-based structures that summarizes the partitions. Intuitively, the summary book keeps the distinct structures from all the partitions.
  • 5. LeMeniz Infotech 36, 100 feet Road, Natesan Nagar(Near Indira Gandhi Statue, Next to Fish-O-Fish), Pondicherry-605 005 Call: 0413-4205444, +91 99625 88976, 95663 55386. For More Projects Titles Visits : www.lemenizinfotech.com | Call Us : 9962588976 /9566355386 Do Your Projects With Domain Experts Advantages  Better efficiency  Overcome scalability issues  Better results LITERATURE SUMMARY Keyword search on generic graphs, many techniques assume that graphs fit in memory, an assumption that breaks for big RDF graphs. For instance, existing the approaches maintain a distance matrix for all vertex pairs, and clearly do not scale for graphs with millions of vertices. Furthermore, these works do not consider how to handle updates. He et al proposed a tractable problem that does not aim to find a Steiner tree and can be answered by using backward search. Large graph data to support keyword search were also studied. The graph data are first partitioned into small subgraphs by heuristics. It is assumed that edges across the boundaries of the partitions are weighted. A partition is treated as a supernode and edges crossing partitions are superedges. The supernodes and superedges form a new graph, which is considered as a summary the underlying graph data. By recursively performing partitioning and building summaries, a large graph can be eventually summarized with a small summary and fit into memory for query processing. During query evaluation, the correspondent supernodes containing the
  • 6. LeMeniz Infotech 36, 100 feet Road, Natesan Nagar(Near Indira Gandhi Statue, Next to Fish-O-Fish), Pondicherry-605 005 Call: 0413-4205444, +91 99625 88976, 95663 55386. For More Projects Titles Visits : www.lemenizinfotech.com | Call Us : 9962588976 /9566355386 Do Your Projects With Domain Experts keywords being queried are unfolded and the respective portion of graph are fetched from external memory for query processing. Hardware requirements: Processor : Any Processor above 500 MHz. Ram : 128Mb. Hard Disk : 10 Gb. Compact Disk : 650 Mb. Input device : Standard Keyboard and Mouse. Output device : VGA and High Resolution Monitor. Software requirements: Operating System : Windows Family. Language : JDK 1.5 Database : RDF