SlideShare a Scribd company logo
1 of 18
ONTOLOGY BASED WEB CRAWLER
SUBMITTED BY:
Sachin Murwariya (9910103457)
WHAT IS RSS
RSS is a defined standard for syndicating headlines and other content.
RSS is created using XML or eXtensible Markup Language, which is a markup language
similar to HTML. All fields are defined. Tags are used to denote the field’s classification.
Like HTML, proper construction requires that tags are both opened and closed.
Example: <title> Title of Item in Feed </title>
RSS has been around for more than a decade, but only recently the standard has been
embraced by bloggers, webmasters and large news portals as a means of distributing
Information, in a standardized format.
WHAT IS ONTOLOGY BASED WEB CRAWLER
 We present News Personalization using the Semantic
Recommender, a news recommender system which applies
Semantic Web technologies to describe and relate news
contents and user preferences in order to produce enhanced
recommendations
APPLICATIONS
◦ User profile construction
◦ Semantics based recommendation:
◦ Delevering categorised news items
BENEFITS:
 Help in constant update
 Ease of Operation:
 User can collect information from multiple
sources into a single data stream.
PROBLEM STATEMENT
 The extremely large volume of online news has created
an urgent need for tools that let users effectively and
efficiently browse topics, detect temporal trends, and
search news of interest.
 For this we are preparing a ONTOLOGY BASED WEB
CRAWLER to extract valuable information from large
online news collections
TEST PLAN
The purpose of testing is quality assurance, verification and
validation, or reliability estimation.
Unit Testing
Component testing
Integration testing
Validation Testing
System Testing
ARCHITECTURE :
METHODS IN USE:
1. Crawling Algorithm
2. Concept Based Algorithm
3. Recommendation Algorithm
 CRAWLING ALGORITHM:
Concept Based Algorithm
RECOMMENDATION ALGORITHM
 Recommender systems typically produce a list of
recommendations in one of two ways - through
collaborative or content-based filtering.
 Collaborative filtering approaches build a model
from a user's past behavior (items previously
purchased or selected )
 Then use that model to predict items that the
user may have an interest in Content-based
filtering approaches utilize a series of discrete
characteristics of an item in order to recommend
additional items with similar properties.
IMPLEMENTATION
 Login Page
Search using keyword:
TEST PLAN
The purpose of testing is quality assurance, verification and
validation, or reliability estimation.
Unit Testing
Component testing
Integration testing
Validation Testing
System Testing
REFERENCES
[I] Ching Hsu .Taiwan, National Formosa University,2011.
[2] I.Jntema, F.Frasincar, F.Goossen and F.Hogenboom, Erasmus
University Rotterdam, 2010
[3] M.Shea and M.Levene, University of London, UK, 2011
[4] Z.Rui-juan , Z. Yang-sen, 9th International Conference,2012
[5] S.Saha, A. Sajjanhar, S. Gao, R.Dew and Y. Zhao,0 IEEE 10th
International Conference,2010
[6] Sajjanhar, A. Ying Zhao, ChinaGrid Annual Conference
(ChinaGrid), 2012
[7] S. Sarumathi , (PRIME)International Conference,2012
[8] F.Goossen, W.IJntema, F.Frasincar, F.Hogenboom, U.Kaymak,
Erasmus University Rotterdam.2011
THANK YOU

More Related Content

Viewers also liked (8)

“Web crawler”
“Web crawler”“Web crawler”
“Web crawler”
 
Web Crawling
Web CrawlingWeb Crawling
Web Crawling
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawler
 
Webcrawler
Webcrawler Webcrawler
Webcrawler
 
Web crawler
Web crawlerWeb crawler
Web crawler
 
WebCrawler
WebCrawlerWebCrawler
WebCrawler
 
Web Crawler
Web CrawlerWeb Crawler
Web Crawler
 
Web crawler
Web crawlerWeb crawler
Web crawler
 

Similar to JIIT;Project 2013-14; CSE/IT

One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop RecommendationIRJET Journal
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop RecommendationIRJET Journal
 
IRJET- Hybrid Recommendation System for Movies
IRJET-  	  Hybrid Recommendation System for MoviesIRJET-  	  Hybrid Recommendation System for Movies
IRJET- Hybrid Recommendation System for MoviesIRJET Journal
 
MOVIE RECOMMENDATION SYSTEM
MOVIE RECOMMENDATION SYSTEMMOVIE RECOMMENDATION SYSTEM
MOVIE RECOMMENDATION SYSTEMIRJET Journal
 
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...inventionjournals
 
Opinioz_intern
Opinioz_internOpinioz_intern
Opinioz_internSai Ganesh
 
Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...
Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...
Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...IRJET Journal
 
Test Automation Framework An Insight into Some Popular Automation Frameworks.pdf
Test Automation Framework An Insight into Some Popular Automation Frameworks.pdfTest Automation Framework An Insight into Some Popular Automation Frameworks.pdf
Test Automation Framework An Insight into Some Popular Automation Frameworks.pdfSerena Gray
 
About pellustro - The cloud-based platform for assessments
About pellustro - The cloud-based platform for assessmentsAbout pellustro - The cloud-based platform for assessments
About pellustro - The cloud-based platform for assessmentsElement22
 
How to Apply Your Taxonomy to Your Content Automatically
How to Apply Your Taxonomy to Your Content AutomaticallyHow to Apply Your Taxonomy to Your Content Automatically
How to Apply Your Taxonomy to Your Content AutomaticallyAccess Innovations, Inc.
 
Software Product Line
Software Product LineSoftware Product Line
Software Product LineHimanshu
 
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...Jonathan Ralton
 
Framework for Product Recommandation for Review Dataset
Framework for Product Recommandation for Review DatasetFramework for Product Recommandation for Review Dataset
Framework for Product Recommandation for Review Datasetrahulmonikasharma
 

Similar to JIIT;Project 2013-14; CSE/IT (20)

Major ppt
Major pptMajor ppt
Major ppt
 
SNATZ Technology
SNATZ TechnologySNATZ Technology
SNATZ Technology
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop Recommendation
 
One Stop Recommendation
One Stop RecommendationOne Stop Recommendation
One Stop Recommendation
 
IRJET- Hybrid Recommendation System for Movies
IRJET-  	  Hybrid Recommendation System for MoviesIRJET-  	  Hybrid Recommendation System for Movies
IRJET- Hybrid Recommendation System for Movies
 
MOVIE RECOMMENDATION SYSTEM
MOVIE RECOMMENDATION SYSTEMMOVIE RECOMMENDATION SYSTEM
MOVIE RECOMMENDATION SYSTEM
 
Moss Governance Guidelines
Moss Governance GuidelinesMoss Governance Guidelines
Moss Governance Guidelines
 
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
 
Opinioz_intern
Opinioz_internOpinioz_intern
Opinioz_intern
 
Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...
Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...
Mixed Recommendation Algorithm Based on Content, Demographic and Collaborativ...
 
Test Automation Framework An Insight into Some Popular Automation Frameworks.pdf
Test Automation Framework An Insight into Some Popular Automation Frameworks.pdfTest Automation Framework An Insight into Some Popular Automation Frameworks.pdf
Test Automation Framework An Insight into Some Popular Automation Frameworks.pdf
 
About pellustro - The cloud-based platform for assessments
About pellustro - The cloud-based platform for assessmentsAbout pellustro - The cloud-based platform for assessments
About pellustro - The cloud-based platform for assessments
 
Slideshow ire
Slideshow ireSlideshow ire
Slideshow ire
 
How to Apply Your Taxonomy to Your Content Automatically
How to Apply Your Taxonomy to Your Content AutomaticallyHow to Apply Your Taxonomy to Your Content Automatically
How to Apply Your Taxonomy to Your Content Automatically
 
Software Product Line
Software Product LineSoftware Product Line
Software Product Line
 
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...
SPSNYC14 - Must Love Term Sets: The New and Improved Managed Metadata Service...
 
Text Analytics
Text Analytics Text Analytics
Text Analytics
 
Bv31491493
Bv31491493Bv31491493
Bv31491493
 
Data-Driven Testing
Data-Driven Testing  Data-Driven Testing
Data-Driven Testing
 
Framework for Product Recommandation for Review Dataset
Framework for Product Recommandation for Review DatasetFramework for Product Recommandation for Review Dataset
Framework for Product Recommandation for Review Dataset
 

Recently uploaded

HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...
HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...
HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...Ismail Fahmi
 
Quiz for Heritage Indian including all the rounds
Quiz for Heritage Indian including all the roundsQuiz for Heritage Indian including all the rounds
Quiz for Heritage Indian including all the roundsnaxymaxyy
 
Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...
Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...
Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...Axel Bruns
 
Vashi Escorts, {Pooja 09892124323}, Vashi Call Girls
Vashi Escorts, {Pooja 09892124323}, Vashi Call GirlsVashi Escorts, {Pooja 09892124323}, Vashi Call Girls
Vashi Escorts, {Pooja 09892124323}, Vashi Call GirlsPooja Nehwal
 
Manipur-Book-Final-2-compressed.pdfsal'rpk
Manipur-Book-Final-2-compressed.pdfsal'rpkManipur-Book-Final-2-compressed.pdfsal'rpk
Manipur-Book-Final-2-compressed.pdfsal'rpkbhavenpr
 
Top 10 Wealthiest People In The World.pdf
Top 10 Wealthiest People In The World.pdfTop 10 Wealthiest People In The World.pdf
Top 10 Wealthiest People In The World.pdfauroraaudrey4826
 
Different Frontiers of Social Media War in Indonesia Elections 2024
Different Frontiers of Social Media War in Indonesia Elections 2024Different Frontiers of Social Media War in Indonesia Elections 2024
Different Frontiers of Social Media War in Indonesia Elections 2024Ismail Fahmi
 
How Europe Underdeveloped Africa_walter.pdf
How Europe Underdeveloped Africa_walter.pdfHow Europe Underdeveloped Africa_walter.pdf
How Europe Underdeveloped Africa_walter.pdfLorenzo Lemes
 
Opportunities, challenges, and power of media and information
Opportunities, challenges, and power of media and informationOpportunities, challenges, and power of media and information
Opportunities, challenges, and power of media and informationReyMonsales
 
Referendum Party 2024 Election Manifesto
Referendum Party 2024 Election ManifestoReferendum Party 2024 Election Manifesto
Referendum Party 2024 Election ManifestoSABC News
 
Chandrayaan 3 Successful Moon Landing Mission.pdf
Chandrayaan 3 Successful Moon Landing Mission.pdfChandrayaan 3 Successful Moon Landing Mission.pdf
Chandrayaan 3 Successful Moon Landing Mission.pdfauroraaudrey4826
 
N Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election Campaign
N Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election CampaignN Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election Campaign
N Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election Campaignanjanibaddipudi1
 
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep VictoryAP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victoryanjanibaddipudi1
 
VIP Girls Available Call or WhatsApp 9711199012
VIP Girls Available Call or WhatsApp 9711199012VIP Girls Available Call or WhatsApp 9711199012
VIP Girls Available Call or WhatsApp 9711199012ankitnayak356677
 
Brief biography of Julius Robert Oppenheimer
Brief biography of Julius Robert OppenheimerBrief biography of Julius Robert Oppenheimer
Brief biography of Julius Robert OppenheimerOmarCabrera39
 
complaint-ECI-PM-media-1-Chandru.pdfra;;prfk
complaint-ECI-PM-media-1-Chandru.pdfra;;prfkcomplaint-ECI-PM-media-1-Chandru.pdfra;;prfk
complaint-ECI-PM-media-1-Chandru.pdfra;;prfkbhavenpr
 

Recently uploaded (16)

HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...
HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...
HARNESSING AI FOR ENHANCED MEDIA ANALYSIS A CASE STUDY ON CHATGPT AT DRONE EM...
 
Quiz for Heritage Indian including all the rounds
Quiz for Heritage Indian including all the roundsQuiz for Heritage Indian including all the rounds
Quiz for Heritage Indian including all the rounds
 
Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...
Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...
Dynamics of Destructive Polarisation in Mainstream and Social Media: The Case...
 
Vashi Escorts, {Pooja 09892124323}, Vashi Call Girls
Vashi Escorts, {Pooja 09892124323}, Vashi Call GirlsVashi Escorts, {Pooja 09892124323}, Vashi Call Girls
Vashi Escorts, {Pooja 09892124323}, Vashi Call Girls
 
Manipur-Book-Final-2-compressed.pdfsal'rpk
Manipur-Book-Final-2-compressed.pdfsal'rpkManipur-Book-Final-2-compressed.pdfsal'rpk
Manipur-Book-Final-2-compressed.pdfsal'rpk
 
Top 10 Wealthiest People In The World.pdf
Top 10 Wealthiest People In The World.pdfTop 10 Wealthiest People In The World.pdf
Top 10 Wealthiest People In The World.pdf
 
Different Frontiers of Social Media War in Indonesia Elections 2024
Different Frontiers of Social Media War in Indonesia Elections 2024Different Frontiers of Social Media War in Indonesia Elections 2024
Different Frontiers of Social Media War in Indonesia Elections 2024
 
How Europe Underdeveloped Africa_walter.pdf
How Europe Underdeveloped Africa_walter.pdfHow Europe Underdeveloped Africa_walter.pdf
How Europe Underdeveloped Africa_walter.pdf
 
Opportunities, challenges, and power of media and information
Opportunities, challenges, and power of media and informationOpportunities, challenges, and power of media and information
Opportunities, challenges, and power of media and information
 
Referendum Party 2024 Election Manifesto
Referendum Party 2024 Election ManifestoReferendum Party 2024 Election Manifesto
Referendum Party 2024 Election Manifesto
 
Chandrayaan 3 Successful Moon Landing Mission.pdf
Chandrayaan 3 Successful Moon Landing Mission.pdfChandrayaan 3 Successful Moon Landing Mission.pdf
Chandrayaan 3 Successful Moon Landing Mission.pdf
 
N Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election Campaign
N Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election CampaignN Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election Campaign
N Chandrababu Naidu Launches 'Praja Galam' As Part of TDP’s Election Campaign
 
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep VictoryAP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
AP Election Survey 2024: TDP-Janasena-BJP Alliance Set To Sweep Victory
 
VIP Girls Available Call or WhatsApp 9711199012
VIP Girls Available Call or WhatsApp 9711199012VIP Girls Available Call or WhatsApp 9711199012
VIP Girls Available Call or WhatsApp 9711199012
 
Brief biography of Julius Robert Oppenheimer
Brief biography of Julius Robert OppenheimerBrief biography of Julius Robert Oppenheimer
Brief biography of Julius Robert Oppenheimer
 
complaint-ECI-PM-media-1-Chandru.pdfra;;prfk
complaint-ECI-PM-media-1-Chandru.pdfra;;prfkcomplaint-ECI-PM-media-1-Chandru.pdfra;;prfk
complaint-ECI-PM-media-1-Chandru.pdfra;;prfk
 

JIIT;Project 2013-14; CSE/IT

  • 1. ONTOLOGY BASED WEB CRAWLER SUBMITTED BY: Sachin Murwariya (9910103457)
  • 2. WHAT IS RSS RSS is a defined standard for syndicating headlines and other content. RSS is created using XML or eXtensible Markup Language, which is a markup language similar to HTML. All fields are defined. Tags are used to denote the field’s classification. Like HTML, proper construction requires that tags are both opened and closed. Example: <title> Title of Item in Feed </title> RSS has been around for more than a decade, but only recently the standard has been embraced by bloggers, webmasters and large news portals as a means of distributing Information, in a standardized format.
  • 3. WHAT IS ONTOLOGY BASED WEB CRAWLER  We present News Personalization using the Semantic Recommender, a news recommender system which applies Semantic Web technologies to describe and relate news contents and user preferences in order to produce enhanced recommendations
  • 4. APPLICATIONS ◦ User profile construction ◦ Semantics based recommendation: ◦ Delevering categorised news items
  • 5. BENEFITS:  Help in constant update  Ease of Operation:  User can collect information from multiple sources into a single data stream.
  • 6. PROBLEM STATEMENT  The extremely large volume of online news has created an urgent need for tools that let users effectively and efficiently browse topics, detect temporal trends, and search news of interest.  For this we are preparing a ONTOLOGY BASED WEB CRAWLER to extract valuable information from large online news collections
  • 7. TEST PLAN The purpose of testing is quality assurance, verification and validation, or reliability estimation. Unit Testing Component testing Integration testing Validation Testing System Testing
  • 9.
  • 10. METHODS IN USE: 1. Crawling Algorithm 2. Concept Based Algorithm 3. Recommendation Algorithm
  • 13. RECOMMENDATION ALGORITHM  Recommender systems typically produce a list of recommendations in one of two ways - through collaborative or content-based filtering.  Collaborative filtering approaches build a model from a user's past behavior (items previously purchased or selected )  Then use that model to predict items that the user may have an interest in Content-based filtering approaches utilize a series of discrete characteristics of an item in order to recommend additional items with similar properties.
  • 16. TEST PLAN The purpose of testing is quality assurance, verification and validation, or reliability estimation. Unit Testing Component testing Integration testing Validation Testing System Testing
  • 17. REFERENCES [I] Ching Hsu .Taiwan, National Formosa University,2011. [2] I.Jntema, F.Frasincar, F.Goossen and F.Hogenboom, Erasmus University Rotterdam, 2010 [3] M.Shea and M.Levene, University of London, UK, 2011 [4] Z.Rui-juan , Z. Yang-sen, 9th International Conference,2012 [5] S.Saha, A. Sajjanhar, S. Gao, R.Dew and Y. Zhao,0 IEEE 10th International Conference,2010 [6] Sajjanhar, A. Ying Zhao, ChinaGrid Annual Conference (ChinaGrid), 2012 [7] S. Sarumathi , (PRIME)International Conference,2012 [8] F.Goossen, W.IJntema, F.Frasincar, F.Hogenboom, U.Kaymak, Erasmus University Rotterdam.2011