SlideShare a Scribd company logo
1 of 1
Impulse Technologies
                                      Beacons U to World of technology
        044-42133143, 98401 03301,9841091117 ieeeprojects@yahoo.com www.impulse.net.in
            A Survey on Region Extractors From Web Documents
   Abstract
          Extracting information from web documents has become a research area in
   which new proposals sprout out year after year. This has motivated several
   researchers to work on surveys that attempt to provide an overall picture of the
   many existing proposals. Unfortunately, none of these surveys provide a complete
   picture, since they do not take region extractors into account. These tools are kind
   of preprocessors, since they help information extractors focus on the regions of a
   web document that contain relevant information. With the increasing complexity of
   web documents, region extractors are becoming a must to extract information from
   many web sites. Beyond information extraction, region extractors have also found
   their way into information retrieval, focused web crawling, topic distillation,
   adaptive content delivery, mashups, and metasearch engines. In this article, we
   survey the existing proposals regarding region extractors and compare them side
   by side.




  Your Own Ideas or Any project from any company can be Implemented
at Better price (All Projects can be done in Java or DotNet whichever the student wants)
                                                                                          1

More Related Content

What's hot

Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Peter Löwe
 
Revisiting Open Data Kit for Fieldwork Teaching EFL Workshop: 8th September 2017
Revisiting Open Data Kit for Fieldwork TeachingEFL Workshop: 8th September 2017Revisiting Open Data Kit for Fieldwork TeachingEFL Workshop: 8th September 2017
Revisiting Open Data Kit for Fieldwork Teaching EFL Workshop: 8th September 2017
fieldwork_ntf
 
Gov as an API (lca2015)
Gov as an API (lca2015)Gov as an API (lca2015)
Gov as an API (lca2015)
Pia Waugh
 
Knowledge Architecture: Graphing Your Knowledge
Knowledge Architecture: Graphing Your KnowledgeKnowledge Architecture: Graphing Your Knowledge
Knowledge Architecture: Graphing Your Knowledge
Neo4j
 
Library Technology Funding PowerPoint slides
Library Technology Funding PowerPoint slidesLibrary Technology Funding PowerPoint slides
Library Technology Funding PowerPoint slides
eisolomon
 

What's hot (18)

Adopting a situated learning framework for (big) data projects
Adopting a situated learning framework for (big) data projectsAdopting a situated learning framework for (big) data projects
Adopting a situated learning framework for (big) data projects
 
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...
 
You Never Walk Along: Recommending Academic Events Based on Social Network ...
You Never Walk Along: Recommending Academic Events Based on Social Network ...You Never Walk Along: Recommending Academic Events Based on Social Network ...
You Never Walk Along: Recommending Academic Events Based on Social Network ...
 
Revisiting Open Data Kit for Fieldwork Teaching EFL Workshop: 8th September 2017
Revisiting Open Data Kit for Fieldwork TeachingEFL Workshop: 8th September 2017Revisiting Open Data Kit for Fieldwork TeachingEFL Workshop: 8th September 2017
Revisiting Open Data Kit for Fieldwork Teaching EFL Workshop: 8th September 2017
 
Gov as an API (lca2015)
Gov as an API (lca2015)Gov as an API (lca2015)
Gov as an API (lca2015)
 
Data Journalism 101: A Brief Survey
Data Journalism 101: A Brief SurveyData Journalism 101: A Brief Survey
Data Journalism 101: A Brief Survey
 
Knowledge Architecture: Graphing Your Knowledge
Knowledge Architecture: Graphing Your KnowledgeKnowledge Architecture: Graphing Your Knowledge
Knowledge Architecture: Graphing Your Knowledge
 
Australian open data presentation v2.0
Australian open data presentation v2.0Australian open data presentation v2.0
Australian open data presentation v2.0
 
Informatics Needs in Medical Imaging
Informatics Needs in Medical ImagingInformatics Needs in Medical Imaging
Informatics Needs in Medical Imaging
 
Open data presentation 2013 v0 5
Open data presentation 2013 v0 5Open data presentation 2013 v0 5
Open data presentation 2013 v0 5
 
Developing open data tools and portals: experiences of impact delivery
Developing open data tools and portals: experiences of impact deliveryDeveloping open data tools and portals: experiences of impact delivery
Developing open data tools and portals: experiences of impact delivery
 
NewsPatterns - visualisation layer of news feed mining
NewsPatterns - visualisation layer of news feed miningNewsPatterns - visualisation layer of news feed mining
NewsPatterns - visualisation layer of news feed mining
 
From Argument Mapping to Argument Mining, and Back
From Argument Mapping to Argument Mining, and BackFrom Argument Mapping to Argument Mining, and Back
From Argument Mapping to Argument Mining, and Back
 
A National Network of Biomedical Research Expertise
A National Network of Biomedical Research ExpertiseA National Network of Biomedical Research Expertise
A National Network of Biomedical Research Expertise
 
Esriuk_track8_university_of_sheffield
Esriuk_track8_university_of_sheffieldEsriuk_track8_university_of_sheffield
Esriuk_track8_university_of_sheffield
 
Co-Predicting Weather in a Big Data Society
Co-Predicting Weather in a Big Data Society Co-Predicting Weather in a Big Data Society
Co-Predicting Weather in a Big Data Society
 
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
 
Library Technology Funding PowerPoint slides
Library Technology Funding PowerPoint slidesLibrary Technology Funding PowerPoint slides
Library Technology Funding PowerPoint slides
 

Viewers also liked

Viewers also liked (10)

24
2424
24
 
15
1515
15
 
21
2121
21
 
22
2222
22
 
22
2222
22
 
23
2323
23
 
25
2525
25
 
17
1717
17
 
16
1616
16
 
DECENTRALIZED ACCESS CONTROL OF DATA STORED IN CLOUD USING KEY POLICY ATTRIBU...
DECENTRALIZED ACCESS CONTROL OF DATA STORED IN CLOUD USING KEY POLICY ATTRIBU...DECENTRALIZED ACCESS CONTROL OF DATA STORED IN CLOUD USING KEY POLICY ATTRIBU...
DECENTRALIZED ACCESS CONTROL OF DATA STORED IN CLOUD USING KEY POLICY ATTRIBU...
 

Similar to 20

Implementation ofWeb Application for Disease Prediction Using AI
Implementation ofWeb Application for Disease Prediction Using AIImplementation ofWeb Application for Disease Prediction Using AI
Implementation ofWeb Application for Disease Prediction Using AI
BOHR International Journal of Computer Science (BIJCS)
 
Lee Feigenbaum Presentation
Lee Feigenbaum PresentationLee Feigenbaum Presentation
Lee Feigenbaum Presentation
Mediabistro
 
#ALAAC15 Linked Data Love
#ALAAC15 Linked Data Love #ALAAC15 Linked Data Love
#ALAAC15 Linked Data Love
Kristi Holmes
 

Similar to 20 (20)

Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
 
A Survey of Agent Based Pre-Processing and Knowledge Retrieval
A Survey of Agent Based Pre-Processing and Knowledge RetrievalA Survey of Agent Based Pre-Processing and Knowledge Retrieval
A Survey of Agent Based Pre-Processing and Knowledge Retrieval
 
Development of Southern Luzon State University Digital Library of Theses and ...
Development of Southern Luzon State University Digital Library of Theses and ...Development of Southern Luzon State University Digital Library of Theses and ...
Development of Southern Luzon State University Digital Library of Theses and ...
 
IRJET-Multi -Stage Smart Deep Web Crawling Systems: A Review
IRJET-Multi -Stage Smart Deep Web Crawling Systems: A ReviewIRJET-Multi -Stage Smart Deep Web Crawling Systems: A Review
IRJET-Multi -Stage Smart Deep Web Crawling Systems: A Review
 
Smart Crawler for Efficient Deep-Web Harvesting
Smart Crawler for Efficient Deep-Web HarvestingSmart Crawler for Efficient Deep-Web Harvesting
Smart Crawler for Efficient Deep-Web Harvesting
 
Building Satori: Web Data Extraction On Hadoop
Building Satori: Web Data Extraction On HadoopBuilding Satori: Web Data Extraction On Hadoop
Building Satori: Web Data Extraction On Hadoop
 
Pf3426712675
Pf3426712675Pf3426712675
Pf3426712675
 
Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey  Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey
 
Implementation ofWeb Application for Disease Prediction Using AI
Implementation ofWeb Application for Disease Prediction Using AIImplementation ofWeb Application for Disease Prediction Using AI
Implementation ofWeb Application for Disease Prediction Using AI
 
E3602042044
E3602042044E3602042044
E3602042044
 
Semantic Web in the Plateau of Productivity
Semantic Web in the Plateau of ProductivitySemantic Web in the Plateau of Productivity
Semantic Web in the Plateau of Productivity
 
Pdd crawler a focused web
Pdd crawler  a focused webPdd crawler  a focused web
Pdd crawler a focused web
 
Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey  Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey
 
Web Mining
Web MiningWeb Mining
Web Mining
 
DataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedIn
DataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedInDataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedIn
DataEngConf: Building Satori, a Hadoop toll for Data Extraction at LinkedIn
 
`A Survey on approaches of Web Mining in Varied Areas
`A Survey on approaches of Web Mining in Varied Areas`A Survey on approaches of Web Mining in Varied Areas
`A Survey on approaches of Web Mining in Varied Areas
 
STRATEGY AND IMPLEMENTATION OF WEB MINING TOOLS
STRATEGY AND IMPLEMENTATION OF WEB MINING TOOLSSTRATEGY AND IMPLEMENTATION OF WEB MINING TOOLS
STRATEGY AND IMPLEMENTATION OF WEB MINING TOOLS
 
Lee Feigenbaum Presentation
Lee Feigenbaum PresentationLee Feigenbaum Presentation
Lee Feigenbaum Presentation
 
Comparative Analysis of Collaborative Filtering Technique
Comparative Analysis of Collaborative Filtering TechniqueComparative Analysis of Collaborative Filtering Technique
Comparative Analysis of Collaborative Filtering Technique
 
#ALAAC15 Linked Data Love
#ALAAC15 Linked Data Love #ALAAC15 Linked Data Love
#ALAAC15 Linked Data Love
 

More from IMPULSE_TECHNOLOGY

More from IMPULSE_TECHNOLOGY (20)

19
1919
19
 
18
1818
18
 
16
1616
16
 
15
1515
15
 
14
1414
14
 
13
1313
13
 
12
1212
12
 
11
1111
11
 
10
1010
10
 
9
99
9
 
8
88
8
 
7
77
7
 
6
66
6
 
5
55
5
 
4
44
4
 
3
33
3
 
2
22
2
 
1
11
1
 
25
2525
25
 
24
2424
24
 

Recently uploaded

Recently uploaded (20)

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Play hard learn harder: The Serious Business of Play
Play hard learn harder:  The Serious Business of PlayPlay hard learn harder:  The Serious Business of Play
Play hard learn harder: The Serious Business of Play
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.ppt
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Our Environment Class 10 Science Notes pdf
Our Environment Class 10 Science Notes pdfOur Environment Class 10 Science Notes pdf
Our Environment Class 10 Science Notes pdf
 
How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
PANDITA RAMABAI- Indian political thought GENDER.pptx
PANDITA RAMABAI- Indian political thought GENDER.pptxPANDITA RAMABAI- Indian political thought GENDER.pptx
PANDITA RAMABAI- Indian political thought GENDER.pptx
 

20

  • 1. Impulse Technologies Beacons U to World of technology 044-42133143, 98401 03301,9841091117 ieeeprojects@yahoo.com www.impulse.net.in A Survey on Region Extractors From Web Documents Abstract Extracting information from web documents has become a research area in which new proposals sprout out year after year. This has motivated several researchers to work on surveys that attempt to provide an overall picture of the many existing proposals. Unfortunately, none of these surveys provide a complete picture, since they do not take region extractors into account. These tools are kind of preprocessors, since they help information extractors focus on the regions of a web document that contain relevant information. With the increasing complexity of web documents, region extractors are becoming a must to extract information from many web sites. Beyond information extraction, region extractors have also found their way into information retrieval, focused web crawling, topic distillation, adaptive content delivery, mashups, and metasearch engines. In this article, we survey the existing proposals regarding region extractors and compare them side by side. Your Own Ideas or Any project from any company can be Implemented at Better price (All Projects can be done in Java or DotNet whichever the student wants) 1