SlideShare a Scribd company logo
CS-690 Fall-2009 Querying Capability Modelling and Construction of Deep Web Sources Paper by: Liangcai Shu, Weiyi Meng, Hai Fe, and Clement Yu Presentation by: Fabian Alenius
Problem definition ,[object Object]
Information in Deep Web is 400-550 times larger than the surface web
Crawlers need to extract
information
Problem definition ,[object Object]
A form submission consists of a set of values, one for every field of the form
Querying Capability ,[object Object]
An attribute is a field in a form
A query is a set of attributes
A set query instances are a set of attribute and value pairs, e.g. {<Author, Tolkien>, <Title, Bilbo>}
Four types of attributes Functional Attribute Range Attribute Categorical Attribute Value-Infinite Attribute
Conditional Dependency ,[object Object]
Conditional dependency – value of  C  insignificant if  A  is null
A -> C
Attribute Unions ,[object Object]
T is not attribute type, but value type (e.g. phone, zip code, etc.)
Valid queries  – accepted;  Invalid queries  – rejected

More Related Content

Similar to Querying Capability Modeling

Table Recognition
Table RecognitionTable Recognition
Table Recognition
Giorgio Orsi
 
Web clustring engine
Web clustring engineWeb clustring engine
Web clustring engine
FACTS Computer Software L.L.C
 
Fasterflect
FasterflectFasterflect
Fasterflect
Buu Nguyen
 
Exploiting web search engines to search structured
Exploiting web search engines to search structuredExploiting web search engines to search structured
Exploiting web search engines to search structured
Nita Pawar
 
Linq
LinqLinq
Linq
tnkreddy
 
Linq
LinqLinq
Linq
tnkreddy
 
Building Semantic Web Portals with WebML
Building Semantic Web Portals with WebMLBuilding Semantic Web Portals with WebML
Building Semantic Web Portals with WebML
Marco Brambilla
 
Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)
Igor Moochnick
 
Topic Maps: Theory & Practice
Topic Maps: Theory & PracticeTopic Maps: Theory & Practice
Topic Maps: Theory & Practice
bbater
 
Breaking down data silos with the open data protocol
Breaking down data silos with the open data protocolBreaking down data silos with the open data protocol
Breaking down data silos with the open data protocol
Woodruff Solutions LLC
 
Letting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search ComponentLetting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search Component
Jay Luker
 
Ellerslie User Group - ReST Presentation
Ellerslie User Group - ReST PresentationEllerslie User Group - ReST Presentation
Ellerslie User Group - ReST Presentation
Alex Henderson
 
Introducing AWS AppSync: serverless data driven apps with real-time and offli...
Introducing AWS AppSync: serverless data driven apps with real-time and offli...Introducing AWS AppSync: serverless data driven apps with real-time and offli...
Introducing AWS AppSync: serverless data driven apps with real-time and offli...
Amazon Web Services
 
Lerman Vvs14 Ef Tips And Tricks
Lerman Vvs14  Ef Tips And TricksLerman Vvs14  Ef Tips And Tricks
Lerman Vvs14 Ef Tips And Tricks
Julie Lerman
 
Structured Document Search and Retrieval
Structured Document Search and RetrievalStructured Document Search and Retrieval
Structured Document Search and Retrieval
Optum
 
Understanding the Entity API Module
Understanding the Entity API ModuleUnderstanding the Entity API Module
Understanding the Entity API Module
Sergiu Savva
 
Simple Data Binding
Simple Data BindingSimple Data Binding
Simple Data Binding
Doncho Minkov
 
devLink - What's New in C# 4?
devLink - What's New in C# 4?devLink - What's New in C# 4?
devLink - What's New in C# 4?
Kevin Pilch
 
Linq Sanjay Vyas
Linq   Sanjay VyasLinq   Sanjay Vyas
Linq Sanjay Vyas
rsnarayanan
 
Hibernate Session 4
Hibernate Session 4Hibernate Session 4
Hibernate Session 4
b_kathir
 

Similar to Querying Capability Modeling (20)

Table Recognition
Table RecognitionTable Recognition
Table Recognition
 
Web clustring engine
Web clustring engineWeb clustring engine
Web clustring engine
 
Fasterflect
FasterflectFasterflect
Fasterflect
 
Exploiting web search engines to search structured
Exploiting web search engines to search structuredExploiting web search engines to search structured
Exploiting web search engines to search structured
 
Linq
LinqLinq
Linq
 
Linq
LinqLinq
Linq
 
Building Semantic Web Portals with WebML
Building Semantic Web Portals with WebMLBuilding Semantic Web Portals with WebML
Building Semantic Web Portals with WebML
 
Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)
 
Topic Maps: Theory & Practice
Topic Maps: Theory & PracticeTopic Maps: Theory & Practice
Topic Maps: Theory & Practice
 
Breaking down data silos with the open data protocol
Breaking down data silos with the open data protocolBreaking down data silos with the open data protocol
Breaking down data silos with the open data protocol
 
Letting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search ComponentLetting In the Light: Using Solr as an External Search Component
Letting In the Light: Using Solr as an External Search Component
 
Ellerslie User Group - ReST Presentation
Ellerslie User Group - ReST PresentationEllerslie User Group - ReST Presentation
Ellerslie User Group - ReST Presentation
 
Introducing AWS AppSync: serverless data driven apps with real-time and offli...
Introducing AWS AppSync: serverless data driven apps with real-time and offli...Introducing AWS AppSync: serverless data driven apps with real-time and offli...
Introducing AWS AppSync: serverless data driven apps with real-time and offli...
 
Lerman Vvs14 Ef Tips And Tricks
Lerman Vvs14  Ef Tips And TricksLerman Vvs14  Ef Tips And Tricks
Lerman Vvs14 Ef Tips And Tricks
 
Structured Document Search and Retrieval
Structured Document Search and RetrievalStructured Document Search and Retrieval
Structured Document Search and Retrieval
 
Understanding the Entity API Module
Understanding the Entity API ModuleUnderstanding the Entity API Module
Understanding the Entity API Module
 
Simple Data Binding
Simple Data BindingSimple Data Binding
Simple Data Binding
 
devLink - What's New in C# 4?
devLink - What's New in C# 4?devLink - What's New in C# 4?
devLink - What's New in C# 4?
 
Linq Sanjay Vyas
Linq   Sanjay VyasLinq   Sanjay Vyas
Linq Sanjay Vyas
 
Hibernate Session 4
Hibernate Session 4Hibernate Session 4
Hibernate Session 4
 

Recently uploaded

A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 

Recently uploaded (20)

A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 

Querying Capability Modeling

Editor's Notes

  1. blabla
  2. Functional Attribute – only affects the display of results Range Attribute – specifies a range, for example price or year Categorical Attribute – fixed set of distinct value, usually selections lists Value-Infinite Attribute – Infinite number of possible values, typically textboxes Value-Infinite can be categorized into more types, like phone-number, zip code, etc. Not addressed in this paper
  3. Enough to fill in one field
  4. A query is deemed accepted even if it returns no results Goal is find the types of queries that are accepted
  5. First extract HTML form and output XML Feed into Query Capability Analyzer Submit Queries and get result Output is a set of atomic queries
  6. Recall is lower than precision Method is conservative, if no strong evidence the system treats as rejected