SlideShare a Scribd company logo
1 of 20
BY
K.RAJASEKHAR REDDY
        (08Q61A0528)
Contents:
 Introduction
 Wrappers
 Clustering
 System Description
 Working
 Types
 Advantages and Disadvantages
 Conclusion
Introduction:
STAVIES is a system for
 Information Extraction through
 Automatic Web Wrapper Using
 clustering Techniques.
STAVIES is used in:
 Automatic Information Discovery.


 Extraction of structured web data.
WRAPPERS
 Piece of software to extract the
 useful information from web data
 sources.

 Data extracted is referred as Structural
 Tokens.
Categories of Wrappers:
 Site Specific:
    Extracts information from a web
 pages
    or family of web pages.
 Generic wrappers:
   Can be applied to almost any page
   regardless of the structures.
CLUSTERING
Process of recognizing input data
 set in such a way that data points in
 same cluster are similar other than
 in different clusters.
Quality Evaluation Measures:
 Cluster Compactness:
 Evaluates how the subsets of input are redistributed
 by clustering system, compared with whole input set.
 Cluster Separation:
 Indicates overall dissimilarity among the output
 clusters.
System Description
 Two modules


     1.Transformation module

     2.Extraction module
Phases:
 Preparation Phase:
   1.Validation correction and XHTML
   generation.

    2.Tree transformation and Terminal
      node selecton
• Segmentation Phase:
   1. Nodes Comparison.

   2. Hierarchical clustering.

   3. Cluster Evaluation and Target area
      Discover.

   4. Boundary selection.
• Information Retrieval Phase:

    1. Information Extraction component.
Working:
Experimental Results:
Types:
 OMINI



 MDR
Advantages:
 Executes in less than 0.4 sec.


 No human assistance is required.


 High performance.
Disadvantage:
 Hard to implement in free texts and
 non-template pages.
Conclusion
 STAVIES saves precious time and effort.
 Tested successfully in more than 63,000
  HTML pages from 50 different web
 data sources.
THANK YOU.
Queries????

More Related Content

Similar to stavies

Web Services: Encapsulation, Reusability, and Simplicity
Web Services: Encapsulation, Reusability, and SimplicityWeb Services: Encapsulation, Reusability, and Simplicity
Web Services: Encapsulation, Reusability, and Simplicityhannonhill
 
Cloud data management
Cloud data managementCloud data management
Cloud data managementambitlick
 
Web clustering engines
Web clustering enginesWeb clustering engines
Web clustering enginesYash Darak
 
Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using wekaPrashant Menon
 
Automatic Analyzing System for Packet Testing and Fault Mapping
Automatic Analyzing System for Packet Testing and Fault MappingAutomatic Analyzing System for Packet Testing and Fault Mapping
Automatic Analyzing System for Packet Testing and Fault MappingIRJET Journal
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataeXascale Infolab
 
KEY CONCEPTS FOR SCALABLE STATEFUL SERVICES
KEY CONCEPTS FOR SCALABLE STATEFUL SERVICESKEY CONCEPTS FOR SCALABLE STATEFUL SERVICES
KEY CONCEPTS FOR SCALABLE STATEFUL SERVICESMykola Novik
 
Cluster computing ppt
Cluster computing pptCluster computing ppt
Cluster computing pptDC Graphics
 
clustering_classification.ppt
clustering_classification.pptclustering_classification.ppt
clustering_classification.pptHODECE21
 
Cassandra Applications Benchmarking
Cassandra Applications BenchmarkingCassandra Applications Benchmarking
Cassandra Applications Benchmarkingniallmilton
 
Database Analysis, OLAP, Aggregate Functions
Database Analysis, OLAP, Aggregate FunctionsDatabase Analysis, OLAP, Aggregate Functions
Database Analysis, OLAP, Aggregate FunctionsSaifur Rahman
 
Data Virtualization Deployments: How to Manage Very Large Deployments
Data Virtualization Deployments: How to Manage Very Large DeploymentsData Virtualization Deployments: How to Manage Very Large Deployments
Data Virtualization Deployments: How to Manage Very Large DeploymentsDenodo
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data gridBogdan Dina
 

Similar to stavies (20)

Web Services: Encapsulation, Reusability, and Simplicity
Web Services: Encapsulation, Reusability, and SimplicityWeb Services: Encapsulation, Reusability, and Simplicity
Web Services: Encapsulation, Reusability, and Simplicity
 
Cluster computing
Cluster computingCluster computing
Cluster computing
 
Cloud data management
Cloud data managementCloud data management
Cloud data management
 
Clusters
ClustersClusters
Clusters
 
Web clustering engines
Web clustering enginesWeb clustering engines
Web clustering engines
 
Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using weka
 
Automatic Analyzing System for Packet Testing and Fault Mapping
Automatic Analyzing System for Packet Testing and Fault MappingAutomatic Analyzing System for Packet Testing and Fault Mapping
Automatic Analyzing System for Packet Testing and Fault Mapping
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
Rapid Miner
Rapid MinerRapid Miner
Rapid Miner
 
KEY CONCEPTS FOR SCALABLE STATEFUL SERVICES
KEY CONCEPTS FOR SCALABLE STATEFUL SERVICESKEY CONCEPTS FOR SCALABLE STATEFUL SERVICES
KEY CONCEPTS FOR SCALABLE STATEFUL SERVICES
 
Cluster computing ppt
Cluster computing pptCluster computing ppt
Cluster computing ppt
 
clustering_classification.ppt
clustering_classification.pptclustering_classification.ppt
clustering_classification.ppt
 
50120140504006
5012014050400650120140504006
50120140504006
 
Cassandra Applications Benchmarking
Cassandra Applications BenchmarkingCassandra Applications Benchmarking
Cassandra Applications Benchmarking
 
Database Analysis, OLAP, Aggregate Functions
Database Analysis, OLAP, Aggregate FunctionsDatabase Analysis, OLAP, Aggregate Functions
Database Analysis, OLAP, Aggregate Functions
 
Data Virtualization Deployments: How to Manage Very Large Deployments
Data Virtualization Deployments: How to Manage Very Large DeploymentsData Virtualization Deployments: How to Manage Very Large Deployments
Data Virtualization Deployments: How to Manage Very Large Deployments
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data grid
 
cluster computing
cluster computingcluster computing
cluster computing
 
Intro to Databases
Intro to DatabasesIntro to Databases
Intro to Databases
 
Promostat original
Promostat originalPromostat original
Promostat original
 

More from Akhil Kumar

Edp section of solids
Edp  section of solidsEdp  section of solids
Edp section of solidsAkhil Kumar
 
Edp projection of solids
Edp  projection of solidsEdp  projection of solids
Edp projection of solidsAkhil Kumar
 
Edp projection of planes
Edp  projection of planesEdp  projection of planes
Edp projection of planesAkhil Kumar
 
Edp projection of lines
Edp  projection of linesEdp  projection of lines
Edp projection of linesAkhil Kumar
 
Edp ortographic projection
Edp  ortographic projectionEdp  ortographic projection
Edp ortographic projectionAkhil Kumar
 
Edp intersection
Edp  intersectionEdp  intersection
Edp intersectionAkhil Kumar
 
Edp ellipse by gen method
Edp  ellipse by gen methodEdp  ellipse by gen method
Edp ellipse by gen methodAkhil Kumar
 
Edp development of surfaces of solids
Edp  development of surfaces of solidsEdp  development of surfaces of solids
Edp development of surfaces of solidsAkhil Kumar
 
Edp typical problem
Edp  typical problemEdp  typical problem
Edp typical problemAkhil Kumar
 
Edp st line(new)
Edp  st line(new)Edp  st line(new)
Edp st line(new)Akhil Kumar
 
graphical password authentication
graphical password authenticationgraphical password authentication
graphical password authenticationAkhil Kumar
 

More from Akhil Kumar (20)

Edp section of solids
Edp  section of solidsEdp  section of solids
Edp section of solids
 
Edp scales
Edp  scalesEdp  scales
Edp scales
 
Edp projection of solids
Edp  projection of solidsEdp  projection of solids
Edp projection of solids
 
Edp projection of planes
Edp  projection of planesEdp  projection of planes
Edp projection of planes
 
Edp projection of lines
Edp  projection of linesEdp  projection of lines
Edp projection of lines
 
Edp ortographic projection
Edp  ortographic projectionEdp  ortographic projection
Edp ortographic projection
 
Edp isometric
Edp  isometricEdp  isometric
Edp isometric
 
Edp intersection
Edp  intersectionEdp  intersection
Edp intersection
 
Edp excerciseeg
Edp  excerciseegEdp  excerciseeg
Edp excerciseeg
 
Edp ellipse by gen method
Edp  ellipse by gen methodEdp  ellipse by gen method
Edp ellipse by gen method
 
Edp development of surfaces of solids
Edp  development of surfaces of solidsEdp  development of surfaces of solids
Edp development of surfaces of solids
 
Edp curves2
Edp  curves2Edp  curves2
Edp curves2
 
Edp curve1
Edp  curve1Edp  curve1
Edp curve1
 
Edp typical problem
Edp  typical problemEdp  typical problem
Edp typical problem
 
Edp st line(new)
Edp  st line(new)Edp  st line(new)
Edp st line(new)
 
graphical password authentication
graphical password authenticationgraphical password authentication
graphical password authentication
 
yii framework
yii frameworkyii framework
yii framework
 
cloud computing
cloud computingcloud computing
cloud computing
 
WORDPRESS
WORDPRESSWORDPRESS
WORDPRESS
 
AJAX
AJAXAJAX
AJAX
 

Recently uploaded

The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?Mark Billinghurst
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKUXDXConf
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityScyllaDB
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfChristopherTHyatt
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...CzechDreamin
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIES VE
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2DianaGray10
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Buy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxBuy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxEasyPrinterHelp
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyUXDXConf
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoTAnalytics
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераMark Opanasiuk
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...FIDO Alliance
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoUXDXConf
 

Recently uploaded (20)

The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Buy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxBuy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptx
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
 

stavies