SlideShare a Scribd company logo
Search EnginesSearch Engines
Presented by
ganesh kavhar
2
What Are They?What Are They?
 Four Components
 A database of references to webpages
 An indexing robot that crawls the WWW
 An interface

Enables users to submit queries

Displays results
 Information retrieval system
 Each is unique, but are mostly the
same
3
DatabaseDatabase
 Where user's query is matched
 Contains only essential parts of pages
 Only includes pages that were indexed
 Search engines are always out of date
4
Web CrawlerWeb Crawler
 A robot that follows links
 Records data it finds
 Words in the webpage
 Metadata
 ALT attributes in IMG tags
 Robot Exclusion Protocol
5
Search Engine InterfacesSearch Engine Interfaces
 Gathers input from users
 Presents results from the IR system
 Often in ranked order
6
Search Engine InterfacesSearch Engine Interfaces
 Input
 User requirements

Search expression, search limits
 Presentation style

Presentation format , search type
7
Search Engine InterfacesSearch Engine Interfaces
 Output
 Results
 Descriptions
 Clusters
8
Search Term MatchingSearch Term Matching
 Trying to find a match in the database
 Two main methods
 Keyword searching

Matching single terms, computing cosine
 Concept-based searching

Examining clusters of words

Attempt to determine meaning of query and
find records related to that meaning
9
Basic IR FeaturesBasic IR Features
 Boolean operators
 AND, OR, NOT, grouping
 Extended operators
 NEAR, ADJACENT, (")
 Stop word deletion
 Stemming
 Searching in fields (e.g. host)
10
Ranked OutputRanked Output
 Most SEs produce ranked lists by applying
simple rules:
 Early words are more important
 Title is very important
 Frequency of occurrence matters for some
 Infrequent words matter more
 Modification date
 Google is different:
 PageRankTM
method based on popularity
 Links as money
11
GooglebombingGooglebombing
 Google spoofed from the lecture list
 first hit from 1992
 Official GoogleBlog explanation
12
What about the Invisible Web?What about the Invisible Web?
 Also known as the Deep Web
 Documents that are on the WWW but
not indexed by Search Engines
 Some are available only by submitting
forms
 Some are not generally accessible (in
subnets)
 Some are not in (X)HTML format
13
The Invisible Web Isn't SoThe Invisible Web Isn't So
Invisible Anymore…Invisible Anymore…
 More search engines parse non-
(X)HTML now than before
 Because of awareness of the problem
companies are making more content
available using
 Stable URLs
 Robot-friendly sitemaps
 But much content is still not indexed
14
But, there's still plenty ofBut, there's still plenty of
important yet invisible docsimportant yet invisible docs
 How to find them?
 Many of them are in databases
 No one search engine covers everything
 Use database tools from the U.'s library
 Especially for research articles
 Use multiple search engines or a meta-
crawler
 dogpile is the most famous
Search EnginesSearch Engines
A Summary of Practical Advice
16
How To Succeed With SEsHow To Succeed With SEs
 As a surfer:
 If you don't know what you are looking for

Use multiple SEs, or a meta-crawler

Search within results
 If you don't know what you are looking for

Use multiple SEs, or a meta-crawler

Use Boolean expressions or search within
results

Consider specialized engines
17
How To Succeed With SEsHow To Succeed With SEs
 As a creator:
 HTML level

Always use ALT attributes with <IMG>, etc.

Avoid frames
 Make it easier to index

Don't expect SEs to find your pages

Make links between your pages

Use metadata

Informal: <meta name="description" …>

Formal: Dublin core and others
 Increase your pages popularity

Don’t use systematic reciprocal linking: rings, exchanges, lists

Page Rank™ is inversely proportional to outdegree
18
How To Succeed With SEsHow To Succeed With SEs
 As a creator (cont.)
 For surfers:
 Use <meta name="description" …>
 Don't expect surfers to start at top of your
hierarchy

Don't rely on a hierarchy

Include a context map near the top of each page

Don't use frames

Think through dynamic content implications

Stickiness… is for another day
Follow onFollow on
 https://ganeshmkavhar.000webhostapp.
com/
 https://github.com/ganeshkavhar
 https://www.csharpcorner.com/member
s/ganesh-kavhar
19

More Related Content

What's hot

Basic SEO mini workshop for copywriter
Basic SEO mini workshop for copywriter Basic SEO mini workshop for copywriter
Basic SEO mini workshop for copywriter
salomon dayan
 
Technical SEO | Joomla Day Chicago 2012
Technical SEO | Joomla Day Chicago 2012 Technical SEO | Joomla Day Chicago 2012
Technical SEO | Joomla Day Chicago 2012
Jessica Dunbar
 
SEO: search Engine Optimization
SEO: search Engine OptimizationSEO: search Engine Optimization
SEO: search Engine Optimization
phoolchand yadav
 
SEO for Dynamic Websites - Make Web Not War 2011
SEO for Dynamic Websites - Make Web Not War 2011SEO for Dynamic Websites - Make Web Not War 2011
SEO for Dynamic Websites - Make Web Not War 2011
iProspect Canada
 
Understanding & Using Search Engine Optimization
Understanding & Using Search Engine OptimizationUnderstanding & Using Search Engine Optimization
Understanding & Using Search Engine Optimization
ifPeople
 
Get your-web-site-to-be-found
Get your-web-site-to-be-foundGet your-web-site-to-be-found
Get your-web-site-to-be-found
Stefanos Anastasiadis
 
Jamie Alberico — How to Leverage Insights from Your Site’s Server Logs | 5 Ho...
Jamie Alberico — How to Leverage Insights from Your Site’s Server Logs | 5 Ho...Jamie Alberico — How to Leverage Insights from Your Site’s Server Logs | 5 Ho...
Jamie Alberico — How to Leverage Insights from Your Site’s Server Logs | 5 Ho...
Semrush
 
Search Engine Google
Search Engine GoogleSearch Engine Google
Search Engine Google
Chidanand Byahatti
 
Seo and analytics basics
Seo and analytics basicsSeo and analytics basics
Seo and analytics basics
Sreekanth Narayanan
 
SharePoint Folders vs. Metadata Best Practices
SharePoint Folders vs. Metadata Best PracticesSharePoint Folders vs. Metadata Best Practices
SharePoint Folders vs. Metadata Best Practices
Chris Woodill
 
Seo beginners-slide-show
Seo beginners-slide-showSeo beginners-slide-show
Seo beginners-slide-show
Ankush77721
 
Seo beginners-slide-show
Seo beginners-slide-showSeo beginners-slide-show
Seo beginners-slide-show
Krunal Doshi
 
organic_keyword_research_and_selection-seth_wilde.ppt
organic_keyword_research_and_selection-seth_wilde.pptorganic_keyword_research_and_selection-seth_wilde.ppt
organic_keyword_research_and_selection-seth_wilde.ppt
zachbrowne
 
Top 10 Technical SEO Mistakes (that we see time and again)...
Top 10 Technical SEO Mistakes (that we see time and again)...Top 10 Technical SEO Mistakes (that we see time and again)...
Top 10 Technical SEO Mistakes (that we see time and again)...
Erudite
 
Seo beginners
Seo beginners Seo beginners
Seo beginners
Health Care
 
How to improve your Position in Google and Yahoo - Camchal Santiago, Aug. 07
How to improve your Position in Google and Yahoo - Camchal Santiago, Aug. 07How to improve your Position in Google and Yahoo - Camchal Santiago, Aug. 07
How to improve your Position in Google and Yahoo - Camchal Santiago, Aug. 07
HYR
 
Martha van Berkel — Structured Data Beyond Google and Back
Martha van Berkel — Structured Data Beyond Google and BackMartha van Berkel — Structured Data Beyond Google and Back
Martha van Berkel — Structured Data Beyond Google and Back
Semrush
 
Understanding SharePoint Information Architecture
Understanding SharePoint Information ArchitectureUnderstanding SharePoint Information Architecture
Understanding SharePoint Information Architecture
Shailen Sukul
 
Technical SEO - An Introduction to Core Aspects of Technical SEO Best-Practise
Technical SEO - An Introduction to Core Aspects of Technical SEO Best-PractiseTechnical SEO - An Introduction to Core Aspects of Technical SEO Best-Practise
Technical SEO - An Introduction to Core Aspects of Technical SEO Best-Practise
Erudite
 
How search engine works
How search engine worksHow search engine works
How search engine works
Ashraf Ali
 

What's hot (20)

Basic SEO mini workshop for copywriter
Basic SEO mini workshop for copywriter Basic SEO mini workshop for copywriter
Basic SEO mini workshop for copywriter
 
Technical SEO | Joomla Day Chicago 2012
Technical SEO | Joomla Day Chicago 2012 Technical SEO | Joomla Day Chicago 2012
Technical SEO | Joomla Day Chicago 2012
 
SEO: search Engine Optimization
SEO: search Engine OptimizationSEO: search Engine Optimization
SEO: search Engine Optimization
 
SEO for Dynamic Websites - Make Web Not War 2011
SEO for Dynamic Websites - Make Web Not War 2011SEO for Dynamic Websites - Make Web Not War 2011
SEO for Dynamic Websites - Make Web Not War 2011
 
Understanding & Using Search Engine Optimization
Understanding & Using Search Engine OptimizationUnderstanding & Using Search Engine Optimization
Understanding & Using Search Engine Optimization
 
Get your-web-site-to-be-found
Get your-web-site-to-be-foundGet your-web-site-to-be-found
Get your-web-site-to-be-found
 
Jamie Alberico — How to Leverage Insights from Your Site’s Server Logs | 5 Ho...
Jamie Alberico — How to Leverage Insights from Your Site’s Server Logs | 5 Ho...Jamie Alberico — How to Leverage Insights from Your Site’s Server Logs | 5 Ho...
Jamie Alberico — How to Leverage Insights from Your Site’s Server Logs | 5 Ho...
 
Search Engine Google
Search Engine GoogleSearch Engine Google
Search Engine Google
 
Seo and analytics basics
Seo and analytics basicsSeo and analytics basics
Seo and analytics basics
 
SharePoint Folders vs. Metadata Best Practices
SharePoint Folders vs. Metadata Best PracticesSharePoint Folders vs. Metadata Best Practices
SharePoint Folders vs. Metadata Best Practices
 
Seo beginners-slide-show
Seo beginners-slide-showSeo beginners-slide-show
Seo beginners-slide-show
 
Seo beginners-slide-show
Seo beginners-slide-showSeo beginners-slide-show
Seo beginners-slide-show
 
organic_keyword_research_and_selection-seth_wilde.ppt
organic_keyword_research_and_selection-seth_wilde.pptorganic_keyword_research_and_selection-seth_wilde.ppt
organic_keyword_research_and_selection-seth_wilde.ppt
 
Top 10 Technical SEO Mistakes (that we see time and again)...
Top 10 Technical SEO Mistakes (that we see time and again)...Top 10 Technical SEO Mistakes (that we see time and again)...
Top 10 Technical SEO Mistakes (that we see time and again)...
 
Seo beginners
Seo beginners Seo beginners
Seo beginners
 
How to improve your Position in Google and Yahoo - Camchal Santiago, Aug. 07
How to improve your Position in Google and Yahoo - Camchal Santiago, Aug. 07How to improve your Position in Google and Yahoo - Camchal Santiago, Aug. 07
How to improve your Position in Google and Yahoo - Camchal Santiago, Aug. 07
 
Martha van Berkel — Structured Data Beyond Google and Back
Martha van Berkel — Structured Data Beyond Google and BackMartha van Berkel — Structured Data Beyond Google and Back
Martha van Berkel — Structured Data Beyond Google and Back
 
Understanding SharePoint Information Architecture
Understanding SharePoint Information ArchitectureUnderstanding SharePoint Information Architecture
Understanding SharePoint Information Architecture
 
Technical SEO - An Introduction to Core Aspects of Technical SEO Best-Practise
Technical SEO - An Introduction to Core Aspects of Technical SEO Best-PractiseTechnical SEO - An Introduction to Core Aspects of Technical SEO Best-Practise
Technical SEO - An Introduction to Core Aspects of Technical SEO Best-Practise
 
How search engine works
How search engine worksHow search engine works
How search engine works
 

Similar to Search engines by ganesh kavhar

Search Engines.ppt
Search Engines.pptSearch Engines.ppt
Search Engines.ppt
ArshPal2
 
Technical SEO Audit – 15 Point Checklist
Technical SEO Audit – 15 Point ChecklistTechnical SEO Audit – 15 Point Checklist
Technical SEO Audit – 15 Point Checklist
Navneet Singh
 
SEO_by_Prashant_Walke
SEO_by_Prashant_WalkeSEO_by_Prashant_Walke
SEO_by_Prashant_Walke
Prashant Walke
 
Search Engine Optimization ppt
Search Engine Optimization pptSearch Engine Optimization ppt
Search Engine Optimization ppt
OECLIB Odisha Electronics Control Library
 
8 Key Ways to Rock SEO
8 Key Ways to Rock SEO8 Key Ways to Rock SEO
8 Key Ways to Rock SEO
LinkedIn Learning Solutions
 
Search Engine Optimization Tips: SEO Tips For Beginners in 2015
Search Engine Optimization Tips: SEO Tips For Beginners in 2015Search Engine Optimization Tips: SEO Tips For Beginners in 2015
Search Engine Optimization Tips: SEO Tips For Beginners in 2015
waqas ahmad
 
seo-ppt.pptx
seo-ppt.pptxseo-ppt.pptx
seo-ppt.pptx
Poriumlimited
 
Seo guide
Seo guideSeo guide
Seo guide
Sudhanshu Pandey
 
Seo Beginners Slide Show
Seo Beginners Slide ShowSeo Beginners Slide Show
Seo Beginners Slide Show
Tin180 VietNam
 
SEO 101 | New York University
SEO 101 | New York UniversitySEO 101 | New York University
SEO 101 | New York University
Nik Papic
 
Practical SEO for Developers - An Introduction
Practical SEO for Developers - An IntroductionPractical SEO for Developers - An Introduction
Practical SEO for Developers - An Introduction
Noel Flowers
 
concepts of SEO
concepts of SEOconcepts of SEO
concepts of SEO
Praveen Tissera
 
The Best Guide to SEO
The Best Guide to SEOThe Best Guide to SEO
The Best Guide to SEO
Sumeet Chadha
 
Seo beginners-slide-show
Seo beginners-slide-showSeo beginners-slide-show
Seo beginners-slide-show
Vidhi Borsaniya
 
Site Auditing and Panda Survival tips
Site Auditing and Panda Survival tipsSite Auditing and Panda Survival tips
Site Auditing and Panda Survival tips
Fionn Downhill
 
Chewy Trewella - Google Searchtips
Chewy Trewella - Google SearchtipsChewy Trewella - Google Searchtips
Chewy Trewella - Google Searchtips
sounddelivery
 
SEO by Ali Pour Zahmatkesh/ AUT University project
SEO by Ali Pour Zahmatkesh/ AUT University projectSEO by Ali Pour Zahmatkesh/ AUT University project
SEO by Ali Pour Zahmatkesh/ AUT University project
Ali Pour Zahmatkesh
 
Beginner's Guide to SEO [Technical SEO & On Page]
Beginner's Guide to SEO [Technical SEO & On Page]Beginner's Guide to SEO [Technical SEO & On Page]
Beginner's Guide to SEO [Technical SEO & On Page]
Boni Satani
 
Presentationjava
PresentationjavaPresentationjava
Presentationjava
Amandeep Kaur
 
Search Enginesv2
Search Enginesv2Search Enginesv2
Search Enginesv2
athiracyborg
 

Similar to Search engines by ganesh kavhar (20)

Search Engines.ppt
Search Engines.pptSearch Engines.ppt
Search Engines.ppt
 
Technical SEO Audit – 15 Point Checklist
Technical SEO Audit – 15 Point ChecklistTechnical SEO Audit – 15 Point Checklist
Technical SEO Audit – 15 Point Checklist
 
SEO_by_Prashant_Walke
SEO_by_Prashant_WalkeSEO_by_Prashant_Walke
SEO_by_Prashant_Walke
 
Search Engine Optimization ppt
Search Engine Optimization pptSearch Engine Optimization ppt
Search Engine Optimization ppt
 
8 Key Ways to Rock SEO
8 Key Ways to Rock SEO8 Key Ways to Rock SEO
8 Key Ways to Rock SEO
 
Search Engine Optimization Tips: SEO Tips For Beginners in 2015
Search Engine Optimization Tips: SEO Tips For Beginners in 2015Search Engine Optimization Tips: SEO Tips For Beginners in 2015
Search Engine Optimization Tips: SEO Tips For Beginners in 2015
 
seo-ppt.pptx
seo-ppt.pptxseo-ppt.pptx
seo-ppt.pptx
 
Seo guide
Seo guideSeo guide
Seo guide
 
Seo Beginners Slide Show
Seo Beginners Slide ShowSeo Beginners Slide Show
Seo Beginners Slide Show
 
SEO 101 | New York University
SEO 101 | New York UniversitySEO 101 | New York University
SEO 101 | New York University
 
Practical SEO for Developers - An Introduction
Practical SEO for Developers - An IntroductionPractical SEO for Developers - An Introduction
Practical SEO for Developers - An Introduction
 
concepts of SEO
concepts of SEOconcepts of SEO
concepts of SEO
 
The Best Guide to SEO
The Best Guide to SEOThe Best Guide to SEO
The Best Guide to SEO
 
Seo beginners-slide-show
Seo beginners-slide-showSeo beginners-slide-show
Seo beginners-slide-show
 
Site Auditing and Panda Survival tips
Site Auditing and Panda Survival tipsSite Auditing and Panda Survival tips
Site Auditing and Panda Survival tips
 
Chewy Trewella - Google Searchtips
Chewy Trewella - Google SearchtipsChewy Trewella - Google Searchtips
Chewy Trewella - Google Searchtips
 
SEO by Ali Pour Zahmatkesh/ AUT University project
SEO by Ali Pour Zahmatkesh/ AUT University projectSEO by Ali Pour Zahmatkesh/ AUT University project
SEO by Ali Pour Zahmatkesh/ AUT University project
 
Beginner's Guide to SEO [Technical SEO & On Page]
Beginner's Guide to SEO [Technical SEO & On Page]Beginner's Guide to SEO [Technical SEO & On Page]
Beginner's Guide to SEO [Technical SEO & On Page]
 
Presentationjava
PresentationjavaPresentationjava
Presentationjava
 
Search Enginesv2
Search Enginesv2Search Enginesv2
Search Enginesv2
 

More from Savitribai Phule Pune University

Learn c programming
Learn c programmingLearn c programming
R programming by ganesh kavhar
R programming by ganesh kavharR programming by ganesh kavhar
R programming by ganesh kavhar
Savitribai Phule Pune University
 
Networking with java
Networking with javaNetworking with java
Networking with java
Savitribai Phule Pune University
 
Python for data analysis
Python for data analysisPython for data analysis
Python for data analysis
Savitribai Phule Pune University
 
Control statements in java programmng
Control statements in java programmngControl statements in java programmng
Control statements in java programmng
Savitribai Phule Pune University
 
Python by ganesh kavhar
Python by ganesh kavharPython by ganesh kavhar
Python by ganesh kavhar
Savitribai Phule Pune University
 
Machine learning by ganesh kavhar
Machine learning by ganesh kavharMachine learning by ganesh kavhar
Machine learning by ganesh kavhar
Savitribai Phule Pune University
 
Android apps
Android appsAndroid apps
Android apps upload slideshare
Android apps upload slideshareAndroid apps upload slideshare
Android apps upload slideshare
Savitribai Phule Pune University
 
Android app upload
Android app uploadAndroid app upload

More from Savitribai Phule Pune University (10)

Learn c programming
Learn c programmingLearn c programming
Learn c programming
 
R programming by ganesh kavhar
R programming by ganesh kavharR programming by ganesh kavhar
R programming by ganesh kavhar
 
Networking with java
Networking with javaNetworking with java
Networking with java
 
Python for data analysis
Python for data analysisPython for data analysis
Python for data analysis
 
Control statements in java programmng
Control statements in java programmngControl statements in java programmng
Control statements in java programmng
 
Python by ganesh kavhar
Python by ganesh kavharPython by ganesh kavhar
Python by ganesh kavhar
 
Machine learning by ganesh kavhar
Machine learning by ganesh kavharMachine learning by ganesh kavhar
Machine learning by ganesh kavhar
 
Android apps
Android appsAndroid apps
Android apps
 
Android apps upload slideshare
Android apps upload slideshareAndroid apps upload slideshare
Android apps upload slideshare
 
Android app upload
Android app uploadAndroid app upload
Android app upload
 

Recently uploaded

みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 

Recently uploaded (20)

みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 

Search engines by ganesh kavhar

  • 2. 2 What Are They?What Are They?  Four Components  A database of references to webpages  An indexing robot that crawls the WWW  An interface  Enables users to submit queries  Displays results  Information retrieval system  Each is unique, but are mostly the same
  • 3. 3 DatabaseDatabase  Where user's query is matched  Contains only essential parts of pages  Only includes pages that were indexed  Search engines are always out of date
  • 4. 4 Web CrawlerWeb Crawler  A robot that follows links  Records data it finds  Words in the webpage  Metadata  ALT attributes in IMG tags  Robot Exclusion Protocol
  • 5. 5 Search Engine InterfacesSearch Engine Interfaces  Gathers input from users  Presents results from the IR system  Often in ranked order
  • 6. 6 Search Engine InterfacesSearch Engine Interfaces  Input  User requirements  Search expression, search limits  Presentation style  Presentation format , search type
  • 7. 7 Search Engine InterfacesSearch Engine Interfaces  Output  Results  Descriptions  Clusters
  • 8. 8 Search Term MatchingSearch Term Matching  Trying to find a match in the database  Two main methods  Keyword searching  Matching single terms, computing cosine  Concept-based searching  Examining clusters of words  Attempt to determine meaning of query and find records related to that meaning
  • 9. 9 Basic IR FeaturesBasic IR Features  Boolean operators  AND, OR, NOT, grouping  Extended operators  NEAR, ADJACENT, (")  Stop word deletion  Stemming  Searching in fields (e.g. host)
  • 10. 10 Ranked OutputRanked Output  Most SEs produce ranked lists by applying simple rules:  Early words are more important  Title is very important  Frequency of occurrence matters for some  Infrequent words matter more  Modification date  Google is different:  PageRankTM method based on popularity  Links as money
  • 11. 11 GooglebombingGooglebombing  Google spoofed from the lecture list  first hit from 1992  Official GoogleBlog explanation
  • 12. 12 What about the Invisible Web?What about the Invisible Web?  Also known as the Deep Web  Documents that are on the WWW but not indexed by Search Engines  Some are available only by submitting forms  Some are not generally accessible (in subnets)  Some are not in (X)HTML format
  • 13. 13 The Invisible Web Isn't SoThe Invisible Web Isn't So Invisible Anymore…Invisible Anymore…  More search engines parse non- (X)HTML now than before  Because of awareness of the problem companies are making more content available using  Stable URLs  Robot-friendly sitemaps  But much content is still not indexed
  • 14. 14 But, there's still plenty ofBut, there's still plenty of important yet invisible docsimportant yet invisible docs  How to find them?  Many of them are in databases  No one search engine covers everything  Use database tools from the U.'s library  Especially for research articles  Use multiple search engines or a meta- crawler  dogpile is the most famous
  • 15. Search EnginesSearch Engines A Summary of Practical Advice
  • 16. 16 How To Succeed With SEsHow To Succeed With SEs  As a surfer:  If you don't know what you are looking for  Use multiple SEs, or a meta-crawler  Search within results  If you don't know what you are looking for  Use multiple SEs, or a meta-crawler  Use Boolean expressions or search within results  Consider specialized engines
  • 17. 17 How To Succeed With SEsHow To Succeed With SEs  As a creator:  HTML level  Always use ALT attributes with <IMG>, etc.  Avoid frames  Make it easier to index  Don't expect SEs to find your pages  Make links between your pages  Use metadata  Informal: <meta name="description" …>  Formal: Dublin core and others  Increase your pages popularity  Don’t use systematic reciprocal linking: rings, exchanges, lists  Page Rank™ is inversely proportional to outdegree
  • 18. 18 How To Succeed With SEsHow To Succeed With SEs  As a creator (cont.)  For surfers:  Use <meta name="description" …>  Don't expect surfers to start at top of your hierarchy  Don't rely on a hierarchy  Include a context map near the top of each page  Don't use frames  Think through dynamic content implications  Stickiness… is for another day
  • 19. Follow onFollow on  https://ganeshmkavhar.000webhostapp. com/  https://github.com/ganeshkavhar  https://www.csharpcorner.com/member s/ganesh-kavhar 19

Editor's Notes

  1. Page Rank™ (PR) is proportional to PR of page that links to you but also inversely proportional to their outdegree