SEARCH ENGINES OTHER
THAN GOOGLE
DR MAYANK TRIVEDI
UNIVERSITY LIBRARIAN & SENATE MEMBER
SMT. HANSA MEHTA LIBRARY
THE MAHARAJA SAYAJIRAO UNIVERSITY OF
BARODA
VADODARA-390 001
E-MAIL : LIBRARIAN-HML@MSUBARODA.AC.IN
DATE : 4TH AUGUST, 2021
1
DICTIONARY DEFINITIONS
search
COMPUTING (transitive verb) to examine a computer
file, disk, database, or network for particular
information
engine
something that supplies the driving force or energy to
a movement, system, or trend
search engine
a computer program that searches for particular
keywords and returns a list of documents in which
they were found, especially a commercial service
that scans documents on the Internet
2
WHAT IS SEARCH ENGINE
 A search engine is a software system that is
designed to carry out web searches.
 They search the World Wide Web in a
systematic way for particular information
specified in a textual web search query.
 The search results are generally presented in a
line of results, often referred to as search
engine results pages (SERPs)
 The information may be a mix of links to web
pages, images, videos, infographics, articles,
research papers, and other types of files.
3
WHAT IS SEARCH ENGINE
 Intelligent applications known as spiders, robots, or bots
that crawl over the World Wide Web following links from
website to website.
 The data compiled from the web by robots is utilized to
generate a retrievable index of a Website.
 Displays results according to the order of the significant
keyword.
 The display of results differs from search engine to search
engine.
 It uses the keywords to search for documents that relate to
these key words and then puts the result in order of relevance to
the topic that was searched for.
 Search engines are important because with over 8 billion web
pages available, it would be impossible to search for the
information that is specifically needed.
 This is why search engines are used to filter the information
that is on the internet and transform it into results that each
individual can easily access and use within the matter of seconds 4
WHAT ARE THEY?
 Four Components
 A database of references to webpages
 An indexing robot that crawls the WWW
 An interface
 Enables users to submit queries
 Displays results
 Information retrieval system
 Each is unique, but are mostly the same
5
FUNCTIONING
 Designed to help people find information stored
on other sites.
 Select pieces from the Internet based on
important keywords.
 Keep an index of the words they find and where
they find them.
 They allow users to look for words or combinations of
words found in that index.
 Search engine operates, in the following order :
 Crawling : Follow links to find information
 Indexing : Record what words appear where
 Ranking : What information is a good match to a user
query? What information is inherently good?
 Displaying : Find a good format for the information
 Serving : Handle queries, find pages, display results 6
FUNCTIONING..
 1. Spiders: To find information on the hundreds of
millions of Web pages that exist, a search engine employs
special software robots, called spiders, to build lists of the
words found on Web sites."Spiders" take a Web page's
content and create key search words that enable online users
to find pages they're looking for.
 2. Crawling: When a spider is building its lists, the
process is called Web. In order to build and maintain a
useful list of words, a search engine's spiders have to look
at a lot of pages.
 1. Indexing: For fast accessing of data.
 2. Meta tag: The contents of each page are then
analyzed to determine how it should be indexed (for
example, words are extracted from the titles, headings, or
special fields).
 Keyword Density: The ratio of the number of
occurrences of a word or phrase on a page to the total
number of words on the page.
7
ORIGIN
 The first internet search engines predate the
debut of the Web in December 1990: Knowbot
Information Service multi-network user
search was first implemented in 1989.
 The first well documented search engine that
searched content files, namely FTP files,
was Archie, which debuted on 10 September
1990
8
INFORMATION RETRIEVAL
 Search Engine is in the field of IR
 Searching authors, titles and subjects in library
card catalogs or computers
 Document classification and categorization, user
interfaces, data visualization, filtering Should easily
retrieve interested information
 IR can be inaccurate as long as the error is
insignificant
 Data is usually natural language text, which is not
always well structured and could be semantically
ambiguous
 Goal: To retrieve all the documents which are
relevant to a query while retrieving as few non-
relevant documents as possible 9
IMPORTANCE
 Visibility and Rankings
When searching for a service or product online, users are more likely to choose one of the
top five suggestions that the search engine shows them.
 Web Traffic
SEO increases your organic search engine traffic, in turn increasing the number of
visitors your page sees each day.
 Trustworthy
The better your SEO score is, the higher you’ll appear on search engines like Google and
Bing. While ranking higher on Google is appealing to all brands because on increased
visibility, a secondhand benefit is the trust you gain with potential customers.
 User Experience
A well-optimized website clearly communicates what product or service is being offered,
how to obtain it and answers any questions surrounding it. By catering the site build to
the user’s experience, search engines like Google and Bing are able to easily pull the
information they need to then relay to users.
 Growth
SEO is key to the growth of your brand. The higher you rank on a search engine for a
variety of high-volume keywords, the more organic (aka non-paid) web traffic your site
will receive
 A website that is well-optimized is more likely to gain more customers and make
more sales.
10
USEFULNESSS
 Search engines essentially act as filters for the wealth of
information available on the internet.
 Allow users to quickly and easily find information
 Provide users with search results that lead to relevant information
on high-quality websites.
 Search engines use complex algorithms to assess websites and
web pages and assign them a ranking for relevant search
phrases.
 Shopping, Research, Entertainment
 In addition to searching text, search engines will also let you search
for graphics, sounds and other kinds of files.
 Search engines also provide search access to databases of third
parties which allow you to search through corporate reports,
telephone listings, yellow pages, zip codes and numerous other
information databases.
 Search engines can also perform some calculations.
 You can also use search engines for conversions, like converting
Celsius to Fahrenhit.
 If you use convert and then the units you want to convert, Google
returns the answers. Examples include "covert 15 liters to gallons"
or "convert 3pm cst to est."
11
TYPES OF SEARCH
ENGINES
 CRAWLER BASED
 DIRECTORIES
 HYBRID SEARCH ENGINES
 META SEARCH ENGINES
12
CRAWLER BASED
 These types of search engines use a "spider" or a
"crawler" to search the Internet.
 The crawler digs through individual web
pages, pulls out keywords and then adds the
pages to the search engine's database.
 Crawler-based search engines are good when you
have a specific search topic.
 Google and Yahoo are examples of crawler
search engines.
13
DIRECTORIES
 Directories depend on human editors to create
their listings or the database. Yahoo Directory,
Open Directory and Look Smart are few
examples.
 Human-powered directories are good when you
are interested in a general topic of search
14
HYBRID SEARCH ENGINES
 Hybrid search engines are search engines that
use both crawler based searches and
directory searches to obtain their results .
 Example:- Yahoo.com- Google.com
15
META SEARCH ENGINES
 These transmit user-supplied keywords simultaneously
to several individual search engines to actually carry out
the search.
 Search results returned from all the search engines can be
integrated, duplicates can be eliminated and additional
features such as clustering by subjects within the
search results can be implemented by meta-search
engines.
 meta engines search multiple engines
 getting combined results from a variety of engines
 do not have their own databases
 but have their own business models affecting
results
 a number of techniques used
 interesting ones: clustering, statistical analyses 16
SAMPLE OF META ENGINES
Dogpile
results from a number of leading search engines; gives source, so overlap can be
compared
Surfwax
gives statistics and text sources & linking to sources
Teoma
results with suggestions for narrowing; links resources derived; originated at Rutgers
Turbo10
provides results in clusters; engines searched can be edited
 Large directory
 Complete Planet
 directory of over 70,000 databases & specialty engines
 Results with graphical displays
 Vivisimo
 clusters results; innovative
 Webbrain
 results in tree structure – fun to use
Kartoo
results in display by topics of query
17
DATABASE
 Where user's query is matched
 Contains only essential parts of pages
 Only includes pages that were indexed
 Search engines are always out of date
18
WEB CRAWLER
 A robot that follows links
 Records data it finds
 Words in the webpage
 Metadata
 ALT attributes in IMG tags
Robot Exclusion Protocol
19
SEARCH ENGINE INTERFACES
 Gathers input from users
 Presents results from the IR system
 Often in ranked order
 Input
 User requirements
 Search expression, search limits
 Presentation style
 Presentation format , search type
 Output
 Results
 Descriptions
 Clusters
20
SEARCH TERM MATCHING
 Trying to find a match in the database
 Two main methods
 Keyword searching
 Matching single terms, computing cosine
 Concept-based searching
 Examining clusters of words
 Attempt to determine meaning of query and find records
related to that meaning
21
HOW IT WORKS?
 crawlers, spiders: go out to find content
 in various ways go through the web looking for new & changed
sites
 periodic, not for each query
 no search engine works in real time
 some search engines do it for themselves, others not
 buy content from companies such as Inktomi
 for a number of reasons crawlers do not cover all of the
web – just a fraction
 what is not covered is “invisible web”
 organizing content: labeling, arranging
 indexing for searching – automatic
 keywords and other fields
 arranging by URL popularity - PageRank as Google
 classifying as directory
 mostly human handpicked & classified
 as a result of different organization we have basically two
kinds of search engines:
 search – input is a query that is then searched & displayed
 directory – classified content – a class is displayed
 and fused: directories have now also search capabilities & vice versa
22
ELABORATION (CONT.)
 databases, caches: storing content
 humongous files usually distributed over many computers
 query processor: searching, retrieval, display
 takes your query as input
 engines have differing rules how handled
 displays ranked output
 some engines also cluster output and provide visualization
 at the other end is your browser
 all search engines have these basic parts in common
 BUT the actual processes – methods how they do it –
are based on various algorithms & they differ
 most are proprietary with details kept mostly secret but
based on well known principles from information retrieval
or classification
 to some extent Google is an exception – they
published their method
23
BASIC IR FEATURES
 Boolean operators
 AND, OR, NOT, grouping
 Extended operators
 NEAR, ADJACENT, (")
 Stop word deletion
 Stemming
 Searching in fields (e.g. host)
24
WHAT ABOUT THE INVISIBLE WEB?
 Also known as the Deep Web
 Documents that are on the WWW but not
indexed by Search Engines
 Some are available only by submitting forms
 Some are not generally accessible (in subnets)
 Some are not in (X)HTML format
 More search engines parse non-(X)HTML now
than before
 Because of awareness of the problem companies
are making more content available using
 Stable URLs
 Robot-friendly sitemaps
 But much content is still not indexed
25
PLENTY OF IMPORTANT YET INVISIBLE
DOCS
 How to find them?
 Many of them are in databases
 No one search engine covers everything
 Use database tools
 Especially for research articles
 Use multiple search engines or a meta-
crawler
 dogpile is the most famous
26
RANKED OUTPUT
 Most SEs produce ranked lists by applying
simple rules:
 Early words are more important
 Title is very important
 Frequency of occurrence matters for some
 Infrequent words matter more
 Modification date
 Google is different:
 PageRankTM
method based on popularity
 Links as money
27
COVERAGE DIFFERENCES
 No engine covers more than a fraction of WWW
 estimates: none more than 16%
 hard (even impossible) to discern & compare coverage, but
they differ substantially in what they cover
 in addition:
 many national search engines
 own coverage, orientation, governance
 many specialized or domain search engines
 own coverage geared to subject of interest
 many comprehensive sources independent of search engines
 some have compilations of evaluated web sources
28
SEARCHING DIFFERENCES
 Substantial differences among search engines on
searching, retrieval display
 need to know how they work & differ in respect to
 defaults in searching a query
 searching of phrases, case sensitivity, categories
 searching of different fields, formats, types of resources
 advance search capabilities and features
 possibilities for refinement, using relevance feedback
 display options
 personalization options
29
HOW TO SUCCEED WITH SES
 As a surfer:
 If you don't know what you are looking for
 Use multiple SEs, or a meta-crawler
 Search within results
 Use Boolean expressions or search within results
 Consider specialized engines
 As a creator:
 HTML level
 Always use ALT attributes with <IMG>, etc.
 Avoid frames
 Make it easier to index
 Don't expect SEs to find your pages
 Make links between your pages
 Use metadata
 Informal: <meta name="description" …>
 Formal: Dublin core and others
 Increase your pages popularity
 Don’t use systematic reciprocal linking: rings, exchanges, lists
 Page Rank™ is inversely proportional to out degree 30
BASICS
 Generally the more keywords you use in your
search the more specific and accurate your
results will be.
 For example, a search for the Asiatic Lion,
India, Geer, Junagadh, will produce better
results if you search for the words “Asiatic Lion,
India, Junagadh" than if you search for just
“Asiatic Lion"
 Some search engines will also perform this same
function when you place a + sign in front the
keywords such as +Student +Contests.
31
HOW TO USE SE
 “+” before a word in a search will locate for
documents which definitely contain the word.
 “-” before a word will exclude that word from
search.
 Placing words between quotation marks will “ ”
search for phrase between the quotes.
 Using “or” between search phrase will search or
each term separately.
 Examples :
 +BLACK+BLUE: The search results will contain
documents which contain the word black and the word
blue.
 BLACK-BLUE: Those documents will be returned which
contain the word black but not the word blue.
 “BLACK BLUE”: Those documents will be returned which
include the phrase black blue. (placed together).
 BLACK OR BLUE: Those documents will be returned
which contain the term black or the term blue.
32
HOW TO USE SE
 Building a more complex query requires the use of Boolean operators that
allow you to refine and extend the terms of the search. The Boolean
operators most often seen are:
 AND - All the terms joined by "AND" must appear in the pages or
documents. Some search engines substitute the operator "+" for the word AND.
 OR - At least one of the terms joined by "OR" must appear in the pages or
documents.
 NOT - The term or terms following "NOT" must not appear in the pages or
documents. Some search engines substitute the operator "-" for the word NOT.
 FOLLOWED BY - One of the terms must be directly followed by the other.
 NEAR - One of the terms must be within a specified number of words of
the other.
 Specify the words clearly (+, -)
 Use Advanced Search when necessary
 Provide as many particular terms as possible
 If looking for a company, institution, or organization, try: www.name [.com |
.edu | .org | .gov | country code]
 Some searching engine specialize in some areas
 For broad queries, try to use Web directories as starting points
 Anyone can publish data on the Web, so information that they get from
search engines might not be accurate.
33
TIPS FOR BETTER SEARCH(GOOGLE)
 Get Specific With Quotes
 Use the Minus Sign to Remove Words
 Use the Asterisk as a Wildcard: The asterisk acts like a wildcard in a search. This is useful if do not know
part of a phrase or you forget exactly how a word or name is spelled.
 Search Within a Site: To search within a site just put site: in front of the domain followed by your
keywords. For example, "site:writerswrite.com harry potter" will return to the Harry Potter coverage on our
site.
 Calculations: Google also has a built-in calculator. You can enter a math equation like 15+25 or 18*2343 or
553/17 and Google will return the answer within a calculator that appears in your results. It will also
convert units and even graph equations.
 Definitions: Google also returns definitions. Use define: followed by the word you want defined. For
example, "define:perquisition" will provide the definition for perquisition. You can also include wildcards.
For example: define:onomato* will return the definition for onomatopoeia.
 Related Sites: Google will also provide you with sites that are related to other sites. If you use
"related:google.com" it returns Yahoo, Bing, DuckDuckGo and other search tools.
 Search Recently Updated Webpages: If you want to narrow your search results to recent information
you can use the news search tab or you can click the tools tab. You will see that the default search in tools is
"any time." This can be narrowed to past hour, past 24 hours, past week, past month, past year or a custom
range.
 Google Alerts: Use Google Alerts to stay up-to-date on your research topics. You can configure keyword
searches that you want to get updates on when there is new content available. You can get updates by email
or as RSS feeds.
 Random Facts with Google: Google will return a trivia question and answer if you use this keyword.
34
GOOGLE
 Initially known as BackRub, Google began as a research
project of Larry Page, who enrolled in Stanford’s computer
science graduate program in 1995. There, he met fellow CS
student Sergey Brin. The two stayed in touch as Page began
looking into the behavior of linking on the World Wide Web.
 In 2001, Google employee Paul Buchheit started work on an
email product designed to address the company’s increasing
internal communications and storage needs.
 On April 1st, 2004, Gmail launched to the public with 1GB
of storage and advanced search capabilities, dwarfing the
limitations imposed by popular competing email products of
the time, many of which offered just a few megabytes of
storage.
 ”Maps can be useful and fun,” said Google when it first
introduced Maps in 2005. The web-only renders provide step-
by-step directions and zoomable maps with a smattering of
businesses like hotels available to search.
35
GOOGLE
 No. 1 Position in Google Gets 33% of Search Traffic
 GOOGLE TURNS 20: RESHAPED THE WORLD
 No technology company is arguably more responsible for shaping the
modern internet, and modern life, than Google.
 Many of those people use Google software to search the repository
of human knowledge, communicate, perform work, consume
media, and maneuver the endlessly vast internet
 Segmentation of search – Google would try and categorize
information more, for example Google Book Search, (US)
government search, blog search, etc.
 Semantic Web – Google search engine is becoming more
sophisticated, taking account of synonyms, page structure and
user intent.
 Searching the cloud – as people become more confident to store
information on "cloud" hard drives, there will be a need to search
these.
 Real-time – searching what people are writing at the moment to
catch the latest buzz and get really up to the minute information.
 Mobile search – as we use mobiles for information, we will need
search tools to search them, so mobile websites will need to be
formatted for searchability.
36
GOOGLE
 Top Search engines market share worldwide
 Google dominates the field of search engines with more than
a 90% market portion.
 The second most popular search engine on the market is
Bing with 2.78%
 Other companies have even smaller
Percentages: Yahoo 1.6%, Baidu 0.92%, Yandex 0.85%,
and DuckDuckGo 0.5%.
 The growth rate of global internet users is 8.2% per year.
 Google searches completed per year grew around 10% per
year
 There are 40,000+ Google searches every second
 Google processes more than 3.5 billion searches every
day, and 1.2 trillion searches every year.
 92.26% of all global searches take place on Google.
37
GOOGLE
 Google is the world’s largest search engine, and it has over 1 billion
people who use it’s products and services.
 Google is an indexing and searching service which provides us with
connections to websites and pages based on our specific search queries.
 Studies show that Google indexes 35 trillion web pages and more
than 2.4 million searches happen through the search engine every
minute.
 However, some studies show that the internet as a whole is home to a
whopping 17.5 quadrillion different pages.
 While Google still has significant room to grow, it has by far the biggest
collection of web pages available on the internet.
 This massive number of web pages available gives Google promising
marketing potential, which makes it an essential marketing tool
for businesses.
 There are more than 3.5 billion Google searches conducted every day.
 76% of all global searches take place on Google.
 Google Search Index contains more than 100,000,000 GB.
 16 – 20% of all annual Google search results are new.
 More than 60% of Google searches come from mobile devices.
38
LIST OF SEARCH ENGINES
39
 Image Search Engines
 1. Tineye - this is a reverse image search site.
Upload an image and this search engine
accurately finds other places online where the
same image can be found. The way this works is
you upload an image from your computer or enter
the web address of an image online and Tineye
will list all the places online the same or similar
image can be found.
 2. OpenClipArt - search for open source clip
art you can use any way you wish. 100% free
image search engine.
 3. Pixbay - over a quarter million
illustrations, photos, vector and clip art you
can use without attribution in digital and printed
form, even for commercial applications.
 4. Flickr - one of the oldest and most well
known image search websites. Flickr is
owned by Yahoo.
 5. FreeImages - can download and use
images or upload and share images. Includes
250,000+ high resolution, high quality
photographs from photographers.
 6. PrivateLee - a privacy enhanced image
search engine - not so much for downloading
and using images but rather being able to conduct
image searches without being tracked.
 7. Giphy - search for and find animated gif
images. You can perform a search
 Privacy Search Engines
 We all know search engines gather and
store all kinds of information about you.
 1. Ixquick - this one is actually a proxy
server When you search using Ixquick
their computers act as a middleman or
buffer so your information and data if
filtered out and stripped.
 2. StartPage - sister site to Ixquick.
Both are certified search engines that
does not record your IP address or track
your searches in any way.
 3. PrivateLee - does not use cookies or
any other tracking data so your
internet searches are not compiled, saved
or shared. Has both web and image search.
 4. DuckDuckGo - non tracking search
engine. Searches can be customized to
provide region specific results and language
settings.
 5. LookSeek and DarkLookSeek - both
of these are from the same company.
LIST OF….
40
 People Search Engines
 1. Spokeo (website ) - you can search for
someone on Spokeo by input either a
name, online username, email address or
phone number.
 2. Inforegistry (website ) - web based
background check that allows you to
perform a people information search and
instantly find current address, phone
number, marital status and additional
information about any United States citizen.
 3. Peek You (website ) - a decent free
people search engine that allows you to
track down a person or people by name,
username, address, phone, email address or
even interests.
 4. Background Report 360 (website ) -
background investigations service and
person search providing instant background
check results on any person residing court
records and personal records such as
addresses and phone numbers of the person
being looked up.
 5. Everify (website ) - online
background investigations tool and
persons search with 1 billion+ people
records.
 6. Inteligator (website ) - comprehensive
background check service that provides
detailed information on anyone residing in the
United States with a records search including
sex offender, public records, marriage and
divorce, property records and criminal records
history.
 7. GovRegistry (website ) - GovRegistry is a
background check, criminal records and
sex offender search engine that provides
comprehensive background information on any
US citizen.
 Torrent Search Engines
 A torrent (bit torrent) is a way to get things free
(that normally are not free). A torrent is
actually a file sharing system where multiple
servers or computers each have and store pieces
of a file which can be a pdf, software program,
music, movies, just about any type of file.
 1. Torrentz - a meta-search engine
combining results from dozens of torrent
search engines.
 2. Toorgle - another torrent meta search
engine that pulls torrent search results
from multiple torrent search sites.
 3. KickAssTorrents - large torrent search
engine.
INFOSEEKEXCITE/WEBCRAWLER
41
 InfoSeek is one of the best search
engines for finding information In
addition to searching the Web.
 InfoSeek provides guides to
popular subjects which contain
links to recommended sites.
 InfoSeek also groups all of your
results that occur in the same
website.
 Searches can be further limited by
using the advanced search,
reachable by following the advanced
search link on InfoSeek.
 Searching for news stories is also
available on InfoSeek.
 In addition to searching, you can
also reference information such
as maps, Roget's thesaurus and
Webster's dictionary.
 Excite provides a search engine
that will crunch through webpages,
and also provides searches of
recent news stories, site
reviews, a shopping guide and
more.
 Site reviews on Excite can also be
found on Webcrawler,
 Recent news stories.
 Excite's "Power Search" will allow
you to further limit your search.
 It allows your search must NOT
contain in addition to the words the
search MUST contain.
 It also allows you to search only
through websites that Excite
recommends, to bring up only titles
in your results
 Personalize Excite, under the
"My Excite" section.
LYCOS/ALTAVISTA
42
 Lycos is another major search
engine that is also becoming more
of a media site, similar to
Excite, Yahoo and InfoSeek.
 Technology called WiseWire
which brings up particular pages
that pertain to your query and
allows users to rate these pages
at the same time.
 Advanced search option, dubbed
"Lycos Pro."
 Images, sounds and products
and search through Usenet
postings, message boards, and
personal homepages.
 Lycos also provides web
reviews, through its "Top 5%
of the Web" service.
 AltaVista is a great search engine
if you are trying to pull up a
large number of webpages
relating to your search.
 Most effective when you are
involved in a very specific search
or when you are searching for
recently added or updated
webpages.
 Perform a variety of search
options you will not be aware
 Includes a number of specialty
options for use when searching
newsgroups
 It is not as useful if you are seeking
a website on a general subject
because AltaVista will bring up
more results than you will want,
including many inappropriate
listings.
YAHOOHOTBOT
43
 Yahoo : Fast becoming a major
media leader.
 One drawback to searching on
Yahoo is it will often bring up
unsignificant webpages related
to your topic.
 The listings in Yahoo by category
from Yahoo's front page.
 Yahoo also provides searchable
news stories culled from various
major sources.
 Yahoo provides daily picks and an
internet magazine called Yahoo
Internet Life.
 Yahoo also provides online
diversions in the way of message
boards, chat, email addresses and
instant messaging (Yahoo pager).
 HotBot : Large number of results
and you are searching a specific
subject.
 Provides browsing by subject
through website reviews and has
licensed reviews from Look
Smart
 SuperSearch that will allow you to
restrict your web search by the
date, by the domain suffix (i.e.
.com, .net), by continent and by
media type such as audio or video.
 Searching through recent news
stories with its news search service
entitled Newsbot.
 Searches for businesses, people,
newsgroups, domain names,
discussion groups and
shareware.
BEST ALTERNATIVE SEARCH ENGINES
 Some general search engines beyond the top three — Google, Bing, and Baidu.
 DuckDuckGO :Concerned about online privacy? DuckDuckGo prides itself
on being the search engine that does not track or personalize your searches and
results. They even offer handy visual guides on Google tracking and filter
bubbling.
 If you’re an iOS user, you can set DuckDuckGo to be the default search engine
in Safari. It’s also an option for Safari on macOS.
 Ecosia :Want trees planted while you search? That’s what Ecosia does! Simply
run your normal searches and Ecosia will use its surplus income to
conservationist organizations that plant trees.
 Dogpile : If you want results from the top three search engines, but don’t
want to go to them individually, try Dogpile.
 WolframAlpha : Looking for a search engine based on computation and
metrics? Try WolframAlpha. It will give you website data, historical
information by date, unit conversions, stock data, sports statistics, and more.
You can see examples by topic to learn more.
 Gigablast : an open-source search engine.While it doesn’t always get
things right, it does provide a retro look, results return quickly, and a feature
similar to the now-defunct Google Instant.
 Startpage : If you are looking to search without being tracked, Startpage is
another solid option.
 Quant : is a Paris-based search engine dedicated to protecting your
privacy. They are the first search engine to protect user’s privacy and preserve
the “digital ecosystem” by remaining neutral.
44
SOCIAL NETWORK SPECIFIC ADVANCED
SEARCH
 Facebook Search : Want to see a particular search
across different areas of Facebook? Use
Facebook’s advanced search options. You can view
search results for people, pages, places, groups, and
more.
 LinkedIn People/job/answer Search : If you want
to find some new connections on LinkedIn, use
the Advanced People Search.
 LinkedIn offers job seekers an Advanced Job
Search to find jobs using the above information plus
experience level and industry.
 LinkedIn Answers is a great way to gain
exposure and build authority in your industry.
Use the Answers Advanced Search to find the perfect
questions to answer.
 Twitter Search :Twitter’s Advanced Search is a
great way to find better results on Twitter. 45
SOCIAL SEARCH
 Keyhole allows you to search for hashtags,
keywords, @mentions, and URLs. Want to see
how your latest blog post was shared across
social networks? Just select URL on Keyhole and
put in the URL and you’ll see who has shared it.
 Social Mention allows you to search across
multiple types of networks including blogs,
microblogs, bookmarks, comments, events,
images, news, and more.
 Use Buzzsumo if you have a topic in mind and
want to see which articles on the web were
most shared for that particular search.
There is a paid version that can give you access
to more tools for each topic.
46
FORUMS
 Want to participate on forums in your industry?
Use this search engine to find results specifically
on forums.
 BoardReader allows you to search forums and
narrow results down by date (last day through
last year) and language.
 Google Forum
47
BLOG
 Blog Search Engine aptly describes this search
engine. Search blogs and blog posts using
keywords. It’s not perfect, but it’s better than a
general search.
48
DOCUMENTS, EBOOKS, AND PRESENTATIONS
 Google Advanced Search allows you to search
for specific types of documents. Looking
specifically for PDFs? Set that as your criteria.
 Scribd is the largest social reading and
publishing network that allows you to discover
original written content across the web.
 SlideShare is the largest community for
sharing presentations. If you missed a
conference or webinar, there’s a good chance the
slides from your favorite speakers are here.
49
IMAGE SEARCH
 Flickr offers an advanced search screen to find photos,
screenshots, illustrations, and videos on their network.
 Pinterest allows you to search for anything visual –
clothing, cars, floors, airplanes, etc, and pin it to your
favorites. Just be sure you don’t steal copyright work.
 Bing offers an image search that starts out with the top
trending images, then leads to images which can be filtered
by size, layout, and other criteria.
 Google Advanced Image Search allows you to get even
more specific about the images you are looking for,
including specifying whether they are faces, photos, clip
arts, or line drawings.
 Have you seen an image around the web and want to know
where it came from? That’s what TinEye is for. Just put
your image in the search box and TinEye will find where
that image has been seen from around the web.
50
CREATIVE COMMONS MEDIA
 Media created by others to use on your
website?
 Looking for only images that you can repurpose,
use for commercial purposes, or modify?
 Creative Commons Search which will allow
you to look through multiple sources including
Flickr, Google Images, Wikimedia, and YouTube.
 Wikimedia Commons has over 12 million files
in their database of freely usable images, sound
bites, and videos. Use the search box or browse
by categories for different types of media.
51
VIDEO SEARCH
 Yahoo Video Search allows you to search through
video content from their own network, YouTube,
Dailymotion, Metacafe, Myspace, Hulu, and other
online video providers for videos on any topic.
 Sidereel allows you to go beyond YouTube to find
shows on dozens of streaming platforms like HBO and
Hulu. If you’re looking for streaming videos, you’ll
likely find it here.
 AOL Video aggregates the day’s best clips from
around the web, but you can also use it as a search
engine.
 With Google Video Search you’ll be able to search
for videos on any topic and filter your results by
duration, date when uploaded, video source, and
much more.
 YouTube 52
WEBSITE DATA & STATISTICS
 CrunchBase offers insight into your favorite
online brands and companies. Listings will tell
you people who are associated with a company,
contact information, related videos, screenshots, and
more.
 SimilarWeb allows you to search for website or
app profiles based on specific domains or app
names. Domains with a high volume of traffic will
have data including total regional visitors per month,
pageviews online vs. mobile, demographics, sites
similar audiences like, and more.
 BuiltWith allows you to search for domains and see
the technology they use, including
analytics, content management systems,
coding, and widgets. You can also click on any of
the products to see usage trends, industries using the
technology, and more. 53
DIFFICULTIES OF BUILDING A SEARCH
ENGINE
 Build by Companies and hide the technical
detail
 Distributed data
 High percentage of volatile data
 Large volume
 Unstructured and redundant data
 Quality of data
 Heterogeneous data
 Dynamic data
 How to specify a query from the user
 How to interpret the answer provided by
the system
54
USER PROBLEMS
 Do not exactly understand how to provide a
sequence of words for the search
 Not aware of the input requirement of the
search engine.
 Problems understanding Boolean logic, so the
users cannot use advanced search
 Novice users do not know how to start using
a search engine
 Donot care about advertisements ? No
funding
 Around 85% of users only look at the first
page of the result, so relevant answers might be
skipped 55
GOOGLE
 Google LLC is an American multinational technology
company that specializes in Internet-related services and
products, which include online advertising technologies,
a search engine, cloud computing, software, and
hardware.
 It is one of the Big Five companies in the
American information technology industry along
with Amazon, Facebook, Apple, and Microsoft
 According to the latest netmarketshare report74.52% of
searches were powered by Google and only 7.98% by
Bing.
 Google is also dominating the mobile/tablet search
engine market share with 93%! 56
BING
 Microsoft Bing (formerly known simply as Bing) is
a web search engine owned and operated by Microsoft.
 The service has its origins in Microsoft's previous search
engines: MSN Search, Windows Live Search and
later Live Search.
 Bing provides a variety of search services, including
web, video, image and map search products. It is
developed using ASP.NET.
 Bing is Microsoft’s attempt to challenge Google in
the area of search.
 Despite their efforts they still did not manage to
convince users that their search engine can
produce better results than Google.
57
YAHOO! SEARCH
 Yahoo styled as yahoo! is an American web
services provider.
 It is headquartered in Sunnyvale, California and
is owned by Verizon Media, pending sale
to investment funds managed by Apollo Global
Management
 Since October 2011 Yahoo search is powered by
Bing.
 Yahoo is still the most popular email provider
and according to some studies holds the
fourth place in search.
58
ASK
 Ask.com (originally known as Ask Jeeves) is
a question answering–focused e-business founded in
1996 by Garrett Gruener and David
Warthen in Berkeley, California.
 Formerly known as Ask Jeeves, Ask.com receives
approximately 0.05% of the search share.
 ASK is based on a question/answer format where
most questions are answered by other users or are in the
form of polls.
 It also has the general search functionality but the
results returned lack quality compared to Google or
even Bing and Yahoo. 59
AOL SEARCH
 AOL (stylized as Aol., formerly a company known
as AOL Inc. and originally known as America
Online) is an American web portal and online
service provider based in New York City.
 According to net market share the old time famous
AOL is still in the top 10 search engines with a
market share that is close to 0.04%.
 The AOL network includes many popular web sites
like engadget.com, techchrunch.com and the
huffingtonpost.com.
60
BAIDU
 Baidu was founded in 2000 and it is the most
popular search engine in China.
 It’s market share is increasing steadily and
according to Wikipedia, Baidu is serving
billion of search queries per month.
 It is currently ranked at position 4, in the Alexa
Rankings. And Rank No. 1 in China
 As Google maintains its stronghold in the global
internet search arena, Baidu, Inc.,has the upper
hand in China, with 72.37% of the nation's
market share as of May 2021.
61
WOLFRAMALPHA
 Wolframalpha is different of all the other search
engines.
 They market it as a Computational Knowledge
Engine which can give you facts and data for a
number of topics.
 It can do all sorts of calculations, for example if
you enter “mortgage 2000” as input it will
calculate your loan amount, interest paid etc. based
on a number of assumptions.
62
DUCKDUCKGO
 Has a number of advantages over the other search
engines.
 It has a clean interface, it does not track users, it
is not fully loaded with ads and has a number of
very nice features (only one page of results, you can
search directly other web sites etc).
 Privacy
 Update: According to duckduckgo traffic stats, as of
October 2018, duckduckgo is serving more than 30
million searches per day.
63
9. INTERNET ARCHIVE
 The Internet Archive is an American digital
library with the stated mission of "universal access
to all knowledge".
 It provides free public access to collections of
digitized materials, including websites, software
applications/games, music, movies/videos, moving
images, and millions of books.
 In addition to its archiving function, the Archive is an
activist organization, advocating a free and open
Internet
 Archive.org is the internet archive search engine.
 It is very useful tool if you want to trace the history of a
domain and examine how it has changed over the
years.
64
YANDEX.RU
 Yandex is a Russian Dutch-domiciled
multinational corporation providing over
70 Internet-related products and services, including
transportation, search and information services, e-
commerce, navigation, mobile applications, and
online advertising.
 According to Alexa, Yandex.ru is among the 30
most popular websites on the Internet with a
ranking position of 4 in Russian.
 Yandex present themselves as a technology
company that builds intelligent products and
services powered by machine learning. 65
DOGPILE
 Dogpile is a metasearch engine for
information on the World Wide Web that
fetches results
from Google, Yahoo!,Yandex, Bing, and
other popular search engines, including
those from audio and video content
providers such as Yahoo!
66
SEARCH ENGINE OPTIMIZATION(SEO)
 Is the process of improving the volume and quality of traffic to a website
from search engine.
 A higher ranking when someone searches a term in your industry increases
your brand’s visibility online.
 More opportunities to convert qualified prospects into customers.
 SEO can help your brand stand above others as a trustworthy company and
further improve the user’s experience with your brand and website.
 As a marketing strategy for increasing a site's relevance, SEO considers how
search algorithm work and what people search for.
 Search engines often index millions of pages for certain key words, and your
website can be buried deep on page 100 or worse if it is not optimized properly.
 Building a website or adjusting a website so that it comes up on the first or
second page of search engine results is what our SEO Service does.
 If your page is not optimized it will not get a high ranking and you will not
get results, no matter how many search engines the site has been submitted to.
 Effective SEO may require changes to the html source code of a site.
 SEO tactics may be incorporated into web site development and design.
 The term "search engine friendly" may be used to describe web site designs,
menus, content management system and shopping carts that are easy to optimize.
67
LIMITATIONS
 every search engine has limitation as to
 coverage
 meta engines just follow coverage limitations & have more of their
own
 search capabilities
 finding quality information
 some have compromised search with economics
 becoming little more than advertisers
 but search engines are also many times victims
of spam indexing
 affecting what is included and how ranked
68
FACTS
 Most search engines have vanished.
 Google is a big player.
 63% of Internet users use a search engine in
a given session.
 Approximately 94 million adults use the
internet on an average day.
 This means approximately 59.22 MILLION
people use search engines in an average
day.
 Microsoft realized Internet is here to stay
 i. Dominates the browser market.
 ii. Realizes search is critical. 69
CONCLUSION
 The primary goal is to provide high quality and relevant
search results over a rapidly growing World Wide Web. e.g.
Google employs a number of techniques to improve search
quality including page rank, anchor text, and proximity
information.
 Google is a complete architecture for gathering web
pages, indexing them, and performing search queries
over them.
 Search Engine is really useful tool in present era of web.
 There are many of search engines available in market, but the
most popular search engine is Google.
 So for getting topmost results in web, we have to use search
engine optimization technique.
 Both on page and off page search engine optimization
techniques are important for better search result.
 In the three flavors of SEO, White Hat SEO technique is the
best and long term as well. 70
 As a final word, if you search “What is the best search
engine?” you will get an answer that Google is the best
and most popular search engine and Bing is in the
second place (on a Global level).
 The list is by no means complete and for sure many
more will be created in the future but as far as the first
places are concerned, Google and Bing will hold the
lead positions for years to come.
 India must have its own Search engine
 Library Professionals have to work hard @ par
with Google.
71
Thank You……
Email : librarian-hml@msubaroda.ac.in
SLIDESHARE LINK:-
https://www.slideshare.net/mayanktrivedi21
72

Search Engines Other than Google

  • 1.
    SEARCH ENGINES OTHER THANGOOGLE DR MAYANK TRIVEDI UNIVERSITY LIBRARIAN & SENATE MEMBER SMT. HANSA MEHTA LIBRARY THE MAHARAJA SAYAJIRAO UNIVERSITY OF BARODA VADODARA-390 001 E-MAIL : LIBRARIAN-HML@MSUBARODA.AC.IN DATE : 4TH AUGUST, 2021 1
  • 2.
    DICTIONARY DEFINITIONS search COMPUTING (transitiveverb) to examine a computer file, disk, database, or network for particular information engine something that supplies the driving force or energy to a movement, system, or trend search engine a computer program that searches for particular keywords and returns a list of documents in which they were found, especially a commercial service that scans documents on the Internet 2
  • 3.
    WHAT IS SEARCHENGINE  A search engine is a software system that is designed to carry out web searches.  They search the World Wide Web in a systematic way for particular information specified in a textual web search query.  The search results are generally presented in a line of results, often referred to as search engine results pages (SERPs)  The information may be a mix of links to web pages, images, videos, infographics, articles, research papers, and other types of files. 3
  • 4.
    WHAT IS SEARCHENGINE  Intelligent applications known as spiders, robots, or bots that crawl over the World Wide Web following links from website to website.  The data compiled from the web by robots is utilized to generate a retrievable index of a Website.  Displays results according to the order of the significant keyword.  The display of results differs from search engine to search engine.  It uses the keywords to search for documents that relate to these key words and then puts the result in order of relevance to the topic that was searched for.  Search engines are important because with over 8 billion web pages available, it would be impossible to search for the information that is specifically needed.  This is why search engines are used to filter the information that is on the internet and transform it into results that each individual can easily access and use within the matter of seconds 4
  • 5.
    WHAT ARE THEY? Four Components  A database of references to webpages  An indexing robot that crawls the WWW  An interface  Enables users to submit queries  Displays results  Information retrieval system  Each is unique, but are mostly the same 5
  • 6.
    FUNCTIONING  Designed tohelp people find information stored on other sites.  Select pieces from the Internet based on important keywords.  Keep an index of the words they find and where they find them.  They allow users to look for words or combinations of words found in that index.  Search engine operates, in the following order :  Crawling : Follow links to find information  Indexing : Record what words appear where  Ranking : What information is a good match to a user query? What information is inherently good?  Displaying : Find a good format for the information  Serving : Handle queries, find pages, display results 6
  • 7.
    FUNCTIONING..  1. Spiders:To find information on the hundreds of millions of Web pages that exist, a search engine employs special software robots, called spiders, to build lists of the words found on Web sites."Spiders" take a Web page's content and create key search words that enable online users to find pages they're looking for.  2. Crawling: When a spider is building its lists, the process is called Web. In order to build and maintain a useful list of words, a search engine's spiders have to look at a lot of pages.  1. Indexing: For fast accessing of data.  2. Meta tag: The contents of each page are then analyzed to determine how it should be indexed (for example, words are extracted from the titles, headings, or special fields).  Keyword Density: The ratio of the number of occurrences of a word or phrase on a page to the total number of words on the page. 7
  • 8.
    ORIGIN  The firstinternet search engines predate the debut of the Web in December 1990: Knowbot Information Service multi-network user search was first implemented in 1989.  The first well documented search engine that searched content files, namely FTP files, was Archie, which debuted on 10 September 1990 8
  • 9.
    INFORMATION RETRIEVAL  SearchEngine is in the field of IR  Searching authors, titles and subjects in library card catalogs or computers  Document classification and categorization, user interfaces, data visualization, filtering Should easily retrieve interested information  IR can be inaccurate as long as the error is insignificant  Data is usually natural language text, which is not always well structured and could be semantically ambiguous  Goal: To retrieve all the documents which are relevant to a query while retrieving as few non- relevant documents as possible 9
  • 10.
    IMPORTANCE  Visibility andRankings When searching for a service or product online, users are more likely to choose one of the top five suggestions that the search engine shows them.  Web Traffic SEO increases your organic search engine traffic, in turn increasing the number of visitors your page sees each day.  Trustworthy The better your SEO score is, the higher you’ll appear on search engines like Google and Bing. While ranking higher on Google is appealing to all brands because on increased visibility, a secondhand benefit is the trust you gain with potential customers.  User Experience A well-optimized website clearly communicates what product or service is being offered, how to obtain it and answers any questions surrounding it. By catering the site build to the user’s experience, search engines like Google and Bing are able to easily pull the information they need to then relay to users.  Growth SEO is key to the growth of your brand. The higher you rank on a search engine for a variety of high-volume keywords, the more organic (aka non-paid) web traffic your site will receive  A website that is well-optimized is more likely to gain more customers and make more sales. 10
  • 11.
    USEFULNESSS  Search enginesessentially act as filters for the wealth of information available on the internet.  Allow users to quickly and easily find information  Provide users with search results that lead to relevant information on high-quality websites.  Search engines use complex algorithms to assess websites and web pages and assign them a ranking for relevant search phrases.  Shopping, Research, Entertainment  In addition to searching text, search engines will also let you search for graphics, sounds and other kinds of files.  Search engines also provide search access to databases of third parties which allow you to search through corporate reports, telephone listings, yellow pages, zip codes and numerous other information databases.  Search engines can also perform some calculations.  You can also use search engines for conversions, like converting Celsius to Fahrenhit.  If you use convert and then the units you want to convert, Google returns the answers. Examples include "covert 15 liters to gallons" or "convert 3pm cst to est." 11
  • 12.
    TYPES OF SEARCH ENGINES CRAWLER BASED  DIRECTORIES  HYBRID SEARCH ENGINES  META SEARCH ENGINES 12
  • 13.
    CRAWLER BASED  Thesetypes of search engines use a "spider" or a "crawler" to search the Internet.  The crawler digs through individual web pages, pulls out keywords and then adds the pages to the search engine's database.  Crawler-based search engines are good when you have a specific search topic.  Google and Yahoo are examples of crawler search engines. 13
  • 14.
    DIRECTORIES  Directories dependon human editors to create their listings or the database. Yahoo Directory, Open Directory and Look Smart are few examples.  Human-powered directories are good when you are interested in a general topic of search 14
  • 15.
    HYBRID SEARCH ENGINES Hybrid search engines are search engines that use both crawler based searches and directory searches to obtain their results .  Example:- Yahoo.com- Google.com 15
  • 16.
    META SEARCH ENGINES These transmit user-supplied keywords simultaneously to several individual search engines to actually carry out the search.  Search results returned from all the search engines can be integrated, duplicates can be eliminated and additional features such as clustering by subjects within the search results can be implemented by meta-search engines.  meta engines search multiple engines  getting combined results from a variety of engines  do not have their own databases  but have their own business models affecting results  a number of techniques used  interesting ones: clustering, statistical analyses 16
  • 17.
    SAMPLE OF METAENGINES Dogpile results from a number of leading search engines; gives source, so overlap can be compared Surfwax gives statistics and text sources & linking to sources Teoma results with suggestions for narrowing; links resources derived; originated at Rutgers Turbo10 provides results in clusters; engines searched can be edited  Large directory  Complete Planet  directory of over 70,000 databases & specialty engines  Results with graphical displays  Vivisimo  clusters results; innovative  Webbrain  results in tree structure – fun to use Kartoo results in display by topics of query 17
  • 18.
    DATABASE  Where user'squery is matched  Contains only essential parts of pages  Only includes pages that were indexed  Search engines are always out of date 18
  • 19.
    WEB CRAWLER  Arobot that follows links  Records data it finds  Words in the webpage  Metadata  ALT attributes in IMG tags Robot Exclusion Protocol 19
  • 20.
    SEARCH ENGINE INTERFACES Gathers input from users  Presents results from the IR system  Often in ranked order  Input  User requirements  Search expression, search limits  Presentation style  Presentation format , search type  Output  Results  Descriptions  Clusters 20
  • 21.
    SEARCH TERM MATCHING Trying to find a match in the database  Two main methods  Keyword searching  Matching single terms, computing cosine  Concept-based searching  Examining clusters of words  Attempt to determine meaning of query and find records related to that meaning 21
  • 22.
    HOW IT WORKS? crawlers, spiders: go out to find content  in various ways go through the web looking for new & changed sites  periodic, not for each query  no search engine works in real time  some search engines do it for themselves, others not  buy content from companies such as Inktomi  for a number of reasons crawlers do not cover all of the web – just a fraction  what is not covered is “invisible web”  organizing content: labeling, arranging  indexing for searching – automatic  keywords and other fields  arranging by URL popularity - PageRank as Google  classifying as directory  mostly human handpicked & classified  as a result of different organization we have basically two kinds of search engines:  search – input is a query that is then searched & displayed  directory – classified content – a class is displayed  and fused: directories have now also search capabilities & vice versa 22
  • 23.
    ELABORATION (CONT.)  databases,caches: storing content  humongous files usually distributed over many computers  query processor: searching, retrieval, display  takes your query as input  engines have differing rules how handled  displays ranked output  some engines also cluster output and provide visualization  at the other end is your browser  all search engines have these basic parts in common  BUT the actual processes – methods how they do it – are based on various algorithms & they differ  most are proprietary with details kept mostly secret but based on well known principles from information retrieval or classification  to some extent Google is an exception – they published their method 23
  • 24.
    BASIC IR FEATURES Boolean operators  AND, OR, NOT, grouping  Extended operators  NEAR, ADJACENT, (")  Stop word deletion  Stemming  Searching in fields (e.g. host) 24
  • 25.
    WHAT ABOUT THEINVISIBLE WEB?  Also known as the Deep Web  Documents that are on the WWW but not indexed by Search Engines  Some are available only by submitting forms  Some are not generally accessible (in subnets)  Some are not in (X)HTML format  More search engines parse non-(X)HTML now than before  Because of awareness of the problem companies are making more content available using  Stable URLs  Robot-friendly sitemaps  But much content is still not indexed 25
  • 26.
    PLENTY OF IMPORTANTYET INVISIBLE DOCS  How to find them?  Many of them are in databases  No one search engine covers everything  Use database tools  Especially for research articles  Use multiple search engines or a meta- crawler  dogpile is the most famous 26
  • 27.
    RANKED OUTPUT  MostSEs produce ranked lists by applying simple rules:  Early words are more important  Title is very important  Frequency of occurrence matters for some  Infrequent words matter more  Modification date  Google is different:  PageRankTM method based on popularity  Links as money 27
  • 28.
    COVERAGE DIFFERENCES  Noengine covers more than a fraction of WWW  estimates: none more than 16%  hard (even impossible) to discern & compare coverage, but they differ substantially in what they cover  in addition:  many national search engines  own coverage, orientation, governance  many specialized or domain search engines  own coverage geared to subject of interest  many comprehensive sources independent of search engines  some have compilations of evaluated web sources 28
  • 29.
    SEARCHING DIFFERENCES  Substantialdifferences among search engines on searching, retrieval display  need to know how they work & differ in respect to  defaults in searching a query  searching of phrases, case sensitivity, categories  searching of different fields, formats, types of resources  advance search capabilities and features  possibilities for refinement, using relevance feedback  display options  personalization options 29
  • 30.
    HOW TO SUCCEEDWITH SES  As a surfer:  If you don't know what you are looking for  Use multiple SEs, or a meta-crawler  Search within results  Use Boolean expressions or search within results  Consider specialized engines  As a creator:  HTML level  Always use ALT attributes with <IMG>, etc.  Avoid frames  Make it easier to index  Don't expect SEs to find your pages  Make links between your pages  Use metadata  Informal: <meta name="description" …>  Formal: Dublin core and others  Increase your pages popularity  Don’t use systematic reciprocal linking: rings, exchanges, lists  Page Rank™ is inversely proportional to out degree 30
  • 31.
    BASICS  Generally themore keywords you use in your search the more specific and accurate your results will be.  For example, a search for the Asiatic Lion, India, Geer, Junagadh, will produce better results if you search for the words “Asiatic Lion, India, Junagadh" than if you search for just “Asiatic Lion"  Some search engines will also perform this same function when you place a + sign in front the keywords such as +Student +Contests. 31
  • 32.
    HOW TO USESE  “+” before a word in a search will locate for documents which definitely contain the word.  “-” before a word will exclude that word from search.  Placing words between quotation marks will “ ” search for phrase between the quotes.  Using “or” between search phrase will search or each term separately.  Examples :  +BLACK+BLUE: The search results will contain documents which contain the word black and the word blue.  BLACK-BLUE: Those documents will be returned which contain the word black but not the word blue.  “BLACK BLUE”: Those documents will be returned which include the phrase black blue. (placed together).  BLACK OR BLUE: Those documents will be returned which contain the term black or the term blue. 32
  • 33.
    HOW TO USESE  Building a more complex query requires the use of Boolean operators that allow you to refine and extend the terms of the search. The Boolean operators most often seen are:  AND - All the terms joined by "AND" must appear in the pages or documents. Some search engines substitute the operator "+" for the word AND.  OR - At least one of the terms joined by "OR" must appear in the pages or documents.  NOT - The term or terms following "NOT" must not appear in the pages or documents. Some search engines substitute the operator "-" for the word NOT.  FOLLOWED BY - One of the terms must be directly followed by the other.  NEAR - One of the terms must be within a specified number of words of the other.  Specify the words clearly (+, -)  Use Advanced Search when necessary  Provide as many particular terms as possible  If looking for a company, institution, or organization, try: www.name [.com | .edu | .org | .gov | country code]  Some searching engine specialize in some areas  For broad queries, try to use Web directories as starting points  Anyone can publish data on the Web, so information that they get from search engines might not be accurate. 33
  • 34.
    TIPS FOR BETTERSEARCH(GOOGLE)  Get Specific With Quotes  Use the Minus Sign to Remove Words  Use the Asterisk as a Wildcard: The asterisk acts like a wildcard in a search. This is useful if do not know part of a phrase or you forget exactly how a word or name is spelled.  Search Within a Site: To search within a site just put site: in front of the domain followed by your keywords. For example, "site:writerswrite.com harry potter" will return to the Harry Potter coverage on our site.  Calculations: Google also has a built-in calculator. You can enter a math equation like 15+25 or 18*2343 or 553/17 and Google will return the answer within a calculator that appears in your results. It will also convert units and even graph equations.  Definitions: Google also returns definitions. Use define: followed by the word you want defined. For example, "define:perquisition" will provide the definition for perquisition. You can also include wildcards. For example: define:onomato* will return the definition for onomatopoeia.  Related Sites: Google will also provide you with sites that are related to other sites. If you use "related:google.com" it returns Yahoo, Bing, DuckDuckGo and other search tools.  Search Recently Updated Webpages: If you want to narrow your search results to recent information you can use the news search tab or you can click the tools tab. You will see that the default search in tools is "any time." This can be narrowed to past hour, past 24 hours, past week, past month, past year or a custom range.  Google Alerts: Use Google Alerts to stay up-to-date on your research topics. You can configure keyword searches that you want to get updates on when there is new content available. You can get updates by email or as RSS feeds.  Random Facts with Google: Google will return a trivia question and answer if you use this keyword. 34
  • 35.
    GOOGLE  Initially knownas BackRub, Google began as a research project of Larry Page, who enrolled in Stanford’s computer science graduate program in 1995. There, he met fellow CS student Sergey Brin. The two stayed in touch as Page began looking into the behavior of linking on the World Wide Web.  In 2001, Google employee Paul Buchheit started work on an email product designed to address the company’s increasing internal communications and storage needs.  On April 1st, 2004, Gmail launched to the public with 1GB of storage and advanced search capabilities, dwarfing the limitations imposed by popular competing email products of the time, many of which offered just a few megabytes of storage.  ”Maps can be useful and fun,” said Google when it first introduced Maps in 2005. The web-only renders provide step- by-step directions and zoomable maps with a smattering of businesses like hotels available to search. 35
  • 36.
    GOOGLE  No. 1Position in Google Gets 33% of Search Traffic  GOOGLE TURNS 20: RESHAPED THE WORLD  No technology company is arguably more responsible for shaping the modern internet, and modern life, than Google.  Many of those people use Google software to search the repository of human knowledge, communicate, perform work, consume media, and maneuver the endlessly vast internet  Segmentation of search – Google would try and categorize information more, for example Google Book Search, (US) government search, blog search, etc.  Semantic Web – Google search engine is becoming more sophisticated, taking account of synonyms, page structure and user intent.  Searching the cloud – as people become more confident to store information on "cloud" hard drives, there will be a need to search these.  Real-time – searching what people are writing at the moment to catch the latest buzz and get really up to the minute information.  Mobile search – as we use mobiles for information, we will need search tools to search them, so mobile websites will need to be formatted for searchability. 36
  • 37.
    GOOGLE  Top Searchengines market share worldwide  Google dominates the field of search engines with more than a 90% market portion.  The second most popular search engine on the market is Bing with 2.78%  Other companies have even smaller Percentages: Yahoo 1.6%, Baidu 0.92%, Yandex 0.85%, and DuckDuckGo 0.5%.  The growth rate of global internet users is 8.2% per year.  Google searches completed per year grew around 10% per year  There are 40,000+ Google searches every second  Google processes more than 3.5 billion searches every day, and 1.2 trillion searches every year.  92.26% of all global searches take place on Google. 37
  • 38.
    GOOGLE  Google isthe world’s largest search engine, and it has over 1 billion people who use it’s products and services.  Google is an indexing and searching service which provides us with connections to websites and pages based on our specific search queries.  Studies show that Google indexes 35 trillion web pages and more than 2.4 million searches happen through the search engine every minute.  However, some studies show that the internet as a whole is home to a whopping 17.5 quadrillion different pages.  While Google still has significant room to grow, it has by far the biggest collection of web pages available on the internet.  This massive number of web pages available gives Google promising marketing potential, which makes it an essential marketing tool for businesses.  There are more than 3.5 billion Google searches conducted every day.  76% of all global searches take place on Google.  Google Search Index contains more than 100,000,000 GB.  16 – 20% of all annual Google search results are new.  More than 60% of Google searches come from mobile devices. 38
  • 39.
    LIST OF SEARCHENGINES 39  Image Search Engines  1. Tineye - this is a reverse image search site. Upload an image and this search engine accurately finds other places online where the same image can be found. The way this works is you upload an image from your computer or enter the web address of an image online and Tineye will list all the places online the same or similar image can be found.  2. OpenClipArt - search for open source clip art you can use any way you wish. 100% free image search engine.  3. Pixbay - over a quarter million illustrations, photos, vector and clip art you can use without attribution in digital and printed form, even for commercial applications.  4. Flickr - one of the oldest and most well known image search websites. Flickr is owned by Yahoo.  5. FreeImages - can download and use images or upload and share images. Includes 250,000+ high resolution, high quality photographs from photographers.  6. PrivateLee - a privacy enhanced image search engine - not so much for downloading and using images but rather being able to conduct image searches without being tracked.  7. Giphy - search for and find animated gif images. You can perform a search  Privacy Search Engines  We all know search engines gather and store all kinds of information about you.  1. Ixquick - this one is actually a proxy server When you search using Ixquick their computers act as a middleman or buffer so your information and data if filtered out and stripped.  2. StartPage - sister site to Ixquick. Both are certified search engines that does not record your IP address or track your searches in any way.  3. PrivateLee - does not use cookies or any other tracking data so your internet searches are not compiled, saved or shared. Has both web and image search.  4. DuckDuckGo - non tracking search engine. Searches can be customized to provide region specific results and language settings.  5. LookSeek and DarkLookSeek - both of these are from the same company.
  • 40.
    LIST OF…. 40  PeopleSearch Engines  1. Spokeo (website ) - you can search for someone on Spokeo by input either a name, online username, email address or phone number.  2. Inforegistry (website ) - web based background check that allows you to perform a people information search and instantly find current address, phone number, marital status and additional information about any United States citizen.  3. Peek You (website ) - a decent free people search engine that allows you to track down a person or people by name, username, address, phone, email address or even interests.  4. Background Report 360 (website ) - background investigations service and person search providing instant background check results on any person residing court records and personal records such as addresses and phone numbers of the person being looked up.  5. Everify (website ) - online background investigations tool and persons search with 1 billion+ people records.  6. Inteligator (website ) - comprehensive background check service that provides detailed information on anyone residing in the United States with a records search including sex offender, public records, marriage and divorce, property records and criminal records history.  7. GovRegistry (website ) - GovRegistry is a background check, criminal records and sex offender search engine that provides comprehensive background information on any US citizen.  Torrent Search Engines  A torrent (bit torrent) is a way to get things free (that normally are not free). A torrent is actually a file sharing system where multiple servers or computers each have and store pieces of a file which can be a pdf, software program, music, movies, just about any type of file.  1. Torrentz - a meta-search engine combining results from dozens of torrent search engines.  2. Toorgle - another torrent meta search engine that pulls torrent search results from multiple torrent search sites.  3. KickAssTorrents - large torrent search engine.
  • 41.
    INFOSEEKEXCITE/WEBCRAWLER 41  InfoSeek isone of the best search engines for finding information In addition to searching the Web.  InfoSeek provides guides to popular subjects which contain links to recommended sites.  InfoSeek also groups all of your results that occur in the same website.  Searches can be further limited by using the advanced search, reachable by following the advanced search link on InfoSeek.  Searching for news stories is also available on InfoSeek.  In addition to searching, you can also reference information such as maps, Roget's thesaurus and Webster's dictionary.  Excite provides a search engine that will crunch through webpages, and also provides searches of recent news stories, site reviews, a shopping guide and more.  Site reviews on Excite can also be found on Webcrawler,  Recent news stories.  Excite's "Power Search" will allow you to further limit your search.  It allows your search must NOT contain in addition to the words the search MUST contain.  It also allows you to search only through websites that Excite recommends, to bring up only titles in your results  Personalize Excite, under the "My Excite" section.
  • 42.
    LYCOS/ALTAVISTA 42  Lycos isanother major search engine that is also becoming more of a media site, similar to Excite, Yahoo and InfoSeek.  Technology called WiseWire which brings up particular pages that pertain to your query and allows users to rate these pages at the same time.  Advanced search option, dubbed "Lycos Pro."  Images, sounds and products and search through Usenet postings, message boards, and personal homepages.  Lycos also provides web reviews, through its "Top 5% of the Web" service.  AltaVista is a great search engine if you are trying to pull up a large number of webpages relating to your search.  Most effective when you are involved in a very specific search or when you are searching for recently added or updated webpages.  Perform a variety of search options you will not be aware  Includes a number of specialty options for use when searching newsgroups  It is not as useful if you are seeking a website on a general subject because AltaVista will bring up more results than you will want, including many inappropriate listings.
  • 43.
    YAHOOHOTBOT 43  Yahoo :Fast becoming a major media leader.  One drawback to searching on Yahoo is it will often bring up unsignificant webpages related to your topic.  The listings in Yahoo by category from Yahoo's front page.  Yahoo also provides searchable news stories culled from various major sources.  Yahoo provides daily picks and an internet magazine called Yahoo Internet Life.  Yahoo also provides online diversions in the way of message boards, chat, email addresses and instant messaging (Yahoo pager).  HotBot : Large number of results and you are searching a specific subject.  Provides browsing by subject through website reviews and has licensed reviews from Look Smart  SuperSearch that will allow you to restrict your web search by the date, by the domain suffix (i.e. .com, .net), by continent and by media type such as audio or video.  Searching through recent news stories with its news search service entitled Newsbot.  Searches for businesses, people, newsgroups, domain names, discussion groups and shareware.
  • 44.
    BEST ALTERNATIVE SEARCHENGINES  Some general search engines beyond the top three — Google, Bing, and Baidu.  DuckDuckGO :Concerned about online privacy? DuckDuckGo prides itself on being the search engine that does not track or personalize your searches and results. They even offer handy visual guides on Google tracking and filter bubbling.  If you’re an iOS user, you can set DuckDuckGo to be the default search engine in Safari. It’s also an option for Safari on macOS.  Ecosia :Want trees planted while you search? That’s what Ecosia does! Simply run your normal searches and Ecosia will use its surplus income to conservationist organizations that plant trees.  Dogpile : If you want results from the top three search engines, but don’t want to go to them individually, try Dogpile.  WolframAlpha : Looking for a search engine based on computation and metrics? Try WolframAlpha. It will give you website data, historical information by date, unit conversions, stock data, sports statistics, and more. You can see examples by topic to learn more.  Gigablast : an open-source search engine.While it doesn’t always get things right, it does provide a retro look, results return quickly, and a feature similar to the now-defunct Google Instant.  Startpage : If you are looking to search without being tracked, Startpage is another solid option.  Quant : is a Paris-based search engine dedicated to protecting your privacy. They are the first search engine to protect user’s privacy and preserve the “digital ecosystem” by remaining neutral. 44
  • 45.
    SOCIAL NETWORK SPECIFICADVANCED SEARCH  Facebook Search : Want to see a particular search across different areas of Facebook? Use Facebook’s advanced search options. You can view search results for people, pages, places, groups, and more.  LinkedIn People/job/answer Search : If you want to find some new connections on LinkedIn, use the Advanced People Search.  LinkedIn offers job seekers an Advanced Job Search to find jobs using the above information plus experience level and industry.  LinkedIn Answers is a great way to gain exposure and build authority in your industry. Use the Answers Advanced Search to find the perfect questions to answer.  Twitter Search :Twitter’s Advanced Search is a great way to find better results on Twitter. 45
  • 46.
    SOCIAL SEARCH  Keyholeallows you to search for hashtags, keywords, @mentions, and URLs. Want to see how your latest blog post was shared across social networks? Just select URL on Keyhole and put in the URL and you’ll see who has shared it.  Social Mention allows you to search across multiple types of networks including blogs, microblogs, bookmarks, comments, events, images, news, and more.  Use Buzzsumo if you have a topic in mind and want to see which articles on the web were most shared for that particular search. There is a paid version that can give you access to more tools for each topic. 46
  • 47.
    FORUMS  Want toparticipate on forums in your industry? Use this search engine to find results specifically on forums.  BoardReader allows you to search forums and narrow results down by date (last day through last year) and language.  Google Forum 47
  • 48.
    BLOG  Blog SearchEngine aptly describes this search engine. Search blogs and blog posts using keywords. It’s not perfect, but it’s better than a general search. 48
  • 49.
    DOCUMENTS, EBOOKS, ANDPRESENTATIONS  Google Advanced Search allows you to search for specific types of documents. Looking specifically for PDFs? Set that as your criteria.  Scribd is the largest social reading and publishing network that allows you to discover original written content across the web.  SlideShare is the largest community for sharing presentations. If you missed a conference or webinar, there’s a good chance the slides from your favorite speakers are here. 49
  • 50.
    IMAGE SEARCH  Flickroffers an advanced search screen to find photos, screenshots, illustrations, and videos on their network.  Pinterest allows you to search for anything visual – clothing, cars, floors, airplanes, etc, and pin it to your favorites. Just be sure you don’t steal copyright work.  Bing offers an image search that starts out with the top trending images, then leads to images which can be filtered by size, layout, and other criteria.  Google Advanced Image Search allows you to get even more specific about the images you are looking for, including specifying whether they are faces, photos, clip arts, or line drawings.  Have you seen an image around the web and want to know where it came from? That’s what TinEye is for. Just put your image in the search box and TinEye will find where that image has been seen from around the web. 50
  • 51.
    CREATIVE COMMONS MEDIA Media created by others to use on your website?  Looking for only images that you can repurpose, use for commercial purposes, or modify?  Creative Commons Search which will allow you to look through multiple sources including Flickr, Google Images, Wikimedia, and YouTube.  Wikimedia Commons has over 12 million files in their database of freely usable images, sound bites, and videos. Use the search box or browse by categories for different types of media. 51
  • 52.
    VIDEO SEARCH  YahooVideo Search allows you to search through video content from their own network, YouTube, Dailymotion, Metacafe, Myspace, Hulu, and other online video providers for videos on any topic.  Sidereel allows you to go beyond YouTube to find shows on dozens of streaming platforms like HBO and Hulu. If you’re looking for streaming videos, you’ll likely find it here.  AOL Video aggregates the day’s best clips from around the web, but you can also use it as a search engine.  With Google Video Search you’ll be able to search for videos on any topic and filter your results by duration, date when uploaded, video source, and much more.  YouTube 52
  • 53.
    WEBSITE DATA &STATISTICS  CrunchBase offers insight into your favorite online brands and companies. Listings will tell you people who are associated with a company, contact information, related videos, screenshots, and more.  SimilarWeb allows you to search for website or app profiles based on specific domains or app names. Domains with a high volume of traffic will have data including total regional visitors per month, pageviews online vs. mobile, demographics, sites similar audiences like, and more.  BuiltWith allows you to search for domains and see the technology they use, including analytics, content management systems, coding, and widgets. You can also click on any of the products to see usage trends, industries using the technology, and more. 53
  • 54.
    DIFFICULTIES OF BUILDINGA SEARCH ENGINE  Build by Companies and hide the technical detail  Distributed data  High percentage of volatile data  Large volume  Unstructured and redundant data  Quality of data  Heterogeneous data  Dynamic data  How to specify a query from the user  How to interpret the answer provided by the system 54
  • 55.
    USER PROBLEMS  Donot exactly understand how to provide a sequence of words for the search  Not aware of the input requirement of the search engine.  Problems understanding Boolean logic, so the users cannot use advanced search  Novice users do not know how to start using a search engine  Donot care about advertisements ? No funding  Around 85% of users only look at the first page of the result, so relevant answers might be skipped 55
  • 56.
    GOOGLE  Google LLCis an American multinational technology company that specializes in Internet-related services and products, which include online advertising technologies, a search engine, cloud computing, software, and hardware.  It is one of the Big Five companies in the American information technology industry along with Amazon, Facebook, Apple, and Microsoft  According to the latest netmarketshare report74.52% of searches were powered by Google and only 7.98% by Bing.  Google is also dominating the mobile/tablet search engine market share with 93%! 56
  • 57.
    BING  Microsoft Bing(formerly known simply as Bing) is a web search engine owned and operated by Microsoft.  The service has its origins in Microsoft's previous search engines: MSN Search, Windows Live Search and later Live Search.  Bing provides a variety of search services, including web, video, image and map search products. It is developed using ASP.NET.  Bing is Microsoft’s attempt to challenge Google in the area of search.  Despite their efforts they still did not manage to convince users that their search engine can produce better results than Google. 57
  • 58.
    YAHOO! SEARCH  Yahoostyled as yahoo! is an American web services provider.  It is headquartered in Sunnyvale, California and is owned by Verizon Media, pending sale to investment funds managed by Apollo Global Management  Since October 2011 Yahoo search is powered by Bing.  Yahoo is still the most popular email provider and according to some studies holds the fourth place in search. 58
  • 59.
    ASK  Ask.com (originallyknown as Ask Jeeves) is a question answering–focused e-business founded in 1996 by Garrett Gruener and David Warthen in Berkeley, California.  Formerly known as Ask Jeeves, Ask.com receives approximately 0.05% of the search share.  ASK is based on a question/answer format where most questions are answered by other users or are in the form of polls.  It also has the general search functionality but the results returned lack quality compared to Google or even Bing and Yahoo. 59
  • 60.
    AOL SEARCH  AOL(stylized as Aol., formerly a company known as AOL Inc. and originally known as America Online) is an American web portal and online service provider based in New York City.  According to net market share the old time famous AOL is still in the top 10 search engines with a market share that is close to 0.04%.  The AOL network includes many popular web sites like engadget.com, techchrunch.com and the huffingtonpost.com. 60
  • 61.
    BAIDU  Baidu wasfounded in 2000 and it is the most popular search engine in China.  It’s market share is increasing steadily and according to Wikipedia, Baidu is serving billion of search queries per month.  It is currently ranked at position 4, in the Alexa Rankings. And Rank No. 1 in China  As Google maintains its stronghold in the global internet search arena, Baidu, Inc.,has the upper hand in China, with 72.37% of the nation's market share as of May 2021. 61
  • 62.
    WOLFRAMALPHA  Wolframalpha isdifferent of all the other search engines.  They market it as a Computational Knowledge Engine which can give you facts and data for a number of topics.  It can do all sorts of calculations, for example if you enter “mortgage 2000” as input it will calculate your loan amount, interest paid etc. based on a number of assumptions. 62
  • 63.
    DUCKDUCKGO  Has anumber of advantages over the other search engines.  It has a clean interface, it does not track users, it is not fully loaded with ads and has a number of very nice features (only one page of results, you can search directly other web sites etc).  Privacy  Update: According to duckduckgo traffic stats, as of October 2018, duckduckgo is serving more than 30 million searches per day. 63
  • 64.
    9. INTERNET ARCHIVE The Internet Archive is an American digital library with the stated mission of "universal access to all knowledge".  It provides free public access to collections of digitized materials, including websites, software applications/games, music, movies/videos, moving images, and millions of books.  In addition to its archiving function, the Archive is an activist organization, advocating a free and open Internet  Archive.org is the internet archive search engine.  It is very useful tool if you want to trace the history of a domain and examine how it has changed over the years. 64
  • 65.
    YANDEX.RU  Yandex isa Russian Dutch-domiciled multinational corporation providing over 70 Internet-related products and services, including transportation, search and information services, e- commerce, navigation, mobile applications, and online advertising.  According to Alexa, Yandex.ru is among the 30 most popular websites on the Internet with a ranking position of 4 in Russian.  Yandex present themselves as a technology company that builds intelligent products and services powered by machine learning. 65
  • 66.
    DOGPILE  Dogpile isa metasearch engine for information on the World Wide Web that fetches results from Google, Yahoo!,Yandex, Bing, and other popular search engines, including those from audio and video content providers such as Yahoo! 66
  • 67.
    SEARCH ENGINE OPTIMIZATION(SEO) Is the process of improving the volume and quality of traffic to a website from search engine.  A higher ranking when someone searches a term in your industry increases your brand’s visibility online.  More opportunities to convert qualified prospects into customers.  SEO can help your brand stand above others as a trustworthy company and further improve the user’s experience with your brand and website.  As a marketing strategy for increasing a site's relevance, SEO considers how search algorithm work and what people search for.  Search engines often index millions of pages for certain key words, and your website can be buried deep on page 100 or worse if it is not optimized properly.  Building a website or adjusting a website so that it comes up on the first or second page of search engine results is what our SEO Service does.  If your page is not optimized it will not get a high ranking and you will not get results, no matter how many search engines the site has been submitted to.  Effective SEO may require changes to the html source code of a site.  SEO tactics may be incorporated into web site development and design.  The term "search engine friendly" may be used to describe web site designs, menus, content management system and shopping carts that are easy to optimize. 67
  • 68.
    LIMITATIONS  every searchengine has limitation as to  coverage  meta engines just follow coverage limitations & have more of their own  search capabilities  finding quality information  some have compromised search with economics  becoming little more than advertisers  but search engines are also many times victims of spam indexing  affecting what is included and how ranked 68
  • 69.
    FACTS  Most searchengines have vanished.  Google is a big player.  63% of Internet users use a search engine in a given session.  Approximately 94 million adults use the internet on an average day.  This means approximately 59.22 MILLION people use search engines in an average day.  Microsoft realized Internet is here to stay  i. Dominates the browser market.  ii. Realizes search is critical. 69
  • 70.
    CONCLUSION  The primarygoal is to provide high quality and relevant search results over a rapidly growing World Wide Web. e.g. Google employs a number of techniques to improve search quality including page rank, anchor text, and proximity information.  Google is a complete architecture for gathering web pages, indexing them, and performing search queries over them.  Search Engine is really useful tool in present era of web.  There are many of search engines available in market, but the most popular search engine is Google.  So for getting topmost results in web, we have to use search engine optimization technique.  Both on page and off page search engine optimization techniques are important for better search result.  In the three flavors of SEO, White Hat SEO technique is the best and long term as well. 70
  • 71.
     As afinal word, if you search “What is the best search engine?” you will get an answer that Google is the best and most popular search engine and Bing is in the second place (on a Global level).  The list is by no means complete and for sure many more will be created in the future but as far as the first places are concerned, Google and Bing will hold the lead positions for years to come.  India must have its own Search engine  Library Professionals have to work hard @ par with Google. 71
  • 72.
    Thank You…… Email :librarian-hml@msubaroda.ac.in SLIDESHARE LINK:- https://www.slideshare.net/mayanktrivedi21 72