2. บางทีเรียกว่า search engine service หรือ search service
เป็นเว็บไซต์ที่จัดทาขึ้นเพื่อใช้เป็นเครื่องมือค้นหาแฟ้มข้อมูล
(Files) ต่าง ๆ บน Internet
Files ต่างๆ ถูกรวบรวมโดยโปรแกรมหรือซอฟต์แวร์ (computer
program) ที่เรียกว่า Robot, Spider, Crawler,
Wanderer, หรือ Worm.
Page 2
3. โปรแกรมหุ่นยนต์ ก็จะลัดเลาะไปตามเครือข่ายใน Internet เพื่อเก็บข้อมูล
ของเว็บไซต์ต่างๆมาไว้ในฐานข้อมูลของตน
จากนั้นก็ทาดรรชนี (Index) ช่วยค้น จากแฟ้มข้อมูลที่จัดเก็บไว้
(Title, Fulltext, Size, URL etc)
There is no selection criteria for the collection of files.
Page 3
4. A search engine is a searchable database of Internet files
collected by a computer program, called a crawler,
robot, worm, or spider.
Indexing is created from the collected files, e.g., title, full
text, date last modified, URL, language, etc.
Results are ranked by relevance; this will vary among
search engines.
http://www.internettutorials.net/world-of-search-engines.asp
Page 4
5. Search Engines:
are tools that help you to find things on the Internet. One example of a
popular search engine is Google.
(http://www.sag.org/content/new-media-glossary)
Search Engines:
Web services which help search through Internet addresses for user-
defined terms or topics in which you are interested.
(www.starrsites.com/glossary.htm)
Page 5
6. A web search engine is designed to search for information on
the World Wide Web.
The search results are usually presented in a list and are commonly
called hits.
The information may consist of web pages, images, information and
other types of files.
Some search engines also mine data available in databases or open
directories.
Unlike Web directories, which are maintained by human editors,
search engines operate algorithmically or are a mixture of
algorithmic and human input.
(http://en.wikipedia.org/wiki/Search_Engines)
Page 6
7. Search Engines ประกอบด้วย 3 ส่วน
1. Spider: โปรแกรมที่ท่องไปในเว็บ เพื่อเก็บข้อมูลเว็บใหม่ๆ / ที่ปรับปรุง
มาจัดเก็บไว้ในฐาน
(Program that traverses the Web from link to link, identifying and reading pages)
2. Index: ฐานข้อมูลประกอบด้วยสาเนาของเว็บเพจต่าง ๆ ที่ Spider รวบรวมมาจัดเก็บไว้
(Database containing a copy of each Web page gathered by the spider)
3. Search and retrieval mechanism: เทคโนโลยีช่วยในการสืบค้นคาค้นในฐานข้อมูล
----> แสดงผลการสืบค้นตามลาดับที่เกี่ยวข้องที่หน้าจอ
(Technology that enables you to search the index and that returns results in a relevancy-
ranked order )
Page 7
8. In essence, a search engine consists of three components:
Spider: Program that traverses the Web from link to link,
identifying and reading pages
Index: Database containing a copy of each Web page or other
file gathered by the spider
Search and retrieval mechanism: Technology that enables you to
search the index and that returns results in a relevancy-
ranked order
(http://www.internettutorials.net/world-of-search-engines.asp)
Page 8
9. There are two major types of search engines:
•Individual: An individual engine uses a spider to collect its
own searchable index.
•Meta: A meta engine searches multiple individual engines
simultaneously. It does not have its own index, but uses
the indexes collected by the spiders of other search
engines. This type of engine is covered later in this
tutorial under the topic of Meta Search Engines.
Page 9
16. When you have a well-defined topic or idea to research
When your topic is obscure
When you are looking for a specific site
When you want to search the full text of millions of Web pages
When you want to retrieve a large number of Web sites on your topic
When you want to search for particular types of documents, sites,
file types, languages, date last modified, geographical location, etc.
Page 16
17. ใช้ Search Engine ค้นหาหัวข้อที่มีลักษณะเฉพาะเจาะจง เช่น
John Lennon, French wine, digital libraries project
เป็นต้น ซึ่งอาจหาไม่พบ หรือได้ผลการสืบค้นเป็น
เว็บไซต์เพียงจานวนหนึ่งหากใช้ Subject Directories
Page 17
36. Before you search, make a plan!
Putting together a search is a three-step process.
1. Identify your concepts
When planning your search, break down your topic into its
separate concepts. Let's say you're interested in the effects of global
warming on crops. In this case, you have two concepts: GLOBAL
WARMING and CROPS.
Page 36
37. 2. Make a list of search terms for each concept
Once you have identified your concepts, list the terms which
describe each concept. Some concepts may have only one term, while
others may have many.
global warming
greenhouse effect
greenhouse gases
climate change
crops
crop yields
crop production
food supply
These lists are a suggestion. Depending on the focus of your
search, there may be other terms more suited to what you're looking for.
Page 37
38. 3. Specify the logical relationships among your search terms
Once you know the words you want to search, you need to
establish the logical relationships among them using Boolean logic:
AND, OR, NOT.
To keep things simple, you don't need to use all the words
you've compiled in a single search. The words are there to help you
experiment with different searches until you find what you want.
Page 38
39. Boolean AND search
Let's start with a very simple two-word search. In this type of
search, we want Web pages that contain both of our search terms. This
is Boolean AND logic. This is probably the most common type of search
that people want to do.
In our example, we're asking for documents that contain the
words rain and snow. To do this, we simply type the two words into the
search box with a space between them. This is the default logic on
Google and nearly all other general search engines on the Web.
Notice how both words appear in the results. This is exactly
what we wanted.
Page 39
41. A variant of an AND search is the plus sign (+). In many search
engines, the plus sign signals an AND search. It guarantees that the
words or phrases you include in your search will appear in your search
results. For example, +rain +snow. In most search engines, you don't
need to use the plus sign - the search engine will assume it. But it
doesn't hurt to use it.
Google takes the plus sign seriously. If you include the plus sign,
it will search for your words exactly as you have typed them. This keeps
Google from adding synonyms to your search to bring in a wider set of
results. In other words, the plus sign disables the synonyms feature.
Page 41
42. Boolean OR search
What if we want results that include either the word rain or
the word snow? This calls for Boolean OR logic. An easy way to
ensure this is to use the advanced search page. Most search engines
have such an option and it's very useful. Notice how the two search
terms were typed into the line one or more of these words.
The search results include pages with just the word rain or just
the word snow, exactly as we wanted.
Google requires that the word OR be typed in CAPITAL
LETTERS. So do some other search engines. Since this may not be
easy to remember, it's best to go to the advanced search page and let the
search engine do the rest.
Page 42
45. Boolean NOT search
Sometimes you want to retrieve documents that do not contain a
particular word. This can help when associated words are not really
relevant and can muddy the focus of your results. To do this, place a
minus sign (-) in front of the word you want to exclude.
Let's go back to our rain-snow example. In this case, we want
documents that contain the word rain, but not the word snow. So,
we've placed the minus sign immediately in front of the word snow:
rain -snow
Page 45
48. Beware!
(1) You may end out excluding relevant pages with this
technique. Proceed with care.
(2) You many also end out with results that you don't want.
When you look at the results screen above, you see that the term
rain has different meanings. To search for the weather phenomenon of
rain, it would be a good idea to add a semantically meaningful word such
as weather, storm or the like.
The more specific your terminology, the better your results will
be.
Page 48
50. Phrase Search
Some words naturally appear in the context of a phrase, for
example, freedom of the press. To search on phrases in most search
engines, simply enclose the phrase within double quotes:
"freedom of the press"
Phrases are especially important when there are stop words in
your search. These are "little" words such as a, and, the, in, it, etc. Most
search engines tend to ignore these words. If you want to be sure they
are included in your search results, enclose them with the rest of your
search within quotation marks. You can also put a plus sign (+) in front
of them. Yahoo! suggests a combination of quotation marks and the
plus sign, e.g., "+in thing".
Page 50
52. Field Search
Field searching is an optional way to focus your search results.
With general search engines, you're searching the full text of many
millions of pages, and field searching can help you retrieve results that
may be more manageable.
For example, you can search for words that appear within a
particular Web site, within the URL (Web address), in the page title,
and so on.
The exact technique for doing this can differ among search
engines, so be sure to check out the Help pages before proceeding.
Let's consider a couple of examples on Google.
Page 52
57. Notice that all the results come from the site nasa.gov.
You can also go to the advanced search page on Google
to conduct this search.
Page 57
58. Natural language search
A few search engines encourage you to type your search as a
"normal" question or sentence, rather than concern yourself with
Boolean logic.
This is sometimes known as a natural language search. On
these engines, a variety of sophisticated techniques are working behind
the scenes to analyze your search and return relevant results.
Hakia is a good example of this type of engine. Give it a try
and see what you think.
Page 58
60. Fill in the blanks (*)
The *, or wildcard, is a little-known feature that can be very
powerful.
If you include * within a query, it tells Google to try to treat
the star as a placeholder for any unknown term(s) and then find the
best matches.
For example, the search [ Google * ] will give you
results about many of Google's products (go to next page and next page --
we have many products).
The query [ Obama voted * on the *
bill ] will give you stories about different votes on different bills.
Note that the * operator works only on whole words, not parts of words.
Page 60
62. อ่านเพิ่มเติมที่
1. Bet Search Tips
(http://www.internettutorials.net/best-bet-search.asp)
2. Boolean Searching on the Internet
(http://www.internettutorials.net/boolean.asp)
Page 62