7. Suggested Answers:
1. Crawler search engines rely on sophisticated computer
programs called “spiders,” “crawlers,” or “bots” that surf the
Internet, locating webpages, links, and other content that are
then stored in the SE’s page repository.
Web directories are categorized listings of webpages created
and maintained by humans. Because websites are only included
after being reviewed by a person, it is less likely that search
results will contain irrelevant websites.
2. An index helps search engines efficiently locate relevant
pages containing keywords used in a search.
3. The methods by which search engines determine page rank
vary and the specific algorithms they use are often carefully
guarded trade secrets. In some cases, a search engine may use
hundreds of different criteria to determine which pages appear
at the top of a SERP. Google, for instance, claims to use over
200 “clues” to determine how it ranks pages (Google.com,
2014). According to Dover and Dafforn (2011), all these factors
can be grouped into two categories: relevance and popularity (or
“authority”).
4. Businesses utilize search engine optimization (SEO) to
improve their website’s organic listings on SERPs. No payments
are made to the search engine service for organic search
listings.
Pay-per-click (PPC) advertising refers to paid search listings
where advertisers pay search engines based on how many people
click on the ads.
5. Google Trends—Trends (google.com/trends) will help you
identify current and historical interest in the topic by reporting
12. relevance.
Amount of text on page that appears relevant: The proportion of
relevant text to non-relevant text can influence relevance.
Backlinks from relevant sites and Web directories: Webpages
that are listed in relevant categories of Web directories are more
likely to be relevant because they were reviewed by human
editors.
SERP click through rate (CTR): Searchers are more likely to
click on listings that contain relevant content.
Onpage factor: Metadata (such as page titles, page descriptions)
and descriptive URLs should reflect the page content. People
use the information in search listings to determine if a link
contains relevant information. This affects CTR.
Dwell time and bounce rate are impacted by how relevant a
website’s content is. Long dwell times and short bounce rates
suggest relevant content related to the search.
Search engines want their customers to be satisfied. As a result,
SERP ranking is influenced by factors that impact user
satisfaction. Factors that are likely to influence a search
engine’s user satisfaction rating are:
Dwell time: Users that stay on a site longer are probably more
satisfied.
Site speed: Slow page loading time on websites reduces
satisfaction.
Reading level: Reading levels that are too high or too low
frustrate users.
Hacked sites, malware, spam reduce user satisfaction
significantly.
Website satisfaction surveys: Google created user satisfaction
surveys that webmasters can embed in their websites. Positive
responses to these surveys can improve ranking.
Barriers to content: Making people register, provide names, or
fill out forms to get to content has a negative impact on user
satisfaction.
Other factors: Too many ads, page-not-found errors, duplicate
content/pages, content copied from other websites, and spam in
13. comment sections all detract from user satisfaction.
2. A backlinks is an external link that points back to a site.
Google has changed its ranking methods and the assumption is
that backlinks are still very important, but not weighted as
heavily as they used to be. Using a number of other factors
yields a better picture of the relevance of sites to the search.
3. Black hat tactics try to trick the search engine into thinking a
website has high-quality content, when in fact it does not. The
search engines have stronger detection systems in place and
when they are discovered, Google and other SEs will usually
punish the business by dramatically lowering the website’s rank
so that it does not show up on SERPs at all.
4. Using Web analytics programs like Google Analytics,
companies can determine how many people visit their site, what
specific pages they visit, how long they spend on the site, and
what search engines are producing the most traffic. More
sophisticated SEO practitioners will also analyze keyword
conversion rates, or the likelihood that using a particular
keyword to optimize a page will result in conversions (i.e.,
when a website visitor converts to a buyer).These are just a few
of the many metrics used to measure the effectiveness of SEO
strategies.
5. Perhaps the most important action an organization can take to
improve its website’s ranking and satisfy website visitors is
provide helpful content that is current and updated regularly.
When SEO practices are combined with valuable content,
websites become easier to find in search engines but, more
importantly, contribute to building brand awareness, positive
attitudes toward the brand, and brand loyalty.
20
17. want their ad to appear in and time of day. These factors allow
for additional customer targeting designed to help advertisers
reach the consumers most likely to purchase their products.
A quality score is determined by factors related to the user’s
experience. Ads that are considered to be more relevant (and
therefore more likely to be clicked on) will cost less and more
likely run in a top position.
Relevant ads are good for all parties—the search engine makes
more money from clicked ads, the advertiser experiences more
customers visiting its site, and the customer is more likely to
find what he or she is looking for.
4. The effectiveness of PPC ads is heavily influenced by factors
on the webpages that ads are linked to. For instance, sometimes
companies create product-oriented ads, but then link to the main
page of their website instead of a page with information about
the product in the ad. Other factors include landing page design,
effectiveness of the call to action, and the quality of the
shopping cart application. A PPC campaign will not be very
effective if the website is not attractive to consumers once they
reach it.
5. Click through rates (CTRs)—By themselves, CTRs do not
measure the financial performance of an ad campaign. But they
are useful for evaluating many of the decisions that go into a
campaign, such as keyword selection and ad copy.
Keyword conversion—High CTRs are not always good if they
do not lead to sales. Since the cost of the campaign is based on
how many people click an ad, you want to select keywords that
lead to sales (conversions), not just site visits. PPC advertisers
monitor which keywords lead to sales and focus on those in
future campaigns.
22. highlighting of search terms, but also related words or phrases
that may not have appeared in the original query. These can be
used in future searches simply by clicking on them.
Full-text similarity search. Users can submit a block of text or
even a full document to find similar content.
Search on semantic/syntactic annotations. This approach would
allow a user to indicate the “syntactic role the term plays—for
instance, the part-of-speech (noun, verb, etc.)—or its semantic
meaning—whether it’s a company name, location, or event.” For
instance, a keyword search on the word “center” would produce
too many results. Instead, a search query could be written using
a syntax such as the following:
<organization> center </organization>
This would only return documents where the word “center” was
part of an organization’s name. Google currently allows you to
do something similar to specify the kind of files you are looking
for (e.g., filetype:pdf)
Concept search. Search engines could return results with related
concepts. For instance, if the original query was “Tarantino
films,” documents would be returned that contain the word
“movies” even if not the word “films.”
Ontology-based search. Ontologies define the relationships
between data. An ontology is based on the concept of “triples”:
subject, predicate, and object. This would allow the search
engine to answer questions such as “What vegetables are
green?” The search engine would return results about
“broccoli,” “spinach,” “peas,” “asparagus,” “Brussels sprouts,”
and so on.
Semantic Web search. This approach would take advantage of
content tagged with metadata as previously described in this
section. Search results are likely to be more accurate than
keyword matching.
Faceted search. Faceted search provides a means of refining
results based on predefined categories called facets. For
instance, a search on “colleges” might result in options to
“refine this search by. . .” location, size, degrees offered,
23. private or public, and so on. Faceted search tools available
today tend to focus on a specific domain, such as Wikipedia or
Semidico, a search tool for biomedical literature.
Clustered search. This is similar to a faceted search, but without
the predefined categories. Visit Carrot2.org to better understand
this concept. After conducting a search, click on the “foamtree”
option to see how you can refine your search. The refining
options are extracted from the content in pages of the initial
search.
Natural language search. Natural language search tools attempt
to extract words from questions such as “How many countries
are there in Europe?” and create a semantic representation of
the query. Initially, this is what people hoped search engines
would evolve toward, but Grimes wonders if we have become so
accustomed to typing just one or two words into our queries that
writing out a whole question may seem like too much work.
2. Much of the world’s digital information is stored in files
structured so that they can only be read by the programs that
created them. With metadata, the content of these files can be
labeled with tags describing the nature of the information,
where it came from, or how it is arranged, essentially making
the Web one large database that can be read and used by a wide
variety of applications.
The semantic Web will make it possible to access information
about real things (people, places, contracts, books, chemicals,
etc.) without worrying about the details associated with the
nature or structure of the data files, pages, and databases where
these things are described or contained (Hendler and Berners-
Lee, 2010).
3. The first stage was Web 1.0 (The Initial Web) - A Web of
Pages. Pages or documents are “hyperlinked,” making it easier
than ever before to access connected information.
The first stage was Web 2.0 (The Social Web) - A Web of
24. Applications. Applications are created that allow people to
easily create, share, and organize information.
The third stage is Web 3.0 (The Semantic Web) - A Web of
Data. Information within documents or pages is tagged with
metadata, allowing users to access specific information across
platforms, regardless of the original structure of the fi le, page,
or document that contains it. It turns the Web into one giant
database.
4. Context defines the intent of the user; for example, trying to
purchase music, to find a job, to share memories with friends
and family
Personalization refers to the user’s personal characteristics that
impact how relevant the content, commerce, and community are
to an individual.
Vertical search, as you have read, focuses on finding
information in a particular content area, such as travel, finance,
legal, and medical.
The current Web is disjointed, requiring us to visit different
websites to get content, engage in commerce, and interact with
our social networks (community). The future Web will use
context, personalization, and vertical search to make content,
commerce, and community more relevant and easier to access
(Mitra, 2007).
5. The semantic Web utilizes additional languages that have
been developed by the W3C. These include resource description
framework (RDF), Web ontology language (OWL), and
SPARQL protocol and RDF query language (SPARQL).
32
28. 2. Netflix does recommendations of movies for customers
similar to movies they already have watched.
Pandora creates recommendations or playlists based on song
attributes.
3. Many collaborative filtering systems use purchase history to
identify similarities between customers. In principle, however,
any customer characteristic that improves the quality of
recommendations could be used, such as patterns of consumer
behavior, interests, ratings, reviews, social media contacts and
conversations, media use, financial information, and so on.
4. Answers may vary.
Content-based filtering recommends products based on the
product features of items the customer has interacted with in the
past and the similarity to other products’ features.
5. Content filtering, collaborative filtering, knowledge-based
systems, and demographic systems.
6. Hybrid recommendation engines develop recommendations
based on some combination of the methodologies described
(content-based filtering, collaboration filtering, knowledge-
based, and demographic systems).
Weighted hybrid: Results from different recommenders are
assigned a weight and combined numerically to determine a
final set of recommendations. Relative weights are determined
by system tests to identify the levels that produce the best
recommendations.
Mixed hybrid: Results from different recommenders are
presented alongside of each other.
Cascade hybrid: Recommenders are assigned a rank or priority.
If a tie occurs (with two products assigned the same
recommendation value), results from the lower-ranked systems
are used to break ties from the higher-ranked systems.
29. Compound hybrid: This approach combines results from two
recommender systems from the same technique category (e.g.,
two collaborative filters), but uses different algorithms or
calculation procedures.
37