Chapter 6Attracting Buyers withSearch, Semantic, andRecomm.docx

Chapter 6
Attracting Buyers with
Search, Semantic, and
Recommendation
Technology
Prepared by Dr. Derek Sedlack, South University
Copyright © 2015 John Wiley & Sons, Inc. All rights reserved.
Learning Objectives
Organic Search and Search Engine Optimization
Pay-Per-Click and Paid Search Strategies
A Search for Meaning—Semantic Technology
Recommendation Engines
Using Search Technology for Business Success

How Search Engines Work
Search Engine: an application for locating webpages or other
content on a computer network using spiders.
Spiders: web bots (or bots); small computer programs designed
to perform automated, repetitive tasks over the Internet.
Bots scan webpages and return information to be stored in a
page repository.
Chapter 6
Chapter 6
Crawler search engine
Web directory
Hybrid search engine
Meta-search engine
Semantic search engine

Web Directories
Typically organized by categories.
Webpage content is usually reviewed by directory editors prior
to listing.
Page Repository: data structure that stores and manages
information from a large number of webpages, providing a fast
and efficient means for accessing and analyzing the information
at a later time.
Chapter 6
Chapter 6
Figure 6.5 Components of crawler search engines (Grehan,
2002).
Chapter 6
Figure 6.6 Search engines use invested indexes to efficiently
locate Web content based on search query terms.
Why Search is Important for Business
Enterprise search tools allow organizations to share information
internally.
An organizations’ ability to share knowledge among employees

is vital to its ability to compete.
Information is not always in the same format.
Chapter 6
Why Search is Important for Business
Structured data: information with a high degree of organization,
such that inclusion in a relational database is seamless and
readily searchable by simple, straightforward search engine
algorithms or other search operations.
Unstructured data: “messy data” not organized in a systematic
or predefined way.
Chapter 6
Security Issues
Limited access to certain data via job function or clearance.
Request log audits should be conducted regularly for patterns or
inconsistencies.
Enterprise Vendors
Used to treat data in large companies like Internet data but
include information management tools.
Chapter 6
Attempt to anticipate information users might be interested in to

recommend new products, articles, videos, etc.
Search Engine Marketing
A collection of online marketing strategies and tactics that
promote brands by increasing their visibility in search engine
results pages (SERPs) through optimization and advertising.
Chapter 6
Search Engine Marketing
Basic search types:
Informational search
Navigational search
Transactional search
Strategies and tactics produce two outcomes:
Organic search listings
Paid search listings
Pay-per-click (produce click-through rates)
Social media optimization
Chapter 6
Mobile Search
Technically configured mobile sites
Content designed for mobile devices
Business search
Focused search
Filetype
Advanced search
Search tools button

Search history
Chapter 6
Real-time Search
Google Trends
Google Alerts
Twitter Search
Social Bookmarking Search
Page links tagged with keywords
Specialty Search: Vertical Search
Programmed to focus on webpages related to a particular topic
and to drill down by crawling pages that other search engines
are likely to ignore.
Chapter 6
What is the primary difference between a web directory and a
crawler based search engine?
What is the purpose of an index in a search engine?
Describe the page-ranking method most commonly associated
with Google’s success.
What is the difference between search engine optimization and
PPC advertising?
Describe three different real-time search tools.
Chapter 6

Suggested Answers:
1. Crawler search engines rely on sophisticated computer
programs called “spiders,” “crawlers,” or “bots” that surf the
Internet, locating webpages, links, and other content that are
then stored in the SE’s page repository.
Web directories are categorized listings of webpages created
and maintained by humans. Because websites are only included
after being reviewed by a person, it is less likely that search
results will contain irrelevant websites.
2. An index helps search engines efficiently locate relevant
pages containing keywords used in a search.
3. The methods by which search engines determine page rank
vary and the specific algorithms they use are often carefully
guarded trade secrets. In some cases, a search engine may use
hundreds of different criteria to determine which pages appear
at the top of a SERP. Google, for instance, claims to use over
200 “clues” to determine how it ranks pages (Google.com,
2014). According to Dover and Dafforn (2011), all these factors
can be grouped into two categories: relevance and popularity (or
“authority”).
4. Businesses utilize search engine optimization (SEO) to
improve their website’s organic listings on SERPs. No payments
are made to the search engine service for organic search
listings.
Pay-per-click (PPC) advertising refers to paid search listings
where advertisers pay search engines based on how many people
click on the ads.
5. Google Trends—Trends (google.com/trends) will help you
identify current and historical interest in the topic by reporting

the volume of search activity over time. Google Trends allows
you to view the information for different time periods and
geographic regions.
Google Alerts—Alerts (google.com/alerts) is an automated
search tool for monitoring new Web content, news stories,
videos, and blog posts about some topic. Users set up alerts by
specifying a search term (e.g., a company name, product, or
topic), how often they want to receive notices, and an e-mail
address where the alerts are to be sent. When Google finds
content that match the parameters of the search, users are
notified via e-mail. Bing has a similar feature called News
Alerts.
Twitter Search—You can leverage the crowd of over 650
million Twitter users to find information as well as gauge
sentiment on a wide range of topics and issues in real time.
Twitter’s search tool (twitter.com/search-home) looks similar to
other search engines, and includes an advanced search mode.
15
Learning Objectives

Search Engine Optimization
Keyword conversion rates: the likelihood that using a particular
keyword to optimize a page will result in conversions*.
Ranking factors
Reputation or popularity
PageRank: Google’s algorithm based on the assumption that
people are more likely to link a high-quality website than poor-
quality site.
Backlinks: external links that point back to a site.
Relevancy
User Satisfaction
Conversions: when a website visitor converts to a buyer
Chapter 6

Inbound marketing
An approach to marketing that emphasizes SEO, content
Marketing, and social media strategies to attract customers.
Outbound marketing
Traditional approach using mass media advertising.
Chapter 6
Black Hat SEO
Gaming the system or tricking search engines into ranking a site
higher than its content deserves.
Link spamming: generating backlinks toward SEO, not adding
user value.
Keyword tricks: embedded high-value keywords to drive up
traffic statistics.
Ghost text: text hidden in the background that will affect page
ranking
Shadow (ghost or cloaked) pages: created pages optimized to
attract lots of people through redirect.
Chapter 6
Search engines use many different “clues” about the quality of a
website’s content to determine how a page should be ranked in
search results. These clues fall into three primary categories:
Reputation or Popularity, Relevancy, and User Satisfaction.
Explain the rationale for using each of these three categories as
an indicator of a website’s content quality.

Backlinks were a key factor in Google’s original PageRank
algorithm. Explain what a backlink is and why Google has
reduced its emphasis on backlinks and instead uses many other
additional factors in its ranking algorithm?
Explain why so-called black hat SEO tactics are ultimately
short-sighted and can lead to significant consequences for
businesses that use them.
How do organizations evaluate the effectiveness of their search
engine optimization (SEO) strategies and tactics?
Explain why providing high quality, regularly updated content
is the most important aspect of any SEO strategy.
Chapter 6
Suggested Answers:
1. One way of assessing the quality of a website is to use
measures of popularity. This is based on the assumption that
websites with good content will be more popular than sites with
poor quality content. On the assumption that people are more
likely to link to high-quality websites than poor-quality sites,
one measure of popularity is the number of backlinks—external
links that point back to a site.
Search engines attempt to determine if the content on a webpage
is relevant to what the searcher is looking for. As with quality,
the search engine cannot determine relevance directly, so
algorithms have been developed to look for clues that suggest a
site might be relevant. Factors which affect relevancy:
Keywords related to the search topic suggest relevant content.
Page titles: Words in the page title that are related to the topic
suggest relevant content.
Relevant phrases in text: In addition to keywords, search
engines look at the words and phrases on the page to determine

relevance.
Amount of text on page that appears relevant: The proportion of
relevant text to non-relevant text can influence relevance.
Backlinks from relevant sites and Web directories: Webpages
that are listed in relevant categories of Web directories are more
likely to be relevant because they were reviewed by human
editors.
SERP click through rate (CTR): Searchers are more likely to
click on listings that contain relevant content.
Onpage factor: Metadata (such as page titles, page descriptions)
and descriptive URLs should reflect the page content. People
use the information in search listings to determine if a link
contains relevant information. This affects CTR.
Dwell time and bounce rate are impacted by how relevant a
website’s content is. Long dwell times and short bounce rates
suggest relevant content related to the search.
Search engines want their customers to be satisfied. As a result,
SERP ranking is influenced by factors that impact user
satisfaction. Factors that are likely to influence a search
engine’s user satisfaction rating are:
Dwell time: Users that stay on a site longer are probably more
satisfied.
Site speed: Slow page loading time on websites reduces
satisfaction.
Reading level: Reading levels that are too high or too low
frustrate users.
Hacked sites, malware, spam reduce user satisfaction
significantly.
Website satisfaction surveys: Google created user satisfaction
surveys that webmasters can embed in their websites. Positive
responses to these surveys can improve ranking.
Barriers to content: Making people register, provide names, or
fill out forms to get to content has a negative impact on user
satisfaction.
Other factors: Too many ads, page-not-found errors, duplicate
content/pages, content copied from other websites, and spam in

comment sections all detract from user satisfaction.
2. A backlinks is an external link that points back to a site.
Google has changed its ranking methods and the assumption is
that backlinks are still very important, but not weighted as
heavily as they used to be. Using a number of other factors
yields a better picture of the relevance of sites to the search.
3. Black hat tactics try to trick the search engine into thinking a
website has high-quality content, when in fact it does not. The
search engines have stronger detection systems in place and
when they are discovered, Google and other SEs will usually
punish the business by dramatically lowering the website’s rank
so that it does not show up on SERPs at all.
4. Using Web analytics programs like Google Analytics,
companies can determine how many people visit their site, what
specific pages they visit, how long they spend on the site, and
what search engines are producing the most traffic. More
sophisticated SEO practitioners will also analyze keyword
conversion rates, or the likelihood that using a particular
keyword to optimize a page will result in conversions (i.e.,
when a website visitor converts to a buyer).These are just a few
of the many metrics used to measure the effectiveness of SEO
strategies.
5. Perhaps the most important action an organization can take to
improve its website’s ranking and satisfy website visitors is
provide helpful content that is current and updated regularly.
When SEO practices are combined with valuable content,
websites become easier to find in search engines but, more
importantly, contribute to building brand awareness, positive
attitudes toward the brand, and brand loyalty.
20

Learning Objectives
Pay-Per-Click
PPC advertising campaigns:
Set an overall budget

Create ads
Select associated keywords
Set up billing account information
Chapter 6
Paid Search Advertising Metrics
Click through rates (CTR): used to evaluate keyword selection
and ad copy campaign decisions.
Keyword conversion: should lead to sales, not just visits.
Cost of customer acquisition (CoCA): amount of money spent to
attract a paying customer.
Return on advertising spend (ROAS): overall financial
effectiveness.
Chapter 6
Quality Score
Determined by factors related to the user’s experience.
Expected keyword click-through-rate (CTR)
The past CTR of your URL (web address)
Past effectiveness
Landing page quality
Relevance of keywords to ads
Relevance of keywords to customer search
Ad performance on difference devices
Chapter 6

What would most people say is the fundamental difference
between organic listings and PPC listings on a search engine?
What are the four primary steps to creating a PPC advertising
campaign on search engines?
In addition to the “bid price” for a particular keyword, what
other factor(s) influence the likelihood that an advertisement
will appear on a search results page? Why don’t search engines
just rely on the advertisers bid when deciding what ads will
appear on the search results page?
How do webpage factors influence the effectiveness of PPC
advertisements?
Describe four metrics that can be used to evaluate the
effectiveness of a PPC advertising campaign.
Chapter 6
Suggested Answers:
1. Paid advertisements receive preferential page placement, but
most major search engines differentiate organic search results
from paid ad listings on SERPs with labels, shading, and
placing the ads in a different place on the page.
2. There are four steps to creating a PPC advertising campaign
on SEs.
1. Set an overall budget for the campaign.
2. Create ads—most search engine ads are text only.
3. Select keywords associated with the campaign.
4. Set up billing account information.
3. In addition to selecting keywords and setting bid prices,
advertisers also set parameters for the geographic location they

want their ad to appear in and time of day. These factors allow
for additional customer targeting designed to help advertisers
reach the consumers most likely to purchase their products.
A quality score is determined by factors related to the user’s
experience. Ads that are considered to be more relevant (and
therefore more likely to be clicked on) will cost less and more
likely run in a top position.
Relevant ads are good for all parties—the search engine makes
more money from clicked ads, the advertiser experiences more
customers visiting its site, and the customer is more likely to
find what he or she is looking for.
4. The effectiveness of PPC ads is heavily influenced by factors
on the webpages that ads are linked to. For instance, sometimes
companies create product-oriented ads, but then link to the main
page of their website instead of a page with information about
the product in the ad. Other factors include landing page design,
effectiveness of the call to action, and the quality of the
shopping cart application. A PPC campaign will not be very
effective if the website is not attractive to consumers once they
reach it.
5. Click through rates (CTRs)—By themselves, CTRs do not
measure the financial performance of an ad campaign. But they
are useful for evaluating many of the decisions that go into a
campaign, such as keyword selection and ad copy.
Keyword conversion—High CTRs are not always good if they
do not lead to sales. Since the cost of the campaign is based on
how many people click an ad, you want to select keywords that
lead to sales (conversions), not just site visits. PPC advertisers
monitor which keywords lead to sales and focus on those in
future campaigns.

Cost of customer acquisition (CoCA)—This metric represents
the amount of money spent to attract a paying customer. To
calculate CoCA for a PPC campaign, you divide the total budget
of the campaign by the number of customers who purchased
something from your site. For instance, if you spent $1,000 on a
campaign that yielded 40 customers, your CoCA would be
$1,000/40 5 $25 per customer.
Return on advertising spend (ROAS)—The campaign’s overall
financial effectiveness is evaluated with ROAS (revenue /cost).
For example, if $1,000 was spent on a campaign that led to
$6,000 in sales, ROAS would be $6,000/$1,000 5 $6. In other
words, for every dollar spent on PPC ads, $6 was earned.
25
Learning Objectives

Semantic Web
Meaningful computing using metadata: application of natural
language processing (NLP) to support information retrieval,
analytics, and data-integration that compass both numerical and
“unstructured” information.
Semantic Search
Process of typing something into a search engine and getting
more results than just those that feature the exact keyword
typed into the search box.
Metadata
Data that describes and provides information about other data.
Chapter 6
Chapter 6

Web 3.0
Developed by W3C.
Resource description framework (RDF)
Used to represent information about resources
Web ontology language (OWL)
Language used to categorize and accurately identify the nature
of Internet things
SPARCQL protocol
Used to write programs that can retrieve and manipulate data
scored in RDF
RDF query language (SPARCQL)
Chapter 6
Semantic Search Features and Benefits
Related searches/queries
Reference results
Semantically annotated results
Full-text similarity search
Search on semantic/syntactic annotations
Chapter 6
Semantic Search Features and Benefits
Concept search
Ontology-based search
Semantic Web search

Faceted search
Clustered search
Natural language search
Chapter 6
List five different practical ways that semantic technology is
enhancing the search experience of users.
How do metadata tags facilitate more accurate search results?
Briefly describe the three evolutionary stages of the Internet?
Define the words “context,” “personalization,” and “vertical
search” and explain how they make for more powerful and
accurate search results.
What are the three languages developed by the W3C and
associated with the semantic Web?
Chapter 6
Suggested Answers:
1. Grimes (2010) provides a list of practical benefits that could
result from semantic search technology:
Related searches/queries. The engine suggests alternative search
queries that may produce information related to the original
query. Search engines may also ask you, “Did you mean:
[search term]?” if it detects a misspelling. (This already
happens with some.)
Reference results. The search engine suggests reference
material related to the query, such as a dictionary definition,
Wikipedia pages, maps, reviews, or stock quotes.
Semantically annotated results. Returned pages contain

highlighting of search terms, but also related words or phrases
that may not have appeared in the original query. These can be
used in future searches simply by clicking on them.
Full-text similarity search. Users can submit a block of text or
even a full document to find similar content.
Search on semantic/syntactic annotations. This approach would
allow a user to indicate the “syntactic role the term plays—for
instance, the part-of-speech (noun, verb, etc.)—or its semantic
meaning—whether it’s a company name, location, or event.” For
instance, a keyword search on the word “center” would produce
too many results. Instead, a search query could be written using
a syntax such as the following:
<organization> center </organization>
This would only return documents where the word “center” was
part of an organization’s name. Google currently allows you to
do something similar to specify the kind of files you are looking
for (e.g., filetype:pdf)
Concept search. Search engines could return results with related
concepts. For instance, if the original query was “Tarantino
films,” documents would be returned that contain the word
“movies” even if not the word “films.”
Ontology-based search. Ontologies define the relationships
between data. An ontology is based on the concept of “triples”:
subject, predicate, and object. This would allow the search
engine to answer questions such as “What vegetables are
green?” The search engine would return results about
“broccoli,” “spinach,” “peas,” “asparagus,” “Brussels sprouts,”
and so on.
Semantic Web search. This approach would take advantage of
content tagged with metadata as previously described in this
section. Search results are likely to be more accurate than
keyword matching.
Faceted search. Faceted search provides a means of refining
results based on predefined categories called facets. For
instance, a search on “colleges” might result in options to
“refine this search by. . .” location, size, degrees offered,

private or public, and so on. Faceted search tools available
today tend to focus on a specific domain, such as Wikipedia or
Semidico, a search tool for biomedical literature.
Clustered search. This is similar to a faceted search, but without
the predefined categories. Visit Carrot2.org to better understand
this concept. After conducting a search, click on the “foamtree”
option to see how you can refine your search. The refining
options are extracted from the content in pages of the initial
search.
Natural language search. Natural language search tools attempt
to extract words from questions such as “How many countries
are there in Europe?” and create a semantic representation of
the query. Initially, this is what people hoped search engines
would evolve toward, but Grimes wonders if we have become so
accustomed to typing just one or two words into our queries that
writing out a whole question may seem like too much work.
2. Much of the world’s digital information is stored in files
structured so that they can only be read by the programs that
created them. With metadata, the content of these files can be
labeled with tags describing the nature of the information,
where it came from, or how it is arranged, essentially making
the Web one large database that can be read and used by a wide
variety of applications.
The semantic Web will make it possible to access information
about real things (people, places, contracts, books, chemicals,
etc.) without worrying about the details associated with the
nature or structure of the data files, pages, and databases where
these things are described or contained (Hendler and Berners-
Lee, 2010).
3. The first stage was Web 1.0 (The Initial Web) - A Web of
Pages. Pages or documents are “hyperlinked,” making it easier
than ever before to access connected information.
The first stage was Web 2.0 (The Social Web) - A Web of

Applications. Applications are created that allow people to
easily create, share, and organize information.
The third stage is Web 3.0 (The Semantic Web) - A Web of
Data. Information within documents or pages is tagged with
metadata, allowing users to access specific information across
platforms, regardless of the original structure of the fi le, page,
or document that contains it. It turns the Web into one giant
database.
4. Context defines the intent of the user; for example, trying to
purchase music, to find a job, to share memories with friends
and family
Personalization refers to the user’s personal characteristics that
impact how relevant the content, commerce, and community are
to an individual.
Vertical search, as you have read, focuses on finding
information in a particular content area, such as travel, finance,
legal, and medical.
The current Web is disjointed, requiring us to visit different
websites to get content, engage in commerce, and interact with
our social networks (community). The future Web will use
context, personalization, and vertical search to make content,
commerce, and community more relevant and easier to access
(Mitra, 2007).
5. The semantic Web utilizes additional languages that have
been developed by the W3C. These include resource description
framework (RDF), Web ontology language (OWL), and
SPARQL protocol and RDF query language (SPARQL).
32

Learning Objectives
Recommendation Filters
Content-based filtering: products based on product features in
past interactions.

Collaborative filtering: based on user’s similarity to other
people.
Chapter 6
Limitations of Recommendation Engines
Cold start or new user: challenging since no starting point or
preexisting information exists.
Sparsity: unable to create critical mass due to few ratings or
similar groups are unidentifiable.
Limited feature content: manual information entry is prohibitive
where there are many products.
Overspecialization: narrowly configured results may only
recommend the same item, but in different sizes or colors.
Chapter 6
Hybrid Recommendation Engines
Weighted hybrid: results from different recommenders are
assigned weight and combined numerically to determined final
recommendations.
Mixed hybrid: results from different recommenders presented
along-side of each other.
Cascade hybrid: results from different recommenders assigned a
rank or priority.
Mixed hybrid: results from different recommenders combines
results from two recommender systems from the same technique
category.
Chapter 5

How is a recommendation engine different from a search
engine?
Besides e-commerce websites that sell products, what are some
other ways that recommendation engines are being used on the
Web today?
What are some examples of user information required by
recommendation engines that use collaborative filtering?
Before implementing a content-based recommendation engine,
what kind of information would website operators need to
collect about their products?
What are the four distinct methodologies used by recommender
systems to create recommendations?
What is a recommendation engine called that combines different
methodologies to create recommendations? What are three ways
these systems combine methodologies?
Chapter 6
Suggested Answers:
1. With a search engine, customers find products through an
active search, assuming customers know what they want and
how to describe it when forming their search query.
Recommendation engines proactively identify products that
have a high probability of being something the consumer might
want to buy. Each time customers log into the site, they are
presented with an assortment of products based on their
purchase history, browsing history, product reviews, ratings,
and many other factors.

2. Netflix does recommendations of movies for customers
similar to movies they already have watched.
Pandora creates recommendations or playlists based on song
attributes.
3. Many collaborative filtering systems use purchase history to
identify similarities between customers. In principle, however,
any customer characteristic that improves the quality of
recommendations could be used, such as patterns of consumer
behavior, interests, ratings, reviews, social media contacts and
conversations, media use, financial information, and so on.
4. Answers may vary.
Content-based filtering recommends products based on the
product features of items the customer has interacted with in the
past and the similarity to other products’ features.
5. Content filtering, collaborative filtering, knowledge-based
systems, and demographic systems.
6. Hybrid recommendation engines develop recommendations
based on some combination of the methodologies described
(content-based filtering, collaboration filtering, knowledge-
based, and demographic systems).
Weighted hybrid: Results from different recommenders are
assigned a weight and combined numerically to determine a
final set of recommendations. Relative weights are determined
by system tests to identify the levels that produce the best
recommendations.
Mixed hybrid: Results from different recommenders are
presented alongside of each other.
Cascade hybrid: Recommenders are assigned a rank or priority.
If a tie occurs (with two products assigned the same
recommendation value), results from the lower-ranked systems
are used to break ties from the higher-ranked systems.

Compound hybrid: This approach combines results from two
recommender systems from the same technique category (e.g.,
two collaborative filters), but uses different algorithms or
calculation procedures.
37

Chapter 6Attracting Buyers withSearch, Semantic, andRecomm.docx

Recommended

Recommended

More Related Content

Similar to Chapter 6Attracting Buyers withSearch, Semantic, andRecomm.docx

Similar to Chapter 6Attracting Buyers withSearch, Semantic, andRecomm.docx (20)

More from DinahShipman862

More from DinahShipman862 (20)

Recently uploaded

Recently uploaded (20)

Chapter 6Attracting Buyers withSearch, Semantic, andRecomm.docx