SlideShare a Scribd company logo
1 of 14
@trabit@toddkeup
Mastering On-Site Search
Todd Keup, CEO
Magnifisites
Ralf Schwoebel, CEO
Tradebit.com
NOLA 2013
@trabit@toddkeup
Meet the geeks
Ralf Schwoebel
www.tradebit.com
Todd Keup
www.magnifisites.com
@trabit@toddkeup
on-site search is not
simply an input field!
@trabit@toddkeup
Scope of search technology
• Searching for content (the basics)
• Related Products, tag clouds, etc.
• Geo-IP sensitive content (& fraud control)
• User behavioral targeting
⇒ Your choice of technology potentially
opens or hinders many options!
@trabit@toddkeup
Search: the basics
@trabit@toddkeup
Search: bonus material
@trabit@toddkeup
Search: getting started
Content is in database, text, html … mixed?
Plan your index, build your index(es).
Plan your links, no duplicate content!
How will you update the index?
How will it perform?
Will it scale?
How will you search it?
@trabit@toddkeup
Search: tricks of the trade
Use your own index logs
and
search engine referrals
to gather data
Personal
Environmental
@trabit@toddkeup
Search: worth mentioning
Google: Enhanced Link Attribution
You can tag your pages to implement an enhanced link-tracking
functionality that lets you:
• See separate information for multiple links on a page that all
have the same destination. For example, if there are two links
on the same page that both lead to the Contact Us page, then
you see separate click information for each link.
• See when one page element has multiple destinations. For
example, a Search button on your page is likely to lead to
multiple destinations.
• Track buttons, menus, and actions driven by JavaScript.
Source: https://support.google.com/analytics/answer/2558867
@trabit@toddkeup
Introducing …
Technology that might help you …
@trabit@toddkeup
Sphinx
Sphinx is an open source search server
http://sphinxsearch.com/http://sphinxsearch.com/
@trabit@toddkeup
How to use Sphinx
• Download and install
• Configure
• Index
• Test
• Implement
@trabit@toddkeup
How we are using Sphinx
@trabit@toddkeup
But there's more!
Stay tuned for part 2....

More Related Content

What's hot

OSINT for Proactive Defense - RootConf 2019
OSINT for Proactive Defense - RootConf 2019OSINT for Proactive Defense - RootConf 2019
OSINT for Proactive Defense - RootConf 2019RedHunt Labs
 
DataShaka @ Conversion Thursday
DataShaka @ Conversion ThursdayDataShaka @ Conversion Thursday
DataShaka @ Conversion ThursdayLending Works
 
OSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligenceOSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligenceDeep Shankar Yadav
 
OSINT Black Magic: Listen who whispers your name in the dark!!!
OSINT Black Magic: Listen who whispers your name in the dark!!!OSINT Black Magic: Listen who whispers your name in the dark!!!
OSINT Black Magic: Listen who whispers your name in the dark!!!Nutan Kumar Panda
 
OSINT mindset to protect your organization - Null monthly meet version
OSINT mindset to protect your organization - Null monthly meet versionOSINT mindset to protect your organization - Null monthly meet version
OSINT mindset to protect your organization - Null monthly meet versionChandrapal Badshah
 

What's hot (6)

OSINT for Proactive Defense - RootConf 2019
OSINT for Proactive Defense - RootConf 2019OSINT for Proactive Defense - RootConf 2019
OSINT for Proactive Defense - RootConf 2019
 
DataShaka @ Conversion Thursday
DataShaka @ Conversion ThursdayDataShaka @ Conversion Thursday
DataShaka @ Conversion Thursday
 
OSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligenceOSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligence
 
Using hadoop for big data
Using hadoop for big dataUsing hadoop for big data
Using hadoop for big data
 
OSINT Black Magic: Listen who whispers your name in the dark!!!
OSINT Black Magic: Listen who whispers your name in the dark!!!OSINT Black Magic: Listen who whispers your name in the dark!!!
OSINT Black Magic: Listen who whispers your name in the dark!!!
 
OSINT mindset to protect your organization - Null monthly meet version
OSINT mindset to protect your organization - Null monthly meet versionOSINT mindset to protect your organization - Null monthly meet version
OSINT mindset to protect your organization - Null monthly meet version
 

Viewers also liked

CSS and HTML Coding Today - Pubcon Las Vegas 2013
CSS and HTML Coding Today - Pubcon Las Vegas 2013CSS and HTML Coding Today - Pubcon Las Vegas 2013
CSS and HTML Coding Today - Pubcon Las Vegas 2013Todd Keup
 
Pubcon Las Vegas 2012 SQL Injection
Pubcon Las Vegas 2012 SQL InjectionPubcon Las Vegas 2012 SQL Injection
Pubcon Las Vegas 2012 SQL InjectionTodd Keup
 
Pubcon Las Vegas 2012 CSS and HTML coding
Pubcon Las Vegas 2012 CSS and HTML codingPubcon Las Vegas 2012 CSS and HTML coding
Pubcon Las Vegas 2012 CSS and HTML codingTodd Keup
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with DataSeth Familian
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017Drift
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheLeslie Samuel
 

Viewers also liked (7)

CSS and HTML Coding Today - Pubcon Las Vegas 2013
CSS and HTML Coding Today - Pubcon Las Vegas 2013CSS and HTML Coding Today - Pubcon Las Vegas 2013
CSS and HTML Coding Today - Pubcon Las Vegas 2013
 
Pubcon Las Vegas 2012 SQL Injection
Pubcon Las Vegas 2012 SQL InjectionPubcon Las Vegas 2012 SQL Injection
Pubcon Las Vegas 2012 SQL Injection
 
Pubcon Las Vegas 2012 CSS and HTML coding
Pubcon Las Vegas 2012 CSS and HTML codingPubcon Las Vegas 2012 CSS and HTML coding
Pubcon Las Vegas 2012 CSS and HTML coding
 
Key Digital Trends for 2017
Key Digital Trends for 2017Key Digital Trends for 2017
Key Digital Trends for 2017
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with Data
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 

Similar to Pubcon New Orleans 2013 on-site search with Todd Keup

Mastering On-Site Search / Custom Site Search
Mastering On-Site Search / Custom Site SearchMastering On-Site Search / Custom Site Search
Mastering On-Site Search / Custom Site SearchRalf Schwoebel
 
A Data-First Approach to Building a Website _ LondonSEO XL _ Paige Hobart.pdf
A Data-First Approach to Building a Website _ LondonSEO XL _ Paige Hobart.pdfA Data-First Approach to Building a Website _ LondonSEO XL _ Paige Hobart.pdf
A Data-First Approach to Building a Website _ LondonSEO XL _ Paige Hobart.pdfPaige Hobart
 
Search driven knowledge management
Search driven knowledge managementSearch driven knowledge management
Search driven knowledge managementAri Bakker
 
Analyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop WebinarAnalyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop WebinarDatameer
 
Global & Mobile SEO - Dave Lloyd, Sr. Global SEO Manager, Adobe
Global & Mobile SEO - Dave Lloyd, Sr. Global SEO Manager, AdobeGlobal & Mobile SEO - Dave Lloyd, Sr. Global SEO Manager, Adobe
Global & Mobile SEO - Dave Lloyd, Sr. Global SEO Manager, AdobeBrightEdge Technologies
 
You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...
You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...
You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...Concept Searching, Inc
 
Arron daniels 1 pager researching the tech talent market
Arron daniels 1 pager   researching the tech talent marketArron daniels 1 pager   researching the tech talent market
Arron daniels 1 pager researching the tech talent marketTalent42
 
Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015Adam Muise
 
Webinar Structured Data
Webinar Structured DataWebinar Structured Data
Webinar Structured DataBotify
 
Surprising Facts about Google and 2017 SEO
Surprising Facts about Google and 2017 SEOSurprising Facts about Google and 2017 SEO
Surprising Facts about Google and 2017 SEOAffiliate Summit
 
Brightcove Video SEO - Optimizing Brightcove Video for Search
Brightcove Video SEO - Optimizing Brightcove Video for SearchBrightcove Video SEO - Optimizing Brightcove Video for Search
Brightcove Video SEO - Optimizing Brightcove Video for SearchMark Robertson ⏩
 
SharePoint Search - August 2019 at Utah SharePoint User Group
SharePoint Search - August 2019 at Utah SharePoint User GroupSharePoint Search - August 2019 at Utah SharePoint User Group
SharePoint Search - August 2019 at Utah SharePoint User GroupGreg McMurray
 
Search engine optimization
Search engine optimizationSearch engine optimization
Search engine optimizationTommi Forsström
 
Structured Data at Scale - Megan Mathurin
Structured Data at Scale - Megan MathurinStructured Data at Scale - Megan Mathurin
Structured Data at Scale - Megan MathurinMegan Mathurin
 
The Impact of Smart Content
The Impact of Smart ContentThe Impact of Smart Content
The Impact of Smart ContentMatt Turner
 
Professional Information Research
Professional Information ResearchProfessional Information Research
Professional Information ResearchEric Kokke
 

Similar to Pubcon New Orleans 2013 on-site search with Todd Keup (20)

Mastering On-Site Search / Custom Site Search
Mastering On-Site Search / Custom Site SearchMastering On-Site Search / Custom Site Search
Mastering On-Site Search / Custom Site Search
 
Semantic web & structured data - #SMT Search Marketing Thursday - Jan-Willem ...
Semantic web & structured data - #SMT Search Marketing Thursday - Jan-Willem ...Semantic web & structured data - #SMT Search Marketing Thursday - Jan-Willem ...
Semantic web & structured data - #SMT Search Marketing Thursday - Jan-Willem ...
 
A Data-First Approach to Building a Website _ LondonSEO XL _ Paige Hobart.pdf
A Data-First Approach to Building a Website _ LondonSEO XL _ Paige Hobart.pdfA Data-First Approach to Building a Website _ LondonSEO XL _ Paige Hobart.pdf
A Data-First Approach to Building a Website _ LondonSEO XL _ Paige Hobart.pdf
 
Estrat search
Estrat searchEstrat search
Estrat search
 
Search driven knowledge management
Search driven knowledge managementSearch driven knowledge management
Search driven knowledge management
 
Analyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop WebinarAnalyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop Webinar
 
Global & Mobile SEO - Dave Lloyd, Sr. Global SEO Manager, Adobe
Global & Mobile SEO - Dave Lloyd, Sr. Global SEO Manager, AdobeGlobal & Mobile SEO - Dave Lloyd, Sr. Global SEO Manager, Adobe
Global & Mobile SEO - Dave Lloyd, Sr. Global SEO Manager, Adobe
 
You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...
You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...
You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...
 
Arron daniels 1 pager researching the tech talent market
Arron daniels 1 pager   researching the tech talent marketArron daniels 1 pager   researching the tech talent market
Arron daniels 1 pager researching the tech talent market
 
Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015
 
Webinar Structured Data
Webinar Structured DataWebinar Structured Data
Webinar Structured Data
 
Surprising Facts about Google and 2017 SEO
Surprising Facts about Google and 2017 SEOSurprising Facts about Google and 2017 SEO
Surprising Facts about Google and 2017 SEO
 
Misceb search2014
Misceb search2014Misceb search2014
Misceb search2014
 
Brightcove Video SEO - Optimizing Brightcove Video for Search
Brightcove Video SEO - Optimizing Brightcove Video for SearchBrightcove Video SEO - Optimizing Brightcove Video for Search
Brightcove Video SEO - Optimizing Brightcove Video for Search
 
SharePoint Search - August 2019 at Utah SharePoint User Group
SharePoint Search - August 2019 at Utah SharePoint User GroupSharePoint Search - August 2019 at Utah SharePoint User Group
SharePoint Search - August 2019 at Utah SharePoint User Group
 
Search engine optimization
Search engine optimizationSearch engine optimization
Search engine optimization
 
SEO Patents
SEO PatentsSEO Patents
SEO Patents
 
Structured Data at Scale - Megan Mathurin
Structured Data at Scale - Megan MathurinStructured Data at Scale - Megan Mathurin
Structured Data at Scale - Megan Mathurin
 
The Impact of Smart Content
The Impact of Smart ContentThe Impact of Smart Content
The Impact of Smart Content
 
Professional Information Research
Professional Information ResearchProfessional Information Research
Professional Information Research
 

Recently uploaded

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

Pubcon New Orleans 2013 on-site search with Todd Keup

Editor's Notes

  1. Good Morning! We want to thank Brett Tabke and his organization for all their hard work in putting a conference like this together. Each time we attend we find ourselves beneficiaries of the knowledge shared at this gathering. Thanks Brett, for the opportunity to not only be here, but to be here once again as speakers. We would also like to thank Ben Cook of Direct Match Media for volunteering to facilitate this session. But most of all thank you for being here today. We are honored by your presence and the privilege to share what we are able regarding on-site search. For those of you that are familiar with the WebmasterWorld web site and the forums at WebmasterWorld, Ralf is an active member there and goes by the nickname “pontifex”. Todd is also an active member and one of the moderators of the PHP Server Side Scripting Forum. He goes by the nickname “coopster”. We want you to know that we would absolutely love the opportunity to make your personal acquaintance today. We are approachable and friendly. Please don't hesitate to introduce yourself.
  2. Todd: Let me introduce ourselves. Before I expose you to the German accent of my friend Ralf, who is founder and CEO of Tradebit.com – a huge affiliate powered download marketplace - I will present you some Wisconsin accent flavored on-site-search tidbits… We are both running small and large sites ourselves and have to implement search solutions for our own sites or for our customers.
  3. On-site search is not just the input field, which you might imagine right now. As always with complex technology: there is much more to it and we will try to give you a solid overview of WHAT is really in it and how can YOU benefit from it!
  4. It really isn't just about an input field, trust us. What happens when you rely on another source for your on-site search, such as the Google Custom Search (GSE: http://www.google.com/cse/ )? Have you ever tried it? If so, has it ever failed to meet your expectations, your visitors expectations? Can you think of any benefits to having your own on-site search? How about: Performance/Latency. No need to ask another server. Controlled indexes. You custom index your own content. SE Referrer Queries. You can include search from multiple engines in your index. Customized experience. You control the output format! That last one is huge, mutating the landing page. More on that later. But if you answered with any of these responses, you are in the right session today. We are going to share what we can regarding on-site search. You don’t need to install an outside search engine’s technology, relying on their servers, bandwidth, internet connections, their logo … etc. People use a search engine to locate and get to your site, now let’s keep them here and use our own internal search technology to help our user locate the information on our site that they desire.
  5. Basic search. Typos. Data sources? Let's start out with a search on Google. Why? Why not. They haven't become one of the leading search engines for nothing! We can learn from Google and apply the same to our own on-site search. Notice how typographical errors are handled. Auto-correction is being applied on-the-fly and the user experience is being maintained. We see what we have typed in so far, where Google thinks we have made an error and an alternative list of known resources. And it is stated that way. Plus a nice option asking if you *really* want to search for the misspelled word instead. And in the meantime the search page is actually being rendered so you have a visual. Ah, yes, I *DID* mean Elvis. Thank you Mr. Web Site Programmer. Wait a minute, stick with me … there's even more …
  6. We see songs we can listen to and if we take that one more click we are presented with purchase options. If it were our own site, we would like to have the purchase ability right here though … hey wait a minute, that's what this presentation is all about! Also, have a look at the lower right of the display where you can see what "People also search for". Promote other products based on data that has been analyzed. How can we capture that type of information? The same way Google, Amazon and others that have it right are doing it, you analyze your own logs and start making decisions and attempts based on known criteria and tracking patterns. But I'm getting ahead of myself, Ralf is going to cover more of that shortly. Back to basics …
  7. In order to have on-site search you need something to search! You have your content, yes, but is that enough? Where does the data reside? Is it in database tables? Text files? Static HTML? A combination of these resources? Plan how you will index your data. This is important. You might have a product database table, you might have recipes, you might have … the list goes on. Or all your site pages may be in a single database table. You can build a single index or multiple indexes. Or you may need to pull data in from multiple resource types as we just mentioned. How will you do so? Does the search technology you intend to implement allow you to do so? Does it have an indexer you can use, or will you write your own? I mean, somehow you have to get your data searched, right? What about your links? More than likely any of these resources is available on a unique url, a unique resource. And depending on how you build your index you may end up serving duplicate content. How? Well don't forget that your index builds the way you instruct it to do so. You could end up building your index using a different url than the original which offers the same exact page. Consider query strings, etc. especially for items like news or calendars, etc. How will you update your index? Manually? When will you update your index? Hourly? Monthly? How much of your data will you be updating in your index? All 500,000 pages at once? Does your search technology allow you to use a main and delta approach? How will it perform? Will it scale? If you are a MySQL user, are you still running FULLTEXT queries with MATCH AGAINST? If so, you need to talk to my friend Ralf after the presentation. He can offer you some enlightenment. Lastly, how will you search against your index? Can you use Perl, PHP, or some other type of API? A custom approach? Does the technology you intend to implement support your intent?
  8. Your own search feature should be capturing data and you should be combining that with search engine referral data. Why? Find out what people are searching! Why was that search important? Use it! Use it for profiling behavior. Learn from Amazon! They do this well. Plus there are additional benefits, such as fraud protection. What?! Fraud protection? Yes. As part of our session today I had an inclination to interview Ralf with some on-site search questions. And I learned a neat little trick from him regarding fraud protection. He's going to share that today as well as quite a bit more on this profiling idea. For example, if I knew you visited my site in Safari, then should I present Windows software to you? Considerations.
  9. Do you have multiple links to the same resource on any of your site pages? If so, maybe you don't need them both. Start using this information in your own design and search attack. Track and learn from this data. Also note that second bullet. We touched on this earlier. Free tools for tracking destinations. Follow the link here in the slide or search for "enhanced link attribution" for more information.
  10. Sphinx is free! Open Source! Sphinx was created by Andrew Aksyonoff. The company, Sphinx Technologies is a privately held US company, created in late 2007 by Andrew Aksyonoff, creator and primary developer of Sphinx, and Peter Zaitsev, former head of High Performance group in MySQL AB, and a world-class expert in database technologies. Today, we are an office-less company with about 10 employees spanning across all the time zones, working online. A colleague, no a dear friend, introduced me to Sphinx in 2006 and I have been using it on client sites ever since. It is stable, fast, and open source. And you can configure and tweak it to your specific needs. And today I am presenting with that very same friend, Ralf Schwoebel. Ralf was dealing with "Big Data" long before the term was coined and before most of the world outside of search engines themselves were thinking about the idea. And guess what? What works and applies to big data users applies to small customers. Why not? Have you met an organization yet that doesn't want to be treated the same as "the big guys"? Exactly.
  11. I did my first presentation on Sphinx search in 2010 and I showed how to implement it on your site, step-by-step – particularly for non-technical folks. A lot has changed since then but the same concepts still hold true and you can get on board! You can do this! Download and installation For Windows users, I recommend compiling with libstemmer. If you are downloading the binary be sure to select the binary which includes libstemmer. This additional library includes features that are going to come in handy. Open up the documentation and follow the installation steps for your platform. You may want to set up a separate folder for your configuration details as well as the data details (indexer files and logs). Configuration. Start with a basic configuration and use the test database provided. Once you are up and running and are more comfortable you can begin your own custom set up. For now, finish these steps using the test data and test configuration. Run the indexer. This step builds the index from your sample database table and your configuration, sphinx.conf. Test your installation from the command line. That's it. You will want to start the service daemon upon system boot.
  12. No, we are not supporting Nike ID customized shoes. But I went out to their web site to use a similar slide from the past and their current Green Bay Packer colors are hideous  But I want you to remember customization , that's the key word here. Magnifisites is a custom software development corporation supporting and hosting multiple domains. We host sites that run Sphinx implementations but with separate indexers, data and logs and each site implements different search strategies. We have learned a lot from this type of service provision. For example, some sites have acronyms so we may want to set our minimum word limit low, say to 3 characters. Another site provides products and recipes to go along with those products. We have created separate indexes to maximize the performance and landing page manipulation for these products which in turn increases sales and additional item sales because the recipe calls for a certain product. All can be added to the cart at the same time and delivered to the customers door in a single click. The last time I checked Go ogle Custom Search (GSE) doesn't offer this feature  The point? If the shoe fits, wear it. If not, tweak it to meet your needs. Customizing your installation isn't quite as easy as customizing your shoe at nike.com. However, whether you decide to customize it yourself or if you hire somebody else to do the modifications, there are a few simple steps you might follow to make the process more manageable. If you are not using version system software to commit and rollback updates, then at least do yourself a favor ... Make a backup copy of each document you intend to modify. Make a backup copy of any database table you intend to modify. Make your modifications on a development implementation first. Test, test, test. Plan the appropriate implementation date and time. If necessary, let your user base know about the planned system downtime or update implementation period. Perform the update. Test, test, test. Watch your logs for errors or issues.