IntegratingGoogle Search Appliance    with Mura CMS       Ajay Sathuluri         @sathuluri
About Me∗   Ajay Sathuluri∗   Sr. Architect at ICF International∗   Using ColdFusion since ’98∗   Server Tuning, Administr...
What are we covering?∗ Google Search Appliance  ∗   Configuring a Crawl  ∗   Control Access to Content  ∗   Configuring Da...
Google Search Appliance - Home
Configuring a Crawl∗ Before starting a crawl, you must configure the crawl path so  that it only includes information that...
Google Search Appliance – Crawl URL
Configuring a Crawl∗ Demo
Control Access to Content∗ robot.txt∗ meta tag∗ no-crawl Directories
Control Access to Content (2)robot.txt∗ The Google Search Appliance always obeys the rules in robots.txt  and it is not po...
Control Access to Content (3)meta tag∗ Prevent the search appliance crawler (as well as other  crawlers) from indexing or ...
Control Access to Content (4)no-crawl Directories∗ The Google Search Appliance does not crawl any directories  named "no_c...
Configuring Database Crawl∗ Database data source information enables the search appliance  to access content stored in a d...
Google Search Appliance – Databases
Collections∗ A collection lets you search over a specific part of the index.∗ For example, you may want to create a produc...
Google Search Appliance – Collections
Front Ends∗ A front end enables you to change the look and feel of the  search and search result pages your users access.∗...
Google Search Appliance – Front Ends
Crawl Diagnostics∗ Crawl diagnostics provide detailed information about appliance  crawl status for a domain, host, direct...
Google Search Appliance - Crawl         Diagnostics
Google Search Appliance – Secret               Recipe"The appliance uses a sophisticated algorithm to             generate...
Mura – Plugin∗ Deploy Mura Plugin
GSA Plugin - Search∗ Search Code
GSA Plugin - Results∗ Search results code
GSA Plugin – DEMO∗ DEMO
Google Search Appliance – Secret            Recipe
Resources∗ http://docs.getmura.com/∗ http://www.getmura.com/marketplace/apps/fw1-plugin-  template/∗ https://developers.go...
AcknowledgementsThanks to Oğuz Demirkapi for helping to prepare the  presentation.
Q&A?
Upcoming SlideShare
Loading in …5
×

Integrating Google Search Appliance with Mura CMS

1,332 views

Published on

An overview of integrating Google Search Appliance with Mura CMS. Presented at MuraCon 2012 by Ajay Sathuluri.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,332
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Integrating Google Search Appliance with Mura CMS

  1. 1. IntegratingGoogle Search Appliance with Mura CMS Ajay Sathuluri @sathuluri
  2. 2. About Me∗ Ajay Sathuluri∗ Sr. Architect at ICF International∗ Using ColdFusion since ’98∗ Server Tuning, Administration, Load Testing∗ I like spending time with my kids and wife. 
  3. 3. What are we covering?∗ Google Search Appliance ∗ Configuring a Crawl ∗ Control Access to Content ∗ Configuring Database Crawl ∗ Collections / Front Ends ∗ Crawl Diagnostics∗ Configuring GSA with Mura CMS Plugin (FW/1)∗ Search∗ Search Results
  4. 4. Google Search Appliance - Home
  5. 5. Configuring a Crawl∗ Before starting a crawl, you must configure the crawl path so that it only includes information that you wants to make available in search results.∗ Use the Crawl and Index > Crawl URLs page in the Admin Console to enter URLs∗ URLs are case-sensitive.∗ Configure your network to disallow search appliance connectivity outside of your intranet.
  6. 6. Google Search Appliance – Crawl URL
  7. 7. Configuring a Crawl∗ Demo
  8. 8. Control Access to Content∗ robot.txt∗ meta tag∗ no-crawl Directories
  9. 9. Control Access to Content (2)robot.txt∗ The Google Search Appliance always obeys the rules in robots.txt and it is not possible to override this feature.∗ robots.txt file is not mandatory.∗ It is located in the Web servers root directory.∗ For the search appliance to be able to access the robot.txt file, the file must be public.∗ Includes one or more Disallow: or Allow:∗ User-agent: gsa-crawler∗ Disallow: /personal_records/∗ Disallow: /admin/∗ Allow: /∗ Allow: /personal_records/mypersonal.doc
  10. 10. Control Access to Content (3)meta tag∗ Prevent the search appliance crawler (as well as other crawlers) from indexing or following links in a specific HTML page.∗ Embed a robots meta tag in the head of the HTML page.∗ The search appliance crawler obeys the index, noindex, follow, and nofollow in meta tags.<meta name="robots" content="index, nofollow"><meta name="robots" content="noindex, nofollow">
  11. 11. Control Access to Content (4)no-crawl Directories∗ The Google Search Appliance does not crawl any directories named "no_crawl." You can prevent the search appliance from crawling files and directories by: Creating a directory called "no_crawl."∗ Putting the files and subdirectories you do not want crawled under the no_crawl directory.
  12. 12. Configuring Database Crawl∗ Database data source information enables the search appliance to access content stored in a database.∗ To configure a database crawl, provide database data source information.∗ Crawl and Index > Databases page in the Admin Console.∗ After you create a new database data source, click the Sync link to start a database crawl.
  13. 13. Google Search Appliance – Databases
  14. 14. Collections∗ A collection lets you search over a specific part of the index.∗ For example, you may want to create a products collection or a faq collection that supports searches that are only within the products or faqs part of your index.∗ Maximum number of collections for a search appliance is 200.∗ Use the Crawl and Index > Collections - In the Collection Name text box, type a name for the new collection.∗ Manage collection by ∗ Editing a Collection ∗ Exporting and Importing a Collection Configuration ∗ Deleting a Collection
  15. 15. Google Search Appliance – Collections
  16. 16. Front Ends∗ A front end enables you to change the look and feel of the search and search result pages your users access.∗ You can customize these pages to display your organizations colors, fonts, and design. If you have multiple collections, you can make each front end appear in a different format, and have its own configuration options.∗ Use the Serving > Front Ends - In the Front End Name field, enter a name for the new front end.∗ Manage Front End by ∗ Editing a Front End ∗ Deleting a Front End
  17. 17. Google Search Appliance – Front Ends
  18. 18. Crawl Diagnostics∗ Crawl diagnostics provide detailed information about appliance crawl status for a domain, host, directory, or URL.
  19. 19. Google Search Appliance - Crawl Diagnostics
  20. 20. Google Search Appliance – Secret Recipe"The appliance uses a sophisticated algorithm to generate the results bla… bla ..."
  21. 21. Mura – Plugin∗ Deploy Mura Plugin
  22. 22. GSA Plugin - Search∗ Search Code
  23. 23. GSA Plugin - Results∗ Search results code
  24. 24. GSA Plugin – DEMO∗ DEMO
  25. 25. Google Search Appliance – Secret Recipe
  26. 26. Resources∗ http://docs.getmura.com/∗ http://www.getmura.com/marketplace/apps/fw1-plugin- template/∗ https://developers.google.com/search- appliance/documentation/614/∗ https://developers.google.com/search- appliance/documentation/614/xml_reference∗ http://www.robotstxt.org/meta.html∗ http://muracms.com/forum/
  27. 27. AcknowledgementsThanks to Oğuz Demirkapi for helping to prepare the presentation.
  28. 28. Q&A?

×