Beyond Simple Search – Adding Business Value in the Enterprise
Kathlina (Kathy) M. Phillips
Vice President, Technology Manager Enterprise Search Services (ESS)
Tom Lutmer
eBusiness Systems Consultant, Enterprise Search Services (ESS)
Beyond Simple Search
Adding Business Value in the Enterprise
Kathlina (Kathy) Phillips
Vice President, Technology Manager Enterprise Search Services (ESS)
Tom Lutmer
eBusiness Systems Consultant, Enterprise Search Services (ESS)
Lucene Revolution
May 2, 2013
© 2011 Wells Fargo Bank, N.A. All rights reserved. Internal use.
Agenda
 Who are We? What Do We Do?
 Search Architecture
 Beyond Simple Search
 Search Applications – Value & Techniques
 Look to the Future
 Q & A
2
Our Intranet – Served by ESS
265,000 team members
(potential users – all time zones)
Enterprise Search Services (ESS)
2+ million unstructured docs
20+ million structured content
1300+ domains
10,000+ websites
SharePoint
OpenText
Documentum
Websphere, Cold Fusion
Blogs, Wikis, Social Spaces
.NET, ASP, PHP, JSP, etc..
2+ million queries/month
Search Business Value for Wells Fargo
Enterprise Scoped
Search
Site
Specific
Site
Specific
Site
Specific
Site
Specific
Customer
Impact
Customer
Impact
Business
Analysis
Business
Intelligence
Enterprise:
 Time savings and efficiency
 Reduce rework and duplication
 Timely and updated communications
 Collaboration and knowledge sharing
Site Specific:
 Timely access to notifications, forms,
group communications
 Knowledge base applications
Customer Impact:
 Customer support
 Timely access to notifications,
forms, processes, procedures
 Knowledge base applications
Business Analysis:
 Deeper level analysis
 Results only relevant in
context of application
 Structured and unstructured
content
Business Intelligence:
 Connects relationship of data
 Results only relevant in context of application
 Structured and unstructured content
Wells Fargo Intranet
Enterprise Search Web Services (JSON, XML, HTML)
Internal Service – able to switch to results from different search engines (not dependent
on any one search solution)
Best Bets / Autocomplete Admin Interface/Metrics
View /Query Server
Management
Search Architecture
FAST Web Crawl
Database
Connectors
LucidWorks (Lucene/Solr) Search
Hosted Search Apps
Custom
Search Apps
Intranet Websites
Search Apps using Web
Service XML
Search Apps using Web
Service Json
OpenText
Connector
Enterprise Search Web Services
FAST ESP Search Optional Other Search
LucidWorks
Connectors
Other Custom
Connectors
Search Services Admin Interface
Autocomplete and Best Bets Management
Search Services Admin Interface
Query Management
Search Services Admin Interface
View Management Query Manager NameView Manager Name
Enterprise Scope Applications
Enterprise Scope - Typical keyword intranet search; access, find, retrieve information across a
variety of web sites
Challenges: Crawling, Access, Noise in Results, Poor/Inconsistent Quality Content
Techniques: Removal of content, Scripting to improve quality/normalize, Metrics to verify
depth/scope, autocomplete, social feedback (click through, best bets, tagging)
CrawlerRecreate HTML for
Single Sign-on Pages
SharePoint
Connector
Full Crawl
Specific Sites
2 Hop Crawl
All “Published” Sites
Scripting
For Meta Data
Scripting
For Meta Data
Scripting
For Meta Data
Index
Example Crawl Configuration for One Enterprise Scope App
Internet vs. intranet search results
10
Internet:
Paying Customers
Intranet:
Co-workers
 Higher quality content in
top results
 Tuned results by working
directly with search solution
(paid for tuning)
 Mostly HTML/web pages
 Searches usually tuned
for mass appeal (popular
searches)
 Lower quality content
overall
 Quality of content varies
widely
 Larger variety of content
types
 Searches vary between
popular mass appeal and
many very specific to
current task
Crawling Challenges
High Quality Content
with Good Metadata
Web
Crawler
Low Quality Content with
Bad/Poor Metadata
Missing Body Content
– JavaScript Built or
Browser Dependent
Duplicates – Domain Name,
Dynamic Scripts, Published
Multiple Times,
Upper/Lowercase
Crawl Rates, Depth, Link
Following Methods
Authentication –
Custom, Incorrect,
Single Sign-on
Proxies, Firewalls,
Robots
Post Processing and Scripting
Scripting
Metadata Augmentation
Transformation
Tables/Matching
Text Extraction
Rules / Regex
Code / Logic (Complex/Unique)
Content Removal
Merge/Copy Metadata
ScriptPreprocessorUpdateController
Index
DataSource
Script File (*.js) Update Handler
Do Not Index
Site Specific – Self Service
Copy code to include on their site
Site Specific:
Keyword
intranet search
for a smaller
scoped set of
content or
single website
Technique:
Self Service
Example Site Specific
Customer Impact Applications
Customer Impact – Keyword intranet search with interactivity around a specific
business function
• Customer support
• Timely access to notifications, forms, group communications
• Knowledge base applications
Challenges & Techniques:
Security and Performance; Custom User Interfaces and Metadata
Content & ACLs
Database with ACL
mapping
Content Acquisition
Enterprise Search Web Service
Websites
Query/Index
Content – no ACLs
Security at
Website: all
or nothing
Authentication
& Match ACL at
Query
Lock direct
access to
Solr or other
User Group
ACL Caching
Content may/may
not include ACLs at
acquisition time
Security Architecture
Business Analysis Applications
Business Analysis – specialty search solutions for deeper level analysis
Search App
Phonetic Libraries
(Apache Codec)
Content Index
(Lucene)
Thesaurus Index
(Lucene)
Web App
Results
Businesses with MN or
Minnesota will show
up
Results match MN
or Minnesota
Business Intelligence Applications
Business Intelligence:
search across structured
and unstructured data
sources for discovery and
reporting
Once search results are
returned these sliders
can be used to filter to
specific results.
• Companies with FICO
>750 and
• Gross annual sales >
$2 million
• In Scottsdale
Where are We Headed?
TO DO:
 Social tags
 Best Bets
 Integrating Click Through
 Metrics, metrics, metrics
 Clustering
 Semantics
 Big Data
18
Trending:
 Enterprise Search – “gateway” to
search apps
 Site Search/Embedded search –
Value Add Rising
 Business Intelligence – Value
Add Rising
 Quality Audits & Metrics to show
value
 Social/Logs/Feedback for
relevancy & personalization
 New User Interfaces – mobile,
interactive, embedded
Questions?
Thank you!
Kathlina (Kathy) M. Phillips
Tom Lutmer

Beyond simple search – adding business value in the enterprise

  • 1.
    Beyond Simple Search– Adding Business Value in the Enterprise Kathlina (Kathy) M. Phillips Vice President, Technology Manager Enterprise Search Services (ESS) Tom Lutmer eBusiness Systems Consultant, Enterprise Search Services (ESS)
  • 2.
    Beyond Simple Search AddingBusiness Value in the Enterprise Kathlina (Kathy) Phillips Vice President, Technology Manager Enterprise Search Services (ESS) Tom Lutmer eBusiness Systems Consultant, Enterprise Search Services (ESS) Lucene Revolution May 2, 2013 © 2011 Wells Fargo Bank, N.A. All rights reserved. Internal use.
  • 3.
    Agenda  Who areWe? What Do We Do?  Search Architecture  Beyond Simple Search  Search Applications – Value & Techniques  Look to the Future  Q & A 2
  • 4.
    Our Intranet –Served by ESS 265,000 team members (potential users – all time zones) Enterprise Search Services (ESS) 2+ million unstructured docs 20+ million structured content 1300+ domains 10,000+ websites SharePoint OpenText Documentum Websphere, Cold Fusion Blogs, Wikis, Social Spaces .NET, ASP, PHP, JSP, etc.. 2+ million queries/month
  • 5.
    Search Business Valuefor Wells Fargo Enterprise Scoped Search Site Specific Site Specific Site Specific Site Specific Customer Impact Customer Impact Business Analysis Business Intelligence Enterprise:  Time savings and efficiency  Reduce rework and duplication  Timely and updated communications  Collaboration and knowledge sharing Site Specific:  Timely access to notifications, forms, group communications  Knowledge base applications Customer Impact:  Customer support  Timely access to notifications, forms, processes, procedures  Knowledge base applications Business Analysis:  Deeper level analysis  Results only relevant in context of application  Structured and unstructured content Business Intelligence:  Connects relationship of data  Results only relevant in context of application  Structured and unstructured content Wells Fargo Intranet
  • 6.
    Enterprise Search WebServices (JSON, XML, HTML) Internal Service – able to switch to results from different search engines (not dependent on any one search solution) Best Bets / Autocomplete Admin Interface/Metrics View /Query Server Management Search Architecture FAST Web Crawl Database Connectors LucidWorks (Lucene/Solr) Search Hosted Search Apps Custom Search Apps Intranet Websites Search Apps using Web Service XML Search Apps using Web Service Json OpenText Connector Enterprise Search Web Services FAST ESP Search Optional Other Search LucidWorks Connectors Other Custom Connectors
  • 7.
    Search Services AdminInterface Autocomplete and Best Bets Management
  • 8.
    Search Services AdminInterface Query Management
  • 9.
    Search Services AdminInterface View Management Query Manager NameView Manager Name
  • 10.
    Enterprise Scope Applications EnterpriseScope - Typical keyword intranet search; access, find, retrieve information across a variety of web sites Challenges: Crawling, Access, Noise in Results, Poor/Inconsistent Quality Content Techniques: Removal of content, Scripting to improve quality/normalize, Metrics to verify depth/scope, autocomplete, social feedback (click through, best bets, tagging) CrawlerRecreate HTML for Single Sign-on Pages SharePoint Connector Full Crawl Specific Sites 2 Hop Crawl All “Published” Sites Scripting For Meta Data Scripting For Meta Data Scripting For Meta Data Index Example Crawl Configuration for One Enterprise Scope App
  • 11.
    Internet vs. intranetsearch results 10 Internet: Paying Customers Intranet: Co-workers  Higher quality content in top results  Tuned results by working directly with search solution (paid for tuning)  Mostly HTML/web pages  Searches usually tuned for mass appeal (popular searches)  Lower quality content overall  Quality of content varies widely  Larger variety of content types  Searches vary between popular mass appeal and many very specific to current task
  • 12.
    Crawling Challenges High QualityContent with Good Metadata Web Crawler Low Quality Content with Bad/Poor Metadata Missing Body Content – JavaScript Built or Browser Dependent Duplicates – Domain Name, Dynamic Scripts, Published Multiple Times, Upper/Lowercase Crawl Rates, Depth, Link Following Methods Authentication – Custom, Incorrect, Single Sign-on Proxies, Firewalls, Robots
  • 13.
    Post Processing andScripting Scripting Metadata Augmentation Transformation Tables/Matching Text Extraction Rules / Regex Code / Logic (Complex/Unique) Content Removal Merge/Copy Metadata ScriptPreprocessorUpdateController Index DataSource Script File (*.js) Update Handler Do Not Index
  • 14.
    Site Specific –Self Service Copy code to include on their site Site Specific: Keyword intranet search for a smaller scoped set of content or single website Technique: Self Service
  • 15.
  • 16.
    Customer Impact Applications CustomerImpact – Keyword intranet search with interactivity around a specific business function • Customer support • Timely access to notifications, forms, group communications • Knowledge base applications Challenges & Techniques: Security and Performance; Custom User Interfaces and Metadata Content & ACLs Database with ACL mapping Content Acquisition Enterprise Search Web Service Websites Query/Index Content – no ACLs Security at Website: all or nothing Authentication & Match ACL at Query Lock direct access to Solr or other User Group ACL Caching Content may/may not include ACLs at acquisition time Security Architecture
  • 17.
    Business Analysis Applications BusinessAnalysis – specialty search solutions for deeper level analysis Search App Phonetic Libraries (Apache Codec) Content Index (Lucene) Thesaurus Index (Lucene) Web App Results Businesses with MN or Minnesota will show up Results match MN or Minnesota
  • 18.
    Business Intelligence Applications BusinessIntelligence: search across structured and unstructured data sources for discovery and reporting Once search results are returned these sliders can be used to filter to specific results. • Companies with FICO >750 and • Gross annual sales > $2 million • In Scottsdale
  • 19.
    Where are WeHeaded? TO DO:  Social tags  Best Bets  Integrating Click Through  Metrics, metrics, metrics  Clustering  Semantics  Big Data 18 Trending:  Enterprise Search – “gateway” to search apps  Site Search/Embedded search – Value Add Rising  Business Intelligence – Value Add Rising  Quality Audits & Metrics to show value  Social/Logs/Feedback for relevancy & personalization  New User Interfaces – mobile, interactive, embedded
  • 20.
  • 21.
  • 22.
    Kathlina (Kathy) M.Phillips Tom Lutmer