Effective Strategies for Searching Oracle UCM Fishbowl Solutions Oracle Universal Content Management Experts Since 1999
Where’s My CONTENT? One of the key features of a CMS is the ability to search and FIND needed content. A Forrester study in 2009 found that “Of resondents planning to increase ECM use, 61% said content sharing was the most important driver, followed by compliance, 51%; improved search, 45%; and cost-effective automation, 44%.”
Survey What version of Content Server are you running? 10g, 7.5, earlier? What’s your search index? Verity? Metadata only? Oracle DB or MS SQL Server Full Text? Oracle SES, Google, or FAST? Do you have any problems with searching your repository?
Background Verity VDK Integrated Verity VDK search solution previously default indexing solution Verity VDK integration no longer available In the past the search solution was pretty well defined, but now there are a number of options. Both new and existing clients need to decide how they will power indexing of their UCM content
What’s options are available? Database Metadata only indexing Database Full Text Oracle 9i, 10g, and 11g MS SQL Server External Indexes Oracle Secure Enterprise Search (SES) Fast Verity integration Google Mini 3rd Party integration Others?
Which to use??? Is full text necessary? Metadata only indexing creates an overall simpler, lighter weight architecture Do you want to search for things not managed in Content Server? Enterprise search (custom or 3rd party product required) Does your database support full text search? (With the features you need?)
Search Features Search Vocabulary (Boolean, proximity, zone, stemming, case sensitivity, range searching?) Spell checking Results pagination Ability to limit size of search results (MaxResults) Ability to return large hit count
Search Features (cont’d) Sorting on different types or multiple fields at once (score, metadata, number or date fields) Relevance or scoring ability and quality Snippet or summary returned with results Keyword match highlighting (PDF Highlighting)
Indexing Features Stop words Stemming Dictionaries available (multi-language support and support of multiple languages in a single collection) File formats supported Parametric search or faceted search
Indexing Features (cont’d) Performance Index size (relative to indexed content) Page limit (how much content can be indexed?) Indexing Latency Ability to index large amounts of numeric data Metadata indexing (lots of metadata?) Rebuild required on metadata additions? Indexing SDK return value confirms success
Other Considerations Scalability Platform support (Hardware/OS) Support Availability Documentation available Cost of indexing solution and hardware
Verity Verity was purchased by Autonomy is focusing their development efforts on their IDOL platform Oracle stopped distributing it in June 2008 to new customers As of February 28, 2009 the Verity components are no longer available through any distribution channels, including media request or download Only solution to support PDF Highlighting
Database Search Options DisableTotalItemsSearchQuery Improves search performance when = false Substring searches may perform poorly Should not be default search operator SearchSortOptions component Can be used to optimize search performance Recommend indexing commonly searched fields for improved performance! CaseInsensitiveSearch component For Oracle databases
Metadata Only Indexing All databases supported by Content Server can be used to support metadata only indexing Good for Records Management or scanning solutions where content access isn’t as important and full text indexing is not necessary Good for DAM instances where content is not indexable.
Database Full Text Indexing Oracle Text 11g database is recommended Score based sorting Performance improvements Faceted search available Oracle Text 10g is functional (9i not recommended) Microsoft SQL Server also supported No Score based sorting
Secure Enterprise Search Oracle intends to provide a limited license with UCM to index UCM content Available when 11g is released (most likely) Recommended where Content Server database does not support full text indexing Similar indexing features to Oracle Text 11g Can be used as enterprise search solution with additional licensing
Other external indexes Fast Now owned by Microsoft Fast but heavy hardware requirements 10g integration supported, but not currently available Few customers are using this option Google Mini integration Available from 3rd party includes lease of Google search appliance Custom integration Component architecture would allow for a Lucene or other search index integration
Oracle Text 11g Faceted Search 3 fields are configured for drill down OTB: and Security Group, and Account. 1 additional could be added. They allow you to filter your search by any of these fields. This faceted search feature could be added to other search results templates to add value to an interface
Content Rating Allows users to rate content. Content rating is included w/ search results
Predictive Search Component Google type feature to suggest search terms as users types query Captures user search terms and stores those terms that return results Fishbowl Solutions component offered to compliment any search index solution
Conclusions Many customers will need to select a new search option in the not too distant future The 1st question is if metadata only is sufficient, if enterprise search is required, and/or if a full text index is required There are a lot of options, but not a whole lot of objective information available regarding how to compare the options (some of this is expected to be forthcoming with the 11g release) A custom integration may add significant value