EMC ANZ Momentum User Group 2011- Tech Track- EMC Documentum xPlore

2,289 views
2,103 views

Published on

EMC Documentum xPlore:
The Next Generation for Fulltext Search in the Documentum Platform, Uwe Ziemer

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,289
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
68
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

EMC ANZ Momentum User Group 2011- Tech Track- EMC Documentum xPlore

  1. 1. EMC Documentum xPlore: The Next Generation for Fulltext Search in the Documentum Platform Uwe Ziemer, Solution Architect© Copyright 2011 EMC Corporation. All rights reserved. 1
  2. 2. Agenda• Introducing Documentum xPlore, Re-cap• xPlore Roadmap• xPlore Readiness• Documentum xPlore Functional Overview© Copyright 2011 EMC Corporation. All rights reserved. 2
  3. 3. Documentum xPlore re-cap xPlore API • Native EMC software, limited third party Indexing Services Search Services • Next generation search built on EMC’s CPS Node/Data Management native XML database xDB in addition to Analytics Admin Lucene Services xDB API • Provides structured and unstructured search xDB query process leveraging XML and XQuery standards & optimization xDB transaction, • ‘No charge’ feature for the Content Server index & page mgmt • Replaces FAST Instream – FAST Instream is no longer supported after end-of-year 2011© Copyright 2011 EMC Corporation. All rights reserved. 3
  4. 4. xPlore Advantage, at a GlanceItem xPlore Legacy Customer Benefit FAST InstreamxPlore enforcement of ✔ ✖ Under-privileged users response times 9xDocumentum Security better than with FASTCollection routing and ✔ ✖ Rebalancing of additional instances andcollection capping nodes not required. Can separate old data from new data. Not possible or extremely difficult to do with FAST.Collection query ✔ ✖ Enables query partitioning in xPlore. FASTrouting sends query to all nodes and partitions.Native-computed ✔ ✖ Result windows can be ~40x deeper thanfacets with FAST (10,000 vs. 250) [D6.6]© Copyright 2011 EMC Corporation. All rights reserved. 4
  5. 5. xPlore Advantage, at a GlanceItem xPlore Legacy Customer Benefit FAST InstreamAdmin Query & ✔ ✖ Significantly eases troubleshooting burdenIngestion analytics +other toolsFolder descend ✔ ✖ Improved response time and more constantoptimisations memory usageDate range ✔ ✖ Better response time for range queries“betweens” (D6.6+)© Copyright 2011 EMC Corporation. All rights reserved. 5
  6. 6. xPlore Advantage, at a GlanceItem xPlore Legacy Customer Benefit FAST InstreamHorizontal scaling with ✔ ✖ Able to expand capacity without ‘fork-liftingmultiple nodes index data’ to the brand new servers.VMware support ✔ ✖ Massive reduction of HW resources. Quicker time to deployNAS support ✔ ✖ Storage virtualization. Eases deployment and maintenanceImproved special ✔ ✖ Better search results.character handling64 bit OS and VM ✔ ✖ Allows for fewer servers to be deployed andsupport managed© Copyright 2011 EMC Corporation. All rights reserved. 6
  7. 7. xPlore Advantage, at a GlanceItem xPlore Legacy Customer Benefit FAST Instrea mBackward ✔ ✖ No need to upgrade existing clients.compatible withclient codeDual mode ✔ ✖ Significantly reduced migration risk.migration supportIn-depth language ✔ ✖ Greater deployment flexibility, and better searchsupport results.Support for wide ✔ ✖ Additional content types can be indexed.range of contentformatshot-backups and ✔ ✖ *Multiple High Availability & Disaster Recoveryother HA/DR optionsimprovements© Copyright 2011 EMC Corporation. All rights reserved. 7
  8. 8. Documentum Performance & ScalabilityxPlore vs. legacy FAST Instream in Documentum environments Single node xPlore, over 50% higher day-forward indexing ingestion throughput on single node Single node Multi- xPlore over 48% higher query throughput user Query test xPlore consumes 35% less resources ‘Save-to-Search xPlore latency at least ½ that of FAST in Latency’ most cases System size xPlore tested at 40+ million documents prior to GA© Copyright 2011 EMC Corporation. All rights reserved. 8
  9. 9. Agenda• Introducing Documentum xPlore, Re-cap• xPlore Roadmap• xPlore Readiness• Documentum xPlore Functional Overview© Copyright 2011 EMC Corporation. All rights reserved. 9
  10. 10. xPlore 1.1 Just Released Item Customer Benefit Improved ‘Fuzzy Search’ (Diacritics Provides significantly better results when using insensitive, and Levenshtein distance) proximity searches, i.e. matching more than just the exact value Fully 64-bit (new text extractor) Provides the ability to examine larger documents Additional languages (Spanish, Italian, Provides more flexibility in deployment and the ability French, Korean and Japanese) to index more content Spell Check (Levenshtein distance) Limited spell check capability to detect errors in search criteria SSL connection option to content server Provides increased security when content is sent to (Windows only) xPlore during indexing Hybrid security mode (configurable) In environments where there is a high focus on security, provides the ability to check user permissions once search results are returned to content server© Copyright 2011 EMC Corporation. All rights reserved. 10
  11. 11. xPlore 1.1, Just Released (continued) Item Customer Benefit Freshness algorithm improved Better tuning of relevancy providing improved precision, recall, or ordering Performance improvements for wildcard Faster search results when using wildcards search Online re-index supported Provides the ability to re-generate the indexes in the background without impact to users© Copyright 2011 EMC Corporation. All rights reserved. 11
  12. 12. xPlore 1.2 Candidate Features Item Customer Benefit Thesaurus support Improved quality of search allowing return of matches on words with similar meaning Additional languages (Hebrew, Russian, Provides more flexibility in deployment and the ability Brazilian Portuguese, Arabic) to index more content Spell check using external dictionary Full spell check capability to detect errors in search criteria Ingestion improvements A set of features intended to improve the rate at which content is indexed under various situations Warm up Provides the ability to prepare the application for use prior to user sign on so that users experience better performance as a result© Copyright 2011 EMC Corporation. All rights reserved. 12
  13. 13. xPlore 1.3 Candidate Features Item Customer Benefit Cloud integration enhancements A set of functionality enabling xPlore to be deployed in the cloud Additional language (Dutch) Provides more flexibility in deployment and the ability to index more content XCP 2.0 support Search experience and index configuration driven by the XCP data model© Copyright 2011 EMC Corporation. All rights reserved. 13
  14. 14. Agenda• Introducing Documentum xPlore, Re-cap• xPlore Roadmap• xPlore Readiness• Documentum xPlore Functional Overview© Copyright 2011 EMC Corporation. All rights reserved. 14
  15. 15. Product Readiness• Extensive beta program for xPlore – October 2009 to April 2010 – 13 customers• Controlled Release Program with highest number of participants in IIG history – 32 highly qualified customers representing variety of industries (Pharma, Financial, Government, Partners, Media, Aerospace) – July to November, 2010• Many customers already in production + internal systems• Technical Support, Sales, Professional Services, Partners all trained and ready to go© Copyright 2011 EMC Corporation. All rights reserved. 15
  16. 16. xPlore Compatibility xPlore compatible with existing clients (Webtop, CenterStage, etc) – Existing clients will work seamlessly with xPlore (customizations to use VQL will require modifications) – Existing clients will benefit from improved xPlore performance (Folder Descend, underprivileged queries, etc) Dual Mode or “Rolling Upgrade” – ability to run FAST & xPlore on the same repository – Allows for easy transition of users from FAST to xPlore with no downtime – Dual Mode significantly reduces migration risks Barring any internal policy constraints, use the 64-bit version of xPlore xPlore available on Windows & Linux only – Content Server can be on any supported OS (Windows, Linux, Solaris, HP, AIX) xPlore compatible with 6.5 SP2 and newer Content Server versions© Copyright 2011 EMC Corporation. All rights reserved. 16
  17. 17. Your Mission: Migrate to xPlore• xPlore is designed with migration from FAST in mind• Plan for your migration to Documentum xPlore in 2011• So how do you get there?© Copyright 2011 EMC Corporation. All rights reserved. 17
  18. 18. FAST to xPlore Transition Period Both FAST & Documentum xPlore Available FAST xPlore 1.1 6.5 SP1 6.5 SP2 6.5 SP3 6.6 6.7 • Transition Period – 2011 will be critical transition period as both FAST & xPlore will be supported on 6.5 SP2, 6.5 SP3 & 6.6 – FAST will no longer ship with D6.7 but can still be downloaded and installed for a Dual Mode transition – Support of both engines allows customers to take advantage of Dual Mode Migrations (“Rolling Upgrade”)© Copyright 2011 EMC Corporation. All rights reserved. 18
  19. 19. EMC P.S. xPlore Migration Service Offering Work Planning and Kickoff Package 1 Work Current Full-Text Environment Includes Package 2 Analysis • Application Impact Planning Work xPlore System Sizing and • Production Sizing Package 3 hardware recommendations • Planning for High Availability and Work Development environment Disaster Recovery Package 4 deployment & Re-index • Migration Planning Work UAT environment deployment & • Hardware and Software setup Package 5 Re-index • Pre-production Testing Work • User Acceptance Testing Pre-Production Testing Package 6 • Production Cutover Work Production Deployment • Part 1: Content migration & Initial user testing Package 7 Planning & Implementation • Part 2: User cutover and FAST cutoff Work Decommission Fast Package 8© Copyright 2011 EMC Corporation. All rights reserved. 19
  20. 20. Agenda• Introducing Documentum xPlore, Re-cap• xPlore Roadmap• xPlore Readiness• Documentum xPlore Functional Overview© Copyright 2011 EMC Corporation. All rights reserved. 20
  21. 21. Quality of Search: Language • xPlore 1.0 supports Support English, German,• xPlore leverages industry leading linguistic Chinese analysis technology that will allow us to certify with many languages, as required • xPlore 1.1 adds• xPlore uses lemmatization to help improve French, Italian, retrieval Spanish, Japanese & – Searches for “run” will find documents with Korean “running” & “runs” – xPlore stores original form & lemma for each token to ensure at least exact match capability for all languages• Language identification done for metadata and content separately – i.e. meta data is Chinese, content is English – Meta data fields used for language id are configurable – “Default language” can be configured as fallback when language is not supported© Copyright 2011 EMC Corporation. All rights reserved. 21
  22. 22. Quality of Search: Document Format Support• xPlore supports over 150 unique file types – 800+ versions of formats (i.e. Word 2003, Word 2010)• Certain formats can be turned off for indexing if desired by modifying the dm_format object and setting can_index to ‘False’ – Meta data will still be indexed© Copyright 2011 EMC Corporation. All rights reserved. 22
  23. 23. Native Security• Content Server Security replicated to xPlore – Enabled by default when xPlore is configured – Index Agent replicates ACL’s and Groups to xPlore [shared or dedicated] – Fulltext queries will leverage native security including MACLs• Hybrid mode can be configured to check security on content server as well Security Evaluation done natively• Efficient deep facet computation within in xPlore Content Server will replicate security xPlore with security enforcement information into xPlore• Enables efficient searches on large result sets (underprivileged users)© Copyright 2011 EMC Corporation. All rights reserved. 23
  24. 24. Trends within Documentum Environments: Consolidation and Virtualization • Small repositories are getting consolidated into larger ones Repository A Repository B • The full text environments for multiple repositories are being consolidated into multi- instance xPlore deployments • These are being deployed in virtual environments (critical for xPlore migrations) Consolidated Repository (A & B) Repository C xPlore domain for xPlore domain for consolidated Repository Repository C (A & B)© Copyright 2011 EMC Corporation. All rights reserved. 24
  25. 25. xPlore Consolidation Support • Typically consolidated repositories map to different object types within single Repository A Repository B repository • The objects from these types can be mapped to multiple “collections” within xPlore – Mapped to single or multiple xPlore instances Consolidated Repository (A & B) • These collections can be queried independently of each other Collection A Collection B Domain for Consolidated Repository A&B© Copyright 2011 EMC Corporation. All rights reserved. 25
  26. 26. Data Management: Collection basedquerying• In DQL statement, collection hint can be used to indicate that the search will only be executed against the target collection• Limiting query to a specific collection can Domain A improve performance and/or precision select …. from dm_sysobject search document Collection Collection contains test search in collection ( <collection A> ) A B© Copyright 2011 EMC Corporation. All rights reserved. 26
  27. 27. Native Facet Computation FIND WHAT YOU NEED FAST• xPlore provides native facet support that respects your enterprise’s security rules (ACLs)• Designed to provide efficient and deep facets (configurable)• Available with CenterStage 1.1 but could also be integrated into Webtop through customization• Benefits of facets: – Facets provide the user with an effective way to drill down to the desired result quickly – Replaces paging or scrolling through results – Facets are a way to give users “Advanced Search” in a simpler, more intuitive way© Copyright 2011 EMC Corporation. All rights reserved. 27
  28. 28. Auditing & Metrics• Metric information (statistics) is maintained for all indexing activity by default• Auditing is used to track query activity – Off by default but can be turned on in xPlore Administrator• Leveraged by xPlore Administrator built-in reports and can be used to generate custom reports© Copyright 2011 EMC Corporation. All rights reserved. 30
  29. 29. xPlore Administrator• Web based Administrative UI for Documentum xPlore• Greatly simplifies day to day configuration and diagnostics of Documentum xPlore• Allows for administration of either single node or multinode deployments• Exposes power of metrics and auditing information via the reporting component© Copyright 2011 EMC Corporation. All rights reserved. 31
  30. 30. xPlore Administrator: Reporting Mechanism Example Top N Slowest Queries• Leveraging metrics and audit information in xPlore, this report returns you the slowest queries for any time period© Copyright 2011 EMC Corporation. All rights reserved. 32
  31. 31. xPlore Administrator: Reporting Mechanism Top N Slowest Queries (an example)1. Enter date range2. Enter desired criteria i.e. “Time to first result”3. Select number of results or slow queries you wish to see© Copyright 2011 EMC Corporation. All rights reserved. 33
  32. 32. xPlore Administrator: Reporting Mechanism Example “Query Id” “User Name” and to First “Total Search“Total“Summary” “Date” “Time Results can be used helps give Time” “Start Result” is Hits” tells you Denied byyou if tells to retrieve you more helps you typically the if the query security filter” was summary correlate important most tells returned as you if it actual xQuery context was selective problemmeasure for to was an for further other events user end part of results underprivileged diagnosis (i.e. network satisfaction query issues, load, etc)© Copyright 2011 EMC Corporation. All rights reserved. 34
  33. 33. Resources Posted to Powerlink • White Paper: EMC Documentum xPlore Migration and Implementation—Best Practices Planning • Documentum Introduces Powerful New Search - Video • Migrating to New Search Documentum xPlore - Video • Powerful New Documentum xPlore to Replace FAST Instream Search - Announcement • FAQ Plus • EMC Developer Network – https://community.emc.com/docs/DOC-8945© Copyright 2011 EMC Corporation. All rights reserved. 35
  34. 34. Q&A© Copyright 2011 EMC Corporation. All rights reserved. 36
  35. 35. THANK YOU© Copyright 2011 EMC Corporation. All rights reserved. 37

×