• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Deep Dive: Advanced Search Technologies
 

Deep Dive: Advanced Search Technologies

on

  • 503 views

Even with recent advancements in predictive coding, tried and true searching tactics such as keyword searching, concept searching, topic grouping, near de-duplication, and email threading will ...

Even with recent advancements in predictive coding, tried and true searching tactics such as keyword searching, concept searching, topic grouping, near de-duplication, and email threading will continue to play an important role in ediscovery filtering, review and production across the Electronic Discovery Reference Model (EDRM).

Statistics

Views

Total Views
503
Views on SlideShare
503
Embed Views
0

Actions

Likes
1
Downloads
4
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Deep Dive: Advanced Search Technologies Deep Dive: Advanced Search Technologies Presentation Transcript

    • 2
    • Discussion Overview  Case Law and Industry Guidance: The Role of Searching in Ediscovery  Back to the Basics: Keyword Searching Tips  Deep Dive: Advanced Searching Technologies 3
    • Judicial Viewpoints on Keyword Searching Court required parties to “confer on the development of reasonable search terms” instead of compelling production without a list of proposed search terms provided by the requesting party “Common practice governing the discovery of [ESI] requires the use of search terms . . . If the producing party generates the search terms on its own, the inevitable result will be complaints that the search terms were inadequate” EEOC v. McCormick & Schmick’s Seafood Restaurants, Inc., 2012 WL 380048 (D. Md. Feb. 3, 2012). 4
    •  Keyword searching plays an important role in winnowing document sets for discovery Analyzing Search Methods 5  Objective of search: high recall and precision » Recall – fraction of relevant documents found during review » Precision – fraction of identified documents that actually are relevant In this example, fruit is relevant; broccoli is not.
    • Designing Effective Keyword Searches 1. Understand your search engine » Learn how each operator works (OR, AND, PROXIMITY, etc.) » Be aware of operator precedence (Boolean or left-to-right) and use parentheses to clarify » Work with ediscovery provider to create an alternative strategy for lengthy searches that may “time out” 6
    • Designing Effective Keyword Searches 2. Develop a search strategy » Run broad searches for date-range culling, etc. then use results as scope for sub-level searches » Save searches and search results for future use and reference » Find on-point documents and use “similar” documents and concepts to provide additional key terms » Know your universe (foreign language requires foreign keywords!) 7
    • Designing Effective Keyword Searches 3. Build smart keyword lists  Use a text editor to reduce errors » Programs that format text can cause difficulty » Use a program like Notepad and place each term on a separate line » Spell check » Be aware of commonly misspelled keywords or privilege terms  Understand the impact of your key terms » Be flexible: account for word/phrase permutations – use a “Data Dictionary” » Over-inclusive? Under-inclusive? » “Noise words” increase likelihood of false hits 8
    • Advanced Searching Technologies What are some “new and evolving” search methods?  1. Concept Searching  2. Topic Grouping  3. Language Identification  4. Email Threading  5. Near De-Duplication  6. Sampling **Technology-assisted Review 9 Will not cover in this presentation – hot, evolving topic! Will cover in this presentation
    • Keyword Searching Concept Searching Allows reviewers to find documents with similar conceptual terms even if they do not contain the exact search terms Seldom used for filtering; increasingly used for review 1. Keyword Searching vs. Concept Searching Uses search terms to retrieve documents that contain those exact terms 10 Standard practice; generally accepted in the courts Emerging as a technology alternative
    • 2. Topic Grouping  Documents automatically grouped by theme without human input  Topic grouping will group similar documents and label them for quick identification  Users do not need to “seed” the processing engine by providing keywords 11
    • 3. Language Identification  This technology can identify all languages in a document as well as the primary language and pass this information along via a metadata field  A legal team needs to know what languages are in a collection, and the volume of foreign language documents  Reports can help determine whether to use machine translations, foreign language reviewers, or a combination 12
    • 4. Email Threading  Identifies and groups for review e-mail conversations based on content  Using actual content of the e-mails to identify e-mail threads is the most reliable method, as it will not fail to recognize a thread if the subject line changes or if e- mails are exchanged across different e-mail applications 13
    • 5. Near De-Duplication  Reviewers can quickly identify and compare documents that are very similar to one another but are not exact duplicates  Technology assesses document set’s similarities, identifying the most uniquely representative documents as “the core” » All related documents are then grouped around the core 14
    • 6. Sampling: Defensibility & Quality Control Sampling is the practice of looking at a certain % of documents in a data set or particular folder of data » Strengthens the defensibility of the process » Helps validate what you have (and equally important, do not have) in your production set » May take place iteratively throughout the review process or prior to production – During ongoing quality control – At the end to assess completeness of review 15