0
Open Source eDiscovery Presentation for "Women in eDiscovery" Houston, TX 12/15/2011
Open source eDiscovery  <ul><ul><li>Pre-history </li></ul></ul><ul><ul><li>Present capabilities </li></ul></ul><ul><ul><li...
Qualifications <ul><ul><li>MS Math </li></ul></ul><ul><ul><li>MS Computer Science </li></ul></ul><ul><ul><li>Mensa, Langua...
Following the People with Luck <ul><li>Watch the people who made it </li></ul>
My first project: writing eDiscovery for 1 computer <ul><li>Ending with 30 </li></ul>
My second project: writing eDiscovery for an unlimited cluster <ul><li>Ending with BigData </li></ul>
Big Data! Enter Hadoop <ul><li>  </li></ul>
Hadoop = Big Data <ul><li>  </li></ul>
Big Data History <ul><ul><li>2004 - Google reveals their big data technology </li></ul></ul><ul><ul><li>2005 - It becomes ...
Writing a book <ul><li>Hadoop in Practice for Manning </li></ul>
Getting invited <ul><ul><li>YouTube </li></ul></ul><ul><ul><li>Microsoft Bing </li></ul></ul><ul><ul><li>Facebook </li></u...
So what is FreeEed <ul><ul><li>Applied knowledge gained from eDiscovery applications and competitor analysis  </li></ul></...
Built for Big Data <ul><li>Write the code once, make it work either on 1 or on 1000s of computers </li></ul><ul><li>  </li...
What is a cluster <ul><li>Many computers organized together </li></ul>
What is a Hadoop cluster? <ul><ul><li>A group of computers ready to work together </li></ul></ul><ul><ul><li>Hadoop allows...
What is open source? <ul><li>Many programmers working together </li></ul>
Open source for eDiscovery <ul><ul><li>Low cost for the user </li></ul></ul><ul><ul><li>Ideal for in-house implementation ...
FreeEed present capabilities <ul><li>  </li></ul><ul><ul><li>Text extraction </li></ul></ul><ul><ul><li>Culling </li></ul>...
FreeEed processing stages <ul><li>  </li></ul><ul><li>  </li></ul><ul><ul><li>Staging, maintaining the integrity of the da...
FreeEed screens <ul><li>Project, Settings, History </li></ul>
FreeEed immediate future - 3 months <ul><ul><li>Amazon cloud processing </li></ul></ul><ul><ul><li>Multiple enhancements (...
Next organizational steps <ul><ul><li>Support </li></ul></ul><ul><ul><li>Development </li></ul></ul><ul><ul><li>In-house E...
Exciting future steps <ul><ul><li>Enhanced capabilities based on cloud power </li></ul></ul><ul><ul><li>iPad/Chrome tablet...
Upcoming SlideShare
Loading in...5
×

Open source e_discovery

814

Published on

Presentation for Women in eDiscovery, Houston, TX

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
814
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
17
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Today I am going to talk about open source eDiscovery. What are my qualification for doing so? I wrote the first and so far the open source software for eDiscovery called FreeEed.But more than that, when you are dealing with open source, you become part of the echo system. You share and get back. Thus one can speak for all.
  • Past - how did I come to do it Present -FreeEed capabilities Future - what FreeEed will do Vision
  • Transcript of "Open source e_discovery"

    1. 1. Open Source eDiscovery Presentation for &quot;Women in eDiscovery&quot; Houston, TX 12/15/2011
    2. 2. Open source eDiscovery <ul><ul><li>Pre-history </li></ul></ul><ul><ul><li>Present capabilities </li></ul></ul><ul><ul><li>Foreseeable future </li></ul></ul><ul><ul><li>Vision </li></ul></ul>
    3. 3. Qualifications <ul><ul><li>MS Math </li></ul></ul><ul><ul><li>MS Computer Science </li></ul></ul><ul><ul><li>Mensa, Languages (10) </li></ul></ul><ul><li>  </li></ul><ul><ul><li>Oil: patents, books, awards, software </li></ul></ul><ul><ul><li>Projects... </li></ul></ul><ul><li>  </li></ul><ul><ul><li>JD - eDiscovery </li></ul></ul><ul><ul><li>eDiscovery 1 </li></ul></ul><ul><ul><li>eDiscovery 2 </li></ul></ul><ul><ul><li>Free Discovery </li></ul></ul>
    4. 4. Following the People with Luck <ul><li>Watch the people who made it </li></ul>
    5. 5. My first project: writing eDiscovery for 1 computer <ul><li>Ending with 30 </li></ul>
    6. 6. My second project: writing eDiscovery for an unlimited cluster <ul><li>Ending with BigData </li></ul>
    7. 7. Big Data! Enter Hadoop <ul><li>  </li></ul>
    8. 8. Hadoop = Big Data <ul><li>  </li></ul>
    9. 9. Big Data History <ul><ul><li>2004 - Google reveals their big data technology </li></ul></ul><ul><ul><li>2005 - It becomes open source with Hadoop </li></ul></ul><ul><ul><li>2008 - eDiscovery on the cluster </li></ul></ul><ul><ul><li>09-11 - Big Data explosion </li></ul></ul>
    10. 10. Writing a book <ul><li>Hadoop in Practice for Manning </li></ul>
    11. 11. Getting invited <ul><ul><li>YouTube </li></ul></ul><ul><ul><li>Microsoft Bing </li></ul></ul><ul><ul><li>Facebook </li></ul></ul><ul><ul><li>Google  </li></ul></ul><ul><ul><li>Yahoo </li></ul></ul>
    12. 12. So what is FreeEed <ul><ul><li>Applied knowledge gained from eDiscovery applications and competitor analysis </li></ul></ul><ul><ul><li>Big Data </li></ul></ul><ul><ul><li>Open source </li></ul></ul>
    13. 13. Built for Big Data <ul><li>Write the code once, make it work either on 1 or on 1000s of computers </li></ul><ul><li>  </li></ul><ul><ul><li>One machine </li></ul></ul><ul><ul><li>Many private computers (cluster) </li></ul></ul><ul><ul><li>Many rented Amazon computers </li></ul></ul>
    14. 14. What is a cluster <ul><li>Many computers organized together </li></ul>
    15. 15. What is a Hadoop cluster? <ul><ul><li>A group of computers ready to work together </li></ul></ul><ul><ul><li>Hadoop allows them to share the workload </li></ul></ul><ul><ul><li>Fault-tolerant </li></ul></ul>
    16. 16. What is open source? <ul><li>Many programmers working together </li></ul>
    17. 17. Open source for eDiscovery <ul><ul><li>Low cost for the user </li></ul></ul><ul><ul><li>Ideal for in-house implementation </li></ul></ul><ul><ul><li>Better code quality </li></ul></ul><ul><ul><li>Open collaboration </li></ul></ul><ul><ul><li>Fast development using existing open source tools and applications </li></ul></ul>
    18. 18. FreeEed present capabilities <ul><li>  </li></ul><ul><ul><li>Text extraction </li></ul></ul><ul><ul><li>Culling </li></ul></ul><ul><ul><li>Flexible search syntax </li></ul></ul><ul><ul><li>Scalability </li></ul></ul><ul><ul><li>PDF Imaging </li></ul></ul><ul><ul><li>Runs on Windows, Mac, Linux, Hadoop cluster </li></ul></ul>
    19. 19. FreeEed processing stages <ul><li>  </li></ul><ul><li>  </li></ul><ul><ul><li>Staging, maintaining the integrity of the data </li></ul></ul><ul><ul><li>Processing - text/native/exceptions/pdf </li></ul></ul><ul><ul><li>Review - Concordance/Future review platform </li></ul></ul>
    20. 20. FreeEed screens <ul><li>Project, Settings, History </li></ul>
    21. 21. FreeEed immediate future - 3 months <ul><ul><li>Amazon cloud processing </li></ul></ul><ul><ul><li>Multiple enhancements (imaging, deduping, OCR, etc.) </li></ul></ul>
    22. 22. Next organizational steps <ul><ul><li>Support </li></ul></ul><ul><ul><li>Development </li></ul></ul><ul><ul><li>In-house EDD </li></ul></ul>
    23. 23. Exciting future steps <ul><ul><li>Enhanced capabilities based on cloud power </li></ul></ul><ul><ul><li>iPad/Chrome tablet eDiscovery </li></ul></ul><ul><ul><li>Big Data technology for review </li></ul></ul><ul><ul><li>Text Understanding: predictive coding, automated privilege review, clustering, email chains </li></ul></ul><ul><li>  </li></ul><ul><li>Advanced FreeEed technology will be a powerful weapon in future legal battles </li></ul>
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×