Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Open Source eDiscovery Presentation for "Women in eDiscovery" Houston, TX 12/15/2011
Open source eDiscovery  <ul><ul><li>Pre-history </li></ul></ul><ul><ul><li>Present capabilities </li></ul></ul><ul><ul><li...
Qualifications <ul><ul><li>MS Math </li></ul></ul><ul><ul><li>MS Computer Science </li></ul></ul><ul><ul><li>Mensa, Langua...
Following the People with Luck <ul><li>Watch the people who made it </li></ul>
My first project: writing eDiscovery for 1 computer <ul><li>Ending with 30 </li></ul>
My second project: writing eDiscovery for an unlimited cluster <ul><li>Ending with BigData </li></ul>
Big Data! Enter Hadoop <ul><li>  </li></ul>
Hadoop = Big Data <ul><li>  </li></ul>
Big Data History <ul><ul><li>2004 - Google reveals their big data technology </li></ul></ul><ul><ul><li>2005 - It becomes ...
Writing a book <ul><li>Hadoop in Practice for Manning </li></ul>
Getting invited <ul><ul><li>YouTube </li></ul></ul><ul><ul><li>Microsoft Bing </li></ul></ul><ul><ul><li>Facebook </li></u...
So what is FreeEed <ul><ul><li>Applied knowledge gained from eDiscovery applications and competitor analysis  </li></ul></...
Built for Big Data <ul><li>Write the code once, make it work either on 1 or on 1000s of computers </li></ul><ul><li>  </li...
What is a cluster <ul><li>Many computers organized together </li></ul>
What is a Hadoop cluster? <ul><ul><li>A group of computers ready to work together </li></ul></ul><ul><ul><li>Hadoop allows...
What is open source? <ul><li>Many programmers working together </li></ul>
Open source for eDiscovery <ul><ul><li>Low cost for the user </li></ul></ul><ul><ul><li>Ideal for in-house implementation ...
FreeEed present capabilities <ul><li>  </li></ul><ul><ul><li>Text extraction </li></ul></ul><ul><ul><li>Culling </li></ul>...
FreeEed processing stages <ul><li>  </li></ul><ul><li>  </li></ul><ul><ul><li>Staging, maintaining the integrity of the da...
FreeEed screens <ul><li>Project, Settings, History </li></ul>
FreeEed immediate future - 3 months <ul><ul><li>Amazon cloud processing </li></ul></ul><ul><ul><li>Multiple enhancements (...
Next organizational steps <ul><ul><li>Support </li></ul></ul><ul><ul><li>Development </li></ul></ul><ul><ul><li>In-house E...
Exciting future steps <ul><ul><li>Enhanced capabilities based on cloud power </li></ul></ul><ul><ul><li>iPad/Chrome tablet...
Upcoming SlideShare
Loading in …5
×

Open source e_discovery

1,328 views

Published on

Presentation for Women in eDiscovery, Houston, TX

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Open source e_discovery

  1. 1. Open Source eDiscovery Presentation for &quot;Women in eDiscovery&quot; Houston, TX 12/15/2011
  2. 2. Open source eDiscovery <ul><ul><li>Pre-history </li></ul></ul><ul><ul><li>Present capabilities </li></ul></ul><ul><ul><li>Foreseeable future </li></ul></ul><ul><ul><li>Vision </li></ul></ul>
  3. 3. Qualifications <ul><ul><li>MS Math </li></ul></ul><ul><ul><li>MS Computer Science </li></ul></ul><ul><ul><li>Mensa, Languages (10) </li></ul></ul><ul><li>  </li></ul><ul><ul><li>Oil: patents, books, awards, software </li></ul></ul><ul><ul><li>Projects... </li></ul></ul><ul><li>  </li></ul><ul><ul><li>JD - eDiscovery </li></ul></ul><ul><ul><li>eDiscovery 1 </li></ul></ul><ul><ul><li>eDiscovery 2 </li></ul></ul><ul><ul><li>Free Discovery </li></ul></ul>
  4. 4. Following the People with Luck <ul><li>Watch the people who made it </li></ul>
  5. 5. My first project: writing eDiscovery for 1 computer <ul><li>Ending with 30 </li></ul>
  6. 6. My second project: writing eDiscovery for an unlimited cluster <ul><li>Ending with BigData </li></ul>
  7. 7. Big Data! Enter Hadoop <ul><li>  </li></ul>
  8. 8. Hadoop = Big Data <ul><li>  </li></ul>
  9. 9. Big Data History <ul><ul><li>2004 - Google reveals their big data technology </li></ul></ul><ul><ul><li>2005 - It becomes open source with Hadoop </li></ul></ul><ul><ul><li>2008 - eDiscovery on the cluster </li></ul></ul><ul><ul><li>09-11 - Big Data explosion </li></ul></ul>
  10. 10. Writing a book <ul><li>Hadoop in Practice for Manning </li></ul>
  11. 11. Getting invited <ul><ul><li>YouTube </li></ul></ul><ul><ul><li>Microsoft Bing </li></ul></ul><ul><ul><li>Facebook </li></ul></ul><ul><ul><li>Google  </li></ul></ul><ul><ul><li>Yahoo </li></ul></ul>
  12. 12. So what is FreeEed <ul><ul><li>Applied knowledge gained from eDiscovery applications and competitor analysis </li></ul></ul><ul><ul><li>Big Data </li></ul></ul><ul><ul><li>Open source </li></ul></ul>
  13. 13. Built for Big Data <ul><li>Write the code once, make it work either on 1 or on 1000s of computers </li></ul><ul><li>  </li></ul><ul><ul><li>One machine </li></ul></ul><ul><ul><li>Many private computers (cluster) </li></ul></ul><ul><ul><li>Many rented Amazon computers </li></ul></ul>
  14. 14. What is a cluster <ul><li>Many computers organized together </li></ul>
  15. 15. What is a Hadoop cluster? <ul><ul><li>A group of computers ready to work together </li></ul></ul><ul><ul><li>Hadoop allows them to share the workload </li></ul></ul><ul><ul><li>Fault-tolerant </li></ul></ul>
  16. 16. What is open source? <ul><li>Many programmers working together </li></ul>
  17. 17. Open source for eDiscovery <ul><ul><li>Low cost for the user </li></ul></ul><ul><ul><li>Ideal for in-house implementation </li></ul></ul><ul><ul><li>Better code quality </li></ul></ul><ul><ul><li>Open collaboration </li></ul></ul><ul><ul><li>Fast development using existing open source tools and applications </li></ul></ul>
  18. 18. FreeEed present capabilities <ul><li>  </li></ul><ul><ul><li>Text extraction </li></ul></ul><ul><ul><li>Culling </li></ul></ul><ul><ul><li>Flexible search syntax </li></ul></ul><ul><ul><li>Scalability </li></ul></ul><ul><ul><li>PDF Imaging </li></ul></ul><ul><ul><li>Runs on Windows, Mac, Linux, Hadoop cluster </li></ul></ul>
  19. 19. FreeEed processing stages <ul><li>  </li></ul><ul><li>  </li></ul><ul><ul><li>Staging, maintaining the integrity of the data </li></ul></ul><ul><ul><li>Processing - text/native/exceptions/pdf </li></ul></ul><ul><ul><li>Review - Concordance/Future review platform </li></ul></ul>
  20. 20. FreeEed screens <ul><li>Project, Settings, History </li></ul>
  21. 21. FreeEed immediate future - 3 months <ul><ul><li>Amazon cloud processing </li></ul></ul><ul><ul><li>Multiple enhancements (imaging, deduping, OCR, etc.) </li></ul></ul>
  22. 22. Next organizational steps <ul><ul><li>Support </li></ul></ul><ul><ul><li>Development </li></ul></ul><ul><ul><li>In-house EDD </li></ul></ul>
  23. 23. Exciting future steps <ul><ul><li>Enhanced capabilities based on cloud power </li></ul></ul><ul><ul><li>iPad/Chrome tablet eDiscovery </li></ul></ul><ul><ul><li>Big Data technology for review </li></ul></ul><ul><ul><li>Text Understanding: predictive coding, automated privilege review, clustering, email chains </li></ul></ul><ul><li>  </li></ul><ul><li>Advanced FreeEed technology will be a powerful weapon in future legal battles </li></ul>

×