FreeEed presentation

2,693 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,693
On SlideShare
0
From Embeds
0
Number of Embeds
89
Actions
Shares
0
Downloads
34
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

FreeEed presentation

  1. 1. + Hadoop-based Open Source eDiscovery: FreeEed (Easy as popcorn)
  2. 2. + Business (legal) use case 2 • Duty to disclose information – rule FRCP 26 • Preserve relevant information • Produce information on request • Keep the information for X years • Sanctions for obstruction • Sanctions for non-compliance
  3. 3. + Before the thirties 3 • Court room was full of surprises
  4. 4. + Civil discovery changes this 4
  5. 5. + Discovery basics 5 • Obligations of the parties • At the start of a lawsuit or litigation possibility, preserve relevant data • Produce data at request, within timelines • Review the data before production • Can request eDiscovery from opponents • Store and archive
  6. 6. + Interesting facts about eDiscovery 6 • Most of these are proprietary or under NDA • Representative case size: 5GB to 500GB • Cost per GB of processing: $5-200, ~$100 • Takes 25-50% of litigation budget • Days to process and months to review • Preservation: 3-7 years • 500 providers, with 10 majors
  7. 7. + Challenges of eDiscovery 7 • Data sizes in the TB • Seasonal loads, tight deadlines • Hundreds of file formats • Heavy read/write load in review • Text analytics is of paramount importance • Huge price tickets obstruct justice
  8. 8. + FreeEed main features 8 • Open source Hadoop-based eDiscovery: • As scalable as Hadoop • Fast review with NoSQL • Scales with the lawsuit - time and volume • Data preservation and archiving with VM • Only possible with open source license
  9. 9. + Design goals 9 • Built on open source components • Big Data scalable • Preservation, chain of custody, archiving • Scalable technically and business-ly • Stable (don’t laugh, people get different results on different runs) • Close-source compatible (MS + Azure too)
  10. 10. + Packaging architecture 10 • Comes as VM’s • Grab as few or as many as you want • No mixing of matters • No ethical problems • Preserve for as many years as you want • 1 VM = 1 corn, FreeEed = free popcorn
  11. 11. + FreeEed makes lawyers happy 11
  12. 12. + FreeEed : Architecture 12
  13. 13. + FreeEed popcorn is very popular with lawyers, legal techs, IT, etc.
  14. 14. + FreeEed popcorn 14 • Deploy on laptops, servers or cloud • One-node or any number of nodes • Scalable storage • Different cooking recipes • No mixing of matters • Easy archiving • Easy deletion
  15. 15. + Processing architecture 15 • Based on golden-image VM • Controlled cluster start in any environment • Index / cull on the fly or later • Immediately searchable
  16. 16. + Cluster start-up on EC2 16
  17. 17. + Cloud integration  Downloadable VM’s  Same VM’s on Amazon AWS  Amazon VM’s are very convenient  Immediate deployment  Any hardware configuration you need  Control lots of power from a limited-power laptop  Azure – working with Microsoft 17
  18. 18. + Review architecture 18 • Lucene • Solr • HBase • Lucene indexes created in reducers and combined in Solr • For small matters, write directly to Solr
  19. 19. + Review screen 19
  20. 20. + Review capabilities 20 • Search • Cull down • View text and metadata • Tag documents • Export as images or as native files
  21. 21. + Eagle eye’s view - EDRM 21
  22. 22. + Left of EDRM – Legal Hold 22 • FreeEedCollect • Architecture: https://github.com/markkerzner/FreeEedC ollect • ZooKeeper/MapReduce/Flume/HDFS
  23. 23. + Right of EDRM – Org. charts 23 Partnership with Sintelix
  24. 24. + Analytics – network of actors 24 Partnership with Sintelix
  25. 25. + FreeEed and data governance 25 • Virtualization for data preservation • Scalable processing • Archiving • Documents groups not mixing • Data format stored together with software that understands it
  26. 26. + Hadoop & Big Data applications 26 • Other related applications • Financial – text analytics • Energy – documents and procedures analytics • Actual on-going projects
  27. 27. + FreeEed as a learning tool 27 • 100’s of downloads • Dozens of active users • Real-world Hadoop application • Many developers download to learn • Complex, real, but manageable
  28. 28. + FreeEed adoption – who is trying our “popcorn”? 28 • Large law firms • Small law firms and solos • Government agencies • Universities • Enterprises • Developers learn Big Data
  29. 29. + Looking forward 29 • Add • Collection • Analytics • Community • Integrations • Implementations
  30. 30. + How you can use FreeEed 30 • For its intended purpose • Large law firms • Small firms and solos, • Pro-se • Integrate in the IT legal • Start a similar document management project
  31. 31. + How you can use FreeEed 31 • For its intended purpose • Large law firms • Small firms and solos, • Pro-se • Integrate in the IT legal • Start a similar document management project
  32. 32. + Q&A 32 • Thank you! • People usually ask: • How can I put my data in the cloud? • Is it safe? • Do you do OCR, PST, OST, etc…?

×