Rabj freebase all

1,232 views
1,129 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,232
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
13
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Rabj freebase all

  1. 1. The Anatomy of a Large-Scale Human-Computation Engine Shailesh Kochhar, Stefano Mazzocchi, Praveen Paritosh Freebase August Meetup
  2. 2. 1: Freebase & Human Computation 2: Example – Stanford Library 3: RABJ 4: Consensus Aug 18, 2010 Freebase Meetup
  3. 3. Freebase Structured database 12 MM entites, 300 MM triples/facts Aug 18, 2010 Freebase Meetup
  4. 4. Where does the data come from? Aug 18, 2010 Freebase Meetup
  5. 5. Community contributions Mass Data Loads Aug 18, 2010 Freebase Meetup
  6. 6. Human Judgments Improve Both Aug 18, 2010 Freebase Meetup
  7. 7. Community Simplify contribution through games Aug 18, 2010 Freebase Meetup
  8. 8. http://typewriter.freebaseapps.com/ Aug 18, 2010 Freebase Meetup
  9. 9. Community Simplify contribution through games Enable QA for Gridworks loads Aug 18, 2010 Freebase Meetup
  10. 10. Aug 18, 2010 Freebase Meetup
  11. 11. Mass Data Loads Precision: QA for >99% accuracy Aug 18, 2010 Freebase Meetup
  12. 12. Book Edition QA Aug 18, 2010 Freebase Meetup
  13. 13. Mass Data Loads Precision: QA for >99% accuracy Coverage: Manual reconciliation Aug 18, 2010 Freebase Meetup
  14. 14. matchmaker http://matchmaker2.freebaseapps.com/ Aug 18, 2010 Freebase Meetup
  15. 15. 1: Freebase & Human Computation 2: Example – Stanford Library 3: RABJ 4: Consensus Aug 18, 2010 Freebase Meetup
  16. 16. Reconcile Stanford Library Catalog with freebase.com Aug 18, 2010 Freebase Meetup
  17. 17. Stanford Library Catalog 4.4MM book editions 1.3MM English book editions 1.2MM English books 600K authors Aug 18, 2010 Freebase Meetup
  18. 18. For freebase, identity is key match books, match authors Aug 18, 2010 Freebase Meetup
  19. 19. Automatic matching insufficient Trained judges needed to decide hard cases Aug 18, 2010 Freebase Meetup
  20. 20. How to get this? Aug 18, 2010 Freebase Meetup
  21. 21. RABJ Redundant Array of Brains in a Jar Aug 18, 2010 Freebase Meetup
  22. 22. What? Abstraction Powers human judgment (HJ) applications 3.1MM judgments in 16 months Aug 18, 2010 Freebase Meetup
  23. 23. Provides primitive elements for more sophisticated applications Aug 18, 2010 Freebase Meetup
  24. 24. Questions Judgments Queues Agents Aug 18, 2010 Freebase Meetup
  25. 25. Design Constraints Aug 18, 2010 Freebase Meetup
  26. 26. Content-agnostic Dynamic data Low latency Aug 18, 2010 Freebase Meetup
  27. 27. Architecture Aug 18, 2010 Freebase Meetup
  28. 28. Questions contain metadata, pointers to dynamic content Questions added to queues Metadata allows slicing and dicing Aug 18, 2010 Freebase Meetup
  29. 29. Acre applications pull questions from RABJ RABJ matches judge to available tasks Acre renders question, sends judgment back Aug 18, 2010 Freebase Meetup
  30. 30. Declarative consensus Yes: 3, No: 3, Skip: 4, Invalid: 3, Max: 6 RABJ notifies agents when consensus is reached Aug 18, 2010 Freebase Meetup
  31. 31. Scale Aug 18, 2010 Freebase Meetup
  32. 32. 2.3 MM questions 3.1 MM judgments 500+ queues 20+ applications Aug 18, 2010 Freebase Meetup
  33. 33. 1: Freebase & Human Computation 2: Example – Stanford Library 3: RABJ 4: Consensus Aug 18, 2010 Freebase Meetup
  34. 34. Always have leftovers Aug 18, 2010 Freebase Meetup
  35. 35. Perfect Consensus? Not! Aug 18, 2010 Freebase Meetup
  36. 36. Evaluating QAers Aug 18, 2010 Freebase Meetup
  37. 37. Explore http://rabj.freebaseapps.com/explorer Create http://wiki.freebase.com/wiki/RABJ_Tutorial Reference http://wiki.freebase.com/wiki/RABJ_API/ Aug 18, 2010 Freebase Meetup
  38. 38. Questions? Aug 18, 2010 Freebase Meetup

×