Rabj freebase all

  • 951 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
951
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
12
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. The Anatomy of a Large-Scale Human-Computation Engine Shailesh Kochhar, Stefano Mazzocchi, Praveen Paritosh Freebase August Meetup
  • 2. 1: Freebase & Human Computation 2: Example – Stanford Library 3: RABJ 4: Consensus Aug 18, 2010 Freebase Meetup
  • 3. Freebase Structured database 12 MM entites, 300 MM triples/facts Aug 18, 2010 Freebase Meetup
  • 4. Where does the data come from? Aug 18, 2010 Freebase Meetup
  • 5. Community contributions Mass Data Loads Aug 18, 2010 Freebase Meetup
  • 6. Human Judgments Improve Both Aug 18, 2010 Freebase Meetup
  • 7. Community Simplify contribution through games Aug 18, 2010 Freebase Meetup
  • 8. http://typewriter.freebaseapps.com/ Aug 18, 2010 Freebase Meetup
  • 9. Community Simplify contribution through games Enable QA for Gridworks loads Aug 18, 2010 Freebase Meetup
  • 10. Aug 18, 2010 Freebase Meetup
  • 11. Mass Data Loads Precision: QA for >99% accuracy Aug 18, 2010 Freebase Meetup
  • 12. Book Edition QA Aug 18, 2010 Freebase Meetup
  • 13. Mass Data Loads Precision: QA for >99% accuracy Coverage: Manual reconciliation Aug 18, 2010 Freebase Meetup
  • 14. matchmaker http://matchmaker2.freebaseapps.com/ Aug 18, 2010 Freebase Meetup
  • 15. 1: Freebase & Human Computation 2: Example – Stanford Library 3: RABJ 4: Consensus Aug 18, 2010 Freebase Meetup
  • 16. Reconcile Stanford Library Catalog with freebase.com Aug 18, 2010 Freebase Meetup
  • 17. Stanford Library Catalog 4.4MM book editions 1.3MM English book editions 1.2MM English books 600K authors Aug 18, 2010 Freebase Meetup
  • 18. For freebase, identity is key match books, match authors Aug 18, 2010 Freebase Meetup
  • 19. Automatic matching insufficient Trained judges needed to decide hard cases Aug 18, 2010 Freebase Meetup
  • 20. How to get this? Aug 18, 2010 Freebase Meetup
  • 21. RABJ Redundant Array of Brains in a Jar Aug 18, 2010 Freebase Meetup
  • 22. What? Abstraction Powers human judgment (HJ) applications 3.1MM judgments in 16 months Aug 18, 2010 Freebase Meetup
  • 23. Provides primitive elements for more sophisticated applications Aug 18, 2010 Freebase Meetup
  • 24. Questions Judgments Queues Agents Aug 18, 2010 Freebase Meetup
  • 25. Design Constraints Aug 18, 2010 Freebase Meetup
  • 26. Content-agnostic Dynamic data Low latency Aug 18, 2010 Freebase Meetup
  • 27. Architecture Aug 18, 2010 Freebase Meetup
  • 28. Questions contain metadata, pointers to dynamic content Questions added to queues Metadata allows slicing and dicing Aug 18, 2010 Freebase Meetup
  • 29. Acre applications pull questions from RABJ RABJ matches judge to available tasks Acre renders question, sends judgment back Aug 18, 2010 Freebase Meetup
  • 30. Declarative consensus Yes: 3, No: 3, Skip: 4, Invalid: 3, Max: 6 RABJ notifies agents when consensus is reached Aug 18, 2010 Freebase Meetup
  • 31. Scale Aug 18, 2010 Freebase Meetup
  • 32. 2.3 MM questions 3.1 MM judgments 500+ queues 20+ applications Aug 18, 2010 Freebase Meetup
  • 33. 1: Freebase & Human Computation 2: Example – Stanford Library 3: RABJ 4: Consensus Aug 18, 2010 Freebase Meetup
  • 34. Always have leftovers Aug 18, 2010 Freebase Meetup
  • 35. Perfect Consensus? Not! Aug 18, 2010 Freebase Meetup
  • 36. Evaluating QAers Aug 18, 2010 Freebase Meetup
  • 37. Explore http://rabj.freebaseapps.com/explorer Create http://wiki.freebase.com/wiki/RABJ_Tutorial Reference http://wiki.freebase.com/wiki/RABJ_API/ Aug 18, 2010 Freebase Meetup
  • 38. Questions? Aug 18, 2010 Freebase Meetup