Wikipedia : Workshop


Published on

Published in: Education, Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Wikipedia : Workshop

  1. 2. <ul><li>We know Wikipedia exists in English… How many languages do you think it exists in? </li></ul><ul><li>What is the largest Wikipedia? Size? </li></ul><ul><li>What is the size of Hindi Wikipedia? Telugu? Tamil? </li></ul><ul><li>Who uses Wikipedia content? </li></ul><ul><li>Who creates content in Wikipedia? </li></ul><ul><li>Who corrects the content? </li></ul><ul><li>Who owns the content? </li></ul><ul><li>How to contribute to Wikipedia? Do you know? </li></ul>
  2. 4. <ul><li>[15 Min] What is WikiBhasha? </li></ul><ul><li>[15 Min] Demo’s… </li></ul><ul><li>[ 40 Min ] Hands-on to Create some data </li></ul><ul><li>[ 20 Min ] Feedback </li></ul><ul><ul><li>Feedback Questionaire </li></ul></ul><ul><ul><li>Q/A </li></ul></ul><ul><ul><li>Interact with us on your ideas… </li></ul></ul>
  3. 5. <ul><li>Started as a Research Project in Microsoft Research India (WikiBABEL) </li></ul><ul><li>Released as a open-source tool (WikiBhasha) </li></ul>WikiBABEL explores community creation of data for Computational Linguistic research (our first focus: parallel data)
  4. 6. <ul><li>Rough content using Machine Translation </li></ul><ul><li>Community to correct rough content for their purposes… </li></ul>Machine Translation System Collaborative Translation Cache Linguistic Resources WikiBABEL on Wikipedia Article to Target Wikipedia
  5. 7. <ul><li>Wikipedia provides a compelling user scenario </li></ul><ul><ul><ul><li>Highly skewed content in different languages </li></ul></ul></ul><ul><ul><ul><li>Passionate Wikipedia communities </li></ul></ul></ul>
  6. 8. <ul><li>WikiBhasha announced by MSR and WMF ( 18 Oct, ’10 ) </li></ul><ul><ul><li>Open sourced as MediaWiki Extension </li></ul></ul><ul><ul><li>Released publicly in </li></ul></ul>
  7. 9. <ul><li>Designed WikiBhasha as a thin edit layer </li></ul><ul><ul><li>Stays on Wikipedia </li></ul></ul><ul><ul><li>User contribution submitted to Wikipedia </li></ul></ul>WikiBABEL UX WikiBhasha 2.0 User Community CTF Dictionary Cloud Services API’s Wikipedia
  8. 10. <ul><li>WikiBhasha designed to be modular & extendible </li></ul><ul><ul><li>Open-sourced, so community can contribute/enhance </li></ul></ul>Cloud Services Layer WikiBhasha CORE Components Source/Target Wiki System Interface GUI Components (Wikipedia-specific UI and Workflow) WikiBABEL [Edit] WikiBABEL-CORE Source/Target Wiki System Interface (Wiki API’s for Content Pull/Push, Content & User Management, …) User Management (Authentication, User Credentials Management, User Preferences/Skills, Contributions Tracking, …) Linguistic Resources (Mono-/Bi-lingual Dictionaries, Thesauri, …) Lang. Technology Components (Machine Translation, Transliteration, Summarization …) Content Management (Content Discovery, Versions, Tagging, Notification Lists, …) User-Experience (Linguistically Aware Wiki-site Aware Workflow Engine) Contextual Help (Domain-specific, Context-specific, User-Contribution Aware Help…) Communication (Message Boards, Email/Alert Mechanisms, Wikis, …) User-Interface (Generic UI Components, Scratch Pad, …) WikiBhasha UI/UX/Integration Components Layer 3 rd Party Linguistic Services MediaWiki Software Mediawiki Extensions MediaWiki Layer Wikipedia CTF
  9. 11. <ul><li>WikiBhasha is available as a Bookmarklet/ Wikipedia user-script </li></ul><ul><ul><li>Please contribute to your Wikipedia! </li></ul></ul><ul><li>WikiBhasha source code available as a MediaWiki Extension </li></ul><ul><ul><li> </li></ul></ul><ul><ul><li>Please enhance it! </li></ul></ul>
  10. 12. <ul><li>Working with Wikipedia Communities around the world </li></ul><ul><li>Workshops planned in several demographics </li></ul><ul><ul><li>India in Nov 2010 – Mar 2011 </li></ul></ul><ul><ul><ul><li>Allahabad, Banaras, Delhi, Chennai, Trichy, Hyderabad… </li></ul></ul></ul><ul><ul><li>Egypt in Dec 2010-Jan 2011 </li></ul></ul><ul><ul><li>Brazil and Mexico in Mar-Apr 2011 … </li></ul></ul><ul><li>Goal: To understand adoption! </li></ul>
  11. 13. <ul><li>WikiBhasha announced by MSR and WMF ( 18 Oct, ’10 ) </li></ul><ul><ul><li>Open sourced as MediaWiki Extension </li></ul></ul><ul><ul><li>Released publicly in </li></ul></ul>
  12. 14. <ul><li>Covered in 20+ languages across the world </li></ul><ul><ul><li>Covered much by Social media (FB & Twitter) </li></ul></ul>
  13. 15. <ul><li>500K+ Hits & ~100K+ Visitors </li></ul><ul><li>From more than 50 countries </li></ul><ul><li>Parallel data: ~4000 Sentences & ~100K Words </li></ul>Total Hits
  14. 16. <ul><li>Articles on WikiBhasha created in 7 Wikipedias </li></ul><ul><li>WikiBhasha files localized in 2 languages </li></ul><ul><li>A MediaWiki Bugzilla tracks bugs and feature enhancements </li></ul><ul><ul><li>So far about 50 issues discussed & 22 bugs filed </li></ul></ul><ul><ul><li>12 bugs fixed </li></ul></ul>
  15. 17. <ul><li>Many appreciations </li></ul><ul><ul><ul><li>“ Hats off to Microsoft for putting aside commercial concerns for once and doing something positive for society.” Taskado/Wikimedia blog </li></ul></ul></ul><ul><ul><ul><li>“ I applaud Microsoft for making the client open source. ” </li></ul></ul></ul><ul><ul><ul><li>“ Excellent tool indeed! Very intuitive and efficient.” Raoul/RMC Blog </li></ul></ul></ul><ul><li>And some skepticism on… </li></ul><ul><ul><ul><li>What are you planning to do with the data? </li></ul></ul></ul><ul><ul><ul><li>Why are you sharing it? </li></ul></ul></ul><ul><li>And the inevitable Comparison with GTTk </li></ul><ul><ul><ul><li>“ Hmm interesting. … I don’t think Google agreed to release any of their code. Will they now?” Nil Einne/Wikimedia Blog </li></ul></ul></ul><ul><ul><ul><li>“ How does this compare with the Google’s tool?” WikiBhasha Forum </li></ul></ul></ul>
  16. 18. <ul><li>The Wikipedia contribution is still minimal </li></ul><ul><ul><li>100+ sessions on Wikipedia through WikiBhasha </li></ul></ul><ul><ul><li>4000+ sentence pairs through CTF mechanism </li></ul></ul><ul><ul><li>100,000+ words of parallel corpora collected </li></ul></ul><ul><li>Wikimedia Foundation: “Engage the communities” </li></ul><ul><li>Research and Data goals </li></ul>
  17. 21. <ul><li>Tutorial on Contribution… </li></ul><ul><li>Do you have Wikipedia login? </li></ul><ul><ul><ul><li>If so, please log in </li></ul></ul></ul><ul><ul><ul><li>Or, be “Anonymous” </li></ul></ul></ul><ul><li>If you want to create “Wikipedia” login </li></ul><ul><ul><ul><li>Visit ANY Wikipedia article & click or “Create Login” </li></ul></ul></ul><ul><ul><ul><li>Create Login </li></ul></ul></ul><ul><ul><ul><li>Login </li></ul></ul></ul>
  18. 23. <ul><li>As a bookmarklet </li></ul><ul><ul><li>Visit site </li></ul></ul><ul><ul><li>Install from the “Installation” tab </li></ul></ul><ul><li>As a Wikipedia user-script (with Wikipedia login) </li></ul><ul><ul><li>From “WikiBhasha.MSR“ user </li></ul></ul><ul><ul><ul><li>Add to your default skin .js file, the following: importScript(&quot;User:WikiBhasha.MSR/WikiBhasha.js&quot;); </li></ul></ul></ul>
  19. 24. <ul><li>Choose any English/Hindi Wikipedia article </li></ul><ul><li>Go to the English Wikipedia article </li></ul><ul><ul><ul><li>Invoke “WikiBhasha Beta” </li></ul></ul></ul><ul><ul><ul><li>Choose language </li></ul></ul></ul><ul><ul><ul><li>Invoke “WikiBhasha” once more in Hindi article </li></ul></ul></ul><ul><li>Or, Go to the Hindi Wikipedia article </li></ul><ul><ul><ul><li>Invoke “WikiBhasha Beta” </li></ul></ul></ul>
  20. 26. <ul><li>3 Step Process </li></ul><ul><ul><li>Collect : For exploring English articles & translating </li></ul></ul><ul><ul><ul><ul><li>Use this step for ONLY translation corrections </li></ul></ul></ul></ul><ul><ul><li>Compose: For composing target article </li></ul></ul><ul><ul><ul><ul><li>Use this step for moving corrected content AND writing new content </li></ul></ul></ul></ul><ul><ul><li>Submit: You are done! </li></ul></ul><ul><ul><ul><ul><li>Submit to Wikipedia! </li></ul></ul></ul></ul>