• Save
Wikipedia : Workshop
Upcoming SlideShare
Loading in...5

Wikipedia : Workshop






Total Views
Views on SlideShare
Embed Views



1 Embed 2

http://digicmb.blogspot.com 2



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Wikipedia : Workshop Wikipedia : Workshop Presentation Transcript

    • We know Wikipedia exists in English… How many languages do you think it exists in?
    • What is the largest Wikipedia? Size?
    • What is the size of Hindi Wikipedia? Telugu? Tamil?
    • Who uses Wikipedia content?
    • Who creates content in Wikipedia?
    • Who corrects the content?
    • Who owns the content?
    • How to contribute to Wikipedia? Do you know?
    • [15 Min] What is WikiBhasha?
    • [15 Min] Demo’s…
    • [ 40 Min ] Hands-on to Create some data
    • [ 20 Min ] Feedback
      • Feedback Questionaire
      • Q/A
      • Interact with us on your ideas…
    • Started as a Research Project in Microsoft Research India (WikiBABEL)
    • Released as a open-source tool (WikiBhasha)
    WikiBABEL explores community creation of data for Computational Linguistic research (our first focus: parallel data)
    • Rough content using Machine Translation
    • Community to correct rough content for their purposes…
    Machine Translation System Collaborative Translation Cache Linguistic Resources WikiBABEL on Wikipedia Article to Target Wikipedia
    • Wikipedia provides a compelling user scenario
        • Highly skewed content in different languages
        • Passionate Wikipedia communities
    • WikiBhasha announced by MSR and WMF ( 18 Oct, ’10 )
      • Open sourced as MediaWiki Extension
      • Released publicly in www.WikiBhasha.org
    • Designed WikiBhasha as a thin edit layer
      • Stays on Wikipedia
      • User contribution submitted to Wikipedia
    WikiBABEL UX WikiBhasha 2.0 User Community CTF Dictionary Cloud Services API’s Wikipedia
    • WikiBhasha designed to be modular & extendible
      • Open-sourced, so community can contribute/enhance
    Cloud Services Layer WikiBhasha CORE Components Source/Target Wiki System Interface GUI Components (Wikipedia-specific UI and Workflow) WikiBABEL [Edit] WikiBABEL-CORE Source/Target Wiki System Interface (Wiki API’s for Content Pull/Push, Content & User Management, …) User Management (Authentication, User Credentials Management, User Preferences/Skills, Contributions Tracking, …) Linguistic Resources (Mono-/Bi-lingual Dictionaries, Thesauri, …) Lang. Technology Components (Machine Translation, Transliteration, Summarization …) Content Management (Content Discovery, Versions, Tagging, Notification Lists, …) User-Experience (Linguistically Aware Wiki-site Aware Workflow Engine) Contextual Help (Domain-specific, Context-specific, User-Contribution Aware Help…) Communication (Message Boards, Email/Alert Mechanisms, Wikis, …) User-Interface (Generic UI Components, Scratch Pad, …) WikiBhasha UI/UX/Integration Components Layer 3 rd Party Linguistic Services MediaWiki Software Mediawiki Extensions MediaWiki Layer Wikipedia CTF
    • WikiBhasha is available as a Bookmarklet/ Wikipedia user-script
      • Please contribute to your Wikipedia!
    • WikiBhasha source code available as a MediaWiki Extension
      • http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/WikiBhasha
      • Please enhance it!
    • Working with Wikipedia Communities around the world
    • Workshops planned in several demographics
      • India in Nov 2010 – Mar 2011
        • Allahabad, Banaras, Delhi, Chennai, Trichy, Hyderabad…
      • Egypt in Dec 2010-Jan 2011
      • Brazil and Mexico in Mar-Apr 2011 …
    • Goal: To understand adoption!
    • WikiBhasha announced by MSR and WMF ( 18 Oct, ’10 )
      • Open sourced as MediaWiki Extension
      • Released publicly in www.WikiBhasha.org
    • Covered in 20+ languages across the world
      • Covered much by Social media (FB & Twitter)
    • 500K+ Hits & ~100K+ Visitors
    • From more than 50 countries
    • Parallel data: ~4000 Sentences & ~100K Words
    Total Hits
    • Articles on WikiBhasha created in 7 Wikipedias
    • WikiBhasha files localized in 2 languages
    • A MediaWiki Bugzilla tracks bugs and feature enhancements
      • So far about 50 issues discussed & 22 bugs filed
      • 12 bugs fixed
    • Many appreciations
        • “ Hats off to Microsoft for putting aside commercial concerns for once and doing something positive for society.” Taskado/Wikimedia blog
        • “ I applaud Microsoft for making the client open source. ”
        • “ Excellent tool indeed! Very intuitive and efficient.” Raoul/RMC Blog
    • And some skepticism on…
        • What are you planning to do with the data?
        • Why are you sharing it?
    • And the inevitable Comparison with GTTk
        • “ Hmm interesting. … I don’t think Google agreed to release any of their code. Will they now?” Nil Einne/Wikimedia Blog
        • “ How does this compare with the Google’s tool?” WikiBhasha Forum
    • The Wikipedia contribution is still minimal
      • 100+ sessions on Wikipedia through WikiBhasha
      • 4000+ sentence pairs through CTF mechanism
      • 100,000+ words of parallel corpora collected
    • Wikimedia Foundation: “Engage the communities”
    • Research and Data goals
    • Tutorial on Contribution…
    • Do you have Wikipedia login?
        • If so, please log in
        • Or, be “Anonymous”
    • If you want to create “Wikipedia” login
        • Visit ANY Wikipedia article & click or “Create Login”
        • Create Login
        • Login
    • As a bookmarklet
      • Visit www.WikiBhasha.org site
      • Install from the “Installation” tab
    • As a Wikipedia user-script (with Wikipedia login)
      • From “WikiBhasha.MSR“ user
        • Add to your default skin .js file, the following: importScript("User:WikiBhasha.MSR/WikiBhasha.js");
    • Choose any English/Hindi Wikipedia article
    • Go to the English Wikipedia article
        • Invoke “WikiBhasha Beta”
        • Choose language
        • Invoke “WikiBhasha” once more in Hindi article
    • Or, Go to the Hindi Wikipedia article
        • Invoke “WikiBhasha Beta”
    • 3 Step Process
      • Collect : For exploring English articles & translating
          • Use this step for ONLY translation corrections
      • Compose: For composing target article
          • Use this step for moving corrected content AND writing new content
      • Submit: You are done!
          • Submit to Wikipedia!