Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Masters thesis defense talk


Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Masters thesis defense talk

  1. 1. Towards Web 3.0: Harnessing Collective Intelligence of + Humans for Knowledge Acquisition and Web Accessibility Presenter: Deepti Aggarwal Advisors: Prof. Venkatesh Choppella, & Prof. Vasudeva Varma Reviewers: Dr. Raghu Reddy, & Dr. Priyanka Srivastava 1
  2. 2. + 2 Evolution of the Web World wide web is a larger collection of interconnected documents 2003 1990 2020 2010 Web 1.0 Web 2.0 Web 3.0 Web as an information portal Web as a social platform Web as a personalized portable web Focus on ownership Focus on community Focus on individual Static web pages User generated content Semantic and portable pages Meaning is dictated Meaning is socially constructed Meaning is socially constructed and contextually reinvented.
  3. 3. + 3 Getting to Web 3.0 Major hurdles 1. Scattered data 2. Excess of data 3. Understanding data Where should I look for the data? Which data is best for me? How can I understand the available data?
  4. 4. + Getting to Web 3.0 Through context and semantics n  The web where every data owns its semantics and context of the content is defined by the data. n  The web which is capable of reading and understanding the user context. n  CONTEXT refers to why the content is relevant and to whom. n  SEMANTICS refers to the meaning of data and how it is relevant to a given context. 4
  5. 5. + 5 Web 3.0 Possible applications n  Personalized web: content and advertising that match user preferences and choices. n  Data on demand: no need for browsing when all databases are semantically connected to each other. n  Multi-lingual Knowledge acquisition (Extraction and Validation) web: easy access of sources available in varied languages. Accessibility (Re-narration)
  6. 6. + 6 Getting to Web 3.0: Methodology & contributions Methodology: (Human Computer Interaction) n  Research through design (Zimmerman 2007). n  Prototyping – User studies – Analysis – Discussions. Contributions: n  Three prototypes, and their studies. (Power of Friends, uPick ) : Extracting and validating information (Alipi) : Making the web more accessible through re-narration
  7. 7. + Problem: Extract and validate information Exploration: Power of Friends, an online friend sourcing game. 7
  8. 8. + 8 Problem: Extracting & Validating Community related Information Friends on social networks possess a variety of information about each other. Applications: to personalize one’s browsing and targeted advertising. Issues: information is scattered, and no one is an expert.
  9. 9. + 9 Existing approaches Task: Extract information about a person X. Approach 1: Ask X. (21 questions) Approach 2: Ask X’s friends. (Bernstein et al. 2008) Problems: involves social awkwardness of revealing the truth.
  10. 10. + 10 Motivation Get opinion of friends Looking Glass-Self Theory Ask everyone Cultural Consensus Theory Ensure Privacy Secure Multi-party computation Make it fun Power of Ten
  11. 11. + 11 Our approach: Crowd Consensus Our approach: Ask X’s friend to guess the opinion of X’s other friends. Benefits: Tackles social awkwardness in an engaging and fun way. !
  12. 12. + 12 Power of Friends: Our Proposed game A single player, and asynchronous social game.
  13. 13. + 13 User study of Power of Friends ²  Seven communities, 67 participants (40 female). ²  Questions related to community members: 10 in each game play. ²  Questions related to the likes, hobbies and daily activities of community members. ²  Task: play the game online. ²  Four sessions: demographic information and questions about bonding, game demonstration, game play and interview.
  14. 14. + 14 Results of the study Community Number of questions Id correctly identified C1 6/10 C2 8/10 C3 5/10 C4 7/10 C5 6/10 C6 8/10 C7 7/10 Results of the study: Communities C2 and C6 were more accurate. Correlation between the performance of a community and the bonding level within its members.
  15. 15. + 15 Study Findings n  n  n  It is challenging: “It requires a lot of thinking. I wish I knew my coworkers better”. It creates a social impact: “It is not possible that my friend … knows cooking, I think she hates it. I have to ask her.” It explores social awkwardness of answering a given question: “It is a cool way of giving my answer ... No one knows my answer except me.”
  16. 16. + 16 Study Findings (contd.) n  It creates a sense of connectedness among people: “Its kind of fun to see how accurately my thinking aligns with my friends.” n  25% of the participants got confused while playing and thus needed help to remind them the game strategy. n  30% recommended for multi-player settings; 10% for timebased challenge,60% for publishing the game on Facebook.
  17. 17. + 17 Design Themes n  Identify the level of bonding among friends as it impacts their performance in the game. n  Include questions of every group member. n  Select the questions carefully keeping interests of the members in mind. n  Allow participants to generate questions.
  18. 18. + 18 Discussions and Future Work n  Exploring indirect mode of interaction for larger communities. (IRB approved) n  A comparative study between direct and indirect mode of answering questions is planned. n  Publishing game on Facebook. (Social media interaction) Personalized web: content and advertising that match user preferences and choices.
  19. 19. + Problem: Extracting and validating information Exploration 2: uPick, a crowdsourcing system for extracting Named Entities. 19
  20. 20. + 20 Problem scenario: Acquiring accurate and up-to-date information about Sachin from various web sources.
  21. 21. + 21 Problem: Extracting useful data on demand
  22. 22. + 22 Difficulty in Processing English language “You see sir, I can talk English, I can walk English, I can laugh English, I can run English, because English is such a funny language. Amitabh Bachhan in the movie Namak Halal
  23. 23. + 23 Other Problems Sachin Tendulkar was born in Bombay. He studied in Sharadashram... Co- reference Sachin Tendulkar was born in Bombay. Master Blaster is … Acronym Sachin remembered his father last night … He said he loved poems. Ambiguity Sachin Tendulkar was born in Bombay. Tendlya is … Abbreviations
  24. 24. + 24 Constructs of a sentence: Named Entity and relations n  It is an atomic element in a body of text. n  Types: person, organization, location etc. n  Different named entities when linked together, form a relation. Sachin Tendulkar was born in Bombay Subject NE of type ‘Person’ Relation NE of type ‘Verb’ Object NE of type ‘Location’
  25. 25. + 25 Extracting relationships among NEs: Required process n  Identify part of speech constructs: noun, verb, adjective etc. n  Determine co-references, abbreviations and acronyms. n  Connect them together to form a relationship.
  26. 26. + 26 Existing approach: Automated techniques n  Natural Language Processing based: rule based. n  Machine Learning based: supervised and unsupervised learning. n  Other methods: Vocabulary based. n  Hybrid: NLP and vocabulary based. n  Issues: Dependency, Scalability.
  27. 27. + 27 uPick : Our Proposed System A crowdsourcing system to extract Named Entity relationship from the documents.
  28. 28. + 28 uPick Working n  Step 1: Extract NEs and relations by using POS Tagger and n  Step 2: Present the extracted relations to a crowd in the form relation extraction rules proposed by Chen. a game (challenge). n  Step 3: Collect the generated responses. n  Step 4: Filter the relations by collecting the majority votes and comparing against the expert filtered relations.
  29. 29. + 29 Processing of the generated data n  With the help of human experts, we collected valid relations for each document from automatically generated relations (step 1). These relations form a ground truth dataset for further validation. n  We compare the collected responses from each game against the expert corrected facts stored in the database and filter out erroneous response data. n  The relation instances receiving a majority are taken as true facts corresponding to the document.
  30. 30. + n  User study of uPick Supervised laboratory study, 12 participants (8 females). n  Three sessions: training, game play and interview. n  Four documents: Ashoka Maurya, Sachin Tendulkar, Shahrukh Khan, and Sonia Gandhi. n  Procedure: Read the given text and select the relations from the given list. 30
  31. 31. + 31 Study Results D1 D2 D3 D4 Total number of presented relations 37 39 40 33 Correctly identified valid relations 19 18 19 15 Incorrectly identified valid relations as invalid 5 6 4 1 Correctly identified invalid relations 12 12 16 15 Incorrectly identified invalid relations as valid 1 3 1 2 Accuracy 84% (Correctly identified relations / total relations) 77% 87% 91% Accuracy using automated techniques only (Valid relations / total relations) 61% 57% 49% 65%
  32. 32. + 32 Discussions and future work n  Helpful in remembering facts related to a text, so could be used in online education systems. n  Turn it into an engaging game play. n  Leaderboards and persistent scoring. Data on demand: no need for browsing when all databases are semantically connected to each other.
  33. 33. + Problem: Making the web accessible Exploration: Alipi, an online crowdsourcing system for re-narration. 33
  34. 34. + 34 Problem scenario: Accessibility of the web content How can a person who do not know English, understand web pages on fire safety ? Solution : Re - narration A webpage on Fire Safety is renarrated in Hindi
  35. 35. + 35 Why are the existing approaches not sufficient? n  Single point of control and authority. n  Author forced to anticipate target audience. n  Transferring authorship is difficult.
  36. 36. + 36 Alipi: A re-narration framework (Dinesh et al. 2012) n  User rewrites different sections of a web-page. n  Distribution of the point of control from author to users. n  A step from target audience to target communities. n  Follows the principle of “the best content for each one”.
  37. 37. + 37 Alipi Architecture
  38. 38. + 38 Alipi Architecture: Creating and Storing the re-narrations
  39. 39. + 39 Alipi Architecture: Displaying a renarrated page to the user
  40. 40. + 40 Alipi Prototype 1.Open the website http::// Enter the page of interest, here, 2. Click on the button “Re-narrate”
  41. 41. + 41 Alipi Prototype: Steps to re-narrate a page 3. Select a section of the web-page. Re-narrate the element. 4. Publish your re-narration by providing the target community.
  42. 42. + 42 Alipi Prototype: Steps to see the available re-narrations 3. After clicking the “Re-narrations” button, choose a re-narration from the available list. 4. The queried page will change with the re-narrated element.
  43. 43. + 43 My contribution: Testing feasibility of alipi q  IIIT-H R&D showcase: 70 participants (45 male) q Objective: to find out motivation of the user behind using Alipi, and for what sorts of tasks. q Task: to re-narrate a web-page: IIIT-H webpage, Indian culture or any other page and later to check the available renarrations. q Four phases: demographics, training, system experience and questionnaire.
  44. 44. + 44 Findings of the study q  Participants appreciated both the roles of re-narrator and reader: vary for known and unknown domain. q  Re-narrators preferred text based re-narrations over video and audio re- narrations: to escape from setting the camera, and bandwidth issues. q  Readers preferred re-narrations in mixed media: to get a rich experience. q  Majority wanted to re-narrate for their friends and see renarrations from known people: preferences known. q  Participants found the interface design as non-intuitive and uneasy to follow but the system very useful to share information.
  45. 45. + My contribution 2: Alipi browser plugin 45 §  Allowing dynamic filtration based on user profile. §  By-passes the URL §  Decentralize and editable user profile.
  46. 46. + 46 Discussions and future work n  How can we check the credibility of a re-narration: filtration of noisy re-narrations, ranking based on public voting? n  How can we improve our selection algorithm to incorporate: rapidly growing online communities, dialects of a geographical location, vicinity of user mentioned region? n  What could be the security implications of Alipi architecture? Multi-lingual web: easy access and interoperability among contents between different languages.
  47. 47. + Summary and the way ahead 47
  48. 48. + 48 Summary of my work: n  Personalized Power of Friends uPick Alipi Knowledge acquisition (Extraction and Validation) Accessibility (Re-narration) web: content and advertising that match user preferences and choices. n  Data on demand: no need for browsing when all databases are semantically connected to each other. n  Multi-lingual web: easy access and interoperability among contents between different languages.
  49. 49. + 49 Future Work n  Can the proposed Crowd Consensus framework be useful to reduce the number of iterations required for crowdsourcing tasks? n  Using the belief modality, can we develop a mathematical model to check the accuracy of answer generated by using the Crowd Consensus approach and to determine various related conditions where the accuracy may deviate? n  Can the proposed uPick approach be useful in enhancing the experience of students while reading textbooks? n  How to check the relatedness of a re-narration (generated with Alipi tool) with the original document as well as with other available re-narrations for the same web-page?
  50. 50. + 50 References n  C. Cooley. Human Nature & Social Order - Ppr. Social Science Classics Series. Transaction Pub, 1964. n  M. S. Bernstein, D. Tan, G. Smith, M. Czerwinski, and E. Horvitz. Personalization via friendsourcing. ACM Trans. Comput.-Hum. Interact., 17(2):6:1–6:28, May 2008. n  P.-S. Chen. English sentence structure and entity-relationship diagrams. Information Sciences, 29(2- 3):127 – 149, 1983. n  S. C. Weller. Cultural consensus theory: Applications and frequently asked questions. Field Methods, 19(4):339–368, 2007.
  51. 51. + 51 References (contd.) n  I. Tuomi. Data is more than knowledge: implications of the reversed knowledge hierarchy for knowledge management and organizational memory. J. Manage. Inf. Syst., 16(3):103–117, Dec. 1999. n  S. Sekine. Named Entity: History and Future. 2004. n  W. Du and M. J. Atallah. Secure multi-party computation problems and their applications: a review and open problems. In Proceedings of the 2001 workshop on New security paradigms, NSPW ’01, pages 13–22, New York, NY, USA, 2001. ACM. n  Z. Syed, E. Viegas, and S. Parastatidis. Automatic discovery of semantic relations using mindnet. LREC, 2010.
  52. 52. + 52 References (contd.) n  21 Questions. n  Mindnet. default.aspx?id=69647. n  Power of 10. of 10. n  Stanford pos tagger. tagger.shtml.
  53. 53. + 53 Related Publications n  D. Aggarwal, R. A. Khot, and V. Choppella. Power of Friends: When Friends Guess About their Friends’ Guess. In Proceedings of the SIGCHI conference on Human factors in computing systems, CHI ’13, Paris, France, 2013, ACM. n  D. Aggarwal, R. A. Khot, V. Varma, and V. Choppella. UPICK: Crowdsourcing Based Approach to Extract Relations Among Named Entities. In Proceedings of IndiaHCI, Pune, India, 2012 (Accepted as full paper). n  T. B. Dinesh, S. Uskudrali, S. Sastry, D. Aggarwal, and V. Choppella. Alipi: A framework for re-narrating web pages. In Proceedings of the International Cross- Disciplinary Conference on Web Accessibility, W4A ’12, pages 22:1-4, Lyon, France, 2012, ACM. n  D. Aggarwal, R. A. Khot, A. K. Dey, and V. Choppella. Crowd Consensus: Friendsourcing based approach to generate cultural beliefs. In preparation.
  54. 54. + 54 Public Demonstrations n  Presented “Alipi: Making the web Inclusive and Accessible for All” in IIIT-Hyderabad R&D Showcase, Hyderabad, India, 2013. n  Presented “Crowdsourcing Based Approach to Extract Relations Among Named Entities” in OpenData CampHyderabad Meet, Hyderabad, India, 2012. n  Poster presentations on “Power of Friends: Rethinking Games With a Purpose”, and “Alipi: A renarration Web” in IIITHyderabad R&D Showcase, Hyderabad, India, 2012.
  55. 55. + Special Thanks 55 Prof. Venkatesh Choppella Study participants Dr. T. B. Dinesh Prof. Anind Dey My family Prof. Vasudeva Varma Reviewers Friends IIIT-H Faculty
  56. 56. 56 Web 3.0… Web of opportunities! This is just the beginning! Thank you! For more details: