Liao and petzold opensym berlin wikipedia geolinguistic normalizationHanteng Liao
This paper proposes a method of geo-linguistic normalization to advance the existing comparative analysis of open collaborative communities, with multilingual Wikipedia projects as the example. Such normalization requires data regarding the potential users and/or resources of a geolinguistic unit.
Liao and petzold opensym berlin wikipedia geolinguistic normalizationHanteng Liao
This paper proposes a method of geo-linguistic normalization to advance the existing comparative analysis of open collaborative communities, with multilingual Wikipedia projects as the example. Such normalization requires data regarding the potential users and/or resources of a geolinguistic unit.
Spelling correction systems for e-commerce platformsAnjan Goswami
This is a presentation on building a scalable machine learned spell correction system for an e-commerce site. However, most of the techniques are also generally applicable for any large consumer site.
A Deep Learning Model to Predict Congressional Roll Call Votes from Legislati...mlaij
Developments in natural language processing (NLP) techniques, convolutional neural networks (CNNs), and long-short- term memory networks (LSTMs) allow for a state-of-the-art automated system capable of predicting the status (pass/fail) of congressional roll call votes. The paper introduces a custom hybrid model labeled "Predict Text Classification Network" (PTCN), which inputs legislation and outputs a prediction of the document's classification (pass/fail). The convolutional layers and the LSTM layers automatically recognize features from the input data's latent space. The PTCN's custom architecture provides elements enabling adaptation to the input's variance from adjustment to the kernel weights over time. On the document level, the model reported an average evaluation of 67.32% using 10-fold crossvalidation. The results suggest that the model can recognize congressional voting behaviors from the associated legislation's language. Overall, the PTCN provides a solution with competitive performance to related systems targeting congressional roll call votes.
UNIT V TEXT AND OPINION MINING
Text Mining in Social Networks -Opinion extraction – Sentiment classification and clustering -
Temporal sentiment analysis - Irony detection in opinion mining - Wish analysis – Product review mining – Review Classification – Tracking sentiments towards topics over time
Improving the Search Experiencein a Social Network with Cross Media ContentsPaolo Nesi
Indexing/Searching solution for ECLAP Social Network allowing:
Indexing multilingual crossmedia content metadata and data (e.g. documents)
Indexing portal blogs, forums, events, group pages, comments, etc.
Efficient multilingual search (keyword search and advanced search) supporting:
misspelled words (e.g. shespeare)
partial word search
Sorting and filtering search results
re-index the whole data without blocking the system
Log and monitor users activity
…
Evaluate the Indexing/Searching service
Indexing & Search system
Based on Apache Solr
Multilingual aspects
Translate the metadata or translate the query?.. both
metadata translation
Query translation
Indexing schema
Dublin Core + DCTerms (multi language)
Performing Arts
Technical (provider, content type, GPS, IPR, duration, quality, …)
Groups associations (multi language)
Taxonomy associations (multi language)
Comments & multi language tags
FullText of the textual digital resources
Characterizing Data and Software for Social Science ResearchMicah Altman
This presentation describes the landscape of data and software use across the social sciences in terms of the abstract dimensions of data and data use. It then examines three use cases.
Presentation for DASPOS < https://daspos.crc.nd.edu/index.php/workshops/workshop-2 > Workshop at JCDL.
With the recent growth of the graph-based data, the large graph processing becomes more and more important. In order to explore and to extract knowledge from such data, graph mining methods, like community detection, is a necessity. The legacy graph processing tools mainly rely on single machine computational capacity, which cannot process large graphs with billions of nodes. Therefore, the main challenge of new tools and frameworks lies on the development of new paradigms that are scalable, efficient and flexible. In this paper, we review the new paradigms of large graph processing and their applications to graph mining domains using the distributed and shared nothing approach used for large data by Internet players.
SeMuDate-SAMT How To Align Media Metadata Schemas, Design And Implementation ...Chris Poppe
This is the presentation of a joined research paper written by the W3C Media Annotations Working Group. The paper presents and compares different approaches to map metadata formats using semantic web technologies.
Spelling correction systems for e-commerce platformsAnjan Goswami
This is a presentation on building a scalable machine learned spell correction system for an e-commerce site. However, most of the techniques are also generally applicable for any large consumer site.
A Deep Learning Model to Predict Congressional Roll Call Votes from Legislati...mlaij
Developments in natural language processing (NLP) techniques, convolutional neural networks (CNNs), and long-short- term memory networks (LSTMs) allow for a state-of-the-art automated system capable of predicting the status (pass/fail) of congressional roll call votes. The paper introduces a custom hybrid model labeled "Predict Text Classification Network" (PTCN), which inputs legislation and outputs a prediction of the document's classification (pass/fail). The convolutional layers and the LSTM layers automatically recognize features from the input data's latent space. The PTCN's custom architecture provides elements enabling adaptation to the input's variance from adjustment to the kernel weights over time. On the document level, the model reported an average evaluation of 67.32% using 10-fold crossvalidation. The results suggest that the model can recognize congressional voting behaviors from the associated legislation's language. Overall, the PTCN provides a solution with competitive performance to related systems targeting congressional roll call votes.
UNIT V TEXT AND OPINION MINING
Text Mining in Social Networks -Opinion extraction – Sentiment classification and clustering -
Temporal sentiment analysis - Irony detection in opinion mining - Wish analysis – Product review mining – Review Classification – Tracking sentiments towards topics over time
Improving the Search Experiencein a Social Network with Cross Media ContentsPaolo Nesi
Indexing/Searching solution for ECLAP Social Network allowing:
Indexing multilingual crossmedia content metadata and data (e.g. documents)
Indexing portal blogs, forums, events, group pages, comments, etc.
Efficient multilingual search (keyword search and advanced search) supporting:
misspelled words (e.g. shespeare)
partial word search
Sorting and filtering search results
re-index the whole data without blocking the system
Log and monitor users activity
…
Evaluate the Indexing/Searching service
Indexing & Search system
Based on Apache Solr
Multilingual aspects
Translate the metadata or translate the query?.. both
metadata translation
Query translation
Indexing schema
Dublin Core + DCTerms (multi language)
Performing Arts
Technical (provider, content type, GPS, IPR, duration, quality, …)
Groups associations (multi language)
Taxonomy associations (multi language)
Comments & multi language tags
FullText of the textual digital resources
Characterizing Data and Software for Social Science ResearchMicah Altman
This presentation describes the landscape of data and software use across the social sciences in terms of the abstract dimensions of data and data use. It then examines three use cases.
Presentation for DASPOS < https://daspos.crc.nd.edu/index.php/workshops/workshop-2 > Workshop at JCDL.
With the recent growth of the graph-based data, the large graph processing becomes more and more important. In order to explore and to extract knowledge from such data, graph mining methods, like community detection, is a necessity. The legacy graph processing tools mainly rely on single machine computational capacity, which cannot process large graphs with billions of nodes. Therefore, the main challenge of new tools and frameworks lies on the development of new paradigms that are scalable, efficient and flexible. In this paper, we review the new paradigms of large graph processing and their applications to graph mining domains using the distributed and shared nothing approach used for large data by Internet players.
SeMuDate-SAMT How To Align Media Metadata Schemas, Design And Implementation ...Chris Poppe
This is the presentation of a joined research paper written by the W3C Media Annotations Working Group. The paper presents and compares different approaches to map metadata formats using semantic web technologies.
(EOR/MEOR/BERO/oil recovery) Application of bero™ biosurfactant in wyoming an...Jany Li
After shut-in for 5 days, Well ZB 9 resumed production and then its production greatly increased. Its daily oil production increased from 11 BOPD to 22 BOPD and waterproduction decreased from 39 BWPD to 26 BWPD.
ThinkGRC justifying the transition to an Enterprise Risk Management (ERM) modelThinkGRC
Justifying the transition to an Enterprise Risk Management (ERM) Model for Senior Management. This presentation will help Risk Managers present the concept, justification and benefits of moving to a consolidated ERM model and an organizational approach.
2013 Melbourne Software Freedom Day talk - FOSS in Public Decision MakingPatrick Sunter
Slides from my talk at the Melbourne Software Freedom Day, 21st September 2013, on the topic of Free and Open Source Software (FOSS) in public decision-making, particularly in the policy areas of climate change and transportation.
An Introduction to Information Retrieval and Applicationssathish sak
An Introduction to Information Retrieval and Applications The score you get depends on the functions, difficulty and quality of your project
For system development:
System functions and correctness
For academic paper presentation
Quality and your presentation of the paper
Major methods/experimental results *must* be presented
Papers from top conferences are strongly suggested
E.g. SIGIR, WWW, CIKM, WSDM, JCDL, ICMR, …
Proposals are *required* for each team, and will be counted in the score
Robust Expert Finding in Web-Based Community Information SystemsRalf Klamma
Robust Expert Finding in Web-Based Community Information Systems
Ralf Klamma
Advanced Community Information Systems (ACIS)RWTH Aachen University, Germany
Looking beyond plain text for document representation in the enterpriseArjen de Vries
In many real life scenarios, searching for information is not the user's end goal. In this presentation I look into the specific example of corporate strategy and business development in a university setting.
In today's academic institutions, strategic questions are those that relate to dependency on funding instruments, the public private partnerships that exist (and those that should be extended!), and the match between topic areas addressed by the research staff and those claimed important by policy makers. The professional search tasks encountered to answer questions in this domain are usually addressed by business intelligence (BI) tools, and not by search engines. However, professionals are known to be busy people inspired by their own research interests, and not particularly fond of keeping the
customer relationship management (CRM) or knowledge management systems up to date for the organisation's strategic interest. This then results in incomplete and inaccurate data.
Instead of requiring research staff (or their administrative support) to provide this management information, I will illustrate by example how the desired information usually exists already in the documents inherent to the academic work process. Information retrieval could thus play an important role in the computer systems that support the business analytics involved, and could significantly improve the coverage of entities of interest - i.e., to reduce the effort involved in achieving good recall in business analytics. The ranking functionality over the enterprise's (textual) content should however not be an isolated component. Our example setting integrates the information derived from research proposals, research publications and the financial systems, providing an excellent motivation for a more unified approach to structured and unstructured data.
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Amit Sheth
Amit Sheth, "Semantic Web & Info. Brokering Opportunities, Commercialization and Challenges," Keynote talk at the workshop on Semantic Web: Models, Architecture and Management, September 21, 2000, Lisbon, Portugal.
This was the keynote given at probably the first international event with "Semantic Web" in title (and before the well known SciAm article). As in TBL's use of Semantic Web in his 1999 book, (semantic) metadata plays central role. The use of Worldmodel/Ontology is consistent with our use of ontology for (Web) information integration in 1994 CIKM paper. Summary of the talk by event organizers and other details are at: http://knoesis.org/library/resource.php?id=735
Prof. Sheth started a Semantic Web company Taalee, Inc. in 1999 (product was called MediaAnywhere A/V search engine- discussed in this paper in the context of one of its use by a customer Redband Broadcasting). The product included Semantic Web/populated Ontology based semantic (faceted) search, semantic browsing, semantic personalization, semantic targeting (advertisement), etc as is described in U.S. Patent #6311194, 30 Oct. 2001 (filed 2000). MediaAnywhere has about 25 ontologies in News/Business, Sports, Entertainment, etc.
Taalee merged to become Voquette in 2001 (product was called SCORE), Semagix in 2004 (product was called Semagix Freedom), and then Fortent in 2006 (products included Know Your Customers).
PATHS state of the art monitoring reportpathsproject
This document provides an update to an Initial State of the Art Monitoring report delivered by the project. The report covers the areas of Educational Informatics, Information Retrieval and Semantic Similarity relatedness.
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...GUANGYUAN PIAO
In this paper, we propose user modeling strategies which
use Concept Frequency - Inverse Document Frequency (CF-
IDF) as a weighting scheme and incorporate either or both
of the dynamics and semantics of user interests. To this end,
we first provide a comparative study on different user modeling strategies considering the dynamics of user interests in
previous literature to present their comparative performance.
In addition, we investigate different types of information (i.e.,
categories, classes and connected entities via various proper-
ties) for entities from DBpedia and the combination of them
for extending user interest profiles. Finally, we build our user
modeling strategies incorporating either or both of the best-
performing methods in each dimension. Results show that
our strategies outperform two baseline strategies significantly
in the context of link recommendations on Twitter.
Research Overview about the Multimedia Communications Lab (KOM) - Technische Universität Darmstadt - Germany
Research areas towards Adaptive Seamless Multimedia Communications are: Knowledge & Educational Technologies, Multimedia Technologies & Serious Games, Mobile Systems & Sensor Networks, Self-organizing Systems & Overlay Communications, Service-oriented Computing
This chapter gives information about Social media analytics, Social network analysis, Text analytics, stopwords, tokenization, n-grams, Trend analytics, TF-IDF, Stemming and lemmatization
[MMIR@MM2023] On Popularity Bias of Multimodal-aware Recommender Systems: A M...Daniele Malitesta
Slides for the paper "On Popularity Bias of Multimodal-aware Recommender Systems: A Modalities-driven Analysis", accepted and presented at the 1st International Workshop on Deep Multimodal Learning for Information Retrieval, co-located with the 31st ACM International Conference on Multimedia (MMIR@MM'23).
Paper: https://dl.acm.org/doi/abs/10.1145/3606040.3617441
Code: https://github.com/sisinflab/MultiMod-Popularity-Bias
Understanding everyday users’ perception of socio-technical issues through s...Ahreum lee
I gave a talk at ImagineXLab, Seoul, Korea.
In this presentation, I would like to share my recent works that have been explored sociotechnical issues through social media data.
1) /r/Assholedesign: Online conversation about ethical concerns (ACM DIS 20' Honorable Mention Award)
2) /r/Digitalnomad: Current tensions in community-based spaces (ACM CHI 2019 LBW, CSCW 2019)
3) /r/Purdue: Everyday users’ perception of delivery robots on campus (ACM CSCW 2020 LBW)
Similar to Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific Information (20)
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
Honest Reviews of Tim Han LMA Course Program.pptxtimhan337
Personal development courses are widely available today, with each one promising life-changing outcomes. Tim Han’s Life Mastery Achievers (LMA) Course has drawn a lot of interest. In addition to offering my frank assessment of Success Insider’s LMA Course, this piece examines the course’s effects via a variety of Tim Han LMA course reviews and Success Insider comments.
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdfTechSoup
In this webinar you will learn how your organization can access TechSoup's wide variety of product discount and donation programs. From hardware to software, we'll give you a tour of the tools available to help your nonprofit with productivity, collaboration, financial management, donor tracking, security, and more.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
2. KOM – Multimedia Communications Lab 2Source: www.icon-finder.com, www.flickr.com, www.delicious.com, www.crokodil.de
Definition of folksonomy, adapted from
[HJSS06]
Users 𝑈
Resources 𝑅
Tags 𝑇
Tag assignment relation 𝑇𝐴𝑆 ⊆ 𝑈 × 𝑅 × 𝑇
Folksonomies
Bob sugar loaf
A tag assignment
3. KOM – Multimedia Communications Lab 3
Task of Ranking of Resources: “Rank resources, such that they are in
descending order of relevance towards an information need.”
user
given as query-entity
Interests
match
More
like this
resource
adapted from [Bog09]
Guided
search
tag
Find me a
resource
Ranking Resources in Folksonomies
4. KOM – Multimedia Communications Lab 4
“How probable do I go to B being at A”
1/5
1/4
3/5
1/3
1/2
FolkRank [HJSS06] state-of-the-art graph-based
Based on PageRank’s random surfer [PBMW99]
|𝑅 𝑢, 𝑡 |
How to Actually Rank in Folksonomies?
Restart
3
1
1
2
1
1/4
3/4
1/4 2/3
1/5
Ranking 𝒗
describes context
45%
29%
16%
10%
α = 1/3α
Estimates relevance
fcbarcelona.com
messi
barca
barcaFan
Estimates authority
5. KOM – Multimedia Communications Lab 5
Assumption about folksonomy-structure violated
Source: www.icon-finder.com
Challenges of FolkRank
Concept drift
Ambiguity
Multi-facetedness of entities
Including quality attributes of a resource
Authority Signals (e.g. PageRank on the Web)
Hub signals
authority signals
hub signals
1
1
AI
(topic)
Barcelona
(location)
?
IJCAI-Proceedings.pdf
Artificial
Intelligence
(topic)
1
1
1
?1
football 1
1
?1
soccer
news
football
7. KOM – Multimedia Communications Lab 7
Proposed Approaches
IncentiveScore
Concept drift
Concept drift
InteliScore
Inclusion of quality attributes of resources
HITSonomy
Extensive description of resources/query-entity
VSScore
8. KOM – Multimedia Communications Lab 8
HITSonomy
FolkRank ‘thinks’ unidirectional
Combined scores yield ranking 𝒗
Estimates relevance & authority
Estimates relevance & hub
A B
21 2
A B
1/3 2/3 2/4
2/4
A B
1/3
2/3
2/42/4
“How probable do I go to B being at A”
A B
21 2
Additionally:
“How probable did I come from B being at A”
HITSonomy ‘thinks’ bidirectional
Inspired by HITS [Kle99]
Describes context
9. KOM – Multimedia Communications Lab 9
VSScore
Idea
Port ranking task to vector space model [MRS08] used in text retrieval
Cowboys
1
…
0
0.8
…
0.3
…
0.2
barca
barcaFan
dallascowboys.com
A term (usually) represents a semantic concept
Problem
No content information of resources (in this work)
Solution
Entities in folksonomy can be viewed as semantic concepts
Represent resources’ content by their context
Represent any entity by their context (e.g. a query-entity)
δ
Barcelona
Cowboys
Barcelona…
Messi...
Barcelona…
Barcelona…
FCB…
2
…
0
0
…
3
Cowboys
Barcelona
Dallas…
Cowboys…
Football…
Cowboys…
Dallas…
11. KOM – Multimedia Communications Lab 11
Evaluation Setup
BibSonomy corpus
Methodologies
LeavePostOut [JMH+07]
LeaveRTOut
Assumption: “Tag assignment indicates
relevance of resource towards information need
represented by user or tag”
Post: All tag assignments
between user and resource
RT: All tag assignments
between tag and resource
12. KOM – Multimedia Communications Lab 12
Evaluation Parameters
FolkRank
LeavePostOut, given user as query-entity find me resources
Restart propability
13. KOM – Multimedia Communications Lab 13
Evaluation Parameters
HITSonomy
LeavePostOut, given user as query-entity find me resources
Restart propability
Weighted arithmetic mean of authority and hub score
14. KOM – Multimedia Communications Lab 14
Evaluation Results
LeavePostOut: 1 out
Given user as query-entity find me resources
HITSonomy and VSScore significantly more effective than FolkRank
Wilcoxon signed rank test on AveragePrecision
15. KOM – Multimedia Communications Lab 15
Evaluation Results
LeavePostOut: 33% out
Given user as query-entity find me resources
HITSonomy and VSScore significantly more effective than FolkRank
Wilcoxon signed rank test on AveragePrecision
16. KOM – Multimedia Communications Lab 16
Conclusion
HITSonomy and VSScore can beat the state-of-the-art
In different resource ranking tasks
Depending on LeavePostOut/LeaveRTOut, thus the conditions of the query-entity
Other proposed algorithms not as well
Methodology Interests match Guided search
LeavePostOut HITSonomy HITSonomy
LeaveNPostsOut HITSonomy HITSonomy
LeaveRTOut FolkRank,
HITSonomy,
IncentiveScore,
InteliScore
VSScore
LeaveNRTsOut FolkRank,
HITSonomy,
IncentiveScore
HITSonomy,
VSScore
Most pairwise statistical significance comparisons won:
17. KOM – Multimedia Communications Lab 17
Contributions
Disambiguation algorithms not evaluated
Tag Assignment Context
Post Context
Taxonomy for graph-based scoring/ranking algorithms
Implemented and evaluated
Presented algorithms for ranking in folksonomies
AInheritScore and Ascore for ranking in by activities extended folksonomies
Various other ideas for ranking described
Tag type labeling of evaluation corpus
Analysis for CROKODIL application scenario
Graph-based ranking framework
18. KOM – Multimedia Communications Lab 18
Future Work
Parameterization of proposed algorithms
Ranking task
Evaluation
Creation of corpora
Efficient computation
Explainability
Preprocessing of folksonomy corpus
…
E.g. VSScore using HITSonomy result as context-description
19. KOM – Multimedia Communications Lab 19
Bibliography
[Bog09] T. Bogers. Recommender Systems for Social Bookmarking. PhD Thesis, Tilburg University,
2009.
[BSB+08] D. Böhnstedt, P. Scholl, B. Benz, C. Rensing, R. Steinmetz, and B. Schmitz. Einsatz
persönlicher Wissensnetze im Ressourcen-basierten Lernen. In Proceedings of the 6th
e-Learning Fachtagung Informatik, pages 113–124, 2008.
[HJSS06] A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. Information Retrieval in Folksonomies:
Search and Ranking. In Proceedings of the 3rd European Semantic Web Conference on the
Semantic Web: Research and Applications, pages 411–426, 2006.[JMH+07] Robert
Jäschke, Leandro Marinho, Andreas Hotho, Schmidt-Thie Lars, and Stum Gerd. Tag
recommendations in folksonomies. 2007
[MRS08] C. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge
University Press, 2008.
[Kle99] J. Kleinberg. Authoritative Sources in a Hyperlinked Environment. Journal of the ACM,
46:604–632, 1999.
[PBMW99] L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: Bringing
Order to the Web. Technical Report 1999-66, Stanford InfoLab, 1999.
Editor's Notes
|
Name CROKODIL as the application scenario in which this thesis has been done
Explain information need and relevance briefly
Explain Graph creation briefly
Example
Explain how relevance and authority are determined
Give principle idea on IncentiveScore und InteliScore
Explain LeavePostOut, LeaveRTOut and give brief example for the different ranking tasks (interests match, guided search)
Recall biased jump from example
Recall biased jump from example
How to combine authority&relevance and hub&relevance score?
Explain vioplot
AveragePrecisions not normally distributed -> no t.test
About a 1/3 of resources thus removed from user
E.g. VSSCore with HITS ranking as context description or VSScore with context described in external corpus
Ranking for tag recommendation e.g.
Evaluation in CROKODIL scenario to determine true utility for activities (learning task)
CROKODIL corpus would be great to have true assessment of tag types as manual labeling is cumbersome
Efficient computation is usually important for creation of ranking: VSScore is slow or has to be stored
Scrutability can be desirable