This document describes an NLP Rankings system and platform that ranks NLP programs in the United States based on publication data. It collects publications from 2010-2019 from major NLP venues and matches author emails to universities. It assigns scores to universities based on publication weights. The platform allows visualizing rankings over time and analyzing trends at the university, author, and user levels. Next steps include expanding the time horizon for further analysis and identifying research interests and trends using clustering and topic modeling.
What papers should I cite from my reading list? User evaluation of a manuscri...Aravind Sesagiri Raamkumar
Long paper presented during the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2016)
Presentation made on December 7th 2016 during ICADL'16
Full text can be found at http://link.springer.com/chapter/10.1007/978-3-319-49304-6_12
Extended version can be found at https://arxiv.org/abs/1609.01415
Chinese Character Decomposition for Neural MT with Multi-Word ExpressionsLifeng (Aaron) Han
ADAPT seminar series. June 2021
research papers @NoDaLiDa2021:the 23rd Nordic Conference on Computational Linguistics
& COLING20:MWE-LEX WS
Bonus takeaway:
AlphaMWE multilingual corpus
with MWEs
Learning analytics and accessibility – #calrg 2015Martyn Cooper
Presentation at the Open University's Computers and Learning Research Group (CALRG) Conference 2015 on Learning Analytics and Accessibility - detecting accessibility deficits with Learning Analytics approaches
What papers should I cite from my reading list? User evaluation of a manuscri...Aravind Sesagiri Raamkumar
Long paper presented during the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2016)
Presentation made on December 7th 2016 during ICADL'16
Full text can be found at http://link.springer.com/chapter/10.1007/978-3-319-49304-6_12
Extended version can be found at https://arxiv.org/abs/1609.01415
Chinese Character Decomposition for Neural MT with Multi-Word ExpressionsLifeng (Aaron) Han
ADAPT seminar series. June 2021
research papers @NoDaLiDa2021:the 23rd Nordic Conference on Computational Linguistics
& COLING20:MWE-LEX WS
Bonus takeaway:
AlphaMWE multilingual corpus
with MWEs
Learning analytics and accessibility – #calrg 2015Martyn Cooper
Presentation at the Open University's Computers and Learning Research Group (CALRG) Conference 2015 on Learning Analytics and Accessibility - detecting accessibility deficits with Learning Analytics approaches
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.Lifeng (Aaron) Han
Invited Presentation in NLP lab of Soochow University, about my NLP journey and ADAPT Centre. NLP part covers Machine Translation Evaluation, Quality Estimation, Multiword Expression Identification, Named Entity Recognition, Word Segmentation, Treebanks, Parsing.
PhD thesis defense.
This manuscript describes a methodology designed and implemented to realise the recommendation of vocabularies based on the content of a given website. The goal of the proposed approach is to generate vocabularies by reusing existing schemas. The automatic recommendation helps to leverage websites to self-described web entities in the Web of Data; understandable by both humans and machines. In this direction, the implemented approach is wrapped within a broader methodology of turning a website in a machine understandable node by using technologies that have been developed in the scope of the Semantic Web vision. Transforming a website to a machine understandable entity is the first step required by the websites side in order to narrow the gap with web agents and enable the structured content consumption without the need of implementing an Application Programming Interface (API) that would provide read-write functionality. The motivation of the thesis stems from the fact that the data provided via an API is already presented on the corresponding website in most of the cases.
Apply chinese radicals into neural machine translation: deeper than character...Lifeng (Aaron) Han
LPRC 2018: Limerick Postgraduate Research Conference
Lifeng Han and Shaohui Kuang. 2018. Apply Chinese radicals into neural machine translation: Deeper than character level. ArXiv pre-print https://arxiv.org/abs/1805.01565v1
A task-based scientific paper recommender system for literature review and ma...Aravind Sesagiri Raamkumar
My PhD oral defense presentation (as of Oct 3rd 2017)
The dissertation can be requested at this link https://www.researchgate.net/publication/323308750_A_task-based_scientific_paper_recommender_system_for_literature_review_and_manuscript_preparation
Do you know “Over 43% of ISI papers has never received any citations?” (nature.com/top100, 2014). Publishing a high quality paper in scientific journals is only halfway towards receiving citation in the future. The rest of the journey is dependent on disseminating the publications via proper utilization of the “Research Tools”. Proper tools allow the researchers to increase the research impact and citations for their publications. These workshop series will provide various techniques on how one can increase the visibility and enhance the impact of one’s research work.
Comparing scientific performance across disciplines: Methodological and conce...Ludo Waltman
Presentation at the 7th International Conference on Information Technologies and Information Society (ITIS2015) in Novo Mestro, Slovenia on November 5, 2015.
Review of "Survey Research Methods & Design in Psychology"James Neill
Reviews the 150 hour, third year psychology unit which examined survey research methods, with an emphasis on the second-half of the unit on MLR, ANOVA, power, and effect size.
Efficient named entity annotation through pre-emptingLeon Derczynski
Linguistic annotation is time-consuming and expensive. One common annotation task is to mark entities – such as names
of people, places and organisations – in text. In a document, many segments of text often contain no entities at all. We show that these segments are worth skipping, and demonstrate a technique for reducing the amount of entity-less text examined
by annotators, which we call “preempting”. This technique is evaluated in a crowdsourcing scenario, where it provides downstream performance improvements for the same size corpus.
Presentation as part of the "Social Media Annotation" Tutorial at ISWC2014. Content: What is crowdsourcing? What are typical steps taken when crowdsourcing the creation of training and verification corpora? What is the state of the art in performing these steps? How do these steps differ between mechanised labour and GWAPs?
Classifying and visualizing the disciplinary focus of universities: The invis...Nicolas Robinson-Garcia
Presentation of the PhD Defense of Nicolas Robinson-Garcia which took place in the Faculty of Information Science and Communication in the University of Granada (Spain) on July 14, 2014.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.Lifeng (Aaron) Han
Invited Presentation in NLP lab of Soochow University, about my NLP journey and ADAPT Centre. NLP part covers Machine Translation Evaluation, Quality Estimation, Multiword Expression Identification, Named Entity Recognition, Word Segmentation, Treebanks, Parsing.
PhD thesis defense.
This manuscript describes a methodology designed and implemented to realise the recommendation of vocabularies based on the content of a given website. The goal of the proposed approach is to generate vocabularies by reusing existing schemas. The automatic recommendation helps to leverage websites to self-described web entities in the Web of Data; understandable by both humans and machines. In this direction, the implemented approach is wrapped within a broader methodology of turning a website in a machine understandable node by using technologies that have been developed in the scope of the Semantic Web vision. Transforming a website to a machine understandable entity is the first step required by the websites side in order to narrow the gap with web agents and enable the structured content consumption without the need of implementing an Application Programming Interface (API) that would provide read-write functionality. The motivation of the thesis stems from the fact that the data provided via an API is already presented on the corresponding website in most of the cases.
Apply chinese radicals into neural machine translation: deeper than character...Lifeng (Aaron) Han
LPRC 2018: Limerick Postgraduate Research Conference
Lifeng Han and Shaohui Kuang. 2018. Apply Chinese radicals into neural machine translation: Deeper than character level. ArXiv pre-print https://arxiv.org/abs/1805.01565v1
A task-based scientific paper recommender system for literature review and ma...Aravind Sesagiri Raamkumar
My PhD oral defense presentation (as of Oct 3rd 2017)
The dissertation can be requested at this link https://www.researchgate.net/publication/323308750_A_task-based_scientific_paper_recommender_system_for_literature_review_and_manuscript_preparation
Do you know “Over 43% of ISI papers has never received any citations?” (nature.com/top100, 2014). Publishing a high quality paper in scientific journals is only halfway towards receiving citation in the future. The rest of the journey is dependent on disseminating the publications via proper utilization of the “Research Tools”. Proper tools allow the researchers to increase the research impact and citations for their publications. These workshop series will provide various techniques on how one can increase the visibility and enhance the impact of one’s research work.
Comparing scientific performance across disciplines: Methodological and conce...Ludo Waltman
Presentation at the 7th International Conference on Information Technologies and Information Society (ITIS2015) in Novo Mestro, Slovenia on November 5, 2015.
Review of "Survey Research Methods & Design in Psychology"James Neill
Reviews the 150 hour, third year psychology unit which examined survey research methods, with an emphasis on the second-half of the unit on MLR, ANOVA, power, and effect size.
Efficient named entity annotation through pre-emptingLeon Derczynski
Linguistic annotation is time-consuming and expensive. One common annotation task is to mark entities – such as names
of people, places and organisations – in text. In a document, many segments of text often contain no entities at all. We show that these segments are worth skipping, and demonstrate a technique for reducing the amount of entity-less text examined
by annotators, which we call “preempting”. This technique is evaluated in a crowdsourcing scenario, where it provides downstream performance improvements for the same size corpus.
Presentation as part of the "Social Media Annotation" Tutorial at ISWC2014. Content: What is crowdsourcing? What are typical steps taken when crowdsourcing the creation of training and verification corpora? What is the state of the art in performing these steps? How do these steps differ between mechanised labour and GWAPs?
Classifying and visualizing the disciplinary focus of universities: The invis...Nicolas Robinson-Garcia
Presentation of the PhD Defense of Nicolas Robinson-Garcia which took place in the Faculty of Information Science and Communication in the University of Granada (Spain) on July 14, 2014.
Empirical user studies in Semantic Web contextsCatia Pesquita
My presentation at EKAW 2018 for our position paper that argues better user studies and their reporting are needed in the Semantic Web community, and proposes a framework to design and report empirical studies.
Read the paper at: http://steffen-lohmann.de/publications/2018_EKAW_user_studies_semweb.pdf
· Toggle DrawerOverviewFor this assessment, you will complete .docxodiliagilby
· Toggle Drawer
Overview
For this assessment, you will complete an SPSS data analysis report using t-test output for assigned variables.
You will review the theory, logic, and application of t tests. The t test is a basic inferential statistic often reported in psychological research. You will discover that t tests, as well as analysis of variance (ANOVA), compare group means on some quantitative outcome variable.
SHOW LESS
By successfully completing this assessment, you will demonstrate your proficiency in the following course competencies and assessment criteria:
· Competency 1: Analyze the computation, application, strengths, and limitations of various statistical tests.
1. Develop a conclusion that includes strengths and limitations of an independent-samples t test.
. Competency 2: Analyze the decision-making process of data analysis.
2. Analyze the assumptions of the independent-samples t test.
. Competency 3: Apply knowledge of hypothesis testing.
3. Develop a research question, null hypothesis, alternative hypothesis, and alpha level.
. Competency 4: Interpret the results of statistical analyses.
4. Interpret the output of the independent-samples t test.
. Competency 5: Apply a statistical program's procedure to data.
5. Apply the appropriate SPSS procedures to check assumptions and calculate the independent-samples t test to generate relevant output.
. Competency 6: Apply the results of statistical analyses (your own or others) to your field of interest or career.
6. Develop a context for the data set, including a definition of required variables and scales of measurement.
. Competency 7: Communicate in a manner that is scholarly, professional, and consistent with the expectations for members in the identified field of study.
7. Communicate in a manner that is scholarly, professional, and consistent with the expectations for members in the identified field of study.
Competency Map
CHECK YOUR PROGRESSUse this online tool to track your performance and progress through your course.
· Toggle Drawer
Context
Read Assessment 3 Context [DOC] for important information on the following topics:
SHOW LESS
. Logic of the t test.
. Assumptions of the t test.
. Hypothesis testing for a t test.
. Effect size for a t test.
. Testing assumptions: The Shapiro-Wilk test and Levene's test.
. Proper reporting of the independent-samples t test.
. t, degrees of freedom, and t value.
. Probability value.
. Effect size.
· Toggle Drawer
Questions to Consider
As you prepare to complete this assessment, you may want to think about other related issues to deepen your understanding or broaden your viewpoint. You are encouraged to consider the questions below and discuss them with a fellow learner, a work associate, an interested friend, or a member of your professional community. Note that these questions are for your own development and exploration and do not need to be completed or submitted as part of your assessment.
SHOW LESS
Various Forms of the t Test
. In w ...
Similar to NLP Rankings: Publication-based Ranking System and Platform for NLP Research (20)
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Welocme to ViralQR, your best QR code generator.ViralQR
Welcome to ViralQR, your best QR code generator available on the market!
At ViralQR, we design static and dynamic QR codes. Our mission is to make business operations easier and customer engagement more powerful through the use of QR technology. Be it a small-scale business or a huge enterprise, our easy-to-use platform provides multiple choices that can be tailored according to your company's branding and marketing strategies.
Our Vision
We are here to make the process of creating QR codes easy and smooth, thus enhancing customer interaction and making business more fluid. We very strongly believe in the ability of QR codes to change the world for businesses in their interaction with customers and are set on making that technology accessible and usable far and wide.
Our Achievements
Ever since its inception, we have successfully served many clients by offering QR codes in their marketing, service delivery, and collection of feedback across various industries. Our platform has been recognized for its ease of use and amazing features, which helped a business to make QR codes.
Our Services
At ViralQR, here is a comprehensive suite of services that caters to your very needs:
Static QR Codes: Create free static QR codes. These QR codes are able to store significant information such as URLs, vCards, plain text, emails and SMS, Wi-Fi credentials, and Bitcoin addresses.
Dynamic QR codes: These also have all the advanced features but are subscription-based. They can directly link to PDF files, images, micro-landing pages, social accounts, review forms, business pages, and applications. In addition, they can be branded with CTAs, frames, patterns, colors, and logos to enhance your branding.
Pricing and Packages
Additionally, there is a 14-day free offer to ViralQR, which is an exceptional opportunity for new users to take a feel of this platform. One can easily subscribe from there and experience the full dynamic of using QR codes. The subscription plans are not only meant for business; they are priced very flexibly so that literally every business could afford to benefit from our service.
Why choose us?
ViralQR will provide services for marketing, advertising, catering, retail, and the like. The QR codes can be posted on fliers, packaging, merchandise, and banners, as well as to substitute for cash and cards in a restaurant or coffee shop. With QR codes integrated into your business, improve customer engagement and streamline operations.
Comprehensive Analytics
Subscribers of ViralQR receive detailed analytics and tracking tools in light of having a view of the core values of QR code performance. Our analytics dashboard shows aggregate views and unique views, as well as detailed information about each impression, including time, device, browser, and estimated location by city and country.
So, thank you for choosing ViralQR; we have an offer of nothing but the best in terms of QR code services to meet business diversity!
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
4. INTRODUCTION
¡ Growing demand to analyze unstructured data brings tremendous attention to the field of Natural Language
Processing (NLP)
¡ Relatively new field compared to other well-established disciplines and programs
¡ Limited information to assess the quality of NLP research environment at different universities
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
5. INTRODUCTION - PURPOSE
¡ Provide insights regarding NLP programs in the United States to the research community by creating a
customizable ranking dedicated to NLP
¡ particularly, for current faculties and prospective students interested in NLP
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
7. RELATED WORKS
Generic University
Rankings
U.S. News Rankings
QS World University Rankings
Publication-Based
University Rankings
NTU Rankings
CSRankings
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
8. RELATED WORKS – GENERIC RANKINGS
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
¡ U.S. News Rankings
¡ Ranks universities in the United States based on
1. expert opinions about the program excellence
2. statistical indicators that measure the quality of a school’s
faculty, research, and students
¡ data used to calculate the rankings comes from
statistical surveys answered by academic professionals
¡ Opinion-based
¡ QS World University Rankings
¡ Ranks universities in the world based on
1. Academic Reputation (40%)
2. Employer Reputation (10%)
3. Faculty/Student Ratio (20%)
4. Citations per faculty (20%)
5. International Faculty Ratio (5%)
6. International Student Ratio (5%)
¡ Opinion-based
9. RELATED WORKS – PUBLICATION-BASED RANKINGS
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
¡ NTU Rankings
¡ Ranks universities in the world based on
1. Research productivity (25%)
2. Research Impact (35%)
3. Research Excellence (40%)
¡ rankings reflect university’s research output in terms of
publication quantity and quality
¡ Research quality is measured by citation, which
may be susceptible to citation cartel
10. RELATED WORKS – PUBLICATION-BASED RANKINGS
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
¡ CSRankings
¡ Compiled by Emery Berger
¡ Metric-based ranking system that ranks Computer
Science programs
¡ Ranking universities by their presence at prestigious
publication venues
¡ Ranking scores change as faculty move
¡ Publication venues carry equal values
¡ Limited venues selected for NLP programs
12. NLP RANKINGS – DATA COLLECTION
¡ Publications published from 2010 to 2019 are collected from ACL Anthology
¡ Publication conference and venues selected
1. Annual Meeting of the Association for Computational Linguistics (ACL)
2. Computational Linguistics (CL)
3. International Conference on Computational Linguistics (COLING)
4. Conference on Computational Natural Language Learning (CoNLL)
5. European Chapter of ACL (EACL)
6. Conference on Empirical Methods in NLP (EMNLP)
7. International Joint Conference on NLP (IJCNLP)
8. North American Chapter of ACL (NAACL)
9. Transactions of the Association for Computational Linguistics (TACL)
10. workshop and demonstration paper (WS)
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
¡ Total number of publication: 24,896
¡ By academic authors in the US: 6,261
¡ Total number of unique authors: 24,838
¡ Unique authors in the US: 7,426
13. NLP RANKINGS – PUBLICATION OVERTIME
Number of NLP publications over the last 10 years Number of NLP authors over the last 10 years
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
14. NLP RANKINGS – AUTHOR-UNIVERSITY MATCHING
¡ Email addresses are extracted by using a comprehensive group of regular expressions
¡ Email addresses of publication authors are important for institutional authorship
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
15. NLP RANKINGS – AUTHOR-UNIVERSITY MATCHING
¡ The order email addresses are presented might not match
the authors
¡ A list of email addresses are pseudo-generated using the
authors’ name under the following conventions
¡ firstname lastname
¡ f (m) lastname
¡ lastname f (m)
¡ firstname
¡ lastname
¡ f (m) l
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
20 15 20 20 16 20 14 4 13 10 1 12 15 18 18 15 16 13
15 18 16 16 16 16 15 18 20 18 16 18 7 8 15 5 1 7
7 6 12 8 2 10 17 13 20 18 14 17 18 16 15 16 15 18
Email 1
Email 2
Email 3
Pseudo-generated emails
¡ Match authors and email addresses by Levenshtein
distance
¡ Start with the minimum of the matrix
16. NLP RANKINGS – SCORING MECHANISM
¡ Different publication conferences and venues carry different weights
¡ major venues (ACL, CL, EMNLP,TACL, NAACL): 3
¡ other conferences: 2
¡ workshops/demonstrations:1
¡ Credit for each publication is evenly distributed to all authors
¡ each author receives a score of
𝒘
𝒂
for each publication
¡ Institutional scores = sum of authors score who dedicate their work to the institution
¡ Students’ contribution also count (not just faculties’)
¡ If an author moves, one’s previous score will not be transferred to the new institution
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
21. ANALYSIS – UNIVERSITY-LEVEL ANALYSIS
¡ Carnegie Mellon University remained 1st for the
past ten years
¡ Ranking score gaps are more significant between top
universities
¡ Top universities remained largely competitive over time
¡ Most top 50 universities showed an upward movement
in ranking year over year
¡ average rank change between 2010 and 2019 is 15.52
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
22. ANALYSIS – UNIVERSITY-LEVEL ANALYSIS
Rank Institution 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
1 Carnegie Mellon
University
1 1 1 1 1 1 1 1 1 1
2 University of
Washington
5 6 10 4 7 3 3 3 2 2
3 Stanford University 7 8 5 3 2 2 2 4 3 4
4 Johns Hopkins University 10 5 6 6 6 4 5 2 4 3
5 Columbia University 4 2 2 2 3 5 6 15 23 12
6 Massachusetts Institute
of Technology
9 11 13 10 10 6 6 6 7 5
7 University of Illinois at
Urbana-Champaign
2 10 9 7 12 10 4 7 12 8
8 University of California,
Berkeley
3 9 8 8 9 9 8 5 19 21
9 University of
Pennsylvania
11 13 3 19 13 13 13 10 5 6
10 University of Maryland 8 7 4 12 5 8 10 9 14 13
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
23. ANALYSIS – UNIVERSITY TREND CLUSTERING
¡ hierarchical cluster analysis to cluster universities
by their similarity in trends, usingWard variance
minimization algorithm
¡ Grouped into 3 major clusters
¡ Red: 26 high-tier universities (top 30)
¡ Blue: Carnegie Mellon University
¡ Green: 189 mid-lower tier universities
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
24. ANALYSIS – SUB-CLUSTER EXAMPLE
¡ Very similar ranking score trend
¡ Could possibly suggest similar research interest
¡ Information Sciences Institute
¡ Kevin Knight: NLP, machine translation, automata theory and decipherment
¡ University of California, Berkeley
¡ Dan Klein: Unsupervised language acquisition, Machine translation, Information extraction
¡ University of Illinois at Urbana-Champaign
¡ Dan Roth: Artificial Intelligence, natural language understanding
¡ University of Maryland
¡ Philip Resnik: Machine translation, Computational social science, Computational
psycholinguistics and neurolinguistics
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
25. ANALYSIS – AUTHOR-LEVEL ANALYSIS
¡ Author ranking calculated by summing all the publication per authors
¡ Universities that these top NLP authors have worked for or are working at
1. Carnegie Mellon University
2. Stanford University
3. Columbia University
¡ Universities with only one or two top 100 NLP authors
¡ Language, Information, and Learning lab at Yale (LILY)
¡ Dragomir Radev (Top 100 NLP Author, NLP Faculty atYale)
¡ Brown Laboratory for Linguistic Information Processing (BLLIP)
¡ Eugene Charniak (Top 100 NLP Author, NLP Faculty at Brown)
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
26. ANALYSIS – AUTHOR-LEVEL ANALYSIS
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
Universities Attended by Top 100 NLP Authors
27. ANALYSIS – WEIGHT-CONTRIBUTION INDEX
¡ rankings as a sum of scores may be deceiving because
it is on a university-level
¡ individual performance on an author-level
¡ Jorge E. Hirsch (2005): h-index
¡ number of papers with 𝑐𝑖𝑡𝑎𝑡𝑖𝑜𝑛 𝑛𝑢𝑚𝑏𝑒𝑟 ≥ ℎ as an
index
¡ encourages large amount of high-quality publications
¡ citations can be misleading
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
¡ weight-contribution index for NLP Rankings
¡ Different publication conferences venues carry different
weights (some are more major than others)
¡ Score for each author in each publication =
#
$
¡ Index is calculated by identifying number of papers with
score > 1
¡ Identify how active and independent researchers are
¡ shows the behavior and current status of researchers
28. ANALYSIS – WEIGHT-CONTRIBUTION INDEX
wc-index
Rank Name h-index* 2015 2016 2017 2018 2019
1 Dan Roth 50 5 13 17 25 31
2 Noah A. Smith 53 7 8 11 17 23
3 Dan Klein 47 6 9 18 23 26
4 Christopher D. Manning 90 6 12 15 18 20
5 Eduard Hovy 54 2 5 6 7 12
6 Mohit Bansal 30 1 4 7 20 30
7 Vincent Ng 30 5 8 10 11 11
8 Luke Zettlemoyer 45 4 7 7 10 16
9 Claire Cardie 43 2 3 7 13 16
10 Garham Neubig 32 0 2 5 13 18
11 Chris Dyer 53 5 5 5 5 5
12 Heng Ji 38 0 2 3 3 5
13 Kevin Knight 42 3 7 9 11 11
14 William Yang Wang 24 3 3 6 13 19
15 Jason Eisner 32 4 9 12 16 19
16 Regina Barzilay 5 8 10 12 14
17 Mona Diab 33 0 1 1 4 4
18 Dan Jurafsky 63 3 5 7 7 7
19 Nizar Habash 33 2 3 4 7 9
20 Jordan Boyd-Graber 33 2 5 8 10 14
21 Kathleen McKeown 33 4 6 8 9 14
22 Mark Dredze 49 8 11 11 13 14
23 Percy Liang 45 4 10 12 15 18
24 Rada Mihalcea 2 4 7 8 10
25 Yejin Choi 38 0 2 4 5 5
26 Kevin Gimpel 27 1 1 4 6 9
27 Jacob Eisenstein 30 6 10 12 15 15
28 Tom Mitchell 54 4 6 9 11 13
29 Yang Liu 26 4 5 5 5 5
30 Bing Liu 69 2 3 5 5 5
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
joined Google, his recent publications (after
2017) are under Google’s authorship
29. ANALYSIS – USER ANALYSIS
¡ February 12, 2020 - March 28, 2020 (46 days)
¡ total of 3,913 accesses
¡ 1,219 distinct IP addresses
¡ Time period viewed
¡ 97.3% viewed the default 2010-2019
¡ 2015 and 2016 are the start years that are checked the most, followed by 2018 and 2017
¡ Interested in most recent years
¡ Suggests beginning of the emerging interest in NLP
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
30. ANALYSIS – USER ANALYSIS
¡ Weight Customization
¡ 99.2% use the default weights
¡ agree with the proposed values
¡ Re-Visit Frequency (Of the 1,219 unique IP addresses)
¡ 73.9% of the users only viewed the site once
¡ 18.7% used it twice
¡ 3.0% used on three different days
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
Histogram of unique IP re-visit frequency
32. CONCLUSION
¡ NLP Rankings is a tool to evaluate and identify NLP programs in the United States
¡ Proposing different methods to evaluate NLP programs and researchers
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
33. NEXT STEPS
1. Platform running time is relatively short
¡ With a longer time horizon, further analysis can be conducted to reevaluate the usefulness of NLP Rankings platform
¡ Especially during application seasons in the Fall
2. Cluster Analysis andTopic Modeling
¡ Research interest and focus at different universities are also important factors
¡ Identify main NLP research interests at each institution
3. Trend Analysis andTopic Modeling
¡ Identify trending research topics over the past decade
Introduction Related Works NLP Rankings Demonstration Analysis Conclusion
34. REFERENCES
¡ A. F. M.A.Al-Juboori,Y. Na and F. Ko, "University ranking and evaluation:Trend and existing approaches," The 2nd International Conference
on Next Generation InformationTechnology, Gyeongju, 2011, pp. 137-142.
¡ Hirsch, J. E. (2005),An index to quantify an individual's scientific research output. Proceedings of the National Academy of Sciences Nov
2005, 102 (46) 16569-16572.
¡ Isidro F.Aguillo, José Luís Ortega & Mario Fernández (2008) Webometric Ranking ofWorld Universities: Introduction, Methodology, and
Future Developments, Higher Education in Europe, 33:2-3, 233-244.
¡ Jin, B., Liang, L., Rousseau, R. et al. The R- and AR-indices: Complementing the h-index. CHINESE SCI BULL 52, 855–863 (2007).
¡ McPherson, Michael A. (2012), Ranking U.S. Economics Programs by Faculty and Graduate Publications:An Update Using 1994–2009
Data. Southern Economic Journal: July 2012,Vol. 79, No. 1, 71– 89.
¡ Morse, Robert.“How U.S. News Calculated the 2021 Best Graduate Schools Rankings.” U.S. News,
https://www.usnews.com/education/best-graduate-schools/articles/how-us-news-calculated-the-rankings.
¡ “NTU Ranking – Indicators.” NTU Ranking, http://nturanking.lis.ntu.edu.tw/methodoloyg/indicators.
¡ “QS World University Rankings – Methodology.” QSWorld University Rankings, https://www.topuniversities.com/qs-world-university-
rankings/methodology.