(i.e. single concept can exist in more than one place in different contexts).within the hierarchy each meaning, in MeSH terminology called concept, is represented by its Unique ID, its MeSH heading (default name) and its Tree Numbers (contexts). The tree numbers are in fact all distinct drill paths leading from the root of the hierarchy to the concept. Each tree number is a level, and encompasses all levels below it. Every journal article is indexed with about 10-15 descriptors, allowing us to compare researchers with similar number of publications. We, though however, eliminate polysemous descriptors (which occur in multiple tree paths) so as to not include unrelated research areas.
Measuring Researcher Diversity and its Impact on AwardsTanu Malik Computation InstituteAndrey Rzhetsky Department of Human GeneticsIan Foster Computation Institute University of Chicago Argonne National Laboratory
History Leonardo da VinciBohr Darwin Renaissance polymath, painter, sculptor, architect, musician, mathematician, engineer, inventor, anatomist,Einstein geologist, cartographer, botanist, and writer
Biological species Short Term: Competition Long Term: Changing Environments Competition Niche DifferencesAdapted from: Levine, J. M. & HilleRisLambers, J. (2012) The Maintenance of Species Diversity. Nature Education Knowledge 3(10):59
Biological species: Specialist/Generalist Short Term: Competition Long Term: Changing Environments Competition Niche DifferencesAdapted from: Levine, J. M. & HilleRisLambers, J. (2012) The Maintenance of Species Diversity. Nature Education Knowledge 3(10):59
Science ResearchShort Term: Competition on Topics Long Term: Changing Funding Situation Competition Niche Differences
Why is this important?• Research articles whose coauthors are in different departments at the same university receive more citations than those authored in a single department (Katz et.al, 1997).• Multi–university collaborations that include a top tier–university were found to produce the highest–impact research articles (Jones, et al., 2008).• It has also been demonstrated that scholarly work covering a range of fields — and patents generated by larger teams of co–authors — tend to have greater impact over time (Wuchty, et al., 2007).• In the area of nanotechnology authors who have a diverse set of collaborators tend to write articles that have higher impact (Rafols et. al., 2010).• Finally, diverse groups can, depending on the type of task, outperform individual experts or even groups of experts (Page, 2007).
Individual Focus• Some mathematicians are birds, other are frogs. Birds fly high in the air, frogs live in the mud below.. (Freeman Dyson, AMS Einstein Lecture, 2008)• “Foxes”, individuals who know many little things, tend to make better predictions about future outcomes than “hedgehogs” who focus on one big thing (Tetlock, 2005)• Individuals’ degree of focus is positively correlated with the quality of their contributions (Adamic, 2010)
Goals and Problems• Goal: Quantify the ability of each class of researchers (specialists/generalists) to competition in near term and adapt to changing funding requirements in the long term. A. How to determine specialist and generalist researchers? B. How to quantify the ability to compete/adapt?
A. Researcher Diversity• Based on their publication history, determine if their interests can be classified into highly varied interests or focused interests• Researcher profiles created from PubMed
Creating Researcher Profiles• Author Disambiguation – Data mining methods – Microsoft Academic Search • Automated profiles of users • Web scraping • Person’s organization and domain of interest as disambiguating features – Harvard Profiles • Directly links to PubMed • Also takes an input of publications claimed by an author.
Controlled Vocabulary• Medical Subject Headings (MeSH) – poly-hierarchy of 25,186 medical concepts
Researcher Diversity• Shannon’s Entropypi: proportion of individual’s contributions incategory i – Category = MeSH term – Frequency over years 0.00 0.41 0.82 1.00 1.59 2.00
Shannon Vs Sterling• Variety: how many different areas an individual contributes• Balance: how evenly their efforts are distributed among these areas; and,• Similarity, or how related those areas are
B. Quantifying the Ability to Compete• Entropy has a negative correlation with measures of impact and productivity, viz. the h-index and the g-index.• Result (in a way) reconfirms Adamic’s result of positive correlation between specialist and productivity
Geniuses, Birds, Beavers, FrogsGeniuses: Dwell on many topics at all times (8-9)Birds: Dwell on many topics over their research career, but a few topics at a given timeBeavers: Specialists whose focus is interdisciplinaryFrogs: High-focused
Future Work• Larger datasets• Researchers in the long tail are specialists; generalists are in the head of the tail;
Summary• A framework to understand researcher diversity• Quantification of researcher diversity with productivity and awards• Negative correlation of diversity with productivity and positive with awards• Use more accurate author disambiguation methods