Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

What's in a Scientist name?


Published on

Applying onomastics in scientometrics.
Presentation at IREG Symposium on Academic Excellence.

Our friend Tania Vichnevskaia of the French National Institute for Health (INSERM) presented the paper ‘Applying onomastics to scientometrics‘ yesterday at IREG ​​International symposium or​​ganised by University of Maribor and Shanghai Jiao Tong University.

NamSor as a private start-up company has been solicited in 2014 by a European country to help measure the ‘brain drain’ affecting its competitiveness in the BioTech sector and to produce a global map of its scientific Diaspora (who are they, where are they and what are they doing). The objective was to build up the country’s scientific international cooperation and to engage its Diaspora.

Serendipity led analysts to discover interesting patterns in the way scientists names affect co-authorship and citation – not just for this particular country, but globally.

Last year, during ICOS2014 conference at Glasgow University, we presented how data mining millions of scientific articles in PubMed/PMC LifeSciences database uncovered amazing patterns in the way scientists names correlate with whom they publish, and who they cite in their papers.

We were interested to mine the large commercial bibliographic databases (Thomson WoS, Scopus) because they offer better data quality on citations and useful additional information, compared to PubMed:

– firstly, they have the full name in addition to the short name cited with just initials; this significantly reduces the error rate of onomastic classification

– secondly, they link scientists to research institutions (affiliations) and geographies (country of affiliation) ; this allows additional analysis on the topic of Diasporas and brain drain, comparing -for example- the research output of Chinese / Chinese American scientists in the US with that of scientists of Mainland China;

– thirdly, those databases have a larger coverage in terms of scientific disciplines, allowing comparison between different fields of research.

So collaboration started between NamSor and bibliometric experts at INSERM –the French National Institute for Health- to evaluate and visualize the effects of migration, Diaspora engagement and possibly cultural biases in Science.

Published in: Data & Analytics
  • Login to see the comments

  • Be the first to like this

What's in a Scientist name?

  1. 1. APPLYING ONOMASTICS TO SCIENTOMETRICS Tania Vichnevskaia, French National Institute for Health 1 2015-01-19
  2. 2. Cultural bias and diaspora through publications 2  Appling onomastics to approach cultural bias in research  Analysis of International Cancer Research publications through co-authorship and inter-citation
  3. 3. Two approaches combined: onomastics & bibliometrics 3  Onomastics is the study of the origin, history, and use of proper names  Bibliometrics is a statistical and structural analysis of written publications:  Quantitatives: number of publications, number of citations, ranking of the top 1% or 10% of cited publication.  Structural: Configuration of co-authorship, and inter-citation on different levels: Authorship/University/Country
  4. 4. YOU are ALL onomasticians ! 4 Active participants at the IREG conference:  Prof. Gero Federkeil, CHE (Coordinator of Multi-Ranking),  Prof. Nian Cai Liu, Jiao Tong University in Shanghai  Prof. Seeram Ramakrishna, The National University of Singapore,  Prof. Santo Fortunato, Aalto University,  Prof. Karin Stana Kleinschek, The University of Maribor,  Prof. Henryk Ratajczak, member of Czech Academy of Sciences,  Prof. Edvard Kobal, Slovenian Science Foundation,  Roberta Sinatra, PhD, Northeastern University,  Tania Vichnevskaia, French National Institute for Health (INSERM),  Prof. Andrée Sursock, Senior Adviser at EUA,  Prof. Øivind Andersen, The University of Oslo. What can YOU tell from the names?
  5. 5. Data in Cancer Research 5  China : The Fudan University Cancer Hospital  USA: The Dana-Farber Cancer Institute  France: French Comprehensive Cancer Centers (FCCC)  Japan : National Cancer Center  Poland : All applicable institutes  Slovenia : All applicable institutes
  6. 6. Thomson Web of Science 6  For this study, we used Thomson Web of Science database  Web of Science provides access to the world's leading citation databases. Multidisciplinary content covers over 12,000 of the highest impact journals.  Thomson Web of Science contains information on scientists names, affiliation and citation (possibility to qualify Diaspora)  In the corpus  15k articles, 68k authors;  cited: 32k articles 168k authors;  17 million author citing-author cited occurrences
  7. 7. Name origin recognition : does it work? Matrix with affiliation (row) and onomastic class (column) 7 ... ... The USA was considered as a ‘melting pot’ of all origins, not having an onomastic class of its own. Name origin recognition precision varies. We can establish the strong relationship between affiliation and ‘onomastic classes’ for China, Japan, Poland, Slovenia. Ex: 84% of scientists affiliated to Poland have Polish names. ie. Polish name French Brittish Chinese Polish German Japanese Italian US 2358 7072 2406 430 2384 680 1243 France 9375 318 88 93 370 42 607 China 13 35 5690 1 19 76 30 ie. Poland affiliation > Poland 23 16 3 5038 174 5 13 Great Brittain 191 1786 57 52 145 28 115 Japan 9 10 26 7 7 3739 10 Germany 121 102 22 92 1819 9 60 Italy 83 4 1 11 31 3 2026
  8. 8. Percentage of publications and citations 8 Percentage of Cancer Research publications in the World Percentage of Cancer Research Documents cited relative to world publications
  9. 9. Each corpus has a different profile Breakdown by co-authors onomastic classes 9
  10. 10. Chinese scientists in Cancer Research Fudan, China Who do they cite? 10 Authors in Chinese corpus, Affiliated to China, With a Chinese or Taiwanese names Most cited countries of affiliation: 1) US 2) China 3) Japan Most cited onomastic class: 1) Chinese (Chinese Scientists in the US) 2) British 3) Japanese
  11. 11. Chinese Scientists in Cancer Research Dana-Faber, US Who do they cite? 11 Scientists in US corpus, Affiliated to US, With a Chinese or Taiwanese names Most cited countries of affiliation: 1) US 2) Great-Britain 3) Japan Most cited onomastic class: 1) British 2) Chinese 3) Japanese
  12. 12. Diaspora a key ‘hidden’ dimension 12  Ranking by Country & Affiliation would entirely differ from Ranking by Onomastic class (ie. including Diasporas)  Onomastics sheds light on the strong relationship that exists between China and its Diaspora, especially between the institutes of excellence in China and US  So China: Brain Drain or Brain Gain ?
  13. 13. Cancer Research in Poland and Slovenia Examining the ‘brain drain’ 13 In the Polish Corpus, we look at co- authors with Polish names, affiliated abroad. Top countries: 1. US, 2. Great-Britain, 3. Germany. In the Slovenian Corpus, we look at co- authors with Slovenian names, affiliated abroad. Top countries: 1. Great-Britain, 2. US, 3. Germany.
  14. 14. Conclusion 14  Applied onomastics can help view differences in  Co-authorship and citing patterns in various countries  Diasporas citing patterns (ex. Chinese in the US vs. non-Chinese in the US vs. Chinese in China)  ‘Brain drain’ and migration patterns  The above has an potential impact on ranking, and should be further analysed  More questions raised than answered!
  15. 15. Thank you! 15  Tatiana Vichnevskaia,  Contact:  About NamSor software and API  Contact: 