Investigation of Partition Cells as a 
Structural Basis Suitable for 
Assessments of Individual Scientists 
STI 2014, Version 04.09.2014 p. 1 
Nadine Rons 
Research Coordination Unit, Vrije Universiteit Brussel (VUB)
Reference domains for specialized entities 
STI 2014 p. 2 
Context & options 
Investigated method: Partition cells 
(journal-based structures smaller than subject categories in global databases) 
I. Closer fit to real publication records? 
II. Level of accuracy compared to customary levels? 
Potential effect 
Remaining issues & questions
STI 2014 p. 3 
To measure, or not to measure ? 
Citation-based figures in assessments of individual scientists ? 
Issues 
Domain-dependent publication and citation behaviour, 
contributions of co-authors, career stage, data accuracy, … 
Context 
More emphasis in research policy on individual excellence 
(e.g. ERC Advanced Grants, peer review based selection) 
-> Need for suited indicators
STI 2014 p. 4 
Performance compared to … 
… domain related 'standards' or reference values: averages, 
thresholds, …, e.g.: 
– Field normalized citation impact 
e.g. CPP/FCSm, MNCS 
– Threshold for highly cited papers 
e.g. top x%; CSS outstandingly cited papers class 
... for individual scientists 
Needing a more accurately delineated reference domain than 
subject category-based, reflecting the adequate citation 
characteristics at ± specialty level
STI 2014 p. 5 
Options for approaching a specialty 
Import fine-grained classification schemes maintained by the 
research community 
(e.g. Chemical Abstracts) 
Calculate approximations of specialties using algorithms 
(involving e.g. bibliographic coupling, co-citation, direct citation, cowords, or a 
combination) 
(paper-based) 
This paper: 
Use journal-based structures that are smaller than subject 
categories 
(of intermediate size between journals and entire subject categories)
On specialties, journals & subject categories 
• Articles on a given subject are published in a nucleus of 
periodicals more particularly devoted to the subject + with 
smaller productivities in several other groups of journals. 
(Bradford, 1934) 
≈ 
• Journals contain articles on a number of different subjects 
in varying proportions. The journal is too broad a unit of 
analysis to reveal the structure of specialties. (Small, 1974) 
• Journals are assigned to WoS-categories by subjective, 
heuristic methods, incl. journal citation patterns. (Pudovkin & 
STI 2014 p. 6 
Garfield, 2002) 
WANTED: Structures generating 'standards' applicable to specialties.
A partition of a set X is a set of nonempty subsets of X (blocks, parts or cells of 
the partition) such that every element x in X is in exactly one of these subsets. 
Subject category A 
STI 2014 p. 7 
Smaller journal-based structures 
X = A ∪ B 
• Higher precision than subject categories + stability of a journal-based 
structure 
• Cell: publications of interest to a specific set of subject categories 
(influencing citation characteristics) 
Subject category B 
Cell A  B Cell A  B! Cell B  A 
Each cell of the partition contains all publications associated to exactly 
the same combination of subject categories: 
A only, Cell CA! A and B, Cell CA;B! B only, Cell CB! 
!
From subject categories to partition cells 
JCR Edition 2011 Science Social 
Sciences 
Number of articles 1145591 132104 
Number of subject categories 176 56 
range associated to articles/journals 1-6 1-5 
mean number associated to articles 1.6 1.4 
% of articles in subject categories 
with size in range ]10000, 50000] 55.0% 8.6% 
with size in range ]5000, 10000] 26.8% 24.5% 
with size in range ]1000, 5000] 17.3% 59.5% 
with size in range ]500, 1000] 0.7% 6.1% 
with size in range ]100, 500] 0.2% 1.4% 
Number of partition cells 1714 458 
% of articles in cells 
with size in range ]10000, 50000] 21.3% 0.0% 
with size in range ]5000, 10000] 17.9% 18.7% 
with size in range ]1000, 5000] 33.6% 41.8% 
with size in range ]500, 1000] 11.0% 12.3% 
with size in range ]100, 500] 13.4% 17.3% 
with size in range ]0, 100] 2.9% 9.8% 
STI 2014 p. 8
STI 2014 p. 9 
Partition cells — Two perspectives 
I. Closer fit to publication records of individual scientists ? 
Sample: ERC Advanced Grants, 1st Call (2008), 2 Panels 
'Mathematical foundations' M (21) 
'Fundamental constituents of matter' F (14) 
Distribution of articles over cells (2000-2007) 
II. Level of accuracy compared to customary levels ? 
Mean expected number of citations per publication 
Threshold number of citations for highly cited publications 
Calculation per cell (2 domains) - Customary accuracy levels with 
calculation per publication year  per citation window
I. Concentration of real publication records 
STI 2014 p. 10 
Grantee # Cells # Articles Top sharesCombination of subject categories defining the cell 
2000-2007  secondaries 
M1 1 11 100% Mathematics 
M8 3 14 71% Physics, Mathematical 
21% Mathematics 
M11 9 27 56% Mathematics, Applied 
19% Engineering, Multidisciplinary;Mathematics, Interdisciplinary Applications;Mechanics 
M13 4 12 50% Statistics  Probability 
33% Physics, Mathematical 
M16 8 19 37% Computer Science, Interdisciplinary Applications;Physics, Mathematical 
26% Mathematics, Applied 
M20 12 22 23% Computer Science, Artificial Intelligence 
14% Mathematics, Applied 
F1 4 34 74% Physics, Particles  Fields 
21% Physics, Multidisciplinary 
F2 4 19 63% Astronomy  Astrophysics;Physics, Particles  Fields 
21% Physics, Multidisciplinary 
F3 8 117 50% Optics;Physics, Atomic, Molecular  Chemical 
42% Physics, Multidisciplinary 
F5 10 85 47% Physics, Multidisciplinary 
18% Multidisciplinary Sciences 
F6 9 98 46% Optics 
28% Physics, Multidisciplinary 
F13 14 69 29% Physics, Fluids  Plasmas 
25% Physics, Multidisciplinary 
Data sourced from Thomson Reuters Web of Knowledge (formerly referred to as ISI Web of Science). Web of Science (WoS) accessed online 04-21.10.2013.
STI 2014 p. 11 
II. Accuracy levels compared 
A comparison in two domains 
Mathematics domain: CM, CM;MA, CMA; M = Mathematics, MA = Mathematics Applied 
Physics sub-domain: CAA, CAA;PPF, CPPF; AA = Astronomy  Astrophysics, PPF = Physics, Particles  Fields 
Absolute relative difference Reference values 
compared for successive … Mean expected number of citations 
per article 
Threshold number of citations for 
outstandingly cited articles 
… Publication years 3%-18% 0%-31% 
Articles 2005, 2006, 2007 0%-9% 1%-20% 
… Citation window lengths 29%-51% 8%-59% 
3, 4, 5 years 18%-37% 19%-42% 
≈ 
… Partition cells C defined by 6%-42% 3%-36% 
combinations of subject categories: 0%-28% 0%-20% 
Data sourced from Thomson Reuters Web of Knowledge (formerly referred to as ISI Web of Science). Web of Science (WoS) accessed online 10.01-30.05.2013.
STI 2014 p. 12 
When does it matter in particular? 
# Articles % Articles Expected citation rate 
2000-2007 (citation window 5 years; 
range for publication years 
2000-2007) 
Subject category PHYSICS, MATHEMATICAL 56006 100.0% 6.4-8 
(articles associated to multiple 
subject categories fractionally 
counted) 
Cells in subject category PHYSICS, MATHEMATICAL defined bij combinations of subject categories: 
PHYSICS, FLUIDS  PLASMAS;PHYSICS, MATHEMATICAL 18594 33.2% 8.7-10.5 
PHYSICS, MATHEMATICAL 8455 15.1% 4.9-6.4 
PHYSICS, MATHEMATICAL;PHYSICS, MULTIDISCIPLINARY 7612 13.6% 4.8-6.1 
COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS;PHYSICS, 
4652 8.3% 7.3-10.5 
MATHEMATICAL 
MATHEMATICS, APPLIED;PHYSICS, MATHEMATICAL 4416 7.9% 5.2-7.4 
MATHEMATICS, INTERDISCIPLINARY APPLICATIONS;PHYSICS, 
MATHEMATICAL;PHYSICS, MULTIDISCIPLINARY 
3379 6.0% 3.8-13.9 
PHYSICS, APPLIED;PHYSICS, CONDENSED MATTER;PHYSICS, MATHEMATICAL 2443 4.4% 1.5-2.2 
PHYSICS, MATHEMATICAL;PHYSICS, NUCLEAR;PHYSICS, PARTICLES  FIELDS 1744 3.1% 4-6.4 
MATHEMATICS, APPLIED;PHYSICS, MATHEMATICAL;PHYSICS, MULTIDISCIPLINARY 1547 2.8% 7-8.9 
… … … … 
Data sourced from Thomson Reuters Web of Knowledge (formerly referred to as ISI Web of Science). Web of Science (WoS) accessed online 16.07.2014.
STI 2014 p. 13 
How much can it matter? 
Normalized Mean Citation Rate, citation window 5 years 
Grantee M8, 14 articles 2000-2007 
71% in Cell Physics, Mathematical (4.9-6.4 citations per article) 
21% in Cell Mathematics (2.1-3.1 citations per article) 
7% in Cell Physics, Mathematical;Physics, Multidisciplinary;Physics, Particles  Fields 
(2.3-5.8 citations per article) 
Mean expected number of citations 
Cell-based: 66.9 
Subject category-based: 93.5 
Effect on indicator results (cell-based vs. subject category-based) 
Factor: 1.4 
Compare: CPP/FCSm significantly far below ( 0.5), below (0.5 - 0.8), around (0.8 - 1.2), 
above (1.2 – 1.5), and far above (1.5) the international impact standard of the field 
Data sourced from Thomson Reuters Web of Knowledge (formerly referred to as ISI Web of Science). Web of Science (WoS) accessed online 04-09.10.2013  
16.06-16.07.2014.
STI 2014 p. 14 
Conclusions 
Advantages 
• Stable journal-based structure 
• Closer fit to specialty than possible than with larger subject categories 
• Possibility to differentiate between cells with different citation 
characteristics 
• Readily available, for all disciplines 
New and remaining issues 
• Minority in small cells 
• When is a reference set fit close enough? 
• Multidisciplinary journals and interdisciplinary research 
Questions and opportunities 
• Indicators  validation (- peer review results) 
• Effects for different specialties and indicators 
• Other contexts besides individual scientists
Thank you for your attention! 
STI 2014, Version 04.09.2014 p. 15 
Nadine Rons 
Vrije Universiteit Brussel (VUB) 
VUB RD Dept, Research Coordination Unit 
Pleinlaan 2, B-1050 Brussels, Belgium 
Nadine.Rons@vub.ac.be 
http://rd-ir.vub.ac.be/en_GB/people/show/id/554 
http://be.linkedin.com/pub/nadine-rons/55/2a/436

Investigation of Partition Cells as a Structural Basis Suitable for Assessments of Individual Scientists

  • 1.
    Investigation of PartitionCells as a Structural Basis Suitable for Assessments of Individual Scientists STI 2014, Version 04.09.2014 p. 1 Nadine Rons Research Coordination Unit, Vrije Universiteit Brussel (VUB)
  • 2.
    Reference domains forspecialized entities STI 2014 p. 2 Context & options Investigated method: Partition cells (journal-based structures smaller than subject categories in global databases) I. Closer fit to real publication records? II. Level of accuracy compared to customary levels? Potential effect Remaining issues & questions
  • 3.
    STI 2014 p.3 To measure, or not to measure ? Citation-based figures in assessments of individual scientists ? Issues Domain-dependent publication and citation behaviour, contributions of co-authors, career stage, data accuracy, … Context More emphasis in research policy on individual excellence (e.g. ERC Advanced Grants, peer review based selection) -> Need for suited indicators
  • 4.
    STI 2014 p.4 Performance compared to … … domain related 'standards' or reference values: averages, thresholds, …, e.g.: – Field normalized citation impact e.g. CPP/FCSm, MNCS – Threshold for highly cited papers e.g. top x%; CSS outstandingly cited papers class ... for individual scientists Needing a more accurately delineated reference domain than subject category-based, reflecting the adequate citation characteristics at ± specialty level
  • 5.
    STI 2014 p.5 Options for approaching a specialty Import fine-grained classification schemes maintained by the research community (e.g. Chemical Abstracts) Calculate approximations of specialties using algorithms (involving e.g. bibliographic coupling, co-citation, direct citation, cowords, or a combination) (paper-based) This paper: Use journal-based structures that are smaller than subject categories (of intermediate size between journals and entire subject categories)
  • 6.
    On specialties, journals& subject categories • Articles on a given subject are published in a nucleus of periodicals more particularly devoted to the subject + with smaller productivities in several other groups of journals. (Bradford, 1934) ≈ • Journals contain articles on a number of different subjects in varying proportions. The journal is too broad a unit of analysis to reveal the structure of specialties. (Small, 1974) • Journals are assigned to WoS-categories by subjective, heuristic methods, incl. journal citation patterns. (Pudovkin & STI 2014 p. 6 Garfield, 2002) WANTED: Structures generating 'standards' applicable to specialties.
  • 7.
    A partition ofa set X is a set of nonempty subsets of X (blocks, parts or cells of the partition) such that every element x in X is in exactly one of these subsets. Subject category A STI 2014 p. 7 Smaller journal-based structures X = A ∪ B • Higher precision than subject categories + stability of a journal-based structure • Cell: publications of interest to a specific set of subject categories (influencing citation characteristics) Subject category B Cell A B Cell A B! Cell B A Each cell of the partition contains all publications associated to exactly the same combination of subject categories: A only, Cell CA! A and B, Cell CA;B! B only, Cell CB! !
  • 8.
    From subject categoriesto partition cells JCR Edition 2011 Science Social Sciences Number of articles 1145591 132104 Number of subject categories 176 56 range associated to articles/journals 1-6 1-5 mean number associated to articles 1.6 1.4 % of articles in subject categories with size in range ]10000, 50000] 55.0% 8.6% with size in range ]5000, 10000] 26.8% 24.5% with size in range ]1000, 5000] 17.3% 59.5% with size in range ]500, 1000] 0.7% 6.1% with size in range ]100, 500] 0.2% 1.4% Number of partition cells 1714 458 % of articles in cells with size in range ]10000, 50000] 21.3% 0.0% with size in range ]5000, 10000] 17.9% 18.7% with size in range ]1000, 5000] 33.6% 41.8% with size in range ]500, 1000] 11.0% 12.3% with size in range ]100, 500] 13.4% 17.3% with size in range ]0, 100] 2.9% 9.8% STI 2014 p. 8
  • 9.
    STI 2014 p.9 Partition cells — Two perspectives I. Closer fit to publication records of individual scientists ? Sample: ERC Advanced Grants, 1st Call (2008), 2 Panels 'Mathematical foundations' M (21) 'Fundamental constituents of matter' F (14) Distribution of articles over cells (2000-2007) II. Level of accuracy compared to customary levels ? Mean expected number of citations per publication Threshold number of citations for highly cited publications Calculation per cell (2 domains) - Customary accuracy levels with calculation per publication year per citation window
  • 10.
    I. Concentration ofreal publication records STI 2014 p. 10 Grantee # Cells # Articles Top sharesCombination of subject categories defining the cell 2000-2007 secondaries M1 1 11 100% Mathematics M8 3 14 71% Physics, Mathematical 21% Mathematics M11 9 27 56% Mathematics, Applied 19% Engineering, Multidisciplinary;Mathematics, Interdisciplinary Applications;Mechanics M13 4 12 50% Statistics Probability 33% Physics, Mathematical M16 8 19 37% Computer Science, Interdisciplinary Applications;Physics, Mathematical 26% Mathematics, Applied M20 12 22 23% Computer Science, Artificial Intelligence 14% Mathematics, Applied F1 4 34 74% Physics, Particles Fields 21% Physics, Multidisciplinary F2 4 19 63% Astronomy Astrophysics;Physics, Particles Fields 21% Physics, Multidisciplinary F3 8 117 50% Optics;Physics, Atomic, Molecular Chemical 42% Physics, Multidisciplinary F5 10 85 47% Physics, Multidisciplinary 18% Multidisciplinary Sciences F6 9 98 46% Optics 28% Physics, Multidisciplinary F13 14 69 29% Physics, Fluids Plasmas 25% Physics, Multidisciplinary Data sourced from Thomson Reuters Web of Knowledge (formerly referred to as ISI Web of Science). Web of Science (WoS) accessed online 04-21.10.2013.
  • 11.
    STI 2014 p.11 II. Accuracy levels compared A comparison in two domains Mathematics domain: CM, CM;MA, CMA; M = Mathematics, MA = Mathematics Applied Physics sub-domain: CAA, CAA;PPF, CPPF; AA = Astronomy Astrophysics, PPF = Physics, Particles Fields Absolute relative difference Reference values compared for successive … Mean expected number of citations per article Threshold number of citations for outstandingly cited articles … Publication years 3%-18% 0%-31% Articles 2005, 2006, 2007 0%-9% 1%-20% … Citation window lengths 29%-51% 8%-59% 3, 4, 5 years 18%-37% 19%-42% ≈ … Partition cells C defined by 6%-42% 3%-36% combinations of subject categories: 0%-28% 0%-20% Data sourced from Thomson Reuters Web of Knowledge (formerly referred to as ISI Web of Science). Web of Science (WoS) accessed online 10.01-30.05.2013.
  • 12.
    STI 2014 p.12 When does it matter in particular? # Articles % Articles Expected citation rate 2000-2007 (citation window 5 years; range for publication years 2000-2007) Subject category PHYSICS, MATHEMATICAL 56006 100.0% 6.4-8 (articles associated to multiple subject categories fractionally counted) Cells in subject category PHYSICS, MATHEMATICAL defined bij combinations of subject categories: PHYSICS, FLUIDS PLASMAS;PHYSICS, MATHEMATICAL 18594 33.2% 8.7-10.5 PHYSICS, MATHEMATICAL 8455 15.1% 4.9-6.4 PHYSICS, MATHEMATICAL;PHYSICS, MULTIDISCIPLINARY 7612 13.6% 4.8-6.1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS;PHYSICS, 4652 8.3% 7.3-10.5 MATHEMATICAL MATHEMATICS, APPLIED;PHYSICS, MATHEMATICAL 4416 7.9% 5.2-7.4 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS;PHYSICS, MATHEMATICAL;PHYSICS, MULTIDISCIPLINARY 3379 6.0% 3.8-13.9 PHYSICS, APPLIED;PHYSICS, CONDENSED MATTER;PHYSICS, MATHEMATICAL 2443 4.4% 1.5-2.2 PHYSICS, MATHEMATICAL;PHYSICS, NUCLEAR;PHYSICS, PARTICLES FIELDS 1744 3.1% 4-6.4 MATHEMATICS, APPLIED;PHYSICS, MATHEMATICAL;PHYSICS, MULTIDISCIPLINARY 1547 2.8% 7-8.9 … … … … Data sourced from Thomson Reuters Web of Knowledge (formerly referred to as ISI Web of Science). Web of Science (WoS) accessed online 16.07.2014.
  • 13.
    STI 2014 p.13 How much can it matter? Normalized Mean Citation Rate, citation window 5 years Grantee M8, 14 articles 2000-2007 71% in Cell Physics, Mathematical (4.9-6.4 citations per article) 21% in Cell Mathematics (2.1-3.1 citations per article) 7% in Cell Physics, Mathematical;Physics, Multidisciplinary;Physics, Particles Fields (2.3-5.8 citations per article) Mean expected number of citations Cell-based: 66.9 Subject category-based: 93.5 Effect on indicator results (cell-based vs. subject category-based) Factor: 1.4 Compare: CPP/FCSm significantly far below ( 0.5), below (0.5 - 0.8), around (0.8 - 1.2), above (1.2 – 1.5), and far above (1.5) the international impact standard of the field Data sourced from Thomson Reuters Web of Knowledge (formerly referred to as ISI Web of Science). Web of Science (WoS) accessed online 04-09.10.2013 16.06-16.07.2014.
  • 14.
    STI 2014 p.14 Conclusions Advantages • Stable journal-based structure • Closer fit to specialty than possible than with larger subject categories • Possibility to differentiate between cells with different citation characteristics • Readily available, for all disciplines New and remaining issues • Minority in small cells • When is a reference set fit close enough? • Multidisciplinary journals and interdisciplinary research Questions and opportunities • Indicators validation (- peer review results) • Effects for different specialties and indicators • Other contexts besides individual scientists
  • 15.
    Thank you foryour attention! STI 2014, Version 04.09.2014 p. 15 Nadine Rons Vrije Universiteit Brussel (VUB) VUB RD Dept, Research Coordination Unit Pleinlaan 2, B-1050 Brussels, Belgium Nadine.Rons@vub.ac.be http://rd-ir.vub.ac.be/en_GB/people/show/id/554 http://be.linkedin.com/pub/nadine-rons/55/2a/436