Is peer review any good? A quantitative analysis of peer review
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Is peer review any good? A quantitative analysis of peer review

  • 2,038 views
Uploaded on

This is a presentation of the paper in which we focus on the analysis of peer reviews and reviewers behavior in conference review processes. We report on the development, definition and rationale......

This is a presentation of the paper in which we focus on the analysis of peer reviews and reviewers behavior in conference review processes. We report on the development, definition and rationale of a theoretical model for peer review processes to support the identification of appropriate metrics to assess the processes main properties. We then apply the proposed model and analysis framework to data sets about reviews of conference papers. We discuss in details results, implications and their eventual use toward improving the analyzed peer review processes.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,038
On Slideshare
2,038
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
11
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Is
peer
review
any
good?
A
quan4ta4ve
 analysis
of
peer
review
 Fabio
Casa),
Maurizio
Marchese,
Azzurra
Ragone,
Ma6eo
Turrini
 University
of
Trento
 h6p://eprints.biblio.unitn.it/archive/00001654/01/techRep045.pdf


  • 2. Ini)al
Goals
 •  Understand
how
well
 peer
review
works
 •  Understand
how
to
 improve
the
process
 •  Metrics
+
Analysis
 –  (refer
to
liquid
doc)
 •  Focus
only
on
 gatekeeping
aspect
 “Not everything that can be counted counts, and not everything that counts can be counted.” -- Albert Einstein
  • 3. Metric
Dimensions
 Quality Divergence Kendall Distance Disagreement Robustness Effort vs quality. Effort-invariant alternatives Unbiasing Biases Fairness Efficiency
  • 4. Data
Sets
 •  Around
7000
reviews
from
various
 conferences
in
the
CS
field
(more
on
the
way)
 –  Large,
medium,
small
 –  Some
with
“young
reviewers”

  • 5. Is
peer
review
effec)ve?
Does
it
work?
 •  And
what
does
it
mean
to
be
effec)ve?
HOW
 do
we
measure
it?
 •  Easier
to
measure/detect
“problems”
 •  Peer
review
ranking
vs.
ideal
ranking

  • 6. Comparing
rankings
 28 33 17 89 2 2 45 17 67 67 .. .. .. 28 89 .. 33 .. .. 45
  • 7. Ideal
ranking
(?)
 •  Success
in
a
subsequent
phase
 •  Cita)ons

 Suggested reading: Positional effect on citation and readership in arXiv, by Haque and Ginsparg
  • 8. Comparing
rankings
 T=3 N=10 Divergence:

Div
(t,N)
 Kendall
τ

  • 9. 9
  • 10. Results:

 peer
review
ranking
vs.
cita)on
count

 Divergence
 Div
 Normalized
t
 10
  • 11. Randomness
and
reliability
 •  Quality‐related
but
independent
of
the
criteria
 for
the
“ideal”
ranking
 •  Basic
stats
 •  Disagreement
 •  Robustness
 •  Biases
 11
  • 12. Quality‐related
Metrics:
Sta)s)cs
 Distribu4on
of
marks
(integer
marks)
 0.18
 0.16
 0.14
 0.12
 Probability
 0.1
 0.08
 0.06
 0.04
 0.02
 0
 0
 1
 2
 3
 4
 5
 6
 7
 8
 9
 10
 Marks
 12
  • 13. Disagreement
 •  Measure
the
difference
between
the
marks
 given
by
the
reviewers
on
the
same
 contribu4on.

 •  The
ra)onale
behind
this
metric
is
that
in
a
 review
process
we
expect
some
kind
of
 agreement
between
reviewers.


  • 14. Normalized
Disagreement
(afer
discussion)
 C1 C2 C3 Computed 0,27 0,32 (high variance) 0,26 (high variance) Reshuffled 0,34 0,40 0,32 14
  • 15. Robustness
 •  Sensi)vity
to
small
varia)on
in
the
marks
 –  Tries
to
assess
the
impact
of
small
indecisions
in
 giving
the
mark
(e.g.,
6
vs
7…..)
 •  Measures
divergence
afer
applying
an
ε‐ varia)on
to
the
mark
 •  Results:
reasonably
robust
except
for
the
 conference
managed
by
young
researchers
 15
  • 16. Metric
dimensions
 Quality Divergence Kendall Distance Disagreement Statistics Robustness Unbiasing Biases Effort Fairness Efficiency
  • 17. Fairness
 •  Defini)on:

A
review
process
is
fair
if
and
only
 of
the
acceptance
of
a
contribu)on
does
not
 depend
on
the
par)cular
set
of
PC
members
 that
reviews
it
 •  The
key
is
in
the
assignment
of
a
paper
to
 reviewers:

A
paper
assignment
is
unfair
if
the
 specific
assignment
influences
(makes
more
 predictable)
the
fate
of
the
paper.


  • 18. Poten)al
biases
 •  Ra4ng
bias:
Reviewers
are
biased
if
they
 consistently
give
higher/lower

marks
than
their
 colleagues
who
are
reviewing
the
same
paper
 •  Affilia4on
bias
 •  Topic
bias
 •  Country
bias
 •  Gender
bias

 •  …

  • 19. Computed
Normalized
Ra)ng
Biases
 C2 C3 C4 top accepting 3,44 1,52 1,17 top rejecting -2,78 -2,06 -1,17 > + |min bias| 5% 9% 7% < - |min bias| 4% 8% 7% C2 C3 C4 Unbiasing effect (divergence) 9% 11% 14% Unbiasing effect (reviewers affected) 16 5 4
  • 20. 20