1. BIBLIOMETRICS
SCINTOMETRICS
CITATION ANALYSIS
CONTENT ANALYSIS
Presented By
Sumit Ranjan
Junior Research Fellow
Dept. Of Library & Information Science (SIST)
BBAU, LUCKNOW
PRESENTATION TOPICS
2. What is Bibliometrics?Bibliometrics literally means "book measurement" but the term is used
about all kinds of documents (with journal articles as the dominant kind of
document). What is measured are not the physical properties of documents
but statistical patterns in variables such as authorship, sources, subjects,
geographical origins, and citations. Bibliometrics is defined as “the study of
quantitative aspects of the production, dissemination and use of recorded
information”. Herubel5 explained bibliometrics as “as a quantitative analysis
of publications for the purpose of ascertaining specific kind of phenomenon”.
• “… all studies which seek to quantify processes of written
communication.”
Pritchard
• “… the quantitative treatment of the propertied of recorded discourse
and behavior pertaining to it.”
Fairthorne
• Recorded communication - ‘literature’->
quantitative methods
2
3. Alan Pritchard 1969
Coined the term “Bibliometrics“ in a paper “Statistical
bibliography or Bibliometrics” published in the Journal
of Documentation (1969)
"the application of mathematics and statistical methods
to books and other media of communication“
Journal of Documentation (1969) 25(4):348-349
quantitative methods
3
4. Also used to study broader than books, articles
…
Scientometrics
• covering science in general, not just
publications
Infometrics
• all information objects
Webometrics or cybermetrics
• web connections, manifestations
• using bibliometric techniques to study the
relationship or properties of different sites
on the web
and other related metrics …
4
5. Reasons for quantitative studies of
literature
Analysis of structure and dynamics
• search for regularities - predictions possible
Understanding of patterns
• “order out of documentary chaos”
• verification of models, assumptions
Rationale for policies & design
5
6. Application of Bibliometrics
To identify research trends and growth of knowledge.
To estimate comprehensiveness of secondary periodicals.
Library selection, weeding, policies
Information organization
Information management
To identify users of different subjects.
To identify authorship and its trends in documents on various
subjects.
To forecast past, present and future publishing trends.
To predict productivity of publishers, individual authors,
organizations and countries.
6
7. What Studied?
Governed by data available in documents or
information resources in general - that what
can be counted
author(s)
origin
•organization, country, language
source
•journal, publisher, patent …
7
8. What … more
contents
• text, parts of text, subject, classes
representation
citations
• to a document, in a document, co-citation
utilization
• circulation, various uses
links
any other quantifiable attribute
8
9. Tools
Science Citation Index
Compilation of variables from journals in
a subject
Use data
Publication counts from indexes, or
other data bases
Web structures, links
9
10. Variable: Authors
number in a subject, field, institution, country
growth
correlation with indicators like GNP, energy etc.
productivity e.g. Lotka’s law
collaboration - co-authorship, associated
networks
dynamics - productive life, epidemics
papers/author in a subject
mapping
10
11. Variable: Origin
Rates of production, size, growth by
• country, institution, language, subject
Comparison between these
Correlation with economic & other
indicators
11
12. Variable: Sources
Concentration most often on journals
Growth, dynamics, numbers
• information explosion - exponential laws
• time movements, life cycles
Scatter - quantity/yield distribution
• Bradford’s law
Various distributions
• by subject, language, country
12
13. Variable: Contents
Analysis of texts
• distribution of words – Zipf’s law
• words, phrases in various parts
• subject analysis, classification
• co-word analysis
13
15. Variable: citations
Studied a lot; many pragmatic results
• base for citation indexes, web of science, impact
factors, co-citation studies etc.
Derived:
• number of references in articles
• number of citations to articles
oresearch front; citation classics
• bibliographic coupling
15
17. Alfred J. Lotka 1926
• Statistics—the frequency distribution of scientific
productivity
Purpose: to "determine, if possible, the part which
men of different caliber contribute to the
progress of science“
– Looked at Chemical Abstracts Index, then Geschichtstafeln
der Physik
• J. Washington Acad. Sci. 16:317-325
17
18. Lotka’s law: xn • y = C
The total number of authors y in a given subject, each
producing x publications, is inversely proportional to
some exponential function n of x.
• Where:
– x = number of publications
– y = no. of authors credited with x
publications
– n = constant (equals 2 for scientific
subjects)
– C = constant
• inverse square law of scientific productivity
18
19. Lotka's Law - scientific publications
xn • y = C
No.ofauthors
19
20. Samuel Clement Bradford 1934
Distribution of quantity vs yield of sources of information on
specific subjects
• he studied journals as sources, but applicable to other
• what journals produce how many articles in a subject and how are
they distributed? or
• How are articles in a subject scattered across journals?
Purpose: to develop a method for identification of the most
productive journals in a subject & deal with what he called
“documentary chaos”
First published in: Engineering (1934) 137:85-86, then in his book
Documentation, (1948)
20
21. Bradford’s law
"If scientific journals are arranged in order of
decreasing productivity of articles on a given
subject, they may be divided into a nucleus of
periodicals more particularly devoted to the
subject and several groups or zones containing
the same number of articles as the nucleus,
when the numbers of periodicals in the nucleus
and succeeding zones will be as a : n : n2 : n3
…“
21
22. 22
Bradford’s law: f(x)=a+b log x
• Were
f(x) = Total no. of references in first x most
productive journal
a & b = constant
It studies scattering of article on particular
topic over different journal
23. Bradford's Law of Scattering –
an idealized example
No. of
source journals
1
2
1
2
2
4
10
7
5
5
No. of articles
per source
60
35
30
25
9
8
6
5
4
3
Total no. of articles
60
70
30
50
18
32
60
35
20
15
3 130
9
27
130
130
23
24. Bradford's Law of Scattering – zones
3 sources
130 articles
9 sources
130 articles
27 sources
130 articles
nucleus
24
25. George Kingsley Zipf’s 1935
• The psycho-biology of language: an introduction to
dynamic philology (1935)
• Human behavior and the principle of least effort: An
introduction to human ecology (1949)
• Looked, among others, at frequency distributions of
words in given texts
– counted distribution in James Joyces’ Ulysses
• Provided an explanation as to why the found
distributions happen:
Also known as Principle of least effort
25
26. Zipf’s law: r • f = c
• Where:
r = rank (in terms of frequency)
f = frequency (no. of times the given word
is used in the text)
c = constant for the given text
• For a given text the rank of a word multiplied by
the frequency is a constant
• Works well for high frequency words, not so well
for low – thus a number of modifications
26
27. Scientometrics: Definition
It is the quantitative study of science
output or outcome in any form, not just
records or bibliographies. It comprises all
the metrics studies related to science
indicators, citation analyses, research
evaluation, etc.
27
28. The measurement of science
– not the use of measurement in science
The quantitative study of scientific
communications
Using bibliometric methods
Scientometric
28
29. Scope….
It is a field wherein the flow of information and
behavior of information are analyzed, measured and
quantitative relations are established
It is a scientific field wherein the developments of
measurement of impact of information are accessed
continuously.
Scientometrics mostly deals with analysis of science
data.
29
30. Topics deal with…..
Growth and obsolescence of literature.
Measures of scientific productivity (often referred to
as the author productivity).
Quantitative aspects of library and information
studies, including journal productivity, rank
distribution of words, etc.
Co-citation, bibliographic coupling, co-word analysis,
etc.
Identifying relations among various disciplines,
structure of subjects, national mapping of science, etc.
30
31. Scientometrics: Applications
Documentation – where it can count the number of
journals that constitute the core, secondary sources
and periphery of a discipline by analyzing the quantity
of journals needed to cover 50% of the information in
a given area of science
Science policy – where it provides indicators to
measure productivity and scientific quality, thereby
supplying a basis for evaluating and orienting R & D.
31
32. 32
Citation analysis is the examination of the
frequency, patterns, and graphs of citations in
articles and books. It uses citations in scholarly
works to establish links to other works or other
researchers. Citation analysis is one of the most
widely used methods of bibliometrics. For
example, bibliographic coupling and co-citation
are association measures based on citation
analysis.
Citation Analysis: Definition
33. Using citations (footnotes) as the raw
data for bibliometric studies
Who footnotes whom/who is footnoted
by whom
Can be used to assess the influence of an
individual on a field of study
Citation Analysis
33
34. Citing behavior is little understood
Citer motivations are little acknowledged
Matthew Effect (Robert K. Merton)
– "To him who has shall be given, and he
shall have abundance: but from him who does
not have, even that which he has shall be taken
away.”
Obliteration by incorporation
Problems in Citation Analysis
34
35. Content Analysis
35
• Content analysis is considered a scholarly method
in the humanities by which texts are studied as to
authorship, authenticity, or meaning.
• Content analysis to explore the content of various
media (book, journals, web resources, etc.) in order
to discover how particular issues are presented.
• Content analysis is a summarizing, quantitative
analysis of messages that relies on the scientific
method and is not limited as to the types of
variables that may be measured or the context in
which the messages are created or presented
36. What Analysis?
A technique that enables researchers to study
human behavior in an indirect way
– through an analysis of their communications
The analysis of the
– Written contents of a communication.
– Examples?
– Review resumes from job applicants
36
37. Steps for Conducting CA
Define terms
– Knowledge
construction,
socializing, presence
Specify unit and
analysis
– Words, sentences,
phrases, paintings,
audio, video
Find data and
sampling
Generate coding
scheme
Inter rater reliability
Analyze!!!
37
38. Let’s look at examples
Corporate Blog study
Blog study: collaboration
Socialization online study
Let’s code together!
• Choose one entry from one of the
class blogs
• Copy-paste to Word and then Code it
together
38
41. Qualitative Analysis
Content Analysis
Identifying, Coding, Categorizing the primary patterns in the
data
• Interaction styles in online discussion:
• Complexity of response
• Question type
• Levels of argumentation & negotiation
• Socializing
• Coding Scheme
Creates a scheme which clusters words and phrases into
conceptual categories for purposes of counting
41
42. Computerized Content Analysis
Adolescent Writings of Napoleon
Bonaparte
Analysis of verbal behavior
• scores on scales for on depression, anxiety, &
preoccupation with sickness
• coincide with the available
biographical evidence (childhood)
42
43. Unit of Analysis?
Words
Phrases
Sentences
Paragraphs
Blog entries
Video segments
Picture
43
44. 1. Bibliometric and scientometric . (2006). Retrieved from
http://edweb.sdsu.edu/courses/edtec296/
2. Content Analysis. (2010). Retrieved from
http://amandaklein.blogspot.com/
3. Scientometric to webometrics. (2010). Retrieved from uni-
mysore.ac.in/Asc/2010%20TO%202011/RC/.../mysore2011-
1.ppt
4. Citation Analysis. (2011). Retrieved from
boballen.info/ISS/PPT/ISSchp09.pptx
5. Content analysis. (2008). Retrieved from
http://en.wikipedia.org/wiki/Content_analysis
References
44