Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bibliometrix Seminar

1,504 views

Published on

Bibliometric Research Synthesis
bibliometrix: An R-tool for comprehensive science mapping analysis

In the seminar we propose and use a unique tool, developed in the R language, which follows a classic logical bibliometric workflow that we reconstruct. We have designed and produced an R-tool for comprehensive bibliometric analyses. R is a language and environment for statistical computing and graphics. It provides a wide variety of statistical and graphical techniques and is highly extensible. In addition to enabling statistical operations, it is an object-oriented and functional programming language; hence, you can automate your analyses and create new functions. It has an open-software nature, which means it is well supported by the user community and new functions are regularly contributed by users, many of whom are prominent statisticians. As it is programmed in R, the proposed tool is flexible, can be rapidly upgraded, and can be integrated with other statistical R-packages. It is therefore useful in a constantly changing field such as bibliometrics.

Published in: Science

Bibliometrix Seminar

  1. 1. PhD Seminar Series: Bibliometric Research Synthesis bibliometrix: An R-tool for comprehensive science mapping analysis Massimo Aria and Corrado Cuccurullo massimo.aria@unina.it; corrado.cuccurullo@unicampania.it Bibliometrix package www.bibliometrix.org
  2. 2. Practice experience • Seminars goals The aim of this seminar cycle is twofold. 1. First, we want to bring together in a single seminar cycle all the knowledge on research synthesis through a recommended workflow, from problem formulation to report writing. 2. Second, we present our open-source bibliometrix R-package for performing comprehensive bibliometric analyses, and discuss how bibliometrix is a valid tool for performing bibliometric studies. We illustrate the main bibliometrix functions in the workflow, using topics selected by the participants to the seminars. • Lab activities In this seminar, doctoral students have to synthesize a large volume of studies and organize articles into different intellectual and conceptual maps that represents distinct research streams and interrelationships. The assignment culminates with doctoral students presenting (a) graphical representations of their bibliometric analysis, (b) a depiction of how each stream relates to other streams, and (c ) a list of main contributing authors and/or works in each stream and substream. • Effective learning and usefulness The exercise proves useful for all involved. Doctoral students learn valuable skills in analysing and conceptualizing vast amounts of literature. This is a useful skill for any researcher. Moreover, doctoral students will be able to use bibliometric maps to identify gaps and trends in the literature. Given the value of this experience, we ask doctoral students to present and defend their own maps of literature.
  3. 3. Key readings • Research Synthesis • Thomé, A. M. T., Scavarda, L. F., & Scavarda, A. J. (2016). Conducting systematic literature review in operations management. Production Planning & Control, 27(5), 408- 420. • Cooper, H. (2015). Research synthesis and meta-analysis: A step-by-step approach (Vol. 2). Sage publications. • Briner RB , Denyer D (2012) Systematic review and evidence synthesis as a practice and scholarship tool in Rousseau, D. M. (Ed.). (2012). The Oxford handbook of evidence-based management. Oxford University Press. • Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & Prisma Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS medicine, 6(7), e1000097. • Massaro, M., Dumay, J., & Guthrie, J. (2016). On the shoulders of giants: undertaking a structured literature review in accounting. Accounting, Auditing & Accountability Journal, 29(5), 767-801. • Webster, J., & Watson, R. T. (2002). Analyzing the past to prepare for the future: Writing a literature review. MIS quarterly, xiii-xxiii. • Torraco, R. J. (2005). Writing integrative literature reviews: Guidelines and examples. Human resource development review, 4(3), 356-367. • General science mapping workflow • Aria, M. & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis, Journal of Informetrics, 11(4), pp 959-975 • Cobo, M. J., Lopez-Herrera, A. G., Herrera-Viedma, E., & Herrera, F. (2011). Science Mapping Software Tools: Review, Analysis, and Cooperative Study Among Tools. Journal of the American Society for Information Science and Technology. • bibliometrix R-package (http://www.bibliometrix.org) • Bibliometrix Tutorial • Bibliometrix function map • Citation Indicators • Waltman, L. (2016). A review of the literature on citation impact indicators. Journal of Informetrics, 10(2), 365-391. • Intellectual Map • Yang, S., Han, R., Wolfram, D., & Zhao, Y. (2016). Visualizing the intellectual structure of information science (2006– 2015): Introducing author keyword coupling analysis. Journal of Informetrics, 10(1), 132-150. • Co-word analysis and Research Front • Cuccurullo, C., Aria, M., & Sarto, F. (2016). Foundations and trends in performance management. A twenty-five years bibliometric analysis in business and public administration domains, Scientometrics, DOI: 10.1007/s11192-016-1948-8. • Cobo, M. J., López-Herrera, A. G., Herrera-Viedma, E., & Herrera, F. (2011). An approach for detecting, quantifying, and visualizing the evolution of a research field: A practical application to the fuzzy sets theory field. Journal of Informetrics, 5(1), 146-166. • Datascience and big data • George, G., Osinga, E. C., Lavie, D., & Scott, B. A. (2016). Big data and data science methods for management research. Academy of Management Journal, 59(5), 1493- 1507. • Sivarajah, U., Kamal, M. M., Irani, Z., & Weerakkody, V. (2017). Critical analysis of Big Data challenges and analytical methods. Journal of Business Research, 70, 263-286.
  4. 4. Key readings in Strategy • Conceptual • Nag, R., Hambrick, D. C., & Chen, M. J. (2007). What is strategic management, really? Inductive derivation of a consensus definition of the field. Strategic management journal, 28(9), 935-955. • Hoskisson, R. E., Hitt, M. A., Wan, W. P., & Yiu, D. (1999). Theory and research in strategic management: Swings of a pendulum. Journal of management, 25(3), 417-456. • Adcroft, A., & Willis, R. (2008). A snapshot of strategy research 2002-2006. Journal of Management History, 14(4), 313-333. • Bibliometric articles (General) • Ramos‐Rodríguez, A. R., & Ruíz‐Navarro, J. (2004). Changes in the intellectual structure of strategic management research: A bibliometric study of the Strategic Management Journal, 1980–2000. Strategic Management Journal, 25(10), 981-1004. • Nerur, S. P., Rasheed, A. A., & Natarajan, V. (2008). The intellectual structure of the strategic management field: An author co‐citation analysis. Strategic Management Journal, 29(3), 319-336. • Furrer, O., Thomas, H., & Goussevskaia, A. (2008). The structure and evolution of the strategic management field: A content analysis of 26 years of strategic management research. International Journal of Management Reviews, 10(1), 1-23. • Phelan, S. E., Ferreira, M., & Salvador, R. (2002). The first twenty years of the Strategic Management Journal. Strategic Management Journal, 23(12), 1161-1168. • Ronda‐Pupo, G. A., & Guerras‐Martin, L. Á. (2012). Dynamics of the evolution of the strategy concept 1962– 2008: a co‐word analysis. Strategic Management Journal, 33(2), 162-188. • Maia, J. L., Serio, L. C., & Alves Filho, A. G. (2015). Almost two decades after: a bibliometric effort to map research on strategy as practice using two data sources. European Journal of Economics, Finance and Administrative Sciences, 73, 7-31. • Acedo, F. J., Barroso, C., & Galan, J. L. (2006). The resource‐based theory: dissemination and main trends. Strategic Management Journal, 27(7), 621-636. • Vogel, R., & Güttel, W. H. (2013). The dynamic capability view in strategic management: a bibliometric review. International Journal of Management Reviews, 15(4), 426-446. • Di Stefano, G., Peteraf, M., & Verona, G. (2010). Dynamic capabilities deconstructed‡: a bibliographic investigation into the origins, development, and future directions of the research domain. Industrial and Corporate Change, 19(4), 1187-1204. • Dagnino, G. B., Levanti, G., Minà, A., & Picone, P. M. (2015). Interorganizational network and innovation: A bibliometric study and proposed research agenda. Journal of Business & Industrial Marketing, 30(3/4), 354-377.
  5. 5. Context • Topic Relevance The number of academic publications is increasing at a rapid pace and it is becoming increasingly unfeasible to remain current with everything that is being published. Moreover, the emphasis on empirical contributions has resulted in voluminous and fragmented research streams. This hampers the ability to accumulate knowledge and actively collect evidence through a set of previous research papers. Therefore, literature reviews are increasingly assuming a crucial role in synthesizing past research findings to effectively use the existing knowledge base, advance a line of research, and provide evidence-based insight into the practice of exercising and sustaining professional judgment and expertise. • Bibliometrics Scholars use different qualitative and quantitative literature reviewing approaches to understand and organize earlier findings. Among these, bibliometrics has the potential to introduce a systematic, transparent, and reproducible review process based on the statistical measurement of science, scientists, or scientific activity. Unlike other techniques, bibliometrics provides more objective and reliable analyses. The overwhelming volume of new information, conceptual developments, and data are the milieu where bibliometrics becomes useful by providing a structured analysis to a large body of information, to infer trends over time, themes researched, identify shifts in the boundaries of the disciplines, to detect the most prolific scholars and institutions, and to present the “big picture” of extant research. Bibliometrics for: • Research valuation • Science Mapping Altmetrics
  6. 6. Bibliometrix • Complexity of bibliometric analysis Although over time, the use of bibliometrics has been extended to all disciplines, bibliometric analysis is complex because it entails several steps that employ numerous and diverse analyses and mapping software tools, which are frequently available only under commercial licenses. These difficulties are compounded by the reality that few researchers and practitioners are trained in how to review literature and to identify evidence-based practices. The cumbersome nature of the process reduces the possibilities and the potential of bibliometrics, especially for scholars who have no general programming skills. Recently, automated workflows to assemble specialized software into a comprehensive and organized data flow have begun to emerge for bibliometrics. They are particularly well suited to multi-step analyses using different types of software tools. • Bibliometrix: one tool for the whole bibliometric workflow In the seminar we propose and use a unique tool, developed in the R language, which follows a classic logical bibliometric workflow that we reconstruct. We have designed and produced an R-tool for comprehensive bibliometric analyses. R is a language and environment for statistical computing and graphics. It provides a wide variety of statistical and graphical techniques and is highly extensible. In addition to enabling statistical operations, it is an object-oriented and functional programming language; hence, you can automate your analyses and create new functions. It has an open-software nature, which means it is well supported by the user community and new functions are regularly contributed by users, many of whom are prominent statisticians. As it is programmed in R, the proposed tool is flexible, can be rapidly upgraded, and can be integrated with other statistical R-packages. It is therefore useful in a constantly changing field such as bibliometrics.
  7. 7. Visits 13.481 (last 12 months at March, 2017) Aria, Cuccurullo (2017), JoI
  8. 8. Bibliometrix in Chinese
  9. 9. Recommended workflow for science mapping Study design Data collection Data Analysis Data visualization Interpretation • Data retrieval (Database) • Data loading and converting • Data cleaning. • Network extraction • Data normalization • Data reduction • Software tools for science mapping • R-packages for bibliometric analysis
  10. 10. Scientific document is the basic unit of a complex relational system Co-citations Word co-occurrences Collaborations
  11. 11. Bibliometrix
  12. 12. Study Design
  13. 13. Data collection: Main steps Data retrieval Data importing and converting Doc Authors Title Abstract Source Keywords Affilaition … Bibliographic dataframe Data downloading
  14. 14. Bibliographic dataframe: an example Field tags
  15. 15. Data collection PRISMA diagram • Keywords for query (Boolean operators) • Timespan & timeslices • Language (English) • Types of documents (articles, …) • Subject Categories (Mgmt, Fin, Ops, …) • Sources (ABS, 2015; one-journal or …)
  16. 16. Data Analysis Coupling Two works (A & B) refer to a common work (a) Co-citation Two works (a & b) are cited together by a common work (A) Intellectual structure Conceptual Structure (research front)
  17. 17. Main functionsSoftware assisted workflow steps bibliometrix functions Description Data loading and converting • readFiles() • Loads a sequence of Scopus and Clarivate Analytics WoS export files into R • Convert2df() • Creates a bibliographic data frame • retrievalByAuthorID() • Uses Scopus API search to obtain information regarding documents on a set of authors using Scopus ID Descriptive bibliometric analysis • biblioAnalysis() • Returns an object of class bibliometrix • summary() and plot() • Summarize the main results of the bibliometric analysis • citations() • Identifies the most cited references or authors • localCitations() • Identifies the most cited local authors • dominance() • Calculates the authors’ dominance ranking • Hindex() • Measures productivity and citation impact of a scholar • lotka() • Estimates Lotka’s law coefficients for scientific productivity • keywordGrowth() • Calculates yearly cumulative occurrences of top keywords/terms • keywordAssociation() • Associates authors' keywords to keywords plus Document x Attribute matrix creation • metaTagExtraction() • Extracts other field tags, different from the standard WoS/Scopus codify • termExtraction() • Extracts and stems terms from textual fields (abstract, title, author's keywords, and others) of a bibliographic data frame • cocMatrix() • Computes a Document x Attribute matrix Normalization • normalizeSimilarity() • Calculates association strength, inclusion index, Jaccard’s coefficient, and Salton’s similarity coefficient among objects of a bibliographic network Data Reduction • conceptualStructure() • Creates conceptual structure map of a scientific field using MCA and Clustering Network matrix creation • biblioNetwork() • Calculates the most frequently used bibliographic coupling, co-citation, collaboration, and co-occurrence networks • histNetwork() • Creates a historical co-citation network from a bibliographic data frame Mapping • networkPlot() • Plots a bibliographic network using internal R library or VOSviewer software • histPlot() • Plots a historical co-citation network • conceptualStructure() • Plots conceptual structure map of a scientific field using MCA and Clustering
  18. 18. Descriptive analysis Wip on Big Data
  19. 19. Matrix “Document x Attribute” • Document’s attributes are connected to each other through the Doc itself: author(s) to journal, keywords to publication date, etc. • An attribute is an item of information associated to the document and stored in a field tag within the bibliometric data frame (e.g., authors, publication source, keywords, cited references, affiliations). • These connections of different attributes generate a binary rectangular matrices (Document x Attribute) that, in some cases, it can be represented as a bipartite networks • Furthermore, scientific publications regularly contain references to other scientific works. This generates a further network, namely, co-citation or coupling network • These networks are analyzed in order to capture meaningful properties of the underlying research system, and in particular to determine the influence of bibliometric units such as scholars and journals.
  20. 20. Matrix 𝐷𝑜𝑐𝑢𝑚𝑒𝑛𝑡 × 𝑅𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 (cocMatrix function) Ref X ref Y Ref Z Doc A 1 0 1 Doc B 0 1 0 Doc C 1 0 1 Doc D 0 1 0 Doc E 0 0 1 Doc F 1 1 0 Doc G 0 0 1 A B C D X Y ZE F G matrix 𝑨 𝐷𝑜𝑐𝑢𝑚𝑒𝑛𝑡 × 𝑅𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 Bipartite graphCiting documents Cited documents
  21. 21. Bibliographic Networks (biblioNetwork function) • Bibliographic coupling 𝐵𝑐𝑜𝑢𝑝 = 𝐴 𝐶𝑅 × 𝐴 𝐶𝑅 ′ • Co-citation 𝐵𝑐𝑜𝑐𝑖𝑡 = 𝐴 𝐶𝑅 ′ × 𝐴 𝐶𝑅 • Collaboration 𝐵𝑐𝑜𝑙𝑙 = 𝐴 𝐴𝑈 ′ × 𝐴 𝐴𝑈 (among: authors, univ, dept, countries) • Co-word 𝐵𝑐𝑜𝑐 = 𝐴𝐼𝐷 ′ × 𝐴𝐼𝐷
  22. 22. Co-citation coupling “Co-citation Coupling” is the mirror image of “Bibliographic coupling” • Co-citation coupling is a method used to establish a subject similarity between two documents. • If papers A and B are both cited by paper C, they may be said to be related to one another, even though they don't directly cite each other. • If papers A and B are both cited by many other papers, they have a stronger relationship. The more papers they are cited by, the stronger their relationship is.
  23. 23. Co-citation network • A coupling network can be obtained using the general formulation: 𝐵𝑐𝑜𝑐𝑖𝑡 = 𝐴 𝐶𝑅 ′ × 𝐴 𝐶𝑅 • Like matrix 𝐵𝑐𝑜𝑢𝑝, matrix 𝐵𝑐𝑜𝑐𝑖𝑡 is also symmetric. • The main diagonal of 𝐵𝑐𝑜𝑐𝑖𝑡 contains the number of cases in which a reference is cited in our dataframe. • In other words, the diagonal element 𝐵𝑖𝑖 is the number of local citations of the reference 𝑖.
  24. 24. Co-citation analysis 𝑨 𝑪𝑹 × 𝑨 𝑪𝑹 ′ Ref X Ref Y Ref Z Doc A 1 0 1 Doc B 0 1 0 Doc C 1 0 1 Doc D 0 1 0 Doc E 0 0 1 Doc F 1 1 0 Doc G 0 0 1 Doc A Doc B Doc C Doc D Doc E Doc F Doc G Ref X 1 0 1 0 0 1 0 Ref Y 0 1 0 1 0 1 0 Ref Z 1 0 1 0 1 0 1
  25. 25. X Y Z X 3 1 2 Y 1 3 0 Z 2 0 4 3 1 2 Co-citation analysis (2) Matrix 𝑩 𝒄𝒐𝒄𝒊𝒕 Degree Co-citation Network X Y Z
  26. 26. Document co-citation • Position, proximity and bubble diameter • Clusters • Strenght of linkages • Bridge papers
  27. 27. Journal co-citation
  28. 28. Collaboration network
  29. 29. Historiograph • Historiographic analysis generates chronological tables as well as historiographs which highlight the most-cited works in and outside the collection. • It will be used to help scholars quickly identify the most significant work on a topic and trace its year-by-year historical development.
  30. 30. Co-word analysis through networks
  31. 31. Co-word Analysis through MCA
  32. 32. Thematic map MOTOR THEMES EMERGING OR DECLINING THEMES HIGHLY DEVELOPED AND ISOLATED THEMES (NICHES) BASIC AND TRANSVERSAL THEMES THEMATIC NETWORK
  33. 33. What’s next? • Shiny • Lab of bibliometrics and data-knowledge discovery • Bibliometrix R community • Bibliometrix social (Follow us!) • https://www.facebook.com/bibliometrix/ • https://twitter.com/search?q=%23bibliometrix&src=typd • We are already working on new developments. They concern • the extension of compatibility with other bibliographic databases such as PubMed • The search of grey literature • the improvement of reference disambiguation by string metric-based algorithms • the introduction of direct citation and tri-citation analysis • the use of hybrid methods that combine bibliometric and semantic approaches. The last-mentioned development includes term-burst detection through expectile smoothing, thematic mapping and evolution and latent semantic analysis

×