There are a wide variety of scientific contribution rating indices including the impact factor and h-index.
These are used for quantitative analyses on research papers published in the past, and therefore unable to
incorporate in the assessment the growth, or deterioration, of the research area: whether the research area
of a particular paper is in decline or conversely in a growing trend. Other hand, the use of the conventional
rating indices may result in higher rates for papers that are hardly referenced nowadays in other papers
although frequently cited in the past. This study proposes a new type of scientific contribution ranking
index, "Growing Degree of Research Area and Variance Values Index (GV-Index)". The GV-Index is
computed by a principal component analysis based on an estimated value obtained by PageRank
Algorithm, which takes into account the growing degree of the research area and its variance. We also
propose visualization system of a scientist’s network using the GV-Index.
K index: A New Dynamic Performance IndicatorRiyad Khazali
This paper introduces a new dynamic Research Performance Indicator (DRPI) as an attempt to effectively measure the productivity of a researcher by considering all his/her cited papers, the articles’ age, the number of co-authors of each cited paper, and the order of each co-author. Unlike the h-index, or other ones, such as G, H(2), W, AR, E, K, Hw , and J indices, that do not differentiate between co-authors in the same field, and ignore articles below a certain threshold or of high number of citations, the proposed k-index strives to map the research outcome of each co-author onto a more practical measure based on his/her collective research contribution, article’s age, number and order of co-authors, and the total number of citations, thus, highlighting the complete effort of each co-author separately. The k-index utilizes a recursive geometric sequence that distributes the merits of each article, as fairly as possible, among all co-authors. The effectiveness of the proposed k-index is demonstrated against the well-known h-index by investigating the research outcomes of researchers in the same field that almost have similar h-indices. As expected, the aging factor of the proposed measure implies that if the number of citations of a researcher does not increase in time, his/her research index should decrease, and that is successfully reflected by the proposed k-index.
Improved author profiling through the use of citation classesbartthijs
Improved author profiling through the use of citation classes
Bart Thijs, Koenraad Debackere, Wolfgang Glänzel
Presented at the STI 2014 Conference at Leiden University, Leiden, The Netherlands
Dynamic extraction of key paper from the cluster using variance values of cit...IJDKP
When looking into recent research trends in the field of academic landscape, citation network analysis is
common and automated clustering of many academic papers has been achieved by making good use of
various techniques. However, specifying the features of each area identified by automated clustering or
dynamically extracted key papers in each research area has not yet been achieved. In this study, therefore,
we propose a method for dynamically specifying the key papers in each area identified by clustering. We
will investigate variance values of the publication year of the cited literature and calculate each cited
paper’s importance by applying the variance values to the PageRank algorithm.
Construction of Keyword Extraction using Statistical Approaches and Document ...IJERA Editor
Organize continuing growth of dynamic unstructured documents is the major challenge to the field experts.
Handling of such unorganized documents causes more expensive. Clustering of such dynamic documents helps
to reduce the cost. Document clustering by analysing the keywords of the documents is one the best method to
organize the unstructured dynamic documents. Statistical analysis is the best adaptive method to extract the
keywords from the documents. In this paper an algorithm was proposed to cluster the documents. It has two
parts, first part extracts the keywords using statistical method and the second part construct the clusters by
keyword using agglomerative method. This proposed algorithm gives more than 90% of accuracy.
K index: A New Dynamic Performance IndicatorRiyad Khazali
This paper introduces a new dynamic Research Performance Indicator (DRPI) as an attempt to effectively measure the productivity of a researcher by considering all his/her cited papers, the articles’ age, the number of co-authors of each cited paper, and the order of each co-author. Unlike the h-index, or other ones, such as G, H(2), W, AR, E, K, Hw , and J indices, that do not differentiate between co-authors in the same field, and ignore articles below a certain threshold or of high number of citations, the proposed k-index strives to map the research outcome of each co-author onto a more practical measure based on his/her collective research contribution, article’s age, number and order of co-authors, and the total number of citations, thus, highlighting the complete effort of each co-author separately. The k-index utilizes a recursive geometric sequence that distributes the merits of each article, as fairly as possible, among all co-authors. The effectiveness of the proposed k-index is demonstrated against the well-known h-index by investigating the research outcomes of researchers in the same field that almost have similar h-indices. As expected, the aging factor of the proposed measure implies that if the number of citations of a researcher does not increase in time, his/her research index should decrease, and that is successfully reflected by the proposed k-index.
Improved author profiling through the use of citation classesbartthijs
Improved author profiling through the use of citation classes
Bart Thijs, Koenraad Debackere, Wolfgang Glänzel
Presented at the STI 2014 Conference at Leiden University, Leiden, The Netherlands
Dynamic extraction of key paper from the cluster using variance values of cit...IJDKP
When looking into recent research trends in the field of academic landscape, citation network analysis is
common and automated clustering of many academic papers has been achieved by making good use of
various techniques. However, specifying the features of each area identified by automated clustering or
dynamically extracted key papers in each research area has not yet been achieved. In this study, therefore,
we propose a method for dynamically specifying the key papers in each area identified by clustering. We
will investigate variance values of the publication year of the cited literature and calculate each cited
paper’s importance by applying the variance values to the PageRank algorithm.
Construction of Keyword Extraction using Statistical Approaches and Document ...IJERA Editor
Organize continuing growth of dynamic unstructured documents is the major challenge to the field experts.
Handling of such unorganized documents causes more expensive. Clustering of such dynamic documents helps
to reduce the cost. Document clustering by analysing the keywords of the documents is one the best method to
organize the unstructured dynamic documents. Statistical analysis is the best adaptive method to extract the
keywords from the documents. In this paper an algorithm was proposed to cluster the documents. It has two
parts, first part extracts the keywords using statistical method and the second part construct the clusters by
keyword using agglomerative method. This proposed algorithm gives more than 90% of accuracy.
The presentation discusses about a Thesis, Research paper, Review Article & Technical Reports: Organization of thesis and reports, formatting issues, citation methods, references, effective oral presentation of research. Quality indices of research publication: impact factor, immediacy factor, H- index and other citation indices. A verbal consent of Prof. Dr. C. B. Bhatt was obtained (at 4.15pm on Dt. 26-11-2016 at Hall A-2, GTU, Chandkheda) to float the presentation online in benefits of the research scholar society.
Modified CiteScore metric for reducing the effect of self-citationsTELKOMNIKA JOURNAL
Elsevier B.V. launched a scholarly metric called CiteScore (CS) on December 8, 2016. Up till
then, the journal impact factor (JIF) owned by Clarivate Analytics (Thomson Reuters) was the only trusted
metric for journal evaluation. As noted by Teixeira da Silva & Memon (2017), CS offers some observed
advantages over JIF. The potentials of CiteScore as a viable metric are still emerging. The paper briefly
introduces a variant of the CiteScore that can be used in quantifying the impact of researchers and their
institutions. The ultimate aim is to reduce the numerical effect of self-citations (SC) in academic publishing.
The reduction is designed to discourage SC but not diminishing it. The reasons for the adopted
methodology are discussed extensively. The proposed modified CiteScore metric is simple, transparent
and constructed to ensure integrity in academic publication. The result showed that the proposed modified
CiteScore is a better option than the traditional CiteScore and hence, can be applied in impact
determination, the ranking of authors and their institutions, and evaluation of scientists for a grant award.
The approach used in this paper is entirely new in two ways; first, a metric similar to journal ranking is
proposed for ranking authors and their institutions and secondly, disproportionate scores are awarded to
different sources of citations to reduce perceived dishonesty in academic publications. In conclusion, this
research is one of very few to report the effect of SC on CiteScore. Hitherto, the effect of SC has always
been on the journal impact factor (IF).
Grey Multi Criteria Decision Making MethodsIJSRP Journal
Multi-Criteria Decision Making is the most well-known branch of decision making. In some cases, determining precisely the exact value of attributes is difficult and their values can be considered as Uncertain data. This paper presents two different Multi Criteria Decision Making methods based on grey numbers. The two methods are used to obtain the final ranking of the alternatives and select the best one under grey numbers. Finally, an illustrative example is presented and the results are analyzed.
Its a fully detailed topic about Editing , Coding, Tabulation o Data in research work.
The editing , coding , tabulation of data is been explained in this ppt.
Data analysis using spss for two sample t-test tutorialDaniel Sarpong
This beginner's manual for students, researchers, and data analysts provide a visual step-by-step approach for conducting data analysis using the Statistical Package for the Social Sciences (SPSS). It uses screen captures of the software to simplify the steps needed to carry out the commands to perform the statistical methods commonly employed in data analysis.
Co-word analyses study the co-occurrence of pairs of items (for example, keywords) that are representative in a document, to identify relations between the ideas presented in the
texts.
Data imputing uses to posit missing data values, as missing data have a negative effect on the computation validity of models. This study develops a genetic algorithm (GA) to optimize imputing for missing cost data of fans used in road tunnels by the Swedish Transport Administration (Trafikverket). GA uses to impute the missing cost data using an optimized valid data period. The results show highly correlated data (R- squared 0.99) after imputing the missing data. Therefore, GA provides a wide search space to optimize imputing and create complete data. The complete data can be used for forecasting and life cycle cost analysis. Ritesh Kumar Pandey | Dr Asha Ambhaikar"Data Imputation by Soft Computing" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-4 , June 2018, URL: http://www.ijtsrd.com/papers/ijtsrd14112.pdf http://www.ijtsrd.com/computer-science/real-time-computing/14112/data-imputation-by-soft-computing/ritesh-kumar-pandey
De Vlaamse Regering gaf haar principiële goedkeuring voor de opstart van een 3de cluster van 40 kunstgrasvelden in het kader van het Vlaams Sportinfrastructuurplan. Dat plan voorziet in publiek-private samenwerking (PPS), die een deel van het tekort aan sportinfrastructuur in Vlaanderen versneld moet wegwerken.
The presentation discusses about a Thesis, Research paper, Review Article & Technical Reports: Organization of thesis and reports, formatting issues, citation methods, references, effective oral presentation of research. Quality indices of research publication: impact factor, immediacy factor, H- index and other citation indices. A verbal consent of Prof. Dr. C. B. Bhatt was obtained (at 4.15pm on Dt. 26-11-2016 at Hall A-2, GTU, Chandkheda) to float the presentation online in benefits of the research scholar society.
Modified CiteScore metric for reducing the effect of self-citationsTELKOMNIKA JOURNAL
Elsevier B.V. launched a scholarly metric called CiteScore (CS) on December 8, 2016. Up till
then, the journal impact factor (JIF) owned by Clarivate Analytics (Thomson Reuters) was the only trusted
metric for journal evaluation. As noted by Teixeira da Silva & Memon (2017), CS offers some observed
advantages over JIF. The potentials of CiteScore as a viable metric are still emerging. The paper briefly
introduces a variant of the CiteScore that can be used in quantifying the impact of researchers and their
institutions. The ultimate aim is to reduce the numerical effect of self-citations (SC) in academic publishing.
The reduction is designed to discourage SC but not diminishing it. The reasons for the adopted
methodology are discussed extensively. The proposed modified CiteScore metric is simple, transparent
and constructed to ensure integrity in academic publication. The result showed that the proposed modified
CiteScore is a better option than the traditional CiteScore and hence, can be applied in impact
determination, the ranking of authors and their institutions, and evaluation of scientists for a grant award.
The approach used in this paper is entirely new in two ways; first, a metric similar to journal ranking is
proposed for ranking authors and their institutions and secondly, disproportionate scores are awarded to
different sources of citations to reduce perceived dishonesty in academic publications. In conclusion, this
research is one of very few to report the effect of SC on CiteScore. Hitherto, the effect of SC has always
been on the journal impact factor (IF).
Grey Multi Criteria Decision Making MethodsIJSRP Journal
Multi-Criteria Decision Making is the most well-known branch of decision making. In some cases, determining precisely the exact value of attributes is difficult and their values can be considered as Uncertain data. This paper presents two different Multi Criteria Decision Making methods based on grey numbers. The two methods are used to obtain the final ranking of the alternatives and select the best one under grey numbers. Finally, an illustrative example is presented and the results are analyzed.
Its a fully detailed topic about Editing , Coding, Tabulation o Data in research work.
The editing , coding , tabulation of data is been explained in this ppt.
Data analysis using spss for two sample t-test tutorialDaniel Sarpong
This beginner's manual for students, researchers, and data analysts provide a visual step-by-step approach for conducting data analysis using the Statistical Package for the Social Sciences (SPSS). It uses screen captures of the software to simplify the steps needed to carry out the commands to perform the statistical methods commonly employed in data analysis.
Co-word analyses study the co-occurrence of pairs of items (for example, keywords) that are representative in a document, to identify relations between the ideas presented in the
texts.
Data imputing uses to posit missing data values, as missing data have a negative effect on the computation validity of models. This study develops a genetic algorithm (GA) to optimize imputing for missing cost data of fans used in road tunnels by the Swedish Transport Administration (Trafikverket). GA uses to impute the missing cost data using an optimized valid data period. The results show highly correlated data (R- squared 0.99) after imputing the missing data. Therefore, GA provides a wide search space to optimize imputing and create complete data. The complete data can be used for forecasting and life cycle cost analysis. Ritesh Kumar Pandey | Dr Asha Ambhaikar"Data Imputation by Soft Computing" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-4 , June 2018, URL: http://www.ijtsrd.com/papers/ijtsrd14112.pdf http://www.ijtsrd.com/computer-science/real-time-computing/14112/data-imputation-by-soft-computing/ritesh-kumar-pandey
De Vlaamse Regering gaf haar principiële goedkeuring voor de opstart van een 3de cluster van 40 kunstgrasvelden in het kader van het Vlaams Sportinfrastructuurplan. Dat plan voorziet in publiek-private samenwerking (PPS), die een deel van het tekort aan sportinfrastructuur in Vlaanderen versneld moet wegwerken.
ES UNA BREVE DESCRIPCION DE QUE ES EL FLICKR
Flickr (pronunciado /fliker/) es un sitIO WEB que permite almacenar, ordenar, buscar, vender y compartir fotografIas o vídeos en línea, a través de Internet.
Cuenta con una comunidad de usuarios que comparten fotografías y videos creados por ellos mismos. Esta comunidad se rige por normas de comportamiento y condiciones de uso que favorecen la buena gestión de los contenidos.
La popularidad de Flickr se debe fundamentalmente a la capacidad para administrar imágenes mediante herramientas que permiten a los autores: etiquETAR sus fotografías, explorar y comentar las imágenes de otros usuarios.
Flickr es una herramienta que se puede usar para potenciar las clases de fotografía.
Permite hacer búsquedas de imágenes por etiquetas, por fecha y por licencias de Creative Commons.
Otras funcionalidades son los canales RSS y Atom, y la API que permite a desarrolladores independientes crear servicios y aplicaciones vinculados a Flickr. El servicio se basa en las características habituales del HTML y el HTTP, permitiendo que sea usable en múltiples plataformas y navegadores web. La interfaz de etiquetación y edición de texto utilizaAJAX, que también es compatible con la gran mayoría de los navegadores. Las imágenes (fotografías o videos) también pueden enviarse a través del correo electrónico.
Permite hacer búsquedas de imágenes por etiquetas, por fecha y por licencias de Creative Commons.
Otras funcionalidades son los canales RSS y Atom, y la API que permite a desarrolladores independientes crear servicios y aplicaciones vinculados a Flickr. El servicio se basa en las características habituales del HTML y el HTTP, permitiendo que sea usable en múltiples plataformas y navegadores web. La interfaz de etiquetación y edición de texto utilizaAJAX, que también es compatible con la gran mayoría de los navegadores. Las imágenes (fotografías o videos) también pueden enviarse a través del correo electrónico.
Permite hacer búsquedas de imágenes por etiquetas, por fecha y por licencias de Creative Commons.
Otras funcionalidades son los canales RSS y Atom, y la API que permite a desarrolladores independientes crear servicios y aplicaciones vinculados a Flickr. El servicio se basa en las características habituales del HTML y el HTTP, permitiendo que sea usable en múltiples plataformas y navegadores web. La interfaz de etiquetación y edición de texto utilizaAJAX, que también es compatible con la gran mayoría de los navegadores. Las imágenes (fotografías o videos) también pueden enviarse a través del correo electrónico.Desde 2013
Tamaño total de archivo 1 TeraByte (1.024 Gigabytes)
Tamaño máximo de foto 200 Mbytes.
Tamaño máximo de video 1GByte 3 minutos.
Vistas de las galerías limitadas a las 200 imágenes más recientes.
La posibilidad de publicar cualquiera de tus fotos .
Muestra de la imágenes en menor definición (Los originales se guardan sin pérdida de información).
Creative Brief und Creative Briefing / "Why to" und "How to"Rainer Buehler
Wie stellen wir Dienstleistern, Kollegen, Creative Teams oder Beratern Aufgaben präzise, zielorientiert und inspirierend? Das Präsentationsdeck "Creative Brief und Creative Briefing" gibt Antworten.
K index: A New Dynamic Performance IndicatorRiyad Khazali
This paper introduces a new dynamic Research Performance Indicator (DRPI) as an attempt to effectively measure the productivity of a researcher by considering all his/her cited papers, the articles’ age, the number of co-authors of each cited paper, and the order of each co-author.
Visual mining of science citation data for benchmarking scientific and techno...Gurdal Ertek
In this paper we present a study where we visually analyzed science citation data to investigate the competitiveness of world countries in selected categories of science. The dataset that we worked on in our study includes the number of papers published
and the number of citations made in the ESI (Essential Science Indicators) database in 2004. The dataset lists these values for practically every country in the world. In analyzing the data, we employ methods and software tools developed and used in the data mining and information visualization fields of the Computer Science. Some of the questions for which we look for answers in this study are the following: (a) Which countries are most competitive in the selected categories of science? (i.e.
Engineering, Computer Science, Economics & Business) (b) What type of correlations exist between different categories of science? For example, do countries with many published papers in the field of Engineering science also have many papers published on Computer Science or Economics & Business? (c) Which countries produce the most influential papers? This analysis is needed since a country may have
many papers published but these papers may be cited very rarely. (d) Can we gain useful and actionable insights by combining science citation data with socioeconomic and geographical data?
http://research.sabanciuniv.edu.
Does it Matter Which Citation Tool is Used to Compare the h-index of a Group ...Nader Ale Ebrahim
h-index retrieved by citation indexes (Scopus, Google scholar, and Web of Science) is used to measure the scientific performance and the research impact studies based on the number of publications and citations of a scientist. It also is easily available and may be used for performance measures of scientists, and for recruitment decisions. The aim of this study is to investigate the difference between the outputs and results from these three citation databases namely Scopus, Google Scholar, and Web of Science based upon the h-index of a group of highly cited researchers (Nobel Prize winner scientist). The purposive sampling method was adopted to collect the required data. The results showed that there is a significant difference in the h-index between three citation indexes of Scopus, Google scholar, and Web of Science; the Google scholar h-index was more than the h-index in two other databases. It was also concluded that there is a significant positive relationship between h-indices based on Google scholar and Scopus. The citation indexes of Scopus, Google scholar, and Web of Science may be useful for evaluating h-index of scientists but they have some limitations as well.
Paradoxical betweenness in Academic endeavors and research metricsSaptarshi Ghosh
Publish or perish" is an aphorism describing the pressure to publish academic work in order to succeed in an academic career. ... The pressure to publish has been cited as a cause of poor work being submitted to academic journals.
History of humanity especially post renaissance era depicts the contribution of research and its output In terms of publication, patents and technology transfer paving the way for the societal prosperity. Scientific writing and research publication are fundamental components indicative of academic excellence and supposedly committed to the Research and Development (R&D) funding. Thus the budget allocation and expenditure thereof towards R&D is considered to be a vital parameter for the advancement in science and technology and also for social and economic well being
CORRELATING R & D EXPENDITURE AND SCHOLARLY PUBLICATION OUTPUT USING K-MEANS ...Zac Darcy
History of humanity especially post renaissance era depicts the contribution of research and its output In
terms of publication, patents and technology transfer paving the way for the societal prosperity. Scientific
writing and research publication are fundamental components indicative of academic excellence and
supposedly committed to the Research and Development (R&D) funding. Thus the budget allocation and
expenditure thereof towards R&D is considered to be a vital parameter for the advancement in science and
technology and also for social and economic well being. In view of the tall aspirations of society and
government at large it becomes indispensable to investigate whether the scholastic output is going hand in
hand with the R & D budget spillover or otherwise. We present in this communication a systematic
clustering approach based on K-means algorithm to reveal the impact of R&D expenditure on the extent of
research publications. Two independent sources of data, Research and development expenditure i.e.
percentage of Gross Domestic Product (GDP) and Scientific and technical journal articles, are brought
together in this comprehensive study. From an empirical perspective, present study found that there exist a
positive linear correlation between R & D Expenditure and number of research publication.
CORRELATING R & D EXPENDITURE AND SCHOLARLY PUBLICATION OUTPUT USING K-MEANS ...Zac Darcy
History of humanity especially post renaissance era depicts the contribution of research and its output Interms of publication, patents and technology transfer paving the way for the societal prosperity. Scientific writing and research publication are fundamental components indicative of academic excellence and
supposedly committed to the Research and Development (R&D) funding. Thus the budget allocation and expenditure thereof towards R&D is considered to be a vital parameter for the advancement in science and technology and also for social and economic well being. In view of the tall aspirations of society and government at large it becomes indispensable to investigate whether the scholastic output is going hand in
hand with the R & D budget spillover or otherwise. We present in this communication a systematic clustering approach based on K-means algorithm to reveal the impact of R&D expenditure on the extent of research publications. Two independent sources of data, Research and development expenditure i.e.
percentage of Gross Domestic Product (GDP) and Scientific and technical journal articles, are brought together in this comprehensive study. From an empirical perspective, present study found that there exist a positive linear correlation between R & D Expenditure and number of research publication.
MODELING PROCESS CHAIN OF METEOROLOGICAL REANALYSIS PRECIPITATION DATA USING ...Zac Darcy
In this paper, we propose a models of process chain and knowledge-based of meteorological reanalysis
datasets that help scientists, working in the field of climate and in particular of the rainfall evolution, to
solve uncertainty of spatial resources (data, process) to monitor the rainfall evolution. Indeed, rainfall
evolution mobilizes all research, various methods of meteorological reanalysis datasets processing are
proposed. Meteorological reanalysis datasets available, at present, are voluminous and heterogeneous in
terms of source, spatial and temporal resolutions. The use of these meteorological reanalysis datasets may
solve uncertainty of data. In addition, phenomena such as rainfall evolution require the analysis of time
series of meteorological reanalysis datasets and the development of automated and reusable processing
chains for monitoring rainfall evolution. We propose to formalize these processing chains from modeling
an abstract and concrete models based on existing standards in terms of interoperability. These processing
chains modelled will be capitalized, and diffusible in operational environments. Our modeling approach
uses Work-Context concepts. These concepts need organization of human resources, data, and process in
order to establish a knowledge-based connecting the two latter. This knowledge based will be used to solve
uncertainty of meteorological reanalysis datasets resources for monitoring rainfall evolution.
Correlating R&D Expenditure and Scholarly Publication Output Using K-Means Cl...Zac Darcy
History of humanity especially post renaissance era depicts the contribution of research and its output In
terms of publication, patents and technology transfer paving the way for the societal prosperity. Scientific
writing and research publication are fundamental components indicative of academic excellence and
supposedly committed to the Research and Development (R&D) funding. Thus the budget allocation and
expenditure thereof towards R&D is considered to be a vital parameter for the advancement in science and
technology and also for social and economic well being. In view of the tall aspirations of society and
government at large it becomes indispensable to investigate whether the scholastic output is going hand in
hand with the R & D budget spillover or otherwise. We present in this communication a systematic
clustering approach based on K-means algorithm to reveal the impact of R&D expenditure on the extent of
research publications. Two independent sources of data, Research and development expenditure i.e.
percentage of Gross Domestic Product (GDP) and Scientific and technical journal articles, are brought
together in this comprehensive study. From an empirical perspective, present study found that there exist a
positive linear correlation between R & D Expenditure and number of research publication
CORRELATING R & D EXPENDITURE AND SCHOLARLY PUBLICATION OUTPUT USING K-MEANS ...ijitmcjournal
History of humanity especially post renaissance era depicts the contribution of research and its output In terms of publication, patents and technology transfer paving the way for the societal prosperity. Scientific writing and research publication are fundamental components indicative of academic excellence and supposedly committed to the Research and Development (R&D) funding. Thus the budget allocation and expenditure thereof towards R&D is considered to be a vital parameter for the advancement in science and technology and also for social and economic well being. In view of the tall aspirations of society and government at large it becomes indispensable to investigate whether the scholastic output is going hand in hand with the R & D budget spillover or otherwise. We present in this communication a systematic clustering approach based on K-means algorithm to reveal the impact of R&D expenditure on the extent of research publications. Two independent sources of data, Research and development expenditure i.e. percentage of Gross Domestic Product (GDP) and Scientific and technical journal articles, are brought together in this comprehensive study. From an empirical perspective, present study found that there exist a positive linear correlation between R & D Expenditure and number of research publication.
Match By Match Detailed Schedule Of The ICC Men's T20 World Cup 2024.pdfmouthhunt5
20 Teams, One Trophy: What to Expect from the ICC Men's T20 World Cup 2024
The ICC Men's T20 World Cup 2024 is set to be an exciting event, co-hosted by the West Indies and the USA from June 1 to June 29, 2024. This edition of the tournament will feature a record 20 teams divided into four groups, competing across 55 matches for the prestigious title.
Italy vs Albania Soul and sacrifice' are the keys to success for Albania at E...Eticketing.co
We offer UEFA Euro 2024 Tickets to admirers who can get Italy vs Albania Tickets through our trusted online ticketing marketplace. Eticketing. co is the most reliable source for booking Euro Cup Final Tickets. Sign up for the latest Euro Cup Germany Ticket alert.
Narrated Business Proposal for the Philadelphia Eaglescamrynascott12
Slide 1:
Welcome, and thank you for joining me today. We will explore a strategic proposal to enhance parking and traffic management at Lincoln Financial Field, aiming to improve the overall fan experience and operational efficiency. This comprehensive plan addresses existing challenges and leverages innovative solutions to create a smoother and more enjoyable experience for our fans.
Slide 2:
Picture this: It’s a crisp fall afternoon, driving towards Lincoln Financial Field. The atmosphere is electric—tailgaters grilling, fans in Eagles jerseys creating a sea of green and white. The air buzzes with camaraderie and anticipation. You park, join the throng, and make your way to your seat. The stadium roars as the Eagles take the field, sending chills down your spine. Each play is a thrilling dance of strategy and skill. This is what being an Eagles fan is all about—the joy, the pride, and the shared experience.
Slide 3:
But now, the day is marred by frustration. The excitement wanes as you struggle to find a parking spot. The congestion is overwhelming, and tempers flare. The delays mean you miss the pre-game excitement, the tailgate camaraderie, and even the opening kick-off. After the game, the joy of victory or the shared solace of defeat is overshadowed by the stress of navigating out of the parking lot. The gridlock, honking horns, and endless waiting drain the energy and joy from what should have been an unforgettable experience.
Our proposal aims to eliminate these frustrations, ensuring that from arrival to departure, your experience is extraordinary. Efficient parking and smooth traffic flow are key to maintaining the high spirits and excitement that make game days special.
Slide 4:
The Philadelphia Eagles are not just a premier NFL team; they are an integral part of the community, hosting games, concerts, and various events at Lincoln Financial Field. Our state-of-the-art stadium is designed to provide a world-class experience for every attendee. Whether it's the thrill of game day, the excitement of a live concert, or the camaraderie of community events, we pride ourselves on delivering a fan-first experience and maintaining operational excellence across all our activities. Our commitment to our fans and community is unwavering, and we continuously strive to enhance every aspect of their experience, ensuring they leave with unforgettable memories.
Slide 5:
Recent trends show an increasing demand for efficient event logistics. Our customer feedback has consistently highlighted frustrations with parking and traffic. Surveys indicate that a significant number of fans are dissatisfied with the current parking situation. Comparisons with other venues like Citizens Bank Park and Wells Fargo Center reveal that we lag in terms of parking efficiency and convenience. These insights underscore the urgent need for innovation to meet and exceed fan expectations.
Slide 6:
As we delve into the intricacies of our operations, one glaring issue emer
Belgium vs Slovakia Belgium Euro 2024 Golden Generation Faces Euro Cup Final ...Eticketing.co
We offer Euro Cup Tickets to admirers who can get Belgium vs Slovakia Tickets through our trusted online ticketing marketplace. Eticketing.co is the most reliable source for booking Euro Cup Final Tickets. Sign up for the latest Euro Cup Germany Ticket alert.
Hesan Soufi's Legacy: Inspiring the Next GenerationHesan Soufi
Hesan Soufi's impact on the game extends far beyond his on-field exploits. With his humility, sportsmanship, and unwavering commitment to excellence, Soufi has become a role model for aspiring footballers worldwide. His legacy lies not only in his achievements but also in the inspiration he provides to the next generation of talented players.
Turkey vs Georgia Tickets: Turkey's Provisional Squad for UEFA Euro 2024, Key...Eticketing.co
Euro Cup Germany fans worldwide can book Euro 2024 Tickets from our online platform www.eticketing.co.Fans can book Euro Cup 2024 Tickets on our website at discounted prices.
Netherlands vs Austria Netherlands Face Familiar Foes in Euro Cup Germany Gro...Eticketing.co
The Netherlands are in Group D in Euro Cup Germany - and, unpaid to this, they will be coming up against familiar foes. Remarkably, they have played France, who have fashioned some of the greatest players of all time, 30 times throughout history. Despite France being more effective in major competitions, including captivating the World Cup in 2018, Holland have the greater head-to-head record.
We offer Euro Cup Tickets to admirers who can get Netherlands vs Austria Tickets through our trusted online ticketing marketplace. Eticketing.co is the most reliable source for booking Euro Cup Final Tickets. Sign up for the latest Euro Cup Germany Ticket alert.
UEFA Euro 2024 Tickets | Euro 2024 Tickets | Netherlands vs Austria Tickets
However, in 2023, they played one another twice, with France endearing both matches 4-0 and 2-1 individually. Against Poland and Austria, the Netherlands also have a stout record, winning just under half the matches. They faced Austria at Euro 2020, engaging 2-0, and they haven't lost to Poland since 1979.
The lettering is on the wall for Holland to qualify for the knockouts, but nothing is failsafe. The Netherlands kickstart their Euros campaign against Poland on Sunday, June 16th. In Hamburg, they will have to go up against one of the best strikers in the world, Robert Lewandowski.
Netherlands vs Austria: Tough Challenges Await the Netherlands in Euro Cup Germany
Five days later, they travel south to face France in Leipzig, a side led by Kylian Mbappe - one of the finest players in the world currently and one of the most impressive players in his nation's history. To conclude, they face Austria in Berlin, knowing it could be the end of the road if they don't perform.
Ronald Koeman is widely considered one of the more successful Dutch managers in Premier League history, considering the nation has a reputation for struggling to replicate their talents in England. The former Everton manager went against that script and shone — and now he is back managing his nation.
UEFA Euro 2024 Tickets | Euro 2024 Tickets | Euro Cup Germany Tickets | Netherlands vs Austria Tickets
Euro fans worldwide can book Euro Cup Germany Tickets from our online platform, www.eticketing.co. Fans can book Euro Cup 2024 Tickets on our website at discounted prices.
Netherlands vs Austria: Ronald Koeman's Tactical Approach For UEFA Euro 2024
As well as being the highest-scoring defender in history, Koeman is a man with immense tactical knowledge. He returned to manage Holland at the start of 2023 after it was announced Louis van Gaal would retire. His life back in the dugout with the team wasn't easy, as he lost his first match 4-0 to France after going 3-0 down within 21 minutes.
However, he eventually helped them qualify for Euro Cup Germany. The 61-year-old likes to organize his team with a defensive mindset. Some might call it pragmatic as he defends with minimal space between the lines, but that's often needed for international football.
Spain vs Croatia Spain aims to put aside the RFEF crisis as they chase Euro C...Eticketing.co
We offer UEFA Euro 2024 Tickets to admirers who can get Spain vs Croatia Tickets through our trusted online ticketing marketplace. Eticketing. co is the most reliable source for booking Euro Cup Final Tickets. Sign up for the latest Euro Cup Germany Ticket alert.
Turkey vs Georgia Turkey's Road to Redemption and Euro 2024 Prospects.pdfEticketing.co
Euro Cup Germany fans worldwide can book Euro 2024 Tickets from our online platform www.eticketing.co.Fans can book Euro Cup 2024 Tickets on our website at discounted prices.
According to the report, the consumption of video content related to IPL 2024 has seen significant growth, nearly 3 times more than the previous season, reflecting an increasing interest of fans.
Croatia vs Italy Can Luka Modrić Lead Croatia to Euro Cup Germany Glory in Hi...Eticketing.co
Euro 2024 fans worldwide can book Croatia vs Italy Tickets from our online platform www.eticketing.co. Fans can book Euro Cup Germany Tickets on our website at discounted prices.
Gv index scientific contribution rating index that takes into account the growth degree of
1. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
DOI : 10.5121/ijdkp.2013.3501 1
GV-INDEX: SCIENTIFIC CONTRIBUTION RATING
INDEX THAT TAKES INTO ACCOUNT THE GROWTH
DEGREE OF RESEARCH AREA AND VARIANCE
VALUES OF THE PUBLICATION YEAR OF CITED
PAPER
Akira Otsuki 1
and masayoshi Kawamura2
1
Tokyo Institute of Technology, Tokyo, Japan
2
MK future software, Ibaraki, Japan
ABSTRACT
There are a wide variety of scientific contribution rating indices including the impact factor and h-index.
These are used for quantitative analyses on research papers published in the past, and therefore unable to
incorporate in the assessment the growth, or deterioration, of the research area: whether the research area
of a particular paper is in decline or conversely in a growing trend. Other hand, the use of the conventional
rating indices may result in higher rates for papers that are hardly referenced nowadays in other papers
although frequently cited in the past. This study proposes a new type of scientific contribution ranking
index, "Growing Degree of Research Area and Variance Values Index (GV-Index)". The GV-Index is
computed by a principal component analysis based on an estimated value obtained by PageRank
Algorithm, which takes into account the growing degree of the research area and its variance. We also
propose visualization system of a scientist’s network using the GV-Index.
KEYWORDS
Scientific Contribution Rating Index, Principal Component Analysis, Bibliometrics, Database
1. INTRODUCTION
As typical scientific contribution indexes, such as h-Index, g-Index, A-Index and R-Index, have
been conventionally assessed based on literatures published in the past, these values tend to be
higher in case of well-experienced scientists or those who have larger number of colleagues. In
addition, if quoted by many papers in the past, an index value will be highly computed even if
these papers have not been cited current. Therefore, this study will calculate "The growing degree
of the research area" and "Variance values of the publication year of the cited literature" as an
observation value of principal component analysis. Then we propose a method for calculating
new synthetic variables (scientific contribution estimated index for scientist) by conducting
principal component analysis based on these two observation values in the study.
2. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
2. PRECEDING STUDIES
This section describe about preceding studies at 2.1
2.1. h-index
h-index is the index Hirsch, J.E. [
matter. The number of papers that the number of citations is more than
example of h-Index in the Table.
Scientist
A Paper A(9), Paper B(7), Paper C(5), Paper D(4), Paper E(4)
B Paper A(35), Paper B(9),
2.2. g-index
Egghe, L. [2] proposed the g-index as a modification of the
index, the same ranking of a publication set
received- is used as for the h-index.
that together received g2
or more citations
weight to highly cited papers.
2.3. A-index
The proposal to use this average number of citations as a variant of the
[3]. Jin introduced the A-index (as well as the
calculation only papers that are in the Hirsch core. It is defined as the average number of citations
of papers in the Hirsch core.
2.4. R-index
The better scientist is 'punished' for having a higher
h. Therefore, instead of dividing by
citations in the Hirsch core to calculate the index.
the R-index, as it is calculated using a square root.
Hirsch core, the index can be very sensitive to just a very few papers receiving extremely high
citation counts (3).
International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
TUDIES
This section describe about preceding studies at 2.1 later.
index is the index Hirsch, J.E. [1] did proposal. One scientist of h-index is satisfies following
matter. The number of papers that the number of citations is more than h is more than
Index in the Table.1.
Table.1 Example of h-Index
Papers
(Number of citation)
Paper A(9), Paper B(7), Paper C(5), Paper D(4), Paper E(4)
Paper A(35), Paper B(9), Paper C(5), Paper D(3), Paper E(1)
index as a modification of the h-index. For the calculation of the
index, the same ranking of a publication set -paper in decreasing order of the number of citations
index. Egghe defines the g-index "as the highest number
or more citations (1). In contrast to the h-index, the g-index gives more
The proposal to use this average number of citations as a variant of the h-index was made by Jin
index (as well as the m-index, r-index, and AR-index) includes in the
calculation only papers that are in the Hirsch core. It is defined as the average number of citations
st is 'punished' for having a higher h-index, as the A-index involves a division by
Therefore, instead of dividing by h, the authors suggest taking the square root of the sum of
citations in the Hirsch core to calculate the index. Jin et al. [4] did proposal to this new index as
index, as it is calculated using a square root. R-index- measures the citation intensity in the
Hirsch core, the index can be very sensitive to just a very few papers receiving extremely high
International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
2
index is satisfies following
is more than h. We show
h-index
4
3
index. For the calculation of the g-
paper in decreasing order of the number of citations
index "as the highest number g of papers
index gives more
(1)
index was made by Jin
index) includes in the
calculation only papers that are in the Hirsch core. It is defined as the average number of citations
(2)
index involves a division by
, the authors suggest taking the square root of the sum of
to this new index as
measures the citation intensity in the
Hirsch core, the index can be very sensitive to just a very few papers receiving extremely high
(3)
3. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
3
2.5. Problem so far of preceding studies
Problem so far of scientist evaluation index is these are used for quantitative analyses on research
papers published in the past, and therefore unable to incorporate in the assessment the growth, or
deterioration, of the research area: whether the research area of a particular paper is in decline or
conversely in a growing trend. On the other hand, all indexes above are not considered about a
growth rate of research area. Namely, if these research areas have already obsolete meaningless,
even if the paper has a lot of citations. We think that it is very important to consider about a
growth rate of research area. But the past scientist evaluation indexes is not consider about a
growth rate of research area. From mentioned above, the past scientist evaluation indexes has an
issues of quality assessment yet. Therefore, we will show the concept of this study by next
chapter aiming at improvement progress of quality assessment.
3. CONCEPT
In order to solve prior chapter problem, we will calculate using principal component analysis
based on two observed following values. This calculated index called "GV-index (Growing
degree of research area and Variance values index)". GV-index is intended for journal papers.
① Growing Degree of Research Area
② The Page Rank algorithm considering the degree of dispersion of the cited papers year
① the above is the value to evaluate whether there is a growing trend in the research area. Also
② the above is the value to evaluate the importance of scientists. The PageRank algorithm [8] is a
technique used to determine the most “important” page quantitatively by using calculations in the
presence of mutual referencing relations such as hyperlink structures. In this study, the strictness
of each paper is calculated using this algorithm. That is to say, assuming that the sum of the
scores of the citations that “flow out” to each paper and the sum of the scores of the citations that
“flow in” from each paper are equal to each other, such a sum is then considered as the score of
the pertinent paper, and papers with higher scores are considered more important. By applying the
variance value to calculation of the score of citations that “flow in” from each paper, it is possible
to identify the key papers in each area. Although scores have been assigned equally in the
conventional algorithm when there are multiple citations that “flow in,” the severities reflecting
the state of variance in the citation year are calculated in this study with the consideration that
more citations will “flow in” to papers with higher variance values. We propose a new scientist
evaluation index by principal component analysis using this two observation values.
3.1. Calculation of Cluster growth
First, calculate the Cluster growth rate as observed values of principal component analysis. The
Cluster of this study is based on random network. Random network was proposal by Paul Erdös
and Alféd Rényi [9-11] at 1960. The random network is the network that there are random edges
in among the nodes. We will use Newman method as the clustering method in this study. Then
the group of papers identified by clustering were labelling of research area by experts. We
describe the steps to create a random network. Assume the total number of nodes to be "N", and
the probability of existence of each edge to be "p". Also assume, at first, N nodes are prepared. In
this case, maximum possible number of edges is shown as underline (4).
(4)
4. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
4
Each edge is made with the same probability of p. The result, random network based on
probability p, has as many edges as underline (5) on average.
(5)
Also, average number of edges per node <k> is shown as underline (6).
<k>=p(N-1) (6)
As we’ve seen, a random network can be made by giving N and p. In a random network, each
edge between nodes exists at the same probability and is not clustered. In random network, the
probability that randomly chosen two nodes are linked together equals to p, so the clustering
coefficient in random network is shown as underline (7).
(7)
Then, calculate the clustering coefficient for each fiscal year. For example, to calculate clustering
coefficient fiscal year by fiscal year from FY2008 through FY2012, the clustering coefficient in
FY2008 will be the initial value. This will be called SCrand. Then, the year to be "Y" and the
cluster coefficient by year to be YCrand. Next, we will calculate the "Growing Degree of Research
Area" by the following equation (8) by applying the GACR (Compound Average Growth Rate)
[12].
(8)
1/(Y-1) is intended for adjusting the elapsed years. Then we will calculate "CGY
" in each fiscal
year to date from the publication year of the paper and we will use CGY
as observed values of
principal component analysis.
3.2. Calculation of importance of scientists by the Page Rank algorithm considering
the degree of dispersion of the cited papers year
Calculation of importance of scientists by the Page Rank algorithm considering the degree of
dispersion of the cited papers year as observed values of principal component analysis. First,
investigate the period in which was cited by investigating the variance (standard deviation) of the
publication years of the cited papers. In this case, the common method for obtaining the standard
deviation is expressed as follows.
(9)
We method for obtaining the standard deviation is expressed as formula (9). We assume the P1,
P2・・・Pn-1, Pn as a period sample. Then, we regard a is the arithmetic average of these.
Then we will calculate variance values formula (9) as an arithmetic mean of . And the
5. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
5
obtained value of standard deviation is stored as variance. Then, apply this variance values to the
Page Rank algorism [13]. The formula of the Page Rank algorism is expressed as follows.
(10)
d is the parameter, and can be any real number within the range [0,1]. Formula (10) starts at any
value that is given to each node in the graph, and is repeatedly calculated until the value is not
exceeding the designated threshold value. Once the calculations are complete, the most important
node is determined. Formula (10) is set so that the sum total of the inlink score and the sum total
of the outlink score is equal, and as this sum total is seen as the page score, designating pages
with higher scores more valuable. However, we method applies the variance value of formula (9)
to the score calculations of the inlink and the outlink. While past algorithms would, distribute
scores evenly when there were multiple outlinks for example, we method would calculate based
on the thought that the points will flow towards higher variance. As a result, importance of
scientists can be calculated in a way that reflects the dispersity of the referenced year. This
formula is expressed as follow.
(11)
The Y of PRY
represents the relevant year. Because the variance values is expressed as a
“Standard deviation^
2" generally, we will express as inlink and as outlink. Finally,
we will calculate "PRY
" in each fiscal year to date from the publication year of the paper and we
will use PRY
as observed values of principal component analysis.
3.3. Calculation of Scientific Evaluation Index “GV-index” using Principal
Component Analysis
We will calculate the scientific evaluation index using principal component analysis based on the
observation value of the previous section. This index is called “GV-index”. Principal component
analysis is a mathematical procedure that produces a synthesis of a new one variable from two or
more variables. The first, we will prepare the data frame of the observation value (table.2). Next,
we calculate principal component analysis using the data frame of the observation value.
Table.2 The example of the observation value data frame
No CGY
PRY
1 CG2012 PR2012
2 CG2011 PR2011
3 CG2010 PR2010
4 CG2009 PR2009
5 CG2008 PR2008
Principal component loadings are the weight of each variable describes the synthesis variables.
This is the partial regression coefficient at regression analysis. In accordance with this, the
synthesis variables (GV-index) express follows (12) in this study.
6. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
6
GV-index = Comp.1 * CGY
+ Comp.1 * PRY
(12)
n is integer. And Comp.1 is the first synthesis variable of CGY
and PRY
. We will use two variables
after standardization. Then we will explain effect for GV-index of observations value. If the
values of CGY
and PRY
are negative minus, GV-index will take the negative effect. Conversely, if
the values of CGY
and PRY
are positive plus, GV-index will take the positive effect. Therefore
GV-index will be able to consider the cluster growth and importance of scientists by Page Rank
algorithm considering the degree of dispersion of the cited papers year.
3.4. Visualization of “GV-index”
Fig.1 is the image of visualization of the GV-index. The left side is "reference-between clusters
(Communities) relation". From the top are listed in descending order of the cluster size. Also to
indicate the reference relationship between the clusters, we will shows two same clusters list.
Each cluster will show research areas. Then right side is the scientist’s network map. If you select
one cluster from clusters list in the left side, will show details (Scientist Network Map) of that
cluster to right side. Each node is the scientists. And will show large nodes as the GV-index value
is greater. This scientist network map shows one's cluster (research area) as well as also another
cluster (another research area). We will be able to comprehend scientist’s importance or position
of scientists in the research area. Furthermore, we will be able to understand the scientists of
affecting other research areas.
Figure 1. The image of visualization of GV-index (Citation Network Map)
4. EVALUATION EXPERIMENT
4.1. Outline of Evaluation Experiment and Result of Evaluation Experiment
This section will confirm effectiveness of GV-index by compare h-index, g-index, A-index and
R-index. First, we will get the papers from "Web of Science (journal database)" [14] using "Data
Mining" query. The search period is 2012 from 1960. Papers numbers we have acquired is about
31,000. Table3 shows the papers of cited number top 5 in 31,000. We decided these four
scientists (Table3) as target scientists of this evaluation experiment.
7. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
7
Table.3 The papers of cited number top 5
(Journal DB is the Web of Science, and Query is the "Data Mining")
Citation Author’s Paper’s name Publication
year
1 3649 Agrawal,R., et.al Mining association rules between sets of
items in large databases
2009
2 1594 Fawcett,T., et.al An introduction to ROC analysis 2006
3 1340 Zimmermann, P.
et.al
GENEVESTIGATOR. Arabidopsis
microarray database and analysis toolbox
2004
4 1305 Agrawal, R., et.al Mining sequential patterns 2005
5 749 Foster, I. et.al Grid services for distributed system
integration
2002
Then, Table4-7 is the papers list about five scientists. These papers lists satisfy the following two
conditions. "Four scientists (Table3) are the first author" and further "Research area is the Data
Mining". Then Table4-7 are show the result of calculation of h-index,g-index,A-index,R-
index and GV-index. Calculation method of GV-index will describe in the next section.
Table.4 Compare of each index in Agrawal,R
GV-
index
h-
index
g-index A-
index
R-
index
Number
of
Citation
Name of the Papers
1.353000 1(1) 3649 3649 Mining association rules
between sets of items in large
databases, 1993.
1.361491 2(4) 4954 1305 Mining sequential patterns,
1995.
1.363350 3(9) 5688 734 Privacy-preserving data
mining, 2000.
1.373500 4(16) 6328 640 Automatic subspace clustering
of high dimensional data for
data mining applications,
1998.
1.292764 5(25) 6813 485 Database mining: A
performance perspective,
1993.
1.352239 6(36) 7211 398 Parallel mining of association
rules, 1996.
1.373296 7(49) 7282 71 Automatic subspace clustering
of high dimensional data,
2005.
1.277639 8 8(64) 7313 914.12
5
85.52 31 Securing electronic health
records without impeding the
flow of information, 2007.
1.343410
8. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
8
Table.5 Compare of each index in Fawcett,T.
GV-
index
h-
index
g-index A-
index
R-
index
Number
of
Citation
Name of the Papers
1.244574 1(1) 1613 1613 An introduction to ROC
analysis, 2006.
1.373041 2(4) 2062 449 Adaptive fraud detection,
1995.
1.247522 3(9) 2112 50 Using rule sets to maximize
ROC performance, 2001.
1.146067 4 4(16) 2125 531.25 46.10 13 PRIE: A system for generating
rulelists to maximize ROC
performance, 2008.
1.252801
Table.6 Compare of each index in Zimmermann, P.
GV-
index
h-
index
g-index A-
index
R-
index
Number
of
Citation
Name of the Papers
1.404585 1 1(1) 1226 1226 35.01 1226 GENEVESTIGATOR.
Arabidopsis microarray
database and analysis toolbox
Table.7 Compare of each index in Foster, I.
GV-
index
h-
index
g-index A-
index
R-
index
Number
of
Citation
Name of the Papers
1.403287 1(1) 749 749 Grid services for distributed
system integration, 2002.
1.400020 2 2(4) 782 391.00 27.96 33 Data integration in a
bandwidth-rich world, 2003.
1.401654
4.2. Calculation of GV-index
First, will prepare two observation values (CGY
and PRY
) previously described. Then prepare the
data frame like table8. Table8 is the sample of Zimmermann, P.
Table.8 The sample of observation values data frame
No CGY
PRY
2012 1319 0.02269
2011 1205 0.02336
2010 1057 0.02521
2009 914 0.02720
2008 727 0.02744
9. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
9
This study uses the R-2.15 [15] as a principal component analysis tool. There are two functions
for a principal component analysis tool in the R. It’s called prcomp() and princomp(). Because do
not make much difference, this study uses princomp().fig.2 is the example of the result of the
principal component analysis using princomp(). The argument "cor=TRUE" is designate that
analysis to standardize the raw data. Other hand, the principal component loadings can be output
by specifying the "loadings = TRUE" argument to "summary ()".'
> data <- read.csv("/mydata.csv",head=F)
> data2 <- princomp(data, cor=TRUE)
> data2
Call:
princomp(x = data, cor = TRUE)
Standard deviations:
Comp.1 Comp.2
1.4045851 0.1647446
2 variables and 9 observations.
> summary(data2, loadings=TRUE)
Importance of components:
Comp.1 Comp.2
Standard deviation 1.4045851 0.16474458
Proportion of Variance 0.9864296 0.01357039
Cumulative Proportion 0.9864296 1.00000000
Loadings:
Comp.1 Comp.2
V1 -0.707 -0.707
V2 0.707 -0.707
>
Figure 2. Sample of result of the principal component analysis (using R-2.15)
Because principal component analysis will create synthesis variable of the same numbers as the
number of observed variables, we will need adopt the synthetic variable of the appropriate
number, then It is necessary to truncates other composite variables. This study will use standard
deviation and cumulative contribution this evaluation criteria. In the principal component
analysis, the weight will calculate so that the variance of synthetic variable maximized.
Therefore, we can say that the bigger of these values, the better synthetic variable. Also, in
usually, Cumulative contribution rate use the main component of more than 80%. In this case
(Fig.2), because Comp.1 has explained more than 98% (0.98) of all data, it is possible to truncate
the below Comp.2. Show the principal component scores (Comp.1) in Table 4-7. The last row of
GV-index in the table 4-7 show the average if there are the plural papers.
4.3. Discussion for the Result of Evaluation Experiment
The numbers of publication papers of Zimmermann, P. and Foster, I. are small as shown in
Table4-7. In contrast, The number of publication papers of Agrawal,R and Fawcett,T. are large.
The values of h-index, g-index, A-index and R-index were rise in proportion to number of
publication papers. Fig3-7 shows compare each index. h-index, g-index and R-index had small
value at Zimmermann, P. Incidentally, though the calculated A-index value of Zimmermann, P. is
high, this is because A-index is an index for calculating square roots and since there is only one
10. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
10
applicable report by Zimmermann, P., his A-index was calculated simply by using the citation
count (1,226) of the said report. According to an expert, Zimmerman, P. invented the e-mail
coding software package Pretty Good Privacy [16], and has served as a Fellow at the Stanford
Law School's Center for Internet and Society. However, while being a prospective scientist he has
produced only a few applicable papers and as such his calculated h-index, g-index, and R-index
values were low. Conversely, the value of GV-index was calculated without being influenced by
the low number of published papers.
Figure 3. GV-index Figure 4. h-index
Figure 5. g-index Figure 6. A-index
Figure 7. R-index
Regarding the previously mentioned experiment results, the causes which enabled a more
qualitative evaluation are discussed. Fig.8 shows a comparison of the transition of the cluster size
growth rate in Agrawal, R.’s data mining papers (eight in total). In Fig.9, “Mining association
rules between sets of items in large databases, 1993. (Hereinafter referred to as Agrawal, R. 1993:
Database)" was singularly extracted from Fig.8 to show its cluster size growth rate transition.
"Agrawal, R.1993: Database" is the most number of citation (3,649) in this evaluation experiment
as shown in Table4. But the number of papers of research area of this paper (Agrawal, R.1993:
11. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
11
Database) is declining after 2010 as shown in Fig8. Therefore the value of GV-index was low as
shown in Table4.
Figure 8. Comparison of the transition of the degrees of growth of the cluster size of the Data Mining
articles by Agrawal,R
Figure 9. Comparison of the transition of the degrees of growth of the cluster size of the "Agrawal,
R.1993:Database"
5. CONCLUSION
In this study, we pointed out, having conducted a survey of the past representative scientific
contributory indexes that they confined to a narrow sense of the quantitative analyses of the
articles published in the past. With respect to this issue, concretely, a problem is explained by the
fact that we cannot assess the degree of growth of those research domains concerning whether the
research area focused on by an article tends to decline or develop when a research provides only a
quantitative analysis. With regard to another problem, it is recognised that even a work which is
now not often quoted, albeit it was often cited in the past, can be considered as relevant to the
citation counts of the present day scholarship. For the purpose of dealing with these problems,
this study, first, calculated the important evaluation value of scientists by the PageRank algorithm
as the observation data of principal component analysis, taking account of the degree of growth of
the research domain and the variance of the year of the articles quoted, and then, proposed a new
scientific contributory index (GV-index) by having carried out principal component analysis on
the basis of these data. Further, we did Implementation the scientists network map based on the
GV-index. In the result of evaluation experiment, the index values of h-index,g-index,A-
12. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
12
index,R-index were larger in proportion to the number of citation. But GV-index could
calculation in consideration of the growth rate of the research area.
REFERENCES
[1] Hirsch, J.E. (2007). Does the h index have predictive power?, Proceedings of the National
Academy of Sciences of the United States of America 104 (49) , pp. 19193-19198.
[2] Egghe, L. (2006). Theory and practise of the g-index, Scientometrics 69 (1), pp. 131-152.
[3] Jin, B. H. (2006). h-Index: An evaluation indicator proposed by scientist. Science Focus, 1(1), 8–
9. (In Chinese)
[4] Jin BH, Liang LM, Rousseau R, Egghe L (2007). The R- and AR-indices: Complementing the h-
index. Chinese Science Bulletin 52(6):855-863.
[5] Alonso S, Cabrerizo FJ, Herrera-Viedma E, Herrera F (2010). hg-index: A new index to
characterize the scientific output of researchers based on the h- and g- indices. Scientometrics
82(2): 391-400.
[6] Jin BH, Liang LM, Rousseau R, Egghe L (2007). The R- and AR-indices: Complementing the h-
index. Chinese Science Bulletin 52(6): 855-863.
[7] Antonakis, John.; Lalive, Rafael. Quantifying Scholarly Impact: IQp Versus the Hirsch h.
Journal of the American Society for information Science and Technology. 2008, vol. 59, no.6,
pp. 956-969.
[8] Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd (1998) The PageRank Citation
Ranking: Bringing Order to the Web.
[9] Erdös, P., & Rényi, A. (1959). On random graphs, Publicationes Mathematicae Debrecen, 6,
290–297.
[10] Erdös, P., & Rényi, A. (1960). On the evolution of random graphs. Magyar Tud. Akad. Mat. Kut.
Int. Kzl., 5, 17–61.
[11] Erdös, P., & Rényi, A. (1961). On the strength of connectedness of a random graph. Acta Math.,
Acad. Sci. Hungar., 12, 261–267.
[12] http://www.investopedia.com/terms/c/cagr.asp
[13] S. Brin and L. Page. (1998). The anatomy of a large-scale hypertextual Web search engine.
Computer Networks and ISDN Systems, 30(1–7).
[14] http://thomsonreuters.com/web-of-science/
[15] http://cran.r-project.org/bin/macosx/old/R-2.15.3.pkg/
[16] http://www.philzimmermann.com/JA/background/index.html
13. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013
13
AUTHORS
Akira Otsuki
Received his Ph.D. in engineering from Keio University (Japan), in 2012. He is
currently associate professor at Tokyo institute of technology (Japan) and Officer at
Japan society of Information and knowledge (JSIK). His research interests include
Analysis of Big Data, Data Mining, Academic Landscape, and new knowledge
creation support system. Received his Best paper award 2012 at JSIK. And received
his award in Editage Inspired Researcher Grant, in 2012.
Masayoshi Kawamura
Masayoshi Kawamura is a system engineer (Japan). He received M.S. degree from
Kyoto Institute of Technology (Japan) in 1998. His research interests include image
processing, digital signal processing, and statistical data analysis.