ABSTRACT
This paper presents a webometric analysis of the academic search
engine result pages (SERPs) of the Chinese-language term of
“Wikipedia” across major Chinese-speaking regions of mainland
China, Hong Kong and Taiwan. Because of the academic
outcome, the findings can also be interpreted for further metaanalysis,
or “research about research”, of the Wikipedia research
in Chinese-language literatures. The findings cover the results
from four major search platforms: CNKI Scholar, Google Scholar
China, Google Scholar Hong Kong and Google Scholar Taiwan.
Cross tabulation of the results shows the major institutions
(journals and academic departments) and scholarly archives for
Chinese-language Wikipedia research. The findings suggest that
there exists a divide between mainland Chinese academic
sources/search results on one hand, and Hong Kong/Taiwanese
ones on the other. Meta-analysis based on academic SERPs have
implications for identifying the gaps and potentials in
internationalization of Wikipedia research.
What do Chinese-language microblog users do with Baidu Baike and Chinese Wiki...Hanteng Liao
ABSTRACT
This paper presents a case study of information engagement based on microblog posts gathered from Sina Weibo and Twitter that mentioned the two major Chinese-language user-generated encyclopaedias. The content analysis shows that microblog users not only engaged in public discussions by using and citing both encyclopaedias, but also shared their perceptions and experiences more generally with various online platforms and China’s filtering/censorship regime to which user-generated content and activities are subjected. This exploratory study thus raises several research and practice questions on the links between public discussions and information engagement on user-generated platforms.
Liao and petzold opensym berlin wikipedia geolinguistic normalizationHanteng Liao
This paper proposes a method of geo-linguistic normalization to advance the existing comparative analysis of open collaborative communities, with multilingual Wikipedia projects as the example. Such normalization requires data regarding the potential users and/or resources of a geolinguistic unit.
Prior empirical and theoretical work has discussed the role of dominant search engine plays in the function of information gatekeeping on the Web, and there are reports on the high ranking of Wikipedia website among the search engine result pages (SERP). However, little research has been conducted on non-Google search engines and non-English versions of user-generated encyclopedias. This paper proposes a method to quantify the “display” gatekeeping differences of the SERP ranking and presents findings based on the Chinese SERP data. Based on 2,500 mainly-Chinese-language search queries, the data set includes the SERP outcome of four Chinese-speaking regions (mainland China, Singapore, Hong Kong and Taiwan) provided by three major search engines (Baidu, and Google and Yahoo), covering over 97% of the search engine market in each region. The findings, analysed and visualized using network analysis techniques, demonstrate the followings: major user-generated encyclopedias are among the most visible; localization factors matter (certain search engine variants produce the most divergent outcomes, especially mainland Chinese ones). The indicated strong effects of “network gatekeeping” by search engines also suggest similar dynamics inside user-generated encyclopedias.
What do Chinese-language microblog users do with Baidu Baike and Chinese Wiki...Hanteng Liao
ABSTRACT
This paper presents a case study of information engagement based on microblog posts gathered from Sina Weibo and Twitter that mentioned the two major Chinese-language user-generated encyclopaedias. The content analysis shows that microblog users not only engaged in public discussions by using and citing both encyclopaedias, but also shared their perceptions and experiences more generally with various online platforms and China’s filtering/censorship regime to which user-generated content and activities are subjected. This exploratory study thus raises several research and practice questions on the links between public discussions and information engagement on user-generated platforms.
Liao and petzold opensym berlin wikipedia geolinguistic normalizationHanteng Liao
This paper proposes a method of geo-linguistic normalization to advance the existing comparative analysis of open collaborative communities, with multilingual Wikipedia projects as the example. Such normalization requires data regarding the potential users and/or resources of a geolinguistic unit.
Prior empirical and theoretical work has discussed the role of dominant search engine plays in the function of information gatekeeping on the Web, and there are reports on the high ranking of Wikipedia website among the search engine result pages (SERP). However, little research has been conducted on non-Google search engines and non-English versions of user-generated encyclopedias. This paper proposes a method to quantify the “display” gatekeeping differences of the SERP ranking and presents findings based on the Chinese SERP data. Based on 2,500 mainly-Chinese-language search queries, the data set includes the SERP outcome of four Chinese-speaking regions (mainland China, Singapore, Hong Kong and Taiwan) provided by three major search engines (Baidu, and Google and Yahoo), covering over 97% of the search engine market in each region. The findings, analysed and visualized using network analysis techniques, demonstrate the followings: major user-generated encyclopedias are among the most visible; localization factors matter (certain search engine variants produce the most divergent outcomes, especially mainland Chinese ones). The indicated strong effects of “network gatekeeping” by search engines also suggest similar dynamics inside user-generated encyclopedias.
Slides prepared and presented by Prof Dr Nara at Unimas 2012. For more detail, go to http://de-run.blogspot.com/2012/08/webometrics-and-launching-of-unimas-new.html
A handout of "Afternoon Talk on Impact Factor and Improved Access to Papers" (July 2018)
The pptx file is available: <http://hdl.handle.net/2115/71206>
No other medium has taken a more meaningful place in our life in such a short time than the world wide largest data network, the World Wide Web. However, when searching for information in the data network, the user is constantly exposed to an ever growing ood of information. This is both a blessing and a curse at the same time. The explosive growth and popularity of the world wide web has resulted in a huge number of information sources on the Internet. As web sites are getting more complicated, the construction of web information extraction systems becomes more difficult and time consuming. So the scalable automatic Web Information Extraction WIE is also becoming high demand. There are four levels of information extraction from the World Wide Web such as free text level, record level, page level and site level. In this paper, the target extraction task is record level extraction. Nwe Nwe Hlaing | Thi Thi Soe Nyunt | Myat Thet Nyo "The Data Records Extraction from Web Pages" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd28010.pdfPaper URL: https://www.ijtsrd.com/computer-science/world-wide-web/28010/the-data-records-extraction-from-web-pages/nwe-nwe-hlaing
Trinity Kings World Leadership: Family Franchising Systems: Creating A Cultur...Terrell Patillo
1 Corinthians 14:40
Easy-to-Read Version (ERV)
40 But everything should be done in a way that is right and orderly.
1 Peter 5:2-3
Easy-to-Read Version (ERV)
2 take care of the group of people you are responsible for. They are God’s flock.[a] Watch over that flock because you want to, not because you are forced to do it. That is how God wants it. Do it because you are happy to serve, not because you want money. 3 Don’t be like a ruler over those you are responsible for. But be good examples to them.
Introduction MA Data, Culture and Society | University of Westminster, UKslejay
Datafication, the transformation of our everyday lives into digital data, poses great risks and opportunities for contemporary societies. This new MA course addresses, explores and researches this transformation. Industries increasingly rely on big data and dataficiation. Students therefore need analytical and practical skills to work with data in various sectors. The interdisciplinary course combines hands-on and applied approaches with theoretical learning. It encourages collaboration, group work and problem-based learning. Students will learn about analytical approaches to big data, algorithms, the Internet of Things, artificial intelligence, blockchain and other cutting-edge technologies. We will discuss and explore what the implications of such technologies for identities, politics, the economy and societies are.
Students will also be introduced to practical skills when it comes to the use, analysis and visualisation of data (such as data/text mining, social network analysis, digital discourse analysis, digital ethnography, sentiment analysis, geospatial analysis). Graduates from this programme will be fully capable and confident to combine these skills during their careers. Students who complete the MA Data, Culture and Society can work in a wide variety of sectors connected to data and the media and creative industries.
More information:
https://www.westminster.ac.uk/computer-science-and-software-engineering-journalism-and-mass-communication-courses/2019-20/september/full-time/data-culture-and-society-ma
Slides prepared and presented by Prof Dr Nara at Unimas 2012. For more detail, go to http://de-run.blogspot.com/2012/08/webometrics-and-launching-of-unimas-new.html
A handout of "Afternoon Talk on Impact Factor and Improved Access to Papers" (July 2018)
The pptx file is available: <http://hdl.handle.net/2115/71206>
No other medium has taken a more meaningful place in our life in such a short time than the world wide largest data network, the World Wide Web. However, when searching for information in the data network, the user is constantly exposed to an ever growing ood of information. This is both a blessing and a curse at the same time. The explosive growth and popularity of the world wide web has resulted in a huge number of information sources on the Internet. As web sites are getting more complicated, the construction of web information extraction systems becomes more difficult and time consuming. So the scalable automatic Web Information Extraction WIE is also becoming high demand. There are four levels of information extraction from the World Wide Web such as free text level, record level, page level and site level. In this paper, the target extraction task is record level extraction. Nwe Nwe Hlaing | Thi Thi Soe Nyunt | Myat Thet Nyo "The Data Records Extraction from Web Pages" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd28010.pdfPaper URL: https://www.ijtsrd.com/computer-science/world-wide-web/28010/the-data-records-extraction-from-web-pages/nwe-nwe-hlaing
Trinity Kings World Leadership: Family Franchising Systems: Creating A Cultur...Terrell Patillo
1 Corinthians 14:40
Easy-to-Read Version (ERV)
40 But everything should be done in a way that is right and orderly.
1 Peter 5:2-3
Easy-to-Read Version (ERV)
2 take care of the group of people you are responsible for. They are God’s flock.[a] Watch over that flock because you want to, not because you are forced to do it. That is how God wants it. Do it because you are happy to serve, not because you want money. 3 Don’t be like a ruler over those you are responsible for. But be good examples to them.
Introduction MA Data, Culture and Society | University of Westminster, UKslejay
Datafication, the transformation of our everyday lives into digital data, poses great risks and opportunities for contemporary societies. This new MA course addresses, explores and researches this transformation. Industries increasingly rely on big data and dataficiation. Students therefore need analytical and practical skills to work with data in various sectors. The interdisciplinary course combines hands-on and applied approaches with theoretical learning. It encourages collaboration, group work and problem-based learning. Students will learn about analytical approaches to big data, algorithms, the Internet of Things, artificial intelligence, blockchain and other cutting-edge technologies. We will discuss and explore what the implications of such technologies for identities, politics, the economy and societies are.
Students will also be introduced to practical skills when it comes to the use, analysis and visualisation of data (such as data/text mining, social network analysis, digital discourse analysis, digital ethnography, sentiment analysis, geospatial analysis). Graduates from this programme will be fully capable and confident to combine these skills during their careers. Students who complete the MA Data, Culture and Society can work in a wide variety of sectors connected to data and the media and creative industries.
More information:
https://www.westminster.ac.uk/computer-science-and-software-engineering-journalism-and-mass-communication-courses/2019-20/september/full-time/data-culture-and-society-ma
This review demonstrates that using these websites can provide researchers with valuable sources of data and research, facilitating access to current literature and specialized scientific content. For optimal results, diversifying sources of research and using multiple search engines based on need and specialization is recommended
2013 CrossRef Annual Meeting, How CrossRef has Accelerated Science and Its Pr...Crossref
This presenation provides a brief oral history highlighting how literature review has been revolutionized by CrossRef. It will also cover FundRef and why it is so important to Federal agencies. Finally, it mentiones the status of the public access as encouraged by OSTP and how FundRef potentially enables CHORUS to play a major role.
Universities and their web-based library services : a study of their relati...Sangeeta Dhamdhere
Sangeeta Namdev Dhamdhere and Egbert de Smet(2017). " A quantitative analysis of library web-services and their rankings related to the universities context". Abstract accepted at 9th Qualitative and Quantitative Methods in Libraries International Conference during 23-26 May 2017 held at The Savoy Hotel, Limerick city, Ireland.
Open access for researchers, policy makers and research managers - Short ver...Iryna Kuchma
Presented at Open Access: Maximising Research Impact, April 23 2009, New Bulgarian University Library, Sofia. Open access for researchers: enlarged audience, citation impact, tenure and promotion. Open access for policy makers and research managers:
new tools to manage a university’s image and impact. How to maximize the visibility of research publications, improve the impact and influence of the work, disseminate the results of the research, showcase the quality of the research in the Universities and research institutions, better measure and manage the research in the institution, collect and curate the digital outputs, generate new knowledge from existing findings, enable and encourage collaboration, bring savings to the higher education sector and better return on investment. What are the key functions for research libraries?
Relationship of Google Scholar Versions and Paper CitationsNader Ale Ebrahim
The number of citations that a paper has received is the most commonly used indicator to measure the quality of research. Researchers, journals, and universities want to receive more citations for their scholarly publications to increase their h-index, impact factor, and ranking respectively. In this paper, we tried to analyses the effect of the number of available Google Scholar versions of a paper on citations count. We analyzed 10,162 papers which are published in Scopus database in year 2010 by Malaysian top five universities. Then we developed a software to collect the number of citations and versions of each paper from Google Scholar automatically. The result of spearman correlation coefficient revealed that there is positive significant association between the number of Google Scholar versions of a paper and the number of times a paper has been cited.
Similar to Chinese-language literature about Wikipedia: a metaanalysis of academic search engine result pages (20)
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Chinese-language literature about Wikipedia: a metaanalysis of academic search engine result pages
1. Note. A pre-print version accepted for presentation at Opensym 2014. It may be different from the final version for the conference proceeding.
Chinese-language literature about Wikipedia: a meta-
analysis of academic search engine result pages
Han-Teng Liao
Oxford Internet Institute
University of Oxford
Oxford, United Kingdom
hanteng@gmail.com
Bin Zhang
Center for the Studies of Information Resources
Wuhan University
Wuhan 430072, P. R. China
zb0205@126.com
ABSTRACT
This paper presents a webometric analysis of the academic search
engine result pages (SERPs) of the Chinese-language term of
“Wikipedia” across major Chinese-speaking regions of mainland
China, Hong Kong and Taiwan. Because of the academic
outcome, the findings can also be interpreted for further meta-
analysis, or “research about research”, of the Wikipedia research
in Chinese-language literatures. The findings cover the results
from four major search platforms: CNKI Scholar, Google Scholar
China, Google Scholar Hong Kong and Google Scholar Taiwan.
Cross tabulation of the results shows the major institutions
(journals and academic departments) and scholarly archives for
Chinese-language Wikipedia research. The findings suggest that
there exists a divide between mainland Chinese academic
sources/search results on one hand, and Hong Kong/Taiwanese
ones on the other. Meta-analysis based on academic SERPs have
implications for identifying the gaps and potentials in
internationalization of Wikipedia research.
Categories and Subject Descriptors
[World Wide Web]: Web searching and information discovery –
Web search engines, Page and site ranking
Keywords
Chinese Internet, Chinese research on Wikipedia, academic
search engines, academic databases.
1. INTRODUCTION
The ideal of using human knowledge to engage citizens has often
been subject to parochial and national concerns despite the
universal ideal of enlightenment encyclopedias. During the
European enlightenment, geographic and linguistic barriers were
among the major challenges for knowledge collection and
diffusion [2, 6, 12]. Facing similar challenges, the Wikimedia
Foundation, the hosting organization for all Wikipedia projects,
has targeted the “Global South” regions of Brazil, India, and the
Arabic language countries for engagement [3]. Nonetheless, some
research has suggested that cultural and linguistic factors have
prevented wider acceptance of Wikipedia, especially the Chinese
and Korean versions [14, 16]. Thus, internationalization of
Wikipedia research is needed for better understanding of the
Wikipedia research around the world beyond English-language
literatures.
This paper aims to contribute to such internationalization efforts
by looking at the current published Chinese-language literature as
reported by major academic search engine results.
2. Methods
The exploratory method mimics the information seeking actions
likely to be executed by Chinese-language Wikipedia researchers.
The query of the Chinese-language term “Wikipedia” is submitted
to major academic search engine platforms, and then the search
engine result pages (SERPs) are scraped and mined for further
analysis.
Because of the academic outcome, the expected findings can be
interpreted for further meta-analysis, or “research about research”
[1], of the Wikipedia research in Chinese-language literatures. It
should be noted, however, the meta-analysis applied to social
science research often involves a hypothesis that are being
examined by a body of work. The work to be presented in this
paper, on the other hand, is more of a descriptive meta-analysis of
search engine result pages, or “search about research”. The aim of
the research is to identify search result patterns based on the
results provided by popular Chinese-language search platform as
an exploratory study, not to test a hypothesis normally seen in
meta-analysis.
2.1 Data selection and data sets
This study includes CNKI scholar and three localized versions of
Google Scholar (China, Hong Kong and Taiwan).
CNKI refers to Chinse National Knowledge Infrastructure that
was established in 1999 by Tsinghua University and Tsinghua
Tongfang, with the aim to “achieve full social sharing and
dissemination of knowledge resources”[5, 7, 17].
For data collection, the Chinese term “Wikipedia” was submitted
to the four academic search engines, with the number of results
shown in Table 1. Because mainland China uses simplified
Chinese script and Hong Kong and Taiwan use traditional
Chinese script[10], the queries of “Wikipedia” were submitted to
the academic search engine platforms differently accordingly. For
mainland Chinese platforms, the query of “维基百科” is used.
For Hong Kong and Taiwan, the query of “維基百科” is used.
Each search engine reported having different number of results:
ranging from the highest one reported by Google Scholar China to
the lowest one reported by CNKI scholar. However, these
numbers do not correspond to the numbers of results that users
can actually view. For instance, CNKI scholar has a limit at 500
and the Google Scholar has a cap at 1000, thereby imposing a
limit on the number of samples for this research. The second row
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
OpenSym '14 August 27 - 29 2014, Berlin, Germany
Copyright 2014 ACM xxx-x-xxxx-xxxx-x/xx/xx ...$xx.xx .
2. 2
of Table 1 shows the number of results actually scraped and
studied in this paper. The dataset examined in this paper thus
contains 3464 data points in total, each of which contains, if
available, meta information of title, authors, institutions, links and
short descriptions of the result.
Table 1. Number of search engine result items
CNKI
Scholar
Google Scholar
China (CN)
Google Scholar
Hong Kong (HK)
Google Scholar
Taiwan (TW)
Reported 464 27,300 3,950 3,950
Scrapped and
studied
464 1,000 1,000 1,000
2.2 Cross-tabulation analysis
By extracting the meta information from the search results,
several cross-tabulation is made to examine the patterns regarding
the major publishing institutions, archive platforms, and
publication years.
3. Results
The findings of crosstabs are presented as follows. As the results
for Google Scholar Hong Kong and Google Scholar Taiwan are
almost identical (only 13 different out of 1000 results), only the
results for Google Scholar Taiwan will be listed if the crosstabs
results are identical for both Google Scholar Taiwan and Google
Scholar Hong Kong.
3.1 Major (publishing) institutions
Table 2 lists the most frequently-appearing source institutions,
which include academic journals, degree-granting institutions and
popular science magazines. Only those receive 10 or more hits in
total are listed.
Table 2. Most frequently appearing source institutions
Institutions
(in Chinese)
Institutions
CNKI
Scholar
Google
Scholar
CN
Google
Scholar
TW
Grand
Total
情报学报 Journal of the China Society for Scientific
and Technical Information
0 65 0 65
国家图书馆学刊 Journal of the National Library of China 1 41 0 42
臺灣大學企業管理碩
士專班學位論文
MBA dissertations, National Taiwan
University
0 0 29 29
臺灣師範大學工業科
技教育學系學位論文
Department of Industrial Technology
Education Dissertations, Taiwan Normal
University
0 0 28 28
圖文傳播藝術學報 NTUA Department of Graphic
Communication Arts
0 0 23 23
现代图书情报技术 Modern Technology of Library and 5 12 0 17
互联网周刊 China Internet Weekly 6 10 0 16
图书情报工作 Library and Information Service 5 9 0 14
青年记者 "Youth Reporter" magazine 3 11 0 14
中文信息学报 Journal of Chinese Information Processing 6 7 0 13
互联网天地 China Internet Magazine 7 6 0 13
上海交通大学 Shanghai Jiao Tong University 13 0 0 13
情报资料工作 Information and Documentation Services 3 9 0 12
PAR表演藝術雜誌 Performance Arts Review 0 0 12 12
图书馆理论与实践 Library Theory and Practice 5 6 0 11
情报理论与实践 Information Studies: Theory & Application 5 6 0 11
哈尔滨工业大学 Harbin Institute of Technology 10 0 0 10
情报杂志 Journal of Intelligence 7 3 0 10
现代情报 Modern Information 2 8 0 10
The academic journals contain mostly mainland Chinese
information and library science journals, including Journal of the
China Society for Scientific and Technical Information, Journal of
the National Library of China, Modern Technology
of Library and Information, Library and Information Service,
Journal of Chinese Information Processing, Information and
Documentation Services, Library Theory and Practice,
Information Studies: Theory & Application, Journal of
Intelligence, and Modern Information.
Taiwanese journals such as NTUA Department of Graphic
Communication Arts, and Performance Arts Review are also listed
in Table 2. The listed degree-granting academic bodies include
both Taiwanese (MBA and education dissertations) and mainland
Chinese (Shanghai Jiao Tong University, and Harbin Institute of
Technology) ones.
The remaining listed institutions include three popular science
magazines: China Internet Weekly, "Youth Reporter"(author’s
translation) magazine, and China Internet Magazine. A clear
division is shown in Table 2 between the mainland Chinese and
Taiwanese results. (The Hong Kong results are identical to the
Taiwanese ones.) Table 2 shows no overlapping results.
Between CNKI Scholar and Google Scholar China, substantial
overlapping results exists for journals and magazines. Google
Scholar China does not seem to include as many degree
dissertations as CNKI Scholar.
3.2 Major (archive) platforms
Table 3 lists the most frequently appearing source platforms for
each search engine results. (The Hong Kong results are identical
to the Taiwanese ones.) As these data points indicate potential
clicks to the websites that provide either the metadata or full texts
of the citation, they provide important insights into the major
academic (archive) database online platforms in the Chinese-
speaking regions.
Table 3. Most frequently appearing source platforms
CNKI
Scholar
Google Scholar
China
Google Scholar
Taiwan
Grand Total
CNKI platfrom 464 219 0 683
Education Taiwan 0 2 605 607
Miscellaneous 0 346 248 594
CQVIP platform 0 378 0 378
Airitilibrary platform 0 0 147 147
Wanfang Data platform 0 55 0 55
Grand Total 464 1000 1000 2464
It is clear that CKNI Scholar is the academic database itself with
all the links pointing to its own sites. In contrast, Google Scholar
China links outwards to three mainland Chinese academic
databases: CNKI platform, CQVIP platform, Wangfang Data
platform. CQVIP refers to Chongqing VIP Information Co., Ltd.,
a company has its historical roots under the Chinse Ministry of
Science and Technology and is a strategic partner of Google
search since 2005. Wangfang Data is another joint enterprise by
various Chinese research institutes and publishers [5, 7, 17].
Google Scholar Taiwan links to mostly Taiwanese education
websites with the domain names of “edu.tw” and a Taiwanese
academic database called Airitilibrary platform.
Again, judging from the archive academic database results, there
is little overlapping among the search results. The only exceptions
are first the links to CNKI platform by CNKI Scholar and Google
Scholar China, and second the two links to Taiwanese educational
websites by Google Scholar China and Google Scholar Taiwan.
3. 3
When compared with CNKI Scholar, Google Scholar China is
more neutral in terms of platform choices given to the users.
Table 4 shows the same data points listed in Table 3 with the
details of the subdomain names. CNKI Scholar and Google
Scholar China links to different web interfaces hosted by the
CNKI, with heavy concentration of links in certain subdomain
names. Google Scholar Taiwan’s links to Taiwanese educational
websites are much less concentrated.
Table 4. Most frequently appearing source platforms and
their subdomain names
CNKI
Scholar
Google Scholar
China
Google Scholar
Taiwan
Grand Total
CNKI platfrom
www.cnki.net 447 0 0 447
www.cnki.com.cn 0 103 0 103
cdmd.cnki.com.cn 0 103 0 103
cpfd.cnki.com.cn 0 13 0 13
dbpub.cnki.net 13 0 0 13
d.scholar.cnki.net 4 0 0 4
Subtotal 464 219 0 683
Education Taiwan*
thesis.lib.ncu.edu.tw 0 0 75 75
pc01.lib.ntust.edu.tw 0 1 52 53
other* 0 1 478 479
Subtotal 0 2 605 607
Other websites 0 346 248 594
CQVIP platform
www.cqvip.com 0 377 0 377
2010.cqvip.com 0 1 0 1
Subtotal 0 378 0 378
Airitilibrary platform
www.airitilibrary.com 0 0 147 147
Wanfang Data platform
d.wanfangdata.com.cn 0 55 0 55
Grand Total 464 1000 1000 2464
Note*: Taiwanese educational websites include 89 instances, with
9 subdomain names having more than 30 links to them.
3.3 Growth and freshness
The reported publication years can provide some insights into the
growth and freshness of the found Chinese-language results on
Wikipedia. Figure 1 shows the results from 2002 to 2014,
indicating a general trend of growth from 2002 to 2011. (The
Hong Kong results are identical to the Taiwanese ones.)
0
20
40
60
80
100
120
140
160
CNKI
Scholar
Google
Scholar CN
Google
Scholar HK
Google
Scholar TW
Figure 1. The trendlines for number of publications each year
One should not jump into the conclusion by interpreting the
results of the years of 2012-2014 as indication of decline. The
new publications may still be in the process of archiving or
indexing. In early 2012, based on the doctoral and master theses
from Taiwanese and mainland Chinese databases, we observed a
growing trend from 2003 to 2009 (for Taiwan) or 2010 (for
mainland China), and then a decrease in numbers from 2010
onwards to 2012 [9]. In light of the new findings shown in Figure
1, the 2011 results indicate a continuing growth for the Google
Scholar China and CNKI scholar results, and a rebound for the
Google Scholar Taiwan results. Hence, it is more likely that there
exists a two- to three-year window where fresh publications can
be archived and indexed by major Chinese-language academic
databases.
4. Discussion
The status of the Chinese-language literature regarding
Wikipedia, based on the SERPs from major academic search
platforms across the regions of mainland China, Taiwan and Hong
Kong, provide interesting findings as follows.
First, the presence of major institutions of academic journals,
degree-granting bodies, and popular science magazines indicate a
substantial interest of these regions in Wikipedia. The distribution
appears to focus on information and library science, management
studies, popular science on the Internet. Little overlapping exists
though between mainland Chinese and Hong Kong/Taiwanese
results.
Second, the linking patterns show a distinct choice of academic
databases used by these academic search platforms. CNKI
Scholar seems to keep the links to its own websites. Google
Scholar China links more evenly among three mainland Chinese
academic databases: CNKI, CQVIP and Wangfang Data.
In addition to several Taiwanese educational websites, Google
Scholar Taiwan links to a Taiwanese academic database
Airitilibrary platform.
Third, the trend graph based on the reported publication years
indicate a growing number of publications from 2002 to 2011. In
conjunction with the previous observation made in 2012[9], the
currently shown decreasing results for 2012-2014 are likely the
outcomes of the possible two- to three-year window for a
Chinese-language publication to be archived and indexed.
The findings show a worrying situation where there exists little
overlapping results between the mainland Chinese and Hong
Kong/Taiwanese results. It is worrying because Chinese
Wikipedia project is actually an open collaboration project
contributed and managed by Chinese Wikipedians across these
Chinese-speaking regions and beyond[11]. Wikipedia researchers
who intend to conduct a literature review on Chinese-language
literatures must, at the current situation, acknowledge the division
of the found academic search results and find literature across
Chinese-language academic search platforms. Otherwise, the
coverage of the literature review is likely to be bounded and thus
limited to a certain regional perspective.
After trying entering different Chinese scripts into different
search engines, we found that the Google Scholar results appear
to return different results if the keywords were typed in different
scripts. In contrast, the CNKI Scholar results appear to return
similar and even smaller number of results. It means that users of
Google Scholar can simply switch the keywords between
traditional Chinese and simplified Chinese scripts for different
scopes of results. It also presents a search engine design challenge
in integrating sources that contain some degree of geolinguistic
4. 4
variations [10]. This research has captured the majority of users
choice of Chinese scripts for each of the regions of mainland
China, Hong Kong and Taiwan, future research can be conducted
to include both scripts.
Another direction for future research is to consider the results of
other academic search engines in China such as CQVIP platform
and Wangfang Data. It is noted that Baidu, the biggest search
engine company in China, does not offer any academic search.
5. Conclusion
Internet internationalization has been an important aspect of its
development [15] and the internationalization of media studies
has been slowly catching up to examine the implications[4].
Wikipedia research faces the similar situation where
internationalization of methods and findings is needed for better
research and practices. This paper contributes to such an effort
with a meta-analysis of academic SERPs for Chinese-language
literatures.
Indeed, digital support for the main East Asian languages
(Chinese, Japanese and Korean) becomes an important milestone
for the Internet’s internationalization. As put by a major East
Asian Science Technology and Society (STS) scholar, Nakayama
Shigeru [13]:
East Asians are accustomed to dealing with a multibyte system,
in contrast to Western monobyte reductionist culture. It may be
that in the future our multibyte culture will prove advantageous
for dealing with complex systems.
It is in this technical and linguistic context that Chinese
Wikipedia contributes to the global Wikimedia movement with
their internationalization innovations in multi-script editing
platforms and automatic conversions that are now part of the
MediaWiki codes [8].
However, the Chinese-language Wikipedia research seems to be
lagging if not stagnant in promoting academic exchanges across
Chinese-speaking regions. Future research is needed to examine
whether the citations of Chinese-language Wikipedia research are
also divided between mainland China on one side and Hong
Kong/Taiwan on the other. The findings here clearly indicate
such a divide, in terms of academic search engine platforms,
publication institutions and academic archive databases.
Similar research efforts can be conducted for other languages
which have contributed to the Wikipedia research, for instance,
Spanish, Arabic, Korean and Japanese. The meta analysis of
academic SERPs cannot replace in-depth literature review,
though, future work is still needed to foster the scientific
exchange beyond just English language. Nevertheless, the meta
analysis based on major academic SERPs conducted here presents
a set of reproducible procedures to monitor the current status of
Wikipedia research through the information windows that are
used daily by academics and students, thereby pointing
researchers to the gaps and potentials in internationalization of
Wikipedia research.
6. ACKNOWLEDGMENTS
We acknowledge the open source software tools called Scrapy for
making the web crawling and scraping tasks easier.
7. REFERENCES
[1] Card, N.A. 2012. Applied Meta-analysis for Social Science
Research. Guilford Press.
[2] Delanty, G. 2009. The Cosmopolitan Imagination: The
Renewal of Critical Social Theory. Cambridge University
Press.
[3] Global Development - Meta: 2012.
https://meta.wikimedia.org/wiki/Global_Development.
Accessed: 2012-03-19.
[4] Goggin, G. and McLelland, M. 2009. The
internationalization of the internet and its implications for
media studies. Internationalizing Media Studies. D.K.
Thussu, ed. Routledge. 294–307.
[5] Hsu, C.-H.[ 徐 嘉 晧 ] 2008. A Comparative Study on
Chinese Electronic Journal Systems between Taiwan and
China [ 海峽 兩岸中文電子期刊系統之比較研究].
National Chengchi University.
[6] Israel, J.I. 2001. Radical Enlightenment: Philosophy and
the Making of Modernity 1650-1750. Oxford University
Press, USA.
[7] Li, J.L.[ 李 金 兰 ] 2011. Comparative Analysis on
Resources of CNKI, Wanfang Data and Weipu
Information. [CNKI, 万 方 , 维 普 资 源 比 较 与 分 析 ].
Information Research [情报探索]. 4, (2011), 59–61.
[8] Liao, H.-T. 2009. Conflict and Consensus in the Chinese
version of Wikipedia. IEEE Technology and Society
Magazine. 28, 2 (2009), 49–56.
[9] Liao, H.-T. 2012. Growth of Academic Interest in
Wikipedia From Major Chinese-speaking regions: Has it
peaked? Oxford Internet Institute Blog.
[10] Liao, H.-T. 2013. How does localization influence online
visibility of user-generated encyclopedias? A case study on
Chinese-language Search Engine Result Pages (SERPs).
[11] Liao, H.-T. 2014. The Cultural Politics of User-Generated
Encyclopedias: Comparing Chinese Wikipedia and Baidu.
University of Oxford.
[12] Roche, D. 2006. Encyclopedias and the diffusion of
knowledge. The Cambridge History of Eighteenth-Century
Political Thought. M. Goldie and R. Wokler, eds.
Cambridge University Press. 172–194.
[13] Shigeru, N.(中山茂) 2001. The Digital Revolution and
East Asian Science. Historical Perspectives on East Asian
Science, Technology, and Medicine. A.K.L. Chan et al.,
eds. World Scientific.
[14] Shim, J. and Yang, J. 2009. Why is Wikipedia not more
widely accepted in Korea and China? Factors affecting
knowledge-sharing adoption. Decision Line. 40, 2 (2009),
12–15.
[15] Stone, A. 2003. Internationalizing the internet. IEEE
Internet Computing. 7, 3 (May 2003), 11–12.
[16] Suo, H. 2007. The development of Wiki sites in China: An
exploratory study. [中国维基网站发展原因之初探].
Harmonious Society, Civic Society, and Mass Media [和谐
社 会 、 公 民 社 会 与 大 众 媒 介 ]. China University
Communication Press [中国传媒大学出版社].
[17] Tan, J.[谭捷] et al. 2010. A Comparative Study of Chinese
Academic Journal Databases [中文学术期刊数据库的比
较研究]. Document, Information & Knowledge [图书情报
知识]. 4 (2010), 4–13.