This document analyzes the relative popularity of 10 Hong Kong universities by searching their abbreviations in the Corpus of Global Web-based English and counting occurrences. The University of Hong Kong is found to be the most mentioned, with 1124 occurrences, while Hong Kong Shue Yan University is the least mentioned with only 14 occurrences. The document suggests this may reflect differences in universities' online promotion and open learning opportunities more than academic status. It concludes by noting the analysis is informal and limitations of the corpus methodology used.
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Corpus Report
1. Copyright by Charles Ko Ka Shing, 2012. And Know more about
IHBCM, www.ihbcm.webs.comGoogle/Yahoo: IHBCM/Charles KoKaShingwww.charlesksko.webs.com
Originally published on SlideShare in ppt format
Corpus Report: Comparison of the popularity of universities of Hong Kong in
globe
Originally published on SlideShare in ppt format
Email: ihbcmcharles@gmail.com
Aim and Objectives:
• To introduce the readers and lead them to the power of Corpus
Linguistics
• To stimulate the readers’ curiosity of the use of corpus as a methodology
of research of any kind
Foreword
What is Corpus?
Corpus (Latin plural corpora, English plural corpuses or corpora) is Latin for
body. It may refer to: Habeas corpus, a legal mechanism to end detention of a
suspect Corpus delicti, a legal term meaning "body of the crime...
http://en.wikipedia.org/wiki/Corpus :
• [Corpus in Linguistics, Applied Linguistics and Corpus Linguistics]
– “Text corpus, in linguistics, a large and structured set of texts
– Speech corpus, in linguistics, a large set of speech audio files”
What is Corpus Linguistics?
“[…] corpus linguistics is a whole system of methods and principles of
how to apply corpora in language studies and teaching/learning, it
certainly has a theoretical status. Yet theoretical status is not theory in
itself…”
(McEnery et al. 2006: 7f.)
Introduction
In this paper, I will use the most updated corpus, Corpus of Global Web-based
ENGLISH (GloWbE)1
to compare the 10 selected Hong Kong “universities”’s
“fame” … in order to initially and informally report how people from around the
world mention the names of the following selected education institutions:
1. The Open University of Hong Kong (OUHK)
2. The Hong Kong Polytechnic University (PolyU)
3. City University of Hong Kong (CityU)
1
The Corpus of Global Web-based English (GloWbE) http://corpus2.byu.edu/glowbe/that is developed
on the corpus2 website involves 20 varieties of English.
2. Copyright by Charles Ko Ka Shing, 2012. And Know more about
IHBCM, www.ihbcm.webs.comGoogle/Yahoo: IHBCM/Charles KoKaShingwww.charlesksko.webs.com
Originally published on SlideShare in ppt format
4. Hong Kong Shue Yan University (SYU)
5. Hong Kong Academy for Performing Arts (HKAPA)
6. The Hong Kong Institute of Education (HKIEd)
7. The Chinese University of Hong Kong (CUHK)
8. The University of Hong Kong (HKU)
9. Hong Kong Baptist University (HKBU)
10. The Hong Kong University of Science & Technology (HKUST)
Methodology
I would type the abbreviation of the institutions’ names one by one in
searching, to find out which institution’s name is mentioned the most amongst
varieties of English (or Englishes) around the world.
Analysis and Evaluation
It is found that, the world “status” (occurrences in the GloWbE corpus) of the
University of Hong Kong is the highest; while the one of the Hong Kong Shue
Yan University is the lowest.
In addition, the following table shows all the ten selected Hong Kong institutions’
word frequency2
, the order is as follows:
1.
HKU
2.
CUHK
3.
HKUST
4.
PolyU
5.
CityU
6.
OUHK
7.
HKIEd
8.
HKBU
9.
HKAPA
10.
SYU
1124 640 590 341 164 161 78 65 28 14
Above row represents the words, or word types (and 1=highest frequency;
10=lowest frequency.)
Below row represents the number of occurrences in GloWbE corpus.
(See more data in Appendix 2.)
2
In computational linguistics, a frequency list is a sorted list of words (word types) together with their
frequency, where frequency here usually means the number of occurrences in a given corpus.
3. Copyright by Charles Ko Ka Shing, 2012. And Know more about
IHBCM, www.ihbcm.webs.comGoogle/Yahoo: IHBCM/Charles KoKaShingwww.charlesksko.webs.com
Originally published on SlideShare in ppt format
It is surprising that in the world of the GloWbE corpus, the OUHK has a higher
ranking than the HKBU and the HKIEd, although it is generally agreed that the
academic status of OUHK is lower than the two tertiary institutions (e.g., in
http://www.4icu.org/hk/, see figure 2 below, sort by 2013 university web ranking
according to 4icu.org: in terms of general facilities and academic support, the
OUHK does not provide more enough than the HKBU and the HKIEd.)
Figure 1
However, it may also be interpreted that the OUHK’s promotion of the open
learning, especially during the 2012-133
has been done better than the ones of
the two universities (i.e. HKBU and HKIEd), so more people can receive its
education by the OUHK through online and the people know its name and
mention more of its name, hence the occurrence of its name in the corpus is
higher than the HKBU and HKIEd (however I did not have a peek into each
word types’ KWIC, Key Word in Context precisely so the results may not be
completely accurate that in actual case there could be other things not referring
to the selected universities are encoded, or there may be other names encoded
to represent the HKBU and the HKIEd.4
)
In the future, there is definitely the room for potential scholars to research on
the most mentioned name of universities in Hong Kong in each variety of
Englishes, creating an extra dimension into the research; or even the
researchers can conduct research on the most mentioned name of world
3
The GloWbE corpus is released in April, 2013, and it is a 2012-2013 corpus. (http://corpus.byu.edu/)
4
N.B. in the whole process of data collecting, I did not use any wildcards, and any use of tag set is not
involved.
4. Copyright by Charles Ko Ka Shing, 2012. And Know more about
IHBCM, www.ihbcm.webs.comGoogle/Yahoo: IHBCM/Charles KoKaShingwww.charlesksko.webs.com
Originally published on SlideShare in ppt format
universities in corpus, further increasing one more probable dimension.
Limitation and Conclusion
I want to clarify, this is NOT an academic report, or any form of article, which it
just wants to stimulate the readers’ curiosity of the use of corpus as a
methodology of research of any kind. The report may only succeed to conclude
that the names of OUHK, HKIEd, HKBU, HKAPA, and SYU are not quite
familiar on the stage of academicians in the world, or say at least within the 10
selected institutions in GloWbE corpus.
Reference
McEnery, T., Xiao, R. & Tono, Y. (2006).Corpus-based language studies: an
advanced resource book. London/New York: Routledge.
Appendices
5. Copyright by Charles Ko Ka Shing, 2012. And Know more about
IHBCM, www.ihbcm.webs.comGoogle/Yahoo: IHBCM/Charles KoKaShingwww.charlesksko.webs.com
Originally published on SlideShare in ppt format
Appendix 1
Find Corpora on Yahoo!
http://tw.search.yahoo.com/search?fr=fp-tab-web-t&ei=UTF-8&p=corpus
Example:
The Corpus of Contemporary American English (COCA) is the largest
freely-available corpus of English, and the only large and balanced
corpus of American English. The corpus was created by Mark Davies of
Brigham Young University, and it is used by tens of thousands of users
every month (linguists, teachers, translators, and other researchers).
COCA is also related to other large corpora that we have created.
Source: http://corpus.byu.edu/coca/
Appendix 2
Some more data of
HKAPA
CITYU
POLYU
HKU
OUHK