LET関西支部メソドロジー研究部会2023年度第1回研究会発表スライド（亘理陽一）

英語教師が
言語アセスメントについて
知りたいことが知りたい
Kremmel & Harding (2020)に基づく
パイロット調査
外国語教育メディア学会関西支部メソドロジー研究部会
2023年度第1回研究会（関西大学梅田キャンパス）
2023年6月18日
亘理陽一
中京大学国際学部
watariyoichi@mecl.chukyo-u.ac.jp

• 「指導と評価の一体化」「英語力」証明圧力
• 学習指導要領,「パフォーマンステスト」云々
• (ｲｶｶﾞﾜｼｲ)外部試験利用の拡大
• 英語教師の英語運用能力アセスメントの現状
• Cf. 出口マクドナルド・福田・亘理 (2019)
• A2の問題を求めて作れたのは約46% (n = 83)
背景(1)
パフォーマンステストの実施状況
○小学校では、「話すこと」を評価するためのパフォーマンステストを実施している割合は97.2%と、
ほぼすべての小学校で実施されている。
○中学校では、「話すこと」「書くこと」のパフォーマンステストを両方とも実施している割合は、90.1%
となっている。
○高等学校では、48.6%の学校で「話すこと」「書くこと」の両方のパフォーマンステストを行っている。
【小学校】
97.2% 97.0% 97.4%
0%
20%
40%
60%
80%
100%
(96.8%) (96.6%)
(97.0%)
【中学校】
90.1% 88.6% 90.5% 91.0%
0%
20%
40%
60%
80%
100%
(90.5%) (89.2%) (90.7%) (91.4%)
中学生の英語力（都道府県・指定都市別）
44.3%
41.6%
42.6%
37.9%
44.4%
42.3%
38.8%
52.4%
42.5%
59.1%
50.1%
59.1%
59.5%
47.0%
43.9%
46.1%
52.3%
86.4%
41.1%
46.2%
54.8%
37.9%
47.7%
49.8%
44.7%
49.1%
46.2%
45.2%
51.1%
34.6%
48.1%
45.0%
50.7%
52.1%
36.0%
47.6%
37.9%
48.5%
37.4%
49.2%
45.3%
44.3%
45.0%
47.4%
38.6%
50.1%
41.2%
86.6%
51.6%
53.7%
66.0%
48.2%
38.7%
51.7%
45.3%
37.7%
52.9%
55.8%
59.2%
51.1%
44.0%
54.4%
46.9%
57.5%
57.2%
40%
60%
80%
100%
CEFR A1レベル相当以上の英語力を有すると思われる生徒の割合
CEFR A1レベル相当以上を取得している生徒の割合
R3年度
目標値:50% 【第３期教育振興基本計画】
R4年度平均値〔49.2%〕
A1レベル相当以上
https://www.mext.go.jp/a_menu/kokusai/gaikokugo/1415043_00004.htm
文部科学省「英語教育実施状況調査」

背景(2)
• 『宙船』の歌詞を噛み締めるべき状況では？
• どうしたって、妥当性の高い課題を通じた運用
能力評価が英語教師に求められ続ける
• とすれば、その判断を外部試験に預け、依存を
強め(て当該団体関係者のﾒｼｳﾏに資す)るよりも
• 英語教師自身が専門職としてLanguage
Assessment Literacy (LAL)を備えて漕いで行く
方が(時間がかかっても、みんなにとって)よい
• 「おまえのオールを任せるな」と言ってあげたい
英語教師はLALについて何を知るべき？(RQ1)

• 要素的見方 → 多次元的・発達的見方 (Cf. Fulcher, 2012, Vogt & Tsagari, 2014)
• Taylor (2013) → 実証に基づくthe LAL surveyの開発 (n = 1,086)
• 10因子を仮定 → 71 − 18 = 53項目9因子モデル (CCR = 73%)
Kremmel & Harding (2020)
Towards a Comprehensive, Empirical Model of Language
Assessment Literacy across Stakeholder Groups: Developing the
Language Assessment Literacy Survey
Benjamin Kremmela
and Luke Hardingb
a
University of Innsbruck, Innsbruck, Austria; b
Lancaster University, Lancaster, UK
ABSTRACT
While scholars have proposed different models of language assessment lit-
eracy (LAL), these models have mostly comprised prescribed sets of compo-
nents based on principles of good practice. As such, these models remain
theoretical in nature, and represent the perspectives of language assessment
researchers rather than stakeholders themselves. The project from which the
current study is drawn was designed to address this issue through an empirical
investigation of the LAL needs of different stakeholder groups. Central to this
aim was the development of a rigorous and comprehensive survey which
would illuminate the dimensionality of LAL and generate profiles of needs
across these dimensions. This paper reports on the development of an instru-
ment designed for this purpose: the Language Assessment Literacy Survey. We
first describe the expert review and pretesting stages of survey development.
Then we report on the results of an exploratory factor analysis based on data
from a large-scale administration (N = 1086), where respondents from a range
of stakeholder groups across the world judged the LAL needs of their peers.
Finally, selected results from the large-scale administration are presented to
illustrate the survey’s utility, specifically comparing the responses of language
teachers, language testing/assessment developers and language testing/
assessment researchers.
Introduction
Given the widespread use of language assessments for decision-making across an increasing
number of social domains (education, immigration and citizenship, professional certification), it
has become vital to raise awareness and knowledge of good practice in language assessment for
a wide range of stakeholder groups. Scholars have thus called for the promotion of language
assessment literacy (LAL) not only for teachers and assessment developers, the two groups most
typically involved with language assessments, but also for score users, policymakers and students
(among others) (e.g. Baker, 2016; Deygers & Malone, 2019). For such groups, a heightened
awareness of the principles and practice of language assessment would ideally lead to more
informed discussion of assessment matters, clarity around good practice in using language
assessments, and ultimately more robust decision-making on the basis of assessment data
LANGUAGE ASSESSMENT QUARTERLY
2020, VOL. 17, NO. 1, 100–120
https://doi.org/10.1080/15434303.2019.1674855
先行研究(1)

Language Testing
30(3) 403–412
© The Author(s) 2013
Reprints and permissions:
sagepub.co.uk/journalsPermissions.nav
DOI: 10.1177/0265532213480338
ltj.sagepub.com
Communicating the theory,
practice and principles of
language testing to test
stakeholders: Some reflections
LyndaTaylor
University of Bedfordshire, UK
Abstract
The 33rd Language Testing Research Colloquium (LTRC), held in June 2011 in Ann Arbor,
Michigan, included a conference symposium on the topic of assessment literacy. This event brought
together a group of four presenters from different parts of the world, each of whom reported
on their recent research in this area. Presentations were followed by a discussant slot that
highlighted some thematic threads from across the papers and raised various questions for the
professional language testing community to consider together. One point upon which there was
general consensus during the discussion was the need for more research to be undertaken and
published in this complex and challenging area. It is particularly encouraging, therefore, to see a
coherent set of studies on assessment literacy brought together in this special issue of Language
Testing and it will undoubtedly make an important contribution to the steadily growing body of
literature on this topic, particularly as it concerns the testing of languages. This brief commentary
revisits some of the themes originally raised during the LTRC 2011 symposium, considers how
these have been explored or developed through the papers in this special issue and reflects on
some future directions for our thinking and activity in this important area.
Keywords
Assessment literacy, language assessment literacy, test stakeholders
LTRC 2011 in Michigan celebrated 50 years since the publication in 1961 of the
seminal work in our field by Professor Robert Lado, one-time Director of the English
480338LTJ30310.1177/0265532213480338Language TestingTaylor
2013
Article
Taylor (2013)
先行研究(0)
activity. The resulting profile for each stakeholder group is likely to look somewhat dif-
erent. For example, a profile for test writers may cover a wide range of content dimen-
sions fairly evenly and in some depth. A profile for classroom language teachers,
however, may end up focusing strongly on the practical know-how needed for creating
ests but have a much lighter focus on measurement theory or ethical principles; the lat-
er may need to be touched upon only briefly at a surface level. While a profile for
university administrators will address those aspects of the assessment literacy construct
hat relate to understanding the nature of test instruments and the meaning of their
scores for decision-making purposes, other aspects such as how to construct and vali-
date tests need not receive much attention.
Figure 2(a–d) attempts to illustrate what differential assessment literacy might look
ike for these three groups and for the community of professional language testing
experts. It should be noted that the labelled dimensions on the eight axes (i.e. knowledge
of theory, technical skills, etc.) are hypothesized from the discussion of possible AL/
LAL components across various papers in this special issue, while the values (i.e. 0–4)
are hypothesized according to the different stages of literacy suggested by Pill and
Harding. The diagrams are for illustrative purposes only, to show how it might be
possible to conceptualize and represent differential AL/LAL; the actual characterization
(b)
(a)
0
1
2
3
4
Knowledge of
theory
Technical skills
Principles and
concepts
Language
pedagogy
Sociocultural
values
Local prac!ces
Personal
beliefs/a"tudes
Scores and
decision making
0
1
2
3
4
Knowledge of
theory
Technical skills
Principles and
concepts
Language
pedagogy
Sociocultural
values
Local prac!ces
Personal
beliefs/a"tudes
Scores and
decision making
(c) (d)
0
1
2
3
4
Knowledge of
theory
Technical skills
Principles and
concepts
Language
pedagogy
Sociocultural
values
Local prac!ces
Personal
beliefs/a"tudes
Scores and
decision making
0
1
2
3
4
Knowledgeof
theory
Technicalskills
Principlesand
concepts
Language
pedagogy
Sociocultural
values
Local prac!ces
Personal
beliefs/a"tudes
Scores and
decision making
Figure 2. Differential AL/LAL profiles for four constituencies.
(a) Profile for test writers.
(b) Profile for classroom teachers.
(c) Profile for university administrators.
(d) Profile for professional language testers.
Benjamin Kremmela
and Luke Hardingb
a
ABSTRACT
2020, VOL. 17, NO. 1, 100–120
https://doi.org/10.1080/15434303.2019.1674855
Taylor’s model. Yan, Zhang, and Fan (2018) also used the profiles as a point of comparison
a study of language teachers’ LAL needs in China. However, the speculative nature of
profiles, the “etic” view they embody, and the need to broaden the profiles to a wider gr
of stakeholders represents an important gap in LAL research. In addressing this gap, the pres
study aimed to elaborate and validate Taylor’s profiles by means of a large-scale survey t
invited a range of stakeholder groups to assess their needs and identify how important t
consider various aspects of LAL for members of their group/profession. Specifically, two resea
questions were posed:
(1) To what extent are hypothetically different dimensions of language assessment liter
empirically distinct?
(2) To what extent, and in what ways, do the needs of different stakeholder groups vary w
respect to identified dimensions?
Method
Instrument development
A number of existing LAL survey instruments have been reported in the research literature – m
prominently Fulcher’s (2012) survey, which has been modified for use in numerous research conte
and the survey used by Vogt and Tsagari (2014) to evaluate assessment literacy across Euro
Ηowever, as these surveys were designed primarily for teachers, and therefore for a different purp
to the present instrument, they accordingly may not reflect the full range of assessment-rela
activities that would be undertaken by a range of different stakeholder groups. Thus, in orde
develop a language assessment literacy survey to be used by a range of stakeholders to assess their o
groups’ needs, we had two clear guiding aims: (1) the survey would need to be comprehensive,
feasible to complete among populations where motivation to engage with LAL may be low; and (2)
survey items would need to be intelligible across the wide-range of stakeholder groups suggested
Taylor (2013). This necessitated a multi-stage development process which spanned almost 24 mon
(see Figure 2).
Version 1.0
- Simplified definitions
2015
Version 2.0 – 2.4
- Elaborated definitions
- Multi-item scales
Expert review 1
- 6 professors in LTA
Pre-test 1
- 62 participants
- QUAN/QUAL feedback
Versions 2.5 – 2.10
- Further refinement to
wording
Pre-test 2
- 25 participants
Expert review 2
- 2 language assessment
experts (with expertise
in questionnaire design)
Version 2.11
Final version created
Survey launched May
2017
Data pulled from
Qualtrics platform
November 2017
わっさーパイセンのご好意にワッサー

• 要素的見方 → 多次元的・発達的見方 (Cf. Fulcher, 2012, Vogt & Tsagari, 2014)
• Taylor (2013) → 実証に基づくthe LAL surveyの開発 (n = 1,086)
• 10因子を仮定 → 71 − 18 = 53項目9因子モデル (CCR = 73%)
Benjamin Kremmela
and Luke Hardingb
a
ABSTRACT
Introduction
2020, VOL. 17, NO. 1, 100–120
https://doi.org/10.1080/15434303.2019.1674855
先行研究(1)
Version 1.0
- Simplified definitions
2015
Version 2.0 – 2.4
- Elaborated definitions
- Multi-item scales
Expert review 1
- 6 professors in LTA
Pre-test 1
- 62 participants
Versions 2.5 – 2.10
- Further refinement to
wording
Pre-test 2
- 25 participants
Expert review 2
- 2 language assessment
experts (with expertise
in questionnaire design)
Version 2.11
Final version created
Survey launched May
2017
Data pulled from
Qualtrics platform
November 2017
Figure 2. Overview of instrument development process.

and a modification of both Taylor’s (2013) initial framework, and our own hypothesized dimensions
based on Taylor’s work. The evolution of these dimensions across the three stages – initial frame-
work, hypothesized dimensions, data-driven factor structure – is summarized in Table 6.
Table 4. Eigenvalues for 9-factor solution.
Initial Eigenvalues Extraction Sums of Squared Loadings
Factor Total % of Variance Cumulative % Total % of Variance Cumulative %
1 22.065 44.129 44.129 21.755 43.509 43.509
2 4.634 9.267 53.397 4.346 8.691 52.201
3 2.242 4.485 57.882 1.880 3.759 55.960
4 1.840 3.680 61.561 1.549 3.098 59.059
5 1.317 2.634 64.196 1.060 2.121 61.179
6 1.259 2.518 66.713 .979 1.959 63.138
7 1.134 2.269 68.982 .866 1.731 64.869
8 1.040 2.079 71.061 .760 1.519 66.388
9 1.013 2.026 73.088 .671 1.343 67.731
Table 5. The nine factors of LAL as represented in the final version of the LAL survey.
Item numbers α
Factor 1 Developing and administering language assessments 62, 68, 61, 63, 64, 66, 70, 69, 65, 60, 67, 58, 59, 17 .96
Factor 2 Assessment in language pedagogy 8, 7, 6, 5, 1, 21 .89
Factor 3 Assessment policy and local practices 12, 11, 38, 14, 39, 22 .88
Factor 4 Personal beliefs and attitudes 46, 47, 45, 48 .93
Factor 5 Statistical and research methods 50, 49, 51, 52 .95
Factor 6 Assessment principles and interpretation 32, 31, 3, 10 .85
Factor 7 Language structure, use and development 28, 27, 26, 29, 33 .85
Factor 8 Washback and preparation 24, 25, 23, 19 .87
Factor 9 Scoring and rating 56, 55, 53 .85
先行研究(2)
0 = not knowledgeable at all
1 = slightly knowledgeable
2 = moderately knowledgeable
3 = very knowledgeable
4 = extremely knowledgeable
This scale had been developed and modified during pre-testing, and provided the most useful way of
assessing the perceptions of needs among different roles/professional groups. An almost identical
question was used for a set of items (grouped together) which referred to skills rather than types of
knowledge (see Appendix 1).
106 KREMMEL AND HARDING
Benjamin Kremmela
and Luke Hardingb
a
ABSTRACT
Introduction
2020, VOL. 17, NO. 1, 100–120
https://doi.org/10.1080/15434303.2019.1674855

Kremmel & Harding (2020)に基づく今回の調査
Benjamin Kremmela
and Luke Hardingb
a
ABSTRACT
Introduction
2020, VOL. 17, NO. 1, 100–120
https://doi.org/10.1080/15434303.2019.1674855
and a modification of both Taylor’s (2013) initial framework, and our own hypothesized dimensions
based on Taylor’s work. The evolution of these dimensions across the three stages – initial frame-
1 22.065 44.129 44.129 21.755 43.509 43.509
2 4.634 9.267 53.397 4.346 8.691 52.201
3 2.242 4.485 57.882 1.880 3.759 55.960
4 1.840 3.680 61.561 1.549 3.098 59.059
5 1.317 2.634 64.196 1.060 2.121 61.179
6 1.259 2.518 66.713 .979 1.959 63.138
7 1.134 2.269 68.982 .866 1.731 64.869
8 1.040 2.079 71.061 .760 1.519 66.388
9 1.013 2.026 73.088 .671 1.343 67.731
Item numbers α
This scale had been developed and modified during pre-testing, and provided the most useful way of
assessing the perceptions of needs among different roles/professional groups. An almost identical
question was used for a set of items (grouped together) which referred to skills rather than types of
方法(1)
全く知らない
わずかに知っている
まあまあ知っている
よく知っている
きわめてよく知っている
(49)−(71)のskilledは
「長けている/いない」と訳した

Kremmel & Harding (2020)に基づく今回の調査
方法(2)
• 53 − (2因子)10項目 = 43項目の日本語版
• ＋年齢・職歴・学校数、養成課程種別・学位
• URL共有 → オンライン回答 [Google Form]
• 2022年12月〜2023年4月(n = 53)
• 静岡県・三重県・宮崎県の中高英語教員
• 研修の場を通じて、指導主事を介して
1 22.065 44.129 44.129 21.755 43.509 43.509
2 4.634 9.267 53.397 4.346 8.691 52.201
3 2.242 4.485 57.882 1.880 3.759 55.960
4 1.840 3.680 61.561 1.549 3.098 59.059
5 1.317 2.634 64.196 1.060 2.121 61.179
6 1.259 2.518 66.713 .979 1.959 63.138
7 1.134 2.269 68.982 .866 1.731 64.869
8 1.040 2.079 71.061 .760 1.519 66.388
9 1.013 2.026 73.088 .671 1.343 67.731
Item numbers α
→キャリア・学校種によってどのような違いが見られるか(RQ2)

結果(1)
Data: bit.ly/LAQ2023wtrych
.csv
年齢学部非養成系学部養成系教職大学院修士・博士
20代 5 1 1 0
30代 10 0 0 4
40代 9 3 3 2
50代 7 2 0 3
60代 2 1 0 0
教職歴学校数
n 53 53
M 18.15 4.23
SD 10.32 2.06
Min 2 1
Max 39 8
年齢教職歴学位学校数
→年齢・教職歴はそこそこバランスよく集まったが…

結果(2)
Appendix 4. Descriptive statistics of LAL needs for three key stakeholder groups
LTA developers
(n = 198)
LTA researchers
(n = 138)
Language teachers
(n = 645)a
M SD M SD M SD
Developing and administering language assessments 3.35 .59 3.28 .60 2.53 .87
Assessment in language pedagogy 2.53 .83 3.12 .70 2.96 .72
Assessment policy and local practices 2.75 .77 3.01 .82 2.28 .86
Personal beliefs and attitudes 3.21 .85 3.28 .74 2.83 .89
Statistical and research methods 3.25 .80 3.38 .74 2.10 1.03
Assessment principles and interpretation 3.60 .52 3.63 .49 2.94 .79
Language structure, use and development 3.19 .70 3.25 .61 3.02 .73
Washback and preparation 2.85 .82 3.04 .74 3.01 .79
Scoring and rating 3.45 .68 3.31 .79 2.83 .83
a
Note, for the Language teachers group: n = 644 for Personal beliefs and attitudes and Assessment principles and interpretation; n
= 643 for Statistical and research methods and Scoring and rating
Kremmel & Harding (2020, p. 120)
n = 53
M SD
F1 1.03 0.72
F2 1.11 0.95
F4 1.19 0.95
F6 1.19 0.91
F7 1.46 0.87
F8 0.92 0.87
F9 1.68 0.71
F1
F2
F4
F6
F7
F8
F9
→Kremmel & Harding (2020)の結果と比べ…
• どの因子の平均値も著しく低い
• その中でも、F9 > F7 > … > F8
This scale had been developed and modified during pre-testing, and p
assessing the perceptions of needs among different roles/profession
question was used for a set of items (grouped together) which referre
Respondents who completed the 71 items were asked to provide
responses using a sliding scale (0% to 100% confident), and to comp
gender, age, years of experience in role/profession, country of residen
professional/learning role. A space for open-ended comments was al
finally asked if they would like to continue on to provide a self-assess
skill on the same set of items (the analysis of self-assessment data
current paper).
Main trial sample
We did not use a probability sampling approach in the main trial
population for each category was unknown (e.g., there is no reliable d
teachers worldwide). We also faced a challenge in gaining acces
stakeholder groups and encouraging them to complete the survey. Th
of language professionals (e.g., teachers, examiners) working within o
dispersed and difficult to reach. For that reason, we implemented
全く知らない
わずかに知っている
まあまあ知っている
よく知っている
きわめてよく知っている

結果(3) 相関
→属性との相関は薄いが、因子間の相関は基本的に高い(.69〜.89)

結果(4) 年代別
全体 (n = 53) 20代 (n = 7) 30代 (n = 14) 40代 (n = 17) 50代 (n = 12)
M SD M SD M SD M SD M SD
F1 1.03 0.87 0.99 0.26 0.99 0.68 1.10 0.86 1.11 0.80
F2 1.11 1.03 0.64 0.74 1.15 0.88 1.29 0.96 1.36 1.06
F4 1.19 1.03 1.21 0.82 1.27 0.94 1.18 0.93 1.40 1.07
F6 1.19 1.05 0.93 0.84 1.25 0.86 1.26 0.94 1.40 1.03
F7 1.46 1.08 1.29 0.86 1.41 0.79 1.45 0.91 1.75 0.90
F8 0.92 0.95 0.79 0.85 0.88 0.88 1.04 0.89 1.08 0.93
F9 1.68 0.81 1.62 0.52 1.60 0.66 1.63 0.80 1.83 0.44
言語アセスメントの開発・運営
教え方のアセスメント
個人的信念と態度
アセスメントの原理と解釈
言語の構造と使用、発達
ウォッシュバックと準備
採点・評価
→年代が下がるにつれ特にF2に関する知識が足りないと感じている可能性

結果(5) 年代別2
• F2教え方のアセスメント、F6アセスメントの原理と解釈、
F7言語の構造と使用、F9採点・評価 -> だんだん上昇？
• F1言語アセスメントの開発・運営 -> だんだん下降？拡散？

結果(6) 学校種別
全体 (n = 53) 中学 (n = 26) 高校 (n = 22)
M SD M SD M SD
F1 1.03 0.87 0.94 0.70 1.03 0.73
F2 1.11 1.03 0.96 0.89 1.17 0.99
F4 1.19 1.03 1.13 1.02 1.07 0.89
F6 1.19 1.05 1.00 0.87 1.26 0.93
F7 1.46 1.08 1.33 0.85 1.43 0.91
F8 0.92 0.95 0.91 0.89 0.77 0.81
F9 1.68 0.81 1.37 0.66 1.49 0.66
採点・評価
→今回のサンプルの範囲では学校種による目立った違いは確認されない

結果(7) 学位
全体 (n = 53) 学部 (n = 40) 大学院 (n = 13)
M SD M SD M SD
F1 1.03 0.87 0.99 0.66 1.14 0.89
F2 1.11 1.03 1.05 0.90 1.32 1.11
F4 1.19 1.03 1.13 0.97 1.38 0.90
F6 1.19 1.05 1.06 0.86 1.60 0.98
F7 1.46 1.08 1.35 0.86 1.63 0.84
F8 0.92 0.95 0.85 0.87 1.12 0.90
F9 1.68 0.81 1.48 0.66 1.51 0.73
採点・評価
→院修了者の方が全般に高く、F6に関する知識が特に深められている可能性

結果(8) 項目別
• 2人以上が「きわめてよく知っている」を選んだ項目:
• F6_31: 信頼性の概念（評価の正確さや一貫性）
• F6_32: 妥当性の概念（評価が測定しようとするも
のをどれだけ測定しているか）
• F7_26: 言語スキルがどのように発達するか（例：読
む、聞く、書く、話す）
• F7_27: 外国語／第二言語がどのように学習されるか
• F9_55: 選択式問題（多肢選択問題など）の採点
→F1とF8に4が選ばれた項目が1つもない

結果(9) 項目別
• 「全く知らない」の選択人数が多い上位(28〜24人):
• F1_17: 言語アセスメントについて他者に教える方法
• F7_29: 社会的価値観は、言語アセスメントの設計
と使用にどういう影響があり得るか
• F8_19: 言語アセスメントを受けるための学習者の
準備の仕方
• F0_18: アセスメントが不適切に使用されているこ
とをどのように認識するか
• F2_21: アセスメントに基づく有用なフィードバック
の仕方

考察とまとめ(1)
• RQ1: 英語教師はLALについて何を知るべき？
• TESOL Educatorsの水準で見れば7(9)因子の全て
• とりわけ、言語アセスメントの共有・教授方法、
社会的要因の考慮、適切性の判断、学習者へのフ
ィードバックについて
• RQ2: キャリア・学校種による違いは？
• 職歴を通じてand/or大学院でLALは高まっている
と言えそうだが、全然まだよくわからない
• より広範囲な調査が必要

考察とまとめ(2)
• F3 Assessment policy
and local practicesに
ついても訊くべきだっ
たかなあ(後悔)
• F5統計と研究法の知識
は英語教師にどの程度
必要かねえ(問い)
• それにしてもデータが
なかなか集まらん(嘆息)
• 現時点での知識とスキル
を訊いた -> 本人が必要と
感じているかは別の問題
• 今更だけど因子間で項目
数偏りすぎじゃない？
• やはり教師向けのFulcher (2012)や Vogt & Tsagari (2014)をアレンジして(質的に)集めるところからかなあ？

2.2. Please specify if you need training in the following domains
None Yes, basic
Training
Yes, more advanced
training
a) Giving grades ! ! !
b) Finding out what needs to be
taught/ learned
! ! !
c) Placing students onto
courses, programs, etc.
! ! !
d) Awarding final certificates
(from school/program; local,
regional or national level
! ! !
3. Content and concepts of LTA
3.1. Please specify if you were trained in the following domains.
Not at all A little (1-2 days) More
advanced
1. Testing/Assessing:
a) Receptive skills (reading/listening) ! ! !
b) Productive skills (speaking/ writing) ! ! !
c) Microlinguistic aspects (grammar/vocabulary) ! ! !
d) Integrated language skills ! ! !
e) Aspects of culture ! ! !
2. Establishing reliability of tests/assessment ! ! !
3. Establishing validity of tests/assessment ! ! !
4. Using statistics to study the quality of
tests/assessment
! ! !
3.2. Please specify if you need training in the following domains
None Yes, basic
Training
Yes, more advanced
training
1. Testing/Assessing:
a) Receptive skills (reading/listening) ! ! !
b) Productive skills (speaking/writing) ! ! !
c) Microlinguistic aspects
(grammar/vocabulary)
! ! !
d) Integrated language skills ! ! !
e) Aspects of culture ! ! !
2. Establishing reliability of tests/assessment ! ! !
3. Establishing validity of tests/assessment ! ! !
4. Using statistics to study the quality of
tests/assessment
! ! !
Q3 Are there any skills that you still need?
Q4 Please look at each of the following topics in language testing.
For each one please decide whether you think this is a topic that should be included in a course
on language testing.
Indicate your response as follows:
5 = essential
4 = important
3 = fairly important
2 = not very important
1 = unimportant
A. History of Language Testing ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
B. Procedures in language test design ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
C. Deciding what to test ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
D. Writing test specifications/blueprints ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
E. Writing test tasks and items ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
F. Evaluating language tests ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
G. Interpreting scores ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
H. Test analysis ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
I. Selecting tests for your own use ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
Downloaded
by
[Texas
State
Univer
LANGUAGE CLASSROOM ASSESSMENT LITERACY 131
J. Reliability ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
K. Validation ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
L. Use of statistics ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
M. Rating performance tests (speaking/writing) ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
N. Scoring closed-response items ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
O. Classroom assessment ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
P. Large-scale testing ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
Q. Standard setting ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
R. Preparing learners to take tests ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
S. Washback on the classroom ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
T. Test administration ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
U. Ethical considerations in testing ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
V. The uses of tests in society ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
W. Principles of educational measurement ◦ 1 ◦ 2 ◦ 3 ◦ 4 ◦ 5
Q5 Which was the last language testing book you studied or used in class?
What did you like about the book? What did you dislike about the book?
Q6 What do you think are essential topics in a book on practical language testing?
Q7 What other features (e.g. glossary/activities etc) would you most like to see in a book on
practical language testing?
Q8 Do you have any other comments that will help me to understand your needs in a book on
practical language testing?
Q9 How would you rate your knowledge and understanding of language testing?
5 = very good
4 = good
3 = average
2 = poor
1 = very poor
by
[Texas
State
University
-
San
Marcos]
at
22:28
14
April
2013
Fulcher (2012) Vogt & Tsagari (2014)
でもなんか
しっくり
こない…

参考文献
• 出口マクドナルド友香理・福田純也・亘理陽一 (2019).「高等学校における英語運用能力アセスメ
ントの現状と課題: 静岡県内高校のパフォーマンス・タスク分析」『教育実践総合センター研究紀
要』29, 162–168.
• Fulcher, G. (2012). Assessment literacy for the language classroom, Language Assessment
Quarterly, 9, 2, 113-132. DOI: 10.1080/15434303.2011.642041
• Kremmel, B., & Harding, L. (2020). Towards a comprehensive, empirical model of language
assessment literacy across stakeholder groups: Developing the language assessment literacy
survey, Language Assessment Quarterly, 17, 1, 100−120, DOI: 10.1080/15434303.2019.1674855
• Taylor, L. (2013). Communicating the theory, practice and principles of language testing to
test stakeholders: Some re
fl
ections. Language Testing, 30, 3, 403–412. DOI:
10.1177/0265532213480338
• Vogt, K., & Tsagari, D. (2014). Assessment literacy of foreign language teachers: Findings of a
european study. Language Assessment Quarterly, 11, 4, 374-402. DOI:
10.1080/15434303.2014.960046

LET関西支部メソドロジー研究部会2023年度第1回研究会発表スライド（亘理陽一）

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to LET関西支部メソドロジー研究部会2023年度第1回研究会発表スライド（亘理陽一）

Similar to LET関西支部メソドロジー研究部会2023年度第1回研究会発表スライド（亘理陽一） (20)

More from youwatari

More from youwatari (20)

LET関西支部メソドロジー研究部会2023年度第1回研究会発表スライド（亘理陽一）