SlideShare a Scribd company logo
1 of 28
QUALITY ASSURANCE ON
INTERNAL ATTRIBUTES OF A
GOOD MEASUREING LANGUAGE
DEVICES (RELIABILITY)
AMIRUL FAISAL RIZZA
TESTS
TOOLS /
INSTRUMENTS
English test
tools / instruments
to draw out evidence
of the existance of
English abilities
1.Good instruments :
2.The hidden English abilities
are guaranteed to be
observable.
1.Bad instruments :
2.1. Damage measurements and
evaluation.
3.2. Can not describe the real
language ablities of the test takers.
1. Reability
2. Validity
3. Practicallity/usability
4. Andeconomy
Requirements for a good instrument
RELIABLE = STABLE = CONSISTENT
Reliability
• Reliable test is a test that can produce stable scores or
consistent scores.
• Test scores demonstate consistency or stability no
matter who administers the test (Rater or Interrater).
• The scores consistent no matter when or where the test
is administrated.
Reliability
– Observed Score is the data gathered by the researcher
– True Score is the actual unknown values that correspond to the
construct of interest
– Error
– Systematic Error is variations that results from constructs of disinterest
– Unsystematic / Random Error is nonsystematic variations in the observed
scores
Observed Score = True Score + (Measurement) Error
Students Rater 1 Rater 2
A 8 8
B 8.6 8.6
C 9 9
D 8 8
E 9.4 9.4
Perfect Consistency
Consistency between Raters or interraters
Students Rater 1 Rater 2
A 8.2 8
B 8.6 8.8
C 8.9 9
D 8 8.1
E 9.3 9.4
Consistent Test
Students Administrated on
Wednesday
Administrated on
Friday
A 7.8 8
B 8.6 8.3
C 9.1 9
D 8 8.2
E 9.3 9.4
Consistent across time
Inconsistency between raters or interraters
Students Administrated on
Wednesday
Administrated on
Friday
A 3 8
B 8.6 2
C 3.1 9
D 8 8.2
E 9.3 5
Students Administrated on
Wednesday
Administrated on
Friday
A 4 8
B 8.6 2
C 5 9
D 6 8.2
E 8 8
Inconsistency across time
How do we determine whether a
measurement is reliable?
The principles of reliability estimation utilizing
these APPROACHES:
Test Retest
Parrarel Forms
Internal Consistency
TEST RETEST
Uses the same test twice to the same group of
subjects on different testing occasions.
There is a repetation on the use of the same
instrument and the invovements of the same
subjects.
The repetetion is done on different day.
TIME ACTIVITY PARTICIPANTS RESULT
Wednesday, 3/10/18 Vocabulary test 50 students of SMA
1 Bangsal
Result 1
Wednesday,
10/10/18
The same
vocabulary test
50 students of SMA
1 Bangsal
Result 2
Advantanges Disadvantages
• We only need one set of a test. • Requires of testing occasions.
• It is not easy to create a similar
condition on different testing
occasions.
• Too close time for the test
administration makes the test
takers still remember the
content of the test.
• Far too long of the second test
may affect the test takers’
performance.
• Cause boredom, ailment and
the like.
Back
Parrarel Forms
Requires two or more sets of tests.
Each set of test is made equal in every aspect
of the test with other test.
Equal in :
Test format
Test lenght
The level of difficulty
Discrimination indexes used
Time allocation
Test content
SET A
(administrated on
Tuesday)
SET B
(administrated on
Friday)
Administered
to a group of
students
Scores
produced from
completing Set
A
Scores
produced from
completing Set
B
Correlational
analysis
Advantanges Disadvantages
• Has more variations in sets of
the tests.
• Time consuming to make two
more sets of the tests.
• Not easy to keep the students’
motivation in doing the second
test.
Back
Internal Consistency
Based on the logic that if the items in the test are highly
correlated, the test is said to be reliable.
Before develop a test, it should be built a theoritical
ability that would be measured by the test.
The items of the test should be constructed to measure
a single ability (technically it is calaed as “construct”).
Tests with higher internal consistency more accurately
measure the intended construct of the test developers.
Vocabulary ability
Synonym
Indicator 1 Indicator 2
Antonym
Indicator 1 Indicator 2
Item 1
Item 2
Item 3
Item 1
Item 2
Item 3
Item 4
Item 1
Item 2
Item 3
Item 1
Item 2
Item 3
Item 4
Concept level
Dimension level
Indicator level
Test item
level
A tetst of vocabulary containing assembles & selected items Test level
Students item Total score
A 1 1 0 1 0 1 1 0 1 0 6
B 1 0 1 0 1 0 1 1 1 0 6
C 0 1 1 1 1 1 1 1 1 1 9
D 1 0 1 0 0 0 1 1 1 0 4
E 1 0 0 0 1 1 1 1 1 1 7
Test takers’ scores
(Hypothetical dichotoumous scoring on 10 items)
F 1 1 0 1 1 1 1 1 1 1 9
Total score 5 3 3 3 4 4 6 5 6 3
Approaches to perform internal consistency
 Split half : split the scores based on the test achievement in the first half of
the items and those on the second items.
 The split can be half of the total items or based on the odd or even numbers.
 Some drawbacks of split-half are:
 Inter-item estimation : the test scores are correlated with themselves
within the same test.
It is called as inter-item correlation.
Obtained scores in each item are correlated with one another.
Item 1 is correlated with item 2, 3, 4, 5, 6, 7, 8, 9, 10 or
Item 2 is correlated with item 1, 3, 4, 5, 6, 7, 8, 9, 10.
Examples :
Split-half
Test
takers’
identity
Item Total
(set A)1 2 3 4 5
A 1 1 0 1 0 3
B 0 1 1 0 1 3
C 1 1 1 1 1 5
D 1 1 0 0 0 2
E 1 0 1 1 0 3
F 1 0 1 0 1 3
TOTAL
SCORE
5 4 4 3 3
1ST half
Split-half
Test
takers’
identity
Item Total
(set A)6 7 8 9 10
A 1 1 0 1 0 3
B 0 1 1 0 1 3
C 1 0 1 1 1 4
D 1 1 0 0 0 2
E 1 0 1 1 1 4
F 0 0 1 0 1 2
TOTAL
SCORE
4 3 4 3 4
2nd half
Back
Does not fully reflect the true value of reliablity of the
test (Kline, 1993:11)
Different split may cause different result of reliability
(Cronbach, 1951)
Test lenght affects the reliability of the test. The more
items in the test, the reliable the test is (Wiersma and
Jurs, 1990:163).
DRAWBACKS OF SPLIT-HALF
Back
Example of inter-item estimation
There two or more raters evaluate students speaking
skills.
The scoring may be based on some several aspects.a
statistical analysis may be used to analyze the data,
usually uses t-test.
A correlational analysis may be applied to examine the
closeness of the scores got by the two rates.
Quality assurance on internal attributes of a good

More Related Content

What's hot

Reliability and validity
Reliability and validityReliability and validity
Reliability and validityKaimrc_Rss_Jd
 
Unit 5 validity and reliability
Unit 5 validity and reliabilityUnit 5 validity and reliability
Unit 5 validity and reliabilitymarudhar aman
 
Blended Learning System Design Model
Blended Learning System Design ModelBlended Learning System Design Model
Blended Learning System Design ModelUwes Chaeruman
 
Reliability for testing and assessment
Reliability for testing and assessmentReliability for testing and assessment
Reliability for testing and assessmentErlwinmer Mangmang
 
Reliability
ReliabilityReliability
ReliabilityRoi Xcel
 
Reliability and validity- research-for BSC/PBBSC AND MSC NURSING
Reliability and validity- research-for BSC/PBBSC AND MSC NURSINGReliability and validity- research-for BSC/PBBSC AND MSC NURSING
Reliability and validity- research-for BSC/PBBSC AND MSC NURSINGSUCHITRARATI1976
 
Characteristics of a good test
Characteristics of a good test Characteristics of a good test
Characteristics of a good test Arash Yazdani
 
Characteristics of effective tests and hiring
Characteristics of effective tests and hiringCharacteristics of effective tests and hiring
Characteristics of effective tests and hiringBinibining Kalawakan
 
Seminar gre 2014
Seminar gre 2014Seminar gre 2014
Seminar gre 2014Hakanasa NT
 
Kinds of Tests and Testing
Kinds of Tests and TestingKinds of Tests and Testing
Kinds of Tests and TestingMaury Martinez
 

What's hot (19)

7.1 ealta guidelines
7.1 ealta guidelines7.1 ealta guidelines
7.1 ealta guidelines
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
Unit 5 validity and reliability
Unit 5 validity and reliabilityUnit 5 validity and reliability
Unit 5 validity and reliability
 
Blended Learning System Design Model
Blended Learning System Design ModelBlended Learning System Design Model
Blended Learning System Design Model
 
Assessment in Learning
Assessment in LearningAssessment in Learning
Assessment in Learning
 
Reliability for testing and assessment
Reliability for testing and assessmentReliability for testing and assessment
Reliability for testing and assessment
 
Reliability
ReliabilityReliability
Reliability
 
Reliability and validity- research-for BSC/PBBSC AND MSC NURSING
Reliability and validity- research-for BSC/PBBSC AND MSC NURSINGReliability and validity- research-for BSC/PBBSC AND MSC NURSING
Reliability and validity- research-for BSC/PBBSC AND MSC NURSING
 
Characteristics of a good test
Characteristics of a good test Characteristics of a good test
Characteristics of a good test
 
Language testing
Language testingLanguage testing
Language testing
 
Reliability bachman 1990 chapter 6
Reliability bachman 1990 chapter 6Reliability bachman 1990 chapter 6
Reliability bachman 1990 chapter 6
 
Characteristics of effective tests and hiring
Characteristics of effective tests and hiringCharacteristics of effective tests and hiring
Characteristics of effective tests and hiring
 
Seminar gre 2014
Seminar gre 2014Seminar gre 2014
Seminar gre 2014
 
Certamen item corrected
Certamen item correctedCertamen item corrected
Certamen item corrected
 
Norm tets
Norm tetsNorm tets
Norm tets
 
Instrumentation
InstrumentationInstrumentation
Instrumentation
 
Steps fo test constructions
Steps fo test constructionsSteps fo test constructions
Steps fo test constructions
 
Kinds of Tests and Testing
Kinds of Tests and TestingKinds of Tests and Testing
Kinds of Tests and Testing
 
Assessment
AssessmentAssessment
Assessment
 

Similar to Quality assurance on internal attributes of a good

Developing instruments for research
Developing instruments for researchDeveloping instruments for research
Developing instruments for researchCarlo Magno
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Linejan
 
MyMathTest La Trobe case study
MyMathTest La Trobe case studyMyMathTest La Trobe case study
MyMathTest La Trobe case studyPearson Australia
 
Testing in language programs (chapter 8)
Testing in language programs (chapter 8)Testing in language programs (chapter 8)
Testing in language programs (chapter 8)Tahere Bakhshi
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good testcyrilcoscos
 
Quality test construction 1
Quality test construction 1Quality test construction 1
Quality test construction 1kuchet106
 
Characteristics of Assessment
Characteristics of Assessment Characteristics of Assessment
Characteristics of Assessment AliAlZurfi
 
Basic Principles of Assessment
Basic Principles of AssessmentBasic Principles of Assessment
Basic Principles of AssessmentYee Bee Choo
 
Validity and Reliability
Validity and ReliabilityValidity and Reliability
Validity and ReliabilityMaury Martinez
 
Standardized testing.pptx 2
Standardized testing.pptx 2Standardized testing.pptx 2
Standardized testing.pptx 2Jesullyna Manuel
 
RELIABILITY IN LANGUAGE TESTING-TITIN'S GROUP
RELIABILITY IN LANGUAGE TESTING-TITIN'S GROUPRELIABILITY IN LANGUAGE TESTING-TITIN'S GROUP
RELIABILITY IN LANGUAGE TESTING-TITIN'S GROUPTitin Rohayati
 
Characteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentCharacteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentSuresh Babu
 
Aligning tests to standards
Aligning tests to standardsAligning tests to standards
Aligning tests to standardsFariba Chamani
 
D8 and d9 personality test development 10 2007-posting
D8 and d9 personality test development 10 2007-postingD8 and d9 personality test development 10 2007-posting
D8 and d9 personality test development 10 2007-postingBlessed Santos
 

Similar to Quality assurance on internal attributes of a good (20)

Developing instruments for research
Developing instruments for researchDeveloping instruments for research
Developing instruments for research
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity
 
MyMathTest La Trobe case study
MyMathTest La Trobe case studyMyMathTest La Trobe case study
MyMathTest La Trobe case study
 
Testing in language programs (chapter 8)
Testing in language programs (chapter 8)Testing in language programs (chapter 8)
Testing in language programs (chapter 8)
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
 
Test construction
Test construction Test construction
Test construction
 
Quality test construction 1
Quality test construction 1Quality test construction 1
Quality test construction 1
 
Ch05 instrumentation
Ch05 instrumentationCh05 instrumentation
Ch05 instrumentation
 
7.1 assessment and the cefr (1)
7.1 assessment and the cefr (1)7.1 assessment and the cefr (1)
7.1 assessment and the cefr (1)
 
Characteristics of Assessment
Characteristics of Assessment Characteristics of Assessment
Characteristics of Assessment
 
Unit 2.pptx
Unit 2.pptxUnit 2.pptx
Unit 2.pptx
 
Basic Principles of Assessment
Basic Principles of AssessmentBasic Principles of Assessment
Basic Principles of Assessment
 
Validity and Reliability
Validity and ReliabilityValidity and Reliability
Validity and Reliability
 
Item analysis
Item analysisItem analysis
Item analysis
 
Standardized testing.pptx 2
Standardized testing.pptx 2Standardized testing.pptx 2
Standardized testing.pptx 2
 
RELIABILITY IN LANGUAGE TESTING-TITIN'S GROUP
RELIABILITY IN LANGUAGE TESTING-TITIN'S GROUPRELIABILITY IN LANGUAGE TESTING-TITIN'S GROUP
RELIABILITY IN LANGUAGE TESTING-TITIN'S GROUP
 
Characteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentCharacteristics of Good Evaluation Instrument
Characteristics of Good Evaluation Instrument
 
Aligning tests to standards
Aligning tests to standardsAligning tests to standards
Aligning tests to standards
 
7 assessment and the cefr
7 assessment and the cefr 7 assessment and the cefr
7 assessment and the cefr
 
D8 and d9 personality test development 10 2007-posting
D8 and d9 personality test development 10 2007-postingD8 and d9 personality test development 10 2007-posting
D8 and d9 personality test development 10 2007-posting
 

Recently uploaded

Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 

Recently uploaded (20)

Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 

Quality assurance on internal attributes of a good

  • 1. QUALITY ASSURANCE ON INTERNAL ATTRIBUTES OF A GOOD MEASUREING LANGUAGE DEVICES (RELIABILITY) AMIRUL FAISAL RIZZA
  • 3. English test tools / instruments to draw out evidence of the existance of English abilities
  • 4. 1.Good instruments : 2.The hidden English abilities are guaranteed to be observable. 1.Bad instruments : 2.1. Damage measurements and evaluation. 3.2. Can not describe the real language ablities of the test takers.
  • 5. 1. Reability 2. Validity 3. Practicallity/usability 4. Andeconomy Requirements for a good instrument
  • 6. RELIABLE = STABLE = CONSISTENT Reliability • Reliable test is a test that can produce stable scores or consistent scores. • Test scores demonstate consistency or stability no matter who administers the test (Rater or Interrater). • The scores consistent no matter when or where the test is administrated.
  • 7. Reliability – Observed Score is the data gathered by the researcher – True Score is the actual unknown values that correspond to the construct of interest – Error – Systematic Error is variations that results from constructs of disinterest – Unsystematic / Random Error is nonsystematic variations in the observed scores Observed Score = True Score + (Measurement) Error
  • 8. Students Rater 1 Rater 2 A 8 8 B 8.6 8.6 C 9 9 D 8 8 E 9.4 9.4 Perfect Consistency Consistency between Raters or interraters
  • 9. Students Rater 1 Rater 2 A 8.2 8 B 8.6 8.8 C 8.9 9 D 8 8.1 E 9.3 9.4 Consistent Test
  • 10. Students Administrated on Wednesday Administrated on Friday A 7.8 8 B 8.6 8.3 C 9.1 9 D 8 8.2 E 9.3 9.4 Consistent across time
  • 11. Inconsistency between raters or interraters Students Administrated on Wednesday Administrated on Friday A 3 8 B 8.6 2 C 3.1 9 D 8 8.2 E 9.3 5
  • 12. Students Administrated on Wednesday Administrated on Friday A 4 8 B 8.6 2 C 5 9 D 6 8.2 E 8 8 Inconsistency across time
  • 13. How do we determine whether a measurement is reliable? The principles of reliability estimation utilizing these APPROACHES: Test Retest Parrarel Forms Internal Consistency
  • 14. TEST RETEST Uses the same test twice to the same group of subjects on different testing occasions. There is a repetation on the use of the same instrument and the invovements of the same subjects. The repetetion is done on different day.
  • 15. TIME ACTIVITY PARTICIPANTS RESULT Wednesday, 3/10/18 Vocabulary test 50 students of SMA 1 Bangsal Result 1 Wednesday, 10/10/18 The same vocabulary test 50 students of SMA 1 Bangsal Result 2
  • 16. Advantanges Disadvantages • We only need one set of a test. • Requires of testing occasions. • It is not easy to create a similar condition on different testing occasions. • Too close time for the test administration makes the test takers still remember the content of the test. • Far too long of the second test may affect the test takers’ performance. • Cause boredom, ailment and the like. Back
  • 17. Parrarel Forms Requires two or more sets of tests. Each set of test is made equal in every aspect of the test with other test. Equal in : Test format Test lenght The level of difficulty Discrimination indexes used Time allocation Test content
  • 18. SET A (administrated on Tuesday) SET B (administrated on Friday) Administered to a group of students Scores produced from completing Set A Scores produced from completing Set B Correlational analysis
  • 19. Advantanges Disadvantages • Has more variations in sets of the tests. • Time consuming to make two more sets of the tests. • Not easy to keep the students’ motivation in doing the second test. Back
  • 20. Internal Consistency Based on the logic that if the items in the test are highly correlated, the test is said to be reliable. Before develop a test, it should be built a theoritical ability that would be measured by the test. The items of the test should be constructed to measure a single ability (technically it is calaed as “construct”). Tests with higher internal consistency more accurately measure the intended construct of the test developers.
  • 21. Vocabulary ability Synonym Indicator 1 Indicator 2 Antonym Indicator 1 Indicator 2 Item 1 Item 2 Item 3 Item 1 Item 2 Item 3 Item 4 Item 1 Item 2 Item 3 Item 1 Item 2 Item 3 Item 4 Concept level Dimension level Indicator level Test item level A tetst of vocabulary containing assembles & selected items Test level
  • 22. Students item Total score A 1 1 0 1 0 1 1 0 1 0 6 B 1 0 1 0 1 0 1 1 1 0 6 C 0 1 1 1 1 1 1 1 1 1 9 D 1 0 1 0 0 0 1 1 1 0 4 E 1 0 0 0 1 1 1 1 1 1 7 Test takers’ scores (Hypothetical dichotoumous scoring on 10 items) F 1 1 0 1 1 1 1 1 1 1 9 Total score 5 3 3 3 4 4 6 5 6 3
  • 23. Approaches to perform internal consistency  Split half : split the scores based on the test achievement in the first half of the items and those on the second items.  The split can be half of the total items or based on the odd or even numbers.  Some drawbacks of split-half are:  Inter-item estimation : the test scores are correlated with themselves within the same test. It is called as inter-item correlation. Obtained scores in each item are correlated with one another. Item 1 is correlated with item 2, 3, 4, 5, 6, 7, 8, 9, 10 or Item 2 is correlated with item 1, 3, 4, 5, 6, 7, 8, 9, 10. Examples :
  • 24. Split-half Test takers’ identity Item Total (set A)1 2 3 4 5 A 1 1 0 1 0 3 B 0 1 1 0 1 3 C 1 1 1 1 1 5 D 1 1 0 0 0 2 E 1 0 1 1 0 3 F 1 0 1 0 1 3 TOTAL SCORE 5 4 4 3 3 1ST half
  • 25. Split-half Test takers’ identity Item Total (set A)6 7 8 9 10 A 1 1 0 1 0 3 B 0 1 1 0 1 3 C 1 0 1 1 1 4 D 1 1 0 0 0 2 E 1 0 1 1 1 4 F 0 0 1 0 1 2 TOTAL SCORE 4 3 4 3 4 2nd half Back
  • 26. Does not fully reflect the true value of reliablity of the test (Kline, 1993:11) Different split may cause different result of reliability (Cronbach, 1951) Test lenght affects the reliability of the test. The more items in the test, the reliable the test is (Wiersma and Jurs, 1990:163). DRAWBACKS OF SPLIT-HALF Back
  • 27. Example of inter-item estimation There two or more raters evaluate students speaking skills. The scoring may be based on some several aspects.a statistical analysis may be used to analyze the data, usually uses t-test. A correlational analysis may be applied to examine the closeness of the scores got by the two rates.