A quasi experimental evaluation design study comparing the impact of using the Continuous Assessment strategy in intervention and control schools in Zambia
Describes the implementation of the Grade 4 Basic Competence Tests Programme in Zambia focussing on setting learning targets and training teachers in assessment. Also focussed on the develeopment of valid assessment materials.
The document outlines the components of a BERMUTU education program in Indonesia between 2008-2013. The program aims to improve education quality through teacher competency reform. It describes 4 components: 1) reforming teacher education, 2) strengthening continuous teacher professional development locally, 3) revising teacher accountability and incentive systems, and 4) increasing monitoring and evaluation of teacher performance and student achievement.
The document outlines the various documents and reports required for a school's development plan, implementation of activities, monitoring and evaluation, stakeholder engagement, and training and development. It includes requirements for strategic improvement plans, organization charts, club activities and reports, engagement with external stakeholders like Rotary clubs, and monitoring of student performance and teacher training. Requirements include development plans, meeting minutes, accomplishment reports, agreements, financial documents, assessments and more to comprehensively document the school's programs and partnerships.
OPCRF aligned with PD Priorities for SY 2020-2023Divine Dizon
This document is the Office Performance Commitment and Review Form (OPCRF) for the 2021-2022 school year for a school in Mabalacat City, Pampanga. The OPCRF outlines the key responsibilities and performance indicators for the school principal based on the Philippine Professional Standards for School Heads. It includes domains for leading strategically, managing school operations and resources, and developing others. Performance will be evaluated on factors such as developing the school improvement plan, monitoring and evaluation processes, financial management, and supporting the professional growth of teachers.
The document outlines guidelines and processes for providing technical assistance to schools. It describes technical assistance as a process aimed at professional help and guidance for improvement. The objectives are to describe technical assistance, analyze processes for providing it, demonstrate readiness in applying guidelines, and appreciate adherence to standards. The technical assistance mechanism involves assessing needs, planning, implementation, monitoring, evaluation, and adjustment. Key steps include organizing provider teams, assessing school needs, designing plans relevant to recipients, implementing plans, and verifying the process with documents. Crucial technical assistance areas for strengthening school-based management are also listed.
The Quality Assurance and Accountability Division of DepEd-Regional Office VII held a two-day training program on the School Monitoring, Evaluation and Adjustment System for division trainers from June 13-14, 2013. The training aimed to demonstrate understanding of school monitoring and evaluation, describe the school MEA process, validate MEA tools, commit to implementing MEA in schools, and develop district and school training plans. Eighty-six participants from Region VII's 19 school divisions attended and evaluated the training positively. The training successfully achieved its objectives of building understanding and capacity for monitoring, evaluation, and adjustment in schools.
This document outlines the School Monitoring, Evaluation and Adjustment (SMEA) Plan for the 2021-2022 school year for a school in Mabalacat City, Pampanga. It details the plan across four domains: 1) Leading Strategically, 2) Managing School Operations and Resources, 3) Focusing on Teaching and Learning, and 4) Developing Self and Others. For each domain, it lists the objectives, major outputs, suggested activities, performance indicators, and timeline. The plan aims to monitor progress, evaluate outcomes, and make adjustments to ensure the school's strategic plans and operations align with its vision and support improved teaching, learning, and performance.
Describes the implementation of the Grade 4 Basic Competence Tests Programme in Zambia focussing on setting learning targets and training teachers in assessment. Also focussed on the develeopment of valid assessment materials.
The document outlines the components of a BERMUTU education program in Indonesia between 2008-2013. The program aims to improve education quality through teacher competency reform. It describes 4 components: 1) reforming teacher education, 2) strengthening continuous teacher professional development locally, 3) revising teacher accountability and incentive systems, and 4) increasing monitoring and evaluation of teacher performance and student achievement.
The document outlines the various documents and reports required for a school's development plan, implementation of activities, monitoring and evaluation, stakeholder engagement, and training and development. It includes requirements for strategic improvement plans, organization charts, club activities and reports, engagement with external stakeholders like Rotary clubs, and monitoring of student performance and teacher training. Requirements include development plans, meeting minutes, accomplishment reports, agreements, financial documents, assessments and more to comprehensively document the school's programs and partnerships.
OPCRF aligned with PD Priorities for SY 2020-2023Divine Dizon
This document is the Office Performance Commitment and Review Form (OPCRF) for the 2021-2022 school year for a school in Mabalacat City, Pampanga. The OPCRF outlines the key responsibilities and performance indicators for the school principal based on the Philippine Professional Standards for School Heads. It includes domains for leading strategically, managing school operations and resources, and developing others. Performance will be evaluated on factors such as developing the school improvement plan, monitoring and evaluation processes, financial management, and supporting the professional growth of teachers.
The document outlines guidelines and processes for providing technical assistance to schools. It describes technical assistance as a process aimed at professional help and guidance for improvement. The objectives are to describe technical assistance, analyze processes for providing it, demonstrate readiness in applying guidelines, and appreciate adherence to standards. The technical assistance mechanism involves assessing needs, planning, implementation, monitoring, evaluation, and adjustment. Key steps include organizing provider teams, assessing school needs, designing plans relevant to recipients, implementing plans, and verifying the process with documents. Crucial technical assistance areas for strengthening school-based management are also listed.
The Quality Assurance and Accountability Division of DepEd-Regional Office VII held a two-day training program on the School Monitoring, Evaluation and Adjustment System for division trainers from June 13-14, 2013. The training aimed to demonstrate understanding of school monitoring and evaluation, describe the school MEA process, validate MEA tools, commit to implementing MEA in schools, and develop district and school training plans. Eighty-six participants from Region VII's 19 school divisions attended and evaluated the training positively. The training successfully achieved its objectives of building understanding and capacity for monitoring, evaluation, and adjustment in schools.
This document outlines the School Monitoring, Evaluation and Adjustment (SMEA) Plan for the 2021-2022 school year for a school in Mabalacat City, Pampanga. It details the plan across four domains: 1) Leading Strategically, 2) Managing School Operations and Resources, 3) Focusing on Teaching and Learning, and 4) Developing Self and Others. For each domain, it lists the objectives, major outputs, suggested activities, performance indicators, and timeline. The plan aims to monitor progress, evaluate outcomes, and make adjustments to ensure the school's strategic plans and operations align with its vision and support improved teaching, learning, and performance.
The document summarizes the results of the Regional Monitoring, Evaluation and Adjustment (RMEA) conducted in the first quarter of 2013. Key findings include:
- The region achieved 92.31% of its 118 planned outputs, with most divisions accomplishing all targets except for minor shortfalls.
- Issues raised included lack of personnel and equipment repair needs. Lessons focused on benefits of planning and teamwork.
- As of April, the region had spent 7.53% of its 28.3 million peso budget, mostly on office supplies, travel, and communication. Expenditures varied between divisions.
- Recommendations included addressing personnel and equipment needs, enhancing monitoring systems, and adopting long-term
This document provides guidance for numeracy teaching in Prep and Years 1-2. It outlines the knowledge teachers require, essential numeracy skills and concepts to focus on, recommended assessments, and advice on planning differentiated instruction using the e5 instructional model. Teachers are advised to use assessment data to determine individual student needs and focus on number, patterns, addition/subtraction, measurement, geometry, data, and time concepts appropriate for each year level. Ongoing monitoring of student progress is recommended, along with providing feedback and opportunities for self-assessment. A range of teaching strategies including explicit instruction, questioning, discussion and hands-on activities with concrete materials should be used.
This document is an Individual Performance Commitment and Review Form for a teacher named Armando D. Ison. It outlines his key result areas, duties and responsibilities, objectives, timeline, and targets for the review period from June to March 2014. The form details objectives and targets for teaching-learning process, pupil/student outcomes, community involvement, professional growth and development, and professional ethics. It commits Mr. Ison to specific goals in each key result area to be evaluated by his rater, Principal Famie C. Apay.
Monitoring And Evaluation Framework For The K 12 Education And Training Syste...Wesley Schwalje
This presentation advances a performance management framework for the K-12 education system that aligns ministry and sector strategies with the development goals established by the Qatar National Development Strategy 2011-2016 and the Qatar National Vision 2030. Policy-based KPIs were conceived to measure system performance relative to the achievement of the overarching policy aims of quality, equity, and portability. Output KPIs were conceived to measure the effectiveness of education and training system interventions in terms of achieving academic, social, and economic outcomes which are precursors to the future development of Qatar.
The document is an individual performance review of Evelyn Mercado, a Master Teacher II, by her rater Richard Broñola, the school principal. It evaluates Mercado's performance from June 2015 to March 2016 based on key result areas including professional growth, instructional competence, instructional supervision, technical assistance, and additional factors. Mercado's performance is rated using indicators and supporting documents for each objective. The review will be used to assess Mercado's work and identify areas for improvement or additional training.
This document provides a summary report of a two-day monitoring and evaluation (M&E) training conducted for policy staff at the Ministry of Finance in Afghanistan. The training aimed to build participants' capacity in M&E and equip them with skills to effectively plan and implement M&E of programs, particularly those under the Tokyo Mutual Accountability Framework. A total of 15 staff members from various departments attended the training, which covered terminology, concepts, tools and the importance of M&E. Participants engaged in group work and discussions. Based on an evaluation, the training was successful in enhancing understanding of key M&E topics.
Quality Assurance and institutional accreditation performance indicators and ...Ganesh Shukla
This document discusses quality assurance and institutional accreditation in India, focusing on the National Institutional Ranking Framework (NIRF), the National Assessment and Accreditation Council (NAAC), and Internal Quality Assurance Cells (IQACs). It outlines the aims, functions, criteria and benefits of quality assurance processes and accrediting bodies in India. Key points include that NAAC accredits higher education institutions based on several criteria and parameters, NIRF provides annual rankings of Indian universities based on teaching, research, graduation outcomes and other metrics, and IQACs were established to promote continuous quality improvement within institutions.
Rubrics for IPCRF of Teachers per Objective of their KRAsDIEGO Pomarca
This document outlines rubrics used to evaluate teachers' performance on various objectives within their Key Result Areas (KRAs) at Pangpang National High School. It provides performance indicators, scales for rating teachers as outstanding, very satisfactory, satisfactory, unsatisfactory or poor on objectives related to lesson planning, facilitating learning, classroom management, and monitoring students. Sample computations are shown to calculate a teacher's average score for each objective based on their ratings in quality, efficiency and timeliness. The rubrics provide a framework for conducting impartial performance reviews of teachers according to clear and measurable standards.
This paper is an attempt to address the problems that Cavite State University Naic Branch are facing : drop in enrollment, a drop in passing in licensure examinations, job competitiveness of graduates and university instructors' qualifications.
This guide is made with you – our school heads, teachers, school staff, and other school
stakeholders in mind. In crafting this guide, we consulted with planning experts and
experts from the field – principals, supervisors, and teachers – to ensure that School
Improvement Planning becomes easier and effective for you.
The document is an Individual Performance Commitment and Review form for the principal of Dapdap Elementary School. It outlines her key result areas, objectives, performance indicators and actual results for the review period of June 2015 to March 2016. The principal's performance was rated based on her fulfillment of objectives related to instructional leadership, learning environment, human resource management, community partnership, school leadership and utilization of resources. She received the highest rating for most objectives which demonstrated meeting targets for learner outcomes, submission of required documents, professional development activities and community engagement.
This document contains a performance review for a teacher for the period of April 2016. It evaluates the teacher's performance across several key result areas (KRAs) based on objectives, timelines, weights, and indicators. For each KRA, such as teaching-learning process and students' learning outcomes, the teacher's performance is rated on a scale of 1-5 based on the degree to which various quality, efficiency and timeliness indicators were met. The teacher received high ratings, often "Outstanding" or "Very Satisfactory" across most KRAs, indicating strong teaching performance over the review period.
This document provides a summary of the monitoring, evaluation, and adjustment activities for a school from September 2021 to June 2022. It includes:
1) A table showing the physical and financial accomplishments of 20 planned activities across 5 domains of school leadership. 95% of activities were completed and 92,000 of the allocated 92,000 budget was spent.
2) An analysis of quarterly physical accomplishment and funds utilization rates, finding that goals were achieved and budgets optimally used each quarter.
3) Tables showing the status of staff deployment, with 26 teaching and non-teaching positions filled and no new positions created.
This document summarizes the accomplishments of DepEd Tambayan Elementary School for the month of June in several key result areas including teaching and learning, pupil outcomes, community involvement, and professional growth and development. Several targets were set and achieved across different objectives, such as preparing daily lesson logs with appropriate teaching materials, facilitating learning through innovative strategies, monitoring student attendance and progress, conducting parent-teacher association meetings, and undertaking professional development activities like action research and training attendance. Supporting documentation is provided for each reported output including documents, data, pictures and certificates to validate that the targets were achieved.
Individual performance commitment and review form for regular teachersRai Blanquera
This document is an individual performance commitment and review form for a regular teacher. It outlines the teacher's objectives, key responsibilities, and performance indicators across seven major areas for the rating period: student development, staff development, curriculum development, physical facilities, fiscal management, records management, and community development/parent involvement. The teacher's performance will be evaluated based on completing objectives, meeting responsibilities, and achieving targets for the various performance indicators listed in the form.
This operational plan aims to increase enrollment, improve student welfare, and develop faculty. Key strategies include intensive enrollment campaigns, upgrading facilities, strengthening guidance services, offering scholarships, and providing faculty development opportunities. Success will be measured through indicators such as enrollment numbers, student satisfaction ratings, and the percentage of faculty completing doctoral programs. Regular monitoring and evaluation of activities will allow corrective measures to ensure goals are achieved.
The document outlines several division programs and projects for the 2018-2019 school year (SY) aimed at improving curriculum implementation and student performance. Key programs include:
1. HI-TEACH - Focuses on instructional supervision, technical assistance for teachers, and capacity building for school heads to ensure full implementation of K-12 curriculum.
2. POWER IT UP - Implements intervention, reinforcement, and enhancement activities to improve student performance in all subject areas to at least 75% proficiency.
3. I-LIKHA - Contextualizes instructional materials to address 21st century learner needs and support teachers through localization of resources.
4. AGAP - Strengthens the assessment
This document provides a summary of the final evaluation report of the GEAR UP San Francisco program from May 2014. The key findings are:
1) GEAR UP students in the first two cohorts had higher GPAs, graduation rates, college attendance rates, and were more likely to take steps towards college like visits and SATs than similar students before the program.
2) Traditionally underserved students particularly benefited from GEAR UP's services.
3) GEAR UP helped strengthen a college-going culture in the schools and supported school-wide college access events.
4) GEAR UP coordinators' qualifications and stable presence on campus helped them effectively support students. The variety of services provided were crucial
The document provides guidelines for implementing the Results-Based Performance Management System (RPMS) aligned with the Philippine Professional Standards for Teachers (PPST) for public school teachers for School Year 2021-2022. It outlines 18 PPST indicators that will be used in evaluating teacher performance. Teachers' performance periods will cover August 1, 2021 to July 31, 2022. It provides the timeline and steps for the four phases of the RPMS cycle, and describes the assessment tools and means of verification that will be used to measure teacher performance on the indicators.
The document outlines an assessment plan to evaluate the Instructional Supervision Program in Oman. It will evaluate the program using a nine step model that includes: 1) defining the purpose and scope of the evaluation; 2) specifying evaluation questions; 3) designing the evaluation; 4) creating a data collection plan; 5) collecting data through surveys and interviews of supervisors and teachers; 6) analyzing the data; 7) documenting findings; 8) disseminating results; and 9) providing feedback for program improvement. The evaluation aims to assess the effectiveness of instructional supervision in teaching and the supervisors' training. It will answer whether instructional supervision plays an essential role in the learning process in Oman.
This document provides guidelines for assessment in the Life Orientation subject for the National Certificates (Vocational) qualification. It outlines the principles and objectives of assessment for vocational qualifications set by the National Qualifications Framework. Assessment includes an internal continuous assessment component and an external summative assessment component. It provides details on moderation of assessment, types of assessment, planning assessment, and strategies for collecting evidence. The guidelines are intended to help lecturers develop a coherent integrated assessment system for Life Orientation that complies with NQF requirements.
The document summarizes the results of the Regional Monitoring, Evaluation and Adjustment (RMEA) conducted in the first quarter of 2013. Key findings include:
- The region achieved 92.31% of its 118 planned outputs, with most divisions accomplishing all targets except for minor shortfalls.
- Issues raised included lack of personnel and equipment repair needs. Lessons focused on benefits of planning and teamwork.
- As of April, the region had spent 7.53% of its 28.3 million peso budget, mostly on office supplies, travel, and communication. Expenditures varied between divisions.
- Recommendations included addressing personnel and equipment needs, enhancing monitoring systems, and adopting long-term
This document provides guidance for numeracy teaching in Prep and Years 1-2. It outlines the knowledge teachers require, essential numeracy skills and concepts to focus on, recommended assessments, and advice on planning differentiated instruction using the e5 instructional model. Teachers are advised to use assessment data to determine individual student needs and focus on number, patterns, addition/subtraction, measurement, geometry, data, and time concepts appropriate for each year level. Ongoing monitoring of student progress is recommended, along with providing feedback and opportunities for self-assessment. A range of teaching strategies including explicit instruction, questioning, discussion and hands-on activities with concrete materials should be used.
This document is an Individual Performance Commitment and Review Form for a teacher named Armando D. Ison. It outlines his key result areas, duties and responsibilities, objectives, timeline, and targets for the review period from June to March 2014. The form details objectives and targets for teaching-learning process, pupil/student outcomes, community involvement, professional growth and development, and professional ethics. It commits Mr. Ison to specific goals in each key result area to be evaluated by his rater, Principal Famie C. Apay.
Monitoring And Evaluation Framework For The K 12 Education And Training Syste...Wesley Schwalje
This presentation advances a performance management framework for the K-12 education system that aligns ministry and sector strategies with the development goals established by the Qatar National Development Strategy 2011-2016 and the Qatar National Vision 2030. Policy-based KPIs were conceived to measure system performance relative to the achievement of the overarching policy aims of quality, equity, and portability. Output KPIs were conceived to measure the effectiveness of education and training system interventions in terms of achieving academic, social, and economic outcomes which are precursors to the future development of Qatar.
The document is an individual performance review of Evelyn Mercado, a Master Teacher II, by her rater Richard Broñola, the school principal. It evaluates Mercado's performance from June 2015 to March 2016 based on key result areas including professional growth, instructional competence, instructional supervision, technical assistance, and additional factors. Mercado's performance is rated using indicators and supporting documents for each objective. The review will be used to assess Mercado's work and identify areas for improvement or additional training.
This document provides a summary report of a two-day monitoring and evaluation (M&E) training conducted for policy staff at the Ministry of Finance in Afghanistan. The training aimed to build participants' capacity in M&E and equip them with skills to effectively plan and implement M&E of programs, particularly those under the Tokyo Mutual Accountability Framework. A total of 15 staff members from various departments attended the training, which covered terminology, concepts, tools and the importance of M&E. Participants engaged in group work and discussions. Based on an evaluation, the training was successful in enhancing understanding of key M&E topics.
Quality Assurance and institutional accreditation performance indicators and ...Ganesh Shukla
This document discusses quality assurance and institutional accreditation in India, focusing on the National Institutional Ranking Framework (NIRF), the National Assessment and Accreditation Council (NAAC), and Internal Quality Assurance Cells (IQACs). It outlines the aims, functions, criteria and benefits of quality assurance processes and accrediting bodies in India. Key points include that NAAC accredits higher education institutions based on several criteria and parameters, NIRF provides annual rankings of Indian universities based on teaching, research, graduation outcomes and other metrics, and IQACs were established to promote continuous quality improvement within institutions.
Rubrics for IPCRF of Teachers per Objective of their KRAsDIEGO Pomarca
This document outlines rubrics used to evaluate teachers' performance on various objectives within their Key Result Areas (KRAs) at Pangpang National High School. It provides performance indicators, scales for rating teachers as outstanding, very satisfactory, satisfactory, unsatisfactory or poor on objectives related to lesson planning, facilitating learning, classroom management, and monitoring students. Sample computations are shown to calculate a teacher's average score for each objective based on their ratings in quality, efficiency and timeliness. The rubrics provide a framework for conducting impartial performance reviews of teachers according to clear and measurable standards.
This paper is an attempt to address the problems that Cavite State University Naic Branch are facing : drop in enrollment, a drop in passing in licensure examinations, job competitiveness of graduates and university instructors' qualifications.
This guide is made with you – our school heads, teachers, school staff, and other school
stakeholders in mind. In crafting this guide, we consulted with planning experts and
experts from the field – principals, supervisors, and teachers – to ensure that School
Improvement Planning becomes easier and effective for you.
The document is an Individual Performance Commitment and Review form for the principal of Dapdap Elementary School. It outlines her key result areas, objectives, performance indicators and actual results for the review period of June 2015 to March 2016. The principal's performance was rated based on her fulfillment of objectives related to instructional leadership, learning environment, human resource management, community partnership, school leadership and utilization of resources. She received the highest rating for most objectives which demonstrated meeting targets for learner outcomes, submission of required documents, professional development activities and community engagement.
This document contains a performance review for a teacher for the period of April 2016. It evaluates the teacher's performance across several key result areas (KRAs) based on objectives, timelines, weights, and indicators. For each KRA, such as teaching-learning process and students' learning outcomes, the teacher's performance is rated on a scale of 1-5 based on the degree to which various quality, efficiency and timeliness indicators were met. The teacher received high ratings, often "Outstanding" or "Very Satisfactory" across most KRAs, indicating strong teaching performance over the review period.
This document provides a summary of the monitoring, evaluation, and adjustment activities for a school from September 2021 to June 2022. It includes:
1) A table showing the physical and financial accomplishments of 20 planned activities across 5 domains of school leadership. 95% of activities were completed and 92,000 of the allocated 92,000 budget was spent.
2) An analysis of quarterly physical accomplishment and funds utilization rates, finding that goals were achieved and budgets optimally used each quarter.
3) Tables showing the status of staff deployment, with 26 teaching and non-teaching positions filled and no new positions created.
This document summarizes the accomplishments of DepEd Tambayan Elementary School for the month of June in several key result areas including teaching and learning, pupil outcomes, community involvement, and professional growth and development. Several targets were set and achieved across different objectives, such as preparing daily lesson logs with appropriate teaching materials, facilitating learning through innovative strategies, monitoring student attendance and progress, conducting parent-teacher association meetings, and undertaking professional development activities like action research and training attendance. Supporting documentation is provided for each reported output including documents, data, pictures and certificates to validate that the targets were achieved.
Individual performance commitment and review form for regular teachersRai Blanquera
This document is an individual performance commitment and review form for a regular teacher. It outlines the teacher's objectives, key responsibilities, and performance indicators across seven major areas for the rating period: student development, staff development, curriculum development, physical facilities, fiscal management, records management, and community development/parent involvement. The teacher's performance will be evaluated based on completing objectives, meeting responsibilities, and achieving targets for the various performance indicators listed in the form.
This operational plan aims to increase enrollment, improve student welfare, and develop faculty. Key strategies include intensive enrollment campaigns, upgrading facilities, strengthening guidance services, offering scholarships, and providing faculty development opportunities. Success will be measured through indicators such as enrollment numbers, student satisfaction ratings, and the percentage of faculty completing doctoral programs. Regular monitoring and evaluation of activities will allow corrective measures to ensure goals are achieved.
The document outlines several division programs and projects for the 2018-2019 school year (SY) aimed at improving curriculum implementation and student performance. Key programs include:
1. HI-TEACH - Focuses on instructional supervision, technical assistance for teachers, and capacity building for school heads to ensure full implementation of K-12 curriculum.
2. POWER IT UP - Implements intervention, reinforcement, and enhancement activities to improve student performance in all subject areas to at least 75% proficiency.
3. I-LIKHA - Contextualizes instructional materials to address 21st century learner needs and support teachers through localization of resources.
4. AGAP - Strengthens the assessment
This document provides a summary of the final evaluation report of the GEAR UP San Francisco program from May 2014. The key findings are:
1) GEAR UP students in the first two cohorts had higher GPAs, graduation rates, college attendance rates, and were more likely to take steps towards college like visits and SATs than similar students before the program.
2) Traditionally underserved students particularly benefited from GEAR UP's services.
3) GEAR UP helped strengthen a college-going culture in the schools and supported school-wide college access events.
4) GEAR UP coordinators' qualifications and stable presence on campus helped them effectively support students. The variety of services provided were crucial
The document provides guidelines for implementing the Results-Based Performance Management System (RPMS) aligned with the Philippine Professional Standards for Teachers (PPST) for public school teachers for School Year 2021-2022. It outlines 18 PPST indicators that will be used in evaluating teacher performance. Teachers' performance periods will cover August 1, 2021 to July 31, 2022. It provides the timeline and steps for the four phases of the RPMS cycle, and describes the assessment tools and means of verification that will be used to measure teacher performance on the indicators.
The document outlines an assessment plan to evaluate the Instructional Supervision Program in Oman. It will evaluate the program using a nine step model that includes: 1) defining the purpose and scope of the evaluation; 2) specifying evaluation questions; 3) designing the evaluation; 4) creating a data collection plan; 5) collecting data through surveys and interviews of supervisors and teachers; 6) analyzing the data; 7) documenting findings; 8) disseminating results; and 9) providing feedback for program improvement. The evaluation aims to assess the effectiveness of instructional supervision in teaching and the supervisors' training. It will answer whether instructional supervision plays an essential role in the learning process in Oman.
This document provides guidelines for assessment in the Life Orientation subject for the National Certificates (Vocational) qualification. It outlines the principles and objectives of assessment for vocational qualifications set by the National Qualifications Framework. Assessment includes an internal continuous assessment component and an external summative assessment component. It provides details on moderation of assessment, types of assessment, planning assessment, and strategies for collecting evidence. The guidelines are intended to help lecturers develop a coherent integrated assessment system for Life Orientation that complies with NQF requirements.
The document summarizes the progress and targets of the Basic Education Sector Reform Agenda (BESRA) Implementation and Accountability Plan from 2010 to 2012. It discusses objectives to improve learning outcomes and increase the number of schools reaching higher levels of school-based management. It also outlines targets and status updates for developing systems for quality assurance, teacher development, learning resources, and pre-school and alternative learning education programs.
Aspects of the Zambian Ministry of Education's policy on assessmentWilliam Kapambwe
The document discusses the Ministry of Education's policy on assessment in Zambia. It outlines the background of assessment reform beginning in 1992 with "Focus on Learning" which aimed to improve the quality of education. The ministry's 1996 policy "Educating Our Future" emphasized school-based continuous assessment and establishing basic competency levels. The assessment procedures were shifted to an outcome-based approach involving criterion referencing and authentic assessment. The strategic plan for 2003-2007 aimed to set minimum education standards and provide a competency-based curriculum supported by learning resources.
It refers to the collection of information on which judgment might be made about the worth and the effectiveness of a particular programme. It includes making those judgments so that decision might be made about the future of programme, whether to retain the program as it stand, modify it or throw it out altogether.
Professional Development PPT slides.pptxNqobile Nkosi
The document provides details about a training session on implementing the Quality Management System (QMS) for school-based educators in South Africa. It includes an agenda for Session 1 which covers topics such as the purpose and guiding principles of QMS, roles and responsibilities, performance standards and criteria, the implementation process, and appraisal instruments. Educators are required to attend the training to earn professional development points. The training aims to orient educators to the QMS framework as outlined in the relevant collective agreement.
The document outlines an orientation on the Basic Education Monitoring and Evaluation Framework (BEMEF) which provides guidance for monitoring and evaluating DepEd's programs and projects to ensure they are achieving their goals. It discusses the objectives, scope, principles and types of monitoring and evaluation under the BEMEF. The agenda also includes presentations on enabling mechanisms, indicators, operationalizing monitoring and evaluation, the roles of different governance levels, and developing a theory of change.
The document discusses the revised process for assessment and accreditation of higher education institutions by the National Assessment and Accreditation Council (NAAC) in India. Some key points:
1. The revised process aims to make the accreditation process more robust, objective, transparent, scalable and ICT-enabled, with reduced duration.
2. The revisions are based on feedback from stakeholders like academic experts and institutions. It resulted in developing technology-enabled and user-friendly assessment frameworks.
3. A new manual was developed for the accreditation of dual-mode universities which offer both conventional and distance learning programs, based on inputs from expert committees and stakeholders.
4. The manual will
The document provides information on school performance accountability in the Philippines. It defines accountability and discusses key principles like having a client focus, being performance oriented, ensuring transparency and integrating accountability mechanisms. It outlines who is accountable for what at different levels from the school to the central office. It also discusses approaches to monitoring and evaluation, and tools for accountability like the school report card.
The document summarizes the Quality Assurance and Accountability Framework (QAAF) adopted by the Department of Education in the Philippines. The QAAF provides a roadmap to build a culture of quality in the Department. It has the following key objectives: 1) Highlight the strategic importance of schools in providing quality education; 2) Strengthen support to schools from divisions and regions; 3) Define system boundaries between DepEd units; 4) Facilitate sharing of best practices; 5) Ensure education standards and management systems are in place; 6) Foster continuous improvement. The QAAF is based on a quality management model and emphasizes functional literacy, learners' outcomes, schools as the core unit, management levels and processes, and
This manual provides the framework for instructional supervision across different governance levels from region to school. It outlines the organizational structure and functions and responsibilities of instructional supervisors at each level. The roles include providing technical assistance and support to teachers, monitoring and evaluating instruction, identifying needs and implementing interventions to improve teaching and learning. The overall goal is to enhance education quality by supporting teachers and creating an enabling environment.
Pilot-tesing, Monitoring and Evaluating the Implementation of CurriculumVirginia Sevilla
This is the continuation of Curriculum Development Lesson 3 Module III which is "Pilot-tesing, Monitoring and Evaluating the Implementation of Curriculum"
The document discusses Continuous and Comprehensive Evaluation (CCE), which is a system for evaluating students that covers their overall development. CCE aims to assess all aspects of a child during their time at school, minimize stress, provide regular and comprehensive assessments, and help teachers and students. It assesses scholastic areas like academic subjects as well as co-scholastic areas such as life skills, activities, attitudes and values. Formative assessment provides feedback during learning, while summative assessment evaluates learning at the end of a period. However, CCE also has limitations like being difficult in large classes and requiring trained teachers.
UIACS provides academic and consultancy services to schools to improve student outcomes. This includes curriculum design, instruction support, accreditation services, and school improvement planning. UIACS consists of several specialized divisions that help schools with these areas. Some of the key services mentioned include curriculum development, online student assessments, facilitating the accreditation process, and providing guidance on school improvement initiatives.
1. Formative assessment occurs throughout instruction to monitor student learning and provide feedback, while summative assessment evaluates student learning at the end of a unit or course.
2. Both forms of assessment are important, as formative assessment guides ongoing instruction and learning, while summative assessment evaluates the achievement of learning outcomes and can be used for grading purposes.
3. Effective assessment involves aligning learning objectives, instructional activities, and evaluation methods to obtain a full picture of student understanding.
This document summarizes the history of assessment reforms in India from ancient to modern times. It discusses key education commissions and policies that shaped assessment practices. The National Education Policy 2020 aims to transform assessment to focus on competency and learning rather than rote memorization. It recommends continuous evaluation, multisource assessment, and reducing the high-stakes nature of board exams. Guidelines will be developed to align assessment across schools while supporting gifted students. Overall the policy seeks to shift assessment practices from testing memorization to evaluating higher-order skills.
This document outlines interim policies for assessment and grading during distance learning due to COVID-19. It defines key terms like formative and summative assessment and discusses using a variety of assessment strategies to understand student learning. Teachers are responsible for designing flexible assessments and providing timely feedback, while students and parents are responsible for communicating challenges. Assessment methods should align with learning standards and allow students multiple ways to demonstrate understanding. Both formative and summative assessments will continue remotely, with an emphasis on feedback and remediation to support student growth.
The AP Regional initiative on QA of TVET Qualifications in the context of ASE...OECD CFE
This document summarizes UNESCO's work on quality assurance of technical and vocational education and training (TVET) qualifications in Asia and the Pacific. It discusses the development of country studies, guidelines, and principles to help countries strengthen their TVET quality assurance systems. The guidelines were published in 2017 to provide a framework for stakeholders to develop, monitor, and improve the effectiveness of their TVET qualifications systems.
This document provides guidance on assessing school-based management practices in the Philippines. It outlines three levels (scales) of practice for six dimensions of school-based management. The dimensions include school leadership, strategic planning, resource management, teaching and learning, stakeholder partnerships, and monitoring and evaluation. Schools can use the assessment tool to determine their current level of practice for each dimension and identify areas for improvement. The assessment results will help schools and the Department of Education develop targeted support and interventions to strengthen school-based management.
Similar to Ca Baseline and Post test assessment report 2007 12 oct07 (20)
Effective administration and management of high stake assessementWilliam Kapambwe
1. The document discusses the effective administration and management of high-stake assessment examinations by the Council of Zambia.
2. It emphasizes the importance of continuous assessment conducted by teachers to improve student learning and reduce anxiety around exams.
3. Moderation by the Examinations Council of Zambia is needed to ensure fairness and catch any bias or malpractice, as teachers' assessments can vary in relation to standards between classes and schools.
Test construction (for content staff) eg feb08 erpWilliam Kapambwe
The document outlines the test construction procedure for assembling operational and field test forms for the CAPS assessment program. It involves content specialists using an item bank and test construction specifications to select items and create initial pull lists, which are then reviewed by psychometricians who analyze item and test statistics to ensure specifications are met. The process involves iterations between content specialists and psychometricians until final pull lists and test maps are approved, after which the operational forms are prepared for administration.
General Framework for Setting Examination Papers and Test PapersWilliam Kapambwe
The document provides guidance on developing test specifications and examination papers, including defining test content and mapping domains, using taxonomies to classify learning objectives, and selecting assessment methods that align with domains of learning. It discusses Bloom's taxonomy and provides examples of verbs for different cognitive levels. Assessment options are described for various learning domains, including cognitive, affective, and psychomotor. Frameworks like Romiszowski's are presented for relating knowledge and skills to test construction. The importance of congruence between learning outcomes and assessment methods is emphasized.
An overview of the assessment tasks for all the six subject areasWilliam Kapambwe
The document summarizes the assessment schemes for the six learning areas in primary school: 1) Literacy and Language, 2) Integrated Science, 3) Creative and Technology Studies, 4) Mathematics, 5) Social and Development Studies, and 6) Community Studies. Assessment includes practical activities and pencil-and-paper tests at the end of months 1-3 in each term. The learning areas use pupil-centered teaching methods and cover age-appropriate topics that increase in complexity from grades 1 to 5. Continuous assessment allows teachers to monitor individual student progress in each subject area.
The role of strategic planning in effecting change the realtionshiop between ...William Kapambwe
Strategic planning is a management tool that helps organizations focus their energy and work towards common goals. It determines an organization's direction over the next year or more. The strategic planning process involves three major activities: strategic analysis, setting strategic direction through goals and strategies, and action planning to implement strategies. While strategic planning is a disciplined process, it is also creative and allows for flexibility as insights are gained.
The document discusses the characteristics of outcomes-based assessment. It outlines that outcomes-based assessment focuses on measuring what learners know and can do rather than content coverage. Key characteristics include integrating assessment with teaching, using varied assessment strategies like portfolios and performances, and focusing on mastery learning and criterion-referenced assessment. The purposes of assessment in an outcomes-based curriculum are to identify learner needs, track progress, diagnose problems, and judge the effectiveness of learning programs. Recording and reporting of assessment should describe learner progress, strengths, and weaknesses.
The document provides an introduction to philosophy, outlining its main goals and branches. It discusses how philosophy originated under Socrates and his development of the Socratic method. It describes the core areas of philosophy including ethics, metaphysics, epistemology, and logic. It also covers the demands and rewards of studying philosophy.
Continuous assessment as a relevant tool to quality products of learners in e...William Kapambwe
The document discusses continuous assessment as a relevant tool for quality education. It defines key concepts like curriculum and assessment and examines the relationship between learning and assessment. Different assessment types and curriculum planning models are described. The principles of the process curriculum model emphasize developing learners through a variety of authentic and participatory assessments over time, making continuous assessment well-suited as it focuses on individual progress, understanding over rote knowledge, and both formative and summative feedback. Implementation of a continuous assessment pilot program in Zambia from 2006 to 2009 observed positive impacts from its use.
The document discusses the components of emotional intelligence that effective leaders possess. It summarizes Daniel Goleman's research finding that emotional intelligence is crucial for leadership success more than IQ or technical skills. The five components of emotional intelligence are self-awareness, self-regulation, motivation, empathy, and social skills. Each component is then defined in one to three sentences. For example, self-awareness means understanding one's own emotions and having candor in assessing strengths and weaknesses. Self-regulation allows one to control feelings and not panic. Motivation provides drive to achieve beyond expectations. Empathy means considering how others feel in decision making. Social skills help manage relationships and move people in the desired direction.
Training session on talent management and developmentWilliam Kapambwe
1. The document summarizes lessons learned from a short course on test construction presented by representatives from Zambia, Nepal, Philippines, and Zimbabwe.
2. Key lessons included understanding item response theory and classical test theory, developing valid and reliable tests, using taxonomies to clarify learning outcomes, and the importance of item banking and reporting test results.
3. The presenters outlined strategies for disseminating the lessons within their respective countries, which involved workshops, training programs, and collaborating with examination boards.
Power point for the techniques for constructing exam itemsWilliam Kapambwe
The document discusses techniques for constructing examination questions and assessing student learning. It covers constructing objective test items like multiple choice and matching, as well as subjective items like short answer and essays. Tips are provided for writing different item types and ensuring item-objective congruence. A variety of assessment options for different learning domains and continuous assessment techniques are also outlined.
The document provides guidance on developing test specifications and examination papers, including defining test content and mapping, using taxonomies to classify learning objectives, choosing appropriate assessment methods based on the cognitive, affective, and psychomotor domains being assessed, and ensuring congruency between learning outcomes and assessment techniques. It discusses Bloom's and Romiszowski's taxonomies and provides examples of verbs to use for different levels. The conclusion emphasizes the importance of aligning assessments with the intended learning outcomes.
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
This presentation was provided by Rebecca Benner, Ph.D., of the American Society of Anesthesiologists, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
Gender and Mental Health - Counselling and Family Therapy Applications and In...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxEduSkills OECD
Iván Bornacelly, Policy Analyst at the OECD Centre for Skills, OECD, presents at the webinar 'Tackling job market gaps with a skills-first approach' on 12 June 2024
Ca Baseline and Post test assessment report 2007 12 oct07
1. Ministry of Education
DRAFT Technical Report
of the
Pre- and Post-Pilot Testing for the
Continuous Assessment Programme
in Lusaka, Southern and Western Provinces
Coordinated by the
Examinations Council of Zambia
Research and Test Development Department
Under the Direction of the
Continuous Assessment Steering and Technical Committees
Ministry of Education
Lusaka, Zambia
October 2007
2. Table of Contents
ACKNOWLEDGMENTS ..................................................................................2
CHAPTER ONE: BACKGROUND ....................................................................3
1.1 Introduction to Continuous Assessment....................................................... 3
1.2 Definition of Continuous Assessment .......................................................... 4
1.3 Challenges in the Implementation of Continuous Assessment .................... 4
1.4 Guidelines for Implementation of Continuous Assessment.......................... 5
1.5 Plan for Implementation of Continuous Assessment.................................... 7
CHAPTER TWO: EVALUATION METHODOLOGY ..............................................8
2.1 Objectives .................................................................................................... 8
2.2 Design.......................................................................................................... 8
2.3 Sample......................................................................................................... 9
2.4 Instruments .................................................................................................. 9
2.5 Administration .............................................................................................. 9
2.6 Data Capture and Scoring.......................................................................... 10
2.7 Data Analysis ............................................................................................. 10
CHAPTER THREE: ASSESSMENT RESULTS..................................................11
3.1 Psychometric Characteristics..................................................................... 11
3.2 Classical Test Theory ................................................................................ 11
3.3 Item Response Theory............................................................................... 14
3.4 Scaled Scores............................................................................................ 15
3.5 Vertical Scaled Scores ............................................................................... 18
3.6 Comparison between Pilot and Comparison Groups ................................. 19
3.7 Comparison across Regions ...................................................................... 24
3.8 Performance Categories ............................................................................ 25
CHAPTER FOUR: SUMMARY AND CONCLUSIONS .........................................28
APPENDIX 1: ITEM STATISTICS BY SUBJECT
APPENDIX 2: SCORES AND FREQUENCIES - GRADE 5 PRE-TESTS
APPENDIX 3: SCORES AND FREQUENCIES - GRADE 5 POST-TESTS
APPENDIX 4: HISTOGRAMS BY SUBJECT AND GROUP
1
3. ACKNOWLEDGMENTS
The Continuous Assessment Joint Steering and Technical Committees and the
Examinations Council of Zambia wish to express profound gratitude to the
professional and material support provided by the Provincial Education Offices,
District Education Boards, Educational Zone staff in the different districts, school
administrators, teachers and pupils. Without this support, the baseline and post-pilot
assessment exercises would not have succeeded.
Other appreciations go to the management in the Directorate for Curriculum and
Assessment in the Ministry of Education for providing professional support towards
the Continuous Assessment programme in general and the assessment exercises in
particular. We wish to specifically thank the Director for Standards and Curriculum,
the Director for the Examinations Council of Zambia, and the Chief Curriculum
Specialist for allowing their personnel to take part in the assessment exercise.
Finally, we wish to express our appreciation to the USAID and the EQUIP2 Project
for providing the finances and technical support towards the Continuous Assessment
programme in Zambia.
All of the participants and stakeholders listed above have played a crucial role in not
only developing and implementing the Continuous Assessment programme, but
have also been supportive of the quantitative evaluation of the programme presented
in this technical paper. It is because of their interest in improving student learning
outcomes that the Continuous Assessment programme has had the necessary
financial, administrative and technical support. Our hope is that the programme will
prove to be valuable for all of the pupils and teachers in Zambian schools.
2
4. Chapter One: Background
1.1 Introduction to Continuous Assessment
Over the years in Zambia, the education system has not been able to provide
enough spaces for all learners to proceed from Grade 7 to Grade 8, from
Grade 9 to Grade 10, and from Grade 12 to higher learning institutions. The
system has used examinations for selection of those to proceed to the next
level and for the certification of candidates; however, this has been done
without formal consideration of the school-based assessment as a component
in the final examinations, with the exception of some practical subjects.
The 1977 Educational Reforms explicitly provided for the use of Continuous
Assessment (CA). Later, national policy documents, particularly Educating
Our Future (1996) and Ministry of Education’s Strategic Plan 2003-2007,
stated the need for integrating school-based continuous assessment into the
education system, including the development of strategies to combine CA
results with the final examination results for purposes of pupil certification and
selection.
Furthermore, the national education policy, as stated in Educating Our Future,
stipulated that the Ministry of Education will develop procedures that will
enable teachers to standardise their assessment methods and tasks for use
as an integral part of school-based CA. The education policy document also
stated that the Directorate of Standards, in cooperation with the Examinations
Council of Zambia (ECZ), will determine how school-based CA can be better
conducted so that it can contribute to the final examination results for pupil
certification and promotion to the subsequent levels. The policy also stated
that the Directorate of Standards, with input from the ECZ, will determine
when school-based CA can be introduced.
In order to set in motion the implementation of school-based CA, the ECZ
convened a preparatory workshop from 16th to 22nd November 2003 in
Kafue. Ninety (90) participants from various stakeholders’ institutions took
part. The objectives of the preparatory workshop were to:
• Recommend a plan for developing and implementing CA;
• Recommend a training plan for preparing teachers in implementing CA;
• Explore ways of ensuring transparency, reliability, validity and
comparability in using CA results;
• Agree on common assessment tasks and learning outcomes to be
identified in the syllabuses for CA;
• Discuss the development of a teacher’s manual on CA; and
• Discuss the nature of summary forms for recording marks that should be
provided to schools.
3
5. 1.2 Definition of Continuous Assessment
Continuous assessment is defined as an on-going, diagnostic, classroom-
based process that uses a variety of assessment tools to measure learner
performance. CA is a formative evaluation tool conducted during the teaching
and learning process with the aim of influencing and informing the overall
instructional process. It is the assessment of the whole learner on an ongoing
basis over a period of time, where cumulative judgments of the learner’s
abilities in specific areas are made in order to facilitate further positive
learning (Le Grange & Reddy, 1998). 1
The data generated from CA should be useful in assisting teachers to plan for
the learning by individual pupils. It also should assist teachers in identifying
the unique understanding of each learner in a classroom by informing the
pupil of the level of instructional attainment, helping to target opportunities that
promote learning, and reducing anxiety and other problems associated with
examinations. CA has shown to have had positive impacts on student learning
outcomes in hundreds of educational settings (Black & William, 1998). 2
CA is made up of a variety of assessment methods that can be formal or
informal. It takes place during the learning process when it is most necessary,
making use of criterion referencing rather than norm referencing and providing
feedback on how learners are changing.
1.3 Challenges in the Implementation of Continuous Assessment
There are several areas in which the implementation of CA in the classroom
will present challenges. Some of these are listed below.
• Large class sizes in most primary schools are a major problem. It is
common to find classes of 60 and above in Zambian classrooms.
Teachers are expected to mark and keep records of the progress of all of
these learners.
• CA can take a lot of time for teachers. As a result, teachers get concerned
that time spent on remediation and enrichment is excessive and many
teachers do not believe that they would finish the syllabus with CA.
• CA will not be successfully implemented if there are inadequate teaching
resources / equipment in schools. Teachers need materials and equipment
such as stationery, computers and photocopiers (and electricity).
• There may be cases of resistance from school administrators and teachers
if they feel left out in the process of developing the CA programme.
• CA requires the cooperation of communities and parents. If they do not
understand what is expected of them, they may resist and hence affect the
success of the programme.
1
Le Grange, L.L. & Reddy, C. 1998. Continuous Assessment: An Introduction and Guidelines to
Implementation. Cape Town, South Africa: Juta.
2
Black, P. & William, D. (1998). Assessment and classroom learning. Assessment in Education, 5(1),
7-74.
4
6. 1.4 Guidelines for Implementation of Continuous Assessment
A teachers’ guide on the implementation of continuous assessment at the
basic school level was developed with the involvement of curriculum
specialists, Standards officers, Examinations specialists, Provincial Education
Officials, District Education Officials, Zonal in-Service training providers,
school administrators and teachers.
The Teachers’ Guide on CA comprises the following:
• Sample record forms;
• Description of the CA schemes;
• Instructions for preparing and administering assessment materials;
• Marking and moderation of the CA marks;
• Recording and reporting assessment results; and
• Monitoring of the implementation of the CA.
The Teachers’ Guide also specifies the roles of stakeholders as follows:
Teachers
• Plan assessment tasks, projects and mark schedules;
• Teach, guide and supervise pupils in implementing given tasks;
• Conduct the assessment in line with given guidelines;
• Mark and record the results;
• Provide correction and remedial work to the pupils;
• Inform the head teacher and parents on the performance of the child;
• Advise and counsel the pupils on their performance in class tasks;
• Take part in internal moderation of pupils’ results.
School Administrators
• Provide an enabling environment, such as the procurement of teaching
and learning materials;
• Act as links between the school and other stakeholders like ECZ,
traditional leaders, politicians and parents;
• Ensure validity, reliability and comparability through moderation of CA;
• Compile CA results and hand them to ECZ.
Parents
• Provide professional, moral, financial and material support to pupils.
• Continuously monitor their children’s attendance and performance
• Take part in making and enforcing school rules.
• Attend open days and witness the giving of prizes (rewards) to outstanding
pupils in terms of performance.
5
7. Standards Officers
• Interpret Government of Zambia policy on education;
• Monitor education policy implementation at various levels of the education
system;
• Advise and evaluate the extent to which the education objectives have
been achieved;
• Ensure that acceptable assessment practices are conducted;
• Monitor the overall standards of education.
Guidance Teachers/School Counsellors
• Prepare and store record cards for CA;
• Counsel pupils, teachers and parents/ guardians on CA and feedback;
• Take care of the pupils’ psycho-social needs;
• Make referrals for pupils to access other specialized assistance/support.
Heads of Department/Senior Teachers/Section Heads
• Monitor and advise teachers in the planning, setting, conducting, marking
and recording of CA results;
• Ensure validity, reliability and dependability of CA by conducting internal
moderation of results;
• Hold departmental meetings to analyze the assessment;
• Provide or make available the teaching and learning materials;
• Compile a final record of CA results and hand them over to Guidance
Teachers for onward submission to the ECZ.
District Resource Centre Coordinators
• Ensure adequate in service training for teachers in planning, conducting,
marking, moderating and recording results at school level in the district;
• Monitor the conduct of CA in the schools and district;
• Professionally guide teachers to ensure provision of quality education at
school level.
Provincial Resource Centre Coordinators
• Ensure adequate in-service training for teachers for them to be effective in
planning, conducting, marking, moderating and recording CA results;
• Monitor the conduct of CA in the province;
• Professionally guide teachers to ensure provision of quality education at
provincial level.
Examinations Specialist
• Analyse and moderate CA results and certify candidates;
• Integrate CA results with terminal examination results;
• Determine grade boundaries;
• Certify the candidates;
6
8. • Disseminate the results of candidates.
Monitors
As monitors of the CA programme, various officials and stakeholders will look
out for the following documents and information:
• Progress chart;
• Record of CA results and analysis;
• Marked evidence of pupils’ CA work on remedial activities;
• Evaluating gender performance;
• Pupil’s Record Cards;
• CA plans or schedules and schemes;
• Evidence of pupils’ work;
• CA administration;
• Evidence of remedial work;
• Availability of planned remedial work in the classroom;
• Availability of the teacher’s guide;
• Sample CA tasks;
• Evidence of a variety of CA tasks;
• Teacher’s record of pupils’ performance.
1.5 Plan for Implementation of Continuous Assessment
CA in Zambia is planned to roll out over a period of several years. This will
allow for proper stakeholder support and evaluation. The following list
provides the brief timeline of important CA activities through 2008:
• Creation of CA Steering and Technical Committees (2005);
• Development of assessment schemes, teacher’s guides, model
assessment tasks booklets and recordkeeping forms (2005);
• Design of quantitative evaluation methodology with focus on student
learning outcomes (2005);
• Implementation of CA pilot in Phase 1 schools: Lusaka, Southern and
Western regions (2006);
• Baseline report on student learning outcomes (2006);
• Implementation of CA pilot in Phase 2 schools: Central, Copperbelt and
Eastern Regions (2007);
• Expansion of modified CA pilot to community schools (2007);
• Post-test report on student learning outcomes (2007);
• Implementation of CA pilot in Phase 3 schools: Luapula, Northern and
Northwestern Regions (2008);
• Discussion of scaling up of CA pilot and systems-level planning for
combining Grade 7 end-of-cycle summative test scores with CA scores for
selection and certification purposes (2008).
7
9. Chapter Two: Evaluation Methodology
2.1 Objectives
The main objective of the quantitative evaluation is to determine whether the
CA programme has had positive effects on student learning outcomes. The
evaluation allows for a determination of whether pupils’ academic
performance has changed as a result of the CA intervention, as well as the
extent of the change in performance.
2.2 Design
The evaluation design is quasi-experimental, with pre-test and post-tests
administered to intervention (pilot) and control (comparison) groups. It
features a pre-test at the beginning of Grade 5 and post-tests at the end of
Grades 5, 6, and 7. The pilot and comparison groups will be compared at
each time point in 6 subject areas to see if there are differences in test scores
from the baseline to the post-tests by group (see Figures 1 and 2 below). 3
Figure 1: Pre-Test and Post-Test, Pilot and Control Group Design
Grade 5 Grade 5 Grade 6 Grade 7
Pre-test Post-test Post-test Post-test
Pilot Pilot Pilot Pilot
Group Group Group Group
Control Control Control Control
Group Group Group Group
Figure 2: Expected Results from the Evaluation
650
600
550
Scaled Score
500
450
400
350 Pilot
300 Control
250
200
G5 Pre-test G5 Post-test G6 Post-test G7 Post-test
Assessment
3
For more information, refer to the Summary of the Continuous Assessment Program August 2007 by
the Examinations Council of Zambia and the EQUIP2-Zambia project.
8
10. With the matched pairs random assignment design, it was expected that the
two groups, pilot and control, would have similar mean scores on the pre-test.
However, with a successful intervention, it was expected that the pilot group
would score higher than the control group on the subsequent post-tests.
2.3 Sample
The sample included all the 2006 (pre-test) and 2007 (post-test) Grade 5
basic school pupils in Lusaka, Southern and Western Provinces in the 24 pilot
(intervention) and 24 comparison (control) schools. The schools were chosen
using matched pairs by geographic location, school size, and grade levels as
matching variables, followed by random assignment to pilot and comparison
status. CA activities were implemented in pilot schools but not in the
comparison schools.
2.4 Instruments
Student achievement for the Grade 5 baseline and post-pilot administrations
was measured using multiple choice tests with 30 items (30 points per test).
The test development process included the following steps:
• Review of the curriculums for each subject area;
• Development of test specifications;
• Development of items;
• Piloting of items;
• Data reviews of item statistics;
• Forms pulling (selecting items for final test papers).
The test instruments were developed by teams of Curriculum Specialists,
Standards Officers, Examination Specialists and Teachers. The baseline tests
(pre-tests) were developed based on the Grade 4 syllabus and the post-pilot
tests (post-tests) were developed based on the Grade 5 syllabus.
2.5 Administration
The ECZ organized the administration of both pre-test and post-test papers.
Teams comprising an Examination Specialist, a Standards Officer and a
Curriculum Specialist were sent to each region to supervise the
administration. District Education officials, School Administrators and
Teachers were involved in the actual administration of the tests. All of the
Grade 5 pupils in the pilot and comparison schools sat for six tests, one in
each of the six subject areas (English, Mathematics, Social and Development
Studies, Integrated Science, Creative and Technology Studies and
Community Studies). The baseline tests (Grade 4 syllabus) were administered
to the students at the beginning of Grade 5, in February 2006. The post-pilot
tests (Grade 5 syllabus) were administered in February 2007.
Note that there will be two more administrations of post-tests for the cohort of
students in the three provinces. These will take place in February 2008
9
11. (Grade 6 syllabus) and November 2008 (Grade 7 syllabus). This process will
be repeated in Phases 2 and 3 schools (see Table 1 below).
Table 1: Implementation Plan for CA Pilot
Phase 2006 2007 2008 2009 2010
Phase 1 (Lusaka,
Grade 5 Grade 6 Grade 7
Southern, Western)
Phase 2 (Central,
Grade 5 Grade 6 Grade 7
Copperbelt, Eastern )
Phase 3 (Luapula,
Grade 5 Grade 6 Grade 7
Northern, Northwestern)
2.6 Data Capture and Scoring
Data were captured using Optical Mark Readers (OMR) and scored by use of
the Faim software at the ECZ. Through this process, tem scores for all
students were converted into electronic format and data files were produced
for analysis.
2.7 Data Analysis
Data were analysed by use of the Statistical Package for Social Sciences
(SPSS). Scores and frequencies by subject were generated. Analysed data
were presented in tabular, chart and graphical forms. Additional analyses
were conducted using WINSTEPS (item response theory Rasch modelling)
software. SPSS was used for scaling the pupils’ scores.
10
12. Chapter Three: Assessment Results
3.1 Psychometric Characteristics
An initial step in determining the results from the assessments was to conduct
analyses to determine the psychometric characteristics of the assessments.
Both the Standards for Educational and Psychological Testing (1999) 4 and
the Code of Fair Testing Practices in Education (2004) 5 include standards for
identifying quality items. Items should assess only knowledge or skills that are
identified as part of the domain being tested and should avoid assessing
irrelevant factors (e.g., ambiguous and grammatical errors, sensitive content
or language, etc.).
Both quantitative and qualitative analyses were conducted to ensure that
items on both Grade 5 baseline and post-pilot tests met satisfactory
psychometric guidelines. The statistical evaluations of the items are presented
in two parts, using classical test theory (CTT) and item response theory (IRT),
which is sometimes called modern test theory. 6 The two measurement
models generally provide similar results, but IRT is particularly useful for test
scaling and equating. CTT analyses included 1) difficulty index (p-value), 2)
discrimination index (item-test correlations), and 3) test reliability (Cronbach's
Alpha for an estimate of internal consistency reliability). IRT analyses
included (1) calibration of items, and (2) examination of item difficulty index
(i.e., b-parameter).
3.2 Classical Test Theory
Difficulty Indices (p)
All multiple-choice items were evaluated in terms of item difficulty according to
standard classical test theory practices. Difficulty was defined as the average
proportion of points achieved on an item by the students. It was calculated by
obtaining the average score on an item and dividing by the maximum possible
score for the item. Multiple-choice items were scored dichotomously (1 point
vs. no points, or correct vs. incorrect), so the difficulty index was simply the
proportion of students who correctly answered the item. All items on Grade 5
pre-tests and post-tests had four response options. Table 2 shows the
average p-values for each test. Note that this may also be calculated by
taking the average raw score of all students divided by the maximum points
(30) per test.
4
American Educational Research Association, American Psychological Association, and National
Council on Measurement in Education (1999). Standards for Educational and Psychological Testing.
Washington, DC: American Educational Research Association.
5
Joint Committee on Testing Practices (2004). Code of Fair Testing Practices in Education.
Washington, DC: American Psychological Association.
6
For more information, see Crocker, L. and Algina, J. (1986). Introduction to Classical and Modern
Test Theory. New York: Harcourt Brace.
11
13. Table 2: Overall Test Difficulty Estimates by Subject Area
Grade 5 Pre-test Grade 5 Post-test
Subject Area Mean Mean
# Items # Items
p-value p-value
English 30 0.40 30 0.37
Social and Developmental Studies 30 0.34 30 0.42
Mathematics 30 0.41 30 0.40
Integrated Science 30 0.33 30 0.36
Creative and Technology Studies 30 0.35 30 0.36
Community Studies 30 0.32 30 0.37
Items that are answered correctly by almost all students provide little
information about differences in student ability, but they do indicate
knowledge or skills that have been mastered by most students. Similarly,
items that are correctly answered by very few students may indicate
knowledge or skills that have not yet been mastered by most students, but
such items provide little information about differences in student ability. In
general, to provide best measurement, difficulty indices should range from
near-chance performance of about 0.20 (for four-option, multiple-choice
items) to 0.90. In general, the item difficulty indices for both Grade 5 pre-tests
and post-tests were within generally acceptable and expected ranges (see
Appendix 1 for a complete list of p-values for all items on each test).
Item Discrimination (Item-Test or Point-Biserial Correlations)
One desirable feature of an item is that the higher performing students do
better on the item than lower performing students. The correlation between
student performance on a single item and total test score is a commonly used
measure of this characteristic of an item. Within classical test theory, the item-
test (or point-biserial) correlation is referred to as the item’s discrimination
because it indicates the extent to which successful performance on an item
discriminates between high and low scores on the test. The theoretical range
of these statistics is –1 to +1, with a typical range from 0.2 to 0.6.
Discrimination indices can be thought of as measures of how closely an item
assesses the same knowledge and skills assessed by other items contributing
to the total score. Discrimination indices for Grade 5 are presented in Table 3.
Table 3: Overall Test Discrimination Estimates by Subject Area
Grade 5 Pre-test Grade 5 Post-test
Subject Area Mean Mean
# Items # Items
Pt-bis Pt-bis
English 30 0.46 30 0.48
Social and Developmental Studies 30 0.38 30 0.45
Mathematics 30 0.37 30 0.41
Integrated Science 30 0.35 30 0.43
Creative and Technology Studies 30 0.38 30 0.44
Community Studies 30 0.29 30 0.43
12
14. On average, the discrimination indices were within acceptable and expected
ranges (i.e., 0.20 to 0.60). The positive discrimination indices indicate that
students who performed well on individual items tended to perform well
overall on the test. There were no items on the instruments that had near-zero
discrimination indices (see Appendix 1 for a complete list of the point-biserial
correlations for all items on each pre-test and post-test per subject area).
Test Reliabilities
Although an individual item’s statistical properties is an important focus, a
complete evaluation of an assessment must also address the way items
function together and complement one another.
There are a number of ways to estimate an assessment’s reliability. One
possible approach is to give the same test to the same students at two
different points in time. If students receive the same scores on each test, then
the extraneous factors affecting performance are small and the test is reliable.
(This is referred to as test-retest reliability.) A potential problem with this
approach is that students may remember items from the first administration or
may have gained (or lost) knowledge or skills in the interim between the two
administrations. A solution to the ‘remembering items’ problem is to give a
different, but parallel test at the second administration. If the student scores
on each test correlate highly, the test is considered reliable. (This is known as
alternate forms reliability, because an alternate form of the test is used in
each administration.) This approach, however, does not address the problem
that students may have gained (or lost) knowledge or skills in the interim
between the two administrations. In addition, the practical challenges of
developing and administering parallel forms generally preclude the use of
parallel forms reliability indices. One way to address these problems is to split
the test in half and then correlate students’ scores on the two half-tests; this in
effect treats each half-test as a complete test. By doing this, the problems
associated with an intervening time interval, and of creating and administering
two parallel forms of the test, are alleviated. This is known as a split-half
estimate of reliability. If the two half-test scores correlate highly, items on the
two half-tests must be measuring very similar knowledge or skills. This is
evidence that the items complement one another and function well as a
group. This also suggests that measurement error will be minimal.
The split-half method requires a judgment regarding the selection of which
items contribute to which half-test score. This decision may have an impact on
the resulting correlation; different splits will give different estimates of
reliability. Cronbach (1951) 7 provided a statistic, α (alpha), that avoids this
concern about the split-half method. Cronbach’s α gives an estimate of the
average of all possible splits for a given test. Cronbach’s α is often referred to
as a measure of internal consistency because it provides a measure of how
well all the items in the test measure one single underlying ability. Cronbach’s
α is computed using the following formula:
7
Cronbach, L. J. (1951). Coefficient Alpha and the Internal Structure of Tests. Psychometrika, 16,
297–334.
13
15. ⎡ n ⎤
n ⎢ ∑σ 2 (Yi ) ⎥
α = ⎢1 − i =1 ⎥
n −1 ⎢ σ x2 ⎥
⎢ ⎥
⎣ ⎦
where, i : Item
n : Total number of items,
σ 2 (Yi ) : Individual item variance, and
σ x2 : Total test variance
For standardized tests, reliability estimates should be approximately 0.80 or
higher. According to Table 4, the reliabilities for the tests on the pre-test
ranged from 0.63 (Community Studies) to 0.87 (English). The reliability
estimate for Community Studies was low due to the absence of a national
curriculum for use in test construction. In contrast, the reliability estimates for
the post-tests ranged 0.83 (Mathematics) to 0.89 (English). It is likely that the
post-tests had higher reliability estimates since the test developers had more
experience than they had when they developed the baseline tests.
Table 4: Test Reliability Estimates by Subject Area
Grade 5 Pre-test Grade 5 Post-test
Subject Area Coefficient Coefficient
# Items # Items
Alpha Alpha
English 30 0.87 30 0.89
Social and Developmental Studies 30 0.80 30 0.87
Mathematics 30 0.79 30 0.83
Integrated Science 30 0.76 30 0.85
Creative and Technology Studies 30 0.80 30 0.86
Community Studies 30 0.63 30 0.85
3.3 Item Response Theory
Item Response Theory (IRT) uses mathematical models to define a
relationship between an unobserved measure of student ability, usually
referred to as theta ( θ ), and the probability ( p ) of getting a dichotomous item
correct. In IRT, it is assumed that all items are independent measures of the
same construct or ability (i.e., the same θ ). The process of determining the
specific mathematical relationship between θ and p is referred to as item
calibration. Once items are calibrated, they are defined by a set of parameters
which specify a non-linear relationship between θ and p . 8
8
For more information about item calibration, see the following references: Lord, F.M. and Novick,
M.R. (1968). Statistical Theories of Mental Test Scores. Boston, MA: Addison-Wesley; Hambleton,
R.K. and Swaminathan, H. (1984). Item Response Theory: Principles and Applications. New York:
Springer.
14
16. For the CA programme, a 1-parameter or Rasch model was implemented.
The equation for the Rasch model is defined as probability of giving correct
response to item i by a student with ability level of θ :
exp D(θ − bi )
Pi (θ ) =
1 + exp D(θ − bi )
Where, i = item,
b = item difficulty,
D = a normalizing constant equal to 1.701.
In IRT, item difficulty ( bi ) and student ability ( θ ) are measured on a scale of
− ∞ to + ∞ . A scale of − 3.0 to + 3.0 is used operationally in educational
assessment programmes. with − 3.0 being low student ability or an easy item
and + 3.0 being high student ability or a difficult item. The bi parameter for an
item is the position on the ability scale where the probability of a correct
response is 0.50.
The WINSTEPS program was the software used to do the IRT analyses. The
item parameter files resulting from the analyses are provided in Appendices 2
and 3. This presentation is direct output from WINSTEPS. 9 Raw scores were
then scaled using the item response theory model, with a range of 100-500
(see Appendices 2 and 3 for the raw score to scale score conversion tables
for each subject area).
3.4 Scaled Scores
The Grade 5 pre-test and post-test scores in each subject area are reported
on a scale that ranges from 100 to 500. Students’ raw scores or total number
of points, on the pre-tests and post-tests are translated to scaled scores using
a data analysis process called scaling. Scaling simply converts raw points
from one scale to another. In the same way that distance can be expressed in
miles or kilometres, or monetary value can be expressed in terms of U.S.
dollars or Zambian Kwacha, student scores on both pre and post-tests could
be expressed as raw scores (i.e., number of points) or scaled scores.
Cut points were established on the raw score scale both for the pre-tests and
post-tests (see Section 3.8 “Performance Levels” for an explanation of how
these cut points were determined). Once the raw score cut points were
determined via standard setting, the next step was to compute theta cuts
using the test characteristic curve (TCC) mapping procedure and then
calculate the transformation coefficients that would be used to place students’
raw scores onto the theta scale then onto the scaled score used for reporting.
As previously stated, student scores on the assessments are reported in
integer values from 100 to 500 with two scores representing cut scores on
each assessment. Two cut points (Unsatisfactory/Satisfactory and
Satisfactory/Advanced) were pre-set at 250 and 350, respectively.
9
See the WINSTEPS user’s manual for additional details regarding this output (at
http://www.winsteps.com).
15
17. Figure 3: Scaled Score Conversion Procedure
Raw Score Cut Conversion of Raw Score Cuts into theta Calculation of
Scores (from cuts θ1 and θ 2 Using TCC Mapping Scaled Score
Standard Setting) constants (b
and m) using
theta cuts
( θ 1 , θ 2 ), and
Calculation of Scaled Score using scaled score
cuts (250 and
m(θ ) + b 350)
The scaled scores are obtained by a simple linear transformation of the theta
score using the values of 250 and 350 on the scaled score metric and the
associated theta cut points to define the transformation. The scaling
coefficients were calculated using the following formulae:
b = 250 − m(θ1 )
b = 350 − m(θ 2 )
(350 − 250)
m=
(θ 2 − θ1 )
Where m is the slope of the line providing the relationship between the theta
and scaled scores, b is the intercept, θ 1 is the cut score on the theta score
metric for the Unsatisfactory/Satisfactory cut (i.e., corresponding to the raw
score cut for Unsatisfactory/Satisfactory), and θ 2 is the cut score on the theta
score metric for the Satisfactory/Advanced cut (i.e., corresponding to the raw
score cut for Satisfactory/Advanced). Scaled scores were then calculated
using the following linear transformation (see Figure 1):
Scaled Score = m (θ ) + b
Where, θ represents a student’s theta (or ability) score. The values obtained
using this formula were rounded to the nearest integer and then truncated
such that no student received a score below 100 or above 500. Table 4
presents the mean raw score for each grade/subject area combination in pre
and post-tests.
It is important to note that converting from raw scores to scaled scores does
not change the students’ performance-level classifications. For the Zambia
CA programme, a score of 250 is the cut score between Unsatisfactory and
Satisfactory and a score of 350 is the cut score between Satisfactory and
Advanced. This is true regardless of which subject area, grade, or year one
may be concerned with.
Scaled scores supplement the pre-test and post-test results by providing
information about the position of a student’s results within a performance
level. For instance, if the range for a performance level is 200 to 250, a
16
18. student with a scaled score of 245 is near the top of the performance level,
and close to the next higher performance level.
School level scaled scores are calculated by computing the average of
student-level scaled scores. Table 5 provides the raw score averages for each
of the subject areas, while Table 6 provides the same information in scaled
scores.
Table 5: Grade 5 Mean Raw Scores by Subject Area
Grade 5 Pre-test Grade 5 Post-test
#
Subject Area Std. Std.
Items N Mean N Mean
Dev. Dev.
English 30 3798 12.2 6.5 4025 11.7 7.1
Social and Developmental Studies 30 3962 10.1 5.3 4104 13.2 6.6
Mathematics 30 3883 12.3 5.3 4127 12.4 5.8
Integrated Science 30 4039 9.9 4.9 4135 11.1 6.3
Creative and Technology Studies 30 4032 10.5 5.3 4097 11.7 6.2
Community Studies 30 4037 9.5 4.0 4141 11.2 6.4
According to Table 5, overall mean raw scores (with both pilot and
comparison groups taken together) across the subject areas on the pre-test
ranged from 9.5 (Community Studies) to 12.3 (Mathematics) out of possible
score point of 30. In contrast, the overall mean raw scores for the post-tests
ranged from 11.1 (Integrated Science and Creative and Technology Studies)
to 13.2 (Social and Developmental Studies). From Table 6, the scaled score
averages for Grade 5 pre-tests ranged from 214 (Community Studies) to 239
(English) out of possible score point of 100-500. In contrast, the scaled score
averages for the post-tests ranged from 233 (English) to 262 (Mathematics).
Table 6: Grade 5 Mean Scaled Scores by Subject Area
Grade 5 Pre-test Grade 5 Post-test
#
Subject Area Std. Std.
Items N Mean N Mean
Dev. Dev.
English 30 3798 238.8 83.7 4025 233.4 88.1
Social and Developmental Studies 30 3962 230.5 86.2 4104 241.2 83.9
Mathematics 30 3883 222.4 89.2 4127 261.9 72.6
Integrated Science 30 4039 226.5 80.2 4135 245.7 73.7
Creative and Technology Studies 30 4032 224.1 85.3 4097 244.3 83.0
Community Studies 30 4037 214.0 83.7 4141 236.9 72.3
It was stated earlier that scaled score is a simple linear transformation of the
raw scores, using the values of 250 and 350 on the scaled score metric.
Student’s relative position on the raw score matrix does not change due to
this scale transformation.
Note that the primary interest of this evaluation is not whether the raw scores
and/or scaled scores increase or decrease from pre-test to post-test. These
differences will occur mainly through variations in test difficulty. The main
analysis will compare the relative changes in the two groups, i.e., pilot and
17
19. comparison, across the two time points, i.e., pre-test to post-test. At a later
point, post-tests will also be conducted when the cohort of students is in
Grade 6 and Grade 7, followed by extended analyses for the two additional
time points.
3.5 Vertical Scaled Scores
In vertical scaling, tests that vary in difficulty level, but that are intended to
measure similar constructs, are placed on the same scale. Placing different
tests on the same scale can be implemented in a number of ways, such as,
linking items across the tests or social moderation. For the CA programme, a
social moderation (Linn, 1993) procedure was employed for vertical scaling. 10
In social moderation, assessments are developed in reference to a common
content framework. Performance of individual students, and schools, is
measured against a single set of common standards. For Zambia, an analysis
of the Grade 4 and 5 curriculums showed that the content was vertically
aligned, i.e., students were expected to progress in their learning along the
same constructs from one grade level to the next. This allowed the test
developers to link the pre-tests and post-tests through common performance
standards. The visual representation of the vertical scaling scheme for the CA
programme is shown below.
Figure 4: Vertical Scaling Scheme
Grade 5 Pre-test: 250 350
Grade 5 Post-test: 350 450
Grade 6 Post-test: 450 550
Grade 7 Post-test: 550 650
In other words, students who were classified as Advanced in the Grade 5 pre-
test (i.e., end of Grade 4 syllabus) would also be considered as Satisfactory in
Grade 5 post-test (i.e., end of Grade 5 syllabus) and students who classified
as Advanced in Grade 5 post-test would be considered as Satisfactory in
Grade 6 post (end of Grade 6 test) so on through Grade 7. In the vertical
10
Linn, R. L. (1993). Linking results of distinct assessments. Applied Measurement in Education, 6(1),
83-102.
18
20. scaled score matrix, students who earned a grade level scaled score of 250
on Grade 5 post-test would also earn a vertical scaled score of 350 (because
350 is the equivalent grade level scaled score in Grade 5 pre-test).
Therefore, grade level scaled scores and vertical scaled scores is differed by
a constant value of 100 points. The mean vertical scaled scores for each
subject are shown in Table 7.
Table 7: Grade 5 Mean Vertical Scaled Scores by Subject Area
Grade 5 Pre-test Grade 5 Post-test
#
Subject Area Std. Std.
Items N Mean N Mean
Dev. Dev.
English 30 3798 238.8 83.7 4025 333.4 88.1
Social and Developmental Studies 30 3962 230.5 86.2 4104 341.2 83.9
Mathematics 30 3883 222.4 89.2 4127 361.9 72.6
Integrated Science 30 4039 226.5 80.2 4135 345.6 73.7
Creative and Technology Studies 30 4032 224.1 85.3 4097 344.4 83.0
Community Studies 30 4037 214.0 83.7 4141 336.9 72.3
Figure 5 shows that mean vertical scaled scores on pre and post-tests across
the subject areas. Vertical scaled scores for the pre-test are basically the
grade level scaled scores. As expected, vertical scaled scores for Grade 5
post-test are higher than the Grade 5 pre-test scaled scores.
Figure 5: Vertical Scaled Mean Scores by Subject Area
400
vertical Scaled Score
300
PRE
200
POST
100
0
Eng. SDS Math. ISC CTS CS
3.6 Comparison between Pilot and Comparison Groups
The comparisons between pilot and comparison groups were made in raw
scores and vertical scaled scores. Although raw scores in the pre and post
tests are not on the same scale as the tests are of varied difficulty, however
the comparison was made for simplicity. Comparison would be more
relevant, valid, and beneficial when they are compared on the vertical scaled
score. Note that vertical scaled scores for the pre and post tests are on the
same scale.
19
21. Raw Scores
Table 8 shows that the raw score mean differences between the pilot and
comparison schools on the Grade 5 pre-tests were small for each subject
area. The mean differences, analyzed using t-tests, were statistically
significant only in English and Mathematics, with the pupils in comparison
group performing better than those in the pilot group (p<.05). In the other four
subjects, the t-tests showed no significant differences between the two groups
on the baseline. In raw scores, differences in English and Mathematics were
about a half-point, while the differences for the other subjects had a maximum
difference of two-tenths of a point. These results reflected the expectation of
very small differences on the pre-tests, since the schools were randomly
assigned to one of the two groups based on a matched pairs design.
Table 8: Mean Raw Scores by Subject Area and Group
Grade 5 Pre-test Grade 5 Post-test
Subject Area Group †
N Mean Std. Dev. N Mean Std. Dev.
Pilot 1785 11.9 6.4 1773 13.3* 1.6
English Comparison 2013 12.4* 6.6 1967 12.2 1.6
Total 3798 12.2 6.5 3740 12.8 1.6
Social and Pilot 1907 10.0 5.2 1895 14.9* 1.3
Developmental Comparison 2055 10.2 5.5 2008 13.7 1.3
Studies Total 3962 10.1 5.3 3903 14.3 1.3
Pilot 1861 12.0 5.3 1849 13.8* 1.4
Mathematics Comparison 2022 12.6* 5.3 1975 13.2 1.4
Total 3883 12.3 5.3 3824 13.5 1.4
Pilot 1961 9.8 4.9 1949 13.2* 1.9
Integrated Science Comparison 2078 9.9 4.9 2031 11.2 1.8
Total 4039 9.9 4.9 3980 12.2 1.9
Pilot 1967 10.5 5.2 1955 12.9* 1.5
Creative and
Comparison 2065 10.6 5.4 2018 11.7 1.5
Technology Studies
Total 4032 10.5 5.3 3973 12.3 1.5
Pilot 1979 9.5 4.0 1967 13.4* 1.6
Community Studies Comparison 2058 9.5 3.9 2011 12.5 1.6
Total 4037 9.5 4.0 3978 13.0 1.6
* Significant at p<0.05; † represents adjusted weighted sample size.
The differences between the two groups for all subject areas in Grade 5 post-
test (also in Table 8),were evaluated using an Analysis of Covariance
(ANCOVA), with the pre-test scores as the covariates. In other words, the pre-
tests scores were made statistically equivalent so that the groups could be
evaluated on an equal basis on the post-tests. Using the raw scores, the
results were statistically significant in each of the subject areas, with the pilot
group outperforming the comparison group (p<.05).
Note that all statistical comparisons were made at the school level, not at the
student level. This was due to changes in student population at each school
from pre-test to post-test. The design was based on cohorts (student groups
20
22. over time) and not on panels (the same students over time). A panel design
would have been statistically possible, but it would also have led to skewed
results due to student attrition.
Vertical Scaled Scores
As started, vertical scaled scores on the pre and post tests were computed
independently both for pilot and comparison groups and were measured on
the same scale (i.e., vertical scale). This makes the comparison more relevant
and valid to assess the impact of CA in the pilot schools compared to the
comparison schools.
Table 9: Mean Vertical Scaled Scores by Subject Area and Group
Grade 5 Pre-tests Grade 5 Post-tests
Subject Area Group †
N Mean Std. Dev. N Mean Std. Dev.
Pilot 1785 236.1 82.4 1773 352.3* 20.3
English Comparison 2013 241.2* 84.8 1967 339.9 20.3
Total 3798 238.8 83.7 3740 346.1 20.3
Social and Pilot 1907 229.1 84.3 1895 362.4* 17.7
Developmental Comparison 2055 231.8 87.9 2008 346.2 17.7
Studies Total 3962 230.5 86.2 3903 354.3 17.7
Pilot 1861 217.8 89.3 1849 380.5* 17.1
Mathematics Comparison 2022 226.7* 88.9 1975 373.1 17.1
Total 3883 222.4 89.2 3824 376.8 17.1
Pilot 1961 225.5 80.1 1949 369.5* 20.4
Integrated Science Comparison 2078 227.4 80.4 2031 348.0 20.4
Total 4039 226.5 80.2 3980 358.8 20.4
Pilot 1967 223.0 84.0 1955 357.1* 16.0
Creative and
Comparison 2065 225.1 86.5 2018 343.5 16.0
Technology Studies
Total 4032 224.1 85.3 3973 350.3 16.0
Pilot 1979 213.7 84.3 1967 365.8* 22.1
Community Studies Comparison 2058 214.2 83.1 2011 352.8 22.1
Total 4037 214.0 83.7 3978 359.3 22.1
* Significant at p<0.05
Table 9 shows that the vertical scaled score mean differences between the
pilot and comparison schools on the Grade 5 pre-tests were small for each
subject area. The mean differences in all six subject areas, analyzed using t-
tests, were not statistically significant (p>.05). In contrast, when the
differences between the two groups for all subject areas in Grade 5 post-test
(also in Table 9),were evaluated using an ANCOVA (with the pre-test scores
as the covariates), the results were statistically significant in all subject areas,
with the pilot group outperforming the comparison group (p<.05).
Figures 6 through 11 show the differences in vertical scaled scores from the
Grade 5 pre-test to the Grade 5 post-test for each of the subject areas. The
graphs show clearly the greater score increases by the pilot groups in all
subject areas except for Mathematics, where the increases were not as
evident as in the other groups, though the pilot group started off lower.
21
23. Figure 6: English Mean Vertical Scores by Group
400
380
360
Vertical Scaled Score
340
320
300
280 Pilot
260 Comparison
240
220
200
Grade 5 Pre-test Grade 5 Post-test
Figure 7: Social & Dev. Studies Mean Vertical Scores by Group
400
380
360
Vertical Scaled Score
340
320
300
280
260 Pilot
240 Comparison
220
200
Grade 5 Pre-test Grade 5 Post-test
Figure 8: Mathematics Mean Vertical Scores by Group
400
380
360
Vertical Scaled Score
340
320
300
280
260 Pilot
240 Comparison
220
200
Grade 5 Pre-test Grade 5 Post-test
22
24. Figure 9: Integrated Science Mean Vertical Scores by Group
400
380
360
Vertical Scaled Score
340
320
300
280
260 Pilot
240 Comparison
220
200
Grade 5 Pre-test Grade 5 Post-test
Figure 10: Creative & Tech. Studies Mean Vertical Scores by Group
400
380
360
Vertical Scaled Score
340
320
300
280
260 Pilot
240 Comparison
220
200
Grade 5 Pre-test Grade 5 Post-test
Figure 11: Community Studies Mean Vertical Scores by Group
400
380
360
Vertical Scaled Score
340
320
300
280
260 Pilot
240 Comparison
220
200
Grade 5 Pre-test Grade 5 Post-test
23
25. 3.7 Comparison across Regions
While not the focus of the evaluation, the next two sections have useful
information on student performance. Tables 10 and 11 contain a brief analysis
of the scores by region, providing information on the scores on a
disaggregated basis. As with the overall analyses, the comparisons across
the three regions were made in raw scores and vertical scaled scores. Lusaka
Region consistently had the highest mean scores (both raw scores and
vertical scaled scores) in all subjects on the Grade 5 pre-tests, followed by
Western and Southern. The same pattern of results was also observed for
Grade 5 post-tests.
Table 10: Subject Area Mean Raw Scores by Region
Grade 5 Pre-test Grade 5 Post-test
Subject Area Region
N Mean Std. Dev. N Mean Std Dev.
Southern 1010 11.0 6.2 1157 10.4 6.6
Western 994 11.7 5.9 1103 11.9 6.7
English
Lusaka 1794 13.1 6.9 1765 12.4 7.5
Total 3798 12.2 6.5 4025 11.7 7.1
Southern 1014 9.4 4.8 1214 11.7 6.0
Social and Western 1112 9.9 4.9 1125 13.2 6.1
Developmental
Studies Lusaka 1836 10.7 5.8 1765 14.1 7.0
Total 3962 10.1 5.3 4104 13.2 6.6
Southern 1002 11.5 5.4 1226 11.1 5.2
Western 1086 12.2 5.2 1120 12.7 5.3
Mathematics
Lusaka 1795 12.9 5.2 1781 13.0 6.3
Total 3883 12.3 5.3 4127 12.4 5.8
Southern 1025 9.2 4.4 1212 9.6 5.4
Integrated Western 1151 9.4 4.6 1154 11.7 6.4
Science Lusaka 1863 10.6 5.3 1769 11.8 6.7
Total 4039 9.9 4.9 4135 11.1 6.3
Southern 1016 9.6 4.8 1205 9.9 5.6
Creative and Western 1140 10.2 5.0 1146 11.3 6.0
Technology
Studies Lusaka 1876 11.2 5.7 1790 11.9 6.9
Total 4032 10.5 5.3 4141 11.2 6.4
Southern 1015 9.0 3.5 1191 10.5 5.3
Community Western 1146 9.4 4.3 1122 11.5 6.0
Studies Lusaka 1876 9.8 4.0 1784 12.7 6.8
Total 4037 9.5 4.0 4097 11.7 6.2
24
26. Table 11: Subject Area Mean Vertical Scaled Scores by Region
Grade 5 Pre-test Grade 5 Post-test
Subject Area Region
N Mean Std. Dev. N Mean Std Dev.
Southern 1010 224.1 80.3 1157 317.3 82.8
Western 994 232.3 72.9 1103 335.0 81.0
English
Lusaka 1794 250.7 89.3 1765 343.0 94.1
Total 3798 238.8 83.7 4025 333.4 88.1
Southern 1014 218.5 77.4 1214 321.7 76.7
Social and Western 1112 226.4 79.1 1125 341.1 78.1
Developmental
Studies Lusaka 1836 239.6 93.6 1765 354.7 89.5
Total 3962 230.5 86.2 4104 341.2 84.0
Southern 1002 209.2 91.0 1226 346.6 66.1
Western 1086 219.9 86.2 1120 366.6 65.5
Mathematics
Lusaka 1795 231.3 89.0 1781 369.5 79.3
Total 3883 222.4 89.2 4127 361.9 72.6
Southern 1025 215.7 72.1 1212 328.9 63.5
Integrated Western 1151 218.1 76.1 1154 353.0 74.2
Science Lusaka 1863 237.5 85.5 1769 352.4 78.0
Total 4039 226.5 80.2 4135 345.7 73.7
Southern 1016 209.8 77.9 1191 327.6 70.7
Creative and Western 1140 218.9 79.7 1122 340.7 79.5
Technology
Studies Lusaka 1876 234.9 90.8 1784 357.7 90.3
Total 4032 224.1 85.3 4097 344.3 83.0
Southern 1015 204.2 74.8 1205 323.4 64.3
Community Western 1146 213.1 88.6 1146 338.7 66.8
Studies Lusaka 1876 219.8 84.6 1790 344.9 79.1
Total 4037 214.0 83.7 4141 336.9 72.3
3.8 Performance Categories
Depending on test difficulty and score distributions, performance categories
were established for each of the tests using a procedure called standard
setting. An Angoff (1971) 11 standard setting method was implemented to set
the cut scores between Unsatisfactory and Satisfactory and between
Satisfactory and Advanced both for pre-tests and post-tests.
The resultant cut scores are presented in Tables 12 and 13. In English, for
example, students who got a score of 1-12 would be classified Unsatisfactory,
students who got a score of 12-21 would be classified as Satisfactory and
students who earned a score of 22-30 would be classified as Advanced on the
pre-test. For Mathematics, the corresponding ranges are 1-13 Unsatisfactory,
14-19 Satisfactory, and 20-30 Advanced for the pre-test. The post-test ranges
for each subject area are different from those on the pre-tests; the reason is
that the pre-tests and post-tests covered different content and had different
levels of difficulty.
11
Angoff, W. H. (1971). Scales, Norms, and Equivalent Scores. In R.L. Thorndike (Ed.) Educational
Measurement (2nd ed.). (pp. 508-560). Washington, DC: American Council on Education.
25
27. Table 12: Performance Categories for Pre-tests by Subject
Grade 5 Pre-test
Subject Area 1 2 3
Unsatisfactory Satisfactory Advanced
(Fail) (Pass) (Pass)
English 1-12 13-21 22-30
Social and
1-10 11-17 18-30
Developmental Studies
Mathematics 1-13 14-19 20-30
Integrated Science 1-10 11-17 18-30
Creative and Technology
1-11 12-18 19-30
Studies
Community Studies 1-10 11-15 16-30
Table 13: Performance Categories for Post-tests by Subject
Grade 5 Post-test
Subject Area 1 2 3
Unsatisfactory Satisfactory Advanced
(Fail) (Pass) (Pass)
English 1-12 13-21 22-30
Social and
1-13 14-21 22-30
Developmental Studies
Mathematics 1-10 11-19 20-30
Integrated Science 1-10 11-20 21-30
Creative and Technology
1-11 12-21 22-30
Studies
Community Studies 1-11 12-19 20-30
Tables 14 and 15 provide the percentages of students classified in the 3
performance categories by subject. On the pre-test, the percentages in each
category by group were similar for most of the subjects. For instance, in
Integrated Science, similar percentages of students were in the Satisfactory
(Pass) category for the pilot (34%) and comparison (33%) groups. However,
on the post-test, there were some differences for the groups, mostly in favour
of the pilot group. In Integrated Science, 53% of students in the pilot group
were Satisfactory vs. 43% of students in the comparison group. The
percentages for each group favoured the pilot group on the post-test, with the
exception of Mathematics where the rounded percentage passing was the
same in the pilot (65%) and comparison (65%) groups.
26
28. Table 14: Percentages of Students in Performance Categories for Pre-tests
Grade 5 Pre-test
Subject Area Group 1 2 3
Unsatisfactory Satisfactory Advanced
(Fail) (Pass) (Pass)
Pilot 63.0 27.2 9.8
English
Comparison 59.7 28.2 12.1
Social and Pilot 62.8 26.9 10.3
Developmental
Studies Comparison 64.4 24.0 11.6
Pilot 64.3 26.2 9.5
Mathematics
Comparison 60.1 29.4 10.5
Integrated Pilot 65.9 25.6 8.5
Science Comparison 67.3 22.9 9.8
Creative and Pilot 67.5 22.9 9.6
Technology
Studies Comparison 68.4 20.1 11.5
Community Pilot 66.8 25.4 7.8
Studies Comparison 66.8 24.8 8.4
Table 15: Percentages of Students in Performance Categories for Post-tests
Grade 5 Post-test
Subject Area Group 1 2 3
Unsatisfactory Satisfactory Advanced
(Fail) (Pass) (Pass)
Pilot 60.0 26.5 13.5
English
Comparison 64.0 24.0 11.9
Social and Pilot 51.4 33.4 15.3
Developmental
Studies Comparison 59.3 30.6 10.2
Pilot 35.2 53.9 10.9
Mathematics
Comparison 34.8 56.3 8.9
Integrated Pilot 46.7 40.2 13.1
Science Comparison 57.3 36.0 6.7
Creative and Pilot 54.5 35.1 10.4
Technology
Studies Comparison 62.3 31.0 6.7
Community Pilot 50.4 33.9 15.6
Studies Comparison 54.4 36.2 9.5
27
29. Chapter Four: Summary and Conclusions
The main objective of the evaluation was to determine whether the CA
programme is having positive effects on student learning outcomes in the first
year of implementation. This was accomplished by measuring and comparing
the levels of learning achievement of pupils in pilot (intervention) and
comparison (control) schools. A baseline (pre-test) assessment occurred
before implementation of the proposed interventions at the beginning of
Grade 5 in randomly selected pilot schools. This created a basis upon which
the impact of CA was measured at the end of the Grade 5 pilot year.
A sample of 48 schools was selected from Lusaka, Southern and Western
Provinces using a matched pairs design and random assignment, resulting in
24 pilot schools and 24 comparison schools. Student achievement for the
Grade 5 baseline and post-test administrations was measured using multiple
choice tests in 6 subject areas with 30 items each (30 points per test). The
Grade 5 baseline tests were based on the Grade 4 curriculum, while the
Grade 5 post-tests were based on the Grade 5 curriculum. Overall, the
psychometric characteristics of the tests were very satisfactory on both the
pre-tests and post-tests. Items were within acceptable difficulty (p-value)
ranges and discrimination (point-biserial correlation) levels. Overall tests were
found reliable, using Cronbach's Alpha as an estimate of internal consistency
reliability.
Performance of the schools in the baseline and post-tests were compared
using mean raw scores and mean vertical scaled scores. The vertical scaled
score comparison was found more relevant, valid, and beneficial, since the
school mean scores both on the baseline and post-tests were evaluated on
the same measurement scale (i.e., vertical scale). In addition, statisticians
generally prefer using scaled scores for longitudinal comparisons since the
scale is equal interval, thus making comparisons more accurate.
Overall, the pupils’ scores on the baseline pre-test were very similar in the
pilot and comparison schools. The comparison schools scored slightly higher
on the English and Mathematics tests, but the score differences for the two
groups on the other four tests were minimal. On the post-test, which was
administered after one year of the CA programme, the scores of the pilot
schools on all six tests were significantly higher than those in the comparison
schools. This provides strong initial evidence that the CA programme had a
significantly positive effect on pupil learning outcomes.
When the performance of the schools on the baseline and post-tests were
compared by region, Lusaka Region consistently had the highest mean
scores in all subjects on the Grade 5 pre-tests and post-test, followed by
Western and Southern. The number of schools by region was too small to
make statistically valid region-by-region comparisons of pre-test to post-test
scores for the pilot and comparison groups.
Students were also classified into three performance level categories
(Unsatisfactory, Satisfactory, and Advanced) in each subject area based on
their performance in baseline and post-tests. On the pre-tests, the
28
30. percentages in each category by group were similar for most of the subjects.
However, on the post-test, there were differences in favour of the pilot group
in virtually all subjects. For instance, in Integrated Science, 53% of students in
the pilot group were Satisfactory and above vs. 43% of students in the
comparison group. This provided strong evidence that a greater percentage of
students in the pilot group were achieving a passing score on the post-test
than those in the comparison group.
The next round of post-tests in the Phase 1 schools will be administered when
the same cohort of pupils completes Grade 6. This will be followed by a final
test administration (a third post-test) when the cohort of pupils completes
Grade 7. At that point, with four time points (a baseline and three post-tests),
more substantial conclusions will be drawn on the effectiveness of the CA
programme.
Note also that the evaluation process is being repeated in the Phase 2 and
Phase 3 schools, which will provide a complete national quantitative
evaluation of the programme at the end of Year 5 of implementation (2010).
Based on guidance from the CA Steering Committee, results from the
evaluation will be used at a selected point in the implementation period as a
criterion for scaling up the CA programme to other primary schools in Zambia.
29