Your SlideShare is downloading. ×

Deciding on the Sampling Strategy 决定抽样策略


Published on

Shanghai International Program for Development Evaluation Training Asia-Pacific Finance and Development Center; 200 Panlong Road-Shanghai, October 14, 2008

Shanghai International Program for Development Evaluation Training Asia-Pacific Finance and Development Center; 200 Panlong Road-Shanghai, October 14, 2008

Published in: Education, Technology
1 Comment
1 Like
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Deciding on the Sampling Strategy 决定抽样策略 Shanghai International Program for Development Evaluation Training Asia-Pacific Finance and Development Center 200 Panlong Road-Shanghai, October 14, 2008 Ray C. Rist
  • 2.
    • Ray C. Rist
    • Knowledge & Evaluation Capacity Development Advisor, Independent Evaluation Group of the World Bank
    • President International Development Evaluation Association (IDEAS)
    Dadang Solihin Indonesia Delegation Shanghai International Program for Development Evaluation Training
  • 3. Introduction 引言
    • Introduction to Sampling
    • 抽样简介
    • Types of Samples: Random and Non-Random
    • 样本的类型:随机和非随机
    • How Confident and Precise Do You Need to Be?
    • 你需要多大的置信度和精确度?
    • How Large a Sample Do You Need?
    • 你需要多大的样本?
    • Where to Find a Sampling Statistician?
    • 到哪去找抽样调查统计员 ?
  • 4. Introduction to Sampling 抽样简介
  • 5. Sampling 抽样
    • Is it possible to collect data from the entire population? (census)
    • 收集总体的数据(普查)可能吗?
      • If so, we can talk about what is true for the entire population
      • 如果可以,我们能够说出总体的真实情况
      • Often we cannot (time/cost)
      • 经常的情况是我们不能 ( 时间 / 成本 )
      • If not, we can use a smaller subset: a SAMPLE
      • 如果不能,我们可以使用一个较小的子集:样本
  • 6. Sampling Concepts 抽样的概念
  • 7. Concepts 概念
    • Population
    • 总体
      • the total set of units
      • 抽样的方法
    • Sample
    • 样本
      • a subset of the population
      • 总体的一个子集
    • Sampling Frame
    • 抽样框架
      • list from which to select your sample
      • 一个列表,从中可以选取你要的样本
  • 8. More Sampling Concepts 更多的抽样概念
    • Sample Design
    • 样本设计
      • methods of sampling (probability or non-probability)
      • 抽样的方法(概率抽样或非概率抽样 )
    • Parameter
    • 参数
      • characteristic of the population
      • 总体的特征
    • Statistic
    • 统计
      • characteristic of a sample
      • 样本的特征
  • 9. Random Sample 随机样本
    • A random sample allows us to make estimates about the larger population based on what we learn from the subset
    • 一个 随机样本 允许我们基于从该子集(样本)所了解的情况,做出有关一个更大总体的估计
    • Lottery, everyone has an equal chance
    • 博彩,每个人都有相同的机会
    • Advantages:
    • 优点:
      • eliminates selection bias
      • 消除选择偏差
      • able to generalize to the population
      • 能够推断总体
      • cost-effective
      • 成本划算
  • 10. Types of Samples: Random and Non-Random 样本的类型:随机样本和 非随机样本
  • 11. Types of Random Samples 随机样本的类型
    • Simple random sample
    • 简单随机样本
    • Random interval sample
    • 随机间隔样本
    • Stratified random sample
    • 分层随机样本
    • Random c luster sample
    • 随机整群样本
    • Multi-stage random sample
    • 多等级随机样本
    • Combination random sample
    • 合并随机样本
  • 12. Simple Random Sample 简单随机样本
    • Simplest
    • 最简单的一类样本
    • Establish a sample size and proceed to randomly select units until we reach the sample size
    • 先确定样本大小,然后进行随机地抽取直到获得预定数量的样本
    • Uses a random number table to select units
    • 采用一个随机数字表来选取样本单位
  • 13. Random Interval Sample 随机间隔样本
    • Used when there is a sequential population that is not already enumerated and would be difficult or time consuming to enumerate
    • 在一个序列总体还未被列举或者列举过于费事且耗时的情况下使用
    • Uses a random number table to select intervals
    • 用一个随机数字表来选取间隔
  • 14. Stratified Random Sample 分层随机样本
    • Use when specific groups must be included that might otherwise be missed by using a simple random sample
    • 总体中有若干个特定的子类,样本必须把这些子类都包含近来,但如果使用简单随机样本的话可能会遗漏某些子类。这时要使用分层随机样本。
      • usually a small proportion of the population
      • 通常是总体的一小部分
  • 15. Stratified Random Sample 分层随机样本 Total Population 总体 sub-popula-tion 子总体 Sub-population 子总体 sub-population 子总体 simple random sample 简单随机样本 simple random sample 简单随机样本 simple random sample 简单随机样本
  • 16. Random Cluster Sample 随机整群抽样
    • Another form of random sampling
    • 另一种随机抽样
    • Any naturally occurring aggregate of the units that are to be sampled that are used when:
    • 即将被抽样的任何自然发生的单位的聚合,在下列情况下使用:
      • you do not have a complete list of everyone in the population of interest but have a list of the clusters in which they occur or
      • 你没有一个完整的名单,但是有一个整群名单,参与者包括其中
      • you have a complete list of everyone, but they are so widely distributed that it would be too time consuming and expensive to send data collectors out to a simple random sample
      • 你有一个完整的名单,但是他们过于分散,致使让数据收集者出去进行一个简单的随机抽样过于费时而且成本高
  • 17. Random Cluster Sample (Cont.)
    • For example, you may not have a single, complete list of every AIDS patient in your country but you might have a list of every AIDS clinic.
    • You can randomly select AIDS clinics and then randomly select patients within each of the selected clinics.
    • The drawback of cluster samples is that it may not yield an accurate representation of the population.
  • 18. Random Cluster Sample (Cont.)
    • For example, you may want to interview 200 AIDS patients, but these 200 may be selected from only four randomly sampled clinics because of resource constraints. It is possible that the clinics may serve populations that are too similar in terms of economic background or other characteristics, and therefore may not be representative of all AIDS patients.
    • You might be better off, in this situation, to use knowledgeable people to identify the range of different kinds of clinics. That way you can check whether your cluster sample is a fair representative mix of types of clinics. If its not, you will at least know the direction of the bias.
  • 19. Multi-stage Random Sample 多阶段随机抽样
    • Combines two or more forms of random sampling
    • 将两种及两种以上的随机抽样形式结合起来
    • Most commonly, it begins with random cluster sampling and then applies simple random sampling or stratified random sampling
    • 最常见的是,从随机整群抽样开始,然后应用简单随机抽样或者分层随机抽样
  • 20. Combination Random Samples 合并随机样本
    • More than one random sampling technique is used
    • 采用不只一种随机抽样技巧
  • 21. Drawback of Random Cluster and Multi-stage Random Sampling 随机整群抽样以及多阶段随机抽样的缺陷
    • May not yield an accurate representation of the population
    • 可能无法准确代表抽样总体
  • 22. Summary of Random Sampling Process 随机抽样过程概述 Step Process 步骤 过程 1 Obtain a complete listing of the entire population 取得总体的完整列表 2 Assign each case a number 对总体内的所有个体进行编号 3 Randomly select the sample using a random numbers table 使用随机数表随机地抽取样本 4 When no numbered listing exists or is not practical to create: 当不存在一个经过编号的列表或在操作上很难形成这样一个列表,则: • take a random start 随机地开始 • select every n th case 每隔 n 个个体选取一个作为样本
  • 23. Non-Random Samples 非随机样本
    • Can be more focused
    • 能够更具有针对性
    • Can make sure a small sample is representative
    • 能够保证一个小样本具有代表性
    • Cannot make inferences to a larger population
    • 无法推断一个更大总体的情况
  • 24. Types of Non-random Samples 非随机样本的种类 whoever is easiest to contact or whatever is easiest to observe 最容易联系到的任何人,或最容易观察到的任何事 convenience 方便 set criteria to achieve a specific mix of participants 设定标准,来达到一个特定组合的参与者群体 purposeful (judgment) 目的明确 ask people who else you should interview 询问人们你还应该采访谁 Snowball 滚雪球效应
  • 25. Forms of Purposeful Samples 有目的的样本的种类
    • Typical cases (median)
    • 典型案例 (中间类型)
    • Maximum variation (heterogeneity)
    • 最大变化(异质性)
    • Quota
    • 配额
    • Extreme case
    • 极端例子
    • Confirming and disconfirming cases
    • 确认以及否认案例
  • 26. Bias and Non-random Sampling 偏差与非随机抽样
    • People selected in a biased way?
    • 选人的方法是否有偏差?
    • Are they substantially different from the rest of the population?
    • 抽取的样本是否与总体的其它部分有重大的不同?
    • collect some data to show that the people selected are fairly similar to the larger population (e.g. demographics)
    • 收集一些数据来表明所选则的人与总体相当地相似(例如人口统计)
  • 27. Combinations: Random and Non-Random 合并:随机样本和非随机样
    • Example:
    • 示例:
    • Non-randomly select two schools from poorest communities and two from the wealthiest communities
    • 从最贫困的社区内选取 2 所学校,并且从最富裕的社区内选取 2 所学校
    • Select a random sample of students from these four schools
    • 从这 4 所学校中随机选取学生样本
  • 28. Possibility of Error 误差的概率
    • Sample different from the population?
    • 样本与总体不同?
    • Statistics: data derived from random samples
    • 样本统计量: : 从随机样本得出的数据
  • 29. How confident do you wish to be? 你希望要多大的置信度
    • confidence level
    • 置信水平
      • E.g., 90% (90% certain your sample results are an estimate of the population as a whole)
      • 例如 90% (能够 90% 地确定你的样本统计量是总体的估计值)
    • the higher confidence level, the larger sample needed
    • 置信水平越高,所需要的样本就越大
  • 30. Confidence Standard 标准的置信水平
    • Standard is 95%
    • 标准的置信水平是 95%
    • 19 of 20 samples would have found similar results
    • 20 个样本中有 19 个样本具有相似的样本统计量
    • we are 95% certain that the population parameter is somewhere between the lower and upper confidence interval calculated from the sample
    • 我们可以 95% 地确定,总数参数在从样本计算得出的上下置信区间之间
  • 31. Confidence Interval 置信区间
    • Sometimes called sampling error, margin of error, or precision
    • 有时也被称为抽样误差,误差幅度或者精度
    • Example:
    • 例如:
      • in polls 48% for, 52% against, with (+/- 3%)
      • 民意测验表明 48% 赞成, 52% 反对。(误差率正负 3% )
      • actually means 45% to 51% for and 49% to 55% against
      • 实际上就是, 45%-51% 的人赞成, 49-55% 的人反对
  • 32. Sample Size 样本容量
    • By increasing sample size, you increase accuracy and decrease margin of error
    • 通过增大样本容量,你就提高了精确度,同时降低了边际误差
    • The larger the margin of error, the less precise your results will be
    • 边际误差越大,样本统计量的精确度就越小
    • The smaller the population, the smaller the needed sample size for a given confidence level and margin of error, but the larger the needed ratio of the sample size to the population size.
    • 总体越小,在给定置信区间和边际误差的前提下,需要的样本容量就越小,但是样本与总体的比率就越大
    • Aim for is a 95% confidence level and a margin of error of +/- 5%
    • 力求达到 95% 的置信水平和 +/- 5% 的边际误差
  • 33. Sample Sizes for Large Populations 较大总体的样本容量 271 752 1,691 6,765 90% 384 1,067 2,401 9,604 95% 666 1,848 4,144 16,576 99% 5% 3% 2% 1% Confidence Level 置信区间 Precision 精确度
  • 34. Summary of Sampling Size 样本容量小结
    • Accuracy and precision can be improved by increasing the sample size
    • 精确性可以通过增加样本大小来提高
    • The standard to aim for is a 95% confidence level and a margin of error of +/- 5%
    • 目标是达到 95% 的置信程度,和正负 5% 的误差幅度
    • The larger the margin of error, the less precise the results will be
    • 误差幅度越大,结果越不精确
    • The smaller the population, the larger the needed ratio of the sample size to the population size
    • 样本总量越小,样本容量占总量所需的比例越大
  • 35. Where to Find a Sampling Statistician 到哪去找抽样统计师?
    • American Statistical Association (ASA) directory of statistical consultants
    • 美国统计协会( ASA )统计咨询师名录
    • Alliance of Statistics Consultants (统计咨询师联盟)
    • HyperStat Online
  • 36. Thank you! 谢谢!
  • 37.
    • Beside working as Assistant Professor at Graduate School of Asia-Pacific Studies, Waseda University, Tokyo, Japan, he also active as Associate Professor at University of Darma Persada, Jakarta, Indonesia.
    • He got various training around the globe, included Shanghai International Program for Development Evaluation Training (2008) , Public Officials Capacity Building Training Program for Government Innovation, Seoul –Korea (2007), Advanced International Training Programme of Information Technology Management, at Karlstad City, Sweden (2005); the Training Seminar on Land Use and Management, Taiwan (2004); Developing Multimedia Applications for Managers, Kuala Lumpur, Malaysia (2003); Applied Policy Development Training, Vancouver, Canada (2002); Local Government Administration Training Course, Hiroshima, Japan (2001); and Regional Development and Planning Training Course, Sapporo, Japan (1999). He published more than five books regarding local autonomous.
    • You can reach Dadang Solihin by email at [email_address] or by his mobile at +62812 932 2202
    Dadang Solihin currently is Director for Regional D evelopment Performance Evaluation at Indonesian National Development Planning Agency (Bappenas). He holds MA degree in Economics from University of Colorado, USA. His previous post is Director for System and Reporting of Development Performance Evaluation at Bappenas. Dadang Solihin’s Profile