The document discusses key concepts related to sampling methods in marketing research. It defines sampling elements, population, sampling frame, and sampling unit. It presents formulas for calculating sample size when estimating means of continuous variables and proportions. The formula for means involves variables like confidence level (Z), standard deviation (s), and tolerable error (e). The formula for proportions uses variables like confidence level (Z), estimated proportion (p), and tolerable error (e). The document provides an example of each formula and discusses limitations of the formulas related to number of centers, multiple questions, and cell size in analysis.
Unlike mean, median and mode which generally describes the center of distribution, percentile, decile and quartile characterize a specific location of the distribution.
Unlike mean, median and mode which generally describes the center of distribution, percentile, decile and quartile characterize a specific location of the distribution.
A simple explanation of Regression | Regression versus Causation | Regression versus Correlation
The presentation aims at explaining the basic concept of regression. It also shows how regression is different from causation and correlation.
For further explanation, checkout the youtube link: https://youtu.be/SELNQs9b-XY
Descriptive statistics are methods of describing the characteristics of a data set. It includes calculating things such as the average of the data, its spread and the shape it produces.
APPLICATION OF STATISTICS IN BUSINESS with Graphs | Business StatisticsHassan Shaheer
APPLICATION OF STATISTICS IN BUSINESS
WHAT IS STATISTICS ?
Meaning
Significance of STATISTICS
ROLE OF STATISTICS IN ACCOUNTING, FINANCE, MARKETING, PRODUCTION & ECONOMICS
Quantative Data Graphs, Pie Charts, Dot Plots & Pareto Charts
Hypothesis is usually considered as the principal instrument in research and quality control. Its main function is to suggest new experiments and observations. In fact, many experiments are carried out with the deliberate object of testing hypothesis. Decision makers often face situations wherein they are interested in testing hypothesis on the basis of available information and then take decisions on the basis of such testing. In Six –Sigma methodology, hypothesis testing is a tool of substance and used in analysis phase of the six sigma project so that improvement can be done in right direction
Decision theory as the name would imply is concerned with the process of making decisions. The extension to statistical decision theory includes decision making in the presence of statistical knowledge which provides some information where there is uncertainty. The elements of decision theory are quite logical and even perhaps intuitive. The classical approach to decision theory facilitates the use of sample information in making inferences about the unknown quantities. Other relevant information includes that of the possible consequences which is quantified by loss and the prior information which arises from statistical investigation. The use of Bayesian analysis in statistical decision theory is natural. Their unification provides a foundational framework for building and solving decision problems. The basic ideas of decision theory and of decision theoretic methods lend themselves to a variety of applications and computational and analytic advances.
A simple explanation of Regression | Regression versus Causation | Regression versus Correlation
The presentation aims at explaining the basic concept of regression. It also shows how regression is different from causation and correlation.
For further explanation, checkout the youtube link: https://youtu.be/SELNQs9b-XY
Descriptive statistics are methods of describing the characteristics of a data set. It includes calculating things such as the average of the data, its spread and the shape it produces.
APPLICATION OF STATISTICS IN BUSINESS with Graphs | Business StatisticsHassan Shaheer
APPLICATION OF STATISTICS IN BUSINESS
WHAT IS STATISTICS ?
Meaning
Significance of STATISTICS
ROLE OF STATISTICS IN ACCOUNTING, FINANCE, MARKETING, PRODUCTION & ECONOMICS
Quantative Data Graphs, Pie Charts, Dot Plots & Pareto Charts
Hypothesis is usually considered as the principal instrument in research and quality control. Its main function is to suggest new experiments and observations. In fact, many experiments are carried out with the deliberate object of testing hypothesis. Decision makers often face situations wherein they are interested in testing hypothesis on the basis of available information and then take decisions on the basis of such testing. In Six –Sigma methodology, hypothesis testing is a tool of substance and used in analysis phase of the six sigma project so that improvement can be done in right direction
Decision theory as the name would imply is concerned with the process of making decisions. The extension to statistical decision theory includes decision making in the presence of statistical knowledge which provides some information where there is uncertainty. The elements of decision theory are quite logical and even perhaps intuitive. The classical approach to decision theory facilitates the use of sample information in making inferences about the unknown quantities. Other relevant information includes that of the possible consequences which is quantified by loss and the prior information which arises from statistical investigation. The use of Bayesian analysis in statistical decision theory is natural. Their unification provides a foundational framework for building and solving decision problems. The basic ideas of decision theory and of decision theoretic methods lend themselves to a variety of applications and computational and analytic advances.
This was a presentation that was carried out in our research method class by our group. It will be useful for PHD and master students quantitative and qualitative method. It consist sample definition, purpose of sampling, stages in the selection of a sample, types of sampling in quantitative researches, types of sampling in qualitative researches, and ethical Considerations in Data Collection.
BUS308 – Week 5 Lecture 1 A Different View Expected Ou.docxcurwenmichaela
BUS308 – Week 5 Lecture 1
A Different View
Expected Outcomes
After reading this lecture, the student should be familiar with:
1. What a confidence interval for a statistic is.
2. What a confidence interval for differences is.
3. The difference between statistical and practical significance.
4. The meaning of an Effect Size measure.
Overview
Years ago, a comedy show used to introduce new skits with the phrase “and now for
something completely different.” That seems appropriate for this week’s material.
This week we will look at evaluating our data results in somewhat different ways. One of
the criticisms of the hypothesis testing procedure is that it only shows one value, when it is
reasonably clear that a number of different values would also cause us to reject or not reject a
null hypothesis of no difference. Many managers and researchers would like to see what these
values could be; and, in particular, what are the extreme values as help in making decisions.
Confidence intervals will help us here.
The other criticism of the hypothesis testing procedure is that we can “manage” the
results, or ensure that we will reject the null, by manipulating the sample size. For example, if
we have a difference in a customer preference between two products of only 1%, is this a big
deal? Given the uncertainty contained in sample results, we might tend to think that we can
safely ignore this result. However, if we were to use a sample of, say, 10,000, we would find
that this difference is statistically significant. This, for many, seems to fly in the face of
reasonableness. We will look at a measure of “practical significance,” meaning the likelihood of
the difference being worth paying any attention to, called the effect size to help us here.
Confidence Intervals
A confidence interval is a range of values that, based upon the sample results, most likely
contains the actual population parameter. The “most likely” element is the level of confidence
attached to the interval, 95% confidence interval, 90% confidence interval, 99% confidence
interval, etc. They can be created at any time, with or without performing a statistical test, such
as the t-test.
A confidence interval may be expressed as a range (45 to 51% of the town’s population
support the proposal) or as a mean or proportion with a margin of error (48% of the town
supports the proposal, with a margin of error of 3%). This last format is frequently seen with
opinion poll results, and simply means that you should add and subtract this margin of error from
the reported proportion to obtain the range. With either format, the confidence percent should
also be provided.
Confidence intervals for a single mean (or proportion) are fairly straightforward to
understand, and relate to t-test outcomes simply. Details on how to construct the interval will be
given in this week’s second lecture. We want to understand how to interpret and understa.
This presentation will address the issue of sample size determination for social sciences. A simple example is provided for every to understand and explain the sample size determination.
Answer the questions in one paragraph 4-5 sentences. · Why did t.docxboyfieldhouse
Answer the questions in one paragraph 4-5 sentences.
· Why did the class collectively sign a blank check? Was this a wise decision; why or why not? we took a decision all the class without hesitation
· What is something that I said individuals should always do; what is it; why wasn't it done this time? Which mitigation strategies were used; what other strategies could have been used/considered? individuals should always participate in one group and take one decision
SAMPLING MEAN:
DEFINITION:
The term sampling mean is a statistical term used to describe the properties of statistical distributions. In statistical terms, the sample meanfrom a group of observations is an estimate of the population mean. Given a sample of size n, consider n independent random variables X1, X2... Xn, each corresponding to one randomly selected observation. Each of these variables has the distribution of the population, with mean and standard deviation. The sample mean is defined to be
WHAT IT IS USED FOR:
It is also used to measure central tendency of the numbers in a database. It can also be said that it is nothing more than a balance point between the number and the low numbers.
HOW TO CALCULATE IT:
To calculate this, just add up all the numbers, then divide by how many numbers there are.
Example: what is the mean of 2, 7, and 9?
Add the numbers: 2 + 7 + 9 = 18
Divide by how many numbers (i.e., we added 3 numbers): 18 ÷ 3 = 6
So the Mean is 6
SAMPLE VARIANCE:
DEFINITION:
The sample variance, s2, is used to calculate how varied a sample is. A sample is a select number of items taken from a population. For example, if you are measuring American people’s weights, it wouldn’t be feasible (from either a time or a monetary standpoint) for you to measure the weights of every person in the population. The solution is to take a sample of the population, say 1000 people, and use that sample size to estimate the actual weights of the whole population.
WHAT IT IS USED FOR:
The sample variance helps you to figure out the spread out in the data you have collected or are going to analyze. In statistical terminology, it can be defined as the average of the squared differences from the mean.
HOW TO CALCULATE IT:
Given below are steps of how a sample variance is calculated:
· Determine the mean
· Then for each number: subtract the Mean and square the result
· Then work out the mean of those squared differences.
To work out the mean, add up all the values then divide by the number of data points.
First add up all the values from the previous step.
But how do we say "add them all up" in mathematics? We use the Roman letter Sigma: Σ
The handy Sigma Notation says to sum up as many terms as we want.
· Next we need to divide by the number of data points, which is simply done by multiplying by "1/N":
Statistically it can be stated by the following:
·
· This value is the variance
EXAMPLE:
Sam has 20 Rose Bushes.
The number of flowers on each b.
Confidence Interval ModuleOne of the key concepts of statist.docxmaxinesmith73660
Confidence Interval Module
One of the key concepts of statistics enabling statisticians to make incredibly accurate predictions is called the Central Limit Theorem. The Central Limit Theorem is defined in this way:
· For samples of a sufficiently large size, the real distribution of means is almost always approximately normal.
· The distribution of means gets closer and closer to normal as the sample size gets larger and larger, regardless of what the original variable looks like (positively or negatively skewed).
· In other words, the original variable does not have to be normally distributed.
· This is because, if we as eccentric researchers, drew an almost infinite number of random samples from a single population (such as the student body of NMSU), the means calculated from the many samples of that population will be normally distributed and the mean calculated from all of those samples would be a very close approximation to the true population mean. It is this very characteristic that makes it possible for us, using sound probability based sampling techniques, to make highly accurate statements about characteristics of a population based upon the statistics calculated on a sample drawn from that population.
· Furthermore, we can calculate a statistic known as the standard error of the mean (abbreviated s.e.) that describes the variability of the distribution of all possible sample means in the same way that we used the standard deviation to describe the variability of a single sample. We will use the standard error of the mean (s.e.) to calculate the statistic that is the topic of this module, the confidence interval.
The formula that we use to calculate the standard error of the mean is:
s.e. = s / √N – 1
where s = the standard deviation calculated from the sample; and
N = the sample size.
So the formula tells us that the standard error of the mean is equal to the
standard deviation divided by the square root of the sample size minus 1.
This is the preferred formula for practicing professionals as it accounts for errors that may be a function of the particular sample we have selected.
THE CONFIDENCE INTERVAL (CI)
The formula for the CI is a function of the sample size (N).
For samples sizes ≥ 100, the formula for the CI is:
CI = (the sample mean) + & - Z(s.e.).
Let’s look at an example to see how this formula works.
* Please use a pdf doc. “how to solve the problem”, I have provided for you under the “notes” link.
Example 1
Suppose that we conducted interviews with 140 randomly selected individuals (N = 140) in a large metropolitan area. We assured these individuals that their answers would remain confidential, and we asked them about their law-breaking behavior. Among other questions the individuals were asked to self-report the number of times per month they exceeded the speed limit. One of the objectives of the study was to estimate (make an inference about) the average nu.
PAGE
O&M Statistics – Inferential Statistics: Hypothesis Testing
Inferential Statistics
Hypothesis testing
Introduction
In this week, we transition from confidence intervals and interval estimates to hypothesis testing, the basis for inferential statistics. Inferential statistics means using a sample to draw a conclusion about an entire population. A test of hypothesis is a procedure to determine whether sample data provide sufficient evidence to support a position about a population. This position or claim is called the alternative or research hypothesis.
“It is a procedure based on sample evidence and probability theory to determine whether the hypothesis is a reasonable statement” (Mason & Lind, pg. 336).
This Week in Relation to the Course
Hypothesis testing is at the heart of research. In this week, we examine and practice a procedure to perform tests of hypotheses comparing a sample mean to a population mean and a test of hypotheses comparing two sample means.
The Five-Step Procedure for Hypothesis Testing (you need to show all 5 steps – these contain the same information you would find in a research paper – allows others to see how you arrived at your conclusion and provides a basis for subsequent research).
Step 1
State the null hypothesis – equating the population parameter to a specification. The null hypothesis is always one of status quo or no difference. We call the null hypothesis H0 (H sub zero). It is the hypothesis that contains an equality.
State the alternate hypothesis – The alternate is represented as H1 or HA (H sub one or H sub A). The alternate hypothesis is the exact opposite of the null hypothesis and represents the conclusion supported if the null is rejected. The alternate will not contain an equal sign of the population parameter.
Most of the time, researchers construct tests of hypothesis with the anticipation that the null hypothesis will be rejected.
Step 2
Select a level of significance (α) which will be used when finding critical value(s).
The level you choose (alpha) indicates how confident we wish to be when making the decision.
For example, a .05 alpha level means that we are 95% sure of the reliability of our findings, but there is still a 5% chance of being wrong (what is called the likelihood of committing a Type 1 error).
The level of significance is set by the individual performing the test. Common significance levels are .01, .05, and .10. It is important to always state what the chosen level of significance is.
Step 3
Identify the test statistic – this is the formula you use given the data in the scenario. Simply put, the test statistic may be a Z statistic, a t statistic, or some other distribution. Selection of the correct test statistic will depend on the nature of the data being tested (sample size, whether the population standard deviation is known, whether the data is known to be normally distributed).
The sampling distribution of the test statistic is divided into t.
Quantitative MethodsChoosing a Sample.pptxChoosing a Samp.docxamrit47
Quantitative Methods/Choosing a Sample.pptx
Choosing a Sample
Leedy, P., and Ormrod, J., Practical Research. (8th ed.)
Fink, A. 1995. From the Survey Toolkit published by Sage.
Choosing a Sample to Survey
Population – the group to be covered by your research plan
Sample – a subset of your population
Generalize results – only if the sample is representative of the population
Probability sampling
Non-probability sampling
2
Probability sampling –
Random Sampling
Each member of the population has an equal chance of being selected.
3
Probability sampling –
Stratified Random Sampling
Take equal samples from each group (layers, strata).
4
Probability sampling –
Proportional Stratified Sampling
Take equal proportions of samples from each group (layers, strata).
5
Probability sampling –
Cluster Sampling
Take equal proportions of samples from certain regions only.
6
Non-probability Sampling –
Convenience Sampling
No attempt to have a representative sample
Examples:
Survey people in your neighborhood.
Customer satisfaction cards in a restaurant.
Survey all companies who have had projects done by NWMOC.
Survey all Human Resources Managers at Stout Career Fair.
7
Sample size?
Entire population, if N<100
20-50% of population, if 100 < N < 2000
About 400, if N > 2000
Affects the time and cost of the study, the precision of statistical results
Be sure to consider the response rate
8
Sampling Bias
Bias – an influence, condition, or set of conditions which distort the data
Sampling bias – is the sample random?
Examples:
Political polls by phone interview
A mail survey of alumni satisfaction, with 30% response rate
9
10
Dilbert
Click to edit Master text styles
Second level
Third level
Fourth level
Fifth level
Quantitative Methods/Confidence Intervals.pptx
Confidence Intervals
Some adapted from http://stattrek.com/estimation/confidence-interval.aspx and http://www.stat.yale.edu/Courses/1997-98/101/confint.htm
1
Confidence Interval
Statisticians use a confidence interval to describe the amount of uncertainty associated with a sample estimate of a population parameter.
It gives an estimated range of values which is likely to include an unknown population parameter,
the estimated range is calculated from a set of sample data.
Confidence Interval
Gives the probability that the interval produced by the sample method includes the true value of the parameter
You must assume a normal distribution
Confidence Interval Selection
Common choices for the confidence level are 0.90, 0.95, and 0.99. These levels correspond to percentages of the area of the normal density curve. For example, a 95% confidence interval covers 95% of the normal curve --
Normal Distribution
Confidence Intervals
Supp ...
Steps of hypothesis testingSelect the appropriate testSo far.docxdessiechisomjj4
Steps of hypothesis testing
Select the appropriate test
So far we’ve learned a couple variation on z- and t-tests
See next slide for how to select
State your research hypothesis and your null hypothesis
State them in English
Then in math
Describe the NULL distribution
Starting here is where you be a skeptic and assume the null is true!
For one-sample tests, you will need to determine μ
(For two-tailed tests, you don’t need to worry about μ)
Compute the relevant standard error
Determine your critical value(s)
Keep in mind whether it is a directional or non-directional test
Compute the test statistic
Compare the test stat to the critical value(s) and make your decision
When to use each test
All of these tests require that the sampling distribution is normal
Either because population is normal or, thanks to central limit theorem, sample size is very large
All of these tests require that the measures be quantitative variables, that is interval/ratio
(Not all quantitative variables are normal, BUT all normal variables are quantitative. So if someone tells you a variable is normal, you know it is also quantitative.)
When to use each test, cont’d
1 Sample z-test
Comparing one sample mean to a population mean
And you do know σ (population SD)
2 sample z-test
Comparing two sample means to each other
And you do know σM1-M2 (standard error of difference of means)
1 sample t-test
Comparing one sample mean to a population mean
You only know s (sample SD)
2 sample t-test
Comparing two sample means to each other
You only know s1 and s2 (sample SDs)
Dependent sample t-test
You have two scores coming from each person, such as if you measured them before and after an experimental manipulation.
Compute the differences between the two scores, then treat like a 1 sample t
What is α?
Put on your skeptic’s hat: you believe the null hypothesis is true
But you’re willing to be convinced you’re wrong
If the test statistic is sufficiently improbable, you will change your mind and decide the null hypothesis is false
What is “sufficiently” improbable?
When your test statistic is more extreme than your critical values
Critical values are selected so that only a small fraction of the entire distribution is more extreme than the critical values
This “small fraction” is called α
Conventionally, α is usually set to .05, that is 5%
Directionality of a test
Is a test simply about whether there a difference, regardless of direction?
If so, it is a non-directed, or undirected, or two-tailed test
Your α must be evenly split between the two tails
For the conventional α = .05, that means each tail should have .025 or 2.5% of the total distribution
Is the test predicting one mean will be bigger than another? Or is it predicting one mean will be less than another?
If so, it a directional, or directed, or one-tailed test
Put all your α in a single tail
Special note on one-tailed tests
Step 3 of our procedure is a little awkward when we have one-tailed tests
How do you descr.
Topic: What is Reliability and its Types?
Student Name: Kanwal Naz
Class: B.Ed 1.5
Project Name: “Young Teachers' Professional Development (TPD)"
"Project Founder: Prof. Dr. Amjad Ali Arain
Faculty of Education, University of Sindh, Pakistan
Similar to Sampling methods theory and practice (20)
Business Valuation Principles for EntrepreneursBen Wann
This insightful presentation is designed to equip entrepreneurs with the essential knowledge and tools needed to accurately value their businesses. Understanding business valuation is crucial for making informed decisions, whether you're seeking investment, planning to sell, or simply want to gauge your company's worth.
Putting the SPARK into Virtual Training.pptxCynthia Clay
This 60-minute webinar, sponsored by Adobe, was delivered for the Training Mag Network. It explored the five elements of SPARK: Storytelling, Purpose, Action, Relationships, and Kudos. Knowing how to tell a well-structured story is key to building long-term memory. Stating a clear purpose that doesn't take away from the discovery learning process is critical. Ensuring that people move from theory to practical application is imperative. Creating strong social learning is the key to commitment and engagement. Validating and affirming participants' comments is the way to create a positive learning environment.
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...BBPMedia1
Grote partijen zijn al een tijdje onderweg met retail media. Ondertussen worden in dit domein ook de kansen zichtbaar voor andere spelers in de markt. Maar met die kansen ontstaan ook vragen: Zelf retail media worden of erop adverteren? In welke fase van de funnel past het en hoe integreer je het in een mediaplan? Wat is nu precies het verschil met marketplaces en Programmatic ads? In dit half uur beslechten we de dilemma's en krijg je antwoorden op wanneer het voor jou tijd is om de volgende stap te zetten.
Buy Verified PayPal Account | Buy Google 5 Star Reviewsusawebmarket
Buy Verified PayPal Account
Looking to buy verified PayPal accounts? Discover 7 expert tips for safely purchasing a verified PayPal account in 2024. Ensure security and reliability for your transactions.
PayPal Services Features-
🟢 Email Access
🟢 Bank Added
🟢 Card Verified
🟢 Full SSN Provided
🟢 Phone Number Access
🟢 Driving License Copy
🟢 Fasted Delivery
Client Satisfaction is Our First priority. Our services is very appropriate to buy. We assume that the first-rate way to purchase our offerings is to order on the website. If you have any worry in our cooperation usually You can order us on Skype or Telegram.
24/7 Hours Reply/Please Contact
usawebmarketEmail: support@usawebmarket.com
Skype: usawebmarket
Telegram: @usawebmarket
WhatsApp: +1(218) 203-5951
USA WEB MARKET is the Best Verified PayPal, Payoneer, Cash App, Skrill, Neteller, Stripe Account and SEO, SMM Service provider.100%Satisfection granted.100% replacement Granted.
Premium MEAN Stack Development Solutions for Modern BusinessesSynapseIndia
Stay ahead of the curve with our premium MEAN Stack Development Solutions. Our expert developers utilize MongoDB, Express.js, AngularJS, and Node.js to create modern and responsive web applications. Trust us for cutting-edge solutions that drive your business growth and success.
Know more: https://www.synapseindia.com/technology/mean-stack-development-company.html
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...BBPMedia1
Marvin neemt je in deze presentatie mee in de voordelen van non-endemic advertising op retail media netwerken. Hij brengt ook de uitdagingen in beeld die de markt op dit moment heeft op het gebied van retail media voor niet-leveranciers.
Retail media wordt gezien als het nieuwe advertising-medium en ook mediabureaus richten massaal retail media-afdelingen op. Merken die niet in de betreffende winkel liggen staan ook nog niet in de rij om op de retail media netwerken te adverteren. Marvin belicht de uitdagingen die er zijn om echt aansluiting te vinden op die markt van non-endemic advertising.
Cracking the Workplace Discipline Code Main.pptxWorkforce Group
Cultivating and maintaining discipline within teams is a critical differentiator for successful organisations.
Forward-thinking leaders and business managers understand the impact that discipline has on organisational success. A disciplined workforce operates with clarity, focus, and a shared understanding of expectations, ultimately driving better results, optimising productivity, and facilitating seamless collaboration.
Although discipline is not a one-size-fits-all approach, it can help create a work environment that encourages personal growth and accountability rather than solely relying on punitive measures.
In this deck, you will learn the significance of workplace discipline for organisational success. You’ll also learn
• Four (4) workplace discipline methods you should consider
• The best and most practical approach to implementing workplace discipline.
• Three (3) key tips to maintain a disciplined workplace.
Personal Brand Statement:
As an Army veteran dedicated to lifelong learning, I bring a disciplined, strategic mindset to my pursuits. I am constantly expanding my knowledge to innovate and lead effectively. My journey is driven by a commitment to excellence, and to make a meaningful impact in the world.
Unveiling the Secrets How Does Generative AI Work.pdfSam H
At its core, generative artificial intelligence relies on the concept of generative models, which serve as engines that churn out entirely new data resembling their training data. It is like a sculptor who has studied so many forms found in nature and then uses this knowledge to create sculptures from his imagination that have never been seen before anywhere else. If taken to cyberspace, gans work almost the same way.
"𝑩𝑬𝑮𝑼𝑵 𝑾𝑰𝑻𝑯 𝑻𝑱 𝑰𝑺 𝑯𝑨𝑳𝑭 𝑫𝑶𝑵𝑬"
𝐓𝐉 𝐂𝐨𝐦𝐬 (𝐓𝐉 𝐂𝐨𝐦𝐦𝐮𝐧𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬) is a professional event agency that includes experts in the event-organizing market in Vietnam, Korea, and ASEAN countries. We provide unlimited types of events from Music concerts, Fan meetings, and Culture festivals to Corporate events, Internal company events, Golf tournaments, MICE events, and Exhibitions.
𝐓𝐉 𝐂𝐨𝐦𝐬 provides unlimited package services including such as Event organizing, Event planning, Event production, Manpower, PR marketing, Design 2D/3D, VIP protocols, Interpreter agency, etc.
Sports events - Golf competitions/billiards competitions/company sports events: dynamic and challenging
⭐ 𝐅𝐞𝐚𝐭𝐮𝐫𝐞𝐝 𝐩𝐫𝐨𝐣𝐞𝐜𝐭𝐬:
➢ 2024 BAEKHYUN [Lonsdaleite] IN HO CHI MINH
➢ SUPER JUNIOR-L.S.S. THE SHOW : Th3ee Guys in HO CHI MINH
➢FreenBecky 1st Fan Meeting in Vietnam
➢CHILDREN ART EXHIBITION 2024: BEYOND BARRIERS
➢ WOW K-Music Festival 2023
➢ Winner [CROSS] Tour in HCM
➢ Super Show 9 in HCM with Super Junior
➢ HCMC - Gyeongsangbuk-do Culture and Tourism Festival
➢ Korean Vietnam Partnership - Fair with LG
➢ Korean President visits Samsung Electronics R&D Center
➢ Vietnam Food Expo with Lotte Wellfood
"𝐄𝐯𝐞𝐫𝐲 𝐞𝐯𝐞𝐧𝐭 𝐢𝐬 𝐚 𝐬𝐭𝐨𝐫𝐲, 𝐚 𝐬𝐩𝐞𝐜𝐢𝐚𝐥 𝐣𝐨𝐮𝐫𝐧𝐞𝐲. 𝐖𝐞 𝐚𝐥𝐰𝐚𝐲𝐬 𝐛𝐞𝐥𝐢𝐞𝐯𝐞 𝐭𝐡𝐚𝐭 𝐬𝐡𝐨𝐫𝐭𝐥𝐲 𝐲𝐨𝐮 𝐰𝐢𝐥𝐥 𝐛𝐞 𝐚 𝐩𝐚𝐫𝐭 𝐨𝐟 𝐨𝐮𝐫 𝐬𝐭𝐨𝐫𝐢𝐞𝐬."
Digital Transformation and IT Strategy Toolkit and TemplatesAurelien Domont, MBA
This Digital Transformation and IT Strategy Toolkit was created by ex-McKinsey, Deloitte and BCG Management Consultants, after more than 5,000 hours of work. It is considered the world's best & most comprehensive Digital Transformation and IT Strategy Toolkit. It includes all the Frameworks, Best Practices & Templates required to successfully undertake the Digital Transformation of your organization and define a robust IT Strategy.
Editable Toolkit to help you reuse our content: 700 Powerpoint slides | 35 Excel sheets | 84 minutes of Video training
This PowerPoint presentation is only a small preview of our Toolkits. For more details, visit www.domontconsulting.com
2. Basic Terminology in Sampling
Sampling Element: This is the unit about which information is sought by
the marketing researcher for further analysis and action.
The most common sampling element in marketing research is a human
respondent who could be a consumer, a potential consumer, a dealer or a
person exposed to an advertisement, etc.
But some other possible elements for a study could be companies,
families or households, retail stores and so on.
Population : This is not the entire population of a given geographical area,
but the pre-defined set of potential respondents (elements) in a
geographical area.
For example, a population may be defined as "all mothers who buy
branded baby food in a given area" or "all teenagers who watch MTV in
the country" or " all adult males who have heard about or use the
AQUAFRESH brand of toothpaste" or similar definitions in line with the
study being done.
Slide 1
3. Sampling Frame
This is a subset of the defined target population, from which we can
realistically select a sample for our research.
For example, we may use a telephone directory of Mumbai as a
sampling frame to represent the target population defined as "the
adult residents of Mumbai".
Obviously, there would be a number of elements (people) who fit
our population definition, but do not figure in the telephone
directory. Similarly, some who have moved out of Mumbai recently
would still be listed.
Thus, a sampling frame is usually a practical listing of the
population, or a definition of the elements or areas which can be
used for the sampling exercise.
Slide 2
4. Sampling Unit
If individual respondents form the sample elements, and if we directly
select some individuals in a single step, the sampling unit is also the
element. That is, both the unit and the element are the same.
But in most marketing research, there is a multi-stage selection.
For example, we may first select areas or blocks in a city or town. These
form the first stage Sampling Units.
Then, we may select specific streets within a block or area, and these are
called second stage sampling units.
Then we may select apartments or houses - the third stage sampling units.
At the last stage, we reach the individual sampling element - the
respondent we wanted to meet.
Slide 3
5. The Sample Size Calculation
It is not a formula alone that determines sample size in actual
marketing research. Sampling in practice is based on science, but is
also an art.
The basic assumptions made while computing sample sizes through
the use of formulae are sometimes not met in practice. At other
times, there are other factors which are influential in increasing or
decreasing sample sizes obtained through the use of formulae.
For now, remember that sample size is decided based on
• use of formulae,
• experience of similar studies,
• time and budget constraints,
• output or analysis requirements,
• number of segments of the target population,
• number of centres where the study is conducted, etc.
Slide 4
6. Slide 5
There are two formulas depending on variable type, used for computing
sample size for a study. The first is used when the critical variable studied
is an interval-scaled one.
Formula for Sample Size Calculation when Estimating Means
(for Continuous or Interval Scaled Variables)
The formula for computing ‘n’, the sample size required to do the study,
is –
Z s
n = ----------
e
Let us examine one by one what the quantities ‘Z’, ‘s’, and ‘e’ represent.
We will then apply the same to an example to see how it works in
2
7. Z :The ‘Z’ value represents the Z score from the standard normal
distribution for the confidence level desired by the researcher. For
example, a 95 percent confidence level would indicate (from a
standard normal distribution for a 2-sided probability value of 0.95)
a ‘z’ score of 1.96. Similarly, if the researcher desires a 90 percent
confidence level, the corresponding ‘z’ score would be 1.645
(again, from the standard normal distribution, for a ‘2’ sided
probability of 0.90).
Generally, 90 or 95 percent confidence is adequate for most
marketing research studies. A 100 percent confidence level is not
practical, as it means we have to take a census of the entire
population, instead of using a sample.
We will use z = 1.96, equivalent to a 95 percent confidence level,
in our example.
Slide 6
8. s : The ‘s’ represents the population standard deviation for the variable which
we are trying to measure from the study. By definition, this is an unknown
quantity, since we have not taken a sample yet. So, the question of knowing
the value of ‘s’, the sample standard deviation, does not arise.
However, we can use a rough estimate of the sample standard deviation for the
variable being measured. This estimate can be obtained in the following ways –
If past studies have measured this variable, we can use the standard deviation of
the variable from one of the studies from the recent past. It serves as a good
approximation.
A very small sample can be taken as a test or pilot sample, only for the purpose
of roughly estimating the sample standard deviation of the concerned variable.
If the minimum and maximum values of the variable can be estimated, then the
range of the variable’s values is known. Range = Maximum value – Minimum
value. Assuming that in practically all variables, 99.7 percent of the values of the
variables would lie within + 3 standard deviations of the mean, we could get an
approximate value of the standard deviation by dividing the range by 6.
The logic of this is that Range is equal to 6 standard deviations for most variables.
Therefore, Range, when divided by 6, should give a fairly good estimate of the
standard deviation.
Slide 7
9. e : The third value required for calculating the sample size required for the
study is ‘e’, called tolerable error in estimating the variable in question. This can
be decided only by the researcher or his sponsor for the study. The lower the
tolerance, the higher will be the sample size. The higher the tolerable error, the
smaller will be the sample size required.
Now, let us take an example of the use of the above formula, to see how it works.
Let us assume we are doing a customer satisfaction study for a washing machine.
We are measuring satisfaction on a scale of 1 to 10. 1 represents "Not at all
satisfied", and 10 represents "Completely Satisfied". The scale would look like this
on a questionnaire –
Customer Satisfaction Scale
We will assume that the questionnaire consists only of 7-8 questions, all of them
using this 10-point scale. Therefore, the variable we are trying to measure or
estimate through the survey, is Customer Satisfaction, which is being measured on
Slide 8
1 2 3 4 5 6 7 8 9 10
10. We will apply the formula discussed for sample size calculation, and
check for its usefulness.
Zs is the formula, for variables which are
continuous, or scaled.
Z Let us assume we want a 95 percent confidence level in our
estimate of customer satisfaction level from the study. Then, from the
standard normal distribution tables, (for a 2-sided probability value of
0.95), the Z value is 1.96.
s Let us assume that such a customer satisfaction study was not
conducted in the past by us. We have no idea of the standard deviation
of the variable “Customer Satisfaction”. We can then use the rough
approximation of Range divided by 6 to estimate the sample standard
deviation.
In this case, the lowest value of customer satisfaction is 1, and the
highest value is 10. Thus, the Range of values for this variable is 10–1 =
9. Therefore, the estimated sample standard deviation becomes 9/6 = 1.5.
Slide 9
e
2
11. e The tolerable error is expressed in the same units as
the variable being measured or estimated by the study. Thus,
we have to decide how much error (on a scale of 1 to 10) we
can tolerate in the estimate of average customer satisfaction.
Let us say, we put the value at + 0.5. That means we are
putting the value of ‘e’ as 0.5. This means, we would like our
estimate of customer satisfaction to be within 0.5 of the actual
value, with a confidence level of 95 percent (decided earlier
while setting the ‘z’ value).
Slide 9 contd….
12. Slide 10
Now, we have all 3 values required for calculating
‘n’, the sample size. So let us calculate ‘n’.
n = Z s 2
1.96 x 1.5 2
e 0.5
= (1.96 x 3) 2
= 34.57 or 35 (approximately)
Therefore, a sample size of 35 would give us an
estimate of customer satisfaction measured on a 1–10
point scale, with 95 percent confidence level, and
error level maintained within + 0.5 of the actual
value.
If we were to tighten our tolerance level of error (e)
to + 0.25 instead of + 0.5, we would have to take a
sample of higher size.
‘n’ would then be equal to
1.96 x 1.5 2
= ( 1.96 x 6 ) 2
= 138.3
0.25
= 138 (approximately)
13. Similarly, for any change in the estimate of ‘s’ or the value of ‘Z’ we choose to
set, the value of ‘n’, the sample size, would change.
In general, sample size would increase if
•.standard deviation ‘s’ is higher
•.confidence level required is higher
•.error tolerance 'e' is lower
The major things to remember in the above formula are that
1.‘Z’ value is set based on the confidence level we desire.
2. ‘s’ value is estimated from past studies involving the same variable, or from
the approximate formula of Range, if we can estimate the
Range of values for the variable in question.
3. ‘e’ value is also set by us.
Slide 11
6
14. Formula for Sample Size Calculation when Estimating Proportions
In cases where the variable being estimated is a proportion or a percentage, a
variation of the formula mentioned earlier should be used.
Such variables are typically found in questions that have a dichotomous
scale, with only two choices for an answer. For example, regular users
versus non-users. If we are estimating the proportion of respondents who
are regular users of our brand of toothpaste, say, we might use following
formula to determine sample size.
Here, the formula is
z
n = pq ----
e
Let us look at the meaning of each of the terms on the right hand side of the
Slide 12
2
15. ‘p’ is the frequency of occurrence of something expressed as a
proportion. For example, if the number of users you would expect to find
in a sample is 1 out of every 4 respondents, ‘p’ would be ¼ or 0.25. ‘q’ is
simply the frequency of non-occurrence of the same event, and is
calculated as (1-p). In other words, ‘p’ and ‘q’ always add up to 1. Here
again, it should be noted that we are actually trying to determine ‘p’ or
estimate ‘p’ by doing our survey. So, the estimate of ‘p’ that we use to
compute ‘n’ in the formula is either a very rough guess based on prior
studies, or on some other data. It is used only to calculate the sample size
‘n’. Only after doing the study will we have our true estimate of ‘p’, the
proportion of users in the population. It is similar to the problem
mentioned earlier (in the estimation of means for continuous variables)
when we used an estimate of ‘s’ before doing the actual study, only for the
purpose of computing sample size.
Z : ‘Z’ is the confidence level-related value of the standard normal
variable, as discussed in the earlier section. It is equal to 1.645 for 90
percent confidence level, and 1.96 for 95 percent confidence level (from
the standard normal distribution table).
Slide 13
16. e : ‘e’ is once again, the tolerable level of error in
estimating ‘p’ that the researcher has to decide. If we decide
that we can tolerate only a 3 percent error, ‘e’ has to be
expressed in terms of the same units as ‘p’. So, a 3 percent
tolerable error would translate into e = 0.03 because ‘p’ is a
proportion, with values ranging from 0 to 1 only. ‘q’ is also a
proportion, with the same range of values, and p+q is equal to
1.
Slide 13 contd….
17. Slide 14
Example of Use of Formula for Proportions
Let us plug in some numbers to see how the formula
works. Assuming we are trying to estimate the
proportion of the population who use our toothpaste
brand AQUA, let us assume that we want a
confidence level of 95 percent in our results (which
means Z = 1.96), and ‘e’ is 0.03, as discussed above.
‘p’, from previous studies or from prior knowledge,
is estimated as 0.25 for the purpose of sample size
determination.
Then, n = pq z . 2
e
which is equal to ( 0.25 ) ( 0.75 ) 1.96 2
0.03
or n = ( 0.25 ) ( 0.75 ) ( 4268.4 )
= 800
Therefore, we need a sample size of 800 respondents
to estimate the true value of ‘p’, with a 95 percent
confidence level, and with an error tolerance of +
0.03 from the true value.
18. Here, like in the earlier formula, the sample size is higher if
The confidence level is higher
The error tolerance is lower
But, the relationship between sample size and estimated ‘p’ is
somewhat different. The sample size increases as ‘p’ increases
from 0 to 0.5, but decreases thereafter, as ‘p’ increases from 0.5 to
1. Thus, other things being equal, sample size required is
maximum if ‘p’ is equal to 0.5. This is because the formula also
contains ‘q’ which is equal to (1-p). The product of ‘p’ and ‘q’ is
maximum when p = 0.5, q = 0.5 (0.5 x 0.5 = 0.25). At all other ‘p’
values, the product of ‘p’ and ‘q’ is less than 0.25. Therefore, the
sample size formula gives the highest value when p = 0.5.
This also gives us an easy way out of estimating the value of ‘p’, if
past information is not available. We can simply set the value of
‘p’ to 0.5, because that will give us the maximum sample size. This
could be an overestimated sample size, but it can never
underestimate sample size.
Slide 15
19. Limitations of Formulae
Number of Centres
Most studies deal with multiple locations spread across the country. If the data is
to be analysed separately for each geographical segment, the overall sample size
obtained from the formula has to be split into these geographical centres or
segments. In such cases, we may intervene, and fix a minimum sample size for
each centre / city.
Multiple Questions
Different varieties and scales of variables are used in a questionnaire. Our
assumption in using the above formulae was that we have only one major type of
variable in the questionnaire – either a continuous variable or a proportion.
Actually, we have many different types of variables in any commonly used
questionnaire. This may require formulas to be used for each different scale / type
of variable. Then, we have to reconcile the different sample sizes arrived at for
each different variable type. Usually, the easy way out in such cases is to take the
maximum sample size which is calculated, for one important variable in the
questionnaire.
Cell Size in Analysis
Slide 16
20. There may be 5 income categories among our respondents, and 4 age
categories. This creates a table with 5x4, or 20 cells. Now, even though the
overall sample size was adequate for simple analysis, the sample size in some
of these 20 cells may not be adequate. There are various rules of thumb used
to overcome or prevent such problems. One says that each cell must have a
minimum of 10 entries for us to do any analysis using that cell. Such problems
can be overcome more easily if we know in advance what types of analysis we
are likely to do. In other words, blank formats of output tables can be specified
before doing the study.
Time and Budget Constraints
Many a time, a study has to be done quickly to aid decision-making, or to prevent
competitors from learning too much about possible marketing strategy changes.
There may also be budget constraints, because more money has been spent in
product development, or in promotions, etc. Sampling design has to keep in
mind both the time and budget constraints for the study, before finalising a
sampling plan.
The Role of Experience in Determination of Sample Size
Given the many limitations in using formulae to determine the “right” sample
size, past experience of conducting marketing research studies is often used to
Slide 17
21. We will now discuss some of the commonly used sampling techniques,
their merits and demerits
Sampling Techniques can be classified under two major types –
probability and non-probability.
Probability Sampling Techniques
These are techniques where each sampling unit (usually a household or
individual in a marketing research study) has a known probability of being
included in the sample. The probability of inclusion need not be equal for
every sampling unit. In some methods, it is equal, and in some others, it is
unequal. But it should be a known probability, for it to be classified as a
probability sampling method.
The other major distinguishing feature of probability sampling methods is
that they are unbiased. The scheme of selection of units from the target
population is pre-specified, and then the sample is selected according to
the scheme. Not according to any biases or preferences of the researcher.
Slide 18
22. In practice, there are quite a few difficulties in using the probability
sampling methods. In such cases, the best feasible theoretical
method with minor modifications may be used. The major types of
probability sampling techniques are –
•.Simple Random Sampling
•.Stratified Random Sampling
•.Cluster Sampling
•.Systematic Sampling
•.Multi-stage or Combination Sampling
Slide 18 contd...
23. Simple Random Sampling
This technique is conceptually the easiest to understand, but quite difficult to
implement in a realistic marketing research project. To illustrate what it is,
assume that we wish to estimate the average income level of 100 employees of
a company. We do not have access to their income levels, so we have to
interview them and find out their income level. We have a time constraint,
and we just need a quick estimate. Assume that we have decided we would be
happy with a sample of 5, randomly selected from the 100. How do we select
the sample?
If we wish to use simple random sampling we could make a list of all 100
employees. Then, a number could be allotted to each employee. We could
then write these 100 numbers on small pieces of paper, one number on each.
Shuffling these folded pieces of paper, we can draw 5 pieces out of the 100,
and use these employees as our sample.
Slide 19
24. This appears very easy to do when there is a relatively small number of
people to pick from. But when we deal with typical marketing research
problems, the numbers are quite large, and more importantly, the exact
numbers are not known. This creates a very practical difficulty for the
marketing researcher who wishes to use Simple Random Sampling.
Imagine trying to procure a list of all Indian consumers of toilet soap, for
a study into their brand preferences. It is an impossible task, and
therefore, Simple Random Sampling, strictly speaking, is infeasible.
But it is possible to use modifications of the basic technique, with
reasonable checks and balances to keep the method unbiased in practice.
Slide 19 contd...
25. Slide 20
Stratified Random Sampling
In this technique, the total target population is
divided into strata or segments on the basis of some
important variables. For example, a consumer
population may be divided into age brackets of below
25, 25-40 and above 40 years. Then, a sample is
taken from each of the strata defined earlier.
Practically, the overall sample size is first calculated,
using a formula of the type discussed earlier, or based
on judgement and experience. This overall sample is
then divided into sub-samples for each stratum or
segment. There are two ways of doing this – called
proportionate stratification, and disproportionate
stratification. We will illustrate, based on our
example of the 3 age-based strata.
Total Sample Size for Proportionate Stratified
Sample
First, to compute the overall sample size for a
proportionate stratified sample, we have to use a
modified formula,
Z 2
Σ Wi Si
2
e
26. instead of the earlier formula discussed at the
beginning of this chapter. The pre-condition for
using this formula is that we need to know the
standard deviation (estimated) of the concerned
variable for each of the strata S1, S2, S3, etc. We also
have to assign a weight to each stratum, which is Wi
in the formula above. Wi is generally calculated as a
proportion of number of people in stratum ‘i’, to the
number of people in all the strata. In other words,
Wi = Ni , where Ni is the population of stratum ‘i’,
N and ‘N’ is the total population targeted
F or the study.
For calculating the weights, therefore, we must have
at least an estimate of the distribution of our target
population among the strata. We also need Si , the
standard deviation of the variable being estimated,
for each stratum. These are not always easy to get.
Slide 20 contd...
27. Slide 21
However, we will illustrate, assuming we are trying
to gather data for a Customer Satisfaction Study for a
T.V. Channel. Let us assume we want to know the
overall Customer Satisfaction level among three age
groups – below 25, 25 to 40 and above 40, for an
entertainment channel such as Sony. We want to
determine the customer satisfaction on a 7 point
scale, 1 being low satisfaction level, and 7 being high
satisfaction level.
Our formula for total sample size, we recall, is
Z 2
n = ---- Σ Wi Si
2
e
28. Slide 22
We will now assume that
Z = 1.96 (assuming 95 percent confidence level)
e = 0.05 (tolerable error on the 7 point scale)
We will assume that for the three age-based strata,
the weights and standard deviations are known or can
be calculated. A rough estimate of the standard
deviation ‘s’ (overall) is given by the formula (Range
÷ 6). Range is 7–1 = 6 because the maximum value
of the rating can be 7, and minimum can be 1.
Therefore Range = 6 = 1
6 6
We will now assume that S 1 , S 2 , S 3 , the standard
deviations of customer satisfaction are 1.2, 0.9 and
0.7 for the three age-based strata we have described.
Also, let us assume that 40 percent of the target
population of TV watchers is in the 40 plus age
group, 30 percent is in the 25-40 age group and 30
percent is in the below 25 age group. The weights
for the age groups W 1 , W 2 , W 3 will then be (from the
lower age group to the higher), 0.3, 0.3 and 0.4. The
values are written again below –
S1 = 1.2 W 1 = 0.3
S2 = 0.9 W 2 = 0.3
S 3= 0.7 W 3 = 0.4
29. Slide 23
Now, applying the formula,
Z 2
n = ---- Σ Wi Si
2
, we get
e
n = 1.96 2
[ (0.3) (1.2) 2
+ (0.3) (0.9) 2
+ (0.4) (0.7) 2
]
0.05
= 1536 [0.871] = 1338 (approx.)
This is the total sample size required. (Note that if
we had used the formula for simple random sampling
discussed earlier, sample size n would have been
(using s=1 as estimated above) equal to 1536. So,
stratified sampling has led to a smaller sample size of
1338 for the same z and e values.)
30. Slide 24
To split this total sample of 1338 into proportionately stratified sub-
samples, we simply use the same weights as determined earlier. Thus, the
sample size for stratum 1 (below 25 age group) would be
1338 x W1 = 1338 x 0.3 = 401
For stratum 2, it would be
1338 x W2 = 1338 x 0.3 = 401
For stratum 3 (above 40 age group), it would be
1338 x W3 = 1338 x 0.4 = 536 (approx.)
Thus, we would take a sample of 401, 401 and 536 from each of the three
strata. The total sample size is maintained at 1338.
31. Slide 25
Disproportionate Stratified Sampling
One of the keys to effective sampling is to take a sample as large or as small as
required. Not too high and not too low. But in practice, we need to know the
variability of the population to be able achieve an accurate sampling plan.
As we know intuitively, the higher the variability among the population (of the
variable we are measuring or estimating), the higher the sample size required from
the population.
As an illustration (though exaggerated), if we know that all the population is of
exactly the same characteristics, then a sample size of 1 is enough to tell us the
characteristics of the entire population.
At the other extreme, if the population is extremely variable, each unit having its
own different characteristics, we would need a very large sample to accurately
represent the population. Most populations do not fall into extreme zones, and
generally strata or segments consist of units that are similar to each other.
When doing stratified sampling, we would probably go for disproportionate
stratified samples if the variability of the variable being estimated is different from
segment to segment. If the variability is the same, we could take a proportionate
stratified sample. We measure variability by the standard deviation of the
32. Slide 26
The formula for the total sample size calculation is
(for disproportionate sampling)
Z 2
n = ---- ( Σ Wi Si ) 2
e
This is slightly different from the formula used in
case of proportionate stratified sampling.
To illustrate, let us use the same example of three
age-based strata, and check how to use a
disproportionate sample in the same.
Z 2
n = ---- ( Σ Wi Si ) 2
e
n = 1.96 2
[ (0.3) (1.2) + (0.3) (0.9) + (0.4) (0.7)] 2
0.05
= (1536) (0.8281) = 1272 (approx.)
Thus, we see that compared to the proportionate
stratified sample, we have got a lower sample size,
for the same level of tolerable error (e) and Z (1.96,
95 percent confidence level). In general, we will note
that disproportionate stratified samples tend to be
more efficient (lower sample sizes are obtained), than
proportionate stratified samples, because we allocate
sample size according to the variability in the strata.
33. Slide 27
We have yet to allocate the sub-samples to the strata.
We will now do that. The criterion for doing so
would be to do it in proportion to the variation in a
given stratum, compared to the total variation in all
strata.
In other words,
ni = ( Ni Si ) n
( Σ Ni Si )
In our three strata,
nI = Sample size for stratum ‘i’
n = Total sample size = 1272 (calculated above)
NI =Proportion of population belonging to stratum ‘i’
SI = Standard deviation of the variable (customer
satisfaction) in stratum ‘i’
We have assumed
N1= 0.3 S1 = 1.2
N2= 0.3 S2 = 0.9
N3= 0.4 S3 = 0.7
n = 1272 from our calculation
34. Slide 28
Therefore, the sample size in stratum 1 (age group
below 25),
n 1= (0.3) (1.2) (1272)
(0.3) (1.2) + (0.3) (0.9) + (0.4) (0.7)
= (0.36) x (1272) = 503
0.91
Similarly,
n 2 = (0.3) (0.9) x 1272
0.91
= 0.27 x 1272 = 377
0.91
and,
n 3 = (0.4) (0.7) x 1272
0.91
= 0.28 x 1272 = 391
0.91
35. Slide 29
Thus, the sample is divided into the three age groups in proportion to the
variation in customer satisfaction, and not in proportion to the number of
respondents in each stratum.
For example, the below 25 segment has the largest sample size of 503, even
though it has only 0.3 or 30 percent of the population. If we had gone for
proportionate stratified sampling, this segment would have got a sample size
of 0.3 x 1272 = 382 only. This would have been under-representative for
this segment.
We have discussed the pros and cons of proportionate and disproportionate
stratified sampling in these two sections. The reason for such an extensive
discussion is because many of the questions about sampling efficiency get
answered when we think about the need for stratification.
It has been researched and proven that if feasible, stratified sampling is the
most efficient method of probabilistic sampling. That is, for a given sample
size, it produces less sampling error than either simple random sampling or
cluster sampling.
36. We now move on to a discussion of other probabilistic methods of sampling.
Cluster Sampling / Area Sampling
A major difference between previously discussed methods of sampling and cluster
sampling is that a group of objects / units for sampling is selected in cluster
sampling.
A cluster is a group of sampling units or elements, which can be identified, listed
and a sample of which can be chosen. Theoretically, a cluster could be on the
basis of any criterion. But in practice, clusters tend to be found either in terms of
geographical areas, or membership of some groups such as a church, a club, or a
social organisation.
When the clusters are selected on the basis of geographical area, it is also called
Area Sampling.
If cluster sampling is only a single stage procedure, then
1. A list of all available clusters should be prepared.
2. All clusters should be numbered.
3. A sample of clusters (number to be decided by researcher) should be
randomly drawn.
Slide 30
37. Slide 31
Practically, most of the time, 2 or more stages of sampling takes place.
Out of the clusters selected in the first stage, a sample of units
(households) is generally taken, because the number of people in a cluster
is usually too large for sampling purposes.
One problem with cluster sampling is that the members of a cluster tend to
be similar – for example, people living in a block or neighbourhood come
from the same socio-economic background; have similar tastes, buying
behaviour, etc.
In general, cluster sampling is statistically inferior to simple random
sampling and stratified random sampling. Its sample tends to be less
representative than the other two methods. In other words, it produces
more sampling error for the same sample size, when compared to the other
two methods.
But on the positive side, the cost of cluster sampling is also usually lower.
So, the researcher may be able to justify using this technique on the
grounds of low cost and convenience.
38. Systematic Sampling
Systematic sampling is very similar to Simple Random Sampling, and easier to
practice. Just as we do in a simple random sample, we start with a list of all
sampling units or respondents in the population. We first compute the sample
size required, based on a formula.
Once the sample size (n) is decided, we divide the total population into (N ÷ n)
parts, where ‘n’ is the sample size required. From the first part of sampling units,
we pick one at random. Thereafter, we pick every (N ÷ n) th
item from the
remaining parts.
To illustrate, say we have a population of 300 students, for some research. We
need a sample of 15 out of these. The sampling fraction is 15/300 which means
1 out of every 20 students will be selected, on an average.
We divide the list into 300/15 = 20 parts. Out of the first 20 students, we choose
any one at random. Let us say, we choose student number 7 (all students are
listed). Thereafter, we choose student numbers 7+20, 7+20+20, 7+20+20+20
and so on in a systematic sampling plan. Therefore, the selected students will be
numbers 7, 27, 47, 67, 87, 107, 127, 147, 167, 187, 217, 237, 257, 277 and 297.
All these 15 students will comprise our total sample for the study.
Slide 32
39. In an ordered list according to the criterion of interest, systematic
sampling produces a more representative sample than simple random
sampling. For example, if all students were arranged in ascending
order of age, a systematic sample would produce a sample consisting
of all age groups.
However, a potential drawback also exists. If the list is drawn up such
that every 20th
student were similar on the characteristic we are
estimating, either by chance or design, then systematic samples can go
very wrong. So a list should be examined to see that there is no
cyclicality which coincides with our sampling interval.
Slide 32 contd...
40. Slide 33 Multistage or Combination Sampling
As the name indicates, in this type of sampling, we do not choose the final sample
in one stage. We combine two or more stages, and sometimes 2 or more different
methods of probability sampling.
We have already talked about 2-stage Area Samples while discussing Cluster
Sampling. Usually, multi-stage methods have to be used when doing research on a
national scale.
We may divide the national-level target population for our survey into clusters or
some such units. For example, we may divide India into 5 metro clusters, 20 class
A towns, 200 class B towns, and take our first stage sample as 1 metro, 3 class A
towns, and 10 class B towns, based on our sampling plan.
In the second stage, we may choose a stratified sample based on household income
and age of respondent. In such a case, we are using a two stage sampling plan,
which is a combination of Cluster Sampling, and Stratified Random Sampling.
If we go on sampling by geographical area based clusters in all the stages, it could
be a 3 or 4 stage cluster sample.
41. Slide 34
Non-Probability Sampling Techniques
We have so far discussed probability sampling techniques. In reality, because of
various difficulties involved in obtaining reliable lists of the desired target
population, it is difficult to use a textbook probability sampling prescription.
Therefore, some compromises could be made, or approximately probability-type of
sampling procedures may be used. Some of the non-probabilistic techniques may
also be used explicitly in cases where it is not feasible to use probability based
methods.
The major difference is that in non-probability techniques, the extent of bias in
selecting a sample is not known. This makes it difficult to say anything about the
representativeness or accuracy of the sample. Nevertheless, if done
conscientiously, some of these are good approximations for the probability
sampling techniques.
There are four major non-probability sampling techniques. These are –
Quota Sampling
Judgement Sampling
42. Slide 35
Quota Sampling
The first method, quota sampling, is very similar to stratified random sampling.
The first step of deciding on the strata, or segments which the population is divided
into, is actually the same.
The second step, of calculating a total sample size, and allocating it to the various
strata, is also the same. The major difference is that, random selection of
respondents is not strictly adhered to. More liberty is given to the field worker to
select enough respondents to complete the segmentwise quota.
In practice, unless there are untrained field workers, or the field supervision is lax,
the results produced by a quota sample could be very similar to the one produced
by a stratified random sample. But there is no guarantee that it would be similar.
In practice, many researchers use quota sampling, because it saves time, compared
with stratified random sampling. For example, if a household is locked, a quota
sample would permit the field worker to use a substitute household in the same
apartment block. But with a stratified random sample, he would be expected to
make a second or third attempt at different times of the day to contact the same
locked household. This would increase the time taken to complete the required
“quota”.
43. Slide 36
Judgement Sampling
This is not used often, as it is difficult to justify. The method relies only on the
judgement of the researcher as to who should be in the sample.
It obviously suffers from a researcher bias. If a different researcher were to do
the same study, he is likely to select an entirely different kind of sample.
Convenience Sampling
This is employed usually in pre-testing of questionnaires. It involves picking
any available set of respondents convenient for the researcher to use.
For example, students could be used as a sample by a marketing researcher who
lives in a college town. They (the students) need not be representative of the
target population for the study, for the product being researched.
Other examples of convenience sampling includes on-the-street interviews, or
any other meetings, or from employees of one office block or factory. Another
common example of convenience sampling is the one by TV reporters who
44. Snowball Sampling
This technique is used when the population being sought
is a small one, and chances of finding them by traditional
means are low. For example, to find owners of Mercedes
Benz cars in a city, we may go to one or two, and ask
them if they know anyone else who owns one. They in
turn are asked for more names of owners.
Slide 36 contd...
45. Slide 37
Census Versus Sample
It would appear from our discussion of sampling that it is not possible to do a
census in marketing research. Strictly speaking, it is possible to do one if the
population size is small. For example, if 200 solar cooker owners exist in a
town, it may be possible to meet all of them, if their addresses were available,
or could be obtained.
In some cases, like a survey of distributors or dealers, or even industrial
buyers, it may make sense to do a census if it is feasible. Particularly if
opinions or buying behaviour of respondents in a small population are likely
to be widely divergent.
But in most cases, if populations are reasonably large or very large, it makes
little sense to do a census. One major reason is that it may simply take too
long. Data may arrive too late for decision-making. Inaccuracies also are
likely to be a function of the volume of data collected. We will discuss these
in the next section under the subject “Sampling and Non-sampling Errors”.
46. Slide 38
Types of Errors in Marketing Research
Any research study has an error margin associated with it. No method is foolproof,
as we will see, including a census. This is because there are two major types of
errors associated with a research study. These are called –
•Sampling Error or Random Error
•Non-sampling or Human Error
Sampling Error
This is the error which occurs due to the selection of some units and non-selection of
other units into the sample. It is controllable if the selection of sample is done in a
random, unbiased way. In other words, if a probability sampling technique is used, it
is possible to control this error. In general, this error reduces as sample size
increases.
47. Non-sampling Error
This is the effect of various errors in doing the study, by the interviewer, data entry
operator or the researcher himself. Handling a large quantity of data is not an easy
job, and errors may creep in at any stage of the researcher. The data entry person
may interchange the column of ‘yes’ and ‘no’ responses while entering or
compiling data, or the interviewer may cheat by not filling up the questionnaire in
the field, and instead, fudge the data. Or, the respondent may say one thing, but
another may be recorded by mistake. These errors are usually proportionate to the
sample size. That is, the larger the sample size, the larger the non-sampling error.
Also, it is difficult to estimate the size of non-sampling error. But we can use some
controls on the quality of manpower, and supervise effectively to minimize it.
Slide 38 contd...
48. Slide 39
Total Error
1. This is the total of sampling error + non-sampling error.
2. Out of this, the sampling error can be estimated in the case of probability
samples, but not in the case of non-probability samples.
3. Non-sampling errors can be controlled through hiring better field workers,
qualified data entry persons, and good control procedures throughout the
project.
4. One important outcome of this discussion of errors is that the total error is
usually unknown. But, we may have to live with higher non-sampling error
in our attempt to reduce sampling error by increasing the sample size of the
study, not to mention the higher cost of a larger sample.
5. Therefore, it is worthwhile to optimise total error by optimising the sample
size, rather than going blindly for the largest possible sample size.