The document provides an overview of univariate statistical analysis and inferential statistics, including key concepts like population and sample distributions, measures of central tendency and dispersion, the normal distribution, sampling distributions, confidence intervals, and how these statistical techniques are used to make inferences about populations based on samples. It also discusses important steps in the data analysis process like data preparation, selecting appropriate analysis strategies and techniques based on the research objectives and data types.
This is a presentation I gave on Data Visualization at a General Assembly event in Singapore, on January 22, 2016. The presso provides a brief history of dataviz as well as examples of common chart and visualization formatting mistakes that you should never make.
Data visualizations make huge amounts of data more accessible and understandable. Data visualization, or "data viz," is becoming largely important as the amount of data generated is increasing and big data tools are helping to create meaning behind all of that data.
This SlideShare presentation takes you through more details around data visualization and includes examples of some great data visualization pieces.
This is a presentation I gave on Data Visualization at a General Assembly event in Singapore, on January 22, 2016. The presso provides a brief history of dataviz as well as examples of common chart and visualization formatting mistakes that you should never make.
Data visualizations make huge amounts of data more accessible and understandable. Data visualization, or "data viz," is becoming largely important as the amount of data generated is increasing and big data tools are helping to create meaning behind all of that data.
This SlideShare presentation takes you through more details around data visualization and includes examples of some great data visualization pieces.
Data Science - Part III - EDA & Model SelectionDerek Kane
This lecture introduces the concept of EDA, understanding, and working with data for machine learning and predictive analysis. The lecture is designed for anyone who wants to understand how to work with data and does not get into the mathematics. We will discuss how to utilize summary statistics, diagnostic plots, data transformations, variable selection techniques including principal component analysis, and finally get into the concept of model selection.
North Raleigh Rotarian Katie Turnbull gave a great presentation at our Friday morning extension meeting about data visualization. Katie is a consultant at research and advisory firm, Gartner, Inc.
This presentation gives the idea about Data Preprocessing in the field of Data Mining. Images, examples and other things are adopted from "Data Mining Concepts and Techniques by Jiawei Han, Micheline Kamber and Jian Pei "
This presentation will help you understand the basic building blocks of Business Intelligence. Learn how decisions are triggered, the complete decision process and who makes decisions in the corporate world.
More importantly, understand core components of a Business Intelligence architecture such as a data warehouse, data mining, OLAP (Online analytical procession) , OLTP (Online Transaction Processing) and data reporting. Each component plays an integral part which enables today's managers and decision makers collect, analyze and interpret data to make it actionable for decision making.
Business intelligence has become an integral part that needs to be incorporated to ensure business survival. It is a tool that helps analyze historical data and forecast future so that your are always one step ahead in your business.
Please feel free to like, share and comment as you please!
The slide aids to understand and provide insights on the following topics,
* Overview for Data Science
* Definition of Data and Information
* Types of Data and Representation
* Data Value Chain - [ Data Acquisition; Data Analysis; Data Curating; Data Storage; Data Usage ]
* Basic concepts of Big Data
Data visualization in data science: exploratory EDA, explanatory. Anscobe's quartet, design principles, visual encoding, design engineering and journalism, choosing the right graph, narrative structures, technology and tools.
Description of four techniques for Data Cleaning:
1.DWCLEANER Framework
2.Data Mining Techniques include Association Rule and Functional Dependecies
,...
All about Big Data components and the best tools to ingest, process, store and visualize the data.
This is a keynote from the series "by Developer for Developers" powered by eSolutionsGrup.
Session about types of analytics. Descriptive, diagnostic, predictive and prescriptive analytics.
Conference DATA ANALYSIS DEVELOPMENT 2016 by RZECZPOSPOLITA.
Statistics is the science of dealing with numbers.
It is used for collection, summarization, presentation and analysis of data.
Statistics provides a way of organizing data to get information on a wider and more formal (objective) basis than relying on personal experience (subjective).
Data Science - Part III - EDA & Model SelectionDerek Kane
This lecture introduces the concept of EDA, understanding, and working with data for machine learning and predictive analysis. The lecture is designed for anyone who wants to understand how to work with data and does not get into the mathematics. We will discuss how to utilize summary statistics, diagnostic plots, data transformations, variable selection techniques including principal component analysis, and finally get into the concept of model selection.
North Raleigh Rotarian Katie Turnbull gave a great presentation at our Friday morning extension meeting about data visualization. Katie is a consultant at research and advisory firm, Gartner, Inc.
This presentation gives the idea about Data Preprocessing in the field of Data Mining. Images, examples and other things are adopted from "Data Mining Concepts and Techniques by Jiawei Han, Micheline Kamber and Jian Pei "
This presentation will help you understand the basic building blocks of Business Intelligence. Learn how decisions are triggered, the complete decision process and who makes decisions in the corporate world.
More importantly, understand core components of a Business Intelligence architecture such as a data warehouse, data mining, OLAP (Online analytical procession) , OLTP (Online Transaction Processing) and data reporting. Each component plays an integral part which enables today's managers and decision makers collect, analyze and interpret data to make it actionable for decision making.
Business intelligence has become an integral part that needs to be incorporated to ensure business survival. It is a tool that helps analyze historical data and forecast future so that your are always one step ahead in your business.
Please feel free to like, share and comment as you please!
The slide aids to understand and provide insights on the following topics,
* Overview for Data Science
* Definition of Data and Information
* Types of Data and Representation
* Data Value Chain - [ Data Acquisition; Data Analysis; Data Curating; Data Storage; Data Usage ]
* Basic concepts of Big Data
Data visualization in data science: exploratory EDA, explanatory. Anscobe's quartet, design principles, visual encoding, design engineering and journalism, choosing the right graph, narrative structures, technology and tools.
Description of four techniques for Data Cleaning:
1.DWCLEANER Framework
2.Data Mining Techniques include Association Rule and Functional Dependecies
,...
All about Big Data components and the best tools to ingest, process, store and visualize the data.
This is a keynote from the series "by Developer for Developers" powered by eSolutionsGrup.
Session about types of analytics. Descriptive, diagnostic, predictive and prescriptive analytics.
Conference DATA ANALYSIS DEVELOPMENT 2016 by RZECZPOSPOLITA.
Statistics is the science of dealing with numbers.
It is used for collection, summarization, presentation and analysis of data.
Statistics provides a way of organizing data to get information on a wider and more formal (objective) basis than relying on personal experience (subjective).
WEBINAR: How to Set Up and Run Hypothesis Tests (ENCORE!)GoLeanSixSigma.com
The first live presentation of this webinar was so popular that we’re doing an encore presentation!
Join us for this 1-hour advanced webinar where we answer the question, “Why do we need hypothesis tests in process improvement?” and then stay with us as we walk you through a real, live hypothesis test direct from the Bahama Bistro!
Existing and new approaches for analysing data from Check All That Apply ques...Compusense Inc.
Check-All-That-Apply (CATA) questions are increasingly being incorporated into consumer tests because they provide a simple mechanism for consumers to communicate their perceptions of products being evaluated. We review existing and propose new approaches for analysing data obtained from such a study.
Contingency tables are well known, and can be pictured using mosaic plots. Correspondence analysis (CA) using the χ2 distance provides dimensionality reduction, but Hellinger's distance is often preferred where rarely cited attributes skew results. Word clouds can be used to determine citation frequency for responses that might be entered in open comment format by consumers (e.g. upon checking "other" in a CATA question). Cochran's Q test provides a univariate test for differences between 3 or more products, and the sign test can be used to assess pairwise differences. To our knowledge no omnibus hypothesis test is available for assessing global differences. We propose such a test, based on randomization and Cochran's Q statistics, in which the null distribution is formed from data re-randomizations. Multidimensional alignment (MDA) is suggested to investigate the relationship between products and CATA attributes. The φ-coefficients, proposed to understand relationships between CATA attributes, are readily visualized using MDS. Consumers can be asked to evaluate an ideal product, and the gaps between the real and ideal products can inform product improvements. Penalty and penalty-lift analyses can reveal (positive and negative) hedonic drivers.
Methods are illustrated by means of CATA study on whole grain breads.
It Covers basic tool-kit of scales that can be used for the purposes of marketing research. The measurement scales covered are into two groups; comparative and non-comparative scales. The examples further simplifies the Understanding.
MKTG 322 - Marketing Research
Assignment: to choose a restaurant company that was not currently in College Station that we believed would do well in the area. Then to do primary and secondary research to decide if the company would succeed in College Station.
Our group agreed to choose Noodles & Company as our restaurant feeling that College Station is lacking a healthy fast food option. We ran both focus groups and surveys as our primary research.
Under-mailing? Over-mailing? Email Frequency, Cadence and ROI -- JenningsJeanneJennings.com, Inc.
Are you over-mailing? Under-mailing? How do you know? What are the consequences (both positive and negative) of each? And how do you come up with the ‘perfect’ frequency and cadence (if there is such a thing)?
Please join us for this frank discussion on a topic that’s near and dear to the heart of all email marketers. You’ll walk away from this session with:
> Tools to tell whether you over- or under-mailing
> Tips for testing whether changes will benefit your program
> Techniques for talking to management about frequency and cadence issues
Speed Dating the Data Geeks: What you need to know about Nonprofit Analytic T...hjc
Speakers: Richard Becker, Blackbaud, John Blackwell, INtegral, Joe Churpek, Analytical Ones, John Ernst, Integral, Julie Wilson, Integral
With more nonprofits investing in advanced measurement technologies and analytics to drive their multi-channel fundraising programs, the need to understand what to measure, what to model and what to expect from an analytic partner is greater than ever. Join leaders from some of the nonprofit industry’s top analytic consulting firms for a candid panel discussion on how programs at every maturity level should approach analytics.
Watch the entire webinar: http://info.userzoom.com/online-surveys-design-webinar.html
UserZoom teamed up with Elizabeth Ferrall-Nunge, User Experience Research Lead at Twitter, to discuss how to create effective surveys and how to avoid common survey pitfalls.
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...BBPMedia1
Marvin neemt je in deze presentatie mee in de voordelen van non-endemic advertising op retail media netwerken. Hij brengt ook de uitdagingen in beeld die de markt op dit moment heeft op het gebied van retail media voor niet-leveranciers.
Retail media wordt gezien als het nieuwe advertising-medium en ook mediabureaus richten massaal retail media-afdelingen op. Merken die niet in de betreffende winkel liggen staan ook nog niet in de rij om op de retail media netwerken te adverteren. Marvin belicht de uitdagingen die er zijn om echt aansluiting te vinden op die markt van non-endemic advertising.
Cracking the Workplace Discipline Code Main.pptxWorkforce Group
Cultivating and maintaining discipline within teams is a critical differentiator for successful organisations.
Forward-thinking leaders and business managers understand the impact that discipline has on organisational success. A disciplined workforce operates with clarity, focus, and a shared understanding of expectations, ultimately driving better results, optimising productivity, and facilitating seamless collaboration.
Although discipline is not a one-size-fits-all approach, it can help create a work environment that encourages personal growth and accountability rather than solely relying on punitive measures.
In this deck, you will learn the significance of workplace discipline for organisational success. You’ll also learn
• Four (4) workplace discipline methods you should consider
• The best and most practical approach to implementing workplace discipline.
• Three (3) key tips to maintain a disciplined workplace.
VAT Registration Outlined In UAE: Benefits and Requirementsuae taxgpt
Vat Registration is a legal obligation for businesses meeting the threshold requirement, helping companies avoid fines and ramifications. Contact now!
https://viralsocialtrends.com/vat-registration-outlined-in-uae/
Premium MEAN Stack Development Solutions for Modern BusinessesSynapseIndia
Stay ahead of the curve with our premium MEAN Stack Development Solutions. Our expert developers utilize MongoDB, Express.js, AngularJS, and Node.js to create modern and responsive web applications. Trust us for cutting-edge solutions that drive your business growth and success.
Know more: https://www.synapseindia.com/technology/mean-stack-development-company.html
Falcon stands out as a top-tier P2P Invoice Discounting platform in India, bridging esteemed blue-chip companies and eager investors. Our goal is to transform the investment landscape in India by establishing a comprehensive destination for borrowers and investors with diverse profiles and needs, all while minimizing risk. What sets Falcon apart is the elimination of intermediaries such as commercial banks and depository institutions, allowing investors to enjoy higher yields.
Attending a job Interview for B1 and B2 Englsih learnersErika906060
It is a sample of an interview for a business english class for pre-intermediate and intermediate english students with emphasis on the speking ability.
What are the main advantages of using HR recruiter services.pdfHumanResourceDimensi1
HR recruiter services offer top talents to companies according to their specific needs. They handle all recruitment tasks from job posting to onboarding and help companies concentrate on their business growth. With their expertise and years of experience, they streamline the hiring process and save time and resources for the company.
Enterprise Excellence is Inclusive Excellence.pdfKaiNexus
Enterprise excellence and inclusive excellence are closely linked, and real-world challenges have shown that both are essential to the success of any organization. To achieve enterprise excellence, organizations must focus on improving their operations and processes while creating an inclusive environment that engages everyone. In this interactive session, the facilitator will highlight commonly established business practices and how they limit our ability to engage everyone every day. More importantly, though, participants will likely gain increased awareness of what we can do differently to maximize enterprise excellence through deliberate inclusion.
What is Enterprise Excellence?
Enterprise Excellence is a holistic approach that's aimed at achieving world-class performance across all aspects of the organization.
What might I learn?
A way to engage all in creating Inclusive Excellence. Lessons from the US military and their parallels to the story of Harry Potter. How belt systems and CI teams can destroy inclusive practices. How leadership language invites people to the party. There are three things leaders can do to engage everyone every day: maximizing psychological safety to create environments where folks learn, contribute, and challenge the status quo.
Who might benefit? Anyone and everyone leading folks from the shop floor to top floor.
Dr. William Harvey is a seasoned Operations Leader with extensive experience in chemical processing, manufacturing, and operations management. At Michelman, he currently oversees multiple sites, leading teams in strategic planning and coaching/practicing continuous improvement. William is set to start his eighth year of teaching at the University of Cincinnati where he teaches marketing, finance, and management. William holds various certifications in change management, quality, leadership, operational excellence, team building, and DiSC, among others.
LA HUG - Video Testimonials with Chynna Morgan - June 2024Lital Barkan
Have you ever heard that user-generated content or video testimonials can take your brand to the next level? We will explore how you can effectively use video testimonials to leverage and boost your sales, content strategy, and increase your CRM data.🤯
We will dig deeper into:
1. How to capture video testimonials that convert from your audience 🎥
2. How to leverage your testimonials to boost your sales 💲
3. How you can capture more CRM data to understand your audience better through video testimonials. 📊
Buy Verified PayPal Account | Buy Google 5 Star Reviewsusawebmarket
Buy Verified PayPal Account
Looking to buy verified PayPal accounts? Discover 7 expert tips for safely purchasing a verified PayPal account in 2024. Ensure security and reliability for your transactions.
PayPal Services Features-
🟢 Email Access
🟢 Bank Added
🟢 Card Verified
🟢 Full SSN Provided
🟢 Phone Number Access
🟢 Driving License Copy
🟢 Fasted Delivery
Client Satisfaction is Our First priority. Our services is very appropriate to buy. We assume that the first-rate way to purchase our offerings is to order on the website. If you have any worry in our cooperation usually You can order us on Skype or Telegram.
24/7 Hours Reply/Please Contact
usawebmarketEmail: support@usawebmarket.com
Skype: usawebmarket
Telegram: @usawebmarket
WhatsApp: +1(218) 203-5951
USA WEB MARKET is the Best Verified PayPal, Payoneer, Cash App, Skrill, Neteller, Stripe Account and SEO, SMM Service provider.100%Satisfection granted.100% replacement Granted.
Digital Transformation and IT Strategy Toolkit and TemplatesAurelien Domont, MBA
This Digital Transformation and IT Strategy Toolkit was created by ex-McKinsey, Deloitte and BCG Management Consultants, after more than 5,000 hours of work. It is considered the world's best & most comprehensive Digital Transformation and IT Strategy Toolkit. It includes all the Frameworks, Best Practices & Templates required to successfully undertake the Digital Transformation of your organization and define a robust IT Strategy.
Editable Toolkit to help you reuse our content: 700 Powerpoint slides | 35 Excel sheets | 84 minutes of Video training
This PowerPoint presentation is only a small preview of our Toolkits. For more details, visit www.domontconsulting.com
Skye Residences | Extended Stay Residences Near Toronto Airportmarketingjdass
Experience unparalleled EXTENDED STAY and comfort at Skye Residences located just minutes from Toronto Airport. Discover sophisticated accommodations tailored for discerning travelers.
Website Link :
https://skyeresidences.com/
https://skyeresidences.com/about-us/
https://skyeresidences.com/gallery/
https://skyeresidences.com/rooms/
https://skyeresidences.com/near-by-attractions/
https://skyeresidences.com/commute/
https://skyeresidences.com/contact/
https://skyeresidences.com/queen-suite-with-sofa-bed/
https://skyeresidences.com/queen-suite-with-sofa-bed-and-balcony/
https://skyeresidences.com/queen-suite-with-sofa-bed-accessible/
https://skyeresidences.com/2-bedroom-deluxe-queen-suite-with-sofa-bed/
https://skyeresidences.com/2-bedroom-deluxe-king-queen-suite-with-sofa-bed/
https://skyeresidences.com/2-bedroom-deluxe-queen-suite-with-sofa-bed-accessible/
#Skye Residences Etobicoke, #Skye Residences Near Toronto Airport, #Skye Residences Toronto, #Skye Hotel Toronto, #Skye Hotel Near Toronto Airport, #Hotel Near Toronto Airport, #Near Toronto Airport Accommodation, #Suites Near Toronto Airport, #Etobicoke Suites Near Airport, #Hotel Near Toronto Pearson International Airport, #Toronto Airport Suite Rentals, #Pearson Airport Hotel Suites
Personal Brand Statement:
As an Army veteran dedicated to lifelong learning, I bring a disciplined, strategic mindset to my pursuits. I am constantly expanding my knowledge to innovate and lead effectively. My journey is driven by a commitment to excellence, and to make a meaningful impact in the world.
3. Review sampling
• You want to see a new movie this weekend.
So you get onto a website and checkout
previews of what’s on.
• Is this sampling?
• How good a sample would this be>
3
5. Learning Objectives
• Understand and explain the need for data
preparation techniques such as editing,
coding, cleaning and statistically adjusting the
data where required
• Develop a data analysis strategy based on
specific research objectives
• Identify the factors influencing the selection of
an appropriate data analysis strategy
• Outline various analysis techniques
6. Data Preparation Process
Prepare preliminary plan of data analysis
Check questionnaires
Edit
Code
Transcribe
Clean data
Statistically adjust the data
Select a data analysis strategy
7. Questionnaire Checking
• Review all questionnaires for completeness
and interviewing quality
• Unacceptable questionnaires include:
– Parts of the questionnaire that are
incomplete
– Skip patterns may not have been followed
– Little variances in responses
– Pages missing
– Late questionnaires
– Respondents does not fit the selection
criteria
8. Data Editing
• A review of the questionnaires with the
objective of increasing accuracy and
precision.
• Identify responses that are:
– Illegible
– Incomplete
– Inconsistent
– Ambiguous responses
9. Data Editing cont.
• Treatment of unsatisfactory responses
– Return to the field
• Recontact the respondent
– Assign missing values
• If the number of unsatisfactory responses is
small
• Key variables are not missing
– Discard unsatisfactory respondents (cases)
• Proportion of unsatisfactory responses is small
• Sample size is large
• Unsatisfactory respondents do not differ from
satisfactory respondents
• Responses to key variables are missing
10. Data Coding
• Assigning a code [number] to each possible
response to each question [variable]
– Structured questionnaires [pre-coded]
– Unstructured questions [post-coding]
• Category codes should be mutually exclusive
and collectively exhaustive.
• Category codes should be assigned for critical
issues even if no one mentions them.
11. A Basic Questionnaire
1. In a typical month, how many times would you say you visit a fast-food restaurant? (Tick one box only)
None One Two Three Four Five Six or more
2. On your last visit to a fast-food restaurant, what was the dollar amount you spent on food and beverages?
Under $2.00 $6.01 - $10.00 More than $14.00
$2.01 - $6.00 $10.01 - $14.00 Don’t remember
3. How many of these restaurants would you say you visited in the past two months? Tick as many as apply.
KFC Pizza Hut
Wendy’s Red Rooster
McDonalds Other
Hungry jacks Have not visited any of these establishments
4. On a scale of 1 to 5, with 1 being strongly disagree to 5 being strongly agree, how would you rate fast-food
restaurants on the following dimensions:
I only visit those fast-food establishments that are conveniently located to my home 1 2 3 4 5
I prefer to visit fast-food restaurants that serve healthy/nutritious food 1 2 3 4 5
The price of food items is not important when visiting a fast-food restaurant 1 2 3 4 5
All fast-food restaurants should offer some type of child’s menu or kid’s meal 1 2 3 4 5
5. How many children do you have living at home?
None One Two Three Four Five or more
6. Which category does you total annual household income fall?
Under $20,000 $20,000 - $39,999 $40,000 - $59,999 $60,000 or more
12. Coding the Questionnaire
Variable Variable Coding
Number Name Instruction (99=missing value)
1 Number of visits per month 0=None
1=one
2= two
3=three
4=Four
5= five
6= six or more
2 Amount spent 1= Under $2
2= $2.01 - $6.00
3= $6.01 - $10.00
4= $10.01 - $14.00
5= More than $14.00
6= Don’t remember
3.1 Visited KFC 1=Yes, 0= No
13. Coding the Questionnaire cont.
3.2 Visited Wendy’s 1=Yes, 0= No
3.3 Visited McDonalds 1=Yes, 0= No
3.4 Visited Hungry Jacks 1=Yes, 0= No
3.5 Visited Pizza Hut 1=Yes, 0= No
3.6 Visited Red Rooster 1=Yes, 0= No
3.7 Visited Other establishment 1=Yes, 0= No
3.8 Have not visited any establishment 1=Yes, 0= No
4.1 Visit conveniently located stores 1= strongly disagree
2= disagree
3=neither agree/disagree
4=agree
5=strongly agree
4.2 Prefer healthy fast food stores As above
14. Coding the Questionnaire cont.
4.3 Price is important As above
4.4 Children’s menu is important As above
5 Number of children 0=None
1=one
2= two
3=three
4=Four
5= five or more
6 Annual household income 1=under $20,000
2=$20,000 - $39,000
3=$40,000 - $59,000
4=$60,000 or more
15. Transcribing
• Transferring coded data from the questionnaire to
a computer to be used for analysis.
• Variations to manual transcribing:
– CATI or CAPI
– Mark sense forms and optical scanning
– UPC
– Computerised sensory analysis systems
• For verification of the entire dataset, re-enter the
responses
17. Data Cleaning
• Consistency check
– Out of range [see study status]
– Logically inconsistent
[e.g., does not own the product but is a heavy user]
– Extreme values
[indiscriminatingly responding the same way on all attributes]
18. Example: Out of Range
Study Status
Cumulative
Frequency Percent Valid Percent Percent
Valid Full time student 923 91.8 91.8 91.8
Part time student 81 8.1 8.1 99.9
3.00 1 .1 .1 100.0
Total 1005 100.0 100.0
19. Data Cleaning cont.
• Treatment of missing responses
– Substitute a neutral value [substitute the ‘mean’
response of the variable]
– Substitute an imputed response [use the
respondent’s pattern of responses to other
questions]
– Casewise deletion [respondents with any missing
values are discarded from the analysis]
– Pairwise deletion [use only cases or respondents
with complete responses for each calculation]
20. Statistically Adjusting the Data
• Weighting
– Each case is assigned a weight to reflect its
importance relative to other cases, often used to
make the sample more representative of a target
population
• Variable re-specification
– Transformation of data to create new variables or
modify existing variables to better suit the
research objectives by summing several variables,
log transformations, dummy variables [see next
slide]
• Scale transformation
– Manipulation of scale values to ensure
comparability with other scales or otherwise make
the data suitable for analysis [when data is not
normally distributed].
21. Variable re-specification: Composite variables
•Aesthetics of a
website
•Measured using two
items
–“The website is
visually pleasing”
–“The website is
visually appealing”
–Combine these two
items to create a new
variable “Aesthetics
of a website” – this
new variable is used
with further analysis
in place of the two
items.
22. Variable re-specification: Recode variables
(to recode negatively-worded scale items)
Role Overload Strongly Disagree Disagree Neither Agree Agree Strongly
Disagree Somewhat agree nor Somewhat Agree
disagree
I have too much work to do, to do everything 1 2 3 4 5 6 7
well
The amount of work I am asked to do is fair 1 2 3 4 5 6 7
I never seem to have enough time to get 1 2 3 4 5 6 7
everything done
•Role overload is measured by 3 items.
•Which item is reverse-coded?
•We need to code this so all item are flowing in the same
direction.
•We need to inform SPSS that 1=7, 2=6, 3= 5, 4=4, 5=3, 6=2,
7=1 for the reverse coded item.
23. Variable re-specification: Recode variables
•“Overall, I’m (to collapse a continuous variable) cont.
satisfied with my
job” was measured
using a seven-point
scale.
•When we perform
data analysis
(particularly cross-
tabs) we may wish
to have fewer
categories for
brevity.
24. Strategy for Data Analysis
• Determine the type of data which is available
[nominal, ordinal, interval, ratio]
• Decide what needs to be discussed in order to tell
‘the story’
• Choose techniques to best get information on
specific parts of what has to be discussed
• Run the results
• Determine what the results mean, what patterns
can be seen, what kind of statistical decisions
should be made
• Write about the results to explain what is going on
to someone who does not like numbers and has
never heard of statistics
25. Overview of Techniques
• Descriptive Statistics
– Frequency distribution and cross
tabulations
– Measures of central tendency [mean,
median, mode]
– Measures of dispersion [range,
interquartile range, standard deviation]
– Shape [skewness, kurtosis]
• Inferential Statistics
– Parametric tests [Z or t test, paired t
test]
– Non-parametric tests [Chi-square]
26. Descriptive and inferential statistics
• Descriptive statistics are used to describe
characteristics of a population.
• Inferential statistics are used to make
inferences about a population from a
sample of that population.
26
27. Sample statistics and population
parameters
• Sample statistics are variables in a sample or
measures computed from sample data.
• Population parameters are variables in a
population or measured characteristics of the
population.
• But, generally we do not know what these
population parameters are and that is why we
use samples.
27
28. Frequency distributions
• Frequency distribution involves a process of
recording the number of times a particular
value of a variable occurs.
• Percentage distribution is a distribution of
relative frequency.
• Probability is the long–run relative frequency
with which an event will occur.
28
30. Measures of central tendency
• Mean: arithmetic average
• Median: the midpoint
– The value below which half the values
in a distribution fall.
• Mode: the value that occurs most often.
30
31. Measures of dispersion
• The tendency of observations to depart from
the central tendency.
• Range: distance between the smallest and
largest values.
• Deviation scores: how far any observation is
from the mean.
– Average deviation
• Variance: measure of variability or dispersion
– Its square root is the standard deviation.
31
32. Measures of dispersion
• Standard deviation: quantitative index of a
distribution’s spread.
– Using square root of variance reverts to the
original measurement units.
32
33. The normal distribution
• A symmetrical, bell–shaped distribution that
describes the expected probability distribution
of many chance occurrences.
– 99% of its values are within + 3 standard
deviations from its mean.
33
34. The normal distribution
• Standardised normal distribution has:
– symmetry about its mean
– infinite number of cases
– area under the curve with probability
density equal to 1
– mean of 0 and standard deviation of 1.
Standardised value = Value to be transformed – Mean
Standard deviation
Z=X-µ
σ
34
35. An example of standardised value
• Toy manufacturer has mean sales of 9000 units and standard
deviation of 500 units.
• Wishes to know whether wholesalers will demand between 7500
and 9635 units.
Z = X - µ = 7500 – 9000 = -3.00
σ 500
Z = X - µ = 9625 – 9000 = 1.25
σ 500
• Referring to Table 12.8, we find that:
– When Z = –3.00, the area under the curve = 0.499.
– When Z = 1.25, the area under the curve = 0.394.
– The total area under the curve = 0.499 + 0.394 = 0.893.
– There is a 0.893 probability that sales will in that range.
35
37. Population, sample, and sampling
distribution
• Population distribution: a frequency
distribution of the elements of a population.
• Sample distribution: a frequency distribution
of a sample.
• Sampling distribution: a theoretical probability
of sample means for all possible samples of a
certain size drawn from a particular
population.
37
38. Population, sample, and sampling
distribution
• Standard error of the mean: the standard
error of the sampling distribution.
• Sampling distribution is important because it
addresses the question of ‘ What would
happen if we were to draw a large number of
samples, each having n elements, from a
specified population?’
38
40. Central–limit theorem
• Central–limit theorem states that as the
sample size increases, the distribution of the
mean of a random sample taken from
practically any population approaches a
normal distribution.
40
41. Confidence intervals
• A confidence interval estimate is based on
the knowledge that the population mean is
the sample mean plus or minus a small
sampling error.
– After calculating an interval estimate, we
can determine how probable it is that the
population mean will fall within this range
of statistical values.
• Confidence level is a percentage that
indicates the long–run probability that the
results will be correct.
41
42. Confidence intervals
∀ µ=X+E
where E = range of sampling error
• E = Zc.l.SX
where Zc.l. = value of Z at a specified confidence level (c.l.) and
SX = standard error of the mean
∀ µ = X + Zc.l.SX
where SX = S , S = standard deviation and n = sample size
√n
• Thus, µ = X + Zc.l.S
√n
42
43. An example of confidence intervals
• Sporting goods store caters to working women who golf.
• Survey showed the mean age is 37.5 years and standard
deviation of 12.0 years.
• Wishes to be 95% confident that the sample estimates will include
the population parameter.
µ = X + Zc.l. S = 37.5 + Zc.l. 12.0
√n √100
• Including 95% of the area requires that 47.5% of the distribution
on each side be included.
• Referring to Table B.2 in Appendix B, we find that 0.475
corresponds to the Z-value 1.96. Thus:
µ = 37.5 + (1.96)(1.2) = 37.5 + 2.352
• 95% of the time µ is in range of 35.15 to 39.85 years.
43
44. Frequency Distributions
• A count of the number of responses
associated with different values of the
variable
Where did you hear about VU's Open Day?
Cumulative
Frequency Percent Valid Percent Percent
Valid Radio 39 12.7 12.8 12.8
Newspaper 29 9.4 9.5 22.3
Internet site 25 8.1 8.2 30.5
Friend/Relation 52 16.9 17.0 47.5
School 160 51.9 52.5 100.0
Total 305 99.0 100.0
Missing System 3 1.0
Total 308 100.0
45. Frequency Distributions cont.
Age of respondent
Cumulative
Frequency Percent Valid Percent Percent
Valid 18 or under 197 64.0 64.6 64.6
19 - 29 71 23.1 23.3 87.9
Over 29 37 12.0 12.1 100.0
Total 305 99.0 100.0
Missing System 3 1.0
Total 308 100.0
46. Bar Chart Produced from Frequency
Distributions
40% 38.00%
35% 34.00%
30%
25%
20% 18.00%
The course offered
15%
10%
6.00%
5% 4.00%
0%
Very Important Of some Of little Of absolutely
important importance importance no
importance
47. Frequencies for
Multiple Response Questions
• Example of a question using multiple-response
formatting
Q9.Which of the following people had an influence on your choice of university?
Parents 01
Friends 02
Ex-VU student 03
Teacher at high school 04
Careers teacher at high school 05
Colleagues 06
Other 07
48. Frequencies for Multiple Response
Questions
Influence on choice of university
(Value tabulated = 1)
Pct of Pct of
Dichotomy label Name Count Responses Cases
Influenced by Parents Q9A 420 26.4 42.3
Influenced by friends Q9B 331 20.8 33.4
Influenced by student Q9C 149 9.4 15.0
Teacher at high school Q9D 158 9.9 15.9
Careers teacher at high school Q9E 259 16.3 26.1
Colleagues Q9F 88 5.5 8.9
Other Q9G 184 11.6 18.5
------- ----- -----
Total responses 1589 100.0 160.2
49. Statistics Associated with Frequency
Distributions: Measures of Location
• Mean
– ‘average’
• Mode
– The value that occurs most frequently.
– Most appropriate for categorical data.
• Median
– Middle value in the data set when the data are
arranged in ascending or descending order.
50. Mean Mode Median
Nominal
Type of data Interval Ordinal Interval
Ratio Interval Ratio
Ratio
Influenced Yes No No
by outliers
51. Statistics Associated with Frequency
Distributions: Measures of Variability
• Range
– The difference between the largest and smallest
values of a distribution.
• Interquartile range
– The range of a distribution encompassing the
middle 50 percent of the observations.
• Variance and Standard deviation
– Variance is the mean squared deviation of all the
values from the mean. The standard deviation
measures the average spread (deviation) from the
mean and uses values which are consistent with
the original observations.
• Coefficient of variation
– The standard deviation expressed as a
percentage of the mean.
57. Notes on writing up results
• Do not simply repeat the numbers in the table as
part of the discussion
• The discussion should focus on the patterns in the
data
• Percentages (rather than numbers) are more
generalisable to the population,
• However, keep in mind that because of sampling
error the percentage in the population will not
exactly match that of the sample
• We rarely care about the sample itself, except
what it tells us about the population, it is supposed
to represent