This document provides an overview of statistics concepts for physical education. It discusses:
- Types of data including qualitative, quantitative, discrete, and continuous variables.
- Levels of measurement including nominal, ordinal, interval, and ratio scales.
- Descriptive statistics for summarizing data such as measures of central tendency and variation.
- Inferential statistics for making inferences about populations based on samples using techniques like hypothesis testing.
- Key aspects of hypothesis testing including the null and alternative hypotheses, p-values, and types of errors.
Statistics is the study of the collection, analysis, interpretation, presentation, and organization of data.
The word STATISTICS is seems to be derived from the Latin word ‘status’ or the Italian word ‘Statista’ or German word ‘Statistik’. All of them means the same thing i.e. a political state.
Facts expressed numerically are called statistics such as data related to income, height of a class, weight of a class, etc.
However mere facts or aggregate of facts cannot be called statistics.
For example 151, 182, 169, 158, 162, 148 etc. are not statistics.
But if I say the above digits are the height of students of a particular class then that’s statistics.
Statistics is the study of the collection, analysis, interpretation, presentation, and organization of data.
The word STATISTICS is seems to be derived from the Latin word ‘status’ or the Italian word ‘Statista’ or German word ‘Statistik’. All of them means the same thing i.e. a political state.
Facts expressed numerically are called statistics such as data related to income, height of a class, weight of a class, etc.
However mere facts or aggregate of facts cannot be called statistics.
For example 151, 182, 169, 158, 162, 148 etc. are not statistics.
But if I say the above digits are the height of students of a particular class then that’s statistics.
INTRODUCTION
DEFINITION
HYPOTSIS
ANALYSIS OF QUANTITATIVE DATA
STEPS OF QUANTITATIVE DATA ANALYSIS.
STEPS OF QUANTITATIVE DATA ANALYSIS.
INTERPRETATION OF DATA
PARAMETRIC TESTS
Commonly Used Parametric Tests.
This ppt includes basic concepts about data types, levels of measurements. It also explains which descriptive measure, graph and tests should be used for different types of data. A brief of Pivot tables and charts is also included.
This will help understand the basic concepts of Statistics like data types, level of measurements, central tendency, dispersion, graphs, univaraite analysis, bivariate analysis and more. Moreover, it will also help you to select appropriate summary statistics and charts for your data.
Non-parametric tests are sometimes called distribution-free tests because they are based on fewer assumptions (e.g., they do not assume that the outcome is approximately normally distributed). The cost of fewer assumptions is that non-parametric tests are generally less powerful than their parametric counterparts.
Correlation & Regression Analysis using SPSSParag Shah
Concept of Correlation, Simple Linear Regression & Multiple Linear Regression and its analysis using SPSS. How it check the validity of assumptions in Regression
SPSS does not have Z test for proportions, So, we use Chi-Square test for proportion tests. Test for single proportion and Test for proportions of two samples
Chi Square test for independence of attributes / Testing association between two categorical variables, Chi-Square test for Goodness of fit / Testing significant difference between observed and expected frequencies
Chi-Square test for independence of attributes / Chi-Square test for checking association between two categorical variables, Chi-Square test for goodness of fit
t test for single mean, t test for means of independent samples, t test for means of dependent sample ( Paired t test). Case study / Examples for hands on experience of how SPSS can be used for different hypothesis testing - t test.
Basics of Hypothesis testing for PharmacyParag Shah
This presentation will clarify all basic concepts and terms of hypothesis testing. It will also help you to decide correct Parametric & Non-Parametric test for your data
Exploratory Data Analysis for Biotechnology and Pharmaceutical SciencesParag Shah
This presentation will give perfect understanding of data, data types, level of measurements, exploratory data analysis and more importantly, when to use which type of summary statistics and graphs
This presentation will clarify all your basic concepts of Probability. It includes Random Experiment, Sample Space, Event, Complementary event, Union - Intersection and difference of events, favorable cases, probability definitions, conditional probability, Bayes theorem
The ppt gives an idea about basic concept of Estimation. point and interval. Properties of good estimate is also covered. Confidence interval for single means, difference between two means, proportion and difference of two proportion for different sample sizes are included along with case studies.
Testing of hypothesis - large sample testParag Shah
Different type of test which are used for large sample has been included in this presentation. Steps for each test and a case study is included for concept clarity and practice.
This ppt is to guide students opting for Statistics major. It gives an idea of skills required and job prospects. It also emphasizes on the important life skills along with Statistics knowledge, analytical thinking and hands on analytical software .
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
3. 1-3
Session Flow
• Data
• Types of Data
• Various measurements of data
• Data Analysis :
– Descriptive
– Inferential Statistics
3
4. 1-4
What is Data?
Data is a collection of facts or information from which
conclusions may be drawn
4
5. 1-5
Types of Data
A. Qualitative or Attribute data - the characteristic being studied is
nonnumeric.
E.g.: Gender, religious affiliation, state of birth, country representing, words, images,
videos
B. Quantitative data - information is reported numerically.
E.g.: time (in seconds) for 400 mts race, Prize money won by a tennis player , or number of
boundaries scored in a match.
6. 1-6
Quantitative Variables - Classifications
Quantitative variables can be classified as either discrete or
continuous.
A. Discrete variables: can only assume certain values
EXAMPLE: the number of goals in a football match, or the number of wickets by a bowler in a cricket match
(1,2,3,…,etc.)
B. Continuous variable can assume any value within a
specified range.
EXAMPLE: the height of an athlete or the weight of a boxer.
8. 1-8
Data Collection
The 5 W’s of data collection are:
1.What data is to be collected?
2.From whom data is to be collected?
3.Who will collect data?
4.From where the data will be collected?
5.When is the data collected?
8
9. 1-9
Data Collection Methods
9
• Involves data collection directly from the subjects by the researcher or
trained data collector.
• e.g. Surveys, Interview, Observations etc.Primary Data
collection method
• It involves of use of the data that were collected for various purposes
other than current research.
• e.g. diaries, nurses notes, care plans, patient medication record, statistical
abstracts, census reports neither published or unpublished data
Secondary Data
collection method
12. 1-12
Ordinal-Level Data
Properties:
• Data classifications are represented by
sets of labels or names (high, medium,
low) that have relative values.
• Because of the relative values, the
data classified can be ranked or
ordered.
13. 1-13
Interval-Level Data
Properties:
• Data classifications are ordered
according to the amount of the
characteristic they possess.
• Equal differences in the
characteristic are represented
by equal differences in the
measurements.
Example: Women’s dress sizes listed on the table.
14. 1-14
Ratio-Level Data
Practically all quantitative data is recorded on the ratio level of
measurement.
Ratio level is the “highest” level of measurement.
Properties:
• Data classifications are ordered according to the amount of the characteristics they
possess.
• Equal differences in the characteristic are represented by equal differences in the
numbers assigned to the classifications.
• The zero point is the absence of the characteristic and the ratio between two
numbers is meaningful.
15. 1-15
Four Levels of Measurement
Nominal level - data that is classified into categories
and cannot be arranged in any particular order.
EXAMPLES: eye color, gender, religious affiliation.
Ordinal level – data arranged in some order, but the
differences between data values cannot be
determined or are meaningless.
EXAMPLE: During a taste test of 4 soft drinks, Thumps Up was ranked
number 1, Sprite number 2, Seven-up number 3, and Fanta
number 4.
Interval level - similar to the ordinal level, with the
additional property that meaningful amounts of
differences between data values can be determined.
There is no natural zero point.
EXAMPLE: Temperature on the Fahrenheit scale., size of garment,
Likert’s scale
Ratio level - the interval level with an inherent zero starting
point. Differences and ratios are meaningful for this
level of measurement.
EXAMPLES: Monthly income of surgeons, or distance traveled by
manufacturer’s representatives per month.
17. 1-17
Why to Know the Level of Measurement of a Data?
• The level of measurement of the data dictates the
calculations that can be done to summarize and
present the data.
• To determine the statistical tests that should be
performed on the data
19. 1-19
Types of Analysis
• Descriptive statistics uses the data to provide
descriptions of the population, either through
numerical calculations or graphs or tables.
• Inferential statistics makes inferences and
predictions about a population based on a sample
of data taken from the population in question.
19
20. 1-20
Descriptive Statistics
Summarizing Data:
– Central Tendency (or Groups’ “Middle Values”)
• Mean
• Median
• Mode
– Variation (or Summary of Differences Within Groups)
• Range
• Interquartile Range
• Variance
• Standard Deviation
21. 1-21
Choosing summary statistics
Which average and measure of
spread?
Scale
Normally
distributed
Mean
(Standard deviation)
Skewed data
Median
(Interquartile
range)
Categorical
Ordinal:
Median
(Interquartile
range)
Nominal:
Mode
(None)
22. 1-22
1st
variable
Only 1 variable Scale Categorical
Scale Histogram Scatter plot Box-plot
Categorical Pie/ Bar Box-plot Stacked/ multiple
bar chart
Which graph?
23. 1-23
Bar chart
Clustered bar charts (two categorical variables)
Histogram (can be plotted against a categorical
variable)
Box & Whisker plot (can be plotted against a
categorical variable)
Dot plot (can be plotted against a categorical variable)
Scatter plot (two continuous variables)
Mean
Median
Standard deviation
Range (Min, Max)
Inter-quartile range (LQ, UQ)
Flow chart of commonly used descriptive statistics and
graphical illustrations
Frequency
Percentage (Row, Column or Total)
Exploring data
Descriptive statistics
Graphical illustrations
Categorical data
Continuous data: Measure of location
Continuous data: Measure of variation
Categorical data
Continuous data
28. 1-28
Parametric or Non-parametric?
•Parametric tests are restricted to data that:
1) show a normal distribution
2) are independent of one another
3) are on the same continuous scale of measurement
•Non-parametric tests are used on data that:
1) show an other-than normal distribution
2) are dependent or conditional on one another
3) in general, do not have a continuous scale of measurement
31. 1-31
Parametric & Non-Parametric tests
Purpose of test Parametric Test Non-Parametric Test
Compare central value( Mean / Median)
with specific value
One sample t / Z test Wilcoxon Signed Rank
Compare central values of two
independent samples
Two sample t / Z test Mann -Whitney
Compare central values of two
dependent samples
Paired t test Wilcoxon Signed Rank
Compare central values of three or
more samples ( One Variable)
One Way ANOVA Kruskal - Wallis
Compare central values of three or
more samples ( Two Variable)
Two Way ANOVA Friedman
Compare independence of two
categorical variables
Chi – Square
31
32. 1-32
32
p-value
p-value is the probability
the test statistic would
take a value as extreme
or more extreme than
observed test statistic,
when H0 is true
33. 1-33
p value
• Smaller-and-smaller p-values → stronger-and-
stronger evidence against H0
• For typical analysis, using the standard α = 0.05
cutoff, the null hypothesis is
- rejected when p <= .05 and
- not rejected when p > .05.
33
34. 1-34
Hypothesis
• A hypothesis is a statement regarding a
characteristic of one or more populations.
• Hypothesis testing is a procedure, based on
sample evidence and probability, used to test
statements regarding a characteristic of one or
more populations.
34
35. 1-35
Steps of Hypothesis Testing
• Define Null Hypothesis
• Decide Alternative Hypothesis
• Calculate test statistics
• Find the table of test statistics based on level of
significance
• Give the conclusion based on test statistics and it’s
table value
35
36. 1-36
Null Hypothesis
The null hypothesis, denoted H0, is a statement to
be tested.
The null hypothesis is a statement of no change, no
effect or no difference and is assumed true until
evidence indicates otherwise.
36
38. 1-38
Different Alternative Hypothesis
1.Equal versus not equal hypothesis (two-tailed test)
H0: parameter = some value
H1: parameter ≠ some value
2. Equal versus less than (One- tailed, left-tailed test)
H0: parameter = some value
H1: parameter < some value
3. Equal versus greater than (One- tailed, right-tailed test)
H0: parameter = some value
H1: parameter > some value
38