This document discusses statistical analysis and data science concepts. It covers descriptive statistics like mean, median, mode, and standard deviation. It also discusses inferential statistics including hypothesis testing, confidence intervals, and linear regression. Additionally, it discusses probability distributions, random variables, and the normal distribution. Key concepts are defined and examples are provided to illustrate statistical measures and probability calculations.
Introduction to Statistics - Basic concepts
- How to be a good doctor - A step in Health promotion
- By Ibrahim A. Abdelhaleem - Zagazig Medical Research Society (ZMRS)
This will help understand the basic concepts of Statistics like data types, level of measurements, central tendency, dispersion, graphs, univaraite analysis, bivariate analysis and more. Moreover, it will also help you to select appropriate summary statistics and charts for your data.
Introduction to Statistics - Basic concepts
- How to be a good doctor - A step in Health promotion
- By Ibrahim A. Abdelhaleem - Zagazig Medical Research Society (ZMRS)
This will help understand the basic concepts of Statistics like data types, level of measurements, central tendency, dispersion, graphs, univaraite analysis, bivariate analysis and more. Moreover, it will also help you to select appropriate summary statistics and charts for your data.
This Presentation course will help you in understanding the Machine Learning model i.e. Generalized Linear Models for classification and regression with an intuitive approach of presenting the core concepts
Descriptive statistics, central tendency, measures of variability, measures of dispersion, skewness, kurtosis, range, standard deviation, mean, median, mode, variance, normal distribution
This Presentation course will help you in understanding the Machine Learning model i.e. Generalized Linear Models for classification and regression with an intuitive approach of presenting the core concepts
Descriptive statistics, central tendency, measures of variability, measures of dispersion, skewness, kurtosis, range, standard deviation, mean, median, mode, variance, normal distribution
its a Short Biography of Mr.Elon Musk who is an Investor, Mechanic, and a good thinkers.....his inventions his Struggle and his Visions are presented......."Failure is an option here. If things are not failing, you are not innovating enough."- Elon Musk..........Thank You
Adhesives in maxillofacial prosthesis /orthodontics courses in indiaIndian dental academy
Indian Dental Academy: will be one of the most relevant and exciting training
center with best faculty and flexible training programs for dental
professionals who wish to advance in their dental practice,Offers certified
courses in Dental implants,Orthodontics,Endodontics,Cosmetic Dentistry,
Prosthetic Dentistry, Periodontics and General Dentistry.
MATERIALS USED FOR DENTAL IMPLANT / dental implant courses by Indian dental a...Indian dental academy
The Indian Dental Academy is the Leader in continuing dental education , training dentists in all aspects of dentistry and
offering a wide range of dental certified courses in different formats.for more details please visit
www.indiandentalacademy.com
Brief overview of basic statistics which migh be useful for MD (Paedatrics -Part 1)
Please note that some images and slides taken from the internet behalf of the readers to have a clear picture.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
2. Managerial Decisions
How many Programmers should I
staff for?
What is the right level of inventory
for our new product manufacturing
Where should we open our new
retail store?
What will be next year revenue?
Whether we are on right or wrong
track
How much should I invest in
advertising
5. Descriptive statistics
Descriptive statistics utilizes numerical and graphical methods to look for patterns
in a data set, to summarize the information revealed in a data set and to present
that information in a convenient form.
• Average
• Spread
• Range
• Frequency
• Histogram
• Mode
• Scatter Plot
• Mode
• Interquartile Range
6. Inferential statistics
• Hypothesis Test
• Z score
• ANNOVA
• Confidence Interval
• Margin of error
• Ordinary least Square
• T test
• F Test
7. Types of Data
Type of Data Definition Example
Nominal The categories are in no logical order and have
no particular relationship
Your Previous Degree
Ordinal Can be ranked/ordered but not measured College Rankings
Interval Scale Set of numerical measurements in which the
distance between numbers is of a known
Temperature in Celsius
Ratio Scale Ratios are meaningful Sales of a new product
Source of data Definition Example
Observational Analyst Does not control data
generation process
Stock returns on BSE
Experimental Analyst has good control over data
generation
Clinical trials for drug
efficiency
8. Few Examples
1. The length of time until a pain reliever begins to work.
2. Ranking of racers in moto GP.
3. The number of colors used in a statistics textbook.
4. The brand of refrigerator in a home.
5. The overall satisfaction rating of a new car.
6. The number of files on a computer’s hard disk.
7. The pH level of the water in a swimming pool.
8. The number of staples in a stapler.
9. Population & Sample
Population: A collection, or set, of individuals or objects or events whose
properties are to be analyzed.
Typically, there are too many experimental units in a population to consider
every one.
Sample: A Subset of population
10. Measure of Central Tendency
Mode: The value in the data that occurs most frequently
Mean: The average of a given set of numbers
Mean of sample
Population Mean µ=
1
𝑁 𝑖
𝑛
𝑥𝑖
Percentiles: The pth Percentile of a group of numbers is that value below which
lie p% of the numbers in the group .
Pth percentile= (n+1)p/100 where n is the number of data points
Median: 50th percentile
Quartiles: These are percentiles which break down the distribution of the data.
1st (25 percentile),3rd (75th percentile)
Interquartile Range(IQR): Difference between 1st and 3rd quartile
value Frequency
18 4
19 1
20 3
21 1
22 2
23 2
24 1
12. Measure of Variability
Range: Difference between largest number and smallest number in a given data
set
Variance: Is the average squared deviation of the data points from their mean
Sample Variance
Population Variance
Standard Deviation: Square root of variance of the data set
Sample sd S=√𝑆2
Population sd 𝜎 = √𝜎2
14. Histogram
• Histogram is a chart made of bars where height of each bars represent frequency
of values
• Frequency of values can be absolute frequencies of counts or relative frequency
• Relative frequency of data points counts of the data points divided by total
number of data points
15. Boxplot
Boxplot is a measure of five point summary measures of the distribution of the
data
16. Skew ness
Skew ness is the measure of the degree of asymmetry of a frequency
distribution
17. Kurtosis
Kurtosis is a measure of peakedness of a distribution
Kurtosis for normal distribution is 3
18. What Is Random Variable?
How To Summarize Random Variable?
How to pictorially Represent Probability Distribution?
Random Variable
19. Random Variable
A Random Variable describes the probabilities for an uncertain future numerical
outcome of a random process
It is a variable that can take on several possible value
It is random because there is some chance associated with each possible values
Random variable is of 2 types
• Discrete
• Continuous
20. Probability Distribution
• Probability
o Long Run average of a random event occurring
o Different from subjective beliefs
• A Probability distribution is a rule that identifies possible outcomes of a
random variable and assigns a probability to each
• A discrete distribution has finite number of values
o E.g. face value of a card, height of students in class
• A continuous distribution has all possible values in some range
o E.g. salaries per month, Temperature in a month
21. PDF & CDF of Random Variable
The PDF(probability distribution function) for a discrete random variable x is the
relative frequency distributions of the x. It is a graph, table or formula that gives
the possible values of x and the probability p(x) associated with each value.
For all xi pdf must satisfy
CDF(Cumulative distribution function), F(x) of a discrete random variable is
F(x)=P(X≤x)= 𝑎𝑙𝑙 𝑖≤𝑥 𝑃(𝑖)
1)(and1)(0
havemustWe
xpxp
X p(X=x) F(x)
0 0.1 0.1
1 0.2 0.3
2 0.3 0.6
3 0.2 0.8
4 0.1 0.9
5 0.1 1.00
1.00
22. Example
Toss a fair coin three times and define
x = number of heads.
P(x = 0) = 1/8
P(x = 1) = 3/8
P(x = 2) = 3/8
P(x = 3) = 1/8
HHH
HHT
HTH
THH
HTT
THT
TTH
TTT
x p(x)
0 1/8
1 3/8
2 3/8
3 1/8
Probability Histogram
for x
1/8
1/8
1/8
1/8
1/8
1/8
1/8
1/8
x
3
2
2
2
1
1
1
0
23. Quick exercise
Randomly chosen card from a deck of cards
What is the probability of getting an ace?
What is the probability of getting a card less than 3?
What is the probability of getting 1 head if I toss 2 unbiased coin?
What is the probability of getting 2 head if I toss 3 unbiased coin?
24. An Example
X p(X=x)
0 0.4
1 0.25
2 0.2
3 0.05
4 0.1
• Daily sales of TVs at store
• What is the probability of a sale?
• What is the probability of selling at least three TVs?
25. Expected Value or Mean
• The expected value or mean(µ) of a random variable is
the weighted average of its values
‒ The probabilities serve as weights
‒ E(x)= 𝒊
𝒏
𝒙𝒊 𝒑(𝑿 = 𝒙𝒊)
• What is the mean number of TVs sold per day
• What does this imply
26. Variance and Standard Deviation
• Both measures of variation or uncertainty in random variable
• Variance(σ2) :The weighted average of the squared deviations from the
mean
‒ Probabilities serve as weights
‒ σ2(x)= 𝑖
𝑛
𝑥𝑖 − µ 2 𝑝 𝑋 = 𝑥𝑖 = 𝐸 𝑥 − µ 2
‒ Units are squared of the units of the variables
‒ Another way Var(X)=E(X2)-[E(X)]2
• Standard Deviation(σ) :Square root of variance
‒ Has units same as variable
27. Sum of Random Variables
Let X1 and x2 be 2 random variables with means µ1 and µ2 and standard
deviation σ1 and σ2, suppose Y=aX1 +b X2
‒ What is the Mean of Y?
E[Y]=aE[X1] +bE[X2]
‒ What is the standard deviation of Y?
Var(Y)=a2var(X1)+b2Var(X2)
• Independent: When the value taken by random variable does not affect
the value taken by other random variable
‒ E.g. Rolls of 2 Dice
• Dependent : When the value of one random variable gives us more
information about the other random variable
‒ E.g. Height and weight of students
28. Example
Let X1 and X2 be the outcomes associated with a toss of a pair of dice
E(X1)=E(X2)=3.5
SD(X1)=SD(X2)=1.708
Compute the following:
E(x1+X2)=
SD(X1+X2)=
29. The Empirical Rule
• Approximately 68% of data points will be within 1 standard deviation of
the mean
• Approximately 95% of the data points will be within 2 standard
deviation of the mean
• A vast majority(almost all) will lie within 3 standard deviation of the
mean
30. Normal distribution
• The graph of the PDF is a bell shaped curve
• The normal random variable takes values from -∞ to +∞
• It is symmetric and centered around the mean(which is also the median and the
mode)
• Any normal distribution can be specified with just 2 parameters – the mean(µ)
and the standard deviation(σ)
• We write this as X~N(µ,σ2)
32. Probability Calculation for
continuous Distribution
• The probability associated with any single value of the random variable is always
zero
• Probability of values being in a range = Area under the pdf curve in that range
• Area under the entire curve is always equals 1
33. Z-scores, Standard Normal
Distribution
For every value(x) of the random variable X, we calculate its z-score:
Interpretation- How many standard deviations away is the value from the
mean?
If X~N(µ,σ2) then
‒ Z-scores have a normal distribution with µ=0 and σ=1
‒ i.e. Z~N(0,1)
‒ Standard normal distribution
• Inverse Transformation
‒ X=µ + zσ
34. Probability calculation for normal
distribution
• Consider a normal distribution X~N(µ,σ2)
• Methods to calculate P(X≤ 𝑥)
‒ Use R:pnorm(x,µ,σ)