Qué es la estadística?
Qué es la teoría de probabilidades?
Qué es la estadística descriptiva?
Qué es la estadística inferencial
------------------------------------------------------------
Definiciones básicas
Medidas de tendencia no central
Medidas de tendencia central
Medidas de dispersión
Momentos
Representación gráfica de la información
Histogramas
36. Medidas de tendencia central en MS EXCEL Observe que MATLAB y MS EXCEL utilizan diferentes algoritmos para calcular la moda
37.
38. Medidas de tendencia central en MATLAB Observe que MATLAB y MS EXCEL utilizan diferentes algoritmos para calcular la moda
39.
40.
41.
42.
43. Varianza y desviación estándar muestral Use siempre estas fórmulas para calcular la varianza y la desviación estándar, a menos que se indique lo contrario. Ver: http://en.wikipedia.org/wiki/Variance http://en.wikipedia.org/wiki/Standard_deviation
74. Histogramas con MS EXCEL MS EXCEL 2003: http://www.bloggpro.com/creating-histograms-in-excel/ MS EXCEL 2007: http://www.bloggpro.com/creating-a-simple-histogram-in-excel-2007/ or just GOOGLE IT! http://www.google.com/search?q=histograms+excel+2007
75.
76. Percentil 80 El eje Y tiene unidades de frecuencia únicamente
88. . Histograma del tiempo de viaje (censo USA, 2000) El área bajo la curva es igual al número de casos = 124 millones. Este diagrama usa cantidad/ancho de la tabla.
90. . Histograma de frecuencia relativa del tiempo de viaje (censo USA, 2000) El área bajo la curva es igual a 1 Este diagrama usa cantidad/total/ancho de la tabla.
# Sturges, H. A. (1926). "The choice of a class interval". J. American Statistical Association: 65–66. # ^ Scott, David W. (1979). "On optimal and data-based histograms". Biometrika 66 (3): 605–610. doi:10.1093/biomet/66.3.605. # ^ Freedman, David; Diaconis, P. (1981). "On the histogram as a density estimator: L_2 theory". Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 57 (4): 453–476.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % 2006 Author Hideaki Shimazaki % Department of Physics, Kyoto University % shimazaki at ton.scphys.kyoto-u.ac.jp % Please feel free to use/modify this program. % % Data: the duration for eruptions of % the Old Faithful geyser in Yellowstone National Park (in minutes) clear all; x = [4.37 3.87 4.00 4.03 3.50 4.08 2.25 4.70 1.73 4.93 1.73 4.62 ... 3.43 4.25 1.68 3.92 3.68 3.10 4.03 1.77 4.08 1.75 3.20 1.85 ... 4.62 1.97 4.50 3.92 4.35 2.33 3.83 1.88 4.60 1.80 4.73 1.77 ... 4.57 1.85 3.52 4.00 3.70 3.72 4.25 3.58 3.80 3.77 3.75 2.50 ... 4.50 4.10 3.70 3.80 3.43 4.00 2.27 4.40 4.05 4.25 3.33 2.00 ... 4.33 2.93 4.58 1.90 3.58 3.73 3.73 1.82 4.63 3.50 4.00 3.67 ... 1.67 4.60 1.67 4.00 1.80 4.42 1.90 4.63 2.93 3.50 1.97 4.28 ... 1.83 4.13 1.83 4.65 4.20 3.93 4.33 1.83 4.53 2.03 4.18 4.43 ... 4.07 4.13 3.95 4.10 2.27 4.58 1.90 4.50 1.95 4.83 4.12]; x_min = min(x); x_max = max(x); N_MIN = 4; % Minimum number of bins (integer) % N_MIN must be more than 1 (N_MIN > 1). N_MAX = 50; % Maximum number of bins (integer) N = N_MIN:N_MAX; % # of Bins D = (x_max - x_min) ./ N; % Bin Size Vector %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Computation of the Cost Function for i = 1: length(N) edges = linspace(x_min,x_max,N(i)+1); % Bin edges ki = histc(x,edges); % Count # of events in bins ki = ki(1:end-1); k = mean(ki); % Mean of event count v = sum( (ki-k).^2 )/N(i); % Variance of event count C(i) = ( 2*k - v ) / D(i)^2; % The Cost Function end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Optimal Bin Size Selectioin [Cmin idx] = min(C); optD = D(idx); % *Optimal bin size edges = linspace(x_min,x_max,N(idx)+1); % Optimal segmentation %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Display an Optimal Histogram and the Cost Function subplot(1,2,1); hist(x,edges); axis square; subplot(1,2,2); plot(D,C,'k.',optD,Cmin,'r*'); axis square;
As an example we consider data collected by the U.S. Census Bureau on time to travel to work (2000 census, Ref. Table 2). The census found that there were 124 million people who work outside of their homes. This rounding is a common phenomenon when collecting data from people.
This histogram shows the number of cases per unit interval so that the height of each bar is equal to the proportion of total people in the survey who fall into that category
This histogram differs from the first only in the vertical scale. The height of each bar is the decimal percentage of the total that each category represents. The curve displayed is a simple density estimate. This version shows proportions, and is also known as a unit area histogram. In other words a histogram represents a frequency distribution by means of rectangles whose widths represent class intervals and whose areas are proportional to the corresponding frequencies. They only place the bars together to make it easier to compare data.