Statistics module 1

¡Supera tus tareas y exámenes ahora con Quizwiz!

What is bivariate PDF and bivariate Pmf?

1. Bivariate PDF: if (X,Y) are continuous random variable's then they have a bivariate PDF f(x,y) 2. Bivariate pmf: If (X,Y) are discrete random variables then they have have a bivariate PMF p(x,y)

What is Statistic. What are the two purposes of statistic? Name some stattistic(s).

1. Moment statistic(s) >These are SAMPLE/ EMPIRICAL versions of a rv's moment. > They can be viewd as descriptive stat bc they are describing certain properties of the data > But here, x-bar = µ, the smaple mean is used to find out population mean µ. Here sample mean x-bar is INFERENTIAL statistics bc it is used to estimate certain parameters of population. a) sample mean: X-bar = (1/n) ∑ x_i , where i = 1 to n b) sample variance: s² = (¹/n-1) ∑ (x_i -x-bar)², where i = 1 to n c) sample sd :s = √s² 2. Order statistics > k th order statistics x_(k) > sample minimum >sample maximum > Sample Quantiles ------Sample Median, 1st quartile, 3rd quartile, IQR 3. Frequency statistic(s) > Another type of descriptive statistics > sample/empirical pdf/pmf/cdf

What is the difference bet. percentile and quantile?

1. Quantile is decimal » Denoted by p » If we plot rv X's outcomes in X axis and pdf f(x) in y axis, the area under the curve left of π_p is p. » Quantile is the inverse of cdf » p = Pr (X ≤ π_p) = F(π_p) » p = ∫ f(x) dx , range -∞ to π_p = F(π_p) 2.Percentile = Quantile * 100 » πp is also called the 100pth percentile 50th Percentile ≈ 0.5 quantile

Practice Example: 1.What is the type7 and type 6 way of calculating Qunatiles? 2.What is the symbol of sample median, sample 1st quartile and sample 3rd quartile? 3. What is IQR

1. There is not much difference in type 6 and type 7 way of calculating quantiles when n is large. The difference becomes obvious when n is small. Type 7 and Type 6 only differ in what k means. type 7: π_p- hat = x_(k). Here, k = (n -1) p + 1 ; » here p = pth quantile and » x_(k) = value of kth order of x and » π_p- hat denotes pth sample quantile » Type 7 is default in R type 6: π_p- hat = x_(k) . Here, k = (n + 1) p ; Example HOw to calculate.Sample median = π_0.5- hat 1. calculate k th order k = 1 + (10 -1) 0.5 = 5.5 So, π_0.5- hat = x_(5.5) 2. Now use linear interpolation to calc x_(5.5) x_(5.5) = 5 + 0.5 (x_6 - x_5) 3. IQR: The difference between 3rd and 1st sampel quartile is called IQR. » About 50% of the smaple lies in theis range.

What are the 6 descriptive statistics in R?

1. min x_(1) : x_(1)hat 2. 1st sample quantile x_0.25-hat 3. sample median x_0.5-hat 4. sample mean x-bar 5. 3rd sample quantile : x_(0.75)hat 6. sample maximum : x_(n) Note: These are sample statistic that are used to approx pop statistic. Hence, πhat denotes pth sample quantile Remember: 25th and 75th percentile is denoted by π_0.25 and π_0.75. median = π_0.5. These are for population.

Explain the concept of Normal QQ plots.

1.In normal queue queue plots, we compare our data set to a normal distribution to see if our data set comes from a normal distribution. 2. To do that, We assume rv X has normal distribution and we aim to the quantiles x_(k) of normal distribution. But normal distribution has to parameters: µ and sigma ^2. Without any information about this parameters it is hard to calculate. So we have to standardize the random variable X. 3. After plotting the points of sample quantile and theoretical quantiles, if the result is a straight line with intercept µ and slope sigma, the normal model is appropriate.

What does covariance measure?

Covariance measures the joint variability of two random variables.

What is Data visualisation?

Data visualisation deal with Different ways of Graphically summarising data from random variables 1. Box plot: Convenient way of comapring data from two differnt groups. 2. Scatter plot For comparing data from two variables to see if they have any relationships(usually continuous) 3. Histograms and Smoothed pdfs: continupus rvs 4. QQ plots > For comparing the similarity of two probbaility distributions. 5. Normal QQ plots

What does Emprirical means?

Empirical means derived from data

What is the formula for calculating ORDER STATISTIC?

Formula: 1st way: Linear interpolation 2nd way: Linear combination

What is the LLN or Law of Large number. What is the Central Limit theorem?

Law of Large number. » If we consider X1 , ..., Xn iid rvs, then, as n goes bigger the average of the rvs will get closer and closer to true/ population mean µ Central Limit Theorem? » As we have more datat point (n goes bigger), the STANDARDIZED sample mean X-bar converges/ approximate to have Standard Normal distribution.

Why mgf is so useful?

MGF uniquely determines the distribution. Hence, knowing the MGF is the same as knowing the distribution. That means two random variables with same MGF follow same distribution.

Explain the concepts of QQ plots

Suppose we observed two data sets: 1. x1,......,xn 2. y1,.......,yn We want to compare the distribution of these two data sets to see whether these two data sets come from a population with the same distribution. Because if they had the same distribution then they will have same Quantiles.

Tail Probability

The Pr(X > x) = 1 - F(x) is called Tail probability of X. Also called survival function.

If X and Y are two independent random variables, what is the MGF of the sum?

The mgf of the Sam is the product of individual mgfs.

How these distributions are denoted this semester? Bernoulli, Binomial, Poisson, Uniform dist

X = Be (p) #Bernoulli X = Bi (n, p) # Binomila X = Pn ( λ) # Poisson X = Unif (a,b) # Cont. Uniform distribution. » if b = 1 and a = 0, this is known as the uniform distribution over the UNIT interval. Also called Standard distribution. X = Exp (λ) Memoryless property of Exponential dist: » Pr( X > y + x | X > y) = Pr(X >x) » Given, In past, the Pr (X > y), now, the Pr( X> x + y) = Pr (X). The current probability is not influenced by the probability outcome in the past

How pdf and pmf are denoted? How sample space of a Discreet rv is denoted in this chapter?

pdf : X ~ f pmf : X ~ p Sample space for discreet rv is denoted by Ω (omega) Omega - Set of all possible outcome in the sample space. (In Elemntas of probability Ω ≈ S_


Conjuntos de estudio relacionados

Comparative Politics Ch. 12 Test

View Set

Exceptions to Warrant Requirement

View Set

pharm prepu 28: Drug Therapy for Coronary Heart Disease

View Set