8.1: Distributions to Estimate Population Parameter
The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central tendency. These sample distributions are essential and need to be converted to some specific probability distributions required for the estimation of population parameters.
When conditions are fulfilled such as high sample size (generally more than 30), random and unbiased sampling, and the normal distribution of the population and the normal distribution of the samples, estimating population parameters becomes straightforward. However, such conditions can neither be assumed for the given samples nor be achieved every time or in every study. In such cases, estimation requires other distributions.
To estimate the population proportion from the sample proportion, the z distribution and the z table is used. Here, the samples need not follow the standard normal distribution, but they should be at least approximately distributed symmetrically and normally. The z scores calculated from the sample data can then be used to estimate the point of the population proportion, and confidence intervals can be constructed.
The z distribution can also be used to estimate the population mean but requires prior knowledge of population standard deviation (or variance). The z distribution can then be used to obtain the point estimate of the population mean, and the confidence intervals at the desired confidence level can be constructed for reliable estimates of the population mean.
In most realistic situations, the population standard deviation (to estimate the population mean) may not be known a priori for the given study. In such cases, the estimation of a population parameter such as the population mean is based on the Student t distribution. The t distribution is a symmetric distribution, for example, the normal distribution, but it is an approximation of the standard normal distribution. Its shape (the shallowness or steepness) changes as per the degrees of freedom (or by sample size). Student t distribution can be advantageous when the sample size is below 30.
Estimating the population standard deviation (or variance) requires the Chi-square distribution, which is not symmetric. The skew in the chi-square distribution changes as per the degrees of freedom (or sample size). It approaches the normal distribution at a sample size above 90. The Chi-square distribution helps estimate the population standard deviation (or variance) even at smaller sample sizes.