7.8: Sample Size Calculation
Knowledge of the sample size is the first requirement to conduct random sampling or an experiment. The sample size is the total number of units, observations, or groups (in some cases) used to get the data to estimate a population parameter. As the name suggests, the sample size is that of the sample drawn from the population and differs from the population size.
The sample size for the given experiment or sampling effort is fundamental to any study design. Sample size decides the number of efforts, time, funding, or other resources to be utilized for the study. Its decision cannot be arbitrary as the estimation and type of statistical test also are often based on the sample size. When a sample size is arbitrarily decided, the results cannot be interpreted appropriately. Too small a sample size results in biased conclusions or wrong conclusions, whereas too large a sample is often challenging to handle when the data is to be analyzed.
Although the decision about the sample size sounds complicated, there is a simpler way to estimate an appropriate sample size for the given population parameter. The sample size, denoted as n (Population size is denoted as N), is estimated using the formula of the margin of error. In case when the sample proportion is known, the actual value of the point estimate is used. In case when the population proportion is unknown, it can be assumed to be 0.5, and the sample size calculation is conducted. Similarly, the sample size can also be estimated when the population mean or variance is considered.
The sample size determination largely depends on the predecided significance level (or the confidence level), distribution of the data and the sample, and the predecided margin of error, commonly between 0.03 and 0.05. Sample size does not depend on the population size but on the desired confidence level and the margin of error. The margin of error and the confidence level should be decided based on the study question, hypothesis, amount of variation, availability of the samples, accessibility of the population, and amount of resources or efforts.