Back to chapter

7.8:

Sample Size Calculation

JoVE Core
Statistics
A subscription to JoVE is required to view this content.  Sign in or start your free trial.
JoVE Core Statistics
Sample Size Calculation

Languages

Share

Sample size—denoted as n—in statistics can refer to the number of observations or number of replicates.

In the example of the survey for legal protection of rainforests, the total number of respondents—10,000—is the sample size. However, this is an arbitrarily decided number.

To determine the sample size, for instance, to obtain more sample proportions, already known data can be used in the following modified equation of margin of error.

Here, the sample proportion of 0.85 from the known sample and the fixed critical value of 1.96 at a 95% confidence level can be used. The 3% margin of error is predecided, but it can be chosen between 2% and 5%.

So, solving for n, we get 545.

Notice that sample size is affected by the margin of error and critical value, but not by the population size. In other words, the sample size is higher when the confidence level is high or the value of E is small.

When no estimate of proportion is available to determine the sample size, it can be assumed to be 0.5.

7.8:

Sample Size Calculation

Knowledge of the sample size is the first requirement to conduct random sampling or an experiment. The sample size is the total number of units, observations, or groups (in some cases) used to get the data to estimate a population parameter. As the name suggests, the sample size is that of the sample drawn from the population and differs from the population size.

The sample size for the given experiment or sampling effort is fundamental to any study design. Sample size decides the number of efforts, time, funding, or other resources to be utilized for the study. Its decision cannot be arbitrary as the estimation and type of statistical test also are often based on the sample size. When a sample size is arbitrarily decided, the results cannot be interpreted appropriately. Too small a sample size results in biased conclusions or wrong conclusions, whereas too large a sample is often challenging to handle when the data is to be analyzed.

Although the decision about the sample size sounds complicated, there is a simpler way to estimate an appropriate sample size for the given population parameter. The sample size, denoted as n (Population size is denoted as N), is estimated using the formula of the margin of error. In case when the sample proportion is known, the actual value of the point estimate is used. In case when the population proportion is unknown, it can be assumed to be 0.5, and the sample size calculation is conducted. Similarly, the sample size can also be estimated when the population mean or variance is considered.

The sample size determination largely depends on the predecided significance level (or the confidence level), distribution of the data and the sample, and the predecided margin of error, commonly between 0.03 and 0.05. Sample size does not depend on the population size but on the desired confidence level and the margin of error. The margin of error and the confidence level should be decided based on the study question, hypothesis,  amount of variation, availability of the samples, accessibility of the population, and amount of resources or efforts.