The calculation of the sample size (n) required for the estimation of the prevalence of an event in an infinite population is based on the following formula:
Where:
| p | The assumed prevalence of the event in the population under study. |
| z | The critical value obtained from a standard normal distribution. For each level of confidence there is a corresponding value of z. The levels of confidence frequently used in biological studies are 90%, 95%, and 99%. The corresponding z values are 1.64, 1.96, and 2.58 respectively. |
| e | The maximum absolute error that the user is willing to accept. For example: if one assumes a prevalence of 0.40 and a relative error of 0.10 the absolute error will be 0.04 (that is, 0.40 × 0.10). In general, the relative error should be ≤ 0.20. |
Sample size may be adjusted (na) according to the size of the study population (N), as follows:
Where:
| n | The sample size calculated for an infinite population. |
| N | The size of the population under study. |
| Expected prevalence | The assumed prevalence of the event in the population under study (usually based on previous studies, field data or the literature). When no information is available a value of 0.50 will yield the maximum sample size. Acceptable values: ≥ 0 and ≤ 1. |
| Acceptible relative error | A measure of the desired precision. For example, if you assume a prevalence of 0.40 and a relative error of 0.10, the result will have a precision of ± 0.04 (that is, 0.40 × 0.10). In this case 0.04 is the absolute error. In general, the relative error should be ≤ 0.20. Acceptable values: ≥ 0 and ≤ 1. |
| Level of confidence | The confidence that the user wants to have in the results. Acceptable values: 90%, 95% or 99%. |
| Population size | The number of individuals in the population under study. Acceptable values: any positive whole number. |
Instituto Nacional de Tecnología Agropecuaria |
EpiCentre, IVABS, Massey University |