The sample size required for detection of an event if it is present in a population from two-stages sampling is determined by calculating independently the number of herds from which the individuals will be sampled and the number of individuals per herd to include in the sample.
Where:
| CL | The level of confidence. |
| e | The number of detectable individuals with the event in the population. This value is the product of population size (N) by detectable prevalence. Detectable prevalence is the result of the product of expected prevalence (p) by sensitivity (Se) of the diagnostic method. e = N × p × Se |
| Nh | The number of herds in the population. |
The number of individuals to sample per herd is given by:
Where:
| CL | The level of confidence. |
| e | The number of detectable individuals with the event in the population. This value is the product of population size (N) by detectable prevalence. Detectable prevalence is the result of the product of expected prevalence (p) by sensitivity (Se) of the diagnostic method. e = N × p × Se |
| Ni | The number of individuals per herd. |
The sample size calculated this way takes into account the sensitivity (Se) of the diagnostic method (the lower the Se the larger the sample size). Test specificity is not considered in this calculation. The lack of Sp of a diagnostic test produces false positive results and increases the probability of a Type II error (that is, considering a population as affected by an event when it is actually free of it). ProMESA calculates, based on the maximum accepted probability of making a Type II error stated by the user, how many false positive results are expected to be obtained from a sample of size n (calculated by ProMESA) randomly taken from a population with a prevalence equal to the minimum expected prevalence stated by the user.
There are three scenarios:
The sensitivity (Se) and specificity (Sp) of the diagnostic method are calculated as follows:
| Diagnostic strategy | Sensitivity | Specficity |
| Only one test (Ts 1) | = Se Ts 1 | = Sp Ts 1 |
| Two tests in parallel | = 1 - ( 1 - Se Ts 1 ) * ( 1 - Se Ts 2 ) | = Sp Ts 1 × Sp Ts 2 |
| Two tests in series | = Se Ts 1 × Se Ts 2 | = 1 - ( 1 - Sp Ts 1 ) × ( 1 - Sp Ts 2 ) |
This procedure is used to calculate the sample size needed to determine if an event is present in a population above a stated level of prevalence, when the selection of individuals is done by two-stage sampling. The two-stage method is advised when the total list of the individuals from the population is not available but there is a complete list of herds or farms. The procedure produces two results: (1) the number of herds from which the individuals are going to be chosen; and (2) the number of individuals per herd that have to be included in the sample.
| Level of confidence | The confidence that the user wants to have in the results. Acceptable values: 90%, 95% or 99%. |
| Probability of making a Type II error | Also known as beta. This is the probability of concluding that the population under study is affected by the event when it is actually free of it. This type of error is due to lack of specificity of the diagnostic method. If the specificity of a test was perfect the probability of false positive results would be null and this type of error would never occur. Unfortunately, the perfect method does not exit and it is important to have an estimation about how many misclassifications are to be expected using the available diagnostic tests. Acceptable values: ≥ 0 and ≤ 1. |
| Minimum expected prevalence of positive herds | State the lower proportion of affected herds if the event really existed in the region under study. Acceptable values: ≥ 0 and ≤ 1. |
| Number of herds | The total number of farms in the population under study. Acceptable values: any positive integer. |
| Minimum expected prevalence of positive animals | State the lower proportion of affected herds if the event really existed in the region under study. Acceptable values: ≥ 0 and ≤ 1. |
| Number of animals per herd | The mean number of animals per herd. Acceptable values: any positive integer. |
| Sensitivity | The probability that an individual having the event under study will be identified as positive by the diagnostic test. Acceptable values: ≥ 0 and ≤ 1. |
| Specificity | The probability that an individual not having the event under study will be identified as negative by the diagnostic test. Acceptable values: ≥ 0 and ≤ 1. |
Combination must be entered when two diagnostic tests will be used.
Parallel interpretation: an individual is considered to be positive if one or both tests produce a positive result. This method increases the sensitivity but decreases the specificity.
Series interpretation: an individual is considered to be positive if both tests produce a positive result. This method decreases the sensitivity but increases the specificity.
Instituto Nacional de Tecnología Agropecuaria |
EpiCentre, IVABS, Massey University |