TABLE 1. Individual diagnosis of one referee and six independent observers using the original WHO goitre classification
| Subject | Individual rating | ||||||
| Referee | Observer | ||||||
| 1 | 2 | 3 | 4 | 5 | 6 | ||
| 1 | 1b | 1b | 1b | 1b | 1b | 2 | 1b |
| 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 3 | 0 | 1a | 1b | 1b | 1a | 2 | 1b |
| 4 | 2 | 1b | 1b | 2 | 1a | 1a | 2 |
| 5 | 1b | 1a | 1b | 1b | 1b | 1a | 1b |
| 6 | 1a | 1a | 0 | 1a | 0 | 1a | 0 |
| 7 | 1a | 1a | 1a | 1b | 1a | 1a | 1a |
| 8 | 3 | 3 | 3 | 3 | 2 | 3 | 3 |
| 9 | 1b | 1b | 1b | 1b | 1b | 1a | 0 |
| 10 | 1a | 1b | 0 | 1a | 0 | 0 | 0 |
| 11 | 2 | 2 | 1b | 1a | 1b | 1b | 1b |
| 12 | 2 | 1b | 1a | 2 | 1b | 1b | 0 |
| 13 | 2 | 2 | 2 | 2 | 1b | 1b | 1b |
| 1 | 1b | 2 | 1b | 0 | 1b | 0 | 0 |
| 15 | 1b | 1b | 1b | 1b | 1b | 1a | 1b |
| 16 | 0 | 1a | 0 | 0 | 0 | 0 | 0 |
| 17 | 0 | 0 | 0 | 1a | 0 | 1a | 0 |
| 18 | 1b | 2 | 3 | 2 | 1b | 2 | 1b |
| 19 | 0 | 1a | 1b | 1a | 1a | 1a | 1b |
| 20 | 3 | 3 | 3 | 3 | 1b | 3 | 3 |
| 21 | 1b | 1b | 2 | 1b | 1b | 1a | 1b |
| 22 | 1b | 2 | 2 | 2 | 1b | 1a | 1b |
| 23 | 1b | 3 | 1b | 2 | 2 | 1a | 1b |
| 24 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 25 | 2 | 2 | 2 | 1b | 1b | 1a | 1b |
TABLE 2. Agreement across all categories of the WHO goitre classification for each pair of raters composed of one observer and his referee (paired kappa and standard error of kappa)
| Referee | Observer | N | Index of agreement | Paired kappa (and SE) | |
| Observed (p0) | Expected (pe) | ||||
| A | 1 | 25 | 0.56 | 0.23 | 0.42 (0.10) |
| 2 | 25 | 0.60 | 0.25 | 0.46 (0.11) | |
| 3 | 25 | 0.52 | 0.27 | 0.33 (0.10) | |
| 4 | 25 | 0.60 | 0.25 | 0.46 (0.10) | |
| 5 | 25 | 0.27 | 0.17 | 0.11 (0.08) | |
| 6 | 25 | 0.60 | 0.26 | 0.45 (0.10) | |
| B | 7 | 24 | 0.63 | 0.20 | 0.52 (0.10) |
| 8 | 23 | 0.52 | 0.20 | 0.39 (0.10) | |
| 9 | 24 | 0.50 | 0.18 | 0.38 (0.09) | |
| 10 | 24 | 0.42 | 0.20 | 0.26 (0.09) | |
| C | 11 | 19 | 0.53 | 0.26 | 0.35 (0.12) |
| 12 | 21 | 0.52 | 0.23 | 0.37 (0.11) | |
| 13 | 19 | 0.63 | 0.27 | 0.49 (0.12) | |
| 14 | 17 | 0.47 | 0.21 | 0.32 (0.11) | |
| 15 | 21 | 0.67 | 0.27 | 0.54 (0.11) | |
| D | 16 | 23 | 0.52 | 0.24 | 0.36 (0.10) |
| 17 | 24 | 0.54 | 0.23 | 0.40 (0.10) | |
| 18 | 23 | 0.30 | 0.17 | 0.15 (0.08) | |
TABLE 3. Agreement on a single category"no goitre"for each pair of raters composed of one observer and his referee (paired kappa and standard error of the kappa)
| Referee | Observer | N | Index of agreement | Paired kappa (and SE) | |
| Observed (p0) | Expected (pe) | ||||
| A | 1 | 25 | 0.88 | 0.81 | 0.37 (0.15) |
| 2 | 25 | 0 84 | 0.79 | 0.25 (0.19) | |
| 3 | 25 | 0.84 | 0.73 | 0.40 (0.20) | |
| 4 | 25 | 0.84 | 0.73 | 0.40 (0.20) | |
| 5 | 25 | 0.80 | 0.76 | 0.17 (0.20) | |
| 6 | 25 | 0.72 | 0.65 | 0.20 (0.19) | |
| B | 7 | 24 | 0.79 | 0.64 | 0.42 (0.19) |
| 8 | 23 | 0.87 | 0.64 | 0.64 (0.20) | |
| 9 | 24 | 0.75 | 0.66 | 0.27 (0.18) | |
| 10 | 24 | 0.88 | 0.64 | 0.65 (0.19) | |
| C | 11 | 19 | 0.79 | 0.51 | 0.57 (0.21) |
| 12 | 21 | 0.67 | 0.53 | 0.29 (0.20) | |
| 13 | 19 | 0.95 | 0.50 | 0.89 (0.22) | |
| 14 | 17 | 0.76 | 0.56 | 0.47 (0.20) | |
| 15 | 21 | 0.81 | 0.51 | 0.61 (0.21) | |
| D | 16 | 23 | 0.96 | 0.81 | 0.78 (0.20) |
| 17 | 24 | 0.92 | 0.78 | 0.62 (0.20) | |
| 18 | 23 | 0.91 | 0.71 | 0.70 (0.20) | |
The category of interest is no goitre" (stage 0); all other categories are combined into a single goitre" (stages 1a-3) category.
TABLE 4. Agreement a single category"visible goitre"for each pair of raters composed of one observer and his referee (paired kappa and standard error of the kappa)
| Referee | Observer | N | Index of agreement | Paired kappa (and SE) | |
| Observed (p0) | Expected (pe) | ||||
| A | 1 | 25 | 0.76 | 0.52 | 0.50 (0.20) |
| 2 | 25 | 0.80 | 0.53 | 0.58 (0.20) | |
| 3 | 25 | 0.72 | 0.60 | 0.31 (0.17) | |
| 4 | 25 | 0.76 | 0.54 | 0.48 (0.20) | |
| 5 | 25 | 0.68 | 0.56 | 0.27 (0.20) | |
| 6 | 25 | 0.84 | 0.58 | 0.62 (0.18) | |
| B | 7 | 24 | 1.00 | 0.51 | 1.00 (0.20) |
| 8 | 23 | 0.83 | 0.52 | 0.64 (0.21) | |
| 9 | 24 | 0.83 | 0.54 | 0.64 (0.19) | |
| 10 | 24 | 0.75 | 0.54 | 0.45 (0.19) | |
| C | 11 | 19 | 0.79 | 0.66 | 0.38 (0.22) |
| 12 | 21 | 0.76 | 0.61 | 0.39 (0.22) | |
| 13 | 19 | 0.79 | 0.71 | 0.27 (0.16) | |
| 14 | 17 | 0.82 | 0.61 | 0.55 (0.24) | |
| 15 | 21 | 0.90 | 0.59 | 0.77 (0.21) | |
| D | 16 | 23 | 0.74 | 0.47 | 0.51 (0.18) |
| 17 | 24 | 0.67 | 0.49 | 0.35 (0.19) | |
| 18 | 23 | 0.61 | 0.44 | 0.30 (0.15) | |
The category of interest is "visible goitre (stages 2 and 3); all other categories are combined into a single no visible goitrc (stages 0 - 1b) category.
TABLE 5. Determination of averaged group kappa (SE = approximate standard error) among multiple observers on three different goitre categorizations
| Goitre scalea | Group | N | Index of a greement | Averaged group kappa (and SE) | |
| Observed (p0) | Expected (pe) | ||||
| I | A | 25 | 0.48 | 0.24 | 0.31 (0.13) |
| B | 23 | 0.46 | 0.22 | 0.31 (0.13) | |
| C | 15 | 0.48 | 0.23 | 0.32 (0.16) | |
| D | 23 | 0.42 | 0.22 | 0.32 (0.13) | |
| II | A | 25 | 0.86 | 0.76 | 0.41 (0.29) |
| B | 23 | 0.83 | 0.72 | 0.37 (0.29) | |
| C | 15 | 0.77 | 0.57 | 0.46 (0.25) | |
| D | 23 | 0.93 | 0.71 | 0.75 (0.18) | |
| III | A | 25 | 0.77 | 0.57 | 0.46 (0.20) |
| B | 23 | 0.77 | 0.56 | 0.49 (0.20) | |
| C | 15 | 0.84 | 0.64 | 0.54 (0.27) | |
| D | 23 | 0.42 | 0.54 | 0.39 (0.20) | |
a. I = five stages WHO classification; II = presence/absence of goitre: III = presence/ absence of visible goitre.