# Standard Deviation Interobserver Agreement

The two clinical studies that provided data for this trial were approved by the Regional Committees of Scientific Ethics in Southern Denmark (Project ID: S-20120100 and S-20120137). All procedures conducted in these studies involving human participants were consistent with the ethical standards of the institutional and/or national research committee, as well as the Helsinki Declaration of 1964 and its subsequent modifications or comparable ethical standards. Patients in both clinical trials agreed to participate prior to the start of the study. Repeatability in relation to reproducibility: repeatability establishes the proximity of the concordance between the measurements in the same condition, i.e. with the same laboratory, with the same observer and with the same equipment (scanner PET, image reconstruction software), at close intervals. Reproducibility is about the proximity of the agreement between actions under all possible conditions for identical themes, i.e. using different laboratories, observers or PET scanners or assessing daily variation. In contract studies that focus solely on the difference between the different measures, as in our study 1, the data are ideally presented using Bland-Altman plots, possibly optimized by the transformation of the original data log and the consideration of heterogeneity and/or trends on the scale of measurement [10, 20]. In Study 1, we observed the duality between the Bland-Altman boundaries of concordance on one side and the corresponding RC on the other. In fact, several authors of recent contract studies have defined the repeatability coefficient (or reproducibility coefficient) at 1.96 times the standard deviation of type differences [21-25], algebraic equal to 2.77 times the standard deviation within the subject in simple settings, such as our study 1. Lodge et al.

designated the RC as 2.77 times the standard deviation within the subject [26]. Reliability refers to the ability of a test to distinguish patients despite measurement errors, while the agreement focuses on the measurement error itself [11]. Background assessment is well established and is generally done using intra-ccassal correlation coefficients (CCIs) [12, 13]. With Method 2, we begin with the formation of the third column, which contains the absolute value of the individual difference of two measurements. In a second step, we again calculate the average value and standard deviation of this third column. Information on observer variability is present in the average and standard deviation. The calculated average is therefore an average difference between the first and second measurements. Finally, in method 3 (3) less commonly used, we form the third column by calculating the standard deviation of individual measuring pairs. Here, too, we calculate the average and standard deviation of the standard deviation of the third column.

Although this seems unnecessarily complicated, it has a hidden advantage: although calculated from the same data set, the average value and SD of the standard deviation of individual pairs are exactly √2 times less than the observer variability calculated under method 2. All three methods can be presented as calculated or after standardization by derived by the average of the pair of measurements, i.e. by displaying the percentage or relative variability.

## Recent Comments