Preview

Nephrology and Dialysis

Advanced search

Pitfalls of statistical analysis and clinical interpretation of the obtained estimates on the example of patients with kidney disease. Part IV: ROC analysis and special assessments of biomarker informativeness

https://doi.org/10.28996/2618-9801-2022-1-99-113

Abstract

Currently, many classifiers claim to be markers that enable to detect (screening markers) or confirm a disease (diagnostic or prognostic markers). Accuracy (Acc) is a metric that is often used to evaluate the effectiveness of a diagnostic test, representing the proportion of correct classifications. Although Acc is widely used in publications as a measure of test effectiveness, in fact, it isn`t so. Moreover, Acc can reach large values even if a marker and an outcome are completely not conjugated. A more balanced estimate is the Matthews correlation coefficient (MCC). Another interesting evaluation metric is F-measure, in particular - the traditional F1-score. The F1-measure is a balanced average (harmonic mean) of sensitivity (or "recall") and positive predictive value (or "precision"). This metric allows us to more fully assess the ability of the test to recognize patients with the disease, but not to discriminate between sick and healthy subjects, since it does not consider true negative results. In the case when the marker is not binary, but a continuous quantitative variable, it is important to identify a cut-off threshold that allows us to solve certain tasks in a more effective way using the test (to classify subjects as sick or healthy based on the marker value). Traditionally, ROC analysis is used for this purpose, choosing the optimal threshold value of a quantitative variable based on the Yuden index (the maximum distance from the diagonal reference line on the ROC curve graph) or the K-index (the minimum distance from the ROC curve to the upper left corner of the graph). Such a utilitarian approach is applicable when the threshold provides high values of both sensitivity and specificity (more than 0.9). In most cases, the threshold is chosen based on the maximization (or achievement of the minimum acceptable value) of certain estimates, such as sensitivity, specificity, positive or negative predictive value, relative risk or odds ratio, likelihood ratio, etc., which allows using the marker to carry out certain tasks.

About the Authors

A. B. Zulkarnaev
Moscow Regional Research and Clinical Institute
Russian Federation


E. V. Parshina
Saint Petersburg State University Hospital
Russian Federation


V. A. Fedulkina
Moscow Regional Research and Clinical Institute
Russian Federation


References

1. Зулькарнаев А.Б. «Подводные камни» статистического анализа и клинической интерпретации полученных оценок на примере пациентов с хронической болезнью почек. Часть I: оценка риска. Нефрология и диализ. 2019; 21(4): 419-429.

2. Зулькарнаев А.Б. Паршина Е.В. «Подводные камни» статистического анализа и клинической интерпретации полученных оценок на примере пациентов с хронической болезнью почек. Часть III: Оценка информативности биомаркеров. Нефрология и диализ. 2021; 23(1): 105-118.

3. Koren W., Koldanov R., Pronin V.S. et al. Amiloride-sensitive Na+/H+ exchange in erythrocytes of patients with NIDDM: a prospective study. Diabetologia. 1997; 40(3): 302-6. doi: 10.1007/s001250050678.

4. Hänninen E.L., Denecke T., Stelter L. et al. Preoperative evaluation of living kidney donors using multirow detector computed tomography: comparison with digital subtraction angiography and intraoperative findings. Transpl Int. 2005; 18(10):1134-41. doi: 10.1111/j.1432-2277.2005.00196.x.

5. Peräsaari J.P., Jaatinen T., Merenmies J. Donor-specific HLA antibodies in predicting crossmatch outcome: Comparison of three different laboratory techniques. Transpl Immunol. 2018; 46: 23-28. doi: 10.1016/j.trim.2017.11.002.

6. Nixon A.C., Bampouras T.M., Pendleton N. et al. Diagnostic Accuracy of Frailty Screening Methods in Advanced Chronic Kidney Disease. Nephron. 2019; 141(3): 147-155. doi: 10.1159/000494223.

7. Kovesdy C.P., Molnar M.Z., Czira M.E. et al. Diagnostic accuracy of serum parathyroid hormone levels in kidney transplant recipients with moderate-to-advanced CKD. Nephron Clin Pract. 2011; 118(2): c78-85. doi: 10.1159/000320318.

8. Matthews B.W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta. 1975; 405(2): 442-451. doi:10.1016/0005-2795(75)90109-9

9. Gorodkin J. Comparing two K-category assignments by a K-category correlation coefficient. Comput Biol Chem. 2004; 28(5-6):367-74. doi: 10.1016/j.compbiolchem.2004.09.006.

10. Shi L., Campbell G., Jones W.D. et al. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat Biotechnol. 2010; 28(8): 827-38. doi: 10.1038/nbt.1665.

11. SEQC/MAQC-III Consortium. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat Biotechnol. 2014; 32(9): 903-14. doi: 10.1038/nbt.2957.

12. Yin W.J., Yi Y.H., Guan X.F. et al. Preprocedural Prediction Model for Contrast-Induced Nephropathy Patients. J Am Heart Assoc. 2017; 6(2):e004498. doi: 10.1161/JAHA.116.004498.

13. Singh N.P., Bapi R.S., Vinod P.K. Machine learning models to predict the progression from early to late stages of papillary renal cell carcinoma. Comput Biol Med. 2018; 100: 92-99. doi: 10.1016/j.compbiomed.2018.06.030.

14. Kannan S., Morgan L.A., Liang B. et al. Segmentation of Glomeruli Within Trichrome Images Using Deep Learning. Kidney Int Rep. 2019; 4(7): 955-962. doi: 10.1016/j.ekir.2019.04.008.

15. Hu L., Li H., Cai Z. et al. A new machine-learning method to prognosticate paraquat poisoned patients by combining coagulation, liver, and kidney indices. PLoS One. 2017; 12(10):e0186427. doi: 10.1371/journal.pone.0186427.

16. Kocak B., Yardimci A.H., Bektas C.T. et al. Textural differences between renal cell carcinoma subtypes: Machine learning-based quantitative computed tomography texture analysis with independent external validation. Eur J Radiol. 2018; 107: 149-157. doi: 10.1016/j.ejrad.2018.08.014.

17. Sokolova M., Lapalme G. A systematic analysis of performance measures for classification tasks. Information Processing & Management. 2009; 45(4): 427-437. doi: 10.1016/j.ipm.2009.03.002

18. Powers D.M.W. Evaluation: from precision, recall and F-factor to ROC, informedness, markedness and correlation. Journal of Machine Learning Technologies. 2011; 2: 37-63.

19. Diciolla M., Binetti G., Di Noia T. et al. Patient classification and outcome prediction in IgA nephropathy. Comput Biol Med. 2015; 66: 278-286. doi:10.1016/j.compbiomed.2015.09.003

20. Liu Y., Zhang Y., Liu D. et al. Prediction of ESRD in IgA Nephropathy Patients from an Asian Cohort: A Random Forest Model. Kidney Blood Press Res. 2018; 43(6): 1852-1864. doi: 10.1159/000495818.

21. Park N., Kang E., Park M. et al. Predicting acute kidney injury in cancer patients using heterogeneous and irregular data. PLoS One. 2018; 13(7):e0199839. doi: 10.1371/journal.pone.0199839.

22. Morid M.A., Sheng O.R.L., Del Fiol G. et al. Temporal Pattern Detection to Predict Adverse Events in Critical Care: Case Study With Acute Kidney Injury. JMIR Med Inform. 2020; 8(3):e14272. doi: 10.2196/14272.

23. Lok C.E., Huber T.S., Lee T. et al. KDOQI Clinical Practice Guideline for Vascular Access: 2019 Update. Am J Kidney Dis. 2020; 75(4 Suppl 2):S1-S164. doi: 10.1053/j.ajkd.2019.12.001.

24. Kallner A. Laboratory Statistics. Methods in Chemistry and Health Sciences. 2nd Edition. Elsevier. 2018. 174 p.

25. Tripepi G., Jager K.J., Dekker F.W., Zoccali C. Diagnostic methods 2: receiver operating characteristic (ROC) curves. Kidney Int. 2009; 76(3): 252-6. doi: 10.1038/ki.2009.171.

26. Perkins N.J., Schisterman E.F. The inconsistency of "optimal" cutpoints obtained using two criteria based on the receiver operating characteristic curve. Am J Epidemiol. 2006; 163(7): 670-5. doi: 10.1093/aje/kwj063.

27. Albert C., Zapf A., Haase M. et al. Neutrophil Gelatinase-Associated Lipocalin Measured on Clinical Laboratory Platforms for the Prediction of Acute Kidney Injury and the Associated Need for Dialysis Therapy: A Systematic Review and Meta-analysis. Am J Kidney Dis. 2020; 76(6): 826-841.e1. doi: 10.1053/j.ajkd.2020.05.015.

28. Couchoud C., Pozet N., Labeeuw M., Pouteil-Noble C. Screening early renal failure: cut-off values for serum creatinine as an indicator of renal impairment. Kidney Int. 1999; 55(5): 1878-84. doi: 10.1046/j.1523-1755.1999.00411.x.

29. Twerenbold R., Wildi K., Jaeger C. et al. Optimal Cutoff Levels of More Sensitive Cardiac Troponin Assays for the Early Diagnosis of Myocardial Infarction in Patients With Renal Dysfunction. Circulation. 2015; 131(23): 2041-50. doi: 10.1161/CIRCULATIONAHA.114.014245.

30. Candela-Toha Á., Pardo M.C., Pérez T. et al. Estimated glomerular filtration rate is an early biomarker of cardiac surgery-associated acute kidney injury. Nefrologia. 2018; 38(6): 596-605. English, Spanish. doi: 10.1016/j.nefro.2018.01.002.

31. Waikar S.S., Betensky R.A., Emerson S.C., Bonventre J.V. Imperfect gold standards for kidney injury biomarker evaluation. J Am Soc Nephrol. 2012; 23(1): 13-21. doi: 10.1681/ASN.2010111124.

32. Ray P., Le Manach Y., Riou B., Houle T.T. Statistical evaluation of a biomarker. Anesthesiology. 2010; 112(4): 1023-40. doi: 10.1097/ALN.0b013e3181d47604.

33. Hoffman J. Biostatistics for Medical and Biomedical Practitioners. 2nd Edition. Academic Press. 2019. 734 p.

34. Edelstein C. Biomarkers of Kidney Disease. 2nd Edition. Academic Press. 2016. 632 p.

35. Thiele C., Hirschfeld G. cutpointr: Improved Estimation and Validation of Optimal Cutpoints in R. arXiv [stat.CO]. 2020. Available from: http://arxiv.org/abs/2002.09209.

36. López-Ratón M., Rodríguez-Álvarez M.X., Cadarso-Suárez C., Gude F. OptimalCutpoints: An R Package for Selecting Optimal Cutpoints in Diagnostic Tests. Journal of Statistical Software. 2014; 61(8): 1-36. doi: 10.18637/jss.v061.i08


Review

For citations:


Zulkarnaev A.B., Parshina E.V., Fedulkina V.A. Pitfalls of statistical analysis and clinical interpretation of the obtained estimates on the example of patients with kidney disease. Part IV: ROC analysis and special assessments of biomarker informativeness. Nephrology and Dialysis. 2022;24(1):99-113. (In Russ.) https://doi.org/10.28996/2618-9801-2022-1-99-113

Views: 146


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1680-4422 (Print)
ISSN 2618-9801 (Online)