Abstract
Background
Chronic hepatitis B (CHB) infection is the major risk factor for hepatocellular carcinoma (HCC).
Objective
To develop machine-learning models for predicting an individual risk of HCC development in CHB-infected patients.
Methods
Machine learning models were constructed using features from follow-up visits of CHB patients to predict the diagnosis of HCC development within 6 months after each index follow-up. We developed 4 model variants using all features, with alpha fetoprotein (AFP) (AF A) and without AFP (AFN); and selected features, with AFP (SF A) and without AFP (SFN). Performance was evaluated using 10-fold cross-validation on a derivation cohort and further validated on an independent cohort.
Results
In the derivation cohort of 2,382 patients, of whom 117 developed HCC, AFA achieved higher sensitivity (0.634, 95% confidence interval [CI]: 0.559–0.708) and specificity (0.836; 0.830–0.842) than AF N (sensitivity 0.553; 0.476–0.630 and specificity 0.786; 0.779–0.792). SFA also achieved higher sensitivity (0.683; 0.611–0.755 vs. 0.658; 0.585–0.732) and specificity (0.756; 0.749–0.763 vs. 0.744; 0.737–0.751) than SFN. Performance of SFA and SFN were tested in another cohort of 162 patients in which 57 patients developed HCC. SFA achieved sensitivity and specificity of 0.634 (0.522–0.746) and 0.657 (0.615–0.699), while sensitivity and specificity of SFN were 0.690 (0.583–0.798) and 0.651 (0.609–0.693), respectively.
Conclusion
The machine learning models demonstrate good performance for predicting short-term risk for HCC development and may potentially be used for tailoring surveillance interval for CHB patients.