Machine learning algorithms have become popular in diabetes research, especially within the scope of glucose prediction from continuous glucose monitoring (CGM) data. We investigated the design choices in case-based reasoning (CBR) approach to glucose prediction from the CGM data. Design choices were made with regards to the distance function (city-block, Euclidean, cosine, Pearson’s correlation), number of observations, and adaptation of the solution (average, weighted average, linear regression) used in the model, and were evaluated using five-fold cross-validation to establish the impact of each choice to the prediction error. Our best models showed mean absolute error of 13.35 ± 3.04 mg/dL for prediction horizon PH = 30 min, and 30.23 ± 6.50 mg/dL for PH = 60 min. The experiments were performed using the data of 20 subjects recorded in free-living conditions. The problem of using small datasets to test blood glucose prediction models and assess the prediction error of the model was also addressed in this paper. We proposed for the first time the methodology for estimation of the impact of the number of subjects (i.e., dataset size) on the distribution of the prediction error of the model. The proposed methodology is based on Monte Carlo cross-validation with the systematic reduction of subjects in the dataset. The implementation of the methodology was used to gauge the change in the prediction error when the number of subjects in the dataset increases, and as such allows the projection on the prediction error in case the dataset is extended with new subjects.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.