Both sides of the story: comparing student-level data on reading performance from administrative registers to application generated data from a reading app

EPJ Data Science

Table 6 Performance comparison of all models on the training and testing set. The \(R^{2}\) score and root mean squared error (RMSE) indicate how well we can fit the true reading progress. Accuracy measures if the direction of progress is correctly predicted and the f1 score is a combination of precision and recall. Since both, accuracy and f1 score are the same, we know that precision (positive predictive value) and recall (sensitivity) have the same value. For all, except RSME, holds that higher values indicate a better fit. Test and train values are close to one another, which suggests that no strong over-fitting occurred. We finally chose linear regression and ridge regression as final models

Split	Metric name	Accuracy	\(R^{2}\)	RMSE	f1
Test	linear regression	0.621	0.159	16.400	0.621
	ridge regression	0.621	0.180	16.197	0.621
	extremely randomized trees	0.568	0.114	16.743	0.568
	LightGBM	0.636	0.141	16.492	0.636
	random forest	0.575	0.086	17.005	0.575
	SVR (RBF kernel)	0.594	0.136	16.534	0.594
	SVR (linear kernel)	0.620	0.157	16.338	0.620
Train	linear regression	0.648	0.158	16.568	0.648
	ridge regression	0.642	0.167	16.481	0.642
	extremely randomized trees	0.680	0.293	14.968	0.680
	LightGBM	0.660	0.197	15.951	0.660
	random forest	0.698	0.403	13.751	0.698
	SVR (Gaussian kernel)	0.657	0.193	15.990	0.657
	SVR (linear kernel)	0.663	0.172	16.198	0.663