Evaluating the construct validity of text embeddings with application to survey questions

EPJ Data Science

Table 4 Results of the Predictive Validity Analysis. r is the average Pearson’s correlation between predicted and observed scores. Δ% in the parentheses indicates the percentage change in r in comparison to the baseline r: 0.187. 95% CI refers to the 95% confidence interval around r

	Lasso r (Δ%)	Lasso 95% CI	RF r (Δ%)	RF 95% CI
TF	0.106 (−43.316)	[0.102, 0.110]	0.337 (80.007)	[0.333,0.341]
TF-IDF	0.092 (−50.802)	[0.087, 0.096]	0.323 (72.830)	[0.319,0.327]
Random 300	0.149 (−20.321)	[0.144, 0.153]	0.331 (77.066)	[0.327,0.335]
Random 768	0.116 (−37.968)	[0.111, 0.120]	0.334 (78.614)	[0.330,0.338]
Random 1024	0.069 (−63.102)	[0.065, 0.073]	0.338 (80.520)	[0.333,0.342]
fastText	0.204 (9.261)	[0.200, 0.209]	0.356 (90.439)	[0.352,0.360]
GloVe	0.107 (−42.781)	[0.103, 0.111]	0.347 (85.664)	[0.343,0.351]
BERT-base-uncased	0.195 (4.278)	[0.191, 0.200]	0.411 (119.994)	[0.407,0.415]
BERT-large-uncased	0.151 (−19.251)	[0.147, 0.155]	0.378 (102.260)	[0.374,0.382]
All-DistilRoBERTa	0.188 (0.535)	[0.183, 0.192]	0.374 (100.228)	[0.370,0.378]
All-MPNet-base	0.119 (−36.364)	[0.115, 0.123]	0.406 (117.135)	[0.402,0.410]
USE	0.186 (−0.535)	[0.182, 0.191]	0.386 (106.272)	[0.382,0.390]