From: Evaluating the construct validity of text embeddings with application to survey questions
 | Lasso r (Δ%) | Lasso 95% CI | RF r (Δ%) | RF 95% CI |
---|---|---|---|---|
TF | 0.106 (−43.316) | [0.102, 0.110] | 0.337 (80.007) | [0.333,0.341] |
TF-IDF | 0.092 (−50.802) | [0.087, 0.096] | 0.323 (72.830) | [0.319,0.327] |
Random 300 | 0.149 (−20.321) | [0.144, 0.153] | 0.331 (77.066) | [0.327,0.335] |
Random 768 | 0.116 (−37.968) | [0.111, 0.120] | 0.334 (78.614) | [0.330,0.338] |
Random 1024 | 0.069 (−63.102) | [0.065, 0.073] | 0.338 (80.520) | [0.333,0.342] |
fastText | 0.204 (9.261) | [0.200, 0.209] | 0.356 (90.439) | [0.352,0.360] |
GloVe | 0.107 (−42.781) | [0.103, 0.111] | 0.347 (85.664) | [0.343,0.351] |
BERT-base-uncased | 0.195 (4.278) | [0.191, 0.200] | 0.411 (119.994) | [0.407,0.415] |
BERT-large-uncased | 0.151 (−19.251) | [0.147, 0.155] | 0.378 (102.260) | [0.374,0.382] |
All-DistilRoBERTa | 0.188 (0.535) | [0.183, 0.192] | 0.374 (100.228) | [0.370,0.378] |
All-MPNet-base | 0.119 (−36.364) | [0.115, 0.123] | 0.406 (117.135) | [0.402,0.410] |
USE | 0.186 (−0.535) | [0.182, 0.191] | 0.386 (106.272) | [0.382,0.390] |