From: Evaluating the construct validity of text embeddings with application to survey questions
 | Lasso 0.240 (Δ%) | Lasso 95% CI | RF 0.240 (Δ%) | RF 95% CI |
---|---|---|---|---|
TF | 0.247 (2.970) | [0.233, 0.261] | 0.226 (−5.958) | [0.211, 0.24] |
TF-IDF | 0.248 (3.478) | [0.233, 0.264] | 0.228 (−5.061) | [0.211, 0.245] |
Random 300 | 0.245 (2.247) | [0.233, 0.258] | 0.231 (−3.568) | [0.219, 0.244] |
Random 768 | 0.245 (1.899) | [0.232, 0.258] | 0.231 (−3.842) | [0.218, 0.243] |
Random 1024 | 0.247 (3.082) | [0.234, 0.261] | 0.230 (−4.206) | [0.218, 0.242] |
fastText | 0.240 (−0.056) | [0.229, 0.251] | 0.227 (−5.299) | [0.216, 0.239] |
GloVe | 0.245 (2.239) | [0.234, 0.257] | 0.227 (−5.273) | [0.215, 0.240] |
BERT-base-uncased | 0.243 (1.391) | [0.230, 0.257] | 0.222 (−7.527) | [0.211, 0.233] |
BERT-large-uncased | 0.245 (2.067) | [0.232, 0.258] | 0.225 (−6.051) | [0.215, 0.236] |
All-DistilRoBERTa | 0.240 (−0.102) | [0.228, 0.252] | 0.224 (−6.601) | [0.213, 0.235] |
All-MPNet-base | 0.245 (1.975) | [0.232, 0.258] | 0.223 (−7.088) | [0.211, 0.235] |
USE | 0.241 (0.309) | [0.228, 0.254] | 0.224 (−6.478) | [0.213, 0.236] |