Skip to main content

Table 3 Results of Content Validity Analysis: Prediction Accuracy Scores of Probing Classifiers. Note that sentence length is converted into a categorical variable with four levels including “0-10”, “11-12”, “13-15” and “16-25”; basic concept, concrete concept and formulation are also categorical with 13, 117 and 5 levels, respectively

From: Evaluating the construct validity of text embeddings with application to survey questions

 

Length

Basic concept

Concrete concept

Formulation

Average

Simple Majority

0.389

0.010

0.029

0.255

0.171

Random 300

0.102

0.198

0.440

0.742

0.371

Random 768

0.148

0.198

0.509

0.694

0.387

Random 1024

0.074

0.198

0.548

0.731

0.388

TF

0.148

0.198

0.636

0.770

0.438

TF-IDF

0.167

0.198

0.493

0.690

0.387

fastText

0.093

0.173

0.711

0.656

0.408

GloVe

0.194

0.192

0.908

0.642

0.484

BERT-base-uncased

0.657

0.175

0.815

0.944

0.648

BERT-large-uncased

0.620

0.153

0.739

0.908

0.605

All-DistilRoBERTa

0.407

0.198

0.916

0.776

0.574

All-MPNet-base

0.481

0.198

0.929

0.805

0.603

USE

0.454

0.198

0.903

0.853

0.602