Skip to main content

Table 1 Overview of Pretrained Text Embedding Models Investigated in this Study

From: Evaluating the construct validity of text embeddings with application to survey questions

Model

Name

Dimension

File size

fastText

cc.en.300.bin

300

2.44 GB

GloVe

glove.840B.300d

300

2.03 GB

BERT

BERT-base-uncased

768

420 MB

BERT

BERT-large-uncased

1024

1.25 GB

Sentence-BERT

All-DistilRoBERTa-V1

768

292 MB

Sentence-BERT

All-MPNet-base-V2

768

418 MB

USE

USE-V4

512

916 MB