Skip to main content

Table 1 Datasets Used for Performance Comparison

From: Predicting and explaining behavioral data with structured feature space decomposition

Prediction task

Dataset

# of samples

# of features

Classification

Breast Cancer (Original) [28]

683

9

Spambase

4601

57

SPECTF [29]

267

44

Parkinsons [30]

195

22

Stack Exchange

1,026,225

12

Khan

680,551

17

Digg

1,000,000

19

Twitter

5,000,000

19

Duolingo

767,718

18

Regression

App Energy [31]

19,735

27

Building (Sales) [32]

372

103

Building (Costs) [32]

372

103

Pole Telecommunication [33]

15,000

48

Breast Cancer (Prognostic) [28]

194

32

Boson Housing [34]

506

13

Triazines [35]

186

60

Parkinsons (Motor) [36]

5875

16

Parkinsons (Total) [36]

5875

16