Skip to main content

Table 3 Overview of the Spooky Books Data Set

From: Enriching feature engineering for short text samples by language time series analysis

 

Number of samples

Total number of tokens

Sample lengths (tokens)

Average

Standard deviation

EAP

7900

232,184

29.4

21.1

HPL

5635

173,979

30.9

15.3

MWS

6044

188,824

31.2

24.8

Overall

19,579

594,987

30.4

20.9