Skip to main content

Table 3 Overview of the Spooky Books Data Set

From: Enriching feature engineering for short text samples by language time series analysis

  Number of samples Total number of tokens Sample lengths (tokens)
Average Standard deviation
EAP 7900 232,184 29.4 21.1
HPL 5635 173,979 30.9 15.3
MWS 6044 188,824 31.2 24.8
Overall 19,579 594,987 30.4 20.9