Impact and dynamics of hate and counter speech online

EPJ Data Science

Table 2 Classification scores for the panel of experts used to label the Reply Tree tweets using the top 25 experts. γ is a confidence threshold on \(S_{h}(t)\), viz., \(|S_{h}(t)=2p_{h}(t)-1|\ge \gamma \). “Percent Labeled” is the percentage of examples in the test set that were labeled as either hate or counter at a confidence level of γ. The top row represents traditional accuracy measures, which compare favorably to previous studies that used smaller unbalanced data sets and achieved F1 scores ranging from 0.49 to 0.77 [32]. The remaining rows are more nuanced as not all examples in the tests sets are being labeled correctly or incorrectly and thus these rows need to be viewed with cautious optimism; see [32] for an in-depth discussion of these issues

Accuracy Scores for Classification System
γ	Precision	Recall	F1	Percent Labeled
0.0	0.763	0.762	0.762	100%
0.20	0.827	0.827	0.827	76%
0.40	0.877	0.876	0.877	57%
0.60	0.917	0.917	0.917	41%
0.80	0.958	0.958	0.958	25%
0.90	0.977	0.977	0.977	15%