Skip to main content

Table A2.3 The difference between two average PRAB(N) values within a row (Region) in Table 4 is declared significant (based on \(\alpha =0.03\)) if it exceeds 2.65 in absolute value. Key findings from the post hoc analysis of differences between average PRAB(N) values are included in this table

From: Sweet tweets! Evaluating a new approach for probability-based sampling of Twitter

1.

Average PRAB(N) values from Uniform samples ranged from about 30% to about 45% across the regions and were very consistent across Size within Region. They were significantly larger than those from any other method with one exception—for query size 360, average PRAB(N) values from Uniform samples were not significantly different from that of Recent in Pittsburgh.

2.

Recent samples from Atlanta had PRAB(N) values that were consistent across all levels of Size. In Baltimore and Pittsburgh, the Recent PRAB(N) averages improved significantly with larger sample sizes. In Phoenix, PRAB(N) averages for 540 and 720 queries were indistinguishable but significantly better than with 360 queries. For Chicago, we saw the opposite behavior—the performance degraded for larger sample sizes with 720 queries performing significantly worse than 360.

3.

Across Regions and query sizes, Recent samples produced significantly higher PRAB(N) values, on average, than VBEST-SRS and VBEST-SYS which were indistinguishable.

4.

VBEST-SYS and VBEST-SRS clearly had the best performance among the four Methods averaging between about 7% and 11% depending on Region.