Skip to main content

Table A1.1 Twitter Search API parameters and values used in sampling and gathering Tweets for our experiment

From: Sweet tweets! Evaluating a new approach for probability-based sampling of Twitter

Twitter API parameter

Values used in our experiment

Notes

keyword

‘-filter:reTweets’

We exclude reTweets. This parameter setting allows us to gather all non-reTweet Tweets regardless of content.

result_type

‘recent’

Options include ‘recent’, ‘mixed’ or ‘popular’. For the popular and mixed sampling methods we selected ‘popular’ and ‘mixed’, respectively but for all other methods we selected ‘recent’.

count

100

The count values range from 10 to 100 for the TSAPI version we used in our experiment.

lang

‘en’

We gathered English language Tweets within the MSAs.

Tweet_mode

‘extended’

This setting allows the user to receive the full text of the Tweet rather than the default option which truncates the body of the Tweet beyond a character limit set by the API.

geocode

Latitude, Longitude and Radius in Miles

These parameters provide the geographic center of a circle of a given radius for the TSAPI to use as the location filter of the Tweets. This step is not the geo-filtering we discuss in Sect. 3.2.2, but rather a first step at filtering Tweets into the samples that are in the approximate geographical areas of the MSAs. The values entered for geocode varied according to the MSAs themselves as illustrated in Table 1.

max_id

various

Time of desired query was converted to a Tweet ID as described and this parameter was used with only the Uniform, SRS and VBEST methods.

until

2020-11-26; 2020-11-27 and so on…to 2021-01-01

This parameter value was set to the date of the day following the field period date for which data collection is desired. This parameter provides a non-inclusive upper bound for the date of the Tweets for the TSAPI.

since_id

various

This parameter represents a time stamp corresponding to midnight of the day for which Tweet samples are desired to ensure that our daily samples do not go beyond the given day. We converted a midnight time stamp into a synthetic Tweet id and used these values for this parameter.