From: Sweet tweets! Evaluating a new approach for probability-based sampling of Twitter
Tweet access method | Description |
---|---|
1. Popular | One of three methods available for the result_type parameter in the TSAPI that returns the most popular results, as determined by Twitter, in the query. |
2. Mixed | The current default method for the result_type parameter of the TSAPI: returns both “popular” and “recent” Tweets as part of the query. Popular Tweets are determined by Twitter. |
3. Recent | Another option for the result_type parameter of the TSAPI that can be selected by the user in which the most recent Tweets are returned. If there are more than 100 Tweets that occurred most recently then additional queries can be submitted in sequence to obtain collections of Tweets that follow chronologically from 11:59:59:999 pm of a given day back to midnight at the beginning of that day as described in https://developer.twitter.com/en/docs/twitter-api/v1/Tweets/timelines/guides/working-with-timelines. |
4. Uniform | A series of evenly spaced time points from a given day are determined a single query is submitted for each of the selected time points using the TSAPI with result_type parameter set to “recent”. For this method we randomly select a starting time point within a sampling interval determined by the number of queries desired and then determine subsequent, evenly spaced points. The identified time points are then converted to Tweet IDs and used as the max_id parameters in the TSAPI. |
5. VBEST-SYS | A systematic random sample (without replacement, circular) is taken of a desired size from the universe of Tweet PSUs identified from the VBEST algorithm. The right-most endpoint of each of the Tweet PSU intervals is then used in a TSAPI query with result_type set to “recent”. One query is submitted per selected Tweet PSU. |
6. VBEST-SRS | A simple random sample (without replacement) is taken of a desired size from the sampling frame of Tweet PSUs constructed from the VBEST algorithm. The right-most endpoint of each of the Tweet PSU intervals is then used in a TSAPI query with result_type set to “recent”. One query is submitted per selected Tweet PSU. |