Skip to main content

Table 3 Pipeline Resource Consumption. Note that the smallest node in our cluster has 120GB RAM. However, testing shows that TFIDF and Word2vec can both run on much smaller RAM (16GB to 32GB)

From: Keyword expansion techniques for mining social movement data on social media

Pipeline

Ave. Hours Per Loop (Single Thread)

Total Mturk Cost

PC Requirement

Notes

Linder-BERT

118.0

$28.0

Tesla-K80 GPU

Partial data processing (20% tweets)

Word2vec

1.0

$300.0

120GB RAM

–

King

122.0

$102.0

120GB RAM

Partial data clustering (max 2 million tweets)

Linder

24.0

$31.0

180GB RAM

–

TFIDF

0.0

$29.0

120GB RAM

–