Skip to main content
Figure 2 | EPJ Data Science

Figure 2

From: Measuring sustainable tourism with online platform data

Figure 2

Unsupervised Learning techniques applied to TripAdvisor data. (A) Heatmap of principal component loadings of the four main principal components based on dimensionality reduction of the 33 continuous variables in the data set. The algorithm identifies four main dimensions in the data: accommodation size and user interaction (PC1), user rating (PC2), location (PC3), and quality (PC4). (B) Summary statistics of four clusters identified by k-means clustering. The accommodations can be grouped according to quality and user interaction variables. The clusters show different proportions of the GreenLeader outcome variable, varying from 2% to 19%. (C) Two-dimensional representation (PC1, PC2) of TripAdvisor data (10% sample) grouped in four clusters (panels) and GreenLeader/other accommodations (color). The unsupervised learning algorithms are able to split the data into distinct groups with varying proportions of GreenLeader accommodations

Back to article page