Offline biases in online platforms: a study of diversity and homophily in Airbnb

How diverse are sharing economy platforms? Are they fair marketplaces, where all participants operate on a level playing field, or are they large-scale online aggregators of offline human biases? Often portrayed as easy-to-access digital spaces whose participants receive equal opportunities, such platforms have recently come under fire due to reports of discriminatory behaviours among their users, and have been associated with gentrification phenomena that exacerbate preexisting inequalities along racial lines. In this paper, we focus on the Airbnb sharing economy platform, and analyse the diversity of its user base across five large cities. We find it to be predominantly young, female, and white. Notably, we find this to be true even in cities with a diverse racial composition. We then introduce a method based on the statistical analysis of networks to quantify behaviours of homophily, heterophily and avoidance between Airbnb hosts and guests. Depending on cities and property types, we do find signals of such behaviours relating both to race and gender. We use these findings to provide platform design recommendations, aimed at exposing and possibly reducing the biases we detect, in support of a more inclusive growth of sharing economy platforms.


I. INTRODUCTION
Sharing economy platforms are new manifestations of century old phenomena.Resource circulation systems that facilitate the exchange of underutilized goods or services between consumers have long existed in the form of flea markets, garage sales, second-hand shops, just to name a few.However, what used to be small scale and local instances of collaborative consumption, have now become massive online marketplaces, where face-to-face interactions have been replaced by technology-mediated ones [1].
A fundamental question arises about the role that such decentralized, largely unregulated, online platforms play in our societies.Often thought of as level-playing fields, where all participants receive the same opportunities, sharing economy platforms might instead end up acting as online aggregators of well-known offline human dynamics and biases.Indeed, a number of studies have suggested that some of the big sharing economy players are acting as accelerators of gentrification pheonomena that are already underway in large cities.For example, Airbnb has led to the emergence of short-term rent gaps between different areas of New York City [2] and has contributed to exacerbating the affordable housing crisis in Los Angeles [3].These phenomena, in turn, typically accelerate preexisting divides along racial lines, fostering inequalities between the Airbnb community and the communities living in the neighbourhoods where Airbnb has a significant presence (see, e.g., [4]).
Moreover, sharing economy platforms have come under multiple allegations over discrimination episodes taking place within the platforms themselves.For example, Uber drivers were found to be twice more likely to cancel trips requested by passengers with African-American sounding names compared to White-sounding names (even though Uber penalizes drivers for cancellations) [5].Similarly, Airbnb's hosts were found to be turning down potential guests based upon their racial background [6].
While headway has been made to tackle the unintended consequences brought about by sharing economy platforms, the debate on their socio-economic impact is still in its infancy, and only relies on a handful of studies or on anecdotal evidence.This, in turn, delays the execution of targeted interventions to expose, and possibly reduce, such consequences.The goal of this paper is to contribute to inform such a debate by performing a large scale empirical analysis aimed at detecting systematic statistical evidence of 'offline' biases taking place in online sharing economy platforms.
We study the Airbnb hospitality service, and focus first on the composition of its user base, with the aim of assessing its diversity both in general terms and then contextually with respect to the city hosting it.Second, we employ a network methodology to assess the statistical significance of host-guest interactions in Airbnb.In particular, we focus on homophily, i.e. the social phenomenon where people gravitate towards those like themselves [7], and on its opposite, heterophily.We also study the tendency to avoid members of a social group with different social traits, which we refer to as avoidance.While avoidance is universally deemed as unacceptable, homophily has sometimes been perceived as 'natural', and thus judged in a more accepting way.However, several studies have shown that the aggregation of slightly biased individual preferences can lead to unintended and collectively undesirable consequences, as evidenced by Schelling's work on urban racial segregation [8,9], and Neal's work on school children's development [10].
In performing this study, we make three contributions: • We gather data about Airbnb hosts, guests, and their interactions for five cities, spanning three different continents (Airbnb Data section).These are Amsterdam (The Netherlands), Dublin (Ireland), Hong Kong (China), Chicago and Nashville (U.S.).We have chosen them so to cover geographically (and culturally) different cities, as well as to cover variances in size, population composition, and cost of living.
• We study the diversity of the Airbnb user base in the above five cities along the dimensions of gender, age, and race.We find the Airbnb community to be predominantly female, and overwhelmingly young and White.In line with the aforementioned literature, we find the majority of hosts to be White even in cities whose racial composition is significantly more diverse (Results section).
• We model Airbnb's peers and interactions as nodes and edges in a bi-partite graph, and use a statistical method based on network rewiring to systematically identify edges (i.e., guest-host pairings) that cannot be attributed to chance (Method section).We apply such a method to the five cities under study, and, depending on the specific city and property type, find signals of homophily, heterophily and avoidance.We find such signals to be rather strong in the case of gender, rather weak (although still statistically significant) in the case of race, and mostly absent in the case of age (Results section).
These results echo other findings in the literature (see next section), and provide concrete evidence about how sharing economy platforms are being appropriated in different city contexts, possibly resulting in large divides between the online communities who can enjoy the benefits of the sharing economy and the 'offline' urban communities who are most exposed to its expansion.They also offer an opportunity to inform the design of tailored technology interventions aimed at exposing, and possibly reducing, certain behaviours, while also providing the means to monitor their effects (Discussion section).

II. RELATED WORK
Upon its inception, the Internet was expected to create a global level playing field, where the inequalities of the 'offline' world would be overcome thanks to easy access to digital opportunities.Yet, reality has been very different.As it is invariably the case, different social groups are not equally equipped to face technological innovation in its early stages, which typically exacerbates preexisting inequalities [11].
The sharing economy, as a whole, has been no exception.Indeed, a handful of studies have shown that the ability to seize the sharing economy's opportunities is often severely limited by geographical and socio-economic constraints.For example, Airbnb listings are usually more concentrated in wealthier, more attractive areas populated by young and tech-savvy residents [12].Similarly, TaskRabbit users from areas with low socioeconomic status and/or low population density were found to have a harder time both when selling their services and when seeking to outsource work to potential taskers [13], while individuals living in deprived Chicago suburbs have been found to have a harder time to get an Uber ride [14].
Similarly, the Internet's promise to circumvent physical barriers and improve communication between social groups has not always been upheld.For example, episodes of racial discrimination in online social networks have been extensively documented [15,16].Also, a vast amount of scholarly work has been devoted to understanding the formation of online preferential relationships between individuals.This has often been explained either in terms of interest-based homophily, e.g., showing the impact of ideological homophily in determining the opinions and content individuals are exposed to on social media [17], or in terms of homophily driven by demographics.
Studies of early social networks, e.g., MySpace [18], have identified race, gender, and age as the main demographic features driving online homophily, and such elements kept recurring in more recent studies.Indeed, evidence of racial [19] and gender [20] homophily has been reported in Facebook and Twitter, respectively, and evidence of both has been documented in the social networks underpinning location sharing applications [21].Age homophily is somewhat less studied, but still documented in a study of the Facebook social graph [22] and in niche environments such as virtual worlds [23,24].
Our work follows this stream of literature and investigates whether well known 'offline' biases also take place in sharing economy platforms.A handful of recent studies have started to look at such platforms from this perspective.Indeed, recently published work [5] found evidence of both gender and racial discrimination in Uber and Lyft, as female passengers were disproportionally taken on longer and more expensive routes, while passengers with African American-sounding names were twice as likely to receive trip cancellations from Uber drivers compared to passengers with White-sounding ones (even though Uber penalizes drivers for cancellations).Similarly, another study [25] found gender and race to have an impact on worker evaluations in online freelance marketplaces.
Evidence of biased behaviour was also found in Airbnb by means of a field experiment [6].In particular, guests were found to be 16% less likely to have their booking accepted if they had a distinctly African American-sounding name when compared to identical guests with White-sounding names instead.Similarly, in [26] it was found that non-Black hosts charged on average 12% more for an equivalent rental compared to Black hosts, and similar results were replicated in a subsequent study on Airbnb [27], where Asian and Hispanic hosts were found to rent at prices 9.3% and 9.6% lower, respectively, than their White counterparts.
While the above works investigate some specificities of user demographics and interactions in sharing economy platforms, a systematic analysis of these dimensions across the fundamental features of gender, age, and race is still lacking.This work aims at filling this void, by providing (i) an overview of the composition and diversity of Airbnb's community, and (ii) a quantitative method to dissect the anatomy of user-user interactions in sharing economy platforms (and Airbnb in particular), providing statistical evidence of homophily and avoidance between certain user groups.

III. AIRBNB DATA
In order to perform this study, we needed two types of data: demographic characteristics of hosts and guests (i.e., gender, age, race); and their pairing dynamics (i.e., who stayed with whom).Since we hypothesise that peers' behaviours might vary in different geographic (and cultural) contexts, we chose to perform this study on a per city level, rather than treating the whole of Airbnb as a single analytical context.
To begin with, we accessed city snapshots that the website InsideAirbnb[48] already makes available.We chose five cities (Amsterdam, Chicago, Dublin, Hong Kong, Nashville) so to have high geographic diversity (these cities span three different continents), as well as high diversity in terms of population composition and cost of living.Records of Airbnb hosts, guests and stays go from 2008 to 2016 for all cities except Nashville, whose Airbnb records start in 2009.For each city, InsideAirbnb makes available a full list of host IDs (from their 'listings' file).We used these IDs to query the Airbnb website and further acquire a host profile picture, the type(s) of property they were renting out (i.e., full property, private room in a shared property, or shared room), and the full list of IDs of all guests that ever left a review to such host (and for what property).We then further queried the Airbnb website with the guest IDs to acquire their profile pictures.Since Airbnb does not explicitly make available a peer's gender, age and race as attribute-value pairs in the peers' profile, we used image processing software on the collected profile pictures to automatically extract this information.In particular, we first used face localisation software to detect whether the profile picture contained a human face, and if so, to identify the portion in the picture containing it.We tested both FaceReact [49] and Indico[50] on a manually curated sample of 50 Airbnb images, so to contain a mix of pictures with and without human faces, and with and without background clutter.We found Indico to be significantly more accurate, especially for human images taken at an angle rather than straight-facing the camera.We thus continued only with the latter.Having extracted the bounding box containing a human face, we then used face recognition software to extract attributes.We tested Betaface,[51] Sightcorp F.A.C.E, [52] and Face++[53] on a subset of 250 Airbnb images.We found all three to be equally accurate when detecting gender.Sightcorp was found to be significantly more reliable in recognising age groups, and Betaface in extracting race (note that our analyses will focus exclusively on race, not on ethnicity; in particular, we will focus on three main race categories, i.e., White, Black, and Asian).We thus worked with Sightcorp and Betaface in parallel.We manually verified their accuracy on all 250 test images, and found the confidence levels reported by both products to be 0.3 ∈ [0, 1] or higher on images annotated correctly.Hence, we kept such value as a threshold for the ensuing automatic annotation; furthermore, we only retained pictures for which both face recognition software products agreed on both gender and ethnicity.To understand how robust our results are when varying facial annotation accuracy, we repeated all our analyses after (i) increasing the above threshold to 0.5, and (ii) manipulating the data by changing the race annotation on a random sub-sample of the images.The results obtained from such analyses are reported in Appendix A.
In terms of pairing dynamics, Airbnb does not make visible who stays with whom, nor whether a stay request has been refused or cancelled.However, what it does make visible are reviews that hosts and guests leave to one another.We use these as proxies for the actual pairing dynamics.Studies show that over 65% of stays result in a guest review and 72% result in a host review [28], so most stays are indeed captured by reviews.At present, it is not known whether those who do not leave reviews in Airbnb belong to specific users' groups; a past survey study of Tripadvisor reviews [29] did find that certain age and gender groups were more vocal than others, and this might also be the case in this context.Although the method we present next is still applicable, the validity of some of our findings might be impacted, and we will come back to this when we discuss limitations and future work (Conclusion section).
Summary statistics about the number of hosts, guests and pairings that we collected and annotated for each city under study are reported in Table VIII.We model Airbnb hosts and guests as nodes in a bi-partite graph, with a directed g → h edge with weight w gh representing the number of times guest g stayed at host h.Since we hypothesise that pairing dynamics may vary across cities, as well as across type of rented property (full property vs. shared -the latter comprising both private and shared rooms), we create and analyse a total of 10 (5-cities × 2-property types) bipartite networks.
Each such network is analysed using a statistical rewiring approach designed to assess the significance of pairing patterns in each of the cities studied.More precisely, the method starts from a null hypothesis that a given guest-host pairing occurred randomly.It then proceeds to verify whether this hypothesis holds by creating ensembles of null network models through the rewiring of the original networks' edges, and by comparing the properties of such null network model against those observed in the actual, empirical networks.Crucially, the procedure is designed to preserve the heterogeneity of the original networks, as it produces null network configurations where the number of stays that each guest and host have had are both kept intact, therefore preserving correlations between demographic features and activity on Airbnb.In the following, we describe the details of this methodology, and the rationale for adopting it.
Starting from an empirical bipartite network, we create a randomised version of it by iteratively performing xSwap operations [30].These amount to selecting two guest nodes (g 1 and g 2 ) and one of their corresponding host nodes (h 1 and h 2 , respectively) at random, erasing the existing edges (g 1 → h 1 and g 2 → h 2 ) between both pairs, then reassigning them to each other (g 1 → h 2 and g 2 → h 1 ), as shown in Figure 1 (left).Should either of the selected edges have a weight larger than one, the strength of the link is reduced by one and a unit weight is redistributed to the new host node (see Figure 1 -right).These operations preserve both the outgoing weight of guest nodes and the incoming weight of host nodes.Therefore, repeated xSwap operations yield configurations which are exactly equivalent to the original networks in terms of their heterogeneity, but are instead fully randomized in terms of the relationships between their nodes [31].In order to determine how many xSwap operations were needed before the rewired network configurations could be considered distinct enough (i.e., sufficiently uncorrelated) from their original counterparts, we computed the Kendall's Correlation Coefficient as suggested in [32].
For each of the 10 empirical networks under consideration, we generated an ensemble of 1000 randomized network configurations with the above rewiring procedure, thus preserving the original networks' degree sequences.Since such a rewiring procedure generates null configurations that do not deviate substantially from the original networks themselves, we can resort to a fairly parsimonious numerical investigation based on such relatively small number of null configurations.On such configuration ensemble, we computed the frequency of interaction between different groups of hosts and guests based on gender, age, and race (e.g., female guests to male hosts, White guests to White hosts, etc.).We then took the 95% confidence interval to define lower and upper bound probabilities for the occurrence of each pairing combination.This range represents the expected probabilities of pairing combinations taking place by chance.We then computed the pairing probabilities among the same groups in the original empirical network: should the actual values fall below or above the 'by-chance' range, we take them to be statistically significant under/over-expressions of certain group interactions with respect to the null hypothesis.Note that the over-expression of interactions between two groups (or within a group) does not necessarily translate into the under-expression of interactions between other groups; we will use the former to detect a tendency of certain groups to associate, and the latter to separately detect a tendency to avoid certain groups instead.

V. RESULTS
In this section, we report the results obtained when applying the above method to analyse Airbnb users' pairing dynamics.Before doing so, we provide an overview of the demographic characteristics of Airbnb hosts and guests, whose profile pictures we scraped and annotated using the process described in the Airbnb Data section.We will discuss and elaborate on these results in the Discussion section.   A. Demographics In Table IX we report summary statistics of Airbnb users in terms of gender, broken down by city and by property rental type (full vs shared).As shown, Airbnb has a predominantly female user base: in all cities under study, the net majority of both hosts and guests were found to be female, regardless of the property type, with such proportion getting close to or exceeding 60% in a number of cases.
We next consider race, once again as it varies by city and by property rental type (see Table X).We focus our attention on hosts only in this case, so to compare our results with available census information on the demographics of the cities under study.In all cities, the majority of hosts were found to be racially White, even in those with a markedly diverse racial makeup (e.g., Hong Kong, Chicago, Nashville).We shall comment extensively on this finding in the next section.
The last demographic feature we consider is age.Table IV reports the breakdown of each considered city's host population based on quintiles (we found the summary statistics to be very similar in the case of hosts, as well as when separating hosts based on full or shared property rentals).As shown in Table IV  community, with the vast majority of users found to be between their mid-twenties and late thirties.

B. Rewiring Analysis
We now present results of Airbnb host/guest pairing dynamics and compare them with the results of the rewiring analytical method illustrated above.We break down results by focusing on one demographic characteristic at a time for both hosts and guests.In this work, we did not investigate interactions between different features (e.g., between age of a host and gender of a guest); the same methodology could be used in the future to study these interactions.
Gender-related pairings.As shown in Table XI, we found very different, strongly city-dependent patterns.Indeed, we found same gender interactions to be prevalent in some cities (e.g., Amsterdam) while in others we found interactions between hosts and guests of different genders to be prevalent (e.g, Nashville).Yet, these results per se are not necessarily informative, as they partially echo each city's Airbnb population composition, and become statistically relevant only when compared with the expected rate of interaction measured under the null hypothesis of random interaction encoded in the above link rewiring procedure.As reported in Table XI, depending on the city and property type we detect very different over/under-expression patterns.Namely, when looking at full property rentals, we find same-gender interactions to be over-expressed (homophily), and interactions between different genders to be underexpressed (avoidance), in all cities.Things change considerably when looking at shared properties, where in the case of Dublin we find interactions between different genders to be over-expressed (heterophily) and, symmetrically, same gender interactions to be under-expressed, while in Nashville we find interactions to be compatible with the null hypothesis.
Race-related pairings.Results capturing pairing dynamics in terms of race are shown in Tables VI and VII for full and shared properties, respectively, once again broken down per city and per rental type.As a general observation, we can note that the signals we detect are not as strong as the ones detected for gender, with most over/underexpressions being of the order of a few fraction percentage points.Yet, strictly speaking, in most cases we detect an over-expression of White-to-White, Asian-to-Asian, and Black-to-Black guest-host pairings, regardless of whether the property was a shared rental or not (although with notable exceptions in Dublin and Nasvhille).Of note, the presence of racial homophily was found to be strongest in Hong Kong, both in terms of White-to-White and Asian-to-Asian guest-host pairings.
If we now change focus to pairing dynamics between different racial groups, we found results to be almost symmetric: that is, the majority of pairings between hosts and guests from different racial groups were under-expressed, although with a few exceptions (see, e.g., Dublin's full property case), and these results appear to be largely independent from the property rental type.
We verified the robustness of the above findings against possible confounding factors associated with the users' economic status.Namely, we checked whether the homophily and avoidance patterns shown in Tables VI and VII might arise as a byproduct of correlations between racial background and wealth / income.We used the number of Airbnb properties owned and the price charged for a week-long stay as proxies for a host's income.We then performed two types of analysis: (i) a matched pair analysis [33], to measure the rate of interaction of White guests across groups of White and non-White hosts with similar levels of income; and (ii) a rewiring analysis on the sub-networks obtained after removing all hosts belonging to the top and bottom thirds of the income distribution.In both cases, by focusing our attention on hosts of similar economic status, we try to control for hosts' wealth.The results obtained from the matched pair analysis were statistically significant only in Hong Kong and Chicago (full properties); in these settings, they do confirm those reported in Table VI; results obtained from the rewiring analysis were all significant and fully in line with the ones reported in Tables VI and VII.Full results are reported in Appendix B.  In this section, we discuss the results presented above, trying to offer an interpretation based on their societal context, and proposing recommendations concerning the design of sharing economy platforms.We break down the discussion into two parts: we first consider the Airbnb's user base in each city under study, and reflect upon it relative to the city's demographic, economic, and historical context; we then move our discussion to our findings on pairing dynamics.

A. City Demographics vs. Airbnb Community
Echoing the findings from other studies of the sharing economy, our investigation into the user base of Airbnb revealed a disparity between its communities and the city-level demographics surrounding them, both in terms of age, gender and race.As far as age is concerned, we found Airbnb hosts and guests to be overwhelmingly young (midtwenties to mid-thirties).This can be interpreted as a reflection of the broader age-related digital divide phenomenon [34].
In terms of gender, we have found the Airbnb community to be predominantly female.In 2015, Airbnb reported that 54% [35] of their guests were female.Based on the data we have collected up to 2017 for the cities under study, such percentage seems to be substantially higher (60% and above), both for hosts and for guests, both for private and for shared rental properties.We do not know whether this is a signal of an evolutionary mechanism, whereby more female than male users join the platform attracted by homophily, as they already see more female users already being engaged with it.
Perhaps the most notable results were found in terms of race.In the following, we focus our attention only on hosts, so to compare our results with available census information on the demographics of the cities under study.In all cities, the majority of hosts were found to be racially White.In particular, Dublin was found to have the highest proportion of White hosts (at 96% for full property rental); this is expected for a city whose resident population was reported to be 90% racially White in the latest Census[54] (2011).
Things are considerably different in the more diverse cities we analyzed, where we found well-known inequalities along racial lines to be largely replicated in Airbnb's interactions, with systematic evidence of a divide between the Airbnb community and the local demographics.Indeed, we found up to 64% of Airbnb hosts in Hong Kong to be White (for full property rentals), even when 92% of the population is reported to be of Chinese (i.e., Asian) racial background as per the 2016 Census [55].Hong Kong is well-known to be plagued by rising wealth inequality [36] and exorbitant property prices, that are known to be among the most unaffordable in major cities [37].This is coupled with high income inequality, with Hong Kong's racially White population amassing relatively high household income, while the same does not hold for the majority of the local Chinese population [38].Owning a spare property to rent out might thus be a privilege mostly in the hands of the White population.
Similarly of interest is Amsterdam's large majority of racially White hosts, resting at roughly 90% of the total Airbnb user pool.This too is substantially higher than expected, given that a third of Amsterdam's population is composed of migrants recognized to be of non-western racial origins[56].This finding is in line with research conducted by the Netherlands' Central Commission for Statistics (CCS), which has highlighted the existence of societal integration issues [39] amongst the Netherland's non-western population, which we speculate to be reflected into a decreased ability to secure properties to rent out.
Chicago and Nashville were found to have the highest proportion of Black hosts and guests recorded among the cities under study, averaging around 3-4% across the two property rental types.However, this result too is notable, given that the most recent census reports Chicago's and Nashville's total populations to be 32%[57] and 28%[58] racially Black, respectively -an order of magnitude more than what found on Airbnb.This resonates with recent research suggesting that Airbnb is a conduit for racial gentrification where the old, local community members of a neighborhood lose out in housing and in wealth [4].

B. Homophily and Avoidance
Our statistical investigation on pairing dynamics detected evidence of homophily both in terms of gender and race.Gender homophily is well documented to be 'built in' even in young children [40,41], so it is not surprising we could detect it in our results too.Conversely, it was interesting to see it supplanted by heterophilous behavior (i.e., a statistical over-expression of interactions between hosts and guests of different genders) in the case of Nashville, regardless of the property type, and both in Dublin and Hong Kong when switching from full property to shared property rentals.This is even more interesting when considering that full properties obviously do not imply any shared space between hosts and guests, and often allow to avoid any live interaction through automated checkin procedures.We speculate that a possible explanation behind this might lie in the different communities that naturally self-select based on property type, with those selecting shared accommodation most likely being more open-minded and prone to meeting different people (see [42] for similar findings in couchsurfing platforms).
In addition, we also detected less strong statistical signals of homophily when analysing pairing dynamics based on racial background.Once again, this is somewhat to be expected, as racial homophily has been detected in a broad variety of social environments [7], ranging from labour markets [43] to online social networks [19].Symmetrical to this, we also detected phenomena of racial avoidance (still with quite weak statistical signals in most cases), i.e., under-expressions of relationships between guests and hosts belonging to different racial groups.This, again, resonates with pre-existing literature.For example, racial avoidance has been found to partially explain relocation patterns within countries [44].These results were accompanied by a few exceptions where we detected under-expressed homophily (White-to-White relationships in Dublin's and Nashville's full properties) and heterophily (e.g., Asian-to-White relationships in Dublin).
Since our results are of a purely statistical nature, we can only highlight what relationships are over/underrepresented, without making any claims on causality.In particular, we are in no position to distinguish between avoidance and outright discrimination.Yet, some of the trends we observe are worrying and raise questions about potential countermeasures that platforms might adopt in order to monitor their progress and possibly control them.For example, following research on unconscious biases [45], platforms could design interventions aimed at providing users with detailed information about the peers they chose to interact with (or not) in the past, possibly highlighting systematic preferences or deviations from the outcomes that would be obtained under an unbiased selection process.Interventions could also go a step beyond raising awareness of individual behaviours.For example, they could encourage behaviours to enhance heterophily (which we already detected in some cities) by means of incentive systems similar to those that are already in place to promote service excellence (e.g., rewarding outstanding Airbnb hosts with a 'superhost' status).In this fashion, users with a history of interactions with peers from different racial backgrounds could be rewarded with badges or statuses highlighting their role as diversity champions.Last but not least, platforms could incentivise users to give up potentially unnecessary steps in the interaction process where additional, and potentially biasing, information about other peers is usually acquired; for example, Airbnb's 'instant booking' option, where a guest's request is automatically accepted by the platform, without an explicit consent action from the host, has been an exemplary step in this direction.

C. Limitations
We ought to acknowledge three main limitations of this work: first, in the generation of the null network models, we have not enforced the preservation of temporal constraints (i.e., it is possible for a stay that occurred between guest g 1 and host h 1 in year y 1 to be swapped with a stay between g 1 and h 2 , despite host h 2 joining Airbnb only in year y 2 > y 1 ).We chose to adopt this simplified approach under the assumption that Airbnb demographics composition has not changed significantly between 2008 and 2016 (e.g., women have consistently made up the majority of the Airbnb host community [46]).In the future, we will consider generating null network models that preserve timing constraints (see, e.g., [47]).
Second, our findings rely on the accuracy of several image processing tools, to automatically annotate profile pictures in terms of gender, age and race.If the accuracy of these annotations is low, then the findings are void.In this paper, we have tried to reduce this risk by cross validating annotations across several image processing tools, and by verifying the robustness of our findings with respect to variations in their accuracy.Even so, we had to disregard any user whose profile picture did not present a (recognisable) human face, or where the estimated confidence of the annotation was low.Platform owners are most likely in possession of more accurate demographic information, explicitly provided at the time of user registration; they could thus skip the image annotation step (Airbnb Data section) and directly use this information to annotate nodes in the bipartite graph, then proceeding with the application of the statistical network analysis method we proposed (Method section) to extract more robust results.
Finally, we ought to acknowledge that our analysis of Airbnb pairing dynamics was limited to what the platform makes externally visible (i.e., reviews that hosts and guests leave to one another after a stay); results might differ if one had the opportunity to apply our method to the whole history of interactions (including stays that resulted in no reviews, and reservation requests that were cancelled/refused).Once again, platform owners do possess the whole interaction history, and might thus want to repeat this study so to validate our findings on a complete network of host/guest stays.Yet, we have reason to believe our results would hold regardless.Indeed, the large samples our analysis relies on are such that only major differences in the tendency to leave reviews between groups would affect the significance of the findings reported in this paper.

VII. CONCLUSION
In this paper, we have gathered and analyzed data to assess the diversity of the Airbnb community, especially with respect to the cities where it is embedded, and we have presented a method based on the statistical analysis of networks to detect homophily, heterophily and avoidance between different groups in the Airbnb community.To the best of our knowledge, network rewiring techniques, and, more generally, null network ensembles have never been employed as a tool to detect bias, and this application represents an element of novelty of our work.
Our findings suggest that, in all cities under study, certain user groups (e.g., young, White, female) are substantially over-represented compared to the local population; furthermore, statistically significant signals of gender and racial homophily were detected, across all cities and regardless of the property rental type.
Taken together, our findings echo sentiment that perhaps, contrary to all branding, the sharing economy community might not be that diverse.Rather, platforms such as Airbnb might be acting as accelerators for gentrification processes that are already well underway in major cities.While policy and legislation interventions are needed to regulate who benefits from sharing economy platforms [12], technological considerations also deserve attention: for example, Airbnb, like most sharing economy platforms, requires hosts to have a traditional bank account at the time of registration, to which money will then be deposited when guests visit.This might hinder the ability for many hosts from socioeconomic deprived backgrounds to join the platform in the first place (which might be reflected in the very low representation from certain racial backgrounds in our studies); alternative solutions, such as on-demand payment services provided by platforms like BitPesa, [59] could lower the barrier to entry.
The results presented in the main paper have been obtained by using two facial annotation softwares (Sightcorp and Betaface) in parallel.The user pictures retained in our dataset were only those for which both products provided the same annotations with a confidence higher than 0.3 ∈ [0, 1].As a robustness test of our results, we repeated our analyses on a restricted dataset limited to images for which both softwares provided the same annotations with a confidence higher than 0.5.
The summary statistics about the number of hosts, guests and pairings that we collected and annotated with confidence ≥ 0.5 for each city are reported in Table VIII.The number of annotated users decreases slightly with the higher confidence threshold, but the demographic features of this subset of data remain largely unchanged with respect to those reported in the main paper (see Tables IX and X).
In Tables XI and XII we report the results obtained from the rewiring analysis on this restricted dataset on gender and race, respectively.As it can be seen, the over-and under-expression patterns we find in the gender-related pairings are exactly the same as those obtained with a lower accuracy threshold reported in the main paper.The same applies to the race-related pairings, which we find to be consistent with those obtained with a lower threshold in all but a few cases (see, e.g., Black-White pairings in Nasvhille's full property rentals).As an additional robustness check of our results, we repeated our analysis on race-related pairings after an artificial manipulation of the data.Namely, we manually altered the race annotation of a randomly selected sample made of 5% of all White users in each city, changing their annotation to Black or Asian with probability 1/2.The results obtained from the rewiring analysis on this manipulated dataset are shown in Table XIII.As one might expect, this leads to a few changes with respect to the results presented in the main paper.For example, the over-expression of White-White interactions is eliminated in the case of Amsterdam.However, most of the homophily, heterophily, and avoidance patterns reported in the main paper do not change.

22 FIG. 1 :
FIG.1: xSwap rewiring moves.Left: swap of two links with unit weight.Right: swap of a unit weight subtracted from two links with weights larger than one.

TABLE I :
Number of hosts, guests, and host-guest pairs annotated for each city analysed

TABLE II :
Airbnb host and guest population by gender (F=female)

TABLE IV :
Quintiles of Airbnb's host age distributions for full property rentals.Q1 denotes the bottom 20% of the age distribution, Q2 denotes users falling between the bottom 20% and 40% of the age distribution, and so on.

TABLE VI :
Pairings between racial backgrounds of Airbnb guests and hosts (W=White, A=Asian, B=Black) in full property rentals.Values in brackets represent 95% confidence level intervals obtained from the rewiring analysis, while values below them denote the corresponding empirically observed frequencies.Upward green (downward red) arrows highlight over-expressed (under-expressed) values.

TABLE VII :
Pairings between racial backgrounds of Airbnb guests and hosts (W=White, A=Asian, B=Black) in shared property rentals.Values in brackets represent 95% confidence level intervals obtained from the rewiring analysis, while values below them denote the corresponding empirically observed frequencies.Upward green (downward red) arrows highlight overexpressed (under-expressed) values.

TABLE VIII :
Number of hosts, guests, and host-guest pairs annotated for each city analysed when setting the annotation

TABLE IX :
Airbnb host and guest population by gender (F=female) in the dataset restricted to images annotated with

TABLE XII :
Pairings between racial backgrounds of Airbnb guests and hosts (W=White, A=Asian, B=Black) in the dataset restricted to images annotated with confidence higher than 0.5.Values in brackets represent 95% confidence level intervals obtained from the rewiring analysis, while values below them denote the corresponding empirically observed frequencies.Upward green (downward red) arrows highlight over-expressed (under-expressed) values.

TABLE XIV :
Results of the matched pair analysis City Property White/White non-White/White Pairs Stays p-value

TABLE XV :
Pairings between racial backgrounds of Airbnb guests and hosts (W=White, A=Asian, B=Black) in the subnetworks obtained by removing all hosts belonging to the top and bottom thirds of the distribution of prices charged for a week-long stay.Values in brackets represent 95% confidence level intervals obtained from the rewiring analysis, while values below them denote the corresponding empirically observed frequencies.Upward green (downward red) arrows highlight overexpressed (under-expressed) values.