Urban magnetism through the lens of geo-tagged photography
© Paldino et al. 2015
Received: 27 March 2015
Accepted: 12 May 2015
Published: 29 May 2015
There is an increasing trend of people leaving digital traces through social media. This reality opens new horizons for urban studies. With this kind of data, researchers and urban planners can detect many aspects of how people live in cities and can also suggest how to transform cities into more efficient and smarter places to live in. In particular, their digital trails can be used to investigate tastes of individuals, and what attracts them to live in a particular city or to spend their vacation there. In this paper we propose an unconventional way to study how people experience the city, using information from geotagged photographs that people take at different locations. We compare the spatial behavior of residents and tourists in 10 most photographed cities all around the world. The study was conducted on both a global and local level. On the global scale we analyze the 10 most photographed cities and measure how attractive each city is for people visiting it from other cities within the same country or from abroad. For the purpose of our analysis we construct the users’ mobility network and measure the strength of the links between each pair of cities as a level of attraction of people living in one city (i.e., origin) to the other city (i.e., destination). On the local level we study the spatial distribution of user activity and identify the photographed hotspots inside each city. The proposed methodology and the results of our study are a low cost mean to characterize touristic activity within a certain location and can help cities strengthening their touristic potential.
Keywordscity attractiveness big data human mobility urban planning tourism study smart city complex systems collective sensing geo-tagged Flickr
The traces of communication and information technologies are currently considered to be an efficient and consolidated way of collecting useful and large data sources for urban studies. There are in fact various ways to electronically track human behavior and the most diffuse one is collecting data from mobile phones [1, 2]. It was already demonstrated that this technique can be used as an accurate method for understanding crowds  and individual mobility patterns [3–5], to classify how the land is used [6–9] or to delineate the regional boundaries [10–12]. Moreover, it was shown how to identify some of user characteristics from mobile call patterns, for example how to determinate if a user is a tourist or a resident . However, when it comes to studying human mobility patterns, exploring detailed call records is not the only possibility - other sources of big data collected from digital maps , electronic toll systems , credit cards payments [16, 17], Twitter , circulation of bank notes , vehicle GPS traces  and also using geotagged photographs [21–23] can be successfully applied.
The focus of this paper is on geotagged photographs that provide novel insights into how people visit and experience a city, revealing aspects of mobility and tourism, and discovering the attractions in the urban landscape. In the past, photography was already considered as a good mean of inquiry in architecture and urban planning, being used for understanding landscapes . Moreover, Girardin et al. showed that it was possible to define a measure of city attractiveness by exploring big data from photograph sharing websites . They analyzed two types of digital footprints generated by mobile phones that were in physical proximity to the New York City Waterfalls: cellular network activity from AT&T and photographic activity from Flickr. They distinguished between attractiveness and popularity. Regarding attractiveness they defined the Comparative Relative Strength indicator to compare the activity in one area of interest with respect to the overall activity of the city. They measured the attractiveness of a particular event in New York City. In this study we consider the attractiveness in the overall area of the city during three years, comparing the attractiveness and spatial distribution of activities in different cities.
Discovering how to increase the global city attractiveness or the local attractiveness of hotspots requires knowing the differences in the visits made by residents and tourists. While both residents and tourists take photographs at locations that they consider important, the reasons why they are taking photographs are different. This knowledge helps us to understand the different usages of the urban infrastructure in people spare time. The overall goal is to find the ways how passively collected data can be used for low cost applications that inform urban innovation. These information trends can be of interest for planning, forecasting of economic activity, tourism, or transportation . Finally, a comparative study of cities from different parts of the world is a relevant objective to discern how the patterns of human behavior largely depend on a particular city .
In this paper we define city global attractiveness as the absolute number of photographs taken in a city by tourists, while local attractiveness of hotspots within a city is defined by the spatial distribution of photographs taken by all users (either local residents or tourists). In our analysis we are using a dataset that consist of more than 100 million publicly shared geotagged photographs took during a period of 10 years. The dataset is divided into 8,910 files denoting 3,015 different locations (e.g., cities or certain areas of interest such as Niagara Falls) where for almost every location three different labels are given: resident, tourist and unknown, to denote people who are living in the area, visiting the area or are uncategorized.
3 Definition of user home cities and countries
In order to determinate if there is any difference between how residents vs. tourists are attracted to a certain location, for each user in the dataset we have to determinate his/her home city and country. Even though the dataset has tags for: resident, tourist and unknown, the given categorization is not comprehensive and in some cases is not consistent. For instance, for more than 85% of users in the dataset their home city is not defined (i.e., their photographs are always in unknown files). In addition, for almost 25% of users for who their home city is defined at least two or more cities are listed as their home cities making the proposed categorization inconsistent.
Due to the aforementioned reasons, we used our own criteria to determinate if a person is living in the area where he/she took a photograph or not. We are considering that a person is a resident if at that location he/she took the highest number of the photographs (at least 10 of them) over the longest period of time (at least longer than 180 days) calculated as the time between the first and last photograph taken at the location. Once when we determinate a user home city, he/she is automatically becoming a tourist in all other cities in the dataset. A category ‘tourist’ in this sense denotes many different kinds of visitors including business visitors. However, most of people taking photographs at locations other than their home cities in fact act like tourists during this particular instance of time.
From almost 1 million users that took photographs between 2007 and 2010, for only 11% of them we were able to determinate their home city and country using our criteria. However, these users took more than 40% of all photographs (i.e., almost 30 million). Our classification was not consistent with the initial categorization (i.e., users whose photographs were listed in only one resident file) for only less than 2% of users. Moreover, for every city in the dataset we identified its country code allowing us to classify tourists as domestic or foreign ones, where domestic tourists are coming from the same country as the considered city, while foreign tourists are all the others - visitors from different countries. Finally, for every city all the observed activity is assigned to residents, domestic tourists or foreign tourists and for each of these categories, we keep the following data: user id, geo coordinates and for tourists their home city id together with their country code and continent id.
4.1 Global attractiveness
Considering our original question what attracts people to a certain location, we start our analysis looking at different locations and their absolute global attractiveness that is quantified by the number of photographs taken in them by either domestic or foreign tourists, leaving out the contribution that was made by their residents. Once we determined user home cities and countries, in order to calculate global attractiveness for different locations, we ranked locations by the total number of photographs taken in them by tourists (i.e., people residing outside the considered city) from all over the world. We find that the first 10 ranked cities by photographs are: New York City, London, Paris, San Francisco, Washington, Barcelona, Chicago, Los Angeles, Rome and Berlin. In order to see how strong might be the impact of short-distance domestic visitors on this ranking on this classification, we compared it with the ranking built based just on the activity of foreign users in a city. Surprisingly the difference is not that high - New York and London are still the two leading cities (just switching order), Paris and San Francisco are still within top 5, while Rome having the lowest place in this new ranking among all the cities mentioned is still the 23rd world most photographed city with respect to the activity of foreign tourists. That is why the cities we picked up are the important destinations not only for all (including domestic), but also for the foreign visitors.
This ranking is also highly consistent with the one presented in  - all of our top 10 cities happen to be among the first 15 cities they mention. Another two global rankings of city visitor attractiveness worth mentioning in that context are the ones presented by Euromonitor1 and MasterCard2. Although one should not really expect them to be consistent with our ranking, as those rankings are built on diverse (and sometimes heterogeneous) sources of data trying to include all the visits and not necessary only tourists who are willing to take photographs as we do it in our study, we will compare them against our ranking. One could often expect one city to attract more people, but another one, attracting less, being more picturesque, and motivating those fewer people attracted for taking more photos, which would result in a higher total photographic activity. However all 10 of our top cities are included in Euromonitor’s top destinations list. Worth mentioning is that this is already not the case for a newer version Euromonitor’s ranking from 20153 - for example Washington falls out of the top 100 world destinations according to their recent estimate. This serves as a good example of how dynamic the world is, while it is not too surprising that our ranking built based on the data before 2010 happens to be more consistent with the older version of Euromonitor’s report. Moreover, we found that our top 3 cities - New York, London and Paris - are also the top 3 (the order is different however) in MasterCard’s ranking, while in total 7 of out top 10 cities (besides Berlin, Washington and Chicago) are mentioned among the ‘Global Top 20 Destination Cities by International Overnight Visitor Spend’ in 2014.
Further, we focus this study on the global attractiveness of these 5 US cities and 5 European Union cities (EU), and add together the remaining information as the rest of Europe, the rest of the US and the rest of world.
Heterogeneity of Flickr usage: total number of photographs taken worldwide by residents of different areas versus their official population in 2008
Photographs per 1,000 residents
New York City
Rest of EU
Rest of the US
Rest of the world
In order to use the O/D flows of geotagged photography as a proxy for actual human mobility between cities across the globe, the appropriate normalization for the above heterogeneity is required. One way of doing it is by normalizing the O/D flows shown in Figure 2 by the number of photographs taken per 1,000 of residents of the origin location reported in Table 1. However, this would require further assumptions about the homogeneity of the dataset representativeness for different modes of people travel behavior that would be a questionable assumption given the dataset sparsity and heterogeneity. Therefore, in the further analysis we will refrain from extrapolating the original values of O/D flows defined by the actual number of photographs taken to represent actual human mobility. In this way we will focus our analysis on the actual photographic activity of the users, keeping in mind that flows of the activity from different origins might actually have different representativeness across the entire human population and might not represent the entire variety of types of human activity from the considered origins in the considered destinations. However, we believe that photographic activity by itself is an important component of visitor behavior and might serve as a relevant proxy for measuring city visual attractiveness for the visitors.
The cumulative incoming flow for each destination in the O/D network from all the origins other than the considered destination represents the destination’s total global attractiveness in terms of geotagged photographic activity of tourists. Normalized by the population of destination, this measure becomes a relative global attractiveness of the destination stating how much visitors per capita of residential population the location has.
Additionally, for every city we can define a measure that shows how mobile its residents are (i.e., if their activity is home-oriented or not) by looking at the ratio of loop edges to the total outgoing weights from the O/D matrix (i.e., user activities in their home cities compared to their total activities). We depict these results in Figure 3(c). Although this ratio is nearly flat varying between 50-60%, an interesting pattern appears when looking at the destinations of activities. Again, American and European patterns are surprisingly distinctive - while the American tourists seem to be mostly engaged in domestic tourism, EU citizens’ travel more abroad. This difference can be explained because the US is much bigger and at the same time much more geographically diverse when compared to the EU countries of our studied cities. American users thus have more options when engaged in domestic tourism and are consequently less likely to travel to foreign destinations.
Relative strength of the links between each pair of cities, normalized by the null-model estimation
New York City
New York City
4.2 Local attractiveness
Variance of the fitted lognormal distributions
New York City
Top photographed area
New York City
Consider top n density cells in the grid and let a be the lowest activity level among them.
Select all the cells with activity higher or equal to a and divide them into spatially connected components (considering cells having at least one common vertex to be connected).
If resulting number of connected components is equal or higher than n (could be higher if a number of cells has the same activity level a) stop the algorithm defining selected connected components as the hotspots.
Otherwise, if the number of components is \(t < n\), select \(n-t\) top activity cells from the remaining ones (not covered by selected components) and let a be the lowest activity level among them. Repeat from Step 2.
Hotspots in New York City and Rome
New York City
New York City Hall
San Giovanni in Laterano square
Metropolitan Museum of Art
Auditorium Parco della Musica
Caio Cestio Pyramide
IED - Istituto Europeo di Design
American Museum of Natural History
Grand Central Terminal
Church San Paolo Fuori Le Mura
Parco dei Caduti
Passeggiata del Gianicolo
Values of q exponents for residents and tourists in the different cities
New York City
5 Summary and conclusion
The importance of cities in our society is well founded and it is evident that cities play a crucial role as more than half of world population lives in them. ‘Rethinking’ cities is thus the key component of the world sustainable development paradigm. The first and the most direct way of doing that is by rethinking the way we plan them. In order to become a better planner, one needs to start considering people needs. Not only do people need efficiency, better transportation and green energy, but also do they need a better experience of living in cities enjoying the things that they have interest in and that they find attractive. In this study we thus conducted an analysis of cities through the city attractiveness and derived patterns. The novelty of the study is in the kind of data that was used in it: geotagged photographs from publicly available photograph sharing web sites (e.g., Flickr).
Over the last decade big data analyses have being increasingly utilized in urban planning (e.g., analysis of cell phone records for the transportation planning). However, information from geotagged photographs was not very often analyzed although they can provide us with an additional layer of information useful for the urbanism in general. Namely, taken photographs indicate places in cities important enough for people to visit them and to decide to leave their digital trails there. By analyzing the global dataset of geotagged photographs we identified 10 most photographed cities, which happened to be distributed evenly between the US and EU. Focusing on the top 10 selected destinations, we studied spatial patterns of visitor attraction versus behavior of the residential users together with analyzing people mobility between those 10 cities and other places all around the world.
Although intercity origin/destination fluxes in a rather predictable way depend on the distance between two cities, links between American and European cities are surprisingly asymmetric. Namely, links going from American origins to EU destinations are on average stronger than the ones going in the opposite direction. Another clearly distinctive pattern between EU and American cities is related to the structure of the photographic activity within them. The results showed that in the US cities their residents take most of the photographs while the domestic tourists mostly cover the rest leaving not much for foreigners. The activity that happens in the EU cities is much more diverse showing a higher fraction of touristic and specifically foreign activity. Finally, when investigating the qualitative structure of destinations, again American and EU patterns are surprisingly distinctive - while Americans seem mostly engaged in domestic tourism, the Europeans travel more abroad than within their own home countries.
Moreover, we extracted the photographic activity at the local scale comparing attraction patterns for residents, domestic and foreign tourists within a city. Spatial distribution of photographic activity of all those user categories follows the same universal pattern - activity density distributions of all types of users follow log-normal law pretty well while the shapes of the curves for area size vs. activity quintile appear to be strongly consistent. However, the areas covered by tourist activities are always smaller compared to the areas covered by residents with the only exception of Los Angeles where domestic tourist activities cover larger areas compared to city residents. The ratio between areas covered by tourists and city residents is different for domestic and foreign tourists and is always higher for domestic tourists with only exception of Berlin where those factors are almost the same.
Once again, we find the activities within American and European cities are different from the quantitative standpoint. First, tourists visiting American cities (with the exception of Chicago) explore them more extensively, covering more of the areas of residential activity. More strikingly, the variance of the foreign activity density distribution is always higher for all the European cities compared to the residential activity and lower for all the American cities. Finally, we identified the hotspots in each city focusing on the most photographed places of each city. We noted that hotspot attractiveness follows a power-law distribution where the exponent of this distribution serves as an indicator of how focused people’s attention is on the major attractiveness compared to how distributed it is among a number of objectives. For all cities with the exception of Berlin, activity of the tourists and especially foreign tourists appeared to be more concentrated on the major attractions.
To conclude, by showing differences between people visiting the US and EU cities our study revealed interesting patterns in human activity. The results of our study are useful for understanding of what has to be enhanced in cities and where it can be appropriate to increase services targeting different categories of users. In past those questions were traditionally answered by analyzing different available datasets such as hotel information or survey data. However, collecting or getting the access to such datasets usually requires significant efforts and/or expenses, while geotagged photography is publicly available while providing a unique global perspective on addressing many research questions at both global and local scale. In future work we will also consider the longitudinal perspective of data analysis by showing how the observed human patterns evolve over time.
The authors wish to thank MIT SMART Program, Accenture, Air Liquide, BBVA, The Coca Cola Company, Emirates Integrated Telecommunications Company, The ENEL foundation, Ericsson, Expo 2015, Ferrovial, Liberty Mutual, The Regional Municipality of Wood Buffalo, Volkswagen Electronics Research Lab, UBER and all the members of the MIT Senseable City Lab Consortium for supporting the research. We further thank MIT research support committee via the NEC and Bushbaum research funds, as well as the research project ‘Managing Trust and Coordinating Interactions in Smart Networks of People, Machines and Organizations’, funded by the Croatian Science Foundation. Finally, we would like to thank Eric Fisher for providing the dataset for this research, Alexander Belyi for some help with the visualizations and also Jameson L Toole, Yingxiang Yang, Lauren Alexander for useful recommendations at the early stage of this work.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- Ratti C, Pulselli RM, Williams S, Frenchman D (2006) Mobile landscapes: using location data from cell-phones for urban analysis. Environ Plan B, Plan Des 33(5):727-748 View ArticleGoogle Scholar
- Reades J, Calabrese F, Sevtsuk A, Ratti C (2007) Cellular census: explorations in urban data collection. IEEE Pervasive Comput 6(3):30-38 View ArticleGoogle Scholar
- González MC, Hidalgo CA, Barabási AL (2008) Understanding individual human mobility patterns. Nature 453:779-782 View ArticleGoogle Scholar
- Kung KS, Greco K, Sobolevsky S, Ratti C (2014) Exploring universal patterns in human home-work commuting from mobile phone data. PLoS ONE 9(6):e96180. doi:10.1371/journal.pone.0096180 View ArticleGoogle Scholar
- Hoteit S, Secci S, Sobolevsky S, Ratti C, Pujolle G (2014) Estimating human trajectories and hotspots through mobile phone data. Comput Netw 64:296-307 View ArticleGoogle Scholar
- Reades J, Calabrese F, Ratti C (2009) Eigenplaces: analysing cities using the space-time structure of the mobile phone network. Environ Plan B, Plan Des 36(5):824-836 View ArticleGoogle Scholar
- Toole JL, Ulm M, González MC, Bauer D (2012) Inferring land use from mobile phone activity. In: Proceedings of the ACM SIGKDD international workshop on urban computing. ACM, New York, pp 1-8 View ArticleGoogle Scholar
- Grauwin S, Sobolevsky S, Moritz S, Godor I, Ratti C (2014) Towards a comparative science of cities: using mobile traffic records in New York, London and Hong Kong. In: Computational approaches for urban environments. Geotechnologies and the environment, vol 13, pp 363-387 Google Scholar
- Pei T, Sobolevsky S, Ratti C, Shaw SL, Li T, Zhou C (2014) A new insight into land use classification based on aggregated mobile phone data. Int J Geogr Inf Sci 28(9):1-20 View ArticleGoogle Scholar
- Ratti C, Sobolevsky S, Calabrese F, Andris C, Reades J, Martino M, Claxton R, Strogatz SH (2010) Redrawing the Map of Great Britain from a Network of Human Interaction. PLoS ONE. doi:10.1371/journal.pone.0014248 Google Scholar
- Sobolevsky S, Szell M, Campari R, Couronné T, Smoreda Z, Ratti C (2013) Delineating geographical regions with networks of human interactions in an extensive set of countries. PLoS ONE 8(12):e81707. doi:10.1371/journal.pone.0081707 View ArticleGoogle Scholar
- Amini A, Kung K, Kang C, Sobolevsky S, Ratti C (2014) The impact of social segregation on human mobility in developing and industrialized regions. EPJ Data Sci 3:6. doi:10.1140/epjds31 View ArticleGoogle Scholar
- Furletti B, Gabrielli L, Renso C, Rinzivillo S (2012) Identifying users profiles from mobile calls habits. ACM SIGKDD International Workshop on Urban Computing Google Scholar
- Fisher D (2007) Hotmap: looking at geographic attention. IEEE Trans Vis Comput Graph 13(6):1184-1191 View ArticleGoogle Scholar
- Houée M, Barbier C (2008) Estimating foreign visitors flows from motorways toll management system. In: 9th international forum on tourism statistics Google Scholar
- Sobolevsky S et al (2014) Mining urban performance: scale-independent classification of cities based on individual economic transactions. ASE BigDataScience 2014, Stanford, CA, preprint. arXiv:1405.4301
- Sobolevsky S, Sitko I, Tachet des Combes R, Hawelka B, Murillo Arias J, Ratti C (2014) Money on the move: big data of bank card transactions as the new proxy for human mobility patterns and regional delineation. The case of residents and foreign visitors in Spain. In: 2014 IEEE international congress on big data, Anchorage, AK Google Scholar
- Hawelka B, Sitko I, Beinat E, Sobolevsky S, Kazakopoulos P, Ratti C (2014) Geo-located Twitter as proxy for global mobility patterns. Cartogr Geogr Inf Sci 41(3):260-271 View ArticleGoogle Scholar
- Brockmann D, Hufnagel L, Geisel T (2006) The scaling laws of human travel. Nature 439:462-465 View ArticleGoogle Scholar
- Kang C, Sobolevsky S, Liu Y, Ratti C (2013) Exploring human movements in Singapore: a comparative analysis based on mobile phone and taxicab usages. In: Proceedings of the 2nd ACM SIGKDD international workshop on urban computing. ACM, New York, p 1 View ArticleGoogle Scholar
- Girardin F, Calabrese F, Dal Fiore F, Ratti C, Blat J (2008) Digital footprinting: uncovering tourists with user-generated content. IEEE Pervasive Comput 7(4):36-43 View ArticleGoogle Scholar
- Crandall DJ, Backstrom L, Huttenlocher D, Kleinberg J (2009) Mapping the world’s photos. In: WWW’09: proceedings of the 18th international conference on world wide web. ACM, New York, pp 761-770 View ArticleGoogle Scholar
- Zheng YT, Zha ZJ, Chua TS (2012) Mining travel patterns from geotagged photos. ACM Trans Intell Syst Technol 3(3):1-18 View ArticleGoogle Scholar
- Spirn A (1998) The language of landscape, pp 3-81 Google Scholar
- Girardin F, Vaccari A, Gerber A, Biderman A, Ratti C (2009) Quantifying urban attractiveness from the distribution and density of digital footprints. Int J Spat Data Infrastruct Res 4:175-200 Google Scholar
- Sinkiene J, Kromolcas S (2010) Concept, direction and practice of city attractiveness improvement. Public Policy Adm 31:147-154 Google Scholar
- Van den Berg L, Van der Meer J, Otgaar AHJ (2007) The attractive city: catalyst of sustainable urban development. In: Ache P, Lehtovuori P (eds) European urban and metropolitan planning. Proceedings of the first openings. Seminar 12th October 2007 YTK-espoo. Centre for urban and regional studies publication C67, pp 48-63 Google Scholar
- Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA (2009) Circos: an information aesthetic for comparative genomics. Genome Res 19(9):1639-1645. doi:10.1101/gr.092759.109 View ArticleGoogle Scholar
- Krings G, Calabrese F, Ratti C, Blondel VD (2009) Urban gravity: a model for inter-city telecommunication flows. J Stat Mech Theory Exp 2009(07):L07003 View ArticleGoogle Scholar