The effect of Pokémon Go on the pulse of the city: a natural experiment

Pokémon Go, a location-based game that uses augmented reality techniques, received unprecedented media coverage due to claims that it allowed for greater access to public spaces, increasing the number of people out on the streets, and generally improving health, social, and security indices. However, the true impact of Pokémon Go on people’s mobility patterns in a city is still largely unknown. In this paper, we perform a natural experiment using data from mobile phone networks to evaluate the effect of Pokémon Go on the pulse of a big city: Santiago, capital of Chile. We found significant effects of the game on the floating population of Santiago compared to movement prior to the game’s release in August 2016: in the following week, up to 13.8% more people spent time outside at certain times of the day, even if they do not seem to go out of their usual way. These effects were found by performing regressions using count models over the states of the cellphone network during each day under study. The models used controlled for land use, daily patterns, and points of interest in the city. Our results indicate that, on business days, there are more people on the street at commuting times, meaning that people did not change their daily routines but slightly adapted them to play the game. Conversely, on Saturday and Sunday night, people indeed went out to play, but favored places close to where they live. Even if the statistical effects of the game do not reflect the massive change in mobility behavior portrayed by the media, at least in terms of expanse, they do show how ‘the street’ may become a new place of leisure. This change should have an impact on long-term infrastructure investment by city officials, and on the drafting of public policies aimed at stimulating pedestrian traffic.


1
Pokémon Go is a location-based mobile game about capturing and "evolving" virtual characters that appear to exists in the same real-world location as players. As soon as it came out, people of all ages seemed to be caught in the frenzy of walking everywhere trying to nd the next pocket monster. As it became world-wide hit, the game has fueled a urry of speculation about its potential e ects. It has been suggested, for example, that the game made kids and adults move out of the living room and into the open air, and that touristic attractions would attract more people if they had a Pokéstop (a place to check-in and get items), among many others. Real or not, Pokémon go has had some real-world e ects, some of which are not ideal: governments issuing alerts on playing the game in mine elds, 1 searching for Pokémon in "inappropriate" places like the Holocaust Museums and the White House, 2 and even to causing accidents. 3 The game was so popular, that in some countries it reached engagement rates that surpass those of mainstream social platforms like Twitter and Facebook. 4 The general perception is that games like Pokémon Go and its predecessor, Ingress [1], could make whole populations change their mobility patterns through a reward-system: earning more points by catching creatures, getting to certain places and checking in, among other well-known gami cation techniques. Ingress and Pokémon Go have a laxer de nition of "check-points" than a city's usual Points of Interest (POIs, such as museums and parks) including, for example, gra ti [2] and hidden heritage [3]. Thus, these games may help motivate visiting di erent kinds of places from the usual POIs. Because people tend to visit few POIs in their daily routines [4], playing the game implies that people would tend to visit di erent places from those they would usually visit. If this is the case, and considering that Pokémon is one of the most successful media franchises in the world [5], providing empirical evidence in favor (or against) this folk hypothesis would help understand the level to which these games make people change their habits.
In this study, we seek to quantify the Pokémon Go E ect on the pulse of a city, as seen from its oating population patterns. The " oating population" concept denotes the number of people present in a given area during a speci c period of time, but who do not necessarily reside there. For instance, people who work in a business district are part of its oating population, since they probably reside elsewhere. Given the successful launch of Pokémon Go in Chile and the availability of mobile phone network data, we are able to ask sophisticated questions about oating population such as whether Pokémon Go has an e ect on it at the city-scale; and if so, what the characteristics of these e ects are.
In Chile, Pokémon Go was o cially launched on August 3rd, 2016. Reports indicate that more than one million people downloaded the game within the next ve days. 5 To understand the game e ects, we based our analyses on a set of mobile communications records from Telefónica Movistar, the largest telecommunications company in Chile, with a market share of 33% in 2016. We used a dataset that follows the Call Detail Records structure, i. e., a dataset built for billing purposes, from July 27th, until August 10th. CDR datasets usually include logs of mobile phone calls, SMS's, and data-type network events (e. g., Web browsing, application usage, etc.), aggregated by a context-dependent amount of downloaded information [6]. Even though we cannot know who is playing the game, we hypothesize that this is not needed to evaluate the city-level e ect: our aim is to measure how many people were on the streets before and after the release of the game.
We followed a natural experiment approach whereby we evaluated oating population patterns at two speci c intervals of time: seven days before and seven days after the launch of Pokémon Go. We selected a speci c number of devices to ensure that we only analyzed oating population patterns of active users who live in the city (see Section 2.1 for the ltering process). Given this, we assumed, conservatively, that a higher number of connected mobile devices meant a higher number of people on the streets. Then, using regression models suitable for count data, we evaluated whether the launch of the game had an impact on the oating population by individually testing each regression factor for signi cance. Our aim was to measure the factor associated to the game itself [7].
We explored how spatio-temporal properties of mobility were related to the Pokémon Go e ect. To do so, we also used complementary datasets that are either generally available, such as travel surveys, or that could be approximated using mobile phone data (e. g., land use). Hence, the methods presented in this paper can be used to perform a similar analysis in other cities, as well as monitoring for pattern changes.
The contributions of this work are two-fold. First, we introduce a methodology to analyze mobile records that allowed us to identify behavioral changes at the city level. Second, we provide a case study of a city in a developing country: Santiago of Chile, and report several empirical insights about the observed phenomena. To the extent of our knowledge, this is the rst large-scale study on the e ect of an external factor (a location-based augmented reality game) on the oating population of a city.
The main ndings of our work are as follows: rst, there is a signi cant e ect of the availability of Pokémon Go on Santiago's oating population patterns, including covariates that account for daily patterns, land use, and available points of interest. The highest e ect during business hours was found at 12:31, with 13.8% more people connected to mobile networks. After business hours, the strongest e ect was found at 21:31, with 9.6% more people connected to mobile networks. Second, during business hours, the e ect is signi cant at commuting times between important places (such as home and work) and break hours (e. g., lunch times). We conclude that people adapted their routines to play the game, concentrating geographically in places with a high oating population. Unlike those e ects found during business hours, Pokémon Go players were scattered around the city at night, which hints at the possibility that people played the game near their places of residence at times when they were usually indoors. Finally, we discuss our ndings in the light of practical and theoretical implications in the areas of urbanism and the social life of the city. 2 We focused our study on Santiago, the most populated city in the country, with almost 8 million inhabitants. Comprising an area of 867.75 square kilometers, urban Santiago is composed of 35 independent administrative units called municipalities. The city has experienced accelerated growth in the last few decades, a trend that has been predicted to continue at least until 2045 [8]. Chile, and Santiago in particular, is one of the developing regions of South America with the highest mobile phone penetration index. There are about 132 mobile subscriptions per 100 people. 6 Santiago's growth and the general availability of mobile phones makes it an excellent city to perform research based on mobile communication data.

Datasets
We use the following complementary datasets: 6 https://goo.gl/sjWEjS Santiago Travel Survey and Tra c Analysis Zones. The Santiago 2012 Travel Survey 7 (also known as Origin-Destination survey, or ODS) contains 96,013 trips from 40,889 users. The results of this survey are used in the design of public policies related to transportation and land use. The survey includes tra c analysis zones of the entire Metropolitan Region, encompassing Santiago and nearby cities. We use this zoning resource for two reasons. On the one hand, the extent and boundaries of each area within a zone take residential and oating population density, administrative boundaries, and city infrastructure into account. This enables the comparison of several phenomena between zones. On the other hand, it allows us to integrate other sources of information, providing results that can be compared to other datasets such as land use properties [9].
The complete survey includes 866 zones; however, we were interested in urban areas of a single city. Since these are densely populated, we restricted our analysis to zones with a surface of under 20 square Km. As result, the maximum zone area is 18.37 square Km., with mean 1.34 and median 0.72 square Km. Finally, we were interested in zones that have both cell phone towers and Pokémon points of interests (see Figure 1), resulting in 499 zones covering 667 km 2 , about 77% of Santiago. Ingress Portals/Pokémon Points of Interest. Before Pokémon Go, Niantic Labs launched Ingress [1] in 2012. Ingress is a location-based game where players choose one team (from two available), and try to hack (take control) several portals placed in real locations world-wide. Portal locations are crowd-sourced and include "a location with a cool story, a place of historical or educational value," "a cool piece of art or unique architecture," "a hidden-gem or a hyper-local spot," among others. 8 The de nition of a portal, thus, includes points of interest that go beyond the de nition used in, for instance, check-in based social networks [2]. Once a portal is "hacked, " it belongs to the corresponding faction. A set of portals belonging to the same faction de nes the limit of an area controlled by it. Since players need to be close to portals to hack them, this makes players explore the city to nd portals to hack and conquer for their own teams.
Pokémon Go shares many game mechanics with Ingress, including the team concept. The main di erence between the games is that while in Ingress players capture portals, in Pokémon Go they capture wild pocket monsters. A subset of Ingress portals is de ned to be a PokéStop (a place to check-in and get items) or a PokéGym (a place to battle against the Pokémon of other factions). In this paper, both are referred to as PokéPoints. Additionally, there are hidden Pokémon respawn points, where di erent creatures tend to appear. The main mechanic of the game is that players must walk around and explore to nd creatures to capture. Note that all players see the same creatures, and one creature may be captured by many players. The game motivates walking in two ways: rst, walking to points of interests that are scattered around city (see Figure 1); and second, by walking 1, 5 or 10 Kilometers, players can also hatch eggs containing random pocket monsters that have better biological properties than those caught on the wild. Mobile Communication Records. Telefónica has 1,464 cell phone towers in the municipalities under consideration. We studied an anonymized Call Detail Records (CDR) dataset from Telefónica Chile. This dataset contains records from seven days prior to the launch of Pokémon Go in 2016 (from July 27th to August 2nd) and seven days after (from August 4th to August 10th). We did not take into account the day of the o cial launch of Pokémon Go, as there was no speci c hour in which the game was o cially and generally available. Also note that the dataset contains pre-paid and contract subscriptions from Telefónica.
The dataset contains data-type events rather than voice CDRs [6]. Unlike typical Call Detail Records for voice, each data event has only one assigned tower, as there is no need for a destination tower. Each event has a size attribute that indicates the number of KiB downloaded since the last registered event. We did not analyze the records from the entire customer population in Santiago. Instead, we applied the following ltering procedure: (i) we ltered out those records that do not fall within the limits of the zones from the travel survey, and also those with a timestamp outside the range between 6:00 AM and 11:59 PM; (ii) to be considered, mobile devices must have been active every day under study, because a device that does not show regular events may belong to a tourist, someone who is not from the city, or does not evidence human-like mobility patterns such as points-of-sale (which are mostly static); (iii) only devices that downloaded more than 2.5 MiB and less than 500 MiB per day were included, as that indicates either inactivity or an unusual activity for a human (i. e., the device could be running an automated process); (iv) we used a Telefónica categorization scheme that associates an anonymized device ID with a certain category of service: for example pay-as-you-go, contractual, enterprise, etc. This gives us a good idea of the general kind of account holders. Thus, every step of this procedure was taken to ensure that events were triggered by humans. After applying these lters, the dataset comprised records from 142,988 devices.
Our ltering procedure ensures that a positive di erence in the number of connections between two di erent days represents more people within a given zone. Depending on conditions such as time and location, we may interpret that some of those people are on the street. For instance, we may look at typical times where people commute, or at places where people are either inevitably outdoors (e. g., in a park) or inevitably at residential areas, which tend to have WiFi networks. 9 Land Use Clusters. We may take each tra c analysis zone to belong to one category of land use: residential, business, and areas with mixed activities (e. g., business plus recreation or shopping activities, etc.). These categories are the result of our previous work on land use and CDR data, which is based on hierarchical clustering of time-series of connections at each zone of the city [9].

Approach
This study uses a natural experiment approach to measure the Pokémon Go e ect at the city scale. To do this, we analyzed the change in population patterns before and after the launch of Pokémon Go as evidenced by CDR data. First, we described a method of smoothing the number of connected devices at each cell tower according to several snapshots of the tower network. A snapshot is the status of the cell phone network in a given time-window [10]. Then, we aggregated these device counts at the zone level to de ne a set of observations that we evaluated in a regression model. We took into account covariates that enabled us to isolate and quantify the Pokémon Go e ect. Device counts at each tower and zone level aggregation. Let e ∈ E be a network event, and |E| is the total number of such events. A network event e is a tuple (d, u, b, z), where d is a timestamp with a granularity of one minute, u is some (anonymized) user id, b is a tower id, and z is one of the previously de ned geographical areas of Santiago. For each tower b and time d we developed a time-series B d,b which represented the number of unique users from E connected to b at d. Due to the sparsity of CDR data, it is possible that B is not continuous. As consequence, the time-series could be null (B = 0) at a point of time where there were active devices at the corresponding tower. To account for this sparsity and obtain a continuous time-series, we smoothed each time-series B using Locally Weighted Scatterplot Smoothing (LOWESS) interpolation [11], obtaining B d,b . To obtain a LOWESS curve, several non-parametric polynomial regressions are performed in a moving window. The size of this window is the bandwith parameter for the model. In our implementation, this value is 30, which is interpreted as follows: each connection in uences (i. e., is counted into) its correspondent location during 30 minutes. Then, for each zone z, we aggregated : , 7 all time-series B d,b i into S d,z by computing the sum of all time-series B d,b i , where the tower b i lies in z (determined using a point-in-polygon test, as in [12]). Finally, each S d,z time-series represents the oating population pro le for each zone under study.
Measuring the Pokémon Go e ect. To measure the city-wide Pokémon Go e ect we considered the availability of the game as a city intervention, which started on the day of the launch of the game. Our hypothesis is that the following days would show an e ect of the game if people went out, regardless of being players or not, and this e ect would show on the number of people connected at each zone of the city.
To do so, we used Negative Binomial Regression (NB) [13,7] applied to our dataset at 1-minute intervals during a day. The NB regression model has been used frequently to analyze over-dispersed count data, i. e., when the variance is much larger than the mean, contrary to the Poisson model [14]. For every minute under study (note that we restricted ourselves from 6 AM until 12AM), we performed a NB regression using the observed device counts at each zone of the city in all available days in the dataset. The model is speci ed as follows: is the expected value of the number of active devices within a zone at time t. The PoGo factor is a binary variable that has a value of 0 when the game was not available, and 1 when it was available. The covariates DayOfWeek (with values business_day, Saturday, and Sunday) and LandUse (with values residential, business_only, and mixed_activities) account for the uctuations in population on di erent days according to land use. Both factors use dummy coding because they are categorical. The covariate PokéPoints represents the number of PokéStops and PokéGyms, which are proxies for points of interest within an area, accounting for the number of potential attracting places in each zone. The exposure value a represents the surface area of each zone. Because urbanists designed each zone having into account population density, transportation infrastructure, and administrative boundaries, the exposure parameter also allowed to control indirectly for these potential covariates.
The model output allows the following interpretation: the β coe cient assigned to each factor represents the di erence of the logarithm of expected counts in a zone at time t, if all other factors were held equal. Since β = log µ 1 − log µ 0 = log µ 1 µ 0 , then the di erence of logarithms equals the logarithm of the ratio between population counts before and after the availability of the game. The exponential of this coe cient is de ned as Incidence Rate Ratio, IRR β (t) = e β(t) . We developed a time-series of IRR β (t) values for each factor. By analyzing these time-series we determined when, in terms of time-windows within a day, there were signi cant e ects for each factor. 3 : , Our aim was to measure the e ect of Pokémon Go in the number of people and their mobility patterns in a city. As stated in our methods section, the rst step is to obtain a smoothed number of connected devices to each tower per minute during the previous/following seven days to the launch of the game. Figure 2 (a) shows the normalized number of events per minute in a speci c tower (Moneda Metro Station) during July 27th. The chart shows that there are many minutes (dots) without registered events, i. e., their position in the y-axis is 0. This shows the sparsity of the data, because this tower is located in a business area with a high rate of public transportation tra c. We used LOWESS interpolation to be able to approximate the number of people even in those minutes where there are no events. The continuous lines in Figure 2  minutes captured interesting regularities without incurring in noise produced by the sparsity of the data, nor smoothing the dataset too much. Figure 2 (b) shows the aggregation of towers within Zone 44 for three di erent days: a week before the launch of the game, the day of the launch, and a week after. Zone 44 contains a park (Parque O'Higgins), which is not usually visited at night, hinting that the availability of the game may have in uenced a di erent behavior. City-level connections. Figure 3 shows the city-level aggregated distributions of: (a) number of connected devices, and (b) downloaded information. Note that in both gures the distributions are normalized by dividing each value by the global maximum value. The gure considers three categories of days: before, during, and after the launch of Pokémon Go. In terms of connections, one can see that the patterns are stable across most days, with the total number of connections generally higher when Pokémon Go was available. This means, intuitively, that there were more people connected to the network presumably playing the game. Two rather surprising e ects are found on Mondays between 10 AM and 12 PM when Pokémon Go was available, and Tuesdays at about 12 PM when it was not yet available. In the rst case, we hypothesize that since it was the rst Monday after the launch of the game, people were trying it out. In the second case, there does not seem to be any explanation for the sudden drop of connections before the launch of the game, but it might be due to general network outages. Notice that the curve of Pokémon Go availability in the same time period behaves as expected.
In terms of downloaded information, the patterns present smaller variations across days, and some of them present identical behavior. We cannot assume that more data implies people playing, due to how the game works: the user has to keep the screen active for the game, and most of the game data-transfers are text-encoded packages of information (e. g., position updates about nearby pocket monsters, in JSON format). Conversely, social media applications can use more data, due to the size of images, and the streaming of audio and video. Hence, we leave data-tra c analysis for future work. However, one e ect that is clear is how the game was massively downloaded on its launch, Wednesday, August 3rd.
Note that a word should be said about Saturdays. The rst Saturday after the release of Pokémon Go exposes the highest number of device connections found in the dataset. While it is tempting to associate any found e ect to other extraneous factors (for example, the Olympic Games were on TV at the same time of our study), Figure 3 (b) shows that Saturdays exhibit similar behavior in terms of data consumption. Thus, it is unlikely that the Olympic Games were watched massively, at least not using mobile phones. Negative Binomial Regressions. After aggregating the smoothed counts for each zone, we performed a NB regression for every 1-minute snapshot across the days in our dataset. In other words, for each minute, the observations are the aggregated zone counts for all days, for all zones, at that speci c minute. As result, we obtained a time-series of regressions representing the dynamics of each factor in our model. Figure 4 shows the Incidence Rate Ratios (IRR) for each factor, as well as the distribution of model dispersion (α). An IRR of one means that the factor does not explain change. Although the signi cance of each factor is evaluated minute by minute applying a z-test, the visualization of each factor's IRR allows to visually determine signi cance if its 95% con dence intervals does not intercept y = 1.
The day of week covariates captured behavior expected for weekends. There are fewer people (IRR is signi cantly lesser than 1) in the morning for both kinds of days. For instance, the Saturday factor captured a portion of the night e ect (its IRR at 9:59 PM is 1.278). This means that not all people who went out on the Saturday when the game was available can be explained by the Pokémon Go factor. The land use covariates captured the dynamics of high oating population during the day, given the bell-shape of their distributions, and their IRRs are signi cant during the whole day. The PokéPoints covariate, which is a proxy for points of interest in the city, is also signi cant during the whole day. Its maximum IRR is 1.019,  We then analyzed the time windows when the Pokémon e ect is signi cant. Table 1 summarizes these time windows, most of which are a few minutes long. However, there are two windows with prominent lengths: from 11:58 AM to 12:46 PM, and from 9:24 PM to 10:12 PM. These time windows also contain the highest Pokémon Go IRR values found per window: 1.138 and 1.096, respectively. Thus, all other factors being held equal, the availability of the game increased the number of people connected to mobile towers in the city by 13.8% at lunch time and 9.6% at night.
Finally, notice that we also tested for statistical interactions between the regression factors, but they were not signi cant. Additionally, we tested the model without the covariates, having only the intercept and the Pokémon e ect. Particularly, the greater time-windows presented similar results and lengths, indicating that the model is robust. Explaining the Pokémon Go E ect. Given the speci cation of our model, our results are city-wide. To explore results at a ner geographical level, we estimated the di erence between two time-series: the mean of connection counts per zone after and before the launch of the game by separating observations between business days and weekends. Then, we adjusted the time-series according to the surface area of each zone, and we obtained time-series per zone that indicate whether they had, on average, more or fewer people connected after the launch of the game.
To nd whether these di erences correlate with the number of PokéPoints per square kilometers in each zone, we performed a Pearson correlation for all the minutes of the day with maximum IRR values (see Table 1). These correlations vary during the day, as shown on Figure 5. For business days, the highest correlation was found at 12:31 (r = 0.59, p < 0.001). For weekends, the highest correlation was found at 21:38 (r = 0.44, p < 0.001). The e ect is stronger at lunch time on business days, and at night on weekends. Figure 6 displays four choropleth maps of Santiago. The top row contains two maps, both showcasing di erences per zone at 12:31 PM. The left map (a) displays business days, and the right (b) displays weekends. Similarly, the bottom row displays di erences at 21:38 (business days at (c), weekends at (d)). Regarding the correlations described in Figure 5, one can see that Fig. 6 (a) shows a highly concentrated e ect in the city historical center. Reportedly, players visited this place at all times of the day, both because of its location within the city, as well as the availability of PokéPoints. 10 In contrast, weekends present a more diversi ed e ect on the city, specially at night, with many areas showing highly positive di erences. A careful exploration of the map (d) reveals that zones with higher di erences contain or are near parks and public plazas.
We also compared how the Pokémon Go e ect related to urban mobility, by graphically comparing the signi cant time-windows with the trip density distribution from the travel survey. Figure 7A shows the position of each time-window as a bar, and the trip densities as lines for business days, Saturdays, and Sundays. In the morning, the signi cant time-windows appear when the trip density is increasing or when it reaches a local maxima in business days. They also appear after noon, potentially linked to lunch breaks during business hours. All other e ects, such as those at night, do not appear to have any correlation with trip behavior. Given that the travel survey is already about ve years old, it was also important to compare our results to current trip distributions. We applied a work-in-progress algorithm, based on previous work on inferring trips from CDR data [15]. Figure 7B shows that, in general terms, trip distributions and the e ects still hold for CDR detected trips. This is not surprising, given how hard it is to change a whole population's general mobility patterns without any exogenous circumstances, at least at the 5-year scale. In any case, this lends further validation to our main results. 4 We performed a natural experiment at the city-scale comparing the behavior of a subset of the population before and after the launch of Pokémon Go in Santiago. We found that the availability of the game increased the number of people that connected to the Internet on their mobile phones by 13.8% at lunch time and 9.6% at night. A further exploration of the relationships among urban mobility patterns, mobile connectivity  and points of interests revealed that there are two primary ways in which the Pokémon e ect is noticeable. On the one hand, people take advantage of commuting times and breaks during the day to play the game. Thus, players tend to be near their work/study places, which are mostly downtown. On the other hand, on weekends at night the e ect is more diversi ed, implying that people tend to play the game in places near their homes.
In his book "The Great Good Place", Ray Oldenburg discussed the need for third places in the city: "In order for the city and its neighborhoods to o er the rich and varied association that is their promise and potential, there must be neutral ground upon which people may gather. There must be places where individuals may come and go as they please, in which no one is required to play host, and in which we all feel at home and comfortable" [16]. The concept comes from the designation of home and work (or study) as rst and second places in one's life. Nowadays, third places are facing two challenges. First, virtual worlds [17] and social networks [18] provide social infrastructure that is similar to that of third places, but without going out of the rst or second place. Second, perceptions of crime and violence are making the city feel less safe than it really is. In Santiago, this has been called fear of life [19]. Hence, the usage of location-based augmented reality games may help to alleviate both situations, by placing virtual worlds on top of physical reality, and by motivating people to go out and walk around their neighborhoods as our results indicate.
Because Pokémon Go has the e ect of increasing the number of pedestrians on the street, it has the potential to convert the city into a third place given its social features. Motivating the presence of pedestrians is important and relates to a theory put forth by Jane Jacobs, who claims that there are four conditions that must be met for a city to be lively and safe. These conditions are: the presence of pedestrians at di erent times of the day, the availability of mixed uses in districts, the mixture of old and new buildings, and the availability of many crossings for pedestrians [20]. Even though testing theories like these is di cult, some work has already been done. For instance, the city of Seoul found some evidence in favor of the theory by using house-hold surveys [21]. Recently, mobile communication records have been used to validate Jacobs' theories in Italian cities [22]. Studies like those mentioned above [21,22] perform an ex-post analysis of whether lively places comply with theories. A more granular approach would be to perform natural experiments like ours. To the extent of our knowledge, this is the rst experiment of its kind performed using mobile records of data-type. Our method makes it possible to measure the e ect not only of long-term but also short-term interventions, opening a path to quantify how much speci c actions help improve quality of life in the city. For instance, crime-data could be correlated with results obtained with our method to test whether the safety and livelihood theories hold after speci c interventions. Additionally, as our gures have shown (cf. Fig. 6), our method is able to produce maps apt for patch dynamics visualization to monitor population density [23]. Limitations. One limitation in our study is the lack of application usage identi cation. Thus, even though we included several covariates in our model to account for other e ects, we may still be confounding other e ects outside the Pokémon Go factor.
Another possible confounding variable is the presence of events that may attract lots of people at certain hours, in uencing the Pokémon e ect. In particular, since popular events such as important football games or cultural presentations were being held in Santiago at the same time as Pokémon go was launched, we felt the need to test for an e ect. Adding those events as a new land use category at the zone and the time when the popular events were held, we found that they only account for the 0.01 percent of observations. Therefore, popular events had either negligible e ects or no e ect at all in our study.
While it can be argued that most of the Pokémon Go media appearances focus on speci c situations, like those related to museums, tourism, and physical activity of some players, our empirical results uncovered to which extent the availability of the game is related to the number of people on the street. Even though mobile phone datasets are usually not public, it is becoming more common to have access to this kind of information thanks to several initiatives in opening and sharing data [6]. This kind of analysis is almost costless to perform by telecommunications companies, because mobile records are already extracted and stored for billing purposes.
A further issue involves the nature of the dataset itself. As we mentioned in the paper, Telefónica Movistar has a little more than a third of the Chilean mobile marketshare. This is the largest portion of the market (the other 60% is owned by many other smaller companies, with the second largest one at 20%). This does, however, introduce some data biases, although they are hard to identify and quantify. One such bias is population bias: it is not the case that Telefónica customers are distributed uniformly in the geographical space under study. Since we do not have access to the Telefónica database of customers, we cannot tell how representative a certain cell-phone tower is to Telefónica versus other telcos.
A nal note of caution could be said about weekend e ects: since we only have one weekend, we cannot be completely con dent about the results. Unfortunately, this is the dataset we have. In any case, our study was meant to be at "city" level, and only analyzed the di erent days to explore the nature of the e ect.
Regarding a potential novelty e ect in our results, it may be the case that not all Pokémon Go players during the rst few days were still engaged with the game later. Indeed, the number of Pokémon Go players has gone down enormously. While the novelty e ect might be true, our study was not aimed at evaluating the popularity of the game. Instead, its purpose was to quantify the city-level e ects, something that could have been more concentrated in time: in Chile, the game was highly anticipated by the users and the media, because it was released almost one month later than in the USA. Future Work. User modeling and classi cation may help to categorize users into Pokémon players and non-players, in order to study the individual e ects of the game. This can be done, for instance, by estimating their daily routines from their CDR-based trajectories [24,15], as well as their home and work locations [9]. Using these methodologies, we could learn whether they visited unknown places, or whether they walked slowly or faster. An epidemiological analysis of player behavior [25] could help evaluate whether social interactions in uence city exploration as a result of the Team Battle Dynamics featured in the game. A study of whether the use of the game is correlated with crime-rate reduction in public places, which could be linked with urban theories about street safety and could be useful for urban planners or the police, as well as providing evidence to some of Jacob's theories.
Finally, given the dataset we had available, we measured the immediate change in mobility patterns for the whole city of Santiago. However, it would have been very interesting to show that the changes in behavioral patterns hold for some weeks further into the future, perhaps even when the popularity of the game was declining (see the provisos above). This could have been a very interesting result that may have opened possible innovative usages of game-based strategies of this kind to modify citizens' behavioral routines.

5
Mobile Phone Data Analysis. Our work builds on the extensive literature on the analysis of mobile records. We refer the reader to comprehensive surveys of research in this wide area [26,27]. Here, we focus on describing the speci c topics that intersect with our research. In order to analyze our data, we borrowed the concept of a "snapshot," i. e., the status of the cell phone network during a speci c interval of time [10]. The comparison of snapshots of the network enables us to nd which time instants actually show interesting, and, in our case, statistically signi cant di erences. In contrast, previous work has focused mostly on studying the network when it showed higher tra c volume or population density [28], without controlling for covariates or population size, which we do in this work. Controlling for this was achieved by xing the number of users to only those active every single day under study, minimizing "noise" in the form of one-o users, for instance. We also make use of the concept of oating population pro le derived from our own, and other similar work [9,29,30,31,32,33]. An interesting result is that research based on this methodology has proven consistent across di erent cities, enabling urban planners to compare cities with respect to their land use patterns [32], as well as to study how rhythms of life di er according to socio-cultural factors [34].
According to a recent survey of urban sensing research [6], local event analysis is a key area of mobile phone data analysis. Local events are usually de ned as unusual gatherings or movements of massive amounts of people (e. g., protests, emergencies, sports, natural events, etc.) [35,36,37,38]. Hence, the unit of analysis is a single event with time and space constraints. Our method, instead, works at the city level with less prescribed time and location from those above. One thing in common between our work and the cited references is that all analysis have been performed ex-post, which makes it possible to create spatio-temporal signatures of places [29]. Even though these approaches allow us to analyze and understand the city, they do not allow measurement of city-scale phenomena due to their assumptions of locality.
Another relevant area is prediction and forecasting of human mobility [39,40,41,42], which allow to understand distributions [42] and limits of predictability in human mobility [41]. Those methods focus on the big picture of mobility and rely on probabilistic models that need extensive datasets covering large periods of time, unlike ours, which only analyzed a two-week period. Another approach uses regression [40], as we do. However, our method di ers: instead of using longitudinal data in only one regression model, for which Poisson models are better suited [40,14], we use many consecutive Negative Binomial models, one for each time snapshot of the network, thus avoiding violating the assumptions of the Poisson model, and controlling for daily rhythms at the same time [43]. As the dispersion value showcases (cf. Figure 4), the NB regression was a correct choice, due to the exposure α being greater than zero [44]. In our case, factors that could have caused over-dispersion include aggregation and non-uniform spatial distribution of the units of analysis [45].
Augmented Reality and Location-based Games. The e ects of augmented reality games and applications within the city have been anticipated for more than one decade [46,47]. However, the limits in mainstream hardware have impeded their general implementation/adoption at di erent times. Most studies about the impact of those games have been small-scale only [48]. Even though smartphone technology has allowed location-based augmented reality games to become more commonplace in the last few years, until the launch of Pokémon Go, they still lacked the cultural impact needed to have a considerable e ect on the city. As Frank Lantz is quoted in [49], in relation to the game PacManhattan: "If you want to make games like this you have to work hard to recruit an audience for them, you can't just make up something awesome and then hope that people fall into it" [50]. Since as we discussed above, Pokémon is one of the most successful media franchises in the world [5], it enables the unique opportunity to study both, the impact of a location-based augmented reality game; and the e ect of an intervention at the city scale when it comes to population mobility.

6
In this paper, we studied how Pokémon Go a ected the oating population patterns of a city. The game led to notable pedestrian phenomena in many parts of the world. This is extraordinary in the sense that it happened without the usual triggers like war, climate change, famine, violence, or natural catastrophes. In this regard, to the extent of our knowledge, this is the rst large-scale study on the e ect of augmented reality games on city-level urban mobility. Given the massive popularity of the Pokémon brand, and its cultural impact in many parts of the world, we believe that the found e ects represent a good approximation of mobility change in a large city. Jane Jacobs theorized that the streets need more pedestrians to be safe and lively [20]. Using CDR data, it has been shown that this is true in at least some cities [22]. Thus, one of the most important conclusion of our work is that cities may not need to change their infrastructure in the short term to motivate pedestrians to go out. A game about imaginary creatures lurking in neighborhoods, that can be collected using cell phones encourage people to go out and make streets more lively and safe when they are commuting, or when they have free time during lunch or at night.
In summary, this study identi ed and investigated the e ect of a speci c type of phenomenon in the pulse of a city, measured through its oating population mobility patterns and usage of Pokémon Go. Our methods can be used to perform other natural experiments related to urban mobility, enabling measurement of the impact of city-wide interventions, and using the results to inform public policy changes.
We thank Alonso Astroza for providing a crowdsourced list of Ingress Portals validated as Pokémon Go PokéStops and PokéGyms. The analysis was performed using Jupyter Notebooks [51], jointly with the statsmodels [52] and pandas [53] libraries. All the maps on this paper include data from ©OpenStreetMap contributors and tiles from ©CartoDB. We also thank Telefónica R&D in Santiago for facilitating the data for this study, in particular Pablo García Briosso. Finally, we thank the anonymous reviewers for the insightful comments that helped to improve this paper.
EG and LF designed the experiment. EG, LF and DC performed data analysis. All authors participated in manuscript preparation.
The Telefónica Movistar mobile phone records have been obtained directly from the mobile phone operator through an agreement between the Data Science Institute and Telefónica R&D. This mobile phone operator retains ownership of these data and imposes standard provisions to their sharing and access which guarantee privacy. Anonymized datasets are available from Telefónica R&D Chile (http://www.tidchile.cl) for researchers who meet the criteria for access to con dential data. Other datasets used in this study are either derived from mobile records, or publicly available.