In this Section, we first present the dataset used in this study and how it was collected (Sect. 2.1). We then introduce the gravity model and the radiation model as ways to capture human mobility (Sect. 2.2). We also discuss the methodology behind the COVID Gravity Model (CGM) and why an extended gravity model is needed to better capture international mobility during the COVID-19 pandemic (Sect. 2.3). Finally, we briefly discuss the evaluation metrics adopted to evaluate the models (Sect. 2.4).

### 2.1 Dataset

Here, we describe the measurement infrastructure we leverage to collect network data from one of the largest commercial mobile network operators (MNOs) in UK, with 27.2 million subscribers as of May 2021. In particular, we detail the dataset we have built and the metrics we use to capture the international activity of smartphone devices.

#### 2.1.1 Measurement infrastructure

In this study, we use a passive measurement approach to retrieve some anonymized information about the devices attached to the antennas of the mobile network operator that provided the data. Each measurement carries the (1) anonymized user ID, (2) the SIM mobile country code (MCC) and mobile network code (MNC), (3) the first eight digits of the device International Mobile Equipment Identity (IMEI), (4) the timestamp, and other information. We also collect a device’s unique ID assigned by the Global System for Mobile Communications Association that describes some properties of the device like manufacturer, brand and model name, operating system, radio bands supported, etc. In this way, we can distinguish between smartphones (likely used as primary devices by mobile users) and Internet of Things devices. In this study, we use only the measurements related to smartphones. Additional information on the measurement infrastructure can be found in [34].

#### 2.1.2 International patterns extraction

Mobile phones are an ubiquitous technology that has been rapidly adopted worldwide [35]. Most of the people traveling within the same nation and internationally bring with them *at least* a device that uses Radio Base Stations (RBSs) to interact with other devices (e.g., send/receive calls/messages and connect to the internet). Whenever people traveling with connected devices cross a border, their devices need to connect to the radio network of another (local) operator to continue working correctly. For example, a person with a mobile phone traveling from Italy to UK will have to connect to a UK telecommunication operator network. The telecommunication operator will collect information about that device, including the country where the connected SIM is registered. The latter can be extracted using the MCC, a three-digit code that allows us to identify the origin of the SIM [35]. While using the generated data we can quantify the incoming international mobility, it is also possible to capture outgoing international mobility as telecommunication operators are aware of their SIMs connected to other operators’ networks.

In this study, to quantify international mobility, we are interested in counting (1) the number of foreign mobile phones connected to operators’ network per day as a proxy of incoming international mobility, and (2) the number of SIMs of the telecommunication operator in mobile phones connected to a foreign network as a proxy for outgoing international mobility. Other devices (e.g., modems, tablets, wearable devices, etc.) are excluded from this study. In this way, we can quantify both incoming and outgoing international mobility almost in real-time (e.g., with one day of delay).

### 2.2 Modeling international mobility

In this Section, we highlight how we can model international mobility patterns with roaming traces. In the literature, there are mainly two ways to model mobility flows: the gravity model [36], and the radiation model [37]. The main differences are that the gravity model mimics Newton’s gravity law and assumes that the number of trips decreases as the distance between places increases. In this model, the population of the origin and the one of the destination play the role of masses. The radiation model [37], similarly to the intervening opportunities model [38], assumes that the number of trips is justified by the opportunities offered by the origin and destination locations with people that will eventually travel to a location that can provide adequate opportunities within a certain distance.

#### 2.2.1 Gravity model

In 1946 George K. Zipf proposed a model to estimate mobility flows, drawing an analogy with Newton’s law of universal gravitation [36]. The gravity model is based on the assumption that the number of travelers between two locations increases with the population living there while decreases with the distance between them [9]. Given its ability to generate spatial flows and traffic demand between locations, the gravity model has been used in various contexts such as transport planning [39], spatial economics [40], and the modeling of epidemic spreading patterns [41–43]. In particular, the gravity model estimate mobility flows between the areas *i*, *j* according to the following function

$$ T_{i,j} \propto m_{i}, m_{j} f(r_{i}j), $$

(1)

where the masses \(m_{i}\) and \(m_{j}\) are related to people in location *i* and *j* respectively, while \(f(r_{ij})\) is a function of the distance between *i*, *j* and it is commonly called friction factor or deterrence function. There are two common ways to model the deterrence function, namely (i) assuming an exponential decay:

$$ f(r_{ij}) = \exp ^{- \beta r_{i}j} $$

(2)

or (ii) assuming a power decay of the flows with respect to the distance:

$$ f(r_{ij}) = r_{ij} ^{-\beta }. $$

(3)

The parameters of the function need to be fine-tuned. In this work, we have searched the best parameters using the curve fit utilities of SciPy [44]. The main limitations of the gravity models are (i) that it requires, at least, the estimation and calibration of beta, which makes it sensitive to its changes; and (ii) that for doing this calibration, the system needs empirical data of the actual movements which are not necessarily available for all cases. As a result of the previous limitations, this approach is a strong simplification of the actual flows, so the results may not reflect the real mobility.

#### 2.2.2 Radiation model

To solve some of the limitations of the gravity model, the radiation model has been proposed [37]. This model is an extension of the intervening opportunities model [38] in which we assume that a traveler chose the destination of a trip by computing two actions. First, all the possible destinations are assigned to a value representing the opportunities for the traveler. This number *k* is chosen from a distribution \(p(k)\) representing the quality of the opportunity. Then, all the opportunities are ranked according to the distance and the traveler goes to the nearest location with an opportunity value higher than a threshold. The threshold is randomly sampled by the same distribution \(p(k)\). Therefore, the number of people commuting from *i* to *j* can be modeled with

$$ T_{ij} = \frac{m_{i} m_{j}}{(m_{j} + s_{ij})(m_{i} + m_{j} + s_{ij})} $$

(4)

and, differently from the gravity model, there are no parameters to calibrate. The radiation model has been reported to better captures long-term migration patterns and to have an high degree of accuracy at the intra-country scale [37, 45]. The radiation model we adopted is implemented in scikit-mobility library [46].

Although the radiation model has been applied efficiently in various settings, some results highlight that the spatial scale is not adequately considered by the model [47, 48]. In that sense, some studies go further and limit the application of the radiation model to urban or metropolitan areas [49], due to the parameter-free design of the model, which limits the capability of capturing human mobility.

### 2.3 COVID gravity model

In this work, we claim that the gravity model may have some limitations when modeling human mobility during the COVID-19 pandemic. In particular, the gravity model assumes that flows of people are proportional to the population and the distance between origins and destinations. However, during the COVID-19 pandemic we should also consider that travel restrictions and travel bans play an important role. Indeed, if we suppose to have an origin and two different destinations with the same population and the same distance, by definition, the gravity model will output the same flow of people. However, the destinations may have different restrictions in place (e.g., quarantines, travel bans) and thus the flows may be significantly different. Therefore, we claim that capturing only distances and populations is not enough and that the restrictions should be explicitly taken into consideration.

In this Section, we adapted the gravity model to take into consideration also restriction levels. This version of the gravity model is called COVID Gravity Model (CGM).

The information about restriction levels are provided by the Oxford Stringency Index (SI) [50]. It is a composite measure based on nine response indicators including school closures, workplace closures, and travel bans. Oxford SI is provided with different spatial aggregations including the national one and it take values from 0 to 100 where lower numbers indicate lower restrictions. Oxford SI is computed every day starting from the 22nd January 2020. As this study is focused on European countries, we investigate a period that goes from the 5th of March to the 30th of May. Indeed, starting from March 5, European countries start to adopt non-pharmaceutical interventions to contrast the diffusion of the pandemic (e.g., school closure in Italy and self-isolation in Germany).

CGM considers, additionally to populations and distances, the Oxford SI of the origin country and the Oxford SI of the destination.

Mathematically, we can model \(T_{i,j}\) of CGM as a negative binomial regression with multiple parameters to fit [51]:

$$ T_{i,j} = \exp \bigl(\epsilon + \alpha \log (P_{i}) + \beta \log (P_{j}) + \gamma \log \bigl(f(r_{ij})\bigr) + \delta _{1} \mathrm{SI}_{i} + \delta _{2} \mathrm{SI}_{j}\bigr). $$

(5)

### 2.4 Evaluation metrics

The Sørensen–Dice index, also called Common Part of Commuters (CPC) [8, 9], is a well-established measure to compute the similarity between real flows, \(y^{r}\), and generated flows, \(y^{g}\):

$$ \mathrm{CPC} = \frac{2 \sum_{i,j} \min (y^{g}(l_{i}, l_{j}), y^{r}(l_{i}, l_{j}))}{\sum_{i,j} y^{g}(l_{i}, l_{j}) + \sum_{i,j} y^{r}(l_{i}, l_{j})} $$

(6)

CPC is a positive number and contained in the closed interval \((0, 1)\) with 1 indicating a perfect match between the generated flows and the ground truth and 0 highlighting bad performance. Note that when the generated total outflow is equal to the real total outflow CPC is equivalent to the accuracy, i.e., the fraction of trips’ destinations correctly predicted by the model. In this work, we use CPC to evaluate the goodness of gravity, radiation and CGM.

We also compute the Information Gain (IG). Given the real flow at a given time step over *n* locations \(y^{r} = \{y_{1}^{r},y_{2}^{r},\dots ,y_{n}^{r}\}\) and the generated flows for the same spatial and temporal reference \(y^{g} = \{y_{1}^{g},y_{2}^{g},\dots ,y_{n}^{g}\}\), IG is defined as follows

$$ \operatorname{IG}\bigl(y^{r},y^{g}\bigr) = \sum _{i=1}^{n} \frac{y_{i}^{r}}{N} \log \frac{y_{i}^{r}}{y_{i}^{g}}, $$

(7)

where *N* is the sum over all the elements in \(y^{r}\). IG is a non-negative error metric with lower numbers indicating better performances. We use the Information Gain implemented in scikit-mobility [46].