To measure mobility resilience, we have used geo-tagged tweets from several types of disaster (Table 1). The data sets have been collected from Dryad digital repositories http://datadryad.org/resource/doi:10.5061/dryad.88354 [58], originally collected by Wang et al. [41] and https://datadryad.org//resource/doi:10.5061/dryad.15fv2, collected by Kryvasheyeu et al. [59].

To validate our approach of using social media data, we collected New York City taxi data which includes taxi movement for the period same as the hurricane Sandy twitter data. The data was collected from a repository hosted by New York City Taxi and Limousine Commission (http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml). In the data, each observation represents a trip and there were total 12,892,877 trips in the study period.

Hurricane Sandy data have tweets from several places including USA, Canada, Mexico and other countries. For measuring resilience for a city or a state in response to hurricane Sandy, we have applied appropriate location filters. For example, a trip can be made within the New York City or having only an origin or destination in it. Since displacements are calculated in six-hour periods, when calculating resilience for the New York City, if a location filter is applied, only the displacement within the New York City will be considered in a six-hour period. If a location filter is not applied, both displacements within the New York City and having origins or destinations at the New York City will be considered in a six-hour period. Except hurricane Sandy data, the rest of the data consist city-specific tweets where those cities were subject to a disruptive event. Thus, a location filter or constraint is not required for these cases.

In this study, we apply the concept of resilience for understanding human mobility under a disaster. Following the basic definition of resilience, we define *mobility resilience* as the ability of a mobility infrastructure system responsible for the movement of a population to manage shocks and return to a steady state in response to an extreme event. These events include a hurricane, earthquake, terrorist attack, winter storm, wildfire, flood, and others. We propose a simple method based on human movement data using *normalized per user displacement* as a key indicator of human mobility. Comparing the difference between per user displacements from typical displacements, the proposed method can detect a disruptive event from movement data and calculate the maximum deviation from normal conditions and the recovery time. Finally, applying the concept of resilience triangle, we estimate resilience and transient loss of resilience for an event detected by the method. The proposed method can take any kind of movement data as inputs including coordinates from mobile phone call recordings, GPS observations, social media posts and many others. In this paper, we present our resilience analysis based on social media data from multiple types of disasters.

### 2.1 Extracting location time series of a user

First, the coordinates of a user are sorted in an ascending order by timestamps. If there are not enough users for an hourly based analysis, we can divide each day in 4 periods such as 12 AM to 6 AM, 6 AM to 12 PM, 12 PM to 6 PM and 6 PM to 12 AM. From the sorted time series, locations (i.e., latitude and longitude) of each user are extracted in six-hour interval for each day.

$$ P_{u}^{d,t} = \bigl\{ ( x, y ) \vert ( x, y ) \in ( \text{latitude}, \text{longitude of a region} ) \bigr\} , $$

(2)

where, \(P_{u}^{d,t}\) denotes the set of locations of a user *u* in day *d* at period *t*
\(d \varepsilon\ ( \text{days in the} \text{dataset} )\), \(t \varepsilon\ ( \text{periods in a day} )\), \(u \varepsilon\ (\text{users in dataset})\).

### 2.2 Displacement metric

From the set of locations of a user, distances between two consecutive points are calculated using the Haversine formula [60] shown in Equation (3). Given a pair of points (latitude and longitude), Haversine formula calculates the great-circle distance between two points. Although the most appropriate distance will be the actual traveled distance (distance along the traveled road or air path) by a user, it is impossible to obtain this actual distance from social media data due to the lack of trajectory information. Euclidean distance is the shortest distance between any two points which is often not the case in real road or air distance. We adopted Haversine distance because it considers the curvature of the earth and for small distance almost similar to Euclidian distance. Thus, Haversine displacement is better than the Euclidian distance and the most suitable for air distance. The Haversine displacement is adopted by many previous studies [40, 41, 61, 62] related to human mobility. To the best of our knowledge, Canberra distance is not suitable for human mobility analysis because it tends to calculate the distance in a higher dimensional space.

For calculating displacements, a user must have at least two locations within a six-hour interval. Otherwise, the user is not considered in that interval.

$$ C=2r\times \sin^{-1} \biggl( \sqrt{\sin^{2} \biggl( \frac{\phi_{2} - \phi_{1}}{2} \biggr) + \cos \phi_{1} \cos \phi_{2} \sin^{2} \biggl( \frac{\varphi_{2} - \varphi_{1}}{2}} \biggr) \biggr), $$

(3)

where *r* is radius of earth, *ϕ* is latitude and *φ* is longitude. Displacement between two consecutive points will be calculated for each user at every six-hour interval.

The average of the displacements for an interval is calculated by dividing the sum of the displacements by the total number of unique users contributing to that displacements. Thus,

$$ D^{d_{i}, t_{p}} = \biggl\{ \frac{\sum_{d_{i}, t_{p}}^{d_{i}, t_{p+ \Delta t}} C^{d,t}}{\sum_{d_{i}, t_{p}}^{d_{i}, t_{p+ \Delta t}} u^{d,t}} \biggr\} , $$

(4)

where \(D^{d_{i}, t_{p}}\) represents the average displacements from period \(t_{p}\) to \(t_{p+ \Delta t}\) for day \(d_{i}\). The term \(\sum_{d_{i}, t_{p}}^{d_{i}, t_{p+ \Delta t}} C^{d,t}\) indicates the summation of the displacements for all users from \(t_{p}\) to \(t_{p+ \Delta t}\) for day \(d_{i}\) and the term \(\sum_{d_{i}, t_{p}}^{d_{i}, t_{p+ \Delta t}} u^{d,t}\) represents the total number of users contributing to these displacements within this period. Here Δ*t* is the time interval to calculate human mobility. In this study, \(\Delta t=6\) hours is chosen considering the availability of enough users within the interval.

If a user has more than two observations at a given period, we calculate all the displacements between the consecutive points. To calculate the average displacement at a given period on a given day, we sum up all the displacements of all the users and divide it by the total number of unique users having displacements at that given period on that given day. However, we do not normalize it by the number of displacements observed for a user. During a disaster, human mobility can be affected by both the distance and the frequency of displacements. For example, let us consider, before a disaster, an individual used to make 4 displacements or trips each having a distance of 2 miles, in a given 6-hour period. And, during a disaster, the same individual makes 2 trips each having a distance of 2 miles in a 6-hour period. Now, if we normalize by the number of observations (i.e., calculate displacement per trip per user), then we will determine displacement of this user 2 miles/trip for both pre-disaster and during disaster periods, although in this case individual mobility was significantly decreased during the disaster. Thus, when calculating the average displacement, we do not normalize by the number of observations so that we can capture the effect of a disaster on both trip frequencies and distances.

### 2.3 Extraction of typical and actual displacements time series

The mobility dataset to be used for a resilience analysis should cover pre-disaster, disaster, and post disaster periods. Using the average displacements value in the pre-disaster period, we can make four sets of typical values for the four periods considered in a day. These four typical values are calculated separately for weekdays and weekends.

$$\begin{aligned} &D_{\mathrm{weekday}}^{t} = \bigl\{ D^{d,t} \text{ where}, d\in(\text{pre-disaster weekdays})\bigr\} , \end{aligned}$$

(5)

$$\begin{aligned} &D_{\mathrm{weekend}}^{t} = \bigl\{ D^{d,t}\text{ where}, d\in(\text{pre-disaster weekend days})\bigr\} , \end{aligned}$$

(6)

where \(D_{\mathrm{weekday}}^{t}\) represents the set of displacements at period *t* considering only weekdays in the pre-disaster period. Similarly, \(D_{\mathrm{weekend}}^{t}\) represents the set of displacements at period *t* considering only weekends in the pre-disaster period. For instance, if we have 4 periods per day, and if we select first 7 days as a pre-disaster period, for each period, we have a set of 5 values of displacement for weekdays and a set of 2 values for weekends. The mean and standard deviation of these sets of displacement are used to compare the actual displacement at the corresponding periods of a day to check whether the displacement is typical or not. To capture this effect, we can compute standardized displacement, *Z* score, for each actual displacement using the equation given below:

{Z}^{d,t}=\left\{\begin{array}{c}\frac{{D}^{d,t}\text{-mean of}{D}_{\mathrm{weekday}}^{t}}{\text{standard deviation of}{D}_{\mathrm{weekday}}^{t}}\text{if}d\in (\text{week days})\\ \text{else}\frac{{D}^{d,t}\text{-mean of}{D}_{\mathrm{weekend}}^{t}}{\text{standard deviation of}{D}_{\mathrm{weekend}}^{t}}\end{array}\right\},

(7)

where \(Z^{d,t}\) represents the *Z* score at day *d* and period *t*. If *d* is a weekday, typical displacements for weekdays are used to compare; and if *d* is a weekend day, typical displacements for weekends are used.

### 2.4 Extreme event detection

An extreme event can disrupt human mobility by either increasing mobility or decreasing mobility. We consider two parameters for detecting an extreme event: a threshold *z* score (*α*) and the number of time intervals (*τ*). The first parameter checks the amount of deviation from typical values and the second parameter checks how long this deviation persists.

$$ \mathrm{Event}_{d_{i}, t_{p}}^{d_{j}, t_{q}} = \Biggl\{ Z^{d,t}: Z^{d,t} \leq \alpha_{l} \mbox{ and } \sum_{d_{i}, t_{p}}^{d_{j}, t_{q}} d,t \geq\tau \Biggr\} $$

(8)

or,

$$ \mathrm{Event}_{d_{i}, t_{p}}^{d_{j}, t_{q}} = \Biggl\{ Z^{d,t}: Z^{d,t} \geq \alpha_{u} \mbox{ and } \sum_{d_{i}, t_{p}}^{d_{j}, t_{q}} d,t \geq\tau \Biggr\} . $$

(9)

Equation (8) and (9) represent the event detection for decreased and increased mobility, respectively; where \(\mathrm{Event}_{d_{i}, t_{p}}^{d_{j}, t_{q}}\) represents an extreme event from day \(d_{i}\) period \(t_{p}\) to day \(d_{j}\) period \(t_{q}\); \(d_{i}\), \(d_{j} \in( \text{days in data set})\) and \(t_{p}, t_{q} \in(\text{periods in a day})\); \(\alpha_{l}, \alpha_{u}\) represent the lower and upper threshold of *Z* score; and *τ* represents the threshold number of periods when *Z* score is above or below the threshold *Z* score. These parameters (\(\alpha, \tau\)) can be selected to identify shorter or longer extreme events depending on the type of a disaster and the area affected by it.

We recommend selecting the threshold values based on a decision maker’s need. For instance, a small threshold on event duration (*τ*) can capture mobility resilience due to events (such as rainfall, thunderstorm, special events etc.) that last short periods. In contrast, a longer duration threshold is recommended if we want to calculate resilience only for events that last longer period such as hurricane, typhoon etc.

On the other hand, the threshold on z score captures the amount of deviation occurred due to an event. The lower threshold (\(\alpha_{\mathrm{l}} \)) and upper threshold (\(\alpha_{\mathrm{u}} \)) values are used for capturing the events due to decreased and increased mobility, respectively. Again, selection of the thresholds depends on the decision maker’s need. For example, if we want to capture only the events that make a huge deviation from normal condition, a very small \(\alpha_{\mathrm{l}}\) and very big \(\alpha_{\mathrm{u}}\) should be chosen.

A threshold is meant to separate the condition when the level of human mobility deviates significantly (\(<\alpha_{{l}} \mbox{ or } > \alpha_{{u}} \)) from normal condition for a significant amount of time (≥*τ*). In our study, we selected the threshold values that best capture the actual timing of the events (landfall time, earthquake strike time etc.). Another approach could be to fit a distribution of the predisaster data and choose a threshold which is significantly different from the normal condition at 90% or 95% confidence level. But in our case, there are not enough data for fitting a distribution for most of the cases. We consider variability between weekdays and weekends only.

### 2.5 Resilience calculation

Once an extreme event has been detected, maximum deviation and recovery time can be easily calculated. We define human mobility resilience as the ability of a mobility infrastructure system responsible for the movement of the population of a community to manage shocks and return to a steady state in response to an extreme event. Bruneau et al. [10] introduced an equation for calculating resilience loss as shown in Equation (1). As applied to infrastructures of a community, Bruneau’s equation [10] computes the loss of resilience by the size of degradation of the expected quality of an infrastructure over time. But community resilience as a whole should be calculated with respect to all the extreme events possible. When applied to people and its environment, Norris et al. [6] used the term resilience as a metaphor where a transient dysfunction occurs during a crisis due to the degradation of quality of life. A resilient community can adapt to the situation after the event while a vulnerable community goes through a persistent dysfunction [6]. Here, using Bruneau’s approach, we calculate the loss of resilience which is equivalent to the size of dysfunction/degradation of human mobility. But mobility resilience represents a long-term property of a community in response to all the possible crisis events. Since we determine the loss of resilience in response to a single crisis only, we call the size of the degradation as the transient loss of resilience, defined as:

$$\text{Transient loss of resilience}, \mathrm{TLR}= \int_{t_{0}}^{t_{1}} \bigl[100-Q ( t ) \bigr]\,dt, $$

where, TLR is the transient loss of resilience which is the area (see Fig. 1(a)) between the horizontal line from 100 and the quality curve Q(t) from \(t_{0}\) to \(t_{1}\) which is the recovery period for any event.

A schematic representation of this equation (see Fig. 1(a)) is known as a resilience triangle. From this triangle, the transient loss of resilience in any extreme event can be calculated as the area formed by the dashed lines and the vertical line (see Fig. 1(a)). Inspired from the resilience triangle, we represent the resilience by dividing this area into smaller trapezoids (see Fig. 1(b) and 1(c)) having height equal to the increment of time (six hours) considered in the analysis. This assumption is required since, unlike an idealized quality function, a real-world quality function indicating human mobility gradually drops from and improves to its typical values. Thus, assuming smaller trapezoids will minimize the loss in calculation.

In our analysis, we assume human mobility level as a proxy of the quality of the mobility infrastructure system. Thus, we define \(Q(t)\) as the ratio of average actual displacements to average typical displacements of a population at a time period *t*. If an actual displacement is equal to a typical displacement, the value of the quality function is 100 or the ratio is 1. In our case, \(Q(t)\) at a period *t* represents how much different the level of human mobility is compared to a typical value of the level of human mobility at the same period before the disaster. We obtain the recovery time (\(\mathrm{t}_{0}\) to \(\mathrm{t}_{1} \)) (see Fig. 1) from the extreme event detection phase. Here, \(\mathrm{t}_{0}\) is the starting point of the detected extreme event and \(\mathrm{t}_{1}\) is the end point of the detected event. The duration between \(\mathrm{t}_{0}\) and \(\mathrm{t}_{1}\) is the recovery time. The summation of the areas of all the small trapezoids is the transient loss of resilience (indicated by transient loss of resilience in Fig. 1(b) and 1(c)). The residual area (indicated by resilience in Fig. 1) represents the value of resilience during the recovery period. For increased mobility area considered in resilience calculation are defined by the maximum quality percentage/ratio (see Fig. 1(c)).

The selection of the thresholds on *α* and *τ* have effects on event detection and resilience calculation. For a given *α* value, the threshold on *τ* determines whether a deviated state of human mobility will be considered as a disruptive event or not. But once an event is detected, *τ* has no effect on the calculated resilience value. The selection of *α* threshold will directly affect the duration of an event which affects the resilience calculation. For instance, bigger \(\alpha_{u}\) and smaller \(\alpha_{l}\) will reduce the duration of an event and hence calculated resilience loss will be less. Although the resilience calculation depends on the selection of the thresholds, the ranking of events with respect to resilience will not change given that the same threshold value is chosen for all the events.

When interpreting the resilience and transient loss of resilience values, we should consider certain aspects of the resilience metric. At a fundamental level, the proposed resilience metric measures the impact of a disruption to the mobility of a population and its infrastructures. The greater the impact of a disruption has on human mobility, the greater the transient loss of resilience will be. We consider both the increase and decrease of mobility level similarly, through calculating transient loss of resilience, since both situations indicate impacts to the typical level of population mobility. However, loss of resilience calculated from decreased mobility should not be compared with the same due to increased mobility. For increased mobility, our resilience calculation is limited as we have an unbounded scenario (see Fig. 1(c)). Furthermore, an increased mobility does not necessarily indicate a better performance of the mobility infrastructure system. It is more likely that in such a situation the infrastructure system has collapsed forcing people to displace further.