- Regular article
- Open Access
Methods for quantifying effects of social unrest using credit card transaction data
- Xiaowen Dong1Email authorView ORCID ID profile,
- Joachim Meyer2,
- Erez Shmueli2,
- Burçin Bozkaya3 and
- Alex Pentland1
- Received: 23 July 2017
- Accepted: 18 March 2018
- Published: 13 April 2018
Abstract
Societal unrest and similar events are important for societies, but it is often difficult to quantify their effects on individuals, hindering a timely and effective policy-making in emergencies and in particular localized social shocks such as protests. Traditionally, effects are assessed through economic indicators or surveys with relatively low temporal and spatial resolutions. In this work, we compute two behavioral indexes, based on the use of credit card transaction data, for measuring the economic effects of a series of protests on consumer actions and personal consumption. Using data from a metropolitan area in an OECD country, we show that protests affect consumers’ shopping frequency and spending, but in noticeably different ways. The effects show strong temporal and spatial patterns, vary between neighborhoods and customers of different socio-demographical characteristics as well as between merchants of different categories, and suggest interesting subtleties in purchase behavior such as displaced or delayed shopping activities. Our method can generally serve for the real-time monitoring of the effects of major social shocks or events on urban economy and consumer sentiment, providing high-resolution and cost-effective measurement tools to complement traditional economic indicators.
Keywords
- Social shocks
- Economic effect
- Consumer behavior
- Spatiotemporal pattern
- Credit card transaction
1 Introduction
The routine of daily life is punctuated by extraordinary public events, ranging from major sports and cultural events, to terror attacks or disasters. Some events are local, and some are national or even wider; some affect only part of the population, while others affect the entire population. Although unlike natural events and disasters, certain social events, such as a protest, a riot, a large police action, or a bomb explosion can still have characteristics of natural emergencies, with fatalities and damage to properties. However, when such localized social shock occurs, we have mostly qualitative or at best retrospective survey data about its effect on the surrounding community. Consequently it is impossible to know how well authorities managed the side-effects of the event, and thus difficult to develop effective event policies for use by police and municipal authorities.
In particular, understanding the effects of major social shocks or events on the economy and consumer behavior can have important implications [1–7]. Existing measures of economic behavior and attitudes towards economic situations use surveys to assess consumer confidence or consumer sentiment, asking about the intentions to purchase goods or to make large investments. These include objective measures published by national agencies, such as the measure of personal consumption expenditures in the monthly personal income and outlays report, provided by The Bureau of Economic Analysis (BEA) [8]. Other indicators are based on consumers’ subjective reports, such as the monthly consumer confidence index (CCI) issued by The Conference Board [9], and the monthly index of consumer sentiment (ICS) published by the University of Michigan [10]. These data and the actual economic behavior, as expressed in consumption, can be strongly correlated [11], and they can therefore be used to evaluate the effects of events. However, their temporal resolution is low (consumer confidence and sentiment surveys are usually conducted on a monthly base). Their spatial resolution is low, too, since the measures aggregate information over large areas and usually entire economies. Finally, the information usually cannot be used to look at the differential effect of events on different segments of a society.
The limitations of traditional approaches may be overcome by using digital tools to study human and social behavior, analyzing large-scale quantitative data in emerging fields such as computational social science [12, 13]. The analysis of large amount of data accumulated in social networks, cellular phone records, or credit card transactions may allow us to gain a new understanding of social processes. Such data have been widely utilized to study situation awareness and response to emergencies [14–18], mostly due to natural events and disasters, as well as the dynamics of communication and information propagation immediately followed [19–22]. However, relatively little work, so far, aimed to understand the impact of social events, and especially localized social shocks, using quantitative behavioral data.
Einav et al. recently highlighted the new possibilities of conducting economic research in the age of “Big Data” [23]. Within this scope, in this work, we show how a particular type of data, namely credit card transaction records, can be used to quantify and understand the repercussions localized social events have on individuals’ economic behavior, quantifying when, where and how an event has an effect. In particular, credit card data help us overcome the limitations of traditional indicators by allowing us to measure purchase behavior directly, and, together with information about the types of merchants at which purchases are made, to infer what kinds of products are purchased. Especially, purchases at physical shops (rather than those made online) are clearly linked to specific spatial locations and time stamps. Furthermore, credit card issuers have information about the demographics and economic status of the credit card holder, which enables us to measure the effects of events on different demographic groups in the population. This would allow the authorities to develop targeted and timely policies in case of emergencies.
2 Data and methods
To demonstrate the use of this method, we analyzed more than ten million credit card transaction records provided by a major financial institution in an OECD country about more than 100 thousand individuals, at more than 100 thousand merchants during a period of three months. Each record consists of the date, time and amount (in local currency) for one credit card transaction, along with anonymized customer and store IDs. Additional information about the customers and stores is also available, such as customers’ gender and the neighborhoods in which they live, as well as the category of the stores and the neighborhoods in which they are located.
Visualization of the hourly number of transactions in the metropolitan area during the period the data set covers. The horizontal axis corresponds to the index of the days in the data set, and the vertical axis corresponds to the index of the specific hours of each day
- 1.
consumer action: the number of unique customers who made transactions at stores inside the neighborhood;
- 2.
personal consumption: the median spending amount in local currency for all transactions inside the neighborhood.
Time series of the two behavioral indexes in a given neighborhood on Mondays and Saturdays: (a) number of customers; (b) median spending
3 Results
We use the score of relative deviation from the mean for the two behavioral indexes, computed in each neighborhood for each day in the data set, to measure the impact of a series of protests that took place in a central site of the metropolitan area during a period of one month. In particular, we focus on the temporal, spatial, heterogeneous, and integral economic effect of these protest events.
3.1 Temporal variation of effects
The impact of a series of protest events on the two behavioral indexes, in neighborhoods within 2 km (blue), from 2 to 4 km (red), and above 4 km (orange) from the protest site: (a) number of customers; (b) median spending. The dash-dot purple line indicates zero change, and the dashed green and cyan lines indicate Day 62 and Day 77, respectively
To confirm our findings, we test the statistical significance of the results shown in Fig. 3(a), (b) in the Appendix (see Table A3(a)). On the days of two major events, we see a significant drop in both behavioral indexes for all three distances in Fig. 3(a), (b). In addition, there also exist statistically significant differences between distances. On days following the major protests (those after Day 62 and Day 77), the number of customers decreased significantly in neighborhoods within 4 km from the protest site, but not beyond. On the other hand, there was no significant decrease in the median spending in any distance, suggesting that the people who went out shopping did not tend to spend less money than usual.
The 2-D histograms of the number of neighborhoods in terms of the two behavioral indexes for Day 62 in the data set: (a) within 2 km; (b) from 2 to 4 km; (c) above 4 km. The horizontal and vertical dashed lines indicate changes of a magnitude less than 0.1
3.2 Spatial decay of effects
The exponential fit to the median magnitude of decrease in the two behavioral indexes for neighborhoods that are located at different distances from the event center: (a) number of customers on Day 62; (b) number of customers on Day 77; (c) median spending on Day 62; (d) median spending on Day 77
These results indicate that, first, the amount of money people spent when making a purchase was less clearly affected by the protests, compared to the change in the number of customers, and the negative effect spread less further away from the event center as well. Second, it seems that the event on Day 77 had a stronger impact on consumer actions within the close vicinity of the event center, which is indicated by a higher initial quantity of 1.25 in the exponential fit and reflected by the magnitude of the first two points in Fig. 5(a), (b). Such effect, however, decayed spatially faster compared to Day 62, which is suggested by a smaller mean distance of 2.86 km and reflected by the generally smaller magnitude from the third point (3 km) onwards in Fig. 5(a), (b). On the other hand, on Day 62 personal consumption was more affected close to the event center, while the negative effect on this index spread further away on Day 77. Such differences might be due to people’s responses to events changing over time. Finally, we can also study the event effect as a function of the estimated travel time (by car) from the event center, thus taking into account geographical constraints and transportation accessibility. The results are similar (in terms the exponentially decaying patterns) and presented in Fig. A2 in the Appendix. We further show in Fig. A3 and Fig. A4 in the Appendix the event effect on the individual neighborhoods. Overall, these results demonstrate the possibility of utilizing transaction data to quantify the spatial decay of event effects.
3.3 Heterogeneous effects
Social events may impact neighborhoods of different characteristics and the purchase behavior of people from different demographic groups in different ways. It may also have a differential effect on purchases from different types of merchants. We therefore study the heterogeneous effects of the social events in the following scenarios, by focusing on neighborhoods within 4 km from the event center.
3.3.1 Socio-economic status
The impact of a series of protest events in neighborhoods of higher socio-economic status (blue) and lower socio-economic status (red): (a) number of customers; (b) median spending. The dash-dot purple line indicates zero change, and the dashed green and cyan lines indicate Day 62 and Day 77, respectively
3.3.2 Political preference
The impact of a series of protest events in neighborhoods of higher political conservatism (blue) and lower political conservatism (red): (a) number of customers; (b) median spending. The dash-dot purple line indicates zero change, and the dashed green and cyan lines indicate Day 62 and Day 77, respectively
We note that the neighborhoods we analyze here are centrally located, hence it is possible that both the socio-economic status and political preference of the neighborhoods mainly reflect the typical profile of the residents but not that of the visiting customers. The results presented here are therefore mainly behavioral change observed in these neighborhoods. However, according to our data, 54% (46%) transactions in the wealthier (less wealthy) neighborhoods being studied come from their residents or residents of other wealthy (less wealthy) neighborhoods in the city, while 61% (48%) transactions in the conservative (liberal) neighborhoods come from their residents or residents of other conservative (liberal) neighborhoods. Therefore we believe that shopping activities observed in these neighborhoods reflect to a certain extent the behavioral change of people of similar characteristics in terms of wealth and political preference. Tests for statistical significance of the differences between the two groups in Fig. 7 are presented in Table A3(c) in the Appendix.
3.3.3 Gender difference
The impact of a series of protest events averaged over all neighborhoods for male (blue) and female (red) customers: (a) number of customers; (b) median spending. The dash-dot purple line indicates zero change, and the dashed green and cyan lines indicate Day 62 and Day 77, respectively
3.3.4 Store category
The impact of a series of protest events averaged over all neighborhoods for grocery stores (blue), family clothing stores (red) and restaurants (orange): (a) number of customers; (b) median spending. The dash-dot purple line indicates zero change, and the dashed green and cyan lines indicate Day 62 and Day 77, respectively
3.4 Integral economic effects
Although societal unrest has led to major changes in people’s purchase behavior, particularly on the days of major events, it is interesting to investigate whether there exists an integral economic effect due to displaced or delayed shopping activities.
3.4.1 Displaced shopping
The possibility of displaced shopping may be suggested by Fig. 4(c) where it can be seen that certain neighborhoods further away from the event center, in particular those corresponding to the four squares on the top right, actually saw increased activities on Day 62. In this case, each of the squares represents a single neighborhood (with color indicating the number of stores in the neighborhood), which we analyze in detail as follows.
Increased shopping activities in these four neighborhoods were mainly due to (i) new merchants, and (ii) new customers that did not appear in the reference period. First, the neighborhood corresponding to the red square (100% increase in customers and 60% in spending) saw many payments at a fitness club and, since Day 62 happened to be the first day of the month, these may correspond to people paying subscription fees at the club. As the reference period does not contain first days of the month, this fitness club did not appear as a merchant in the reference period, and therefore the corresponding purchase behavior was not observed before.
Second, the two neighborhoods corresponding to the two light blue squares (220%/120% increase in customers and 20%/100% in spending) saw increased activities due to their close proximity to a new theme park and shopping complex, where those purchases were made on a Saturday (Day 62) at stores that were newly opened in the shopping complex.
Finally, instead of increased activities driven by new merchants, those in the neighborhood corresponding to the last square (60% increase in customers and 100% in spending) were mainly driven by new customers. Indeed, 13 out of the 19 customers purchased in that neighborhood on Day 62 never visited the neighborhood in the reference period, and at the same time they did not visit any neighborhood they used to go on the same day of the week. Therefore, this suggests a pattern of displaced shopping where these customers were shifting their shopping locations, which might be due to the influence of the protest events.
In summary, displaced shopping may have indeed taken place, but being further away from the event center, there could be other factors that influence shopping behavior as well, such as those that came with a special day like Day 62 (being the first day of the month as well as a Saturday that came shortly after the opening of a shopping complex).
3.4.2 Delayed shopping
We see from Fig. 3 that, in terms of both number of customers and median spending, shopping activities generally increased on mid-week days after the weekends when a majority of the protests took place. For the number of customers, such increase is most obviously in Fig. 9(a) for purchases at grocery stores. Compared to clothing stores and restaurants, grocery stores provide products needed for daily life, and it is therefore possible that people waited during the weekends and got the necessary shopping done shortly after the major protests. On the other hand, increase in median spending on mid-week days were mainly present in neighborhoods within 2 km from the event center, and is most obviously in synchronization with the increased spending in neighborhoods of lower socio-economic status shown in Fig. 6(b). Given that median spending dropped significantly in these neighborhoods on weekends of major protests, this suggests that people may have decided to save in uncertain situations and regained some confidence for make-up purchases after the protest events.
Given the discernible patterns of displaced or delayed shopping activities, however, the integral economic effect of the protest events remains largely negative, as is evident in Fig. A6 where change in total sales is illustrated for neighborhoods that are of different distances from the event center.
4 Discussion
Our study demonstrates the potential of using pervasive and passively collected behavioral data to quantify the impact of localized social shocks and events on an urban population. We found that major events, such as societal unrest due to protests in our case, can alter people’s purchase behavior, both in terms of consumers’ tendency to visit shops in certain areas and in terms of their willingness to spend money. From a temporal perspective, personal consumption levels recover relatively quickly from the negative influence of the events. Spatially, on days of major events, consumption levels also seem more resilient, although both measures show clear spatial exponential decay. Event effects also differ between groups in the society, defined in terms of demographics, socio-economic status, and political conservatism, and between categories of merchants. Finally, given the results above, the existence of discernible patterns such as displaced or delayed shopping activities suggest that the effect of social events was not always distributed in the way one would imagine, e.g., it is not purely a function of geographical distance, and may help explain the rather unintuitive resilience of (or lack of change in) personal computation as a result of inelastic spending driven by needs.
Our analysis has certain limitations. The data set of credit card transaction records used in this study is based on a sample (about 10%) of all the individual customers of one financial institution and does not include people without credit cards. Therefore, sampling bias could exist and could potentially influence the results. Credit card transaction data may also represent only part of people’s daily spendings, as people may choose to pay with cash in certain scenarios. Furthermore, the data set only covers a period of three months, and the observations might be affected by seasonality. Finally, there might exist unobserved external events that bias our results, even though the strong spatiotemporal patterns observed on Days 62 and 77 are probably mainly caused by the major events on those days.
The real-time monitoring of economic behavior can have important social, economic and political implications. As a research tool, it makes it possible to assign quantitative values to the inherently elusive effects of social events. One advantage of the method we present here is the ability to analyze events with relatively high temporal and spatial resolutions. These allow a fine-grained understanding of events, beyond what is often possible with traditional measures. Also, the measures are less affected by demand characteristics or other biases, inherent in methods such as surveys. Such understanding and measures can potentially be used to develop timely and effective policies targeted at specific communities and demographic groups, and can often be critical in emergency situations.
With our method we can also evaluate statements made, for instance, by politicians or the media, such as that people are greatly upset by certain events. We can compare them to the observed changes in behavior, as expressed in the number of customers doing purchases and the average amount spent in a purchase. If people go about their activities as usual, the events probably have less impact, compared to when measures of behavior differ strongly from the usual values. At a practical level, these analyses may allow authorities to prepare for timely interventions to minimize possible negative consequences, possibly through a prediction mechanism that could estimate the recovery time of individual economic activities at an early stage of the event.
The framework we present here complements, rather than replaces, other methods such as traditional economic indicators or surveys. Combinations of it with other tools help create a comprehensive picture of the dynamic response of a population to social events.
The two latter steps aim to remove inactive customers and stores to prevent them from biasing subsequent analyses. The specific thresholds of ten customers and ten stores we used here have little effect on the amount of filtered data: with alternative thresholds of five customers/stores or three, 2.0 or 2.1 million transaction records are left after the filtering process.
An alternative would be to compute a z-score that also captures the variability of the static. However, computing sample standard deviation over a relatively small number of points may lead to large variation in the z-score hence biases the analysis afterwards. We therefore choose the relative deviation from the mean, which is a more stable measure and is also adopted in [20, 22].
Except that we remove three holidays in these six weeks when shopping activities were clearly boosted and would introduce obvious bias in the analysis.
Notice that the behavioral indexes are computed as relative changes with respect to a reference period, therefore the factor that activities may decrease further away from the city center due to less shops available has already been taken into account.
In our data set, we have 34% female customers who account for 38% of the transactions, which is moderately imbalanced. However, since the behavioral indexes are computed at the neighborhood level as relative changes with respect to a reference period, and gender split is assumed to stay reasonably stable across the reference and analysis periods, such imbalance would not cause an issue in the analysis and statistical tests presented in the paper.
Declarations
Acknowledgements
X. Dong was supported by a Swiss National Science Foundation Mobility fellowship while completing this work. The authors are grateful to the financial institution that provided the credit card transaction data for this research.
Availability of data and materials
Data sufficient to reproduce all results in this paper will be made available upon request.
Authors’ contributions
XD and JM conceived and designed the study. XD, JM, ES and BB conducted the analyses. XD, JM and ES wrote the manuscript. All authors reviewed the manuscript. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- Long PT, Perdue RR (1990) The economic impact of rural festivals and special events: assessing the spatial distribution of expenditures. J Travel Res 28(4):10–14 View ArticleGoogle Scholar
- Burgan B, Mules T (1992) Economic impact of sporting events. Ann Tour Res 19(4):700–710 View ArticleGoogle Scholar
- Crompton JL, Mckay SL (1994) Measuring the economic impact of festivals and events: some myths, misapplications and ethical dilemmas. Festiv Manage Event Tour 2:33–43 View ArticleGoogle Scholar
- Mondello MJ, Rishe P (2004) Comparative economic impact analyses: differences across cities, events, and demographics. Econ Dev Q 18(4):331–342 View ArticleGoogle Scholar
- Getz D (2005) Event management and event tourism, 2nd edn. Cognizant Communication Corporation, New York Google Scholar
- Herrero LC, Sanz JA, Devesa M, Bedate A, del Barrio MJ (2006) The economic impact of cultural events: a case-study of Salamanca 2002, European capital of culture. Eur Urban Reg Stud 13(1):41–57 View ArticleGoogle Scholar
- Jong-A-Pin R (2009) On the measurement of political instability and its impact on economic growth. Eur J Polit Econ 25(1):15–29 View ArticleGoogle Scholar
- Personal income and outlays, U.S. Department of Commerce, Bureau of Economic Analysis (BEA). http://www.bea.gov/newsreleases/national/pi/pinewsrelease.htm
- Consumer confidence index, The Conference Board. https://www.conference-board.org/data/consumerconfidence.cfm
- Surveys of consumers, University of Michigan. http://www.sca.isr.umich.edu
- Carroll CD, Fuhrer JC, Wilcox DW (1994) Does consumer sentiment forecast household spending? If so, why? Am Econ Rev 84:1397–1408 Google Scholar
- Lazer D, Pentland A, Adamic L, Aral S, Barabási A-L, Brewer D, Christakis N, Contractor N, Fowler J, Gutmann M, Jebara T, King G, Macy M, Roy D, Alstyne MV (2009) Computational social science. Science 323(5915):721–723 View ArticleGoogle Scholar
- Pentland A (2014) Social physics: how good ideas spread. Penguin, Baltimore Google Scholar
- Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the international conference on World Wide Web, pp 851–860 Google Scholar
- Yates D, Paquette S (2011) Emergency knowledge management and social media technologies: a case study of the 2010 Haitian earthquake. Int J Inf Manag 31(1):6–13 View ArticleGoogle Scholar
- Yin J, Lampert A, Cameron M, Robinson B, Power R (2012) Using social media to enhance emergency situation awareness. IEEE Intell Syst 27(6):52–59 View ArticleGoogle Scholar
- Imran M, Castillo C, Diaz F, Vieweg S (2015) Processing social media messages in mass emergency: a survey. ACM Comput Surv 47(4):67–16738 View ArticleGoogle Scholar
- Martínez EA, Rubio MH, Martinez RM, Arias JM, Patane D, Zerbe A, Kirkpatrick R, Luengo-Oroz M (2016) Measuring economic resilience to natural disasters with big economic transaction data. In: Proceedings of the data for good exchange Google Scholar
- Kapoor A, Eagle N, Horvitz E (2010) People, quakes, and communications: inferences from call dynamics about a seismic event and its influences on a population. In: Proceedings of AAAI symposium on artificial intelligence for development, pp 51–56 Google Scholar
- Bagrow JP, Wang D, Barabási A-L (2011) Collective response of human populations to large-scale emergencies. PLoS ONE 6(3):17680 View ArticleGoogle Scholar
- Altshuler Y, Fire M, Shmueli E, Elovici Y, Bruckstein A, Pentland A, Lazer D (2013) The social amplifier—reaction of human communities to emergencies. J Stat Phys 152(3):399–418 MathSciNetView ArticleGoogle Scholar
- Gao L, Song C, Gao Z, Barabási A-L, Bagrow JP, Wang D (2014) Quantifying information flow during emergencies. Sci Rep 4:3997 View ArticleGoogle Scholar
- Einav L, Levin J (2014) Economics in the age of big data. Science 346(6210):715 View ArticleGoogle Scholar
- Dong X, Jahani E, Morales AJ, Bozkaya B, Lepri B, Pentland A (2016) Purchase patterns, socioeconomic status, and political inclination. In: International conference on computational social science Google Scholar
- Hochberg Y, Tamhane AC (1987) Multiple comparison procedures. Wiley, New York View ArticleMATHGoogle Scholar