Skip to main content
  • Regular article
  • Open access
  • Published:

Investigating causality in human behavior from smartphone sensor data: a quasi-experimental approach


Smartphones and wearables have become an indispensable part of our daily life. Their improved sensing and computing capabilities bring new opportunities for human behavior monitoring and analysis. Most work so far has been focused on detecting correlation rather than causation among features extracted from smartphone data. However, pure correlation analysis does not offer sufficient understanding of human behavior. Moreover, causation analysis could allow scientists to identify factors that have a causal effect on health and well-being issues, such as obesity, stress, depression and so on and suggest actions to deal with them. Finally, detecting causal relationships in this kind of observational data is challenging since, in general, subjects cannot be randomly exposed to an event.

In this article, we discuss the design, implementation and evaluation of a generic quasi-experimental framework for conducting causation studies on human behavior from smartphone data. We demonstrate the effectiveness of our approach by investigating the causal impact of several factors such as exercise, social interactions and work on stress level. Our results indicate that exercising and spending time outside home and working environment have a positive effect on participants stress level while reduced working hours only slightly impact stress.

1 Introduction

Nowadays, people generate vast amounts of data through the devices they interact with during their daily activities, leaving a rich variety of digital traces. Indeed, our mobile phones have been transformed into powerful devices with increased computational and sensing power, capable of capturing any communication activity, including both mediated and face-to-face interactions. User location can be easily monitored and activities (e.g., running, walking, standing, traveling on public transit, etc.) can be inferred from raw accelerometer data captured by our smartphones [1, 2]. Even more complex information such as our emotional state or our stress level can be inferred either by processing voice signals captured by means of smartphone’s microphones [3, 4] or by combining information, extracted from several sensors, which correlates with our mood [59]. Moreover, we keep track of our daily schedule by using digital calendars and we use social media to share our experiences, opinions and emotions with our friends. Wearable devices that are able to monitor physical indicators with a very high level of accuracy are also increasingly popular.

Leveraging this rich variety of human-generated information could provide new insights on a variety of open research questions and issues in several scientific domains such as sociology, psychology, behavioral finance and medicine. For example, several works have demonstrated that online social media could act as crowd sensing platforms; the aggregated opinions posted in online social media have been used to predict movies revenues [10], elections results [11] or even stock market prices [12]. Social influence effects in social networks have been also investigated in several projects either using observational data [13, 14] or by conducting randomized trials [15, 16]. Other works also use mobility traces in order to study social patterns [17] or to model the spreading of contagious diseases [18]. Moreover, the use of smartphones is increasingly used to monitor and better understand the causes of health problems such as addictions, obesity, stress and depression [9, 19, 20]. Smartphones enable continuous and unobtrusive monitoring of human behavior and, therefore, could allow scientists to conduct large-scale studies using real-life data rather than lab constrained experiments. In this direction, in [21] the authors attempt to explain sleeping disorders reported by individuals, by investigating the correlations between sociability, mood and sleeping quality, based on data captured by mobile phones sensors and surveys. Also, in [22] the authors study the links between unhealthy habits, such as poor-quality eating and lack of exercise, and the eating and exercise habits of the user’s social network. However, both studies are based on correlation analysis and, consequently, they are not sufficient for deriving valid conclusions about the causal links between the examined variables. For example, an observed correlation between the eating and exercising habits of a social group does not necessarily imply that eating and exercise habits of individuals are influenced by their social group and, therefore, could be modified by changing someone’s social group. Instead, the observed correlation could be due to the fact that people tend to have social relationships with people with similar habits.

The efficient exploitation of human generated data in order to uncover causal links among factors of interest remains an open research issue. Some works have proposed the use of randomized trials [15, 16]. According to this technique, the causal effects of an event or treatment are examined by exposing a randomly selected subset of participants (treatment group) to this event and comparing the result with the corresponding outcome on a control group (i.e., a subset of participants who have not been exposed to the event). By randomly assigning participants to treatment and control groups it is assured that, on average, there will be no systematic difference on the baseline characteristics of the participants between the two groups. Baseline characteristics are considered to be any characteristics of the subjects that could be related with the study (e.g. in a clinical study the age and the previous health status of the subjects could be considered as baseline characteristics). While randomized trials represent a reliable way to detect causal relationships, they require the direct intervention of scientists in participants’ life, which is sometimes unethical or just not feasible. Moreover, such experimental studies cannot exploit the vast amount of observational data that are produced daily.

Detecting causal relationships in observational data is challenging since subjects cannot be randomly exposed to an event. Thus, subjects that are exposed to a treatment may systematically differ from subjects that are not. In order to eliminate any bias due to differences on the baseline characteristics of exposed and unexposed subjects, scientists need to gather and process information about several factors that could influence the result of the study. There are two main methodologies that can be applied to control such factors: structural equation modeling [23, 24] and quasi-experimental designs [25]. According to the former, the causal effect is estimated using multivariate regression. In detail, the variable representing the causal effect of an event or treatment is regressed using as predictors the variable representing the treatment as well as all the baseline characteristics of the subjects of the study that could influence the result. Structural equation modeling is based on the assumption that the regression model has been correctly specified. False assumptions about the linearity or non-linearity of the model or failure to correctly specify the regression coefficients may result in misleading conclusions. On the other hand, methods based on quasi-experimental designs do not require the specification of a model. Instead, they attempt to emulate randomized trials by exploiting inherit characteristics of the observational data. This can be achieved by comparing groups of treated and control subjects with similar baseline characteristics (matching design).

Causality studies on human behavior so far have been relied mainly on paper records. For example, the causal impact of social influence on human behavior has been extensively studied [2630]. In contrast, in this work we feature the benefits of leveraging digital devices in order to continuously and unobtrusively collect data that would facilitate studies on human behavior. The purpose of this work is to propose a generic causal inference framework for the analysis of human behavior using digital traces. More specifically, we demonstrate the potential of automatically processing human generated observational digital data in order to conduct causal inference studies based on quasi-experimental techniques. We support our claim by presenting an analysis of the causal effects of daily activities, such as exercising, socializing or working, on stress based on data gathered by smartphones from 48 students that were involved in the StudentLife project [31] at Dartmouth College for a period of 10 weeks. It is also worth noting that although previous studies have provided evidence of peer influence on individuals mood ([27, 29]), the information about the social network of participants is not sufficient to examine the impact of such factors on stress level. The main goal of the StudentLife project is the study of the mental health, academic performance and behavioral trends of this group of students using mobile phones sensor data. To the best of our knowledge, this is the first work presenting an observational causality study using digital data gathered by smartphones.

Information about participants’ daily social interactions as well as their exercise and work/study schedule is not directly measured; instead, we use raw GPS and accelerometer traces in order to infer high-level information which is considered as implicit indicator of the variables of interest.

No active participation of the users is required, i.e., answering to pop-up questionnaires. We automatically assign semantics to locations in order to group them in four categories: home, work/university, socialization venues and gym/sports center. By grouping locations into these four categories and continuously monitoring the spatio-temporal traces of users, we can derive high-level information as follows:

  • Work/University. By analyzing the daily time that users spend at their workplace we can infer their working schedule. Prolonged sojourn time at work/university could be considered as an indicator of increased workload.

  • Home. The time that participants spend at home could serve as a rough indicator of their social interactions. Prolonged sojourn time at home could imply limited social interactions or social interactions with a restricted number of people. In general, spending time outside home usually involves some social interaction. An estimation of the total daily time that participants spend at any place apart from their home and working environment could serve as a rough indicator of their non-work-related social interactions.

  • Socialization Venues. By monitoring users visits at socialization venues such as pubs, bars, restaurants etc., we can infer the time that they spend relaxing and socializing outside home during a day.

  • Gym/Sports-center. Indoor workout can be captured by tracking participants’ visits to gyms or sports centers. Outdoor activity can be measured using accelerometer data.

2 Causal inference framework

Our causality analysis is based on Rubin’s counterfactual framework [32]. According to this framework, a causal problem is formulated as a counterfactual statement which examines what would have been the outcome if an object has been exposed to an event. Since it is impossible to observe for the same object both the result of exposure and non-exposure to an event, causal inference is based on comparing the outcomes on equivalent treatment and control groups, i.e., treatment and control units with similar baseline characteristics. In this subsection, we discuss a methodology for causal inference in observational data.

The first step of the analysis is the description of the variables of the study. A causality study involves the following variables:

  1. 1

    cause or treatment variable X: an independent variable that influences the values of another variable. The treatment variable is usually binary, denoting whether an object of the study has been exposed to a treatment or not. Treatment could also be a discrete variable in case that different levels of treatment are considered;

  2. 2

    effect or outcome variable Y: a dependent variable which can be manipulated by changing the variable that represents the cause;

  3. 3

    a set of N variables \(\mathbf{Z} = \{Z^{1}, Z^{2}, \ldots, Z^{N}\}\), which describes the baseline characteristics of the objects of the study.

In the second step of the analysis we define the units of the study. Each unit corresponds to a set of attributes, derived by the variables of the study, which describe an object (e.g., a person or a thing) on a specific time period. We can use multiple units describing a single object in different time intervals. Thus, a unit \(u_{o,t}\) that describes an object o at time t corresponds to a set of values \(\{X_{u_{o, t}}, Y_{u_{o, t}}, Z^{1}_{u_{o, t}}, Z^{2}_{u_{o, t}}, \ldots, Z^{N}_{u_{o, t}}\}\). Given that, in a causation study, the treatment should precede temporally the effect, i.e., the value \(X_{u_{o, t}}\) should correspond to the treatment that has been applied to object o before time t. In the remainder of the paper, the simplified notation u will be used to describe a unit \(u_{o, t}\).

In order to claim that a value of a variable Y has been caused by a value of a variable X, there should be an association between the occurrence of these two values and there should be no other plausible explanation of this association [25]. The first part of this requirement can be examined by performing a simple statistical analysis. However, excluding any other explanation of the observed association is a hard problem since both the treatment and the effect variable may be driven by a third variable. Variables that correlate with both the outcome and the treatment are called confounding variables or confounders. In Figure 1 we provide a graphical representation of the dependencies between the treatment, outcome and confounding variables. The identification of the confounders requires a correlation analysis between each variable \(Z^{i} \in\mathbf{Z}\) and the variables X and Y.

Figure 1
figure 1

Confounder diagram. Graphical representation of the relationships among the treatment X, outcome Y and the set of confounding variables \(\mathcal{C}\).

An unbiased causality study requires that the assignment of units to treatments is independent from the outcome conditional to the confounding variables. While in experimental studies this requirement is satisfied by randomly assigning units to treatments, in observational studies we could eliminate confounding bias by comparing units with similar values on their confounding variables but different treatment value (matching design). Let us consider a binary treatment X, a group of treated units U and a group of control units V such as \(X_{u} = 1\), \(\forall u \in U\) and \(X_{v} = 0\), \(\forall v \in V\). Let us also consider a set of confounding variables C. Ideally, each unit \(u \in U\) will be matched with a unit \(v \in V\) if \(\mathcal{C}^{i}_{u} = \mathcal{C}^{i}_{v}\), \(\forall\mathcal{C}^{i} \in\mathbf{\mathcal{C}}\). However, perfect matching is usually not feasible. Thus, treated units need to be matched with the most similar control units. Several methods have been proposed to create balanced treated and control pairs [33]. After applying a matching method, it is necessary to check whether the treated and control groups are sufficiently balanced by estimating the standardized mean difference between the groups or by applying graphical methods such as quantile-quantile plots, cumulative distribution functions plots, etc. [34]. If sufficient balance has not been achieved, the matching method has to be revised.

Finally, if any confounding bias has been sufficiently eliminated, the treatment effect can be estimated by comparing the effect variable Y of the matched treated and control units. Let us define as G the set of paired treated and control units and \(N_{G}\) the number of pairs. Then, the average treatment effect (ATE) can be estimated as follows:

$$ \mathit{ATE} = \frac{\sum_{\forall(u, v) \in G} Y_{u} - Y_{v}}{N_{G}}. $$

In Figure 2 we provide a graphical representation of the causal inference methodology.

Figure 2
figure 2

Description of the causal inference process in observational data using a quasi-experimental matching design.

3 Dataset description

The StudentLife dataset contains a rich variety of information that was captured either through smartphone sensors or through pop-up questionnaires. In this study we use only GPS location traces, accelerometer data, a calendar with the deadlines for the modules that students attend during the term and students responses to questionnaires about their stress level. Students answer to these questionnaires one or more times per day.

We use the location traces of the users to create location clusters. GPS traces are provided either through GPS or through WiFi or cellular networks. For each location cluster, we assign one of the following labels: home, work/university, gym/sports-center, socialization venue and other. Labels are assigned automatically without the need for user intervention (a detailed description of the clustering and location labeling process is presented in Additional file 1).

We use information extracted from both accelerometer data and location traces to infer whether participants had any exercise (either at the gym or outdoors). The StudentLife dataset does not contain raw accelerometer data. Instead it provides an activity classification by continuously sampling and processing accelerometer data. The activities are classified to stationary, walking, running and unknown.

We also use the calendar with students’ deadlines, which is provided by the StudentLife dataset, as an additional indicator of students workload. We define as \(\mathcal{D}_{\mathrm{deadline}}^{u}\) the set of all days that the student u has a deadline. We define a variable \(D^{u, d}\) that represents how many deadlines are close to the day d for a user u as follows:

$$ D^{u, d} = \textstyle\begin{cases} \sum_{j} ^{j \in\mathcal{D}_{\mathrm{deadline}}^{u}} \frac{1}{j-d}, & \mbox{if } j-T_{\mathrm{days}}< d< j,\\ 0, & \mbox{otherwise}. \end{cases} $$

Thus, \(D^{u, d}\) will be equal to zero if there are no deadlines within the next \(T_{\mathrm{days}}\) days, where \(T_{\mathrm{days}}\) is a constant threshold; otherwise, \(D^{u, d}\) will be inversely proportional to the number of days remaining until the deadline. In our experiments we set the \(T_{\mathrm{days}}\) threshold equal to 3. We found that with this value the correlation between the stress level of the participants and the variable \(D^{u, d}\) is maximized.

Finally, the StudentLife dataset includes responses of the participants to the Big Five Personality test [35]. The Big Five Personality Traits describe human personality using five dimensions: openness, conscientiousness, extroversion, agreeableness, and neuroticism. The personality traits of participants can be used to describe some baseline characteristics of the units and, for this reason, we include them in the study.

4 Causality analysis

We apply the causal inference framework described in the previous section in order to assess the causal impact of factors like exercising, socializing, working or spending time at home on stress level.Footnote 1

4.1 Variables

Initially, we define the variables that will be included in the study as follows:

  1. 1

    \(H^{u, d}_{t}\): denotes the total time in seconds that the user u spent at home during day d until time t;

  2. 2

    \(U^{u, d}_{t}\): denotes the total time in seconds that the user u spent at university during day d until time t;

  3. 3

    \(O^{u, d}_{t}\): denotes the total time in seconds that the user u spent in any place apart from his/her home or university during day d until time t;

  4. 4

    \(E^{u, d}_{t}\): denotes the total time in seconds that the user u spent exercising during day d before time t (it is estimated using both location traces and accelerometer data);

  5. 5

    \(\mathit{SC}^{u, d}_{t}\): denotes the total time in seconds that the user u spent at any socialization or entertainment venue during day d before time t;

  6. 6

    \(S^{u, d}_{t}\): denotes the stress level of user u that was reported on day d and time t. Stress level is reported one or more times per day. Thus, in contrast with the above mentioned variables, \(S^{u, d}_{t}\) is not continuously measured;

  7. 7

    \(\mathit{PS}^{u, d}\): denotes the last stress level that was reported by user u during the day \(d-1\). This variable remains constant within a day;

  8. 8

    \(D^{u, d}\): represents the upcoming deadlines as described in Eq. (2);

  9. 9

    \(E^{u}\), \(N^{u}\), \(A^{u}\), \(C^{u}\), \(O^{u}\): these five variables denote the extroversion, neuroticism, agreeableness, conscientiousness and openness of user u based on his Big Five Personality Traits score respectively.

4.2 Units

In this study, we examine the effects of five treatments, denoted by the variables \(H^{u, d}_{t_{i}}\), \(U^{u, d}_{t_{i}}\), \(O^{u, d}_{t_{i}}\), \(E^{u, d}_{t_{i}}\) and \(\mathit{SC}^{u, d}_{t_{i}}\) on the stress level of participants, which is described by the variable \(S^{u, d}_{t}\). A unit of the study corresponds to a set of attributes derived by the variables of the experiment. All the variables are sampled every 4 hours, thus there are maximum six samples per day for each participant. Let \(T = \{4\mbox{ am}, 8\mbox{ am}, 12\mbox{ pm}, 16\mbox{ pm}, 20\mbox{ pm}, 24\mbox{ pm}\}\) a set of sampling times and \(t_{i}\) the ith element of T. Then, a unit corresponds to the set of variables \(P^{u, d}_{t_{i}} = (H^{u, d}_{t_{i}}, U^{u, d}_{t_{i}}, O^{u, d}_{t_{i}}, E^{u, d}_{t_{i}}, \mathit{SC}^{u, d}_{t_{i}}, S^{u, d}_{t_{i}}, \mathit{PS}^{u, d}, D^{u, d})\). Since the variable \(S^{u, d}_{t}\) is not continuously measured, it is not feasible to sample it for time \(t_{i}\). Instead, we define \(S^{u, d}_{t_{i}}\) as the average stress level of unit u in day d between time \(t_{i}\) and \(t_{i+1}\). Thus, \(S^{u, d}_{t_{i}}\) is estimated as follows:

$$ S^{u, d}_{t_{i}} = E\bigl\{ S^{u, d}_{t}\bigr\} ,\quad \mbox{for } t_{i} \leq t \leq t_{i+1}. $$

If there are no stress level reports during this time interval, then the unit that corresponds to the set of variables \(P^{u, d}_{t_{i}}\) will be discarded.

4.3 Detection of confounding variables

In order to conduct a reliable causation study based on observational data we need to define the confounding variables. While there is a large number of factors that could influence the stress level of participants, the study could be biased only by factors that have a direct influence on both the stress level and the variable that is considered as treatment in the study. Thus, in our case we need to specify factors that could influence both the daily activities of participants and their stress level. For example, the workload of students can influence their activities (e.g., in periods with increased workload some students may choose to change their workout schedule, etc.) and their stress level. Since the workload cannot be directly measured using only sensor data from smartphones, we use other variables that provide implicit indicators of workload as confounding variables, such as the time that students spend at home and university and their deadlines. Moreover, participants choice to do an activity may exclude another activity from their schedule and it may also influence their stress level. For example, someone may choose to spend some time in a pub instead of following his/her normal workout schedule. The previous day stress level may also influence both next day’s activities and stress level. Finally, several studies have demonstrated that stress level fluctuations are affected by personality traits [8]. In general, more positive and extrovert people tend to be able to handle stress better than people with high neuroticism score. Moreover, personality characteristics may correlate with the daily schedule that people follow. For example, more extrovert people may spend less time at home and more time in social activities. In order to define the covariates of the study, we conduct a correlation analysis on the variables of interest. Since the relationship among the variables may not be linear, we apply the Kendall rank correlation. The p-values of the Kendall correlation are presented in Table 1.

Table 1 p -values of Kendall correlation under the null-hypothesis that the examined variables are independent

Based on these results, the time that students spend at home does not correlate with their stress level. Thus, the variable \(H^{u, d}_{t_{i}}\) will not be included in the causality study. The causal impact of each treatment variable \(U^{u, d}_{t_{i}}\), \(O^{u, d}_{t_{i}}\), \(E^{u, d}_{t_{i}}\) and \(\mathit{SC}^{u, d}_{t_{i}}\) on the effect variable \(S^{u, d}_{t_{i}}\) will be examined using all the variables that correlate with both the treatment and effect based on Table 1 as confounding variables. We consider a correlation to be significant enough if the p-value is smaller than 0.1. In Table 2, we present the confounding variables that will be used for each examined treatment. While the variables \(O^{u, d}_{t_{i}}\) and \(\mathit{SC}^{u, d}_{t_{i}}\) are strongly correlated, we do not include \(\mathit{SC}^{u, d}_{t_{i}}\) in the set of confounding variables when the treatment is the variable \(O^{u, d}_{t_{i}}\), since our goal is to study the impact of spending time in any place (including socialization venues) apart from home and working environment.

Table 2 Confounding variables for the different applied treatments

4.4 Creation of treated and control groups

After defining the confounding variables of the study, we need to split the units into control and treatment groups. We consider binary treatments by applying thresholds to the examined treatment variables. Thus, for each of the four examined treatments (i.e., \(U^{u, d}_{t_{i}}\), \(O^{u, d}_{t_{i}}\), \(E^{u, d}_{t_{i}}\), \(\mathit{SC}^{u, d}_{t_{i}}\)) the units are split as follows:

  1. 1

    \(U^{u, d}_{t_{i}}\): treatment units are all the units with \(U^{u, d}_{t_{i}}< E\{U^{u, d}_{t_{i}}\} - \alpha\cdot E\{U^{u, d}_{t_{i}}\}\) and control all the units with \(U^{u, d}_{t_{i}} \geq E\{U^{u, d}_{t_{i}}\} + \alpha\cdot E\{U^{u, d}_{t_{i}}\}\), for a constant \(\alpha\in[0, 1)\). Thus, we consider to have a positive treatment value when the university sojourn time is relatively small;

  2. 2

    \(O^{u, d}_{t_{i}}\): treatment units are all the units with \(O^{u, d}_{t_{i}}>E\{U^{u, d}_{t_{i}}\} + \alpha\cdot E\{U^{u, d}_{t_{i}}\}\) and control all the units with \(O^{u, d}_{t_{i}} \leq E\{U^{u, d}_{t_{i}}\} - \alpha\cdot E\{U^{u, d}_{t_{i}}\}\). Thus, we consider to have a positive treatment value when the time spent in any non-work-related place outside home is relatively large;

  3. 3

    \(E^{u, d}_{t_{i}}\): treatment units are all the units with \(E^{u, d}_{t_{i}}>0\) i.e. all the units that denote that a user u had some exercise at day d before time t. In the control group are units with \(E^{u, d}_{t_{i}}=0\);

  4. 4

    \(\mathit{SC}^{u, d}_{t_{i}}\): similarly to the treatment variable \(E^{u, d}_{t_{i}}\), treatment units are units with \(\mathit{SC}^{u, d}_{t_{i}} > 0\) and control units with \(\mathit{SC}^{u, d}_{t_{i}} = 0\).

Thus, when the treatment variables \(U^{u, d}_{t_{i}}\) and \(O^{u, d}_{t_{i}}\) are considered, units are classified as treated and untreated based on the time participants have spent at university or at any place apart from their home and university, respectively. However, in order to examine the impact of exercising and visiting socialization venues, the binary treatments are defined by considering only whether there was some exercising activity or a visit to a socialization place or not. We do not study the impact of these factors by considering also the duration of these events since the amount of the data is not sufficiently large.

Each of the examined treatment variables describes some user behavior or activity from the start of the day to some time \(t_{i}\). Consequently, the comparison of two units with different sampling times \(t_{i}\) is not valid. Thus, we create a group of pairs of treated and control units \(G_{t_{i}}\) for each one of the 6 sampling times \(t_{i}\) such that each treated unit \(P^{(u, d)}_{t_{i}}\) is matched with a control unit \(P^{(u, d)'}_{t_{i}}\) with similar values on its confounding variables. Then, the average treatment effect is estimated as follows:

$$ \mathit{ATE} = \frac{\sum_{t_{i}}\sum_{(P^{(u, d)}_{t_{i}}, P^{(u, d)'}_{t_{i}}) \in G_{t_{i}}} (S_{P^{(u, d)}_{t_{i}}} - S_{P^{(u, d)'}_{t_{i}}})}{\sum_{t_{i}} N_{G_{t_{i}}}}. $$

If there is no causal effect of the examined treatment on the stress level then the average treatment effect should be zero. We use a t-test in order to decide whether the observed average treatment effect is statistically significant.

4.5 Balance check

In order to create balanced treated and control pairs of units we apply the Genetic Matching method [36]. Genetic Matching is a multivariate matching method that applies an evolutionary searching algorithm that estimates weights for each confounding variable in order to achieve an optimal covariates balance. In order to assess if the treated and control pairs are sufficiently balanced, we check the standardized mean difference for each confounding variables of the study. We indicate with \(\mathcal{C}\) the set of confounding variables. For each confounding variable \(c \in\mathcal{C}\), the standardized mean difference is estimated as follows:

$$ \mathit{SMD}_{c} = \frac{\sum_{\forall t_{i}} \sum_{\forall(P^{(u, d)}_{t_{i}}, P^{(u, d)'}_{t_{i}}) \in G_{t_{i}}} (c_{P^{(u, d)}_{t_{i}}} - c_{P^{(u, d)'}_{t_{i}}})}{\sum_{\forall t_{i}} N_{G_{t_{i}}}} \Big/ \sqrt{\sigma_{c}^{T=1}} , $$

where \(\sigma_{c}^{T=1}\) denotes the variance of the confounding variable c for the treated units. The remaining bias from a confounding variable c is considered to be insignificant if \(\mathit{SMD}_{c}\) is smaller than 0.1 [34].

5 Results

We conduct a causal inference study for each one of the four examined treatments that were discussed above. In each study, we use as confounding variables all the variables that are presented in Table 2. We report our findings collectively for the whole population. We also repeated these studies separately for participants with high and low extroversion and participants with high and low neuroticism scores in order to investigate whether some of the examined treatments have a different causal impact on these sub-populations. We decided to conduct additional studies separately for these sub-populations because neuroticism and extroversion are strongly correlated with stress level according to Table 1. Participants are classified as highly extroverts if their extroversion score is higher than the average extroversion score; otherwise, they are classified as member of the low extroversion sub-population. Correspondingly, we define two sub-population of participants with high neuroticism (i.e. participants with neuroticism score higher than the average) and participants with low neuroticism scores. In Figure 3 we present the distribution of the neuroticism and extroversion scores of the participants.

Figure 3
figure 3

Distribution of neuroticism and extroversion scores.

In Figure 4 we show the average treatment effect (ATE) normalized by the average stress level of the control units along with the 95% confidence intervals for each one of the four examined treatment variables. For the treatment variables \(U^{u, d}_{t_{i}}\) and \(O^{u, d}_{t_{i}}\) we present results for α equal to 0, 0.05, 0.1 and 0.15. We do not present results for larger α values since the number of samples that are discarded is large and the remaining data are not sufficient for statistically significant conclusions. In Figure 5 and Table 3 we present the standardized difference, as described in Eq. (5), for all the confounding variables that were used in each one of the causation studies. According to our results, the standardized difference between treated and control samples is smaller than 0.1 for all the confounding variables thus any confounding bias has been sufficiently minimized.

Figure 4
figure 4

Treatment effect. Percentage improvement on the stress level of treated units compared to control units when each one of the examined treatments is applied. Percentage improvement is estimated as \(\frac{\mathit{ATE}}{E\{S^{(u, d)'}_{t_{i}}\}}\times 100\). Results are presented along with the 95% confidence interval.

Figure 5
figure 5

Balance check. Standardized difference between treated and control samples for each confounding variable when the applied treatment is (a) the variable \(U^{u, d}_{t_{i}}\) and (b) the variable \(O^{u, d}_{t_{i}}\). The standardized difference for all the confounding variables is less than 0.1, thus the groups are balanced.

Table 3 Standardized difference between treated and control samples for each one of the confounding variables when the applied treatments correspond to the variables \(\pmb{\mathit{SC}^{u, d}_{t_{i}}}\) and  \(\pmb{E^{u, d}_{t_{i}}}\)

Our results indicate that the time that students spend at university has only a weak causal impact on the stress level when participants’ samples are split into treatment and control groups using an α value equal to 0.15. In detail, participants report 3.1% (with confidence interval ±0.7%) lower stress level the days that their sojourn time at university is 15% lower than the average university sojourn time of the whole population compared to days that the university sojourn time is 15% larger than usual. However, when the analysis is limited to people with high extroversion score, there is no statistically significant evidence that the time that students spend at university has any causal effect on stress. When smaller α values are considered, the causality score is close to zero for the examined set of students.

Based on our results, the time that students spend in any place apart from their home and university has a significantly strong causal impact on their stress level. As depicted in Figure 4(b), students have reported around 3% (with confidence interval ±0.65%) lower stress level the days that they spend more time outside than the average time compared to days that they spend less time outside (i.e., \(\alpha= 0\)), when the whole set of participants is considered. Similar results are observed when the study is repeated separately for students with high and low extroversion and students with high and low neuroticism scores (the observed difference is not statistically significant given the 95% confidence intervals of the study). When the value of α is increased, the causal impact of the examined variable is stronger. For \(\alpha= 0.15\), the improvement on the stress level for students who spend more time outside is 14.45% (with confidence interval ±1.5%) when the total population is considered. The results are similar when the study is limited to students with high extroversion score and students with low neuroticism scores. However, the examined variable has a significantly lower impact on stress level when only students with high neuroticism score and students with low extroversion score are considered.

In Figure 4(c), we examine the impact of exercising or visiting socialization venues on stress level. While the variable \(\mathit{SC}^{u, d}_{t_{i}}\) is strongly correlated with the stress level, according to our results, there is no causal link between them. This indicates that, while people benefit from spending time outside home or working environment in general, there is no statistically significant benefit from visiting specific venues. Finally, exercising has positive effect on the stress level of the examined population. When we examine the four different sub-populations separately, we observe that exercising has a stronger positive effect on the stress level of participants with high neuroticism score while there is no statistically significant benefit for people with high extroversion score. The impact on people with low neuroticism score is also weak.

6 Discussion

In this work, we have presented a framework for detecting causal links on human behavior using mobile phones sensor data. We have studied the causal effects of several factors, such as working, exercising and socializing, on stress level of 48 students using data captured by means of smartphone sensors. Our study does not consider the impact of social influence on stress level of individuals mainly because of dataset limitations (see endnote a). Our results suggest that exercising and spending time outside home or university have a strongly positive causal effect on participants’ stress level. We have also demonstrated that the time participants stay at university has a positive causal impact on their stress level only when it is considerably lower than the average daily university sojourn time. However, this impact is not remarkable.

Moreover, we have observed that some of the examined factors have different impact on the stress level of students with high extroversion score and on students with high neuroticism score. More specifically, more extrovert students benefit more from spending time outside home or university, while more neurotic students benefit more from exercising.

Our study mainly relies on raw sensor data that can be easily captured with smartphones. We have demonstrated that information extracted by simply monitoring users’ location and activity (through accelerometer) can serve as an implicit indicator of several factors of interest such as their working and exercising schedule as well as their daily social interactions. Inferring this high-level information using raw sensor data instead of pop-up questionnaires has three main advantages: (1) it offers a more accurate representation of participants activities over time since data are collected continuously; (2) data are collected in an obtrusive way without requiring participants to provide any feedback; this minimizes the risk that some users will quit the study because they are dissatisfied because of the amount of feedback that they need to provide; (3) data gathered through pop-up questionnaires may not be objective since participants may provide either intentionally or unintentionally false responses. On the other hand, inferences based on sensor data could also be inaccurate either due to noisy sensor measurements or due to the fact that the variable of interest is inferred by the sensed data rather than directly measured. For example, in our case we assume that a visit to a sports center implies that the user had some exercise. However, the user may have visited this place to attend a sport event or just to meet friends. Assessing the degree of uncertainty that information inference from sensor measurements involves and incorporating this uncertainty into the causation study represents an interesting research area for further investigation.

This study involves a limited number of participants who do not constitute a representative sample of the population; therefore extrapolating general conclusions about the causal impact of the examined factors on stress level is not feasible. However, the purpose of this article is to demonstrate the potential of using smartphones for conducting large-scale studies related to human behavior, rather than present a thorough investigation on factors influencing the stress level of the participants.

Finally, any causal inference study based on observational data could be biased in case of missing confounding variables. However, conducting experimental studies is not feasible in many cases due to either practical or ethical reasons. Smartphones as well as wearable devices can capture a large variety of data and offer useful information about users’ daily activities. Additional information that may be needed in a study could be provided by the users through pop-up questionnaires. Thus, by leveraging this technology, scientists could obtain sufficient information in order to conduct reliable causal inference studies.


  1. There are some studies which provide evidence that mood of individuals is influenced by the mood of their peers (see for example [27, 29]). However, the dataset limitations do not allow us to investigate whether the stress experienced by a participant could influence his/her social circle. In order to examine this aspect, we create a friendship network based on the phone calls/SMSes of the users. However, the resulting friendship network is composed of only 19 students out of 48 (i.e., there were only 19 students with at least one friendship link to another student). Moreover, all the users are not active during all the days of the study (e.g., some users do not report their stress level every day). In order to study the impact of friends’ stress, we need to consider only samples for which we have information for both the stress level of the student taken into consideration and the stress level of his/her friends. This reduces the size of our dataset by 73%. The sample is not sufficient to derive statistically significant results. For this reason, the impact of the social network of individuals is not considered in this study.


  1. Miluzzo E et al. (2008) Sensing meets mobile social networks: the design, implementation and evaluation of the CenceMe application. In: Proceedings of SenSys’08. ACM, New York, pp 337-350

    Google Scholar 

  2. Ravi N, Dandekar N, Mysore P, Littman ML (2005) Activity recognition from accelerometer data. In: Proceedings of AAAI’05, vol 3, pp 1541-1546

    Google Scholar 

  3. Lu H, Rabbi M, Chittaranjan GT, Frauendorfer D, Mast MS, Campbell AT, Gatica-Perez D, Choudhury T (2012) StressSense: detecting stress in unconstrained acoustic environments using smartphones. In: Proceedings of UbiComp’12. ACM, New York, pp 351-360

    Google Scholar 

  4. Rachuri KK, Musolesi M, Mascolo C, Rentfrow PJ, Longworth C, Aucinas A (2010) Emotionsense: a mobile phones based adaptive platform for experimental social psychology research. In: Proceedings of UbiComp’10. ACM, New York, pp 281-290

    Google Scholar 

  5. Ma Y, Xu B, Bai Y, Sun G, Zhu R (2012) Daily mood assessment based on mobile phone sensing. In: Proceedings of BNS’12. IEEE, Los Alamitos, pp 142-147

    Google Scholar 

  6. Bauer G, Lukowicz P (2012) Can smartphones detect stress-related changes in the behaviour of individuals? In: Proceedings of PerCom Workshops’12. IEEE, Los Alamitos, pp 423-426

    Google Scholar 

  7. Bogomolov A, Lepri B, Pianesi F (2013) Happiness recognition from mobile phone data. In: Proceedings of SocialCom’13. IEEE, Los Alamitos, pp 790-795

    Google Scholar 

  8. Bogomolov A, Lepri B, Ferron M, Pianesi F, Pentland AS (2014) Daily stress recognition from mobile phone data, weather conditions and individual traits. In: Proceedings of Multimedia’14. ACM, New York, pp 477-486

    Google Scholar 

  9. Canzian L, Musolesi M (2015) Trajectories of depression: unobtrusive monitoring of depressive states by means of smartphone mobility traces analysis. In: Proceedings of UbiComp’15. ACM, New York

    Google Scholar 

  10. Asur S, Huberman BA (2010) Predicting the future with social media. In: Proceedings of the 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT), vol 1. IEEE, Los Alamitos, pp 492-499

    Chapter  Google Scholar 

  11. Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with Twitter: what 140 characters reveal about political sentiment. In: Proceedings of the fourth international AAAI conference on weblogs and social media, pp 178-185

    Google Scholar 

  12. Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1-8

    Article  Google Scholar 

  13. Cha M, Haddadi H, Benevenuto F, Gummadi PK (2010) Measuring user influence in Twitter: the million follower fallacy. In: Proceedings of the fourth international AAAI conference on weblogs and social media, pp 10-17

    Google Scholar 

  14. Anger I, Kittl C (2011) Measuring influence on Twitter. In: Proceedings of KDD’11. ACM, New York, p 31

    Google Scholar 

  15. Bond RM, Fariss CJ, Jones JJ, Kramer ADI, Marlow C, Settle JE, Fowler JH (2012) A 61-million-person experiment in social influence and political mobilization. Nature 489(7415):295-298

    Article  Google Scholar 

  16. Muchnik L, Aral S, Taylor SJ (2013) Social influence bias: a randomized experiment. Science 341(6146):647-651

    Article  Google Scholar 

  17. Onnela J-P, Saramäki J, Hyvönen J, Szabó G, Lazer D, Kaski K, Kertész J, Barabási A-L (2007) Structure and tie strengths in mobile communication networks. Proc Natl Acad Sci USA 104:7332-7336

    Article  Google Scholar 

  18. Colizza V, Barrat A, Barthelemy M, Vespignani A (2006) Prediction and predictability of global epidemics: the role of the airline transportation network. Proc Natl Acad Sci USA 103:2015

    Article  Google Scholar 

  19. Marsch LA (2012) Leveraging technology to enhance addiction treatment and recovery. J Addict Dis 31(3):313-318

    Article  Google Scholar 

  20. Pejovic V, Musolesi M (2014) Anticipatory mobile computing for behaviour change interventions. In: Proceedings of MCSS’14. ACM, New York, pp 1025-1034

    Google Scholar 

  21. Moturu ST, Khayal I, Aharony N, Pan W, Pentland AS (2011) Using social sensing to understand the links between sleep, mood, and sociability. In: Proceedings of SocialCom’11. IEEE, Los Alamitos, pp 208-214

    Google Scholar 

  22. Madan A, Moturu ST, Lazer D, Pentland AS (2010) Social sensing: obesity, unhealthy eating and exercise in face-to-face networks. In: Proceedings of wireless health 2010. ACM, New York, pp 104-110

    Chapter  Google Scholar 

  23. Mouchart M, Russo F, Wunsch G (2009) Structural modelling, exogeneity, and causality. In: Causal analysis in population studies. Springer, Berlin, pp 59-82

    Chapter  Google Scholar 

  24. Spirtes P, Glymour CN, Scheines R (2000) Causation, prediction, and search. MIT Press, Cambridge

    Google Scholar 

  25. Shadish WR, Cook TD, Campbell DT (2002) Experimental and quasi-experimental designs for generalized causal inference. Wadsworth Cengage Learning, Boston

    Google Scholar 

  26. Rosenquist JN, Murabito J, Fowler JH, Christakis NA (2010) The spread of alcohol consumption behavior in a large social network. Ann Intern Med 152(7):426-433

    Article  Google Scholar 

  27. Fowler JH, Christakis NA et al. (2008) Dynamic spread of happiness in a large social network: longitudinal analysis over 20 years in the Framingham Heart Study. Br Med J 337:2338

    Article  Google Scholar 

  28. Cacioppo JT, Fowler JH, Christakis NA (2009) Alone in the crowd: the structure and spread of loneliness in a large social network. J Pers Soc Pychol 97(6):977

    Article  Google Scholar 

  29. Rosenquist JN, Fowler JH, Christakis NA (2011) Social network determinants of depression. Mol Psychiatry 16(3):273-281

    Article  Google Scholar 

  30. Christakis NA, Fowler JH (2013) Social contagion theory: examining dynamic social networks and human behavior. Stat Med 32(4):556-577

    Article  MathSciNet  Google Scholar 

  31. Wang R et al. (2014) StudentLife: assessing mental health, academic performance and behavioral trends of college students using smartphones. In: Proceedings of UbiComp’14. ACM, New York, pp 3-14

    Google Scholar 

  32. Holland PW (1986) Statistics and causal inference. J Am Stat Assoc 81(396):945-960

    Article  MATH  Google Scholar 

  33. Stuart EA (2010) Matching methods for causal inference: a review and a look forward. Stat Sci 25(1):1

    Article  Google Scholar 

  34. Austin PC (2011) An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res 46(3):399-424

    Article  Google Scholar 

  35. Bolger N, Schilling EA (1991) Personality and the problems of everyday life: the role of neuroticism in exposure and reactivity to daily stressors. J Pers 59(3):355-386

    Article  Google Scholar 

  36. Diamond A, Sekhon JS (2013) Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. Rev Econ Stat 95(3):932-945

    Article  Google Scholar 

Download references


This work was supported through the EPSRC Grants “Trajectories of Depression: Investigating the Correlation between Human Mobility Patterns and Mental Health Problems by means of Smartphones” (EP/L006340/1) and “MACACO: Mobile context-Adaptive CAching for Content-Centric networking” (EP/L018829/1).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Fani Tsapeli.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

FT and MM designed the study. FT analyzed the data and prepared the figures. FT and MM wrote the manuscript.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Supplementary material (pdf)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tsapeli, F., Musolesi, M. Investigating causality in human behavior from smartphone sensor data: a quasi-experimental approach. EPJ Data Sci. 4, 24 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: