Skip to main content

Temporal social network reconstruction using wireless proximity sensors: model selection and consequences


The emerging technologies of wearable wireless devices open entirely new ways to record various aspects of human social interactions in a broad range of settings. Such technologies allow to log the temporal dynamics of face-to-face interactions by detecting the physical proximity of participants. However, despite the wide usage of this technology and the collected datasets, precise reconstruction methods transforming the raw recorded communication data packets to social interactions are still missing.

In this study we analyse a proximity dataset collected during a longitudinal social experiment aiming to understand the co-evolution of children’s language development and social network. Physical proximity and verbal communication of hundreds of pre-school children and their teachers are recorded over three years using autonomous wearable low power wireless devices. The dataset is accompanied with three annotated ground truth datasets, which record the time, distance, relative orientation, and interaction state of interacting children for validation purposes.

We use this dataset to explore several pipelines of dynamical event reconstruction including earlier applied naïve approaches, methods based on Hidden Markov Model, or on Long Short-Term Memory models, some of them combined with supervised pre-classification of interaction packets. We find that while naïve models propose the worst reconstruction, Long Short-Term Memory models provide the most precise way to reconstruct real interactions up to \({\sim} 90\%\) accuracy. Finally, we simulate information spreading on the reconstructed networks obtained by the different methods. Results indicate that small improvement of network reconstruction accuracy may lead to significantly different spreading dynamics, while sometimes large differences in accuracy have no obvious effects on the dynamics. This not only demonstrates the importance of precise network reconstruction but also the careful choice of the reconstruction method in relation with the data collected. Missing this initial step in any study may seriously mislead conclusions made about the emerging properties of the observed network or any dynamical process simulated on it.

1 Introduction

The precise observation of the dynamics of face-to-face interactions of people have been a major challenge in social studies [1]. Such observations were commonly limited to small-scale observations for short periods of time [2]. Meanwhile such studies promise keys to understand better the formation of social ties, the emergence of social groups, psychological well-being [3], or how information or influence diffuse via personal interactions. Recently developed new technologies of wearable wireless devices made possible a giant leap in this direction, as they allowed for large-scale experiments to observe offline interactions in multiple settings like in schools [4, 5], museums or conferences [6], hospitals [7, 8], or even between animals [9, 10]. These experiments highlighted novel behavioural patterns [11, 12] and their consequences on ongoing dynamical processes like epidemic spreading [1316] or the adoption of different behavioural forms [12]. However, all these studies have some methodological similarities. First, although they have been implemented in rather different ways using centralised radio-frequency identification (RFID) [11, 1621] or autonomous LPWD [8, 22, 23] technologies, they all provide the same output as sequences of radio signals pairs. Second and more importantly, using the collected data streams, the reconstruction methods of temporal interactions were commonly based on naïve assumptions [18, 24], which may seem convenient at first, but have indisputable consequences on the reconstructed event structure and any observed process taking place on it. To bridge this shortcoming, in this paper, starting from the recorded raw communication data, we explore multiple temporal network reconstruction methods to find the best way to rebuild the original sequence of interactions. In addition, we will show that even a small improvement in the reconstruction accuracy may have dramatic effects on the dynamics of ongoing processes simulated on the temporal network.

The dynamics of human actions and interactions are conventional topics in social and behavioural sciences, while their modern studies articulated a new distinguishable area in the field of computational social science [25] called human dynamics. This transition has been fuelled by some radically new digital data collection technologies [26], which allow to track human behaviour at the individual level within large populations. On one hand, they provide automatised data collection methods for in vivo observations of millions of people without the usual observational biases but with certain limits in control and reproducibility. Popular methods rely on mobile communication devices [27], or online communication platforms [28] just to give a few examples. On the other hand, online experimental settings, crowd-sourcing services, and new behavioural tracking technologies provided by Internet-of-Things (IoT) solutions like RFIDs [17], IoT-LPWD [8, 22, 23], in addition to reality mining/personal logs [29, 30], real-time surveillance [31], or smart/GPS enabled devices [32, 33] provide the opportunity to precisely observe the behaviour of a selected group of people in more controlled settings.

However, these new opportunities open new challenges as the recorded raw data do not translate easily to knowledge. To bridge this gap, careful methodologies need to be developed to obtain unbiased precise proxy descriptions of human behaviour. For example, in case of RFID and LPWD tracking, information about the physical proximity of two people comes as pairs of data sequences recording mutually shared packets with certain frequencies and varying signal strength. However, these devices are not flawless, they are influenced by external noise, and may lose packets for several reasons, e.g., due to interference with other devices, communication overload, physical obstacles, humidity level, or misplaced orientation of interacting people. The reconstruction of the real temporal interactions from such noisy and incomplete data sequences translates to a multidisciplinary challenge involving signal processing, communication protocols, social behavioural studies and advanced techniques of statistical data analysis. Supervised learning methods can be especially useful for this task, as they are able to recognise recurrent dynamical patterns associated with ongoing or missing interactions. As a result, we may obtain a sequence of proxy interactions between people, which can be best represented as a temporal network [34] coding simultaneously the dynamics and structure of the social fabric. The precise estimation of these interactions has important consequences. Misplaced or wrongly identified interactions may lead to completely different network dynamics and in turn to falsely claimed time-respecting paths. Such paths determine the possible ways how information or influence can spread in the observed temporal structure, thus even minor change in their structure can lead to radically different outcomes of co-evolving processes like information diffusion [35, 36], opinion dynamics [3739] or epidemic spreading [40].

In this paper, we focus on human proxy data collected in a large-scale social study called DyLNet [22]. This experiment, explained in details in Sect. 2, aims to understand the language development of children in their early age, by observing the verbal and social interactions of pre-school children and their teachers over three years. Physical proximity and verbal interactions of the participants are recorded using voice enabled decentralised Low Power Wireless tags. By focusing on the proximity data, and relying on ground truth data recorded simultaneously in controlled settings, we explore several supervised reconstruction methods of the temporal social interactions. As demonstrated in Fig. 1, first we build a binary interaction sequence from the raw data of packet exchange between the LPW badges of a pair of individuals, and then use it to reconstruct the time and duration of the mutual interactions among the participants in order to obtain a temporal network representation of the social interaction dynamics. As we explain in Sect. 3.1, we translate the first level problem to a classification task, while in Sect. 3.2 we explore multiple naïve and advanced supervised learning methods to solve the final reconstruction problem of the dynamical interaction sequences. Further, in Sect. 4, via data-driven simulation of spreading processes, we demonstrate that while commonly used naïve reconstruction methods consistently overestimate the number of interactions, using advanced supervised learning methods, even a minor improvement in the reconstruction performance can have radical effects on the dynamics of an ongoing process.

Figure 1
figure 1

Temporal network reconstruction pipeline. Starting (a) from mutually observed packets of pairs of LPWD tags, (b) we train models using the raw and annotated data to (c) reconstruct interaction and non-interaction periods between individuals to (d) reconstruct events and ultimately (e, f) build a temporal network

2 Experimental setting and data collection

Our primarily analysed dataset was collected during a large-scale social experiment, which employed IoT-LPWD technology to track human interactions. More precisely, it involved, in each year of the project, about 170 pre-school children (between age \(2\frac {1}{2}\) and \(6\frac {1}{2}\)) in 7 classes, together with about 30 teachers and assistants. During this experiment, their physical proximity and verbal interactions have been recorded using voice enabled LPWD badges. The experiment has been ongoing in a French pre-school over three years, for one week in each month during school periods (i.e. 30 weeks in total). The original goal of this study has been to analyse the relations between child socialisation and language development in pre-school, by describing the co-evolution between social network dynamics (changes in the social relationships) and the language dynamics in the networks (inter-individual influences and changes in language skills).

2.1 Data collection

For the purpose of our analysis, we concentrate on the proximity data recorded by LPWD badges worn by each of the participants. In our architecture, developed by SEQUANTA [23], each badge is associated with a unique ID and uses the IEEE 802.15.4 low-rate wireless standard to communicateFootnote 1. Badges broadcast a ‘hello’ packet with 0 dBm transmission power for \(384~\mu \mbox{s}\) in every 5 seconds. For communication, they use the carrier-sense multiple access (CSMA) protocol, thus they first listen to the dedicated channel, and only transmit packets if it is clear to avoid collision. When a badge is not transmitting, it is listening to capture incoming packets from other devices in its vicinity. Due to the CSMA protocol, a random offset can appear in the timing of the packet transmission, which can accumulate and cause considerable delay as compared to a global clock signal. For this reason, the time of each badge is synchronised with a central device, connected to a computer, which propagates the global time to all badges. Badges which fall off synchronisation of the time stop transmitting packets. This way, each sent packet is time-stamped. Beyond the broadcasted global time signal our architecture is completely distributed as it builds up from autonomous badges, which record data locally on their own flash memory card. Every day, badges are plugged on a USB board to re-charge their batteries overnight, and at the end of each measurement period (typically after a week), we gather data from their flash memory card to a computer, and re-initialise it ahead of the next recording session. Note that there are other wireless architectures developed for similar purposes, such as OpenBeacon [17] or Open badges [30], but they typically use RFID technology with centralised communication protocols and a different philosophy for data collection.

2.2 Ground truth datasets

Beyond the experiments, we collected multiple annotated ground truth datasets for training and validation. The first one (GT1) was recorded by a researcher sitting in a classroom and logging the state of interaction/no-interaction between a pair of children and their relative orientation using the scan sampling method (7 pairs of children were observed for \(\varSigma =3\mbox{ hours }35\mbox{ minutes}\), in ranges of 17–54 minutes). Advantage of this setting was to gather annotated data in-situ with all noise and interference present (the class containing 20–25 individuals wearing a badge). To obtain annotated data in more controlled settings, a second dataset (GT2) was recorded by statically positioning a pair of children, at a given distance and orientation from each other for 10 minutes periods, in a separate room (4 pairs of children positioned in two different settings each, observation time in a given position range was 2–10 minutes, summing up to 56 minutes of observation). These data provided stationary observations of packet exchange, however, it failed to capture the dynamicity of social interactions. Finally, a third dataset (GT3) was recorded by a researcher sitting in a classroom and logging the relative distance and orientation of the individuals present, both children and adults, during regular class activities (28 individuals wearing a badge observed for 30 min). Advantage of this dataset was to make possible a direct comparison of the RSSI values collected by the badges with the actual distance between social partners. Both the datasets GT1 and GT3 were collected using the scan sampling method [41] (data recorded at intervals of 10 seconds for GT1, and at intervals of 2 minutes for GT3) with the Animal Observer application for iPad [42]. For a more detailed description on how ground truth data were collected see Appendix A.

2.3 Environmental dependencies and parameters

During the experiments, each badge recorded a time-stamped sequence of packets, which were broadcasted by other badges in its vicinity. More precisely, a sequence recorded by a given badge consists of (t, ID, RSSI) tuples, where t is the time of observation, ID is the unique identifier of the observed badge, and RSSI is the received signal strength indicator of the observed packet transmitted as a radio signal. In Fig. 2a we show the distribution of RSSI values recorded over a week (24 hours, 196 badges). Values were observed between −24 and \(-94\mbox{ dBm}\) with a bimodal distribution. One peak, above \(-45\mbox{ dBm}\), corresponds to the situation when badges are stored in a box close to each other thus communicating with strong radio signals. The other peak, below \(-75\mbox{ dBm}\), corresponds to any other observations, including ones capturing real social interactions with an \(-94\mbox{ dBm}\) ceiling value hardwired in the badges configuration. Observed RSSI values can depend on external factors like distance, body orientation, battery status, or humidity conditions. While battery status should not be an issue here as badges are charged overnight, and we can control for distance and orientation (as explained below), we cannot account for changing weather conditions, which can cause some fluctuations in our measurements. In addition, the potential conflict of signals within \(1~\mu \mbox{s}\) may induce accidental loss of observed packet in case of interactions within large social groups.

Figure 2
figure 2

LPWD packet statistics. Panel (a) shows the distribution of RSSI values (\(\mbox{bin-width} = 3\)) observed by single badges over a week (24 hours, 196 badges); (b) Correlation between distance (x-axis) and RSSI (y-axis) values shown as a density plot (based on GT3 dataset); (c) Recorded RSSI values as a function of time in settings where the relative orientation of participants has been changed (based on GT2 dataset; position FF: face to face, SS: side by side, BB: back to back; distance: 10 cm, 1 m, 2 m)

Social as well as verbal interactions depend upon the relative distance and orientation of the participants, which should be reflected by the RSSI values of captured packets in our experiments. The strength of transmitted radio signals depends on distance and orientation as they are effectively absorbed by the water of the human body. These dependencies are demonstrated by the measurements depicted in Fig. 2b and c. There, in panel Fig. 2b, the density plot of RSSI values (y-axis) of captured packets shows a non-linear negative correlation with the distance between participants (x-axis). This measure suggests, for a realistic distance of maximum 1.5–2 meters for verbal interactions, a corresponding RSSI range between \(-70\dots -75\mbox{ dBm}\), while high intensity regions for lower RSSI values are due to noise and situations of close, non-interacting proximities. This is verified by other measures based on GT2 dataset (see panel Fig. 2c), where RSSI values remain within this range for several orientations and radically change only when participants are back to back to each other. However, only visually inspecting these results, it is very difficult to determine a precise RSSI threshold separating real and false social interactions. To better solve this task, next we frame this question as a classification problem to distinguish between packets indicating real and false social interactions that we can use then to reconstruct the temporal network.

3 Temporal network reconstruction

In our pipeline, we are going to reconstruct the temporal network from raw data in five main steps, as demonstrated in Fig. 1. First, we discuss how to arrive from the recorded data to a handshake pair sequence, whose items indicate mutual handshakes between interacting badges. Then we perform a binary state classification process, turning handshake pair sequence into the binary sequence where each item indicates mutual social interaction state. Finally, we propose several methods to reconstruct the real dyadic temporal interactions with duration, which in turn provides us with a temporal network capturing time-varying interactions between a larger group of individuals.

We separate the binary state classification step from event reconstructions as our first goal was to create a binary sequence of interactions from the raw data that we can apply earlier defined methods on. In addition, we found this approach necessary as we identified two potential sources of errors effective during the reconstruction process. One is due to the fluctuations of recorded signal strengths of transmitted packets, while the other is caused by packet loss or interferences, which induce uncertainties in present or absent handshake pairs. This second type of errors makes it difficult to reconstruct events with longer duration, a problem for which earlier studies provided overly simplified heuristic solutions only with limited precision. We will explore various methods relying on this two-steps approach, but show also a method, which solves the problem at once by taking packets with signal strength values as input and directly reconstructing temporal interactions with duration.

3.1 Binary signal reconstruction

Physical proximity between two participants, A and B, within the right distance range and orientation should appear as a sequence of consecutive mutual ‘handshakes’ of badges for the duration of their interaction. To obtain the sequence of these handshakes, we take the sequences of packets observed by the badges of A and B and match those packets, which correspond to the mutual observation of the two participants. In other words, since packets are transmitted every 5 seconds, we match two packets into a single handshake event (see Fig. 1b) if they appear within ±2.5 seconds to each other and they refer to the opposite ID (for A from B and for B from A). Missing packets are also recorded in the handshake sequence (see Fig. 1b empty red arrows) and are assigned a default RSSI value, \(-95\mbox{ dBm}\), out of the possible RSSI range, that we can easily distinguish from observed packets. To clearly distinguish ‘fake signal’ from ‘real signal’, we appended one item called ‘\(\mathit{pair}\_\mathit{state}\)’ to each handshake RSSI pair to indicate the number of ‘real signals’ in the pair. This variable can take values 0, 1 or 2, and it enables to code for the presence of ‘fake signals’ while keeping the normalisation of RSSI values possible for the coming reconstruction methods. Thus the encoding of each handshake pair becames a vector (\(\mathit{RSSI}_{A}\), \(\mathit{RSSI}_{B}\), \(\mathit{pair}\_\mathit{state}\)), forming a sequence of handshake events recording all information for the reconstruction task.

In order to determine if a handshake pair should be considered as a state of social interaction, we use GT1 where we recorded the start and end time of each social interaction so that we could mark each handshake pair as interaction or non-interaction event. This is shown in Fig. 3a, where we plot handshake pairs using their RSSI values as coordinates. Colours code a handshake being an interaction (green) or non-interaction (red). Since different handshake pairs could appear with the same RSSI values but different interaction states, in Fig. 3a we represent with a small pie chart their fraction at a given location. The strong diagonal component indicates that the RSSI values of mutual observations are very similar to each other as expected, while the interactions seem to separate from non-interactions around \({\sim} {-}70\mbox{ dBm}\), which corresponds well to the earlier estimated threshold range. To solve this classification problem in a more systematic way, we trained a logistic regression model on the annotated GT1 dataset. As input we gave vectors of handshake pairs and we used their annotated labels for the training task. As output we received a probability for each state to be a real social interaction and we thresholded this probability at 0.5 to assign 0/1 states to each handshake pair. The obtained decision boundary is shown as a grey line in Fig. 3a, which appears to be linear, except close to the boundaries where saw-teeth appears due to the two dimensional projection of a three dimensional decision surface. With this method we reached a 77.28% accuracy with 10-fold cross validation (for further details see Table 1) to classify a handshake pair as real social interaction or not. This way we can turn our sequence of handshakes into a binary signal (demonstrated in Fig. 1c), by assigning \(1/0\) to interaction and non-interaction events in every 5 seconds.

Figure 3
figure 3

Reconstructing binary sequences and interactions. (a) Scatter plot of RSSI values of pairs of interacting (green) and non-interacting (red) badges observed in GT1, with decision boundary presented as grey line; (bd) Demonstration of reconstruction strategies of (b) an observed binary sequence of interactions using (c) a naïve method with \(\mathit{gap}=1\) threshold, (d) a HMM with window size \(\mathit{win}=3\) and (e) a BiLSTM with window size \(\mathit{win}=3\)

Table 1 Confusion matrix with accuracy for logistic regression to reconstruct binary signals

3.2 Interaction state reconstruction methods

Using the obtained binary sequences, what we call now on un-reconstructed sequences, our next task is to reconstruct the real interactions, which appeared between pairs of participants. The general problem here is to identify false interaction events, which were induced by interference and thus should appear as actual non-interactions, and reconstruct true ones, which were missed due to packet loss. As this is the most challenging task in our methodological pipeline, we are going to follow three different methodological tracks. We will start with a naïve approach commonly used in the literature, then we will explore variants of the Hidden Markov model (HMM) and the Long Short-Term Memory (LSTM) model to find the best solution for this dynamical reconstruction task. Note that while the naïve method only reconstructs interaction periods, the two learning methods naturally adapt to the inverse problem and also reconstruct non-interacting periods with falsely observed interactions in the middle.

3.2.1 Naïve reconstruction model

Consecutive binary signals in a sequence (following each other in 5 seconds here) can be merged into long interaction periods (we call them events) with duration equal to the length of the continuous interaction. These events are separated by non-interaction gaps. If such a non-interaction gap is induced by an accidental packet loss, it is assumingly very short. On the other hand, if it is due to a real break of social engagement, it may occupy a longer period. Based on this assumption we can design a very simple reconstruction method, where we merge two interaction periods if they are separated only by a sequence of non-interaction events shorter than a given gap threshold value. This naïve reconstruction method is demonstrated in Fig. 3c, where we assume with \(\mathit{gap}=1\) to reconstruct the sequence observed in Fig. 3b. This method has been used conventionally in most of the RFID social experiments so far, typically choosing the threshold to be \(\mathit{gap}=0\), thus merging only directly consecutive interaction packets (what we call non-reconstructed method here) or \(\mathit{gap}=1\) corresponding to a gap smaller than 40 seconds. This choice has been challenged recently by Elmer et al. [24], who identified the optimal threshold being 75 seconds for the best reconstruction accuracy.

3.2.2 Hidden Markov model

The second reconstruction method we chose is the Hidden Markov Model. To set the parameters for HMM with supervised learning method we used the GT1 dataset with the annotated states of handshake sequences as states sequence, and the sequence of binary states (explained in Sect. 3.1) as the observations. After training on the annotated data (GT1), we determined the values of the conditional transition probability of hidden states as transition matrix, conditional emission probability from hidden states to observation states as emission matrix and the initial states probability as start matrix. In turn, we used these as parameters for the Viterbi algorithm to solve the most likely sequence problem and use the output as the reconstructed sequence.

To enrich the information coded in the input sequence, instead of providing a sequence of binary values for each time step, we define a backward window, which contains some short term information before the actual state being reconstructed. More precisely, as demonstrated in Fig. 3d, we define a tuple of win number items (there \(\mathit{win}=3\)), where the last one is yet the state to reconstruct while the others are the previous states in the sequence. Applying these envelop definitions to a unit of binary state sequence, we create an envelop with backward signals for each signal, thus transforming a binary state sequence into an embedded envelop sequence. Subsequently, we use these transformed envelops instead of binary signals to define the hidden states, observation states, as well as determining all the matrices. Finally, as an output of the Viterbi algorithm we obtain a sequence of envelops, with last item of each envelop as the predicted interaction/non-interaction state of each time step. Note that we tried multiple other envelop methods (not reported here) coding different distance information between actual and last interaction packets but received worst performance than in the actual case.

3.2.3 Bi-directional LSTM model

The Hidden Markov Model has two limitations in terms of reconstructing the real interaction signal. First, it can only consider states from the past, while states in the future may be also important for the actual state to predict. Second, it is a Markov model thus it can consider only short-term temporal correlations between the actual and previous states. We tried to overcome this shortcoming by introducing longer observation windows for each state, which helped to learn longer temporal correlations yet they were very limited to the actual window size.

Bidirectional recurrent neural networks propose simultaneous solutions to these two problems as they can be trained using input information in the past and future of a specific time frame [43] (for demonstration see Fig. 3e). Especially the Bidirectional Long Short-Term Memory (BiLSTM) model has been shown to perform well on dynamical signal reconstruction. This model was initially adopted in speech recognition and showed to improve model performance on sequence classification problems. In practise, it trains two LSTM models on a complete input sequence from opposite directions, one on the input sequence as it is, and another on a reversed copy. The output of each time step from the two LSTM models are merged and passed to the next layer, this way providing some additional context to perform the learning task better.

We applied this model in three different settings to find the best performing one. In one case, that we call BiLSTM-bin, the input of the model was a sequence of binary states we obtained from the classifier’s binary output as explained in Sect. 3.1. In the second setting, that we call BiLSTM-logi, the input sequence was also generated from the classifier but, instead of binary states, it was a sequence of probabilities obtained as the direct output of the logistic regression before thresholding it. Finally, the third case, that we call BiLSTM-RSSI, is not relying on the sequence of classified states, but instead it takes directly sequences of encoded handshake pair vectors (\(\mathit{RSSI}_{A}\), \(\mathit{RSSI}_{B}\), \(\mathit{pair}\_\mathit{state}\)). This solution has the advantage to skip one step of the reconstruction pipeline and to use a more complex set of information, but it needs to solve the same problem using noisy RSSI signals without pre-processing.

3.3 Event reconstruction

To train all these models we used GT1 since it was recorded in the most realistic setting. These data were built from 7 observation clips of 1290, 3060, 3200, 1230, 1740, 1350 and 1030 sec, covering 3 hours 35 minutes combined. For training and validation purposes, we divided evenly observations longer than 3000 seconds (2 clips) into 3 shorter periods, and retained 10 observation clips all with length between 1000 and 1740 seconds (for a total of 3 hours and 18 minutes). To determine the best hyper-parameter set for each method, we applied a nested cross validation strategy. In the outer loop, we selected one clip (each of them once) for testing purposes thus we kept it out from the training of the model at this round. From the remaining 9 clips we perform the traditional 9-fold cross validation. Considering all combinations, we could compute the average accuracy over 9 possible divisions of training-validation sets in order to screen hyper-parameter dependencies. Subsequently, we could repeat it 10 times to obtain the average test accuracy with the selected best hyper-parameters. Note that while computing averages we took into account the variance in length of the actual clips used for validation or testing.

3.3.1 Hyper-parameters

All BiLSTM models have two hyper-parameters to define their architecture, the number of hidden neurons and hidden layers. In our computations, we decided to use all architectures with a single hidden layer, which was a sufficient choice for the relatively small training data we have. At the same time, with grid search we explored the dependency of the models on the number of hidden neurons. The results summarised in Appendix B suggested that the performance of the models were weakly depending on this hyper-parameter, but suggested different optimal values for their best performance as summarised in Table 2.

Table 2 Selected optimal hyper-parameters as number (No.) of hidden neurons, optimal (Opt.) window size, average test accuracy values, and the corresponding standard deviations for each model

The most important hyper-parameter controlling the performance of all of the methods was the window size, which determined the length of temporal correlation a given method could consider. For the naïve model, this window can be associated with the gap parameter (see Fig. 3c). In case of the HMM model, as shown in Fig. 3d, it is the size of the window that the model considers from the past to infer the actual state. Finally for the BiLSTM models, this window was defined as an envelope of equal number of states before (past) and after (future) relative to the actual state to reconstruct (see Fig. 3e).

To choose the best window size, we took it as a parameter to compute the dependency of average accuracy values over the validation sets. As results in Fig. 4a depict, the reconstruction accuracy of each of the models shows strong dependency on the selected window size. First, in the case of the naïve method, by increasing the filled non-interaction gap size the accuracy reaches a maximum at \(\mathit{gap}=6\)Footnote 2. This corresponds to a gap length of 35 seconds, which is somewhat smaller than the gap window size of 75 seconds reported by Elmer et al. [24] on another RFID dataset. In the case of the HMM model, the best performance corresponds to the same window size \(\mathit{win}=6\). For the BiLSTM models, the accuracy increases with the window size but reaches a plateau at window size \(\mathit{win}=27\) for the BiLSTM-bin, \(\mathit{win}=25\) for the BiLSTM-logi, and \(\mathit{win}=27\) for the BiLSTM-RSSI, after which the reconstruction accuracy decreases.

Figure 4
figure 4

Accuracy of temporal network reconstruction (a) Average validation accuracy values as a function of window size for the naïve, HMM, BiLSTM-bin, BiLSTM-logi and BiLSTM-RSSI models, with fixed number of hidden neurons for BiLSTM models as summarised in Table 2. (b) Distribution of accuracy values shown as box-plots for the different models with optimal hyper-parameters values summarised in Table 2. Horizontal white bar inside box is median and white star is average

3.3.2 Performance of network reconstruction

After computing the average accuracy values over the test sets, surprisingly all methods performed relatively well the reconstruction task (see Table 2 and Appendix B for the confusion matrices). Even the non-reconstructed sequence reaches a surprisingly high accuracy of 77.28%. On the other hand, the naïve method, commonly used in other studies, performs significantly better with 83.36%, closely matching the performance (84.25%) of the considerably more complicated model of HMM. It is evident, however, that from all tested models, the BiLSTM methods perform the best to solve the temporal network reconstruction. They all provide accuracy at least 4% better than any other method reaching 88.34% for the BiLSTM-bin method, closely matching the values of 89.02% and 90.03% for the BiLSTM-logi and BiLSTM-RSSI methods respectively. More importantly, the best performing BiLSTM methods are also the ones providing accuracy values with the smallest fluctuations over the different test cases. This is reflected by the standard deviation values reported in Fig. 4b and Table 2 where all performance measures are summarised. In summary, these results suggest that the pipelines with the binary classification and logistic regression provide one of the best performances, but the BiLSTM-RSSI model trained directly on the RSSI values of interaction pairs provide just as good but simpler solution.

3.4 The reconstructed temporal network

The different models we introduced may reconstruct the temporal network with different characteristics. First of all, difference may arise as some models would label the same event to be present and some others as being absent interaction. This can be easily demonstrated by looking at the rates of reconstructed interactions by each model, as shown in Fig. 5a for a single morning period (2.5 hours). There, evidently, the highest event rate appears for the unreconstructed signal (naïve method with \(\mathit{gap}=0\)) where we only merge consecutive packets labelled as interactions by the binary classifier. Relative to the unreconstructed sequence, each reconstruction method reduces considerably the rate of identified interaction events. The naïve method, being still a very simple model, which merges events maximum 35 seconds apart, appears with the second highest event rate. Subsequently, the HMM method provides a lower event rate while BiLSTM-logi, BiLSTM-bin and BiLSTM-RSSI methods are closely grouped with lowest rates, reconstructing about four times less events than in the unreconstructed case.

Figure 5
figure 5

Characterisation of the reconstructed network. (a) Rates of reconstructed events by different methods (for colours see key). (b) Inter-event time distribution between original and reconstructed event on single links. (c) Distribution of the duration of interactions. Dashed lines on panels (b) and (c) depict approximating power-law functions with exponents 1.8 and 2.1 respectively. (d) Weighted static representation of a reconstructed network using the BiLSTM-RSSI method. Here link widths are proportional to the time spent together when both nodes were present and node sizes are proportional to node degrees. Nodes are coloured according to the original class partitions with darker nodes indicating adults (teachers, assistants or interns). For better visualisation we removed links which correspond to the weakest 3% of weights

Despite these large differences in the reconstructed volume, the \(P(\tau )\) inter-event time distributions between interactions on single links (shown in Fig. 5b) and the \(P(\mathit{dur})\) distribution of duration of interactions (Fig. 5c) appear with very similar shapes. These distributions all depict broad tails ranging over several orders of magnitudes and can be approximated well with power-law functions with exponents of \(\alpha =1.8\) and \(\gamma =2.1\) respectively. Interestingly, this scaling is very similar to earlier observations in independent RFID studies [18, 44]. In one way, this match verifies our experimental setting and observations, and at the same time it suggests that heterogeneities present in the interaction dynamics of face-to-face interactions may be universal with similar characteristics in independent systems.

To demonstrate the structure of the reconstructed network, we chose the BiLSTM-RSSI method as it was one of the best performing models with the smallest variance in accuracy. Using this method, we reconstructed the events recorded in five consecutive mornings (15 hours observation combined) for 165 children and 25 adults (teachers, assistants and interns), and aggregated the obtained interaction sequences into a static network structure. Link weights in this representation were defined as per hour interaction rates between participants. This network is visualised in Fig. 5d where we draw links with width proportional to the time the connected nodes interacted during periods when they were both present. The size of the nodes reflects their degrees, while their colours are associated to the class they belong to (with darker colours indicating teachers, assistants and interns). This network structure appears with several interesting characteristics. First of all, the network is heterogeneous in degree, which is a common characteristic of social networks. Second of all, it well recovers the expected community structure where children of the same class connect densely together including the teaching staff in charge of that group.

4 Spreading processes on reconstructed networks

Temporal social interactions are far from being random but highly correlated in time and structure. They are characterised by heterogeneous bursty dynamics [45], which potentially appear due to causal correlations between events. Such causally related adjacent events, sharing at least one person in common, build long time respecting paths [34, 36], which are extremely important as they determine how information/epidemics/influence can flow in the temporal network.

Consequently, the precise inference of temporal interactions in a network is extremely important not only to study the emergent structure but any ongoing process, like language evolution or information spreading or epidemics. To demonstrate this issue, here we take all the different event reconstruction methods we explored, and study how the temporal networks reconstructed in different ways influence the dynamics of a simple information spreading process. More precisely, we use the susceptible-infected (SI) model [46] as one of the simplest prototypical models of information spreading, which in turn can be used to simulate the fastest possible spreading under certain conditions. This model, defined on temporal networks, assumes that each node in the network initially is in susceptible state except for a single randomly selected node, which was set to be infected initially at a randomly selected time. Infection can be transferred with rate β from an infected to a susceptible node (i.e. \(S\xrightarrow[]{\beta } I\)) only at the time and direction of their temporal interactions. In case \(\beta =1\) the model is equivalent to a breadth-first-search process realising the fastest possible information spreading scenario with given initial conditions in the actual temporal network. However, if \(\beta <1\), the process would be arguably less sensitive to local fluctuations in the temporal networks, as it could take alternative routes than the shortest paths to reach nodes, thus would spread slower on the same network. Note that due to the finite observation period of temporal interactions, in our simulation we divided a 150 minutes long observation period into a 30 minutes and a 120 minutes time windows. We selected 800 random seeds from the first window and simulated the SI process for 120 minutes in each case. This way we obtained simulated spreading curves with the same length that we could easily average.

To depict our simulation results, in Fig. 6a we show the average spreading curves for each model for \(\beta =1\) case, while in the inset for lower β values the average times the process reached 90% infection on each reconstructed networks. Figure 6b shows the corresponding distributions of time to reach 90% infection in each case, again when \(\beta =1\). All these results indeed demonstrate large differences between spreading dynamics simulated on temporal networks reconstructed with the different methods, despite they all relied on the same raw observation sequences. Not surprisingly, in general, the speed of spreading is largely determined by the overall number of events that the different models reconstructed, as already shown in Fig. 5a. Larger number of interactions means larger number of possible transitions between the same set of nodes and over the same period. Following this logic, not surprisingly the unreconstructed network spread the infection the fastest, while BiLSTM models were the slowest. However, there is an important exceptions, which reflects our main conclusion here. The naïve method reduced by more than \({\sim} 90\%\) the event rates as compared to the unreconstructed sequence, but when it turns to disseminating information, this seems to make no difference. It is suggested by the corresponding spreading curves in Fig. 6a, which are almost indistinguishable, and by the distributions of 90% infection time which appear with almost the same average and standard deviation (see Fig. 6b). At the same time, these results seem to be consistent over a range of β values. From Fig. 6a inset it is evident that at small β values the spreading is strongly stochastic, fluctuations are very large, and the process takes a long time to spread. However, as we increase β the spreading becomes faster on each network. More importantly, after an initial β regime, the spreading processes evolve with similar relative speeds on the different structures as observed in case of the deterministic \(\beta =1\) case. This indicates that even for stochastic settings (\(\beta <1\)) the dynamical process is sensitive to the precise reconstruction of the underlying temporal network.

Figure 6
figure 6

Characterisation of information spreading on reconstructed networks. (a) Average spreading curves of susceptible-infected processes with \(\beta =1\) simulated on different reconstructed temporal networks as indicated in the key. Inset shows the β dependencies of the speed of spreading processes measured as the average \(t_{90\%}\) 90% infection time. (b) Distributions of 90% infection times as box-plots for the different reconstruction methods in case of \(\beta =1\). Distribution averages are represented by stars. For the parameters of the SI simulations see the main text

In conclusion, when using wireless proximity sensors to capture temporal interactions, (a) it is very important to carefully reconstruct events from the raw data and not only rely on simplistic intuitive conditions, otherwise the constructed temporal network will be biased by noise and overestimated event rates and will lead to unreliable outcomes of simulated dynamical processes; and (b) it is not enough to choose the best reconstruction method by its final accuracy, but it is crucial to choose carefully the reconstruction pipeline, which balances between good reconstruction performance and matching the purpose of the actual system under study.

5 Discussion

The goal of this work was manyfold. First, we developed a filtering and temporal network reconstruction pipeline to obtain the best approximation of temporal social interaction sequences from proximity data recorded via wearable wireless devices. We used ground truth data recorded in various settings and explored different reconstruction strategies involving supervised methods of classification and sequence reconstruction. We found that, while all tested methods provide reasonable performance, naïve methods commonly used in the literature show the worst performance. At the same time, bi-directional LSTM methods, which take into account information from the past and future of the actually predicted state, solve the reconstruction task the best, with accuracy up to \({\sim} 90\%\).

Furthermore, we wanted to highlight the importance of precise reconstruction of temporal interactions from raw data. Over the last few years, experiments using wearable wireless devices provided an ideal way to study collective social phenomena through the precise recording of temporal social interactions of people/animals in various settings. At the same time, these datasets became inductive resources to study ongoing dynamical processes such as epidemics [13, 47], opinion dynamics [48], etc. evolving on the temporal social fabric. However, without the careful reconstruction of social interactions, any study addressing the dynamics or structure of the evolving networks or any ongoing collective dynamics would risk to draw wrong or inaccurate conclusions. We demonstrated the sensitivity of this issue by simulating susceptible-infected processes on the reconstructed networks, which in turn follow significantly different scenarios depending on the actual method used for event reconstruction, even for those with comparable accuracy.

Finally, we wanted to showcase a large-scale longitudinal social experiment, which records the proxy social and verbal interactions of hundreds of pre-school children and their teachers and assistants with high temporal resolution. Our experiment is ongoing, but in the end it will provide observations about the dynamics of language development and social network of children over three years. Using these data in our upcoming research, we plan to study linguistic and social similarities at different levels of organisation of the social network: the collective level (the whole school, classes, or children groups with similar socio-cultural background considered as a community), the intermediate level (friendship groups), the dyadic level (connected pairs of children) and the individual level (each child with their specific characteristics).

First, we plan to detect the temporal relationships between social networks dynamics and changes in children language. We shall adopt two approaches to disentangle the mutual influences between socialisation and language. To evaluate social influence on language skills, we will test whether the change of social distance between individuals predicts the linguistic distance between them as well. On the contrary, to assess the effect of language on social relationships (homophily), we will also investigate whether the linguistic distance between individuals predicts the social distance between them. Interesting future research direction would be to uncover the fine grained differences between the reconstructed networks to understand which of their features are important to reconstruct and which are insignificant for dynamical processes.

Second, we aim at measuring, quantifying and modelling the processes that have long-term influence on social and linguistic development. Our three-year longitudinal follow-up design makes it possible to analyse the processes underlying the dynamics between changes in the social network and language skills. In particular, we will examine the effect of integration within a new community: does a community always have a homogenising role, by absorbing linguistic change, or, by contrast, can it accommodate the linguistic usages brought by new members and augment these by disseminating them through the community?

As any data-driven study intending to predict or infer human behaviour, our study has also limitations. First of all, the collected data contain certain noise, which cannot be reconstructed with any actual method. Noise is also inevitably present in the ground truth data, which at the same time code only a finite set of configurations used for training, while rare and exceptional scenarios may remain unobserved. These limitations together with the stochastic nature of human behaviour lead to an always perfectible reconstruction of human traits of actions or interactions. Finally, although we payed special attention on de-noising, pre-filtering, model selection, and the exploration of the hyper-parameter space of each model, surely the optimal inference pipeline we identified is not universal but may be different in the case of data from other wireless proximity sensors.

Beyond scientific merit, our results highlight the importance of the careful design of event reconstruction in studies using wireless sensors. We demonstrated this in the case of LPWD based experiments recording social interactions, but it is important more generally in any study relying on similar data collection methods. This way, we hope that our study contributes not only to the better design of coming scientific studies but also to future emerging technologies.


  1. The employed technology and its implementation complies with the requirements of the product standard EN50566 following the basic restrictions of the European Council recommendations 1999/519/EC.

  2. Note that in case the threshold was \(\mathit{gap}=0\), we obtain the unreconstructed sequence.


  1. Goffman E (2017) Interaction ritual: essays in face-to-face behavior. Routledge, London

    Google Scholar 

  2. Duncan S, Fiske DW (2015) Face-to-face interaction: research, methods, and theory. Routledge, London

    Google Scholar 

  3. Kawachi I, Berkman LF (2001) Social ties and mental health. J Urban Health 78(3):458–467

    Google Scholar 

  4. Stehlé J, Voirin N, Barrat A, Cattuto C, Isella L, Pinton J-F, Quaggiotto M, Van den Broeck W, Régis C, Lina B et al. (2011) High-resolution measurements of face-to-face contact patterns in a primary school. PLoS ONE 6(8):23176

    Google Scholar 

  5. Fournet J, Barrat A (2014) Contact patterns among high school students. PLoS ONE 9(9):107878

    Google Scholar 

  6. Isella L, Stehlé J, Barrat A, Cattuto C, Pinton J-F, Van den Broeck W (2011) What’s in a crowd? Analysis of face-to-face behavioral networks. J Theor Biol 271(1):166–180

    MathSciNet  MATH  Google Scholar 

  7. Martinet L, Crespelle C, Fleury E, Boëlle P-Y, Guillemot D (2018) The link stream of contacts in a whole hospital. Soc Netw Anal Min 8(1):59

    Google Scholar 

  8. Duval A, Obadia T, Martinet L, Boëlle P-Y, Fleury E, Guillemot D, Opatowski L, Temime L (2018) Measuring dynamic social contacts in a rehabilitation hospital: effect of wards, patient and staff characteristics. Sci Rep 8(1):1686

    Google Scholar 

  9. Psorakis I, Voelkl B, Garroway CJ, Radersma R, Aplin LM, Crates RA, Culina A, Farine DR, Firth JA, Hinde CA et al. (2015) Inferring social structure from temporal data. Behav Ecol Sociobiol 69(5):857–866

    Google Scholar 

  10. Krause J, Krause S, Arlinghaus R, Psorakis I, Roberts S, Rutz C (2013) Reality mining of animal social systems. Trends Ecol Evol 28(9):541–551

    Google Scholar 

  11. Waber BN, Olguin Olguin D, Kim T, Pentland A (2010) Productivity through coffee breaks: changing social networks by changing break structure. Available at SSRN 1586375

  12. Chancellor J, Layous K, Margolis S, Lyubomirsky S (2017) Clustering by well-being in workplace social networks: homophily and social contagion. Emotion 17(8):1166

    Google Scholar 

  13. Stehlé J, Voirin N, Barrat A, Cattuto C, Colizza V, Isella L, Régis C, Pinton J-F, Khanafer N, Van den Broeck W et al. (2011) Simulation of an seir infectious disease model on the dynamic contact network of conference attendees. BMC Med 9(1):87

    Google Scholar 

  14. Machens A, Gesualdo F, Rizzo C, Tozzi AE, Barrat A, Cattuto C (2013) An infectious disease model on empirical networks of human contact: bridging the gap between dynamic network data and contact matrices. BMC Infect Dis 13(1):185

    Google Scholar 

  15. Lucet J-C, Laouenan C, Chelius G, Veziris N, Lepelletier D, Friggeri A, Abiteboul D, Bouvet E, Mentre F, Fleury E (2012) Electronic sensors for assessing interactions between healthcare workers and patients under airborne precautions. PLoS ONE 7(5):37893

    Google Scholar 

  16. Obadia T, Silhol R, Opatowski L, Temime L, Legrand J, Thiébaut AC, Herrmann J-L, Fleury E, Guillemot D, Boelle P-Y et al. (2015) Detailed contact data and the dissemination of staphylococcus aureus in hospitals. PLoS Comput Biol 11(3):1004170

    Google Scholar 

  17. OpenBeacon. Accessed: 2019-06-26

  18. Cattuto C, Van den Broeck W, Barrat A, Colizza V, Pinton J-F, Vespignani A (2010) Dynamics of person-to-person interactions from distributed rfid sensor networks. PLoS ONE 5(7):11596

    Google Scholar 

  19. Barrat A, Cattuto C, Colizza V, Pinton J-F, Broeck WVd, Vespignani A (2008) High resolution dynamical mapping of social interactions with active rfid. arXiv preprint. arXiv:0811.4170

  20. Konomi S, Inoue S, Kobayashi T, Tsuchida M, Kitsuregawa M (2006) Supporting colocated interactions using rfid and social network displays. IEEE Pervasive Comput 5(3):48–56

    Google Scholar 

  21. Kibanov M, Atzmueller M, Scholz C, Stumme G (2014) Temporal evolution of contacts and communities in networks of face-to-face human interactions. Sci China Inf Sci 57(3):1–17

    Google Scholar 

  22. Nardy A, Fleury E, Chevrot J-P, Karsai M, Buson L, Bianco M, Rousset I, Dugua C, Liégeois L, Barbu S, Crespelle C, Busson A, Léo Y, Bouchet H, Dai S (2016) DyLNet – language dynamics, linguistic learning, and sociability at preschool: benefits of wireless proximity sensors in collecting big data ( working paper or preprint.

  23. SEQUANTA. Accessed: 2019-12-17

  24. Elmer T, Chaitanya K, Purwar P, Stadtfeld C (2019) The validity of rfid badges measuring face-to-face interactions. Behav Res Methods, 51:2120–2138

    Google Scholar 

  25. Lazer D, Pentland A, Adamic L, Aral S, Barabási A-L, Brewer D, Christakis N, Contractor N, Fowler J, Gutmann M et al. (2009) Computational social science. Science 323(5915):721–723

    Google Scholar 

  26. Vespignani A (2009) Predicting the behavior of techno-social systems. Science 325(5939):425–428

    MathSciNet  MATH  Google Scholar 

  27. Blondel VD, Decuyper A, Krings G (2015) A survey of results on mobile phone datasets analysis. EPJ Data Sci 4(1):10

    Google Scholar 

  28. Abdesslem FB, Parris I, Henderson T (2012) Reliable online social network data collection. In: Computational social networks. Springer, London, pp 183–210

    Google Scholar 

  29. Eagle N, Pentland AS (2006) Reality mining: sensing complex social systems. Pers Ubiquitous Comput 10(4):255–268

    Google Scholar 

  30. Lederman O, Calacci D, MacMullen A, Fehder DC, Murray FE, Pentland A (2017) Open badges: a low-cost toolkit for measuring team communication and dynamics. arXiv preprint. arXiv:1710.01842

  31. Haritaoglu I, Harwood D, Davis LS (2000) W/sup 4: real-time surveillance of people and their activities. IEEE Trans Pattern Anal Mach Intell 22(8):809–830

    Google Scholar 

  32. Ashbrook D, Starner T (2003) Using gps to learn significant locations and predict movement across multiple users. Pers Ubiquitous Comput 7(5):275–286

    Google Scholar 

  33. Stopczynski A, Sekara V, Sapiezynski P, Cuttone A, Madsen MM, Larsen JE, Lehmann S (2014) Measuring large-scale social networks with high resolution. PLoS ONE 9(4):95978

    Google Scholar 

  34. Holme P, Saramäki J (2012) Temporal networks. Phys Rep 519(3):97–125

    Google Scholar 

  35. Karsai M, Kivelä M, Pan R, Kaski K, Kertész J, Barabási A-L, Saramäki J (2011) Small but slow world: how network topology and burstiness slow down spreading. Phys Rev E 83(2):025102

    Google Scholar 

  36. Kivelä M, Cambe J, Saramäki J, Karsai M (2018) Mapping temporal-network percolation to weighted, static event graphs. Sci Rep 8(1):12357

    Google Scholar 

  37. Moinet A, Barrat A, Pastor-Satorras R (2018) Generalized voterlike model on activity-driven networks with attractiveness. Phys Rev E 98(2):022303

    Google Scholar 

  38. Karsai M, Perra N, Vespignani A (2014) Time varying networks and the weakness of strong ties. Sci Rep 4:4001

    Google Scholar 

  39. Li M, Dankowicz H (2019) Impact of temporal network structures on the speed of consensus formation in opinion dynamics. Phys A, Stat Mech Appl 523:1355–1370

    Google Scholar 

  40. Masuda N, Holme P (2017) Temporal network epidemiology. Springer, Berlin

    MATH  Google Scholar 

  41. Altmann J (1974) Observational study of behavior sampling methods. Behaviour 49:227–267

    Google Scholar 

  42. Animal Observer: an iPad app designed to collect animal behavior and health data.

  43. Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681

    Google Scholar 

  44. Zhao K, Stehlé J, Bianconi G, Barrat A (2011) Social network dynamics of face-to-face interactions. Phys Rev E 83(5):056109

    Google Scholar 

  45. Karsai M, Jo H-H, Kaski K (2018) Bursty human dynamics. Springer, London

    Google Scholar 

  46. Barrat A, Barthelemy M, Vespignani A (2008) Dynamical processes on complex networks. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  47. Chowdhury B, Chowdhury MU, Sultana N (2009) Real-time early infectious outbreak detection systems using emerging technologies. In: 2009 international conference on advances in recent technologies in communication and computing. IEEE, pp 506–508

    Google Scholar 

  48. Maity SK, Manoj TV, Mukherjee A (2012) Opinion formation in time-varying social networks: the case of the naming game. Phys Rev E 86(3):036110

    Google Scholar 

  49. Santos AJ, Vaughn BE, Bost KK (2008) Specifying social structures in preschool classrooms: descriptive and functional distinctions between affiliative subgroups. Acta Ethol 11(2):101–113

    Google Scholar 

Download references


We are especially grateful to all the children, their family, and the members of the school teaching staff who took part in our experiment. This work has been supported by the DyLNet ANR project (ANR-16-CE28-0013). MK, JPC and EF are grateful for partial support from the SoSweet ANR project (ANR-15-CE38-0011). MK acknowledges support from the DataRedux ANR project (ANR-19-CE46-0008) and the SoBigData++ H2020 project. SD benefits from sponsorship by China Scholarship Council (CSC) for his PhD program.

Author information

Authors and Affiliations



All authors participated in the design of the data collection experiment. HB collected the data, SD and MK designed the reconstruction methods, SD implemented and ran the supervised training experiments. SD and MK wrote the first draft of the manuscript while all authors participated in its finalisation. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Márton Karsai.

Ethics declarations

Ethics approval and consent to participate

Experiments were approved by relevant ethics committees, COERLE (Comité Opérationnel d’Evaluation des Risques Légaux et Ethiques of INRIA institute, favourable opinion no. 2017-014) and CNIL (Commission Nationale de l’Informatique et des Libertés, favourable opinion Avis CIL_UGA-2017_0980683). The study was conducted within a French primary school with permission from the relevant authorities of Education Nationale. Written informed consent was obtained from all adults of the teaching staff and from the parents of all the children who took part in the study.

Competing interests

The authors declare that they have no competing interests.

Additional information

Sicheng Dai, Hélène Bouchet and Márton Karsai contributed equally to this work.


Appendix A: Experimental design of ground truth data collection

While designing the ground truth data collection, special attention has been paid to the feasibility and reliability of our observation method, to decrease human errors during observations while obtaining a meaningful ground truth dataset. To meet with all these requirements we followed the following logic and conditions:

  • To record the ground truth data for GT1 and GT3 we had a researcher in the classroom who was monitoring the behaviour of children to record their interaction state, their relative orientation or position at regular intervals. To quantify interactions, distance and orientation, we chose a scan sampling strategy (i.e. observations of states at predetermined time intervals) [41]. Another option was focal sampling (i.e. continuous recording of interaction events) [41], however social interactions (and kids position/distance too) were sometimes too short and fluctuating for continuous observation, thus it was impossible to record the beginning and end of each interacting event without using video-recording.

    In addition, for scan sampling, we made trials to find the appropriate time intervals between two observations (scans) that allowed us to record data without loss (i.e. the shortest step possible for recording the data of interest without taking the risk to miss one observation point). In the end, we chose 10 second steps for pair observations (GT1) and 2 minute steps for group observations (GT3).

  • For GT1 observations, we worked during free play time to be able to observe spontaneous play interactions. To decrease the possible noise due to fleeting behaviours and random moves, observations were carried out on older children from the middle (4-5 years old) or the grand class (5-6 years old). Importantly, we focused only on one pair of kids at a time to record their state (interacting or not) and relative position every 10 seconds. This way we reduced to the minimum the possible human observation errors in this setting.

    Regarding the scoring of the state of interaction/no-interaction, we set the criteria of interactions based on the literature [49] and the field expertise of the participating researchers from earlier similar experiments. Specifically, we considered two children “interacting” if they were within arm’s reach (i.e. less than 1 meter from each other), either playing together (e.g. cooperatively manipulating construction blocks, kitchen toys, puzzles...) or playing alongside (e.g. making a drawing next to each other). These situations typically involved talking to each other at times.

  • Regarding GT3, we selected the most appropriate and stable conditions to observe distance and orientation of as many children as possible. In practice, we performed observations only with older children (grand class) when their movements inside the classroom were limited for an extended period of time. This was possible during specific activities that involved standing still (like collectively sitting on a bench to listen to the teacher reading a book, sitting around tables in small groups to do written work...) rather than freely moving around (like during free play time).

  • Measuring distances between children is very much subject to inter- (and even intra-) individual variations. To minimise such fluctuations in recording the distances in GT3, we used a customised behavioural observation app originally developed to record the positions and distances of animals in a fixed environment (Animal Observer application for iPad [42]). This app projects a scaled map of the classroom with indicated reference objects (furnitures, doors, windows, etc.) and allows to record by an observer the positions of individuals on this map as the function of time. Using this temporal location dataset inter-individual distances of children have been computed afterward. Relative to the objects on the classroom map the location of children could be estimated precisely with a very small error margin (around 10 cm), something that would have been impossible to accurately assess through “naked eye” estimations and “hand” recording.

To further check the impact of potential errors due to human encoding, we conducted a randomisation experiment where we induced noise in the already collected data. More precisely, we randomly selected the 5%, 10%, 15%, and 20% of observation points from the ground truth data sequences and flipped the annotated flags from interaction to non-interaction or vice versa to add noise to our original observations. Through remeasuring the accuracy change of the BiLSTM-RSSI method on these randomised data we found that the average and variance of accuracy is rather robust against such small induced noise, only having the average to decrease slightly as summarised in Table 3.

Table 3 Average and standard deviations of BiLSTM-RSSI reconstruction method trained on GT1 after introducing random noise of various levels. Average values are computed over 20 independent realisations

Appendix B: Parameters and performance of reconstruction methods

In this Appendix, we summarise the parametrisation of the different reconstruction methods together with the confusion matrices corresponding to their best performing parameters, as summarised in the Sect. 3.3 in the main text.

2.1 B.1 Naïve method

The naïve method has a single parameter gap, which determines the maximum length of non-interaction gaps between two interaction events to be filled automatically with interaction states. \(\mathit{gap}=0\) is a special case as it belongs to the non-reconstructed signal where no state has been filled. We explored \(\mathit{gap} \text{ from } 0 \text{ to } 9\), corresponding from 0 to 45 seconds of non-interaction gaps, with 5 seconds incremental step size. We found that in our setting the best reconstruction can be reached with \(\mathit{gap}=6\) with accuracy reaching 0.834 as summarised in the confusion matrix in Table 4. Consequently, if a longer than 35 second gap appears in the interaction sequence of two individuals, the two participants most probably broke their actual social interaction, thus events before and after the gap should be considered separately.

Table 4 Confusion matrix with accuracy of the naïve event reconstruction method with \(\mathit{gap}=6\)

2.2 B.2 Hidden Markov model

When parametrizing the HMM with annotated data, we used maximum likelihood estimation to compute three matrices. Take transition matrix for example, if the frequency of hidden state i at t transiting to hidden state j at \(t+1\) is \(A_{ij}\), then the estimated transition probability \(\hat{a}_{ij}\) is computed as follows: \(\hat{a}_{ij} = A_{ij}/\sum_{j = 0}^{N-1} A_{ij}\), where N is the number of hidden states. Same method is applied for computing emission matrix and initial matrix.

With embedded envelop sequence, we first pad 0 (indicating non-interaction states) at the beginning of each sequence, with size of \(\mathit{window} \mathit{size}-1\). We then use transformed envelop signals instead of binary signals to define the hidden states, observation states, as well as determining all the matrices. Finally, as an output of the Viterbi algorithm we obtain a sequence of envelop, the last item of each envelop being the predicted interaction/non-interaction state of each time step.

The reconstruction accuracy and the confusion matrices of the HMM methods are shown in Table 5 for window size \(\mathit{win}=6\).

Table 5 Confusion matrix with accuracy of the HMM model with window size \(\mathit{win}=6\)

2.3 B.3 Bi-directional LSTM methods

For each BiLSTM method we used an envelop with size of winsize located symmetrically on the middle state which we wanted to reconstruct (as demonstrated in Fig. 3e). We pad void signals at the beginning and at the end of each sequence, with size of \(\lfloor \mathit{winsize}/2 \rfloor \) on each side. More precisely, the padded void signal for BiLSTM-RSSI is a vector \((-95,-95,0)\), for BiLSTM-logi is a vector \((1,0)\) and for BiLSTM-bin is a single number 0.

We merged the outputs of the two LSTMs using concatenation, which provided double size of outputs to the next layer. For the training, we split our labelled data into 10 clips with each around 25 mins then use nested cross validation to select best hyper-parameters and examine the performance of each reconstruction method. The confusion matrix of the three BiLSTM reconstruction tasks are shown in Table 6, Table 7 and Table 8. The accuracy of the BiLSTM-RSSI reached 0.9003, which is the best among all the tested methods.

Table 6 Confusion matrix with accuracy of the BiLSTM-RSSI for event reconstruction
Table 7 Confusion matrix with accuracy of the BiLSTM-logi for event reconstruction
Table 8 Confusion matrix with accuracy of the BiLSTM-bin for event reconstruction

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, S., Bouchet, H., Nardy, A. et al. Temporal social network reconstruction using wireless proximity sensors: model selection and consequences. EPJ Data Sci. 9, 19 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: