Complex decision-making strategies in a stock market experiment explained as the combination of few simple strategies

Many studies have shown that there are regularities in the way human beings make decisions. However, our ability to obtain models that capture such regularities and can accurately predict unobserved decisions is still limited. We tackle this problem in the context of individuals who are given information relative to the evolution of market prices and asked to guess the direction of the market. We use a networks inference approach with stochastic block models (SBM) to find the model and network representation that is most predictive of unobserved decisions. Our results suggest that users mostly use recent information (about the market and about their previous decisions) to guess. Furthermore, the analysis of SBM groups reveals a set of strategies used by players to process information and make decisions that is analogous to behaviors observed in other contexts. Our study provides and example on how to quantitatively explore human behavior strategies by representing decisions as networks and using rigorous inference and model-selection approaches.


Introduction
In recent years, thanks to the widespread use of the internet, e-mail and mobile phone technologies, we have been able to gather large amounts of data that have enabled the large-scale characterization of specific traits of human behavior [1,2,3,4,5]. Indeed, a number of studies have shown that humans display statistically regular patterns in the way they move, communicate or make decisions which makes them identifiable [6,7,8,9,10,11,12]. Despite the success in characterizing such systems, there has been, comparatively, less work to assess whether there are interpretable models of that are truly predictive of unobserved behavior [11].
A compelling example is that of the study of decisions made by individuals when playing dyadic games that represent social dilemmas. A recent study [6] identified five different patterns in the strategies individuals use to play these games; these individual strategies, or behavioral phenotypes, deviate from optimal rational behavior and can be associated to common human attitudes such as jealousy, optimism or altruism [6,13,14]. However, such model is not the best for making predictions of unobserved individual decisions in the context of dyadic games. Indeed, a Bayesian modeling approach using a network representation of individual decisions in dyadic games showed that a model in which individuals mix three simple strategies is more predictive of unobserved individual decisions than a fivephenotype model [15]. Moreover, the Bayesian modeling approach reveals that the way individuals perceive games is different from the expectations based on game theory arguments [15].
Here, we use a similar network inference approach to model and understand the decision-making process in the context of stock markets. Specifically, we consider the situation in which individuals have to guess the short-term direction (up or down) of a stock market based on real data given a reduced set of available sources of information, such as the evolution of the market in the previous time step, average historical market trends, information on other markets or the advice of an expert [16]. Several questions arise in this context; for example: in an environment in which we are given more information than what we can process, which sources of information affect the decisions we make the most? How do we use this information to make decisions? Can we identify recurrent, context-dependent patterns of information usage that are predictive of unobserved decisions?
To address these questions, we assume that the information available to an individual fully defines the context in which she makes a decision; the decision is then the result of a specific strategy in which the individual exploits the available information. To define these strategies, we assume that there are some underlying patterns of behavior so that we can identify groups of individuals that display similar decision patterns and at the same time, identify groups of contexts that are similarly perceived by individuals. In our approach we exploit the fact that we can represent the decisions made by individuals in a specific context (for instance, the market went up in the previous round and the individual made a correct guess) as a bipartite graph connecting individuals and contexts, and use inference techniques developed for complex networks. In particular, we use stochastic block models, a type of group-based generative models that are amenable to Bayesian inference methods, including rigorous model selection [17,15]. Our choice is further motivated by the fact that models in which individuals can mix more than one strategy have been shown to be successful at predicting unobserved individual decisions in other contexts [18,10,15].
With our network inference approach, we are able to rigorously compare network data representations that define contexts using different types of information-for instance information about the previous round or a historical trend. We find that users are rather Markovian when it comes to processing available information: individuals are best described if we assume that they only use information from the previous and current round to make their decisions. We also find that, consistently with some previous analysis [16], individuals use the information of the previous and current rounds in different ways to construct four distinct strategies: a switching strategy-individuals make a different decision at each time step; an optimistic strategy-individuals tend to predict that the market will go up; a repeating strategy-individuals repeat their previous decision; and a win-stay loose-shift strategy-individuals copy the previous market move.
Our inference approach thus makes it possible to identify the best representation of the data in terms of predictability. Additionally, through this representation we can explore the regularities in the way individual players use information to make decisions thus providing a valuable illustration of how power of inference methodologies to advance our understanding of human behavior from data.

Dataset
Here, we consider the data set collected in the Mr. Banks social experiment [16] [1] . In this experiment, participants participated in a game that consisted in correctly predicting the evolution of a simulated market. That is, whether it would go up (↑; stocks increase their value) or down (↓; stocks lose value). In total, 280 people participated in this experiment. Participants played during 25 rounds that corresponded to 25 days of a real stock market. The experiment used 30 different time series of 25 days in a real market taken from the period between 01/02/2006-12/29/2009 of daily prices of: the Spanish IBEX, the German DAX and the S&P500 from the United States.
In each round, players had access to the following sources of information: the evolution of the stock market during the previous month, a simulated expert's advice that, by design, was correct 60% of the time, the trend of the same market in other places in the world, the average trend of the market over the previous 5 and 30 days, and the daily changes of direction of the market during the 30 previous days. We refer to this information as the context of the player.

Network models and inference
Model selection To find the best model for our data, we compare models in terms of their ability to predict unobserved decisions. Asymptotically, and most often in practice as well [17], this is equivalent to using Bayesian inference for model selection. In the Bayesian setting, the best model is the one that has the largest posterior probability p(M |D) (or equivalently the model that has the shortest description length [17,19]). Here, we maximize the posterior p(M |D) to obtain model parameters, and follow a predictive approach to find, first, which type of SBM better describes observed data and, second, which is the network representation with which we get the most accurate predictions of unobserved decisions.
Network representation Our goal is to predict whether a given player p will guess that the market is going up or down when exposed to a specific context c, given a set of past observed decisions R o of the player herself and other individual players in different contexts. As noted above, a context c is the information accessible to the player before making her decision.
We represent the data as a bipartite network in which nodes are players and contexts. We draw an edge with value 1 or 0 between a player p and a context c if, in that context, the player guessed ↑ or ↓, respectively. For instance, consider that the available information at round t is: 1) whether the guess of the player at round t − 1 was right or wrong, C = {R, W}; and 2) the market evolution at time t − 1, B = {↑, ↓}. Then, there are four possible different contexts for all CB combinations [1] The datasets analysed during the current study are available in the Zenodo repository, https://zenodo.org/record/50429#.YDOCwqvPxPY and we would represent our data as a bipartite network with N players and four different contexts Because we have information about the guessing history of all users, it would be possible to build many bipartite graphs in which contexts consider only current information, current information and information from the previous round, current information and information from the previous two rounds, etc. However, for our inference approach to properly work, we need to keep the number of possible contexts small enough to detect statistical patterns between individuals and contexts. Besides that, we have a priori no justification to assume that some choices of context are better than others. In our approach, assessing the predictive power over unobserved decisions allows us to select both the best model and the best set of contexts to represent the data.
Single-membership and mixed-membership stochastic block models In our approach, we assume that there are statistical regularities in the way individuals make decisions in different contexts. These regularities define the strategic behaviors of players. In the network representation, we assume that the statistical regularities take the form of groups of players and groups of contexts with similar connections.
Consistent with this assumption, we model player decisions using stochastic block models [20,21,22,23]. Stochastic block models are simple generative network models that assume the existence of groups of nodes (players and contexts), and that the probability that a pair of nodes connects (that is, that a player guesses up or down in a given context) depends exclusively on the groups to which the nodes belong.
Specifically, because the network we consider has two types of nodes, players and contexts, we consider bipartite stochastic block models [24,18,25]. In addition, we consider two types of stochastic block models: one in which each player and context can belong to a single group (SBM), and another one in which players and contexts can belong to multiple groups simultaneously with different weights (mixed-membership SBM, MMSBM) [26,27].
Formally, we have a set U of players and a set I of contexts, and the observed decisions R o = {r pc } that players p ∈ U make in context c ∈ I. In the data we consider that each player p in a specific context c has to guess the direction of the market in the next round, therefore the decision d pc is binary: ↑ or ↓.
We assume that there are K groups of players and L groups of contexts, and that the probability that player p in group k in context c in group makes decision d pc =↑ is given by p k where p is the matrix of connection probabilities between pairs of groups. Note that because there are only two possible decisions, the probability that In the case of MMSBM, we allow players and contexts to belong to more than one group. We therefore introduce a membership vector for players θ p , such that θ pk is the probability that player p belongs to group k. Analogously, we introduce a membership vector for contexts η c such that η c is the probability that context c belongs to group . Because these vectors represent probabilities they are subject to the normalization conditions: Note that membership vectors become binary in the single membership SBM model, so that player p exclusively belongs to group k, θ pk = 1 and θ pk = 0 for all k = k.
In the general MMSBM, the probability that player p makes decision d pc in context c is: SBMs further assume that each decision is independent from the others (conditionally on the group memberships), so that the probability of observing the data given the model parameters θ, η, p (or likelihood) can be expressed as the product of the probabilities of each individual decision: where n ↑ pc is the number of times player p guesses ↑ in context c, and n ↓ pc is the number of times player p guesses ↓ in context c.
Inference and prediction A priori , we are agnostic about the values that the model parameters should take. We therefore use a non informative prior so that the posterior probability of the model is proportional to the likelihood p(θ, p, η|R • ) ∝ p(R • |θ, p, η). We then find the model parameters that maximize the posterior. In the case of the SBM we use simulated annealing to find the set of model parameters (θ * , η * , p * ) that maximizes the posterior probability [15]. In the case of the MMSBM, we use the expectation maximization approach described in [28,15] (see Materials and Methods). We make our predictions about unobserved decisions using these maximum a posteriori parameters.

Model selection
We start by looking for the model that best describes our data, that is, the most predictive one. We consider three different models and assess their ability to make predictions measuring their accuracy on unobserved data using 5-fold cross validation [17]. The first model we consider is a naive baseline, in which we use the most common observed decision of player p in context c as a prediction. If there are no observed decisions of player p in context c, we predict d pc =↑ because this is the most common decision (and the most common market move) in our data set. The other two models are the (single-membership) SBM and the (mixed-membership) MMSBM described above.
Each set of contexts defines a different network representation. We compute the average predictive accuracy of each model over folds and over network representations (Fig. 5). We find that, overall, MMSBM models perform better than the naive baseline and single-membership SBMs. Therefore, in what follows we use MMSBMs to identify the best representation of the data and to analyze player strategies.
3.2 Identification of the most predictive network representation of the data Next, we identify the most predictive network representation for our data so as to establish which pieces of information and mechanisms are being used in the decisionmaking process. As we have already mentioned, we have the full history of player's decisions, daily and average market evolution, and information on whether players consult the expert opinion before making their decisions. We find that past information beyond the previous round t − 1 is not relevant to predict decisions at round t ( Fig.5 and Supplementary Material). We therefore consider 23 different network representations that mostly consider information about the user and the market at round t − 1 and the expert's advice at round t, and compare the performance of MMSBMs (fit to each different network representations) at predicting unobserved data (Fig. 5). To avoid fold-to-fold variability, we use the average log-ratio of the accuracy of pairs of models as our metric to compare predictive performance: where A S,i is the predictive accuracy in fold i for representation S, and N folds is the total number of folds. Note that by taking the logarithm of the ratio, we ensure that the metric is symmetric with respect to zero. We find that there are five representations that yield an almost identical accuracy, significantly higher than that of the other representations (Fig. 5). Among those, we select the simplest representation, which comprises 12 different contexts characterized by three sources of information available to players at round t: the market evolution at round t − 1 (↑ or ↓), the outcome of the player's guess at round t − 1 (right or wrong), and the expert's advice at round t (↑, not consulted, or ↓). Importantly, adding further information does not increase the predictive accuracy. Our analysis thus shows that players have short memory. This result correlates with the findings of Ref. [16] on the same data set. While there is a possibility that our data set is not big enough to capture effects beyond round t − 1, our result is consistent with other studies which have successfully used a Markovian human in the analysis of decision making processes [29,30,31].

Each group of users has well-defined patterns of behavior
Next, we turn to the model parameters obtained for the most predictive representation of the data. In particular we focus on player groups, which we identify with guessing strategies [15]. As mentioned before, MMSBMs assume that players can mix several strategies and that contexts can also belong to different groups simultaneously. In our case, we find that the highest predictive accuracy is for a MMSBM with K = 4 groups of players (or strategies) and L = 8 groups of contexts (see Supplementary). Interestingly, we find that contexts tend to belong to a single group of contexts, while players have their memberships spread across different groups. In other words, the players behavior is the result of mixing different strategies (see Supplementary Material).
The group-to-group probabilities p k express the probability that a group of players k guesses ↑ when facing contexts that belong to group , and therefore encapsulate all the information of the strategy of each group of players. For a more straightforward interpretation of the strategies (and because each context belongs mostly to a single group), we show the matrixp kc = p k η c corresponding to the probability that a group of players k guesses ↑ in each context c (Fig. 5). We refer to each one of the rows inp kc as an elementary strategy, since players combine these elementary strategies to give rise to observed complex strategies. Figure 5 shows that elementary strategies are well defined because thep kc are often either close to zero or one.
To help summarize and interpret elementary strategies, we note that in thep kc matrix we can identify several simple, easily interpretable patterns that we can use as alternative building blocks to describe decision-making strategies: • Win-stay (WS): at round t repeat the guess of round t − 1 if it was right. The first three of patterns were already identified in the behavior of some players in the original data collection study, as well as a general bias towards ↑ when making decisions [16]. However, whether these strategies were combined with others or used by different users was not explored.
To assess how each one of the elementary strategies aligns with these building blocks for behavioral patterns, we define for each group k a score M kb that quantifies the extent to which players within group k follow behavioral pattern b ∈ {MI, WS, LS, RPT, RPTU, RPTD, EXP}: In the summation, S b represents the set of contexts in which a behavioral pattern can be observed -for example, while RPTD can be observed after a user guessed down in the previous round, RPTU cannot -, q kc (b) is the probability that a user in group k follows the elementary strategy b in context c, N kc is the number of times a player in group k faces context c, and N kb = c∈s b N kc is the total number of observations of players in group k facing contexts c ∈ S b . Therefore, a score M kb = 1 means that the group k always follows a pattern of behavior b when facing contexts c ∈ s b ; a score M kb = −1 means that the group does exactly the opposite of what the strategy prescribes in 100% of cases. For instance, if group k always guesses ↑ regardless of the context, the scores for RPTU and RPTD will be M k RPTU = 1 (always guess ↑ after having guessed ↓) and M k RPTU = −1 (always guess ↑ after having guessed ↓).
In Fig. 5, we show the M kb scores for each of the groups of players. First, we note that when the expert is consulted, three of the groups follow her advice, while the remaining group is not influenced by it. When the expert is not consulted, we find that each group follows a very specific strategy. Group k 1 strongly tends to change their previous decision (switching behavior); k 2 tends to guess ↑ (optimistic behavior); k 3 almost always repeats the previous decision (repeating behavior); and k 4 always follows a win-stay loose-shift (WS-LS) behavior.

Discussion
Our analysis shows that the best description of player's strategies to predict their decision about market evolution is to consider that players are mixing elementary strategies which we have termed switching, optimistic, repeating, and win-stay (WS-LS).
Interestingly, not all of these elementary strategies are combined in the same way by all of the players (Fig. 5). The most used strategy overall is the WS-LS strategy, which on average explains 34% of player's decisions, that is, 1 N u θ u,WS−LS = 0.34. We observe this trend for both players using a single elementary strategy (low entropy in Fig. 5) and those combining various elementary strategies (high entropy). This finding is consistent with the conclusions of numerous studies assessing the wide use of this behavior in various natural systems such as children education [32,33], evolutionary systems [34,35], or games involving adversity [36].
By contrast, the optimistic behavior is seldom used by users using a single strategy (Fig. 5a, d). Nonetheless the optimistic strategy is on average part of 25% of payers' behavior, suggesting that this is a common strategy when used in combination with other strategies, which reflects the overall bias to guess ↑ regardless of the context in which the decision is made [16].
Finally, we note that that the two remaining elementary behaviors (switching decisions and repeating the previous decision) are complementary to one another, since if a player does not repeat her decision, it means that she changes her decision. Therefore, a player combining both strategies with equal probability would generate a sequence of random guesses. To investigate whether these strategies are the result of a behavioral pattern or just capture random sequences of ↑ and ↓, we measure from each player p, the normalized probability difference of use of each one of the two strategies D p : If D p = 1 player p uses exclusively the repeating strategy; if D p = −1 means she exclusively uses the shifting strategy; if D p = 0 means that both strategies are used equally. We compute D p only for players that use either "shifting" or "repeating" as a main strategy to avoid considering players that mostly use other strategies (WS-LS or optimistic). We find that while there are two peaks at extreme values (D p = −1 and D p = 1) showing that many players use exclusively one of the two strategies, players have a tendency to repeat their previous guess (i.e. the distribution of D p is skewed towards D p = 1). Indeed, this observation shows that some players behave according a natural persistence cognitive bias observed when an individual has to make repeated decisions [37,38].

Conclusion
In this work we have shown the suitability of the MMSBM for the study of social systems, particularly for modeling, predicting and understanding human decisionmaking processes. MMSBMs not only provide more accurate predictions than other state-of-the-art methods [15,18], but they are interpretable.
A close analysis of the model parameters highlights the ability of these models to decompose complex behaviors into a linear combination of elementary behaviors. Our analysis helps identify basic patterns of behavior that are consistent with human behavior in other contexts. First, we observe an optimistic bias that suggests that individuals make decisions based on their desire that the market goes up. Second, we observe that many players use a win-stay loose-shift strategy which is known to be very efficient in the long run in learning processes, games of adversity, co-evolution networks, etc. Last, we find that players also display a tendency to repeat previous decisions, a behavior observed when individuals have to perform repetitive tasks. Our study demonstrates that these behavioral patterns of behavior are hidden to the naked eye but can be obtained from the data using our approach.
MMSBM have been widely used in several other fields including quantitative and computational social science [15]. This work confirms and pinpoints its relevance in the study of social systems. Its predictive accuracy and interpretability could be efficiently exploited in different studies and experiments in sociology and psychology in order to improve the understanding of human behavior [39]. Moreover, MMSBM offer an alternative framework to conventional populations analysis tools used in social sciences, and therefore finds its direct uses in the confirmation and information of state-of-the-art theories in cognitive sciences, sociology, psychology, etc., if not for the discovery of previously unreachable hidden thought mechanisms.

Materials and methods
In order to maximize the logarithm of the likelihood, we use a variational approach. First we use a trick to change the logarithm of a sum into a sum of logarithms : In the first line of Eq.8, we introduced the ω ui (k, l) function, which is the probability for a node u belonging to the group k to be linked by an edge r ui to the node i belonging to the group l. Using Jensen's inequality ln(x) ≥ ln(x), we end up to the last line of the equation 8. Then, one notices that this inequality becomes an equality for: Which is the update equation for the expectation step. In order to maximize the log-likelihood, we derive it with respect to θ, η and p using Lagrange's multipliers to account for normalization constraints. We obtain: With d u the degree of node u and d i the degree of node i.   Figure 2 Comparison matrix of data representations using predictive accuracy Each row/column corresponds to a specific data representation S. We label them according to the information we are using to define the set of contexts: A=player's decision at t − 1, B=market evolution at t − 1, C=outcome of decision at t − 1, D=expert consultation, E=indications consulted, F/G=average over the 5 last/all rounds, H/I=market's evolution/outcome before the previous round. Each matrix element Q S 1 S 2 corresponds to the average log ratio of predictive accuracies (see text) between representations S 1 , S 2 and is colored following the colorbar on the right hand side. Note that if Q S 1 S 2 > 1 (red), S 1 (row) has a larger predictive accuracy than S 2 (column). If Q S 1 S 2 < 1 (blue), S 1 (row) has a lower predictive accuracy than S 2 (column). A score of 1 means the group completely follows a basic pattern, and a score of -1 means it does the exact opposite of the basic pattern. Every group displays a preferential (if not systematic) pattern of behavior: K4 behaves according to the WS-LS strategy, K3 repeats the previous guess, K2 always guesses ↑ and K1 changes the previous guess. Additionally, the expert's advice is either followed or ignored, but none of the groups systematically makes the opposite decision. Note that because the expert is consulted in less than 30% of the rounds, we split the data set into expert consulted and not consulted. From the former we get the score for the EXP strategy, while from the latter we obtain the scores for the other behavioral patterns.
O p t i m i s t i c R e p e a t i n g W S -L S S w i t c h i n g O p t i m i s t i c R e p e a t i n g W S -L S S w i t c h i n g O p t i m i s t i c R e p e a t i n g W S -L S S w i t c h i n g Figure 5 Distribution of use of behavioral patterns by players We compute the Shannon entropy of the θ membership vectors among four behavioral patterns: switching, optimistic, repeating, and win-stay (WS-LS). We split the users according to their Shannon entropy: (a,d) Low entropy, including players who typically do not mix elementary strategies; (b,e) Medium entropy, including players who mix two or three elementary strategies; (c,f) High entropy, including players mix three or four elementary strategies. (a,b,c) Use of each elementary strategy for players with membership vectors with low, medium and high entropy users, respectively. (d,e,f ) Membership vectors. Each row represents one player. The color code quantifies the extent to which a player uses a behavioral pattern. We observe that the win-stay loose-Shift strategy is widely used alone, and that optimistic is commonly used in combination with other behavioral patterns. Figure 6 Balance between repeat and switch strategies We define Dp = θ p RPT −θ p SWI θ p RPT −θ p SWI , so that Dp=0 means that both strategies are used by players to the same extent. This plot shows that a significant number of players are using exclusively one of the strategies; other players use a combination of those with a bias towards repeating their previous guess, rather than switching.