In the Same game, participants produced slightly higher levels of segregation and average scores than the model predictions (Figures 2(a), 3(a)). In 17 out of the 20 trials the participants separated in two distinct groups at either side or corner of the grid (as in Figure 1(i)). The Same game was the easiest game for the participants to complete: nearly all groups converged to the end-game segregation and average scores in less than 30 seconds of play [Figure S5 in Additional file 1]. This was because it quickly became apparent which side ‘belonged’ to which color.
For the Diverse game, the segregation was low and similar to the model predictions. The participants continued to move for the entire two minutes of the experiment [Figure S5 in Additional file 1] but they eventually achieved the expected high levels of integration (Figure 2(b)) and high average scores (Figure 3(b)).
The experimental results for the Same and Diverse game did not reveal marked levels of segregation, contradicting model predictions. In most cases, participants actually achieved a good degree of integration (Figure 2(c)). Yet, their scores were not always lower than the simulated agents’ scores (Figure 3(c)). The high level of segregation in the simulation was due to the fact that a small number of agents quickly formed ‘a frontline’ with an optimal mix of own- and other-color neighbors. The agents not on this front-line then moved to their own-color side of the borderline, which provided higher utility than the other-color side (Figure 1(g)). In contrast, experiment participants achieved a more uniform mix (Figure 1(k)), similar to the pattern in the Diverse game (Figure 1(j)).
The experimental results for the Same or Different game also deviated from the model. The students tended to have higher levels of segregation (Figure 2(d)). Further, based on the average scores, it is clear that the participants did worse in the game than the simulation agents (Figure 3(d)). A priori, we expected that the participants would recognize that the ‘same’ strategy was much easier to coordinate on and hence, they would converge to high levels of segregation, as they easily did in the Same game. During the experiments, students made repeated references to such a strategy, shouting things like ‘all the yellows up to the top.’ Four groups attempted to carry out this approach but it took only one or two contrarians to upset the pattern [Figure S5D in Additional file 1]. Despite decreasing the scores of their neighbors, these contrarians acted rationally, since they had all non-same neighbors and thus maximum utility. As a result, no group managed to achieve the simpler one-color-neighborhood solution. The students also made suggestions about checkerboard solutions like ‘we should alternate…blue then yellow’ or ‘we’ll stand in groups of four two of each with one square distance between.’ Three groups managed to coordinate (or nearly coordinate) to create these more difficult checkerboard solutions. The rest of the groups failed to get close to a mutually beneficial configuration within the allotted time.
The vast majority of moves made by the participants were consistent with an attempt to maximize the utility functions provided to them. Figure 4 shows the average latency until a move is made as a function of neighbor types for the four games. Movement patterns differed greatly between games, but were consistent across trials within a game. The timing of the moves reflected a strong tendency towards higher scoring configurations. For example, in the Same game individuals with zero same neighbors typically moved after less than 2 seconds, while those with 5 or more same neighbors would remain stationary for more than 20 seconds. The motivation to get high scores was also reflected in the students’ discussions and exclamations during gameplay. These were primarily about maximizing points, and at no time in any of the trials did any of the students make any wider reference to segregation. Both the actions (in terms of movements made in the game) and verbal expressions were thus consistent with our assumption that the participants saw the game in terms of utility maximization.
Although the participants followed the incentives we assigned them, they produced different outcomes than seen in models. Previous research suggests that this could be due to significantly high levels of behavior noise [8]. Our participants inadvertently committed errors but the errors were not the driving mechanism. Instead, it appears that the participants used a strategy that differed from the one implemented in the simulation models. In line with previous work [6–8], our model assumes the best-response strategy, according to which individuals change their position only if it increases their utility. This assumption implies that individuals are able not only to identify better positions but also to recognize when no better positions exist. However, the participants in our experiment differed in two important ways from the simulation. Firstly, they were usually unwilling to ‘satisfice’: the participants moved whenever they did not obtain the perfect score. This is evident from the high mobility in less-than-optimal positions in Figure 4(b)-(d). Secondly, when they moved, the participants did not necessarily move to better positions but rather, appeared to choose their new location randomly [Figure S7 in Additional file 1]. Given the fast game dynamics, this choice of random directions is probably a consequence of the cognitive limitations in identifying the optimal locations. Together these two behavioral rules led to unpredictability and no stable equilibrium arrangement was achieved.
The participants’ tendency to make a random move whenever they found themselves in a sub-optimal position explains the mismatch with the predictions of the best-response simulation model. In the Same game, the participants obtained more segregated outcomes because they wanted to avoid being on the frontier between the two neighborhoods, as this made them more vulnerable to other participants’ moves. In the Same and Diverse game, the participants avoided segregation because they could not be satisfied with being at the periphery of their own-color neighborhood, as these positions entailed lower scores. In the Same or Different game, the participants failed to coordinate on the common-sense solution based on two groupings of yellow and blue because their scores could be easily lowered by one individual of the opposite color infiltrating a mono-color block.
To further test our explanation for the experimental results, we replicated the simulation without the best-response assumption. Instead, we assumed random relocation and no satisficing. In the new model, agents decide to move whenever their utility is less than the maximum. They then move to one of the nearest four available locations in the up, down, left and right directions chosen at random. The predictions from this model match the observed outcomes better, particularly for the Same and Diverse game and the Same or Different game (Figure 5).