Politicians [11, 42], urban planers [13] and scholars [7, 43] have been debating the solutions to segregation and concentration of poverty in Europe and North America since the 70’s. One of the primary mechanisms developed, along with some criticisms [43], is residential and social mixing [5, 44]. Policies developed under this approach aim at incentivising the mobility of the segregated communities to other neighbourhoods in order to increase spatial diversity. Rearranging the spatial distribution of each community would be in line with recent research suggesting that diversity within neighbourhoods can actually increase a positive contacts among citizens belonging to different groups [45]. Other than maximising geographic proximity, a parallel approach for increasing the mutual exposure of communities is to make individuals from different groups more similar, relying on the effects of homophily. Homophily is the well-known sociological principle which states that: the more similar individuals are, the more frequent their interactions are expected to be [23, 46].

Our work builds from these fundamental debates and hypotheses, and particularly relies on the principle of homophilic interactions. We assume calling behavioural can be understood as one behavioural feature [46] of individuals. Thus, reducing differences between communities in this regard (i.e. reducing behavioural segregation) may increase exposure, and subsequently, interaction between communities. With this aim in mind, in the following section we estimate the specific volumes of residents that would need to move from their current district, as well as the districts they would need to move to, in order to improve behavioural segregation as measured by variations in CNs.

### 4.1 Minimising segregation: residential mixing as an optimisation problem

As discussed, house or residential mixing aims at promoting the mobility of segregated communities into other less segregated neighbourhoods. Framing this idea within our definition of behavioural segregation, the problem can be rephrased as obtaining a mobility matrix **M**, where each entry \(m_{ji}\) stands for the fraction of refugees living in district *i* that are required to be reallocated in district *j*, in order to maximise^{Footnote 1} the *p*-value of the \(\chi ^{2}\) homogeneity test. Our interpretation of the problem, although applied to call patterns and not spatial distributions, is very similar to the definition of the Dissimilarity Index [28, 29], which is usually interpreted as the percentage of the minority population that would need to relocate in order to perfectly spatially integrate the residential distributions in a region.

The estimation of the best **M** can be formally defined as an optimisation problem. We begin with the case of the “eliminating” differences between the RT and TT networks. The non-linear optimisation problem corresponds to

$$\begin{aligned} \mbox{maximize} &\quad \sum_{i} \mbox{ $p$-value} \bigl(\mathbf{{o}}_{i}^{ \mathrm{TT}},\hat{\mathbf{o}}_{i}^{\mathrm{RT}} \bigr) \end{aligned}$$

(2)

$$\begin{aligned} \mbox{s.t.} &\quad\sum_{i} m_{ji} = 1\quad \forall i \end{aligned}$$

(3)

$$\begin{aligned} & \quad \sum_{j} {\hat{\mathbf{o}}}_{ij} \leq f_{i}\quad \forall i \end{aligned}$$

(4)

$$\begin{aligned} &\quad 0 \leq m_{ji} \leq 1 \end{aligned}$$

(5)

$$\begin{aligned} \quad \mbox{where } &\quad \hat{\mathbf{O}}^{\mathrm{RT}} = \mathbf{M} { \mathbf{{O}}}^{ \mathrm{RT}}, \end{aligned}$$

(6)

where **O** is the matrix of original communication records, \(\hat{\mathbf{{O}}}\) is the resulting matrix of communication records after the mobility matrix has been applied, and each \(m_{ji}\) is an unknown to be obtained. The restriction in Eq. (3) guarantees that the total number of communications is maintained. That is, in the mobility matrix, the sum from each origin and to all the destinations must equal the total number of communications observed in the call record matrix **O**. The restriction in Eq. (4) requires that no district has more than \(f_{i}\) refugees. This restriction is important, as the definition of enclaves has to do with a high fraction of immigrants living in an area with respect to the total immigrant population in the region. In our case \(f_{i}\) is obtained such that the fraction of refugees living in a district never exceeds 10% of the total population. This percentage was chosen as a rounded upper bound based on the empirical observation that, under current conditions, the highest percentage of refugee population in a single district is 8%. The restriction in Eq. (5) simply ensures that the different \(m_{ji}\) are bounded in the range \([0,1]\).

Unlike in the comparison between the TT and RT networks, when comparing TT and RR networks, the destination groups of calls the are different. This requires some modifications to the optimization problem in Eq. (2) when applied to the RR case. First, as we explained in Sect. 3.3, it is necessary to normalise the call destination counts by the different volumes of the target populations of the two datasets when computing the *p*-value. Second, in order to account for the fact that the refugees being moved are the same ones receiving calls (refugees call refugees in the RR network), we need to apply an additional transformation to change the destination districts of the calls directed at the relocated refugees. This can be done by multiplying the result of \(\mathbf{MO}^{\mathrm{RR}}\) by \(\mathbf{M}'\) (the transpose of **M**). For the definition of the optimisation problem, this means replacing Eq. (6) with \(\hat{\mathbf{O}}^{\mathrm{RR}} = \mathbf{M}{\mathbf{{O}}}^{ \mathrm{RR}}{\mathbf{M}}'\). Figure 5 provides a simplified example of the optimisation problem we propose.

The high non-linearity of the problem, in both the RT and RR case, does not allow us to obtain satisfactory results optimizing directly the problem in Eq. (2). The fundamental complication is due to the very low *p*-values obtained with the initial call densities, \(\mathbf{{o}}_{i}^{\mathrm{TT}}\) and \(\mathbf{{o}}_{i}^{\mathrm{RR}}\). From those values, we were unable to find good initialisations for unknowns \(m_{ji}\) that were close enough to a satisfactory mobility matrix solution. Instead, we developed a two-step procedure based on two similar optimisation problems. In the first step, we modified the objective function (with equivalent restrictions) to find the mobility matrix that minimises the mean squared difference between vectors \(\mathbf{{o}}_{i}^{\mathrm{TT}}\) and \(\hat{\mathbf{o}}_{i}^{\mathrm{RR}}\). In the second step, using as initialisation vector the mobility matrix outcome of the previous optimisation, we minimised the sum of the \(\chi ^{2}\) value for the different vectors \(\mathbf{{o}}_{i}^{\mathrm{TT}}\) and \(\hat{\mathbf{o}}_{i}^{\mathrm{RR}}\). The solution to the optimisation problem was been obtained using the MatLab R2017a engine. We used the *fmincon* function configured to use the Interior-Point algorithm.

This two-step process, similar to the original objective function in Eq. (2), gives very satisfactory results, as Fig. 6 shows. Note again that, under the initial conditions, all of the districts indicated segregation in both the Refugee–Refugee and Refugee–Turkish case. Figure 6 Panel A shows the results mitigating segregation considering the Refugee–Refugee network. We observe that after the proposed mobility, we reduce segregation in 43% of the districts. When considering Refugee–Turkish communications, the results are also impressive (see Fig. 6B). After promoting mobility, segregation is reduced in 40% of the districts.

### 4.2 Optimising behavioural vs. spatial segregation: the potential trade-offs

In order to establish a baseline for the outcome of our method, we compared our results with a process directed to minimise the Dissimilarity Index (DI). That is, maximize

$$ \frac{1}{2} \sum_{i}^{n} \biggl\vert \frac{{{c^{T}_{i}}}}{\sum_{j}^{n} {{c^{T}_{j}}}}- \frac{{{c^{R}_{i}}}}{\sum_{j}^{n} {{c^{R}_{j}}}} \biggr\vert , $$

where *n* corresponds to the number of districts, and \(c^{T}_{j}\) and \(c^{R}_{j}\) are the sum of all outgoing calls made from district *j*, serving as a proxy of population. As in Sect. 4.1, we preformed a separate optimisation for both the RR and RT networks. In each case, in order to have fair comparison with the results of our method, we impose a constraint to limit the total number of citizens to be relocated under the optimisation, which is set to the number relocated using our behavioural segregation optimisation described above. After optimisation, we compared the results in terms of the change in the DI, and in terms of the number of districts in which refugee and local call patterns did not exhibit significant differences. Clearly, each optimisation will do its job better than the other (when minimising segregation, we expect a better outcome for segregation than when we minimise call pattern differences), but seeing how distinct the outcomes are can point us towards potential trade-offs.

We note that the original DI calculated using our call volume-based population estimation for the RT network was 24%, while for the RR network it was 32%. Both are quite similar to the value calculated using official population data (around 30%). The optimal mobility matrices found in the optimisation of behavioural segregation increased the DI, to 53% in the RT case and 57% in the RR case. When minimising the DI, we reach 2.5% and 3% for the RT and RR cases respectively. With respect to behavioural segregation: in both cases, RT and RR, when optimising the DI, all of the districts remain significantly segregated (\(p\mbox{-value} < 0.01\)). This is in contrast to the optimisation minimising behavioural segregation, which reduced segregation in 40% of the districts. These results suggest that spatial segregation as measured by the Dissimilarity Index and behavioural segregation as we measure it here present somewhat different objective functions with different optima. The optimisation of both measures may be taken as being desirable, and studying their mutual effects on one another could be useful. An interesting prospect for future work could go in the direction of designing a multi-objective objective function, in order to find points in the problem space where a positive outcome exists for improving both spatial and behavioural segregation.

### 4.3 Economic incentives towards integration

From one perspective, social integration can be framed in terms of cost-benefit analysis [47]. In this conceptualisation, language acquisition, distance from family, and exposure to unfamiliar cultures can be considered costs, though they are difficult to quantify in economical terms. Housing costs, in contrast, are relatively easy to quantify. Aside from the characteristics of individual houses, this cost reflects a variety of factors including access to services, employment, and city resources [48–50]. As previously mentioned, rent prices are negatively related to refugee population, as Fig. 1C show. This implies that some rent-reduction incentives might be effective in getting refugees to relocate out of enclaves. This could be an opportunity for public and private actors interested in increasing host-refugee integration in Turkey to adjust the cost-benefit analysis of refugee location choice by subsidising rent in targeted areas of the city, thereby encouraging refugees to live away from enclaves and making inter-group contact more frequent.

In support of the viability of using rental subsidies as a way to incentivise refugee location choice, we examined the overall change in rent payments that would occur under the new population distribution considering rental markets for the 2017 period [27]. The proposed optimisation problem in Eqs. (2)–(6) provides us with information about the volume of communications that need to be shifted from one district to another. The density of communications originating from an area is known to be related to the population density of the area [51–53] as Fig. 7, drawn from the real population and CDR data, confirms. We can thus use outgoing call volume as a proxy for the amount of citizens for whom we need to incentivise mobility. Performing the optimisation considering the RR communication, a total of 54,942 refugees are required to be relocated (12% of the refugee population). The resulting net increase in monthly rent cost is 11,709,295 (1,847,817€), which corresponds to 213 (34€) per person/month. Performed considering the RT communication network, the optimisation resulted in a relocation of 212,100 refugees (approx. 40% of the population). This corresponds to a net rent increase of 52,430,540 (8,273,946€), or 247 (39€) per person per month.

As it can be seen in Fig. 8A and C, the changes in rent payment approximate a normal distribution with a large variance, meaning that, under the adjusted population distribution, some refugees would considerably increase their savings on rent, and others would pay a higher price. The overall tendency, though, is a positive increase in the rent costs. The distribution of these changes in rent cost over the districts of Istanbul at the level of the individual is provided in Fig. 9. These figures provide an individual (refugee) point of view in terms of the increase or reduction in cost of living. Panels B and D of Fig. 8, on the other hand, provide a governmental or organisational perspective. The maps indicate the total investment that would be required in each district in order to fully offset the increased rent payments of refugees. As we can see, the subsidies would be larger at the districts near the Bosphorus Strait. Surprisingly, these largest subsides are not regularly distributed among adjacent districts, and they correspond to the densest areas of the province.