Diversity dilemmas: uncovering gender and nationality biases in graduate admissions across top North American computer science programs

Although different organizations have defined policies towards diversity in academia, many argue that minorities are still disadvantaged in university admissions due to biases. Extensive research has been conducted on detecting partiality patterns in the academic community. However, in the last few decades, limited research has focused on assessing gender and nationality biases in graduate admission results of universities. In this study, we collected a novel and comprehensive dataset containing information on approximately 14,000 graduate students majoring in computer science (CS) at the top 25 North American universities. We used statistical hypothesis tests to determine whether there is a preference for students’ gender and nationality in the admission processes. In addition to partiality patterns, we discuss the relationship between gender/nationality diversity and the scientific achievements of research teams. Consistent with previous studies, our findings show that there is no gender bias in the admission of graduate students to research groups, but we observed bias based on students’ nationality.


Introduction
Every year, many students from all over the world apply to pursue their graduate studies at top universities in North America [1].Despite the committee-based nature of admission to many of these universities, professors still play a prominent role in accepting students and providing them with financial support [2].As a result, students often directly contact faculty members to enhance their chances of admission.Furthermore, students who are admitted by a committee must find an academic advisor and research group, and faculty members have the authority to approve or reject these requests.Consequently, their research group may demonstrate a preference for accepting students of similar gender, country of origin, or previous universities.In this study, we aim to examine the existing biases in interactions with computer science faculty members at top North American uni-versities and their preferences regarding nationality and gender when selecting graduate students for their research group.
In addition to establishing fair admission systems, it is crucial to enhance diversity in academia.Promoting diversity within universities enables them to have a greater impact on societies [3].This is because institutions aim to address social issues, which cannot be effectively achieved without embracing diversity [4].Furthermore, it is argued that being in a diverse environment can broaden students' horizons [5].
Most prestigious universities typically strive to ensure fairness in the admission process for their graduate programs.Various factors, such as merit, gender equality, and diversity, contribute to establishing a fair graduate admission system [2,6].However, it is argued that admitting a greater number of marginalized students for graduate education at U.S. universities remains a contentious issue [7].
To the best of our knowledge, only three studies have focused on assessing gender or nationality bias in graduate admissions, and all of them were conducted prior to 2000.Bickel and Hammel [8] analyzed admission results from various schools at the University of California, Berkeley to examine the presence of a gender gap.They found statistically significant favoritism towards female applicants.Maxwell and Jones [9] employed adjustment techniques to compare admission rates between women and men in four graduate programs at the University of North Carolina, Chapel Hill.Their findings suggested that gender was not a significant factor in admission decisions.Subsequently, the authors of [10] discussed the influence of demographic attributes, such as gender and country of citizenship, on graduate admission decisions at top-ranked American universities.Their results indicated that these universities placed greater emphasis on admitting U.S. students, and female applicants received some degree of preference.Our work builds upon these studies by addressing questions regarding gender/nationality bias in more recent and comprehensive graduate admissions data.The dataset we collected for this study encompasses a larger number of students and includes a greater number of universities.Some studies have examined the impact of gender/nationality diversity on the performance of research teams.In [11], the authors investigated the level of cultural diversity at which a research group achieves the highest performance.AlShebli et al. [12] analyzed author lists of research papers to explore the influence of diversity in characteristics such as gender and ethnicity on the success of research teams.Llorens et al. [13] demonstrated the existence of gender bias throughout scholars' academic careers, affecting aspects such as career opportunities, promotion, and grant allocation.They also proposed solutions at various levels to enhance diversity, highlighting its importance for scientific success.The authors of [14] examined different facets of gender diversity and reported its positive impact on creativity and performance in scientific domains.Kamerlin [15] addressed bias issues in academia and presented strategies to promote gender diversity in academic environments.Powell [16] utilized citation count to quantify the success of research papers and investigated its relationship with various aspects of diversity, such as gender, age, ethnicity, and affiliation, among the authors.In addition to citation count, we consider faculty members' h-index and publication count as measures of success for their research groups.
Many initiatives have been undertaken to enhance diversity in computer science.The author of [17] emphasized that these efforts should not be limited to achieving gender equity alone.Wilson [18] highlighted how his team in Hour of Code decided to translate their lectures into multiple languages and establish branches in more countries to promote di-versity in computer science.One of the primary objectives of their program is to globalize computer science [19].Increasing students' awareness of diversity and inclusion is a crucial step towards fostering a more diverse community of computer scientists [20].These studies collectively underscore the significance of addressing diversity issues in academia.
In this study, we aim to address the following questions: • Do professors exhibit a preference for admitting students of the same gender to their research group?• Are they inclined to accept students who share their country of origin?• How do these bias patterns evolve over time?
• Is there any correlation between the diversity of gender or nationality among team members and the research team's productivity?Our contributions can be summarized as follows: 1. We provide a comprehensive description of the dataset collected for this study, highlighting its various features.2. We analyze the gender distributions of students and faculty members and conduct hypothesis tests to examine the presence of gender bias in the selection of students for graduate study.3. We investigate the distributions of advisors and students' home countries and explore the existence of bias in this variable.4. We construct an advisor-student relationship network using our dataset and calculate centrality metrics to identify the most influential countries in higher education.5. We examine the trends in gender/nationality biases and diversities among advisor-student pairs over time using Mann-Kendall tests.6.We assess the correlations between academic success and diversity measures to analyze the relationship between gender/nationality diversity and the performance of research groups.The rest of this paper is structured as follows.The "Materials and methods" section offers an overview of the data collection process and delineates the diverse features within our dataset.Following this, our discoveries are outlined and analyzed in the "Results and discussion" section.In the "Future work" section, potential directions for future research are proposed.Finally, the "Conclusion" section succinctly summarizes the key takeaways of the paper.

Materials and methods
In this section, we define the techniques and metrics that we use in answering our research questions.Moreover, we describe the dataset that we collected for this study.

Methods
In this part, we introduce the algorithms and statistical tests utilized in our study.

Disparity filter
The disparity filter is a graph sparsification algorithm utilized to effectively reduce the number of edges in a network while preserving its multi-scale nature [21].We apply this algorithm to remove insignificant edges from the advisor-student relationship network.Figure 1 provides an example of the application of the disparity filter algorithm.

Louvain community detection
The Louvain community detection algorithm is utilized to identify communities within a large-scale network by optimizing the modularity.This algorithm aims to maximize the difference between the expected edge counts within a community and the actual edge counts.It employs a greedy approach with heuristics to solve the problem efficiently in polynomial time [22].We apply this algorithm to detect communities within the advisorstudent relationship network.

Leiden community detection
The Leiden community detection algorithm is an advancement of the Louvain algorithm.It employs a fast local move approach and iteratively refines partitions to ensure the connectedness of all detected communities.Compared to the Louvain algorithm, it offers improved speed and provides more accurate partitions [23].We use this algorithm to identify communities within the advisor-student relationship network.
Figure 2 shows the examples of the Louvain and Leiden community detection algorithms.

Statistical analysis
In this part, we provide a description of the statistical methods employed in this study.

Proportion hypothesis test
The proportion hypothesis test is a statistical method that compares the ratio of an attribute in a population with a reference proportion.It also establishes a range of values that are likely to include the population proportion [24].We utilize this technique to assess our research questions regarding biases in graduate admission.

Mann-Kendall test
The Mann-Kendall test is a nonparametric method that assesses the presence and direction of trends.It is particularly suitable for detecting monotonic trends that exhibit consistent increases or decreases over time [25].We employ this technique to evaluate the trends of variables such as gender/nationality diversity over time.

Metrics
In this part, we provide the definitions of the measures that we calculate in this study.

Weighted degree centrality
Weighted degree centrality is defined for each node in a network by summing the weights of the edges connected to that node.The formula for weighted degree centrality is as follows: where v is a neighbor of u, and w(v, u) is the weight of the edge between v and u [26].
We employ this measure to examine the faculty members from which countries accept a greater number of students from other countries.

Closeness centrality
For each node, closeness centrality is defined as the average distance between that node and all other nodes in the network.The formula for closeness centrality is as follows: where n represents the number of vertices that node u is reachable from, and d(v, u) denotes the geodesic distance between nodes v and u [27].We utilize this metric to determine which countries are closer to the rest of the world in terms of admission results.

Entropy
The entropy of a variable is defined as the average uncertainty of that variable based on its probability distribution.The formula for entropy is as follows: where the base of the logarithm is e, and p i represents the probability of the i-th outcome in variable X [28].We employ this measure to calculate the diversity of an advisor's research team.

Dataset
Data collection was the most challenging aspect of this study.We collected data from multiple websites, each with its own unique structure, using a combination of manual and automated approaches.
The data collection procedure consists of four steps: manual data gathering, data collection using crawlers, removal of unnecessary data, and preprocessing.We collected data from the top 25 universities in North America, as ranked by Quacquarelli Symonds (QS) in 2021 for computer science [29].

Manual data collection
Among all the faculty members in the computer science departments of each university, we randomly selected approximately 30 professors.We collected information such as the professor's academic rank, home country, gender, research areas, and academic performance metrics (h-index and citation count).We also completed the prior universities (alma maters) column by referring to the professors' resumes and information available on their websites, LinkedIn, and Google Scholar.To determine the gender, we relied on images or pronouns specified on their websites.If the birthplace was not explicitly stated, we used the location of their undergraduate university to determine their home country.We also gathered academic records, such as citation counts and h-indexes, from Google Scholar.
The academic rank of faculty members, including Assistant Professor, Associate Professor, and Professor, was typically available on the university's website.Table 1 presents the key information about faculty members that we collected from the university homepage and the professors' personal pages.
For the professor's field column, we initially obtained the professor's research interests from their website, resume, or in some cases, from Google Scholar.Next, we manually determined whether the professor's research interests were associated with one or more of the 13 primary fields of the Association for Computing Machinery (ACM) computer science field category [30].Table 2 presents a sample mapping between professor interests and ACM subareas within our dataset.
After obtaining all the necessary information for each professor, we proceeded to gather the names of their students and any additional available information from their profiles.If any student-related information was available, we used it to populate the corresponding column; otherwise, we left it blank and planned to update it later with data collected from  our crawlers in the next stage.Furthermore, after running the crawlers, we manually crosschecked the data to fill in any gaps using information available from other sources.The process of finding the information and collecting the data proved to be challenging and time-consuming, leading to the development of crawlers for different sections.Table 3 presents the student information available in our dataset.

Data collection using crawlers
We used the list of all students as input for the Google search engine to locate their websites and resumes, including their LinkedIn accounts.The next challenge was to automatically extract the required data from these websites and resumes to populate the information columns, such as degree, admission year, and alma maters.We also performed data cleaning on the output from the crawler and merged it with the primary dataset to ensure consistency and completeness.We used the Name2GAN website [31] to label a person's gender, if it was not manually identified.We checked the results of this tool for 3000 previously labeled data.The results show that the gender detection tool has an accuracy higher than 90%.We used manual labeling for cases that gender detection uncertainty was high to enrich the quality of our dataset.

Irrelevant data removal
Since we recorded information about all students associated with each randomly-selected professor, including visiting students, undergraduates, postdocs, masters, and PhD students, it was important to filter out irrelevant data and include only graduate students for our analysis.The final version of the dataset was completed on August 2, 2022, and it consisted of a total of 13,936 graduate students.

Preprocessing
The preprocessing stage consists of two phases: 1. Preparing the input for the crawlers.2. Preparing the data for analysis.
The most crucial component of the preprocessing stage was creating a consistent list of institutions that could be used for analysis and for the Google Maps crawler.We also double-checked the address results for each university to ensure that the mapping between university and address was unique.As mentioned earlier, we used these addresses to identify the students' countries of origin.In some cases, the home countries of students were improperly reported as a state rather than the country, and we corrected this during the preprocessing stage.Once the home country column was filled out, we standardized the names of the countries and prepared them for analysis.Additionally, the admission year column required cleaning, as there were specific irrational values that were quickly corrected.
Students' home country is determined based on explicit specifications, if available.If not explicitly specified, we first consider the country from which they earned an associate degree.If that information is not available, we use the location of their undergraduate university to determine their home country.Additionally, we utilized a crawler for the Google Maps API to search for the location of universities and schools, which provided us with the necessary addresses for further analysis.

Data exploration
In this part, we present an overview of the key features of our dataset in order to gain insights into their distributions.

Advisors' gender
In this part, we examine the distributions of advisors' gender across other attributes.Figure 3 displays the mosaic plot depicting the relationship between advisors' gender and their academic rank.The majority of advisors in our dataset hold the professor rank, and the highest proportion of male advisors is also observed at the professor level.This finding  aligns with the results of [32], which suggest that men have a greater likelihood of being promoted to the professor rank compared to women.
Figure 4 illustrates the distribution of female and male faculty members across different subfields of computer science in our dataset.The graph shows that the computing methodologies subfield has the highest number of advisors.This observation can be attributed to the growing significance of Artificial Intelligence, which falls under the computing methodologies category and is an interdisciplinary field [33,34].The theory of computation and computer systems organization subfields represent the second and third largest groups, respectively.
Figure 5 shows the distribution of gender among computer science faculty members across different universities.

Advisors' academic performance metrics
In this part, we present the dispersion of academic performance metrics of the faculty members, including publication count, h-index, and citation count, which are crucial indicators of the success of their research teams.Figure 6 displays the boxplots of advisors' publication counts for each university.To enhance the resolution, advisors with more than 1000 publications were excluded from this plot.
Figure 7 illustrates the distribution of citation counts for faculty members at each university.To improve the clarity of the diagram, faculty members with a citation count exceeding 100,000 were excluded.8 presents the boxplots of h-indexes for faculty members at each university.It is worth noting that the h-index metric has fewer outliers compared to the previous metrics, indicating that it may be a better indicator for assessing the success of research groups [35].

Students' gender
In this part, we illustrate the distribution of students' gender against other features.Figure 9 presents a mosaic plot depicting the distribution of students' gender based on the degree they are pursuing (or have pursued) under the supervision of their advisor.The plot reveals that there are fewer women in graduate computer science programs, which aligns with the findings of Cuny and Aspray's study [36].Additionally, the female-to-male ratio decreases as the degree level progresses from masters to doctorate, potentially indicating a lower tendency among women to pursue higher education [37].
Figure 10 displays the gender distribution of CS students across different universities.

Nationality distributions
In this part, we explore the distribution of nationalities among students and faculty members.Figure 11 presents the distribution of students' citizenship for each degree.It shows that the majority of students apply for doctoral programs, and the percentage of international students is higher than that of American and Canadian students.This finding is in line with the result of a study by Okahana and Zhou [38], which states that in Fall 2015, approximately 55% of students majoring in computer science or related programs were international students.Figure 12 displays the distribution of students' nationalities on the world map.The United States and Canada have been excluded to focus solely on international students.The map reveals that the majority of international students are from China, India, and Iran, respectively.This finding aligns with the results of [39], which indicate that graduate programs are predominantly composed of Chinese and Indian students.
Figure 13 displays the distribution of faculty members' home countries on the world map.The map reveals that the majority of advisors originated from the United States, followed by India, China, and Canada, respectively.
Figure 14 depicts the sorted bar plots of the 15 most common countries among faculty members and students.

Results and discussion
In this section, we provide a comprehensive explanation of our analyses and interpret the results we obtained.

Assessing gender partiality
In this part, we evaluate the presence of gender bias in admission decisions.We conduct a two-sided hypothesis test with a significance level of 0.05 to examine whether there is gender bias in the acceptance of graduate students into advisors' research groups.To accomplish this, we employ a simulation-based approach with 500 iterations [24].In each iteration, we generate 13,759 advisor-student pairs, where the gender of each component is selected based on the observed ratio in our dataset.Specifically, the probability of an advisor being male is 0.788, and the probability of a student being male is 0.771.This simulation yields an approximately normal distribution with a mean of 0.6562 and a standard deviation of 0.0212, as depicted in Fig. 15.It is important to note that this distribution represents the values for the ratio of advisor-student pairs with the same gender, assuming no gender bias in admitting graduate students.In our dataset, the observed ratio of advisor-student pairs with the same gender is 0.6896.We will now test whether this observed value is likely to occur in the simulated distribution.Thus, our hypothesis test is formulated as follows: H 0 : p common gender ratio = 0.6562, H a : p common gender ratio = 0.6562.

(4)
Using a z-test, we obtained a p-value of 0.1152, which is higher than the significance level of 0.05.Therefore, we cannot reject the null hypothesis.In other words, the data does not provide strong evidence of gender bias in the admissions of graduate students.This finding is consistent with the results of Maxwell's study [9], which also concluded that gender is not a significant factor in graduate student acceptance.

Evaluating nationality bias
In this part, we aim to investigate the presence of nationality bias in advisor-student relationships.We conduct a two-sided hypothesis test with a significance level of 0.05 to assess the existence of such bias.Similar to the previous analysis, we employ a simulationbased approach with 500 iterations.For this analysis, we only consider international students who are not from the United States or Canada.At each iteration, we generate 4839 advisor-student pairs, where the nationality of each individual is selected with a probability equal to the observed ratio in the dataset.In each iteration, we calculate the ratio of advisor-student pairs with the same nationality.The resulting distribution, shown in Fig. 16, approximates a normal distribution with a mean of 0.0682 and a standard deviation of 0.0113.In our dataset, the proportion of advisor-student pairs with the same nationality is 0.1593.To assess the likelihood of observing such a ratio in the simulated distribution, we formulate the following hypothesis test: H 0 : p common nationality ratio = 0.0682, H a : p common nationality ratio = 0.0682.
(5) Using a z-test, we obtain a p-value of p < 10 -15 , which is significantly lower than the chosen significance level of 0.05.Therefore, we reject the null hypothesis and conclude that there is strong evidence of nationality bias in admitting international graduate students.This bias may be attributed to advisors' familiarity with universities in their home country and their potential to make more accurate assessments of students who have graduated from those universities.

Advisor-student relationship network
In this part, we present a cross-country advisor-student relationship network based on our dataset.The network is constructed by connecting the nationalities of students and their advisors with weighted edges.We apply the disparity filter algorithm [21] to eliminate insignificant edges and remove isolated nodes from the network.Figure 17 provides an overview of the advisor-student relationship network.In this visualization, the size of the nodes and labels corresponds to the weighted degree and closeness centralities, respec- tively.The thickness of the edges represents their weight, which indicates the number of advisor-student pairs between the respective countries.Additionally, the nodes are colorcoded based on their community assignment, determined using the Louvain community detection algorithm [22].
The countries with the highest values for both centrality metrics are the United States, India, China, Canada, and Iran, respectively.This observation aligns with the previous findings that faculty members from these countries are prevalent in top universities.It serves as further evidence of the potential existence of nationality bias in advisor-student relationships.
In Fig. 18, the advisor-student relationship network is depicted with similar settings, but the Leiden algorithm [23] is utilized for community detection.According to the results, Sweden and Romania are assigned to different communities compared to Fig.

Exploring time effect
In this part, we analyze the changes in bias patterns over time.Specifically, we examine admissions from 2000 to 2021.For each year, we calculate the ratios of advisor-student pairs with the same gender and nationality.Figure 19 illustrates the time series of the identical gender ratio.The results of a Mann-Kendall test indicate that this time series exhibits a statistically significant decreasing trend (p < 0.01).
Figure 20 illustrates the proportions of advisor-student pairs with the same nationality across different acceptance years.We observe an increasing trend in these proportions, which is consistent with the results of a Mann-Kendall test (p < 0.01).Advisor's number of publications divided by the years of her/his research experience

Investigating relationship between academic success and diversity
In this part, we aim to investigate whether there is a correlation between diversity in advisors' research groups and their academic success.To assess this relationship, we employ scientometrics, which are described in Table 4, as measures of research group success.Moreover, we consider the entropy of genders and nationalities among an advisor's students as measures of diversity within their research group.We calculate the academic success and diversity measures for 737 advisors in our dataset.Subsequently, we compute the correlations between these variables, as shown in Fig. 21.To assess the statistical significance of each correlation, we conduct a hypothesis test with a significance level of 0.01.Based on the results, the correlations between gender entropy and other variables are close to zero and not statistically significant.This suggests that there is no significant linear correlation between gender diversity and the performance of research groups.On the other hand, nationality entropy exhibits a moderate positive correlation with advisors' h-index.This implies that research teams with greater diversity in terms of nationality tend to have higher research productivity.Additionally, there are weak positive linear relationships between nationality diversity and the remaining academic success metrics.It is important to note that the h-index is considered a more reliable measure of academic success [35].

Analyzing trends of diversity
In this section, we discuss how gender and nationality diversities have changed over the past two decades.Once again, we employ the Mann-Kendall test to assess the strength of the observed trend.Figure 22 illustrates the increasing trend in gender entropy over time.According to the results of the Mann-Kendall test, the observed trend is highly statistically significant (p < 10 -5 ).
Figure 23 shows the time series of nationality entropy.As depicted, there has been a decrease in nationality diversity over time.The decline from 2016 to 2020 is particularly noticeable.The results of the Mann-Kendall test confirm that the observed trend is statistically significant (p < 0.01).

Future work
While our work presents a novel study analyzing gender and nationality biases in graduate admissions over recent decades, future research should aim to explore other crucial factors influencing admission decisions.These factors include academic background, religion, and politics, in order to provide a more comprehensive understanding of bias in graduate admissions.To achieve this, researchers could contemplate integrating our dataset with additional sources, such as institutional reports and the social media profiles of students and faculty members on platforms like Twitter, to glean fresh insights on this matter.
Moreover, another promising avenue for future research involves evaluating whether specific stages of the admissions process accentuate gender and nationality biases, and how these biases manifest diversely across various universities.For example, researchers could concentrate on distinct phases of the admissions process, such as committee decisions, to discern differing bias patterns.
Additionally, future investigations might delve into the correlation between gender and nationality diversity within computer science faculty and observed biases in graduate admissions.This analysis could yield insights into potential strategies for addressing these biases effectively.
Lastly, a valuable topic for future research could be assessing whether significant variations in gender and nationality biases exist across different subfields within computer science (e.g., artificial intelligence, systems, theory).Furthermore, exploring how these biases correlate with broader trends could provide valuable insights into the dynamics of bias within the field.

Conclusion
In this study, we analyzed the distribution of genders and nationalities among students and their advisors.We conducted two-sided hypothesis tests to examine the presence of bias in gender and home country within advisor-student relationships.Our findings indicate that there is no gender bias in admission results.However, our results confirm the existence of bias against international applicants based on nationality.Additionally, we explored centrality metrics in the advisor-student relationship network, revealing that the United States, India, and China are the dominant countries in CS academia, influencing the composition of students and faculty members in top North American universities.We investigated the trends in gender and nationality bias over time and observed a reduction in gender bias, while nationality bias has shown an increasing pattern.Furthermore, we established a positive relationship between diversity in the nationalities of research group members and their academic performance.Lastly, we demonstrated an increase in gender diversity over time, alongside a decline in nationality diversity.
We acknowledge a limitation regarding the data collected for this study.We cannot guarantee that each faculty member consistently includes all individuals on their webpage.While the majority of computer science professors at high-ranking universities update their homepage at least once a year, some faculty members may not update information about newly admitted students as frequently.
Universities can utilize the findings of this study to formulate and implement policies aimed at promoting diversity and equality among their graduate students.Furthermore, they can raise awareness among faculty members regarding the benefits, particularly in terms of scientific achievement, that arise from having a diverse research team.Universities can also encourage faculty members to actively consider admitting students from a variety of nationalities.

Figure 1 A
Figure 1 A sampled subgraph of the advisor-student relationship network.The subgraph is shown before (a) and after (b) the application of the disparity filter algorithm

Figure 2
Figure 2 The sampled subgraph of the advisor-student relationship network with specified communities.Communities are detected using Louvain (c) and Leiden (d) community detection algorithms

Figure 3
Figure 3 Mosaic plot of advisors' academic rank and gender.The numbers displayed on the bars indicate the percentage of advisors' gender, while the numbers on the gray section represent the percentage of faculty members in each rank.The width of each bar corresponds to the number of faculty members with a specific rank

Figure 4 Figure 5
Figure 4 Back-to-back bar plot of advisors' gender and their research fields

Figure 6 Figure 7
Figure 6 Boxplots of advisors' publication counts for each university

Figure 8 Figure 9
Figure 8 Boxplots of advisors' h-indexes for each university

Figure
Figure8presents the boxplots of h-indexes for faculty members at each university.It is worth noting that the h-index metric has fewer outliers compared to the previous metrics, indicating that it may be a better indicator for assessing the success of research groups[35].

Figure 10
Figure 10 Gender-disaggregated bar plot showing the count of students for each university.The numbers on each column represent the percentage of different genders in that particular university

Figure 11 Figure 12
Figure 11 Mosaic plot of students' degree and citizenship.The numbers on each bar represent the percentage of different citizenships, and the numbers on the gray section indicate the percentage of students in each degree.The width of each bar is proportional to the number of students in that particular degree

Figure 13 Figure 14
Figure 13 Distribution of advisors' countries of origin

Figure 15
Figure 15 Histogram of advisor-student common gender proportion

Figure 16 Figure 17
Figure 16 Histogram of advisor-student common nationality ratio

Figure 18
Figure 18 Cross-country advisor-student relationship network, with communities detected via Leiden algorithm

Figure 19 Figure 20
Figure 19 Time series of advisor-student identical gender ratio

Figure 21
Figure 21 Correlogram of academic success measures and gender/nationality diversity

Figure 22 Figure 23
Figure 22 Students' gender entropy across admission years

Table 1
Essential professor information

Table 2
Mapping between professor interests and ACM subareas

Table 3
Student information in our dataset

Table 4
Scientometrics and their explanations