Responsible team players wanted: an analysis of soft skill requirements in job advertisements

During the past decades the importance of soft skills for labour market outcomes has grown substantially. This carries implications for labour market inequality, since previous research shows that soft skills are not valued equally across race and gender. This work explores the role of soft skills in job advertisements by drawing on methods from computational science as well as on theoretical and empirical insights from economics, sociology and psychology. We present a semi-automatic approach based on crowdsourcing and text mining for extracting a list of soft skills. We find that soft skills are a crucial component of job ads, especially of low-paid jobs and jobs in female-dominated professions. Our work shows that soft skills can serve as partial predictors of the gender composition in job categories and that not all soft skills receive equal wage returns at the labour market. Especially “female” skills are frequently associated with wage penalties. Our results expand the growing literature on the association of soft skills on wage inequality and highlight their importance for occupational gender segregation at labour markets.


Introduction
When it comes to jobs and careers, technical abilities and professional qualifications are important factors both from the perspective of an employer and of a new employee. However, as pointed out by recent studies [1][2][3], more and more attention is focused on soft skills, i.e. qualities that do not depend on the acquired knowledge and that are harder to quantify due to being related to one's emotional intelligence and personality traits. At the same time, they are extremely important because they facilitate human connections [4]. The Oxford dictionary, for instance, defines soft skills as "personal attributes that enable someone to interact effectively and harmoniously with other people". a During the period of 1980 and 2012, jobs with high social skills requirements grew by around 10% as a share of the US labour force [5]. The increasing importance of soft skills at labor markets stems from the growth of the service sector, where interpersonal services are sold, as well as from the introduction of lean-manufacturing, where an integrated skill set, comprised of both hard and soft skills, has gained importance [6,7]. Observational studies have also shown that social features potentially related to soft skills (e.g. the variety of friendship connections and position diversity within a community) are positively correlated with economic outputs [8,9].
The growing importance of soft skills also carries implications for gender inequality in labour markets. Research has shown that certain societal groups are perceived as lacking important soft skills, i.e. evidence was found that black men are characterized as being less motivated than their white counterparts [10]. Additionally, not all types of soft skills are valued equally, e.g. based on gender stereotypes and beliefs about women's inferior status in the workplace, skills that are perceived as "female" are found to be associated with wage penalties [11][12][13]. On the other hand, recent scholarly debates engage in the discussion of a possible female advantage associated with the rising importance of people skills in contemporary labor markets [2,[14][15][16].
Despite the growing importance of soft skills and their potential contributions to inequalities in labour markets, to date, we know surprisingly little about the role of "gendered soft skills"-i.e., soft skills that are stereotypically associated with one gender-in the job market [15,17]. Most prior scientific articles referring to skills and labor market outcomes construct indices of soft skills in which male and female connoted skills get added up, rather than making a distinction between them (see, for instance, [2,14]). This approach is useful, because the overall increasing importance of soft skills in contemporary labor markets [16] can be measured in an easy-to-grasp, single-index way. However, this coarse-grained measure can mask important differences in labor market outcomes with regard to gendered soft skills. We go beyond this relatively crude measure by introducing a semi-automatic approach for constructing an extensive list of soft skills from job advertisements, which we can use for soft skills detection. Combining this data on soft skills with what prior research has identified as commonly shared gender stereotypes (see, for instance, [18][19][20]) and official statistics about the proportion of women in various professional fields, allows us to differentiate soft skills depending on their gender connotation. Thus we are able to establish new insights on the association of soft skills related to gender stereotypes and wages.
Additionally, we present evidence on the impact of soft skills on sex segregation in labor markets. Although the existing literature on supply-side mechanisms of occupational sorting, i.e. women making career choices based on potentially biased self-assessed beliefs about interests and capacities, is growing [21,22], the demand-side process, meaning the allocation of men and women into sex-typed occupations by employers, remains relatively understudied [17]. There is only a limited number of studies examining the influence of gendered wording on occupational choices. These studies use small-scale experiments and thus cover only a limited range of soft skills associated with gender stereotypes [19,[23][24][25][26]. Utilizing our newly extracted dataset based on real job advertisements, we are able to examine the impact of soft skills in general and gendered soft skills in specific on occupational segregation.
Based on our unique dataset on soft skills in job ads, we find evidence that female connoted soft skills are associated with wage penalties, while soft skills perceived as being stereotypically male are linked to wage premiums. Our results show further that women are more likely to be found in occupations that are advertised using soft skills associated with female stereotypes and vice versa for men. This article is structured as follows: in Sect. 2, we present our methodology for extracting soft skill mentions from a large corpus of job advertisements. In Sect. 3, we scrutinize wage premiums and penalties associated with soft skills frequently mentioned in job ads based on a matching study. Next, the role of soft skills in reproducing gender segregation, i.e. the unequal distribution of men and women across occupations, is examined in Sect. 4. Finally, we present conclusions in Sect. 5 with a summary of our findings, their implications, limitations, and suggestions for future work.

Methods and data
In this section, we describe the datasets used in this work and our semi-automatic soft skill mining approach. Following this approach we first create clusters of soft skills, grouping similar soft skills together, and then detect soft skills in job ads by searching for the soft skill strings in job descriptions.

Data
Our analysis is based on a dataset containing 245,000 job advertisements (ads) from the United Kingdom (UK). b This data is provided by the Adzuna job search engine, which collects job ads from hundreds of different websites. Each job ad entry contains the title, full description, job category, and salary of the job, among five other types of fields. c Adzuna has classified the ads into 29 job categories, based on the source of the ad and the job's description. Table 1 illustrates the most distinctive soft skills for five selected job categories. Desired soft skills differ considerably depending on the job category. For instance, the three most distinctive skills for Teaching are enthusiastic, dedicated, professional, whereas for Accounting & Finance they are accurate, responsible, analytical abilities. The soft skill detection algorithm is described in Sect. 2.2.4.
All experiments in this paper are conducted using the UK dataset, except for a crowdsourcing experiment needed for collecting an initial list of soft skills, which is described in the next Section 2.2.1. For this crowd-sourcing experiment, a dataset posted by the Armenian human resource portal CareerCenter consisting of 19,000 online job postings in a period from 2004-2015 is more appropriate, because job requirements are listed in a separate field. Thus the workers do not need to read through the full ad, allowing us to annotate more ads and to collect a longer list of soft skills. d

Soft skill mining
Our semi-automatic soft skill mining approach consists of the following steps: first, crowdworkers generate an initial set of potential soft skills, second, skills that seldom refer to candidates are removed, third, soft skills with a similar meaning are clustered into groups of skills, and fourth, soft skills are detected in new ads. These steps are summarized in Fig. 1 and explained in more detail in the following sections.
The resulting soft skills and their clusters are available at http://dx.doi.org/10.7802/1707.

Crowdsourcing a list of soft skills
The collection of soft skills was done through Figure Eight (formerly known as Crowd-Flower), e a crowdsourcing platform that allowed us to speed up our data collection process by submitting annotation tasks to online crowdworkers. Distinctiveness ( %) is defined as the absolute difference between the percentage of job ads that contain the skill within the given category (%) and the percentage of job ads that contain the skill within all categories.

Figure 1
The steps of our data-extraction process. We collect a list of soft skill clusters using crowd sourcing and then find occurrences of these clusters in a corpus of job ads First, each worker was given the following definition of soft skills: In a nutshell soft skills can be identified as qualities that do not depend on acquired knowledge; they complement hard skills (also known as technical skills). According to Wikipedia soft skills "are a combination of interpersonal people skills, social skills, communication skills, character traits, attitudes, [. . .] social intelligence and emotional intelligence quotients".
This was followed by a list of soft skill examples and instructions for completing the tasks.
In particular, the workers were instructed to read the presented text, consisting of the "job description" and "required qualifications" fields, select whether the text contained any soft skills, and, if that was the case, they were instructed to copy and paste the smallest relevant part of text denoting each skill to an answer field. Additionally, the workers were instructed to remove unnecessary adjectives and complements, but not to alter the text in any other way. For instance, excellent communication skills with customers and partners had to be reported as communication skills. Before the actual annotation phase, the workers were supposed to pass a training phase and answer a set of test questions, for which we had provided the correct answers: they had to obtain an accuracy level of at least 60% to proceed further. These test questions also showed up randomly during the actual annotation phase to ensure that the minimum accuracy level of 60% was maintained.
In total, we annotated 1650 job ads by at least 3 different workers. The annotation effort was conducted in two batches. After both batches we computed the number of distinct soft skills as a function of the number of annotated ads, plotted in Fig. 2. The results show that the rate at which new soft skills are discovered slows down, although new skills were still found at the end of the data collection. However, when examining the skills found last, most of them turned out to be typos and other phrases unrelated to soft skills (these include "ability to work as a part of PSD team", which is a hard skill since PSD stands for personal security detail, and "unquestioned behaviour", which is highly ambiguous). Therefore, we decided to stop the annotation task after the second batch.
To remove the typos as well as recurrent superfluous adjectives, f results were cleaned using a script. The script removed additionally extra whitespace and punctuation, and it corrected simple typos and misspellings by comparing the detected skill tokens to a Figure 2 The cumulative number of discovered soft skills as function of annotated job ads. The rate of discovered soft skills slows down towards the end of our data collection. At the end, the newly discovered skills are mostly typos and other phrases unrelated to soft skills. The final, manually refined list consists of 948 unique soft skills whitelist of valid skill tokens. Thereafter, we manually reviewed the skills to remove all non-soft skills and to prune out tokens not relevant to the skill.
The final manually curated collection included 948 unique soft skills.

Removing ambiguous soft skills
The focus of this work is to analyze soft skill requirements for job applicants. However, often soft skill phrases in job ads do not refer to the required applicant characteristics, but they may also describe the working environment or something else. For instance, independent could be used to describe an "independent business" or a home care assistant might be required to "help people to remain independent in their own homes. " Therefore, it is crucial to be able to detect soft skills that refer to the candidate rather than something else.
To tackle this problem, we created another crowdsourcing task, instructing crowdworkers to annotate soft skill phrases in the context they appear, i.e. the job ads. We noticed that skills consisting of multiple tokens usually unambiguously refer to the candidate and therefore we only annotated the skills consisting of at most three words, that is, 582 out of the 948 skills found in the previous steps.
More specifically, for each one of these skills, we extracted 10 randomly sampled text snippets where the skill occurs, including 25 words before and after the skill. Then we asked crowdworkers to classify each snippet to one of the following three categories: Candidate, Company/Company environment, or Other. At least three answers were recorded for each text snippet.
Based on the annotations, we computed the following confidence score g for each soft skill where W c (s) denotes the workers who classified an occurrence of skill s to refer to a candidate, W (s) denotes the workers who assessed an occurrence of skill s, and T(w) is the trust of a worker w. Trust is calculated by the crowdsourcing platform as the contributor's accuracy level in the current job, determined by his/her accuracy during the training phase-as explained in Sect. 2.2.1. Thus, the confidence score measures the proportion of votes for the Candidate category weighted by the trusts' of the workers who gave the votes.
We included the skills with a confidence value of at least 0.7 into the final list of soft skills. This value allowed us to retain 81.3% of the annotated skills (8.3% of trigram, 10.3% of bigram and 40.1% of single-word skills were discarded) while still having a relatively high confidence that the retained soft skill phrases actually refer to the candidate.

Soft skill clustering
Many of the soft skills collected by the crowdworkers are synonyms or near-synonyms. The different versions of a skill result, e.g., from diverse ways of expressing the concept (team-worker, ability to work in a team), or from slightly different spellings (able to work in team). To unify the different variants, the collected soft skills were clustered by first employing an algorithmic approach and then refining the clusters manually. After experimenting with a small subset of soft skills, different algorithms and parameter settings, we decided upon the following procedure.
Each soft skill was first represented in the vector space by averaging the word2vec [27] embeddings of its tokens, excluding stopwords. We used 300-dimensional embeddings pre-trained on the GoogleNews dataset. h Then, we employed agglomerative clustering algorithm to cluster the embedding vectors using the average linkage cosine distance measure. The clusters were finally reviewed and manually improved by split and merge operations and by reassigning some of the skills to more appropriate clusters, obtaining a final list of 190 clusters. i

Soft skill detection
In the final phase, our goal was to detect skill clusters in each job ad.
First, we preprocessed the job descriptions and the list of soft skills by lowercasing and removing stop words. j We also removed the competence terms (able, skills, etc.) from most soft skills, if they were perceived as not being fundamental for skill identification, to avoid false negatives (e.g. capable of handling multiple tasks should match with abilities in handling multiple tasks). Still, for some skills, we kept the competence terms if they would have become too ambiguous, resulting in false positive detection (e.g. communication skills without the word skills would match with communication technologies).
Thereafter, we searched for each soft skill s in each job description. If s consisted of multiple tokens, we allowed for at most two extra words to occur before each token in addition to stop-words, that were allowed to be removed from certain skills without making them ambiguous. We also experimented with more liberal ways of matching skills, ignoring the word order of the skill tokens or lemmatizing the tokens, but these were found to decrease the precision of the detected skills significantly.
Soft skills were detected in 78% of the ads, with 45.5% mentioning at least 3 soft skills, attesting to the importance of soft skills in the labour market.

Related work on soft skill mining
The curation of hard skills has been addressed by LinkedIn [28], whereas Kivimäki et al. [29] proposed a system for automatic detection of new skills in free written text using a  [30] suggested a novel approach for collecting data on skills and gender imbalances through LinkedIn's advertising platform. Automatic classification of soft skills referring to a candidate vs. something else (e.g. the work environment), has been studied by Sayfullina et al. [31], using the crowdsourced data collected in this work as described in Sect. 2.2.2.

Salary and soft skills
One of our main research questions is how the presence of certain soft skills may affect wages.
Analyzing annual salaries of job ads, we found that low-paid job ads contain, on average, more soft skills than high-paid job ads. This is illustrated in Fig. 3 which shows the average number of soft skill mentions per job ad in four different salary groups. The ads with a salary (s) of s ≤ £20,000 have 3.52 soft skills on average, whereas ads with a salary of £60,000 < s ≤ £80,000 have only 2.97 soft skills on average. All paired differences between the salary groups are statistically significant (p < 0.001; two-tailed t-tests with unequal variances).
While the higher prevalence of soft skills in low paid jobs is interesting by itself, it does not reveal which soft skills tend to be associated with wage premiums and which ones with wage penalties. To address this question we conduct a matching study.

Matching study
In order to study the link between a job ad's soft skill requirements and their respective salary, k we conduct a matching study [32]. The benefit of matching is that, in pairing a treated job ad (i.e. an ad with a given job title and job category that contain a specific skill) with its counterfactual (i.e. an ad with the same title and category but without the specific skill), we can control for a range of unobserved job category characteristics [33]. These characteristics include, for instance, work experience, since job titles often include qualifiers, such as head, senior, junior, or intern.
The specific matching strategy applied in this article is as follows: first, we group ads having the same job category c and job title t, ignoring stop words and the word order of the title. We picked all titles occurring at least twice, resulting in 34,071 distinct titles and 158,658 ads. Given a soft skill s, a normalized salary reward is defined as where M s,c,t andM s,c,t are the average salaries of job ads belonging to job category c, having job title t, and containing or not containing skill s, respectively.
where C s,c,t andC s,c,t are the number of job ads belonging to job category c, having job title t, and containing or not containing skill s, respectively. Individual rewards are weighted by the number of ads to avoid letting infrequent job titles have disproportionately large effect on the overall reward. In most cases, min(C s,c,t ,C s,c,t ) = C s,c,t since typically less than half of the ads from any category contain a given soft skill. Thus, the individual rewards are typically weighted by the number of ads containing the skill. A positive reward r s indicates that job ads that mention skill s have on average a higher salary than other job ads from the same job category and the same job title that do not mention s.
To compute the statistical significance of an observed reward value, r obs , we conduct a permutation test as follows: each job ad consists of (i) a set of soft skills mentioned in the job description, (ii) job category and title, and (iii) salary. We shuffle the soft skill sets (i) between the ads and keep everything else ((ii) and (iii)) fixed. This shuffling is repeated 1000 times and after each shuffle, we compute a new reward r rand . The p-value for the null hypothesis that |r obs | ≤ |r rand | is given simply by the fraction of |r rand | values that are greater than or equal to |r obs |. If the fraction is below or equal to a threshold of α = 0.05, we conclude that r obs is statistically significant and mark the reward with a ' * ' . A reward with p ≤ 0.01 is marked by ' * * ' .

Results
The soft skills that are associated with the highest wage premiums or penalties are shown in Table 2. Most of the soft skills associated with wage premiums can also be considered a requirement for higher occupational positions. Soft skills such as delegation skills, team building skills and leadership imply that a certain kind of supervision and authority toward others is required [34]. In contrast, listening skills, willingness to learn, as well as being punctual, describe skills that entail a certain degree of subordination.
Our empirical observation that soft skills associated with wage premiums are also closely tied to leadership positions is in accordance with sociological occupational class theories. Previous research on occupational classes has identified the magnitude of a job's authority as one of the key determinants in assessing the job's position in the occupational class system [35,36]. Jobs that entail a high degree of authority also occupy a strategic position in the labour market: by monitoring their subordinates, employees in leadership positions are ensuring that a firm produces surplus. Given this powerful position, high degrees of authority entail a significant degree of bargaining power and thereby the possibility to demand higher than average wages [36]. Empirical research indeed supports this notion and shows that leadership skills are associated with wage premiums [37,38].
Additional supporting evidence for this particular reading of the results comes from psychology. We find that character traits associated with wage premiums, for instance delegation skills, team building skills, and strategic planning are closely connected to skills psychological research has identified as leadership character- istics, i.e. management of personnel, visioning, as well as general strategic skills [39]. What is striking, is that many of the aforementioned skills in Table 2 also correspond to gender stereotypes. Gender stereotypes are generalizations about commonly shared perceptions of female and male attributes. Previous research has shown that while women are described as embodying "communal behavior", such as kindness, loyalty, and warmness, men are characterized by "agentic traits", such as competitiveness and aggressiveness [20], and as possessing leadership abilities [18]. Common "agentic" traits, such as competitive and aggressive, have been filtered out as ambiguous (see Sect. 2.2.2), since they typically do not describe the desired characteristics of the job applicant. However, we still find several leadership traits to come about with higher wages in Table 2. Moreover, "communal behavior" seems to come about with wage penalties in Table 2 across the board (for instance: polite, dedication, friendly personality, and being calm).
Thus, Table 2 provides first evidence that male gender stereotypes are connected to wage premiums, whereas female gender stereotypes are connected to wage penalties in the labor market. To scrutinize this issue further, in the following section we examine the association between gender stereotypes and wages in more detail.

Gender and soft skills
In this section we scrutinize to what extent soft skills are associated with occupational sex segregation. Thereafter, we explore a possible relationship between wages and gendered soft skills.

Industry gender composition prediction
In what follows, we test whether soft skills can predict the gender composition of a job category. The proportion of women for each job category was approximated by mapping the job categories in our data to the nearest categories from UK Labour Market statistics l as shown in Table 3.
We find that job ads in male-dominated job categories mention 3.20 soft skills on average, while ads in female-dominated job categories mention only 3.00 soft skills. The difference in means is statistically significant (p < 0.001; two-tailed t-test with unequal variances).
To predict the proportion of women in the category of a job ad, we used ordinary least squares (OLS) regression over job ads containing at least 3 different soft skills. Table 4 shows the soft skill clusters that are most predictive of female-dominated jobs (positive coefficients) and of male-dominated jobs (negative coefficients). Only those skill clusters that occurred more than 50 times and whose coefficient is statistically significant (p < 0.01) are shown. The table also indicates whether the reward associated with a soft skill is significant or not. The model obtained an R 2 score of 0.11.
A high proportion of women in a job category is associated with soft skills such as empathy, respectful, sensitivity and dedication. Skills such as marketing skills, ability to win new business, ability to lead project teams and analytical skills are negatively associated with women's shares in job categories, meaning they predict soft skills mentioned more frequently in ads for maledominated jobs. These results illustrate that with a few exceptions (e.g. delegation skills and managerial skills), the soft skills that are predictive of the job's gender composition are also closely associated to gender stereotypes. Thus, not only do skills associated with gender stereotypes about women potentially get lower rewards in labor markets (as suggested by Table 2), but we further find that some soft skills, which are distinctive of the gender composition within a job, are also stereotyped as being female. Put differently, not only does one potentially get paid less if one is carrying out tasks connoted as being female, but occupations carried out mainly by women are also advertised making use of those skills that come about with wage penalties.
Our findings also suggest that there are two deviations from this pattern, i.e. delegation skills and managerial skills, which are soft skills that are associated with leadership (male) stereotypes but still predict a high proportion of women in an occupation. This finding, however, is in line with previous research, providing evidence that women will apply for leadership positions if the remaining part of the job ad is phrased using female stereotypes or gender neutral language [19,23,24].

Occupational segregation and gender-stereotypical soft skills
To more systematically analyze the claim that the gender composition of an occupation is shaped by gender stereotypes, we mapped our soft skill clusters to a list of twenty personality characteristics desired in men and another twenty characteristics desired in womenthe so-called Bem Sex Role Inventory [18]. Out of these, we were able to map five femi- The first twelve soft skills are the strongest predictors for female oriented job ads (i.e. job ads for professions with a high proportion of women), while the last twelve rows correspond to the strongest predictors for male oriented ads (i.e. job ads for professions with a low proportion of women). Many of the found predictors correspond to common gender stereotypes. The third column lists the salary reward (r, see Eq. (2)), whilst the fourth shows the number of samples from the training set in which the skill clusters appear.
nine and seven masculine characteristics to similar soft skill clusters in our data, shown in Table 5. m Based on the mappings, we set out to study the prevalence of the genderstereotypical soft skills in job ads of female and male-dominated industries. The percentage of ads containing a skill within the ads from female-(male-) dominated industries is denoted by P f (P m ). In the last column of Table 5 we show the percentage difference between these two percentages. A positive value means that the skill is used more in femaledominated industries and a negative value that it is used more in male-dominated industries.
All feminine skills are more prevalent in female-dominated industries, whereas for masculine skills the picture is not as clear. For instance, analytical skill is used more than five times more often in male-dominated industries, while leadership is used almost twice as often in female-dominated industries, although both of these skills are stereotypically masculine according to Bem [18]. This finding, however, is in agreement with previous research, where evidence was found that although women will make inroads into occupations in which the skill set is in line with typically male features, this is not true the other way around [17,40]. Hence, although women try to push into male-dominated occupations, men do not do the same with regard to female-dominated occupations.
Our findings have implications for occupational sex segregation, that is, the unequal distribution of men and women across occupations in the labour market. Advertising female or male-dominated jobs in accordance with the associated gender stereotypes reproduces The gender stereotypes listed by Bem [18] that could be mapped to one of our soft skill clusters. On average, the feminine stereotypes are associated with a wage penalty (r = -1.7), whereas the masculine stereotypes are associated with a premium (r = 2.6). The percentage of job ads within female and male-dominated industries that mention a skill cluster are denoted by P f and Pm , respectively.
cultural beliefs about these stereotypes and upholds the gender-typicality of occupations.
Previous research has shown that cultural beliefs about gender stereotypes influence selfassessment of men and women [22,41]. These biased self-assessments have been shown to be a crucial factor of career choices [22]. Accordingly, empirical evidence employing experiments, suggest that if jobs are advertised using stereotypically male traits, women are less likely to think that they are suitable for the position [25] and, hence, hesitate to apply. Thus, by illustrating that real jobs advertisements that include female stereotypes are dominated by women, we provide large-scale evidence that job ads can be seen as part of a leaky pipeline [42], serving as the first sorting mechanism by which women are crowded out of male-dominated occupations at labor markets [19,23,25,26].
The results thereby suggest the importance of gender stereotypes in the reproduction of occupational segregation, i.e. the demand-side, and the corresponding selection of men and women in different occupations.
However, it is important to note that while our results establish a correlation between the usage of stereotypical soft skills and occupational segregation, studying the causal mechanisms between the two is beyond the scope of this paper. Nevertheless, this work supplements the much richer account of research examining the supply side of the unequal distribution of men and women across occupations, namely the influence of gendered individual preferences and respective assessments of one's own skills and capacities [21,22], by showing a connection between the demand-side, i.e. job ads, and occupational segregation.

Gendered soft skills and salary
Results in the previous section illustrated that soft skills corresponding to gender stereotypes are associated with the gender composition of the job category. In what follows, we are going to examine to what extent these gendered soft skills are associated with wage premiums or penalties. Gender stereotypes may influence wages. More specifically, tasks that are linked to typically "female" responsibilities are often associated with wage penalties [43][44][45]. An explanation for the devaluation of "female" tasks is found in the ascribed lower status of women, i.e. gender status beliefs. Gender status beliefs are diffuse cultural beliefs on account of which men are rated more competent than women. These beliefs about women's lack in aptitude and competence are transferred to the labor market and thereby facilitate a devaluation of women and typically "female" tasks in the workplace [11]. Recent evidence, for instance, suggests that women are underrepresented in academic fields where practitioners believe that raw talent is needed in order to succeed. Women are simply seen as less brilliant than men and therefore not hired in academic segments where beliefs about the need for innate talent are salient [46].
The rewards in Table 4 illustrate that soft skills that correspond to gender stereotypes about women, such as respectful, empathy and dedication are predominantly associated with wage penalties (with the exception of sensitivity). A similar pattern is found in Table 2, where most of the soft skills related to stereotypes about women are associated with wage penalties, while the ones linked to leadership bring about wage premiums. Hence, our study presents evidence on the devaluation of soft skills related to gender stereotypes based on a large-scale list of soft skills derived from real job ads. We thereby confirm previous small-scale research, in which evidence was found that, net of individual labour-market-relevant characteristics such as work experience, single tasks tied to female gender stereotypes (such as nurturing [43]) are associated with wage penalties [44,45].
Regarding male-dominated jobs, our results show that soft skills that are associated with commonly shared stereotypes about men, such as analytical skill and self starter [19], predict statistically significant wage premiums. Moreover, Table 4 illustrates that leadership skills, which are also stereotypically ascribed to men, do come with wage premiums (i.e., ability to win new business, ability to lead project teams, and ability to present ideas). However, we find that leadership skills associated with female-dominated occupations such as delegation skills, and managerial skills are related to wage premiums as well. This means that soft skills that are associated with a high share of women in an occupation are also more often related to wage penalties compared to soft skills that are associated with a high percentage of male incumbents. However, if soft skills required in female-dominated occupations represent leadership skills they can also entail wage premiums.
To further explore the association between sex-typed gender stereotypes and wage penalties or premiums, we calculated the salary rewards r of the soft skills clusters that we found congruent with the personality traits from the Sex Role inventory by Bem [18]. The rewards are listed in Table 5. We find that all masculine skills are associated with a positive reward, whereas 3/5 feminine skills are associated with a penalty. The average rewards for masculine and feminine skills are 2.6 and -1.7, respectively. This difference is statistically significant (one-tailed t-test with equal variances; p = 0.014). This suggests that stereotypically masculine character traits are valued more in the workplace than feminine character traits.
Based on the evidence provided we find that the devaluation of women is mainly realized via gender stereotypes, while skills associated with male stereotypes, i.e. leadership skills, do receive wage premiums.

Discussion and conclusions
This study examined soft skills in the labour market and showed that soft skills are a crucial component of job ads, especially of low-paid jobs and male-dominated professions and may therefore potentially perpetuate labour market inequalities. To explore how soft skills influence labor market outcomes, in particular wage premiums or penalties and gendered labour market composition, we developed a semi-automatic approach for mining soft skills from job advertisements.
We would like to highlight three key findings of our study: 1. We found that not all soft skills are valued equally in the labour market, some are associated with wage premiums while others are linked to wage penalties. 2. Some soft skills are significant predictors of a job's gender composition. Utilizing solely soft skills, we can explain 11% of the variation in the gender composition of job categories. Soft skills that are associated with gender stereotypes, such as empathy and sensitivity for women, are significant predictors for a high percentage of women in the respective jobs, and vice versa is found for characteristics perceived as being "male". However, the selection of men and women into different occupations would in itself not be crucial for labour market inequality, as long as this segregation only implies that men and women work in different occupations and no other repercussions are attached. Previous research, however, has pointed out that wages paid in female-dominated occupations are lower than in male-dominated occupations [47][48][49]. Sex segregation in labour market is thus perceived as being a crucial factor of perpetuating wage differentials between men and women. Therefore, our results suggest that gender stereotypical job ads serve as part of a leaky pipeline upholding gender wage inequality, by contributing to a selection of women into lower paying occupations, on the basis of employing wording that discourages them to apply to higher paid male-dominated jobs in the first place.
3. Typically "female" soft skills, i.e. prescribed stereotypes about women, are mostly associated with wage penalties, while soft skills associated with leadership, and as such stereotypes that are associated with men, come with wage premiums-even after controlling for the job title and job category. Although, by drawing on empirical research from psychology, we could explain which tasks are associated with being "male" or "female", we believe that certain soft skills, such as being respectful and being curious are probably important in any kind of job. Given this assumption, it is the more compelling to find that while the former is associated with a high percentage of women in an occupation and wage penalties, the latter comes about with wage premiums and is found in job ads for male-dominated occupations. This hints, as discussed, at a general devaluation of task carried out by women in labour markets.
One might wonder, if women could not simply apply for jobs that are advertised using "male" soft skills and thereby circumvent possible wage penalties. Current evidence however shows that the solution is not that simple: women are less likely to be successful when applying for a male-dominated job and when violating female gender stereotypes [20,50,51].
This study was not without limitations. Therefore next we discuss these restraints and briefly consider how these limitations can be addressed in the future research.
First, distinguishing between when a given soft skill is a necessity for a job or merely a useful asset is beyond the scope of this paper. The accuracy of the soft skill detection method, as well as the distinction of a soft skill being an asset or a necessity, could be improved by considering part-of-speech features. Second, although we were able to account for a considerable degree of unobserved occupational heterogeneity by using matching techniques, in order to rigorously test the impact soft skills on wages, one would need to analyze if wage premiums or penalties associated with certain soft skills hold, net of individual labor-market-relevant attributes. More to the point: we believe that work experience and job tenure serve as relevant confounders in our study. The particularly large premiums for leadership are very likely also connected to senior positions requiring professional expertise and longstanding on-the-job experience.
While work experience is to some extent controlled by using the words of the job titles (e.g. senior and intern) as matching criteria, in some cases, the expected work experience can be indicated merely in the job description, which is not used for matching. Given previous evidence that finds that tasks associated with being "female", such as "nurturing skills" do pose a penalty on wages, net of individual characteristics [43], it is plausible that our results would be stable net of individual labor-market-relevant attributes as well. In future research this could be tested by linking the soft skills to individual survey data, which include measures of individual work experience.
Regardless of these limitations, this study has made an important contribution to the impact of soft skills in the labour market. Combining computational methods as well as theoretical and empirical insights from economics, sociology and psychology enabled us to shed more light on how soft skills operate in the labour market. We showed that soft skills are a crucial component of job ads, especially of low-paid jobs and jobs in femaledominated professions. Furthermore, we found evidence that soft skills are associated gender segregation across occupations and reinforce wage inequalities between men and women by rewarding typically "male" characteristics and penalizing "female" traits.
Grugulis and Vincent [6, p. 599] put it this way: "When it is an individual character that is being judged, evaluations based on gender and race are far more likely". Put differently, personal traits and characteristics, namely soft skills, are hard to evaluate and thus likely subjected to proxies such as gender or race and associated stereotypes, which in turn leads to discrimination. Our results support this observation, as they suggest that soft skill polarize labour market outcomes in terms of wages and occupational segregation. This polarization strikes women, as an already vulnerable group in labour markets, the hardest.