Inequality and cumulative advantage in science careers: a case study of high-impact journals

EPJ Data Science

Table 3 Summary statistics for two aggregate regression models

Journal set	$N_{p}$	A	B	S	p-val.	$N_{f i t}$	$R^{2}$
Economics	4-9	1,090	0.17(3)	−0.046(4)	$1 \times 10^{- 5}$	9	0.93
Shuffled	4-9	21,800	−0.003(6)	0.0001(1)	0.68	9	0.03
Economics	10-20	373	0.17(2)	−0.021(4)	$5 \times 10^{- 4}$	10	0.87
Shuffled	10-20	7,460	0.01(1)	−0.002(2)	0.23	10	0.17
Mgmt. Sci.	5-10	262	0.22(9)	−0.05(1)	$6 \times 10^{- 3}$	10	0.63
Shuffled	5-10	5,240	−0.01(3)	0.004(4)	0.40	10	0.09
Mgmt. Sci.	11-20	62	0.5(1)	−0.07(2)	$4 \times 10^{- 3}$	10	0.68
Shuffled	11-20	1,240	0.03(2)	0.005(4)	0.20	10	0.19
Nat./PNAS/Sci.	5-10	3,953	0.15(2)	−0.035(4)	$8 \times 10^{- 6}$	10	0.93
Shuffled	5-10	79,060	−0.006(8)	0.002(1)	0.28	10	0.16
Nat./PNAS/Sci.	11-20	847	0.23(3)	−0.032(4)	$10^{- 4}$	10	0.88
Shuffled	11-20	16,940	0.02(1)	−0.003(1)	0.05	10	0.36

Journal set	$N_{p}$	$N_{d}$	b	s	p-val.	A	$R^{2}$
Economics	4-9	6,183	0.19(3)	−0.053(7)	0	1,090	0.012
Economics	10-20	3,730	0.17(3)	−0.022(6)	$3 \times 10^{- 4}$	373	0.005
Mgmt. Sci.	5-10	1,710	0.26(4)	−0.07(1)	0	262	0.020
Mgmt. Sci.	11-20	620	0.48(9)	−0.07(2)	$10^{- 4}$	62	0.042
Nat./PNAS/Sci.	5-10	26,010	0.19(1)	−0.048(3)	0	3,953	0.013
Nat./PNAS/Sci.	11-20	8,470	0.23(2)	−0.032(4)	0	847	0.013

(Top) The regression model (ii) given by Eq.(8): A denotes the number of individual careers that were aggregated for each mean impact trajectory $〈 \tilde{z} (n) 〉$ . B and S are estimated using ordinary least squares, along with the F-test p-value, the number $N_{f i t}$ of data points, and the $R^{2}$ correlation value. The number in parentheses represents the standard error in the last digit shown. The ‘shuffled’ values correspond to the parameter estimations using our citation shuffling scheme (conserving the empirical citation distribution) that also allows for an increase in the sample size by a factor of 20). We also include the management science careers for comparison since the dataset contained a sufficient number of researcher profiles to analyze. Bold-faced p-values indicate the regressions with p ≤ 0.01. (Bottom) The fixed-effects linear regression model (iii) (implemented by the function ‘xtreg, vce(robust) fe’ in STATA11) given by Eq. (9). We used the ‘vce(robust)’ Huber-White variance estimator to account for possible heteroscedasticity in the model errors. $N_{d}$ denotes the number of observations, b and s are the coefficient estimates of the fixed-effects model (value in parenthesis is the robust standard error in the last significant digit), and p-val. corresponds to the model F-statistic F(1,A − 1).