Skip to main content

Table 2 Comparison of Penumbra and GitHub datasets

From: The penumbra of open source: projects outside of centralized platforms are longer maintained, more academic and more collaborative

Statistic

Fig.

Penumbra

GitHub

KS 2-Sample

Mean

Median

CI

Mean

Median

CI

KS S

KS P

Files

2(c)

244.47

12

[1,859]

156.07

9

[1,264]

0.07

<0.001

Committers

2(b)

2.39

1

[1,6]

2.08

1

[1,3]

0.17

<0.001

Message Lengths

2(f)

29.24

20.80

[7.00,67.33]

24.23

17.60

[7.42,56.00]

0.13

<0.001

Editor Density

2(d)

1.12

1.00

[1.00,1.60]

1.05

1.00

[1.00,1.30]

0.20

<0.001

Burstiness

3(d)

4.86

2.88

[0.50,14.51]

3.68

2.15

[0.17,11.24]

0.13

<0.001

Commits

2(a)

67.12

8

[1,194]

25.27

4

[1,57]

0.20

<0.001

Branches

2(e)

1.74

1

[1,4]

1.67

1

[1,5]

0.03

<0.001

Age (hours)

3(a)

5528

883

[0.1,25556]

2669

73

[0.03,16194]

0.26

<0.001

Age / Commits

3(b)

283

39

[0.02,1261]

193

9

[0.01,944]

0.19

<0.001

Avg. Interevent

3(c)

375

43

[0.05,1547]

257

11

[0.02,1130]

0.19

<0.001

Team Size

3(e)

1.71

1.00

[1.00,3.92]

1.42

1.00

[1.00,2.67]

0.17

<0.001

  1. Mean, median, and 5th and 95th percentile values from the Penumbra and GitHub samples for each statistic. KS S and KS P represent the Kolmogorov-Smirnov two-sample statistic, and its corresponding p-value.