Skip to main content

Practices of public procurement and the risk of corrupt behavior before and after the government transition in México

Abstract

Corruption has a significant impact on economic growth, democracy, and inequality. It has sever consequences at the human level. Public procurement, where public resources are used to purchase goods or services from the private sector, are particularly susceptible to corrupt practices. However, government turnover may bring significant changes in the way public contracting is done, and thus, in the levels and types of corruption involved in public procurement. In this respect, México lived a historical government transition in 2018, with the new government promising a crackdown on corruption. In this work, we analyze data from more than 1.5 million contracts corresponding from 2013 to 2020, to study to what extent this change of government affected the characteristics of public contracting, and we try to determine whether these changes affect how corruption takes place. To do this, we propose a statistical framework to compare the characteristics of the contracting practices within each administration, separating the contracts in different classes depending on whether or not they were made with companies that have now been identified as being involved in corrupt practices. We find that while the amount of resources spent with companies that turned out to be corrupt has decreased substantially, many of the patterns followed to contract these companies were maintained, and some of those in which changes did occur, are suggestive of a larger risk of corruption.

Introduction

Because of its impact on economic growth [1, 2], democracy [3], and inequality [4], one of the biggest challenges that a government has to deal with is corruption. Transparency International defines corruption as the abuse of public power for private benefit [5], i.e. it assumes that corruption involves the participation of public officials. Considering this definition, a niche in which corruption can arise naturally is in public procurement; where public and private sectors interact through contracting, mostly to purchase goods or services. Corruption at the level of public-private contracting has high costs in many areas. For example, if buyers favor some suppliers over others through corrupt decisions, bribes or patronage (clientelism), then market competition is affected [6, 7]. This lack of competition leads to a misallocation of resources, affecting areas such as budget composition [1], military and technological spending [4], social care [7], and may even change the market structure and dynamics [8, 9]. However, owing to the complexity of the contracts, the large sums of money involved, the number of participants, as well as the inherent complicity of public officials, this kind of corruption is difficult to identify, track, and prevent [1012].

A government turnover, i.e. a change in the individuals and/or parties in power, due to elections or otherwise, may bring significant changes in the way public contracting is carried out, and thus, in the types of corrupt behavior that may occur in the context of public procurement [13]. Broms et al. propose that frequent elections with uncertain outcomes may compel corrupt elites to pursue predatory strategies; however, if there is a well-established party system, regular electoral uncertainty may motivate corrupt elites to exercise restraint [14]. In this context, Fazekas et al. found that fair electoral contest and heterogeneous power-sharing may have the potential to mitigate corrupt market distortions, even in systematically corrupt places [9]. On the other hand, while it seems to be true that government turnovers can diminish corruption in public procurement [9, 14], there is also evidence that a change in government can maintain corrupt behaviors, only changing the favored suppliers [15]. An example of this was found by the Mexican Institute for Competitiveness (IMCO by its Spanish acronym, [16]), which analyzed public procurement data from México, finding that in the change of government that took place between 2012 and 2013, the favored suppliers also changed, but not the amount of resources and contracts given to the new suppliers. This phenomena was called ‘‘the compadres’ change” [17].

For years México has occupied low levels in the score of corruption perception index given by Transparency International [18, 19]. Mexican citizens consider that corruption is one of the biggest problems in the country, only behind violence and insecurity [19]. For years, government and corporate structures have been created, maintained, and adapted to obtain private benefits from public resources, which has carried huge consequences to the economic growth and human well-being in the country [20].

In 2018, México lived the largest electoral contest of its history, with more than 56.6 million voters, representing 62.62% of citizen participation [21]. In this election, a ‘‘leftist” candidate won the presidency for the first time. He was the most voted winner in Mexico’s history, obtaining 30.1 million votes (representing 53.1% of the voters), and his party won 388 congress positions of the 554 available (70% of the seats) [21]. This event not only marked a government turnover, but an ideological and structural change in the government’s goals and methods [20, 22, 23]. In this paper, we study how this government change affected the characteristics of public contracting, and the extent to which these changes affect the forms in which corruption takes place. To do so, we analyzed data from more than 1.5 million contracts between many agencies that conform the Mexican government and private suppliers, corresponding to the period 2013-2020, which includes the first two years of the new government (2019-2020) [24].

Many states have taken advantage of the new technology era to record their administrative activities. This data, which can be used to analyze the projects undertaken by an administration, their success and shortfalls, mainly describes the state’s procurement practices [25]. This large amount of administrative data collected by governments allows studying corruption from a new data driven perspective [26]. A common approach to quantify corruption is by building risk factors from contract data [8, 9, 15, 2732]. For example, single-bidder contracts have been shown to be effective in identifying and predicting corruption risk [8, 3234] in various contexts; including studies on the relationship between corruption and political incumbency [33], and the effect of campaign contributions on corruption [34]. Other approaches include the application of network theory to measure the impact of single-bidder contracts on local corruption [8]. Network theory was also used to identify corruption risk distribution among countries. For those studies, the actors (buyers and suppliers) were represented by network nodes; these nodes were linked if they entered into the same public procurement contract, and the weight of the link was the fraction of single-bidder contracts between each pair [32].

Here we propose a somewhat different approach, taking advantage of specific public data that lists suppliers that were investigated and have been identified as having incurred in corrupt practices. Since 2013 México’s government collects a list of companies that have been caught providing invoices for simulated operations (or EFOS for its spanish acronym “Empresas que Facturan Operaciones Simuladas”) [35]. These companies sell fake receipts to buyers who use them to avoid taxes or to cover acts of embezzlement, among other things. There is also, since 2013, a list of all the suppliers that have been caught doing one or several types of corrupt activities when participating in a public contract. For example, presenting fake documentation in order to win a contract, overcharging their services, breaching contract, or diverting resources. These companies are labeled as ‘‘sanctioned suppliers and contractors” (or PCS by their Spanish acronym ‘‘Proveedores y Contratistas Sancionados”) [36].

It should be noted that to appear in these lists, these contractors were subject to a legal investigation, and some of the contractors are appealing the decision. Thus, these lists may change slightly as time goes by as a result of companies winning their appeals, which removes them from the lists, as well as due to conclusions of long lasting investigations, which may add new companies. Nevertheless, from the available records, we identified the contracts in which companies, which have been suspect of corrupt activities, participated.

We use this data to achieve the following goals:

  1. i)

    Our first goal is to develop a statistical framework to analyze whether there were changes in public contracting practices as a consequence of the government turnover in México, both in those contracts suspect of corruption (those in which EFOS or PCS companies participated) and those in which companies free of official corruption charges participated. To achieve this goal, each contract has to be described by a set of quantitative variables based on the available data.

  2. ii)

    The second goal is to use the data to build risk factors following the framework proposed by Fazekas’ group, and to test whether these indicators successfully describe and identify those contracts in which companies suspect of corrupt activities participated. We then use these indicators to determine whether the change in the procurement practices due to the government turnover brought a lower or higher risk of corruption, where risk of corruption, in this work, is to be understood as the fraction of potentially corrupt contracts.

The results in this paper are based solely on the study of the data, trying to avoid any political bias, and without use of any prior knowledge about the contracts’ participants.

Methods

Data

The list of public contracts from 2013 to 2020 was taken from the electronic Mexican system of public governmental information on public procurement CompraNet [24]. This system is operated by an administrative unit designed by the Mexican federal agency for budget and public debt (SHCP for ‘‘Secretaría de Hacienda y Crédito Público”). The registry of contracts in this electronic system is mandatory for all those that operate with resources from the federal budget.Footnote 1 The contracts on this list have a specific set of variables or entries that describe them. The particular entries we use in this work are shown in Table 1.

Table 1 Set of the variables chosen to describe the features of each contract from the source list of public contracts [24]. In parenthesis, a short explanation of the variables that are not self explanatory is given, providing the motivation for considering each particular variable

The original data lists consisted in 1.6 M of contracts. These records were curated to standardize the information available in each of them. We homogenized all the string variables to avoid issues with special characters or spacing,Footnote 2 we also deleted all the entries in which an important variable was omitted (for example, the buyer name, the amount spent, the type of procedure, etc.). The contracts with this kind of problems were approximately 60 K (representing 4% of the data). Thus, from the 1.6 M contracts available in the original data lists, we consider 1.54 M. As mentioned above, from these 1.54 M of records, we took only the variables shown in Table 1. The original set includes other variables that are not of interest for this work, such as the name of the buyers’ legal representative, or the suppliers’ webpage. Hereinafter, we will refer to this curated list of 1.54 M of contracts as the source list.

The list of companies identified as EFOS is available on the site of the Mexican tax agency (SAT for ‘‘Secretaría de Administración Tributaria”) [35],Footnote 3 and the list of PCS contractors is available on the official Mexican open data site [36]. The only variable of interest in these lists is the supplier’s name, which we use to identify the contracts in which these companies participated.

All data sets mentioned in this section, and other tools to reproduce the results shown here are available on [37].

Contract classes

Our first step was to identify those public procurement contracts won by companies labeled as having been involved in corrupt activities. Since these companies have gone through a process to be labeled as either an EFOS or a PCS, we assume that they may be suspect of having incurred in corrupt behavior in all the contracts they participated in, independently of whether the contracts occurred before or after the company was labeled. Thus, all the contracts in which the supplier is an EFOS or a PCS, are classified as possibly corrupt and we assign the corresponding label EFOS or PCS to them.

Following the principle of presumption of innocence, we label as NC (for Non-Corrupt) all the other contracts in the source list in which the supplier is free of official corruption charges. Thus, we have three classes of contracts labeled EFOS, PCS and NC, respectively. Of course we are aware that it is very likely that corrupt contracts went undetected and end up in our NC class, however we expect that they will have little statistical weight in this class that represents the vast majority of the contracts.

Contract description variables and risk factors

Once we have identified the class to which each contract belongs, we use the information given in the source list to build two variables that are descriptive of the buyers: the maximum number of contracts assigned by the buyer to the same supplier, and the maximum total amount spent by the buyer with the same supplier in each year. We call these variables T.Cont.Max and T.Spending.Max. These variables give us an idea of the budget managed by each buyer and its activity in the public procurement market (Table 2 - Type ii)). Unfortunately, the available data in the source list is not detailed enough to evaluate exactly the risk factors developed in [27, 28, 42], thus, we propose approximate versions of the factors that give similar information about the features of the relationship between the buyers and suppliers, and we refer to these variables as Type iii) (See Table 2). Below we explain in detail the meaning of the risk factors we use:

  1. 1.

    RAD: As we mentioned above, single-bidder contracts have been shown to be effective identifiers of corruption risk, particularly because these direct processes hinder the possibility of competition, and the selection of winners may be influenced by illicit agreements. The Mexican Constitution (in its article 134 [43]) establishes that all public procurement should be made by open public contest, except for exceptional cases contemplated in the Law of Acquisitions, Leases and Services of the Public Sector [38] and the Law of Public Works and Services Related to It [39]. Given these considerations, we consider as a risk factor the ‘‘Fraction of single-bidder contracts”, proposed in [32] and defined as:

    $$ \mathit{RAD} = T.AD/T.Cont $$

    where T.AD is the total number of single-bidder contracts assigned directly by a buyer to a supplier in a year, and T.Cont, the total number of contracts between the specific buyer and supplier that year. If \({\mathbf{RAD}}\geq 0.5\), that is, if more than half of the contracts between a supplier and a buyer are single-bidder, we consider that the risk of corruption is high [32]. Strictly speaking, the variable is descriptive of the relation between each buyer and supplier each year. However, in what follows, we assign to each contract the value of RAD corresponding to the relation between the buyer and supplier celebrating the contract.

  2. 2.

    Fav: Clientelism, or favoritism, is also a red flag for corruption. This practice refers to the situation in which the buyer favors a particular supplier by giving it an atypically high volume of contracts and money. While it is hard to detect other than in very obvious cases, the IMCO proposed a quantitative way to measure favoritism as follows [42]:

    $$\begin{aligned} \mathit{Fav} &= (0.33)\frac{T.\mathit{Cont}}{T.\mathit{Cont}.\mathit{Max}} +(0.66)\frac{T.\mathit{Spending}}{T.\mathit{Spending}.\mathit{Max}} \end{aligned}$$

    with, T.Cont and T.Spending the number of total contracts and total spending made by the buyer with the supplier in a year, and T.Cont.Max and T.Spending.Max the variables that are descriptive of the buyer as explained above. This factor would give a score of 1 to the supplier that is the most favored by the buyer in both the number of contracts and money spent. The rest of the suppliers are then scored with respect to this most favored possible supplier. If \({\mathbf{Fav}}\geq 0.9\), then we consider that that relationship buyer-supplier has a large risk of corruption [42]. As above, we assign to each contract the value of the Fav corresponding to the relation between the buyer and suppler celebrating the contract.

  3. 3.

    CPW and SPW: The IMCO also identifies a particular form of corruption in which a buyer and a company manipulate the conditions of a contract procedure, dividing an expensive contract into multiple smaller ones that are assigned to the company in a very short time interval (sometimes the same day). These smaller contracts are easier to assign directly to a single-bidder, and are less likely to be scrutinized [42]. To identify this behavior, we consider two risk factors: ‘‘Contracts per Active Week” and ‘‘Spending per Active Week” defined as:

    $$\begin{aligned}& \mathit{CPW} = T.\mathit{Cont}/\mathit{ActiveWeeks} \\& \mathit{SPW} = T.\mathit{Spending}/\mathit{ActiveWeeks} \end{aligned}$$

    where T.Cont and T.Spending are the same variables explained before, and ActiveWeeks is the number of weeks in which a supplier was assigned contracts by a buyer in each year. Then, if a large contract was fractioned into many small ones, we expect that CPW would be large. We consider that there is a large risk of corruption if \({\mathbf{CPW}}\geq 5\) (i.e. that the company received more than one contract per day) and SPW≥ 10K USD PPP.Footnote 4 As above, we assign to each contract the value of the CPW and SPW corresponding to the relation between the buyer and suppler celebrating the contract.

Table 2 Set of variables designed to assess buyer features and corruption risk. Variables of type ii) describe features of the buyers. Variables of type iii) provide information about the relationship between the buyers and suppliers. These variables are approximate versions of those proposed in [27, 28, 42], see text

Therefore, considering the variables available in the source list, and the variables related to buyers’ features and risk factors, there are three types of items in the final data set:

  1. i)

    Items that describe the features of the contracts. For example, the procedure by which the supplier won the contract (single-bidder, public contest, etc.), the amount allocated to the contract, or the week in which the contract began (Table 1).

  2. ii)

    Items that describe the features of the buyers. These include the maximum amount spent by a buyer with a supplier, or the maximum number of contracts carried out by a buyer with a single supplier.

  3. iii)

    Items that give information about the relationship between the buyers and suppliers. These items are expected to work as risk factors for corruption. Examples of these are the fraction of single-bidder contracts awarded by a buyer to supplier, or the favoritism of a buyer for a supplier.

In what follows, the metrics computed for variables of type i) and iii) will be the fraction of contracts that satisfy certain property (e.g. the fraction of contracts in which a small company participated, or the fraction of contracts between buyers and suppliers characterized by a given value of RAD), and the metrics related to variables of type ii) will be related to the fraction of buyers with certain features (e.g. the fraction of buyers with a maximum spending larger than 5K USD PPP). Table S1 in the Additional file 1 shows the most common descriptive statistics of all the variables for each contract class.

Statistical analysis

One of the central goals of this work is to determine whether the government turnover brought along a methodological change in public procurement. To achieve this goal we first need to analyze whether, within the same government period, there exist significant differences between the three contract classes we defined. Specifically, we aim to verify that each contract class presents statistical differences in their characteristic variables (i.e. variables of types i) and ii)), when compared with the other classes. This will provide further justification for separating contracts into the three classes, and will help associate each class with certain characteristic description variables. After that, we compare the statistical profile of each class between the different governments, which will tell us whether there were significant changes in public procurement practices within the classes with the change of government. And finally, we test the risk factors proposed above, and determine if the government transition resulted in a higher or lower risk of corruption in public procurement.

Binomial Test and Kolmogorov-Smirnov Test

Since each contract has two kinds of descriptive variables, dummy variables and non-dummy variables, we use different tests to compare each kind of variable between classes and periods. To measure differences between classes we take the following procedure: for the dummy variables set we use the Binomial Test (B-Test) [44, 45], and for the non-dummy variables set we use the Kolmogorov-Smirnov Test (KS-Test) [46].Footnote 5 The specific steps to do this are the following:

  1. 1.

    We separate the data corresponding to the two different government periods. The 1st period covers from 2013 to 2018, the 2nd period from 2019 to 2020.

  2. 2.

    For each contract class we extract the data for each variable for all the contracts belonging to the class in each period.

  3. 3.

    We compare classes by pairs in the same period performing the two-sample B-Test or the two-sample KS-Test for each variable.Footnote 6

  4. 4.

    We consider that there are significant statistical differences between contract classes in those variables for which:

    1. a.

      The B-Test results in a p-value \(p_{v}\leq 0.05\) and where the difference between fractions of the dummy variables is ≥0.1.

    2. b.

      The KS-Test results in a statistic \(D\geq 0.1\) and a p-value \(p_{v}\leq 0.05\).

    With these values we ensure that at least 10% of the contracts of one class present, in these variables, a different behavior from the contracts of the other class. The sample sizes for each variable type in each class and period, are in Table S2 in the Additional file 1.

The results of these comparisons are shown in sub-Sect. 3.2.

Measuring differences between government periods

To measure the differences between government periods, we take a slightly different path than that taken above since there is a natural variability in the distribution functions within a period, and we seek to detect differences beyond this variability. Thus we perform the comparison as follows:

  1. 1.

    We separate the data of each government period by year.

  2. 2.

    For each contract class we compute:

    1. a.

      For each dummy variable: the fraction of contracts in which the variable is present, over all the contracts belonging to the class in every year.

    2. b.

      For each non-dummy variable: the cumulative distribution function (CDF) of the variableFootnote 7 over all the contracts belonging to the class in each year.

  3. 3.

    Then, for non-dummy variables, we compute the confidence interval (CI) at 99%Footnote 8 for each distribution, using the data from each year of the 1st period. For dummy variables we use a boxplot of the data from 2013-2018 for comparison.Footnote 9

  4. 4.

    Now, using the data from the contracts under the new government, those dummy variables whose fractions are outside of the minimum and maximum ranges of the boxplot for the corresponding data from the first period, and those non-dummy variables with at least 25% of their cumulative distribution curve lying outside of the CI of the corresponding data of the first period, are considered to have significant statistical differences beyond the natural variability of the distributions. Conversely, those dummy variables whose fractions are inside the corresponding boxplot’s ranges, or the non-dummy variables for which at least 75% of their cumulative distribution curves are within the corresponding CI, will be considered statistically equivalent, meaning that the behavior corresponding to this variable is similar between periods.

This method provides a quantitative tool to identify the differences (and similarities) of the different contract classes between periods.

The results of these comparisons are shown in sub-Sect. 3.3.

Results

Context for comparison

To make a fair comparison between government periods and between different contract classes, we present how many resources were spent in each class, as well as how many contracts belonged to each class per year. Table 3 shows the approved federal budget (FB) per year (in USD PPP) [5158], the Total Spending (TS) on public procurement according to the source list (in USD PPP), the Total number of Contracts (TC) made in each year, and the ratio TS/FB. We observe that the ratio TS/FB varies from 0.10 to 0.16 in the 1st government period (2013-2018, Fig. 1 - Top, blue dashed line, green circles), but after the change of government, the ratio fell to 0.05 and 0.06 for 2019 and 2020 (Fig. 1 - Top, blue dashed line, purple circles), i.e. even when the budget increased, the spending in public procurement fell by approximately a half. Figure 1 - Top (orange dashed line) shows that the number of contracts also fell at the end of the 1st period, then continued to fall at the beginning of the 2nd period, and remained low during the second year of the 2nd period. This decrease may be a consequence of the politics of ‘‘republican austerity” imposed by the new government [59].

Figure 1
figure 1

Comparison of the number of contracts and spending per year. Top: Ratio between the total spending reported in the contract source list and federal budget (blue dashed line - in USD PPP), and total number of contracts reported (orange dashed line - in thousands). The green circles represent the 1st government period (2013-2018), whereas the purple circles correspond to the 2nd government period (2019-2020). Center Left: Logarithmic plot of the number of contracts in each class. Center Right: Logarithmic plot of the resources spent in each class. Bottom Left: Logarithmic plot of the percentage of the total contracts in each class. Bottom Right: Logarithmic plot of the percentage of the total spending made in each class. Green and purple bars correspond to the 1st and 2nd government periods respectively

Table 3 Public data per year in the period 2013-2020. Approved federal budget (FB - extracted from [5158]), total spending made on public procurement (TS - extracted from the source list), total number of contracts made (TC- extracted from the source list), and the ratio TS/FB. FB and TS are reported in USD PPP using the equivalences given in [41]

Considering these data, we compared the number of contracts and the amount of money spent on each contract class for every year (Table 4 - Rows 2 and 3, and Fig. 1- Center). It is noticeable that the number of contracts in the EFOS class in 2019 and 2020, and the amount of money spent on them, fell by an order of magnitude. In contrast, the number of contracts and the total spending in contracts in the PCS class remained roughly in the same scale over several years, and only suffered a significant decrease in 2020. For contracts in the NC class, the numbers are similar between periods.

Table 4 Data for each contract class per year in the period 2013-2020. Number of contracts, spending (indicated in USD PPP), percentage of total contracts (%TC) and percentage of total spending (%TS) made in each contract class. This data were extract from the source list

We normalized the absolute numbers of contracts in each class by the corresponding TC and TS to compare the percentage of the Total Contracts (%TC) and of the Total Spending (%TS) in each year (Table 4 - Rows 4 and 5, and Fig. 1 - Bottom). We see that the NC class represents approximately 90% of the Total Spending and Total Contracts for all years. Also for the contracts in the PCS class there were no major changes between years in %TC and %TS up to 2020, where %TC decayed roughly by half to 1.02% and %TS fell to 1.78%. However, for the contracts in the EFOS class, there was a large decay in the fraction of contracts assigned, and a corresponding decrease in resources spent when the government changed.

Then, the first important change in public procurement due the government transition, was a marked decrease in public spending and in the number of contracts made. This decrease was especially noticeable in the EFOS class.

Comparison between contract classes

To verify whether the three different contract classes have intrinsic, statistically significant, procedural differences between them, we computed a B-Test (for the set of dummy variables) and a KS-Test (for the non-dummy variables) comparing by pairs each variable that describes the contracts. This comparison was made between all the contract classes in the same government period.

First we compared the EFOS class vs. the PCS class. For the 1st period, Table S3 and Fig. S1 in the Additional file 1 show that there were 17 variables for which the statistical tests showed significant differences. These variables were of type i), i.e. variables that describe the features of the contracts. For example, we found that most of the contracts in the PCS class were made with companies classified as ‘‘large supplier” (labeled S.NOM), while the contracts in the EFOS class were mostly done with micro, small, or medium companies.Footnote 10 Figure 2 shows a subset of those variables for which contracts carried out with companies labeled as EFOS and PCS were significantly different in the 1st government period (2013-2018). For the variable called Spending (Left), we found that the contracts in the EFOS class (red circles solid line) tend to be more expensive than the contracts in PCS class (blue triangles dashed line). We can also see in Fig. 2 - Right that the PCS companies obtain contracts of short duration (less than 3 weeks) more frequently than EFOS companies. However, in both classes there are a few contracts of large duration (more than one year). Finally, Fig. 2 - Center shows that the 1st period government made contracts with EFOS mainly during the 2nd half of the year (BeginningWeek), while companies labeled as PCS were contracted mostly during the 1st half of the year. For the 2nd government period, the comparison EFOS vs. PCS showed 14 variables with significant differences, all of them of type i) as well (Table S4 and Fig. S2 in the Additional file 1). Interestingly, 11 of these variables were also present in the set of variables that displayed differences in the 1st government period. Variables such as fraction of contracts with large suppliers (S.NOM) and Spending appear with similar behavior as in the 1st period. Also, even when the PCS contracts tend to be shorter than EFOS contracts, some of the PCS contracts reached duration times of up to two years, versus one year for the longest EFOS contracts (Fig. S2 in the Additional file 1). The remaining variables with significant differences are related to procedure type, procedure character, or contract type. For example, almost all the PCS contracts were for acquisitions (CT.ADQ), while less than half of the EFOS contracts were of this type (Table S4 in the Additional file 1). Also, the distribution of the variable for the single-bidder procedure (PT.AD) shows that this procedure is more frequent in the PCS class than in the EFOS class (Table S4 in the Additional file 1). These results highlight the fact that each of the corrupt contract classes has a different statistical profile, which in turn justifies keeping the corrupt classes separated. Further, given that the PCS contracts far outnumber the EFOS contracts every year (see Table 4), if we join both classes in a single ‘‘corrupt” class, the statistics of this ‘‘joint” class would mostly reflect the statistics of the PCS class.

Figure 2
figure 2

An example of a set of variables with significant differences between EFOS and PCS classes within the first government period. Left: Cumulative distribution function (CDF) for the variable Spending in the contracts of the EFOS class (red circles solid line) and in the PCS class (blue triangles dashed line). Center: CDF for the week of the year in which the contract began (BeginningWeek). Right: CDF for the duration of the contract (EBWeeks). The grey dashed line shows the value at which the CDFs showed the maximum difference

Regarding the comparison between EFOS and NC contracts, 10 variables presented significant differences for the 1st period, these variables were of types i) and ii) (Table S5 and Fig. S3 in the Additional file 1). It is noticeable that the two variables that showed the most prominent differences were those that characterize buyers’ features (i.e. variables of type ii)); these were the maximum number of contracts awarded by a buyer to a supplier (labeled T.Cont.Max), and the maximum amount of money spent by a buyer in contracts with a supplier (T.Spending.Max). These variables show that the EFOS class had a higher proportion of buyers with a proportionally stronger market activity (i.e. buyers characterized by having awarded a maximum number of contracts to a supplier T.Cont.Max ≥ 5) and larger budget (T.Spending.Max ≥ 1.9M USD PPP) than the buyers in the NC class (Fig. S3 in the Additional file 1). For the 2nd period the statistical tests gave 16 variables (again of the two first types) with significant differences (Table S6 and Fig. S4 in the Additional file 1). Here, also the variables T.Cont.Max and T.Spending.Max, were those that showed the most prominent differences. The behavior of these variables was similar to that observed in the 1st government period, giving the same differences between EFOS and NC classes for the buyers’ features.

The PCS vs. NC contract classes’ comparison for the 1st government period (Table S7 and Fig. S5 in the Additional file 1) showed significant differences in 10 variables, again distributed among the variable types i) and ii). For example, the variable for national procedure character (labeled PC.N) shows that the NC contract’s fraction made under national regulations, i.e. the contracting protocols that followed Mexican laws, was much higher than in the PCS class. The variable for large suppliers (S.NOM) shows that this kind of company was proportionally contracted with higher frequency in the PCS class than in the NC class. The variables for buyers’ features (T.Cont.Max and T.Spending.Max) presented differences similar to those discussed above: the PCS class had a higher proportion of buyers with a proportionally stronger market activity and larger budget than the buyers in the NC class. For the 2nd government period, the tests showed 13 variables with significant differences, again within the variable types i) and ii) (Table S8 and Fig. S6 in the Additional file 1). For example, the variable for large suppliers (S.NOM) has a similar behavior as in the 1st government period: large suppliers were more frequently contracted in PCS class than in the NC class. Finally, Fig. S6 in the Additional file 1 shows that the PCS contracts tend to last more than the contracts in the NC class.

Thus, we see that our three contract classes do present significant differences when compared among each other in the same government period, and that these differences are similar between periods. These results provide support to the separation of the contracts in these three classes, and also to the hypothesis that the corruption that occurs through EFOS and PCS have different procedural patterns. At the same time, these two contract classes have differences with the NC contracts. The next step is to identify whether each class presents differences between government periods.

Comparison between government periods

As mentioned before, to identify differences and similarities in public procurement between government periods, we compared the variables that define the contracts either by computing the confidence interval (CI) generated by the years of the 1st period (2013-2018) and verifying whether or not both curves corresponding to each year of the 2nd period (2019-2020) lie within the CI, or, for the dummy variables, checking if the results for the second period fall inside the box plot of the data of the first period. Since one of our goals is to try to determine whether the government transition that occurred in México at the end of 2018 brought a change in the procurement practices, we compared the data for each period in the three different contract classes. Our analysis of the data sets produced the following results.

Main differences between both periods

Figure 3 shows the most significant differences between both periods for the three contract classes: EFOS, PCS, and NC. First, we observe that in the 1st period, 20% (on average) of the suppliers classified as EFOS were micro-companies (Fig. 3 - Top Left). In contrast, in the 2nd period, this fraction grew to 50% in 2019, and to 100% in 2020. This signals an important change in the way interactions with EFOS were carried out between both periods. Another important difference in this class was the contract duration (Fig. 3 - Bottom Left). In the 1st period, only 20% (on average) of the contracts with companies labeled as EFOS lasted more than ten weeks, while for the 2nd period, 40% of these contracts had a duration of more than ten weeks in 2019, and in 2020 this fraction grew to almost 90%. Even when the government of 2nd period invested less money on EFOS, and granted them fewer contracts (Fig. 1), the majority of these contracts were of long duration. Fig. S7 in the Additional file 1 shows the remaining variables in which there were significant differences between periods. For example, we notice that the new government tends to contract EFOS at the beginning of the year (BeginningWeek), while the previous government did so mostly during the second half of the year.

Figure 3
figure 3

Example variables illustrating the main differences between both government periods. Left: Differences between contracts in the EFOS class. Top Left: Fraction of ‘‘micro companies” (S.MIC) for the contracts in the EFOS class. In green, boxplot of the 6 years comprising the 1st government period. Bottom Left: Cumulative distribution function (CDF) for the number of weeks that the contract lasted (EBWeeks). The green area represents the CI at 99% generated by the data of the 1st period. Center: Differences between contracts in the PCS class. Top Center: Fraction of contracts in the PCS class with ‘‘large” companies (S.NOM). Bottom Center: CDF for week that the contract began (BeginningWeek). Right: Differences between contracts in the NC class. Top Right: Fraction of contracts following the NAFTA procedures (PC.ITLC) in the NC class. Bottom Right: CDF for the maximum spending made by a buyer with a single supplier (T.Spending.Max). In all graphs, the orange line corresponds to the mean of the 1st government period, and the two purple colored curves correspond to the first two years of the second government period

For the PCS class we observe that in the 1st government period, the PCS contracts were made mostly (57%) with small and medium companies (Fig. 3Top Center). In contrast, in the 2nd government period, the majority of these contracts passed to large suppliers. On the other hand, with the new government, PCS contracting decreased in the middle of the year; i.e. the 2nd period government had a slightly lower tendency to contract PCS between weeks 20 and 40 than the 1st period government (Fig. 3 - Bottom Center). Fig. S8 in the Additional file 1 shows the remaining variables for which contracts in the PCS class presented significant differences between periods. For example, in the 1st period, 1% of contracts in the PCS class were leases (labeled CT.AR), this fraction grew to almost 1.5% in 2019, and to 2% in 2020.

Next, for the NC class, we notice that the 2nd period had an increase of 15% in the number of contracts made under the rules and regulations of the North American Free Trade Agreement (NAFTA) (Fig. 3 - Top Right). On the other hand, the two variables associated with buyer features had differences between periods. Figure 3 - Bottom Right shows that almost 80% of the buyers in the 1st period had a maximum spending (T.Spending.Max) less than 2K USD PPP, while in the 2nd period this percentage fell to almost 60%. Fig. S9 in the Additional file 1 shows that the maximum of contacts made by a buyer with the same supplier in the 2nd period remains relatively close to the CI generated by the 1st period. Fig. S9 in Additional file 1 shows the remaining variables for which NC contracts presented significant differences between periods. For example, the variable for single-bidder procedure type (PT.AD) shows that there was also a slight increase in the percentage of contracts won through this procedure from 75% to 80% from one period to another.

Main similarities between both periods

Having identified the main differences between the contract classes in both government periods, it may also be helpful to identify the similarities, i.e. identify those variables that did not experience significant changes due to government transition. This will give an idea of which aspects of the practices related to EFOS and PCS contracts may have remained.

Figure 4 shows the main similarities between governments for the three contract classes. We notice, for example, that for the EFOS class, the percentage of contracts assigned to single-bidders (PT.AD) did not experience significant changes between periods, staying within the range of 50% to 60% (Fig. 4Top Left). There are also similarities in the percentage of contracts for leases (CT.AR, Fig. S10 in the Additional file 1).

Figure 4
figure 4

Example variables illustrating the main similarities between both government periods. Left: Similarities between contracts in the EFOS class. Fraction of contracts assigned to single-bidders (PT.AD) in the EFOS class. In green, boxplot of the 6 years comprising the 1st government period. Center: Similarities between contracts in the PCS class. CDF for the Spending variable. Inset: A close up to the behavior of the CDF for small values of the Spending variable. The green area represents the CI at 99% generated by the data of the 1st period. Right: Similarities between contracts in the NC class. Fraction of contracts in the NC class made under an international legal framework (PC.I). In all graphs, the orange line corresponds to the mean of the 1st government period, and two purple colored curves correspond to the first two years of the second government period

For the PCS class, we found similarities in only one variable. For the Spending variable we found that 80% of the contracts were for less than 1M USD PPP in both periods, but in both periods there were a few contracts that reached 4M USD PPP (Fig. 4 - Center).

For the NC class, we found similarities in the procedure character, in which 7-11% of the contracts were made under international rules for both periods (Fig. 4 - Right). In Fig. S12 in the Additional file 1 we can observe that there are also similarities in the contract type. We find that almost 35% of the contracts were for services in the 1st period, and in the 2nd period this fraction remains at 38% in 2019, and 32% in 2020. It is also noticeable that the NC contracts for both periods had the same distribution for the duration variable (labeled EBWeeks), where, while most of them lasted less than 40 weeks, there were some that reached more than 200 weeks of duration.

These results show that corruption related with EFOS and PCS suffered important changes in specific features, such as the kind of contracted supplier, the duration of the contracts, and the time of the year in which the corrupt companies won the contracts; and there were other features that did not change between governments, such as the fraction of contracts in the EFOS class won by single-bidder, or the distribution of resources spent on contracts of the PCS class. Also, while the NC class is in principle not legally related to corruption, the differences and similarities of the features that characterize this class of contracts from one period to the other, provide an idea of how the practices of public procurement changed during the government transition.

Testing the risk factors

To reach our second goal, we tested the risk factors proposed in Sect. 2.1.2 and presented in Table 2 (labeled Type iii)). To do this we performed two experiments:

  1. 1.

    First we measured the accuracy of each risk factor as a descriptor of the contracts corresponding to the corrupt classes (EFOS and PCS). This was done by computing the fractions of contracts within each corrupt class for which the value of the variables proposed as risk factors exceed the corruption thresholds. If either fraction is larger than 0.5, then the risk factor may be considered as a useful descriptor of the contracts in the corresponding corrupt class. These fractions can be interpreted as the conditional probabilities that a contract exceeds the corruption threshold of the variables proposed as risk factors, given that it belongs to the corresponding corrupt class.

  2. 2.

    Second, we measured the accuracy of each risk factor to identify those contracts in which EFOS or PCS participated. To do this we compute the fractions of contracts that belong to either the EFOS or to the PCS class respectively, within the contracts for which the values of the variables proposed as risk factors exceed the corruption threshold. If either fraction is larger than the probability to find a contract of the corresponding corrupt class at random from the complete set of contracts, we consider that the risk factor is helpful identifying contracts in the corresponding class. Again, these fractions are the conditional probabilities that a contract belongs to either the EFOS or the PCS class, given that it exceeds the corresponding corruption threshold of the variables proposed as risk factors.

These experiments are done without separating the contracts between both government periods, i.e. we considered all data from 2013-2020 to test each risk factor. Once we determine the risk factors that most accurately describe and identify the corrupt contract classes, we determine whether the government turnover brought along a higher or lower risk of corruption using the same framework proposed in Sect. 2.2.2 for non-dummy variables.

Figure 5 shows the results of the first experiment, namely, the probabilities that a contract exceeds the corruption threshold (i.e. that the value of the variable RAD, which we have assigned as a descriptor of each contract, exceeds 0.5; and favoritism Fav, exceeds 0.9) given that it belongs to each corrupt class respectively. We observe that the risk factor RAD has a nearly 60% accuracy describing contracts in the EFOS class and a nearly 80% accuracy describing contracts in the PCS class. On the other hand, less than 1% of the contracts in both classes presented a favoritism greater than 0.9. Fig. S13 in the Additional file 1 shows the accuracy for the remaining risk factors. It is noticeable that the presumably corrupt behavior which combines more than 5 contracts and 10K USD PPP spending per active week, which should describe the practice of dividing expensive contracts into smaller ones, showed an accuracy of less than 2% for contracts in the EFOS class and less than 30% for the PCS class. Fig. S14 of Additional file 1 shows the results of this 1st experiment from the perspective of a recall curveFootnote 11 over a larger range of thresholds.

Figure 5
figure 5

First experiment. CDF for RAD and Fav variables (Up and Bottom, respectively) for the three different classes EFOS (Left), PCS (Center), and NC (Right). The dashed grey lines indicate the threshold for each risk factor and the value of the CDF in which it was crossed. This value in the CDF indicates the probability to find a contract with the risk factor above the marked threshold given that the contract belongs to the corresponding class

Figure 6 shows the results of experiment 2, these are the fraction of EFOS (solid red line) and PCS (solid blue line) contracts within all those contracts that exceed a certain threshold for a range of possible threshold values for the RAD and Fav variables (Left and Right, respectively). The dashed lines represent the probability to randomly find in the source list an EFOS (red) or PCS (blue) contract. We can observe that the RAD risk factor is not a good identifier for neither of the corrupt classes, since none of the solid lines are above their respective dashed line. The opposite happened for the Fav risk factor. Here we found that the probability to find an EFOS or PCS contract within those with a favoritism larger than the threshold (solid lines) is indeed higher than the probability to randomly find a corrupt contract in the source list (dashed lines). Fig. S13 in the Additional file 1 shows that CPW and SPW variables are good identifying PCS class but not EFOS class. As above, Fig. S14 in the Additional file 1 shows the results of the 2nd experiment from the perspective of a precision curveFootnote 12 over a larger range of different thresholds.

Figure 6
figure 6

Second experiment. Probability for a contract to be corrupt (EFOS or PCS, solid lines) given it has an RAD\(\geq r_{i}\) (Left) or a Fav \(\geq f_{i}\) (Right), for several values of the thresholds, \(r_{i}=\{0.4, 0.5, 0.6\}\) and \(f_{i}=\{0.85, 0.9, 0.95\}\). The dashed lines correspond to the probability to find and EFOS (red) or PCS (blue) contract in the source list

Thus, even though Fav is not a good descriptor of corrupt behavior since only 1% of the EFOS and PCS contracts have a favoritism larger than 0.9; it is a good identifier since the fraction of both classes of corrupt contracts, among the contracts with a value of Fav greater than 0.9, is significantly higher than their concentration in the complete sample. In contrast, the RAD variable is good as a descriptor of EFOS and PCS contracts, since more than 50% of these contracts occur in buyer-supplier relationships for which RAD ≥ 0.5. However, RAD is not a good identifier since the probability to find an EFOS or PCS contract within all of those that exceed a given threshold for RAD is never larger than choosing at random from the complete sample. Surprisingly, the fraction of contracts in the EFOS class among the contracts with RAD ≥ 0.5 is actually lower than their concentration in the complete source list.

Now, to determine whether the government turnover brought a change of procurement practices associated to corruption, we need to analyse which of the risk factors is correlated to an increase in the fraction of potentially corrupt contracts. To do this, we compute linear models between the total fraction of contracts between a buyer and a supplier for which RAD ≥ 0.5 per year as independent variable, and the fraction of contracts by single-bidder in each of the corrupt classes per year as the dependent variables. A similar procedure was done with the contracts with Fav ≥ 0.9.Footnote 13 To take into account the heterogeneity between the number of contracts in each class per year, we weighted each observation by the number of contracts assigned to each class per year. Figure 7 shows the results of these analyses for RAD risk factor. We observe that the increase in the fraction of contracts characterized by a RAD ≥ 0.5 is linearly correlated with an increase of the fraction of potentially corrupt contracts in both classes, with a \(R^{2}=0.72\) for contracts in the EFOS class and \(R^{2} = 0.82\) for those in the PCS class. Both \(R^{2}\) are statistically significant with a \(p_{v}\ll 0.01\). For the Fav risk factor the analyses show that there is no significant correlation between the fraction of contracts with Fav≥0.9 and the fraction of potentially corrupt contracts (Fig. S15 in the Additional file 1). To confirm that RAD variable is the best (among our variables) to predict corruption risk related to EFOS and PCS, we compute several multivariate regression models considering, one by one, variables such as procedure character PC, contract type CT, supplier’s size S, and the remaining risk factors. This experiment shows that the impact of RAD in the prediction of corruption risk is always significantly larger than the other variables (Tables S9 and S10 in the Additional file 1). Thus, we conclude that RAD is a good indicator of corruption risk, which is in line with previous results found in related works [32].

Figure 7
figure 7

Correlation between RAD and corruption risk. Linear model to correlate the fraction of contracts with RAD ≥ 0.5 per year (as independent variable) with the fraction of contracts by single-bidder per year (i.e. contracts for which PT.AD = 1), as dependent variable, for the EFOS and PCS classes (Left and Right, respectively). To avoid trivial auto-correlations, we exclude from the set corresponding to the independent variable the data of the dependent variables. The years corresponding to the 1st period are marked in green, those corresponding to the 2nd period in purple. Since the number of contracts varies widely from one year to another, the data was weighted by the number of contracts in each class per year to perform the least square fit. For the first model (EFOS class) we obtained an \(R^{2} = 0.72\), an intercept of −0.94, and a slope of 2.01. For the second model (PCS class) we obtained an \(R^{2} = 0.82\), an intercept of −0.41, and a slope of 1.42. All the parameters with \(p_{v}\ll 0.01\)

Then, considering the variable RAD as the best we have to describe corruption related to EFOS and PCS, and given its correlation with the risk of corruption, we can use the framework used in Sect. 2.2.2 to test how the risk of corruption changed with the government turnover. Figure 8 shows the CDF of the RAD variable, separated by periods. We can observe that the RAD variable in the 2nd period (purple colored lines) lies outside of the CI of the distributions of the old government for RAD values from 0.5 to 0.75, indicating that though less numerous, due to the large amount of contracts assigned to single bidders, the practices in the 2nd period present a higher risk of corruption than in the first period.

Figure 8
figure 8

Corruption risk between periods. CDF for the RAD variable. The orange line corresponds to the mean of the 1st government period, the green area represents the CI at 99% generated by the data of the 1st period, and two purple colored curves correspond to the first two years of the second government period

Discussion

México lived a historical government turnover in 2018 when, for the first time, a leftist candidate won the presidency, being the most voted candidate in México’s democratic history. The new government appeared to represent a complete change of regime, credibly promising to break with the widespread corrupt practices of previous governments. In this work we presented a statistical framework to identify to what extent public procurement practices underwent significant changes due to the government transition, focusing on those contracts related to companies labeled as EFOS (companies that provide invoices for simulated operations) or as PCS (sanctioned suppliers and contractors) due to having incurred in some other kind of corrupt behavior. To do this, we analyzed data from more than 1.5 million government contracts corresponding from 2013 to 2020. We classified each contract in one of three different classes: EFOS or PCS if the contract had been carried out with a company identified as incurring in corrupt practices by the Mexican tax agency (SAT), or Non-Corrupt (NC) if it had not been identified as such. Once classified, we characterized each contract class by a set of variables that give information about the contract itself (i.e. the contract type, the supplier company’s size, the amount spent, etc.) or the buyers’ features (for example the maximum number of contracts given by the buyer to a single company in a year). We also proposed risk factors constructed considering the framework developed in [29, 32]. We tested which of these risk factors had the best accuracy to describe and identify the companies labeled as EFOS or PCS, and then compared between government periods to determine whether the risk of corruption decreased, increased or remained the same after the government transition.

Before discussing the detailed comparison between the two governments, a first important difference between them concerns the number of presumably corrupt contracts per year carried out during each government, as well as the total amount of resources spent in these contracts. In the first year of the new government, the number of EFOS contracts fell by a factor of four from the last year of the previous government (or a factor of eight with respect to the average of the six years of the previous government), and another factor of four, to a total of only 13 cases, in the second year. A decrease in the number of EFOS contracts represented a reduction by a factor of three with respect to the previous year (or a factor of approximately 11 with respect to the average of the previous government) in the amount of resources spent in this kind of contracts. Also, and representing many more resources, the number of contracts carried out with companies identified as PCS during the first year of the new government was comparable to those of the previous government, however, the amount spent on these contracts during the first year of the new government fell by a factor of two with respect to the last year of the previous government. In the second year, the number of contracts fell by a factor larger than 2, and the resources spent fell by another factor of 4. Thus, the data showed that there has been a significant reduction in both the total number as well as the relative fraction of corrupt contracts, and the corresponding resources spent on such contracts. Whether this reduction was due to an effective crackdown on corruption, a consequence of the austerity program undertaken by the new government, or some other reason, is a question the data cannot answer.

However, our purpose in this work was to try to go beyond the analysis of the total amounts of resources spent in contracts that were suspect of being corrupt. We attempted to establish whether the practices and warning flags regarding suspect contracts actually changed from one government period to the other. To do so, using the variables characterizing each contract, we began by verifying that each contract class presented statistically significant differences between them in the same government period. Thus, we were able to assign a rough statistical profile to each class of contracts in each government period. For example, we found that contracts in the EFOS class were characteristically not carried out with large companies, and that this class had a large proportion of contracts for public work in comparison with the PCS and NC classes. On the other hand, the PCS class profile was distinguished by having very few contracts with micro, small, and medium size companies, and that the main activity in this class was for acquisitions. Finally, both corrupt classes were identified by having a higher proportion of buyers with a proportionally stronger market activity and larger budget than the buyers in the NC class. These characteristic statistical profiles were maintained between periods. Having verified that the various classes —defined by whether or not the supplier had participated in corrupt activities, and if so, what kind of activities— also had significant statistical differences between them, the next step was to analyze whether, as a consequence of the government transition, the practices in public procurement suffered changes within each class.

Regarding contracts with EFOS, we identified that the new government tended to favor micro-suppliers more than the older government. Companies this size obtained only 20% of EFOS contracts (on average) in the 1st government period. This percentage grew to 50% in 2019 and to almost 100% in 2020. The reduction in the fraction of large EFOS suppliers may be due to government action targeting these companies as they represent a large drain of resources. Also, EFOS contracts’ duration suffered a significant change: the proportion of EFOS contracts longer than ten weeks grew from 20% to 40% in 2019, and to almost 90% in 2020. On the other hand, there were also similarities between periods for this class of contracts. For example, the percentage of the contracts assigned to single-bidders remained between 55%-65% in both periods.

In regards to contracts in the PCS class, these showed differences in the kind of suppliers that were favored, exhibiting a relative increase (from 40% to 65%) in the fraction of large companies that won contracts between both government periods. Moreover, the fraction of single-bidders in this class also increased from one government to another. On the other hand, the variable Spending showed that the way resources were distributed in PCS contracts remained roughly the same between periods.

The NC class presented differences mainly in the contract procedures and in the features of the buyers involved. In the 2nd government period, there was an increase of 20% of contracts made under NAFTA procedures compared to the 1st period. We also found that the kind of buyers that participated in the contracts in this class, tend to spend more money with a single company in the 2nd period than in the 1st period.

Our framework to analyze a set of contracts through certain characteristic variables allowed us to identify specific changes and similarities between contract classes and between government periods. But this approach did not tell us whether the risk of corruption increased or decreased after the government transition. To do this we considered specific indicators, namely RAD (fraction of single-bidder contracts), Fav (favoritism), CPW (contracts per active week) and SPW (spending per active week), based on previously proposed risk factors of corruption [27, 28, 42]. These can be used as red flags for those contracts for which these variables exceed a certain threshold. For example, a buyer-supplier relation (and all contracts in it) should be red flagged as risky if, say, RAD ≥ 0.5 [32], or if Fav ≥ 0.9 [42]. Our study showed that for the EFOS and PCS classes, the risk factor Fav performed well identifying corrupt contracts among those that exceeded the threshold of 0.9. Actually, it was twice as probable to find EFOS contracts in the subset of contracts with Fav≥0.9, than in the whole source list, and almost five times more probable to find PCS contracts. However, it had little accuracy as a descriptor since Fav exceeded the threshold 0.9 in less than 1% of companies labeled as having participated in corrupt activities. The variables CPW and SPW were not useful risk factors, since only a few EFOS and PCS contracts had more than five contracts per active week and expenditures for more than 10K USD PPP per active week, thus we were unable to identify the corruption scheme in which a buyer assigned many small contracts in a short period of time to the same supplier [42]. This may also suggest an important limitation of the effectiveness of these variables to predict corruption with the available data. Actually, having access to more specific information about the contracts and their participants, would be helpful to re-compute these risk factors and to re-test their effectiveness to describe and identify corruption.

On the other hand, the risk factor RAD was accurate in describing (on average) more than 50% and 80% of the companies labeled as EFOS and PCS, respectively. However, this result also implies that many corrupt contracts were won in open contest. This suggests that in our corrupt classes there may be two kinds of illegal schemes; one in which public officials colluded with the supplier and frequently assigned them contracts as single-bidders (i.e. RAD ≥ 0.5), and another scheme in which the contracted companies committed a fraudulent action without the government being involved. This second scheme is likely to occur in EFOS and PCS contracts that are mostly assigned through open contest (so RAD < 0.5). Here we focus in schemes of the first kind, as corruption implies government complicity [5]. We also found that the RAD variable was not efficient in identifying corrupt contracts among those with RAD≥0.5. Actually, the probability to find a PCS contract in this subset is the same as the probability to find it choosing randomly in the complete source list, and the probability to find an EFOS contract among those with RAD ≥ 0.5 is significantly less than for the random search in the complete source list. This result is explained by the fact that the single-bidder contracts are very common in the NC class (Fig. 5Right), as common as in the PCS class and even more common than in the EFOS class. Thus, even when the single-bidder contracts should be only exceptional [38, 39, 43], they have become an extremely common procedure in México’s public procurement practices. Further, in line with previous findings [32], we found that the RAD variable is positively correlated with an increase of the risk of corruption. Given this correlation, the picture that emerges is that even though the absolute number of corrupt contracts has been reduced, the risk of corruption related to EFOS and PCS contracts has not decreased with the new government, since the fraction of single-bidder contracts, RAD, actually increased from one government to another.

Overall, our study provides a framework to identify the properties and behavior of corruption in public procurement, as well as to evaluate how corrupt practices change during a government turnover. We used these tools to analyze the impact of government transition in México in the practices of public procurement. The methodology may be a stepping stone to build new methods involving the analysis of the variables displaying differences between EFOS, PCS and NC, in order to better describe and identify corrupt contracts among big data sets.

Availability of data and materials

All data generated or analyzed during this study are included in the repository [37]

Notes

  1. These include federal, state, and municipal agencies. It should also be mentioned that we have no way of knowing how complete the list is, nor whether omissions are more frequent regarding contracts in one government level or another, as well as the possibility of omissions regarding “sensitive” contracts, as could be military spending. Finally, we remark that the list only includes contracts between government agencies and private suppliers. Government agencies that act as suppliers to other government agencies are exempt from reporting in CompraNet.

  2. By law all registries in CompraNet should be made using the fiscal name of both the agency and company. This ensures that the data contains unique identifiers of buyers and suppliers.

  3. For this study we use the lists of both definitive and presumed EFOS

  4. We established the threshold for SPW after checking the behavior of contracts in the EFOS and PCS classes with \({\mathbf{CPW}}\geq 5\). Further, the precision/recall curves (Fig. S14 in the Additional file 1) show that moving these thresholds does not make a difference on the results.

  5. We choose these particular tests because a balance between samples sizes is not required to obtain reliable comparison results.

  6. To compute the B-Test we follow the standard procedure to compute the \(z-\mathit{score}\) and the \(p-\mathit{value}\) given in [44, 45, 47] and for the KS-Test we use the R function ks.test from the dgof package [48, 49].

  7. To make this computation we use the ecdf R function from the stats package [48].

  8. We considered this value for the confidence interval because of the small amount of data we have to compute it. For the computation of the CI we use the CI R function from the Rmisc package [48, 50]

  9. To generate the boxplot we use the R function boxplot from the stats package [48].

  10. To test if this difference in the type of companies contracted was beyond the null hypothesis of randomly choosing a contractor, we analyzed the distribution of each company size in each class, finding that the choice of supplier indeed favored one particular company size for each different class.

  11. In the context of binary classifiers, recall is defined as the ratio between the correctly detected instances by the classifier and all positive instances.

  12. In the context of binary classifiers, precision is defined as the ratio between the actual positive instances and all the cases predicted as positive by the classifier.

  13. To avoid trivial auto-correlations, in all cases we exclude from the set corresponding to the independent variable the data of the dependent variables.

References

  1. Mauro P (1995) Corruption and growth. Q J Econ 110:681–712

    Article  Google Scholar 

  2. Hessami Z (2014) Political corruption, public procurement, and budget composition: Theory and evidence from OECD countries. Eur J Polit Econ 34:372–389

    Article  Google Scholar 

  3. Stockemer D, LaMontagne B, Scruggs L (2013) Bribes and ballots: The impact of corruption on voter turnout in democracies. Int Polit Sci Rev 34(1):74–90

    Article  Google Scholar 

  4. Gupta S, Davoodi H, Alonso-Terme R (2002) Does corruption affect income inequality and poverty? Econ Gov 3(1):23–45

    Article  Google Scholar 

  5. “What is corruption?” (2021) https://www.transparency.org/en/what-is-corruption. Accessed 24 October 2021

  6. North DC, Wallis JJ, Weingast BR, et al. (2009) Violence and social orders: A conceptual framework for interpreting recorded human history. Cambridge University Press, Cambridge

    Book  Google Scholar 

  7. Aidt TS (2016) Rent seeking and the economics of corruption. Const Polit Econ 27(2):142–157

    Article  Google Scholar 

  8. Wachs J, Yasseri T, Lengyel B, Kertész J (2019) Social capital predicts corruption risk in towns. R Soc Open Sci 6(4):182103

    Article  Google Scholar 

  9. Fazekas M, Wachs J (2020) Corruption and the network structure of public contracting markets across government change. Politics and Governance 8(2):153–166

    Article  Google Scholar 

  10. Baldi S, Bottasso A, Conti M, Piccardo C (2016) To bid or not to bid: That is the question: Public procurement, project complexity and corruption. Eur J Polit Econ 43:89–106

    Article  Google Scholar 

  11. G. OECD., Integrity in public procurement: Good practice from A to Z. Organization For Economic, Paris, 2007

  12. G. OECD., Government at a Glance. Organization For Economic, Paris, 2015

  13. Clingermayer JC, Feiock RC, Stream C (2003) Governmental uncertainty and leadership turnover: Influences on contracting and sector choice for local services. State Local Gov Rev 35(3):150–160

    Article  Google Scholar 

  14. Broms R, Dahlström C, Fazekas M (2019) Political competition and public procurement outcomes. Comp Polit Stud 52(9):1259–1292

    Article  Google Scholar 

  15. Dávid-Barrett E, Fazekas M (2019) Grand corruption and government change: an analysis of partisan favoritism in public procurement. European Journal on Criminal Policy and Research, 1–20

  16. IMCO - Instituto Mexicano para la Competitividad A.C.https://imco.org.mx, 2021. Accessed 12 April 2021

  17. IMCO, “Índice de riesgos de corrupción: El sistema mexicano de contrataciones públicas” (2017) https://imco.org.mx/indice-riesgos-corrupcion-sistema-mexicano-contrataciones-publicas/. Accessed 12 April 2021

  18. “Transparency International - Corruption Perception Index” (2013-2020) https://www.transparency.org/en/cpi/2020/index/nzl. Accessed 12 April 2021

  19. María AC (2015) Anatomía de la Corrupción. CIDE Instituto, México

    Google Scholar 

  20. Meyer L (2019) El poder vacío: El agotamiento de un régimen sin legitimidad. DEBATE, México

    Google Scholar 

  21. “INE - Instituto Nacional Electoral” (2018) https://computos2018.ine.mx/#/presidencia/nacional/1/1/1/1. Accessed 12 April 2021

  22. Muñoz Armenta A, Hernández García M, Gómez Romo de Vivar G, Mares Sánchez DA, Muñoz Canto CS, Álvarez Olivas IR, Díaz Sandoval M, Espejel Espinoza A, Martínez González VH, Corona Armenta G, et al. (2020) El triunfo de la izquierda en las elecciones de 2018¿ Ideología o pragmatismo?. Universidad de Guanajuato, Grañén Porrúa, México

    Google Scholar 

  23. Hanrahan B, Fugellie PA (2019) Reflections on the transformation in México. J. Lat. Am. Cult. Stud. 28(1):113–137

    Article  Google Scholar 

  24. “CompraNet - Contratos Públicos” (2021) https://www.gob.mx/compranet/documentos/datos-abiertos-250375. Accessed 06 March 2021

  25. Kim G-H, Trimi S, Chung J-H (2014) Big-data applications in the government sector. Commun ACM 57(3):78–85

    Article  Google Scholar 

  26. Radermacher WJ (2018) Official statistics in the era of big data opportunities and threats. Int J Data Sci Anal 6(3):225–231

    MathSciNet  Article  Google Scholar 

  27. Fazekas M, Tóth IJ, King LP (2013) Anatomy of grand corruption: A composite corruption risk index based on objective data, vol 2. Corruption Research Center Budapest Working Papers No. CRCB-WP/2013

  28. Fazekas M, Tóth IJ, King LP (2013) Corruption manual for beginners:’corruption techniques’ in public procurement with examples from hungary, vol 1. Corruption Research Center Budapest Working Paper No. CRCB-WP/2013

  29. Fazekas M, Tóth IJ, King LP (2016) An objective corruption risk index using public procurement data. Eur J Crim Policy Res 22(3):369–397

    Article  Google Scholar 

  30. Fazekas M, Skuhrovec J, Wachs J (2017) Corruption, government turnover, and public contracting market structure–insights using network analysis and objective corruption proxies. GTI Working Paper Series

  31. Fazekas M, Ferrali R, Wachs J (2018) “Institutional quality, campaign contributions, and favouritism in us federal government contracting.” tech. rep., Working Paper series: GTI-WP/2018: 01. Government Transparency Institute

  32. Wachs J, Fazekas M, Kertész J (2020) Corruption risk in contracting markets: a network science perspective. International Journal of Data Science and Analytics, 1–16

  33. Klašnja M (2015) Corruption and the incumbency disadvantage: Theory and evidence. J Polit 77(4):928–942

    Article  Google Scholar 

  34. Charron N, Dahlström C, Fazekas M, Lapuente V (2017) Careers, connections, and corruption risks: Investigating the impact of bureaucratic meritocracy on public procurement processes. J Polit 79(1):89–104

    Article  Google Scholar 

  35. “Secretaría de Atención Tributaria - Empresa que Factura Operaciones Simuladas” (2021) http://omawww.sat.gob.mx/cifras_sat/Paginas/datos/vinculo.html?page=ListCompleta69B.html. Accessed 06 March 2021

  36. “Datos Abiertos - Proveedores y Contratistas Sancionados” (2021) https://datos.gob.mx/busca/dataset/proveedores-y-contratistas-sancionados. Accessed 06 March 2021

  37. Falcón-Cortés A, Aldana A, Larralde H Data Set of Public Procurement in México (2013-2020). ZENODO. https://doi.org/10.5281/zenodo.6110977

  38. “Ley de Adquisiciones, Arrendamientos y Servicios del Sector Público” (2021) http://www.diputados.gob.mx/LeyesBiblio/pdf/14_200521.pdf. Accessed 10 November 2021

  39. “Ley de Obras Públicas y Servicios relacionados con la misma” (2021) http://www.diputados.gob.mx/LeyesBiblio/pdf/56_200521.pdf. Accessed 10 November 2021

  40. “Secretaría de Economía - Clasificación de empresas” (2021) http://www.2006-2012.economia.gob.mx/mexico-emprende/empresas. Accessed 24 May 2021

  41. “OECD Data - Purchasing power parities (PPP)” (2000-2020) https://data.oecd.org/conversion/purchasing-power-parities-ppp.htm. Accessed 01 February 2022

  42. IMCO, “Mapeando la corrupción” (2018) https://mapeandolacorrupcion.mx/Anexo_Metodologico.pdf. Accessed 12 April 2021

  43. “Constitución Política de los Estados Unidos Mexicanos” (1917) http://www.diputados.gob.mx/LeyesBiblio/pdf_mov/Constitucion_Politica.pdf. Accessed 18 October 2021

  44. Conover WJ (1998) Practical nonparametric statistics, vol 350. Wiley, New York

    Google Scholar 

  45. Hollander M, Wolfe DA, Chicken E (2013) Nonparametric statistical methods, vol 751. Wiley, New York

    MATH  Google Scholar 

  46. Smirnov NV (1939) Estimate of deviation between empirical distribution functions in two independent samples. Bull Mosc Univ 2(2):3–16

    Google Scholar 

  47. Oakley J Lecture notes in data science - mas113 part2: Data science (2021). http://www.jeremy-oakley.staff.shef.ac.uk/mas113/notes/index.html. Accessed 30 June 2021

  48. R Core Team (2020) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna

    Google Scholar 

  49. Arnold TA, Emerson JW (2011) Nonparametric goodness-of-fit tests for discrete null distributions. R J 3(2):34–39

    Article  Google Scholar 

  50. Hope RM (2013) Rmisc: Rmisc: Ryan Miscellaneous. R package version 1.5

  51. “Presupuesto de Egresos de la Federación para el Ejercicio Fiscal 2013” (2012) https://www.dof.gob.mx/nota_detalle.php?codigo=5283490&fecha=27/12/2012. Accessed 06 March 2021

  52. “Presupuesto de Egresos de la Federación para el Ejercicio Fiscal 2014” (2013) https://www.dof.gob.mx/nota_detalle.php?codigo=5324132&fecha=03/12/2013. Accessed 06 March 2021

  53. “Presupuesto de Egresos de la Federación para el Ejercicio Fiscal 2015” (2014) http://www.dof.gob.mx/nota_detalle_popup.php?codigo=5374053. Accessed 06 March 2021

  54. “Presupuesto de Egresos de la Federación para el Ejercicio Fiscal 2016” (2015) http://dof.gob.mx/nota_detalle.php?codigo=5417699&fecha=27/11/2015. Accessed 06 March 2021

  55. “Presupuesto de Egresos de la Federación para el Ejercicio Fiscal 2017” (2016) https://www.dof.gob.mx/nota_detalle.php?codigo=5463184&fecha=30/11/2016. Accessed 06 March 2021

  56. “Presupuesto de Egresos de la Federación para el Ejercicio Fiscal 2018” (2017) http://www.dof.gob.mx/nota_detalle.php?codigo=5506080&fecha=29/11/2017. Accessed 06 March 2021

  57. “Presupuesto de Egresos de la Federación para el Ejercicio Fiscal 2019” (2018) http://dof.gob.mx/nota_detalle.php?codigo=5547479&fecha=28/12/2018. Accessed 06 March 2021

  58. “Presupuesto de Egresos de la Federación para el Ejercicio Fiscal 2020” (2019) https://www.dof.gob.mx/nota_detalle.php?codigo=5581629&fecha=11/12/2019. Accessed 06 March 2021

  59. “Proyecto alternativo de nación 2018-2024” (2017) https://repositoriodocumental.ine.mx/xmlui/bitstream/handle/123456789/94367/CG2ex201712-22-rp-5-2-a2.pdf. Accessed 13 July 2021

Download references

Acknowledgements

AFC thanks PostDoctoral Scholarship DGAPA-UNAM for financial support. AFC thanks to D Cervantes-Filoteo and JR Nicolás-Carlock for fruitful discussion.

Author information

Authors and Affiliations

Authors

Contributions

AFC, AA and HL provided the original idea and developed the theoretical framework. AFC obtained the original data, curated it and analyzed it. All authors discussed the general outline, the theoretical framework of the article, and contributed to comments and revisions. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Andrea Falcón-Cortés.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Abbreviations

All abbreviations are defined on the tables in main text

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary information (PDF 5.3 MB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Falcón-Cortés, A., Aldana, A. & Larralde, H. Practices of public procurement and the risk of corrupt behavior before and after the government transition in México. EPJ Data Sci. 11, 19 (2022). https://doi.org/10.1140/epjds/s13688-022-00329-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1140/epjds/s13688-022-00329-7

Keywords

  • Public procurement
  • Corruption
  • Government transition
  • México