Flow of online misinformation during the peak of the COVID-19 pandemic in Italy

The COVID-19 pandemic has impacted on every human activity and, because of the urgency of finding the proper responses to such an unprecedented emergency, it generated a diffused societal debate. The online version of this discussion was not exempted by the presence of d/misinformation campaigns, but differently from what already witnessed in other debates, the COVID-19 -- intentional or not -- flow of false information put at severe risk the public health, reducing the effectiveness of governments' countermeasures. In the present manuscript, we study the effective impact of misinformation in the Italian societal debate on Twitter during the pandemic, focusing on the various discursive communities. In order to extract the discursive communities, we focus on verified users, i.e. accounts whose identity is officially certified by Twitter. We thus infer the various discursive communities based on how verified users are perceived by standard ones: if two verified accounts are considered as similar by non unverified ones, we link them in the network of certified accounts. We first observe that, beside being a mostly scientific subject, the COVID-19 discussion show a clear division in what results to be different political groups. At this point, by using a commonly available fact-checking software (NewsGuard), we assess the reputation of the pieces of news exchanged. We filter the network of retweets (i.e. users re-broadcasting the same elementary piece of information, or tweet) from random noise and check the presence of messages displaying an url. The impact of misinformation posts reaches the 22.1% in the right and center-right wing community and its contribution is even stronger in absolute numbers, due to the activity of this group: 96% of all non reputable urls shared by political groups come from this community.


Introduction
The advent of the internet and online social media has promoted a more democratic access to information, increasing the offer of news sources, with a significant number of individual contributions too. Unfortunately, unmediated communication channels have generated an incredible amount of low-quality contents, polluting the online debate in several areas, like politics, healthcare, education, and environment [1]. For this reason, in the recent Joint Communication titled "Tackling COVID-19 disinformation -Getting the facts right" (June 10, 2020, https://bit.ly/35C1dGs), the High Representative of the Union for Foreign Affairs and Security Policy, while introducing the various d/misinformation campaigns that arose during the first months of the COVID-19 pandemic, presented an explicit declaration of intent: "Combatting the flow of disinformation, misinformation [...] calls for action through the EU's existing tools, as well as with Member States' competent authorities [...] enhancing citizens' resilience." The research regarding online social media, featuring the detection of misinformation campaigns and of the pollution of the online political debate has been the target of a great flow of recent research, e.g. [2][3][4][5][6][7][8][9][10][11]. Nevertheless, due to the societal relevance of the topic, the analysis of misinformation campaigns during the COVID-19 pandemic has immediately attracted several scholars, focusing on different facets of this phenomenon: on the Google trend related to Coronavirus arguments [12], on the existence of Facebook groups experiencing an extreme exposure to disinformation [13], on the evolution of the diffusion in Twitter of false information across several countries [14] and on the disinformation epidemiology on various online social platforms [15]. In the present paper, using Twitter as a benchmark, we shall consider the effective flow of online misinformation in Italy, one of the countries in Europe that have been affected the most by the COVID-19 pandemic [1] , and how this flow affected the various discursive communities, i.e., groups of users that debate on the pandemic. Since the debate is mostly centered on verified users, i.e., users whose identity is certified by Twitter, we start considering their interactions with unverified accounts. Following [10,11,16], our intuition is that two verified users that are perceived to be similar by unverified users, interact with (i.e., retweet and are retweeted by) the same accounts. In order to assess how many common unverified users are 'enough' to state that the two verified users are indeed similar, we use an entropy-based null-model as a benchmark [17,18]. In a nutshell, the entropy-based null-model is a network benchmark in which part of the information is constrained to the values observed in the real system and the rest is completely random. If the observations are not compatible with the null-model, then they cannot be explained by the constraints only and carry a non trivial information regarding the real system. Interestingly enough, we find that the main discursive communities are political, i.e., they involve politicians, political parties and journalists supporting a specific political ideal. While, at first sight, this may sound surprising -the pandemic debate was more on a scientific than on a political ground, at least in the very first phase of its abrupt diffusion -, it might be due to pre-existing echo chambers [19]. We then consider the news sources shared among the accounts of the various groups. With a hybrid annotation approach, based on independent journalists and annotation carried out by members of our team, we categorised such sources as reputable and non reputable (in terms of credibility of the published news and the transparency of the source). Finally, we extract the effective flow of content shared within the network: still following the approach of Ref. [10,11], we extend the entropy-based methodology to a directed bipartite network of users and posts. In this sense, we are able to control not only the authorship activity and the retweeting attitude of the various [1] In Italy, since the beginning of the pandemic and at time of writing, more than 2.5 million persons have contracted the Covid-19 virus: of these, more than 89k have died. Source: http://www.protezionecivile.gov.it/ Accessed February 4, 2021.
accounts, but even the virality of the different messages, i.e., how many times a single message is shared. The various political groups display different online behaviours. In particular, the right wing community is more numerous and more active, even relatively to the number of accounts involved, than the other communities. Interestingly enough, newly formed political parties, as the one of the former Italian prime Minister Matteo Renzi, quickly imposed their presence on Twitter and on the online political debate, with a strong activity. Furthermore, the different political parties use different sources for getting information on the spreading on the pandemic. Notably, we experience that right and center-right wing accounts spread information from non reputable sources with a frequency almost 10 times higher than that of the other political groups. Interestingly, due to their outstanding activity, their impact, in terms of number of d/misinforming posts in the debate, is much greater than that of any other groups. The paper is organised as follows: we describe the dataset in Section 3 and present the results of our analysis in Section 4. After discussing and commenting our results in Section 5, we introduce the methodology implemented in our analysis (Section 6).

Related Work
As in any disaster, natural or otherwise, people is exposed to the spread of related online misinformation. This is the case of the COVID-19: the physical pandemic was quickly complemented by the so-called COVID-19 infodemic, i.e. the diffusion of a great amount of low-quality information regarding the pandemic. Academia has stepped up its efforts to combat this infodemic. Here, we briefly review some of the most relevant articles in the area. Rovetta et al.,in [12], explore the internet search activity related to COVID-19 from January to March 2020, to analyse article titles from the most read newspapers and government websites 'to investigate the attitudes of infodemic monikers circulating across various regions and cities in Italy'. The study reveals a growing regional and population-level interest in COVID-19 in Italy, highlighting how the majority of searches concern -often unfounded-remedies against the disease. Work in [14], by Gallotti et al., develops an Infodemic Risk Index to depict the risk of exposure to false online information in various countries around the world. Regarding healthcare news, the authors find that even before the rise of the pandemic, entire countries were exposed to false stories that can severely threaten the public health. Hossaini et al. [20] release COVIDLies, a dataset of 6761 expert-annotated tweets to evaluate the performances of existing NLP systems in detecting false stories about COVID-19. Still regarding datasets, work by Zhou et al. [21] present ReCOVery, a repository of more than 2k news articles on Coronavirus, together with more than 140k tweets testifying the spreading of such articles on Twitter. Chen et al., in [22], present to the scientific community a multilingual COVID-19 Twitter dataset that they have been continuously collecting since January 2020. Celestini et al., in [13], collect and analyse over 1.5 M COVID-19-related posts in Italian language. Findings are that, although controversial topics associated to the origin of the virus circulate online, discussions on such topics is negligible compared to those on mainstream news websites. Pierri et al, in [23], provide public access to online conversations of Italian users around vaccines on Twitter, an on-going collection capturing the Italian vaccine roll-out (on December 27, 2020). The authors also report a consistent amount of low-credibility information already circulating on Twitter alongside vaccine-related conversations. Still regarding the COVID-19 vaccination campaigns, De Verna et al. collect a Twitter dataset of English posts, giving statistics about hashtags, urls, and number of tweets over time through a dashboard. Sharma et al, in [24], consider the role of Twitter bots in the pandemic online debate. By moving away from the research trend of detecting teams of bots on the basis of features concerning coordination and synchronous behavior between such accounts, they propose an approach to automatically uncover coordinated group behaviours from account activities and interactions between accounts, based on temporal point processes. A lot of works examine Twitter, because of its provision of public APIs for data taking. Yang et al. [25] analyse and compare instead the presence of links pointing to low-credibility content both on Twitter and Facebook posts. Misinformation 'superspreaders' and evidences of coordinated sharing of false stories about COVID-19 are present on both the platforms. Still at a narrower granularity, Cinelli et al. in [15] carry on a massive analysis on Twitter, Instagram, YouTube, Reddit and Gab. The authors characterize COVID-19-related information spreading from questionable sources, finding different volumes of misinformation in each platform. This brief literature overview on the COVID-19 infodemic, although not exhaustive, highlights that the spread of misinformation on pandemic-related issues on the internet and social media is a major issue. Scientists propose various methods to detect false information about the virus. Aligned with this line of research, in this manuscript we quantify the effective level of misinformation about the pandemic exchanged on Twitter during late winter and early spring in 2020 in Italy, with a special focus on the role of the Italian political communities.

Dataset
Using Twitter's streaming API from February 21 st to April 20 th 2020 [2] we collected circa 4.5M tweets in Italian language. Actually, the data set analysed is a subset of a greater corpus in which the language was not a selection criterion for the download; we then selected Italian messages. Due to the great amount of data downloaded, we run into Twitter download limit, even if quite seldom [3] ; [2] We had an interruption of one day and 4 hours on February 27 th and another of three days and 8 hours on March 10 th due to a connection breakdown. Because of the validation procedure that, moreover, we applied on the aggregated network over the entire period of data collection, we expect that the effect of the breakdown to be limited for the interpretation of our results. [3] During the peaks of traffic during the end of February we run into Twitter limits less than once a day. nevertheless, due to the validation procedures (see Sections 4 and 6), we expect that the impact of Twitter limitation to be negligible in terms of the interpretation of the results. The data collection was keyword-based, with keywords related the Covid-19 pandemics, including the most used version (coronavirus) and the first names for the virus. The complete lists of keywords used can be found in Table 1. Twitter's streaming API returns any tweet containing the keyword(s) in the text of the tweet, as well as in its metadata. It is worth noting that it is not always necessary to have each permutation of a specific keyword in the tracking list. For example, the keyword 'Covid' will return tweets that contain both 'Covid19' and 'Covid-19'. Table 1 lists a subset of the considered keywords and hashtags. There are some hashtags that overlap due to the fact that an included keyword is a sub-string of another one, but we included both for completeness. Let us conclude this short section with few final comments. First, we remark that, beside being relatively less popular than other OSNs (in Italy it is used by less than 5.8% of the population in order to access to the latest information [26]), Twitter has an incidence of journalists and politicians higher than other platforms (Twitter is the second most used social network, after Facebook, with an incidence of 30% of journalists accessing it every day [27]), probably to the limited number of characters of the messages shared, which is extremely suitable for short and fast communication, as the breaking news [27]. Finally, details about the health situation in Italy during the period of data collection can be found in the Supplementary Material, Section 1.1: 'Evolution of the Covid-19 pandemics in Italy'.

Discursive communities of verified users
Many results in the analysis of online social networks (OSN) shows that users are highly clustered in group of opinions [28][29][30][31][32][33][34][35][36]; indeed those groups have some peculiar behaviours, as the echo chamber [32,33]. Following the example of references [10,11], we leverage this users' clustering in order to detect discursive community, i.e. groups of accounts interacting among themselves by retweeting on the same (covid-related) subjects. Remarkably, our procedure does not follow the analysis of the text shared by the various users, but is simply related on the retweeting activity among users. In the present subsection we will examine how the information about discursive community of verified Twitter users can be extracted. On Twitter there are two distinct categories of accounts: verified and unverified users. Verified users have a tick close to the screen name; the platform itself, upon request from the user, has a procedure to check the authenticity of the account. Verified accounts are usually owned by politicians, journalists or VIPs in general, as well as ministers, newspapers, newscasts, companies and so on: for those kind of users, the verification procedure guarantees the identity of their account and reduce the risk of malicious accounts tweeting in their name. Non verified accounts are for standard users: in this second case, we cannot trust any information provided by the users. The information carried by verified accounts has been studied extensively in order to have a sort of anchor for the related discussion [9-11, 16, 37, 38] To detect the discursive communities we consider the bipartite network represented by verified (on one layer) and unverified (on the other layer) accounts: a link is connecting the verified user v with the unverified one u if at least one time v was retweeted by u, and/or viceversa. To extract the similarity of users, we compare the commonalities with a bipartite entropy-based null-model, the Bipartite Configuration Model (BiCM [39]), described in details in subsection 6.1. The rationale is that two verified users that share many links to same unverified accounts probably have similar visions, as perceived by the audience of unverified accounts. We then apply the method of [40], in order to get a statistically validated projection of the bipartite network of verified and unverified users. In a nutshell, the idea is to compare the amount of common linkage measured on the real network with the expectations of an entropy-based null model fixing (on average) the degree sequence: if the associated p-value is so low that the overlaps cannot be explained by the model, i.e. such that it is not compatible with the degree sequence expectations, they carry non trivial information and we put a link connecting the two nodes in the (monopartite) projection of verified users. The top panel of Fig. 1 shows the network obtained by following the above procedure. The network resulting from the projection procedure will be called, in the rest of the paper, validated network [4] . In order to get the community of verified Twitter users, we applied the Louvain algorithm [41] to the data in the undirected validated network. Such an algorithm, despite being one of the most effective and popular, is also known to be order dependent [42]. To get rid of this bias, we apply it iteratively N times (N being the number of the nodes) after reshuffling the order of the nodes. Finally, we select the partition with the highest modularity. The network presents a strong community structure, composed by four main subgraphs. When analysing [4] The term validated should not be confused with the term verified, which instead denotes a Twitter user who has passed the formal authentication procedure by the social platform. the emerging 4 communities, we find that they correspond to 1 Media and right and center-right wing parties (in steel blue) 2 Center-left wing (in dark red) 3 Movimento 5 Stelle (5 Stars Movement, or M5S; in dark orange) 4 Institutional accounts (in sky blue) Details about the political situation in Italy during the period of data collection can be found in the Supplementary Material, Section 1.2: 'Italian political situation during the Covid-19 pandemic'. While the various groups display a quite evident homophily among their components, we further examined them by re-running the Louvain algorihtm inside each of them, with the same care as above to the node order. Since the subcommunities structure is extremely rich, we suggest the interested reader to consult the section 2 of the Supplementary Material for a more detailed description. In this manuscript we focus on the purely political subcommunities, highlighted in the lower panel of Fig. 1. Starting from the center-left wing, we can find a darker red community, including the main politicians of the Italian Democratic Party (Partito Democratico, or PD), as well as by its representatives in the European Parliament (Italian and others) and some EU commissioners. The violet red group is instead mostly composed by the representatives of Italia Viva, a new party founded by the former Italian prime minister Matteo Renzi (December 2014 -February 2016). In turn, also the dark orange (M5S) community shows the presence of a purely political subcommunity (in orange in the bottom panel of Fig. 4), which contains the accounts of politicians, parliament representatives and ministers of the 5 Stars Movement and journalists of Il Fatto Quotidiano, a newspaper supporting the Movimento 5 Stelle. Similar considerations apply to the steel blue community: the subcommunity of center-right and right wing parties (as Forza Italia, Lega and Fratelli d'Italia, from now on FI-L-FdI) is represented in blue in the bottom panel of Fig. 4. Finally, the sky blue community is mainly composed by Italian embassies around the world. Let us conclude with a final comment. In [11], the authors analysed with similar techniques the Twitter Italian debate on a political subject as the migration policies. After cleaning the system from random noise, as in the present paper, the authors highlighted a group of coordinated accounts -the bot squad -increasing the visibility of more than a single genuine account. Beside the different final target, the division in community resembles the one found here, with few differences. First, in [11] media and center-right and right wing parties appeared in different communities from the very beginning; this is probably due to the fact that in the present case, the criticisms regarding the management of the pandemic by the main leaders of these parties were promptly reported by media, since they represented the opposition to the government. Secondly, in Ref. [11] M5S is not distinguishable from the the right and center-right wing discursive community. This second point is not so surprising, since at the time of the data collection of the previous manuscript, M5S was governing in an alliance with Lega, the main right wing party in Italy, and Matteo Salvini, the leader of Lega, was the Minister

Domain names' analysis
Here, we report a series of analyses related to the domain names, hereafter simply called domains, that mostly appear in all the tweets of the validated network of verified users. The domains have been tagged according to their degree of credibility and transparency, as indicated by the independent software toolkit NewsGuard https://www.newsguardtech.com/. The details of this procedure are reported below. As a first step, we considered the network of verified accounts, whose communities and subcommunities are shown in Fig. 1. On this topology, we labelled all domains that had been shared at least 20 times (between tweets and retweets).  Table 2 shows the tags associated to the domains. In the rest of the paper, we shall be interested in quantifying reliability of news sources publishing during the period of interest. Thus, for our analysis, we will not consider those sources corresponding to social networks, marketplaces, search engines, institutional sites, etc.; nevertheless, the information regarding their frequency are available to the interested readers in the Supplementary Material. Tags R, ∼ R and NR in Table 2 are used only for news sites, be them newspapers, magazines, TV or radio social channels, and they stand for Reputable, Quasi Reputable, Not Reputable, respectively. As mentioned above, we relied on NewsGuard, a plugin resulting from the joint effort of journalists and software developers aiming at evaluating news sites according to nine criteria concerning credibility and transparency. For evaluating the credibility level, the metrics consider, e.g., whether the news source regularly publishes false news, does not distinguish between facts and opinions, does not correct a wrongly reported news. For transparency, instead, the tool takes into account, e.g., whether owners, founders or authors of the news source are publicly known; and whether advertisements are easily recognizable. After combining the individual scores obtained out of the nine criteria, the plugin associates to a news source a score from 1 to 100, where 60 is the minimum score for the source to be considered reliable. When reporting the results, the plugin provides details about the criteria which passed the test and those that did not. For the sake of completeness, the Supplementary Material reports the procedure adopted by Newsguard journalists and editors to score each news site, the meaning of the score, and which are the textual information associated with the score. The material is inherited from the Newsguard website [5] .
In order to have a sort of no-man's land and not to be too abrupt in the transition between reputability and non-reputability, when the score was between 55 and 65, we considered the source to be quasi reputable, ∼R.
It is worth noting that not all the domains in the dataset under investigation were evaluated by NewsGuard at the time of our analysis. For those not yet evaluated by Newsguard, the annotation was made by three members of our team, who assessed the domains by using a subset of the NewsGuard criteria. The final class has been decided by majority voting (it never happened that the three annotators gave 3 different labels to the same domain). In the case of the network of verified users, considering only domains that appear at least 20 times, we have 80 domains annotated by Newsguard and 42 domains annotated by our three annotators. We computed the Fleiss' kappa (κ) inter-rater agreement metric [43]. The metric measures the level of agreement of different annotators on a task. The annotators showed a moderate agreement for the classification of domains, with κ = 0.63. Table 3 gives statistics about number and kind of tweets, the number of url and distinct url (dist url), the number of domains and users in the validated network of verified users. We clarify what we mean by these terms with an example: a domain for us corresponds to the so-called 'second-level domain' name [6] , i.e., the name directly to the left of .com, .net, and any other top-level domains. For instance, repubblica.it, corriere.it, nytimes.com are considered domains by us. Instead, the url maintains here its standard definition [7] and an example is http://www.example.com/index.html. Table 4 shows the outcome of the domains annotation, according to the scores of NewsGuard or to those assigned by the three annotators, when scores were no available from NewsGuard.  Table 3 Posts, urls, domains and users statistics in the validated network of verified users. "Tw" represent pure tweets, while "rt" indicates retweets. The number of tweets sharing an url is much higher than the one of retweets and it is a known results for verified users, from which they appear to drive the online debate.
At a first glance, the majority of the news domains belong to the Reputable category. The second highest percentage is the one of the untagged domains -UNC. In fact, in our dataset there are many domains that occur only few times. For example, there are 300 domains that appear in the datasets only once. [5] NewsGuard rating process: https://www.newsguardtech.com/ratings/rating-process-criteria/ [6] https://en.wikipedia.org/wiki/Domain_name [7] Table 4 Annotation results over all the domains in the validated network of verified users. It is worth noticing that, while the original posts are sharing reputable domains, this percentage strongly reduces when it refers to sharing, in favour of other non classified sources. Indeed, in introducing arguments for the discussion, verified users preferably refer to reliable source, while they are less rigorous when sharing others' messages.

Domain names' analysis of verified users
While in Section 3 of the Supplementary Material we analysed the domains reputability used in the various verified users communities, we focus on the urls shared in the purely political subcommunities in Table 5. Broadly speaking, we examine the contribution of the different political parties, as represented on Twitter, to the spread of d/misinformation and propaganda. Table 5 clearly shows how the vast majority of the news coming from sources considered scarce or non reputable are shared by the center-right and right wing subcommunity (FI-L-FdI ). Notably, the percentage of non reputable sources shared by the FI-L-FdI accounts is more than 30 times the second community in the NR ratio ranking.  Table 5 Domains annotation per political subcommunities -validated network of verified users. Frequency of the various type of news sources. The incidence of Non Reputable sources in the center-right and right wing parties discursive community is almost 30 times the one of the second community in the NR ratio ranking. In fact, the impact of NR sources is even greater in absolute numbers, due to the greater sharing activity of the users in this group (more than twice the value of the Movimento 5 Stelle subcommunity). For a greater detail in the annotations, consult the  Table 6 Posts, urls, domains and users statistics per political subcommunities -validated network of verified users: #post is the number of posts (divided in tweets and retweets) by the considered community, #url is the number of link shared, #dist url is the number of distinct url, #domain is the number of distinct domains contained in all urls. While the number of (validated) verified users is the center-right and right wing subcommunity is lower than any other political group, their activity in writing original posts is at least twice greater than any other group. This difference is not present in the number of retweets.
Looking at  Figure 2 The directed validated projection of the retweet activity network: the communities have been highlighted according to their political discursive community they take part to. All nodes not belonging to political discursive communities are in grey. Nodes' dimensions are proportional to their out degree.

The validated retweet network
In the present subsection we examine the effective retweet network, i.e. the activity of user of sharing messages as a reaction to an interesting original tweet. By focusing on the effectiveness of the relations we mean to consider the non random flow of messages from user to user: indeed it may happen that a single tweet is shared simply because it is viral, because its retweeter is particularly active or because the account publishing the original tweet is extremely prolific. Instead we are interested in the flow that cannot be explained only by the activity of users or by the popularity of posts, in order to highlight the non-trivial sharing activity, distinguishing the relevant information from the random noise. We define a directed bipartite network in which one layer is composed by accounts and the other one by the tweets. An arrow connecting a user u to a tweet t represents the u writing the message t. The arrow in the opposite direction means that the user u is retweeting the message t. To filter out the random noise from this network, we make use of the directed version of the BiCM, i.e. the Bipartite Directed Configuration Model (BiDCM [44]), described in subsection 6.2. The BiDCM constrains the in-and out-degree sequences of nodes on both layers, in our representation the users' tweeting and retweeting activity and the virality of posts. In order to detect the non trivial flow of messages from user to user, for every (directed) couple of accounts, we compared the number of retweets observed in the real system with the expectation of the null model. If the amount of retweets cannot be explained by the theoretical model, we project a link from the author to the retweeter in the monopartite directed network of users. Due to the process of validation, we call this network directed validated projection. More details can be found in the subsection 6.3. In order to infer the affiliation of unverified users to the various discursive communities, we use the labels obtained in the previous subsection for verified users and propagate them on the validated retweet network using the algorithm proposed in [45]. In the Section 6 of the Supplementary Material we show that propagating labels on the entire weighted retweet network, on its binary version or on the validated version is almost equivalent in order to get the labels for the users in the directed validated network. After label propagation, the representation of the political communities in the validated retweet network (displayed in Fig. 2) changes dramatically with respect to the case of the network of verified users: the center-right and right wing community is the most represented community in the whole network, with 11063 users (representing 21.1% of all the users in the validated network), followed by Italia Viva users with 8035 accounts (15.4% of all the accounts in the validated network). The impact of M5S and PD is much more limited, with, respectively, 3286 and 564 accounts. It is worth noting that this result is unexpected, due to the recent formation of Italia Viva. As in our previous study targeting the online propaganda [11], we observe that the most effective users in term of hub score [46] are almost exclusively from the center-right and right wing party: considering the first 100 hubs, only 4 are not from this group. Interestingly, 3 out of these 4 are verified users: Roberto Burioni, one of the most famous Italian virologists, ranking 32nd, Agenzia Ansa, a popular Italian news agency, ranking 61st, and Tgcom24, the popular newscast of a private TV channel, ranking 73rd. The fourth account is an online news website, ranking 88th: this is a unverified account which belongs to a non political community. Remarkably, in the top 5 hubs we find 3 of the top 5 hubs already found when considering the online debate on migrations from northern Africa to Italy [11]: in particular, a journalist of a neo-fascist online newspaper (non verified user), an extreme right activist (non verified user) and the leader of Fratelli d'Italia Giorgia Meloni (verified user), who ranks 3rd in the hub score. Matteo Salvini (verified user), who was the first hub in [11], ranks 9th, surpassed by his party partner Claudio Borghi (verified user), ranking 6th. The first hub in the present network is an (unverified) extreme right activist, posting videos against African migrants to Italy and accusing them to be responsible of the contagion and of violating lockdown measures. Table 7 shows the annotation results of all the domains tweeted and retweeted by users in the directed validated network. The annotation was made considering the domains occurring at least 100 times. Even in this case, for those sites not yet evaluated by Newsguard, these have been annotated by the same three members of our team. We have 100 domains annotated by Newsguard and 53 domains annotated by the three annotators. Also in this case, the annotators showed a moderate agreement for the classification of domains, with κ = 0.57. There are important differences from Table 4: the majority of urls traceable to news sources is still considered reputable, but its incidence is much reduced. Interestingly enough, the impact of at least nearly reputable sources is nearly 19% for original messages and 16% for retweets, against percentages around 3% and 2%, respectively.  Table 7 Annotation results over all the domains -directed validated network. Differently from the analogous Table 4, in the case of all users the number of original messages is less than one half of the one of retweets. This behaviour can be explain by the different role that verified users play in the debate: indeed, those accounts are drivers for the discussion and contribute mostly in proposing original messages. Interestingly enough, in passing to considering all users, the percentages of at least nearly reputable sources rise from nearly 3% and 2% to nearly 19% and 16% for tweet and retweets.   Noticeably, by comparing these numbers with those of Table 6, reporting analogous statistics about the validated network of verified users, we can see that here the number of retweets is much higher than the one of only tweets, while it was the opposite for verified users: verified users tend to tweet more than retweet, while users in the directed validated network, which comprehends also non verified users, have a greater number of retweets, being even more than 5 times the one of tweets, depending on the community. It is a behaviour that was already observed in [10,11] and it is essentially due to the preeminence of verified users in shaping the public debate on Twitter. It is also remarkable the fact that verified users represent a minority of all users in the verified networks. Fig. 3 shows the trend of the number of posts containing urls over the period of data collection. The highest peak appears after the discovery of the first cases in Lombardy, corresponds to more than 68000 posts containing urls, but a higher traffic is still present before the beginning of the Italian lockdown, while a settling down is present as the quarantine went on [8] . Interestingly, similar trends are present even in the analysis [14,22]. It is interesting to note that the incidence of NR sources is nearly constant in the entire period. It is interesting to notice that the incidence of NR sources in the entire period is more or less constant in time. Interestingly enough, the same reduction of the overall activity after the beginning of the lockdown was detected even in [14,22]. Tables 9 and 10 show the core of our analysis, that is, the distribution of reputable and non reputable news sources in the direct validated network, consisting of both verified and non-verified users. Again, we focus directly on the 4 political subcommunities identified in the previous subsection. The incidence of non reputable source in the subcommunity of center-right and right wing parties reach the impressive percentage of 22.1%, which is even greater than what observed in Table 5 (i.e. 12.8%): the contribution of unverified users seems to boost more the diffusion of unreliable contents. It is even more alarming that the percentage of nearly reputable source is great too: considering both non reputable and nearly reputable sources the percentage is of 34.2%. Thus, more than one third of the urls shared in the validated network by FI-L-FdI subcommunity are at least nearly reputable. Table 10 offers another point of view. The #user column shows the number of users of the 4 different political subcommunities who share urls labelled as NR. In absolute numbers, the FI-L-FdI community shares the highest number of NR urls, being responsible of the 96% of NR urls shared by political subcommunities. This [8] The low peaks for February 27 and March 10 are due to an interruption in the data collection, caused by a connection breakdown.  Table 9 Domains annotation per political subcommunities -directed validated network The contribution of urls of the FI-L-FdI community to the validated network is more than 3 times greater than any other discursive community, with 22% of non reputable urls. The interested reader can find more details in Table 5 Table 10 Share of dissemination of NR domains for each of the 4 subcommunities -directed validated network. Due to its outstanding activity, the contribution of center-right and right wing discursive community to the significant spread of non reputable news sources is impressive: nearly 96% of all NR urls spread by political subcommunities comes from this group.

Domain analysis on the directed validated network
behaviour is not only due to the the greater amount of users: in the FI-L-FdI subcommunity the accounts sharing NR urls are particularly active. In this group, the average number of (original) NR posts sent per user is 32.21, which is almost 6 times the average for the M5S users (which has 5.38 NR posts per users); IV and PD have 4.48 and 1.00 as average, respectively. The frequency of accounts retweeting NR sources among all users from the same community is extremely high too for FI-L-FdI (57.6% for FI-L-FdI, 23.5% for M5S, 5.79% for IV and 2.5% for PD -percentages for the only tweets activity being similar). It thus appears that FI-L-FdI contributes substantially to the diffusion of d/misinformation, not only in relation to the numbers of posts and users, but also in absolute numbers: out of the over 1M tweets, more than 320k tweets refer to a NR url.

Non Reputable sources shared in the effective flow of misinformation
As a final task, over the whole set of tweets produced or shared by the users in the directed validated network, we counted the number of times a message containing a url was shared by users belonging to different political communities, although without considering the semantics of the tweets. Namely, we ignored whether the urls were shared to support or to oppose the presented arguments. Table 11 shows the most frequent (tweeted and retweeted) NR domains shared by the political communities; the number of occurrences is reported next to each domain. The first NR domains for FI-L-FdI in Table 11 are related to the right, extreme right and neo-fascist propaganda, as it is the case of imolaoggi.it, ilprimatonazionale.it and voxnews.info, recognised as disinformation websites by NewsGuard and by the two main Italian debunker websites, bufale.net and BUTAC.it. As shown in the table, some domains, although in different number of occurrences, are present under more than one column, thus shared by users close to different political communities. However, since the semantics of the posts in which these domains are present were not investigated, the retweets of the links by more than one political community could be due to contrast, and not to support, the opinions present in the original posts: indeed, here we intend to just present the most frequent NR domains.  Table 11 List of the most frequent NR domains, with relative occurrences, per political subcommunities. The count was made considering all posts for users of the direct validated network.

Discussion
Due to its impact on several dimensions of the society, the online debate regarding the COVID-19 epidemics was the target of several early studies [12][13][14][15][20][21][22][23][24][25]. In the present article we examine the presence of d/misinformation campaigns in the Italian online societal debate about the pandemic during its peak of the first wave. Our analysis is based on a general methodology reviewed in [17,18] in order to extract both the discursive communities and the effective flow of messages [10,11]: in particular, in order to extract both the aforementioned information, we build an entropy-based null-model, constraining part of the information of the real system, and compare the observations on the real network with this benchmark. In particular, we extracted the various discursive communities, focusing our attention on verified users, i.e. public figures whose identity has been checked directly by Twitter platform. Indeed, we observed that, as in other cases [10,11,16], verified accounts lead the debate: their original posts (the tweets) are much more than their retweets, i.e. their messages sharing others' original tweets. Due to their role in the online debate, we examined in details the activity of verified users. Furthermore, we focused on the effective flow of information in the online debate: by comparing the system with an entropy-based null model, we filter out all the random noise associated to the online activities of users. In this sense, we highlighted all the non trivial retweeting activities and further examined the properties of the filtered network, focusing on the incidence of non reputable sources shared in the debate. Despite the fact that the results were achieved for a specific country, we believe that our approach, being general and unbiased by construction, is extremely useful to highlight non trivial properties and peculiarities. In particular, when analyzing the outcome of our investigation, some features attracted our attention: 1 Persistence of clusters w.r.t. different discussion topics: In Caldarelli et al. [11], we focused on tweets concerned with immigration, an issue that has been central in the Italian political debate for years. In particular, using the same techniques we implemented here in order to extract the effective retweet network, we highlight the presence of coordinated automated accounts increasing effectively the visibility of some accounts belonging to the same discursive community. Here, we discovered that the clusters and the echo chambers that were detected when analysing tweets about immigration are almost the same as those singled out when considering discussions concerned with Covid-19 [9] . This may seem surprising, because a discussion about Covid-19 may not be exclusively political, but also medical, social, economic, etc.. From this we can argue that the clusters are political in nature and, even when the topic of discussion changes, users remain in their cluster on Twitter. (It is, in fact, well known that journalists and politicians use Twitter for information and political propaganda, respectively). The reasons political polarisation and political vision of the world affect so strongly also the analysis of what should be an objective phenomenon is still an intriguing question. 2 (Dis)Similarities amongst offline and online behaviors of members and voters of parties: Maybe less surprisingly, the political habits is also reflected in the degree of participation to the online discussions. In particular, among the parties in the center-left wing side, a small party (Italia Viva) shows a much more effective social presence than the larger party of the Italian center-left wing (Partito Democratico), which has many more active members and more parliamentary representation. More generally, there is a significant difference in social presence among the different political parties, and the amount of activity is not at all proportional to the size of the parties in terms of members and voters. 3 Spread of non reputable news sources: In the online debate about Covid-19, many links to non reputable (defined such by NewsGuard, a toolkit ranking news website based on criteria of transparency and credibility, led by veteran journalists and news entrepreneurs, https://www.newsguardtech.com/) news sources are posted and shared. Kind and occurrences of the urls vary with respect to the corresponding political community. Furthermore, the center-right and right wing discursive community is characterised by a relatively small number of verified users that corresponds to a very large number of acolytes which are (on their turn) very active, three times as much as the ones of the opposite communities in the partition. In particular, when considering the amount of retweets from poorly reputable news sites, this community is by far (one order of magnitude) much more active than the others. As noted already in our previous publication [11], this extra activity could be explained by a more skilled use of the systems of propaganda -in that case a massive use of bot accounts and a targeted activity against migrants (as resulted from the analysis of the hub list). [9] Actually, in the analysis in [11] the center-right and right wing parties were distinct from Media community. In the present analysis they are divided in two different groups only when the first community is further examined by running again a community detection.
While our work contributes to the literature regarding the analysis of the impact of d/misinformation on the online societal debate, it paves the path to other crucial analyses. In particular, it is of interest to analyse the structure of the retweet network and how it may contribute to increase the visibility of some of the influential accounts that we detected (this was, in part, the target of the analysis in [47]). In this sense, even the role of automatic accounts into the network and in the diffusion of NR source is of utmost importance in order to tackle the problem of online d/misinformation.

Methods
In the present section we remind the main steps for the definition of an entropy based null model; the interested reader can refer to the review [18]. We start by revising the Bipartite Configuration Model [39], that has been used for detecting the network of similarities of verified users. We are then going to examine the extension of this model to bipartite directed networks [44]. Finally, we present the general methodology to project the information contained in a -directed or undirected-bipartite network, as developed in [40].

Bipartite Configuration Model
Let us consider a bipartite network G * Bi , in which the two layers are L and Γ. Define G Bi the ensemble of all possible graphs with the same number of nodes per layer as in G * Bi . It is possible to define the entropy related to the ensemble as [48]: where P (G Bi ) is the probability associated to the instance G Bi . Now we want to obtain the maximum entropy configuration, constraining some relevant topological information regarding the system. For the bipartite representation of verified and unverified user, a crucial ingredient is the degree sequence, since it is a proxy of the number of interactions (i.e. tweets and retweets) with the other class of accounts. Thus in the present manuscript we focus on the degree sequence. Let us then maximise the entropy (1), constraining the average over the ensemble of the degree sequence. It can be shown, [40], that the probability distribution over the ensemble is where m iα represent the entries of the biadjacency matrix describing the bipartite network under consideration and p iα is the probability of observing a link between the nodes i ∈ L and α ∈ Γ. The probability p iα can be expressed in terms of the Lagrangian multipliers x and y for nodes on L and Γ layers, respectively, as In order to obtain the values of x and y that maximize the likelihood to observe the real network, we need to impose the following conditions [49,50] where the * indicates quantities measured on the real network. Actually, the real network is sparse: the bipartite network of verified and unverified users has a connectance ρ 3.58 × 10 −3 . In this case the formula (3) can be safely approximated with the Chung-Lu configuration model, i.e.
where m is the total number of links in the bipartite network.

Bipartite Directed Configuration Model
In the present subsection we will consider the case of the extension of the BiCM to direct bipartite networks and highlight the peculiarities of the network under analysis in this representation. The adjancency matrix describing a direct bipartite network of layers L and Γ has a peculiar block structure, once nodes are order by layer membership (here the nodes on L layer first): where the O blocks represent null matrices (indeed they describe links connecting nodes inside the same layer: by construction they are exactly zero) and M and N are non zero blocks, describing links connecting nodes on layer L with those on layer Γ and viceversa. In general M = N, otherwise the network is not distinguishable from an undirected one. We can perform the same machinery of the section above, but for the extension of the degree sequence to a directed degree sequence, i.e. considering the in-and out-degrees for nodes on the layer L, (here m iα and n iα represent respectively the entry of matrices M and N) and for nodes on the layer Γ, The definition of the Bipartite Directed Configuration Model (BiDCM, [44]), i.e. the extension of the BiCM above, follows closely the same steps described in the previous subsection. Interestingly enough, the probabilities relative to the presence of links from L to Γ are independent on the probabilities relative to the presence of links from Γ to L. If q iα is the probability of observing a link from node i to node α and q iα the probability of observing a link in the opposite direction, we have where x out i and x in i are the Lagrangian multipliers relative to the node i ∈ L, respectively for the out-and the in-degrees, and y out α and y in α are the analogous for α ∈ Γ. In the present application we have some simplifications: the bipartite directed network representation describes users (on one layer) writing and retweeting posts (on the other layer). If users are on the layer L and posts on the opposite one and m iα represents the user i writing the post α, then k in α = 1 ∀α ∈ Γ, since each message cannot have more than an author. Notice that, since our constraints are conserved on average, we are considering, in the ensemble of all possible realisations, even instances in which k in α > 1 or k in α = 0, or, otherwise stated, non physical; nevertheless the average is constrained to the right value, i.e. 1. The fact that k in α is the same for every α allows for a great simplification of the probability per link on M: where N Γ is the total number of nodes on the Γ layer. The simplification in (9) is extremely helpful in the projected validation of the bipartite directed network [10].

Validation of the projected network
The information contained in a bipartite -directed or undirected-network, can be projected onto one of the two layers. The rationale is to obtain a monopartite network encoding the non trivial interactions among the two layers of the original bipartite network. The method is pretty general, once we have a null model in which probabilities per link are independent, as it is the case of both BiCM and BiDCM [40]. The method is graphically depicted in Fig. 4 in the case of BiCM; the case of BiDCM is analogous. The first step is represented by the definition of a bipartite motif that may capture the non trivial similarity (in the case of an undirected bipartite network) or flux of information (in the case of a directed bipartite network). This quantity can be captured by the number of V −motifs between users i and j [39,51], or by its direct extension (note that V ij = V ji ). We compare the abundance of these motifs with the null models defined above: all motifs that cannot be explained by the null model, i.e. whose p-value are statistically significance, are validated into the projection on one of the layers [40]. In order to assess the statistically significance of the observed motifs, we calculate the distribution associated to the various motifs. For instance, the expected value for the number of V-motifs connecting i and j in an undirected bipartite network is where p iα s are the probability of the BiCM. Analogously, where in the last step we use the simplification of (9) [10].
In both the direct and the undirect case, the distribution of the V-motifs or of the directed extensions is Poisson Binomial one, i.e. a binomial distribution in which each event shows a different probability. In the present case, due to the sparsity of the analysed networks, we can safely approximate the Poisson-Binomial distribution with a Poisson one [52]. In order to state the statistical significance of the observed value, we calculate the related p-values according to the relative null-models. Once we have a p-value for every detected V-motif, the related statistical significance can be established through the False Discovery Rate (FDR) procedure [53], which, respect to other multiple test hypothesis, controls the number of False Positives. In our case, all rejected hypotheses identify the amount of V-motifs that cannot be explained only by the ingredients of the null model and thus carry non trivial information regarding the systems. In this sense, the validated projected network includes a link for every rejected hypothesis, connecting the nodes involved in the related motifs.

Availability of Data and Materials
The data that support the findings of this study are available from Twitter, but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Twitter. NewsGuard data are proprietary and cannot be shared.

Competing interests
The authors have no competing interest on this paper 9 Funding  Figure 4 Schematic representation of the projection procedure for bipartite undirected networks. a) An example of a real bipartite network. For the actual application, the two layers represent verified (turquoise) and unverified (gray) users and a link between nodes of different layers is present if one of the two users retweeted the other one, at least once. b) Definition of the Bipartite Configuration Model (BiCM) ensemble. Such ensemble includes all possible link realisations, once the number of nodes per layers has been fixed. c) we focus our attention on nodes i and j, i.e., two verified users, and count the number of common neighbours (in magenta both the nodes and the links to their common neighbours). Subsequently, d) we compare this measure on the real network with the one on the ensemble: If this overlap is statistically significant with respect to the BiCM, e) we have a link connecting the two verified users in the projected network. The figure is an adaptation from [11].

Authors' contributions
All the authors devise the experiment and wrote the paper, FS, MPe, MPr collected and analysed the data.