Skip to main content

Analysis and classification of privacy-sensitive content in social media posts


User-generated contents often contain private information, even when they are shared publicly on social media and on the web in general. Although many filtering and natural language approaches for automatically detecting obscenities or hate speech have been proposed, determining whether a shared post contains sensitive information is still an open issue. The problem has been addressed by assuming, for instance, that sensitive contents are published anonymously, on anonymous social media platforms or with more restrictive privacy settings, but these assumptions are far from being realistic, since the authors of posts often underestimate or overlook their actual exposure to privacy risks. Hence, in this paper, we address the problem of content sensitivity analysis directly, by presenting and characterizing a new annotated corpus with around ten thousand posts, each one annotated as sensitive or non-sensitive by a pool of experts. We characterize our data with respect to the closely-related problem of self-disclosure, pointing out the main differences between the two tasks. We also present the results of several deep neural network models that outperform previous naive attempts of classifying social media posts according to their sensitivity, and show that state-of-the-art approaches based on anonymity and lexical analysis do not work in realistic application scenarios.

1 Introduction

The Web is pervaded with user-generated contents as Internet users have multiple and increasing ways to express themselves. They can post reviews of products, businesses, services and experiences; they can share their thoughts, pictures and videos through different social media platforms; they reply to surveys, forums and newsgroups and some of them have their own blogs and web pages. Many companies are encouraging this behavior, because user-generated content has more attractive power on other users than professional contents, and this increases their engagement on web platforms. However, texts, photos and videos posted by users may harm their own and other’s privacy, thus exposing themselves (and other users) to many risks, from discrimination or cyberbullying to frauds and identity theft. Although user-generated content is often subject to moderation, also adopting automated recognition techniques such as inappropriate content [1], hate speech [2] and cyberbullying [3] detection, there is no control on the sensitivity of posted contents. It is worth noting that social media and forums are not the only platforms that store and publish private contents. Surveys, or contact/helpdesk forms are other examples where the users are free to enter any type of text and other contents, together with other more structured personal information. Often, such data need to be transferred to third parties to be analyzed, and the lack of control on free-text fields could put the privacy of respondents at risk. A common quick solution consists in totally removing all such fields or sanitizing them automatically or at hand. However, existing automatic sanitization approaches [46] try to replace sensitive terms belonging to specific domains (e.g., medical or criminal records) with more general ones, and rely on existing knowledge bases and natural language processing techniques such as named entity recognition and linking. In some cases, sanitization techniques destroy the informativeness (and sometimes the meaning itself) of the text.

Self-disclosure, i.e., the act of revealing personal information to others [7], is a social phenomenon that has also been extensively studied in relation with online forums [8], online support groups [9] and social media [10]. Although self-disclosure is also closely related to content sensitivity, it has often been investigated in the context of intrinsically sensitive topics, such as in forums related to health issues, intimate relationships, sex life, or forum sections explicitly devoted to people searching for support from strangers [11]. In these settings, the identity of the users is often masked by pseudonyms or entirely anonymous. Instead, general purpose social media platforms usually encourage the usage of the real identity, although this does not prevent their users from disclosing very private information [1214]. Moreover, the sensitivity of social media texts is harder to detect, because the context of a post play a fundamental role as well. Finally, social media posts are sometimes very short; yet, they may disclose a lot of private information.

To better understand the problem, let us observe the post in Fig. 1: it does not mention any sensitive term or topic, but discloses information about the author and his friend Alice Green, and contains hidden spatiotemporal references that are immediately clear from the context (the author is about to leave for a journey, which implies that he will be far from home for a month, disclosing a potentially sensitive information). On the other hand, there may exist posts that contain very sensitive terms, but are not sensitive at all, when contextualized correctly. An example is given by the post in Fig. 2, where several sensitive terms (struggling, suffering, COVID-19) and topics (health, economic crisis) are mentioned, but no private information is disclosed about any specific person. In these cases, the automatic assessment of text sensitivity could save a lot of rich information and help automate the sanitization process. Furthermore, an automatic warning system able to detect the true potential sensitiveness of a post, may help a user decide whether to share it or not.

Figure 1
figure 1

A potentially sensitive post. The post does not mention any sensitive term or topic, but discloses information about the author and his friend Alice Green, and contains hidden spatiotemporal references that are immediately clear from the context

Figure 2
figure 2

A non-sensitive post mentioning sensitive topics and terms.The post contains several sensitive terms (struggling, suffering, COVID-19) and topics (health, economic crisis), but no private information is disclosed about any specific person

Indeed, the problem of assessing and characterizing the sensitivity of content posted in general purpose social media has been already studied, but, due to the unavailability of specifically annotated text corpora, it has been tackled through the lens of anonymity, by assuming that sensitive contents are posted anonymously [15, 16], on anonymous platforms [17], or with more restrictive privacy settings [18], while non sensitive ones are posted by identifiable users and/or made available to everyone. However, as we pointed out in [19], anonymity and sensitivity are not straightforwardly related to each other. The decision of posting anonymously could be determined uniquely by the sensitivity of the topic, but not by the sensitivity of the posted content itself. Analogously, many non anonymous social media posts contain very private information, just because their sensitivity [12] or their visibility [14] are underestimated by the content authors. These considerations make what we call the “anonymity assumption” too simplistic, or even unrealistic in practice. Other existing annotated corpora concern posts extracted from Reddit [11] and support groups for cancer patients [8, 9]. Unfortunately, these corpora focus on very specific (and intrinsically sensitive) topics or give a very restrictive interpretation of self-disclosure: in [11], for instance, only posts disclosing personal information or feelings about the authors are annotated as sensitive. Moreover, it has a strong focus on mutually supportive communities and intimate relationships. To cope with this problems, very recently, we have introduced a more general task called content sensitivity analysis as a machine learning task aimed at assigning a sensitivity score to content [19]. However, in that preliminary work, we model the problem as a simple bag-of-words classification task on a very small text dataset (less than 700 social media posts) with mild accuracy results (just above the majority classifier).

In this paper, we address all the limitations of previous works by analyzing a new large corpus of nearly 10,000 text posts, all annotated as sensitive or non sensitive by humans, without assuming any implicit and forced link between anonymity and privacy. We provide an in-depth analysis of sensitive and non sensitive posts, and introduce several sequential deep neural network models that outperform bag-of-words classifiers. We also show that models trained according to the anonymity assumption do not work properly in realistic scenarios. Moreover, we also study how the problem of self-disclosure is related to ours and show that existing text corpora are not adequate to analyze the sensitivity of posts shared in general purpose social media platforms. At the best of our knowledge, this is the first work addressing the problem of directly and efficiently evaluating the real sensitivity of short text posts. It has then the potential to represent a new gold standard in content sensitivity analysis and self-disclosure, and could open new research opportunities for improving the users’ awareness on privacy and performing privacy risk assessment analysis or sanitization on data containing free text fields.

Our paper is organized as follows. In Sect. 2, we review some closely related work and discuss their limitations. We define formally our concept of privacy-sensitive content, describe how we have constructed our annotated corpus, and present the datasets used in our analysis in Sect. 3. Section 4 contains an in-depth analysis of the lexical features characterizing sensitive content in the different datasets, while, in Sect. 5, we report the results on multiple classification tasks conducted under different settings. In Sect. 6 we discuss more in detail the results of the experiments and draw some generalized conclusions. Finally, Sect. 7 concludes by also presenting some future research perspectives.

2 Related work

With the success of online social networks and content sharing platforms, understanding and measuring the exposure of user privacy in the Web has become crucial [20, 21]. Thus, many different metrics and methods have been proposed with the goal of assessing the risk of privacy leakage in posting activities [22, 23]. Most research efforts, however, focus on measuring the overall exposure of users according to their privacy settings [24, 25] or position within the network [14]. Instead, the problem of characterizing and detecting the sensitivity of user-generated content, has been subject of very few studies in the last decade. One of the first work in this direction has tried to address this problem using a lexicographic approach [26, 27]. Similarly to sentiment analysis or emotion detection, in fact, linguistic resources may help identify sensitive content in texts. In their work, Vasalou et al. leverage prototype theory and traditional theoretical approaches to construct and evaluate a dictionary intended for content analysis. Using an existing content analysis tool applied on several text corpora, they evaluate dictionary terms according to privacy-related categories. Interestingly, the same authors note that there is no consistent and uniform theory of privacy-sensitivity.

To bypass this problem, several authors adopt a simplification: they assume that the sensitivity of contents is strictly related to the choice of posting them anonymously. This also makes the construction of annotated corpora easier, because one just needs to consider contents posted anonymously as sensitive, while posts shared with identifiable information can be considered as non sensitive. Hence, for instance, Peddinti et al. adopt this strategy for analyzing anonymous and non anonymous posts in a famous question-and-answer website [15]. They analyze different basic machine learning models to predict whether a particular answer will be written anonymously. Similarly, Correa et al. define sensitivity of a social media post as the extent to which users think the post should be anonymous [17]. They compare content posted on anonymous and non-anonymous social media sites both in terms of topics and from the linguistic point of view, and conclude that sensitivity is often subjective and it may be perceived differently according to several aspects. Very recently, the same authors have published a sanitized version of nearly 90 million posts downloaded from Whisper, an anonymous social media platforms [28]. Biega et al. conduct a similar study, but restrict the analysis to sensitive topics with the aim of measuring the privacy risks of the users [29]. It is worth noticing that all these studies conclude that sensitivity is subjective.

Content sensitivity has been associated to privacy settings as well: similarly to anonymity, contents posted with restricted visibility are deemed sensitive. Yu et al. analyze sensitive pictures by learning the object-privacy correlation according to privacy settings to identify categories of privacy-sensitive objects using a deep multi-task learning architecture [18]. They also use their model to customize privacy settings automatically and to sanitize images by blurring sensitive objects.

Text sanitization is another close research field whose goal is to find and hide personally identifiable information and simultaneously preserve text utility. To this purpose, Jiang et al. present an information theoretic approach to hide sensitive terms by more general but semantically related terms to protect sensitive information [30]. Similarly, Sanchez et al. propose several information theoretic approaches that detect and hide sensitive textual information while preserving its meaning by exploiting knowledge bases [4, 31, 32]. Iwendi et al., instead, focus on unstructured medical datasets and propose a framework to completely anonymize the textual clinical records exploiting regular expressions, dictionaries and named entity recognition. Their methods is aimed at sanitizing the detected protected health information with its available generalization, according to a well-known medical ontology [5]. Finally, Hassan et al. use word embeddings to evaluate the disclosure caused by the textual terms on the entity to be protected according to the similarity between their vector representations [6]. All the above mentioned methods rely on the identification of named entities or quasi-identifying terms, and try to replace them with semantically close, although more general, terms. Hence, they all leverage some kind of knowledge bases or ontologies, and work well on some specific domains (e.g., on medical documents, criminal records and so on). Instead, we address a more general notion of sensitivity, that also includes texts that may reveal sensitive or simply private user’s habits, feelings or characteristics.

A closely related concept is the so-called self-disclosure, defined as the act of revealing personal information to others [7]. Self-disclosure has been widely studied well before the advent of modern social media, in particular for its implications in online support groups, online discussion boards and forums. For instance, Barak et al. study, among the others, the reciprocity of self-disclosure in online support groups and discussion forums showing that there are substantial differences in how people behave in these two different media types [8]. Yang et al., instead, analyze the differences in the degree of positive and negative self-disclosure in public and private channels of online cancer support groups [9]. They show that people tend to self-disclose more in public channels than in private ones. Moreover, negative self-disclosure is also present more in public online support channels than in private chats or emails. To achieve these results, the authors study lexical, linguistic, topic-related and word-vector features of a relatively small annotated corpus using support vector machines. Ma et al. conduct a questionnaire-based mixed-factorial survey experiment to answer several questions concerning the relationships that regulate anonymity, intimacy and self-disclosure in social media [10]. They show, for instance, that intimacy always regulates self-disclosure, while anonymity tends to increase the level of self-disclosure and decrease its regulation, in particular for content of negative valence. Differently from the previous works, Jaidka et al. directly address the problem of self-disclosure detection in texts posted in online forums, by reporting the results of a challenge concerning a relatively large annotated corpus made up of top posts collected from Reddit [11]. Contrarily to [28], in this corpus, all posts are directly annotated according to their degree of informational and emotional self-disclosure. The authors also intend to investigate the emotional and informational supportiveness of posts and to model the interplay between these two variables. Unfortunately, this corpus is not entirely adapted to our purpose (i.e., detecting the sensitivity of text content in general purpose social media platforms) mainly for four different reasons: first, the focus is on self-disclosure, although a post may reveal sensitive information about other people as well; second, posts on Reddit are published using pseudonyms, while general purpose social media foster the usage of real identities; third, a large part of the posts has been extracted from a subreddit explicitly devoted to people searching for other users’ support; last but not least, all posts concern intimate relationships by design.

In conclusion, in our work, we do not make any “anonymity” or “privacy settings” assumption, since it has been shown that users tend to underestimate or simply overlook their privacy risk [1214]. Consequently, we analyze and characterize sensitive posts directly. In a very preliminary version of our work, we tried to give a more generic definition of sensitivity [19]. However, our model was trained on very few posts and by using simple bag-of-words classifiers, thus achieving mild accuracy results. In this work, we construct a much larger and more reliable dataset of social media posts, directly annotated according to their sensitivity, and use more sophisticated and accurate models to help decide whether a post is sensitive or not. Additionally, we provide further lexical and semantic insights about sensitive and non sensitive texts.

3 An annotated corpus for content sensitivity

In this section, we introduce the data that we use in our study. We first provide a conceptualization of “content sensitivity” also in relation with existing similar concepts; then, we describe how we construct our annotated corpus and provide some characterization of it.

3.1 Privacy-sensitive content

Content sensitivity is strictly related to the concept of self-disclosure [7], a communication process by which one person reveals any kind of personal information about themself (e.g., feelings, goals, fears, likes, dislikes) to another. It has been described within the social penetration theory as one of the main factors enabling relationship development [33, 34]. Due to the peculiarities of online communication (and its differences w.r.t. face to face communication), the social and psychological implications of self-disclosure in the Internet have been extensively studied as well [35]. For its implications on user privacy, self-disclosure has also been investigated in relation with privacy awareness, policies and control [36], and some rule-based detection techniques for self-disclosure in forums have been proposed [37], leading to some relatively large annotated corpora [11].

In this paper, we refer to content sensitivity as a more general concept than self-disclosure. In [19] we gave a preliminary, subjective and user-centric definition of privacy sensitive content. In that work, we stated that a generic user-generated content object is privacy-sensitive if it makes the majority of users feel uncomfortable in writing or reading it because it may reveal some aspects of their own or others’ private life to unintended people. This definition is motivated by the fact that each social media platform has its own peculiarities and the amount and quality of social ties also play a fundamental role in regulating self-disclosure [10]. However, it has many drawbacks, since it relies on the subjective perception of users and on a notion of uncomfortableness that can also be driven by other external factors. This also conditioned the preliminary annotation of a corpus, leading to poor detection results. Consequently, in this paper, we adopt a more objective definition of privacy-sensitive content.

Definition 1

(Privacy-sensitive content)

A generic user-generated content is privacy-sensitive if it discloses, explicitly or implicitly, any kind of personal information about its author or other identifiable persons.

Differently from the concept of self-disclosure, our definition explicitly mention the disclosure of information concerning persons other than the author of the content. Furthermore, it also clearly includes contents that implicitly reveal personal information of any kind. For instance, the sentence “There’s nothing worse than recovering from COVID-19”, is a neutral sentence, apparently. However, it is very likely that the person who expresses this claim has also personally experienced the effects of SARS-CoV-2 infection.

3.2 Datasets

Most previous attempts of sensitivity analysis on text contents assume that sensitive posts are shared anonymously, while non sensitive posts are associated to real social profiles. Other available corpora do not explicitly require that distinction, but have been collected in very specific domains (e.g., health support groups [9]) or focus on limited types of self-disclosure (e.g., intimate/family relationships [11]). Hence, we will consider a new generic dataset with explicit “sensitive/non-sensitive” annotations. To this purpose, we first need a corpus constituted of mixed sensitive and non-sensitive posts. Twitter is not the most suitable source for that, because most public tweets are of limited interest to our analysis, while tweets with restricted access can not be downloaded. Moreover, it is well known that users are significantly more likely to provide a more “honest” self-representation on Facebook [38, 39]. Consequently, Facebook posts are more adapted to our purposes, but contents posted on personal profiles can not be downloaded, while public posts and comments published in pages do not fit the bill as they are, in general, non sensitive. Furthermore, they would require a huge sanitization effort in order to make them available to the research community. Fortunately, one of the datasets described in [40], and released publicly, has all the required characteristics. It is a sample of 9917 Facebook posts (status updates) collected for research purposes in 2009-2012 within the myPersonality project [41], by means of a Facebook application that implemented several psychological tests. The application obtained the consent from its users to record their data and use it for the research purposes. All the posts have been sanitized manually by their curators: each proper name of person (except for famous one, such as “Chopin” and “Mozart”) has been replaced with a fixed string. Famous locations (such as “New York City” and ```Mexico”) have not been removed, either. Almost all posts are written in English, with an average length of 80 characters (the minimum and maximum length are, respectively 2 and 435 characters). Since the recruitment has been carried out on Facebook, the dataset suffers from the typical sample bias due to the Facebook environment (some groups of people might be under- or over- represented). However, the same problem applies to other datasets as well [9, 11, 28].

All 9917 posts have been proposed to a pool of 12 volunteers (7 males and 5 females, aged from 24 to 41 years, mainly postgraduate/Ph.D. students and researchers), so as to have exactly three annotations per each post. Hence, we have formed four groups, each consisting of three annotators; every group has been assigned from 2479 to 2485 posts. For each post, the volunteers had to say whether they think that the post was sensitive, non-sensitive, or of unknown sensitivity. The choices also include a fourth option, unintelligible, used, for instance, to tag posts written in a language other than English. For each category, the annotators were given precise guidelines and examples (see Table 1). According to our guidelines, a post is “sensitive” if the text is understandable and the annotator is certain that it contains information that violates a person’s privacy (not necessarily of the author of the post), because it contains, for instance: information about current or upcoming moves, on events in the private sphere, on health or mental status; information about one’s habits or that can help geolocalize the author of the post or other people mentioned; information on the sentimental status; considerations that may hint at the political orientation or religious belief.

Table 1 Guidelines and examples for the annotations

At the end of the period allowed for the annotation, all volunteers have accomplished their assigned task and we have computed some statistics regarding their agreement. In details, for each group, we have computed the Fleiss’ κ statistics [42], which measure the reliability of agreement between a fixed number of annotators. The results (reported in Table 2) show fair to moderate agreement in all groups, also considering that the number of possible categories is four. This result also demonstrate that the task of deciding whether a post is sensitive or not is not straightforward, as shown by the percentage of identical annotation in each groups: overall, at least 93.91% of posts have at least two identical annotations, but the percentage drops down to 42.97% if we look for the perfect agreement (three unanimous annotators). Apparently, there are differences among the four groups, but they are smoothed by only considering posts with at least two “sensitive” or “non-sensitive” tags, as we will precise later.

Table 2 Agreement computed according to Fleiss’ κ

In Table 3 we report the details of the annotations. Each column reports the number of posts that received exactly one, two or three annotations for each class. From this table it emerges how the majority (7923) of posts have been annotated at least once as non-sensitive, while the number of posts that have received at least one “sensitive” annotation are much less (5826). In addition, the number of posts with unknown sensitivity drops drastically from 1529 to 7 when the number of annotations considered increases from one to three. This means that for almost all posts (except unintelligible ones) at least one annotator was able to determine its sensitivity.

Table 3 Details of the annotations. The last column contains the number of posts receiving at least one annotation for each class

Starting from all the annotations, we generate two datasets. The first one contains all those posts that received at least two “sensitive” or “non-sensitive” annotations and we call it SENS2. The second, called SENS3 contains all those posts that received exactly three “sensitive” or “non sensitive” annotations. By operating this choice, we exclude automatically all posts that have been annotated as “unknown” or “unintelligible” by at least two annotators. Notice that the portion of sensitive posts is almost the same in both samples. The details of these two datasets are reported in Table 4. The average length of the posts (in terms of number of words) is relatively small (15 words, on average), a typical characteristic of social media text contents, but there is a high variability (some posts are more than 85 words long).

Table 4 Details on the datasets used

For comparison reasons, we also use two additional datasets. The first consists of top posts extracted from two subreddits in Reddit [11]:Footnote 1 “r/CasualConversations”, a sub-community where people are encouraged to share what’s on their mind about any topic; “r/OffmyChest”, a mutually supportive community where deeply emotional things are shared. By design, all posts mention one of the following terms: boyfriend, girlfriend, husband, wife, gf, bf. The annotators were required to annotate each post according to the amount of emotional and informational disclosure it contains. Here, we consider all posts that do not disclose anything as “non sensitive”; all remaining posts are tagged as “sensitive”, in accordance with the choices made for annotating our dataset. We consider all the 12,860 labeled training data samples and the 5000 labeled test data samples. Overall, 10,793 posts are labeled as “sensitive”, and 7067 as “non sensitive”. All the details are given in Table 4. The reader is referred to [11] for further details about this dataset.

The second dataset is an anonymity-based corpus following the example of [17], where sensitive posts are constituted of anonymous posts shared on WhisperFootnote 2 (a popular social media platform allowing its users to post and share photo and video messages anonymously), while non-sensitive posts are taken from Twitter. Here, we generate ten samples, each consisting of a subset of 3336 sensitive posts selected randomly from a large collection of sanitized Whisper posts [28],Footnote 3 and a subset of 5429 non-sensitive posts randomly picked from a large collection of tweets [43]. The numbers of sensitive and non-sensitive posts have been chosen to mimic the distribution observed in dataset SENS2. We filter out posts containing retweets or placeholders, and that are shorter than 9 characters or not written in English (according to the fastText model [44]). Then, from each remaining post, we remove any mention and hashtag, in order to obtain samples of posts similar to the ones in SENS2 and SENS3. The ten samples are needed to limit any sampling bias.

4 Understanding sensitivity

In this section, we analyze our data in detail with the aim of characterizing sensitive and non-sensitive posts from a linguistic point of view. The goal of this analysis is to understand whether lexical features may help distinguish sensitive and non-sensitive content.

4.1 Analysis of the words

As first analysis, we extract the most relevant terms from each class of posts in all datasets considered in our study. To this purpose, all terms are first stemmed. Then, we compute the total number of their occurrences and their relative frequency for all classes as the number of occurrences of each word in each class (sensitive and non-sensitive) divided by its total number of occurrences. To avoid any bias, the number of occurrences and the relative frequency are computed on 10 random samples consisting of 500 sensitive and 500 non-sensitive posts. The results are then averaged on the 10 samples. Only words occurring at least 30 times are considered. The top-20 words ranked according to their average relative frequency in each class are shown in Tables 5, 6, 7 and 8. It is worth noting that, for the sensitive class, relative percentages are in general much higher for WH+TW than SENS2, SENS3 and OMC. Moreover, emergent words in WH+TW are mostly related to personal relationships, while most emergent terms in SENS2 and SENS3 are more generic and related to everyday life. This highlights one of the limitations of previous work based on anonymity, such as [17], i.e., the fact that using different sources to gather anonymous and non-anonymous posts introduces a bias also in terms of discussion topics. Moreover, Table 7 shows the intrinsic bias of dataset OMC: the most prominent words for the sensitive class are related to friendship and personal feelings and wishes (e.g., friend, feel, would).

Table 5 Most relevant words for each class in dataset SENS2
Table 6 Most relevant words for each class in dataset SENS3
Table 7 Most relevant words for each class in dataset OMC
Table 8 Most relevant words for each class in WH+TW

4.2 Analysis of the lexical features

Similarly as in [17], we categorize all words contained in each post into different dictionaries provided by LIWC [45]. LIWC is a hierarchical linguistic lexicon that classifies words into meaningful psychological categories: for each post, LIWC counts the percentage of words that belong to each psychological category. In addition, we also account for another, more specific, lexical resource, i.e., the Privacy Dictionary [26, 27]. It consists of dictionary categories derived using prototype theory according to traditional theoretical approaches to privacy. The categories, together with some example of words, are presented in Table 9.

Table 9 Categories of the Privacy Dictionary [26]

Given 10 random samples consisting of 500 sensitive and 500 non-sensitive posts, we calculate the average percentage of sensitive and non-sensitive posts that contains words belonging to each dictionary as well as the sensitive to non-sensitive ratio for each dictionary. For the psychological categories, we only list the dictionaries whose ratio exceeds 1.3 (thus, it is over-represented in sensitive posts) or is below 0.7 (i.e., it is under-represented in sensitive posts) in each dataset. The results are shown in Table 10 (categories with high sensitive to non sensitive ratio are presented in bold), while the ratios for privacy-related categories are all reported in Table 9. It is worth noting that the number of relevant dictionaries in Table 10 differs significantly from one dataset to another: it is minimum in SENS2 and maximum in WH+TW. Interestingly, some categories are relevant in all datasets (e.g., some personal pronouns, family, friends and female), while other ones are specific to individual corpora (anxiety and feelings appear only in OMC and WH+TW, money only in SENS2 and SENS3). Overall, lexical features seems to help discriminate better OMC and WH+TW datasets rather than ours, and this observation is even more evident for the Privacy Dictionary (Table 9). In our data, with the exception of categories Law and Intimacy, almost all privacy categories are less represented in sensitive posts than in non-sensitive ones (ratios are less than one). Instead, almost all privacy categories are over-represented in sensitive posts belonging to WH+TW. In OMC, ratios are in general closer to one. These results confirm that relying on the anonymity of sources may introduce too much lexical bias, while considering sensitivity directly show less distinguishing lexical properties.

Table 10 Psychological categories of LIWC [45]

This consideration is confirmed by a further experiment conducted to verify whether lexical features can help discriminate sensitive posts against non-sensitive ones. To this purpose, we set up a simple binary classification task, using a logistic regression (LR) classifier, a support vector machine (SVM) classifier with linear kernel, and a Random Forests (RF) classifier with default parameters. Each dataset is randomly divided into training (75%), validation (15%) and test (10%) sets: the same sets will be employed in each experiment presented in this paper. Here, the training set is used for training the model, and the test set for performance evaluation. We train and test the classifiers on different feature sets: the one including all dictionaries, the one including only psychological dictionaries, and the one consisting only of privacy categories. Each post is then represented by a vector whose values are the percentage of words in the post belonging to each dictionary. Values are standardized to have zero mean and unit variance. According to the results presented in Table 11, WH+TW seems to take greater advantage of lexical features w.r.t. all other datasets (in particular, OMC and the equally-sized SENS2). Another important observation concerns the impact of privacy categories on classification. Apparently, some classification results are penalized by these features and, when the classifier is trained on privacy categories only, the performances drop drastically to those of the majority classifier. One explanation is that such a dictionary is built upon technical documents and is not intended as a general-purpose lexical resource, although some categories also applies to our data (e.g., Intimacy). This is also confirmed by the fact that this feature space is very sparse (non-zeros are around 2% in all datasets). Nevertheless, in this analysis we have considered it because this is the only existing lexical resource having a specific focus on privacy.

Table 11 Classification results (macro averaged F1-score) using dictionary features. Results on WH+TW are averaged on ten samples

4.3 In-depth analysis of dictionary-based classification results

To better understand the behavior of the classifiers, we analyze in detail the performance on the different classes (the sensible and the non-sensible ones), in terms of F1-score and for each dataset, considering the best performing classifiers according to the macro-averaged F1-score (see Table 11). The results are reported in Table 12. As expected, the majority class (the non-sensible one for every dataset except OMC) is the one for which the classifiers are the most accurate. However, from the classification point of view, WH+TW is the easiest dataset to analyze, as the two classes are better identified than in any other dataset, while on SENS2 and OMC the best classifiers achieve similar performances, slightly better than the majority voting classifier for the most frequent class. For such datasets, using dictionaries does not provide a reliable way to differentiate the two classes.

Table 12 Detailed classification results (F1-score) using dictionary features with the best classifier. Results on WH+TW are averaged on ten samples

Finally, we inspect the logistic regression classifier to identify the most relevant features for the sensitive class in each dataset. In Table 13 we report the top-20 relevant features together with the corresponding coefficients (the logarithms of the odds ratios). The results seem to confirm the conclusions reached with the previous experiments (feature names with capital initials are from the Privacy Dictionary [26, 27]), but as further analysis, we compute the Spearman’s rank correlation coefficient (referred to as ρ in the following) among the different feature coefficient vectors in order to investigate the similarities among the different models. The results of this analysis show that, not surprisingly, the two most similar logistic regression models are those computed on SENS2 and SENS3 (\(\rho =0.757\)). However, more interestingly, the model computed on WH+TW is more similar to the one computed on OMC (\(\rho =0.25118\)) than to those computed on SENS2 and SENS3 (\(\rho =0.1165\) and \(\rho =-0.0007\)). This shows that the types of sensitiveness captured by OMC and WH+TW have something in common: this is probably due to the fact that the content of sensitive posts for both datasets is mostly related to family and intimate relationships. Finally, it is worth noting that the coefficients computed on OMC are more correlated with those computed on SENS3 (\(\rho =0.3277\)) than with those returned for SENS2 (\(\rho =0.1461\)). This can be explained by the fact that the annotators’ agreement on SENS3 is the highest one: as a consequence, only highly sensitive posts (such as the ones tagged as sensitive in OMC, by construction) are marked as such. However, as already declared, we are interested in a more general concept of content sensitivity which does not rely on the most personal and intimate aspects of the human’s life only.

Table 13 Top-20 relevant features and their coefficients computed by the logistic regression classifier for the sensitive class

5 Classifying posts according to their sensitivity

In this section, we provide the details of the experiments conducted within different classification scenarios, where the learning algorithms are applied directly on (embeddings of) text data. Our goal is to measure the possible gain of applying recent state-of-the-art text classification techniques that consider text as sequences, over the usage of features extracted from dictionaries. In particular, we compare several different convolutional and recurrent neural networks architectures, a transformer-based neural network technique and, in addition, we also consider some baselines consisting in applying standard classifiers on bag-of-words representations of the datasets, similarly as in our previous work [19].

More in detail, we apply four different classifiers for each dataset: a one-dimensional Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN) with gated recurrent unit (GRU) nodes, a RNN with long short-term memory (LSTM) nodes, and BERT [46], a pre-training transformer-based network designed for learning language representation models. The CNN models have an embedding layer, followed by one or two one-dimensional convolutional layers (all with kernel size 8, Rectified Linear Unit as activation function, batch normalization and global average pooling), one or two dense layers, and one dense layer of 2 nodes with softmax activation function. The exact number of nodes per level of each model is reported in Table 14. The RNN models consist of one embedding layer, followed by one or two recurrent layers, one or two dense layers, and, finally, one dense layer of 2 nodes with softmax activation function. The number of layers and nodes of each model is reported in Table 15. The embedding layer projects each word of the input text into a word vector space: we use two different word embeddings pre-trained on Twitter data using GloVe [47].Footnote 4 Each recurrent layer is bidirectional, and each layer has a dropout equals to 0.5. Instead, for each dataset, BERT is trained with a learning rate equal to \(5 \cdot 10^{-5}\) and early stopping on the accuracy of the validation set, with patience equals to 5. Finally, the bag-of-words (BoW) models consists of standard classifiers trained on tfidf features extracted from text data after applying stemming and removing stopwords. We use the same classifiers as in Sect. 4.2, i.e., a logistic regression (LR) classifier, a support vector machine (SVM) classifier with linear kernel, and a random forests (RF) classifier, all trained with default parameters. For all models, we use the same training, validation and test sets described in Sect. 4.

Table 14 Detailed composition (number of neurons) of the Convolutional Neural Networks
Table 15 Detailed composition (number of neurons) of the Recurrent Neural Networks

In our experiments, we use Python implementations of the algorithms of Keras, scikit-learn,Footnote 5 and ktrain [48] libraries. All experiments are executed on a server with 32 Intel Xeon Skylake cores running at 2.1 GHz, 256 GB RAM, and one NVIDIA Tesla T4 GPU.

The results of the classification on the test sets are reported on Table 16. The results for WH+TW are averaged on the ten samples. We also compute the percentage gain of CNNs, RNNs and BERT w.r.t. the best bag-of-words classifier for each dataset. From this results, it emerges that the datasets that take the greatest advantage on the usage of recurrent neural networks and language models are SENS2, SENS3 and OMC (the gain is between 10.26% and 14.71%), while the maximum improvement for WH+TW is 8.80%. It is worth noting that the performance of BERT on OMC are similar to those achieved by using the dictionary-based features (see Sect. 4.2) and significantly lower than those achieved by the same model on SENS2 and SENS3. One possible explanation for this phenomenon is that the posts in this dataset deal with a limited number of very specific topics by construction. We recall, in fact, that its posts have been extracted from some targeted subreddits mentioning few very specific terms (see Sect. 3.2). As a consequence, a language representation model like BERT does not help improve classification results to a great extent. SENS3, instead, also has the highest F1-score using BERT (0.89), but it is worth recalling that this dataset has less than half the posts of all other datasets. Instead, the high performances achieved by BERT on WH+TW can be also explained by the fact that sensitive and non-sensitive posts are derived from two different microblogging platforms. Although this point is out of the scope of our work, the choice of a particular social media platform (especially when it promotes anonymous contents) may have an impact on both the lexicon and the language style adopted by the users. Finally, CNNs are less effective than RNNs and BERT. In WH+TW, they perform similarly as or even worse than any bag-of-words models. More detailed classification results for BERT are given in Table 17.

Table 16 Classification results (macro-averaged F1-scores and percentage gain w.r.t the best bag-of-word classifier)
Table 17 Detailed classification results (F1-score) using BERT. Results on WH+TW are averaged on ten samples

To measure the generalization strength of the classification models, we conduct the following additional experiment. We train the classification models on the training set of each dataset, but instead of testing them on the respective test set, we use every other entire dataset as test set. Hence, for instance, every model learnt on the training set of SENS2 is tested on the entire SENS3, OMC and WH+TW datasets and viceversa. To prevent any bias, when using SENS3 (resp. SENS2) as test set, instances that are also present in the training set of SENS2 (resp. SENS3) are removed. In Table 18 we report the macro-averaged F1-scores computed on the test sets reported in the columns using the training sets reported in each row. We only show the results for SVM trained on the bag-of-word representation and BERT. Interestingly, when BERT is trained on SENS2, its performances are good when tested on SENS3 too. Nonetheless, this is not that surprising, because SENS3 is a subset of SENS2 with less uncertainty on the class labels provided by the annotators (we recall that, in SENS3, the annotators’ agreement is maximum). However, the most interesting results are the ones obtained by the classifier trained on SENS2 and tested on WH+TW, and viceversa. In this cases, the training and test sets are from completely different sources, and BERT trained on WH+TW has even worse performances than the bag-of-words model when tested on SENS2. Instead, BERT trained on SENS2 achieves noticeably higher performances. It is worth noting that the difference in performances is the highest among all pairs of diverse datasets: in fact, the F1-score is 13% higher for BERT trained on SENS2 and tested on WH+TW than for the opposite configuration. The performances of OMC on WH+TW with BERT are sensibly worse than those achieved by SENS2, although its performances on SENS2 and SENS3 are higher than those obtained by our datasets on the entire OMC dataset. This could be the consequence of the better representation provided by the training set of OMC, in particular for the sensitive class. In fact, the value of the F1-score for the sensitive class is 0.56 when the instances of SENS2 are predicted with BERT trained on the training set of OMC, while, for the opposite configuration, the F1-score is 0.39. For the pair of datasets composed by SENS3 and OMC, the same scores are, respectively, 0.57 and 0.36. It is worth noting that BERT trained on WH+TW achieves sensibly higher performances when tested on OMC rather than on SENS2 or SENS3. This confirms that the type of sensitivity captured by OMC and WH+TW are similar. For further analysis, we also conduct the same experiment with dictionary-based features (see Sect. 4.2), using the Random Forest classifier (DICT-RF in Table 18). The results show that the models trained on OMC and WH+TW do not perform well on our datasets (the F1-score are between 0.19 and 0.33). Instead, the same models achieve better performances on their reciprocal test sets (macro-averaged F1-scores are 0.52 and 0.54), confirming that those datasets address similar problems (i.e., a more specific concept of self-disclosure than ours).

Table 18 Cross-classification results (macro-averaged F1-scores). Classifiers are trained on the datasets reported in the row, and tested on the datasets reported in the columns

6 Discussion of the results

In this section, we discuss more in detail the results of the experiments described in Sects. 4 and 5 and draw some generalized conclusions.

In our paper, we have performed many different data analysis tasks with the aim of investigating whether state-of-the-art approaches to self-disclosure detection in texts and the related text corpora, which have made available to the public, are adapted to identify privacy-sensitive posts shared in general purpose social media. Our main target is the typical social media post, which, in principle, may deal with arbitrary topics, and is communicated to different kinds of audiences, both in terms of extension (the number of profiles that can read the post) and type (close friends, acquaintances, general public). So far, the problem has been addressed by assuming that sensitive posts are published anonymously [1517], or by considering a less general problem called self-disclosure [11]. In the experiments, not only have we shown the limitations of both approaches, but we have also pointed out the drawbacks of existing text corpora that might be used to train classification models capable of determining whether a given text is sensitive or not. Such corpora, in fact, are extracted from microblogging or forum platforms under very specific sections (e.g., dealing with family life or intimate relationships). As a result, they are not able to capture sensitive contents with wider topic coverage. Furthermore, we have created a new text corpus, consisting of around ten thousand Facebook posts, each annotated by three experts. In our corpus, sensitivity has a broader definition than self-disclosure and we think that this better captures the actual privacy-sensitive content that can be found in general-purpose social media. More than that, we do not make any anonymity assumption, in line with recent studies on the privacy paradox [12] and privacy fatigue [13] that show that many users tend to underestimate or simply overlook their privacy risk when posting on social media platforms.

All our experiments confirm that tackling the problem of content sensitivity by leveraging anonymity solves a less general problem than ours. By addressing sensitivity directly, we show that dictionary-based or bag-of-words based approaches are not that effective. Sequential models as Recurrent Neural Networks and language models, instead, lead to more accurate analysis and predictions and, more interestingly, introduce a significant performance gain on text annotated according to criteria that are not mediated by the lens of anonymity. Interestingly, OMC, a dataset that is specifically annotated according to self-disclosure [11], does not take advantage of RNNs or BERT to such a great extent: the results of these deep learning algorithms are comparable with those obtained by Random Forests trained on lexical features. The general mild performances of all types of classifier on this dataset could be explained by the overrepresentation of the sensitive class (corresponding to posts containing some form of self-disclosure). Unfortunately, this is by design, also because the dataset has been published with a different objective (i.e., the study of affect in response to interactive content). More interestingly, the posts extracted according to the anonymity criterion (WH+TW) and those extracted following the classic definition of self-disclosure (OMC) share some common properties, as testified by the cross-classification results (Table 18) and the mild correlation of the relevant feature for the logistic regression classifier (Table 13). This is probably the result of the particular choice of sources for the posts composing the sensitive class of those corpora (a subreddit on family relationships for OMC and Whisper for WH+TW). Finally, our experiments have shown that, for our datasets, only RNNs and BERT provide a significant performance boost. This phenomenon can be explained by the fact that, in general purpose social media, the context of a word/sentence (well captured by transformer-based language models) is more adapted to model the sensitivity of a post than simple lexical features. It is worth noting that BERT achieves good performances on WH+TW too. However, in this case, its performances could be biased by the fact that sensitive and non-sensitive posts are extracted from two different social media and, consequently, the network is not learning how to detect the sensitivity of a post, but, rather, the source of it. Although deserving further investigations, we leave this point for future research work.

Despite the results obtained and their analysis largely confirm our hypotheses, the extent of our work is in part limited by the fact that we have not controlled data acquisition, but, instead, rely on a corpus of Facebook posts collected ten years ago for different research purposes (i.e., predicting some psychological traits of users according to their behavior on the well known social network). Currently, it is not possible to collect such data, as Facebook has been limiting the amount of information that can be obtained by using its API since 2015. Nevertheless, it is the only available dataset composed of the so-called profile status updates. Other available Facebook posts are crawled from public pages, so they could hardly fit our objectives. Moreover, although we think that our work could foster further research on related topics, its impact is mitigated by the rapid changes in the social media world. Currently, the most popular social platforms (e.g., Instagram, TikTok) are designed for sharing multimedia content such as pictures and short videos. Although many results on text content presented in this paper (and in other similar research works) can be adapted or transferred to multimedia posts, new efforts should be undertaken to detect sensitive contents in pictures and videos accurately.

7 Conclusion

With the final goal of supporting privacy awareness and risk assessment, we have introduced a new way to address the problem of sensitivity analysis of user-generated content without explicitly considering the so-called anonymity assumption. We have shown that the “lens of anonymity” could indeed distort the actual sensitivity of text posts. Consequently, differently from state-of-the-art approaches, we have measured the sensitivity directly, and we have collected reliable sensitivity annotations for an existing corpus of around ten thousand social media posts. In our experiments, we have shown that our problem is more challenging than anonymity-driven ones, as lexical features are not sufficient for discriminating between sensitive and non-sensitive contents. Moreover, we have also investigated how the problem of self-disclosure is related to content sensitivity analysis, and show that existing text corpora are not adequate to analyze the sensitivity of posts shared in general purpose social media platforms. Instead, recent sequential deep neural network models may help achieve good accuracy results. Our work could represent a new gold standard in content sensitivity analysis and could be used, for instance, in privacy risk assessment procedures involving user-generated content.Footnote 6

On the other hand, our analysis has also pointed out that predicting content sensitivity by simply classifying text can not capture the manifold of privacy sensitivity with high accuracy. So, more complex and heterogenous models should be considered. Probably, an accurate sensitivity content analysis tool should consider lexical, semantic as well as grammatical features. Topics are certainly important, but sentence construction and lexical choices are also fundamental. Therefore, reliable solutions would consist of a combination of computational linguistic techniques, machine learning algorithms and semantic analysis. Finally, the success of picture and video sharing platforms (such as Instagram and TikTok), implies that any successful sensitivity content analysis tool should be able to cope with audiovisual contents and, in general, with multimodal/multimedia objects (an open problem in sentiment analysis as well [49]).

Availability of data and materials

The datasets used and analysed during the current study are available from the corresponding author on reasonable request.





  4. Available at

  5. Available at and

  6. The source code used in our experiments is available online at


  1. Papadamou K, Papasavva A, Zannettou S, Blackburn J, Kourtellis N, Leontiadis I et al. (2020) Disturbed YouTube for kids: characterizing and detecting inappropriate videos targeting Young children. In: Choudhury MD, Chunara R, Culotta A, Welles BF (eds) Proceedings of AAAI ICWSM 2020, held virtually, original venue, Atlanta, Georgia, USA, June 8-11, 2020. AAAI Press, Menlo Park, pp 522–533

    Google Scholar 

  2. Anagnostou A, Mollas I, Hatebusters TG (2018) A web application for actively reporting YouTube hate speech. In: Lang J (ed) Proceedings of IJCAI 2018, Stockholm, Sweden, July 13-19, 2018., pp 5796–5798

    Google Scholar 

  3. Cheng L, Shu K, Wu S, Silva YN, Hall DL, Unsupervised LH (2020) Cyberbullying detection via time-informed Gaussian mixture model. In: d’Aquin M, Dietze S, Hauff C, Curry E, Cudré-Mauroux P (eds) Proceedings of CIKM 2020, virtual event, Ireland, October 19–23, 2020. ACM, New York, pp 185–194

    Google Scholar 

  4. Sánchez D, Batet M (2016) C-sanitized: A privacy model for document redaction and sanitization. J Assoc Inf Sci Technol 67(1):148–163.

    Article  Google Scholar 

  5. Iwendi C, Moqurrab SA, Anjum A, Khan S, Mohan S, Srivastava G (2020) N-sanitization: A semantic privacy-preserving framework for unstructured medical datasets. Comput Commun 161:160–171.

    Article  Google Scholar 

  6. Hassan F, Sanchez D, Domingo-Ferrer J (2021) Utility-preserving privacy protection of textual documents via word embeddings. In: IEEE transactions on knowledge and data engineering, pp 1–14

    Google Scholar 

  7. Jourard SM (1971) Self-disclosure: an experimental analysis of the transparent self

  8. Barak A, Gluck-Ofri O (2007) Degree and reciprocity of self-disclosure in online forums. Cyberpsychol Behav Soc Netw 10(3):407–417

    Google Scholar 

  9. Yang D, Yao Z, Kraut RE (2017) Self-disclosure and channel difference in online health support groups. In: Proceedings of the eleventh international conference on web and social media, ICWSM 2017, Montréal, Québec, Canada, May 15-18, 2017. AAAI Press, Menlo Park, pp 704–707

    Google Scholar 

  10. Ma X, Hancock JT, Naaman M (2016) Anonymity, intimacy and self-disclosure in social media. In: Proceedings of the 2016 CHI conference on human factors in computing systems, San Jose, CA, USA, May 7-12, 2016. ACM, New York, pp 3857–3869.

    Google Scholar 

  11. Jaidka K, Singh I, Liu J, Chhaya N, Ungar L (2020) A report of the CL-aff OffMyChest shared task: modeling supportiveness and disclosure. In: Proceedings of the 3rd workshop on affective content analysis (AffCon 2020) co-located with thirty-fourth AAAI conference on artificial intelligence (AAAI 2020), New York, USA, February 7, 2020. CEUR workshop proceedings, vol 2614., pp 118–129.

    Google Scholar 

  12. Barth S, de Jong MDT (2017) The privacy paradox – investigating discrepancies between expressed privacy concerns and actual online behavior – A systematic literature review. Telemat Inform 34(7):1038–1058

    Google Scholar 

  13. Choi H, Park J, Jung Y (2018) The role of privacy fatigue in online privacy behavior. Comput Hum Behav 81:42–51

    Google Scholar 

  14. Pensa RG, di Blasi G, Bioglio L (2019) Network-aware privacy risk estimation in online social networks. Soc Netw Anal Min 9(1):15:1–15:15

    Google Scholar 

  15. Peddinti ST, Korolova A, Bursztein E, Sampemane G (2014) Cloak and Swagger: understanding data sensitivity through the lens of user anonymity. In: Proceedings of IEEE SP 2014, pp 493–508

    Google Scholar 

  16. Peddinti ST, Ross KW, Cappos J (2017) User anonymity on Twitter. IEEE Secur Priv 15(3):84–87

    Google Scholar 

  17. Correa D, Silva LA, Mondal M, Benevenuto F, Gummadi KP (2015) The many shades of anonymity: characterizing anonymous social media content. In: Proceedings of ICWSM 2015, pp 71–80

    Google Scholar 

  18. Yu J, Zhang B, Kuang Z, Lin D, Fan J (2017) iPrivacy: image privacy protection by identifying sensitive objects via deep multi-task learning. IEEE Trans Inf Forensics Secur 12(5):1005–1016

    Google Scholar 

  19. Battaglia E, Bioglio L, Pensa RG (2020) Towards content sensitivity analysis. In: Berthold MR, Feelders A, Krempl G (eds) Proceedings of IDA 2020, Konstanz, Germany, April 27-29, 2020. Springer, Berlin, pp 67–79

    Google Scholar 

  20. Oukemeni S, Rifà-Pous H, i Puig JMM (2019) Privacy analysis on microblogging online social networks: A survey. ACM Comput Surv 52(3):60:1–60:36

    Google Scholar 

  21. Oukemeni S, Rifà-Pous H, i Puig JMM (2019) IPAM: information privacy assessment metric in microblogging online social networks. IEEE Access 7:114817–114836

    Google Scholar 

  22. Wagner I, Eckhoff D (2018) Technical privacy metrics: A systematic survey. ACM Comput Surv 51(3):57:1–57:38

    Google Scholar 

  23. Alemany J, del Val Noguera E, Alberola JM, García-Fornes A (2019) Metrics for privacy assessment when sharing information in online social networks. IEEE Access 7:143631–143645

    Google Scholar 

  24. Liu K, Terzi E (2010) A framework for computing the privacy scores of users in online social networks. ACM Trans Knowl Discov Data 5(1):6:1–6:30

    Google Scholar 

  25. Pensa RG, Blasi GD (2017) A privacy self-assessment framework for online social networks. Expert Syst Appl 86:18–31

    Google Scholar 

  26. Gill AJ, Vasalou A, Papoutsi C, Joinson AN (2011) Privacy dictionary: a linguistic taxonomy of privacy for content analysis. In: Proceedings of ACM CHI 2011, pp 3227–3236

    Google Scholar 

  27. Vasalou A, Gill AJ, Mazanderani F, Papoutsi C, Joinson AN (2011) Privacy dictionary: a new resource for the automated content analysis of privacy. J Am Soc Inf Sci Technol 62(11):2095–2105

    Google Scholar 

  28. Mondal M, Correa D, Benevenuto F (2020) Anonymity effects: A large-scale dataset from an anonymous social media platform. In: Gadiraju U (ed) Proceedings of ACM HT 2020, virtual event, USA, July 13-15, 2020. ACM, New York, pp 69–74

    Google Scholar 

  29. Biega JA, Gummadi KP, Mele I, Milchevski D, Tryfonopoulos C, Weikum G (2016) R-susceptibility: an IR-centric approach to assessing privacy risks for users in online communities. In: Proceedings of ACM SIGIR 2016, pp 365–374

    Google Scholar 

  30. Jiang W, Murugesan M, Clifton C, Si L (2009) t-plausibility: semantic preserving text sanitization. In: Proceedings of the 12th IEEE international conference on computational science and engineering, CSE 2009. Vancouver, BC, Canada, August 29-31, 2009, IEEE Comput. Soc., Los Alamitos, pp 68–75.

    Chapter  Google Scholar 

  31. Sánchez D, Batet M, Viejo A (2013) Automatic general-purpose sanitization of textual documents. IEEE Trans Inf Forensics Secur 8(6):853–862.

    Article  Google Scholar 

  32. Sánchez D, Batet M, Viejo A (2014) Utility-preserving sanitization of semantically correlated terms in textual documents. Inf Sci 279:77–93.

    Article  Google Scholar 

  33. Altman I, Taylor DA (1973) Social penetration: the development of interpersonal relationships, Holt, Rinehart & Winston, New York

    Google Scholar 

  34. Taylor DA (1968) The development of interpersonal relationships: social penetration processes. J Soc Psychol 75(1):79–90

    Google Scholar 

  35. McKenna KYA, Bargh JA (2000) Plan 9 from cyberspace: the implications of the Internet for personality and social psychology. Personal Soc Psychol Rev 4(1):57–75

    Google Scholar 

  36. Zlatolas LN, Welzer T, Hericko M, Hölbl M (2015) Privacy antecedents for SNS self-disclosure: the case of Facebook. Comput Hum Behav 45:158–167

    Google Scholar 

  37. Umar P, Squicciarini AC, Rajtmajer SM (2019) Detection and analysis of self-disclosure in online news commentaries. In: Liu L, White RW, Mantrach A, Silvestri F, McAuley JJ, Baeza-Yates R et al. (eds) The world wide web conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019. ACM, New York, pp 3272–3278

    Google Scholar 

  38. Jaidka K, Guntuku SC, Ungar LH (2018) Facebook versus Twitter: differences in self-disclosure and trait prediction. In: Proceedings of ICWSM 2018. AAAI Press, Menlo Park, pp 141–150

    Google Scholar 

  39. Seabrook EM, Kern ML, Fulcher BD, Predicting RNS (2018) Depression from language-based emotion dynamics: longitudinal analysis of Facebook and Twitter status updates. J Med Internet Res 20(5):e168

    Google Scholar 

  40. Celli F, Pianesi F, Stillwell D, Kosinski M (2013) Workshop on computational personality recognition: shared task. In: Proceedings of ICWSM 2013

    Google Scholar 

  41. Kosinski M, Stillwell D, Graepel T (2013) Private traits and attributes are predictable from digital records of human behavior. Proc Natl Acad Sci USA 110(15):5802–5805

    Google Scholar 

  42. Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382

    Google Scholar 

  43. Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating Twitter users. In: Huang J, Koudas N, Jones GJF, Wu X, Collins-Thompson K, An A (eds) Proceedings of ACM CIKM 2010, Toronto, Ontario, Canada, October 26-30, 2010. ACM, New York, pp 759–768

    Google Scholar 

  44. Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Lapata M, Blunsom P, Koller A (eds) Proceedings of EACL 2017, Valencia, Spain, April 3–7, 2017. Short papers. Association for computational linguistics, vol 2, pp 427–431

    Google Scholar 

  45. Tausczik YR, Pennebaker JW (2010) The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol 29(1):24–54

    Google Scholar 

  46. Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein J, Doran C, Solorio T (eds) Proceedings of NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019. Association for Computational Linguistics, pp 4171–4186

    Google Scholar 

  47. Pennington J, Socher R, Glove MCD (2014) Global vectors for word representation. In: Moschitti A, Pang B, Daelemans W (eds) Proceedings of EMNLP 2014. ACL, pp 1532–1543

    Google Scholar 

  48. Maiya AS (2020) ktrain: a Low-Code Library for Augmented Machine Learning. CoRR, 2020. Available from:

  49. Poria S, Majumder N, Hazarika D, Cambria E, Gelbukh AF, Hussain A (2018) Multimodal sentiment analysis: addressing key issues and setting up the baselines. IEEE Intell Syst 33(6):17–25

    Google Scholar 

Download references


The authors thank all the persons who have helped us in annotating the corpus.


This work was supported by Fondazione CRT (grant number 2019-0450).

Author information

Authors and Affiliations



Conceptualization: RGP; Data curation: LB, RGP; Analysis: LB, RGP; Writing: LB, RGP. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Ruggero G. Pensa.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bioglio, L., Pensa, R.G. Analysis and classification of privacy-sensitive content in social media posts. EPJ Data Sci. 11, 12 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: