Discovering temporal regularities in retail customers’ shopping behavior
 Riccardo Guidotti^{1}Email authorView ORCID ID profile,
 Lorenzo Gabrielli^{1},
 Anna Monreale^{2},
 Dino Pedreschi^{2} and
 Fosca Giannotti^{1}
Received: 26 July 2017
Accepted: 26 February 2018
Published: 6 March 2018
Abstract
In this paper we investigate the regularities characterizing the temporal purchasing behavior of the customers of a retail market chain. Most of the literature studying purchasing behavior focuses on what customers buy while giving few importance to the temporal dimension. As a consequence, the state of the art does not allow capturing which are the temporal purchasing patterns of each customers. These patterns should describe the customer’s temporal habits highlighting when she typically makes a purchase in correlation with information about the amount of expenditure, number of purchased items and other similar aggregates. This knowledge could be exploited for different scopes: set temporal discounts for making the purchases of customers more regular with respect the time, set personalized discounts in the day and time window preferred by the customer, provide recommendations for shopping time schedule, etc. To this aim, we introduce a framework for extracting from personal retail data a temporal purchasing profile able to summarize whether and when a customer makes her distinctive purchases. The individual profile describes a set of regular and characterizing shopping behavioral patterns, and the sequences in which these patterns take place. We show how to compare different customers by providing a collective perspective to their individual profiles, and how to group the customers with respect to these comparable profiles. By analyzing real datasets containing millions of shopping sessions we found that there is a limited number of patterns summarizing the temporal purchasing behavior of all the customers, and that they are sequentially followed in a finite number of ways. Moreover, we recognized regular customers characterized by a small number of temporal purchasing behaviors, and changing customers characterized by various types of temporal purchasing behaviors. Finally, we discuss on how the profiles can be exploited both by customers to enable personalized services, and by the retail market chain for providing tailored discounts based on temporal purchasing regularity.
Keywords
1 Introduction
The availability of huge amount of retail data stimulates challenging questions that can be answered only by a deep and accurate analysis of different aspects related to customers’ shopping sessions. Retail data is a complex type of data containing various dimensions: what customers buy, when and where they make the purchases and which is the relevance of the purchase in terms of money spent or number of items purchased. The choice of analyzing a set of dimensions rather than another one depends on the kind of phenomena to be investigated. Considering all the dimensions can lead to very complex models or to weak generalizations. The most important dimension for understanding how customers schedule their shopping time is obviously the temporal one. Most of the works in the literature focus on what customers buy [1, 2]. Just a few of them have exploited also the temporal dimension as a feature for enriching their models based primarily on the items purchased [3–6]. However, it has been given very few attention to the temporal dimension of shopping sessions considered on its own in order to extract a customer model which helps in understanding the purchase temporal habits. As a consequence, by using methods proposed in the literature it is not possible to capture the temporal purchasing patterns of each customers, which put in correlation their temporal habits with other information such as the amount of expenditure and number of purchased items. This knowledge about the customers is important because enables different marketing strategies tailored to the temporal and systematic behavior of each customer, and new innovative services for the customer based on recommendations for shopping time schedule and for increasing her awareness. To the best of our knowledge, there is not previous work focusing only on the temporal dimension and using it as the main building block to construct an individual temporal purchasing profile. We do not claim that ignoring the items purchased and/or the shopping location may lead to an advantage, but we show that observing only the temporal dimension is crucial to completely understand the different times and expenses adopted by the customers when they go to shopping.
In this paper, we are interested in understanding whether and when a customer typically makes retail purchases. Which of these temporal aspects of the shopping behavior are more systematic? Which are the regular sequences of the temporal patterns? To this aim, we propose a temporal purchasing profile able to describe the regular and characteristic temporal behaviors of an individual customer. Indeed, the individual person is the key element that lies in between a single purchase and a whole customers population. Each individual has her own regularities and habits outlining her behavior and making her a unique part of the mass. The analysis of individuals provides the basis for understanding the common regular patterns in the purchasing behavior also at collective level. Thus, our aim is to define individual and collective temporal profiles which can be employed for the analysis of the temporal dimension of the customers’ shopping sessions. In particular, our models enable a customer segmentation which considers the temporal components of purchases and permit to perform explorative analyses of individuals under a new point of view. We conduct different case studies by using the defined methodological framework with the aim of discovering customers temporal purchasing patterns and grouping the customers’ profiles to identify sets of customers with similar temporal behavior.
The main contributions of this work are the following: (i) we define the temporal purchasing profile as the set of temporal footprints and sequence of footprints summarizing whether and when a customer typically purchases and we provide the method for extracting these profiles; (ii) we define the collective perspective for making comparable the individual and notcomparable profiles, so that the shopping routines shared by different customers can be analyzed; (iii) we show the application of the whole analytic framework on a set of case studies a real datasets, one of them containing 7 years of retail data for 91k customers; (iv) we observe how the individual profiles and the collective perspective allow to separate the customers into well defined groups.
Our methodological framework empowers the discovery of various customers segmentation with respect to different temporal aspects. In particular, our study reveals three main typical collective behaviors characterizing the whole collection of customers on the basis of when they go to the shopping center: daily spending behavior capturing purchases made every day; oneshop spending behavior, characterizing a regularity with a week containing a predominant shopping session; and an occasional spending behavior, describing a not habitual shopping sessions related to a very small expenditure amount. Among oneshop spending behaviors the analysis captures a further classification in with respect to the expenditure amount: normal spending behavior less than €50, high spending behavior with a typical expenditure between €50 and €100, and higher spending behavior with an expenditure higher than €100. By analyzing the number of different purchasing behaviors at individual and collective level we identify two categories of customers that we name regular and changing: a customer represented with a high number of temporal purchasing behavioral patterns is classified as changing, while a customer with a small number of temporal purchasing behaviors is classified as regular. Finally, we found various and diversified regular sequences explaining how the customers typically combine and follow their shopping behavior with respect to the temporal point of view.
The rest of the paper is organized as follows. In Sect. 2 we discuss the related literature. The individual profile is defined in Sect. 3, while in Sect. 4 we describe the approach to provide the collective perspective. In Sect. 5 we illustrate the case study showing the effectiveness for data analytics of the proposed models and methods. Section 6 discusses about possible applications of the proposed approach. Finally, Sect. 7 reports conclusion and future research directions.
2 Related work
While most of the works in the literature analyze the customer behavior by considering only aspects related to the items bought and without a specific focus on the temporal dimension, the analysis of the customer purchasing behavior presented in this paper uses the temporal dimension as a fundamental building block. In the literature there are various works analyzing and trying to predict the customer behavior. To this aim, they generally take into consideration aspects related to the items bought and not the time: which items the customers frequently buy, which items they pay attention within a store, the changes of customer behavior on the bases of what they buy, etc. To the best of our knowledge no paper analyzes only the temporal dimension of customers shopping sessions to understand whether and when customers typically make characterizing and regular purchases.
Data mining is the classic approach used to analyze purchasing behaviors [1, 2]. However, it is generally hard to create a comprehensive model of overall customer behavior since each individual acts according to a personal utility function depending on various factors, that can be described by different types of user data. Therefore, multiplex approaches are used to understand the customer behavior integrating multiple data sources and multiple types of data [7] to reach the combined prediction results. In [8] the authors propose to represent the customer purchasing behavior using a directed graph retaining temporal information in a purchase sequence. In particular, given a target product, they build the graph for that product attaching as nodes the purchased products in subsequent shopping sessions, putting on the edges the days passed between two consecutive purchases and concatenating products belonging to the same category (e.g. two different types of milk). Then, they apply a graph mining technique on such networks to extract and analyze the occurring frequent patterns. In [9] it is shown how signals of RFID (RadioFrequency IDentification) can be exploited to detect and record how customers browse stores, which items of clothes they pay attention to, and which items of clothes they usually match with. In [10] the authors propose a mixture of nonhomogeneous Poisson processes to discover the latent customer groups and conduct the softmembership customer segmentation based on the dynamically observed purchase behavior. Yet, they exploit the temporal dimension to explain the purchasing patterns of the products at a global scale and their outcomes do not explain the customers behavior with respect to time. The temporal dimension of purchasing habits is exploited in [11] to understand how predictable are consumers in their merchant visitation patterns by using a Markov model for predicting the customer’s next shop location. Note that in these works the customer behavior is generalized at global level, while our model is personal and describes the customer habits and preferences in a concise way.
Other recent studies analyze purchasing data to understand changing in the customer behavior [12, 13] and if a customer will switch from one brand to another one [14]. In particular, in [12] the authors integrate variables such as recency, frequency, and monetary/demographic variables to establish a method for mining changes in shopping behavior. In [14] it is developed a method for extracting useful knowledge from individual purchase history of customers by combining information fusion techniques with data mining to predict whether a customer switches from one brand to another, or becomes loyal to a brand, and when a customer is likely to defect to a competitor. The models and methodology we propose not only help in understanding the changing in customers’ temporal shopping behaviors and their cyclic succession, but also unveil the regularities of these changing if detected.
The temporal component of customer purchases is analyzed in [3–5]. In [4] it is examined the role of personal characteristics in time spent shopping. In particular, is analyzed the roles of time perceptions, brand and store loyalty, social, physical wellbeing, and demographic variables in predicting reported shopping time, including the hours spent at search and purchase. In [5] are studied changes in cluster characteristics of supermarket customers over a 24 week period by performing a temporal analysis that tries to detect the migrations of the customers from one group to another group. The temporal analysis presented is based on conventional and modified self organizing maps. In [3, 6] the authors use a sophisticated version of entropy to study the customers’ behaviors in retail data both from the basket and the spatiotemporal point of view. In particular, they define a procedure to group similar baskets by exploiting a frequent pattern mining approach and creating in this way classes of probability for certain sets of items frequently purchased together. Their discovery is that predictable customers are also the more profitable ones. Although these papers consider the temporal dimension, we highlight that in our work the time is the focus of the models and it is not simply used as a feature for a temporal analysis. Also in [15], an analysis of the sequences of purchases exploiting Zipflike distributions leads to the detection of five consumer groups. Customers in each group resulted to be also similar with respect to their age, gender, total expenditure, etc. Moreover, there are works aimed at understanding the behavior of customers in online shopping [16–19].
Finally, another set of works adopting the temporal dimension in shopping session is related to the task of recommending the items for the next basket. To solve this problem in the literature various methods have been adopted: collaborative filtering [20], Markov chains [21], supervised classification algorithms [22], deep neural networks [23], and temporal frequent pattern mining algorithms [24]. However, all these methods only exploit the temporal dimension but do not provide a way to understand how the time affects the customers decisions and which are the typical temporal shopping patterns.
Our definition of temporal purchasing footprint of a customer is similar to the definition of user profile introduced in [25]. In [25] the authors extract from the call detail records of each user a profile summarizing the calls performed by the user. Their aim is to estimate the proportion of city users that can be classified as residents, commuters or visitors. We underline that the “profile” defined in [25] is just an aggregation of (a count of) the number of calls performed by the user along the various months separated between weekend and weekdays in a specific geographical area. It does not take into account any notion of behavior or regularity, it is a sort of “status” of the user. On the other hand, the profile defined in our work models a set of temporal purchasing patterns highlighting different ways of acting for the customers. It is able to explain the different behaviors adopted by the individual customers and does not report just their status.
3 Individual temporal purchasing profile
Definition 1
(Temporal purchasing unit)
Given a period τ of d̄ days, a temporal purchasing unit U of a customer c is a matrix \(U \in\mathbb{R}^{t \times d}\), where d is the number of dayintervals in τ with \(d \leq \bar{d}\), t is the number of time windows considered for each dayinterval, and \(U_{ij}\) estimates the relevance of the purchases in the ith time window of the jth dayinterval.
Given customer c, her sequence of temporally ordered shopping sessions \(S = \langle s_{1},\ldots, s_{n} \rangle\), the time window granularity t, the dayinterval granularity d, and the width of the time period τ, then S can be segmented into an ordered sequence of units \(\widehat{S} = \langle U^{(1)},\ldots, U^{(m)} \rangle\) with \(m \leq n\), where each temporal purchasing unit \(U^{(i)}\) aggregates a set of shopping sessions with respect to t and d.
Our goal is to summarize for each customer the knowledge contained in Ŝ in a temporal purchasing profile describing the customer’s typical temporal behaviors. In order to define the profile we need to extract the “distinctive” temporal purchasing behaviors of customer c, i.e. her purchasing footprints.
Given the sequence of units Ŝ of customer c, we can detect groups of units which are similar with respect to a distance function δ based on the concept of temporal alignment and with respect to the relevant values considered. Thus, given a group G of similar units we define a temporal purchasing footprint F (footprints in short) as the representative of the group G. Each footprint F captures a temporal behavior characterizing the customer. We define the temporal purchasing footprint as follow:
Definition 2
(Temporal purchasing footprint)
We refer to the set of footprints of a customer with \(\mathcal {F} = \{ F^{(1)},\ldots, F^{(k)}\}\). Note that, for the extraction of the footprints we are not considering the order of the units in Ŝ. Figure 1(c) shows an example of footprint using the days of the week as granularity for dayinterval and morningafternoonevening as timewindows: the darker the cell the higher the relevance which corresponds to the amount in this case.
Given the groups \(\mathcal{G}\) and the footprints \(\mathcal{F}\) of a customer, we can replace each temporal purchasing unit in Ŝ with the footprint representing the group to which it belongs to. We name the new sequence temporal purchasing footprint sequence (footprint sequence in short, see Fig. 1(d)). For example, given \(\mathcal{F} = \{ F^{(1)}, F^{(2)} \}\), \(\mathcal{G} = \{G_{1}, G_{2}\}\) where \(G_{1} = \{ U^{(1)}, U^{(4)} \} \), \(G_{2} = \{ U^{(2)}, U^{(3)} \}\), if \(\widehat{S} = \langle U^{(1)}, U^{(2)}, U^{(3)}, U^{(4)} \rangle\), than the corresponding footprint sequence is \(\widehat{F} = \langle F^{(1)}, F^{(1)}, F^{(1)}, F^{(2)} \rangle\). We define the temporal purchasing footprint sequence as:
Definition 3
(Temporal purchasing footprint sequence)
Given a customer c, her sequence of units Ŝ, her groups \(\mathcal{G}\) and her footprints \(\mathcal{F}\), we define the temporal purchasing footprint sequence as the sequence F̂ obtained replacing in Ŝ the units with the corresponding footprints in \(\mathcal{F}\) according to \(\mathcal{G}\).
Finally, we define the temporal purchasing profile (profile, see Fig. 1(e)) of a customer as:
Definition 4
(Temporal purchasing profile)
Given a customer c, her sequence of units Ŝ, and a distance function δ, we define the temporal purchasing profile of c as \(\mathcal{P}_{c} = \{ \mathcal{F}, \widehat{F} \}\) where \(\mathcal{F}\), is the set of footprints derivable from \(\mathcal {G}\) detected on Ŝ using δ, while F̂ is the corresponding footprint sequence,
4 Collective perspective of individual profiles
Individual profiles of different customers are not comparable because each customer can have a different number of footprints expressing different behaviors. Thus, in order to compare individual profiles of different customers we need to provide them a collective perspective. Given the profiles \(\mathcal{P}_{b}\) and \(\mathcal{P}_{c}\) of customers b and c, this means to make comparable footprints \(\mathcal{F}_{b}\) and \(\mathcal{F}_{c}\), and footprint sequences \(\widehat{F}_{b}\) and \(\widehat{F}_{c}\). We start by specifying how to compare the footprints by defining the collective temporal purchasing footprint (collective footprint in short) as:
Definition 5
(Collective temporal purchasing footprint)
We underline that we use the expression customer collective footprints to indicate the collective perspective of a single customer \(\mathcal{C}_{c}\) and the expression collective footprints of all customers to indicate \(\mathcal{C}\).
Given the collective group of collective footprints \(\mathcal{L}\), and the collective footprints of all the customers \(\mathcal{C}\), we can replace each individual footprint sequence contained in \(\{ \widehat {F}_{c} \}\), with the customer collective footprint representing the collective group to which it belongs to. Therefore, for each customer c her footprint sequence \(\widehat {F}_{c}\) is mapped to an equivalent collective temporal purchasing sequence \(\widehat{C}_{c}\) (collective sequence in short). Similarly to the collective footprints, the collective sequences of all the customers are comparable among each other.
Definition 6
(Collective temporal purchasing sequence)
Given a customer c, her footprint sequence \(\widehat{F}_{c}\), the collective groups \(\mathcal{L}\) and the collective footprints of all customers \(\mathcal{C}\), we define the collective temporal purchasing sequence as the sequence \(\widehat{C}_{c}\) obtained replacing each footprint in \(\widehat{F}_{c}\) with the corresponding collective footprint in \(\mathcal{C}\) according to the groups \(\mathcal{L}\).
In addition, in order to better studying the customers habits and to determine which are the most common subsequences we define the regular temporal purchasing subsequences (regular subsequences in short) as:
Definition 7
(Regular temporal purchasing subsequences)
Given a customer c, her collective sequence \(\widehat{C}_{c}\) and a support threshold ω, we define the regular temporal purchasing subsequences is the set \(\mathcal{R}_{c} = \{ (R_{1}, w_{1}),\ldots, (R_{m}, w_{m})\}\), where each \(R_{i}\) is a subsequence of \(\widehat{C}_{c}\), \(w_{i}\) is its support and \(\forall w_{i} w_{i} \geq\omega\).
In other words, among all the possible subsequences of \(\widehat {C}_{c}\), \(\mathcal{R}_{c}\) contains only the most representative for c. For example, if the subsequences of \(\widehat{C}_{c}\) are \((\langle C^{(1)}, C^{(1)} \rangle, 10)\), \((\langle C^{(1)}, C^{(1)}, C^{(2)} \rangle, 8)\), \((\langle C^{(1)}, C^{(2)} \rangle, 2)\), \((\langle C^{(2)}, C^{(1)} \rangle, 2)\), \((\langle C^{(2)}, C^{(2)} \rangle, 1)\) where the number is the support, i.e., the occurrences of that subsequence, than only the first two are regular and contained in \(\mathcal{R}_{c}\) with \(\omega= 5\). Given two customers b and c, with \(\mathcal{R}_{b}\) and \(\mathcal {R}_{c}\) derivable from \(\widehat{C}_{b}\) and \(\widehat{C}_{c}\), we can compare \(\mathcal{R}_{b}\) and \(\mathcal{R}_{c}\) with an appropriate distance function, e.g. Jaccard or cosine distance.
Finally, we define the collective perspective (see Fig. 3(b)) of a profile as follows:
Definition 8
(Collective perspective)
Given the profile \(\mathcal{P}_{c} = \langle\mathcal{F}, \widehat{F} \rangle\) of customer c, and the collective footprints of all customers \(\mathcal{C}\), the collective perspective of \(\mathcal {P}_{c}\) is defined as \(\mathcal{P}^{*}_{c} = \{ \mathcal{C}_{c}, \mathcal{R}_{c} \}\) where \(\mathcal{C}_{c} \subseteq\mathcal{C}\) are the customer collective footprints, and \(\mathcal{R}_{c}\) is the set of regular subsequences.
We implement the exaction of the regular subsequences using a suffix tree [26]. Given a customer c, her collective sequence \(\widehat{C}_{c}\) is transformed into a string where each character corresponds to the label of a customer collective footprint. Hence, we generate a suffix tree for each customer. Following a branch of the tree from the root to a leaf we can read a subsequence \(R_{i}\) and, on the leaf, we have the support \(w_{i}\) of the subsequence generating that branch. We set the support threshold ω in a datadriven way by observing the support distribution among the subsequences. In particular, we apply a technique known as “knee method” [27] setting ω to the value of the knee. Given a set of pairs composed of items and their support this method sorts the pairs according to the frequencies and returns the most representative, i.e., the pairs with a support greater or equal than the support ω corresponding to the knee in the curve of the ordered frequencies. In this way ω is different for each customer and driven by personal data. For each customer, we cut the suffix tree considering only the regular subsequences, i.e., the subsequences \(R_{i}\) with support \(w_{i} \geq\omega\). The complexity of Algorithm 2 is dominated by the maximum between the complexity of the clustering algorithm (detectGroup) and the complexity of the construction of the suffix tree (extractRegularSubsequences) [26].
5 Case studies
In this section, we apply the proposed framework for temporal purchases analysis on a real world datasets. We show the individual temporal purchasing footprints, the effect of the collective perspective, and we analyze the most common regular sub sequences for the customers segmented in similar groups. We underline that the proposed framework, as well as the other analytical approaches described in Sect. 2, are designed to extract knowledge from the data. All of them are not assessing a task which can be quantified (e.g. a prediction or a classification). As consequence, the proposed framework is not eligible for comparison against these other methodologies. However, it is possible to instantiate the same framework on various datasets characterized by different properties.
In line with [24], we present a main case study on a private real big dataset of shopping sessions. Then, we show that the same framework can be easily exploited for the analysis of other public available datasets. In the main case study we show the overall potentialities of the proposed framework, while on the other datasets we highlight the modeling of the framework together with the principal findings and similarities with the main case study.
We underline that the proposed methodological framework for the analysis of the temporal dimension of shopping has the goal of providing profiles and behaviors for analyzing this kind of information. As consequence, similarly to other approaches described in Sect. 2, we do not compare our methodology against other methods, but we show that the framework can be instantiated to analyze other similar datasets.
The rest of this section in organized as follows. In Sect. 5.1 we illustrate all the datasets analyzed. Section 5.2 details the model setting for the principal case study, and in Sects. 5.3, 5.4 and 5.5 we report the results for the individual footprints, collective footprints and sequences, respectively. By using additional information, in Sect. 5.6 we show that the various clusters, obtained only using the temporal dimension, are capturing diversified qualitative aspects of the customers (e.g., age and profession). Section 5.7 proves that the methodological framework can be instantiated for other case studies and that similar results to the principal case study are found. Finally, in Sect. 5.8 we demonstrate the clustering stability with respect to different choices of the number of collective footprints.
5.1 Real datasets
In the literature there are very few similar public available transactional datasets providing detailed temporal information about shopping sessions. Examples are TaFeng^{2} and TMall^{3} datasets. TaFeng dataset covers products like food, office supplies and furniture. It contains 817,741 transactions registered in four months (from 20001101 to 20010227) and belonging to 32,266 customers. TMall dataset records four months (from 20140415 to 20140814) of online transactions of an online ecommerce website. It is conceptually different from the previous datasets because a transaction does not model a purchase but the fact that a set of items have been observed in the same day. It contains 4298 transactions belonging to 884 users and 9531 brands considered as items. Even though these datasets refer to a time period remarkably shorter than the period observed in UniCoop dataset, in order to show how our methodological framework can be easily instantiated in other case studies, we report in Sect. 5.7 some crucial analytical results on TaFeng and TMall datasets.
5.2 Model setting
As humans we operate under the cadence of a sevenday week [28]. This cycle of activity is deeply rooted in human experience and in our psychological habits. Indeed, the weekdays alternation drives our routinary life. Together with the previous observations, these are the reasons why we decided to adopt the week as time unit and to set the period \({\tau=7}\) and the number of daysintervals \({d=7}\). With respect to the time windows, by observing Fig. 4(upper right) we notice an Mshaped pattern: most of the shopping sessions happen in the morning or after working hours. As consequence, in order to capture all the phases of this curve, we summarize this trend using the datadriven time windows, containing all the phases of growth and decrease of the curve and so, by setting \(t=5\) and dividing the time as follows: 7–9, 10–12, 13–15, 16–18, 19–21 how highlighted in Fig. 4(lower right). Note that, without this time discretization and adopting a finer granularity (e.g. a time slot every hour) the discovered profiles would not be sufficiently easy to read, this because we might have duplication of profiles for customers shopping in the same time slots. On the other hand, by adopting a coarser granularity (two times slot for morning and afternoon shopping sessions) we might miss some crucial differences in the temporal shopping behavior which are highlighted by the findings in the following sections. As relevance function rel we used the sum of the amount spent.
We implemented detectGroups in Algorithm 1 and Algorithm 2, using the kmeans clustering algorithm [27]. Kmeans algorithm requires to specify the number of clusters k.
For the extraction of the individual profiles Algorithm 1 does not take as input the number of cluster k, but the algorithm automatically estimates the number of clusters, i.e., the number of individual footprints for each customer, by running kmeans for \(k \in[2, 50]\) and selecting as number of cluster the k which can be considered the “knee” of the Sum of Squared Error (SSE) curve. The idea is to find the best k for each customer and not using the same value for everyone. In particular, we select as knee the point on the SSE curve having the maximum distance from the straight line passing through the minimum and the maximum point of the SSE curve. On the other hand, for detectGroups in Algorithm 2 we used kmeans with \(k \in[2, 150]\) and, yet using the knee method, analyzing the SSE curve we select \(k = 45\) as number of collective groups. In both cases, as distance function δ we used the cosine distance because unlike the Euclidean or Manhattan, it does not suffer the problem of sparseness. Indeed, typically a customer purchases one or two times per week generating very sparse footprints F.
Therefore, by applying in sequence Algorithm 1 and Algorithm 2 we obtain for each customer her profile \(P_{c}\) and the collective perspective \(P^{*}_{c}\).
5.3 Individual footprints analysis
In this section we analyze the individual temporal purchasing footprint \(\mathcal{F}_{c}\) contained in the temporal purchasing profiles \(\mathcal{P}_{c}\) extracted employing Algorithm 1 on all the customers. Empty footprints are clustered together by default and are represented with an empty footprint. The extraction time is about 0.5–1.0 seconds per customer, depending on the number of non empty footprints.
In light of this, the upcoming analysis aims at understanding how the collective perspective impacts the individual footprints, and the observed indicators.
5.4 Collective footprints analysis
In this section we analyze the customers’ collective footprints representing the collective perspective obtained using Algorithm 2 on the individual profiles \(\mathcal{P}_{c}\).
In order to prove that this result is not casual, yet in Fig. 6(upper left) we report the distribution generated by a null model: for each customer, each individual footprint is randomly assigned to a collective footprint. In other words, we preserve the number of individual footprints for each customer and the number of collective footprints, while we destroy the assignment returned by the clustering, and we label each individual footprints using a randomly selected collective footprints instead of using the one of the cluster where it should belong to. This nulldistribution (in grey) has a Gaussian shape with mode ∼7 and, due to the difference with the bimodal, it allows us to state that the bimodal distribution is not a casual result. Hence, the bimodal distribution statistically delineates two subsets of customers: the regular customers represented by a limited set of collective footprints, and the changing customers requiring a higher number of collective footprints.
Using 4 as threshold we obtain a 27%–83% partitioning. By comparing the individual with the collective footprints we can discriminate between two well defined groups of customers. Regular customers are more predictable than changing customers since they can adopt a smaller range of temporal footprints. Figure 6(upper right) illustrates the distribution of the ratio between the number of collective footprints and individual footprints \(\vert \mathcal{C}_{c} \vert / \vert \mathcal{F}_{c} \vert \). For 37% of the customers each individual footprint belongs to a different collective group, while for the rest the collective perspective changes the personal definition of their behavior putting two different individual footprint in the same collective group.
Also the distributions of purity and entropy, the plots in bottom line of Fig. 6, are remarkably different from those in Fig. 5. For purity we can observe a novel group of \({\sim}10k\) pure customers, while for entropy, even though the distribution remains longtailed, now the units are more unbalanced towards a few set of footprints representing the whole customer purchasing behavior. Moreover, we observe the growth of the standard deviation σs for both measures. In addition, the average purity for a regular customer is 0.94, while it is just 0.19 for a changing customer. We notice a similar effect for entropy: the average entropy for a regular customer is 0.65 while it is 0.91 for a changing customer. This confirms the higher unpredictability of changing customers. The regularchanging partitioning is the first segmentation that emerges by employing our methodological framework. In the following we move over changingregular looking for other interesting temporal segmentations of the customers.
In Fig. 7, with the exception of the collective footprints (29) and (38), all the collective footprints describe a oneshop behavior, i.e., most of the customers perform only one purchase per week. However, the day and time window of these oneshop purchases is spread among the various possible choices. For example, customers with a footprint represented by (1) spend about €37 on Sat10–12, those having a footprint represented by (14) spend about €49 on Fri10–12, and those represented by (4) spend about €55 on Fri16–18. As anticipated by the Mshape in Sect. 5.1, the two time windows mostly used are 10–12 and 16–18, but there are also some collective footprints in “unusual” time windows, e.g. (39) and (40).
We notice that a shopping behavior for the same day and time window can be captured by different collective footprints (e.g. (1)–(2), (19)–(12)) with a different maximum amount. Thus, we classify these oneshop spending behaviors in three classes according to maximum amount spent. We name normal spending footprint the collective footprints lower than €50, high spending footprint those between €50 and €100, and higher spending footprint those with a peak higher than €100.
Moreover, collective footprint (29) captures occasional shopping sessions where a maximum of €3 is spent nor in a specific day nor in a specific time window. However, 87% of the customers have the behavior described by this collective footprint. This indicates that even though each customer has a quite regular oneshop footprint, she sometimes makes purchases following an occasional spending footprint.
Finally, collective footprint (38) captures the behavior of customers that every morning (7–9) make a purchase spending at most €16. We name this behavior daily spending footprint. The customers having this footprint can be retirees who go to the shopping center every morning to satisfy only their daily needs, or workers going to the supermarket before work for buying their lunch. This is the second segmentation unveiled from the analysis of the temporal purchasing profiles.
The last analysis consists in discovering which are the most common orders in which these collective footprints are adopted by the customers.
5.5 Collective sequences analysis
Most of the clusters are characterized by a repetition of the same collective footprint, e.g. clusters 14, 18, 0, 15, 5, 6 and 19 in Fig. 7. Customers belonging to these clusters have a preferred moment to shop and/or they need to shop in that particular moment. This behavior is probably driven by their weekly time table. The fact that there are not noshopping behavior separating these oneshop behaviors is a signal that they consume all the products purchased and they need to repurchase every week spending approximately the same amount. Cluster 14 reveals that also the daily spenders repeat regularly their footprint through the weeks. Clusters 30, 22 and 31 capture different permutations of collective footprints (29) and (−1), (−1) indicates noshopping. These customers generally purchase in subsequent weeks without a regular pattern. Indeed, they are mostly changing customers. Moreover, cluster 30 follows a YesNoYes^{5} (YNY) pattern (complementary to 31), while customer in cluster 22 buys every week (YYY). Clusters 2 and 23 capture two different repetitions of oneshop footprint following a NYN pattern, i.e., these customers depletes her storage in the first week, go to shopping in the second week, and consumes the novel supplies in the third week. Cluster 4 is complementary to 2. Finally, clusters 1 and 8 are specular each other with a NNY and YNN pattern. In the first the customers do not purchase for two weeks and then, spent about €60 on Saturday morning of the third week, while the customer of cluster 8 spend €45 on Monday morning and then do not need to purchases for two weeks. This last analysis shows another possible segmentation driven by the temporal sequences.
5.6 Learning more about the customers in different groups
In this section we further characterize the different groups of customers discovered in the UniCoop dataset. A fist partitioning, which is related not to a specific behavior but rather to the number of collective temporal purchasing footprints adopted, is the one into regular customers and changing customers. Besides the differences highlighted in the previous sections, we observe the amount spent by the two groups which does not depend on the partitioning into regular or changing. In particular, the standard deviation of the amount spent by a regular customer is 7.61, while it is 32.38 for a changing customer. This means that temporal regularity reflects into the spending regularity. The regular customers having few different collective footprints are also regular with respect to the amount spent (low standard deviation of the amount spent). On the other hand, changing customers that can follow many different collective footprints are also more eclectic with respect to the amount spent (high standard deviation of the amount spent). This fact may be reasonable; indeed, if a customer has only small variations in the times she goes to shopping, then it is likely that her regularity depends on the fact that she has a purchasing periodic plan that allows her to consume all the bought items. This leads to an high probability that every time the customer goes to the shopping center will need approximately a comparable set of items to those in the previous purchase. This is not the case if the time between two shopping session varies a lot. In our claim is that it is likely that a regular customer with respect to the temporal dimension is also a regular customer with regards the items bought.
5.7 Methodological framework portability
In this section we show that our framework can be instantiated for other case studies based on data with different characteristics with respect to the Unicoop data. In particular, the application of our methodology must take into consideration the fact that in the data there is no information about both the purchasing time and the amount spent. However, similar results with respect to the principal case study are found.
As for the principal case study, also for TaFeng and TMall datasets we adopt the week as time unit and set the period \({\tau=7}\) and the number of daysintervals \({d=7}\). However, since the time of the shopping is not available, we cannot model it and we consequently use a unique time window (\({t=5}\)). Moreover, also the amount spent is not available, thus as relevance function rel we used the sum of the number of items purchased in each shopping session. As before, we implemented detectGroups in Algorithm 1 and Algorithm 2, using the kmeans clustering algorithm [27]. For the extraction of the individual profiles we estimated the number of clusters by running kmeans for \(k \in[2, 40]\), and as before, selecting as number of individual footprints the k corresponding to the “knee” of the SSE curve. Similarly, for detectGroups in Algorithm 2 we used kmeans with \(k \in[2, 145]\) and, yet using the knee method, analyzing the SSE curve we select \(k = 12\) for TaFeng and \(k = 15\) for TMall as number of collective footprints. For each customer we get both her profile \(P_{c}\) and the corresponding collective perspective \(P^{*}_{c}\).
5.8 Collective footprints validation
In this section we proof that the technique adopted for selecting the number of clusters, both for the individual and collective footprints is robust: small variations in the number of clusters is not changing the overall conclusions of the paper. To assess this task we used an alternative external validation measure to the sum of squared error adopted for selecting the number of clusters. The silhouette coefficient is another useful criterion for assessing the natural number of clusters in a set of data [27]. It measures how similar an object is to its own cluster (cohesion) compared to other clusters (separation). The silhouette ranges from −1 to +1, where a high value indicates that the object is well matched to its own cluster and poorly matched to neighboring clusters. If most objects have a high value, then the clustering configuration is appropriate. If many points have a low or negative value, then the clustering configuration may have too many or too few clusters.

(NMI) Normalized Mutual Information score: is an normalization of the Mutual Information score that measure the mutual information between two clusterings.

(ARI) Adjusted Rand Index score: computes a similarity measure between two clusterings by considering all pairs of samples and counting pairs that are assigned in the same or different clusters in the predicted and true clusterings.

(VM) VMeasure score: is the harmonic mean between homogeneity and completeness [33].
Collective clustering crossvalidation with respect to Normalized Mutual Information score, Adjusted Rand Index score and VMeasure score. For each measure is reported the mean and the standard deviation among five runs
Dataset  NMI  ARI  VM 

Coop  0.73 ± 0.015  0.57 ± 0.044  0.72 ± 0.016 
TaFeng  0.78 ± 0.054  0.78 ± 0.012  0.78 ± 0.055 
TMall  0.65 ± 0.033  0.66 ± 0.077  0.65 ± 0.033 
6 Exploitation
The temporal purchasing profiles can be employed for a wide range of applications. In the previous section we showed how different groups of behaviors and customers can be easily identified by exploiting the profiles and their collective perspectives. Both the customers themselves and the retail market chains adopting this methodology can gain useful insights from the analysis of the knowledge extracted with the temporal purchasing footprint. In the following we discuss possible applications of the methodologies proposed in this paper and benefits that can be drawn from the customers at individual level and from the retail market chain at collective level.
A first example is to provide to the customers a visualization of their typical temporal purchasing patterns to make them conscious of their behavior. The retail market chain could furnish to each customer a personal dashboard, like those theorized into [34, 35]. The visualization of the customer’s temporal purchasing profile \(\mathcal{P}_{c}\) might contribute to the improvement of the customer selfawareness. The selfknowledge might lead the customer to change her temporal habits with the possibility of saving money in the case a more regular behavior brings to spend a lower amount of money. Clearly, the change is possible if this is compatible with the daily time schedules and the customer desires to change. For example, a customer could discover to be a regular or changing. The customer could react by assuming a more regular purchasing behavior trying to adopt more oneshop patterns. Moreover, the customer could also monitor her shopping sessions understanding if her purchasing behaviors are remaining stables or over time. Besides this, since as it is shown in the paper the individual perspective alone is not sufficient to really understand who we are, the integration with the collective perspective provides a way to compare the individual behavior with those of the other customer, and to better understand who we are with respect to the others. For example, the collective perspective \(\mathcal{P}^{*}_{c}\) could reveal to the customer c that she is very similar to the mass in terms of day and time window but with a typically higher amount spent. Hence, c could try to change her weekly habits in order to experience less crowded shopping sessions. Another individual service which can be derived from the temporal purchasing profile is a sort of shopping reminder. Knowing the temporal purchasing habits the system can interact with your personal calendar and remind/alert you that in x days a certain amount is going to be spent. Finally, considering the typical purchasing behaviors of all the customers, an individual customer can be helped by providing tailored recommendations for shopping time schedule suggesting to her to anticipate or to postpone the purchasing time in order to find less queue at the supermarket checkout.
On the other hand, the retail market chain can exploit the collective footprints of all the customers \(\mathcal{C}\) and a customers segmentation like the one showed in Sect. 5.5 to offer personalized discounts. For example, the retail manager could employ the collective footprints by promoting for each customer the shopping in her favorite day and time window by applying a tailored discount. Thus making the regular customers even more regular and also more profitable [3]. Furthermore, the analysis of the regular subsequences enables the retail manager to push customers which generally alternate oneshop weeks with noshopping weeks in performing consecutive oneshop weeks in order to obtain special temporal discounts. Finally, the potential predictive power of the model could be capitalized by shop managers. For example, the knowledge of the collective footprints could be used to improve the overall service like reorganizing the shifts of the employees, or rescheduling the disposal and replacement of the products on the shelves during opening hours. In addition, going back to possible personalized services offered by the shop manager, for each customer her individual temporal purchasing footprints can be used as features to improve existing recommender systems or for predicting the next time that an individual will perform a shopping session.
7 Conclusion
In this paper we have proposed an approach to extract the regularities characterizing the temporal purchasing profile of customers. We have proposed the temporal purchasing profile formed by the temporal purchasing footprints, and by the sequences in which these footprints take place. Then, we have described the approach to make the profiles comparable among different customers by providing the collective perspective to them. The collective perspective have enabled the analysis of many possible segmentations of the customers. The general methodological framework is applied to a case study regarding retail customers where we considered a week as temporal unit. Our extensive analysis of the case studies revealed that for most of the customers the vision of the individual profile is different from its collective perspective. Thus, using this information customers can be classified into regular and changing according to the number of behaviors needed to describe them. Moreover, we have outlined the typical patterns summarizing human behavior in scheduling the shopping time and their repetition through time.
The analytical results show that our framework enables the segmentation of customers with respect to different point of views. For example, we discovered segmentations based on: (a) the number of collective behaviors; (b) the shopping time and the amount of the expenditure; and (c) the frequency of the sequential order of specific behaviors.
Then we would like to extend the methodological framework in order to test the predictive power of the temporal profile by predicting when the next shopping will take place and how much will be the amount spent. Finally, in collaboration with UniCoop Tirreno, we would like to implement a web dashboard where a customer can provide her fidelity card number and visualize the patterns forming her temporal purchasing profile.
All the collective footprints can be found at https://github.com/riccotti/CustomerTemporalRegularities.
Declarations
Acknowledgements
We thank UniCoop Tirreno and Walter Fabbri for allowing us to analyze the data and to publish the results.
Availability of data and materials
A sample of the source code implementing the proposed methodological framework, a sample of the dataset used in the case study, and additional results will be available at publication time at the following link https://github.com/riccotti/CustomerTemporalRegularities.
Funding
This work is partially supported by the European Community’s H2020 Program under the funding scheme INFRAIA120142015: 654024 SoBigData, http://www.sobigdata.eu/.
Authors’ contributions
All authors conceived the framework and the analysis and the various case studies. RG and AM elaborated the theoretical models. RG and LG implemented the framework and performed the analysis of the case studies. All authors analyzed and discussed the results and contributed to the manuscript. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: SIGMOD, Washington, D.C., pp 207–216 Google Scholar
 Kim E, Kim W, Lee Y (2003) Combination of multiple classifiers for the customer’s purchase behavior prediction. Decis Support Syst 34(2):167–175 View ArticleGoogle Scholar
 Guidotti R, Coscia M, Pedreschi D, Pennacchioli D (2015) Behavioral entropy and profitability in retail. In: IEEE international conference on data science and advanced analytics (DSAA). IEEE, pp 1–10 Google Scholar
 McDonald WJ (1994) Time use in shopping: the role of personal characteristics. J Retail 70(4):345–365 Google Scholar
 Lingras P, Hogo M, Snorek M, West C (2005) Temporal analysis of clusters of supermarket customers: conventional versus interval set approach. Inf Sci 172(1–2):215–240 View ArticleGoogle Scholar
 Guidotti R (2017) Personal data analytics: capturing human behavior to improve selfawareness and personal services through individual and collective knowledge Google Scholar
 Chen ZY, Fan ZP (2012) Distributed customer behavior prediction using multiplex data: a collaborative MKSVM approach. KnowlBased Syst 35:111–119 View ArticleGoogle Scholar
 Yada K, Motoda H, Washio T, Miyawaki A (2004) Consumer behavior analysis by graph mining technique. In: Knowledgebased intelligent information and engineering systems, 8th international conference, KES 2004. Proceedings. Part II, pp 800–806 Google Scholar
 Shangguan L, Zhou Z, Zheng X, Yang L, Liu Y, Han J (2015) ShopMiner: mining customer shopping behavior in physical clothing stores with COTS RFID devices. In: Proceedings of the 13th ACM conference on embedded networked sensor systems. ACM, New York, pp 113–125 View ArticleGoogle Scholar
 Luo L, Li B, Koprinska I, Berkovsky S, Chen F (2016) Discovering temporal purchase patterns with different responses to promotions. In: Proceedings of the 25th ACM international on conference on information and knowledge management. ACM, New York, pp 2197–2202 Google Scholar
 Krumme C, Llorente A, Cebrian M, Pentland A, Moro E (2013) The predictability of consumer visitation patterns. Sci Rep 3:1645 View ArticleGoogle Scholar
 Chen MC, Chiu AL, Chang HH (2005) Mining changes in customer behavior in retail marketing. Expert Syst Appl 28(4):773–781 View ArticleGoogle Scholar
 Song HS, kyeong Kim J, Kim SH (2001) Mining the change of customer behavior in an Internet shopping mall. Expert Syst Appl 21(3):157–168 View ArticleGoogle Scholar
 Hamuro Y, Katoh N, Edward IH, Cheung SL, Yada K (2003) Combining information fusion with string pattern analysis: a new method for predicting future purchase behavior. In: Information fusion in data mining. Springer, Berlin, pp 161–187 View ArticleGoogle Scholar
 Di Clemente R, LuengoOroz M, Travizano M, Vaitla B, Gonzalez MC (2017) Sequence of purchases in credit card data reveal life styles in urban populations. arXiv:1703.00409
 Padmanabhan B, Zheng Z, Kimbrough SO (2001) Personalization from incomplete data: what you don’t know can hurt. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 154–163 View ArticleGoogle Scholar
 Hansen T, Jensen JM, Solgaard HS (2004) Predicting online grocery buying intention: a comparison of the theory of reasoned action and the theory of planned behavior. Int J Inf Manag 24(6):539–550 View ArticleGoogle Scholar
 Lin HF (2007) Predicting consumer intentions to shop online: an empirical test of competing theories. Electron Commer Res Appl 6(4):433–442 View ArticleGoogle Scholar
 Van den Poel D, Buckinx W (2005) Predicting onlinepurchasing behaviour. Eur J Oper Res 166(2):557–575 MathSciNetView ArticleMATHGoogle Scholar
 Koren Y (2010) Collaborative filtering with temporal dynamics. Commun ACM 53(4):89–97 View ArticleGoogle Scholar
 Rendle S, Freudenthaler C, SchmidtThieme L (2010) Factorizing personalized Markov chains for nextbasket recommendation. In: Proceedings of the 19th international conference on world wide web. ACM, New York, pp 811–820 Google Scholar
 Cumby C, Fano A, Ghani R, Krema M (2004) Predicting customer shopping lists from pointofsale purchase data. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 402–409 Google Scholar
 Yu F, Liu Q, Wu S, Wang L, Tan T (2016) A dynamic recurrent model for next basket recommendation. In: Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 729–732 Google Scholar
 Riccardo G, Giulio R, Luca P, Giannotti F, Pedreschi D (2017) Market basket prediction using usercentric temporal annotated recurring sequences. In: Data mining (ICDM), 2017 IEEE 17th international conference on. IEEE Google Scholar
 Furletti B, Gabrielli L, Renso C, Rinzivillo S (2012) Identifying users profiles from mobile calls habits. In: Proceedings of the ACM SIGKDD international workshop on urban computing. ACM, New York, pp 17–24 View ArticleGoogle Scholar
 Giegerich R, Kurtz S (1997) From Ukkonen to McCreight and Weiner: a unifying view of lineartime suffix tree construction. Algorithmica 19(3):331–353 MathSciNetView ArticleMATHGoogle Scholar
 Tan PN, Steinbach M, Kumar V et al. (2006) Introduction to data mining. Pearson Education, Upper Saddle River Google Scholar
 Zerubavel E (1989) The seven day circle: the history and meaning of the week. University of Chicago Press, Chicago Google Scholar
 Shannon CE (2001) A mathematical theory of communication. Mob Comput Commun Rev 5(1):3–55 MathSciNetView ArticleGoogle Scholar
 Pappalardo L, Simini F, Rinzivillo S, Pedreschi D, Giannotti F, Barabási AL (2015) Returners and explorers dichotomy in human mobility. Nat Commun 6:8166 View ArticleGoogle Scholar
 Guidotti R, Trasarti R, Nanni M, Giannotti F, Pedreschi D (2017) There’s a path for everyone: a datadriven personal model reproducing mobility agendas. In: IEEE international conference on data science and advanced analytics (DSAA). IEEE, pp 1–10 Google Scholar
 Kaufman L, Rousseeuw P (1987) Clustering by means of medoids. NorthHolland, Amsterdam Google Scholar
 Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854 MathSciNetMATHGoogle Scholar
 de Montjoye YA, Shmueli E, Wang SS, Pentland AS (2014) openPDS: protecting the privacy of metadata through safeanswers. PLoS ONE 9(7):e98790 View ArticleGoogle Scholar
 Vescovi M, Moiso C, Pasolli M, Cordin L, Antonelli F (2015) Building an ecosystem of trusted services via user control and transparency on personal data. In: Trust management IX. Springer, Berlin, pp 240–250 Google Scholar