key: cord-0045907-pf3dn2me authors: Zhong, Jinwen; Ma, Can; Zhou, Jiang; Wang, Weiping title: PDPNN: Modeling User Personal Dynamic Preference for Next Point-of-Interest Recommendation date: 2020-05-25 journal: Computational Science - ICCS 2020 DOI: 10.1007/978-3-030-50433-5_4 sha: 3e57667ac357e98b90abe5c0b12d3b668f5c16af doc_id: 45907 cord_uid: pf3dn2me Next Point of Interest (POI) recommendation is an important aspect of information feeds for Location Based Social Networks (LSBNs). The boom in LSBN platforms such as Foursquare, Twitter, and Yelp has motivated a considerable amount of research focused on POI recommendations within the last decade. Inspired by the success of deep neural networks in many fields, researchers are increasingly interested in using neural networks such as Recurrent Neural Network (RNN) to make POI recommendation. Compared to traditional methods like Factorizing Personalized Markov Chain (FPMC) and Tensor Factorization (TF), neural network methods show great improvement in general sequences prediction. However, the user’s personal preference, which is crucial for personalized POI recommendation, is not addressed well in existing works. Moreover, the user’s personal preference is dynamic rather than static, which can guide predictions in different temporal and spatial contexts. To this end, we propose a new deep neural network model called Personal Dynamic Preference Neural Network(PDPNN). The core of the PDPNN model includes two parts: one part learns the user’s personal long-term preferences from the historical trajectories, and the other part learns the user’s short-term preferences from the current trajectory. By introducing a similarity function that evaluates the similarity between spatiotemporal contexts of user’s current trajectory and historical trajectories, PDPNN learns the user’s personal dynamic preference from user’s long-term and short-term preferences. We conducted experiments on three real-world datasets, and the results show that our model outperforms current well-known methods. Due to the prevalence of smart mobile devices, people frequently use Location-Based Social Networks (LBSNs) such as Foursquare, Twitter, and Yelp to post check-ins and share their life experiences. Point-of-interest (POI) recommendation has become an important way to help people discover attractive and interesting venues based on their historical preferences. Many merchants also use POI recommendation as an important channel to promote their products. POI recommendation has been extensively studied in recent years. In the literature, Markov Chains (MC) [1, 2] , collaborative filtering [3] [4] [5] [6] , and Recurrent Neural Networks [7] [8] [9] [10] [11] are three main approaches. These methods commonly focus on capturing either the user's short-term preferences or long-term preferences. User's short-term and long-term preferences are both important for achieving higher accuracy of recommendation [12] . The short-term preference is greatly influenced by time and POIs the user has visited recently, while the long-term preference reflects the user's personal habits thus all historically visited POIs need to be considered. However, the user's personal long-term preference, which is crucial for personalized POI recommendation, is not addressed well in existing work. Moreover, the user's personal preference is dynamic rather than static, which can guide predictions in different temporal and spatial contexts. To this end, we propose a new deep neural network model called Personal Dynamic Preference Neural Network (PDPNN). The core of the PDPNN model includes two parts: one part learns the user's personal long-term preferences from the historical trajectories, and the other part learns the user's short-term preferences from the current trajectory. By introducing a similarity function that evaluates the similarity between spatiotemporal contexts of user's current trajectory and historical trajectories, PDPNN learns the user's personal dynamic preference from user's long-term and short-term preferences. We summarize our contributions in this paper as follows: Plenty of approaches have been proposed that focus on sequential data analysis and recommendation. Collaborative filtering based models, such as Matrix Factorization (MF) [3] and Tensor Factorization(TF) [4] are widely used for recommendation. These methods aim to cope with data sparsity and the cold start problem by capturing common short-term preferences among users, but cannot capture the user's personal long-term preferences. Other existing studies employ the properties of a Markov chain to capture sequential patterns, such as Markov Chain (MC) [1] and Factorizing Personalized Markov Chains (FPMC) [2] . FPMC models user's preference and sequential information jointly by combining factorization method and Markov Chains for next-basket recommendation. However, both MC and FPMC methods fall short in learning the long-term preference and the periodicity of the user's movement. Recently, as a result of the success of deep learning in speech recognition, vision and natural language processing, Recurrent Neural Networks (RNNs) have been widely used in sequential item recommendation [7] [8] [9] [10] [11] . Spatial Temporal Recurrent Neural Networks (STRNN) [8] utilizes a RNN architecture and linear interpolation to learn the regularity of sequential POI transition. However, traditional RNNs suffer from the issues of vanishing gradients and error propagation when they learn long-term dependencies [13] . Therefore, special gating mechanisms such as Long Short-Term Memory network (LSTM) [14] have been developed and widely used in recent work [9] [10] [11] . By controlling access to memory cells, LSTMs can alleviate the problem of long-term dependencies. Attention mechanism is a key advancement in deep learning in recent years, and it shows a promising performance improvement for RNNs [15] [16] [17] . By introducing the attention mechanism, Attention-based Spatio-Temporal LSTM network (ATST-LSTM) [9] can focus on the relevant historical check-in records in a check-in sequence selectively using the spatiotemporal contextual information. However, these models are designed to learn short-term preferences and are not well suited to learning personal long-term preferences. They commonly add user embedding to RNN outputs, and reduce the problem of user context learning to the problem of learning a static optimal user embedding representation. It is more difficult to learn long-term preferences with simple attention mechanisms, especially for users with only a small set of historical trajectories. Let U = {u 1 , . . . , u m } denote the user set and P = {p 1 , . . . , p j } denote the POI set, where U and P are the total numbers of users and POIs, respectively. Each POI p k is associated with a geographic location l k = (l a , l o ), where l a and l o denote the latitude and longitude of the POI location. For a user u ∈ U , a check-in behavior means u i visits a POI p k at time t k , which is denoted as tuple (u, p k , t k ). Any check-in sequence with all the time intervals of successive check-ins less than a threshold value T delta , is called a trajectory. Obviously, a user's historical check-ins will be segmented into many trajectories. We denote T u i = {(u, p 1 , t 1 ), . . . , (u, p n , t n )} as the i-th trajectory of the user u, with t k − t k−1 ≤ T delta for neighbor check-ins of the trajectory, and T u i is the total number of check-ins. All the historical trajectories of the user u are denoted as When we are processing the i-th trajectory T u i , we need to take into account the previous historical trajectories of user u, which is called the trajectory context. We denote C u i = {T u 1 , T u 2 , . . . , T u i−1 } as the context of the current trajectory T u i . When there is no ambiguity, we will omit the subscript u. Formally, given a user u and all of their historical trajectories T u = {T u 1 , T u 2 , . . . , T u n }, the problem of making the next POI recommendation is to predict the next POI p n+1 that the user is most likely to visit. Basic Framework. The PDPNN model receives current trajectory T i and its trajectory context C i as input. C i is the previous trajectories of each trajectory T i of the same user. The model learns short-term preference mainly from current trajectory T i , and builds a personal dynamic preference from both T i and its trajectory context C i . The architecture of our proposed model PDPNN is shown in Fig. 1 : -(I) Attention-enhanced recurrent neural network (ARNN) module: We use a LSTM network to capture short-term and long-term spatial-temporal sequence patterns. Additionally, an attention mechanism is introduced to capture the weight of all hidden states of the trajectory sequence. This component is used as an important part of the personal dynamic preference and the short-term preference learning module. -(II) Personal dynamic preference learning module: The user's personal preferences is important for personalized POI recommendation, which is implicit in their historical trajectories. This module obtains a representation of personal dynamic preference from the user's current trajectory and trajectory context C i . We first utilize the ARNN module to calculate the output hidden states of all trajectories in the trajectory context. Obviously, not every state can help to predict a POI, and the historical trajectories that are highly correlated with the current trajectory are supposed to have greater weight. To this end, we introduce attention mechanism into the module to measure the spatial temporal context similarity of the trajectory context and the current trajectory, and aggregate all the hidden states of the trajectory context as the personal dynamic preference representation. -(III) Short-term preference learning module: The short-term preference, which is usually referred to as sequential preference, is commonly learned from the current trajectory. We simply apply the ARNN module described above to get the hidden state of the current trajectory T i . -(IV) Classifier: This is the final output component which unifies the user embedding output, the last hidden state from current trajectory and the aggregated attention from historical trajectories into a feature representation; after this it predicts the next POI. The output of this module is the probability vector of every POI that the user is likely to visit next. Cross-entropy loss and L2 regularization is used to measure the total loss in this module. ARNN is an attention-enhanced recurrent neural network, which receives a trajectory T u as input and outputs the weighted sum of the hidden states of the user's trajectory via a attention mechanism. The ARNN model consists of an embedding layer, a recurrent layer and an attention layer. Each input trajectory contains a sequence of POI identifier (id) p, time interval Δs, geographic distance Δt, longitude lo and latitude la. The input POI id is then transformed into a latent space vector e p i by the embedding layer. Input data I t can be described as follows: where We utilize a Long Short-Term Memory (LSTM) network in the recurrent layer. A LSTM neuron unit consists of an input gate i t , an input gate f t , and an output gate o t . These parameters are explained in detail below: The attention mechanism aims to capture the importance of all the state sequences. Two kinds of attention mechanism have been proposed, known as additive attention [16] and dot-product attention [15] . Considering that an optimized matrix multiplication operation is much faster and more space-efficient in practice than an alignment calculation through hidden layer [15] , we chose dotproduct attention to calculate the attention weight of the state in the attention layer. Following the work of [15] , we introduce Z u as the user context, which can be learned during training. We get the weight of each hidden state β t with a softmax function: We can get the weighted sum of all state sequences as the overall output. The overall trajectory state output O n is the weighted sum of all state sequences, which is described as follows: For the convenience of subsequent references, we summarize Eqs. (1)-(9) as function ARN N (·): Personal Dynamic Preference Learning Module: Unlike routine RNN models that treat each separate trajectory as input, PDPNN attaches to each trajectory T i the trajectory context C i = {T 1 , T 2 , . . . , T i−1 } of the same user to produce the input. Obviously, the length of the current trajectory processed by the padding operation is fixed, while the trajectory context can not be filled or truncated due to the large length changes. This represents a difficult problem in model training. To solve this problem, we separate the processing of trajectory context from the processing of the current trajectory. For each trajectory context C i , we use the ARNN model to calculate the hidden state output of each trajectory T j in C i : Trajectories that are highly relevant to the current trajectory are considered to play a more important role in guiding POI predictions for all historical trajectories in the trajectory context. To this end, we introduce a correlation function f (·) to calculate the weight of each historical trajectory, and use softmax to calculate the weight of all historical trajectory states: Function f (·) here can be any function used to measure the correlation of trajectories. We use the Jaccard Similarity of the POI set of the trajectory to measure the similarity between the trajectories in the paper: where P i and P j are the POI sets of T u i and T u j , respectively. Then we multiply the weight of the context trajectories and their corresponding output latent states. The user's personal dynamic preference H i is represented as the weighted sum of the products above: The short-term preference learning module is mainly used to capture the sequential POI transition preferences of users. The module consists of a user embedding layer and an ARNN model, and receives a user and trajectories as input. Each trajectory is represented by a sequence of POI identifier, time interval, geographic distance, longitude and latitude as desribed in Eq. (1). Users are represented by a user identifier u in real world data, which can not precisely reflect the similarities and differences between users. The appropriate representation of a user can accurately measure the similarities between users, so that the model can learn the similarity of user behavior preferences according to degree of similarity between users. To this end, we use a fully connected network to embed user identifiers into latent space. where parameter W u ∈ R U ×du and b u ∈ R 1×du . The output of the short-term preference learning module can be described as follows: Classifier: We consider the next POI recommendation as a multiclass classification problem. The classifier module concatenates the user embedding, the user's personal dynamic preference and the current trajectory state as the input: Then, we feed Q into a fully connected layer with softmax function to calculate the probability of each POI: where parameter W s ∈ R P ×dp and b s is the bias parameter. We adopt the cross-entropy loss between the ground truth POI y t and the predicted POIŷ t . The loss function can be calculated as follows: where Θ denotes the parameter set of PDPNN, and λ is a pre-defined hyper parameter for the L2 regularization to avoid overfitting. To optimize the above loss function, we use Stochastic Gradient Descent to learn the parameter set Θ. Training the model with short-term preference and personal dynamic preference jointly is time-consuming, so we use pre-trained model and inner epoch cache method to accelerate the training process. The ARNN is the core component of the PDPNN model, and it takes most of the time required by the training process. In other words, if the ARNN component could be trained as quickly as possible, the training speed of the whole model would be accelerated. To this end, we try to pre-train the ARNN component in a more simple model, which connects the output of the ARNN and the classifier directly. Then, we load the pre-trained model and modify its network structure to adapt to our current model in training. Each input of the PDPNN model contains the current trajectory and the trajectory context, which makes the training process time consuming. For a certain user u with n trajectories {T 1 , T 2 , . . . , T n }, the PDPNN needs to process all of the n trajectories and the corresponding trajectory contexts. For each trajectory T i , the PDPNN learns a short-term preference from T i , and learns the user's personal dynamic preference from its trajectory context C i = {T 1 , T 2 , . . . , T i−1 }, which means the PDPNN needs to calculate the output state of C i with ARNN component for i times. So, the whole training complexity of the n trajectories user u is O(n 2 ). As a part of trajectory contexts of T i+1 , T i+2 , . . . , T n , the i-th trajectory T i is repeatedly calculated for n − i times during each training epoch. The optimization we can intuitively think of is to eliminate the repeated calculations. However, trajectory context state values are updated during training synchronously, which means we can not simply cache the state value in the whole training process. Considering the increment of state value updates between batches is very small in each training epoch, if the trajectory context output states can be cached within one training epoch and updated between epochs, a large number of approximately repeated calculations can be eliminated. To this end, we apply the inner epoch cache to store the trajectory context output state value during model training, and the calculation of each user's personal dynamic preference can be reduced to linear complexity. Datasets: We conducted experiments on three publicly available LBSN datasets, NYC, TKY and CA. NYC and TKY [18] are two datasets collected from users sharing their check-ins on the Foursquare website in New York and Tokyo, respectively. CA is a subset of a Foursquare dataset [19] , which includes long-term, global-scale check-in data. We chose the check-ins of users in California for the dataset in this paper. The check-in times of above datasets range from 2012 Apr to 2013 Sep. Each record contains an anonymous user identifier, time, POI-id, POI category, latitude and longitude of the check-in behavior. In order to alleviate the problem of data sparsity, following previous work [12] , we expand the set of trajectories by adding the sub-trajectories of the original trajectories. For all datasets, we choose 90% of each user's trajectories as the training set, and the remaining 10% as testing data. Comparing Methods: We compare PDPNN with several representative methods for location prediction: -RNN [7] : This is a basic method for POI prediction, which has been successfully applied in word embedding and ad click prediction. -AT-RNN: This method empowers the RNN model with an attention mechanism, which has been successfully applied in machine translation and vision processing. -LSTM [14] : This is a variant of the RNN model, which contains a memory cell and three multiplicative gates to allow long-term dependency learning. -ATST-LSTM [9] : This is a state-of-the-art method for POI prediction, which applies an attention mechanism to a LSTM network. -PDPNN: This is our approach, which learns a user's personal dynamic preference base on the current spatial and temporal context. To evaluate the performance of all methods for the POI recommendation problem, we employ a commonly used metric known as recall@N. The recall@N metric is popular in ranking tasks, which evaluates where the ground truth next POI appears in a ranked prediction list. A larger metric value indicates better performance. We conducted experiments on a machine with Intel a Xeon CPU and a NVIDIA Tesla P4 GPU. The performance comparison of methods on three datasets is illustrated in Table 2 . The RNN has a lower baseline performance than other methods with no extra optimization. The AT-RNN improves the performance greatly by 2 or 3 times over the RNN on all three datasets, which shows that the attention mechanism can alleviate the problem of long-term dependencies with gradient descent of RNN. The LSTM outperforms the RNN largely because of its memory and forget gate design, but fails to outperform the AT-RNN for the TKY and CA datasets. ATST-LSTM turns out to be the strongest baseline and shows significant improvement on the TKY and CA datasets. The reason is that ATST-LSTM works better at learning long-term dependencies through the gate and attention mechanisms. This shows that the attention mechanism is an effective supplement to LSTM models. PDPNN outperforms the above baseline methods in almost all of the recall metrics. Compared with ATST-LSTM, the PDPNN obtains recall@1, recall@5 and recall@10 improvements of 7-10% on LA and CA, and recall@1 and recall@5 improvements of 5-7% on TKY. By creating an elaborate modeling of users' personal dynamic preference, the PDPNN can capture the intentions of user activities more accurately. This enables the PDPNN to achieve better performance in next POI recommendation. It is worth noting that the PDPNN has lower performance in recall@10 and recall@20 than the ATST-LSTM on the TKY dataset. The reason is that the PDPNN treats POI recommendation as a classification problem, and the objective function is to optimize the accuracy of recall@1. Besides, there may be another interesting reason that people's preferences vary widely from country to country. To evaluate the efficiency of the inner epoch cache, we compare the average time consumption of models with and without the inner epoch cache during one epoch training. Figure 2 shows that the PDPNN with an inner epoch cache trains 45×, 38× and 30× faster than the model without a cache on LA, TKY and CA dataset, respectively. By introducing the inner epoch cache optimization in training, the per-epoch training time is reduced from more than one day to one hour on the TKY data set. It shows that the inner epoch cache can greatly accelerate the model training process. In this paper, we have proposed a new approach to better model the user's personal dynamic preference of next POI from current trajectory supplemented with a trajectory context. Based on an attention empowered LSTM neural network, we build a new neural network model named PDPNN for next POI recommendation. Moreover, to accelerate the training of trajectory contexts, we proposed an inner epoch cache to store the trajectory context output state value during model training, and reduce the complexity of the user's personal dynamic preference module from O(n 2 ) to O(n) in each training epoch. We conduct experiments on three real world data set and show that our model outperforms current well-known methods. Next place prediction using mobility markov chains Factorizing personalized Markov chains for nextbasket recommendation Matrix factorization techniques for recommender systems Temporal collaborative filtering with bayesian probabilistic tensor factorization GeoMF, joint geographical modeling and matrix factorization for point-of-interest recommendation Collaborative filtering meets mobile recommendation, a user-centered approach Sequential click prediction for sponsored search with recurrent neural networks Predicting the next location : a recurrent model with spatial and temporal contexts An attention-based spatio-temporal LSTM network for next POI recommendation Where to go next: a spatiotemporal LSTM model for next POI recommendation SERM: a recurrent model for next location prediction in semantic trajectories Adaptation and evaluation of recommendations for short-term shopping goals. In: RecSys, Adaptation and evaluation of recommendations for short-term shopping goals Learning long-term dependencies with gradient descent is difficult Bidirectional recurrent neural networks Hierarchical attention networks for document classification Neural machine translation by jointly learning to align and translate Attention is all you need Modeling user activity preference by leveraging user spatial temporal characteristics in LBSNs Participatory cultural mapping based on collective behavior data in location based social networks