Simultaneous people tracking and motion pattern learning Expert Systems with Applications 41 (2014) 7272–7280 Contents lists available at ScienceDirect Expert Systems with Applications j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a Simultaneous people tracking and motion pattern learning http://dx.doi.org/10.1016/j.eswa.2014.05.019 0957-4174/Crown Copyright � 2014 Published by Elsevier Ltd. All rights reserved. ⇑ Corresponding author. Tel.: +61 295147629; fax: + 61 295142655. E-mail addresses: Sarath.Kodagoda@uts.edu.au (S. Kodagoda), Stephan.Sehestedt@ uts.edu.au (S. Sehestedt). Sarath Kodagoda ⇑, Stephan Sehestedt Centre for Autonomous Systems, Faculty of Engineering and Information Technology, University of Technology, Sydney, PO Box 123, Broadway, NSW 2007, Australia a r t i c l e i n f o a b s t r a c t Article history: Available online 29 May 2014 Keywords: People tracking Motion pattern learning Human Robot Interaction The field of Human Robot Interaction (HRI) encompasses many difficult challenges as robots need a better understanding of human actions. Human detection and tracking play a major role in such scenarios. One of the main challenges is to track them with long term occlusions due to agile nature of human naviga- tion. However, in general humans do not make random movements. They tend to follow common motion patterns depending on their intentions and environmental/physical constraints. Therefore, knowledge of such common motion patterns could allow a robotic device to robustly track people even with long term occlusions. On the other hand, once a robust tracking is achieved, they can be used to enhance common motion pattern models allowing robots to adapt to new motion patterns that could appear in the environment. Therefore, this paper proposes to learn human motion patterns based on Sampled Hidden Markov Model (SHMM) and simultaneously track people using a particle filter tracker. The proposed simultaneous people tracking and human motion pattern learning has not only improved the tracking robustness compared to more conservative approaches, it has also proven robustness to prolonged occlusions and maintaining identity. Furthermore, the integration of people tracking and on-line SHMM learning have led to improved learning performance. These claims are supported by real world experi- ments carried out on a robot with suite of sensors including a laser range finder. Crown Copyright � 2014 Published by Elsevier Ltd. All rights reserved. 1. Introduction Successful Human Robot Interaction (HRI) requires a robot to have advanced abilities to carry out complex tasks. One such abil- ity is robust people tracking. It has been identified as an important tool in HRI not only for safe operation (Schulz, Burgard, Fox, & Cremers, 2003) but also for collision avoidance (Bennewitz, Burgard, Cielniak, & Thrun, 2005) or to implement following behaviors (Bolić & Fernández-Caballero, 2011; Gockley, Forlizzi, & Simmons, 2007; Prassler, Bank, & Kluge, 2002). Conventional way of people tracking is to use known motion models (Kluge, Kohler, & Prassler, 2001; Montemerlo, Thrun, & Whittaker, 1999) including probabilistic motion models (Tadokoro, Hayashi, Manabe, Nakami, & Takamori, 1995; Zhu, 1991). However, those methods have limited ability to model agile human movements. Therefore, some researchers opt not to use motion models (Francesc Serratosa & Amézquitaa, 2012; Kluge et al., 2001). Another type of techniques can track people in the vicinity of sensors (Montemerlo et al., 1999; Schulz, Burgard, Fox, &Cremers,Crem). However, they do not provide solutions for tracking with long term occlusions. Rosencrantz, Gordon, and Thrun (2003) have overcome the problem of tracking with temporary occlusions, however the tech- niques is not reliable when it is tracking outside the sensory ranges. It has been observed that human motion often follows place dependent patterns as it is influenced by a combination of social, psychological and physiological constraints (Altman, Rapoport, & Wohlwill, 1980; Arechavaleta, Laumond, Hicheur, & Berthoz, 2006; Dean, 1996; Hall, 1969). Therefore, if such a model could be learned by a robot, it could subsequently be used to improve people tracking. Approaches to the learning of place dependent motion pattern models have been proposed in the past. Bennewitz et al. (2005) used a network of laser scanners to learn motion patterns of indi- vidual occupants of an office environment. After collecting a data set, Expectation Maximization (EM) was used to cluster trajecto- ries for building a Hidden Markov Model (HMM). The HMM was used to implement collision avoidance behaviors. However, this technique relies on environment mounted sensors, which needs infrastructure modifications and hence it leads to difficulty in deployment. In Kanda, Glas, Shiomi, and Hagita (2009), multiple laser scanners were used to learn activity patterns of people in a shopping mall. The idea was to automatically identify potential customers in order to communicate with them. The approach http://crossmark.crossref.org/dialog/?doi=10.1016/j.eswa.2014.05.019&domain=pdf http://dx.doi.org/10.1016/j.eswa.2014.05.019 mailto:Sarath.Kodagoda@uts.edu.au mailto:Stephan.Sehestedt@uts.edu.au mailto:Stephan.Sehestedt@uts.edu.au http://dx.doi.org/10.1016/j.eswa.2014.05.019 http://www.sciencedirect.com/science/journal/09574174 http://www.elsevier.com/locate/eswa S. Kodagoda, S. Sehestedt / Expert Systems with Applications 41 (2014) 7272–7280 7273 seems appealing, however it uses six infrastructure mounted laser scanners and off-line learning framework which may limit the feasibility to be used in many other robotic applications. Using stationary laser scanners, Luber, Tipaldi, and Arras (2011) proposed to learn a spatial affordance map based on a Poisson process. The approach was shown to be a viable extension to Multi Hypothesis Tracking (MHT) with improved tracking capabilities and robust- ness. However, it uses stationary sensors. Vasquez, Fraichard, and Laugier (2009) proposed Growing Hidden Markov Models to learn motion patterns. Although it implements an on-line learning algo- rithm, the technique requires the observer to be stationary. Lookingbill, Lieb, Stavens, and Thrun (2005) used a small helicopter with a camera as a semi-stationary observer to monitor a single roundabout to build an activity histogram. This histogram was shown to improve target tracking notably, however it may not be feasible nor suitable to be used in the indoor scenarios like the one proposed in our paper. Bruce and Gordon (2004) used training data to learn goal locations in a small environment. This informa- tion was then used to provide a particle filter based tracker with an improved prediction model, which was shown to perform better than Brownian motion for prediction. This work relies on previ- ously learned goal locations in a given environment. In general, the methods proposed in the literature have many limitations as described in the previous paragraphs such as the requirement of infrastructure based sensors, difficulty of operation under partial observability or occlusions, limited on-line adaptabil- ity, and on-line operation. Further, they are not capable of simulta- neously improving both the model learning and tracking. Therefore, here we propose simultaneous people tracking and motion pattern learning based on our previous work on SHMMs (Sehestedt, Kodagoda, & Dissanayake, 2010), which uses robot mounted sensors for on-line learning and can effectively handle partial observations with limited FOV of the sensors. This frame- work allows the robot to improve its tracking abilities with the availability of a learned model, while improving the model learn- ing with the feedback from the improved tracker. The SHMM based framework was tested in an office environment using the robot LISA shown in Fig. 1 with appealing results. More specifically, the contributions of this paper are, 1. Formulation of the Sampled Hidden Markov Model for effective handling of on-line learning and model adaptation to capture the changes in the human motion patterns. 2. Synthesis of the theoretically sound probabilistic algo- rithm for the simultaneous people tracking and motion pattern learning for improving both aspects. 3. Implementing and testing the algorithm and obtaining superior results for long term occlu- sions when comparing with conventional (model based) methods. This paper is organized as follows. Section 2 gives a brief intro- duction to SHMMs for on-line learning of common human motion patterns. Section 3 introduces and formulates the simultaneous Fig. 1. The LISA robot. people tracking and motion pattern model learning. Section 4 pre- sents experimental results showing the viability and effectiveness of the proposed approach. Finally, Section 5 summarizes the con- tributions and briefly discusses current and future work. 2. Sampled Hidden Markov Models In this section, a brief introduction to SHMM learning is given. SHMMs provide a sparse representation of common motion pat- terns and can be learned on-line using a mobile robot’s on-board sensors. The importance of this can be seen in Fig. 2, which illustrates a robot as a red circle and its current laser scan as a red outline. It can be observed that the field of view (FOV) of the robot is a small fraction of the size of the operating environment. Thus, any learning algorithm that can be used in such environ- ments needs the ability to incrementally learn with partial observations. An SHMM is defined by its states S and state transition matrix A where each state is represented by a set of weighted samples. Although, these sample sets can represent arbitrary probability distributions, for notational simplicity, the states are defined by their means l and covariances R as, S ¼ sðiÞ ¼ lðiÞ RðiÞ " # 1 6 i 6 N ð1Þ where N is the number of states in the model. Note that the model is time dependent as learning is done incrementally, however it is omitted for convenience of notation. The state transition matrix contains the probabilities of transitions from state i to state j as A ¼ aðijÞ ¼ K ðijÞ PðsðjÞjsðiÞÞ " # 1 6 i 6 N ð2Þ where KðijÞ is the number of times a transition was observed, from which the probability of the transition PðsðjÞjsðiÞÞ can be calculated. A particle filter based people tracker is used for learning the SHMM. Consider the situation in Fig. 3(a) where a person walked along the trajectory indicated by the arrow. The tracking algorithm produced a series of sample clusters, each of which could be inter- preted as one state of an SHMM. To obtain a more sparse represen- tation of the observed trajectory, a subset of those sample clusters was used as shown in Fig. 3(b), where means, covariances and transition of states are shown as red squares, red ellipses and red lines respectively. It is to be noted that the state transitions could be directly derived from the sequence of sample clusters as the temporal order of the clusters are known. Suppose another person is observed walking along the trajec- tory shown by the arrow in Fig. 3(c), where the information needs Fig. 2. A small robot observing its environment. Fig. 3. A basic example of SHMM learning. 7274 S. Kodagoda, S. Sehestedt / Expert Systems with Applications 41 (2014) 7272–7280 to be immediately added to the available motion pattern model. As part of the new trajectory is overlapped with the existing trajec- tory, a fusion mechanism is essential. Further, the non-overlapping states need to be appropriately added to the model. The symmetric Kullback–Leibler divergence (KLD) (Kullback & Leibler, 1951) is used for this purpose, which finds associations between states in the SHMM and sample clusters obtained from people tracking as, KLDðsðiÞksðjÞ�Þ¼ KLDsðsðiÞksðjÞ�Þþ KLDs�ðsðjÞ�ksðiÞÞ ð3Þ with 1 6 i 6 N and 1 6 j 6 K, where K is the number of sample clusters (candidate states, sðjÞ�) reported by the tracker. Whenever an association is found, the new information is added to the corre- sponding state and state transitions can be updated by counting. The non-associated sections are added as new states. Now the probabilistic formulation of the SHMM learning approach is given by the belief of motion patterns Dt at time t conditioned on the robot’s location estimate and people tracking results. That is the belief, BelðDtÞ¼ PðDtjnt; ft; zt; . . . ; n0; f0; z0Þ ð4Þ is to be computed, where ft is the robot localization hypothesis, nt is the people tracking result and zt the sensor reading at time t. From this an incremental update rule can be derived using the Bayes theorem as, BelðDtÞ¼ PðntjDt; ft; zt; nt�1; . . . ; n0; f0; z0Þ � PðDtjft; zt; nt�1; . . . ; n0; f0; z0Þ Pðntjft; zt; nt�1; . . . ; n0; f0; z0Þ ð5Þ Since the denominator is independent of Dt , it can be written as, BelðDtÞ¼ gPðntjDt; ft; zt; nt�1; . . . ; n0; f0; z0Þ � PðDtjft; zt; nt�1; . . . ; n0; f0; z0Þ ð6Þ where g ¼ Pðntjft; zt; nt�1; . . . ; n0; f0; z0Þ is a constant. This is the belief of D at time t given all past observations, sensor readings and observer poses. Obviously, this is not an efficient solution for updating the belief since all observations of moving people, sensor data and observer poses up until time t would have to be remembered. Therefore, it is assumed that obser- vations and poses are conditionally independent of past observa- tions and poses given ft and Dt , i.e. the system is Markov. Therefore, BelðDtÞ¼ gPðntjDt; ft; ztÞPðDtjft; zt; nt�1; . . . ; n0; f0; z0Þ zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{prior belief ð7Þ In fact the last term of this equation is the belief at time t � 1 and thus the final update rule is written as BelðDtÞ¼ g PðntjDt; ft; ztÞ zfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflffl{people tracking BelðDt�1Þ ð8Þ As given in the above equation, the belief BelðDtÞ can now be updated using the most recent observations of moving people. 3. Simultaneous people tracking and motion pattern learning In this section, the formulation is extended to simultaneous people tracking and motion model learning. The idea is to tightly integrate learning and tracking in order to improve both perfor- mances. Therefore, the goal is to estimate PðDt; nt; ftjztÞ ð9Þ which is the joint probability of the motion pattern model Dt , the position estimates of people in the robot’s FOV nt and the robot’s loca- tion estimate ft at time t given the latest sensor reading zt . Assuming independence between these variables, it can be shown that PðDt; nt; ftjztÞ¼ PðDtjnt; ztÞPðftjztÞ YK k¼1 PðnðkÞt jztÞ ð10Þ where K denotes the number of tracked people. It is to be noted that the first term on the right hand side is conditioned on nt . Further- more, it can be observed that the estimate of Dt is conditionally independent of zt given nt which is estimated from the latest sensor reading, leading to PðDt; nt; ftjztÞ¼ PðDtjntÞPðftjztÞ YK k¼1 PðnðkÞt jztÞ ð11Þ The second term on the right hand side defines robot localiza- tion and the estimate is commonly conditioned on control input given as ut resulting in PðDt; nt; ftjzt; utÞ¼ PðDtjntÞPðftjzt; utÞ YK k¼1 PðnðkÞt jztÞ ð12Þ Intuitively, this equation could be solved from the right to the left, i.e. after states of people have been estimated, the result could be used to improve the robot localization. The localization data in turn would be used to determine the global positions of detected people. Moreover, the result could be used in the first term to update the model of motion patterns. However, to fully exploit the idea of an incrementally learned model of motion patterns Dt , the simultaneous utilization and learning of the model is desir- able. To accomplish this, the last term of above equation could be made dependent on Dt . S. Kodagoda, S. Sehestedt / Expert Systems with Applications 41 (2014) 7272–7280 7275 PðDt; nt; ftjzt; utÞ¼ PðDtjntÞPðftjzt; utÞ YK k¼1 PðnðkÞt jzt; D_tÞ ð13Þ Note that when people tracking is performed, although time has already incremented to the current discrete time step t, belief at this point is still Dt ¼ Dt�1 , which is indicated by the notation D_t . We now have a formulation of SHMM learning, which explicitly accounts for a mobile observer and takes advantage of improved tracking while learning. 3.1. Motion prediction There is no general model for human motion prediction as such a very conservative prediction models like Brownian motion or constant velocity model is commonly used (Luber et al., 2011). In a b c d i dx dy f g h j y x Person e y Fig. 4. Motion predictio D D D HC Fig. 5. The map used for experiments. Desk areas, corridors and common are marked as laser reading is shown as a red outline. Note the limited FOV of LISA. (For interpretatio version of this article.) Fig. 6. A panoramic view of the la rare cases, less conservative learned models have been presented (Bennewitz et al., 2005; Bruce & Gordon, 2004; Luber et al., 2011), however, they often use stationary observers or off-line learning, which limits the potential mobile robotic applications. Here, we propose to utilize the learned motion models at the prediction stage of a particle filter (PF) based people tracker (Arulampalam, Maskell, & Gordon, 2002) as Pðntjnt�1; D_tÞ ð14Þ Consider the scenario given in Fig. 4(a) with a person (blue ellipse) associated with state ‘‘a’’ in the model, state ellipses (red) and state transition lines (thicknesses are proportional to the magnitude of transition probabilities). Then in the next state, the location of the person can be predicted as indicated by the purple arrow based on dx and dy. a b c d i f g h j x Person e n using an SHMM. D D D D H H ‘‘D’’, ‘‘H’’ and ‘‘C’’ respectively. LISA’s pose is shown by a red circle and the current n of the references to color in this figure caption, the reader is referred to the web rge open office environment. pe rs on tra je ct or y ro bo t Fig. 7. Model learning with real world data. 7276 S. Kodagoda, S. Sehestedt / Expert Systems with Applications 41 (2014) 7272–7280 In our current implementation, half of the samples are dedicated to SHMM based prediction and the other half is used for the prediction based on the constant velocity model to cater for some previously unlearned motion patterns. In more complex situations, there can be more state transitions based on the learned model (see Fig. 4(b)). There the person is associated with state d has two possible transitions, e and f. As the transition probabilities are available in the model, they are used to determine the number Fig. 8. Experimental set up for the evaluation of the tracking performance. Two trajectories (blue and green) were considered. (For interpretation of the references to color in this figure caption, the reader is referred to the web version of this article.) Table 1 Number of track losses in 22 trials. Experiment Sample size 50 100 200 500 1000 Stage 1 13 8 5 2 0 Stage 2 3 0 0 0 0 Stage 3 22 12 9 4 1 Stage 4 2 0 0 0 0 S. Kodagoda, S. Sehestedt / Expert Systems with Applications 41 (2014) 7272–7280 7277 of samples that are allocated to each prediction. That is, the predic- tions with higher transition probabilities are allocated more sam- ples than that of the prediction with lower transition probabilities. Hence the number of samples are estimated based on NðjÞD ¼ Pðs ðjÞjsðiÞÞ � ND; 1 6 i 6 N 1 6 j 6 N ð15Þ where N is the number of states of the SHMM, NðjÞD denotes the num- ber of samples dedicated to the transition from the current state i to state j. 3.2. Long term prediction Long term prediction of people locations without observations is a complex and a non-trivial problem due to the agile and compli- cated nature of human movements and the presence of occlusions. Without having prior knowledge about movements nor recent track new Fig. 9. Learning while observations, the conventional model based predictions tend to deviate from the actual people movements leading to track losses. However, this may be solved to a greater extent, if the knowledge of human motion patterns are available. In our approach, we utilize the learned human models based on SHMM in the prediction stage of the particle filter as described in the previous section. However, it is important to note that there is an inherent uncertainty in the learned model due to the gross nature of motion pattern learning. It gives rise to slight growing uncertainty of the estimator over time. However, as the tracker is still managed to follow the general trend of people motion, it still has significantly lower track losses. 4. Experimental results Experiments were conducted using the Lightweight Integrated Social Autobot (LISA) shown in Fig. 1. LISA is a low cost robotic platform designed for fast and easy deployment. The base is an iRobot create equipped with a small Intel Atom X86-64 computer together with a Hokuyo UTM-30LX laser range scanner (http:// www.hokuyo-aut.jp) for perception. The software development environment is Player/Stage (http://playerstage.sourceforge.net/) and all the algorithms were implemented in C++ within the Orca software framework (Brugali et al., 2007). Fig. 5 shows a Simultaneous Localization and Mapping (SLAM) generated map of the environment where desk areas and corridors are marked appropriately. The LISA robot is shown as a red circle with the red outline illustrating the observed laser reading. Being a small robot, it has a significantly limited field of view due to the presence of furniture. The map spans approximately 32 m � 20 m. Fig. 6 shows a panoramic view of the office environ- ment. It is important to note the complexity of the environment with large amount of clutter and semi-static objects like trash cans. Furthermore, some of the walls are made out of glass contributing to perception issues. 4.1. Learning motion patterns In this section SHMM learning is presented with the robot LISA in the aforementioned office environment. Ten different subjects were included in this experiment and no modifications were done track losses occur. http://www.hokuyo-aut.jp http://www.hokuyo-aut.jp http://playerstage.sourceforge.net/ 7278 S. Kodagoda, S. Sehestedt / Expert Systems with Applications 41 (2014) 7272–7280 to the environment. The limited observability at most times means that the robot had to explore the environment to build a model of motion patterns. Furthermore, in order to observe longer trajecto- ries the robot had to follow people, hence it was a truly mobile observer. The series of figures in Fig. 7 show the evolution of an SHMM. Fig. 7(a) shows the robot following a person, where the person is represented by a yellow cylinder and the trajectory is shown as an orange line. The robot is shown as a red circle, where the red outline indicates the observed reading of the forward looking laser sensor. The observed trajectory exhibits a typical human motion and accordingly it is represented in the initial model in Fig. 7(b). Fig. 7(c) shows the model after more than 70 observed trajecto- ries while the robot was on the move. The trajectories were successfully joined and compactly represented. The final represen- person1 person2 prediction1 prediction2 Fig. 10. Motion prediction during sensor failure in a complex situation. (a) Two people LISA’s sensor fails. Consequently, the position estimates of people were updated based maintained using the information from the SHMM. (d) After the sensor resumed opera maintained. tation including more than 80 trajectories are shown in the (Fig. 7(d)) as a unimodal Gaussians distribution. It could be noted that trajectories are positioned correctly on free spaces rather than through obstacles on the map. Further, compared to grid based representations of motion patterns, a greater efficiency is achieved as the belief has to be maintained only in the relevant areas (with human motion) of interest rather than over the entire space. 4.2. Tracking robustness In this experiment, the robot was positioned at an intersection while observing people trajectories as shown in the Fig. 8. The test data consists of 22 observed trajectories for the right turn and the same for the left turn (44 in total). The experiment was divided into four stages. In the first stage (stage 1) people followed the prediction1 prediction2 person1 person2 were tracked and associated with an SHMM with two disconnected trajectories. (b) on the SHMM. (c) Over an extended period of time, the tracks and identities were tion, both tracks could be recovered successfully and the identities were correctly robot person robot person robot person robot person Fig. 11. Long term motion prediction. S. Kodagoda, S. Sehestedt / Expert Systems with Applications 41 (2014) 7272–7280 7279 trajectory indicated by the blue arrow (right turn) to evaluate the tracking performance. In the second stage (stage 2,) a model of the blue trajectory was learned and used in the tracker to predict the motion. In the third stage (stage 3), the model from stage two was used, however, people followed the green trajectory (left turn), i.e. they exhibited a drastically different behavior compared to the previous observations. In the fourth stage (stage 4) of the experiment, the model was extended to include both types of trajectories and tested on people following either trajectory. Based on these tests, the tracking performance was evaluated in terms of the number of track losses as shown in Table 1. In stage 1, the subjects took a sharp right turn, as indicated by the blue arrow in Fig. 8, while being observed by the robot. As there was no earlier learned SHMM, tracking was done using the near constant velocity model for motion prediction. Sample sizes of 50, 100, 200, 500 and 1000 were considered and the observed track loss rates are summarized in the first row of Table 1. Not sur- prisingly, it can be seen that the tracking with fewer number of samples provided large number of track losses. This is due to the sharp turn, where the constant velocity model could not adequately provide a reasonable prediction. However, with the increasing number of samples, the tracker could achieve lower track losses. In stage 2, the SHMM model was learned first with sharp right turn data before being used in the tracker. The algorithm was tested by replaying the data in stage 1 of the exper- iment. As shown in Table 1, the tracker performance was signifi- cantly improved with no track losses for more than 100 samples used in the PF. In stage 3, data of the people following the right turn was used to learn the model, however tested on people following a left turn, i.e. their motion pattern would diverge from the learned model. As expected the number of tracks losses has increased significantly, even more than that of using a constant velocity model (stage 1). It can also be seen that the tracking performance has an adverse effect with the small number of samples used. Finally, in stage 4, after adaptively learning the left turn with the previously learned right turn, the tracker performed significantly better with only loss of 2 tracks even with 50 samples used in the PF. 4.3. Learning with track losses The findings in stage 1 and 3 of above experiment can be further analysed to understand the process of learning in the presence of track losses. As discussed before, in stage 1, LISA observed people following the blue trajectory in Fig. 8 without having a priory motion pattern model. The People were tracked with a sample size of 100 in this experiment. Fig. 9(a) shows the tracking result of the first observed person, where the track was lost during the sharp turn. Fig. 9(b) illustrates the learning result based on that trajec- tory. As could be seen in Table 1, most of the tracks were lost during the process without contributing much to the learned model. However, once an extended track was observed as shown in Fig. 9(c), the result was immediately used to update the model as shown in Fig. 9(d). This model was further tuned with other few more extended tracks to achieve the gross motion model. This alleviates the need to keep all the trajectories by just keeping the combined trajectory. 4.4. Occlusion handling and long term prediction Consider a part in an office environment with two strictly sep- arated motion trajectories as shown in Fig. 10(a). The figure further shows two people who were tracked by LISA while predicting both motions based on the motion pattern model. Then, in Fig. 10(b) a sensor failure has occurred, which is indicated by the absence of the red outline of the laser scan. From this point, the two people tracks were maintained purely based on the predictions made by the motion pattern model and the predicted estimates are shown as transparent purple cylinders in the figure. The sensor failure persisted for a while and the two people turned to their respective rights. While the constant velocity model based tracker failed to track both targets, our SHMM based tracker could successfully track in such agile situations as shown in Fig. 10(c). Furthermore, it could successfully maintain the identities of people. Finally, as in Fig. 10(d), after the sensor resumed it’s normal operation, the 7280 S. Kodagoda, S. Sehestedt / Expert Systems with Applications 41 (2014) 7272–7280 two people were re-detected correcting associated uncertainties of the filter. Consider a hallway which is partly divided by a wall as shown in Fig. 11(a), where the dividing wall is approximately eight meters long. A motion pattern model has been learned for this environ- ment. At one time, there was a person appeared while the robot was observing the hallway from the left side. The person moved upwards in the figure and took the path right of the dividing wall leaving the field of view of the robot. Once the person was occluded by the wall, as there were no observations, all samples were predicted according to the SHMM (Fig. 11(b)) for continuous tracking. Although, unavailability of observations lead to slight increase of uncertainty (which is indicated by the green covariance ellipse in Fig. 11(c)), it did not contribute to track losses. The con- stant velocity model based implementation lost track within a cou- ple of iterations of the particle filter. Once the observations were available towards the end of the track, the filter converged success- fully as shown in Fig. 11(d). 5. Conclusions and future work 5.1. Conclusions Model based conventional trackers have significant difficulties in tracking with long term occlusions. Therefore, this paper pro- posed to use human motion pattern models in people tracking within a probabilistic framework. We presented simultaneous peo- ple tracking and human motion pattern learning based on Sampled Hidden Markov Models and particle filter tracker. Learning is achieved on a mobile observer where only the robot’s on-board sensors were used without relying on cumbersome infrastructural sensors. Since there was no dedicated learning phase, all new infor- mation was incorporated in the model immediately improving the adaptability. This property is of central importance which particu- larly distinguish our work with many of the other previously pre- sented approaches. The representation of motion as an SHMM is more memory efficient than grid based approaches as the model only represents motion in areas of interest rather than the whole space. Furthermore, as the model is not fully connected, the transi- tion matrix can be replaced by more compact data structures. The knowledge of motion patterns has been shown to be useful in not only handling long term occlusions but also maintaining identities. While a reduction in the tracking errors could be observed with SHMMs integrated tracker, the most significant improvement was observed in the tracking robustness especially in complex situations like a sudden change of direction. The num- ber of lost tracks were greatly reduced even when using small sam- ple sizes for PF tracking. These effects in turn led to a better convergence in motion pattern learning, which again highlights the benefits of on-line learning of human motion patterns. The integration of tracking and learning was shown to have mutual benefits. 5.2. Future work Once the robot is placed in a new environment, it takes some time for learning the motion patterns. Until a reasonable motion pattern model is learned, the tracking performance of the current algorithm suffers in accuracy. Therefore, learning the human behaviors directly from the environmental features rather than relying on observing humans in the environment (through object affordance and hallucinated humans) is a research direction that is of future interest. Further, in the context of learning of human spatial behavior, this paper has presented some of the fundamental abilities needed for lifelong learning such as on-line learning, adaptability and the ability to include incomplete knowledge in incremental learning. One another key competency for lifelong learning under uncertainty in a changing environment is the for- getting of obsolete information, which is not addressed in this research and is considered as a great future direction. Acknowledgments This work was supported by the ARC Centre of Excellence pro- gramme, funded by the Australian Research Council (ARC) and the New South Wales State Government. References Altman, I., Rapoport, A., & Wohlwill, J. (1980). Environment and culture. Springer. Arechavaleta, G., Laumond, J. P., Hicheur, H., & Berthoz, A. (2006). The nonholonomic nature of human locomotion: A modeling study. In IEEE/ RAS- EMBS international conference on biomedical robotics and biomechatronics. Arulampalam, M. S., Maskell, S., & Gordon, N. (2002). A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on Signal Processing, 50, 174–188. Bennewitz, M., Burgard, W., Cielniak, G., & Thrun, S. (2005). Learning motion patterns of people for compliant robot motion. International Journal of Robotics Research, 24. Bolić, J. M. G., & Fernández-Caballero, A. (2011). Agent-oriented modeling and development of a person-following mobile robot. Expert Systems with Applications, 38, 4280–4290. Bruce, A., & Gordon, G. (2004). Better motion prediction for people-tracking. In IEEE international conference on robotics and automation (ICRA), Citeseer. Brugali, D., Brooks, A., Cowley, A., Côté, C., Domínguez-Brito, A., Létourneau, D., Michaud, F., & Schlegel, C. (2007). Trends in component-based robotics. Springer tracts in advanced robotics (Vol. 30). Berlin/Heidelberg: Springer. Dean, D. (1996). Museum exhibition: theory and practice. Routledge. Francesc Serratosa, R. A., & Amézquitaa, N. (2012). A probabilistic integrated object recognition and tracking framework. Expert Systems with Applications, 39, 7302–7318. Gockley, R., Forlizzi, J., & Simmons, R. (2007). Natural person-following behavior for social robots. In ACM/IEEE international conference on human-robot interaction (HRI) (pp. 17–24). New York, NY, USA: ACMhttp://doi.acm.org/10.1145/ 1228716.1228720. Hall, E. T. (1969). The hidden dimension – man’s use of space in public and private. London: The Bodley Head Ltd. Kanda, T., Glas, D., Shiomi, M., & Hagita, N. (2009). Abstracting people’s trajectories for social robots to proactively approach customers. IEEE Transactions on Robotics, 25, 1382–1396. http://dx.doi.org/10.1109/TRO.2009.2032969. Kluge, B., Kohler, C., & Prassler, E., 2001. Fast and robust tracking of multiple moving objects with a laser range finder. In IEEE international conference on robotics and automation (ICRA) (pp. 1683–1688). Kullback, S., & Leibler, R. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22, 79–86. Lookingbill, A., Lieb, D., Stavens, D., & Thrun, S. (2005). Learning activity-based ground models from a moving helicopter platform. In IEEE international conference on robotics and automation (ICRA) (pp. 3948–3953). Luber, M., Tipaldi, G. D., & Arras, K. O. (2011). Place-dependent people tracking. The International Journal of Robotics Research http://dx.doi.org/10.1177/ 0278364910393538. Montemerlo, M., Thrun, S., & Whittaker, W. (1999). Conditional particle filters for simultaneous mobile robot localization and people-tracking. In IEEE international conference on robotics and automation (ICRA) (pp. 695–701). Prassler, E., Bank, D., & Kluge, B. (2002). Motion coordination between a human and a mobile robot. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 1228–1233). Rosencrantz, M., Gordon, G., & Thrun, S. (2003). Locating moving entities in dynamic indoor environments with teams of mobile robots. In Second joint international conference on autonomous agents and multi agent systems (pp. 233–240). Schulz, D., Burgard, W., Fox, D., & Cremers, A. B. (2003). People tracking with mobile robots using sample-based joint probabilistic data association filters. The International Journal of Robotics Research, 22, 99–116. Sehestedt, S., Kodagoda, S., & Dissanayake, G. (2010). Models of motion patterns for mobile robotic systems. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 4127–4132). Tadokoro, S., Hayashi, M., Manabe, Y., Nakami, Y., & Takamori, T. (1995). On motion planning of mobile robots which coexist and cooperate with human. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 518–523). Vasquez, D., Fraichard, T., & Laugier, C. (2009). Growing hidden markov models: An incremental tool for learning and predicting human and vehicle motion. The International Journal of Robotics Research, 28, 1486–1506. Zhu, Q. (1991). Hidden markov model for dynamic obstacle avoidance of mobile robot navigation. IEEE Transactions on Robotics and Automation, 7, 390–397. http://refhub.elsevier.com/S0957-4174(14)00299-1/h0060 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0065 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0065 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0065 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0070 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0070 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0070 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0075 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0075 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0075 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0080 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0080 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0080 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0085 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0090 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0090 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0090 http://doi.acm.org/10.1145/1228716.1228720 http://doi.acm.org/10.1145/1228716.1228720 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0105 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0105 http://dx.doi.org/10.1109/TRO.2009.2032969 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0115 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0115 http://dx.doi.org/10.1177/0278364910393538 http://dx.doi.org/10.1177/0278364910393538 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0120 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0120 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0120 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0125 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0125 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0125 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0130 http://refhub.elsevier.com/S0957-4174(14)00299-1/h0130 Simultaneous people tracking and motion pattern learning 1 Introduction 2 Sampled Hidden Markov Models 3 Simultaneous people tracking and motion pattern learning 3.1 Motion prediction 3.2 Long term prediction 4 Experimental results 4.1 Learning motion patterns 4.2 Tracking robustness 4.3 Learning with track losses 4.4 Occlusion handling and long term prediction 5 Conclusions and future work 5.1 Conclusions 5.2 Future work Acknowledgments References