The proliferation of personal tracking devices (e.g. smart wristband) allows for the continuous and pervasive collection of an individual's behavioral and physiological data. These sensed multi-modal streams are often chronologically ordered (e.g. time series of heart rate measurements), which we refer to as temporal multi-modal sensory data in this thesis. Such data, which is from multiple sensor sources, can benefit a wide spectrum of applications, such as personality inference, wellness assessment, and demographics prediction. Mining such data has also become the cornerstones of related researches and applications.However, temporal multi-modal sensory data poses its unique challenges. Compared to the other chronological sequences, such as music, temporal multi-modal sensory data is more sensitive to individual characteristics, i.e, data similarities and differences. To tackle this challenge, personal features should be given particular attention when mining such data for research purposes. For example, small fluctuations in heart rate data can pose significant impacts on the data analysis, and it is important to consider recovering the personal characteristics in heart rate missing imputation. As such, this thesis aims to address the research question: how can we make full use of personal insights when modeling the temporal multi-modal sensory data?The main thrust of this dissertation focuses on advancing the personalized knowledge in temporal multi-modal sensory data mining and proposing novel solutions in their applications. Specifically, we present research across the data mining end-to-end procedures, including: i. introducing personalized characteristics in data preprocess- ing; ii. generating the individual feature representations while capturing the temporal and semantic patterns in the data; iii. learning and aggregating the personal insights from multiple sources in various applications.This dissertation will not only provide practical solutions to address the domain-specific problems, but also emphasize the focus of personal insights when tackling the temporal sensory data. It will shed light on future research on processing and learning from temporal multi-modal sensory data. Furthermore, the pervasiveness of these self-tracking devices, coupled with the methods to capture the personal traits, will serve as an important role to understand individuals and designing suited contents for them.