key: cord-0486506-rlathgs9 authors: Tashakori, Arvin; Zhang, Wenwen; Wang, Z. Jane; Servati, Peyman title: SemiPFL: Personalized Semi-Supervised Federated Learning Framework for Edge Intelligence date: 2022-03-15 journal: nan DOI: nan sha: dd98a97c3b75dd7de5d3d476a74a276fc37f432b doc_id: 486506 cord_uid: rlathgs9 Recent advances in wearable devices and Internet-of-Things (IoT) have led to massive growth in sensor data generated in edge devices. Labeling such massive data for classification tasks has proven to be challenging. In addition, data generated by different users bear various personal attributes and edge heterogeneity, rendering it impractical to develop a global model that adapts well to all users. Concerns over data privacy and communication costs also prohibit centralized data accumulation and training. This paper proposes a novel personalized semi-supervised federated learning (SemiPFL) framework to support edge users having no label or limited labeled datasets and a sizable amount of unlabeled data that is insufficient to train a well-performing model. In this work, edge users collaborate to train a hyper-network in the server, generating personalized autoencoders for each user. After receiving updates from edge users, the server produces a set of base models for each user, which the users locally aggregate them using their own labeled dataset. We comprehensively evaluate our proposed framework on various public datasets and demonstrate that SemiPFL outperforms state-of-art federated learning frameworks under the same assumptions. We also show that the solution performs well for users without labeled datasets or having limited labeled datasets and increasing performance for increased labeled data and number of users, signifying the effectiveness of SemiPFL for handling edge heterogeneity and limited annotation. By leveraging personalized semi-supervised learning, SemiPFL dramatically reduces the need for annotating data and preserving privacy in a wide range of application scenarios, from wearable health to IoT. O VER the past years, the evolution of sensor and wearable technologies and the Internet of Things (IoT) devices has led to a wealth of personalized time-series multi-sensory data, tackling various real-life problems, including human activity recognition [1] - [3] , sleep stage identification [4] and fall detection [5] . The success in these application domains relies on supervised learning frameworks [6] - [8] , where highquality labeled data is necessary to train classification models in a strictly controlled context. However, acquiring a large amount of personalized labeled data in a centralized server is cost-prohibitive, time-consuming, and impractical due to complexity and privacy concerns [9] . In summary, these problems A. Tashakori is the corresponding author: arvin@ece.ubc.ca have made it challenging to use an extensive amount of multisensor time-series data for predictive analytics tasks for each edge user. Addressing the above concerns motivates us to answer the following research questions: Can we develop a framework to help users participate in cooperative training without violating their privacy? Can they perform better than supervised learning methods with a few labeled data points? What is the influence of the number of available labeled instances on quality of training? Can the framework provide personalized models adapted to individual users instead of one global model? What is the impact of the number of users who are collaborating? Federated learning can address the privacy concern regarding sharing of users' data with a centralized server. It has gained much attention in recent years, with applications in detection of industrial anomalies [10] , [11] , air quality sensing [12] , traffic forecast [13] , resource management in wireless networks [14] - [17] , human-computer interaction [18] , smart ocean [19] , COVID-19 detection [20] , [21] , medical imaging [22] , clinical decision systems [17] , [23] , and embedded intelligence [9] , [24] - [28] . In federated learning frameworks, users collaboratively train a global model while keeping their data isolated on the edge devices [29] - [31] . The first method introduced by Google in 2016, FedAvg [32] , is widely used as the baseline [28] , [33] , [34] . In FedAvg, the server first sends a global model to each user. Later, edge users update the model using their locally labeled data and communicate the results back with the server. Finally, the server updates the global model by averaging the received model parameters from the users. Despite its simple framework, FedAvg demonstrated exceptional performances in several scenarios and applications [29] , [32] , [35] . At the same time, the performance of federated learning methods including FedAvg, degrades in scenarios where edge users have heterogeneous attributes. To address this challenge, recent works aimed at developing a personalized model for each user [26] , [27] . An overview of personalized federated learning models is illustrated in Fig. 1 . In personalized federated learning, users collaborate to develop personalized models instead of generating a global model [28] , [33] , [36] - [41] . However, these works primarily focus on image classification benchmarks such as CIFAR10 [42] , and CIFAR100 [43] , and applications such as multi-sensor and IOT classifications are left unattended. Also, in most of these works the main assumption is that there exist a large set of labeled data at each user side, which may be impractical. Data annotation is another big challenge to address. Current solutions mainly employ supervised learning for sensor-based predictive analytic tasks. For example, in activity recognition, supervised learning is used to deduce activity categories from time-series sensory input data generated by edge sensors [44] - [46] . Data annotation is required by an expert in the field. In practice, we mostly have either no label or some labeled data leaving the bulk of data unlabeled on the edge devices. Although federated learning can protect user privacy, most of the existing methods rely on high degree of annotated data. However, due to the unpredictably variable characteristics of edge users and devices, achieving labeled data at the edge makes the current solutions impractical. To address these shortcomings, semi-supervised learning, which is defined between supervised and unsupervised learning, is proposed to deal with the insufficient labeled data problem [47] . In semisupervised learning, we have a small amount of labeled data and a large amount of unlabeled data. There have been multiple attempts to unify federated learning with semi-supervised learning [28] , [33] , [48] . Zhao et al. consider users have a large amount of labeled data, while the server has a set of labeled data [48] . Saeed et al. developed a self-supervised method to learn valuable representations from users' unlabeled data [9] . However, non of these work investigates methods to solve the data heterogeneity problem. In this work, we aim to integrate semi-supervised learning with personalized federated learning for multi-sensory timeseries-based classification. We assume that we have a set of labeled data from different distributions at the server and a small set of labeled data, and a large set of unlabeled data at the users' side. The proposed framework contains three main steps: first, the server learns to generate a personalized autoencoder using a hyper-network for each user. Second, the server Averaging weight for m-th base model for user j in round r f j (.) Final classification model for user j selects samples close to each user distribution, transforms its dataset using the corresponding user's encoder, and trains a set of base models that map the latent representation to output classes and send them to that specific user. Finally, the user aggregates the received models using its labeled datasets. Our main contributions are as follows: 1) We propose SemiPFL, a semi-supervised personalized federated learning framework, for edge intelligence. SemiPFL applies to a wide range of scenarios, from supervised learning settings to having no labeled data available at the user side. To the best of our knowledge, our solutions is the first attempt to integrate personalized federated learning and semi-supervised learning for multi-sensory classifications. 2) To our knowledge, this is the first work developing personalized models using a hyper-network for multisensory classification. 3) The proposed method outperforms recognition accuracy compared to recent publications using fully connected neural network architectures. 4) We extensively evaluate the proposed framework on publicly available datasets gathered from edge devices such as smartphones and wearable devices. We show better performance of the proposed method as more users collaborate in the framework and more labeled instances are available on the user side. The proposed SemiPFL aims at unifying semi-supervised learning with personalized federated learning for multi-sensory time-series edge inputs. In subsequent sections, we present the details of the main building blocks of our method and related works. Fig. 2 : Overview of the semi-supervised learning method used in this work. First, we train an autoencoder on the whole labeled and unlabeled data. Then, we use the encoder to transform the data to its latent representation, and then we use the labeled data to train a base classifier. The final model is the trained encoder followed by the base classifier. Semi-supervised learning is one division of machine learning that falls between supervised and unsupervised learning [47] , where generally we have a few labeled instances and a large number of unlabeled instances, and the number of labeled data points is insufficient to train a desirable supervised model [47] . One major challenge in supervised learning frameworks is the availability of labeled instances. Data annotation is usually time-consuming, expensive, and not accessible at each edge point. One popular approach to tackling semi-supervised learning is feature extraction by training an autoencoder on unlabeled instances. Autoencoder, an artificial neural network, learns data representation from unlabeled data. Its objective is to transform the original data to its compressed representation and reconstruct it back to its original form without losing valuable information. An autoencoder (f a (.) = f a enc (f a dec (.))) contains two sections, the encoder f a enc (.), which maps instances to their latent representation, and the decoder f a dec (.), which reconstructs the original data from its simplified representation. The learning objective of the autoencoder is to minimize the following loss function [49] : where L a (., .) is the autoencoder loss function, and X ∈ IR S×W the input data, where S is the number of sensors, and W the window width. This loss function penalizes the output of the autoencoder for being dissimilar to input X [49] . Training a classification model under a semi-supervised learning using the autoencoder requires two steps. In the first step, we train an autoencoder on all labeled and unlabeled instances. Then using the encoder part of the trained autoencoder, we compress the original data to its latent representation, which provides a compact representation of original instances. Subsequently, we train a base classifier using the transformed labeled data points. The final classifier will be the encoder, followed by the base classifier [25] , [49] . Fig. 2 gives an overview of the proposed approach. IoT devices such as smartphones and wearable devices generate a massive amount of data every day. Traditional machine learning approaches require us to accumulate user data in a centralized database to train supervised models. However, this task is not practical due to several challenges such as privacy and bandwidth limitations [9] , [35] , [50] . Moreover, the growing computational power of edge devices makes them suitable for local computation and data storage [48] . Federated learning seeks to provide the same collaborative training without sharing data. In FedAvg [32] , the server aggregates all user models without particular non-iid data operations (Eqn. (2)). f j r (.) means user-j model parameters in the r-th round, K is the total number of users, and f S r+1 (.) the global model parameters for round r + 1 that will be sent to all users at the beginning of round r + 1. FedSCN [51] proposes a self-supervised approach based on wavelet transform (WT) to learn valuable representations from unlabeled sensor inputs. FedHAR I [33] and FedHAR II [28] try to unify active learning and label propagation to annotate the local streams of unlabeled sensor data semi-automatically. Its objective is to generate a personalized model for each user. However, both approaches require at least some label on the user side. FedBN [52] assumes that each user keeps the local batch normalization while collaborating to generate a global model. FedPer [53] tries to generate a personalized model for each user by preserving some local layers. SemiFL [25] addresses scenarios where users have unlabeled instances and users collaborate to generate a global model. The authors assume a set of labeled instances available on the serverside. FedProx [34] is a generalized version of FedAvg. It allows partial information aggregation and adds a proximal term to FedAvg. Fedhealth [27] is the first personalized federated learning method through transfer learning introduced for wearable healthcare devices. In Fedhealth, users freeze the 1) The server selects a user, generates a personalized autoencoder using the user embedding α j via a hyper-network, and sends it to the user. 2) The user fine-tunes the model using its unlabeled data and sends it back to the server. 3) The server updates the user embedding document and hyper-network using the fine-tuned model. 4) The server encodes its labeled data using the encoding part of the user autoencoder, 5) trains a set of base models using supervised learning and sends it to the user. 6) The user generates its base model by aggregating them using labeled data. global model and fine-tune the last layers using their labeled dataset. FedHealth 2 [26] measures users' similarities with the pre-trained model and then aggregates all weighted models while users keep their batch normalization layer. This section introduces SemiPFL, our personalized semisupervised federated learning framework for time-series multisensory data. SemiPFL consists of three main steps: first, the server generates a personalized autoencoder using a hypernetwork for each user. Second, the server selects samples close to each user distribution, transforms its dataset using the corresponding user's encoder, and trains a set of base models that map latent representation to classes and send it to that specific user. In the third and final step, the user aggregates the received models using its labeled dataset. We will first describe the problem formulation in subsequent sections and then discuss our training pipeline. In SemiPFL, similar to a semi-supervised learning scenario, we assume that users have a small set of labeled and a large set of unlabeled datasets. We also consider a set of labeled instances in the server. In other words, we assume that we have K users and each user j has a small set of labeled data D j : where X i j , Y i j are the i-th input and output pairs for user j respectively. We also have a large unlabeled dataset for user j, denoted as U j : To evaluate our method, we define E j as user j-th evaluation dataset, where X i j , Y i j are the i-th input and output pairs for user j evaluation dataset, respectively: Also, there is a high-resolution dataset from M different distributions available on the server-side: The goal of traditional federated learning methods such as FedAvg is to find a global model f (.) through minimizing Eqn. (7): where l(., .) denotes the loss function. In traditional models, although we could find a model that minimizes the general loss (Eqn. (7)), due to the domain shift between users, the general model would not perform satisfactorily for all users. The goal of SemiPFL is to generate a set of personalized models for each user. Particularly, our objective is to propose a framework where the server provides a personalized model {f j (.)} K i=1 for each user. The overall objective can be formulated as Eqn. (8): In Eqn. (8) , since there is no dependence among {f j (.)} K j=1 , we can simplify Eqn. (8) to Eqn. (9) as: In other words, the personalized federated learning objective is to evaluate each personalized model generated for each user on that user's dataset. In the experimental evaluation section, we measure the performance based on the average of each user's F1 and Kappa scores on their own dataset. This section introduces SemiPFL, our novel personalized semi-supervised federated learning framework. It consists of several steps that require communication between the server and users. The overall system model can be found in Fig. 3 . The server has a list of user embeddings {α j } K i=1 for every user. At the first step, the server randomly selects one user in every round and generates a personalized autoencoder model with its hyper-network model using user embedding α j and sends it to the selected user (Eqn. (10)). The hyper-network implementation is the same as in [61] . In the second step, user updates the received autoencoder model using the whole unlabeled and labeled dataset over T epochs and sends the updated autoencoder back to the server. In every epoch, user selects a mini-batch γ ⊂ U j {X|(X, Y ) ∈ D j }, and update the received autoencoder via Eqn. (11) . In the third step, the server updates the hyper-network parameters and the corresponding user embedding feature α j . This means that the server calculates the difference between the sent and the received model parameters and updates the hyper-network parameter and user embedding as in Eqn. (12) [61] . ∆f In the fourth step, the server generates a set of M personalized classifiers for user j. The server owns a labeled data set from M ( K) different distributions. For each of those M datasets, the user first selects samples more similar to the user j dataset. In order to do that, the server checks the distance between its data values and the reconstructed values using user j finetuned autoencoder model through calculating an autoencoder loss function (Eqn. (13)): In the fifth step, server encodes its samples using the encoder part of user j autoencoder (Eqn. (14) ): Server trains a set of base models using selected samples from each of M available datasets {f b,j r,m (.)} M m=1 and send them to the corresponding user. User initializes a set of weight for each of these base models and forms an initial personalized base model: In the sixth step, The user encodes its labeled instances using its fine-tuned autoencoder that was calculated earlier, freezes base model parameters in Eqn. (15) , and optimizes model weights via Eqn. (16) using its labeled dataset: Subject to: M m=1 χ m = 1 (16) A summary of the SemiPFL algorithm can be found at algorithm 1. In this section, we evaluate the effectiveness of our method using sets of available activity recognition and stress detection datasets. A summary of datasets can be found in table II. SemiPFL was developed using Python 3.8, Pytorch 1.10.2, and trained on a PC with Nvidia 2080Ti 11GB GPU and 64GB RAM. In this chapter, first, we cite datasets used in this study with their corresponding processing. Second, we explain our experimental setup. Third, we compare SemiPFL with other related federated learning frameworks. Fourth, we explain the impact of the number of available labeled instances. Finally, we evaluate the impact of the number of users collaborating during training We employed five available human action recognition and one stress detection datasets to evaluate our method. A summary of datasets can be found in table II. In this study, we used the following datasets: 1) Mobiact [54] : Mobiact includes accelerometer, gyroscope, and orientation data gathered from smartphones in participants' pockets. Twenty activities are recorded, such as standing, walking, jogging, and jumping. In Mobiact, 61 subjects with six trials for each subject are recorded. To compare our results with FehHAR I [33] and FedSCN [9] , We evaluated our algorithm based on two different sets of outputs, five activities: standing, walking, sitting, jumping, and jogging, and ten activities: standing, car-step in, sitting on a chair, carstep out, jogging, jumping, Stand to sit, stairs down, walking, and stairs up. 2) WISDM [55] : WISDM dataset contains accelerometer and gyroscope data from smartphones and smartwatches separately, summarizing 18 different activities collected from 51 subjects. In our experiment, to compare our results with FedHAR I [33] , we only considered the following daily activities: walking, jogging, stairs, standing, sitting. 3) HHAR [57] : HHAR is the most widely-used benchmark for human activity recognition algorithms. Data is collected by smartphones and smartwatches separately, including 3dimensional accelerometer and gyroscope data from 9 users. Six daily activities (biking, sitting, standing, walking, stair up, and stair down) are collected using different devices. 4) PAMAP2 [58] , [59] : PAMAP2 contains data with 52 dimensions from 3 synchronized IMUs on the chest, dominant side's ankle, and over the wrist on the dominant arm separately. A total of 24 activities are recorded from 8 subjects. We eventually selected ten activities from all eight subjects with 39-dimensional data to address missing data and an imbalance class problem. 5) WESAD [60] : WESAD features physiological data and motion modalities. Fifteen different subjects' data are recorded from wearable sensors on both wrist and chest. WESAD contains three outputs: normal, amusement, and stress. 6) HAR-UCI [56] : HAR-UCI contains activity data collected from thirty participants doing six activities: walking, walking upstairs, walking downstairs, sitting, standing, and lying while users are wearing a smartphone on the waist, which records 3-axial acceleration and 3-axial angular velocity at a constant rate of 50Hz. In the past years, many researchers tried to introduce metrics to evaluate performance in federated learning settings [62] . To evaluate our method, we calculated federated average F1 score and Kappa score for all users (Eqn. (17)), where E j is the evaluation dataset for user j, and f j is the personalized model for user j, and K is the total number of available users during training. We run each scenario multiple times with different seed values to calculate each performance metric's average and standard deviation values. It is important to note that evaluation data is not used during the training phase. Also, since the server randomly selects users, we eliminated those who did not participate in the training session from the final evaluation. We randomly selected a set of users for each dataset to be considered server datasets. We kept the same users for our complete analysis to have consistent results. For Mobiact, HAR-UCI, WISDM, HHAR, PAMAP, and WESAD, we selected six, six, six, three, two, and three users datasets, respectively, as server instances. We selected our hyperparameters using grid search. For the model architecture, we used four linear layers in the autoencoder (two layers for encoder and two layers for decoder). For the base model, we use two linear layers. After each linear layer, we added a ReLU layer as an activation function and a dropout layer (dropout=0.2). We use Adam optimizer with the learning rate of 0.001. The batch size for training was 128. For threshold value, we selected τ = 0.05. We created our data tensor from datasets using a sliding window with length W = 30. We also normalize and balance class distributions before use. For all datasets, we considered 30% of the data as an evaluation dataset and used the rest for training. We considered 20% of the remaining data points as labeled datasets and the rest as unlabeled data points. We randomly added 5, 10, 15, 20, 25, 30, and 40 labeled datapoints per class, respectively, to our setup and started the training from scratch. We reported all the results at r = 2000 rounds. We also choose T = T = 10. For the hyper-network, we borrowed the structure from pFedHN [61] and updated the architecture so that it outputs our autoencoder parameters. Table III demonstrates a comparison amongst our method and the most related studies investigating the effectiveness of federated learning frameworks for embedded edge intelligence. Here, we have classified these methods based on their objectives: personlized, label in server, label in user, and the model type. In order to be able to accurately compare our results with selected methods, we chose similar pre-processing steps. Most of these works focus on federated learning, where users collaborate to generate a global model. These methods are not performing well, having lower average Kappa or F1 scores, (as seen in the table). While SemiPFL use FCNN which is simpler and faster to train than the more complex architectures such as CNN and LSTM, it outperforms them all in terms of respective average F1 and Kappa values. SemiPFL method with its attributes such as some label in server, and no requirement for label for user addresses broader application scenarios as well as providing better performance metrics as compared to both recent semi-personalized work [26] , [28] , [33] , [34] , [52] , [53] , and earlier federated learning methods [9] , [32] , [48] . SemiPFL covers a wide range of scenarios from no labeled data to fully supervised setting, unlike other previous publications. This design has lent itself to investigate the effect of increased available labeling by incrementally adding to the labeled instances per class and measuring the improvements in average Kappa and F1 scores as shown in Tables IVa, IVb, VII, VIII, V,IX, and VI. Having some labeled datasets on the server side in the SemiPFL design enables us to achieve high scores even without any user labels and increasing the performance as the number of labeled data increases. This trend has been observed in this study for different activity numbers and datasets, while in some instances it highlights the possibility to have an optimum number of labeled instances as the performance metrics saturate beyond certain number of labeled datasets. To investigate the impact of the number of users collaborating during the training process, we randomly selected five users for Mobiact and WISDM, and then added five more users each time and started training SemiPFL from scratch. For PAMAP2 and HHAR, we randomly selected one user, and added one more user in each training and testing of SemiPFL. For HAR-UCI dataset, we randomly selected three users, and then added three more users every time. For WESAD, we randomly selected two users, and then added two more users every time. The results for the performance metrics average Kappa and F1 scores as the different number of users are added to SemiPFL is also reported in Tables IVa, IVb, VII, VIII, V,IX, and VI. Unlike previous publications, we observe that due to the sophisticated personalized representation of SemiPFL adding more users increase the performance scores and not necessarily decrease the performance due to the edge data heterogeneity. This is an important outcome as it demonstrates the possibility for collaborative learning from edge nodes for the entire SemiPFL model. This paper introduces SemiPFL, a novel personalized semisupervised learning method focusing on edge intelligence. Our approach trains a hyper-network that generates a personalized autoencoder to enable learning from user data representation. Furthermore, based on the user autoencoder and the sets of the available dataset in server-side from different distributions, the server generates a group of base models to the corresponding user. Finally, the user fine-tunes the weighted average of such base models to generate a personalized base model. We extensively evaluated the proposed method on five publicly available human action recognition and one stress detection datasets collected from wearable devices. Our method outperforms personalized models and available federated learning frameworks under the same assumptions in terms of average F1 and Kappa scores. While SemiPFL can perform well for cases without labeled data at the edge, the performance increases with increasing labeled data and saturates signifying an optimum number of labeled data for a given edge node. In addition, we demonstrated that SemiPFL performance increases with increasing number of users unlike other publications highlighting the possibility to incorporate edge data heterogeneity in SemiPFL platform. By leveraging semi-supervised learning, our framework prohibitively reduces the need for annotating data. A survey on behavior recognition using wifi channel state information Sensor-based activity recognition Activity recognition from accelerometer data Deepsleepnet: A model for automatic sleep stage scoring based on raw single-channel eeg A survey on fall detection: Principles and approaches Human activity recognition and pattern discovery Deep learning for sensor-based activity recognition: A survey A survey on human activity recognition using wearable sensors Federated Self-Supervised Learning of Multisensor Representations for Embedded Intelligence Deep anomaly detection for time-series data in industrial iot: A communication-efficient on-device federated learning approach Blockchain-based federated learning for device failure detection in industrial iot Federated learning in the sky: Aerial-ground air quality sensing framework with uav swarms Privacy-preserving traffic flow prediction: A federated learning approach When deep reinforcement learning meets federated learning: Intelligent multitimescale resource management for multiaccess edge computing in 5g ultradense network Fogfl: Fog-assisted federated learning for resource-constrained iot devices Efficient federated learning algorithm for resource allocation in wireless iot networks A resource-constrained and privacy-preserving edge-computing-enabled clinical decision system: A federated reinforcement learning approach Federated learning meets human emotions: A decentralized framework for human-computer interaction for iot applications Multiagent ddpg-based deep learning for smart ocean federated learning iot networks Dynamicfusion-based federated learning for covid-19 detection Federated learning for predicting clinical outcomes in patients with covid-19 Secure, privacy-preserving and federated machine learning in medical imaging Dynamic contract design for federated learning in smart healthcare applications SelfHAR: Improving Human Activity Recognition through Self-training with Unlabeled Data Semi-supervised Federated Learning for Activity Recognition FedHealth 2: Weighted Federated Transfer Learning via Batch Normalization for Personalized Healthcare Fedhealth: A federated transfer learning framework for wearable healthcare Fedhar: Semi-supervised online learning for personalized federated human activity recognition Federated learning in mobile edge networks: A comprehensive survey Towards federated learning at scale: System design Federated learning for healthcare informatics Communication-efficient learning of deep networks from decentralized data Personalized Semi-Supervised Federated Learning for Human Activity Recognition Federated optimization in heterogeneous networks Federated Learning: A Survey on Enabling Technologies, Protocols, and Applications Personalized federated learning with theoretical guarantees: A model-agnostic metalearning approach Personalized Federated Learning using Hypernetworks Personalized federated learning with moreau envelopes Aviv Navon, Gal Chechik, and Ethan Fetaya. Personalized federated learning with gaussian processes Personalized federated learning: A meta-learning approach Ditto: Fair and robust federated learning through personalization Cifar-10 (canadian institute for advanced research) Deep convolutional neural networks on multichannel time series for human activity recognition Human activity recognition with smartphone sensors using deep learning neural networks. Expert systems with applications Deep, convolutional, and recurrent models for human activity recognition using wearables A survey on semi-supervised learning Semi-supervised federated learning for activity recognition Deep learning Advances and open problems in federated learning. Foundations and Trends® in Machine Learning Federated Self-Supervised Learning of Multisensor Representations for Embedded Intelligence Fedbn: Federated learning on non-iid features via local batch normalization Federated learning with personalization layers Human daily activity and fall recognition using a smartphone's acceleration sensor Smartphone and smartwatch-based biometrics using activities of daily living A public domain dataset for human activity recognition using smartphones Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition Introducing a new benchmarked dataset for activity monitoring Creating and benchmarking a new dataset for physical activity monitoring Introducing wesad, a multimodal dataset for wearable stress and affect detection Personalized federated learning using hypernetworks New metrics to evaluate the performance and fairness of personalized federated learning He is currently a Ph.D. candidate at the Department of Electrical and Computer Engineering at the University of British Columbia. His research in on robust and personalized pose estimation using wearable smart devices Graduate Student Member, IEEE) received a B.Sc. degree in electrical engineering from Tianjin University, China. Currently, She is a MAS.c student at the University of British Columbia, Canada. She is working as a research assistant in the Flexible Electronics and Energy Lab (FEEL) Her research interests include statistical signal processing and machine learning, with applications in digital media and biomedical data analytics He is the director of SmarT Innovations for Technology Connected Health (STITCH) and Flexible Electronics and Energy Lab (FEEL) and co-founder of Centre for Flexible Electronics and Textile at UBC and founder of Texavie Technologies Inc. a leading company in developing advanced wearable health solutions. Prof. Servati has more than 100 papers in peer-reviewed journals and conferences and 90 patents and patent applications. He is the winner of 2006 NSERC Canada-UK Millennium Research Award, 2005 NSERC Doctoral Prize, the 2002/03 IEE Institution Premium for the Best Paper in Circuits, Devices, and Systems, and Bronze Medal in the XXV International Physics Olympiad (I