key: cord-0431484-m64gp0jn
authors: Jastaniah, Khlood; Zhang, Ning; Mustafa, Mustafa A.
title: Privacy-Friendly Flexible IoT Health Data Processing with User-Centric Access Control
date: 2022-03-01
journal: nan
DOI: nan
sha: ee81c9f02996a67ed24d482a93d2e2e447caefa7
doc_id: 431484
cord_uid: m64gp0jn

This paper proposes a novel Single and Multiple user(s) data Aggregation (SAMA) scheme designed to support privacy preserving aggregation of health data collected from users' IoT wearables. SAMA also deploys a user-centric approach to support flexible fine-grain access control. It achieves this by deploying two key ideas. First, it uses multi-key Homomorphic cryptosystem (variant Paillier) to allow flexibility in accommodating both single and multi-user data processing as well as preserving the privacy of users while processing their IoT health data. Second, it uses ciphertext-policy attribute-based encryption to support flexible access control, which ensures users are able to grant data access securely and selectively. Formal security and privacy analyses have shown that SAMA supports data confidentiality and authorisation. The scheme has also been analysed in terms of computational and communication overheads to demonstrate that it is more efficient than the relevant state-of-the-art solutions.

and friends may need to have (or benefit from having) access to the results of these analytic models. Therefore, healthcare systems should support wearable data processing and sharing with multiple and diverse data recipients [5] .

Healthcare provision via wearable devices has led to an increased number of applications that collect, store and analyse (usually with the assistance of cloud providers) user sensitive wearable data at an unprecedented scale and depth [2] , [6] , [7] . However, this approach comes with mounting concerns over users' data privacy (data confidentiality). The concerns lead to two major issues. First, although data collection could be done via secure channels, processing of user wearable data is typically performed in plaintext and governed by service providers who may outsource/disclose the wearable data or analytic results to third parties [8] , [9] . Second, users have no control over who access and share the collected and processed data [10] [11] [12] . In addition, unauthorised exposure of personal health data violate the GDPR [13] and HIPPA [14] regulations that advocates for users' privacy protection and access control. Therefore, it is important to achieve secure data processing and sharing, adopting user-centric approach, which gives access control over data in the hands of users rather than service providers.

There are already attempts to tackle the aforementioned issues which can be generalised into two approaches. The first approach is based on homomorphic encryption schemes, while the second approach uses attribute-based encryption schemes. Existing solutions related to the first approach either support secure data processing over data collected only from a single user [15] [16] [17] [18] [19] [20] or only from multiple users [12] , [21] [22] [23] [24] [25] [26] , but they do not efficiently support system models for both single and multiple user data processing scenarios. In addition, the solutions using the second approach [11] , [12] , [27] [28] [29] assume that data processing entities -typically third-party services providers -are trustworthy.

In modern healthcare systems, the enormous amount of wearable sensitive health data generated and some of the processing jobs are typically outsourced or delegated to thirdparty service providers, such as cloud providers due to user's constrained devices. In such cases, measures should be in place such that any threats from the third-party service providers can be addressed. Moreover, data owners should define fine-grained access control to both raw data and computed/aggregated data and specify who can access which data items (i.e. user-centric access control). In addition, there are multiple entities (e.g., healthcare professionals, researchers) who require to process and access different sets of data of specific individuals and/or of a group of people for different legitimate purposes [2] , [12] , and both types of data processing and access (single and multiple users data) should be supported and granted based on a user-centric approach [8] . Therefore, there is a need for secure data processing that can accommodate single and multiple user data cases, while supporting user-centric fine-grained data access capabilities.

To fill in this research gap, we propose a novel privacypreserving Single And Multiple user data Aggregation (SAMA) scheme that supports single and multiple users data processing over encrypted data and realises data sharing with fine-grain access control based on a user-centric approach. To the best of our knowledge, this paper is the first attempt to combine single and multiple users data processing and sharing the processing results across multiple entities with the focus on a user-centric design approach. To this end, the novel contributions of this work are three-fold:

• The design of SAMA -a novel privacy-preserving scheme to support secure aggregation of data collected from both single and multiple users and secure data sharing with fine-grain access control based on a usercentric approach. The secure aggregation of data is ensured by using variant Paillier homomorphic encryption (VP-HE) scheme in a multi-key environment such that data from individual owners (users) are encrypted by multiple users' public keys in a twin-cloud architecture and data processing can be carried out over the encrypted data. The fine-grained access control of the processing result is supported by using Ciphertext-policy Attributebased Encryption (CP-ABE), which gives data owners full control of the access rights over their data. • The investigation of the SAMA scheme both theoretically (in terms of security) and through simulation (in terms of computational and communication costs) -our results indicate that SAMA satisfies the specified set of security and privacy requirements with lower computational and communication cost in user and data recipients side compared with [30] . The rest of this paper is organised as follows. Section II discusses the related work. Sections III and IV show design preliminaries and main building blocks used in the design of SAMA. This is followed by a detailed design of the SAMA scheme in Section V. Sections VI and VII detail SAMA's security/privacy analysis and performance evaluation, respectively. Finally, Section VIII concludes the paper. The acronyms used in the paper are shown in Table I. II. RELATED WORK Efforts have already been made to preserve the confidentiality of user data while data is being processed by deploying different advanced cryptographic techniques. One of the most widely used techniques is Homomorphic Encryption (HE) which allows operations on encrypted data. There are mainly two types of HE schemes: Fully HE (FHE) and partial HE (PHE). FHE schemes support an arbitrary number of operations over encrypted data [31] . However, they are still impractical as they require high computational resources. Hence they are not suitable for use in wearable/portable [20] have considered secure processing of data provided only by one (single) user, while other schemes [12] , [21] [22] [23] [24] [25] [26] support secure processing of data coming only from different (multiple) users. However, in modern healthcare systems, in many cases, data recipients need to access the processing results of both single and multiple users' data. In such cases, the use of the above schemes has limitations. Firstly, HE proposals cannot efficiently support both single and multiple users data processing scenarios. Secondly, HE cannot support data sharing with multiple data recipients who require access to the same processing result.

To support secure and user-centric access control, there are proposals [11] , [12] , [27] [28] [29] adopting ABE schemes [33] . These proposals allow users to choose who can access their data, hence supporting fine-grained access control and multiuser access. ABE schemes can be classified into two types: ciphertext-policy ABE (CP-ABE) [34] and key-policy ABE (KP-ABE) [35] schemes. The main difference between the two types is the following. In the CP-ABE scheme, access structures are embedded with ciphertexts and users' attributes are embedded with the users' private keys, while with the KP-ABE scheme, the access structure is associated with the private keys of users and the ciphertexts are associated with attributes. Therefore, with the KP-ABE schemes, users do not have control over who can access the data; they can only control attributes assignments [34] . ABE schemes, on their own, do not support computations over encrypted data.

There are some existing proposals which combine secure data processing with access control. Ding et al. [30] , [36] proposed a flexible access control over the computation results of encrypted multiple users' data by combining ABE with HE schemes. The computation supports addition, subtraction, multiplication, division, etc. However, these proposals do not efficiently support processing over data of both single and multiple user(s) nor user-centric access control. Ruj and Nayak [37] combined Paillier HE with ABE to support privacy preserving data aggregation and access control in the smart grid. However, in their proposal, the aggregated data needs to be decrypted and then re-encrypted with an access policy by a trusted authority, hence this solution places unconditional trust on the data manager. Tang et al. [38] proposed privacypreserving fog-assisted health data sharing that supports a flexible user-centric approach using ABE. Patients send the abnormal values encrypted by symmetric encryption scheme and define the access policy by encrypting the symmetric key with ABE. It also supports naive Bayes disease classification over the encrypted data at the fog node; however, it does not effectively support processing over data from multiple users. Pang and Wang [39] propose privacy preserving data mining operations on outsourced data from multiple parties under multi-key environments using VP-HE. The proposal supports sharing of processed data only with a data recipient (miner); however, it does not support user-centric and fine-grained data sharing with multiple users.

In summary, the state-of-the-art research in privacy preserving data processing either focuses on the single user or multiple user(s) data processing; they do not support both usecases systematically. Furthermore, there are limited efforts on exploring the integration of privacy preserving data processing with fine-grained user-centric access control to support secure data processing and secure data sharing access among multiple users. This paper aims to address this knowledge gap, to design a solution that can efficiently support both single and multiple user data processing and fine-grained data sharing in a user-centric manner while protecting users'(data owners) data privacy.

In this section, we introduce the system and threat model, assumptions, notations, and design requirements of SAMA.

The system model used by SAMA consists of the following entities (see Fig. 1 ). Users are data owners who possess wearables and are willing to share the data collected from their wearables with various data recipients for their own personal benefits or for the collective benefit of society. Users' wearable data is usually collected and shared via their smartphone (gateway). Data recipients (DRs) are data consumers who wish to utilise users' wearable data in order to provide (personalised) services to users or society. Example DRs could be individuals such as the users themselves, their family members, friends, professionals (e.g., named GPs), organisations such as hospitals, research centers, insurance, or charities, etc. Two cloud service providers store and process data on behalf of users: Cloud A (CSP A ) provides users with storage and processing for users' data, and manages access requests, while Cloud B (CSP B ) cooperates with CSP A in data computations and access control. A Key authority (KA) plays the role of a key management organisation.

This section describes the threat model of the proposed SAMA scheme as follows. Users are trustworthy but curious. They outsource correct wearable data to cloud providers but are keen to learn other users' data. DRs are also trustworthy but curious. They make legitimate requests to access users' data, but they may be curious to access or find out other users' data. The CSPs are semi-honest (honest-but-curious) entities. They follow the protocol as per the specifications, yet they are curious about the sensitive information of users or any aggregated user data. The KA is considered a trustworthy entity. It performs all its duties honestly and never colludes with any other entities. The external adversary bounded by computational resources (not having access to quantum computers) is considered as untrustworthy or even malicious. The external attackers may utilize different kinds of network eavesdropping attacks and/or modify the data in transit or try to gain unauthorized access in an attempt to disrupt the system or the cloud servers.

The following assumptions are considered in the SAMA design. The communication channels among all entities are encrypted and authenticated. CSP A and CSP B do not collude with each other or with any other entities as they have a legal responsibility to prevent leakage of the users' sensitive data. All entities' identities are verified by the key authority before obtaining the encryption and decryption keys.

The proposed system should satisfy the following functional, security and privacy, and performance requirements.

1) Functional Requirements: (F1) Flexible data processing: SAMA should support both single and multiple user(s) data aggregation using the same system model and without substantially increasing the computational and communication overhead. (F2) Fine-grain access control: SAMA should support a flexible access policy for users and facilitate granting different access rights to a set of data recipients. (F3) user-centric: each user should control who is authorized to access the raw data collected from their wearables as well as the aggregated data that includes their raw data.

2) Security and Privacy Requirements: (S1) Data confidentiality: users' raw and aggregated data should be protected from unauthorised disclosure. (S2) Authorisation: only authorised DRs should be able to access users' aggregated data based on the user-defined access policy.

(P1) Efficient: SAMA should be viable for wearables which are devices with limited computational capabilities.

This section reviews briefly the Paillier cryptosystem [32] , the Variant-Paillier in Multikey cryptosystem [39] , and CP-ABE [34] , which are used in the SAMA scheme design. The notations used throughout the paper are presented in Table II .

Paillier cryptosystem [32] is a practical additive homomorphic encryption scheme proven to be semantically secure.

1) Paillier in Single-Key Environment: It consists of three algorithms: key generation algorithm (KGen P E ), encryption algorithm(Enc P E ), and decryption algorithm(Dec P E ).

• KGen P E (k') − → ppk, psk: Given a security parameter k', select two large prime numbers p and q. Compute n = p · q, and λ = lcm(p − 1, q − 1). Define L(x) = (x − 1)/n. Select a generator g ∈ Z * n . Compute µ = (L(g λ mod n 2 )) −1 mod n. The public key is ppk = (n,g) and the private key is psk = (λ, µ). • Enc P E (ppk, m) − → c: Given a message m ∈ Z and a public key ppk = (n,g), choose a random number r ∈ Z * n , and compute the ciphertext c = Enc P E (ppk, m) = g m · r n mod n 2 .

• Dec P E (psk, c) − → m: Given a ciphertext c and a private key psk = (λ, µ), recover the message m = Dec P E (psk, c) = L(c λ mod n 2 ) · µ mod n.

The variant Paillier scheme [39] is one of the recent variations of the Paillier cryptosystem. It is similar to the original scheme [32] with a slight modification in the key generation algorithm, which makes it compatible to work in multiple users environment by generating a different public-private key pair for each user with two trapdoor decryption algorithms. The scheme comprises four algorithms: key generation (KGen V P ), encryption (Enc V P ), decryption with a weak secret key (Dec wsk ), and decryption with a strong secret key (Dec ssk ).

• KGen V P (k) − → vpk, wsk, ssk: Given a security parameter k, choose k + 1 small odd prime factors u, v 1 , . . . , v i , . . . , v k and choose two large prime factors v p and v q in which p and q are large primes with the same bit length. Compute p and q as p = 2uv

Choose t as a number or a product of multiple numbers from the set (v 1 , v 2 , . . . , v i , . . . , v k ), and t|λ naturally exists. Choose a random integer g ∈ Z * n 2 that satisfies 

Enc P E encryption using PE Dec P E decryption using PE ppk j , psk j PE key pair (public key, private key) of DR pk public parameters in CP-ABE M K

master key in CP-ABE sk

secret key in CP-ABE Enc ABE encryption using CP-ABE Dec ABE decryption using CP-ABE AP S /AP M single and multiple user(s) data access policy g utn = 1 mod n 2 , and gcd(L(g λ mod n 2 ), n) = 1.

The public key is vpk = (n, g, h), the weak secret key is wsk = t and the strong secret key is ssk = λ • Enc V P (vpk, m) − → c: Given a message m ∈ Z n and a public key vpk = (n, g, h), choose a random number r ∈ Z n , and compute the ciphertext c as c = Enc V P (vpk, m) = g m h r mod n 2 .

• W Dec V P (wsk, c) − → m: The decryption algorithm with a weak secret key decrypts only the ciphertext encrypted with the associated public key. Given wsk and c, the ciphertext can be decrypted as m = W Dec V P (wsk, c) = L(c t mod n 2 ) L(g t mod n 2 ) mod n. • SDec V P (ssk, c) − → m: The decryption algorithm with a strong key decrypts the ciphertexts encrypted with any public key of the scheme. Given ssk and c, the ciphertext can be decrypted as m = SDec V P (ssk, c) = L(c λ mod n 2 ) · µ mod n.

The CP-ABE is a type of public-key encryption in which the ciphertext is associated with an access policy and user keys are dependent upon attributes to supports fine-grained access control [39] . It consists of four main algorithms: a setup algorithm (Setup), encryption algorithm (Enc ABE ), key generation algorithm (KGen ABE ), and decryption algorithm (Dec ABE ).

• Setup(s, U ) − → pk, mk: Given a security parameter s and a universe of attributes U , the setup algorithm outputs the public parameters pk and a master key mk.

• Enc ABE (pk, M, A) − → C: Given public parameters pk, a message M , and an access structure A over the universe of attributes, the encryption algorithm outputs a ciphertext C which implicitly contains A.

• KGen ABE (mk, s) − → sk: Given a master key mk and a set of attributes s which describe the key, the key generation algorithm outputs a private key sk. • Dec ABE (pk, C, sk) − → M : Given public parameters pk, a ciphertext C, which includes an access policy A, and a private key sk, using a decryption algorithm, a user can decrypt the ciphertext and get a message M only if the attributes associated with the private key satisfy A.

In this section, we propose our novel secure data aggregation scheme, SAMA, that works on single and multiple user(s) data with flexible data sharing, adopting a user-centric approach. First, we give an overview of SAMA and explain the system initialisation before presenting it in detail.

The SAMA scheme mainly makes use of a combination of the Paillier HE and CP-ABE schemes and consists of three main phases: (i) user access polity setting, (ii) data uploading, and (iii) data access request and processing, as shown in Fig. 2 .

At the user access policy setting phase, to achieve a usercentric fine-grained access policy, users define two types of access policies: single (AP S ) and multiple (AP M ) user(s) data access policy and send them to CSP A . This allows CSP A to process and share users' data with DRs according to users' preferences. In the data uploading phase, every user encrypts their data with their VP-HE public key and sends the resulting ciphertext to CSP A .

During the data access request and processing phase, CSP A receives requests from DR to access the (aggregated) data of users. These requests are processed by the CSP s and the results are shared with the corresponding requesters. There can be three different types of requests, coming either by the users themselves for accessing their own data or from the DRs requesting data of a single user or multiple users.

Upon receiving a request from a user, the CSP A aggregates the user's encrypted data and the result is sent back to the user. The user then can use their own VP-HE weak secret key to obtain their aggregated data. If the request is received by DR for a single user's data, CSP A aggregates the user's encrypted data, masks it, and sends the masked encrypted data to CSP B . CSP B then performs strong decryption to obtain the masked data, encrypts this result (masked aggregated data) with a Paillier public key and encrypts the Paillier private key using CP-ABE with the access policy AP S , and sends both ciphertexts to CSP A . However, if the request is received by DR for multiple users data processing, the process is slightly different. CSP A gets the encrypted data of users, masks the data and sends the masked data to the CSP B . CSP B then performs strong decryption on the received ciphertexts, aggregate them, encrypts the result with a Paillier public key, and encrypts the Paillier private key using CP-ABE with 

ppk j and pk psk j and ssk DR ppk j psk j and sk j the access policy AP M . In both cases, CSP B sends both ciphertexts to CSP A . CSP A then performs de-masking on the received ciphertext and sends the encrypted result (aggregated data) and CP-ABE ciphertext to DR. Finally, the authorized DR who satisfies the access policy will be able to decrypt the CP-ABE ciphertext and obtain the Paillier private key to decrypt the ciphertext of the final result (users aggregate data).

The system initialisation step comprises two phases: system parameters setup and cryptographic key generation and distribution. All the entities' keys are listed in Table III. 1) System Parameters Setup: In this phase, system parameters of the three encryption schemes are set.

• VP-HE setup: The KA sets a security parameter k and chooses two large prime numbers p and q such that L(p) = L(q) = k. L is the bit length of the input data.

• Paillier setup: The KA selects the security parameter k ′ , such that k ′ > k. It then chooses two large prime numbers p and q. Then, the key generation algorithm is initiated as explained in the Section IV-A1 • ABE setup: The KA generates security parameters s and U attributes, which are used to generate pk and mk using the Setup algorithm described in Section IV-B.

This phase is divided into three steps outlined below.

• VP-HE Key Generation: The KA generates a unique ssk and distinct variant Paillier homomorphic public/private key pair (vpk i , wsk i ) for every user U i , i = 1, . . . , N U , using the KGen V P algorithm described in Section IV-A2. • Paillier Key Generation: The KA generates a distinct Pailliar homomorphic public/private key pair (ppk j , psk j ), for each request that comes from the same or any DR, using the KGen P E algorithm described in Section IV-A1. • ABE Key Generation: The KA generates a distinct private key sk j for every DR j , using KGen ABE as described in Section IV-B. DR j obtains sk j from the KA, which embeds her/his attributes/roles.

The SAMA scheme consists of three main phases: (1) User access policy setting, (2) Data uploading, and (3) Data access request and processing.

1) User Access Policy Setting: This phase shown in Fig. 3 is usually performed at the setup stage. It allows users to set their access policy for data aggregation and sharing requirements and share it with CSP A . It includes three steps: a) define access policy, b) activate notifications, and c) update access policy. a) Define access policy: Generally, the user defines two types of access policy: (i) single-user data aggregation and sharing access policy (AP S ) and (ii) multiple-users data aggregation and sharing access policy (AP M ).

(i) AP S allows users to control who can access the aggregated results of their own data. Therefore, only the authorized Data access request and processing (User Request case) Decrypts the processing results using wski

Processing results

Users CSP A Fig. 4 : Data access request and processing phase -user request.

DR with specific attributes satisfying the access policy can have access to the final aggregated result.

(ii) AP M allows users to determine whether they agree their data to be aggregated with other users' data and the aggregated result can be accessed by DRs. In other words, each user defines his/her sharing preferences and gives consent to allow use of their individual wearable data in aggregation along with other users' wearable data. Note that AP M does not authorise CSP A to share any specific individual raw data with anyone. It only allows CSP A to use the encrypted data of users whose sharing preferences match with the attributes of DRs who have requested access to data. b) Activate notification: Users can select to receive regular notifications, which is a summary of all single and multiuser data requests to access their data received by DRs. Through the summary, users can check how many data access requests were granted/rejected. This will also allow users to monitor who has requested access to their data and whose requests were granted/rejected. Regular notification can be switched on/off by the user and can also be set to be received as daily/weekly/monthly data access summaries. CSP A is responsible to follow users' notification selections. c) Update access policy: The CSP A provides the users the ability to update their access policy periodically or based on demand. Users also have the option to update their pre-defined access policies (AP S and/or AP M ) based on their notifications details.

2) Data Uploading: During this phase, users upload their data to CSP s regularly. More specifically, users encrypt their wearable data m i with their variant-Paillier public key, vpk i , to obtain C vpki = Enc V P (vpk i , m i ) and send the encrypted data to CSP A . This phase is the same for single and multi-user aggregated data sharing, as shown in Fig. 3 .

3) Data Access Request and Processing: In this phase, there can be three different types of data access requests for users' aggregated data as follows: a) Users request access to their own (aggregated) data, b) DRs request access to aggregated data of a single user and c) DRs request access to aggregated data of multiple users. The requests coming from users are directly handled by CSP A , while the requests coming from DRs are handled by both CSP s. a) User access request for own (aggregated) data: A user requests CSP A to aggregate his/her own encrypted wearable data and provide the processed result, as shown in Fig. 4 . Upon receiving the request to aggregate N req data points, CSP A aggregates the users' data (i.e., it performs additive homomorphic operations by multiplying the encrypted data of the user) to get [

denotes encrypted data. The result then is sent to the user. Then, the user can decrypt [ Nreq i=1 m i ] vpki with his/her own weak secret key wsk i to obtain the aggregated data as

DR access request for single-user data processing: A DR requests access to the aggregated data of a (specific) single user. For example, a doctor requires access to the aggregated data of a specific patient to monitor his/her health condition. The aggregated data can be accessed only by DRs (e.g., doctors, friends, etc) whose attributes satisfy the fine grain access policy AP S set by the user. This phase, as shown in Fig. 5 , is divided into the following five steps:

(i) Handling DR request: After a DR has issued a request to access the aggregated data of a single user, the CSP A performs the same additive homomorphic operations, as in

Step a) explained above. The result is a ciphertext of the aggregated data: 

The result is then sent to CSP B along with the AP S set by the user.

(iii) Preparing the processing result: Upon receiving the result, CSP B decrypts it using its strong decryption key ssk to get the masked aggregate data

Then, a new Paillier key pair (ppk j , psk j ) is generated by KA (based on CSP B request) and the new key pair is sent back to CSP B . The new Pailliar public key, ppk j , is used to encrypt the masked aggregated data to get [

, while the new Pailliar private key psk j is encrypted by the user defined access policy (AP S ) to get [psk j ] APS = Enc ABE (pk, psk j , AP S ). Finally, the two generated ciphertexts ([

(iv) De-masking: When CSP A receives the two ciphertexts, it initiates the de-masking process. It encrypts the random number r Ui (used previously in the masking process) with ppk j to obtain [r Ui ] ppkj = Enc P E (ppk j , r Ui ). 

(v) DR access the processing result: DR can access the processing result only if the DR's key attributes satisfy the user' AP S . Hence, DR can decrypt and obtain psk j by using its ABE secret key psk j = Dec ABE (pk, [psk j ] APS , sk).

Request single / multiple user(s) data Select N users* Aggregate encrypted data**

Masked Data

Aggregate masked data*

Encrypts the masked data with ppk j and encrypts psk j with CP-ABE using APS/APM

Decrypt CP-ABE(psk j ) and use psk j to decrypt and get the processing result *Only for multiple users data processing request **Only for single user data processing request DR Users CSP A CSP B Finally, it uses psk j to obtain the initially requested aggregated data of the user:

. c) DR access request for multi-user data processing: A DR requests access to aggregated data of multiple users. For example, a researcher may require access to the aggregated data of a specific set of patients (users) who, for instance, suffer from the same disease. The aggregated data can be accessed only by DRs whose attributes satisfy the fine grain access policy AP M of the users whose data is requested. This phase is also shown in Fig. 5 and consists of the following steps.

(i) Handling DR request: Upon receiving a request to access aggregated data of multiple users, the CSP A initiates the process by comparing users' AP M with DR attributes. It then selects users whose AP M matches with DR request. For simplicity, let us assume that CSP A selects N users.

(ii) Masking: CSP A starts the masking process by generating a random number for every user's data used in the aggregation. It then encrypts these generated random numbers with the corresponding users' variant Paillier public keys, vpk i , generating [r Ui ] vpki = Enc vp (vpk i , r Ui ). Next, each encrypted random number is multiplied with the respective user's encrypted data,

Finally, the N masked ciphertexts are sent to CSP B along with the AP M set by the user for further processing.

(iii) Preparing the processing result: this step consists of the outlined sub-steps below:

-The CSP B decrypts all the received masked ciphertexts with the variant Paillier strong secret key ssk to obtain the individual users' masked data: m i + r Ui = Dec vp (ssk, [m i +r Ui ] vpki ). Then, it performs an addition operation to get the masked aggregation as follows: 

r Ui )), while the corresponding private key psk j is encrypted with the common AP M : [psk j ] APM = Enc ABE (pk, psk j , AP M ).

(iv) De-masking: in this phase, CSP A performs the following steps: It aggregates all the random numbers r Ui (used in the masking process) to obtain N i=1 r Ui . It then encrypts the result with ppk j to get

r Ui ] ppkj by raising it to the power of n − 1:

r Ui ] n−1 ppkj . Finally, it demask the result as follows:

DR access the processing result: the DR decrypts [psk j ] APM using sk if the DR's key satisfies the access policy: psk j = Dec ABE (pk, [psk j ] APM , sk). Finally, the DR uses the obtained psk j to obtain the requested aggregated data:

The functional requirements achieved by SAMA in comparison with related schemes [30] , [37] [38] [39] are summarised in Table IV . Compared to these schemes, SAMA achieves all the specified functional requirements.

In this section, we perform a security analysis which includes the security of the cryptosystems used (Paillier, variant Paillier, and CP-ABE), the security of the SAMA scheme, and the security requirements of SAMA.

The security of the Paillier cryptosystem [32] depends on the hardness of the Composite Residuosity Class problem in the standard model. The scheme is semantically secure against chosen-plaintext attack as the Decisional Composite Residuosity assumption holds. The variant Paillier [39] is similar to the Paillier encryption with a slight change in the key generation algorithm (described in Section IV-A2). Hence, its security follows directly from the security of the Paillier cryptosystem, which is proven to satisfy the semantic security in the standard model under the assumption of the intractability of the Composite Residuosity Class hard problem [39] . Moreover, the CP-ABE is secure under the generic elliptic curve bi-linear group and random Oracle model assumptions [34] . Therefore, the SAMA scheme builds its security on the proven security of the Paillier, variant Paillier, and CP-ABE cryptosystems. Multiple users data processing.

User-centric

The security analysis of the SAMA scheme is based on the simulation paradigm with the presence of semi-honest (honestbut-curious and non-colluding) adversaries. To prove that the execution view of the IDEAL world is computationally indistinguishable from the execution view of the REAL world, we construct four simulators (Sim U , Sim CSPA , Sim CSPB , and Sim DR ), which represents four entities U , CSP A , CSP B , and DR. These simulators simulate the execution of the following adversaries Adv U , Adv CSPA , Adv CSPB , and Adv DR that compromise U , CSP A , CSP B , and DR, respectively. Note that KA is excluded as it is assumed to be a trustworthy entity.

T heorem 1. The SAMA scheme can securely retrieve the aggregation result plaintext of the addition computations over encrypted data in the presence of semi-honest adversaries.

P roof : We prove the security of the SAMA scheme by considering the case with two data inputs.

1) Sim U : The Sim U encrypts the provided inputs m 1 and m 2 using VP-HE and returns both ciphertexts to Adv U . The simulation view of the IDEAL world of Adv U is computationally indistinguishable from the REAL world view owing to the semantic security of VP-HE.

2) Sim CSPA : The Sim CSPA simulates Adv CSPA in single and multiple user(s) data processing scenarios. In the singleuser data case, Sim CSPA multiplies the provided ciphertexts and then encrypts a random number r with VP-HE. Next, it multiplies the encrypted random number with the result of the multiplication of the ciphertexts. Later, the same random number r is encrypted with the public key of the Paillier scheme and its ciphertext is raised to n−1 and multiplied with the given ciphertext. In the multiple users data case, Sim CSPA generates two random numbers r 1 and r 2 , encrypts them with the public key of the VP-HE and multiplies the encrypted random numbers with the ciphertexts (encrypted m 1 and m 2 ), respectively. Later, the same random numbers are encrypted with the public key of the Paillier scheme, and the results are raised to n − 1 and multiplied with the given ciphertext. In both cases, the Adv CSPA receives the output ciphertexts from Sim CSPA . Therefore, the REAL and IDEAL views of Adv CSPA are computationally indistinguishable owing to the semantic security of VP-HE and Paillier encryption.

3) Sim CSPB : The execution view of CSP B in the REAL world is given by both ciphertext of (m 1 + r 1 ) and (m 2 + r 2 ), which are used to obtain m 1 + r 1 and m 2 + r 2 by executing decryption with the strong secret key on these ciphertexts (r 1 and r 2 are random integers in Z n ). The execution view of CSP B in the IDEAL world has two ciphertexts randomly selected in the Z n 2 . The Sim CSPB simulates Adv CSPB in both single and multiple user(s) data processing scenarios. In the single-user data case, Sim CSPB simulates Adv CSPB as follows. The Sim CSPB runs the strong decryption algorithm and obtains m ′ 1 + m ′ 2 + r ′ and then the decryption result undergoes further encryption by the public key of Paillier encryption to obtain a new ciphertext. In the multiple users data case, Sim CSPB runs the strong decryption algorithm and obtains m ′ 1 + r ′ 1 and m ′ 2 + r ′ 2 . Then, the Sim CSPB aggregates the decryption results, and then the aggregated result is further encrypted by Paillier encryption public key to obtain a ciphertext. Next, in both cases, a randomly generated number is encrypted with CP-ABE. Then, the two ciphertexts (generated by the Paillier and CP-ABE schemes) are provided as a result by Sim CSPB to Adv CSPB . These ciphertexts are computationally indistinguishable between the REAL and IDEAL world of Adv CSPB since the CSP B is honest and the semantic security of VP-HE and Paillier cryptosystem, and the security of CP-ABE.

4) Sim DR : The Sim DR randomly selects chosen ciphertexts (besides not having access to challenged data), decrypts, and sends them to Adv DR to gain data information. The view of the Adv DR is the decrypted result without any other information irrespective of how many times the adversary access the Sim DR . Due to the security of CP-ABE and the semantic security of the Paillier scheme, both REAL and IDEAL world views are indistinguishable. Since the user data encryption process and DR decryption process are common for both single and multi-user data processing in the SAMA scheme, the security proof of Adv U and Adv DR is common for both single and multi-user scenarios.

C. Analysis against Security Requirements 1) Data Confidentiality: Every user encrypts his/her data using his/her VP-HE public key vpk i . CSP A then performs homomorphic addition operation over encrypted data, and delivers the processing result ciphertext with the encrypted private Paillier key psk using CP-ABE to the DR. Only authorized DRs can obtain psk and hence have access to the user data. Furthermore, the SAMA scheme conceals users' raw data by adding random numbers at CSP A , i.e., masking the processed data, hence preserving the privacy of the user(s) data at CSP B . Moreover, the Paillier cryptosystem is semantically secure and the CP-ABE is secure under the generic elliptic curve bi-linear group model as discussed in VI-A. In addition, the communication channels among all the entities (user, CSP A , CSP B , and DR) are secure (e.g., encrypted using SSL). Therefore, based on all of the above, only the authorised entities (i.e., the user or DR) can access the processing result and all the unauthorised internal or external entities who might eavesdrop messages sent and/or collect information can only access the ciphertext of the users (satisfying (S1)).

2) Authorisation: SAMA uses CP-ABE to implement secure fine grain access control, where the processing result is encrypted by the user defined access policies and the decryption key is associated with the attributes of the recipients. The user-centric access policy has been applied in the design of the SAMA scheme, which allows users to define their access policies to securely and selectively grant DRs access to the processing result. Thus, the processing result is encrypted using AP S and AP M , which are access policies set by users to determine their sharing preferences for sharing the single and multiple users data processing results. Hence, the private key of DR (sk) is required to decrypt the encrypted processing result using CP-ABE and only the authorised DR who satisfies the access policy can access the key and thereby decrypt the processing result. Thus, using CP-ABE, SAMA provides usercentric fine grain access control and only authorized DR can access the processing result (satisfying (S2)).

In this section, we evaluate the performance of the SAMA scheme in terms of the computational complexity and communication overheads incurred among all entities in the system. We also compare the performance of SAMA with the performance of the most relevant work [30] .

The computationally expensive operations considered in the SAMA scheme are the modular exponentiation and multiplication operations, denoted as M odExp and M odM ul, respectively. We ignore the fixed numbers of modular additions in our analysis as their computational cost compared to M odExp and M odM ul is negligible. In our analyses we also use the following parameters: BiP air is the cost of a bilinear pairing in ABE; |γ| + 1 is the number of attributes in the access policy tree and ϑ is the number of attributes needed to satisfy the access policy. Furthermore, we divide the complexity of SAMA to computations related into HE for data aggregation and ABE for access control.

1) Computational Complexity of HE Data Aggregation: In our analysis, we split the computational complexity into four parts: the complexity at each of the entities.

Computations at User Side: This is a common step for single and multiple user(s) data cases. At each reporting time slot, each user encrypts their data by their VP-HE public key vpk i to generate a ciphertext used for data processing/analyzing. This encryption requires two modular exponentiation operations, hence the computational complexity at the user side is:

Computations at CSP s: This includes operations performed by CSP A and CSP B . As these operations are slightly different for the single and multiple user(s) data processing scenarios, we analyse them separately.

For the single-user data processing case, CSP A performs additive homomorphic encryption on the received user ciphertexts ((N m − 1) * M odM ul), generates a random number r, encrypt it with the user's VP public key vpk i (2 * M odExp), multiplies the results of the homomorphic addition with the encrypted random number (M odM ul) and sends it to CSP B . Next, CSP A re-encrypts the generated random number r by ppk j (2 * M odExp), calculates the additive inverse of r (M odExp), and then multiplies it with In total, the computational cost at CSP s in a single-user data processing case is: (N m +2) * M odM ul +9 * M odExp+ (|γ| + 1) * Exp.

For the multiple users data processing case, CSP A generates a random number for every user's data (N users), encrypts them using the VP public key of the corresponding user, vpk i , (N * 2 * M odExp), and then multiplies the resulting ciphertexts with the ciphertexts received from users (N * M odM ul). Later, it aggregates all the generated random numbers, encrypts it using ppk j (2 * M odExp), calculates the additive inverse of the aggregation result (M odExp), and then multiplies the aggregation result ciphertext with the received ciphertext from CSP B (M odM ul) to remove the masking from the original data. Thus, the computational cost of CSP A in multiple users data processing case is: (N * 2+3) * M odExp+(N +1) * M odM u)). CSP B performs strong decryption using ssk for all N received ciphertexts (N * (2 * M odExp + M odM ul)), and then aggregates the decryption result. Next, it encrypts the addition result with a Paillier public key ppk j (2 * M odExp), and then encrypts psk with CP-ABE using AP M (|γ| + 1) * Exp). Hence, the total computation cost of CSP B in multiple users data processing case is: (2 * N +2) * M odExp+N * M odM ul+(|γ|+1) * Exp Therefore, in total, computational complexity of both CSP s in multiple users data processing case is: (4 * N + 5) * M odExp + (2 * N + 1) * M odM ul + (|γ| + 1) * Exp.

Computations at DRs: In single and multiple users data processing, a DR decrypts a ABE ciphertext using his/her sk to obtain the Pailliar decryption key psk j (at most ϑ * BiP air), and then uses it to decrypt the encrypted processing result (2 * M odExp+M odM ul). In total, this gives a computational cost at DR: (2 * M odExp + M odM ul + ϑ * BiP air).

We compare the total computational costs of each entity in SAMA with the addition scheme of [30] in Table V .

2) Computational Complexity of Access Control: We assume that there are |U | universal attributes, in which |γ| attributes are in the access policy tree τ , and at most ϑ attributes should be satisfied in the access policy tree τ to decrypt the ciphertext. The Setup() will generate the public parameters using the given system parameters and attributes U . This requires |U | + 1 exponentiations and one bi-linear pairing. The Enc ABE () requires two exponential operations for each leaf in the ciphertext's access tree τ , which needs (|γ| + 1) * Exp, whereas the KGen ABE () algorithm requires two exponential operations for every attribute given to the user. Also, the private key consists of two group elements for every attribute. Finally, Dec ABE () requires two pairings for every leaf of the access tree τ matched by a private key attribute and at most one exponentiation for each node along a path from that leaf to the root node.

The Setup() only need to be executed once. Thus, its computational complexity can be neglected in both single and multiple users data processing cases. Further, Enc ABE () is performed only once to encrypt the private key of the encrypted final result in both single and multi-user scenarios, also its computation cost is negligible. Moreover, Setup() and KGen ABE () are performed at KA and Enc ABE () by CSP B , which means users will not be burdened with the computational cost. Although the Dec ABE () algorithm is performed by DR which incurs some computational cost, it is an essential requirement to provide an authorised DR access to the final result with fine grained access control.

There are two types of communication overhead incurred in the SAMA scheme: overhead due to occasional data communication and overhead due to regular data communication. The former overhead captures the data sent occasionally, e.g., AP (AP S , AP M ) uploads/updates and notifications. The latter overhead includes the regular data communication patterns within SAMA, such as data upload, data requests, and data exchanged between cloud providers when data is being processed. Since the former overhead is negligible compared to the latter overhead, here we focus only on the communication overhead due to regular data communication patterns.

To ease the analyses, we divide the communication overhead introduced by the SAMA scheme into three parts: overhead incurred (1) between users and CSP s denoted as (Usersto-CSP s), (2) between CSP s (Between-CSP s), and (3) between CSP s and DRs (CSP s-to-DRs).

1) Users-to-CSP s: This is a common step for single and multiple users data cases. At each data reporting time slot, each user U i sends one ciphertext to CSP A . As each ciphertext has a length of 2 * L(n) (operations are performed under mod n 2 ), the total communication overhead for this part in single and multiple users data processing is: N * 2 * L(n).

2) Between-CSP s: The communication between CSP s in single-user data processing is as follows. CSP A sends one 

Here we present the experimental results of SAMA in three different settings: (1) computational cost of the data processing operations, (2) computational cost of the data access operations, and (3) communication overheads within SAMA.

For the computational cost, we have implemented the SAMA scheme to test its computational performances by conducting experiments with Java Pairing-Based Cryptography (jPBC) [40] and Java Realization for Ciphertext-Policy Attribute-Based Encryption (cpabe) [41] libraries on a laptop with Intel Core i7-7660U CPU 2.50GHz and 8GB RAM. We ran each experiment 500 times and took the average values. We set the length of n to 1024 bits, m to 250 bits, and r to 500 bits. We show the computation evaluation for the single-user and multiple users data processing for all entities separately and specifically CSP A and CSP B as they perform different sets of computations in each case as described in Section VII-C1. In addition, the efficiency of user-centric access control and communication overhead among the entities are shown in Section VII-C2 and Section VII-C3 respectively. 

We evaluate the computational cost for all of the four entities: U i , CSP A , CSP B and DR in both single and multiple users data processing scenarios and compare with the related work [30] (multi-user) in terms of different lengths of n. In addition, we show the computational cost of single and multiple users processing cases with a variable number of messages and users, respectively.

(i) Influence of different lengths of n on data processing: Figure 6a shows the influence of the different lengths of n on data processing of two messages, where n=512, 1024, and 2048 bits. We can observe that the computational cost is low on the user side, hence acceptable for resource-constrained devices. In our single and multiple users data processing, CSP A and CSP B achieve better computational efficiency compared to the scheme in [30] , as shown in Fig. 6b and 6c , respectively. The operation time of DR, as shown in Fig. 6d , is the least among all the other entities because it only needs to decrypt the processed result. Even when the n length reaches 2048 bits, it still only needs about 30ms to complete the computations. Further, our scheme computation performance at DR is comparable to that of the scheme in [30] .

We can observe that the computation cost is linearly increasing with the increase of the bit length of n among all of the entities; user, CSP s, and DR. However, as expected CSP A and CSP B computation costs increase much rapidly with the increase of bit length of n compared to user and DR in case of the multi-user data processing. The computational performance evaluation shown in Fig. 6 is consistent with our analysis in Section VII-A1. In general, the above tests prove that the most computation costs are undertaken at CSP s and users do not have much computation overhead. This result shows the practical advantage of the SAMA schemes with users and DRs which are resource constrained sides. Also, our scheme performs better computation efficiency compared to the scheme in [30] , which supports only multi-user processing.

(ii) Performance of SAMA's single-user data processing with a variable number of provided messages: We tested the computation of SAMA's single-user data processing case by varying the number of data messages provided by a singleuser as shown in Fig. 7 . It can be seen from the figure, the operational time increases with the increase of the number of messages. However, only DR's operation time is independent of the number of messages because it decrypts the processed result once, regardless of the number of messages that are processed at the CSP s.

(iii) Performance of SAMA's multiple-users data processing with a variable number of users: We tested the performance of SAMA's multiple users data processing by varying the number of users (N U = 10, 100, 1000, 10000) and fixing each user to generate only one message for data processing. As expected, the CSP s have more operation time compared to the user and DR. Moreover, if we compare Fig. 7 and Fig. 8 , then the CSP operation time is higher in the multi-user case compared to the single-user data processing. Since VP-HE supports only single-key homomorphic addition and does not support multi-key homomorphic addition, our multi-user processing computation time is higher than the single-user data processing at the CSP side. In other words, homomorphic data processing is executed over data encrypted only with the same encryption key. Therefore, as the multi-user data processing requires the decryption of all the messages using ssk, and then encrypting the aggregate with ppk, this incurs extra computation time.

2) Efficiency of User-Centric Access Control: We tested the computational efficiency of CP-ABE by varying the number of attributes from two to ten that are involved in the access policy as shown in Fig. 9 . The Setup algorithm is relatively constant as it does not depend on the number of attributes. In addition, the decryption algorithm Dec ABE in the test was set to require only one attribute needed to satisfy the access policy tree, therefore, the operation time of Dec ABE is constant. The computational costs of Enc ABE and KGen ABE are linearly increasing with the increase in number of the attributes. Although employing CP-ABE achieves user-centric fine-grained access control, there is an additional computation overhead incurred. However, in our scheme, since the computation of Enc ABE is outsourced to CSP A , and Setup, KGen ABE are Number of Data provided

Our multi and single-user scheme in [30] (a) Communication overhead at the Users-CSP s part 10 100 1000 10000 0 0 outsourced to KA, this extra computation will not burden the resource-constrained devices (wearable) at the user side.

The communication overhead among the entities is shown in Fig. 10 and it is evaluated by fixing the key size length n = 1024 bits and varying the number of messages to be computed. It is evident from Fig. 10a , the User-to-CSP A communications at the SAMA scheme reduce the communication overhead by 50% compared to the scheme in [30] . Furthermore, it is essential to note that the scheme in [30] supports only multi-user processing by encrypting data with CSP 's public key. Therefore, in order to support single-user processing, they need to re-encrypt the same data again with the user's public key as mentioned in [30] . Moreover, if we also compare the single processing communication overhead of our scheme with the scheme in [30] at the user side, our scheme reduces the communication overhead by 75%. At SAMA, a user has to encrypt wearable data only once for single and multi-user processing compared to the scheme in [30] , which requires encrypting the user's data twice to support both single and multiple data processing.

In addition, the scheme in [30] generates two ciphertexts for every data encryption, which increases communication overhead on the user side. While in the SAMA scheme only one ciphertext is generated. Clearly, we reduced the communication overhead significantly at the user side, which suits the resource-constrained devices. These results are consistent with the results obtained in [39] , in which it compares the communication overhead of the two HE algorithms: BCP and VP-HE. They found that the communication cost of BCP is about twice that of VP-HE, which was used in [30] . Figure 10b depicts the communication overhead among the cloud servers: (CSP A -to-CSP B and CSP B -to-CSP A ). Although our single-user processing achieves better communication efficiency compared to [30] , the multi-user processing communication performance is significantly higher than the multi-user scheme of [30] . However, since CSP is not limited in resources, it can afford to support this higher communication overhead for multi-user processing. Moreover, the frequency of multiple-users data processing is relatively less than single-user data processing in most wearable and healthcare use cases that are more personalized. We achieve better communication efficiency with the most frequent single-user data processing. Therefore, our scheme is suitable mainly for the applications that require more frequent single-user than multi-user data processing such as wearables and outsourced personalized healthcare data processing.

The communication overhead between the CSP -to-DR is shown in Fig. 10c . As DRs access only the processed result, there is less communication overhead between the CSP s and DR. It is clear that our single and multi-user processing performs better than the scheme in [30] which supports only multi-user processing. Therefore, overall our scheme has significantly less total communication overhead compared to [30] .

Wearable users (data owners) increasingly outsource their wearable data to third parties, e.g., cloud-based services, to benefit from personal and sociable analytic models results. As health data are sensitive, they should be protected against outsider as well as insider threats such as those from the data managers (cloud service providers). In addition, this protection should be applied to both raw data and aggregated data, and should allow data owners to manage the rights/privilege in accessing their data. This leads to the need for a solution with these properties: data are managed by a third party, and no trust is required on the third party, efficient single and multiple user(s) data processing, data owners can specify who can access which data items (i.e., user-centric access control), and fine-grained access to both raw data and computed/aggregated data (i.e., computed results).

To achieve these properties systematically, in this paper, we have designed and evaluated a novel flexible data processing and access control scheme, called SAMA, which supports aggregation over encrypted data of single and multiple users with user-centric access control and privacy preservation. SAMA combines the use of Pailliar homomorphic encryption and CP-ABE with a user-centric approach. Security analysis through simulations shows that SAMA scheme fulfills the specified set of security and privacy requirements. Experimental results have also demonstrated its efficiency and advantages in terms of communication and computation in comparison with the existing related schemes.

As future work, we plan to extend SAMA to support more operations on encrypted data in order to facilitate more complex analytical processes.

Wearable interaction

Toward practical privacy-preserving analytics for iot and cloud-based healthcare systems

Wearable technology to assist the patients infected with novel coronavirus (covid-19)

Real-time alerting system for covid-19 using wearable data

Conceptual privacy framework for health information on wearable device

Computing blindfolded on data homomorphically encrypted under multiple keys: A survey

Secure sharing of partially homomorphic encrypted iot data

Security and privacy in cloudassisted wireless wearable communications: challenges, solutions, and future directions

Secure sharing of personal health records in cloud computing: Ciphertext-policy attribute-based signcryption

Droplet: Decentralized authorization and access control for encrypted data streams

Sieve: Cryptographically enforced access control for user data in untrusted clouds

Scalable and secure sharing of personal health records in cloud computing using attribute-based encryption

Regulation (EU) 2016 -general data protection regulation

Health insurance portability and accountability act of 1996

Secure largescale genome-wide association studies using homomorphic encryption

A secure privacypreserving data aggregation scheme based on bilinear elgamal cryptosystem for remote health monitoring systems

Achieve privacy-preserving priority classification on patient health data in remote ehealthcare system

Generating private recommendations efficiently using homomorphic encryption and data packing

PPDM: privacy-preserving protocol for dynamic medical text mining and image feature extraction from secure data aggregation in cloud-assisted e-healthcare systems

Foresee: Fully outsourced secure genome study based on homomorphic encryption

Efficient and privacy-preserving outsourced calculation of rational numbers

Efficiently outsourcing multiparty computation under multiple keys

Privacy-preserving tensor decomposition over encrypted data in a federated cloud environment

An efficient homomorphic encryption protocol for multi-user systems

Verifiable public key encryption with keyword search based on homomorphic encryption in multi-user setting

A tale of two clouds: Computing on data encrypted under multiple keys

An efficient ciphertext-policy weighted attribute-based encryption for the internet of health things

Secure access for healthcare data in the cloud using ciphertext-policy attribute-based encryption

Privacy preserving ehr system using attribute-based infrastructure

Privacy-preserving data processing with flexible access control

Computing arbitrary functions of encrypted data

Public-key cryptosystems based on composite degree residuosity classes

Fuzzy identity-based encryption

Ciphertext-policy attributebased encryption

Attribute-based encryption for fine-grained access control of encrypted data

An extended framework of privacy-preserving computation with flexible access control

A decentralized security framework for data aggregation and access control in smart grids

Efficient and privacy-preserving fog-assisted health data sharing scheme

Privacy-preserving association rule mining using homomorphic encryption in a multikey environment

jpbc: Java pairing based cryptography

Java realization for ciphertext-policy attribute-based encryption