key: cord-126132-5k415xvj authors: Swain, V. Das; Kwon, H.; Saket, B.; Morshed, M. Bin; Tran, K.; Patel, D.; Tian, Y.; Philipose, J.; Cui, Y.; Plotz, T.; Choudhury, M. De; Abowd, G. D. title: Leveraging WiFi Network Logs to Infer Social Interactions: A Case Study of Academic Performance and Student Behavior date: 2020-05-22 journal: nan DOI: nan sha: doc_id: 126132 cord_uid: 5k415xvj On university campuses, social interactions among students can explain their academic experiences. However, assessing these interactions with surveys fails to capture their dynamic nature. While these behaviors can be captured with client-based passive sensing, these techniques are limited in scalability. By contrast, infrastructure-based approaches can scale to a large cohort and infer social interactions based on collocation of students. This paper investigates one such approach by leveraging WiFi association logs archived by a managed campus network. In their raw form, access point logs can approximate a student's location but with low spatio-temporal resolution. This paper first demonstrates that processing these logs can infer the collocation of 46 students in 34 lectures over 3 months, with a precision of 0.89 and a recall of 0.75. Next, we investigate how this WiFi-based coarse collocation reflects signals of social interaction. With 163 students in 54 project groups, we find that member performance shows a correlation of 0.75 with performance determined from collocation of groups through 14 weeks. Additionally, this paper presents preliminary insights for other campus-centric applications of automatically inferred social interactions. Finally, this paper discusses how repurposing archival WiFi logs can facilitate applications for other domains like mental wellbeing and physical health. Humans are social by nature; their functioning is informed and explained by behaviors that are interlinked with those of others [24] . One of the ways these behaviors manifest is when people in the same physical space take mutually-oriented actions [52] . In situated communities such as college campuses, a student's social interactions with their peers can be important to describe the academic experiences of students, such as motivation [20, 66] , absenteeism [64] , and social isolation [19] . Indeed, these interactions also share a relationship with a student's academic performance [19, 20] . As a result, understanding how students socially interact can help campus stakeholders gain valuable insights to support academic outcomes and support flourishing. However, traditional survey based approaches are limited in representing these dynamic behaviors at a fine temporal and spatil grannularity, that is scalable at the community level. This motivates us to gain an objective insight into social interactions using unobtrusive and automated methods that can be practically deployed at scale. It is possible to capture social interactions through devices possessed by the user [14, 43, 44] . However, it is often impractical to gain a comprehensive picture of a large community's social behaviors because these 2 BACKGROUND AND RELATED WORK Social interactions are known to be related to behavioral and psychosocial outcomes [63] . This motivates us to study how meaningful signals of social interactions can be inferred by unobtrusively sensing collocation among related individuals. Particularly, this paper adopts the definition for social interactions as described by Rummel [52] : "...acts, actions, or practices of two or more people mutually oriented towards each other's selves...". These interactions cannot be determined by physical distance alone [52] . People could walk through crowds without socially acting on each other. People could also be sitting quietly in a room but still socially interacting [45] . Rummel's definition refers to people regulating their actions based on others sharing mutual intent, with the purpose of shaping their subjective experience [52] . Although these interactions can take place digitally, this paper focuses on synchronous social interactions in the physical world, i.e., the people interacting are in proximity. The ubiquitous computing community, along with other HCI researchers, has shown a keen interest in understanding wellbeing of students geographically situated on a campus [4, 12, 54, 69, 71, 72] . Wellbeing describes an individual's perceptions of satisfaction, fulfillment, and motivation. These experiences are characterized by phenomena like affect, stress, and performance. As an individual outcome, wellbeing strongly interacts with social support [60] . Social circles-friends, family, or even work-related groups-can influence the functioning of students at higher-education institutes. One's peer group serves multiple functions of support, such as collaboration on specific tasks or protection from external stressors, even without directly assisting the individual's tasks [8] . Therefore, in order to holistically study wellbeing, it is important for the community to explore methods to identify social interactions. Specifically, social support for college students has been found to moderate stressful events [57] and is linked with reduced negative affect [19] and depression [20] . Therefore, the interactions of an individual with others can facilitate coping with negative events [57] . In fact, these interactions can also be indicative of greater satisfaction [19] and self-actualization [20] . In the context of student life, these factors often contribute to determining student performance. For instance, the lack of social interactions is related to absenteeism [64] , the habitual absence associated with low motivation. At the workplace, the social interactions of an individual also explain their embeddedness, which has been shown to describe the propensity to perform [66] . And more explicitly, interactions foster collaborations, which are positive for performance [3] . Therefore, reduced interactions, or a complete lack of them, not only affects a student's mental health but also impact their academic outcomes. Traditional methods of evaluating social interactions rely on survey instruments, but these are limited by recall and desirability biases [1, 34] . Moreover, self-reports are static assessments, while social interactions are fluid and vary over time [55] . One approach to studying human phenomena by avoiding such biases is with unobtrusive sensing. These automatic methods have the promise of dynamically sensing human behavior without interfering with an individual's natural functioning and are therefore more practical for gathering reliable insights. Automatically sensing social interactions have piqued the interest of the community for over a decade. According to Lukowicz et al., one of the opportunities that describes socially aware computing is "methods for monitoring and analyzing social interactionsâĂŤin particular, with respect to long-term interactions and interactions within large communities and organizations" [38] . Prior work in this space has fundamentally focused on two approaches, differentiated by the scope of the spaces and people studied. The first set of approaches have focused on studying face-to-face interactions in small spaces [14, 44] . Olguín et al., used wearable badges to study how low-level interpersonal interactions are related to workplace performance [44] . Although this methodology is precise, it is limited by the cost of instrumentation and by the obtrusiveness of wearing a foreign device. A known variation is to track social interactions using Bluetooth sensors embedded in one's smartphone [41, 70] . However, the convenience of installing applications on an existing wearable [27, 62] or phone [10] still requires user adoption and introduces concerns of privacy violations [59] . This challenges the approach of aggregating individualized sensing to determine social interactions. The second approach has largely focused on studying city-scale "flocks" through GPS based localization [14, 17, 29, 68] . While this approach scales, it is privacy-invasive because of its continuous data aggregation [35, 59] . However, GPS-based localization varies in accuracy for indoor settings [31] . Moreover, gaining these insights without requiring client-adoption would require resources that are not readily accessible to a campus community. Campus settings require an approach that can infer interactions within buildings while not compromising individual privacy outside their perimeter. This would mitigate the adoption issues of client-side approaches and the oversight challenges of global sensing. As a result, campuses can consider harnessing data already logged through their network infrastructure without needing any active participant effort or involvement. To understand social interactions in a large community, campus stakeholders require methods that can infer such interactions in a dynamic, scalable and reliable way. This has led to some prior explorations to leverage WiFi-based technologies for localization and consequently infer social interaction based on collocation. A common technique for localization with WiFi is by fingerprinting or trilateration [67] . For instance, Hong et al., have shown that WiFi based fingerprinting can help identify ties between groups [25] . However, these works are client-side approaches. Alternatively, enterprise solutions with fingerprinting and trilateration capabilities at the infrastructure-side have emerged [42, 47] . To infer location, these technologies store the Received Signal Strength Indicator (RSSI) values for any client-device within a neighborhood of Access Points (APs). Yet, a common form of WiFi infrastructure deployment in university campuses [16, 71] only stores association logs describing which AP a client-device is connected to. Although it is relatively coarse [39] , this parsimonious representation of location has been exploited to understand individual behavior. Ware et al., have inferred student location based on network logs to assist depression screening by inferring individual dwelling behaviors (e.g., duration, entropy, and rhythms) [71] . Similarly, Eldaw et al., have used unsupervised methods on similar logs to understand how the pattern of student visits can explain the semantic purpose of certain campus spaces [16] . While these works trace individual dwelling patterns across campus, they do not explicitly assess if specific groups of students were interacting. This paper is motivated by these ideas to use network association logs and extend it to identifying periods when multiple individuals are collocated for meaningful social interactions. Even though collocation does not necessitate verbal communication in the strict sense, it does serve a social function [45] . Olson and Olson have described how "spatiality" is fundamental to human collaboration even if individuals do not communicate [45] . Therefore, we seek to determine if this collocation-based information can capture the signals of such collaborations or social interactions by studying the performance of project groups. Although singular instances of collocation might be contaminated with spurious events of co-presence where individuals did not interact, by gleaning the right information from such data we can predict social-interaction based outcomes. For instance, prior work has shown that the network association patterns can phenotype people into behavioral groups [37] . Even other infrastructure-based coarse location technologies, such as Bluetooth, have been used to capture social signals like synchrony within group routines [11] . While these studies implicitly associate individuals together (e.g., distinguish students by dining hall), they do not explore explicit interactions in physical spaces sufficiently. A more direct depiction of social interactions was demonstrated by Zakaria et al., who leverage a custom system integrated to the campus network infrastructure to monitor groups and subsequently predict stress [72] . However, these systems either rely on additional augmentation of the infrastructure or A "social interaction" occurs when and where individuals mutually orient themselves [52] . This happens in the physical world when two or more individuals are within proximity, or they are collocated [45] . In the scope of this paper, periods when individuals are collocated is the basic unit for which we infer social interactions. This paper seeks to answer if coarse collocation can determine meaningful social interactions. Therefore, it is important to establish an automatic approach to compute these collocation sessions. As illustrated in Figure 1 , this section describes a pipeline to determine collocation by leveraging WiFi network association logs and an evaluation of its reliability (RQ1). To build a reliable processing pipeline we need real test data that represents the students on campus. This is to ground the methodology in how the network logs actually depict the behaviors of real students. 3.1.1 Sample Association Logs. As a testing sample, we obtained consent from 46 students at a large public university in the United States, and we then analyzed their WiFi association logs. These students belonged to two sections of a group project intensive course. Both sections were taught by the same instructor(s) and had attendance data for each lecture. We refer to these sections as "1A" (22 students) and "1B" (24 students) throughout the paper. The instructor for the course provided each consenting student's attendance and group label, along with the course lecture schedule. The instituteâĂŹs IT management facility provided anonymized network log data for these students. This data was accessed at the end of the semester 1 and contains approximately 14 weeks of data, which spans 34 lectures for each section 2 . Table 1 shows an AP in room 209 of building 122S. Larger rooms, such as lecture halls, have multiple APs to increase coverage. In the logs, these APs are registered with different MAC addresses, but associated to the same room. Every entry in the log documents an SNMP (Simple Network Management Protocol) update in the network. Its timestamp denotes when a device associates or responds to an SNMP poll request. Therefore, the log itself indicates that a device is in the vicinity of an AP, but without information of the client RSSI this inference has a low spatial resolution. Moreover, the SNMP update is irregular because it depends on the connected device's response [61] . This is erratic because of variable connectivity settings in the device agent (e.g., the WiFi turns off when the screen is locked). The irregularity in log updates leads to a low temporal resolution. The low spatio-temporal resolution is what introduces "coarseness" to this data. Outside of the specific association timestamps -when a device responds to a poll or switches APs because it roams -the device is invisible in the logs. Because of this, it is non-trivial to determine the location of users between two raw log entries. This section describes a method to use this momentary log information of presence to determine sustained periods of mobility, dwelling, and consequentially collocation. The raw logs are coarse for assessing location because the SNMP updates occur either when a device roams or when it responds to a network poll if it happens to be awake. Therefore, there is no fixed interval within which a log occurs. To reliably determine if an individual is dwelling, it is important to determine where they are between two updates. Specifically, our focus here is identifying when an individual is dwelling in the same room, i.e., in the proximity of the associated AP(s). For this we propose the following approach: (i) Determine if an Individual is Mobile -Since we knew the scheduled class time and location for the regular lectures of sections 1A and 1B, we examine the logs accumulated in the thirty minutes before and after Figure 2 depicts the instances when a student's device is logged before, during, and after the lecture times, along with the AP that captured the update for a Section 1A class held on April 5, 2019. In this analysis, we consider any entry associated with a student because less than 1% of the log entries show concurrent updates at different APs from two or more devices owned by the student. Since SNMP updates occur when a device roams, we measure the interval between two successive log entries from a user's devices that associate with different APs. Based on the 90th quantile of these intervals (233 seconds), to determine if an individual is mobile we establish the largest interval between two successive updates from different APs. (ii) Determine if an Individual is Dwelling in Place -Based on the criteria for moving, a user is considered stationary at the location of the earlier log entry when the time between two successive updates at different APs exceeds the threshold. As evident in Figure 2 , the log updates before and after class times not only are at a different AP (than that of the lecture room), but they also exhibit higher update frequency in shorter intervals. Therefore, we consider any time segment when the user is not mobile, to be when they are dwelling. Contiguous dwelling segments where the AP does not change are combined to represent longer dwelling segments. Figure 3 shows how the raw logs represented in Figure 2 can depict moving and stationary time segments. (iii) Filtering Out Disconnection Periods -One confound to this method of determining dwelling time segments is that it can erroneously label time periods where a user was disconnected from the network as a period when they were dwelling. Consider an individual that moves through network coverage and then exits out. When they are moving from campus out of it, this would be registered as multiple short interval updates for the changing APs till the last AP connection on campus. This will be followed by a large interval till the user returns into coverage area. This large interval needs to be distinguished from legitimate dwelling periods to avoid false positives. Based on the class dwelling time, we that find the longest interval between two successive log entries of a student present Each stack depicts where how many students of Section 1B were found to be connected to the lecture room's AP, another AP in the same building, to the campus network, or not connected at all in class was 76 minutes. We use this heuristic as a threshold. With this, we mark any periods of dwelling as disconnected (or inactive) where the log entries are timestamped at intervals longer than the threshold. Figure 4 shows that the disconnection periods identified were predominantly on weekends and before or after class times. The previous phase identifies dwelling periods for individuals. This phase identifies periods of collocation based on overlapping periods of dwelling near the same AP (or room). Simply considering the overlapping dwelling segments could have breaks when one of the collocated members inadvertently switches from the corresponding AP to a different one and then returns (e.g., participant 2,034 in Figure 3 ). This could occur either when they took a break or if they are in place but their device intermittently found a better connection to a different AP. Since the aim of obtaining collocation segments is to infer meaningful social interactions, we consider a liberal approach to characterize collocation. This decision aligns with Rummel's definition of social interactions, which admits interaction between individuals even when they are not within line of site, because behaviors can still be influenced [52] . When an individual takes a brief break from a meeting, for example, it does not signify the conclusion of social interactions. Therefore, instead of dissecting the collocation periods around such short lived absences, these gaps in the segments are bridged. In particular, these gaps are characterized by: (i) common members of a group between the collocation periods adjacent to the gap; and (ii) the gap containing a collocation or dwelling segment with a subset of those members. After identifying such overlapping segments, we first find the median duration of these gaps. The median in our data for such occurrences was 11m 7s. Any gaps less than this threshold are resolved by considering all members to be collocated throughout, including the break period. To quantify the reliability of this coarse localization and collocation technique, we evaluate the attendance of 46 students in 2 sections for the 34 lectures that occurred in the sample data period. Each section had 3 classes a week and the two sections met in different buildings on campus. The instructors provided us with lecture-by-lecture records of each consenting student's attendance for both sections. We use this as the ground truth to evaluate the reliability of our proposed automated method. Missing Data. First, we would like to address the missing data problem. On certain lecture days, we did not find any entry for some students (including a 30 minute margin before or after). The red stacks in Figure 5 show the number of students per lecture with no log entries for section 1B. On comparing this to the attendance records, we learn that 93% of the times a student does not appear in the logs, they are actually present. One possibility is that the student either had all their devices turned off or connected to a different network (e.g,. cellular data). Every student in the sample had no WiFi log entries on at least one lecture they attended (the median was five lectures). Therefore, despite its pervasiveness, leveraging the managed network will still miss out on students who were actually present. This is irrespective of the technique applied or sophistication of logs as it is dependent on the client-side behavior. For such occurrences, the automated method cannot ascertain if a student was present or absent. As result, we exclude these student records (for that lecture) from further analysis. Accuracy. We analyze the accuracy by considering every student who connected to the network during the lecture time for each lecture. We consider a student to be in class, if any time during class they were collocated with their peers. For the lecture illustrated in Figure 3 this refers to the green segments. For every lecture this technique identifies a student to be in class, they were actually present 89% of the time-precision. We speculate, the false positives that emerge could be the result of a student missing the attendance sign-up sheet. Alternatively, for every instance when the student was present, this method infers them to be collocated 75% of the time-recall. This implies the high false negatives, as shown in Figure 6 . A false negative could occur when a student's device connects to a different AP on the network. Figure 3 denotes these as the orange segments. A device could connect to an AP that is physically further away because the signal from their closest WiFi was attenuated [31] . Therefore, this uncertainty in location could lead to missing out on students that were actually present. Although this might underestimate possible social interactions that took place, it motivates us to see what we can meaningfully learn from the interactions that we do detect correctly. The previous section describes how raw WiFi network logs can be processed to detect the collocation of students in a lecture room. This was validated with attendance records. However, attendance only represents occupancy and not necessarily social interactions [52] . Since social interactions are known to be related to several aspects of wellbeing [63] , it is important to learn if collocation-detected by repurposing network logs-can be used to infer these interactions (RQ2). This section presents a case study to explore how the collocation of project teams outside scheduled lectures can represent social interactions. Specifically, social interactions in teams is known to affect performance [3] . This encourages us to investigate the relationship between a group member's' performance and how they collocate with other group members (such as time invested in meetings, the regularity of group activities, and the locations of these meetings). This case study demonstrates the feasibility of leveraging raw logs for one specific application that involves social interactions-predicting the performance in project teams. In this way, it answers our research question-To what extent can WiFi-based coarse collocation represent meaningful signals of social interactions? 4.1.1 Participants. The participants were enrolled in an undergraduate design course for CS students. The course is offered every semester and is a two-semester sequence typically taken by students in their junior (3rd) year. Students in this course are expected to work with a team of four to six students over two semesters (Part 1 and Part 2) on a single design project. In Spring 2019, this course had four sections for Part 1 and five sections for Part 2. Each section had an enrollment of about 40 students. In terms of course structure, Part 1 involves both lectures as well as project milestones. In contrast, Part 2 has fewer lectures and expects students to allocate scheduled class-times for project-related efforts. Students in both parts are expected to collaborate on project work outside scheduled lectures. It is not generally known how often student teams meet outside of class, nor is it known how much those meetings impact team performance. The data used in the previous analysis was from sections of this same course but had lecture wise attendance records (Section 3). Recruitment. The recruitment took place in Spring 2019 in collaboration with the course instructors. Recruitment was carried out in April by physically advertising the study during the lectures and online outreach through the instructors. In addition, a large number of the students were recruited during the final demonstration expo that is attended by students of both parts. On enrolling, participants provided consent for the researchers to access their WiFi AP log data as well as their course data. The participants were assured that this is retrospective data that is already archived and the insights of our study would not impact their course outcomes. During enrollment, participants also completed an entry survey where they reported their group ID along with describing when, where, and how often they interacted with their group members face-to-face for class purposes. Participants were remunerated with a $5 giftcard for enrolling. In total we received consent from 186 students ( Table 2) . Of these, 170 students were in the age of 18-24 years, and 16 were of age 25 and above. Among these students, 59 reported female (32%) 3 . Privacy. Participant privacy was a key concern for the research team given the nature of data being requested. The two core streams of data, course outcomes and WiFi AP logs, are both de-identified and stored in secured databases and servers which were physically located in the researchers' institute and had limited access privileges. The study and safeguards were approved by the Institutional Review Board of the authors' institution. The course related data of the consenting students was requested from the different instructors after grading for the semester was completed. We obtained data for the remaining 186 students along with course lecture times (Table 2 ). Among them, 23 students did not have any other member from their group in our study and thus were dropped from this analysis. This leaves us with 163 students and 54 groups (Figure 7) . Final Score. All instructors provided the final score of students in their section. This represents a numerical score between 0 and 100 that is incorporated into the instructor's grading scheme to assign a letter grade for the course. This final score is dominated by the project but students are assessed individually. These variations are introduced by participation as well as the instructor's subjective assessment of team feedback. This study uses this final score to represent a student's academic performance. Peer Evaluation. Given the group project oriented nature of the course and our interest in studying social interactions in groups, students completed a fairly extensive peer-evaluation battery. This battery was completed by the students at the end of the semester and it captures their perceptions of conflict, satisfaction, and security with the team [15, 28, 65] . It can also assess behaviors like collaboration, contribution, and feedback [37] . In essence, this battery evaluates an individual's experience interacting with their team members. Prior work shows that these instruments quantify aspects of social interactions that relate to performance [15, 28, 37, 65] . Therefore, we use a participant's responses to these surveys as a gold-standard to predict grades. We feed these responses into a model to compare against the predictions of models trained on automatically inferred behaviors. Table 3 summarizes the distribution of scores for each peer-evaluation survey instrument. Specifically, the peer-evaluation battery contained the following validated survey instruments: • Team Conflict -Conflict represents the perception of incompatible goals or beliefs between individuals that cannot be trivially reconciled. This battery contains three scales, "task conflict", "process conflict", and "relationship conflict". Jehn and Mannix have shown that low-levels of process and relationship conflict along with moderate levels of task conflict are optimal conditions to maximize team performance [28] . • Team Satisfaction -Satisfaction reflects the contentment of an individual with their personal situation in terms of their expectations. Van der Vegt et al., have shown that team satisfaction is associated with interdependence among team members which is indicative of team performance [65] . • Psychological Safety -This construct captures a "shared belief held by members of a team that the team is safe for interpersonal risk taking " [15] . Edmondson has shown that it is associated with both learning progress as well as team performance [15] . • Team Member Effectiveness -This measure encompasses five dimensions: (i) contributing to the project; (ii) interacting with collaborators; (iii) monitoring progress and providing feedback; (iv) expecting quality; and (v) relevant knowledge and skills [37] . These dimensions characterize behaviors related to "team member effectiveness", which is theoretically related to team performance [37] . The WiFi access point log data for consenting students was obtained from the institute's IT management facility. Since this data was already aggregated for maintenance and security purposes throughout the semester, we were able to retroactively obtain this information at the end of the semester. The data spans all WiFi access logs by connected devices belonging to consenting students. This data is richer compared to the sample data for processing the raw logs into collocation (Section 3.1.2). It includes more individuals and a larger set of APs. The data spans a time frame of 95 days between January 1 2019 and April 5 2019. On average, the time between the first log by a participant's devices and the last is approximately 90 days. Figure 8 shows the distribution of connected students throughout the semester. The logs in this study include 204 unique buildings with 4,865 unique APs. We also find multiple APs to be in the same room for 803 rooms. Additionally, the 204 buildings were manually categorized to best express the purpose of that space [16, 70] -for example, "academic", "dining", "green spaces", "recreation", and "residential". Two researchers referred to campus resources to independently assigned categories to these buildings. Only two of the building labels had a disagreement, which was resolved by a third researcher. The raw logs of the consenting students was processed with the technique described in Section 3.3 to obtain periods when students in the dataset were collocated. The median time spent collocating with other students in the dataset was found to be about 70hrs. The low spatial resolution of the collocation makes it insufficient to assert from single instances if individuals were interacting during a session in which they were in proximity. However, processing multiple collocation periods over the semester can represent behaviors that indicate these social interactions. For instance, members of the same group are expected to be collocated on a regular basis at a specific type of building. Therefore, it is important to engineer features that can represent these behaviors. This phase extracts relevant information at a week-level based on various behaviors labelled semantically with the help of manually annotated or retrieved data (e.g, building categories, group meeting/lecture schedules). We segregat features by "individual" and "group" in order to capture different behavioral signals. The former is meant to characterize individual behaviors which are not explicitly associated with social interactions. The latter captures the behaviors of individuals that are oriented towards their group, such as time spent collocated with other group members. The dissociation between these features is meant to distinguish the explanatory power of the group behaviors from individual ones. This helps provide discriminant validity that coarse collocation-based features indeed captures social interactions and is not confounded by an individual's general behavior, such as the time spent at academic spaces. Table 4 summarizes the different features we extracted at a week level. We derive the individual features based on the lecture schedule and semantic labels for buildings. To craft the group features, we use the same information but compute them as both absolute duration and a relative percentage. The former denotes how much time a student spent collocated with their group (at least one other member). The latter describes this behavior relative to the total time spent by that group together to express what portion of time a student participated. In comparison to the individual feature, the group features are crafted to consider when the behavior occurred (1) Scheduled: Groups reported their regular meetings in a free-form response field during enrollment (Section 4.1.1). The meeting locations reported were at a building resolution and respondents typically indicated a primary building (e.g., learning commons) along with a potential backup (e.g., library). However, teams also expressed meetings could take place at undetermined locations on campus. Moreover, groups often provided multiple potential meeting times and places for a week. To accommodate all possibilities, this feature captures the collocations between group members that occurred during any of the reported periods. (2) Class: This segregates collocations with group members during class times. This distinguishes itself from the attendance feature by considering periods of collocation even outside the assigned lecture room. For instance in the case of Part 2 sections, the students were expected to meet among themselves during class time. And based on student reports, teams did not necessarily use all class times in a week for meetings. This feature represents this set of behaviors. (3) Other: This is a catch-all bucket to capture all other ad-hoc collocations. Only 4 groups in our study reported interacting with group members for non-academic reasons (e.g., "lived together"). Therefore, this category can be considered to indicate impromptu interactions motivated by course milestones or other classes in which group members are together. Processing. The phase processes these raw week-level features into aggregate features that describe their collocation behavior. All the raw features we extract (Table 4 ) from the data are computed at a week-level for 14 weeks-5 × 14 for individual features and (9 × 2) × 14 for group features. This leads to a rather large feature space given the target variable was the final score obtained at the end of the semester. Therefore, in order to reduce the feature space we calculate summary features to describe the entire semester of the individual. Specifically for each feature extracted at a week level, we compute the median, the mean and the standard deviation for the study period. These are moment statistics that quantitatively depict the distribution of that feature throughout the study period. In addition to these we also compute the approximate entropy of the feature per individual [48] . This statistic is a measure of the regularity of that feature for every individual. This reduces the overall feature count to 20 and 72 for individual and group features respectively. To predict the academic scores, we build multiple models to investigate how the collocation-based features can predict the final scores in comparison to survey-based peer evaluation scores. Since the final score is a continuous integer value, we estimate it using regression. Social interactions are known to be related to performance within teams [19, 20] . This motivates our analysis to demonstrate the relationship between performance and collocation behaviors to explain the extent to which coarse collocation reflects meaningful signals of social interaction (RQ2). M P E denotes the model trained on peer-evaluation scores (Section 4.1.2) based on the self-reported survey responses provided by the instructors. This model illustrates the efficacy of peer-evaluation reports in explaining performance and serves as a benchmark because these constructs have been validated to be associated with performance [15, 28, 37, 65] . M iW F refers to the model trained on individual features and therefore is independent of the participant's group. M дW F describes the model trained only on group features based on coarse collocation and is therefore potentially representative of social interactions. On comparing these models to a specific subset of features (individual or group), it is possible to assess the discriminant validity in predicting final course scores with each subset without confounding interaction effects from other features. We evaluate all models through a 5-fold cross-validation process. To estimate the target variable (the final score), for each model described, we train with different estimators to account for variations in the data. Particularly, we train a Linear Regressor [56] to represent linear relationships between variables and a Decision Tree Regressor [53] for non-linear relationships. Additionally, we also train a Random Forest Regressor [36] , i.e., an ensemble method and thus more sophisticated learner. First, we compare these estimators the basis of the RMSE (Root Mean Square Error) [7] of the predictions. Then, to determine the predictive utility of the we measure the correlation between the predicted value and the actual values. For internal validation we compare these models to a rudimentary baseline M 0 , which always predicts the median of the target variable from the training set. The transformations are needed to solve problems with missing data and to scale the features to comparable units. Both the transformations and selections take place within each fold and therefore we perform all fitting only with the training data of that fold: (1) Scaling Final Scores by Instructor -The target variable the models are trying to predict is the final score for the course. Since the final score varies based on the instructor, we standardize the final scores based on the distribution of scores for each instructor in the training data. (2) Impute Missing Data -For a few individuals certain features might have missing values. In case of peerevaluations this could be because the student did not complete a particular survey instrument. For some of the group features a project team did report their scheduled meeting times (in total seven students). We impute these missing values with the mean. (3) Standardize the Features -We convert all features to zero mean and unit variance [33] . (4) Mutual Information Regression -Lastly, we employ a univariate feature selection method on the basis of mutual information between the training features and the target variable [32] . The number of features selected varies from 1 to k, where k is the total number of features in the model. We select the k that minimizes the RMSE. The choice of k is illustrated in Figure 9 . To re-emphasize, these results will show if the collocation behaviors of students can predict their group-based performance. Since team performance is linked to social interactions in physically collocated teams [45] , these results can delineate if patterns in coarse collocation can infer social interactions. This section describes the prediction results of the various models (described in Section 4.3.1) to explain the explanatory power of the coarse collocation features. Table 5 summarizes the results of the predictions with the best estimator for each model. The RMSE of M 0 -the arbitrary regressor -establishes a baseline to determine the goodness of the models we analyze. Any features that do not reduce the RMSE in comparison are not noticeably better than predicting the mean of the distribution. To compare the information in the features, we select only the best estimator algorithm (based on reduced RMSE) for each set of features. The RMSE of M 0 was 1.075, which can be interpreted as 1.075 standard deviation away from the true value. The smallest RMSE of M P E was 1.085 and that for M iW F was 1.068. For both of these input features, Linear Regression was the best estimator. Therefore, the modeling the peer-evaluation features shows no improvement in error-reduction and the predicted results have an insignificant correlation with the actual scores. In contrast, M iW F only produces a relatively better, yet still weak correlation of 0.12. This implies that merely processing the individual dwelling behaviors is more informative than peer-evaluation responses to predict scores in a group-project intensive course. In comparison, the best RMSE for M дW F is 0.748, which uses Random Forest (a 30% improvement). Moreover, the predicted values exhibit a correlation of 0.759 with the true values. Figure 10 shows the correlation between predicted and actual final scores variables. The results show that the model trained on students' collocation behaviors (M дW F ) outperforms other models in predicting their final scores, in comparison to models trained on peer-evaluation responses and individual behaviors. First, we find the collocation-based behaviors are significantly better predictors than peer-evaluation scores (M P E ). While peer evaluation scores are expected to yield better correlations [15, 28, 37, 65] , the social desirability bias in manually reporting team experiences can wash out the intricacies of actual team behavior [1, 34] . These surveys expect the participants to subjectively interpret and then transform their social interactions into scores. But these students are also aware that these scores might affect the instructor's impression of their team members and possibly their score. In contrast, M дW F incorporated multiple characteristics of the collocation behavior within groups over multiple weeks. These features are devoid of the subjective biases that plague self-report and other manual assessments of social interaction. Second, we find that M дW F performs better than a model built on individual behaviors (M iW F ). Note that M iW F was also found to be a better model than the peer-evaluations. This already implies that dynamic offline behaviors carry explanatory power to determine academic performance. However, given the collaboration-based nature of the course in determining the final score of an individual, M iW F falls short of M дW F . For instance the individual behaviors of attendance or dwelling in academic spaces were not comparable in explaining the final score. Therefore, this result indicates that features in M дW F reflect more than dwelling around an AP. Arguably, in a snapshot of time, individuals could be collocated and not socially interact with each other [52] . However, by observing these behaviors over a period of time it might reveal if certain individuals are mutually oriented when in the same physical space [45, 52] . The fact that the collocation based model (M дW F ) predicts the final score better than the dwelling only model (M iW F ) prides evidence that this data can indicate social interaction. Finally, to further dissect the model and understand how the collocation-based features explain the final score, we evaluate the feature importance of the selected variables. Table 6 shows the top five features in the best model, M дW F with Random Forest. It is notable that four of these capture relative behaviors (e.g., percentage of time students were present in group meetings). Another noticeable aspect is that three of these features are based on the variance in collocations. These features essentially describe the consistency in collocation patterns (e.g., being collocated with the same group every week for a fixed period of time). However, given the importance scores it is evident that not one collocation behavior alone but analyzing a set of them together is what enables a strong prediction of performance. We demonstrate that collocation behaviors of group members predict performance better than individual behaviors and peer-evaluations. This exhibits that our method of determining collocation can be viably used to infer social interactions. The previous two sections demonstrate that infrastructure-based coarse collocation captures some important aspects of social behaviors in students, and that these measures of behavior form a good predictor for group project success. This presents new opportunities to harness archival network logs to scale analyses of social behaviors for larger groups, up to and including an entire campus. For instance, with an empirical understanding of how successful teams socially interact, instructors can tailor recommendations for project courses. This section illustrates other potential use-cases based on insights gathered from the case study. Social interactions are crucial to understand the physical and mental wellbeing of an individual. The lack of social interaction, or social isolation, is linked to stress, negative affect, depression and dissatisfaction [19, 20, 57] . The presence of social interactions can uncover ties within a peer group, which can be helpful to understand individual perception and behavior. For instance, weak social ties are related to a lack of motivation to go to work or absenteeism [64] . Alternatively, strong social ties can also explain peer influence and its effect on alcohol, drug and tobacco use [9] . As a result, retrospective investigations of who a student interacts with and how these interactions evolve over a period of time can provide insights for supporting student wellbeing. Using the data from Section 4, we develop network graphs representing the collocation between pairs of individuals in the same section, shown in Figure 11 . We compute the graphs week-wise (excluding lecture times), with each edge depicting the duration of collocation between any pair of individuals. Figure 11 shows the evolution of these interaction ties between participants from Section 1B. Every node represents a participant and the color denotes project groups. Nodes that are closer illustrate individuals who spent more time collocated together. Even in the first week of class, the participants are somewhat closer to their group members than others, and this becomes more pronounced in the later weeks. However, since we consider all collocations it is evident that a participant's ties are not exclusive to their group. This makes automated methods even more valuable so as to identify these informal interactions. In some cases, members of different groups might be connected through other courses or even social circles. Similarly, Figure 12 represents interaction ties in section 2D aggregated over the whole study period. It reveals the tight connection amongst some members of the red group. This particularly refers to participants 2006, 2009, and 2010. In comparison, two other group members are slightly further apart. This is explained by the disclosure during enrollment that these participants live together. This application is similar to what Hong et al., describe as relationship maps to understand intimacy between individuals [25] . By using raw association logs for an entire campus, it presents opportunities to understand where these interactions occur and semantically labels them based on the expected purpose of collocation (e.g., residential ties for roommates, academic ties for project teams, or recreational ties for parties). Note that our sample is limited to the 215 individuals that consented to the use of their network logs. More nuanced automated learning techniques can help identify social groups in a completely unsupervised way, making a compelling argument for the application to a broader cohort on campus. Accordingly, stakeholders can take actions for a cohort of individuals based on noticeable anomalies in a network, such as when someone from a cohort drops out. Another use case is when someone in a network is affected by an ailment-mental (e.g., violent incident) or physical (e.g., contagious disease)-the campus can react by securing the peers first. Another approach to visually analyzing the segments identified by the pipeline is to observe where social interactions occur. This can be further expanded by locating the "collocation spaces", which are spots on campus where individuals gather together. Figure 13 shows how the collocation changes before the midterm, during the midterm, after the midterm. While there is a slight increase in collocation period at academic spaces towards the end of the semester, it is most notable how the interactions in residential spaces change during the midterm. This excessive collocation could indicate groups working together to collaborate on midterm milestones or simply study for exams. Another notable aspect is the decrease of collocation duration at fraternities and sororities. This might be related to fewer parties and gatherings during midterm exams. More importantly, this data provides evidence that social interactions vary over time at different places. By investigating how collocation varies around specific events, say an exam, it is possible to define the purpose of spaces more dynamically. Furthermore, this can be extended to gain a better understanding of the purpose of social interactions itself. This would stem from knowledge of the space and community. Accordingly administrators could focus on facilitating better interaction in these spaces during times of collaboration. By contrast, they could also choose to carefully regulate the use of these spaces during times of social distancing [40] . Social interactions in physical spaces can lead to congestion depending on the purpose and popularity of a space. Understanding where these spaces are and when congestion can occur can regulate gatherings for crowdmanagement [17, 68] and even prevent the spread of contagious diseases [40] . Therefore, campus stakeholders have an interest in identifying rooms and pathways with high pedestrian concentration. Not only could this help pre-determine bottle-necks for flock movement in cases of emergencies, it can also retroactively indicate which spaces were prone to congestion and inadvertent physical contact. This subsection presents some observational evidence from our data showing how sections move in and out of their class room around a lecture. As discussed in Section 3.4, one of the instructors (1Q) provided us with lecture-by-lecture attendance data (46 students in Sections 1A and 1B). For Section 1A, lectures took place three times a week from 10:10-11:00, and for Section 1B, these lectures took place from 11:15-12:05. The rooms where these lectures took place were both single entry/exit. To gain a better understanding of how congested these points can get, we evaluate the arrival and exit times of the students that attend class. For entering class, we compare the earliest starting time of an individual's collocation with the designated commencement time. Similarly for exiting class, we compare the latest ending time with the lecture's expected conclusion time. The median entering time is five minutes and the median exiting time is two minutes ( Figure 14) . These values align with the instructors' observation of typical class behavior, i.e., students are typically tardy on entry but are on time for the exit. Notably, our analysis considers the only students from the current class and not those who would occupy the same room for the next or previous lecture. However, with richer data, it is possible to understand congestion points not just for class rooms but where any social interactions occur. With these findings, stakeholders could regulate the exit of students even between different rooms to avoid crowding the pathways. These can be proactive interventions to reduce the physical interactions. This work showcases the utility of association logs recorded by managed WiFi networks. These logs are archival data that can be easily scaled for every campus community member that is connected to it, if a mechanism to obtain broad informed consent can be devised. Beyond the case study presented in this paper, repurposing this data to infer social interaction based on collocation behaviors can inform the design of various applications for different stakeholders. In this section, drawing upon the threads of the other applications discussed above as well as others in the existing literature, we highlight some of the real-world scenarios where this technology can be implemented on the campus. 6.1.1 Academic Experiences. Harnessing data already collected at the infrastructure facilitates long-term analyses of social interactions in a large cohort of students. In Section 4.4, we show that modeling the collocation behavior of project group members can convincingly explain their final scores. In other words, signals reflecting social interaction, and its ramifications on academic outcomes, can be captured by understanding collocation. This enables instructors to provide data-driven insights to a new cohort based on actual behaviors of successful teams. However, student experiences are not limited to the classroom. Inspecting the collocation patterns inside a campus can help characterize the campus spaces in terms of their social purposes. In fact, it would be possible to trace the campus' evolving "social blueprint" and dynamically approximate the nature of social interactions based on when and where people were collocated (Section 5.2). This knowledge could be used to augment the static semantic labels of places. Moreover, it can help disentangle social relationships within a community. It is not uncommon for individuals to be collocated in the same space on campus. However, by studying prior collocations and its evolution, it is possible to elucidate if two students were mutually oriented to each other's actions, and thereby socially interacting [52] (Section 5.1). For example, ties at the gym are different from ties through parties. Processing these logs provides researchers an opportunity to understand social interactions on multiple dimensions which motivate new questions and applications. For instance, how do teams with prior ties work in comparison to teams of strangers [23] , or how different are the social interactions in a new class for a student from a marginalized community [49] . Essentially, the network logs are a passive source of information to gather empirical insights of social support. And in a student's life, social support can explain performance [19, 20] , drug use [9] , and even dropping out [22, 49, 64] . 6.1.2 Mental Wellbeing. The applications related to academic outcomes discussed earlier have implications for a student's mental wellbeing. However, this dedicated section is to describe applications that are agnostic of student success and instead focus on supporting their mental wellbeing requirements. Abundant work in psychology and sociology express the importance of receiving social support, and this coping mechanism is fundamental to leading a healthy life. This data makes it possible to evaluate the changing social interactions over time, for both positive and negative outcomes. Major events on campus can impact social behaviors linked to mental wellbeing. This could either be a violent incident [5] (e.g., during a shooting) or an enforced lockdown (e.g., during a pandemic). In fact, the absence of interaction can be associated with social isolation, which in turn is related to stress, affect, and depression [19, 20, 57] . Although these kinds of analyses might be hard to justify in real-time, post-hoc analysis of these trends can provide insights to support positive trends or mitigate negative ones. For example, with the use of archival data, campus health facilities can incorporate the social interaction information for screenings. Ware et al. have already shown that similarly leveraging network logs to infer individual dwelling behaviors (e.g., duration, entropy and rhythms) can assist with depression screening [71] . In the same vein, social behaviors inferred by our method can be analyzed to predict community-scale mental wellbeing concerns and alert campus health facilities. This can help prepare responses following certain expected (e.g., exams) and unexpected (e.g., student death) events on campus. 6.1.3 Physical Health. Social interactions, and the lack thereof, are important behaviors in the context of contagious diseases, something that has become very clear in 2020 with the Coronavirus Disease (COVID- 19) pandemic that is affecting people globally. Literature in epidemiology provides substantial evidence that social distancing helps reduce the spread of influenza [21] and coronaviruses [40] . Even though WiFi-based collocation is too coarse to determine physical contact at the spatial resolution of 6-10 feet, the behaviors that can be derived from it has applications for both reactive and proactive measures. In terms of the former, similar processing pipelines aid contact tracing by automatically assessing the likelihood of individuals at risk based on the amount of collocation they may have had with a known contagious set of individuals. Although this can have false positives, it can still render a risk-based prioritization to help with screening during highly contagious outbreaks, such as was experienced in Spring 2020, and we expect it to recur in the coming years. Using sophisticated interaction networks as we demonstrate in Section 5.1, campus health officials can look at historically-accumulated data to understand which students are associated with infected ones. This can be potentially extended to study multi-hop relationships with greater degrees of separation as well. Alternatively, similar data can be leveraged to develop and simulate proactive measures that assist campuses in resuming and continuing safe operations during a period of contagion. By modeling prior data based on congestion and pedestrian traffic (Section 5.3), it is possible to determine the specific bottlenecks on campus that should be regulated because of risk through both direct interaction and exposure to contact surfaces (e.g., door handles at exits) [6] . Even simpler solutions of applications that depict occupancy of spaces (Section 5.2) to students in real-time can help them adopt safer behaviors by avoiding interactions [18] . Policies such as instituting one-way walkways, assigned seating in classrooms, hybrid physical-remote class attendance policies to reduce student density in classrooms or other creative measures can be tested to see how much they impact risk of exposure to an individual and an entire campus community. The use of passive sensing technologies captured in the digital infrastructure of a campus can characterize human behavior and holds exciting potential because it can be automated and scaled. This mitigates the limitations of manual sensing such as self-reports of experiences or even requiring all individuals to install and consent to passive sensing on the personal devices. However, since this paper highlights the feasibility of appropriating data archived in existing systems, it also elicits new concerns when considering practical deployments. Any ubiquitous technology with the potential of large-scale passive sensing faces privacy concerns [46] . In the scope of our work, the privacy concerns can be related to both the data that is collected (coarse location) as well as what it can infer (interaction with peers) and its eventual implications [35] . From the perspective of data collected, the use of the WiFi association logs is more privacy preserving in comparison to installing an application on a client device that accumulates data to a central server. Such client-based applications can be perceived as invasive not only because such agents can collect sensitive data-possibly more than what the user is aware of-but also because the aggregation can be continuous and unbounded, e.g., a campus application logs locations even beyond the campus perimeter [59] . On the other hand, infrastructure-based localization is limited only to timestamps of network associations and does not elicit anxieties related to a client-side agent leaking data from other sensors. Moreover, these approaches are also localized to the campus. However, automatic computation of where individuals are and who they interact with can be considered sensitive by campus students [50] . Therefore, when adapting such approaches to infer interactions, stakeholders need to consider approaches like differential privacy to obfuscate sensitive data [13] . Related to the privacy concerns is establishing policy around data access. This paper and prior work showcase the utility of social interactions and how it can be inferred unobtrusively. However, this involves a centralized observer that harnesses location data, and even when anonymous, this can be used to trivially identify individuals [26] . A predator can incisively connect certain dwelling patterns if they choose to, e.g., lecture rooms can reveal a schedule and potentially an individual. To protect against this, more data can be abstracted, i.e., the AP locations can be anonymized as well (while still retaining category, floor and relative information). Yet, it still needs to be established which people have the privileges to query for information and what the queries can be. In fact Bagdasaryan et al. have proposed a system for managing the privacy of ubiquitous computing systems that limits the use of the data itself [2] . Moreover, campuses can adapt existing policies regarding access to student records to protect data related to students' social interactions on campus. Finally, we also need to discuss the ethics of such inferences. Since the data relies on network association logs, any individual that connects to the network effectively opts-in their data for this analysis. Choosing to not connect can be considered an unfair choice that limits a student's right to self-determine [51] . Although students already connect to other networks, it is important to understand their rationale in order to weigh the cost and benefits of opting-out of an institute managed network. Moreover, even if a student makes that decision, the fact that they will be excluded from social interaction based insights could in itself be unethical. Even though, on any given day, 90% of the students in our sample were connected to the network, the students outside coverage will be missed in applications like contact tracing and social isolation. And this missing data can have ramifications for the entire community. Even for academic performance, if instructors can use this kind of data for intervening with certain groups during midterms, those that were left out lose the opportunity of improvement. Consequently, this raises concerns of fairness and accountability, which need to be considered before incorporating such systems. The most apparent limitation of using these association logs to determine social interaction is its low spatiotemporal resolution. This introduces reasonable uncertainty in determining the exact location of individuals [37, 71] . Even with a lack of precision, WiFi-based localization does have its advantages. It can be argued that such approaches (e.g., [25] ) provide greater insight into indoor mobility and dwelling than other scalable solutions like GPS [17, 29, 68 ]. Yet, indoor setups present several challenges that can lead to unexpected device associations [31] . As a result, an individual could be in a room and not be associated with the physically closest AP, but rather another AP node that found a stronger signal to the client. This creates an opportunity to deal with this noise by modeling the probability of displaced connections. Individual dwelling and collocation could be described as a probabilistic measure based on their pathway to the location. Other pieces of information that could help calibrate the modelling is incorporating the size and configuration of rooms and neighbourhood maps of the APs. Furthermore, advanced off-the-shelf methods to study archival data can be developed to make AP nodes aware of other APs visible to a client. These additional pieces of information can still be very valuable without the need of installing applications on user phones or fingerprinting the entire campus. Our work shows that collocation behavior over time can indicate social interactions, even if an instance of co-presence between two individuals does not guarantee face-to-face interactions. Theoretically, this falls in line with ideas of spatiality [45] -when collaborators are present near each other, they are interacting through observations and increased sense of accountability. However, it is yet to be explored if these notions of will translate to other social relationships. Specifically, if the principles of spatiality can be used to identify the social groups in an unsupervised way. Social interactions can explain student experiences in terms of stress, motivation and performance. One way these interactions manifest on campus is when students are physically collocated. This paper studied the feasibility of coarse collocation leveraged from WiFi network logs to describe social interactions. We established the reliability of computing collocation of students in class. Then we demonstrated how collocation behaviors of project team members is related to their performance. Additionally we enlisted other opportunities to apply this kind of social interaction data to support the campus community. This paper motivates the use of existing infrastructure data, such as WiFi logs, to perform large-scale longitudinal analyses of social interactions on campus to inform applications for academic outcomes, mental wellbeing and physical health. Invalidity of true experiments: Self-report pretest biases Ancile: Enhancing Privacy for Ubiquitous Computing with Use-Based Privacy The impact of the âĂŸopenâĂŹworkspace on human collaboration Prediction of Mood Instability with Passive Sensing School shootings, the media, and public fear: Ingredientsfor a moral panic Estimating the impact of school closure on influenza transmission from Sentinel data Root mean square error (RMSE) or mean absolute error (MAE)?-Arguments against avoiding RMSE in the literature Stress, social support, and the buffering hypothesis From childhood to the later years: Pathways of human development GroupSense: A Lightweight Framework for Group Identification Birds of a Feather Clock Together: A Study of Person-Organization Fit Through Latent Activity Routines Unobtrusive Assessment of Students' Emotional Engagement During Lectures Using Electrodermal Activity Sensors The algorithmic foundations of differential privacy Reality mining: sensing complex social systems Psychological safety and learning behavior in work teams Presence analytics: making sense of human social presence within a learning environment CityMomentum: an online approach for crowd behavior prediction at a citywide level Adaptive human behavior in epidemiological models The factor structure of received social support: Dimensionality and the prediction of depression and life satisfaction The relationship of self-actualization to social support, life stress, and adjustment Targeted social distancing designs for pandemic influenza Influences of self-beliefs, social support, and comfort in the university environment on the academic nonpersistence decisions of American Indian undergraduates Prior ties and the limits of peer effects on startup team performance Social behavior: Its elementary forms Socialprobe: Understanding social interaction through passive wifi monitoring Decentralized Privacy-Preserving Proximity Tracing Classifying social actions with a single accelerometer The dynamic nature of conflict: A longitudinal study of intragroup conflict and group performance Participatory sensing: Crowdsourcing data from mobile smartphones in urban spaces Indoor positioning using GPS revisited Challenges for social sensing using WiFi signals Estimating mutual information Advanced engineering mathematics Determinants of social desirability bias in sensitive surveys: a literature review Privacy in ubiquitous computing Classification and regression by randomForest Development of a theory-based assessment of team member effectiveness From context awareness to socially aware computing ENERNET: Studying the dynamic relationship between building occupancy and energy consumption Effectiveness of social distancing strategies for protecting a community from a pandemic with a data driven contact network based on census and real-world mobility data Analyzing the longitudinal impact of proximity, location, and personality on smartphone usage Accuware Wi-Fi Location Monitor Extraction of latent patterns and contexts from social honest signals using hierarchical Dirichlet processes Sensible organizations: Technology and methodology for automatically measuring organizational behavior Distance matters Harnessing smartphone-based digital phenotyping to enhance behavioral and mental health Location-Based Services 4.1 Design Guide White Paper A regularity statistic for medical data analysis Correlates of school dropout and absenteeism among adolescent girls from marginalized community in north Karnataka, south India Student perspectives on digital phenotyping: The acceptability of using smartphone data to assess mental health Der wert des privaten Understanding conflict and war A survey of decision tree classifier methodology Inferring mood instability on social media by leveraging ecological momentary assessments Modeling dynamic identities and uncertainty in social interactions: Bayesian affect control theory Linear regression analysis Life stress in various domains and perceived effectiveness of social support A walk on the client side: Monitoring enterprise wifi networks using smartphone channel scans Four billion little brothers? Privacy, mobile phones, and ubiquitous data collection The relationship between social support and subjective well-being across age SNMP, SNMPv2, SNMPv3, and RMON 1 and 2 Working-relationship detection from fitbit sensor data Social relationships and health: A flashpoint for health policy Social support at work and its relationship to absenteeism Patterns of interdependence in work teams: A two-level investigation of the relations with job and team satisfaction Social embeddedness and job performance of tenured and non-tenured professionals Location in ubiquitous computing CrowdWatch: Pedestrian safety assistance with mobile crowd sensing StudentLife: assessing mental health, academic performance and behavioral trends of college students using smartphones SmartGPA: how smartphones can assess and predict academic performance of college students Large-scale automatic depression screening using meta-data from wifi infrastructure StressMon: Scalable Detection of Perceived Stress and Depression Using Passive Sensing of Changes in Work Routines and Group Interactions