key: cord-0028574-c28pct9r authors: Nulty, Alison K.; Chen, Elizabeth; Thompson, Amanda L. title: The Ava bracelet for collection of fertility and pregnancy data in free-living conditions: An exploratory validity and acceptability study date: 2022-03-11 journal: Digit Health DOI: 10.1177/20552076221084461 sha: 65b278bc3fbda3104ab38e46f3ddad94f9a4e4ec doc_id: 28574 cord_uid: c28pct9r OBJECTIVE: To evaluate the validity and acceptability of the Ava bracelet for collecting heart rate, sleep, mood, and physical activity data among reproductive-aged women (pregnant and nonpregnant) under free-living conditions. METHODS: Thirty-three participants wore the Ava bracelet on their non-dominant wrist and reported mood and physical activity in the Ava mobile application for seven nights. Criterion validity was determined by comparing the Ava bracelet heart rate and sleep duration measures to criterion measures from the Polar chest strap and ActiGraph GTX3 + accelerometer. Construct validity was determined by comparing self-report measures and the heart rate variability ratio collected in the Ava mobile application to previously validated measures. Acceptability was evaluated using the modified Acceptability of Health Apps among Adolescents Scale. RESULTS: Mean absolute percentage error was 11.4% for heart rate and 8.5% for sleep duration. There was no meaningful difference between the Ava bracelet, ActiGraph, and construct a measure of sleep quality. Compared to construct measures, Ava bracelet heart rate variability had a significant low negative correlation (r:−0.28), mood had a significant low positive correlation (r : 0.39), and physical activity level had a significant low (r(level of physical activity): 0.56) to moderate positive correlation (r(MET−minutes/week): 0.71). The acceptability of the Ava bracelet was high for fertility and low for pregnancy tracking. CONCLUSION: Preliminary evidence suggests the Ava bracelet and mobile application estimates of sleep and heart rate are not equivalent to criterion measures in free-living conditions. Further research is needed to establish its utility for collecting prospective, subjective data throughout periods of preconception and pregnancy. The use of smartphones and wearable technology have allowed for increased monitoring of daily lifestyle behaviors such as sleep and physical activity. In 2019, approximately 25% of women in the United States reported regularly wearing a smartwatch or fitness tracker. 1 This regular use of wearable technology has started to enable health researchers to collect continuous, real-time data in a free-living environment. 2, 3 Previous studies have determined the use of wearables to be beneficial within health care settings, as well as maternal and child health research. 4, 5 In a study assessing the feasibility of collecting data during pregnancy using Garmin Vívosmart HR activity tracker, Grym et al. 5 found it was feasible for women to wear the device throughout their pregnancy. Other studies have employed the use of wearable devices, with their respective mobile apps, during pregnancy as part of interventions aiming to: (1) promote physical activity and adequate sleep during pregnancy 6, 7 or (2) collect longitudinal cohort data on physical activity during several time points throughout preconception and into pregnancy. 8 However, the majority of studies involving wearables during preconception and pregnancy have focused on using the wearable as part of a health behavior intervention and have less often used the wearables for research data collection. Thus, the literature on the ability of wearables to provide adequate and accurate observational data remains scarce. [9] [10] [11] [12] Due to the rapid physiological changes that occur during pregnancy, a collection of daily objective and subjective measures of maternal health could be important to understand the short-and long-term influences of pregnancy on both mother and infant. Wearables may provide a low burden tool to gather these data. The Ava bracelet, a newly developed Food and Drug Administration-registered fertility tracking wearable, has been marketed as a wearable that can accurately predict a woman's fertile window by measuring and analyzing nighttime skin temperature, resting pulse rate, heart rate variability (HRV) ratio, skin perfusion, breathing rate, movement, and sleep. 13 To predict an Ava bracelet wearer's fertile window, women are asked to wear the Ava bracelet only while they sleep and to sync the device with their Ava smartphone application (mobile app). The mobile app also allows women to self-report their mood, bodily symptoms, and a daily custom log to add any other pertinent information. The Ava mobile app also allows women to continue to track data during their pregnancy. This enables women to monitor their own health data and receive weekly updates on their developing fetuses. Additionally, pregnant women have the ability to report their weight, activity level, prenatal measurements (i.e. blood pressure, baby's heart rate, or fundal height), medical diagnoses (i.e. gestational diabetes, gestational hypertension, preeclampsia, etc.), and hydration status on the Ava mobile app. While the Ava company has sponsored and published research on the use and validity of their product, most of their studies have focused on its ability to predict a woman's fertile window, and the bracelet has not yet been assessed by an unaffiliated researcher. [13] [14] [15] This study investigated the validity and acceptability of the Ava bracelet for collection of observational data among pregnant and nonpregnant reproductive-aged women. The validity and acceptability of the Ava bracelet were assessed through a mixed-methods prospective observational study among reproductive-aged women over a study period of 12 weeks. Given the use of qualitative and quantitative data, we aimed to reach a sample size of 30 participants to satisfy the central limit theorem. 16 We chose to recruit both nonpregnant and pregnant women to assess the validity and acceptability among both groups of reproductive-aged women, knowing that health researchers are likely to use wearables, such as the Ava bracelet, to collect data before, during, and after pregnancy. Reproductive-aged (18 to 45 years) women were recruited through advertisements on Instagram, Facebook, and local online listservs (Appendix) by the first author. Women were eligible if they spoke English, had access to a smartphone, did not wear a pacemaker, and regularly slept 4 or more hours per night (specified by the Ava company for accurate data recording). Women were excluded if they had a pacemaker, as some of the sensors on the Ava bracelet could interfere with a pacemaker; or if they did not get at least 4 h of sleep per night. Recruitment took place between September and November 2020, and data collection ended in December 2020. Women who expressed interest in participating in the study were asked to fill out an online screener survey that included basic information such as race/ethnicity, marital status, current employment status, and pregnancy status. We had 272 women complete the screener survey and 35 were invited to enroll. Participants were chosen on a first-come-first-served basis with an emphasis on recruiting pregnant women. Women who were pregnant throughout the data collection period were considered pregnant, while women who were not pregnant during data collection were considered nonpregnant. All study participants provided informed consent prior to participating in this study. This study protocol was reviewed and approved by the Institutional Review Board of the University of North Carolina at Chapel Hill (IRB No. 20-1732). Self-reported sociodemographic data were collected through online questionnaires both before and after the data collection period. After recruitment and consent, each participant scheduled a time to meet the principal investigator (PI) to pick up their devices. During device pick-up, participants also received a paper and electronic copy of study instructions, with separate instructions for pregnant and nonpregnant women, to address the differences by pregnancy status within the Ava mobile app. All participants were instructed to download the Ava mobile app using the anonymized username and password provided by the research team. Participants were instructed to wear the Ava bracelet (Ava AG, Zurich, Switzerland) on their non-dominant wrist, the ActiGraph GT3X+ (ActiGraph Corporation, Pensacola, FL, USA) monitor on their dominant wrist, and a heart rate chest strap (Polar H10, Polar Electro, Kempele, Finland) for seven consecutive nights during sleep. Participants were also instructed to self-report their mood and physical activity under the designated tabs within the Ava mobile app. Each morning, participants were sent a secure survey link via a Short Message Service text message where they were asked to record the time they went to bed and the time they woke up. Following the seven nights of data collection, participants returned the devices to the PI and completed a set of online questionnaires to assess their past week's stress, physical activity, sleep quality, and acceptability of the Ava bracelet and mobile app. Lastly, all participants were interviewed via Zoom (Zoom Video Communications, San Jose, CA, USA). Before the interview began, subjects were informed all responses would be de-identified, kept confidential, and recorded if the participant agreed. After completion of the interview, each participant was emailed a $30 Amazon gift card. Objective measures of sleep and heart rate were obtained using the ActiGraph GT3X+, worn on the dominant wrist, and a heart rate chest strap. These data were collected in 60 s epochs continuously for seven nights during sleep. ActiGraph sleep and heart rate data were extracted using ActiLife software version 6.13.4 (ActiGraph Corporation, Pensacola, FL, USA). A combination of the Cole-Kripke and the Tudor-Locke algorithm was used to detect sleep and wake periods. Nightly sleep duration and nightly sleep efficiency (%) were obtained from the analysis. Objective measures of heart rate, sleep duration, sleep quality, and HRV ratio were obtained using the Ava bracelet worn on the non-dominant wrist Subjective measures of daily mood and physical activity were entered into the mobile app by participants (Figures 1 and 2) . These data were saved in the mobile app and extracted by the PI after the data collection period ended. Both the Ava bracelet and the ActiGraph collected mean heart rate (bpm) during sleep. Additionally, the Ava bracelet collected the HRV ratio. HRV refers to the fluctuation in time between subsequent heartbeats over time, 17 and is considered to be a measure of the body's ability to adapt to physiological and environmental stressors. Adequate HRV is associated with health and self-regulation. 17 The Ava bracelet measures the HRV ratio of the low frequency to high frequency (LF/HF) power. 18 This measure is thought to estimate the ratio between the sympathetic and parasympathetic nervous systems. 17 While the ActiGraph and heart rate chest strap did not collect the HRV ratio, they did collect R-R intervals (ms), the time between two successive R-waves of the QRS complex on an electrocardiogram. 19 The R-R intervals enabled us to calculate SD1/ SD2, a measure of the unpredictability of the R-R intervals; where SD1 is equal to the root mean square of successive R-R intervals and SD2 equals SD1 plus the mean R-R interval during the time of data collection. 17 SD1/SD2 is correlated with the LF/HF ratio (HRV ratio). Nightly HRV ratio was measured using the Ava bracelet and nightly SD1/SD2 was calculated for each participant using the R-R interval data collected from the ActiGraph and heart rate chest strap. A lower HRV ratio through the Ava bracelet indicates a woman is less stressed, 18 while a lower SD1/SD2 ratio has been associated with increased stress or anxiety. 20 Therefore, we would expect to see the HRV ratio calculated through the Ava bracelet to be negatively correlated to the SD1/SD2 ratio. The objective sleep measures collected with the Ava bracelet include sleep duration, nighttime proportion of combined deep and rapid eye movement (REM) sleep, and nighttime proportion of light sleep (non-REM). Ava notes that light, deep, and REM sleep are distinguished via accelerometry, specifying that a healthy sleep pattern consists of a night that is 50% to 65% light sleep and 35% to 50% deep + REM sleep. 18 Similarly, to assess sleep quality, ActiGraph measures sleep efficiency which is the proportion of sleep periods that a person spends asleep (normal sleep efficiency >85%). 21 We expect ActiGraph's measure of normal sleep efficiency to be positively correlated to Ava's measure of a healthy sleep pattern. Subjective sleep quality during the week of data collection was assessed using the 19-item Pittsburgh Sleep Quality Index (PSQI) after all data had been collected. 22 The PSQI has been validated for use among the general population, 22 as well as pregnant women. 23 The 19 items are combined to form 7 component scores (subjective sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbances, use of sleep medication, and daytime dysfunction). Each component score ranged from 0 ("no difficulty") to 3 ("severe difficulty"). The sum of the component scores was continuous and ranged from 0 to 16 with higher scores (referred to as global scores) indicating poorer sleep quality. A person with a global score >5 is considered to have poor sleep quality. These component measures showed adequate internal consistency (Cronbach's α > 0.68). Stress during the week of data collection was assessed using the Cohen 10-item perceived stress scale (PSS) after all data had been collected. 24 Response options for each item ranged from 0 ("never") to 4 ("very often"). Scores ranged from 0 to 40 with higher scores indicating higher stress. These measures showed excellent internal consistency (Cronbach's α >0.87). Given that the Ava mobile app allowed participants to report several moods other than "stressed" (Figure 1 ), our goal was to determine if the report of negative moods (e.g. disinterested, anxious, sad, stressed, irritable) was correlated with a higher score on the PSS scale. Previous literature has shown higher PSS scores are positively correlated with negative emotion, anxiety, and depression. 25 Therefore, we anticipated women who reported more frequent negative moods to have higher PSS scores. An average mood score based on the input in the Ava mobile app was calculated for each participant. For each day, each woman began with a score of 0. Then, based on her input, she would be assigned 1 point per each negative mood listed, −1 point for each positive mood listed (e.g. happy, content), and 0 points for each neutral mood listed (e.g. sensitive, tired, exhausted, or aroused). The participant's daily totals were added together to get a weekly mood score with a higher score reflecting frequent reports of negative moods. To calculate the average mood score, Figure 1 . Example of Ava mobile application collection of daily mood. Participants were asked to select all moods that applied. the total mood score was divided by the number of days of mood data collection (average of 5.24 days per participant). This average mood score was continuous and ranged from −1.5 to 2.75, with a higher score indicating an increased frequency of negative mood reporting. Physical activity during the week of data collection was assessed using the International Physical Activity Questionnaire (IPAQ)-Short Form after all data had been collected. 26 This questionnaire has been validated to collect physical activity data among the general population, 26 and recommended for pregnant women when more valid objective measures (i.e. accelerometry) are unavailable. 27 Because the Ava bracelet is only worn during sleep, we were only able to compare the self-report of physical activity within the Ava mobile app ( Figure 2 ) to the results from the IPAQ that each participant completed at the end of data collection. The IPAQ asked participants to report the frequency (times per week), duration (minutes/ day), and intensity (vigorous, moderate, or walking) of any physical activity they performed during the past 7 days. Continuous physical activity scores were calculated in units of metabolic equivalents (MET)-minutes per week and then categorized into physical activity level categories of "high," "moderate," or "low." 28 For walking physical activity, 1 MET = 3.3; MET minutes per week walking = 3.3 × duration (minutes/ day walking ) × frequency (times per week walking ). Total MET minutes per week refers to the sum of vigorous, moderate, and walking MET minutes per week. 29 To assess categorical measures of physical activity, participants were scored as "high" physical activity level if they performed vigorous physical activity on 3 or more days, with 1500 total MET minutes in the last week or if they achieved at least 3000 total MET minutes over the last week. Participants were scored as "moderate" physical activity level if they performed vigorous physical activity and/or walking on 3 or more days for at least 30 min per day or if they performed moderate physical activity and/ or walking for at least 30 min per day on 5 or more days. They were also scored as "moderate" if they achieved at least 600 total MET minutes over the last week. Participants were scored as "low" physical activity level if they did not meet criteria for moderate or high physical activity. 29 After all data had been collected, the acceptability of the Ava bracelet and mobile app to monitor reproductive health measures was assessed using a 15-item questionnaire modified from the acceptability of health apps among adolescents (AHAA) scale. 30 AHAA is based on the seven domains within the theoretical framework of acceptability proposed by Sekhon et al. 31 including affective attitude, burden, perceived effectiveness, ethicality, intervention coherence, opportunity costs, and self-efficacy. In our adapted scale, we assessed affective attitude (i.e. how did participants feel about using the Ava bracelet and mobile app?), burden (i.e. how much time was required to use the Ava bracelet and mobile app), intervention coherence (i.e. did participants understand the use of the Ava bracelet and mobile app and how they worked?), ethicality (i.e. did participants find value in monitoring their reproductive health?), and self-efficacy (i.e. were participants confident they could use the Ava bracelet and mobile app?). Response options for each item ranged from 1 "strongly disagree" to 5 "strongly agree" on a 5-point Likert scale and items included "I am confident that I can use this app to monitor my reproductive health even if I'm not reminded to do it" and "I really enjoy using this app." Total acceptability scores were calculated by taking the mean of responses within each of the 5 subscales and then totaling the 5 means. The total acceptability scores were continuous ranging from 8.75 to 25 with higher scores indicating higher levels of acceptability. These measures showed excellent internal consistency (Cronbach's α >0.94). In addition to the modified AHAA, participants were also asked an open-ended question: "If given the opportunity, would you use the Ava bracelet and mobile app while trying to conceive? While you were pregnant?" For this analysis, we defined validity as the ability of the Ava bracelet to accurately measure quantitative physiological data. 32 Validity of the Ava bracelet included criterion and construct validity. Criterion validity was determined by comparing the Ava bracelet heart rate and sleep duration measures to the criterion measures collected from the Polar chest strap and ActiGraph GTX3+ accelerometer (Tables 1 and 2 ). 33, 34 Construct validity was determined by comparing selfreported mood and physical activity, and the HRV ratio collected in the Ava mobile app to previously validated measures of stress, physical activity, and HRV ratio. Valid Ava bracelet measures should be correlated with the construct measures. The definition of acceptability, adapted from the definition by Samkange-Zeeb et al., 35 was whether women would use the Ava bracelet and mobile app to collect nighttime physiological data, mood data, and physical activity data. All statistical analyses were performed with Stata software (version 16; StataCorp, College Station, TX, USA). Descriptive statistics were calculated for participant characteristics and acceptability measures. To estimate measurement error, mean absolute percentage error (MAPE) was calculated, with a smaller MAPE representing better accuracy. We interpreted a MAPE of <10% to indicate the Ava bracelet measure was equivalent to the Polar chest strap and ActiGraph measure. Pearson correlation coefficients were calculated and interpreted with the following ratings: <0.60 low, 0.60 to <0.75 moderate, 0.75 to <0.90 good, and ≥0.90 excellent. 36 Bland-Altman plots were created to visually assess bias in the collected data. Lastly, equivalence testing was used to determine whether the mean values of heart rate and sleep duration from the Ava bracelet fell within the equivalence zone from the criterion measures. The values for the equivalence zone were determined a priori based on prior validity studies assessing heart rate and sleep data collected through wearables. 37, 38 Ninety-five percent precision was assumed if the Ava bracelets' 90% confidence intervals were within an equivalence zone that was between 10% of the criterion (ActiGraph) mean for sleep duration and 5% of the criterion (Polar chest strap) mean for heart rate. We collected data from 33 women. During the data collection period, 5 women were pregnant and 28 were nonpregnant. Most of the women in this sample identified as non-Hispanic White (70%) and over half of the women were married or in a domestic partnership (58%) ( Table 3) . Participants provided a total of 229 nights of data collection (mean: 6.94 nights per participant; range: 6-7 nights). The Ava bracelet underestimated heart rates compared to the ActiGraph (Table 4 and Figure 3 ). Equivalence testing for heart rate is shown in Figure 4 . The Ava bracelet did not fall within the 5% equivalence zone and the MAPE was 11.4%. The mean bias (mean difference between ActiGraph and Ava bracelet measures) was 7.75 ± 3.93 bpm. The Ava bracelet heart rate measures were positively correlated with the ActiGraph heart rate measures (Pearson's r: 0.92; P-value: 0.00) as expected. The Ava bracelet underestimated sleep duration compared to the ActiGraph (Table 4 and Figure 5 ). Equivalence testing for sleep duration is shown in Figure 6 . The Ava bracelet fell within the 10% equivalence zone and the MAPE was 8.5%. The mean bias (mean difference between ActiGraph and Ava bracelet measures) was 0.18 ± 0.88 h. The Ava bracelet sleep duration measures were positively correlated with the ActiGraph sleep duration measures (Pearson's r:0.73; P-value: 0.00) as expected. The Ava bracelet HRV ratio had significant low negative correlation with SD1/SD2 (r: −0.28; P-value: 0.00) as expected (Table 5 ). Using the Ava bracelet definition of a healthy sleep pattern, 49.8% of nights were considered to represent a normal sleep quality; whereas, using the ActiGraph definition, 81.2% of the nights were considered normal (Table 5 ). There was no meaningful difference between the Ava bracelet and ActiGraph measures of sleep quality (P-value: 0.31) as expected. The PSQI identified 39.7% of participants with good sleep quality. There was no meaningful difference between the Ava bracelet and PSQI measures of sleep quality (P-value: 0.37) as expected. The Ava bracelet mood scores had a significantly low positive correlation with PSS scores (r: 0.39; P-value: 0.03) as expected (Table 5 ). The Ava bracelet physical activity level had a significant low positive correlation with IPAQ physical activity level (r: 0.56; P-value: 0.00) and a significant moderate positive correlation with IPAQ MET minutes/week (r: 0.71; P-value: 0.00) ( Table 5 ). Using the AHAA, participants reported high total acceptability scores (mean: 19.3 ± 4.2 out of a possible 25) with a range of 8.75 to 25. Among the acceptability subscales assessed (e.g. affective attitude, burden, coherence, ethicality, and self-efficacy), participants had the highest scores under ethicality (mean: 4.8 ± 0.5 out of a possible 5) and the lowest scores under coherence (mean: 2.8 ± 1.4 out of a possible 5) ( Table 6 ). Ninety-four percent (n = 31) of participants reported, if given the opportunity, they would use Ava bracelet and mobile app prior to conception as a fertility tool; whereas only 52% (n = 17) of participants said they would use it during their pregnancy. This study explored Ava bracelet validity and acceptability in a cohort of women aged 18 to 44 years. It compared measures of heart rate, sleep, physical activity, and mood from the Ava bracelet and the corresponding mobile app to the criterion measures of the ActiGraph GT3X + and Polar chest strap and construct measures using validated questionnaires. Results indicated, in free-living environments, the Ava bracelet may estimate sleep duration with a moderate level of accuracy. Whereas, the heart rate estimates (beats per minute) from the Ava bracelet may show large errors, as the Ava bracelet tends to underestimate heart rate. Subjective Ava bracelet measures (HRV ratio, sleep quality, physical activity, and mood) had low to moderate correlation with validated construct measures. These findings would be valuable to researchers and clinicians aiming to use the Ava bracelet to collect physiological and lifestyle measures among pregnant and nonpregnant women, outside of a clinical setting. The Ava bracelet heart rate measures (beats per minute) had an excellent correlation to the ActiGraph + Polar chest strap (r: 0.92), an unacceptable MAPE value (≥10%), and estimates that were outside of the equivalence region. This finding contrasts with a previous study conducted in a laboratory setting that concluded the Ava bracelet, in comparison to polysomnography, provided an accurate measure of heart rate during sleep (MAPE: 0.20%). 39 Other validity studies have also found consumer-grade wearables, including the Apple Watch and Fitbit, were able to provide accurate measures of heart rate during sleep. [40] [41] [42] Given these findings, it does not appear that the Ava bracelet is suitable for research aiming to collect heart rate among pregnant or nonpregnant women in free-living conditions. Additionally, Ava bracelet estimates of HRV ratio had a low negative correlation with SD1/SD2. While the finding of a significant negative correlation aligns with our initial hypothesis, the low correlation may be due to the choice to use SD1/SD2 as the construct measure, as typically studies evaluate HRV ratio in comparison to the gold standard electrocardiogram. 43 Thus, while it appears the Ava bracelet's measure of the HRV ratio is correlated with the construct measure, further clinical testing is needed. We are unable to determine its clinical accuracy due to our limited study resources. The Ava bracelet sleep duration measures had a moderate correlation to the ActiGraph (r: 0.73), an acceptable MAPE value (<10%), and estimates that overlapped the equivalence region. These results are inconsistent with previous validity studies of consumer-grade wearables, as most have found that consumer-grade wearables generally overestimate sleep time for both adults and children. 36, 44, 45 Despite the acceptable MAPE value, one must also consider the clinical implications of these results. The limit of agreement indicated the Ava bracelet could underestimate sleep duration up to 1.54 h and overestimate up to 1.89 h. The American Academy of Sleep Medicine and Sleep Research Society recommends a healthy adult sleep 7 h per night for optimal health, noting that <7 h of sleep is associated with risk of obesity, diabetes, hypertension, heart disease, stroke, depression, and death. 46 Studies analyzing sleep and fertility on pregnancy outcomes have identified poor outcomes (i.e. reduced fecundability, preterm birth, gestational diabetes, and hypertension) with less than 7 h or over 9 h of sleep. [47] [48] [49] [50] Therefore, the Ava bracelet's lack of precision by almost 2 h, could lead to misclassification bias, influencing overall results on sleep, fertility, and pregnancy research. There were no meaningful differences between Ava bracelet estimates of sleep quality compared to the sleep quality estimates from the ActiGraph. Given that both the Ava bracelet and the ActiGraph use accelerometry data to discern sleep quality, 18 it appears that the ability to assess sleep time versus wake time is similar in both devices. The Ava bracelet sleep quality estimates were also similar to data collected using the construct measure, the PSQI. Previous studies have found PSQI measures are correlated with objective sleep data from the gold standard, polysomnography, 51 and from consumer-grade wearables such as the Garmin vívofit 3®. 52 Another study among college-age individuals found data from accelerometer-based wearable trackers to be similar to data collected from a subjective sleep diary. 37 Thus, researchers may be able to substitute the Ava bracelet for the PSQI to collect measures of sleep quality. The subjective measures of mood input by participants in the Ava mobile app had a significantly low correlation with data obtained from the PSS. The low correlation is likely due to the way we scored mood reported in the Ava mobile app. The calculated mood score was a reflection of all reported moods in the past week-not only stress. The PSS score is a retrospective reflection of the participant's stress level throughout the past week, whereas the Ava mobile app provides the participant's moods on the day each mood was experienced. While results indicate the mood function on the Ava mobile app yields a poor measure of stress, the app may be a useful way for women to report their daily mood, as 94% (n = 31) of participants reported at least one day of moods in their app. This could provide researchers with a better way to prospectively evaluate mood during a n represents the number of nights of recorded data. b MAPE represents mean absolute percent error: absolute value of (ActiGraph − Ava bracelet)/ActiGraph × 100. c 95% limits of agreement. the perinatal period, including preconception and postpartum, as women battling infertility during preconception and women in their postpartum period experience higher rates of depression and other mood disorders compared to the general population. 53, 54 The subjective measures of physical activity recorded in the Ava mobile app had a significant low to moderate correlation with data obtained from the IPAQ. The Ava mobile app physical activity data had a stronger correlation with the estimated MET minutes per week versus classification of physical activity level (i.e. low, moderate, or high). Similar to the PSS, the IPAQ provides a retrospective reflection of the participant's physical activity (i.e. vigorous, moderate, and walking) over the past week. For pregnant women, the Ava mobile app collects similar data as there is a specific area for women to record the duration and intensity of the physical activity they performed each day. However, if a pregnant woman engages in both vigorous and moderate activity on the same day, there is no way for her to denote that in the app, as the app only allows women to enter 1 physical activity intensity each day. Conversely, for nonpregnant women, there is not a Figure 3 . (A) The correlation between the heart rate measured via the ActiGraph and the heart rate measured via the Ava bracelet. (B) Bland Altman plot illustrating the bias between the ActiGraph and the Ava bracelet heart rate measurements, plotted against the ActiGraph (gold standard) heart rate measurements. . Equivalence testing for heart rate. Shaded gray area indicates the proposed equivalence zone (±5% of the ActiGraph heart rate mean). Dark bar indicates the 90% confidence interval for the mean heart rate estimated by the Ava bracelet. designated area of the Ava mobile app for recording physical activity. To collect this information, participants were instructed to enter it manually in their custom log. While this method yielded higher quality data as participants were able to record several physical activity types, durations, and intensities for each day, it appears that this method of data collection placed a heavy burden on participants as 18% (n = 6) did not record any physical activity data in their mobile app during the study period. Based on our findings, it would appear that the Ava mobile app would not be a preferred method of collecting physical activity data among nonpregnant women tracking fertility. The Ava mobile app may be a useful way to prospectively collect self-reported physical activity data among pregnant women. However, given the reporting limitations within the app, researchers aiming to use consumer-grade wearables to collect physical activity data should consider how a collection of subjective physical activity data and the inability to report multiple physical activity intensities may impact their results. Lastly, the acceptability of the Ava bracelet for fertility tracking was high (94%), while acceptability for pregnancy tracking was significantly lower (52%). This difference in acceptability between periods of fertility versus pregnancy tracking may signal that women see utility in monitoring reproductive health data solely as a means to actively prevent or become pregnant, making the need to track reproductive health data obsolete during pregnancy itself. Among the six subscales of acceptability, 30 participants had high affective attitude and ethicality scores and low coherence scores. These results indicate that while participants cared about and supported ways to monitor their reproductive health, there was a lack of understanding around how to successfully use the Ava bracelet and the mobile app. Therefore, researchers aiming to use the Ava bracelet and mobile app in future studies should implement a thorough training session that introduces participants to the Ava bracelet and mobile app, allowing them to feel confident using the equipment prior to the start of data collection. To our knowledge, this is the first consumer-grade wearable study to explore the validity and acceptability of a consumer-grade wearable among a sample of reproductive-aged women. Additionally, this is the first study to evaluate the validity of the sleep, physical activity, and mood measures obtained from the Ava bracelet and mobile app. This study had excellent compliance with a mean number of 6.9 nights of data collection per participant. While this study had many strengths, there were also notable limitations. First, our small sample size of pregnant participants (n = 5) limited our statistical power to test differences between the pregnant and nonpregnant participants, so results presented should be considered exploratory. While we were able to recruit more than 30 participants, we had hoped to recruit a sample comprised of 50% pregnant and 50% nonpregnant women, however, we were unable to achieve that distribution (n = 5 pregnant; n = 28 nonpregnant). Given that recruitment took place in the fall of 2020, during the height of the COVID-19 pandemic, it is likely that pregnant women were not as willing to engage in a research study due to concerns over social distancing protocols and potential COVID-19 exposures. Second, the acceptability of the Ava bracelet was especially high for tracking fertility. This finding may be influenced by our small sample size and the high prevalence of current wearable use within our study sample (n = 22, 67%). Therefore, the acceptability of the Ava bracelet within our sample may not be representative of the entire population of reproductive-aged women. Third, to conduct this study in free-living conditions, we were unable to use gold standard measurements such as an electrocardiogram or polysomnography as our criterion and construct measures. For example, heart rate and HRV are most reliably assessed using electrocardiograms; yet due to resource limitations, we used the Polar chest strap and SD1/SD2 (a correlate of HRV). Thus, the conclusions drawn from the data were limited. Additionally, the method of wrist actigraphy to collect sleep data may introduce misclassification bias as it determines sleep duration based on lack of motion. 33 For example, if a participant was awake, but motionless, their sleep duration would be overestimated. However, the limitations of wrist actigraphy are likely non-differential due to the use of accelerometry in both the Ava bracelet and the ActiGraph to measure sleep. Finally, the Ava bracelet can also collect data on skin temperature, respiratory rate, and provide information on ovulation status. However, we did not evaluate any of those measures in this study and as a result, are unable to comment on the validity of those metrics. The Ava bracelet and ActiGraph were comparable to one another in assessing sleep duration and quality, but not heart rate in free-living conditions. However, it should be noted that a lack of precision within the sleep duration estimates may lead to misclassification bias in sleep research. Due to increased potential for sleep duration misclassification and large errors with heart rate measurement when using the Ava bracelet, we recommend that researchers continue using the gold standard ActiGraph, while further clinical testing is needed to validate the HRV measures. Self-reported measures of mood and physical activity collected in the Ava mobile app had a low to moderate correlation with construct measures, yet compliance with mood data input was high. So, we recommend researchers consider using a mobile app, such as Ava, to prospectively evaluate mood during the perinatal period. Acceptability of the Ava bracelet was high for tracking fertility and lower for tracking pregnancy. Given our small sample size and potential for selection bias, future research should aim to address this discrepancy through feasibility and usability studies. AKN researched literature and conceived the study. EC and ALT were involved in protocol development and gaining ethical approval. EC was involved in participant recruitment. AKN analyzed the data and wrote the first draft of the manuscript. All authors reviewed and edited the manuscript and approved the final version. Declaration of Conflicting Interests: The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Informed Consent: Not applicable, because this article does not contain any studies with human or animal subjects. AHAA: acceptability of health apps among adolescents; SD: standard deviation. a Total acceptability scores were calculated using an adaptation of the AHAA scale by taking the mean of responses within each of the five subscales (i.e. affective attitude, burden, coherence, ethicality, and self-efficacy) and then totaling the five means. 30 About one-in-five Americans use a smart watch or fitness tracker Sensors capabilities, performance, and use of consumer sleep technology How consumer physical activity monitors could transform human physiology research Use of wearable sensors for pregnancy health and environmental monitoring: Descriptive findings from the perspective of patients and providers Feasibility of smart wristbands for continuous monitoring during pregnancy and one month after birth Acceptability and feasibility of a sedentary behavior reduction program during pregnancy: A semi-experimental study The use of wearable technology to objectively measure sleep quality and physical activity among pregnant women in urban Lima, Peru: a pilot feasibility study A prospective cohort study to evaluate the impact of diet, exercise, and lifestyle on fertility: design and baseline characteristics Mhealth physical activity intervention: A randomized pilot study in physically inactive pregnant women A pedometer-guided physical activity intervention for obese pregnant women (the Fit MUM study): Randomized feasibility study Effectiveness of activity trackers with and without incentives to increase physical activity (TRIPPA): A randomised controlled trial Trajectories of objectively-measured physical activity and sedentary time over the course of pregnancy in women self-identified as inactive Modern fertility awareness methods: Wrist wearables capture the changes in temperature associated with the menstrual cycle Pulse rate measurement during sleep using wearable sensors, and its correlation with the menstrual cycle phases, a prospective observational study Capturing the physiological characteristics of early pregnancy using wrist worn wearables. Poster presented at: Annual European Society of Central limit theorem: the cornerstone of modern statistics An overview of heart rate variability metrics and norms Cardiovascular physiology. In: Principles and practice of sleep medicine State anxiety and nonlinear dynamics of heart rate variability in students Evaluation of actigraphy-measured sleep patterns among children with disabilities and associations with caregivers' educational attainment: results from a cross-sectional study The Pittsburgh sleep quality index: A new instrument for psychiatric practice and research Construct validity and factor structure of the Pittsburgh sleep quality index among pregnant women in a pacific-northwest cohort Review of the psychometric evidence of the perceived stress scale Metacognition, perceived stress, and negative emotion Validity of the international physical activity questionnaire short form (IPAQ-SF): A systematic review Reliability and concurrent validity of the international physical activity questionnaire short form among pregnant women Guidelines for data processing analysis of the International Physical Activity Questionnaire (IPAQ) -Short and long forms Scoring the International Physical Activity Questionnaire (IPAQ) Development and validation of a new scale to measure the acceptability of mobile health applications among adolescents. Dissertation. University of North Carolina at Chapel Hill Acceptability of healthcare interventions: an overview of reviews and development of a theoretical framework Validity and reliability in quantitative studies Wrist actigraphy Accuracy of commercially available heart rate monitors in athletes: A prospective study Assessing the acceptability and usability of an internet-based intelligent health assistant developed for use among Turkish migrants: Results of a study conducted in Bremen Review of validity and reliability of Garmin activity trackers Comparison of wearable trackers' ability to estimate sleep Validity of wrist-worn consumer products to measure heart rate and energy expenditure Respiratory and cardiac monitoring at night using a wrist wearable optical system Accuracy of consumer wearable heart rate measurement during an ecologically valid 24-hour period: Intraindividual validation study Guidelines for wristworn consumer wearable assessment of heart rate in biobehavioral research Validity of wristworn photoplethysmography devices to measure heart rate: a systematic review and meta-analysis Can wearable devices accurately measure heart rate variability? A systematic review Assessing the performance of a commercial multisensory sleep tracker Systematic review of the validity and reliability of consumer-wearable activity trackers Recommended amount of sleep for a healthy adult: A joint consensus statement of the American Academy of sleep medicine and sleep research society Associations of early pregnancy sleep duration with trimester-specific blood pressures and hypertensive disorders in pregnancy Female and male sleep duration in association with the probability of conception in two representative populations of reproductive age in US and China Association between maternal sleep duration and quality, and the risk of preterm birth: A systematic review and meta-analysis of observational studies Impact of sleep duration during pregnancy on the risk of gestational diabetes in the Japan environmental and children's study (JECS) Arousals and macrostructure of sleep: Importance of NREM stage 2 reconsidered Patient-generated health data collection using a wearable activity tracker in cancer patients-a feasibility study The prevalence of depression symptoms among infertile women: A systematic review and meta-analysis Vital signs: postpartum depressive symptoms and provider discussions about perinatal depression-United States Appendix. Social media advertisement used for study recruitment