Background & Summary

The Center for Disease Control (CDC) has deemed insufficient sleep to be a “public health epidemic” based on evidence that over one-third of American adults are not getting enough regular sleep1,2. More than 40 million Americans suffer from over 80 different sleep disorders, and another 20–30 million suffer from intermittent sleep problems3. Sleep accounts for one-quarter to one-third of our day and has a significant effect on health, well-being, and daytime performance4. Poor sleep comprising both short and long sleep periods is associated with impaired cognitive performance5. Short sleep is linked with seven of the top fifteen causes of death in the United States6, decreased workplace productivity7, and an estimated $680 billion in economic losses across five major countries8.

The diagnosis and treatment of sleep disorders has been an integral part of medical practice for the last several decades, but a more holistic approach focused on health and wellness has recently started to emerge. This integrated approach is known as “sleep health”, and it focuses on studying sleep in the general population9. Sleep health research is beginning to demonstrate that many facets of sleep are related to important health outcomes10.

Traditional sleep research methods utilize questionnaires and sleep diaries to collect subjective reports of sleep11. In recent years, electronic surveys have been employed with limited scope and reach. Digital health methods can be used to measure variability in sleep and subsequent effects on daytime functioning over prolonged periods of time12. They also allow researchers to reach and enroll significantly larger study cohorts at a lower cost than traditional research methods13,14. Few studies carried out to date have explored the utility of assessing sleep health in this manner15,16.

The SleepHealth Mobile App Study (SHMAS) was an observational study developed using Apple’s ResearchKit (RK) ( platform, which has also been used by other recent digital health studies17,18,19,20,21. The primary goal of the SHMAS was to explore the relationship between sleep habits and daytime functioning using a novel digital health approach. Secondary goals included investigation of participant characteristics, evaluation of data validity and examination of user retention patterns and data-sharing preferences.

The SHMAS was launched on the Apple App Store on March 2, 2016. The app store landing page was viewed 86,507 times and the application was downloaded 27,502 times. 12,356 participants provided consent to participate in the study through June 26, 2019 (Fig. 1). Study participants who agreed to share data broadly were enrolled in the study for an average of 935.4 days (SD = 355.1, range = 1–1212).

Fig. 1
figure 1

SHMAS Cohort Description. This figure shows the number of times the application was viewed and downloaded and the number of participants who made it all the way through to email verification, which indicated the start of the study. The figure then provides the number of participants who subsequently withdrew; of those who did not withdraw, the figure provides the number of participants who elected to share their data with qualified investigators and with only the study team.

Upon enrollment in the study, baseline demographic and survey data were collected. Participants were instructed to actively monitor their sleep and activity patterns for a minimum of five days in a row every quarter. Participants were encouraged to share Apple Health compatible application and wearable data. Participants were also asked to select whether to share their data with the SleepHealth team only (narrowly) or with the team and other qualified researchers (broadly). In order to complete the consent process, participants had to make an active choice selecting the scope of data sharing as no default choice was specified22. The data described here and shared with the research community come from participants who chose to share their data broadly (7,250/12,356; 59%). The cohort consisted of primarily white (78%) males (79%) with a mean age of 37 years (SD = 13, range = 18–87). 40% of the sample completed a 4-year college degree or higher. Baseline demographic characteristics did not appear to meaningfully differ for participants when grouped by sharing status (Table 1), although the comparison yielded statistically significant differences. These statistically significant differences are likely attributable to a large sample size. Large-scale research studies like the SHMAS are testing novel ways to gather real-world data that may contribute to our understanding of long-term disease trajectories and disease management. Early studies have demonstrated the utility of digital health methods for collecting real-world data as well as their potential to fundamentally transform biomedical research23,24. However, concerns about engaging participants to provide long-term real-world data as well as the validity of inferences drawn from said data remain an issue25. To help address some of these challenges, there is a critical need to share the methods employed and lessons learned from studies carried out in this manner with the greater scientific community while following responsible data sharing guidelines.

Table 1 Baseline Demographics.

We hope that by making SHMAS data available in accordance with FAIR principles26, we will provide a rich dataset to researchers interested in studying sleep health. These data capture various sleep health constructs that can help to improve our understanding of sleep habits, sleep patterns, and their relationship with daytime alertness and health. We also hope to raise awareness and interest in sleep research in general and pave the way for more refined sleep studies in the future.


Participant onboarding


The SHMAS was conducted remotely through an iPhone application. After installing the application, prospective participants were given a brief overview of the study and information about the study team (Fig. 2).

Fig. 2
figure 2

SHMAS Onboarding Process. This figure provides a graphical representation of the SHMAS onboarding process from first application use through enrollment and first login. It shows that there were several phases: screening, informed consent, registration and authentication.


Prospective participants completed an in-application, three-item screener to determine their eligibility. The study was open to anyone with an iPhone who could install the application, was 18 years of age or older, lived in the United States, and could read and write fluently in English. There were no other criteria for inclusion in the study.

Participant expectations

After successful completion of the eligibility screener, detailed information about the study was provided (Fig. 3). This included an explanation of how contributed data would be protected and used, the specific types of sensor-based and passive data that would be collected with permission, and the types of surveys and daily activities that prospective participants would be asked to complete.

Fig. 3
figure 3

Screenshots of Onboarding Process. Prospective participants were given an overview of how their data would be protected and used. They were briefed on the types of sensor-based and passive Apple Health data that would be collected with their permission and also given an explanation of the types of surveys and daily activities that they would be expected to complete during the course of the study.

Data sharing

Prospective participants were asked whether they wanted to share their study data for secondary research narrowly (With the SleepHealth team only) or broadly (with the SleepHealth team and other qualified researchers worldwide). No default sharing choice was selected. The study protocol and consent procedure were approved by the Western Institutional Review Board, Puyallup, WA (WIRB #20151042).

Understanding of study conditions

Potential participants were given contact information for the study team if they had any questions or concerns. They were then asked three questions about the information contained in the electronic informed consent process to ensure that they understood the terms of the study (Fig. 4). The three questions reinforced the following concepts: (1) if sleep quality got worse over the course of the study, they should contact their medical provider and not rely on the application; (2) that they could withdraw from the study at any time but the data that they contributed would not be deleted; and (3) that the application was a research study and not a commercial application. All three questions had to be answered correctly in order to proceed.

Fig. 4
figure 4

Screenshots of Informed Consent Test Questions. Prospective participants were asked three questions to ensure that they understood what their participation in research entailed: 1) If their sleep quality got worse during the study, they should stop relying on the application and see their doctor, 2) Once they started participating in the study, they were free to withdraw at any time but the data they contributed would not be deleted and 3) That the application was a research study and not a commercial application.

Prospective participants were shown an electronic copy of the consent form and asked to carefully review it. They were then asked to either click “Agree” or “Cancel.” If they clicked “Cancel,” they were not allowed to proceed any further and were not included in the study. If they clicked “Agree,” they were asked to confirm their decision by providing their first name, last name, and electronic signature.

Passive data

Participants were asked if they wanted to share passive data aggregated by Apple’s HealthKit framework27 with the study team. The default option was no sharing. Participants had the option to share as much or as little passive data as they wanted by either opting to share all categories of Healthkit data by selecting “Turn All Categories On” or by selecting individual data categories one by one.


To complete study registration, participants were asked to provide an email address, password, birthdate, and gender. They also selected a 4-digit passcode and chose permissions for receiving notifications. Participants had the ability to modify their notification permissions within the application at any time.

Account verification, consent form, withdrawal

Participants were required to verify their account via email and received a copy of their electronically signed consent form in PDF format. They could withdraw their consent at any time by selecting the “Leave Study” button from the study dashboard. Any data that participants provided up until the point of withdrawal were retained by the study team.

Active data collection

The study consisted of two kinds of active data collection: one-time quarterly surveys and daily activities which were intended to be completed for a minimum of 5 out of 7 days each quarter. Participants had the option to complete the daily activities as many times as they wanted.


The names of the surveys administered to participants in the study were: About Me (demographics); My Family (family medical history); My Health (general health and medical conditions); Research Interest (prior research experience/willingness to participate in future research studies); Sleep Habits (sleep duration and quality), and Sleep Assessment (sleep disorder symptoms and daytime functioning). The SHMAS utilized a combination of validated and adapted assessments that are commonly used in sleep research. Specific survey and questionnaire items are described in detail on the SleepHealth Public Researcher Portal.

Daily activities

The AM and PM Check-in activities contained items from the Consensus Sleep Diary (CSD)28 and queried participants about their previous night’s sleep and their current day’s activities. The Sleepiness Checker activity was a direct adaptation of the Karolinska Sleepiness Scale (KSS)29. It utilized a 9-point Likert scale to evaluate subjective sleepiness and was anchored by “extremely alert” to “extremely sleepy, fighting sleep.” (Fig. 5). The KSS is a valid measure of daytime alertness/sleepiness and is highly correlated with EEG and other relevant behavioral metrics. The Sleep Quality Checker used a single-item from the CSD which utilized a 5-point likert scale to rate the previous night’s level of sleep quality. It was anchored by “very poor” to “very good.” (Fig. 6). The Alertness Checker was an objective, sensor-based task that was included in the SHMAS to assess sustained attention, which has been shown to be impaired with sleep loss30. It was a smartphone adaptation of the 3-minute Psychomotor Vigilance Task (PVT-B)29,31. The PVT-B has been demonstrated to be sensitive to reduced alertness resulting from sleep loss as well as improvements in alertness following recovery sleep32,33. The PVT-B has also been utilized in situations where it is impractical to use the original 10-minute PVT34. Participants were instructed to use the thumb or index finger of their dominant hand to tap their screen as quickly and accurately as possible in response to a target stimulus (Fig. 7). The Nap Tracker was a simple tool designed by the SHMAS study team that allowed participants to measure the timing and duration of naps. Participants could tap the screen to start the nap tracker at the beginning of their nap and tap again to stop the nap tracker upon awakening. They were also able to rate the quality of their naps on a scale ranging from “poor” to “excellent.”

Fig. 5
figure 5

The Sleepiness Checker. The Sleepiness Checker could be used by participants to report their levels of sleepiness throughout the study. Participants received push notifications (if enabled) to complete it twice randomly throughout the day, but there was no limit on the number of times it could be completed.

Fig. 6
figure 6

The Sleep Quality Checker. The Sleep Quality Checker could be used by participants to report their sleep quality for the previous night’s sleep. Participants received a push notification 1.5 hours after their reported wake-up times (if enabled) to provide this information, but it could be completed at any time on a given study day.

Fig. 7
figure 7

Alertness Checker Activity. The Alertness Checker was adapted from the 3-minute psychomotor vigilance task and was included as an objective measure of alertness. The stimulus was the green circle shown.

Active task administration frequency

The SHMAS was designed to be a longitudinal observational study. The baseline onboarding surveys were gradually made available over the first three days of the study rather than all at once. This was done with the intention of making the baseline assessment less overwhelming. Table 2 outlines the survey and activity release schedule. Surveys were intended to be completed once during a baseline period and again at subsequent quarterly follow-ups. Daily activities were intended to supplement baseline and quarterly surveys, and users received notifications to complete them for 7-days in a row during the baseline period. After the baseline period ended, notifications automatically turned off by default. Users could re-enable notifications if desired, and were encouraged to continue to regularly use the daily activities to track their daytime sleepiness, sleep quality and objective vigilance.

Table 2 Schedule of Study Surveys and Activities.

Passive healthkit data collection

Participants had the option to share HealthKit data with the study team during the onboarding process. For participants who had granted HealthKit sharing permissions, HealthKit data were passively collected in the background while the study application was open.


The SHMAS was only available to iPhone users residing in the United States. Location-based data were not collected for SHMAS participants in an effort to maintain participant privacy. Recruitment of participants for this iOS application based study may have introduced socioeconomic, gender and racial biases, which has been discussed in previous RK-based study publications35,36. Future digital health studies will benefit from being made available internationally and across iOS and Android platforms which will allow more diverse study cohorts to be recruited.

Data Records

All coded data sets with associated metadata and documentation are stored and accessible via the Synapse platform by qualified researchers ( Detailed instructions describing the process of requesting access to data can be found on the SleepHealth Public Researcher Portal under the “Accessing SleepHealth Data” page.

Please see Tables 3 and 4 for a breakdown of surveys and daily activities completed by study participants who agreed to share their data broadly37,38,39,40,41,42,43,44,45,46,47,48,49. Table 5 describes the kinds of HealthKit data that were provided by study participants over the course of the study50,51,52,53,54,55,56,57,58.

Table 3 Available Survey Data.
Table 4 Available Daily Activity Data.
Table 5 Available HealthKit Data.

Technical Validation

The data provided on the SleepHealth Public Researcher Portal ( come from participants who agreed to share broadly with the study team and qualified researchers worldwide. All data are self-reported unless otherwise indicated, and should be treated as such. The study portal also contains passively collected Apple HealthKit data from participants who consented to data sharing.

Potentially sensitive data such as birthdates and free-text responses were excluded in order to safeguard participant privacy. In order to protect participants with potentially identifiable traits, self-reported height and weight data that fell outside thresholds established in a previously published data descriptor manuscript (height <60 or >78 inches, weight <80 or >350lbs)35 were excluded and set to ‘CENSORED’ in the corresponding data frames. Despite the fact that all study participants affirmed that they were 18 years of age or older at the time of study enrollment, some participants later provided dates of birth which indicated that they were under 18 years old. Data contributed by such participants were excluded. Additionally, data from test accounts that were created during the initial study period were also excluded in order to maximize data quality.

The raw passive data collected through Apple’s HealthKit framework were cleaned extensively. Specifically, data that contained sensitive and potentially identifiable information such as participant location, and device name (which in some cases contained participant names) were excluded. Finally, the subset of HealthKit data types that were shared by at least 1% of the participants in the HealthKit data sharing cohort were selected for broad release.

Usage Notes

Governance structures have been put in place in order to respect the balance between the desire of participants to share their data with qualified researchers and the potential risk of re-identification of participant data. To obtain access to these data, researchers must complete the following steps:

  1. 1.

    Become a Synapse Certified User with a validated user profile.

  2. 2.

    Submit a 1–3 paragraph Intended Data Use statement to be posted publicly on Synapse.

  3. 3.

    Agree to comply with the data-specific Conditions for Use when prompted:

  • You confirm that you will not attempt to re-identify research participants for any reason, including for re-identification theory research.

  • You reaffirm your commitment to the Synapse Awareness and Ethics Pledge.

  • You agree to abide by the guiding principles for responsible research use and data handling as described in the Synapse Governance documents (

  • You commit to keeping these data confidential and secure.

  • You agree to use these data exclusively as described in your submitted Intended Data Use statement.

  • You understand that these data may not be used for commercial advertisement or to re-contact research participants.

  • You agree to report any misuse or data release, intentional or inadvertent to the ACT within 5 business days by emailing

  • You agree to publish findings in open access publications.

  • You promise to acknowledge the research participants as data contributors and the study investigators on all publication or presentation resulting from using these data as follows:

“These data were contributed by the participants of the SleepHealth Mobile App Study (SHMAS), which was sponsored by the American Sleep Apnea Association with scientific oversight by Carl Stepnowsky, PhD, as described in Synapse:”