TILES-2018, a longitudinal physiologic and behavioral data set of hospital workers

Mundnich, Karel; Booth, Brandon M.; L’Hommedieu, Michelle; Feng, Tiantian; Girault, Benjamin; L’Hommedieu, Justin; Wildman, Mackenzie; Skaaden, Sophia; Nadarajan, Amrutha; Villatte, Jennifer L.; Falk, Tiago H.; Lerman, Kristina; Ferrara, Emilio; Narayanan, Shrikanth

doi:10.1038/s41597-020-00655-3

Download PDF

Data Descriptor
Open access
Published: 16 October 2020

TILES-2018, a longitudinal physiologic and behavioral data set of hospital workers

Scientific Data volume 7, Article number: 354 (2020) Cite this article

7030 Accesses
35 Citations
3 Altmetric
Metrics details

Subjects

This article has been updated

Abstract

We present a novel longitudinal multimodal corpus of physiological and behavioral data collected from direct clinical providers in a hospital workplace. We designed the study to investigate the use of off-the-shelf wearable and environmental sensors to understand individual-specific constructs such as job performance, interpersonal interaction, and well-being of hospital workers over time in their natural day-to-day job settings. We collected behavioral and physiological data from n = 212 participants through Internet-of-Things Bluetooth data hubs, wearable sensors (including a wristband, a biometrics-tracking garment, a smartphone, and an audio-feature recorder), together with a battery of surveys to assess personality traits, behavioral states, job performance, and well-being over time. Besides the default use of the data set, we envision several novel research opportunities and potential applications, including multi-modal and multi-task behavioral modeling, authentication through biometrics, and privacy-aware and privacy-preserving machine learning.

Measurement(s)	Overall Sleep Quality Rating • Step Unit of Distance • Speech • Mean Heart Rate • Proximity • Electrocardiogram Sequence • heart rate variability measurement • Respiratory Rate • physical activity measurement • light • door motion • Changes in Ambient Temperature in Medical Device Environment • humidity • Overall Emotional Well-Being • Stress • psychological flexibility • work-related acceptance • work engagement • psychological capital • intelligence • job performance • organizational citizenship behavior • counter-productive work behavior • personality trait measurement • Negative affectivity • positive affectivity • anxiety-related behavior trait • Alcohol Use History • Overall Health Rating During Past Week
Technology Type(s)	photoplethysmography • Accelerometer • Microphone Device • Bluetooth-enabled Activity Monitor • electrocardiogram • Sensor Device • Photodetector Device • Temperature Sensor Device • questionnaire • Multidimensional Psychological Flexibility Inventory (MPFI) • Utrecht work engagement scale • survey method • individual task proficiency • Search Results Web results Organizational Citizenship Behavior Checklist • big five inventory • Positive and Negative Affect Schedule (PANAS-X) • State-Trait Anxiety Inventory
Sample Characteristic - Organism	Homo sapiens
Sample Characteristic - Environment	hospital

Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.12465101

TILES-2019: A longitudinal physiologic and behavioral data set of medical residents in an intensive care unit

Article Open access 01 September 2022

A multimodal sensor dataset for continuous stress detection of nurses in a hospital

Article Open access 01 June 2022

A multimodal analysis of physical activity, sleep, and work shift in nurses with wearable sensor data

Article Open access 22 April 2021

Background & Summary

Maintaining a healthy, productive workforce is an increasingly challenging problem in a complex and frenzied world. Optimal job performance relies on worker wellness, and as organizations strive to prepare their workforce for the evolving demands, worker wellness is increasingly important. Current standards are based on cross-sectional assessment of employee characteristics, often in controlled testing conditions that cannot account for the dynamic nature of working environments and employee performance and are therefore poorly suited for this task¹. Fortunately, today’s densely instrumented world offers tremendous opportunities for unobtrusive and persistent acquisition and analysis of diverse, information-rich time-series data that provide a multi-modal, spatio-temporal characterization of individuals’ actions in, and of, the environment within which they operate. However, the connection between individual and group performance, well-being, and quantitative measurements from sensor data has not been established for such dynamic environments in the wild.

To connect job performance-related and well-being-related constructs through self-assessments with data from sensors, we frame well-being and performance within the overarching notion of psychological flexibility. Psychological flexibility refers to an individual’s capacity to maintain fluid awareness and acceptance of current circumstances and, depending upon available opportunities, take effective action even when experiencing difficult or unwanted thoughts, emotions, and sensations². Psychological flexibility is defined as a primary individual determinant of behavioral effectiveness and well-being³. It has been shown that, in the workplace, the degree to which employees are psychologically flexible can have a profound effect on their productivity, well-being, and success^4,5. Moreover, the connection between sensor measurements and mental states put forth by the Somatic Marker Hypothesis⁶ suggests that the physiological status of our body (i.e., the somatic marker) is an indispensable part of our cognition and emotion, which are building blocks of our mental states. The purpose of our research is to connect psychological flexibility, job-performance, and well-being with somatic and bio-behavioral markers using an in situ experimental study in a real world workplace.

The TILES-2018 (Tracking IndividuaL performancE with Sensors, year 2018) data set comes from a prospective longitudinal study using intensive multimodal assessment of workers and their environments aimed to understand the dynamic relationships among individual differences, work and wellness behaviors, and the contexts in which they occur. It aims to support developing and validating sensor-based methods for evaluating worker wellness and job performance over time. To achieve this, we partnered with University of Southern California’s Keck Hospital to directly observe 212 workers who volunteered to participate in the study over a 10-week period both at work and outside of work. Bio-behavioral data were captured continuously and passively throughout the study via wearable devices (including a wristband, a smart undergarment, a clip-on audio-features recorder and Bluetooth-enabled badge, and personal smartphones). These data streams were matched with environmental and behavioral data streams from Internet-of-Things devices and applications logging personal smartphone usage. To map sensor data to constructs of interest, participants also completed an initial battery of online surveys and daily surveys designed to assess individual difference variables (e.g., personality, intelligence, socioeconomic status), psychological states and traits (e.g., positive and negative affect, anxiety, stress, fatigue, psychological flexibility, psychological capital), health and wellness (e.g., sleep, physical activity, cardio exercise, tobacco and alcohol use, health-related quality of life, life satisfaction), and work behaviors (e.g., task performance, organizational citizenship behavior, counterproductive work behavior, work engagement, perceived support and stressors).

This data set provides a unique opportunity for researchers interested in organizational psychology or data sciences in general to perform exploratory and hypothesis-driven investigations regarding the complex, dynamic nature of worker wellness and performance over time. It is also of interest to signal processing, machine learning, and privacy researchers due to the thousands of hours of sensor data collected across participants in natural real-world in the wild settings, that can be used to study and extend current multimodal signal processing methods, perform machine learning inference on psychological states and traits, and study and develop new methods to protect the privacy of users without hindering the richness of such a data set. Unique strengths of this data set include a rich set of self-assessed psychological constructs coupled with multimodal sensor data, all captured in the wild throughout ten weeks, with high compliance rates of the participants.

The data are available through two records: the Main Data Record, and the Audio Data Record, available at https://tiles-data.isi.edu/.

Methods

In this section, we describe the materials and procedures followed to collect the data. An overview of the study is shown in Fig. 1.

Location

The data collection took place at the University of Southern California’s (USC) Keck Hospital in Los Angeles, California, in the United States. USC Keck Hospital is an acute care hospital with 401 patient beds throughout 16 nursing units (https://www.keckmedicine.org/about-keck-medicine/keck-hospital-of-usc/). It is located within USC’s Health Sciences Campus.

Materials

This section describes the materials employed in the data collection: Surveys, sensors, and phone applications (apps).

Surveys

To get an understanding of the participants’ mental states, traits, and physical and emotional well-being, they were asked to take different surveys throughout the study; these also serve as targets (or labels) for statistical modeling. At the beginning of the study, participants were asked to answer a baseline survey over two different sessions, assessing constructs related to job performance, cognitive abilities, and health. Throughout the data collection, participants answered daily Ecological Momentary Assessments (EMAs) for health, job performance, personality, psychological flexibility, and psychological capital. After the conclusion of the sensor data collection, participants completed a post-study survey. These surveys are described in the following sections, where labels in parenthesis correspond to how the measures are referred to in the data set.

Baseline survey

Due to the length of the baseline survey, it was split into two different sessions. A first part of the baseline survey was administered at the study enrollment session (described in the Study Procedures section) for participants, and assessed demographics and a number of cognitive and psychological constructs pertaining to job performance, cognitive ability, and health. Later, and before the start of the sensor-based data collection, participants answered the second part of the baseline survey at home (or another place of their choice). This survey assessed demographics, health, satisfaction with life, perceived stress, psychological flexibility, work acceptance and action, work engagement, psychological capital, and challenge and hindrance stressors (measures were administered in the above order).

We next describe the scales assessed in the first part of the baseline survey. We give brief descriptions herein (obtained from⁷, Table 1) and refer readers to the design and rationale behind this survey in the same document.

Demographics (DEMO): Participants completed a brief demographics survey which assessed sex, age, place of birth, English as the native language, education level, and job-related demographics (e.g., full-time or part-time, industry, tenure in the organization, and income).
Cognitive Ability: It was measured using two different scales:

Fluid Intelligence (ABS): Consists of 25 open-ended text entry items, and is scored by adding the sum of correct responses, for a range between 0 and 25.
Crystallized Intelligence (VOCAB): Consists of 40 multiple choice items, with 4 response options each. It is scored by adding the total number of correct answers, for a range between 0 and 40.

Tobacco Use (GATS): It was measured using a shortened version of the Global Adult Tobacco Surveys, which consists of 3 items of the form yes/no and quantity questions. The computed scores are tobacco status (never, past, current smoker) and a GATS score, computed by adding tobacco units used in past week (which is ≥0.
Alcohol Use (AUDIT): It was measured using The Alcohol Use Disorders Identification Test⁸, which consists of 10 items with yes/no, quantity, and frequency questions. It is scored according to AUDIT instrument scoring guidelines, for a total score in the range 0 to 40.
Sleep (PSQI): It was measured using the Pittsburgh Sleep Quality Index, consisting of 29 items with open-ended response formats as well as structured questions with categorical outcome options. The score is an aggregate sleep quality score, computed according to the PSQI instrument scoring guidelines, for a total score in the range 0–21.
Physical Activity (IPAQ): It was measured using the International Physical Activity Questionnaire, which consists of 27 items of the form yes/no, quantity, and frequency questions. The score is computed using total standardized MET-minutes reported for the prior 7-day period, which is ≥0.
Counter-productive Work Behavior (IOD): It was measured with the Interpersonal and Organization Deviance scale (IOD)⁹. It consists of a total of 19 items, separated into two subsets. Each item is a frequency scale ranging from 1 (never) to 7 (daily).

Interpersonal Deviance (IOD_ID): Consists of 7 items. The score is computed by adding the responses, for a total score in the range 7 to 49.
Organizational Deviance (IOD_OD): Consists of 12 items. The score is computed by adding the responses, for a total score in the range 12 to 84.

Organizational Citizenship Behavior (OCB): Measured using the OCB Checklist (OCB-C)¹⁰. It consists of 20 items, each being a frequency scale ranging from 1 (never) to 5 (every day). The score is computed by adding all the responses, for a total score between 20 and 100.
Task Performance was assessed using two different measures:

In-Role Behavior (IRB)¹¹: Consists of 7 items, each being a Likert scale ranging from 1 (strongly disagree) to 7 (strongly agree). A scored is obtained by adding all the responses, for a total score between 7 and 49.
Individual Task Proficiency (ITP)¹²: Consists of 3 items, each being a Likert scale ranging from 1 (very little) to 5 (a great deal). A scored is obtained by averaging all the responses, for a total score between 1 and 5.

Personality (BFI-2): It was measured using the Big Five Inventory-2¹³. It consists of 60 items, each being a Likert scale ranging from 1 (disagree strongly) to 5 (agree strongly). Five different scores are computed, all in a range between 1 and 5:

Negative Emotionality (neuroticism): Scored by averaging all the negative emotionality responses.
Conscientiousness: Scored by averaging all the conscientiousness responses.
Extraversion: Scored by averaging all the extraversion responses.
Agreeableness: Scored by averaging all the agreeableness responses.
Open-Mindedness: Scored by averaging all the open-mindedness responses.

Affect (PANAS): It was measured using the Positive and Negative Affect Schedule-Expanded Form¹⁴. It consists of 60 items, each being a Likert scale ranging from 1 (very slightly or not at all) to 5 (extremely). Two different scores were computed, with scores in the range 10 to 50:

Positive Affect (POSAFFECT): Score is obtained by adding the positive responses.
Negative Affect (NEGAFFECT): Score is obtained by adding the negative responses.

Anxiety (STAI): It was measured using the State Trait Anxiety Inventory¹⁵. It consists of 20 items, each being a frequency scale ranging from 1 (almost never) to 4 (almost always). It is scored by adding sum responses, obtaining a value in the range 20 to 80.

Table 1 Selected sensors with a summary of measurements (output) and instructed use or sensing times. The first three sensing streams were obtained directly from participants through wearable sensors and apps installed in their personal smartphones. All surveys were obtained by direct input of participants on their personal smartphones or a web browser. The last four sensing streams were obtained by placing sensors in the hospital. PPG: photoplethysmography, ECG: electrocardiography.

Full size table

The following scales correspond to the second part of the baseline survey, and were assessed on a take-home questionnaire. We include a description of each measurement and a brief rationale.

Demographics (DEMO): Additional demographics assessed several basic characteristics of participants. Specifically, they were asked about race, marital status, pregnancy, number of children living with participants, and housing situation (e.g., rent or own). It also assessed things that were more germane to the particular sample at hand. This included what position the participant currently held at the hospital from which they were recruited, what specific certifications they have (e.g., nurse practitioner), years in professions, what shift they worked (e.g., day or night), how many hours worked at the organization from which participants were recruited, and amount of over time worked. In addition to this, participants were asked about the length of their commute, mode of transportation used in their commute, do they have another job outside of the one from which they were recruited and if so, how many hours do they work there. Lastly, they were asked if they were currently a student, their gender, age, place of birth, English as the native language education level, and job-related demographics (e.g., full-time or part-time, industry, tenure in the organization, and income).
Health (RAND): Health was measured using the Rand Health Survey-Short form¹⁶. This assesses eight health domains through 36 self-report items. These domains included physical function, role limitations due to physical health, role limitations due to personal or emotional problems, general mental health, social functioning, bodily pain, general health perceptions, and energy/fatigue. This measure also includes one scale that assesses perceived change in health. Scores are obtained by computing the mean of the items that are associated with each of the domains listed above.
Life Satisfaction (SWLS): The Satisfaction with Life Scale¹⁷ is a 5-item measure that aims to assess participants’ general satisfaction with life. Participants are to rate the degree to which they agree with each statement on a scale of 1 (strongly disagree) to 7 (strongly agree). A total score is obtained by taking the average of the 5 items.
Perceived Stress (PSS): The Perceived Stress Scale¹⁸ is a 10-item scale that aims to assess how often one has experienced stress in the last month. Participants are asked to rate the frequency in which they experience perceive stress on a scale of 0 (never) to 4 (very often). After reverse coding the necessary items, a total score is obtained by taking the average of the 10 items.
Psychological Flexibility (MPFI): The Multidimensional Psychological Flexibility Inventory¹⁹ is a 24-item questionnaire that measures both psychological flexibility as well as inflexibility. The 24-item measure is the short form version. 12 items measure flexibility and 12 items measure inflexibility each being assessed on a scale from 1 (never true) to 6 (always true). The MPFI also measures a number of sub-dimensions:

Psychological Flexibility (PF): Under flexibility there are sub-dimensions for acceptance, present moment awareness, self as context, defusion, values and committed action.
Psychological Inflexibility (PI): The inflexibility sub-scales include experiential avoidance, lack of contact with the present moment, self as content, fusion, lack of contact with values, and inaction.

Items on this measure ask participants to think about the last two week and to rate the frequency in which they experience the feelings described in each item. PF, PI, and their sub-dimensions are scored by taking the mean of the items that comprised each scale or sub-dimension.
Work Related Acceptance (WAAQ): Additionally, psychological flexibility as related to work was measured by the 7-item Work-related Acceptance and Action Questionnaire²⁰. The WAAQ presents a statement and participants rate the degree to which each statement is true on a scale from 1 (never true) to 7 (always true). The WAAQ is scored by taking the mean of the items.
Work Engagement (UWES): Work engagement is measured using the Utrecht Work Engagement Scale²¹. Work engagement measure presents 9-items and participants rate the frequency in which they have experienced the feeling described on a scale from 0 (never) to 6 (always). Then scores are averaged to obtain a total score. There are three sub-scales: vigor, dedication, and absorption.
Psychological Capital (PCQ): It can be thought of as a higher-order construct that is comprised of hope, self-efficacy, resilience, and optimism²². It is assessed through the Psychological Capital Questionnaire through a 12-item measure²³. The PCQ asks participants the degree to which they agree on a 6-point scale from 1 (strongly disagree) to 6 (strongly agree).
Challenge and Hindrance Stressors (CHSS): Challenge and Hindrance stressors is measured using a 16-items measure where participants were presented with a statement and asked to rate the degree of agreement or disagreement with the statement²⁴. 8 items were used to measure challenge stressors and 8 items were used to measure hindrance stressors. Total scores are calculated by computing the mean over all hindrance stressors items and computing separately the mean over all challenge stressor items.

Ecological momentary assessments

The Ecological Momentary Assessments (EMAs) were received twice a day by participants and were divided into two groups. Note that some scales have a “D” appended to their name compared to the baseline survey to denote its daily version.

A first group of EMAs assessed job-related variables, health-related variables, and personality. The job-related questions were asked a total of 31 times during the study (every two days), the health-related questions were asked 35 times during the study (every two days), and the personality-related questions were asked 5 times during the length of the study (every two weeks), with a total of 71 surveys administered over the 10 weeks of the study. Participants received one of these surveys daily. The job, health, and personality surveys were sent either at 6am, noon, or 6 pm, and expired 4 hours after they were sent.

Another group of EMAs assessed psychological flexibility and psychological capital. The psychological flexibility form was sent to participants a total of 50 times over the ten weeks (5 times per week), whereas the psychological capital form was received a total of 20 times throughout the same period (2 times per week). Participants received one of these surveys daily. The psychological flexibility and psychological capital EMAs were sent uniformly at random to day shift participants between 11am and 6 pm, and between 11 pm and 6am for night shift participants. They expired 6 hours after their delivery.

Note that some scales have a “D” appended to their name compared to the baseline survey to denote its daily version.

The surveys were implemented using ResearchKit for iOS and ResearchStack for Android (through the TILES app described in Section Phone-apps).

The following items were asked daily to participants during ten weeks and were present each at the beginning of each job, health, and personality EMA (base daily survey).

Context measures (CONTEXT): These were 4 context questions. The first question asked participants about interactions with other people and the communications channel. The second question asked about the activity in which they were engaged in when they received the survey. The third question asked for current location, and the fourth question asked whether any atypical events had occurred.
Stress (STRESSD): Stress was measured daily using a single that read, “Overall, how would you rate your current level of stress?”.
Anxiety (ANXIETY): Anxiety was assessed daily using a single which asked, “Please select the response that shows how anxious you feel at the moment”.
Affect (PAND): Participants’ positive and negative affect were assessed daily using the 10 items from PANAS-Short²⁵. 5 items were used to assess negative affect and 5 items were used to assess positive affect.
The purpose of the Job Performance Survey was to assess participants’ perceived job performance, and included the following measurements:
Work today (WORK): Prior to completing the job performance survey, participants were asked if they had worked 1 or more hours on that day. If participants answered no, they were not shown the job performance survey.
Task performance (ITPD, IRBD): Was measured using the same items that were used in the baseline survey described previously.
Organizational citizenship behavior (OCBD)/Counterproductive work behavior (CWBD): These were measured using a total of 16 items (DALAL)²⁶, with 8 items per scale.

The purpose of the Health Survey was to assess a number of health-related variables:

Sleep (SLEEPD): Sleep was assessed with a single item that asked participants to specify the number of hours they slept the previous night. Participants were instructed not to confuse this with the number of hours spent in bed.
Physical Activity (EX): Physical activity was measured using two questions. Participants were asked to specify the number of minutes of vigorous activity they engaged in yesterday (e.g., sprinting, power lifting). The second, asked participants how many minutes they spent the previous engaging in moderate physical activity (e.g., jogging, biking).
Tobacco Use (TOB): Tobacco use was measured using two items. The first asked whether the participant used a tobacco product yesterday and if so, a follow-up question was presented which probed how many times tobacco products were used and what type of product was used.
Alcohol Use (ALC): Alcohol use was assessed using 2 items. The first asked whether participants consumed any alcohol yesterday and if they responded yes, they received a question that asked to specify how many beers, wines and spirits they consumed.

The purpose of the Personality Survey was to assess the personality:

Personality (BFID): The personality survey uses BFI-10 (shortened version of the BFI-2 used in the baseline survey previously described).

The Psychological Flexibility Survey included context questions and measures of psychological flexibility:
Context Question (Activity): The first question asked participants to select from a list the type of activity in which they were engaged in immediately before beginning the survey. Example options included travel or commuting, eating and/or drinking, work, and work-related activities. Participants could also respond “other” and specify in text what they were doing.
Context Question (Experience): These items assessed experiences (both pleasant and unpleasant). The question was provided as a checklist (for positive and negative experiences), such as “Difficult thoughts of memories”, “Pleasant physical sensations”, “Difficult urges or cravings”.
Psychological Flexibility (PF): 13 items were included to assess psychological flexibility². Items of the PF survey are divided into 3 sub-scales. Participants were asked to report how true each statement was about themselves during the last 4 hours. They rated each statement on a scale of 1 (Never) to 5 (Always). The mean was calculated for all items in each sub-scale for a total score. This scale was created for this study.

The Engagement/Psychological Capital Survey assessed context (base daily survey), engagement, psychological capital, and challenge and hindrance stressors. It is comprised of items that are non-stigmatizing and/or pathologizing, and that have demonstrated large effect sizes on significant outcomes (e.g., employee health and well-being, job performance, job retention and turn-over)²⁷.

Context questions (Activity): The first question asked participants where they were, and participants selected from a list (e.g., work, home, outdoors, etc.). The second question was the same as the first question participant answered in the context questions for the psychological flexibility questionnaire.
Engagement (Engage): Participants completed a 3-item measure of work engagement²⁸. Participants were asked to think about the activity they had just reported doing and how they felt while engaging in that activity. Statements were rated on a scale of 1 (not at all) to 7 (very much). A mean of the 3 items was computed to create a total score.
Psychological Capital (Psycap): It was measured using 12 items from CPC-12²⁹. Participants were instructed to rate each statement based on how much they agreed with it. Items were rated on a scale of 1 (not at all) to 7 (very much). The mean for all 12 items was used to compute the total score.
Interpersonal Support (IS): A subset of 3 items from³⁰ are used to assess daily job resources.
Challenge/Hindrance Stressors (CS, HS): A subset of 8 items from the baseline survey measure of Challenge/Hindrance Stressors was used, 4 items to measure each type of stressor²⁴. Participants were instructed to consider the degree to which they agreed with each statement based on the last day that they had worked, including the day on which they completed the survey. Items were rated on a scale of 1 (not at all) to 7 (very much).

Post-study survey

The Post-study survey is equivalent to the take-home part of the baseline survey, except for not including demographics.

Sensing devices

The initial goal of the study was to predict self-assessed psychological constructs (obtained through surveys) from sensor data. To this end, we selected a set of wearable and environment-sensing devices to obtain physiological and behavioral information from participants. Table 1 summarizes the sensors worn by participants and their intended use throughout the study. Details on the sensor selection can be found in³¹.

Wearable sensors

Participants were instructed to wear a Fitbit Charge 2 wristband (https://help.fitbit.com/?p = charge_2) at all times throughout the duration of the study. Furthermore, at work, they were asked to wear an OMsignal smart garment (https://web.archive.org/web/20181221115159/https://www.omsignal.com/, a T-shirt for men and a sports bra for women, both discontinued) and a Unihertz Jelly Pro smartphone (https://www.unihertz.com/shop/product/jelly-pro-black-21, Jelly phone for short) as a lapel microphone (or “audio badge”). The Jelly phone was programmed to obtain audio features from the raw audio (which was discarded)³². In parallel, these Jelly phones also sent Bluetooth packets at 1Hz over 15s windows every minute, to estimate their locations within the building/work place. These packets had a unique 4 bytes identifier for every participant.

Environmental sensors

There were two kinds of environmental sensors: Owl-in-One (https://shop.reelyactive.com/products/owl-in-one-ble) Bluetooth data hubs and Minew sensors (https://en.minewtech.com/sensor.html). The Owl-in-Ones were used to estimate participant proximity to these by capturing the signal strength of Bluetooth packets from the Jelly phones that participants wore in the hospital and to collect environmental data sent over Bluetooth by Minew sensors.

The Owl-in-Ones were installed in fourteen nursing units (spread over seven of the building floors) and two hospital labs. A total of 244 Owl-in-Ones were installed, about 1.5m above the floor depending on space availability on wall areas near power outlets. Each nursing unit was equipped with an Owl-in-One sensor in these four room types: patient room, nursing station, lounge, and medication room. These different rooms were selected after observing the behavioral patterns of nurses during their shifts (by talking to nursing directors of Keck Hospital and shadowing nurses throughout a workday). Each Owl-in-One was labelled with the study logo, and the phrase “This is a data hub for the TILES study. For more information, please visit https://sail.usc.edu/tiles”.

One Owl-in-One was installed in every other patient room, one in every medication room, one in every lounge, and between one and four in nursing stations, depending on the size, layout, and availability of power outlets. In the hospital labs, one Owl-in-One was installed in every lounge, and at least one in each major room (e.g., blood lab, micro-bio lab, shipping/receiving, patient lobby, etc.) depending on the room size and power outlet availability. Figure 2 shows an example of Owl-in-One placements in a nursing unit.

Through information collected from Minew sensors, the Owl-in-Ones also captured (door) motion information, humidity, temperature, and light information across the hospital. Two light (E6) and temperature/humidity (S1) Minew sensors were installed in each nursing unit and each laboratory. These sensors were placed in open areas near the main hallways and within one foot of an Owl-in-One sensor. In the nursing units, one pair of E6/S1 sensors was installed in the nursing stations nearest and farthest from the unit entrance. In the labs, one pair was located near the lab entrance and the other in a frequently occupied open room away from the entrance. Minew motion sensors (E8) were placed on the top outer corner of doors and captured information pertaining to foot traffic through the doorway. One motion sensor was placed on each medicine room door in the nursing units. No sensor was placed on the lounge room doors because they remained open at all times, and none were placed on the unit entrance/exit doors due to fire safety restrictions. In the labs, one motion sensor was placed on the main entrance door and one on the lounge door. A total of 52 motion sensors, 63 light sensors, and 63 temperature/humidity sensors were installed throughout the hospital.

Phone apps

Several phone apps were installed, with informed consent, on the participants’ personal smartphones, for interaction with sensors, data uploading, to receive surveys, and to communicate with the research team.

TILES app

This app was custom-developed for the TILES study and was used both for data collection and for communication with participants throughout the enrollment and data collection periods. It is available for both Android and iOS (see Section subsec:code-availability for details). The EMAs were administered via the TILES app. Participants received a push notification when the EMAs were delivered and again thirty minutes before it expired if it had not yet been completed. Bi-directional communication was enabled via the TILES app as well. Participants could contact the research team at any time through the Contact Info tab. The app also contained a Frequently Asked Questions (FAQs) page which was updated in real time during the study as common questions were identified. In return, participants were notified via push notifications, and the via activity feed within the app of any non-compliance and were reminded to sync each device with its companion app.

Fitbit app

The Fitbit app is a third party app that was used to pair the Fitbit wristband with each participant’s personal smartphone using Bluetooth. Participants could visualize the data collected through their Fitbit wristband in this app, and could sync their data with Fitbit’s servers.

OMsignal app

The OMsignal app is a third party app that was used to start and stop the recording of the OMsignal garments, update the firmware of OMsignal garments if necessary, and sync the data to OMsignal’s servers.

RealizD app

RealizD is a third party smartphone application (no longer developed) for iOS and Android that records screen-on time and phone pickups. Data reported by RealizD takes the form of a timestamp marking the start screen-on session and the duration of that session in seconds.

Study procedures

In this section we describe the mechanisms through which participants were deemed eligible and later recruited and enrolled in the study. We also describe the data collection process. All these steps were conducted in accordance with USC’s Health Sciences Campus Institutional Review Board (IRB) approval (study ID HS-17-00876). We present an overview of the study in Fig. 1.

Requirements for eligibility

All volunteer participants were recruited from the University of Southern California’s (USC) Keck Hospital. To participate, subjects were required to (a) be employed by the hospital and work, on average, at least 20 hours a week, (b) have exclusive access to an internet and Bluetooth-enabled mobile phone running Android 4.3 or higher or iOS 8 or higher for the 10 weeks of participation, (c) have exclusive access to a personal e-mail for the 10 weeks of participation, (d) have access to WiFi at home for the duration of the 10 week study, (e) be proficient in both speaking and reading English, and (f) be capable of wearing wearable sensors in a way that allows data to be collected and transmitted to the research team.

Recruitment

Participants were recruited using multiple methods, including (a) e-mails to employees from leaders within Keck Hospital informing them about the study and how to sign up, (b) attending employee meetings to inform employees about the study, (c) posting flyers in different parts of the hospital where employees would be likely to see them, (d) information tables set up in the cafeteria, where potential participants could learn more about the study and sign up. Participants who had indicated interest but had not completed the sign-up process were texted by one of the principal investigators to support completion of the sign-up process.

After completing a screening questionnaire to check eligibility, potential participants were sent a text message with a link to download the TILES app. The TILES app then walked them through identity verification, informed consent, downloading and syncing the necessary additional apps, and finally signing up for an in-person enrollment session.

Through the above methods, 365 individuals indicated interest in participating by completing a brief screening questionnaire and were found to be eligible. Of these 365 individuals, 212 participants provided their consent to participate in the study, while 153 did not complete the on-boarding procedures. Participants were recruited in three waves, each with different start and end dates. Table 2 summarizes the dates and number of participants per wave. Over the course of the study, eight participants chose to drop out, due to various reasons, such as a sensor becoming uncomfortable or no longer wanting to receive daily surveys. The data of these participants has been kept in the dataset.

Table 2 Data collection implementation. This table shows the start and end dates of each wave, with corresponding number of participants at the beginning of each wave and dropouts. Specific dropout dates are given in the folder metadata/participant-info (see Section Main Data Record).

Full size table

Participant enrollment session

After providing their consent to participate, interested individuals signed up for a two-hour in-person enrollment session at the hospital through the TILES app. Upon arrival at the enrollment session, each participant was assigned a unique participant ID. During the first hour, participants completed part I of the baseline survey, under the supervision of a trained research team member. During the second hour, participants received their package of wearable sensors and instructions for use. Each participant received three wearable sensors along with a USB charging hub and two micro USB cables for charging, to help participants streamline the process of charging the sensors. The TILES app sent participants links to download all the necessary apps: Fitbit, OMsignal, and RealizD.

Participants were instructed to wear three sensors (a Fitbit Charge 2, an OMsignal garment, and a Unihertz Jelly Pro smartphone) that collected physiological and behavioral data over a 10-week period. We describe the instructions given to participants in the following paragraphs. Table 1 shows a list of the sensing streams and their instructed use. In addition, participants were instructed to fill part II of the baseline survey at home.

Daily surveys

Participants were informed from the first day of data collection they would start receiving one text message each day they were enrolled in the study. The text message contained a link to the job, health, or personality EMAs that they were expected to complete that day. Participants were instructed to complete the survey as soon as safely possible once they received the text message. A second daily EMA with psychological flexibility or capital surveys was received via a push notification on the participant’s phone and contained similar instructions.

The EMAs took no more than 15 minutes to complete, and on most days the survey could be completed in around 5 minutes. Participants who worked on the night shift received the first EMA (job, health, or personality) at either 6 pm, 12am, or 6am and participants who worked the day shift received the job, health, or personality EMAs at either 6am, 12 pm, or 6 pm. Participants were informed that they had 6 hours to complete each survey and they would receive a reminder notification from the TILES app 30 minutes before the link expires if the survey was not complete. The research team then distributed a calendar of the 10-week data collection period with a schedule of when to expect the daily survey each day. For the second (psychological flexibility or capital) EMAs, night shift participants received the surveys at a random time between 11PM and 6AM and were given 6 hours to complete the survey once it had been sent. Day shift participants received these surveys at a random time between 11AM and 6PM and were given 6 hours to complete the survey once it had been sent. All participants would receive a reminder via a push notification 30 minutes before the survey closed to remind them to complete the survey.

Fitbit charge 2

The first wearable sensor distributed to participants was the Fitbit Charge 2. Participants were asked to wear this sensor on their non-dominant hand day and night throughout participation in the study. To properly set-up this sensor, each participant created a Fitbit account and registered the Fitbit Charge 2 as a new device as well as synced the Fitbit app to the TILES app. When prompted by the Fitbit app, participants were asked to give Bluetooth permissions and deny location permissions.

OMsignal garments

Next, participants were given an OMbox and OMsignal garments; men were given five shirts and women were given three bras. The OMbox contains the hardware and software to process, collect, and transmit the information. Participants were asked to charge the sensor prior to each work shift, then connect it to the OMsignal garment, wear the OMsignal garment with the OMbox attached during their work shifts at the hospital, and start OMsignal recordings in the OMsignal app installed in their phones at the beginning of each work shift and stop, save and upload the recording at the end of each work shift. During the enrollment session, each participant paired his/her OMsignal box to his/her account (created through the TILES app) on the OMsignal app on their mobile phone, practiced connecting the OMsignal box to the garment, and saving an OMsignal recording. At the beginning of the data collection, there was no version of the OMsignal app for Android. As a solution, we provided an iPod Touch to each participant with an Android personal smartphone with the OMsignal app installed. This way, they could start and stop recordings and upload the data using WiFi. The research team also helped set up location-based reminders on the iOS devices to help participants remember to start and stop OMsignal recordings when arriving at the hospital as well as leaving.

Unihertz Jelly pro

Participants were given an Unihertz Jelly Pro phone (running Android 7.0). These were either clipped to participants’ clothing near the neckline or placed in a shirt pocket. The cases of the Jelly phones were modified, such that the microphone pointed upwards, as described in³², to better capture the speech data from the wearer. Participants were asked to charge the Jelly phone prior to each work shift, unlock the Jelly phone, check that the TILES Audio app³² was running, and upload the audio data at the end of each work shift by pressing the UPLOAD DATA button in the TILES Audio app. Each Jelly phone was linked to the TILES app on each participant’s mobile device by scanning a QR code in the TILES app. When prompted by the Jelly phone TILES Audio app, participants were asked to enable permissions (e.g., allowing TILES Audio app to run in the background, access to photos even though the camera was not used, but access was needed for the proper functioning of the app, etc.) and disable location-related services. Additionally, participants were informed of the Jelly Phone TILES Audio app’s disable feature (to stop recording audio features) and instructed on how to use this function.

RealizD app

Lastly, participants downloaded the RealizD app on their smartphone and were informed that this app would track how often the phone was picked up and for how long. Participants did not need to interact with the RealizD app during their participation, since it ran in the background.

Phone permissions

For the RealizD app to work, participants were asked to allow location permissions. Participants were also asked to keep WiFi and Bluetooth turned on on their personal mobile phones throughout their participation in the 10-week data collection period.

Environmental sensors

Finally, participants learned about the environmental sensors that were placed around the hospital and informed that no participant interaction with these sensors was required.

Completing the pre-study survey

Following completion of their enrollment session, participants were emailed a link to complete this survey, administered on the online survey platform REDCap.

Data collection

The 10-week data collection took place in three different participant waves. The data collection periods and number of participants per wave are shown in Table 2.

Off-boarding session

After the 10-week data collection from sensors and daily surveys ended, participants attended an in-person off-boarding session, which typically lasted between 15 to 20 minutes. During this session, participants exported mobile application data to members of the research team and returned their wearable sensors (except for Fitbit, see Section Incentives).

Completing the post-study survey

Following completion of their off-boarding session, participants were emailed a link to complete a survey administered on the online survey platform REDCap. This survey was identical to part II of the baseline survey; the only difference is that the demographics survey was removed and a study feedback survey was administered. This survey took approximately 30 minutes to complete. This concluded participant study procedures.

Incentives structure

A novel incentive scheme was developed for the TILES study to encourage compliance. Study participants were awarded with monetary incentives (Table 3) and points for study-related activities, proportionate to the time required to complete each activity. These points later translated to monetary awards. The number of points awarded for each activity is summarized in Table 4. A survey was considered completed if the participant went through all the survey (but they could skip questions). Note that for at least three consecutive days of Fitbit data, the participant received a 2× boost on points received for wearing the Fitbit. Points were converted to monetary compensation on a weekly cadence, according to a set of thresholds noted in Table 5. The expected use of OMsignal garments and Jelly phones was 3 days a week for most of the participant population, so points for wearing and syncing these devices were added to the incentives schemes as bonuses.

Table 3 Summary of monetary incentives. Participants were paid after the completion of different stages throughout the study.

Full size table

Table 4 Weekly points given to participants. The points were assigned based on the completion of the tasks.

Full size table

Table 5 Weekly reward cutoffs. Weekly points awarded for compliance in sensor usage and answering of the surveys were translated into monetary rewards.

Full size table

In addition to weekly gift cards as incentive, points were accumulated throughout the duration of the study and grand prizes were awarded to the top three point earners per wave. Each participant’s current point total and ranking were displayed in the TILES app activity feed. Bonus points were awarded for various activities, as summarized in Table 6. The first, second, and third place point earners across each wave were awarded $250, $200, and $100, respectively.

Table 6 Bonus points scheme for study participation stages and milestones. Participants received weekly points by wearing the sensors and answering the surveys. These were converted to weekly monetary rewards (Table 5) and added to a global ranking that awarded prices by the end of the study.

Full size table

Participants that finished the 10-week data collection period also kept the Fitbit Charge 2 that they wore during the study.

Data acquisition and flow

Figure 3 depicts the architecture for the data collection from sensors. On the left column, we have all possible wearable and environmental sensors. Wearable sensors such as Fitbit and OMsignal garments connect to the participants’ personal smartphones using Bluetooth. The data are uploaded to a third-party server using an available wireless internet connection (WiFi or LTE).

The Jelly phones (used here as audio-features recorders) uploaded data to the research server directly using WiFi only. The Jelly Pro smartphones given to participants also sent Bluetooth packets programmed with a unique identifier that were captured by the Owl-in-One hubs installed throughout the hospital. These packets were combined with their received signal strength indicator (RSSI) computed by the Owl-in-Ones.

The Owl-in-Ones also received data from environmental sensors. Data from both Minew sensors and Jelly phones were sent through Keck Hospital’s public WiFi network to reelyActive’s servers over UDP, from which the data were collected using the Pareto API³³ over HTTPS. These data were stored securely in the research server after filtering the data to contain only Bluetooth packets generated by our sensors.

Data were also collected directly through the participants’ personal smartphones through the TILES app and the RealizD app. The TILES app uploaded data directly to the research server while the RealizD app uploaded data to the RealizD server and that data was later pulled to the research server. The research server (code available at https://github.com/usc-sail/tiles-data-collection-pipeline/) consists of a RESTful API hosing a series of endpoints to collect push-type data streams (e.g. Owl-in-One, TILES app) in addition to a suite of tasks to fetch pull-type data streams (e.g. Fitbit, OMsignal).

Data preprocessing

Survey data

Once the data collection period ended, the baseline survey and EMAs were scored using R scripts (available at https://git.io/JePgE).

Data for the baseline survey were stored in a table where each column represents a single survey question or metadata variable and each row represents a single participant. In contrast, data for the various EMAs were measured daily and stored in multiple files, where each row contains the answers of a single participant to a survey. The files are split by shift (day/night), date, survey kind, and time it was administered. Surveys left unanswered by participants were added as empty surveys later on. All these files were aggregated and curated to obtain three files, one for the first group of EMAs (job, health, and personality in addition to base), one for the psychological flexibility group and one for the psychological capital group (curation scripts are available in the folder src/curation/ of the companion code). We have removed most of the raw questions from the EMAs to preserve participants’ privacy (except for those included in Table 7), but have kept the aggregated scores. We also list in Table 8 the demographic variables that underwent additional curation to prevent de-identification.

Table 7 EMA surveys anonymization. We have not included in this table the variables whose values did not require changes.

Full size table

Table 8 Demographics anonymization. We have not included in this table the variables whose values did not require changes.

Full size table

Free text responses in EMAs have been manually annotated into categories. Three questions are concerned: location when answering, activity engaged in right before answering, and atypical events that happened or are expected to happen. Since some of these categories are subjective, we have between 2 and 5 annotations (one per annotator) for each text response. Each text response can have between 1 and 3 categories associated with by annotators. Fusion of annotation is then performed and the top 3 categories appearing at least twice are reported alongside the frequency of the category in the annotations (e.g. if 2 out of 5 annotators use a category, that category has frequency 2/5 = 0.4). We refer the reader to the README file in the dataset for further details on those categories and how they are reported in the data.

Data for the enrollment session baseline survey (part I), take-home baseline survey (part II) and study-completion survey (part II) were stored in single files each. Variables were renamed to correspond to what each question measured. After the above steps were taken, total scores for each psychological measurement were calculated (scored folder in Table 9).

Table 9 TILES-2018 Main Data Record. There are five main folders containing information for each stream of data, plus a sixth folder containing participant metadata (all presented in alphabetical order). The details of each data stream (including measurements and features) are included in each of the subfolders of the data record as README files.

Full size table

Fitbit data

Fitbit data retrieved using the Fitbit API contained separate time series for measured heart rate and step count, in addition to a daily summary of physical activity and sleep. The heart rate data is reported on non-uniform intervals anywhere between approximately 5s and 15min depending on the participants’ physical activity. Occasionally, long strings of repeated identical heart rate values (usually 70bpm) were reported in the raw data, spanning durations typically less than 15 minutes but sometimes up to 20 hours. Because of consumer observations that Fitbit technology sometimes incorrectly reports exactly 70bpm (see https://community.fitbit.com/t5/Blaze/Blaze-s-Heart-Rate-Stuck-on-70-bpm/td-p/2727738) and also because repeated measures of the same heart rate over several minutes is highly unlikely, these long strings were interpreted as artifacts. Thus, sequences of at least 50 repeated identical heart rate values were replaced with NaN (Not a Number, equivalent to missing values). As a result, an average of 0.8% ± 1.7% of each participant’s total number of heart rate samples collected were removed. The step count, daily summary, and sleep data did not contain these long string artifacts and, therefore, were not pre-processed.

OMsignal data

The data obtained from the OMsignal’s API contained no obvious visible artifacts, and so they were not modified during the pre-processing stage.

Owl-in-One data

The Owl-in-One devices captured packets from all Bluetooth devices broadcasting Bluetooth advertisements at Keck Hospital. We filtered all of these packets and stored only the packets coming from Minew sensors, Jelly phones, and Owl-in-Ones by filtering keywords expected to be found in the packets (“minew”, “reelyActive_RA-R436”, “jelly”). These were originally stored in JSONL format, and later translated to CSV files containing only the relevant information for easier processing (details below).

RSSI

The RSSI information was pre-processed separately for Minew sensors, Jelly phones, and Owl-in-Ones themselves, and stored in CSV files. All MAC addresses were translated into hospital rooms or locations and formatted into a directory name as follows: [building name]:[floor#]:[wing/area]:[room type][room #]. These files also include relevant IDs (such as the participant ID associated to a Jelly phone), when appropriate. We have hashed the actual directory names to prevent making the hospital’s floor plans publicly available, such that the floor number, unit, and room numbers are kept private. An example is c25c:lounge:2fec.

Environmental data

Bluetooth packets sent by Minew sensors contained the measured temperature and humidity, light level, or motion information in their payload. Each packet was received by Owl-in-One devices, time stamped, and sent to the reelyActive’s cloud servers where they were processed and sent to the research server. In the research server, the packets were filtered so that only packets containing Minew data were kept as environmental data. All environmental data was further filtered so the only packets recorded contained identifier values that also appeared on the research team’s list of identifiers for all installed sensors. Less than 0.1% of the received packets contained corrupted data in the form of invalid source sensor identifiers, which is consistent with the low-energy Bluetooth (BLE) bit error rate. None of the other packet values were observed to be corrupted, including the measured environmental data, so no additional preprocessing was performed.

Audio

Each file contains raw audio features extracted as a combination of the Interspeech 2013 ComParE Vocalization Challenge feature set³⁴ and openSMILE’s emobase feature set³⁵. The OpenSMILE toolkit was applied in this configuration to extract acoustic low-level descriptors (LLDs) of 127 dimensions every 10ms using either 25ms or 60ms frame sizes. The configuration file used to extract features is provided with the app itself (the OpenSMILE configuration file is also available at https://git.io/JeiC7). The feature set contains prosodic measures (pitch, intensity), cepstral information (MFCCs 1–12), RASTA PLP features, spectral features (band energy between 250–650Hz, centroid of frequency distribution, spectral rolloffs), and other acoustic characteristics (e.g. LPC 0–7, zero crossing rate).

We did not perform any preprocessing on the raw audio before feature extraction. To extract foreground speech information, we trained a machine learning model to learn to differentiate foreground versus background on a separate corpus collected in-house, with the same audio feature extraction hardware and software, but also with the ground truth audio, and applied it to processing the TILES-2018 Audio Data Record’s raw features³⁶. The output of these models is temporal foreground predictions in the interval [0, 1], where values close to 1 predict foreground. These temporal foreground predictions are also included in the TILES-2018 Audio Data Record, and described in the Data Records section. To extract data with foreground speech, we recommend thresholding first at 0.5 a median-filtered version of the foreground speech predictions with a window length of 101 samples (corresponding to a 1s window). A non-zero value corresponds then to a row with detected foreground speech.

For the current data release, we further curated the data by only including a subset of the features collected, and omitting filterbank features such as MFCCs and PLPs, as well as LPC features. We believe filterbanks should be released with some form of information obfuscation or encryption, as it contains potentially recoverable language information and poses privacy concerns. We intend to release privacy preserving embeddings on the filterbanks at a later stage. For information on features included in the release, refer to Section Audio.

Inference of days at work

For convenience, we also provide an estimate of working days for all participants. This was obtained using the EMAs, as well as the data collected from the OMsignal garments and a combination of the Jelly phones and Owl-in-One data.

One of the base EMA questions was where the participant currently was (a value equal to 2 indicated currently at work). All of the participants’ responses were saved into a table (each row represented a participant, each column a date). Equivalent tables were saved for days in which participants had recorded data through their OMsignal garments and through the Owl-in-Ones receiving pings from the Jelly phones.

All of this information was combined by performing a logical or operation between the tables. This means that if any of the sources of information regarded a given day as a day spent at work, that day was inferred as a day at work.

Data Records

The TILES-2018 data³⁷ is split into two data records: the main data record, and the audio data record. Each data set is described in detail in this section.

TILES-2018 main data record

The main data record is comprised of several different data streams: fitbit, realizd, omsignal, owlinone, and surveys (following the names of the folders in the record), and a metadata folder. Depending on the kind of data collected, each stream may have subfolders. These are described in the following subsections. A summary of the main data record is presented in Table 9. The total size of the record is about 100 Gb (compressed), presented in csv.gz files. The files per participant are named using participants’ hash-based IDs. All dates and times are in Pacific Time (PT), in the format yyyy-mm-dd[Thh:mm:ss[.sss]]

Detailed descriptions for all the data sources are included in each folder under a README file.

Participant summary

The participants were 212 hospital employees who volunteered to participate in the study. They enrolled in 1 of 3 waves of participation, each with different start and end dates (Table 2). Most participants (n = 210, 99.1%) worked full time in the current sample. More than half of the participants were Registered Nurses (n = 113, 54.3%), and rest worked as Certified Nursing Assistants (n = 25, 12.0%), Monitor Technicians (n = 11, 5.3%), Physical Therapists (n = 6, 2.9%), Occupational Therapists (n = 2, 1.0%), Respiratory Therapists (n = 3, 1.4%), and other occupations not listed above (n = 48, 23.1%). The current data was collected from 146 females (rest males) and 172 individuals have received a degree higher than Bachelor’s degree. The age of the participants ranges from 21–65, with the median age at 35.

Fitbit (fitbit folder)

Daily-summary folder

Each file has rows with a date and time and a set of daily summaries including resting heart rate, total calories burned, total number of steps, sleep report, and heart rate zone durations. The sleep reports provide information about sleep duration, sleep efficiency, the duration of 4 sleep stages (awake sleep, light sleep, deep sleep, REM sleep), as well as the timestamp of the start and end of the sleep. There are up to four sleep records per day. Moreover, calorie consumption and duration of 4 heart rate zones are available in Fitbit daily summaries.

Heart-rate folder

Each file has rows with a timestamp and PPG heart rate values (beats per minute). The PPG heart rate samples are made available by the Fitbit Charge 2 sensors aggregated over intervals of less than 1min, but the time differences between two consecutive samples are non-uniform.

Sleep-data folder

Each file has rows with the sleepId it corresponds to in sleep-metadata, a timestamp and the sleep phase with its total duration in seconds. Phase is either in classic (one of asleep, restless, or awake) or stages (one of deep, light, rem, or wake). The timestamp determines the beginning of the sleeping phase.

Sleep-metadata folder

Each file has rows for each period of sleep, and metadata for that sleep, including beginning and end, nap versus main sleep, type of inferred sleep phases (classic or stages), duration, and various metrics.

Step-count folder

Each file has rows with a timestamp and step count value. In contrast to heart rate values, step count data is sampled with an interval of 1min, and reports the number of steps taken within that minute.

Metadata

Days-at-work folder

Contains a file for all participants. The information is presented in four tables (one per stream, plus aggregated) where each participant corresponds to a column and each row is a date in the format yyyy-mm-dd.

Participant-info folder

Contains a single file with hash-based participant IDs, nursing unit(s) (if available, using the same hashing as for the Owl-in-One directories), and kind of shift (day or night). We have also included the dropout date if it exists.

OMsignal (omsignal folder

ecg folder

Each file has raw, 15s-long electrocardiogram (ECG) snippets sampled at 250Hz and recorded every 5min. Each file corresponds to a single participant. Each row belongs to a single recording identified by record_id, and mapping to the corresponding row in the metadata subfolder.

Features folder

Each file contains rows with a timestamp and a set of physiological and physical activity measurements in real-time (aggregated and saved every second), as well as high-level descriptive features (every 5min). The real-time measurements include breathing rate, breathing depth, intensity, cadence, heart rate, RR intervals (defined as the time elapsed between two successive R waves of the QRS signal on the ECG³⁸), and step count. The high-level descriptive features include statistical aggregations and derived features of real-time measurements over the 5min intervals. Examples include the average and standard deviation of the breathing rates as well as posture.

Metadata folder

Contains one file per participant with metadata information such as dates of usage, usage time in hours, and RR coverage (ratio of successive R waves detections in time over a given time interval) for a given recording.

Owl-in-One (owlinone folder)

Owl-in-One data contains information from three different sources: Jelly phones (RSSIs from the Jelly phones of participants), other Owl-in-Ones (RSSIs), and Minew sensors (RSSIs and ambient information).

Jelly folder

The Jelly subfolder is organized with files per participant. Each file contains rows with a timestamp, a participant ID, and the directories of the receiving Owl-in-Ones with corresponding RSSI values.

Minew folder

This folder contains three subfolders:

Data folder. Contains one file per device whose timestamped content depends on the type of sensor:
light sensor: yes/no light detection
motion sensor: acceleration in X, Y, and Z coordinates in m/s^²,
temperature and humidity sensor: temperature in °C and relative humidity in %.
Locations folder. Contains a file with X and Y coordinates in m. The origin of the system of coordinate (i.e., the point (x, y) = (0, 0)) is arbitrary so that the floor maps of the hospital are not revealed, but the pairwise distances between sensors within a same unit have been kept the same.
Rssi folderThis folder has one file per Minew sensor. Each file contains rows with a timestamp (sorted), the hashed directory of the receiving Owl-in-One, and the corresponding RSSI value.

Owls folder.

This folder contains two subfolders:

Locations folder. Contains a file with X and Y coordinates in m from the same arbitrary origin than the minew locations.
Rssi folder. The Owl-in-One files are organized by Unix time days (meaning that the cutoff is that midnight UTC). These files each contain all of the signals sent by Owl-in-Ones and received by them. Sending and receiving MAC addresses have been included, together with the sender and receivers’ associated directories.

RealizD (realizd folder)

Each RealizD file describes the interaction that participants have with their smartphones. These files include a column with timestamps for initial interaction times and a column with times in seconds corresponding to the duration of the interaction.