TILES-2018: A longitudinal physiologic and behavioral data set of hospital workers

We present a novel longitudinal multimodal corpus of physiological and behavioral data collected from direct clinical providers in a hospital workplace. We designed the study to investigate the use of off-the-shelf wearable and environmental sensors to understand individual-specific constructs such as job performance, interpersonal interaction, and well-being of hospital workers over time in their natural day-to-day job settings. We collected behavioral and physiological data from $n = 212$ participants through Internet-of-Things Bluetooth data hubs, wearable sensors (including a wristband, a biometrics-tracking garment, a smartphone, and an audio-feature recorder), together with a battery of surveys to assess personality traits, behavioral states, job performance, and well-being over time. Besides the default use of the data set, we envision several novel research opportunities and potential applications, including multi-modal and multi-task behavioral modeling, authentication through biometrics, and privacy-aware and privacy-preserving machine learning.


Background & Summary
Maintaining a healthy, productive workforce is an increasingly challenging problem in a complex and frenzied world. Optimal job performance relies on worker wellness, and as organizations strive to prepare their workforce for the evolving demands, worker wellness is increasingly important. Current standards are based on cross-sectional assessment of employee characteristics, often in controlled testing conditions that cannot account for the dynamic nature of working environments and employee performance and are therefore poorly suited for this task [1]. Fortunately, today's densely instrumented world offers tremendous opportunities for unobtrusive and persistent acquisition and analysis of diverse, information-rich time-series data that provide a multi-modal, spatio-temporal characterization of individuals' actions in, and of, the environment within which they operate. However, the connection between individual and group performance, well-being, and quantitative measurements from sensor data has not been established for such dynamic environments in the wild.
To connect job performance-related and well-being-related constructs through self-assessments with data from sensors, we frame well-being and performance within the overarching notion of psychological flexibility. Psychological flexibility refers to an individual's capacity to maintain fluid awareness and acceptance of current circumstances and, depending upon available opportunities, take effective action even when experiencing difficult or unwanted thoughts, emotions, and sensations [2]. Psychological flexibility is defined as a primary individual determinant of behavioral effectiveness and well-being [3]. It has been shown that, in the workplace, the degree to which employees are psychologically flexible can have a profound effect on their productivity, well-being, and success [4,5]. Moreover, the connection between sensor measurements and mental states put forth by the Somatic Marker Hypothesis [6] suggests that the physiological status of our body (i.e., the somatic marker) is an indispensable part of our cognition and emotion, which are building blocks of our mental states. The purpose of our research is to connect psychological flexibility, job-performance, and well-being with somatic and bio-behavioral markers using an in situ experimental study in a real world workplace.
The TILES-2018 (Tracking IndividuaL performancE with Sensors, year 2018) data set comes from a prospective longitudinal study using intensive multimodal assessment of workers and their environments aimed to understand the dynamic relationships among individual differences, work and wellness behaviors, and the contexts in which they occur. It aims to support developing and validating sensor-based methods for evaluating worker wellness and job performance over time. To achieve this, we partnered with University of Southern California's Keck Hospital to directly observe 212 workers who volunteered to participate in the study over a 10-week period both at work and outside of work. Biobehavioral data were captured continuously and passively throughout the study via wearable devices (including a wristband, a smart undergarment, a clipon audio-features recorder and Bluetooth-enabled badge, and personal smartphones). These data streams were matched with environmental and behavioral data streams from Internet-of-Things devices and applications logging personal smartphone usage. To map sensor data to constructs of interest, participants also completed an initial battery of online surveys and daily surveys designed to assess individual difference variables (e.g., personality, intelligence, socioeco-nomic status), psychological states and traits (e.g., positive and negative affect, anxiety, stress, fatigue, psychological flexibility, psychological capital), health and wellness (e.g., sleep, physical activity, cardio exercise, tobacco and alcohol use, health-related quality of life, life satisfaction), and work behaviors (e.g., task performance, organizational citizenship behavior, counterproductive work behavior, work engagement, perceived support and stressors).
This data set provides a unique opportunity for researchers interested in organizational psychology or data sciences in general to perform exploratory and hypothesis-driven investigations regarding the complex, dynamic nature of worker wellness and performance over time. It is also of interest to signal processing, machine learning, and privacy researchers due to the thousands of hours of sensor data collected across participants in natural real-world "in the wild" settings, that can be used to study and extend current multimodal signal processing methods, perform machine learning inference on psychological states and traits, and study and develop new methods to protect the privacy of users without hindering the richness of such a data set. Unique strengths of this data set include a rich set of self-assessed psychological constructs coupled with multimodal sensor data, all captured in the wild throughout ten weeks, with high compliance rates of the participants.
The data are available through two records: the Main Data Record, and the Audio Data Record, available at https://tiles-data.isi.edu/.

Methods
In this section, we describe the materials and procedures followed to collect the data. An overview of the study is shown in Figure 1.

Location
The data collection took place at the University of Southern California's (USC) Keck Hospital in Los Angeles, California, in the United States. USC Keck Hospital is an acute care hospital with 401 patient beds throughout 16 nursing units [7]. It is located within USC's Health Sciences Campus.

Materials
This section describes the materials employed in the data collection: Surveys, sensors, and phone applications (apps).

Surveys
To get an understanding of the participants' mental states, traits, and physical and emotional well-being, they were asked to take different surveys throughout the study; these also serve as targets (or labels) for statistical modeling. At the beginning of the study, participants were asked to answer a baseline survey over two different sessions, assessing constructs related to job performance, cognitive abilities, and health. Throughout the data collection, participants answered daily Ecological Momentary Assessments (EMAs) for health, job performance, personality, psychological flexibility, and psychological capital. After the conclusion of the sensor data collection, participants completed a post-study survey. These surveys are described in the following sections, where labels in parenthesis correspond to how the measures are referred to in the data set.
Baseline Survey Due to the length of the baseline survey, it was split it into two different sessions. A first part of the baseline survey was administered at the study enrollment session (described in the Study Procedures section) for participants, and assessed demographics and a number of cognitive and psychological constructs pertaining to job performance, cognitive ability, and health. Later, and before the start of the sensor-based data collection, participants answered the second part of the baseline survey at home (or another place of their choice). This survey assessed demographics, health, satisfaction with life, perceived stress, psychological flexibility, work acceptance and action, work engagement, psychological capital, and challenge and hindrance stressors (measures were administered in the above order).
We next describe the scales assessed in the first part of the baseline survey. We give brief descriptions herein (obtained from [8], Table 1) and refer readers to the design and rationale behind this survey in the same document.
• Demographics (DEMO): Participants completed a brief demographics survey which assessed sex, age, place of birth, English as the native language, education level, and job-related demographics (e.g., full-time or part-time, industry, tenure in the organization, and income).
• Cognitive Ability: It was measured using two different scales: -Fluid Intelligence (ABS): Consists of 25 open-ended text entry items, and is scored by adding the sum of correct responses, for a range between 0 and 25.
-Crystallized Intelligence (VOCAB): Consists of 40 multiple choice items, with 4 response options each. It is scored by adding the total number of correct answers, for a range between 0 and 40.
• Tobacco Use (GATS): It was measured using a shortened version of the Global Adult Tobacco Surveys, which consists of 3 items of the form yes/no and quantity questions. The computed scores are tobacco status (never, past, current smoker ) and a GATS score, computed by adding tobacco units used in past week (which is ≥ 0).
• Alcohol Use (AUDIT): It was measured using The Alcohol Use Disorders Identification Test [9], which consists of 10 items with yes/no, quantity, and frequency questions. It is scored according to AUDIT instrument scoring guidelines, for a total score in the range 0 to 40.
• Sleep (PSQI): It was measured using the Pittsburgh Sleep Quality Index, consisting of 29 items with open-ended response formats as well as structured questions with categorical outcome options. The score is an aggregate sleep quality score, computed according to the PSQI instrument scoring guidelines, for a total score in the range 0-21.
• Physical Activity (IPAQ): It was measured using the International Physical Activity Questionnaire, which consists of 27 items of the form yes/no, quantity, and frequency questions. The score is computed using total standardized MET-minutes reported for the prior 7-day period, which is ≥ 0.
• Counter-productive Work Behavior (IOD): It was measured with the Interpersonal and Organization Deviance scale (IOD) [10]. It consists of a total of 19 items, separated into two subsets. Each item is a frequency scale ranging from 1 (never ) to 7 (daily).
-Interpersonal Deviance (IOD_ID): Consists of 7 items. The score is computed by adding the responses, for a total score in the range 7 to 49.
-Organizational Deviance (IOD_OD): Consists of 12 items. The score is computed by adding the responses, for a total score in the range 12 to 84.
• Organizational Citizenship Behavior (OCB): Measured using the OCB Checklist (OCB-C) [11]. It consists of 20 items, each being a frequency scale ranging from 1 (never ) to 5 (every day). The score is computed by adding all the responses, for a total score between 20 and 100.
• Task Performance was assessed using two different measures: -In-Role Behavior (IRB) [12]: Consists of 7 items, each being a Likert scale ranging from 1 (strongly disagree) to 7 (strongly agree). A scored is obtained by adding all the responses, for a total score between 7 and 49.
-Individual Task Proficiency (ITP) [13]: Consists of 3 items, each being a Likert scale ranging from 1 (very little) to 5 (a great deal ). A scored is obtained by averaging all the responses, for a total score between 1 and 5.
• Personality (BFI-2): It was measured using the Big Five Inventory-2 [14]. It consists of 60 items, each being a Likert scale ranging from 1 (disagree strongly) to 5 (agree strongly). Five different scores are computed, all in a range between 1 and 5: -Negative Emotionality (neuroticism): Scored by averaging all the negative emotionality responses.
-Conscientiousness: Scored by averaging all the conscientiousness responses.
-Extraversion: Scored by averaging all the extraversion responses.
-Agreeableness: Scored by averaging all the agreeableness responses.
-Open-Mindedness: Scored by averaging all the open-mindedness responses.
• Affect (PANAS): It was measured using the Positive and Negative Affect Schedule-Expanded Form [15]. It consists of 60 items, each being a Likert scale ranging from 1 (very slightly or not at all ) to 5 (extremely). Two different scores were computed, with scores in the range 10 to 50: -Positive Affect (POSAFFECT): Score is obtained by adding the positive responses.
-Negative Affect (NEGAFFECT): Score is obtained by adding the negative responses.
• Anxiety (STAI): It was measured using the State Trait Anxiety Inventory [16]. It consists of 20 items, each being a frequency scale ranging from 1 (almost never ) to 4 (almost always). It is scored by adding sum responses, obtaining a value in the range 20 to 80.
The following scales correspond to the second part of the baseline survey, and were assessed on a take-home questionnaire. We include a description of each measurement and a brief rationale.
• Demographics (DEMO): Additional demographics assessed several basic characteristics of participants. Specifically, they were asked about race, marital status, pregnancy, number of children living with participants, and housing situation (e.g., rent or own). It also assessed things that were more germane to the particular sample at hand. This included what position the participant currently held at the hospital from which they were recruited, what specific certifications they have (e.g., nurse practitioner), years in professions, what shift they worked (e.g., day or night), how many hours worked at the organization from which participants were recruited, and amount of over time worked. In addition to this, participants were asked about the length of their commute, mode of transportation used in their commute, do they have another job outside of the one from which they were recruited and if so, how many hours do they work there. Lastly, they were asked if they were currently a student, their gender, age, place of birth, English as the native language education level, and job-related demographics (e.g., full-time or part-time, industry, tenure in the organization, and income).
• Health (RAND): Health was measured using the Rand Health Survey-Short form [17]. This assesses eight health domains through 36 self-report items. These domains included physical function, role limitations due to physical health, role limitations due to personal or emotional problems, general mental health, social functioning, bodily pain, general health perceptions, and energy/fatigue. This measure also includes one scale that assesses perceived change in health. Scores are obtained by computing the mean of the items that are associated with each of the domains listed above.
• Life Satisfaction (SWLS): The Satisfaction with Life Scale [18] is a 5item measure that aims to assess participants' general satisfaction with life. Participants are to rate the degree to which they agree with each statement on a scale of 1 (strongly disagree) to 7 (strongly agree). A total score is obtained by taking the average of the 5 items.
• Perceived Stress (PSS): The Perceived Stress Scale [19] is a 10-item scale that aims to assess how often one has experienced stress in the last month.
Participants are asked to rate the frequency in which they experience perceive stress on a scale of 0 (never ) to 4 (very often). After reverse coding the necessary items, a total score is obtained by taking the average of the 10 items. -Psychological Inflexibility (PI): The inflexibility sub-scales include experiential avoidance, lack of contact with the present moment, self as content, fusion, lack of contact with values, and inaction.
Items on this measure ask participants to think about the last two week and to rate the frequency in which they experience the feelings described in each item. PF, PI, and their sub-dimensions are scored by taking the mean of the items that comprised each scale or sub-dimension.
• Work Related Acceptance (WAAQ): Additionally, psychological flexibility as related to work was measured by the 7-item Work-related Acceptance and Action Questionnaire [21]. The WAAQ presents a statement and participants rate the degree to which each statement is true on a scale from 1 (never true) to 7 (always true). The WAAQ is scored by taking the mean of the items.
9-items and participants rate the frequency in which they have experienced the feeling described on a scale from 0 (never ) to 6 (always). Then scores are averaged to obtain a total score. There are three sub-scales: vigor, dedication, and absorption.
• Psychological Capital (PCQ): It can be thought of as a higher-order construct that is comprised of hope, self-efficacy, resilience, and optimism [23]. It is assessed through the Psychological Capital Questionnaire through a 12-item measure [24]. The PCQ asks participants the degree to which they agree on a 6-point scale from 1 (strongly disagree) to 6 (strongly agree).
• Challenge and Hindrance Stressors (CHSS): Challenge and Hindrance stressors is measured using a 16-items measure where participants were presented with a statement and asked to rate the degree of agreement or disagreement with the statement [25]. 8 items were used to measure challenge stressors and 8 items were used to measure hindrance stressors. Total scores are calculated by computing the mean over all hindrance stressors items and computing separately the mean over all challenge stressor items.

Ecological Momentary Assessments
The Ecological Momentary Assessments (EMAs) were received twice a day by participants and were divided into two groups. Note that some scales have a "D" appended to their name compared to the baseline survey to denote its daily version. A first group of EMAs assessed job-related variables, health-related variables, and personality. The job-related questions were asked a total of 31 times during the study (every two days), the health-related questions were asked 35 times during the study (every two days), and the personality-related questions were asked 5 times during the length of the study (every two weeks), with a total of 71 surveys administered over the 10 weeks of the study. Participants received one of these surveys daily. The job, health, and personality surveys were sent either at 6am, noon, or 6pm, and expired 4 hours after they were sent.
Another group of EMAs assessed psychological flexibility and psychological capital. The psychological flexibility form was sent to participants a total of 50 times over the ten weeks (5 times per week), whereas the psychological capital form was received a total of 20 times throughout the same period (2 times per week). Participants received one of these surveys daily. The psychological flexibility and psychological capital EMAs were sent uniformly at random to day shift participants between 11am and 6pm, and between 11pm and 6am for night shift participants. They expired 6 hours after their delivery.
Note that some scales have a "D" appended to their name compared to the baseline survey to denote its daily version.
The surveys were implemented using ResearchKit for iOS and ResearchStack for Android (through the TILES app described in Section Phone apps).
The following items were asked daily to participants during ten weeks and were present each at the beginning of each job, health, and personality EMA (base daily survey).
• Context measures (CONTEXT): These were 4 context questions. The first question asked participants about interactions with other people and the communications channel. The second question asked about the activity in which they were engaged in when they received the survey. The third question asked for current location, and the fourth question asked whether any atypical events had occurred.
• Stress (STRESSD): Stress was measured daily using a single that read, "Overall, how would you rate your current level of stress?".
• Anxiety (ANXIETY): Anxiety was assessed daily using a single which asked, "Please select the response that shows how anxious you feel at the moment".
• Affect (PAND): Participants' positive and negative affect were assessed daily using the 10 items from PANAS-Short [26]. 5 items were used to assess negative affect and 5 items were used to assess positive affect.
The purpose of the Job Performance Survey was to assess participants' perceived job performance, and included the following measurements: • Work today (WORK): Prior to completing the job performance survey, participants were asked if they had worked 1 or more hours on that day. If participants answered no, they were not shown the job performance survey.
• Task performance (ITPD, IRBD): Was measured using the same items that were used in the baseline survey described previously.
The purpose of the Health Survey was to assess a number of health-related variables: • Sleep (SLEEPD): Sleep was assessed with a single item that asked participants to specify the number of hours they slept the previous night. Participants were instructed not to confuse this with the number of hours spent in bed.
• Physical Activity (EX): Physical activity was measured using two questions. Participants were asked to specify the number of minutes of vigorous activity they engaged in yesterday (e.g., sprinting, power lifting). The second, asked participants how many minutes they spent the previous engaging in moderate physical activity (e.g., jogging, biking).
• Tobacco Use (TOB): Tobacco use was measured using two items. The first asked whether the participant used a tobacco product yesterday and if so, a follow-up question was presented which probed how many times tobacco products were used and what type of product was used.
• Alcohol Use (ALC): Alcohol use was assessed using 2 items. The first asked whether participants consumed any alcohol yesterday and if they responded yes, they received a question that asked to specify how many beers, wines and spirits they consumed.
The purpose of the Personality Survey was to assess the personality: • Personality (BFID): The personality survey uses BFI-10 (shortened version of the BFI-2 used in the baseline survey previously described).
The Psychological Flexibility Survey included context questions and measures of psychological flexibility: • Context Question (Activity): The first question asked participants to select from a list the type of activity in which they were engaged in immediately before beginning the survey. Example options included travel or commuting, eating and/or drinking, work, and work-related activities.
Participants could also respond "other" and specify in text what they were doing.
• Context Question (Experience): These items assessed experiences (both pleasant and unpleasant). The question was provided as a checklist (for positive and negative experiences), such as "Difficult thoughts of memories", "Pleasant physical sensations", "Difficult urges or cravings".
• Psychological Flexibility (PF): 13 items were included to assess psychological flexibility [2]. Items of the PF survey are divided into 3 sub-scales.
Participants were asked to report how true each statement was about themselves during the last 4 hours. They rated each statement on a scale of 1 (Never ) to 5 (Always). The mean was calculated for all items in each sub-scale for a total score. This scale was created for this study.
The Engagement/Psychological Capital Survey assessed context (base daily survey), engagement, psychological capital, and challenge and hindrance stressors. It is comprised of items that are non-stigmatizing and/or pathologizing, and that have demonstrated large effect sizes on significant outcomes (e.g., employee health and well-being, job performance, job retention and turn-over) [28].
• Context questions (Activity): The first question asked participants where they were, and participants selected from a list (e.g., work, home, outdoors, etc.). The second question was the same as the first question participant answered in the context questions for the psychological flexibility questionnaire.
• Engagement (Engage): Participants completed a 3-item measure of work engagement [29]. Participants were asked to think about the activity they had just reported doing and how they felt while engaging in that activity. Statements were rated on a scale of 1 (not at all ) to 7 (very much). A mean of the 3 items was computed to create a total score.
• Psychological Capital (Psycap): It was measured using 12 items from CPC-12 [30]. Participants were instructed to rate each statement based on how much they agreed with it. Items were rated on a scale of 1 (not at all ) to 7 (very much). The mean for all 12 items was used to compute the total score.
• Interpersonal Support (IS): A subset of 3 items from [31] are used to assess daily job resources.
• Challenge/Hindrance Stressors (CS, HS): A subset of 8 items from the baseline survey measure of Challenge/Hindrance Stressors was used, 4 items to measure each type of stressor [25]. Participants were instructed to consider the degree to which they agreed with each statement based on the last day that they had worked, including the day on which they completed the survey. Items were rated on a scale of 1 (not at all ) to 7 (very much).
Post-study Survey The Post-study survey is equivalent to the take-home part of the baseline survey, except for not including demographics.

Sensing Devices
The initial goal of the study was to predict self-assessed psychological constructs (obtained through surveys) from sensor data. To this end, we selected a set of wearable and environment-sensing devices to obtain physiological and behavioral information from participants. Table 1 summarizes the sensors worn by participants and their intended use throughout the study. Details on the sensor selection can be found in [32].
Wearable Sensors Participants were instructed to wear a Fitbit Charge 2 wristband [33] at all times throughout the duration of the study. Furthermore, at work, they were asked to wear an OMsignal smart garment [34] 1 (a T-shirt for men and a sports bra for women) and a Unihertz Jelly Pro smartphone [35] (Jelly phone, for short) as a lapel microphone ("audio badge"). The Jelly phone was programmed to obtain audio features from the raw audio (which was discarded) [36]. In parallel, these Jelly phones were also sending Bluetooth packets every second over 15s windows every minute, to estimate their locations within the building/work place. These packets had a unique 4 bytes identifier for every participant.
Environmental Sensors There were two kinds of environmental sensors: Owl-in-One [37] Bluetooth data hubs and Minew sensors [38]. The Owl-in-Ones were used to measure participant proximity by capturing the signal strength of Bluetooth packets from the Jelly phones that participants wore in the hospital and to collect environmental data sent over Bluetooth by Minew sensors. The Owl-in-Ones were installed in fourteen nursing units (spread over seven of the building floors) and two hospital labs. A total of 244 Owl-in-Ones were installed, about 1.5 m to 2.0 m above the floor depending on space availability on wall areas near power outlets. Each nursing unit was equipped with an Owlin-One sensor in these four room types: patient room, nursing station, lounge, and medication room. These different rooms were selected after observing the behavioral patterns of nurses during their shifts (by talking to nursing directors of Keck Hospital and shadowing nurses throughout a workday). Each Owl-in-One was labelled with the study logo, and the phrase "This is a data hub for the TILES study. For more information, please visit https://sail.usc.edu/ tiles".
In the nursing units, one Owl-in-One was installed in every other patient room, one in every medication room, one in every lounge, and between one and four in nursing stations, depending on the size, layout, and availability of power outlets. In the hospital labs, one Owl-in-One was installed in every lounge, and at least one in each major room (e.g., blood lab, micro-bio lab, shipping/receiving, patient lobby, etc.) depending on the room size and power outlet availability. Figure 2 shows an example of Owl-in-One placements in a nursing unit.
The Owl-in-Ones also captured packets sent by the Minew sensors collecting (door) motion information, humidity, temperature, and light information. Two light (E6) and temperature/humidity (S1) Minew sensors were installed in each nursing unit and each laboratory. These sensors were placed in open areas near the main hallways and within one foot of an Owl-in-One sensor. In the nursing units, one pair of E6/S1 sensors was installed in the nursing stations nearest and farthest from the unit entrance. In the labs, one pair was located near the lab entrance and the other in a frequently occupied open room away from the entrance. Minew motion sensors (E8) were placed on the top outer corner of doors and captured information pertaining to foot traffic through the doorway. One motion sensor was placed on each medicine room door in the nursing units. No sensor was placed on the lounge room doors because they remained open at all times, and none were placed on the unit entrance/exit doors due to fire safety restrictions. In the labs, one motion sensor was placed on the main entrance door and one on the lounge door. A total of 52 motion sensors, 63 light sensors, and 63 temperature/humidity sensors were installed throughout the hospital.

Phone apps
Several phone apps were installed, with informed consent, on the participants' personal smartphones, for interaction with sensors, data uploading, to receive surveys, and to communicate with the research team.  Table 1: Selected sensors with a summary of measurements (output) and instructed use or sensing times. The first three sensing streams were obtained directly from participants through wearable sensors and apps installed in their personal smartphones. All surveys were obtained by direct input of participants on their personal smartphones or a web browser. The last four sensing streams were obtained by placing sensors in the hospital. PPG: photoplethysmography, ECG: electrocardiography.
TILES app This app was custom-developed for the TILES study and was used both for data collection and for communication with participants throughout the enrollment and data collection periods. It is available for both Android and iOS (see Section Code availability for details). The EMAs were administered via the TILES app. Participants received a push notification when the EMAs were delivered and again thirty minutes before it expired if it had not yet been completed. Bi-directional communication was enabled via the TILES app as well. Participants could contact the research team at any time through the Contact Info tab. The app also contained a Frequently Asked Questions (FAQs) page which was updated in real time during the study as common questions were identified. In return, participants were notified via push notifications, and the via activity feed within the app of any non-compliance and were reminded to sync each device with its companion app.
Fitbit app The Fitbit app is a third party app that was used to pair the Fitbit wristband with each participant's personal smartphone using Bluetooth. Participants could visualize the data collected through their Fitbit wristband in this app, and could sync their data with Fitbit's servers.
OMsignal app The OMsignal app is a third party app that was used to start and stop the recording of the OMsignal garments, update the firmware of OMsignal garments if necessary, and sync the data to OMsignal's servers.
RealizD app RealizD is a third party smartphone application 2 for iOS and Android that records screen-on time and phone pickups. Data reported by RealizD takes the form of a timestamp marking the start screen-on session and the duration of that session in seconds.

Study Procedures
In this section we describe the mechanisms through which participants were deemed eligible and later recruited and enrolled in the study. We also describe the data collection process. All these steps were conducted in accordance with USC's Health Sciences Campus Institutional Review Board (IRB) approval. We present an overview of the study in Figure 1.

Requirements for eligibility
All volunteer participants were recruited from the University of Southern California's (USC) Keck Hospital. To participate, subjects were required to (a) be employed by the hospital and work, on average, at least 20 hours a week, (b) have exclusive access to an internet and Bluetooth-enabled mobile phone running Android 4.3 or higher or iOS 8 or higher for the 10 weeks of participation, (c) have exclusive access to a personal e-mail for the 10 weeks of participation, (d) have access to WiFi at home for the duration of the 10 week study, (e) be proficient in both speaking and reading English, and (f) be capable of wearing wearable sensors in a way that allows data to be collected and transmitted to the research team.

Recruitment
Participants were recruited using multiple methods, including (a) e-mails to employees from leaders within Keck Hospital informing them about the study and how to sign up, (b) attending employee meetings to inform employees about the study, (c) posting flyers in different parts of the hospital where employees would be likely to see them, (d) information tables set up in the cafeteria, where potential participants could learn more about the study and sign up. Participants who had indicated interest but had not completed the sign-up process were texted by one of the principal investigators to support completion of the sign-up process.
After completing a screening questionnaire to check eligibility, potential participants were sent a text message with a link to download the TILES app. The TILES app then walked them through identity verification, informed consent, downloading and syncing the necessary additional apps, and finally signing up for an in-person enrollment session.
Through the above methods, 365 individuals indicated interest in participating by completing a brief screening questionnaire and were found to be eligible. Of these 365 individuals, 212 participants provided their consent to participate in the study. Participants were recruited in three waves, each with different start and end dates. Table 2 summarizes the dates and number of participants per wave. Over the course of the study, eight participants chose to drop out, due to various reasons, such as a sensor becoming uncomfortable or no longer wanting to receive daily surveys.

Participant enrollment session
After providing their consent to participate, interested individuals signed up for a two-hour in-person enrollment session at the hospital through the TILES app. Upon arrival at the enrollment session, each participant was assigned a unique participant ID. During the first hour, participants completed part I of the baseline survey, under the supervision of a trained research team member. During the second hour, participants received their package of wearable sensors and instructions for use 3 . Each participant received three wearable sensors along with a USB charging hub and two micro USB cables for charging, to help participants streamline the process of charging the sensors.
Participants were instructed to wear three sensors (a Fitbit Charge 2, an OMsignal garment, and a Unihertz Jelly Pro smartphone) that collected physiological and behavioral data over a 10-week period. We describe the instructions given to participants in the following paragraphs. Table 1 shows a list of the sensing streams and their instructed use.
Daily Surveys Participants were informed from the first day of data collection they would start receiving one text message each day they were enrolled in the study. The text message contained a link to the job, health, or personality EMAs that they were expected to complete that day. Participants were instructed to complete the survey as soon as safely possible once they received the text message. A second daily EMA with psychological flexibility or capital surveys was received via a push notification on the participant's phone and contained similar instructions.
The EMAs took no more than 15 minutes to complete, and on most days the survey could be completed in around 5 minutes. Participants who worked on the night shift received the first EMA (job, health, or personality) at either 6pm, 12am, or 6am and participants who worked the day shift received the job, health, or personality EMAs at either 6am, 12pm, or 6pm. Participants were informed that they had 6 hours to complete each survey and they would receive a reminder notification from the TILES app 30 minutes before the link expires if the survey was not complete. The research team then distributed a calendar of the 10-week data collection period with a schedule of when to expect the daily survey each day. For the second (psychological flexibility or capital) EMAs, night shift participants received the surveys at a random time between 11PM and 6AM and were given 6 hours to complete the survey once it had been sent. Day shift participants received these surveys at a random time between 11AM and 6PM and were given 6 hours to complete the survey once it had been sent. All participants would receive a reminder via a push notification 30 minutes before the survey closed to remind them to complete the survey.
Fitbit Charge 2 The first wearable sensor distributed to participants was the Fitbit Charge 2. Participants were asked to wear this sensor on their nondominant hand day and night throughout participation in the study. To properly set-up this sensor, each participant created a Fitbit account and registered the Fitbit Charge 2 as a new device as well as synced the Fitbit app to the TILES app. When prompted by the Fitbit app, participants were asked to give Bluetooth permissions and deny location permissions.
OMsignal garments Next, participants were given an OMbox and OMsignal garments; men were given five shirts and women were given three bras. The OMbox contains the hardware and software to process, collect, and transmit the information. Participants were asked to charge the sensor prior to each work shift, then connect it to the OMsignal garment, wear the OMsignal garment with the OMbox attached during their work shifts at the hospital, and start OMsignal recordings in the OMsignal app installed in their phones at the beginning of each work shift and stop, save and upload the recording at the end of each work shift. During the enrollment session, each participant paired his/her OMsignal box to his/her account (created through the TILES app) on the OMsignal app on their mobile phone, practiced connecting the OMsignal box to the garment, and saving an OMsignal recording. At the beginning of the data collection, there was no version of the OMsignal app for Android. As a solution, we provided an iPod Touch to each participant with an Android personal smartphone with the OMsignal app installed. This way, they could start and stop recordings and upload the data using WiFi. The research team also helped set up locationbased reminders on the iOS devices to help participants remember to start and stop OMsignal recordings when arriving at the hospital as well as leaving.
Unihertz Jelly Pro Participants were given an Unihertz Jelly Pro phone (running Android 7.0). These were either clipped to participants' clothing near the neckline or placed in a shirt pocket. The cases of the Jelly phones were modified, such that the microphone pointed upwards, as described in [36], to better capture the speech data from the wearer. Participants were asked to charge the Jelly phone prior to each work shift, unlock the Jelly phone, check that the TILES Audio app [36] was running, and upload the audio data at the end of each work shift by pressing the UPLOAD DATA button in the TILES Audio app. Each Jelly phone was linked to the TILES app on each participant's mobile device by scanning a QR code in the TILES app. When prompted by the Jelly phone TILES Audio app, participants were asked to enable permissions (e.g., allowing TILES Audio app to run in the background, access to photos 4 , etc.) and disable location-related services. Additionally, participants were informed of the Jelly Phone TILES Audio app's disable feature (to stop recording audio features) and instructed on how to use this function.
RealizD app Lastly, participants downloaded the RealizD app on their smartphone and were informed that this app would track how often the phone was picked up and for how long. Participants did not need to interact with the RealizD app during their participation, since it ran in the background.
Phone permissions For the RealizD app to work, participants were asked to allow location permissions. Participants were also asked to keep WiFi and Bluetooth turned on on their personal mobile phones throughout their participation in the 10-week data collection period.
Environmental sensors Finally, participants learned about the environmental sensors that were placed around the hospital and informed that no participant interaction with these sensors was required.

Completing the Pre-Study Survey
Following completion of their enrollment session, participants were emailed a link to complete this survey, administered on the online survey platform RED-Cap.

Data Collection
The 10-week data collection took place in three different participant waves. The data collection periods and number of participants per wave are shown in Table 2.

Off-boarding Session
After the 10-week data collection from sensors and daily surveys ended, participants attended an in-person off-boarding session, which typically lasted between   Table 7).
15 to 20 minutes. During this session, participants exported mobile application data to members of the research team and returned their wearable sensors (except for Fitbit, see Section Incentives).

Completing the Post-Study Survey
Following completion of their off-boarding session, participants were emailed a link to complete a survey administered on the online survey platform REDCap. This survey was identical to part II of the baseline survey; the only difference is that the demographics survey was removed and a study feedback survey was administered. This survey took approximately 30 minutes to complete. This concluded participant study procedures.

Incentives Structure
A novel incentive scheme was developed for the TILES study to encourage compliance. Study participants were awarded with monetary incentives (Table 3) and points for study-related activities, proportionate to the time required to complete each activity. These points later translated to monetary awards. The number of points awarded for each activity is summarized in   Fitbit. Points were converted to monetary compensation on a weekly cadence, according to a set of thresholds noted in Table 5. The expected use of OMsignal garments and Jelly phones was 3 days a week for most of the participant population, so points for wearing and syncing these devices were added to the incentives schemes as bonuses.
In addition to weekly gift cards as incentive, points were accumulated throughout the duration of the study and grand prizes were awarded to the top three point earners per wave. Each participant's current point total and ranking were displayed in the TILES app activity feed. Bonus points were awarded for various activities, as summarized in Table 6. The first, second, and third place point earners across each wave were awarded $250, $200, and $100, respectively.
Participants that finished the 10-week data collection period also kept the Fitbit Charge 2 that they wore during the study.  Off-boarding Export RealizD data 50 Table 6: Bonus points scheme for study participation stages and milestones. Participants received weekly points by wearing the sensors and answering the surveys. These were converted to weekly monetary rewards (Table 5) and added to a global ranking that awarded prices by the end of the study.

Data acquisition and flow
server using an available wireless internet connection (WiFi or LTE). The Jelly phones (used here as audio-features recorders) uploaded data to the research server directly using WiFi only. The Jelly Pro smartphones given to participants also sent Bluetooth packets programmed with a unique identifier that were captured by the Owl-in-One hubs installed throughout the hospital. These packets were combined with their received signal strength indicator (RSSI) computed by the Owl-in-Ones.
The Owl-in-Ones also received data from environmental sensors. Data from both Minew sensors and Jelly phones were sent through Keck Hospital's public WiFi network to reelyActive's servers over UDP, from which the data were collected using the Pareto API [39] over HTTPS. These data were stored securely in the research server after filtering the data to contain only Bluetooth packets generated by our sensors.
Data were also collected directly through the participants' personal smartphones through the TILES app and the RealizD app. The TILES app uploaded data directly to the research server while the RealizD app uploaded data to the RealizD server and that data was later pulled to the research server. The research server (code available at https://github.com/usc-sail/ tiles-data-collection-pipeline/) consists of a RESTful API hosing a series of endpoints to collect push-type data streams (e.g. Owl-in-One, TILES app) in addition to a suite of tasks to fetch pull-type data streams (e.g. Fitbit, OMsignal).

Survey data
Once the data collection period ended, the baseline survey and EMAs were scored using R scripts 5 .
Data for the baseline survey were stored in a table where each column represents a single survey question and each row represents a single participant. In contrast, data for the various EMAs were measured daily and stored in multiple files, where each row contains the answers of a single participant to a survey. The files are split by shift (day/night), date, survey kind, and time it was administered.
Data for the take-home baseline survey (part I), baseline survey (part II) and study-completion survey (part II) were stored in single files each. Variables were renamed to correspond to what each question measured. After the above steps were taken, total scores for each psychological measurement were calculated (scored folder in Table 7).

Fitbit data
Fitbit data retrieved using the Fitbit API contained separate time series for measured heart rate and step count, in addition to a daily summary of physical activity and sleep. The heart rate data is reported on non-uniform intervals anywhere between approximately 5 s and 15 min depending on the participants' physical activity. Occasionally, long strings of repeated identical heart rate values (usually 70 beats/min) were reported in the raw data, spanning durations typically less than 15 minutes but sometimes up to 20 hours. Because of consumer observations that Fitbit technology sometimes incorrectly reports exactly 70bpm [40] and also because repeated measures of the same heart rate over several minutes is highly unlikely, these long strings were interpreted as artifacts. Thus, sequences of at least 50 repeated identical heart rate values were replaced with NaN 6 values. The step count, daily summary, and sleep data did not contain these long string artifacts and, therefore, were not pre-processed.

OMsignal data
The data obtained from the OMsignal's API contained no obvious visible artifacts, and so they were not modified during the pre-processing stage.

Owl-in-One data
The Owl-in-One devices captured packets from all Bluetooth devices broadcasting Bluetooth advertisements at Keck Hospital. We filtered all of these packets and stored only the packets coming from Minew sensors, Jelly phones, and Owlin-Ones by filtering keywords expected to be found in the packets ("minew", "reelyActive_RA-R436", "jelly"). These were originally stored in JSONL format, and later translated to CSV files containing only the relevant information for easier processing (details below). . These files also include relevant IDs (such as the participant ID associated to a Jelly phone), when appropriate. We have hashed the actual directory names to prevent making the hospital's floor plans publicly available, such that the floor number, unit, and room numbers are kept private. An example is c25c:lounge:2fec.
Environmental data Bluetooth packets sent by Minew sensors contained the measured temperature and humidity, light level, or motion information in their payload. Each packet was received by Owl-in-One devices, time stamped, and sent to the reelyActive's cloud servers where they were processed and sent to the research server. In the research server, the packets were filtered so that only packets containing Minew data were kept as environmental data. Some packets were found to contain corrupted data in the form of invalid source sensor identifiers. All environmental data was further filtered so the only packets observed contained identifier values that also appeared on the research team's list of identifiers for all installed sensors. None of the other packet values were observed to be corrupted, including the measured environmental data, so no additional preprocessing was performed.

Audio
Each file contains raw audio features extracted as a combination of the Interspeech 2013 ComParE Vocalization Challenge feature set [41] and openSMILE's emobase feature set [42]. The OpenSMILE toolkit was applied in this configuration to extract acoustic low-level descriptors (LLDs) of 127 dimensions every 10 ms using either 25 ms or 60 ms frame sizes. The configuration file used to extract features is provided with the app itself 7 . The feature set contains prosodic measures (pitch, intensity), cepstral information (MFCCs 1-12), RASTA PLP features, spectral features (band energy between 250 Hz to 650 Hz, centroid of frequency distribution, spectral rolloffs), and other acoustic characteristics (e.g. LPC 0-7, zero crossing rate).
We did not perform any preprocessing on the raw audio before feature extraction. To extract foreground speech information, we trained a machine learning model to learn to differentiate foreground versus background on a separate corpus collected in-house, with the same audio feature extraction hardware and software, but also with the ground truth audio, and applied it to processing the TILES-2018 Audio Data Record's raw features [43]. The output of these models is temporal foreground predictions in the interval [0, 1], where values close to 1 predict foreground. These temporal foreground predictions are also included in the TILES-2018 Audio Data Record, and described in the Data Records section. To extract data with foreground speech, we recommend thresholding first at 0.5 a median-filtered version of the foreground speech predictions with a win-dow length of 101 samples (corresponding to a 1 s window). A non-zero value corresponds then to a row with detected foreground speech.
For the current data release, we further curated the data by only including a subset of the features collected, and omitting filterbank features such as MFCCs and PLPs, as well as LPC features. We believe filterbanks should be released with some form of information obfuscation or encryption, as it contains potentially recoverable language information and poses privacy concerns. We intend to release privacy preserving embeddings on the filterbanks at a later stage. For information on features included in the release, refer to Section Audio.

Inference of days at work
For convenience, we also provide an estimate of working days for all participants. This was obtained using the EMAs, as well as the data collected from the OMsignal garments and a combination of the Jelly phones and Owl-in-One data.
One of the base EMA questions was where the participant currently was (a value equal to 2 indicated currently at work). All of the participants' responses were saved into a table (each row represented a participant, each column a date). Equivalent tables were saved for days in which participants had recorded data through their OMsignal garments and through the Owl-in-Ones receiving pings from the Jelly phones.
All of this information was combined by performing a logical or operation between the tables. This means that if any of the sources of information regarded a given day as a day spent at work, that day was inferred as a day at work.

Code availability
All code for collecting, formatting, processing, and learning on the data is made freely available at https://github.com/usc-sail/tiles-dataset-release. Information about the code dependencies and package requirements are available in the same Github repository.

Data Records
The TILES-2018 data is split into two data records: the main data record, and the audio data record. Each data set is described in detail in this section.

TILES-2018 Main Data Record
The main data record is comprised of several different data streams: fitbit, realizd, omsignal, owlinone, and surveys (following the names of the folders in the record), and a metadata folder. Depending on the kind of data collected, each stream may have subfolders. These are described in the following subsections. A summary of the main data record is presented in  Detailed descriptions for all the data sources are included in each folder under a README file.

Fitbit (fitbit folder)
daily-summary folder Each file has rows with a date and time and a set of daily summaries including resting heart rate, total calories burned, total number of steps, sleep report, and heart rate zone durations. The sleep reports provide  heart-rate folder Each file has rows with a timestamp and PPG heart rate values (beats per minute). The PPG heart rate samples are made available by the Fitbit Charge 2 sensors aggregated over intervals of less than 1 min, but the time differences between two consecutive samples are non-uniform.
sleep-data folder Each file has rows with the sleepId it corresponds to in sleep-metadata, a timestamp and the sleep phase with its total duration in s. Phase is either in classic (one of asleep, restless, or awake) or stages (one of deep, light, rem, or wake). The timestamp determines the beginning of the sleeping phase.
sleep-metadata folder Each file has rows for each period of sleep, and metadata for that sleep, including beginning and end, nap versus main sleep, type of inferred sleep phases (classic or stages), duration, and various metrics.
step-count folder Each file has rows with a timestamp and step count value. In contrast to heart rate values, step count data is sampled with an interval of 1 min, and reports the number of steps taken within that minute.
Metadata participant-info folder Contains a single file with hash-based participant IDs, primary nursing unit, and kind of shift (day or night).
days-at-work folder Contains a file for all participants. The information is presented in four tables (one per stream, plus aggregated) where each participant is a row and each column is a date in format yyyy-mm-dd.

OMsignal (omsignal folder)
features folder Each file contains rows with a timestamp and a set of physiological and physical activity measurements in real-time (aggregated and saved every second), as well as high-level descriptive features (every 5 min). The realtime measurements include breathing rate, breathing depth, intensity, cadence, heart rate, RR intervals 8 , and step count. The high-level descriptive features include statistical aggregations and derived features of real-time measurements over the 5 min intervals. Examples include the average and standard deviation of the breathing rates as well as posture.
ecg folder Each file has raw, 15 s-long electrocardiogram (ECG) snippets sampled at 250 Hz and recorded every 5 min. Each file corresponds to a single participant. Each row belongs to a single recording identified by record_id, and mapping to the corresponding row in the metadata subfolder.
metadata folder Contains one file per participant with metadata information such as dates of usage, usage time in hours, and RR coverage 9 for a given recording.

Owl-in-One (owlinone folder)
Owl-in-One data contains information from three different sources: Jelly phones (RSSIs from the Jelly phones of participants), other Owl-in-Ones (RSSIs), and Minew sensors (RSSIs and ambient information).
jelly folder The jelly subfolder is organized with files per participant. Each file contains rows with a timestamp, a participant ID, and the directories (see Section RSSI for details) of the receiving Owl-in-Ones with corresponding RSSI values.
minew folder Contains one file per device whose timestamped content depends on the type of sensor: • light sensor: yes/no light detection • motion sensor: acceleration in X, Y, and Z coordinates in m s −2 , • temperature and humidity sensor: temperature in • C and relative humidity in %.
locations folder Contains a file with X and Y coordinates in m. The origin of the system of coordinate (i.e., the point (x, y) = (0, 0)) is arbitrary so that the floor maps of the hospital are not revealed, but the pairwise distances between sensors within a same unit have been kept the same.
rssi folder This folder has one file per Minew sensor. Each file contains rows with a timestamp (sorted), the hashed directory of the receiving Owl-in-One, and the corresponding RSSI value.
owls folder This folder contains two subfolders: locations folder Contains a file with X and Y coordinates in m from the same arbitrary origin than the minew locations.
rssi folder The Owl-in-One files are organized by Unix time days (meaning that the cutoff is that midnight UTC). These files each contain all of the signals sent by Owl-in-Ones and received by them. Sending and receiving MAC addresses have been included, together with the sender and receivers' associated directories.

RealizD (realizd folder)
Each RealizD file describes the interaction that participants have with their smartphones. These files include a column with timestamps for initial interaction times and a column with times in seconds corresponding to the duration of the interaction.

Surveys
The surveys folder contains two subfolders including raw and scored surveys.
raw folder Contains three subfolders: baseline Contains two files named with the assessed scales in each part of the survey. In each file, rows correspond to the answers of participants, and columns containing timestamps contain information to track the progress of participants throughout the surveys.
EMAs Contains three files. One file has the answers for health, personality, and job surveys plus the base daily survey in each of these. Each file contains answers per item (or question), for the participants (one per row) that received that survey. The questions are grouped in pages in which they were presented to participants, and the information on the time until first click in each page, last click, and total time spent in each page is also included.
A second file has the responses for psychological flexibility. It includes times at which the surveys were completed, and the total survey time. A third file has the psychological capital surveys.
post-study Contains a single file, named with all the assessed scales. Each row corresponds to a participant's answers to each question. Total 186 212 Table 9: Consent given by participants to share the audio data. The consent was given at the beginning of the study through the TILES app. We have only included the data from participants who allowed their data to be shared.
scored folder baseline Contains two files named with the assessed scales in each part of the survey. In each file, rows correspond to participants, and columns contain the values of the scored scales.
EMAs Each file corresponds to a scored item/scale assessed throughout the study. Each row in each file corresponds to a participant's scored answers.
post-study Contains a single file, named with all the assessed scales. Each row corresponds to a participant's answers to each question.

TILES-2018 Audio Data Record
The TILES-2018 Audio Data Record contains two different kinds of files related to the audio features obtained as per Section Audio. Consent to publish the audio data was given by 186 out of 212 participants, as detailed in Table 9. fg-predictions folder This folder is arranged with subfolders per participant, like the raw-features folder. Each file in a participants' subfolder is a NumPy (.npy) file and corresponds to a file in the raw-features folder. The foreground prediction file stores an array to differentiate foreground (FG) 10 and background (BG) speech activity with values indicating the likelihood of foreground speech information of each row in the corresponding file in the raw-features folder. 10 Foreground refers to audio features generated by the participant wearing the audio badge, as opposed to background noise generated by third-party. More details can be found in [43].   [32]. † Realizd data only shows the times when phone interaction occurs, thus it is not possible to differentiate between periods with no interaction and the application not working.

Features
The features are computed over overlapping frames of raw audio. Frames lengths are typically 25 ms (and 60 ms for some features) and features are updated and recorded every 10 ms (which means roughly an overlap of 60% for 25 ms frames and an overlap of 80% for 60 ms frames). Finally, some features are computed over windows of several frames.

Data Integrity
The TILES-2018 data set was collected in a demanding, real-world workplace setting, where participants were asked to use wearable sensors, even though their workload and responsibilities did not change. In this scenario, the compliance rates obtained were in-line with other reported compliance rates for smaller  Table 11: Survey participation and compliance rates. Compliance was measured as the percentage of answered questions in each survey that was started. We also include the number of started surveys. Please refer to Figure 6 and Figure 7 for accompanying histograms.
studies, as discussed in [32]. Table 10 shows the compliance rates for each data stream, across all participants. Figure 5 shows histograms of the average usage hours for all the wearable sensors, across all participants. Table 12 shows Cronbach's α for the baseline and post-study surveys. This table shows that most of the assessments had an average α over 0.75, except for the agreeableness and alcohol use scales. Figure 3 reports Cronbach's α for each construct of each EMA administered. Some of the assessed constructs show an α > 0.7 for most of the time they were administered (challenge stressors, hindrance stressors, support, psychological capital, engagement, individual task proficiency, psychological flexibility, negative affect, and positive affect).

Works using the Data Set
We have published several papers using this data, where we discuss various data processing challenges and opportunities.
In [45], we proposed a technique for clustering and discovering patterns in proximity-based location data of hospital workers, by extracting motifs (repeating patterns) from the length of stay in each location from the proximity-based time series of locations. We used the data in this data set including locations of over 200 participants and over 240 proximity sensors during the ten weeks of the study, and discovered that rooms of similar types (e.g., patient rooms) in the hospital exhibited a unique motif signature. The results suggest that similar motif features could be used in place of knowing the room types in advance and thus simplify very large-scale data collection.
A different approach involves using these proximity-based measurements to localize hospital workers. In our Main Data Record, we provide proximity-based information for 16 different high-traffic indoor settings. We use this information in [46] to propose a novel indoor localization algorithm based on tools from the metric learning community.
In [47], we explore the usage of physiological time series collected from the Fitbit Charge 2 wristband. We particularly study how to obtain optimal-length motifs from heart rate time series to capture intuitive physiological patterns of workers in their daily lives. The results revealed that regular routine patterns, such as sleep, can be reflected through heart rate time series data.
As emphasized in [32], one major challenge in conducting studies in naturalistic settings relates to the quality of the data being collected with wearables. This requires the development of sensor quality metrics, missing data imputation methods, as well as quality-aware and artifact-robust parameters. To this end, we developed several such measures for breathing and heart rate time series [48,49,50].
The data we are publishing with the Audio Data Record proposes a new set of challenges not described in the literature before, which are related to privacyaware audio processing in a real-world setting with sensitive information. As the Jelly phones ("audio badges") we used recorded all foreground (egocentric audio information) and background audio of each participant's environment, in [43] we trained a machine learning model to detect foreground vs. background audio content in a different, in-house corpus which included raw audio time series alongside the same set of extracted features. We applied it to the Audio Data Record to generate foreground vs. background predictions that allow us to retrieve the egocentric information of a participant.
Finally, in [51] we took a more global approach and used several of the data streams collected through sensors to infer self-assessments of participants, i.e., scored surveys.

Data Access
Due to privacy concerns, we request a signed Data Usage Agreement (DUA) to grant access to all data records. A user signing this DUA agrees to the following: (1) not de-anonymizing the data, (2) not trying to identify language content, and (3) not sharing the data record with anyone not having signed a DUA. The document and the form to submit the signed document are available online here: http://tiles-data.isi.edu. Once validated, the user will receive an email with the information on how to download each data record.

Main Record
The main data record has a total size of about 100Gb. To be mindful of the use of resources, we ask users to download this record only once.

Audio Record
Due to the size of this data record (about 305 GB), we provide 2 subsets of it for convenience. A first data record is about 100 GB, and contains only data when foreground speech has been detected. We believe most users will be interested in this record. The second data record of about 10 GB is from a single user and contains all features extracted from the raw audio, unfiltered, i.e., including segment when no foreground speech has been detected. The complete data record will only be accessible upon request, after testing has been performed on the second subset described above, to be mindful of bandwidth usage. Same as the main records, we ask users to download each data record only once.

Reading the files
We are sharing all the files as compressed comma separated values (CSV) files (.csv.gz), except for the foreground predictions which are stored as NumPy (.npy) files. We recommend directly reading the compressed files. This can be easily done in Python, R, and Julia (examples follow). Note that .csv.gz files can also be opened directly in LibreOffice Calc 11   you need to decompress the files, due to the number of files and their large sizes we recommend using gzip [52].
Python We recommend using Pandas: 1 import pandas as pd 2 df = pd . read_csv ( " file . csv . gz " ) R We recommend using the data.

Data Records: Use Cases
This data set was initially developed to model and predict self-report mental states from wearable sensors. However, we are devising more uses for it, and hope that researchers will find other uses for various aspects of the data.
Multimodal Signal Processing This data set proposes several problems in core (multimodal) signal processing. There are several opportunities for data quality enrichment, including the denoising of the ECG snippets and Fitbit heart data, denoising of proximity information for localization, time alignment and synchronization of events from multimodal streams, and voice activity detection from breathing information. There are also new opportunities from a signal processing standpoint in the processing of longitudinal survey information.
Statistical Modeling and Machine Intelligence This data set presents many opportunities for machine learning. The data set was initially designed to predict self-assessments of participants from sensor data. However, there are opportunities to explore the behavioral dynamics of participants throughout time, including through unsupervised learning, behavioral time series forecasting, and causal inference. Other opportunities involve spatio-temporal modeling of behavior, individualized and group-level behavioral modeling, and multitask learning of behavior patterns.
Privacy We devise several uses of this data set for privacy researchers. Given the total number of hours of physiologic and behavioral data, an obvious use case is exploring the fingerprinting of individuals from physiologic data and behavioral patterns. We are, however, strongly against using the data set for re-identification of specific individuals, hence a data usage agreement specifically for bidding this usage. This data set also poses new challenges and opportunities to explore venues in privacy-aware machine learning.
Behavioral Sciences This data set poses several opportunities for research in social sciences, and specifically at the intersection of social sciences and machine learning. Some of these opportunities include the study of longitudinal survey assessments, and employee well-being within large organizations. For example, a potential avenue is to explore how data from wearable sensors relate to measures of job performance, and how they may provide new ways to explore how health and wellness impact important work behaviors. Lastly, these data provide an opportunity to examine psychometric properties of psychological measures across a 10-week period.
• Handing in sensors • Upload RealizD data • Read an audio script Post-study • Post-study survey Figure 1: Experimental design. Participants received instructions in a 2-hour on-boarding session, where they completed the first part of the baseline survey and were instructed in the use of sensors and smartphone apps. This session was followed by the second part of the baseline survey and then by 10 weeks of data collection, during which participants wore multiple wearable sensors (wristband, garment and an audio badge) and answered two daily EMAs through their personal smartphones. During the off-boarding session, participants handed in their sensors and finished uploading data and read an audio passage for baseline vocal information. After the sensor data collection, they completed a post-study survey.

Sensors
Data servers (the cloud) Figure 4: Data flow. This diagram shows the data flow from the sensors given to participants, the sensors placed at USC's Keck Hospital, and smartphones to the research server, where the data is stored for long-term use.