Database Credentialed Access
GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization
Xuhai Xu , Han Zhang , Yasaman Sefidgar , Yiyi Ren , Xin Liu , Woosuk Seo , Jennifer Brown , Kevin Kuehn , Mike Merrill , Paula Nurius , Shwetak Patel , Tim Althoff , Margaret Morris , Eve Riskin , Jennifer Mankoff , Anind Dey
Published: Nov. 4, 2022. Version: 1.0
When using this resource, please cite:
(show more options)
Xu, X., Zhang, H., Sefidgar, Y., Ren, Y., Liu, X., Seo, W., Brown, J., Kuehn, K., Merrill, M., Nurius, P., Patel, S., Althoff, T., Morris, M., Riskin, E., Mankoff, J., & Dey, A. (2022). GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization (version 1.0). PhysioNet. https://doi.org/10.13026/jvtb-2d81.
Xu X., Zhang H., Sefidgar Y., Ren Y., Liu X., Seo W., Brown J., Kuehn K., Merrill M., Nurius P., Patel S., Althoff T., Morris M., Riskin E., Mankoff J., Dey A. (2022) GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization. 36th Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
Please include the standard citation for PhysioNet:
(show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
We present the first multi-year mobile sensing datasets. Our multi-year data collection studies span four years (10 weeks each year, from 2018 to 2021). The four datasets contain data collected from 705 person-years (497 unique participants) with diverse racial, ability, and immigrant backgrounds. Each year, participants would install a mobile app on their phones and wear a fitness tracker. The app and wearable device passively track multiple sensor streams in the background 24×7, including location, phone usage, calls, Bluetooth, physical activity, and sleep behavior. In addition, participants completed weekly short surveys and two comprehensive surveys on health behaviors and symptoms, social well-being, emotional states, mental health, and other metrics. Our dataset analysis indicates that our datasets capture a wide range of daily human routines, and reveal insights between daily behaviors and important well-being metrics (e.g., depression status). We envision our multi-year datasets can support the ML community in developing generalizable longitudinal behavior modeling algorithms.
Among various longitudinal sensor streams, smartphones and wearables are arguably one of the most widely available data sources . The advances in mobile technology provide an unprecedented opportunity to capture multiple aspects of daily human behaviors, by collecting continuous sensor streams from these devices [2,3], together with metrics about health and well-being through self-report or clinical diagnosis as modeling targets. It poses unique challenges compared to traditional time-series classification tasks. First, the data covers a much longer time period, usually across multiple months or years. Second, the nature of longitudinal collection often results in a high data missing rate. Third, the prediction target label is sparse, especially for mental well-being metrics.
Longitudinal human behavior modeling is an important multidisciplinary area spanning machine learning, psychology, human-computer interaction, and ubiquitous computing. Researchers have demonstrated the potential of using longitudinal mobile sensing data for behavior modeling in many applications, e.g., detecting physical health issues , monitoring mental health status , measuring job performance , and tracing education outcomes . Most existing research employed off-the-shelf ML algorithms and evaluated them on their private datasets. However, testing a model with new contexts and users is imperative to ensure its practical deployability. To the best of our knowledge, there has been no investigation of the cross-dataset generalizability of these longitudinal behavior models, nor an open testbed to evaluate and compare various modeling algorithms. To address this gap, we present the first multi-year passive mobile sensing datasets to help the ML community explore generalizable longitudinal behavior models.
Our data collection studies were conducted at a Carnegie-classified R-1 university in the United State with an IRB review and approval. We recruited undergraduates via emails from 2018 to 2021. After the first year, previous-year participants were invited to join again. The study was conducted during Spring quarter for 10 weeks each year, so the impact of seasonal effects was controlled. Based on their compliance, participants received up to $245 in compensation every quarter.
The four datasets (DS1 to DS4) have 155, 218, 137, and 195 participants (705 person-years overall, and 497 unique people). Our datasets have a high representation of females (58.9%), immigrants (24.2%), first-generations (38.2%), and disability (9.1%), and have a wide coverage of races, with Asian (53.9%) and White (31.9%) being dominant (e.g., Hispanic/Latino 7.4%, Black/African American 3.3%).
Part 1: Survey Data
We collected survey data at multiple stages of the study. We delivered extensive surveys before the start and at the end of the study (pre/post surveys) and short weekly Ecological Momentary Assessment (EMA) surveys during the study to collect in-the-moment self-report data. All surveys consist of well-established and validated questionnaires to ensure data quality.
Our pre/post surveys include a number of questionnaires to cover various aspects of life, including 1) personality (BFI-10, The Big-Five Inventory-10), 2) physical health (CHIPS, Cohen-Hoberman Inventory of Physical Symptoms), 3) mental well-being (e.g., BDI-II, Beck Depression Inventory-II; ERQ, Emotion Regulation Questionnaire), and 4) social well-being (e.g., Sense of Social and Academic Fit Scale; EDS, Everyday Discrimination Scale). Our EMA surveys focus on capturing participants’ recent sense of their mental health, including PHQ-4, Patient Health Questionnaire 4; PSS-4, Perceived Stress Scale 4; and PANAS, Positive and Negative Affect Schedule.
We use the depression detection task as a starting point for behavior modeling. We employ BDI-II (post) and PHQ-4 (EMA) as the ground truth. Both are screening tools for further inquiry of clinical depression diagnosis. We focus on a binary classification problem to distinguish whether participants’ scores indicate at least mild depressive symptoms through the scales (i.e., PHQ-4 > 2, BDI-II > 13). The average number of depression labels is 11.6 ± 2.6 per person. The percentage of participants with at least mild depression is 39.8 ± 2.7% for BDI-II and 46.2 ± 2.5% for PHQ-4.
Due to some design iteration, we did not include PHQ-4 in DS1, but only PANAS. Although PANAS contains questions related to depressive symptoms (e.g., “distressed”), it does not have a comparable theoretical foundation for depression detection like PHQ-4 or BDI-II. Therefore, to maximize the compatibility of the datasets, we trained a small ML model on DS2 that has both PANAS and PHQ-4 scores to generate reliable ground truth labels. Specifically, we used a decision tree (depth=2) to take PNANS scores on two affect questions (“depressed” and “nervous”) as the input and predict PHQ-4 score-based depression binary label. Our model achieved 74.5% and 76.3% for accuracy and F1-score on a 5-fold cross-validation on DS2. The rule from the decision tree is simple: the user would be labeled as having no depression when the distress score is less than 2, and the nervous score is less than 3 (on a 1-5 Likert Scale). We then applied this rule to DS1 to generate depression labels.
Part 2: Sensor Data
We developed a mobile app using the AWARE Framework  that continuously collects location, phone usage (screen status), Bluetooth scans, and call logs. The app is compatible with both the iOS and Android platforms. Participants installed the app on smartphones and left it running in the background. In addition, we provided wearable Fitbits to collect their physical activities and sleep behaviors. The mobile app and wearable passively collected sensor data 24×7 during the study. The average number of days per person per year is 77.5 ± 8.9 among the four datasets.
We strictly follow our IRB's rules for anonymizing participants' data. Specifically, we employed a PID as the only indicator of a participant. No personal information is included in the dataset. Since some sensitive sensor data (e.g., location) can disclose identities, we only release feature-level data under credentialing to protect against privacy leakage.
Moreover, the data collection dates are randomly shifted by weeks. Therefore, the temporal order of events within the same subject and the day of the week are maintained after date-shifting.
We release four datasets, named INS-W_1, INS-W_2, INS-W_3, and INS-W_4. A dataset has three folders. We provided an overview description below. Please refer to our GLOBEM home page  GitHub README page  for more details.
- SurveyData: a list of files containing participants' survey responses, including pre/post long surveys and weekly short EMA surveys.
- FeatureData: behavior feature vectors from all data types, using RAPIDS  as the feature extraction tool.
- ParticipantInfoData: some additional information about participants, e.g., device platform (iOS or Android).
Specifically, the folder structure of a dataset folder is shown as follows:
The SurveyData folder contains five files, all indexed by pid and date:
- dep_weekly.csv: The specific file for depression labels (column "dep") combining post and EMA surveys.
- dep_endterm.csv: The specific file for depression labels (column "dep") only in post surveys. Some prior depression detection tasks focus on end-of-term depression prediction.
These two files are created for depression as it is the benchmark task. We envision future work can be extended to other modeling targets as well.
- pre.csv: The file contains all questionnaires that participants filled in right before the start of the data collection study (thus pre-study).
- post.csv: The file contains all questionnaires that participants filled in right after the end of the data collection study (thus post-study).
- ema.csv: The file contains all EMA surveys that participants filled in during the study. Some EMAs were delivered on Wednesdays, while some were delivered on Sundays.
|Survey Name||Short Description||Score Range||Dataset||Category|
Short-form UCLA Loneliness Scale
|A 10-item scale measuring one's subjective feelings of loneliness as well as social isolation. Items 2, 6, 10, 11, 13, 14, 16, 18, 19, and 20 of the original scale are included in the short form. Higher values indicate more subjective loneliness.||10 - 40||1,2,3,4||pre, post|
Sense of Social and Academic Fit Scale
|A 17-item scale measuring the sense of social and academic fit of students at the institution where this study was conducted. Higher values indicate higher feelings of belongings.||17 - 119||1,2,3,4||pre, post|
2-Way Social Support Scale
|A 21-item scale measuring social supports from four aspects (a) giving emotional support, (b) giving instrumental support, (c) receiving emotional support, and (d) receiving instrumental support. Higher values indicate more social support.||(a) 0 - 25
(b) 0 - 25
(c) 0 - 35
(d) 0 - 20
Perceived Stress Scale
|A 14-item scale used to assess stress levels during the last month. Note that Year 1 used the 10-item version. Higher values indicate more perceived stress.||0 - 56 (Year 2,3,4)
0 - 40 (Year 1)
Emotion Regulation Questionnaire
|A 10-item scale assessing individual differences in the habitual use of two emotion regulation strategies: (a) cognitive reappraisal and (b) expressive suppression. Higher scores indicate more habitual use of reappraisal/suppression.||(a) 1 - 7
(b) 1 - 7
Brief Resilience Scale
|A 6-item scale assessing the ability to bounce back or recover from stress. Higher scores indicate more resilient from stress.||1 - 5||1,2,3,4||pre, post|
Cohen-Hoberman Inventoryof Physical Symptoms
|A 33-item scale measuring the perceived burden from physical symptoms, and resulting psychological effect during the past 2 weeks. Higher values indicate more perceived burden from physical symptoms.||0 - 132||1,2,3,4||pre, post|
State-Trait Anxiety Inventory for Adults
|A 20-item scale measuring State-Trait anxiety. Year 1 used the State version, while other years used the Trait version. Higher values indicate higher anxiety.||20 - 80||1,2,3,4||pre, post|
Center for EpidemiologicStudies Depression ScaleCole version
|A 10-item scale measuring current level of depressive symptomatology, with emphasis on the affective component, depressed mood. Year 2 used the 9-item version. Higher scores indicate more depressive symptoms.||0 - 30 (Year 1,3,4)
0 - 27 (Year 2)
Beck Depression Inventory-II
|A 21-item detect depressive symptoms. Higher values indicate more depressive symptoms. 0-13: minimal to none, 14-19: mild, 20-28: moderate and 26-63: severe.||0 - 63||1,2,3,4||pre, post|
Mindful Attention Awareness Scale
|A 15-item scale assessing a core characteristic of mindfulness. Year 1 used a 7-item version, while other years used the full version. Higher values indicate higher mindfulness.||1 - 6||1,2,3,4||pre, post|
The Big-Five Inventory-10
|A 10-item scale measuring the Big Five personality traits Extroversion, Agreeableness, Conscientiousness, Emotional Stability, and Openness. The higher the score, the greater the tendency of the corresponding personality.||1 - 5||1,2,3,4||pre|
Brief Coping Orientation to Problems Experienced
|A 28-item scale measuring (a) adaptive and (b) maladaptive ways to cope with a stressful life event. Higher values indicate more effective/ineffective ways to cope with a stressful life event.||(a): 0 - 3
(b): 0 - 3
|A 6-item scale assessing individual differences in the proneness to experience gratitude in daily life. Higher scores indicate a greater tendency to experience gratitude.||6 - 42||2,3,4||pre, post|
Flourishing Scale Psychological Well-Being Scale
|An 8-item scale measuring the psychological well-being. Higher scores indicate a person with ``more psychological resources and mental strengths''.||8 - 56||2,3,4||pre, post|
|A 9-item scale assessing everyday discrimination. Higher values indicate more frequent experience of discrimination.||0 - 45||2,3,4||pre, post|
Chronic Work Discriminationand Harassment
|A 12-item scale assessing experiences of discrimination in educational settings. Higher values indicate more frequent experience of discrimination in the work environment.||0 - 60||2,3,4||pre, post|
The Brief Young Adult Alcohol ConsequencesQuestionnaire (optional)
|A 24-item scale measuring the alcohol problem severity continuum in college students. Higher values indicates more severe alcohol problems.||0 - 24||2,3,4||pre, post|
Patient Health Questionnaire 4
|A 4-item scale assessing (a) mental health, (b) anxiety, and (c) depression. Higher values indicate higher risk of mental health, anxiety, and depression.||(a): 0 - 12
(b): 0 - 6
(c): 0 - 6
Perceived Stress Scale 4
|A 4-item scale assessing stress levels during the last month. Higher values indicates more perceived stress.||0 - 16||2,3,4||Weekly EMA|
Positive and Negative Affect Schedule
|A 10-item scale measuring the level of (a) positive and (b) negative affects. Higher values indicates larger extent.||(a): 0 - 20
(b): 0 - 20
PS: Due to the design iteration, some questionnaires are not available in all studies. Moreover, some questionnaires have different versions across years. We clarify them using column names. For example, INS-W_2 only has "CESD_9items_POST", while others have "CESD_10items_POST". "CESD_9items_POST" is also calculated in other datasets to make the modeling target comparable across datasets.
The FeatureData folder contains seven files, all indexed by pid and date.
- rapids.csv: The complete feature file that contains all features.
- location.csv: The feature file that contains all location features.
- screen.csv: The feature file that contains all phone usage features.
- call.csv: The feature file that contains all call features.
- bluetooth.csv: The feature file that contains all Bluetooth features.
- steps.csv: The feature file that contains all physical activity features.
- sleep.csv: The feature file that contains all sleep features.
- wifi.csv: The feature file that contains all WiFi features. Note that this feature type is not used by any existing algorithms and often has a high data missing rate.
Please note that all features are extracted with multiple time_segments
- morning (6 am - 12 pm, calculated daily)
- afternoon (12 pm - 6 pm, calculated daily)
- evening (6 pm - 12 am, calculated daily)
- night (12 am - 6 am, calculated daily)
- allday (24 hrs from 12 am to 11:59 pm, calculated daily)
- 7-day history (calculated daily)
- 14-day history (calculated daily)
- weekdays (calculated once per week on Friday)
- weekend (calculated once per week on Sunday)
For all features with numeric values, we also provide two more versions:
- normalized: subtracted by each participant's median and divided by the 5-95 quantile range
- discretized: low/medium/high split by 33/66 quantile of each participant's feature value
All features follow a consistent naming format:
- feature_type: It corresponds to the six data types.
- location - f_loc
- screen - f_screen
- call - f_call
- bluetooth - f_blue
- steps - f_steps
- sleep - f_slp.
- feature_name: The name of the feature provided by RAPIDS, i.e., the second column of the following figure, plus some additional information. A typical format is [SensorType]_[CodeProvider]_[featurename]. Please refer to RAPIDS's naming format  for more details.
- version: It has three versions:
- 1) nothing, just empty "";
- 2) normalized, _norm;
- 3) discretized, _dis.
- time_segment: It corresponds to the specific time segment.
- morning - morning
- afternoon - afternoon
- evening - evening
- night - night
- allday - allday
- 7-day history - 7dhist
- 14-day history - 14dhist
- weekday - weekday
- weekend - weekend
A participant's "sumdurationunlock" normalized feature in mornings is "f_loc:phone_screen_rapids_sumdurationunlock_norm:morning".
Please find the following tables about feature details in our datasets.
|hometime||minutes||Time at home. Time spent at home in minutes. Home is the most visited significant location between 8 pm and 8 am, including any pauses within a 200-meter radius.|
|disttravelled||meters||Total distance traveled over a day (flights).|
|rog||meters||The Radius of Gyration (rog) is a measure in meters of the area covered by a person over a day. A centroid is calculated for all the places (pauses) visited during a day, and a weighted distance between all the places and that centroid is computed. The weights are proportional to the time spent in each place.|
|maxdiam||meters||The maximum diameter is the largest distance between any two pauses.|
|maxhomedist||meters||The maximum distance from home in meters.|
|siglocsvisited||locations||The number of significant locations visited during the day. Significant locations are computed using k-means clustering over pauses found in the whole monitoring period. The number of clusters is found iterating k from 1 to 200 stopping until the centroids of two significant locations are within 400 meters of one another.|
|avgflightlen||meters||Mean length of all flights.|
|stdflightlen||meters||Standard deviation of the length of all flights.|
|avgflightdur||seconds||Mean duration of all flights.|
|stdflightdur||seconds||The standard deviation of the duration of all flights.|
|probpause||-||The fraction of a day spent in a pause (as opposed to a flight).|
|siglocentropy||nats||Shannon’s entropy measurement is based on the proportion of time spent at each significant location visited during a day.|
|circdnrtn||-||A continuous metric quantifying a person’s circadian routine that can take any value between 0 and 1, where 0 represents a daily routine completely different from any other sensed days and 1 a routine the same as every other sensed day.|
|wkenddayrtn||-||Same as circdnrtn but computed separately for weekends and weekdays.|
|locationvariance||meters2||The sum of the variances of the latitude and longitude columns.|
|loglocationvariance||-||Log of the sum of the variances of the latitude and longitude columns.|
|totaldistance||meters||Total distance traveled in a time segment using the haversine formula.|
|avgspeed||km/hr||Average speed in a time segment considering only the instances labeled as Moving. This feature is 0 when the participant is stationary during a time segment.|
|varspeed||km/hr||Speed variance in a time segment considering only the instances labeled as Moving. This feature is 0 when the participant is stationary during a time segment.|
|numberofsignificantplaces||places||Number of significant locations visited. It is calculated using the DBSCAN/OPTICS clustering algorithm which takes in EPS and MIN_SAMPLES as parameters to identify clusters. Each cluster is a significant place.|
|numberlocationtransitions||transitions||Number of movements between any two clusters in a time segment.|
|radiusgyration||meters||Quantifies the area covered by a participant.|
|timeattop1location||minutes||Time spent at the most significant location.|
|timeattop2location||minutes||Time spent at the 2nd most significant location.|
|timeattop3location||minutes||Time spent at the 3rd most significant location.|
|movingtostaticratio||-||Ratio between stationary time and total location sensed time. A lat/long coordinate pair is labeled as stationary if its speed (distance/time) to the next coordinate pair is less than 1km/hr. A higher value represents a more stationary routine.|
|outlierstimepercent||-||Ratio between the time spent in non-significant clusters divided by the time spent in all clusters (stationary time. Only stationary samples are clustered). A higher value represents more time spent in non-significant clusters.|
|maxlengthstayatclusters||minutes||Maximum time spent in a cluster (significant location).|
|minlengthstayatclusters||minutes||Minimum time spent in a cluster (significant location).|
|avglengthstayatclusters||minutes||Average time spent in a cluster (significant location).|
|stdlengthstayatclusters||minutes||Standard deviation of time spent in a cluster (significant location).|
|locationentropy||nats||Shannon Entropy computed over the row count of each cluster (significant location), it is higher the more rows belong to a cluster (i.e., the more time a participant spent at a significant location).|
|normalizedlocationentropy||nats||Shannon Entropy computed over the row count of each cluster (significant location) divided by the number of clusters; it is higher the more rows belong to a cluster (i.e., the more time a participant spent at a significant location).|
|timeathome||minutes||Time spent at home.|
|timeat[PLACE]||minutes||Time spent at [PLACE], which can be living, exercise, study, greens.|
Phone Usage Details
|sumduration||minutes||Total duration of all unlock episodes.|
|maxduration||minutes||Longest duration of any unlock episode.|
|minduration||minutes||Shortest duration of any unlock episode.|
|avgduration||minutes||Average duration of all unlock episodes.|
|stdduration||minutes||Standard deviation duration of all unlock episodes.|
|countepisode||episodes||Number of all unlock episodes.|
|firstuseafter||minutes||Minutes until the first unlock episode.|
|sumduration[PLACE]||minutes||Total duration of all unlock episodes. [PLACE] can be living, exercise, study, greens. Same below.|
|maxduration[PLACE]||minutes||Longest duration of any unlock episode.|
|minduration[PLACE]||minutes||Shortest duration of any unlock episode.|
|avgduration[PLACE]||minutes||Average duration of all unlock episodes.|
|stdduration[PLACE]||minutes||Standard deviation duration of all unlock episodes.|
|countepisode[PLACE]||episodes||Number of all unlock episodes.|
|firstuseafter[PLACE]||minutes||Minutes until the first unlock episode.|
|count||calls||Number of calls of a particular call_type (incoming/outgoing) occurred during a particular time_segment.|
|distinctcontacts||contacts||Number of distinct contacts that are associated with a particular call_type for a particular time_segment.|
|meanduration||seconds||The mean duration of all calls of a particular call_type during a particular time_segment.|
|sumduration||seconds||The sum of the duration of all calls of a particular call_type during a particular time_segment.|
|minduration||seconds||The duration of the shortest call of a particular call_type during a particular time_segment.|
|maxduration||seconds||The duration of the longest call of a particular call_type during a particular time_segment.|
|stdduration||seconds||The standard deviation of the duration of all the calls of a particular call_type during a particular time_segment.|
|modeduration||seconds||The mode of the duration of all the calls of a particular call_type during a particular time_segment.|
|entropyduration||nats||The estimate of the Shannon entropy for the the duration of all the calls of a particular call_type during a particular time_segment.|
|timefirstcall||minutes||The time in minutes between 12:00am (midnight) and the first call of call_type.|
|timelastcall||minutes||The time in minutes between 12:00am (midnight) and the last call of call_type.|
|countmostfrequentcontact||calls||The number of calls of a particular call_type during a particular time_segment of the most frequent contact throughout the monitored period.|
|countscans||scans||Number of scans (rows) from the devices sensed during a time segment instance. The more scans a bluetooth device has the longer it remained within range of the participant’s phone.|
|uniquedevices||devices||Number of unique bluetooth devices sensed during a time segment instance as identified by their hardware addresses.|
|meanscans||scans||Mean of the scans of every sensed device within each time segment instance.|
|stdscans||scans||Standard deviation of the scans of every sensed device within each time segment instance.|
|countscansmostfrequentdevicewithinsegments||scans||Number of scans of the most sensed device within each time segment instance.|
|countscansleastfrequentdevicewithinsegments||scans||Number of scans of the least sensed device within each time segment instance.|
|countscansmostfrequentdeviceacrosssegments||scans||Number of scans of the most sensed device across time segment instances of the same type.|
|countscansleastfrequentdeviceacrosssegments||scans||Number of scans of the least sensed device across time segment instances of the same type per device.|
|countscansmostfrequentdeviceacrossdataset||scans||Number of scans of the most sensed device across the entire dataset of every participant.|
|countscansleastfrequentdeviceacrossdataset||scans||Number of scans of the least sensed device across the entire dataset of every participant.|
|countscans||devices||Number of scanned WiFi access points connected during a time_segment, an access point can be detected multiple times over time and these appearances are counted separately.|
|uniquedevices||devices||Number of unique access point during a time_segment as identified by their hardware address.|
|countscansmostuniquedevice||scans||Number of scans of the most scanned access point during a time_segment across the whole monitoring period.|
Physical Activity Details
|maxsumsteps||steps||The maximum daily step count during a time segment.|
|minsumsteps||steps||The minimum daily step count during a time segment.|
|avgsumsteps||steps||The average daily step count during a time segment.|
|mediansumsteps||steps||The median of daily step count during a time segment.|
|stdsumsteps||steps||The standard deviation of daily step count during a time segment.|
|sumsteps||steps||The total step count during a time segment.|
|maxsteps||steps||The maximum step count during a time segment.|
|minsteps||steps||The minimum step count during a time segment.|
|avgsteps||steps||The average step count during a time segment.|
|stdsteps||steps||The standard deviation of step count during a time segment.|
|countepisodesedentarybout||bouts||Number of sedentary bouts during a time segment.|
|sumdurationsedentarybout||minutes||Total duration of all sedentary bouts during a time segment.|
|maxdurationsedentarybout||minutes||The maximum duration of any sedentary bout during a time segment.|
|mindurationsedentarybout||minutes||The minimum duration of any sedentary bout during a time segment.|
|avgdurationsedentarybout||minutes||The average duration of sedentary bouts during a time segment.|
|stddurationsedentarybout||minutes||The standard deviation of the duration of sedentary bouts during a time segment.|
|countepisodeactivebout||bouts||Number of active bouts during a time segment.|
|sumdurationactivebout||minutes||Total duration of all active bouts during a time segment.|
|maxdurationactivebout||minutes||The maximum duration of any active bout during a time segment.|
|mindurationactivebout||minutes||The minimum duration of any active bout during a time segment.|
|avgdurationactivebout||minutes||The average duration of active bouts during a time segment.|
|stddurationactivebout||minutes||The standard deviation of the duration of active bouts during a time segment.|
|countepisode[LEVEL][TYPE]||episodes||Number of [LEVEL][TYPE] sleep episodes. [LEVEL] is one of awake and asleep and [TYPE] is one of main, nap, and all. Same below.|
|sumduration[LEVEL][TYPE]||minutes||Total duration of all [LEVEL][TYPE] sleep episodes.|
|maxduration[LEVEL][TYPE]||minutes||Longest duration of any [LEVEL][TYPE] sleep episode.|
|minduration[LEVEL][TYPE]||minutes||Shortest duration of any [LEVEL][TYPE] sleep episode.|
|avgduration[LEVEL][TYPE]||minutes||Average duration of all [LEVEL][TYPE] sleep episodes.|
|medianduration[LEVEL][TYPE]||minutes||Median duration of all [LEVEL][TYPE] sleep episodes.|
|stdduration[LEVEL][TYPE]||minutes||Standard deviation duration of all [LEVEL][TYPE] sleep episodes.|
|firstwaketimeTYPE||minutes||First wake time for a certain sleep type during a time segment. Wake time is number of minutes after midnight of a sleep episode’s end time.|
|lastwaketimeTYPE||minutes||Last wake time for a certain sleep type during a time segment. Wake time is number of minutes after midnight of a sleep episode’s end time.|
|firstbedtimeTYPE||minutes||First bedtime for a certain sleep type during a time segment. Bedtime is number of minutes after midnight of a sleep episode’s start time.|
|lastbedtimeTYPE||minutes||Last bedtime for a certain sleep type during a time segment. Bedtime is number of minutes after midnight of a sleep episode’s start time.|
|countepisodeTYPE||episodes||Number of sleep episodes for a certain sleep type during a time segment.|
|avgefficiencyTYPE||scores||Average sleep efficiency for a certain sleep type during a time segment.|
|sumdurationafterwakeupTYPE||minutes||Total duration the user stayed in bed after waking up for a certain sleep type during a time segment.|
|sumdurationasleepTYPE||minutes||Total sleep duration for a certain sleep type during a time segment.|
|sumdurationawakeTYPE||minutes||Total duration the user stayed awake but still in bed for a certain sleep type during a time segment.|
|sumdurationtofallasleepTYPE||minutes||Total duration the user spent to fall asleep for a certain sleep type during a time segment.|
|sumdurationinbedTYPE||minutes||Total duration the user stayed in bed (sumdurationtofallasleep + sumdurationawake + sumdurationasleep + sumdurationafterwakeup) for a certain sleep type during a time segment.|
|avgdurationafterwakeupTYPE||minutes||Average duration the user stayed in bed after waking up for a certain sleep type during a time segment.|
|avgdurationasleepTYPE||minutes||Average sleep duration for a certain sleep type during a time segment.|
|avgdurationawakeTYPE||minutes||Average duration the user stayed awake but still in bed for a certain sleep type during a time segment.|
|avgdurationtofallasleepTYPE||minutes||Average duration the user spent to fall asleep for a certain sleep type during a time segment.|
|avgdurationinbedTYPE||minutes||Average duration the user stayed in bed (sumdurationtofallasleep + sumdurationawake + sumdurationasleep + sumdurationafterwakeup) for a certain sleep type during a time segment.|
Participant Info Data
The ParticipantInfoData folder contains files with additional information.
- platform.csv: The file contains each participant's major smartphone platform (iOS or Android), indexed by pid
- demographics.csv: Due to privacy concerns, demographic data are only available for special requests. Please reach out to us directly with a clear research plan with demographic data.
We provide a behavior modeling benchmark platform GLOBEM [8,9]. The platform is designed to support researchers in using, developing, and evaluating different longitudinal behavior modeling methods.
Researchers who use the datasets must agree to the following terms.
Although the database has been anonymized, we cannot eliminate all potential risks of privacy information leakage. The PI of any research group access to the dataset, is responsible for continuing to safeguard this database, taking whatever steps are appropriate to protect participants’ privacy and data confidentiality. The specific actions required to safeguard the data may change over time.
If at any point, the administrators of the datasets at the University of Washington have concerns or reasonable suspicions that the researcher has violated these usage note, the researcher will be notified. Concerns about misuse may be shared with PhysioNet and other related entities.
Our datasets have led to multiple publications. Please find them in the reference list [11-18].
v1.0 - Release of our GLOBEM Dataset
Our datasets aim at aiding research efforts in the area of developing, testing, and evaluating machine learning algorithms to better understand college students’ (and the potentially more general population) daily behaviors, health, and well-being from continuous sensor streams and self-reports. These findings may support public interest in how to improve student experiences and drive policy around adverse events students and others may experience.
Privacy is the major ethical concern of our data collection studies. Our study has obtained IRB approval from the University of Washington with the IRB number STUDY00003244. Participants signed the consent form before joining our study. We strictly follow the IRB rules to anonymize participants' data. Anyone outside our core data collection group cannot access direct individually-identifiable information. We also eliminated the data for users who stopped their participation at any time during the study. Since some sensitive sensor data (e.g., location) can disclose identities, we only release feature-level data under credentialing to protect against privacy leakage.
Our multi-year data collection study closely followed a sister study at Carnegie Mellon University (CMU). We acknowledge all efforts from CMU Study Team to provide important starting and reference materials. Moreover, our studies were greatly inspired by StudentLife researchers from Dartmouth College.
Our studies were supported by the University of Washington (including the Paul G. Allen School of Computer Science and Engineering; Department of Electrical and Computer Engineering; Population Health; Addictions, Drug and Alcohol Institute; and the Center for Research and Education on Accessible Technology and Experiences); the National Science Foundation (EDA-2009977, CHS-2016365, CHS-1941537, IIS1816687 and IIS7974751), the National Institute on Disability, Independent Living and Rehabilitation Research (90DPGE0003-01), Samsung Research America, and Google.
Conflicts of Interest
The authors have no conflicts of interest to declare.
- N. D. Lane, E. Miluzzo, H. Lu, D. Peebles, T. Choudhury, and A. T. Campbell. A survey of mobile phone sensing. IEEE Communications Magazine, 48(9), 2010.
- M. E. Morris, Q. Kathawala, T. K. Leen, E. E. Gorenstein, F. Guilak, W. DeLeeuw, and M. Labhard. Mobile therapy: case study evaluations of a cell phone application for emotional self-awareness. Journal of medical Internet research, 12(2):e10, 2010.
- R. Wang, F. Chen, Z. Chen, T. Li, G. Harari, S. Tignor, X. Zhou, D. Ben-Zeev, and A. T. Campbell. Studentlife: Assessing mental health, academic performance and behavioral trends of college students using smartphones. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pages 3–14. ACM, 2014.
- J.-K. Min, A. Doryab, J. Wiese, S. Amini, J. Zimmerman, and J. I. Hong. Toss “n” turn: Smartphone as sleep and sleep quality detector. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’14, page 477–486, New York, NY, USA, 2014. Association for Computing Machinery.
- S. M. Mattingly, J. M. Gregg, P. Audia, A. E. Bayraktaroglu, A. T. Campbell, N. V. Chawla, V. Das Swain, M. De Choudhury, S. K. D’Mello, A. K. Dey, et al. The tesserae project: Large-scale, longitudinal, in-situ, multimodal sensing of information workers. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, pages 1–8, 2019.
- R. Wang, G. Harari, P. Hao, X. Zhou, and A. T. Campbell. Smartgpa: how smartphones can assess and predict academic performance of college students. In Proceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing, pages 295–306, 2015.
- D. Ferreira, V. Kostakos, and A. K. Dey. Aware: Mobile context instrumentation framework. Frontiers in ICT, 2:6, 2015.
- GLOBEM Home Page. https://the-globem.github.io
- Benchmark Platform GLOBEM. https://github.com/UW-EXP/GLOBEM/
- Rapids documentation. https://www.rapids.science/1.6/
- M. E. Morris, K. S. Kuehn, J. Brown, P. S. Nurius, H. Zhang, Y. S. Sefidgar, X. Xu, E. A. Riskin, A. K. Dey, S. Consolvo, and J. C. Mankoff. College from home during COVID-19: A mixed-methods study of heterogeneous experiences. PLOS ONE, 16(6):e0251580, June 2021.
- P. S. Nurius, Y. S. Sefidgar, K. S. Kuehn, J. Jung, H. Zhang, O. Figueira, E. A. Riskin, A. K. Dey, and J. C. Mankoff. Distress among undergraduates: Marginality, stressors and resilience resources. Journal of American College Health, pages 1–9, July 2021.
- Y. S. Sefidgar, W. Seo, K. S. Kuehn, T. Althoff, A. Browning, E. Riskin, P. S. Nurius, A. K. Dey, and J. Mankoff. Passively-sensed Behavioral Correlates of Discrimination Events in College Students. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW):1–29, Nov. 2019.
- X. Xu, P. Chikersal, A. Doryab, D. K. Villalba, J. M. Dutcher, M. J. Tumminia, T. Althoff, S. Cohen, K. G. Creswell, J. D. Creswell, J. Mankoff, and A. K. Dey. Leveraging Routine Behavior and Contextually- Filtered Features for Depression Detection among College Students. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 3(3):1–33, Sept. 2019.
- X. Xu, P. Chikersal, J. M. Dutcher, Y. S. Sefidgar, W. Seo, M. J. Tumminia, D. K. Villalba, S. Cohen, K. G. Creswell, J. D. Creswell, A. Doryab, P. S. Nurius, E. Riskin, A. K. Dey, and J. Mankoff. Leveraging Collaborative-Filtering for Personalized Behavior Modeling: A Case Study of Depression Detection among College Students. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 5(1):1–27, Mar. 2021.
- X. Xu, J. Mankoff, and A. K. Dey. Understanding practices and needs of researchers in human state modeling by passive mobile sensing. CCF Transactions on Pervasive Computing and Interaction, July 2021.
- H. Zhang, M. E. Morris, P. S. Nurius, K. Mack, J. Brown, K. S. Kuehn, Y. S. Sefidgar, X. Xu, E. A. Riskin, A. K. Dey, and J. Mankoff. Impact of Online Learning in the Context of COVID-19 on Undergraduates with Disabilities and Mental Health Concerns. ACM Transactions on Accessible Computing, page 3538514, July 2022.
- H. Zhang, P. Nurius, Y. Sefidgar, M. Morris, S. Balasubramanian, J. Brown, A. K. Dey, K. Kuehn, E. Riskin, X. Xu, and J. Mankoff. How Does COVID-19 impact Students with Disabilities/Health Concerns? In arXiv. arXiv, May 2020. arXiv:2005.05438. [cs]
Only credentialed users who sign the DUA can access the files.
License (for files):
PhysioNet Credentialed Health Data License 1.5.0
Data Use Agreement:
PhysioNet Credentialed Health Data Use Agreement 1.5.0
CITI Data or Specimens Only Research