# This file describes the data organisation in the In-Gauge and En-Gage dataset. # Contact: Nan Gao (nannn.gao@gmail.com) ### Longitudinal Folder. The file names refer to the names of the classrooms in which the indoor data was obtained. Each file contains its individual indoor data (environmental data / occupancy / usability) as well as other data fields that are duplicated accress all files (weather and time related data). The resolution is 5min. Missing data has been linearly interpolated. Occupancy was obtained from the individual classroom schedules provided by the school and takes Victorian school holidays into account. The fields are: Time [time] 00:00:00 ~ 23:55:00 Day [int] 1 ~ 169, where 1 represents the first day of the data collection, and so on. Occupied [int] 0 = No; 1 = Yes SchoolDay [int] 0 = No; 1 = Yes Hour [int] 0~23 LessonNumber [int] -1 = outside of school hours; 0 = 8:50-9:00; 1 = 9:00-9:40; 2 = 9:40-10:20; 3 = 10:20-11:00; 4 = 11:25-12:05; 5 = 12:05-12:45; 6 = 12:45-13:25; 7 = 14:15-14:55; 8 = 14:55-15:35; 9 = Recess times or special "Breadth Studies" session on Wednesdays LessonPct [float] 0.0 - 1.0 (describes how much of the current lesson has passed) IndoorTemperature [float] °C IndoorHumidity [int] % IndoorCO2 [int] ppm IndoorNoise [int] dB OutdoorTemperature [float] °C OutdoorHumidity [float] % OutdoorDewpoint [float] °C OutdoorWindDirection [int] 0 ~ 360° (0° = north wind, 90° = east wind, etc.) OutdoorWindSpeed [float] m/s OutdoorGustSpeed [float] m/s Precipitation [float] mm UvLevel [int] Global Solar UV Index SolarRadiation [int] W/m2 CoolingState [int] 0 = Off; 1 = On HeatingState [int] 0 = Off; 1 = On UsabilityMask [bool] For timeframes where too much data was missing, the UsabilityMask field was set to "False" for the entire day. During holidays, the UsabilityMask also reads "False". ### Participant_class_info Folder. This folder contains demographic information on the background questionnaires participants, and the class schedule. Note that for several survey questions, we adopted the 5-point Likert scale: -2 = 'strongly disagree', -1 = 'somewhat disagree', 0 = 'neither agree nor disagree', 1 = 'somewhat agree' and 2 = ‘strongly agree'. The Participant_class_info folder contains the following files: 1. Student.csv. Each row in this file contains a participant ID (Column A), gender (Column B), age in years (Column C), form room, math room and language room (Columns D–F), and three background questions (Columns G–K) related to their general thermal comfort and engagement in class. Specifically, Columns G to I represent, respectively, the questions 'What is your general feeling in the classroom?' [-3 = cold, -2 = cool, -1 = slightly cool, 0 = neutral, 1 = slightly warm, 2 = warm, 3 = hot], 'When I am engaged in class, I usually don't feel too hot or too cold' and 'When I am engaged in class, I could get distracted when the room is too hot or too cold'. For the latter 2 questions, we adopted the 5-point Likert scale. 2. Teacher.csv. Each row in this file contains a participant ID (Column A), gender (Column B), age in years (Column C), teaching subject (Columns D), and three background questions similar to the student.csv file, except that we changed the last two questions slightly from ‘When I am engaged in class, […]’ to 'When I am engaged in teaching, […]'. 3. Class_table.csv. We generate this file from the class schedule obtained from the school. Each row in this file contains the information of one single class, including the unique class ID (Column A), classroom (Column B), start time of the current class (Column C), finish time of the current class (Column D), length of the class (Column E), week (Column F), weekday (Column G), the order of the class (Column H) and the course name (Column I). Specifically, Column F indicates the order of week during the data collection, where 1 is the frist week, 2 is the second week, etc. Column G means the weekday and 1 to 5 indicate Monday to Friday. Column J shows whether students take this class in a form group, where '0' indicates they are not in a form group, 'all' indicates all students take this class in one whole form group (i.e., Assembly, Chapel), the R1/R2/R3 means students take this class in form groups and their form room is R1, R2 or R3. ### Survey Folder. This folder contains 2 files: Student_survey.csv and Teacher_survey.csv. Student_survey.csv contains the 488 survey responses including 17 columns where Column A is participant ID, Column B is the order of week, Column C is the weekday and Column D is the time of the day. There are columns containing thermal comfort-related information (Columns E–G), seating location (Columns H-I), multi-dimensional student engagement (Columns J–N), mood (Columns O-P), and confidence level of the survey (Column Q). The engagement questions were rated using the Likert-scale. To calculate the engagement score, users should reverse the responses in item 2 and item 4, then calculate the average of the 5-point Likert scale for each dimension of engagement. The specific columns relate to the following questions: • Column E: Thermal_sensation: 'How do you feel right now in the classroom?' [-3 = cold, -2 = cool, -1 = slightly cool, 0 = neutral, 1 = slightly warm, 2 = warm, 3 = hot]. • Column F: Thermal_preference: 'Would you like to be?' [Cooler, No change, Warmer]. • Column G: Clothing: 'What are you wearing now? (multiple options allowed)' [Shirt, Jumper, Jacket, Pants, Shorts, Skirt, Dress, Other]. • Columns H–I: Loc_x, Loc_y: 'Where did you sit in the last class? (please click on the floorplan)' [x, y pixels in the 400*321 room thumbnail where x = y = 0 at the upper left corner]. • Columns J-N: Engage_1, 2, 3, 4, 5: 'Please describe your engagement in the last class': [I paid attention in class], [I pretended to participate in class but actually not], [I enjoyed learning new things in class], [I felt discouraged when we worked on something], [I asked myself questions to make sure I understood the class content]. • Columns O–P: Arousal, Valence: 'Touch the photo that best captures how you feel right now (optional)' [We assigned the arousal and valence values from 1–4 to each picture. For instance, for the right bottom picture, valence = 4 and arousal = 1]. • Column Q: Confidence_level: 'Please rate your confidence level for your answers in this survey (optional)' [5-point Likert scales where 1 = Not confident, 2 = Slightly confident, 3 = Moderately confident, 4 = Very confident, 5 = Extremely confident]. Teacher_survey.csv contains the 22 survey responses by the teachers. The file includes 13 columns where Column A is the recorded time of the day, Column B-C are the week and weekday. Column D is the wristband ID, Columns E–G are the thermal comfort-related information, Columns H–L are the engagement related information, and Column M is the confidence level of the survey. For the wristband ID in Column D, A/B/C/D represent the classrooms R1/R2/R3/R4. The specific columns relate to the following questions: • Column D: Wristband_id: 'Please enter your wristband ID.' [A, B, C, D]. • Column E: Thermal_sensation: 'How do you feel right now in the classroom?' [-3 = cold, -2 = cool, -1 = slightly cool, 0 = neutral, 1 = slightly warm, 2 = warm, 3 = hot]. • Column F: Thermal_preference: 'Would you like to be?' [Cooler, No change, Warmer]. • Column G: Clothing: 'What are you wearing now? (multiple options allowed)' [Shirt, Jumper, Jacket, Pants, Shorts, Skirt, Dress, Other]. • Columns H–L: Engage_1, 2, 3, 4, 5: 'Please describe your engagement in the last class': [I was excited about teaching], [I felt happy while teaching], [While teaching, I paid a lot of attention to my work], [I cared about the problems of my students], [I was aware of my students' feelings]. • Column M: Confidence_level: 'Please rate your confidence level for your answers in this survey (optional)' [5-point Likert scales where 1 = Not confident, 3 = Somewhat confident, 5 = Very confident]. ### Raw_wearable_data Folder. This folder includes 20 sub-folders named with the week and weekday of data collection (e.g., 'Week1_4' indicates Thursday on the first week), containing the raw wearable data for each day during the 4-week data collection. In each sub-folder, there are multiple sessions from different participants. Some participants provided more than 1 session on the same day. The name of each session consists of two parts connected by an underscore: the unique session ID and the participant ID. For example, the session named '1567380164_18' indicates the data is provided by participant 18. There are 6 CSV files in each session, and each of these files (except IBI.csv) has the following format: the first row is the initial time of the session expressed as a Unix timestamp in UTC. The second row is the sample rate expressed in Hz. Specifically: 1. ACC.csv contains data from a 3-axis accelerometer sampled at 32 Hz which is configured to measure accelerations in the range of [-2g, 2 g]. Acceleration is the rate of change of the velocity with respect to time, where SI (International System of Units) derived unit for acceleration is the metre per second squared (m·s-2) where 1 g is equal to 9.80665 m·s-2. The unit in this file is 1/64 g where the raw value of 64 indicates 1 g. The 3 columns refer to the x, y, and z-axis, respectively. 2. BVP.csv contains BVP signals sampled at 64 Hz which is the primary output from the PPG sensor. BVP signals can be used to compute the inter-beat-intervals (IBI) and heart rate (HR). 3. EDA.csv contains data from an electrodermal activity (EDA) sensor expressed as micro siemens (µS) sampled at 4 Hz. The variation of EDA values indicates the electrical changes of the skin surface and the EDA arises when the skin receives nerve signals from the brain and sweat level increases. 4. HR.csv contains the average heart rate data extracted from the BVP signals, calculated in spans of 10 seconds. The first row is the initial time of the session and it is 10 seconds after the beginning of the recording. The sampling rate of heart rate is 1 Hz. 5. IBI.csv contains the time intervals between a participant’s heartbeats extracted from the BVP signals. This file does not have a sampling rate. The first column is the time (with respect to the starting time) of thedetected inter-beat interval expressed in seconds (s). The second column is the duration in seconds (s) of the detected inter-beat interval (i.e., the distance in seconds from the previous beat). 6. TEMP.csv contains data from a temperature sensor expressed in degrees Celsius (°C), sampled at 4 Hz. ### Class_wearable_data Folder. The Class_wearable_data folder contains 221 sub-folders representing 221 different classes during which the wearable data were recorded. Each sub-folder is named by the unique 'Class_id' as shown in the Class_table.csv. Each sub-folder includes further sub-folders named by the unique participant id or simply the label 'teacher'. These contain data from the wristband sensors for each participant in this class. There are 6 CSV files in each sub-folder: ACC.csv, EDA.csv, BVP.csv, HR.csv, IBI.csv, and TEMP.csv. The last column of the files is the time of the day and the other columns are the values of the signal where the unit is identical to the ones in the Raw_wearable_data folder.