Database Open Access
neuroQWERTY MIT-CSXPD Dataset
Published: Dec. 20, 2016. Version: 1.0.0
New Database Added: neuroQWERTY MIT-CSXPD (Dec. 20, 2016, midnight)
The neuroQWERTY MIT-CSXPD database contains keystroke logs collected from 85 subjects with and without parkinsons disease (PD). This dataset has been collected and analyzed in order to indicate that the routine interaction with computer keyboards can be used to detect motor signs in the early stages of PD.
Please include the standard citation for PhysioNet:
(show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
Abstract
The neuroQWERTY MIT-CSXPD database contains keystroke logs collected from 85 subjects with and without parkinsons disease (PD). This dataset has been collected and analyzed in order to indicate that the routine interaction with computer keyboards can be used to detect motor signs in the early stages of PD.
Data Collection
The subjects were recruited from two movement disorder units in Madrid (Spain) following the institutional protocols approved by the Massachusetts Institute of Technology, USA (Committee on the Use of Humans as Experimental Subjects approval no. 1402006203), Hospital 12 de Octubre, Spain (no. CEIC:14/090) and Hospital Clinico San Carlos, Spain (no. 14/136-E).
Each data file collected includes the timing information collected during the sessions of typing activity using a standard word processor on a Lenovo G50-70 i3-4005U with 4MB of memory and a 15 inches screen running Manjaro Linux. Subjects were instructed to type as they normally would do at home and they were left free to correct typing mistakes only if they wanted to. The key acquisition software presented a temporal resolution of 3/0.28 (mean/std) milliseconds.
There are two datasets collected from two sets of experiments:
- PD_MIT-CS1PD - 31 subjects. 13 healthy controls and 18 PD sufferers. Subjects were asked to visit a movement disorder unit twice to complete the study. Therefore each subject's data is stored in 2 csv files.
- PD_MIT-CS2PD - 54 subjects. 30 healthy controls and 24 PD sufferers. Subjects were asked to visit a movement disorder unit once to complete the study.
Along with the raw typing collections, clinical evaluations were also performed on each subject, including UPDRS and finger tapping tests. See the referenced publication for more details.
Data Files
The data from each of the two experiment sets are split into their own subdirectories. Each dataset contains a subject summary csv file GT_DataPD_MIT-CSXPD.csv
which lists for each subject:
- pID - Patient ID
- gt - Ground truth label of whether or not they had PD
- updrs108 - Unified Parkinson’s Disease Rating Scale part III (UPDRS-III)
- afTap - Alternating finger tapping result
- sTap - Single key tapping result
- nqScore - neuroQWERTY index (nQi)
- Typing speed
- file_n - The csv file(s) containing the patient's typing data
Each keystroke data csv file has four columns which give:
- The key pressed.
- The hold duration in seconds.
- The key release time in seconds from time 0.
- The key press time in seconds from time 0.
The neuroQWERTY.zip
file includes all of the data along with the scripts described in the next section.
Loading Scripts
The nqDataLoader.py
python module contains functions used to filter anomalous results and load the data from the csv data files. The readme.ipynb
ipython notebook uses these functions and demonstrates how to load and display the data.
Acknowledgements
These datasets have been collected as part of the neuroQWERTY project at the Massachusetts Institute of Technology thanks to the financial support by the Comunidad de Madrid, Fundacion Ramon Areces and The Michael J Fox Foundation for Parkinson's research (grant number 10860). We thank the M + Vision faculty for their guidance in developing this project. We also thank our many clinical collaborators at MGH in Boston, at “12 de Octubre”, Hospital Clinico and Centro Integral en Neurociencias HM CINAC in Madrid for their insightful contributions.
Access
Access Policy:
Anyone can access the files, as long as they conform to the terms of the specified license.
License (for files):
Open Data Commons Attribution License v1.0
Discovery
DOI (version 1.0.0):
https://doi.org/10.13026/C2859Q
DOI (latest version):
https://doi.org/10.13026/g0bd-1m78
Topics:
parkinsons
neuroelectric
brain
Corresponding Author
Files
Total uncompressed size: 7.3 MB.
Access the files
- Download the ZIP file (7.3 MB)
- Access the files using the Google Cloud Storage Browser here. Login with a Google account is required.
-
Access the data using the Google Cloud command line tools (please refer to the gsutil
documentation for guidance):
gsutil -m -u YOUR_PROJECT_ID cp -r gs://nqmitcsxpd-1.0.0.physionet.org DESTINATION
-
Download the files using your terminal:
wget -r -N -c -np https://physionet.org/files/nqmitcsxpd/1.0.0/
-
Download the files using AWS command line tools:
aws s3 sync --no-sign-request s3://physionet-open/nqmitcsxpd/1.0.0/ DESTINATION
Name | Size | Modified |
---|---|---|
Parent Directory | ||
data_MIT-CS2PD | ||
GT_DataPD_MIT-CS2PD.csv (download) | 4.1 KB | 2016-12-20 |