Database Open Access

neuroQWERTY MIT-CSXPD Dataset

Luca Giancardo

Published: Dec. 20, 2016. Version: 1.0.0


When using this resource, please cite the original publication:

L. Giancardo, A. Sánchez-Ferro, T. Arroyo-Gallego, I. Butterworth, C. S. Mendoza, P. Montero, M. Matarazzo, J. A. Obeso, M. L. Gray, R. San José Estépar. Computer keyboard interaction as an indicator of early Parkinson's disease. Scientific Reports 6, 34468; doi: 10.1038/srep34468 (2016)

Please include the standard citation for PhysioNet: (show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.

Abstract

The neuroQWERTY MIT-CSXPD database contains keystroke logs collected from 85 subjects with and without parkinsons disease (PD). This dataset has been collected and analyzed in order to indicate that the routine interaction with computer keyboards can be used to detect motor signs in the early stages of PD.

Data Collection

The subjects were recruited from two movement disorder units in Madrid (Spain) following the institutional protocols approved by the Massachusetts Institute of Technology, USA (Committee on the Use of Humans as Experimental Subjects approval no. 1402006203), Hospital 12 de Octubre, Spain (no. CEIC:14/090) and Hospital Clinico San Carlos, Spain (no. 14/136-E).

Each data file collected includes the timing information collected during the sessions of typing activity using a standard word processor on a Lenovo G50-70 i3-4005U with 4MB of memory and a 15 inches screen running Manjaro Linux. Subjects were instructed to type as they normally would do at home and they were left free to correct typing mistakes only if they wanted to. The key acquisition software presented a temporal resolution of 3/0.28 (mean/std) milliseconds.

There are two datasets collected from two sets of experiments:

  1. PD_MIT-CS1PD - 31 subjects. 13 healthy controls and 18 PD sufferers. Subjects were asked to visit a movement disorder unit twice to complete the study. Therefore each subject's data is stored in 2 csv files.
  2. PD_MIT-CS2PD - 54 subjects. 30 healthy controls and 24 PD sufferers. Subjects were asked to visit a movement disorder unit once to complete the study.

Along with the raw typing collections, clinical evaluations were also performed on each subject, including UPDRS and finger tapping tests. See the referenced publication for more details.

Data Files

The data from each of the two experiment sets are split into their own subdirectories. Each dataset contains a subject summary csv file GT_DataPD_MIT-CSXPD.csv which lists for each subject:

  • pID - Patient ID
  • gt - Ground truth label of whether or not they had PD
  • updrs108 - Unified Parkinson’s Disease Rating Scale part III (UPDRS-III)
  • afTap - Alternating finger tapping result
  • sTap - Single key tapping result
  • nqScore - neuroQWERTY index (nQi)
  • Typing speed
  • file_n - The csv file(s) containing the patient's typing data

Each keystroke data csv file has four columns which give:

  • The key pressed.
  • The hold duration in seconds.
  • The key release time in seconds from time 0.
  • The key press time in seconds from time 0.

The neuroQWERTY.zip file includes all of the data along with the scripts described in the next section.

Loading Scripts

The nqDataLoader.py python module contains functions used to filter anomalous results and load the data from the csv data files. The readme.ipynb ipython notebook uses these functions and demonstrates how to load and display the data.

Acknowledgements

These datasets have been collected as part of the neuroQWERTY project at the Massachusetts Institute of Technology thanks to the financial support by the Comunidad de Madrid, Fundacion Ramon Areces and The Michael J Fox Foundation for Parkinson's research (grant number 10860). We thank the M + Vision faculty for their guidance in developing this project. We also thank our many clinical collaborators at MGH in Boston, at “12 de Octubre”, Hospital Clinico and Centro Integral en Neurociencias HM CINAC in Madrid for their insightful contributions.


Share
Access

Access Policy:
Anyone can access the files, as long as they conform to the terms of the specified license.

License (for files):
Open Data Commons Attribution License v1.0

Discovery
Corresponding Author
You must be logged in to view the contact information.

Files

Total uncompressed size: 7.3 MB.

Access the files
Folder Navigation: <base>/MIT-CS2PD
Name Size Modified
Parent Directory
data_MIT-CS2PD
GT_DataPD_MIT-CS2PD.csv (download) 4.1 KB 2016-12-20