Challenge Credentialed Access

ShAReCLEF eHealth 2013: Natural Language Processing and Information Retrieval for Clinical Care

Danielle Mowery

Published: Feb. 15, 2013. Version: 1.0


When using this resource, please cite: (show more options)
Mowery, D. (2013). ShAReCLEF eHealth 2013: Natural Language Processing and Information Retrieval for Clinical Care (version 1.0). PhysioNet. https://doi.org/10.13026/0zsp-0e97.

Additionally, please cite the original publication:

Suominen H, Salanterä S, Velupillai S, et al (2013). Overview of the ShARe/CLEF eHealth Evaluation Lab 2013. In Proceedings of the 4th International Conference on Information Access Evaluation. Multilinguality, Multimodality, and Visualization. Volume 8138 (CLEF 2013). https://doi.org/10.1007/978-3-642-40802-1_24

Please include the standard citation for PhysioNet: (show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.

Abstract

This is the pilot year of the ShARe/CLEF eHealth Evaluation Lab, a shared task focused on natural language processing (NLP) and information retrieval (IR) for clinical care. The task is co-organized by the Shared Annotated Resources (ShARe) project and the CLEF Initiative (Conference and Labs of the Evaluation Forum, formerly known as Cross-Language Evaluation Forum). The vision of ShARe/CLEF is two-fold: (1) to develop tasks that potentially impact patient understanding of medical information and (2) to provide the community with an increasingly sophisticated dataset of clinical narrative to advance the state-of-the-art in Natural Language Processing, Information Extraction and Information Retrieval in healthcare. 

The ShARe/CLEF eHealth 2013 Challenge has three tasks: Task 1 and Task 2 involve annotation of entities in a set of narrative clinical reports; Task 3 involves retrieval of web pages based on queries generated when reading the clinical reports. The datasets for Tasks 1 and 2 comprise de-identified clinical free-text notes from the MIMIC-II database (version 2.5). The dataset for Task 3 consists of a set of medical-related documents provided by the Khresmoi E.U. Consortium. This project provides data files for Task 1 and Task 2 only. 


Objective

Task 1 and Task 2 involve annotation of entities in a set of narrative clinical reports extracted from the MIMIC Database [1]; Task 3 involves retrieval of web pages based on queries generated when reading the clinical reports. The original shared task site can be found at: https://sites.google.com/site/shareclefehealth/. To learn more about the ShARe project, the HealthNLP-ShARe project site is at: https://healthnlp.hms.harvard.edu/share/wiki/index.php/Main_Page. The tasks are described in further detail below.

Task 1: Named entity recognition and normalization of disorders

Participants are provided with an unannotated clinical report dataset and will be evaluated on their ability to:

  • (a) automatically identify the boundaries of disorder named entities in the text and
  • (b) map the automatically identified named entities to SNOMED codes [2].

Participants may submit system output for Task A and/or Task B. If participating in Task B only, reference standard disorder spans will be provided. Annotation guidelines and examples of annotation for this task are available as part of the task materials.

A disorder mention is defined as any span of text which can be mapped to a concept in the SNOMED-CT terminology and which belongs to the Disorder semantic group. A concept is in the Disorder semantic group if it belongs to one of the following UMLS semantic types:

  • Congenital Abnormality
  • Acquired Abnormality
  • Injury or Poisoning
  • Pathologic Function
  • Disease or Syndrome
  • Mental or Behavioral Dysfunction
  • Cell or Molecular Dysfunction
  • Experimental Model of Disease
  • Anatomical Abnormality
  • Neoplastic Process
  • Signs and Symptoms

Note that this definition of Disorder semantic group does not include the Findings semantic type, and as such differs from the one of UMLS Semantic Groups. 

Examples of the Task 1 Annotations:

  1. "The rhythm appears to be atrial fibrillation". “atrial fibrillation” is a mention of type Disorders with CUI C0004238 (UMLS preferred term is “atrial fibrillation”)
  2. "The left atrium is moderately dilated". left atrium.... dilated” is a mention of type Disorders with CUI C0344720 (UMLS preferred term is “left atrial dilatation”)
  3. "53 year old man s/p fall from ladder". “fall from ladder” is a mention of type Disorders with CUI C0337212 (UMLS preferred term is “accidental fall from ladder”)

Example (1) represents the easiest cases. Example (2) represents mentions that are disjoint. Example (3) is a synonym of the UMLS preferred term.

Task 2: Normalization of clinical acronyms and abbreviations

Participants will be provided with a clinical report dataset that has been previously annotated for acronym/abbreviation spans. The goal for Task 2 is as follows:

  • Given manually annotated spans of acronyms and abbreviations, normalize the annotation to a UMLS code.

Note that some of the annotations will not match UMLS concepts and will be assigned the value “CUI-less”. The annotation guidelines and examples of annotation for this task are available as part of the evaluation lab dataset. 

Examples of Task 2 Annotations:

  1. "He was given Vanco". “Vanco” is a mention of type Acronym/Abbreviation with CUI C0042313 (UMLS preferred term is “Vancomycin”)
  2. "Patient has breast ca". “ca” is a mention of type Acronym/Abbreviation with CUI C0006826 (UMLS preferred term is “Malignant Neoplasms”)
  3. "Mitral Valve: Trivial MR". "MR" is a mention of type Acronym/Abbreviation with CUI C0026266 (UMLS preferred term is "Mitral Valve Insufficiency")

Task 3: Document retrieval

Task 3 focuses on the retrieval of web pages based on queries generated when reading clinical reports. The task is a TREC-style information retrieval (IR) task using (a) a 2012 crawl of approximately one million medical documents made available by the EU-FP7 Khresmoi project (http://www.khresmoi.eu/) in plain text form and (b) general public queries that individuals may realistically pose based on the content of their discharge summaries. Queries will be generated from discharge summaries used in Tasks 1 and 2. The goal of Task 3 is to retrieve the relevant documents for the user queries. For more information on Task 3, see: https://sites.google.com/site/shareclefehealth/


Participation

Participants have approximately one month to explore the training materials and develop automated techniques, after which test materials for the task will be released. After test material are released, no further development work should take place. Evaluation of submissions will be distributed to the participants before the 2013 CLEF eHealth Workshop. 

Task 1 timeline

  • Training set for Task 1 released: 15 Feb 2013
  • Test set for Task 1 released: 17 Apr 2013
  • Participants submit output for Test set: 24 Apr 2013

Task 2 timeline

  • Training set for Task 2 released: 21 Mar 2013
  • Test set for Task 2 released: 1 May 2013
  • Participants submit output for Test set: 8 May 2013

Task 3 timeline

  • Training set for Task 3 released (documents collection): 25 Mar 2013
  • Training set for Task 3 released (sample development queries and associated result set): 15 Apr 2013
  • Test set for Task 3 released (test queries): 24 Apr 2013
  • Participants submit output for Test set: 1 May 2013

Post-submission timeline

  • Organizers release scores for all test sets: 1 June 2013
  • Participants submit working notes: 15 June 2013
  • Lab chairs submit overview document: 30 Jun 2013
  • CLEF eHealth 2013 Workshop: 23-26 Sep 2013

[UPDATE: This challenge is no longer active].

Submission Guidelines

To submit your runs to Task 1a, 1b, 2, and/or 3, please follow the following guidelines carefully.

  1. Follow the task-specific submission deadlines:
  2. Navigate to our Easy Chair for CLEFeHealth2013 runs (www.easychair.org/conferences/?conf=clefehealth2013runsu) and submit separately to each task by selecting “New Submission”. You will submit all runs for one task at the same time. After you have created a new submission, you can update it, but no updates of runs are accepted after the deadline has passed.
  3. List all your team members as “Authors”. “Address for Correspondence” and “Corresponding author” refer to your team leader. Note: you can acknowledge people not listed as authors separately in the working notes. We wish this process to be very similar to defining the list of authors in scientific papers.
  4. Please provide the task and your team name as “Title” (e.g., “Task 1a: Team NICTA” or “Task 1a using extra annotations: Team NICTA”) and a short description (max 100 words) of your team as “Abstract”. See the category list below the abstract field for the task names. If you submit to multiple tasks, please copy and paste the same description to all your submissions and use the same team name in all submissions.
  5. Choose a “Category” and one or more “Groups” to describe your submission. We allow up to 2 runs for Task 1a; 2 runs for Task 1b; 2 runs for Task 2; 7 runs for Task 3.
  6. Please provide 3 to 10 “Keywords” that describe your the different runs in the submission, including methods (e.g., MetaMap, Support Vector Machines, Weka) and resources (e.g., Unified Medical Language System, expert annotation). You will provide a narrative description later in the process.
  7. As “Paper” please submit a zip file including the runs for this task. Please name each run as follows: “name + run + task + add/noadd” (e.g., TeamNICTA.1.1a.add) where name refers to your team name; run to the run ID; task to 1a, 1b, 2 or 3; and “add/noadd” to the use of additional annotations. In Task 3, the run ID 1 should refer to the mandatory baseline run (mandatory run); 2-4 to the runs generated using the discharge summaries (optional runs); and 5-7 to the runs generated without using the discharge summaries (optional runs). Please follow the file formats available at https://sites.google.com/site/shareclefehealth/evaluation.
  8. As the mandatory attachment file, please provide a .txt file with a description of the submission. Please structure this file by using your run-file names above. For each run, provide a max 200 word summary of the processing pipeline (i.e., methods and resources). Be sure to describe differences between the runs in the submission.
  9. Between May 25 and June 15, 2013, please submit your working notes. Please follow the formatting and submission guidelines available at http://www.clef2013.org/index.php?page=Pages/instructions_for_authors.html.
  10. The organizers will provide the evaluation results via the Easy Chair for CLEFeHealth2013 runs. This includes your ranking with respect to other teams as well as the value(s) of the official evaluation measure(s).

Data Description

The dataset for Tasks 1 and 2 consists of de-identified clinical free-text notes from the MIMIC-II database (version 2.5) [1]. Notes were authored in the ICU setting and note types include discharge summaries, ECG reports, echo reports, and radiology reports. For this evaluation, the training set contains 200 notes, and the test set contains 100 notes.

Task 1: Disorders

Annotation of disorder mentions was carried out as part of the ongoing ShARe (Shared Annotated Resources) project (clinicalnlpannotation.org). For this task in the evaluation lab, the focus is on the annotation of disorder mentions only. As such, there are two parts to the annotation: identifying a span of text as a disorder mention and mapping the span to a UMLS CUI (concept unique identifier). Each note was annotated by two professional coders trained for this task, followed by an open adjudication step. The annotation guidelines and examples of annotation for this task are available as part of the shared task materials.

A disorder mention is defined as any span of text which can be mapped to a concept in the SNOMED-CT terminology and which belongs to the Disorder semantic group. A concept is in the Disorder semantic group if it belongs to one of the following UMLS semantic types:

  • Congenital Abnormality
  • Acquired Abnormality
  • Injury or Poisoning
  • Pathologic Function
  • Disease or Syndrome
  • Mental or Behavioral Dysfunction
  • Cell or Molecular Dysfunction
  • Experimental Model of Disease
  • Anatomical Abnormality
  • Neoplastic Process
  • Signs and Symptoms

This definition of Disorder semantic group does not include the Findings semantic type, and as such differs from the one of UMLS Semantic Groups.

Task 2: Acronyms/Abbreviations

Participants will be provided with a clinical report dataset that has been previously annotated for acronym/abbreviation spans. Participants will be evaluated on their ability to map the existing annotations to UMLS codes. Annotation of acronyms and abbreviations was carried out specifically for the CLEF 2013 eHealth Evaluation Lab. For this task, the focus is normalization of pre-annotated acronyms/abbreviations to UMLS concepts. Annotators were instructed to annotate all acronyms/abbreviations that were contained in narratives and not contained in a list. Participants will be provided the span of the acronyms/abbreviations, which were annotated by multiple nursing students trained for this task, followed by an open adjudication step. The goal for Task 2 is to map the annotation to best matching concept in the UMLS. Some of the annotations will not match UMLS concepts and will be assigned the value “CUI-less”. The annotation guidelines and examples of annotation for this task are available as part of the evaluation lab dataset.

Format of Annotations

The official annotation format for Tasks 1 and 2 is the same. Annotations are standoff and are in the following format (synthetic example):

report name || annotation type || cui || char start || char end
08100-027513-DISCHARGE_SUMMARY.txt||Disease_Disorder||c0332799||459||473

System results should be submitted in the same format. If the annotation contains disjoint spans (i.e., non-contiguous spans, such as in the sentence "Abdomen: no distention is noted." in which the single annotation for "abdominal distention, C0235698" encompasses the span 0-6 (abdomen) and 13-22 (distention)), then additional char start and char end values will be appended: 08100-027513-DISCHARGE_SUMMARY.txt||Disease_Disorder||c0332799||459||473||486||493   Note: If only participating in boundary detection part of Task 1, leave the cui slot blank.

To account for potential linefeed/newline confusion among operating systems, we have provided a java program for converting non-Unix to Unix linefeeds.  The script is a java program based on the flip utility for converting non-Unix to Unix linefeeds (https://ccrma.stanford.edu/~craig/utility/flip/). If you are not using a Unix or MacOS operating system, once you download the datasets, please run the program on the directory containing the corpus of text files to create a new directory with Unix linefeeds:

java -jar convertFilesToUnixFormat.jar <directory containing files> <new directory>

Visualizing Task 1 and 2 Annotations

In addition to the official standoff annotation format, the annotations are provided in Knowtator xml format and can be visualized in three ways:

  1. Protege/Knowtator combination:
    Protege 3.3.1: http://protege.cim3.net/download/old-releases/3.3.1/full/
    Knowtator 1.9beta: http://sourceforge.net/projects/knowtator/files/
  2. eHOST: http://code.google.com/p/ehost/
  3. Evaluation Workbench: available with datasets. Details on using the evaluation workbench are available at: http://sites.google.com/site/shareclefehealth/evaluation.

Evaluation

Participants are provided with training and test datasets. The evaluation for all tasks is conducted using the withheld test data. Participating teams are asked to stop development as soon as they download the test data. Teams are allowed to use any outside resources in their algorithms. However, system output for systems that use annotations outside of those provided for Tasks 1 and 2 will be evaluated separately from system output generated without additional annotations.

Each of the three tasks has at least one evaluation. Each team is allowed to upload up to two system runs for each evaluation in the three tasks, for a maximum of eight submissions (there are two evaluations for task 1). Some evaluations will provide scores for a relaxed and a strict measure. In addition, we will evaluate performance separately for system runs that use annotations outside of those provided; however system runs with outside annotations count towards the two runs per project.

Task 1(a): Named entity recognition and normalization of disorders: Boundary detection

The goal of Task 1(a) is to identify the span of all named entities that could be classified by the UMLS semantic group Disorder (excluding the semantic type Findings). Submissions are evaluated by F1-score.

F1-score = 2 ( R e c a l l P r e c i s i o n ) ( R e c a l l + P r e c i s i o n ) \textrm{F1-score} = 2 * \frac{(Recall * Precision)}{(Recall + Precision)}

Given:

R e c a l l = T P ( T P + F N ) Recall = \frac{TP}{(TP + FN)}
P r e c i s i o n = T P ( T P + F P ) Precision = \frac{TP}{(TP + FP)}

Where: TP = same span; FP = spurious span; and FN = missing span.  For Exact F1-score, the span is identical to the reference standard span. For Overlapping F1-score, the span overlaps reference standard span.

Task 1(b): Named entity recognition and normalization of disorders: Mapping to a SNOMED code

Performance in Task 1(b) is assessed in terms of Accuracy.

A c c u r a c y = C o r r e c t T o t a l Accuracy = \frac{Correct}{Total}

Where Correct = Number of disorder named entities with strictly correct span and correctly generated code; and Total = Number of disorder named entities, depending on strict or relaxed setting:

  • Strict: Total = Total number of reference standard named entities. In this case, the system is penalized for incorrect code assignment for annotations that were not detected by the system.
  • Relaxed: Total = Total number of named entities with strictly correct span generated by the system. In this case, the system is only evaluated on annotations that were detected by the system.

Task 2: Normalization of acronyms/abbreviations to UMLS codes

Performance in Task 2 is assessed in terms of Accuracy.

A c c u r a c y = C o r r e c t T o t a l Accuracy = \frac{Correct}{Total}

Where Correct = Number of pre-annotated acronyms/abbreviations with correctly generated code; and Total = Number of pre-annotated acronyms/abbreviations.

  • Strict Accuracy Score: Correct = number of pre-annotated acronyms/abbreviations with the top code selected by the annotators (one best).
  • Relaxed Accuracy Score: Correct = number of pre-annotated acronyms/abbreviations for which the code is contained in a list of possibly matching codes generated by the annotators (n-best).

Task 3: Retrieval of web documents to address queries

Evaluation focuses on mean average precision (MAP), but other evaluation metrics such as precision at 10 (P@10) and other suitable IR evaluation measures will also be computed for the submitted runs.

Tools for Evaluation (Tasks 1 and 2)

Two tools are provided to perform evaluations on the training and test sets.

1. Evaluation script: eval.pl. A perl evaluation script will calculate all outcome measures and print the results to a file. The results from the script will be used to rank all system runs within each task. The script requires as input the directory containing the pipe-delimited reference standard annotations and the directory containing files of the same format with system annotations.

Parameters:
-input (prediction directory containing one pipe-delimited file per report)
-gold (goldstandard directory containing one pipe-delimited file per report)
-n (specify name of run)
-r (specify 1 or 2 for which run)
-t (specifiy 1a, 1b)
-a (optional - include if you used additional annotations)
Output file: name + run + task + add/noadd (e.g. myrun.1.1a.add)

Example:  perl eval.pl -n myrun -r 1 -t 1a -input /Users/wendyc/Desktop/CLEF/Task1TrainSetGOLD200pipe -gold /Users/wendyc/Desktop/CLEF/Task1TrainSetSystem200pipe -a

2. Evaluation Workbench. We provide a graphical interface for calculation of outcome measures, as well as for visualization of system annotations against reference standard annotations. Use of the Evaluation Workbench is completely optional. The Evaluation Workbench is still under development, so we would appreciate your feedback. Notes on using the interface:

  • Memory issues. You need to allocate extra heap when you run the workbench with all the files, or you will get an "out of memory" error.  To do so, you need to use a terminal (or shell) program, go to the directory containing the startup.parameters file, and type: java -Xms512m -Xmx1024m -jar Eval*.jar 
  • Startup Properties file. The Evaluation Workbench relies on a parameter file called "startup.properties". Since the Workbench is a tool for comparing two sets of annotations, the properties refer to the first (or gold standard) and second (or system) annotators. The following properties will need to be set before running the Workbench:
    • WorkbenchDirectory. Full filename where the executable (.jar) file is located. For example, WorkbenchDirectory=/Users/wendyc/Desktop/CLEF/EvaluationWorkbench
    • TextInputDirectory: Directory containing the clinical reports (every document is a single text file in the directory). For example, TextInputDirectory=/Users/wendyc/Desktop/CLEF/EvaluationWorkbench/Task1TrainSetCorpus200EvaluationWorkbench
    • AnnotationInputDirectoryFirstAnnotator / AnnotationInputDirectorySecondAnnotator: Directories containing the sets of annotations (gold standard annotations is first, system annotations is second). If you do not have system annotations but just want to view the gold standard annotations, point both input directories to the gold standard annotations.
    • AnnotationInputDirectoryFirstAnnotator=/Users/wendyc/Desktop/CLEF/Task1TrainSetGOLD200pipe
    • AnnotationInputDirectorySecondAnnotator=/Users/wendyc/Desktop/CLEF/Task1TrainSetSystem200pipe

Please remember to set pathnames appropriate for your operating system.  MacOS and Unix pathnames are in the form /applications/EvaluationWorkbench/, whereas Windows paths are in the form c:\\Program Files\\Evaluation Workbench\\ (escape characters included)After setting paths appropriately for your computer and operating system, you can activate the Workbench by going to the distribution directory and using the mouse to double-click the EvaluationWorkbench.jar icon.

  • To open the workbench, double click on the EvaluationWorkbench.jar file
  • To navigate the Workbench, most operations will involve holding down the CTRL key until the mouse is moved to a desired position; once the desired position is reached, release the CTRL key. 
  • The Workbench displays information in several panes
    • Statistics pane: rows are classifications (e.g., Disorder CUI); columns display a contingency table of counts and several outcome measures (e.g., F-measure). The intersecting cell is the outcome measure for that particular classification. When a cell is highlighted, the reports generating that value are shown in the Reports pane. When you move the mouse over a report in the Reports pane, that report will appear in the Document pane.
    • The Document pane displays annotations for the selected document. The parameter button with label "Display=" selects whether to view a single annotation set at a time (gold or system), or to view both at once. Pink annotations are those that occur in only one source, and so indicate a false negative error (if it appears in the gold but not the system annotation set) or false positive (if it appears in the system but not the gold set). Highlighting an annotation in the document pane updates the statistics pane to reflect statistics for that classification. It also shows the attributes and relationships for that annotation (not relevant for this dataset but in other datasets you may have attributes like Negation status or relationships like Location of).
    • The Detail panel on the lower right side displays relevant parameters, report names, attribute, and relation information. The parameters include "Annotator" (whether the currently selected annotator is Gold or System), "Display" (whether you are viewing gold annotations, system annotations, or both), MatchMode (whether matches must be exact or any-character overlap) and MouseCtrl (whether the ctrl key must be held down to activate selections).
  • You can store the evaluation measures to a file by selecting File-> StoreOutcomeMeasures, and entering a selected file name.
  • How to generate outcome measures for the tasks using the Workbench 
    • Task 1(a): boundary detection without normalization
      • Select "Span" on the outcome measure pane to calculate outcome measures based only on boundary detection
      • Exact and overlapping span can be toggled by changing the MatchMode parameter
    • Task 1(b) strict - boundary detection and normalization
      • Select "exact" in the MatchMode parameter for all annotations
      • Select "Span&Class" on the outcome measure pane to calculate outcome measures requiring the same span and the same CUI code.
    • Task 1(b) relaxed: boundary detection and normalization
      • Select "exact" in the MatchMode parameter 
      • Select "Class" on the outcome measure pane to calculate outcome measures only on annotations that were correctly identified by the system (i.e., if the system did not identify the boundary of a disorder, that disorder will not be used in determining whether the CUI was correct or incorrect).

Tools for Evaluation (Task 3)

Evaluation metrics can be computed with the trec_eval evaluation tool, which is available from http://trec.nist.gov/trec_eval/.


Release Notes

This challenge is no longer active.


Acknowledgements

This shared task is supported in part by:

  • the Shared Annotated Resources (ShARe) project funded by the United States National Institutes of Health (R01GM090187).
  • the CLEF Initiative (Conference and Labs of the Evaluation Forum, formerly known as Cross-Language Evaluation Forum).
  • NICTA (National ICT Australia Ltd), Australia’s Information and Communications Technology Research Centre of Excellence.
  • The Khresmoi project, funded by the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no 257528.
  • The U.S Office of the National Coordinator of Healthcare Technology (SHARP 90TR0002).
  • The Veterans Affairs Consortium for Healthcare Informatics Research (VA CHIR).

We also thank the ShARe/CLEF eHealth Challenge organizers and teams! Many people were involved in CLEF eHealth (in alphabetical order):  

Samir Abdelrahman, University of Utah, USA; Wendy W Chapman, University of Utah, USA; Riitta Danielsson-Ojala, University of Turku, Finland; Noemie Elhadad, Columbia University, USA; Gabriela Ferraro, NICTA and The Australian National University, Australia; Lorraine Goeuriot, Université Grenoble Alpes, France; Cyril Grouin, CNRS-LIMSI, Orsay, France; Thierry Hamon, CNRS-LIMSI and Université Paris-Nord, Orsay, France; Allan Hanbury, Vienna University of Technology, Austria; Leif Hanlen, NICTA, The Australian National University, University of Canberra, Canberra, ACT, Australia; Preben Hansen, SICS and Stockholm University, Sweden; Harry S Hochheiser, University of Pittsburgh, USA; Gareth Jones, Dublin City University, Ireland; Evangelos Kanoulas, University of Amsterdam, Netherlands; Jussi Karlgren, KTH and Gavagai, Sweden; Lotta Kauhanen, University of Turku, Finland; Daniel Keim, University of Konstanz, Germany; Liadh Kelly, Trinity College Dublin, Ireland; Maria Kvist, Karolinska Institutet and DSV Stockholm University, Sweden; Gondy Leroy, University of Arizona, USA; Johannes Leveling, Dublin City University, Ireland; Wei Li, Dublin City University, Ireland; Heljä Lundgrén-Laine, University of Turku, Finland; Mihai Lupu, Vienna University of Technology, Austria; David Martinez, NICTA and The University of Melbourne, Australia; Danielle L Mowery, University of Pittsburgh, USA; Henning Mueller, HES-SO, Switzerland; Laura Maria Murtola, University of Turku, Finland; Aurélie Névéol, Centre National de la Recherche Scientifique (CNRS-LIMSI), Orsay, France; Jaume Nualart, NICTA and University of Canberra, Australia; Joao Palotti, Vienna University of Technology, Austria; Pavel Pecina, Charles University in Prague, Czech Republic; Sameer Pradhan, Harvard Medical School and Boston Children's Hospital, USA; Sanna Salantera, University of Turku, Finland; Guergana Savova, Harvard Medical School and Boston Children's Hospital, USA; Tobias Schreck, University of Konstanz, Germany; Brett R South, University of Utah, USA; Rene Spijker, UMC Utrecht, Netherlands; Hanna Suominen, NICTA, The Australian National University, University of Canberra, University of Turku (Turku, Finland), Canberra, ACT, Australia; Xavier Tannier, CNRS-LIMSI and Université Paris-Sud, Orsay, France; Ozlem Uzuner, State University of New York, New York, USA; Sumithra Velupillai, DSV Stockholm University, Sweden; Qing Treitler Zeng, University of Utah, SLC VA, Utah, USA; Joe Zhou, NICTA, Canberra, ACT, Australia; Guido Zuccon, Queensland University of Technology, Australia; Pierre Zweigenbaum, CNRS-LIMSI, Orsay, France.


Conflicts of Interest

No conflict of interests to report.


References

  1. Johnson AEW, Pollard TJ, Shen L, Lehman L, Feng M, Ghassemi M, Moody B, Szolovits P, Celi LA, and Mark RG. MIMIC-III, a freely accessible critical care database. Scientific Data (2016). DOI: 10.1038/sdata.2016.35. Available from: http://www.nature.com/articles/sdata201635
  2. Bodenreider, O. and McCray, A. Exploring semantic groups through visual approaches. Journal of Biomedical Informatics, 2003. 36(2203): pp. 414-432. http://semanticnetwork.nlm.nih.gov/SemGroups/Papers/2003-medinfo-atm.pdf

Parent Projects
ShAReCLEF eHealth 2013: Natural Language Processing and Information Retrieval for Clinical Care was derived from: Please cite them when using this project.
Share
Access

Access Policy:
Only credentialed users who sign the DUA can access the files.

License (for files):
PhysioNet Credentialed Health Data License 1.5.0

Data Use Agreement:
PhysioNet Credentialed Health Data Use Agreement 1.5.0

Required training:
CITI Data or Specimens Only Research

Corresponding Author
You must be logged in to view the contact information.

Files