Database Credentialed Access

ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection

Sunjae Kwon Xun Wang Weisong Liu Emily Druhl Minhee Sung Joel Reisman Wenjun Li Robert Kerns William Becker Hong Yu

Published: Jan. 11, 2024. Version: 1.0.0

When using this resource, please cite: (show more options)
Kwon, S., Wang, X., Liu, W., Druhl, E., Sung, M., Reisman, J., Li, W., Kerns, R., Becker, W., & Yu, H. (2024). ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection (version 1.0.0). PhysioNet.

Additionally, please cite the original publication:

Kwon, Sunjae, Xun Wang, Weisong Liu, Emily Druhl, Minhee L. Sung, Joel I. Reisman, Wenjun Li, Robert D. Kerns, William Becker, and Hong Yu. (2023). ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection. arXiv preprint arXiv:2307.02591.

Please include the standard citation for PhysioNet: (show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.


Opioid related aberrant behaviors (ORAB) present novel risk factors for opioid overdose. Previously, ORAB have been mainly assessed by survey results and by monitoring drug administrations. Such methods however, cannot scale up and do not cover the entire spectrum of aberrant behaviors. On the other hand, ORAB are widely documented in electronic health record notes. This paper introduces a novel biomedical natural language processing benchmark dataset named ODD, for ORAB Detection Dataset. ODD is an expert-annotated dataset comprising of more than 750 publicly available EHR notes. ODD has been designed to identify ORAB from patients' EHR notes and classify them into nine categories; 1) Confirmed Aberrant Behavior, 2) Suggested Aberrant Behavior, 3) Opioids, 4) Indication, 5) Diagnosed opioid dependency, 6) Benzodiapines, 7) Medication Changes, 8) Central Nervous System-related, and 9) Social Determinants of Health. 


The opioid overdose (OOD) crisis has had a striking impact on the United States, not only threatening citizens' health [1] but also bringing about a substantial financial burden [2]. According to a report by the Centers for Disease Control and Prevention [3], OOD accounted for 110,236 deaths in a single year in 2022. In addition, fatal OOD and opioid use disorder (OUD) cost the United States \$1.04 trillion in 2017 and that figure rose sharply to $1.5 trillion in 2021 [4].

Opioid-related aberrant behaviors (ORABs) are patient behaviors that may indicate prescription medication abuse [5]. ORABs can be categorized into confirmed aberrant behavior and suggested aberrant behavior [6, 7]. Herein, confirmed aberrant behaviors have a clear evidence of medication abuse and addiction while suggested aberrant behaviors do not have a clear evidence [7]. Since ORABs have shown to be associated with patients with drug abuse problems [8], assessment of ORABs has been recognized as beneficial in evaluating the risk associated with opioid abuse [9] and OOD [10].

Previously, ORABs have been detected by monitoring opioid administration (e.g., frequency and dosage) [11] or self-reported questionnaires [9, 12]. However such measurements do not include the full spectrum of ORABs (e.g., medication sharing, denying medication changing). In addition, patients can obtain opioids from multiple resources (e.g. illegal purchase and medication sharing), which are not captured in the structured data. It has been known that ORABs are widely described in EHR notes and natural language processing (NLP) techniques can be used to identify ORABs [13].  However, the previous study relied on a small amount of annotated notes, which were not publicly available. Moreover, the previous work only considered ORABs as a binary classification (present or not) and only explored traditional machine learning models (e.g., support vector machine (SVM)).  

Herein, we proposes ORAB detection that is a novel Biomedical NLP (BioNLP) task. We also introduce an ORAB etection Dataset (ODD) which is large-size, expert-annotated, and multi-label classification benchmark dataset corresponding to the task. For this, we first designed a robust and comprehensive annotation guideline that labels text into nine categories which encompass two types of ORABs (Confirmed Aberrant Behavior and Suggested Aberrant Behavior) and seven types of auxiliary opioid-related information (Opioids, Indication, Diagnosed Opioid Dependency, Benzodiapines, Medication Change, Central Nervous System related). Using the guideline, domain experts annotated 750 EHR notes of 500 opioid-treated patients extracted from MIMIC-IV- ED [14]. Overall, we annotated 2,519 instances with 157 ORABs instances (113 for confirmed aberrant behavior and 44 for suggested aberrant behavior). 


Data Collection

The source of the first dataset is made up of publicly available fully de-identified EHR notes of the MIMIC-IV- ED [14]. To increase the likelihood that our annotated data incorporate ORABs, we sorted out patients at risk of opioid misuse based on repetitive opioid use and diagnosis related to opioid misuse. Specifically, we first extracted EHR notes mentioning opioids with the generic and brand name of opioid medications. In addition, we selected patients diagnosed based on their  ICD codes. 

Among 331,794 EHR notes of 299,712 patients in MIMIC-IV- ED, we found that approximately 57\% of patients were prescribed opioids during their hospitalization. Then, we selected patients who were repeatedly prescribed (more than twice) opioids. In addition, we chose patients who were diagnosed with drug poisoning and drug dependence based on the ICD codes. Overall, there are 3,904 patients who are satisfied the aforementioned conditions. Among them, we randomly select 750 notes from a randomly sampled 500 patients for annotation.

Sex Age
Male Female 19-29 30-39 40-49 50-59 60-69 Over 70
252 248 39 61 88 124 103 85

The above table stands for the socio-demographic statistics of the sampled patients. Herein, we can notice that the number of male and female is balanced. 

Data Annotation

We first designed an annotation guideline to label each sentence in an EHR note into nine categories as shown in the below table Herein, the categories contain two ORABs (confirmed aberrant behavior and suggested aberrant behavior) and seven additional information that relevant to opioid usage.  

EHR notes were annotated independently by two domain experts by following the annotation guidelines. The primary annotator annotated all EHR notes with eHOST [15] annotation tool. The other annotator coded 100 of the EHRs of the primary annotator with the same environment to compute inter-rater reliability with Cohen's kappa (=0.87).

The definitions and examples of the categories in ODD.
Category Definition Example
Confirmed Aberrant Behavior Evidence confirming the loss of control of opioid use, specifically aberrant usage of opioid medications. [Patient] admits that he has been sharing his Percocet with his wife, and that is why he has run out early. 
Suggested Aberrant Behavior Evidence suggesting loss of control of opioid use or compulsive/inappropriate use of opioids. [Patient] states that ‘that [drug] won’t work; only [X drug] will and I won’t take any other’
Opioids The mention or listing of the name(s) of the opioid medication(s) that the patient is currently prescribed or has just been newly prescribed. Oxycodone has been known to make [the patient] sleepy at 5 mg.
Indication Patients are using opioids under instructions. [The patient] is in a daze.
Diagnosed Opioid Dependency  Patients have the condition of being dependent on opioids, have chronic opioid use, or is undergoing opioid titration [The patient] is in severe pain and has been taking [opioid drug] for [time].[HY1] 
Benzodiazepines Patients are co-prescribed benzodiazepines. Valium has been listed in patient medication list.
Medicine Changes  Change in opioid medicine, dosage, and prescription since the last visit. [Patient] reports that his previous PCP just recently changed his pain regimen, adding oxycodone. 
Central Nervous System Related CNS-related terms/terms suggesting altered sensorium. [Patient] reported to have nausea after taking [drug].
Social Determinants of Health  The nonmedical factors that influence health outcomes [Patient] divorced a years ago.

Data Description

ODD categories and the portion of each category.

Categories Instances

Confirmed Aberrant Behavior

115 (3.09%)
Suggested Aberrant Behavior 47 (1.26%)
Opioids 1,678 (45.13%)
Indication 558 (15.01%)
Diagnosed Opioid Dependency 67 (1.80%)
Benzodiazepines 417 (11.22%)
Medication Change 139 (3.74%)
Central Nervous System Related 542 (14.58%)
Social Determinants of Health 155 (4.17%)
Total 3,718 (100%)

After annotation, among 750 notes, we could find 399 notes with current opioid prescription and 2,840 sentences with at least one instance. The above table shows the statistics of the annotated categories and instances. Herein, MIMIC-IV- ED consist of 3,718 instances annotated from the EHRs. Especially, we can notice that `confirmed aberrant behavior' and `suggested aberrant behavior' in EHRs are relatively rare events only accounting for 162 (4.25%); 115 (3.09%) for confirmed aberrant behavior and 47 (1.26%) for suggested aberrant behavior. The `Opioid' and `CNS-related' are majority classes.

The dataset ‘ORAB_Annotation_MIMIC.csv’ is in csv format with the following columns

  1. Context: Text from an EHR note.
  2. Confirmed aberrant behavior: This class refers to behavior that is more likely to lead to a catastrophic adverse event. It is defined as evidence confirming the loss of control of opioid use, specifically aberrant usage of opioid medications, including:
    • Aberrant use of opioids, such as administration/consumption in a way other than described or self-escalating doses.
    • Evidence suggesting or proving that the patient has been selling or giving away opioids to others, including family members.
    • Use of opioids for a different indication other than the indication intended by the prescriber.
    • Phrases suggesting current use of illicit or illicitly obtained substances or misuse of legal substances (e.g. alcohol) other than prescription opioid medications.
  3. Suggested aberrant behavior: This class refers to behavior implying patient distress related to their opioid treatment. Suggested aberrant behavior includes three kinds of behaviors that suggest potential misuse of opioids.
    • Patient attempt to get extra opioid medicine like requesting for early refill, asking for increasing dosage, or reporting missing/stolen opioid medication. 
    • Patient emotions toward opioids like a request for a certain opioid medication use/change/increase. 
    • Physician concerns.
  4. Opioids: This class refers to the mention or listing of the name(s) of the opioid medication(s) that the patient is currently prescribed or has just been newly prescribed.
  5. Indication: This class indicates that patients are using opioids under instructions, such as using opioids for pain, for treatment of opioid use disorder, etc.
  6. Diagnosed Opioid Dependence: It refers to patients who have the condition of being dependent on opioids, has chronic opioid use, or are undergoing opioid titration.
  7. Benzodiazepines: This class refers to co-prescribed benzodiazepines (a risk factor for accidental opioid overdose. In this case, the patient is simply being co-prescribed benzodiazepines (with no noted evidence of abuse).
  8. Medication Change: This class indicates that the physician makes changes to the patient’s opioid regimen during this current encounter or that the patient’s opioid regimen has been changed since the patient’s last encounter with the provider writing the note.
  9. Central Nervous System Related: This is defined as central nervous system-related terms or terms suggesting altered sensorium, including cognitive impairment, sedation, lightheadedness, intoxication, and general term suggesting altered sensorium (e.g. “altered mental status”).
  10. Social Determinants of Health: This class refers to the factors in the surroundings which impact their well-being [16]. Our dataset captured the following attributes: 
    • Marital status (single, married ...)
    • Cohabitation status (live alone, lives with others ...)
    • Educational level (college degree, high school diploma, no high school diploma ...)
    • Socioeconomic status (retired, disabled, pension, working ...)
    • Homelessness (past, present ...)

Usage Notes


The source codes for the baseline methodologies introduced in the paper can be found in the following GitHub repository [17].


The ORAB detection task relies on EHR notes. Thus, if health providers do not recognize the patient's abnormal signs, they may not describe aberrant behaviors in a note. In this case, our approach cannot detect ORABs.

Potential Usage of the Dataset

Firstly, the information extracted by ORAB detection models can be utilized for various studies and systems aimed at addressing opioid abuse. For instance, since ORABs serve as important evidence of OUD, they can be used as key features in opioid risk monitoring systems. Additionally, this information can be leveraged to detect a patient's risk of OOD or opioid addiction at an earlier stage, thereby assisting in the prevention of fatal OOD cases. Consequently, by supporting efforts to mitigate future opioid overdoses, our research would contribute to maintaining people's health.

Release Notes

Initial version 1.0.0.

The dataset only consists of:

  1. Text from EHR notes
  2. Expert annotated labels

We will update the dataset by data augmentation in version 2.


The data used in this study originated from a completely de-identified dataset and the data does not include any protected health information. We obtained permission to utilize this dataset through a PhysioNet Credentialed Health Data Use Agreement (v1.5.0). The research was classified as exempt from human subjects research. All experiments conducted adhered to the regulations outlined in the PhysioNet Credentialed Health Data License Agreement. The process of medical charting in the electronic health record is susceptible to various forms of bias.

Conflicts of Interest

The authors declare that they have no conflict of interest.


  1. M. Azadfard, M. R. Huecker, and J. M. Leaming. Opioid addiction. In StatPearls [Internet], 2022.
  2. C. Florence, F. Luo, and K. Rice. The economic burden of opioid use disorder and fatal opioid overdose in the united states, 2017. Drug and alcohol dependence, 218:108350, 2021.
  3. Centers for Disease Control and Prevention. Provisional drug overdose death counts. https: //, 2023. [Accessed: 2023-03-08].
  4. D. Beyer. The economic toll of the opioid crisis reached nearly $1.5 trillion in 2020. shorturl. at/kovz3, 2022. [Accessed: 2023-03-08].
  5. M. F. Fleming, J. Davis, and S. D. Passik. Reported lifetime aberrant drug-taking behaviors are predictive of current substance use and mental health problems in primary care patients. Pain Medicine, 9(8):1098–1106, 2008.
  6. R. K. Portenoy. Opioid therapy for chronic nonmalignant pain: a review of the critical issues. Journal of pain and symptom management, 11(4):203–217, 1996.
  7. National Institute on Drug Abuse. Aberrant drug taking behaviors information sheet. https: //, 2023. [Accessed: 2023-04-14].
  8. L. R. Webster and R. M. Webster. Predicting aberrant behaviors in opioid-treated patients: preliminary validation of the opioid risk tool. Pain medicine, 6(6):432–442, 2005.
  9. M. Maumus, R. Mancini, D. M. Zumsteg, and D. K. Mandali. Aberrant drug-related behavior monitoring. The Ochsner Journal, 20(4):358, 2020.
  10. X. Wang. Using natural language processing to detect opioid-related aberrant behaviors from electronic health records. In APHA 2022 Annual Meeting and Expo. APHA, 2022.
  11. K. Rough, K. F. Huybrechts, S. Hernandez-Diaz, R. J. Desai, E. Patorno, and B. T. Bateman. Using prescription claims to detect aberrant behaviors with opioids: comparison and validation of 5 algorithms. Pharmacoepidemiology and drug safety, 28(1):62–69, 2019.
  12. L. L. Adams, R. J. Gatchel, R. C. Robinson, P. Polatin, N. Gajraj, M. Deschner, and C. Noe. Development of a self-report screening instrument for assessing potential opioid medication misuse in chronic pain patients. Journal of pain and symptom management, 27(5):440–459, 2004.
  13. J. M. Lingeman, P. Wang, W. Becker, and H. Yu. Detecting opioid-related aberrant behavior using natural language processing. In AMIA Annual Symposium Proceedings, volume 2017, page 1179. American Medical Informatics Association, 2017.
  14. A. Johnson, L. Bulgarelli, T. Pollard, L. A. Celi, R. Mark, and S. Horng IV. Mimic-iv-ed. PhysioNet, 2021.
  15. eHOST. ehost: The extensible human oracle suite of tools. archive/p/ehost/, 2011. [Accessed: 2023-4-10].
  16. Holmes Fee C, Hicklen RS, Jean S, Abu Hussein N, Moukheiber L, de Lota MF, Moukheiber M, Moukheiber D, Anthony Celi L, Dankwa-Mullan I. Strategies and solutions to address Digital Determinants of Health (DDOH) across underinvested communities. PLOS digital health. 2023 Oct 12;2(10):e0000314.
  17. S. Kwon, The official source code for "ODD: a benchmark dataset for the nlp-based opioid related aberrant behavior detection". [Accessed: 2023-10-11]

Parent Projects
ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection was derived from: Please cite them when using this project.

Access Policy:
Only credentialed users who sign the DUA can access the files.

License (for files):
PhysioNet Credentialed Health Data License 1.5.0

Data Use Agreement:
PhysioNet Credentialed Health Data Use Agreement 1.5.0

Required training:
CITI Data or Specimens Only Research

Corresponding Author
You must be logged in to view the contact information.