Rutgers, GSK, and Deep 6 AI develop AI-powered algorithm

Rutgers, GSK, and Deep 6 AI develop AI-powered algorithm

Identifying patients at risk for chronic obstructive pulmonary disease (COPD) exacerbations is an important component of optimal patient care because exacerbation can significantly impact a patient’s mortality. While predictive algorithms based on claims data are useful, additional precision can be gained by incorporating clinical data from the unstructured text of electronic health records (EHR). Rutgers’ Robert Wood Johnson Barnabas Health System (RWJBH), GSK, and Deep 6 AI conducted a study to develop an algorithm to identify COPD exacerbations via EHR data. The algorithm used artificial intelligence (AI) and natural language processing (NLP) to mine EHR data for clinical characteristics of patients with exacerbation that are not available in traditional claims data or coded EHR data. It leveraged risk factors identified in COPDGene®, one of the largest studies ever to investigate the underlying genetic factors of COPD. The promise of this algorithm is that it can help clinicians better understand COPD disease progression and optimize treatments in care settings.


The study was accepted for presentation at the CHEST 2023 annual meeting on Tuesday, October 10, 2023 at 12:00 pm – 12:45 pm. The presentation was titled “Leveraging artificial intelligence to create a chronic obstructive pulmonary disease exacerbation risk algorithm.” Additional research on this topic was also accepted for a poster presentation at the American Thoracic Society Conference in San Diego on Sunday, May 19 at 2:15 – 4:15 pm. The poster is titled, “Leveraging Machine Learning and Real-World Data to Predict Chronic Obstructive Pulmonary Disease Exacerbations.”

Rutgers’ RWJBH EHR data was used to build and refine the algorithm

Cases were identified from Rutgers’ RWJBH EHR using combinations of the following criteria:

  • COPD diagnosis; 
  • smoking history or white blood cell/eosinophil count; 
  • modified Medical Research Council (mMRC) >2 dyspnea; and 
  • COPD Assessment Test (CAT) score or pulmonary function test or select comorbidities.  

 Then, there criteria were built in the Deep 6 AI software (shown in Figure 1).

Figure 1. Algorithm build in Deep 6 AI software
Figure 1. Algorithm build in Deep 6 AI software


Using an iterative process, initial data pulls were reviewed and the algorithm was refined to remove non-COPD cases. After the algorithm was finalized, a random sample of patients were selected for further review by a clinician independent of the algorithm development team to confirm accuracy of the curated data. 

AI-powered algorithm finds 30.7% of patients in Rutgers’ RWJBH COPD cohort had a history of an exacerbation

Of the 1,551,724 patients within the Rutgers’ RWJBH EHR system, a total of 18,984 patients were identified as having COPD based on one diagnostic code alone and mentions of COPD in unstructured clinical notes. The mMRC and CAT criteria were removed due to lack of data. The final algorithm identified 6,218 patients and included the following criteria: 2 instances of COPD diagnosis AND evidence of smoking history AND pack years (pky) available AND eosinophil count or pulmonary function test or relevant comorbidity or mention of an exacerbation. Of the 2,864 patients in the curated dataset, the average patient age was 73.3 (+ standard deviation 10.9) years old, 46% female; 52.3% had gastroesophageal reflux disease, 56.2% had coronary artery disease/congestive heart failure and 33.5% had 2 or more comorbidities. Regarding smoking history, 60.6% had smoked more than 30 pky (0.1–200 pky) and 28.7% were current smokers. Only 17% of patients with COPD had no evidence of comorbidity. Of the cohort, 30.7% of patients with COPD had a history of an exacerbation.

Figure 2. Demographic and clinical characteristics of patients with COPD at risk of exacerbation
Figure 2. Demographic and clinical characteristics of patients with COPD at risk of exacerbation

AI has potential to identify COPD exacerbation more precisely than claims-based models

AI and NLP provide an important tool to identify patients with COPD at risk for exacerbation by mining unstructured data sources for clinical characteristics of patients that are not available in traditional claims data or coded EHR data. Further validation of this algorithm in clinical settings will help optimize patient care.   

Learn more about how Deep 6 AI’s platform works to mine structured and unstructured EHR data to precisely match patients to trials. 

Authors

Zakusylo, Anna1; Kahle-Wrobleski, Kristin3; Tyler, Allison4; Roy, Jason2; O’Riordan, Thomas3; Panettieri, Reynold A1

Affiliations

1. Rutgers Institute for Translational Medicine and Science and 2. School of Public Health, Rutgers University, New Brunswick, NJ; 3. US Value Evidence and Outcomes, GSK, Philadelphia, PA; 4. Deep6 Inc, Pasadena, CA.