Of A Heart Failure Care Team

Accepted Manuscript
Title: Using Artificial Intelligence in an Intelligent Way to Improve Efficiency

of a Heart Failure Care Team
Author: Griffin M. Weber
PII: S1071-9164(18)30144-1
DOI: https://doi.org/10.1016/j.cardfail.2018.04.003
Reference: YJCAF 4125
To appear in: Journal of Cardiac Failure
Received date: 11-4-2018

Accepted date: 11-4-2018
Please cite this article as: Griffin M. Weber, Using Artificial Intelligence in an Intelligent Way to
Improve Efficiency of a Heart Failure Care Team, Journal of Cardiac Failure (2018),
https://doi.org/10.1016/j.cardfail.2018.04.003.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will
undergo copyediting, typesetting, and review of the resulting proof before it is published in its
final form. Please note that during the production process errors may be discovered which could
affect the content, and all legal disclaimers that apply to the journal pertain.
Using artificial intelligence in an intelligent way to improve efficiency of a heart failure
care team
Griffin M Weber, MD, PhD1,2

1
Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
2
Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts
Griffin M Weber, MD, PhD

Department of Biomedical Informatics
Harvard Medical School
10 Shattuck St
Boston, MA 02115
Email: weber@hms.harvard.edu
Disclosures: Dr. Weber is supported by NIH/NCATS UL1TR001102, NIH/NIGMS

U01GM112623, NIH/NIGMS U01GM112623, NIH/NCI U01CA198934, NIH/NHGRI
U54HG007963, NSF/SciSIP SMA-1360042, and PCORI CDRN1306-04608.
Although the potential benefits of artificial intelligence (AI) to medicine have been discussed for
several decades, there are several reasons to think that the promised impact of AI will finally be
here soon [1,2]: the rapid adoption of electronic health records has made the large amounts of
data needed to develop and test AI algorithms much more readily available; wearable devices
and environmental sensors are providing new ways of digitally monitoring patients’ health;
improvements in AI algorithms have resulted in significant advances in the ability of computers
to recognize patterns in data; and, consumer products, such as voice recognition in smart
phones, are leading towards greater acceptance and trust in AI. However, many also rightly
warn about the hype around AI [3,4,5]. Computers will neither be curing diseases nor
eliminating the need for human doctors in the immediate future. The goals for AI should be more
incremental. It should be viewed as a tool that can assist providers in making clinical decisions
and help them work more efficiently and with fewer medical errors.
Page 1 of 5
In this issue of the Journal, Blecker et al sought to identify patients with acute decompensated
heart failure (ADHF) early in their hospitalization so that interventions to reduce readmissions,
such as patient education and involvement of multidisciplinary teams, can be coordinated before
the patients are discharged. The current approach at Tisch Hospital in New York City, where
this study was conducted, is a screening tool used by the heart failure transition team that looks
for patients with evidence for heart failure on the basis of the problem list, inpatient loop diuretic
use, or a BNP value >=500 pg/ml. With a sensitivity of 0.98, this method correctly identifies
nearly every patient who was later discharged with a diagnosis of ADHF. However, with a low
positive predictive value (PPV) of only 0.13, most of the patients selected by the screening tool
do not actually have ADHF. Providers, therefore, must perform manual chart review, averaging
61.4 minutes of effort to find each patient with ADHF.
Blecker and colleagues tested whether logistic regression models can generate a better
screening tool. Although viewed as a more traditional machine learning classification technique,
logistic regression often works well for simple binary decision problems [6]. The models are
easy to interpret and it is clear how the individual variables contribute to the classification. In
contrast, newer algorithms, such as “deep learning” systems based on neural networks, are the
state-of-the-art in AI for many applications including computer vision and speech recognition [7].
However, they are more complex models and treated like a black box since it is difficult to
understand how the input variables lead to the classification [3,6].
The authors compared three logistic regression models. The first was based on 31 structured
data elements, which included the three in the original screening tool as well as several
demographic, diagnosis, medication, and laboratory test variables. The second was based on
unstructured clinical notes. In this model, each of the 36,463 different words that appeared at
least ten times across all notes was a potential variable; however, including all of these in the
Page 2 of 5
logistic regression risks overfitting the model with irrelevant words. The authors used a common
technique to avoid this problem called L1-regularization, which adds a penalty to models with
large feature coefficients [8]. This resulted in all but 427 significant words dropping out of the
final model as their coefficients became zero. The third model combined both the structured
data elements and the words from clinical notes, for a final model of 432 variables after L1-
regularlization. Their dataset included 37,229 hospitalizations, of which 1,294 (3.5%) have a
principal discharge diagnosis of ADHF; and, as standard practice, they randomly divided the
hospitalizations into separate datasets used to train and test the models.
The output of the logistic regression models are values between 0 and 1, which estimate the
probability that the patient has ADHF. By selecting different probability thresholds, it is possible
to “tune” the model to a desired sensitivity or specificity. In the current study, Blecker et al fixed
the sensitivity to 0.98 so that all models would identify the same number of patients with ADHF
as the existing screening tool. This was key to their approach. Their intent was not to find
additional ADHF patients that are currently being missed. Instead, their objective was only to
raise the PPV and reduce the number of false positives, thereby eliminating the time being
spent manually reviewing those patients’ charts.
The logistic regression models based only on structured or unstructured data had PPVs of 0.13
and 0.30, respectively. The combined model, which performed the best, had a PPV of 0.34.
Although this might still appear low, when converted to time savings, it reduces the manual
chart review from over an hour down to only 25 minutes to find each patient with ADHF. This
corresponds to potentially hundreds of hours of provider time saved each year at their hospital,
given its volume of heart failure admissions.
Page 3 of 5
Perhaps the most important limitation of this study is that much of the benefit of the models
comes from the clinical notes, rather than the structured data. Due to variation in the format,
language, abbreviations, and other characteristics of clinical notes across different providers
and institutions, it is unclear how well these logistic regression models would work at another
hospital [9,10]. The models would need to be trained and tested again before they could be
used. Additionally, the reported time savings are only estimates. Blecker et al plan to use the
combined model at their hospital to generate a daily screening list for their heart failure team. It
would be interesting to learn what the actual time savings are and how the heart failure team
responds to this introduction of AI into their daily workflow.
By focusing on time savings rather than the absolute performance of the models, Blecker et al
are using AI in an intelligent way. They are neither overstating what their models can do nor
suggesting that the models can replace manual chart review. With logistic regression, they
combine both structured and unstructured clinical data in a single, easily computable and
interpretable formula. Although this results in only a modest improvement over their existing
screening tool, over time the cumulative gain in efficiency could be significant.
References
1. Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, Wang Y, Dong Q, Shen H, Wang Y.

Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017
Jun 21;2(4):230-243. doi: 10.1136/svn-2017-000101. eCollection 2017 Dec. Review.
PubMed PMID: 29507784; PubMed Central PMCID: PMC5829945.
2. Shameer K, Johnson KW, Glicksberg BS, Dudley JT, Sengupta PP. Machine learning in
cardiovascular medicine: are we there yet? Heart. 2018 Jan 19. pii:heartjnl-2017-
311198. doi: 10.1136/heartjnl-2017-311198. [Epub ahead of print] Review. PubMed
PMID: 29352006.
3. Cabitza F, Rasoini R, Gensini GF. Unintended Consequences of Machine Learning in
Medicine. JAMA. 2017 Aug 8;318(6):517-518. doi: 10.1001/jama.2017.7797. PubMed
PMID: 28727867.
4. Houssami N, Lee CI, Buist DSM, Tao D. Artificial intelligence for breast cancer
screening: Opportunity or hype? Breast. 2017 Dec;36:31-33.
doi:10.1016/j.breast.2017.09.003. Epub 2017 Sep 20. PubMed PMID: 28938172.
Page 4 of 5
5. Scogland B. Artificial Intelligence in Medicine: Hope or Hype? Medical Device and
Diagnostic Industry. January 2, 2018. Accessed online on March 8, 2018, at
https://www.mddionline.com/artificial-intelligence-medicine-hope-or-hype.
6. Dreiseitl S, Ohno-Machado L. Logistic regression and artificial neural network
classification models: a methodology review. J Biomed Inform. 2002 Oct-Dec;35(5-
6):352-9. Review. PubMed PMID: 12968784.
7. Krittanawong C, Zhang H, Wang Z, Aydar M, Kitai T. Artificial Intelligence in Precision
Cardiovascular Medicine. J Am Coll Cardiol. 2017 May 30;69(21):2657-2664. doi:
10.1016/j.jacc.2017.03.571. Review. PubMed PMID:28545640.
8. Ng AY. Feature selection, L1 vs L2 regularlization, and rotational invariance. ICML '04
Proceedings of the twenty-first international conference on Machine learning. 2004.
Page 78.
9. Carroll RJ, Thompson WK, Eyler AE, Mandelin AM, Cai T, Zink RM, Pacheco JA,
Boomershine CS, Lasko TA, Xu H, Karlson EW, Perez RG, Gainer VS, Murphy SN,
Ruderman EM, Pope RM, Plenge RM, Kho AN, Liao KP, Denny JC. Portability of an
algorithm to identify rheumatoid arthritis in electronic health records. J Am Med Inform
Assoc. 2012 Jun;19(e1):e162-9. Epub 2012 Feb 28. PubMed PMID: 22374935; PubMed
Central PMCID: PMC3392871.
10. Rochefort CM, Buckeridge DL, Forster AJ. Accuracy of using automated methods for
detecting adverse events from electronic health record data: a research protocol.
Implement Sci. 2015 Jan 8;10:5. doi: 10.1186/s13012-014-0197-6. PubMed PMID:
25567422; PubMed Central PMCID: PMC4296680.
Page 5 of 5

Of A Heart Failure Care Team

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Of A Heart Failure Care Team

Transféré par

Droits d'auteur :

Formats disponibles

Accepted Manuscript

Title: Using Artificial Intelligence in an Intelligent Way to Improve Efficiency

Author: Griffin M. Weber

To appear in: Journal of Cardiac Failure

Received date: 11-4-2018

Griffin M Weber, MD, PhD1,2

Griffin M Weber, MD, PhD

Disclosures: Dr. Weber is supported by NIH/NCATS UL1TR001102, NIH/NIGMS

improvements in AI algorithms have resulted in significant advances in the ability of computers

61.4 minutes of effort to find each patient with ADHF.

understand how the input variables lead to the classification [3,6].

spent manually reviewing those patients’ charts.

given its volume of heart failure admissions.

responds to this introduction of AI into their daily workflow.

1. Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, Wang Y, Dong Q, Shen H, Wang Y.

Vous aimerez peut-être aussi