Vous êtes sur la page 1sur 4

Drug safety surveillance using de-identied EMR and

claims data: issues and challenges


Prakash M Nadkarni
ABSTRACT
The author discusses the challenges of pharmacovigilance
using electronic medical record and claims data. Use of
ICD-9 encoded data has low sensitivity for detection of
adverse drug events (ADEs), because it requires that an
ADE escalate to major-complaint level before it can be
identied, and because clinical symptomatology is
relatively under-represented in ICD-9. A more appropriate
vocabulary for ADE identication, SNOMED CT, awaits
wider deployment. The narrative-text record of progress
notes can potentially be used for more sensitive ADE
detection. More effective surveillance will require the
ability to grade ADEs by severity. Finally, access to online
drug information that includes both a reliable hierarchy of
drug families as well as structured information on
existing ADEs can improve the focus and predictive
ability of surveillance efforts.
In this issue, Reisinger et al
1
describe the creation of
a database intended to facilitate drug safety
surveillance by monitoring for adverse events, using
extracted data from two de-identied databases,
a claims database and an electronic medical record
(EMR) database provided by a large healthcare
company. The proposed data model is a subset of
a more detailed model specied by the Observa-
tional Medical Outcomes Partnership.
2
That
commercial enterprises engage in such work is
highly laudable.
The proposed data model is fairly straightfor-
ward. A Persons (Patients) table records basic
demographic elements and related tables list the
encounters, medications, procedures, and clinical
conditions for each person. The latter three tables
encode the concepts being recorded using standard
medical vocabularies whose contents, as well as
associated hierarchical relationships, are extracted
from the US National Library of Medicines Unied
Medical Language System (UMLS) Metathesaurus.
3
Chronological information is essential in surveil-
lance databases: to suspect a medication-related
adverse event, a condition must follow the onset of
medication, though of course a post hoc phenom-
enon does not by itself prove cause and effect. To
create the chronological information, Reisinger et al
pre-processed the raw data by coalescing consecutive
records for the same patient for the same medica-
tion, clinical condition, or procedure into a single
record. The resultant record represents an era for
the therapeutic intervention or condition. Each is
tagged by start and end dates that denote an episode
of continued medication administration or of
ongoing care visits for a condition. The coalescing
heuristic used was: if one encounters a sequence of
records where the start date of intervention in
a subsequent record follows the end date in the
preceding record by 30 days or less, the sequence can
be merged into a single record.
The resulting database is impressive in terms of
its data volume: 43 million subjects and 1 billion
drug exposures. However, both the data model and
the vocabularies employed in the work bring with
them signicant limitations in terms of the infer-
ences one can make with regard to medication
safety. To be fair, some of these limitations only
serve to illustrate the challenges inherent in the
problem.
IDENTIFYING ADVERSE EFFECTS: DATA SOURCE
AND VOCABULARY ISSUES
In the above work, the only source of adverse event
data that was utilized from the EMR/claims data
was clinical-condition information that was
encoded using the International Classication of
Diseases, 9th edition (ICD-9)
4
: this was converted,
where possible, into equivalent codes in MedDRA
(Medical Dictionary of Regulatory Activities)
5
using exact-correspondence information in UMLS.
Because they are used for billing purposes, ICD-9
data are the most readily available structured data
in EMRs for identifying clinical conditions.
However, such data have several issues.
PROBLEMS WITH THE USE OF CLAIMS DATA FOR
ADVERSE EVENT DETECTION
The majority of adverse drug effects (ADEs) are
recorded in the narrative text associated with the
initial post-event visit or progress note, if at all.
Only if severe enough to constitute a chief
complaint or a major nding will they be coded
using ICD-9. The requirement that ADE ndings
must escalate to a major-complaint level to be
picked up lowers the systems sensitivity. Under-
recognition of a seemingly common problem
(weight gain) was an issue with the antipsychotic
risperidone, overuse of which is now the focus of
federal concerns
6
: the prevalence of this ADE was
only recognized when the problem escalated into
obesity sufcient to cause type II diabetes or
became pathological.
The relatively weaker coverage of ICD-9 for
(non-billable) symptomatology, in comparison to
other vocabularies such as the Systematic Nomen-
clature of Medicine Clinical Terms (SNOMED
CT),
7
is well documented.
8 9
For example, it would
be unusual to code a complaint of dry mouth due
to anticholinergic medications using ICD-9. The
encoding process itself is vulnerable to inaccuracies,
because it is not always performed by the care
Correspondence to
Prakash M Nadkarni, Center for
Medical Informatics, Yale
University School of Medicine,
333 Cedar Street, New Haven,
CT 06511, USA;
prakash.nadkarni@yale.edu
Received 13 September 2010
Accepted 17 September 2010
J Am Med Inform Assoc 2010;17:671e674. doi:10.1136/jamia.2010.008607 671
Viewpoint paper
provider during the time of the clinical encounter. Depending on
the healthcare organization and the specialty, a signicant
portion of the clinical record may be recorded in narrative text,
which is then encoded a day or two later by medical records staff
for billing and reporting purposes. As such, the encoding does
not reect ground truth. For example, Stein et al
10
studied the
phenomenon of post-operative pulmonary embolism as recorded
in narrative text and in encoded form, and found not only
signicant discrepancies between the two, but also false posi-
tives and negatives in both.
Some groups now promote encoding of problem lists using
SNOMED CT,
11
because the latter captures symptomatology
much better than ICD-9. However, there are signicant hurdles
to the intended widespread deployment of SNOMED CT. Crit-
ical aspects include the large size of the terminology and the
signicant redundancy in its content. Projects such as the
construction by the National Library of Medicine (NLM) of
a CORE Subset of SNOMED CT
12
aim to address both of these
issues. Nevertheless, recent work by Nadkarni and Darer
13
indicates that such subsets, while undoubtedly useful, cannot
provide the necessary coverage in all circumstanceseaccess to
the complete SNOMED CTcontent is still required.
Another concern is that encoding of ne details of the
encounter, when performed by humans with software assis-
tance, is time consuming. Consequently, busy clinicians may
nd this an unacceptable chore, and relegate this task to their
medical records staff. Doing so would propagate the afore-
mentioned concerns regarding accuracy. Conversely, for clin-
ical encounters documented primarily as narrative text,
a currently popular question is whether automated natural-
language-processing (NLP) techniques can adequately extract
all ADE-related information from the text. Wang et al
14
explored the feasibility of using NLP for ADE signal detection
in a recent JAMIA paper. In their proof of concept study, Wang
et al evaluated patients treated with bupropion, and the
results were promising. However, the eld must replicate such
work on a much larger scale to determine where the pitfalls
lie.
THE CHOICE OF MEDDRA AS AN ADVERSE EVENT
TERMINOLOGY
The FDA uses MedDRA to collect and encode reports of adverse
events. Thus, mapping of ICD-9 codes to MedDRA is necessary
for communication to the FDA. Using MedDRA has some
advantages, notably in the area of standardized MedDRA
queries. Through a knowledge base representing the ndings of
various syndromes using MedDRA terms, one can search for
patients whose individual ndings are consistent with disorders
such as anaphylaxis, extrapyramidal manifestations, hemolysis,
or renal failure. However MedDRAs design deviates signicantly
from modern controlled-vocabulary-design principles as articu-
lated in Ciminos classic paper
15
: its limitations have been
discussed by other authors.
16e18
Concerns about MedDRA
include that it is not concept-oriented, it is non-compositional,
its hierarchy is arbitrarily constrained to ve levels, and, at the
higher levels, it is articially mono-hierarchical, which leads to
difculties in formulating queries.
Because the SNOMED CT concept hierarchy is signicantly
richer than MedDRAs, Bodenreider attempted to map, using
automated approaches, MedDRA preferred terms (the equiva-
lent of concepts) to SNOMED CT concepts.
19
He found that
58% of MedDRAs preferred terms could be mapped this way.
Thus, the incorporation of additional intermediate-level
concepts from SNOMED may make MedDRA-encoded data
easier to categorize, aggregate, and analyze meaningfully.
GRADING OF ADVERSE EVENTS
Early and sensitive adverse event detection requires adverse
event grading. Merely recording that a drug causes an adverse
event is not enougheone must know how severe it is. In the
running example of the Reisinger et al paper, acute myocardial
infarction represents only the tip of the iceberg of coronary
artery disease leading to occlusion. Patients with acute
myocardial infarction frequently experience symptoms such as
anginal pain beforehand. It is important to catch adverse events
before they escalate into full-blown emergencies. While
Reisinger et al mention ischemic heart disease in passing, it is not
clear how their model would represent progression of given
disorders along a spectrum from mild to severe such that all
intermediate states would t as recognizable components of the
same disease process.
Some adverse events, by their very natureesuch as anaphy-
laxis or toxic epidermal necrolysiseoccur in a severe form. Most
ADEs, however, can occur with varying grades of severity. For
example, National Cancer Institute (NCI)-sponsored clinical
trials of cancer therapies utilize the Common Toxicity Criteria
for Adverse Effects (CTC AE),
20
where adverse events are graded
on a 1e5 scale (5 represents death), though, depending on the
particular adverse event, not all points on the scale may be used.
For example, dry mouth can occur as 1e2 on the scale, while
secondary malignancy, if present, is automatically grade 4. One
motivation for grading is to enable consistent reporting of
adverse events to the local Institutional Review Board, to other
collaborating sites in a multi-site study, and to the studys
sponsor, for example, by requiring reporting of major adverse
events of grade 3 and above.
While originally developed for oncology, because CTC AE
grading is anchored, it has found application in non-cancer-
related studies such as stem cell transplantation
21
and, in
a modied form, for rheumatology.
22
The use of CTC AE
minimizes inter-rater variability. Anchoring implies that rather
than simply using terms like mild, moderate, or severe
without denition or qualication, CTC AE species a particular
grade of an adverse event in unambiguous detail, often in terms
of numerical ranges or the extent of functional disability. For all
its strengths, however, CTC AE is not comprehensive enough to
use for all drug categories or for all types of clinical studies. For
example, psychiatric ndings are under-represented, as are
certain physical ndings such as tendon rupture. The latter can
occur with uoroquinolone antimicrobial administration or after
periarticular corticosteroid injections. With CTC AE, tendon
rupture can only be encoded as musculoskeletal, other
(specify).
Grading of adverse events is not always possible or feasible to
perform in real time. While the grades of some adverse events
(eg, those based on measurable physical or laboratory ndings)
can be readily computed algorithmically, grading of subjective
ndings typically requires careful inspection of the clinical
record or detailed interviewing of the patient. Electronic support
in the form of check-lists can facilitate its implementation. A
concern regarding the model proposed by Reisinger et al is that
adverse event grade information is not easily gleaned from ICD-
9 data. First, only a small proportion of clinical conditions are
graded in ICD-9 as mild/minimal, moderate, or severe. More
importantly, as already stated, the billing and administrative
practices related to ICD-9 usage tend to leave adverse events of
672 J Am Med Inform Assoc 2010;17:671e674. doi:10.1136/jamia.2010.008607
Viewpoint paper
a low-level grade as narrative-text portions of the clinical record
rather than formally encoding them.
DRUG INFORMATION: CHOICE OF REFERENCE CONTENT
For hierarchical relationships among drugs, Reisinger et al chose
to use the drug hierarchy of SNOMED CT. While SNOMED
CTs strengths with respect to encoding much of clinical
medicine are well known, SNOMED CT is a suboptimal source
for information about relationships among drugs.
In the discussion below it is important to note that the
relationship between drugs and drug families/categories is poly-
hierarchical (ie, one drug may belong to more than one family).
A given drug may have multiple therapeutic actionsefor
example, aspirin is both an anti-inammatory and anti-platelet
agenteor a drug may bind multiple receptors, as in the case of
chlorpromazine.
The authors choice of rofecoxib as an exemplar was fortuitous:
SNOMED CT characterizes it correctly as a cox-2 inhibitor.
However, in SNOMED CTchlorpromazine falls under the single
category phenothiazine, a chemical classication that is not
useful from the pharmacological or therapeutic perspectives. The
SNOMED CTclassication for the widely used drug acetamino-
phen is incorrect: Para-aminophenol derivative anti-inammatory
agent (substance). Acetaminophen has antipyretic and analgesic
effects, but has no clinically signicant anti-inammatory effects.
The antimicrobial ciprooxacin (a uoroquinolone) is placed in
the less useful, broader category quinolones along with nalidixic
acid, an older drug with a signicantly different adverse event
prole. Such classication problems can have real-world compli-
cations. For example, the uoroquinolone drug family, which is
not a distinct concept in SNOMED CT, is the focus of complaints
regarding overuse from groups such as the Fluoroquinolone
Toxicity Research Foundation, and the Health Research Group of
Public Citizen, which has petitioned the FDAto require black-box
label warnings.
23
An accurate and comprehensive drug hierarchy is important
for analyses of groups of related drugs. Useful drug hierarchies
have been constructed, but are not always freely available. For
example, the Cerner Multum Drug Lexicon database
24
was
freely available in its earlier versions, and correctly classied
chlorpromazine both as a phenothiazine antipsychotic and
a phenothiazine antihistamine. Unfortunately, its distribution
has been constrained in its more recent versions, and one can
now only obtain it by purchasing the content.
DETERMINING DOSE-RELATED EFFECTS: CHALLENGES
Reisinger et al state explicitly that their data model does not
support analyses by drug strength. More concerning, the model
does not record dose information. Many adverse events occur as
dose-related extensions of pharmacological actions, such as
congestive heart failure with b-blockers and uid retention with
the thiazolidinedione anti-diabetic agents.
The absence of dose data again limits the models utility.
There are several issues related to performing such analyses.
<
Many of the standard sources of drug informationesuch as
the NLMs RxNorm
25
and the drug hierarchy of SNOMED
CT that the authors usededo not treat the numbers
associated with a pharmaceutical preparation specially.
Instead, the numbers are simply part of the string that
describes a formulation. The UMLS reects this design
limitation as well. More advanced data models, such as the
previously mentioned Multum Lexicon, explicitly separate
the numeric part of the drug strength (as well as the units in
which the strength is expressed) from the medication itself.
The Multum data model is sophisticated enough to recognize
that in many cases, both strength and units are expressed in
two parts, numerator and denominator (eg, milligrams per
100 ml), and so these parts were modeled separately where
necessary.
<
Of course, even if one knows what strength of preparation
was being prescribed for a given patient, that is not enough to
reliably compute the quantity of medication that the patient
is actually receiving per unit time. For ambulatory patients,
one may try to rely on the quantity dispensed for a given
period, but that is not the same as what is ingested. For
several drugs, the dose must be continually titrated based on
the values of a laboratory measure (eg, the International
Normalized Ratio (INR) for warfarin), so the number of
tablets taken per day or per week may change frequently.
One practical issue is that many EMRs (eg, EpicCare) record
the caregivers orders for a given prescription only as narrative
text, for example, 1 bid, even though it should not be partic-
ularly difcult in principle to enforce at least partial structure in
the data through the use of pull-down lists and separation of the
numeric part of the order from the dose frequency (although
narrative text is still necessary for special instructions). Because
of the considerable variation that can occur in such text,
attempting to extract computable dose information can become
a difcult pattern-recognition or NLP project.
The full Observational Medical Outcomes Partnership data
model allows recording of the number of rells, the number of
days supply, and the total quantity of drug, but does not try to
address dosage issues explicitly. This illustrates the overall
challenges related to determining actual administered drug dose
information reliably.
UTILIZING KNOWN ADVERSE EVENT INFORMATION FOR DRUG
SURVEILLANCE
Pharmocovigilance (drug surveillance) efforts can utilize existing
knowledge about adverse events in several ways:
< Drug surveillance may resemble data mining with hypothesis
generation. Formally designed studies must later conrm (or
disprove) initial signals or trends detected in the raw data. A
signicant problem in data mining exercises is the over-
abundance of signals. Such problems multiply if the software
lacks information on what is already common knowledge, as
in the apocryphal story of the software program that
discovered that ovarian cancer only occurs in women. One
way for software to reduce pharmacovigilance study noise
levels is to post-process signals by checking against known
adverse event information for the drugs under suspicion, so
that only novel signals are considered for further exploration.
< Existing adverse event knowledge about closely related
chemical compounds can also serve to focus the surveillance.
For example, programs should monitor new aminoglycoside
antibiotics for adverse renal or vestibulo-cochlear effects, and
new statin-class drugs for hepatotoxicity and myopathy.
<
If one knows a drugs pharmacological mechanisms of action,
one can predict part of its potential adverse event prole
before case reports appear in the literature. A new drug with
anticholinergic side effects will likely cause urinary retention
in elderly males with benign prostatic hypertrophy, and can
potentially exacerbate glaucoma in those patients known to
have the condition. Such patients are not commonly subjects
in clinical trials of drugs, which often may not specically
target older populations.
J Am Med Inform Assoc 2010;17:671e674. doi:10.1136/jamia.2010.008607 673
Viewpoint paper
While commercial drug databases store such content, they are
proprietary and vary considerably in design. Since comparative
descriptions of commercial content have not been published in
the literature, avoiding a blanket assessment is necessary here.
However, a considerable portion of the proprietary content
reproduces entirely the prose in the FDA-mandated package
insert, and the latter is now freely available through NLMs
DailyMed.
26
The added value of proprietary sources comes in
part from categorizing the textual content into functional
categories (side effects, pregnancy, and lactation), organ systems,
and an occasional severity indicator, but this is not sufcient for
drug surveillance purposes.
The time is now appropriate for systematic efforts (preferably
combining public and private resources) to extract the infor-
mation that is present in the numerous primary and secondary
adverse event data repositories into a single, over-arching
structured representation with standard form and content. Such
structuring will possibly be facilitated by the creation of
a standard terminology of adverse event content that has much
richer inter-relationships than are present in MedDRA, and
where aspects of the same spectrum of disease are correlated
along a time-and-severity spectrumefor example, angina
pectoris and myocardial infarctioneas opposed to merely being
related concepts. Bodenreiders pilot work at using SNOMED
CT represents the starting point for such efforts. A larger
consortium should build upon this work.
Acknowledgments The author wishes to thank Randolph Miller for valuable
feedback on the manuscript.
Competing interests None.
Provenance and peer review Not commissioned; not externally peer reviewed.
REFERENCES
1. Reisinger SJ, Ryan PB, OHara DJ, et al. Development and evaluation of a common
data model enabling active drug safety surveillance using disparate healthcare
databases. J Am Med Inform Assoc 2010;17:652e62.
2. Observational Medical Outcomes Partnership. OMOP Common Data Model
Specications, Version 2.0. 2009. http://omop.fnih.org (accessed 9 Jan 2010).
3. Lindberg DAB, Humphreys BL, McCray AT. The unied medical language system.
Meth Inform Med 1993;32:281e91.
4. World Health Organization. International Classication of Diseases, 10th edn.
Geneva, Switzerland, 1992.
5. MedDRA Maintenance and Support Organization. Medical Dictionary of
Regulatory Activities. 2009. http://www.meddramsso.com (accessed 10 Sep 2009).
6. Harris G. Use of Antipsychotics in Children Is Criticized. The New York Times, 2008.
7. International Health Terminology Standards Development Organization.
SNOMED Clinical Terms (SNOMED CT). 2009. http://www.snomed.org (accessed 2
Jan 2009).
8. Chute C, Cohn S, Campbell K, et al. The content coverage of clinical classications.
For The Computer-Based Patient Record Institutes Work Group on Codes &
Structures. J Am Med Inform Assoc 1996;3:224e33.
9. Brouch K. AHIMA project offers insights into SNOMED, ICD-9-CM mapping process.
J AHIMA 2003;74:52e5.
10. Stein H, Nadkarni P, Erdos J, et al. Exploring the degree of concordance of coded
and textual data in answering clinical queries from a clinical data repository. J Am
Med Inform Assoc 2000;7:42e54.
11. Warren J, Collins J, Sorrentino C, et al. Just-in-time coding of the problem list in
a clinical environment. Proc AMIA Symp 1998; Washington DC:280e4.
12. US National Library of Medicine. The CORE problem list subset of SNOMED-CT.
2009. http://www.nlm.nih.gov/research/umls/Snomed/core_subset.html (accessed
6 Jan 2010).
13. Nadkarni P, Darer J. Migrating existing clinical content from ICD-9 to SNOMED.
J Am Med Inform Assoc 2010;17:602e7.
14. Wang X, Hripcsak G, Markatou M, et al. Active computerized pharmacovigilance
using natural language processing, statistics, and electronic health records:
a feasibility study. J Am Med Inform Assoc 2009;16:328e37.
15. Cimino JJ. Desiderata for controlled medical vocabularies in the twenty-rst
century. Methods Inf Med 1998;37:394e403.
16. Merrill G. The MedDRA paradox. AMIA Annu Fall Symp 2008:470e4.
17. Richesson R, Fung K, Krischer J. Heterogeneous but standard coding systems for
adverse events: Issues in achieving interoperability between apples and oranges.
Contemp Clin Trials 2008;29:635e45.
18. Bousquet C, Lagier G, LiioeLe-Lou A, et al. Appraisal of the MedDRA conceptual
structure for Describing and Grouping Adverse Drug Reactions. Drug Saf
2005;28:19e34.
19. Bodenreider O. Using SNOMED CT in combination with MedDRA for reporting
signal detection and adverse drug reactions reporting. AMIA Annu Fall Symp Am Med
Inform Assoc 2009;2009:45e9.
20. National Cancer Institute. Common Terminology Criteria for Adverse Events
(CTCAE) and Common Toxicity Criteria (CTC). 2009. http://ctep.cancer.gov/
protocolDevelopment/electronic_applications/ctc.htm (accessed 9 Jan 2009).
21. Daly A, Song K, Nevill T, et al. Stem cell transplantation for myelobrosis: a report
from two Canadian centers. Bone Marrow Transplant 2003;32:35e40.
22. Woodworth T, Furst DE, Alten R, et al. Standardizing assessment and reporting of
adverse effects in rheumatology clinical trials II: the Rheumatology Common Toxicity
Criteria v.2.0. J Rheumatol 2007;34:1401e14.
23. Landers S. FDA requires black-box warnings for uoroquinolones. 2008. http://
www.ama-assn.org/amednews/2008/07/28/hlsc0728.htm (accessed 9 Feb 2010).
24. Cerner Corporation. Multum Lexicon. 2005. http://www.multum.com/
VantageRxDB.htm (accessed 6 Aug 2005).
25. National Library of Medicine. RxNorm. 2010. http://www.nlm.nih.gov/research/
umls/rxnorm (accessed 9 Feb 2010).
26. National Library of Medicine. About DailyMed. 2010. http://www.dailymed.nlm.
nih.gov/dailymed/about.cfm (accessed 9 Feb 2010).
674 J Am Med Inform Assoc 2010;17:671e674. doi:10.1136/jamia.2010.008607
Viewpoint paper

Vous aimerez peut-être aussi