Académique Documents
Professionnel Documents
Culture Documents
Sequoia Foundation, La Jolla, and bCalifornia Department of Health Services, Genetic Disease Screening Program, Richmond, CA, USA
Summary
Correspondence:
Michelle Pearl, Sequoia
Foundation c/o Genetic
Disease Screening Program,
California Department of
Public Health, 850 Marina Bay
Parkway, Rm. F175, Mail Stop
8200, Richmond, CA 94804,
USA.
E-mail:
michelle.pearl@cdph.ca.gov
Conicts of interest:
the authors have declared no
conicts of interest.
Pearl M, Wier ML, Kharrazi M. Assessing the quality of last menstrual period date
on California birth records. Paediatric and Perinatal Epidemiology 2007; 21(Suppl. 2):
5061.
Birth certicate last menstrual period (LMP) date is widely used to estimate gestational
age in the US. While data quality concerns have been raised, no large population-based
study has isolated data quality issues by comparing birth record LMP (Birth LMP) with
reliable LMP dates from another source. We assessed LMP data quality in 2002 California singleton livebirth records (n = 515 381) and in a subset of records with linked
prenatally collected LMP from Californias statewide Prenatal Expanded Alphafetoprotein Screening Program (XAFP) (n = 105 936). Missing or incomplete LMP data
affected 13% of birth records; 17% of those had complete LMP within XAFP records.
Data quality indicators supported XAFP LMP as more accurate than Birth LMP, with
a lower prevalence of digit preference, post-term delivery, out-of-range gestational age
estimates and implausible birthweight-for-gestational age. The bimodal birthweight
distribution evident at 2031 weeks gestation based on Birth LMP was nearly absent
with XAFP LMP-based gestational age. Approximately 32% of the second birthweight
mode was explained by apparent clerical errors in Birth LMP month. Digit preference
errors, particularly day 1, were associated with gestational age overestimation. Preterm
delivery rates were higher according to Birth (7.6%) vs. XAFP LMP (7.2%). One-fth of
observed preterm and over half of observed post-term births using Birth LMP were not
true cases; 15% of true preterm cases were missed. African American or Hispanic, less
educated, and publicly or uninsured women were most likely to be misclassied and
have large LMP date discrepancies attributable to clerical or digit preference error.
The implementation of a revised birth certicate is an opportunity for targeted training
and data entry checks that could substantially improve LMP accuracy on birth
records.
Keywords: birth records, LMP date, accuracy, gestational age.
Introduction
Last menstrual period (LMP) date is the most widely
available source for estimating gestational age from
birth certicates in the US, and is the only source from
the California certicate of livebirth before 2007.
However, gestational age estimates from LMP in
general, and from birth records in particular, are prone
to error, as exhibited by digit preference13 and implausible values relative to birthweight.4 Errors in gestational age estimates from LMP have resulted in excess
post-term births relative to ultrasound estimates1,5 and
Paediatric and Perinatal Epidemiology, 21 (Suppl. 2), 5061. 2007 The Authors. Journal Compilation 2007 Blackwell Publishing Ltd
Methods
California singleton livebirth records from 2002
(n = 515 389) were linked to data from pregnant
women enrolled in the statewide Expanded Alphafetoprotein Screening Program (XAFP) between July
2001 and December 2002. The XAFP is a voluntary,
triple marker screening programme offered to all
women entering prenatal care by 20 weeks gestation.
In order to interpret serological markers, the programme requires an estimate of gestational age based
on ultrasound, LMP, or physical examination, which
is reported by the medical provider at the time of
maternal blood collection (between 15 and 20 weeks
gestation) and double-key entered by programme
personnel. The programme assigns a best estimate of
gestational age that prioritises ultrasound when available as the gold standard, unless otherwise specied
by the provider. Between 20% and 25% of records are
routinely veried with providers before serological
interpretation, and those with positive or uninterpretable screen results (roughly an additional 8%) receive
further follow-up to conrm gestational age.
Probabilistic matching was used to link records
from the XAFP and birth certicates, using mothers
name, date of birth, social security number, delivery
date, XAFP accession date, telephone number, street
address, city and zip code.11 A conservative certainty
cut-off was used to minimise false matches. Overall,
51
Paediatric and Perinatal Epidemiology, 21 (Suppl. 2), 5061. 2007 The Authors. Journal Compilation 2007 Blackwell Publishing Ltd
52
M. Pearl et al.
Results
Data completeness and population selection factors are
assessed in Table 1. In 2002 birth records, 12.9% of
deliveries were missing LMP dates, 55.8% of those
missing day only (data not shown). Missing or incomplete LMP data on birth records were associated with
African American and US-born Hispanic race/
ethnicity, younger maternal age, higher prevalence of
low birthweight, less than high-school education, and
Medi-Cal coverage (Table 1). Of records with missing
or incomplete LMP, 39.2% had complete ultrasound
data and 16.6% had complete LMP data in linked XAFP
records (data not shown).
Compared with non-XAFP participants, XAFP participants were more likely to be under the age of
34 years, to have no previous livebirths, to have completed more than 12 years of education, and to be privately insured (Table 1). Among XAFP participants,
women with LMP as opposed to ultrasound best estimates were more likely to be foreign-born Hispanic,
have less than high-school education, and have MediCal coverage. Both preterm and post-term birth rates
derived from birth certicate LMP were higher among
XAFP participants with ultrasound best estimates compared with those with LMP best estimates (8.9% vs.
7.7% and 8.2% vs. 3.7%, respectively).
XAFP LMP appears to suffer from fewer data quality
problems than Birth LMP, as evidenced by fewer outof-range gestational age values, fewer preferred digits,
lower post-term rates and lack of a bimodal birthweight distribution at early gestational ages (Table 2).
Preterm birth prevalence was higher according to
linked Birth LMP than XAFP LMP (7.6% vs. 7.2%).
Birth records linked to XAFP records had lower prevalence of out-of-range gestational age, post-term births
and implausible birthweight-for-gestational age than
the overall birth population. Day 1 was the most commonly reported day in overall birth records, and day 15
was most commonly reported by both Birth LMP and
XAFP LMP within the linked sample. While digit preference is evident in both data sources for LMP date,
over-reporting of days 1 and 15 of the month was
higher in Birth LMP vs. XAFP LMP dates (Table 2).
The proportion of very preterm births falling within
the second birthweight mode was largest among the
overall birth population (26.7% of all births between 20
and 31 weeks), and was four times greater when using
Birth LMP than XAFP LMP to estimate gestational age
in the linked sample (Table 2). The second birthweight
Paediatric and Perinatal Epidemiology, 21 (Suppl. 2), 5061. 2007 The Authors. Journal Compilation 2007 Blackwell Publishing Ltd
53
Table 1. Characteristics of linked and unlinked study populations, California 2002 Live Birth and Prenatal Expanded Alpha-fetoprotein
Screening Program (XAFP) records
2002 Livebirths
(n = 515 381)
Missing or
incomplete
LMP date
n = 66 623
(12.9%)
%
Race/ethnicity
White
29.1
African American
7.8
Asian
8.1
Hispanic, US-born
21.9
Hispanic, foreign-born
28.9
Pacic Islander
3.5
American Indian/
0.6
Alaskan Native
Age (years)
<20
11.8
2024
26.1
2534
47.5
>34
14.6
Education (years)
<12
31.3
12
31.6
>12
37.1
Previous livebirths (parity)
0
34.5
1
32.6
2+
32.9
Birthweight (g)
<1500
1.2
15002499
5.0
2500
93.8
Method of payment for delivery
Medi-Cal
47.7
Any private
48.8
Uninsured
2.4
Other
1.1
Birth LMP gestational age (completed weeks)
<20
NA
2031
NA
3236
NA
3741
NA
4244
NA
>44
NA
Preterm:e 2036
NA
Post-term:e 4244
NA
2002 Livebirthsa
with Birth LMP
(n = 448 758)
With
LMP
date
n = 448 758
(87.1%)
%
Not linked
to XAFPb
n = 178 012
(39.7%)
%
Linked to
XAFPc
n = 270 746
(60.3%)
%
XAFP
Ultrasoundd
n = 164 810
(60.9%)
%
XAFP
LMPd
n = 105 936
(39.1%)
%
31.2
5.6
8.8
17.5
33.0
3.4
0.4
32.1
6.0
7.5
16.1
34.4
3.4
0.5
30.7
5.4
9.6
18.3
32.2
3.4
0.4
32.0
5.5
9.9
18.1
30.6
3.6
0.4
28.6
5.3
9.1
18.7
34.7
3.2
0.3
9.5
23.1
51.0
16.4
11.4
24.5
41.0
23.1
8.2
22.2
57.6
12.0
7.4
21.2
58.3
13.1
9.4
23.9
56.4
10.3
28.8
28.3
42.9
32.2
28.2
39.6
26.5
28.3
45.1
24.9
28.2
47.0
29.1
28.7
42.3
40.0
31.7
28.4
38.1
30.1
31.8
41.2
32.7
26.1
40.7
32.9
26.5
41.9
32.5
25.6
0.9
4.0
95.2
0.9
4.2
95.0
0.8
3.8
95.4
0.9
4.0
95.1
0.7
3.6
95.8
42.6
53.0
2.3
2.0
47.3
44.4
4.1
4.2
39.6
58.7
1.2
0.6
36.0
62.3
1.1
0.6
45.1
53.1
1.3
0.5
0.1
1.3
7.6
83.0
6.5
1.6
9.0
6.6
0.1
1.5
8.3
81.7
6.7
1.7
10.0
6.8
0.1
1.1
7.2
83.9
6.3
1.5
8.4
6.4
0.1
1.2
7.5
81.1
8.0
2.2
8.9
8.2
0.1
0.9
6.7
88.2
3.6
0.6
7.7
3.7
Paediatric and Perinatal Epidemiology, 21 (Suppl. 2), 5061. 2007 The Authors. Journal Compilation 2007 Blackwell Publishing Ltd
54
M. Pearl et al.
2002 Livebirths
Birth LMP
(n = 448 758)
%
Birth LMP
(n = 105 936)
%
XAFP LMP
(n = 105 936)
%
0.1
0.6
7.6
88.7
3.7
0.02
0.03
7.2
90.5
2.3
6.2
4.3
4.4
6.3
4.9
4.2
4.0
34.3
0.1
4.3
4.3
4.4
5.7
4.7
4.1
4.0
31.4
0.02
% (n)
14.1 (47)
21.0 (124)
18.5 (171)
% (n)
1.9 (6)
6.4 (33)
4.7 (39)
Denominator excludes records with gestational ages <20 and >44 completed weeks.
Expected frequency of preferred digits is 3.3%.
c
Proportion with birthweight 2200 g among deliveries 2027 weeks and 2700 g
among deliveries 2831 weeks.
LMP, last menstrual period.
a
0.0008
0.0006
0.0004
0.0002
XAFP
Birth
0.0000
Probability density
0.0010
0.0012
1000
2000
3000
4000
5000
Birthweight (g)
Paediatric and Perinatal Epidemiology, 21 (Suppl. 2), 5061. 2007 The Authors. Journal Compilation 2007 Blackwell Publishing Ltd
0.0004
0.0006
0.0008
0.0002
XAFP
Birth
0.0000
Probability density
55
1000
2000
3000
4000
5000
Birthweight (g)
mode all but disappeared within 2027 weeks gestation when gestational age was derived from XAFP
LMP (Fig. 1), and was greatly attenuated between 28
and 31 weeks (Fig. 2).
The majority of Birth LMP and XAFP LMP dates are
identical (71.1%), and 65.0% of discrepancies amount
to 1 week in either direction (Table 3). Among discrepant records, XAFP LMP-derived days of gestation
have a stronger association with birthweight than Birth
LMP-derived days of gestation (n = 30 624; R2 = 0.27
and R2 = 0.01, respectively). Large (>2 weeks) gestational age overestimates are 75% more common than
large underestimates (Table 3; 3.7% vs. 2.1%), and
account for 97.2% of gestational ages >44 weeks and
% Overall
(n = 105 936)
% Among
preferred digits
(n = 36 333)
% Among
day 1
(n = 6614)
% Implausible
birthweight-forgestational age
(n = 91)
% Among
2nd birthweight
mode
(n = 171)
0.9
2.8
2.5
8.7
71.1
10.1
1.8
1.6
0.5
2.1
3.7
1.3
4.6
3.9
9.8
65.7
9.8
2.3
2.1
0.5
2.6
5.9
3.5
12.8
9.1
13.6
52.1
4.9
1.5
2.2
0.5
2.6
16.2
6.6
0.0
0.0
1.1
7.7
0.0
1.1
1.1
82.4
83.5
6.6
0.0
0.0
0.0
0.6
9.9
1.2
1.2
7.6
79.5
87.1
0.0
Table 3. Magnitude of
difference between gestational
ages calculated from Birth
LMP vs. XAFP LMP date, by
data quality indicators,
California 2002 Linked Birth
and Prenatal Expanded
Alpha-fetoprotein Screening
Program (XAFP) records
(n = 105 936)
Paediatric and Perinatal Epidemiology, 21 (Suppl. 2), 5061. 2007 The Authors. Journal Compilation 2007 Blackwell Publishing Ltd
M. Pearl et al.
56
Table 4. Distribution of XAFP gestational age within Birth LMP gestational age categories (completed weeks), California 2002 Linked
Birth and Prenatal Expanded Alpha-fetoprotein Screening Program (XAFP) records (n = 105 936)
XAFP LMP-based gestational agea
<20
2031
3236
<20
2031
3236
3741
4244
>44
10
3
1
3
0
0
4
723
52
30
1
22
2
77
5540
1055
57
34
0.0
17
0.8
832
0.0
2
Total
%
N
Missing
%
N
Preterm false-positive rateb
Preterm false-negative rateb
Preterm false-positive
screen rateb
Post-term false-positive rateb
Post-term false-positive
screen rateb
4244
>44
33
114
1 523
91 694
2 010
503
2
5
10
598
1770
32
0
1
2
7
5
13
6.4
6765
90.5
95 877
2.3
2417
0.0
28
100.0
1.1
123
8.7
959
87.6
9 693
2.6
284
0.1
9
100.0
11 070
1 652/97 724
1 143/7 535
1 652/8 044
=
=
=
1.7%
15.2%
20.5%
2 068/102 876
2 068/3 838
=
=
2.0%
53.9%
3741
Total %
Total N
0.0
0.9
6.7
88.2
3.6
0.6
51
923
7 128
93 387
3 843
604
105 936
Bolded diagonal values indicate birth records correctly categorised according to XAFP gestational age categories.
Calculations exclude Birth and XAFP gestational ages <20 and >44 completed weeks (total n = 105 259). Because post-term births derived
from either LMP source may be unreliable, a post-term false-negative rate is not presented.
LMP, last menstrual period.
b
0%
All discrepant records
(n = 30 624)
14 days difference
(n = 2 236)
+14 days difference
(n = 3 929)
20%
9.3%
9.7%
29.9%
20.8%
40%
60%
100%
27.3%
6.6%
24.6%
25.3%
21.0%
0.0% 0.0%
60.0%
15.8%
35.1%
22.0%
80%
15.5%
14.0%
25.7%
4.8%
23.5%
33.1%
20.7%
27.9%
17.9%
0.6%
31.6%
15.8%
Day 1 digit preference error (non-clerical) Other digit preference error (non-clerical)
Paediatric and Perinatal Epidemiology, 21 (Suppl. 2), 5061. 2007 The Authors. Journal Compilation 2007 Blackwell Publishing Ltd
57
Table 5. Maternal and infant characteristics by gestational age categories and data quality indicators, California 2002 Linked Birth and
Prenatal Expanded Alpha-fetoprotein Screening Program (XAFP) records (n = 105 936)
-14
days
%
Overall
2.1
Race/ethnicity
White
1.2
African American
2.2
Asian
1.6
Hispanic, US-born
2.1
Hispanic, foreign-born
3.1
Pacic Islander
1.4
Native American
1.7
Age (years)
<20
2.9
2024
2.5
2534
1.9
>35
1.8
Education (years)
<12
3.2
12
2.2
>12
1.3
Previous livebirths (parity)
0
1.9
1
2.1
>1
2.5
Method of payment for delivery
Medi-Cal
3.0
Any private
1.4
Uninsured
2.4
Other
2.0
+14
days
%
Digit
preference
error,
Birth LMP
%
Clerical
error,
Birth
LMP
%
Preterm
rate,
Birth
LMP
%
Preterm
rate,
XAFP
LMP
%
Preterm
falsenegative
ratea
%
Preterm
falsepositive
ratea
%
Preterm
falsepositive
screen ratea
%
3.7
10.7
2.7
7.6
7.2
15.2
1.7
20.5
2.9
5.4
2.8
4.0
4.2
3.1
3.9
9.6
12.9
8.9
11.6
11.4
9.1
11.7
1.8
2.7
2.2
2.7
3.6
2.3
1.7
6.1
11.3
7.0
8.1
8.1
9.9
7.6
5.9
10.9
6.5
7.8
7.3
9.5
7.3
11.5
13.1
13.3
16.6
18.2
11.9
19.2
1.0
2.1
1.3
1.8
2.3
1.7
1.8
14.8
16.1
18.7
20.0
26.4
15.6
22.2
3.9
4.0
3.5
3.8
11.3
11.6
10.3
10.5
3.1
3.0
2.5
2.7
9.1
7.8
7.1
9.0
8.9
7.1
6.6
8.6
17.4
16.8
14.6
12.3
1.9
2.0
1.6
1.5
19.4
23.5
20.5
15.7
4.6
3.9
2.9
11.8
11.4
9.4
3.6
2.7
2.0
8.6
7.7
6.9
7.9
7.3
6.5
19.7
16.0
10.8
2.4
1.7
1.2
25.9
20.5
16.0
3.2
3.7
4.6
9.7
10.8
12.2
2.4
2.7
3.1
7.8
6.8
8.5
7.5
6.1
8.0
13.7
14.3
18.3
1.4
1.7
2.1
17.0
23.4
23.0
4.4
3.1
4.9
2.3
12.0
9.6
9.6
12.1
3.4
2.1
3.5
2.7
8.5
6.9
8.2
6.1
7.9
6.5
7.6
5.2
18.4
12.0
17.0
3.4
2.2
1.2
2.0
1.1
24.2
16.6
23.2
17.7
a
Excludes Birth and XAFP gestational ages <20 and >44 completed weeks (total n = 105 259, see Table 4 for detail).
LMP, last menstrual period.
Paediatric and Perinatal Epidemiology, 21 (Suppl. 2), 5061. 2007 The Authors. Journal Compilation 2007 Blackwell Publishing Ltd
58
M. Pearl et al.
Discussion
This is the rst study to compare LMP dates from birth
certicates with a large, population-based source of
reliable, prenatally collected LMP data in order to
isolate data reporting errors. Birth LMP was discrepant
with XAFP LMP nearly a third of the time, resulting in
one-fth of preterm births and half of post-term births
from birth records representing false positives, and
15% of true preterm cases being missed. Agreement
within 1 week was larger in the current study than a
previous comparison of LMP-based gestational age
from birth records with gestational age from medical
charts among normal-birthweight babies in northern
California (89% and 7778%, respectively); however,
some chart estimates in that smaller study were
derived from ultrasound.15
While menstrual dating has inherent aws for estimating gestational age, the recording of LMP date itself
is prone to errors amenable to improvement. Californias centralised XAFP prenatal screening programme
is the largest in the country, serving approximately 70%
of pregnant women in the State. As accurate gestational
age is needed for interpretation of risks for trisomies
and neural tube defects, XAFP data provide a
population-based source of gestational age in California. Until now, only vital records have provided sufcient numbers of very early deliveries to examine the
bimodal distribution of birthweight. The second birthweight mode at early gestations appears to be largely
an issue of clerical and recall error, rather than pathological non-menstrual bleeding misidentied as a
normal menstrual cycle.6 XAFP LMP is more accurate
than LMP from birth certicates, as demonstrated by
lower rates of digit preference, out-of-range gestational
ages, implausible birthweight-for-gestational age and
post-term births. Over half of large discrepancies in
Paediatric and Perinatal Epidemiology, 21 (Suppl. 2), 5061. 2007 The Authors. Journal Compilation 2007 Blackwell Publishing Ltd
59
Paediatric and Perinatal Epidemiology, 21 (Suppl. 2), 5061. 2007 The Authors. Journal Compilation 2007 Blackwell Publishing Ltd
60
M. Pearl et al.
Acknowledgements
This paper was partially supported through contract
CQ004942-LOS with the Centers for Disease Control
and Prevention, Atlanta, GA. The authors are indebted
to Joyce A. Martin of the Centers for Disease Control
and Prevention, National Center for Health Statistics,
and Alan Oppenheim of the California Department of
Health Services, Center for Health Statistics for insight
regarding national and State birth certicate data; Bob
Currier and Marie Roberson of the California Department of Health Services, Genetic Disease Branch and
Patricia M. Dietz of the Centers for Disease Control
and Prevention, National Center for Chronic Disease
Prevention and Health Promotion for their thoughtful
comments; Alan Hubbard of University of California,
Berkeley for statistical support; Allen Hom and Steve
Graham of the Sequoia Foundation for data linkage;
and Deborah Hildebrandt and Marissa Root for manuscript assistance.
References
1 Savitz DA, Terry JW Jr, Dole N, Thorp JM Jr, Siega-Riz AM,
Herring AH. Comparison of pregnancy dating by last
menstrual period, ultrasound scanning, and their
combination. American Journal of Obstetrics and Gynecology
2002; 187:16601666.
2 Frazier TM. Error in reported date of last menstrual period.
American Journal of Obstetrics and Gynecology 1959;
77:915918.
3 Waller DK, Spears WD, Gu Y, Cunningham GC. Assessing
number-specic error in the recall of onset of last menstrual
period. Paediatric and Perinatal Epidemiology 2000;
14:263267.
4 Alexander GR, Himes JH, Kaufman RB, Mor J, Kogan M. A
United States national reference for fetal growth. Obstetrics
and Gynecology 1996; 87:163168.
5 Kramer MS, McLean FH, Boyd ME, Usher RH. The validity
of gestational age estimation by menstrual dating in term,
preterm, and postterm gestations. JAMA 1988;
260:33063308.
6 David RJ. The quality and completeness of birthweight and
gestational age data in computerized birth les. American
Journal of Public Health 1980; 70:964973.
Paediatric and Perinatal Epidemiology, 21 (Suppl. 2), 5061. 2007 The Authors. Journal Compilation 2007 Blackwell Publishing Ltd
61
Paediatric and Perinatal Epidemiology, 21 (Suppl. 2), 5061. 2007 The Authors. Journal Compilation 2007 Blackwell Publishing Ltd