Vous êtes sur la page 1sur 11

Trauma Scoring Systems

INTRODUCTION Characterization of injury severity is crucial to the scientific study of trauma, yet the actual measurement of injury severity began only 50 years ago. In 1969, researchers developed the Abbreviated Injury Scale (AIS) to grade the severity of individual injuries. Since its introduction, by the Association for the Advancement of Automotive Medicine (AAAM) International Injury Scaling Committee (IISC), the parent organization of the AIS modified the AIS, most recently in 2005 (AIS-2005). The AIS is the basis for the Injury Severity Score (ISS), which is the most widely used measure of injury severity in patients with trauma. Attempting to summarize the severity of injury in a patient with multiple traumas with a single number is difficult at best; therefore, multiple alternative scoring systems have been proposed, each with its own problems and limitations. This article reviews the conceptual and statistical background necessary to understand injury severity scoring, presents the most common scoring systems, and addresses new ideas and trends in trauma scoring. APPLICATIONS OF TRAUMA SEVERITY SCORING An accurate method for quantitatively summarizing injury severity has many potential applications. The ability to predict outcome from trauma (ie, mortality) is perhaps the most fundamental use of injury severity scoring, a use that arises from the patient's and the family's desires to know the prognosis. More recently, physicians suggested that injury severity scoring can provide objective information for end-of-life decision-making and resource allocation. Trauma mortality prediction in individual patients by any scoring system is limited and is in general no better than good clinical judgment. Therefore, decisions for individual patients should never be based solely on a statistically derived injury severity score. However, scoring systems can serve to estimate quantitatively the level of acuity of injured patients that are applied to adjustments in hospital outcome assessments. Field trauma scoring also is used to facilitate rational prehospital triage decisions, thereby minimizing the time from injury occurrence to definitive management. Similarly, physicians suggest that it can enhance appropriate use of helicopters and timely transfer of severely injured patients to trauma wards. Trauma scoring also is used for quality assurance by allowing evaluation of trauma care both within and between trauma centers, a contentious and controversial area that is likely to only increase in importance. Perhaps the most important role for injury severity scoring is in trauma care research. Scientific study of the epidemiology of trauma and trauma outcomes would not be possible otherwise. Injury severity scoring is indispensable in stratifying patients into comparable groups for prospective clinical trials. Similarly, this technique can be used retrospectively to identify and control for differences in baseline injury severity between patient populations.

BASIC STATISTICAL CONCEPTS Fundamentally, trauma outcome prediction is a multivariate problem. Researchers use multiple independent variables (eg, age, injury severity) to predict the dependent variable (or outcome). Most physicians are familiar with the simplest form of regression analysis, simple linear regression, which describes the linear relationship between 2 variables. Multiple regression is an extension of this technique, in which more than one independent variable is used to describe a single, continuous dependent variable. Multiple regression is advantageous because it allows one to measure the association between a predictor variable and an outcome variable while controlling for other modifying factors. Researchers use multiple regression, therefore, to control for the effects of many variables and assess the independent effect of a single variable. In trauma severity scoring, mortality is the outcome that has elicited the most interest. Mortality is a dichotomous variable having only 2 possible values, death or survival. Although several methods are available, multiple logistic regression is the most popular approach when the outcome of interest is dichotomous because of some unique advantages of multiple logistic regression. Odds is a ratio of the probability that a certain event under consideration will occur to the probability that it will not occur. As an example, out of 100 patients aged 65-74 years who sustain blunt abdominal trauma, 10 will die, while out of 50 patients aged 75-84 years, 15 will die. The probability of death in the age group of 65-74 years is 20/100=0.2 (20%) and in the age group of 75-84 years is 15/50=0.3 (30%). Therefore, the odds of a patient who is between the ages of 65-74 years dying after blunt trauma is 0.2/1-0.2=0.25 (25%). In contrast, the odds of a patient who is between the ages of 75-84 years dying after blunt trauma is 0.3/1-0.3=0.4 (40%). If the probability of an event occurring is equal to the probability that it will not occur, then the odds that the event will occur is 0.5/1-0.5=1 (eg, the odds of heads appearing after tossing a coin is 1). A ratio of two odds is called an odds ratio. A ratio of two odds greater than 1.0 indicates an increased risk for the outcome, whereas an odds ratio of less than 1.0 indicates a decreased risk or protective effect for the outcome. For each age group, one group is typically classified as the reference group and comparisons are made to this group. For example, the age group of 65-74 years could be used as the reference group because it is expected that increasing age would be associated with a greater possibility of dying after blunt abdominal trauma. To compare the age group of 65-74 years with an older age group, odds ratios are calculated. The odds of the age group of 75-84 years divided by the odds of the age group of 65-74 years produces an odds ratio of 1.2 (0.3/0.25). With an odds ratio of greater than 1.0, this suggests that patients between the ages of 75-84 years are 1.2 times more likely to die after blunt abdominal trauma when compared to patients between the ages of 65 74 years. Logistic regression analysis is a statistical tool that uses these types of analyses to explore the relationship between multiple variables and outcomes. Mathematical models can then be constructed based on identification of clinical parameters that predict outcome. Most clinical scoring systems are based on these types of mathematical modeling. Any clinical variable that has been given a particular score will effect the determination of outcome

(almost always either alive or dead) based on the influence of that clinical variable itself on mortality and the influence of other clinical variables. Logistic regression is mathematically convenient in that one can easily convert the coefficients of the equation into estimates of the risk of developing a disease or outcome given the presence of a particular risk factor. Researchers adjust these risk estimates for the effects of other risk factors or covariates included in the logistic regression equation. Outcome prediction never will be perfect, in part because injury severity is difficult to quantify. Perhaps more important is that the patient's response to injury is complex and difficult to model adequately; therefore, multiple scoring systems emerged. Practitioners should be able to assess the predictive performance of each system in order to compare them. Measures of predictive performance include explanatory power, discrimination, and calibration. Explanatory power is that proportion of the prediction outcome that can be explained by the model rather than by variation. This is reflected by the coefficient of determination (r2). Discrimination is the ability of the model to separate the patients into 2 groups; for example, those who survive and those who die. This involves sensitivity, specificity, and accuracy, which are concepts well understood by most physicians. However, when applied to predictive models, these concepts can be problematic. A trauma survival predictive model yields a probability of survival; while in reality, patients can only live or die. Therefore, a prediction rule must be established; typically, researchers assign a cutoff point of 0.5. Patients with a probability of survival greater than 0.5, therefore, are predicted to have lived, while those with a probability of survival less than or equal to 0.5 are predicted to have died. The problem is that sensitivity, specificity, and accuracy all vary depending on a prediction rule chosen. Receiver operating characteristic (ROC) curve analysis can help evaluate the accuracy and discrimination of a predictive model over a wide range of cutoff points. The ROC curve is constructed by plotting the sensitivity on the y-axis and (1 - specificity) on the x-axis at different cutoff points. The area under the ROC curve measures the accuracy of the model. A straight line arising from the origin at a 45 angle has an area under the curve of 0.5 and represents accuracy no better than flipping a coin. A perfect predictive model has an area under the curve of 1.0. As accuracy and discrimination improve, the ROC curve moves upward and to the left. ROC curves allow one to compare different predictive models used in the same population of patients. Calibration is the ability of the model to correctly predict outcome over the entire range of risk. Calibration can be assessed graphically by plotting the actual outcome against the predicted outcome. Calibration is assessed statistically by goodness-of-fit testing, most commonly the Hosmer-Lemeshow test. This test involves grouping patients into risk categories and using a modified chi-square analysis to compare the observed and predicted outcomes in each group. The hypothesis tested is that the model's predictions are the same as the actual outcome; therefore, higher P values are desired and reflect a good fit. PHYSIOLOGIC SCORES The Revised Trauma Score (RTS) is one of the more common physiologic scores. It uses 3 specific physiologic parameters, as follows: (1) Glasgow Coma Scale (GCS), (2) systolic blood pressure (SBP), and (3) respiratory rate (RR).

The magnitude of physiologic derangement in each parameter is scored from 0-4. The RTS has 2 forms depending on its use. When used for field triage, the RTS is determined by adding each of the coded values together. Thus, the RTS ranges from 0-12 and is easily calculated. See Table 1. Table 1. Revised Trauma Score Coded Value 0 1 2 3 4 GCS SBP (mm Hg) 0 <50 5075 7690 RR (breaths/min) 0 <5 5-9 >30 10-30

3 4-5 6-8 9-12

13-15 >90

An RTS of less than 11 is used to indicate the need for transport to a designated trauma center. The coded form of the RTS is used more frequently for quality assurance and outcome prediction. The coded RTS is calculated as follows, in which SBPc, RRc, and GCSc represent the coded values of each variable: RTSc = 0.9368 GCSc + 0.7326 SBPc + 0.2908 RRc Obviously, this value is more complicated to compute, which limits its usefulness in the field. The main advantage of the coded RTS is that the weighting of the individual components emphasizes the significant impact of traumatic brain injury on outcome. The RTS has several limitations that affect its usefulness, and most of these limitations are related to the GCS. As originally described, the GCS was intended to measure the functional status of the central nervous system. Because of the importance of head injury in determining trauma outcome, the GCS also is used by many as a component of trauma severity scoring. Problems inherent to the GCS (and RTS) include the inability to accurately score patients who are intubated and mechanically ventilated (which can often happen prior to making a triage decision). Moreover, patients who are pharmacologically paralyzed or who are under the influence of alcohol or illicit drugs also are difficult to score. Alternative approaches in this setting include using the best motor response and the eye-opening response to calculate or predict the verbal response. Research has shown that substitution of the best motor response for the GCS results in no loss of predictive capability. More recently, researchers have shown that the best motor response predicts trauma mortality as well as or better than other trauma severity scores. Acute Physiology and Chronic Health Evaluation APACHE

The Acute Physiology and Chronic Health Evaluation (APACHE) was introduced in 1981. APACHE characterizes trauma patients inadequately, although different versions of this scoring system are used widely for the assessment of illness severity in surgical intensive care units. This system has 2 components, as follows: (1) the chronic health evaluation, which incorporates the influence of comorbid conditions (eg, diabetes mellitus, cirrhosis, chronic renal failure, heart disease malignancy), and (2) the Acute Physiology Score (APS). The APS consists of weighted variables representing the major physiologic systems, including neurologic, cardiovascular, respiratory, renal, gastrointestinal, metabolic, and hematologic variables. In 1985, the APACHE system was revised (ie, APACHE II) by reducing the number of APS variables from 34 to 12, restricting the comorbid conditions, and deriving coefficients for specific diseases. A representative calculation for a hypothetical patient is shown in Table 2. Table 2. Representative APACHE II Calculation for Hypothetical Patient Parameter Glasgow Coma Scale Age Mean arterial pressure (mmHg) PaO2 (FIO2<0.5) [K+] (mmol/L) WBC x 1,000/cm3 Heart rate (beats/min) Respiratory rate (breaths/min) pH (arterial) Representative Measure 13 56 57 APACHE II Value 2 3 2

60 4.0 20 140 35

3 0 2 3 3


[creatinine] (mg/dL) Core temperature (C) [Na+] (mmol/L) Hematocrit (%) Total

1.7 39.2

4 3

148 28

0 2 30

The chronic health assessment is chronic obstructive pulmonary disease (score=5). The APACHE II total score is 35; the predicted death rate is 83.1%. Therefore, approximately 8 out of 10 patients with this score will not survive. APACHE II is the most widely applied APACHE system; however, it has several potential limitations. The computation of APACHE II scores requires large amounts of data to be reviewed and analyzed. However, it is possible to process this information accurately, portably, and reproducibly at the bedside with a handheld personal data assistant (PDA) with appropriate software. APACHE II calculators can be found online. The GCS, which forms a powerful predictive component of the APS, was not intended to reflect extracranial injuries. Being a relatively younger population, comorbidity is unusual in these patients and the potential exists for lead-time bias. By using only ICU data and not accounting for prior treatment, APACHE II underestimates mortality in patients who are transferred to the ICU after relative stabilization. Patients with trauma frequently are resuscitated in the emergency department or operating room prior to admission to the ICU. Patients with trauma comprise only 8% of the population used to develop APACHE II, with only a 9% case-fatality rate. Moreover, 85% of trauma fatalities were related to traumatic brain injury. In 1992, researchers showed that APACHE II is inferior to the Trauma and Injury Severity Score (TRISS) in predicting mortality in injured patients. Poor performance was related largely to the absence of an anatomic component in the APACHE system. The most recent version, APACHE III, was published in 1991 and was designed to address many of these issues. The most important modifications were including 17 variables; limiting comorbid conditions to those affecting immune function; disease-specific equations, including multiple trauma; distinguishing between head and nonhead trauma; and accounting for potential lead-time bias. Practitioners do not widely accept APACHE III, partially because it is proprietary and expensive. In addition, its accuracy needs to be convincingly validated in patients with trauma. Sequential Organ Failure Assessment Score The sequential organ failure assessment (SOFA) score is a scoring system to determine the extent of a person's organ function or the rate of failure in critically ill patients. Regular, repeated scoring enables patient condition and disease development to be monitored.

The score is based on 6 different parameters, as follows: respiratory system (PaO2/FiO2, mm Hg), cardiovascular system (blood pressure/vasopressors), hepatic system (bilirubin, mg/dL), coagulation system (plateletsX103/mm3), renal system (creatinine, mg/dL), and neurological system (Glasgow Coma Scale). Systemic Inflammatory Response Syndrome Score The systemic inflammatory response syndrome (SIRS) score is a generalized response to nonspecific insults, including infections, pancreatitis, trauma, and burns. To calculate a SIRS score, each of the following components is assigned 1 point: fever or hypothermia (temperature, >38C or <36C), tachypnea (respiratory rate, >20 breaths/min or PaCO2 <32 mm Hg), tachycardia (heart rate, >90 beats/min), and leukocytosis or leukopenia (WBC count, >12,000/mm3 or <4,000/mm3, or presence of 10% bands). Thus, a SIRS score can range from 0-4. ANATOMIC SCORES Numerous scores are based on the characterization of injuries anatomically, as outlined below. Abbreviated Injury Score (AIS) Injury Severity Score (ISS) New Injury Severity Score (NISS) Anatomic Profile (AP) Penetrating Abdominal Trauma Index (PATI) ICD-based Injury Severity Score (ICISS) Injury Severity Score The AIS is a simple numerical method for grading and comparing injuries by severity. Although originally intended for use with vehicular injuries, its scope is increasingly expanded to include other injuries. The AIS is a consensus-derived, anatomically based system of grading injuries on an ordinal scale ranging from 1 (minor injury) to 6 (lethal injury). Scales for all anatomic regions and organs can be found at the American Association for the Surgery of Trauma Web site. AIS Manuals and CDs (2005) are available from the AAAM list of publications. The AIS does not reflect the combined effects of multiple injuries; however, it forms the foundation for the ISS. Baker et al introduced the ISS in 1974 as a means of summarizing multiple injuries in a single patient.1 The ISS is defined as the sum of squares of the highest AIS grade in the 3 most severely injured body regions. Six body regions are defined, as follows: the thorax, abdomen and visceral pelvis, head and neck, face, bony pelvis and extremities, and external structures. Only one injury per body region is allowed. The ISS ranges from 1-75, and an ISS of 75 is assigned to anyone with an AIS of 6. An example of an ISS calculation is shown in Table 3. Table 3. ISS Calculation

Region Head/Neck

Injury Single cerebral contusion No injury Flail chest 1. Liver laceration 2. Completely shattered spleen Fractured femur No injury


AIS2 9

Face Chest Abdomen

0 4 4 5 16 25

Extremity External

3 0

Injury Severity Score (ISS) = 50 The ISS has several limitations. The most obvious limitation is its inability to account for multiple injuries to the same body region. Similarly, it limits the total number of contributing injuries to only 3. This seriously impairs the usefulness of the ISS in penetrating injuries, in which multiple injuries are common. The ISS weights injuries to each body region equally, ignoring the importance of head injuries in mortality from trauma. Furthermore, mortality is not strictly an increasing function of the ISS. The mortality rate for an ISS of 16, therefore, is higher than the mortality rate for an ISS of 17 because of the different combinations of AIS scores that comprise each. Another idiosyncrasy of the ISS is that many ISS values cannot occur, while other ISS values can result from multiple different combinations of AIS scores. Obviously, this makes the ISS a heterogeneous score and reduces its predictive ability. Although the classic use of the ISS is to predict mortality from trauma, the ISS also has been noted to be a consistent risk factor predictor for postinjury multiple-organ failure (MOF). In developing predictive models for MOF, researchers categorized risk factors as related to tissue injury severity, cellular shock severity, the magnitude of the systemic inflammatory response to the injury, and host factors (eg, age, sex, comorbidity). Tissue injury severity is a major component of these predictive models, and it is readily quantifiable using the ISS. Recognizing the limitations of the ISS, researchers subsequently investigated the Anatomic Profile (AP) as an alternative measure of tissue injury severity, observing that the AP offered no advantage over the ISS in predicting postinjury MOF. Moreover, they found the AP difficult to calculate with greater interrater variability compared to the ISS. Recently, Osler et al reported a modified ISS (new ISS or NISS) based on the 3 most severe injuries regardless of body region.21 This simple but significant modification of the ISS avoids

many of its previously acknowledged limitations. By preserving the AIS as the framework for injury severity scoring, the NISS remains familiar and user-friendly. Preliminary studies suggest that the NISS is a more accurate predictor of trauma mortality than the ISS, particularly in penetrating trauma. Other researchers demonstrated that the NISS is superior to the ISS as a measure of tissue injury in predictive models of postinjury MOF. Osler et al recommend that the NISS replace the ISS as the standard anatomic measure of injury severity. Anatomic Profile The AP was developed in response to the limitations of the ISS. Unlike the ISS, the AP includes all serious injuries in a body region. Moreover, the AP appropriately weights head and torso injuries more heavily than other body regions. This index summarizes all serious injuries (AIS greater >3) into 3 categories. Category A includes the head and spinal cord. Category B encompasses the thorax and anterior neck. Category C includes all remaining serious injuries. A fourth category, category D, summarizes all nonserious injuries. Practitioners calculate each component as the square root of the sum of squares of the AIS scores of all serious injuries within each region. A region with no injury receives a score of zero. Using logistic regression, these AP component values are used to calculate a probability of survival. The AP performs better than the ISS in discriminating survivors from nonsurvivors and may provide a more rational basis for comparing injury severity between patients. However, the AP failed to garner much interest or support, probably due to its computational complexity and only modest improvement in predictive performance. Penetrating Abdominal Trauma Index (PATI) This score is used to calculate the risk of complications in patients undergoing celiotomy for penetrating abdominal trauma. Fourteen organs are examined and assigned a risk factor from 1-5 (eg, pancreas=5, spleen=3, bladder=1). Injuries to any organ are graded by severity from 1 for minimal injury (eg, tangential wound to the pancreas) to 5 for maximal injury (eg, pancreatic proximal duct disruption). The severity grade is multiplied by the risk factor; the final penetrating score is obtained by summing the individual organ scores. A PATI of greater than 25 is associated with a complication rate of approximately 50%. This score can be used to compare complication rates between different institutions. International Classification of Diseases (ICD-9) Injury Severity Score (ICISS) Another, more recent approach to anatomic injury scoring is based on the International Classification of Disease, Ninth Edition (ICD-9) codes. This method is termed ICD-9 Injury Severity Score (ICISS) and uses survival risk ratios (SRRs) calculated for each ICD-9 discharge diagnosis. SRRs are derived by dividing the number of survivors in each ICD-9 code by the total number of patients with the same ICD-9 code. ICISS is calculated as the simple product of the SRRs for each of the patient's injuries. ICISS has some advantages over the ISS. First, it represents a true continuous variable that takes on values between 0 and 1. Second, it includes all injuries. Third, ICD-9 codes are readily available and do not require special training or expertise to determine. Finally, initial observations suggest that ICD-9 has better predictive power when compared to the ISS. Moreover, ICISS has the potential to better account for the effects of comorbidity on outcome

by including the SRR for each comorbidity present. Recent observations have suggested that the ICISS outperforms the ISS in outcome predictions of interest (eg, hospital length of stay, hospital charges). Despite the apparent advantage of the ICISS, however, it has not yet replaced other methods of outcome analysis. In addition, further validation is needed before it can be used widely. COMBINED SCORES The predictive capability of any model usually is improved with the inclusion of additional relevant information. Champion and colleagues exemplified this concept with the development of the TRISS.6 This test combines anatomic and physiologic measures of injury severity (ISS and RTS, respectively) and patient age in order to predict survival from trauma. Recognizing the difference between blunt and penetrating injury, researchers developed separate models for each mechanism. The logistic regression equation predicts the probability of survival, ie, P. RTSc is the coded version of the RTS, and patient age is categorized such that age is equal to zero if the patient is younger than 55 years and age is equal to one otherwise. The coefficients will differ for blunt and penetrating trauma. TRISS quickly became the standard methodology for outcome assessment. It appears to be valid for adult and pediatric patients but has been criticized because (1) it is only moderately accurate for predicting survival; (2) problems already are noted with the ISS (eg, inhomogeneity, inability to account for multiple injuries to the same body region); (3) no information is incorporated related to preexisting conditions (eg, cardiac disease, chronic obstructive pulmonary disease, cirrhosis); (4) similar to the RTS, it cannot include intubated patients because respiratory rate and verbal responses are not obtainable; and (5) it does not incorporate an accounting for patient mix (making comparisons between trauma centers difficult). A Severity Characterization of Trauma In an attempt to address these shortcomings, Champion et al introduced A Severity Characterization of Trauma (ASCOT) in 1990 as an improvement over TRISS.6 ASCOT uses the AP in place of the ISS and categorizes age into deciles. In addition, changes include the individual components of the coded RTS that were included as independent predictors in the final logistic regression model. Despite these modifications, the predictive performance of ASCOT is only marginally better than the ISS. This, coupled with the complex nature of the AP component, has discouraged widespread acceptance of ASCOT. ICISS also is combined with age and the RTS in a manner similar to TRISS analysis. This model has superior predictive power and is better calibrated than TRISS. Moreover, this ICISS-based model is a superior predictor of resource utilization in injured patients. CONCLUSION Despite its imperfections, trauma severity scoring remains important for many reasons. ICISS may reflect a significant improvement in methodology, but this requires further validation. Scoring systems applied in intensive care units are not useful for predicting survival for the individual patient. Many models are used for audit purposes, and some are used as

performance measures and quality indicators of a unit; however, both utilities are controversial because of poor adjustment of these systems to case mixtures. Moreover, existing severity scores are being used for purposes for which they are not intended (eg, decisions to withdraw support or on the allocation of resources). Continued research hopefully will improve methodology and make accurate trauma prediction, particularly on an individual patient basis, a reality.