A retrospective cohort study was performed after Institutional Review Board approval and the granting of a waiver of informed consent. From the institutional liver transplant databases adult patients who underwent OLT between July 1996 and May 2008 were identified. For each of the identified patients the institutional APACHE III database was searched, using the software provided by Cerner Corporation (Kansas City, Missouri), and APACHE III data were abstracted. Patients who did not authorize a review of their medical records for research and pediatric patients were excluded. Patients who died in the operating room or soon after arrival to the ICU were also excluded as only patients who spend more than four hours in the ICU generate APACHE data. Data were collected for the day of ICU admission after liver transplantation only. Data for second or subsequent ICU admissions were not used.
The entry of all laboratory values used for the APACHE III scoring was computerized using software that interfaces with the laboratory. Vital signs, Glasgow Coma Scale scores and urine output were abstracted by the bedside nurses according to a formalized protocol and entered into the computer by trained specialists. The nurses received training, an instruction manual and initial supervision. Audit of the collected data for missing and discrepant admission, physiologic and outcome values was performed by experienced clinical ICU nurses. To successfully pass the audit, criteria were set including: at least 90% agreement on admission variables overall, 100% agreement on admission and discharge dates, a minimum of 80% agreement in admission diagnosis, admission and discharge times, chronic health items, readmission status, surgical status, active treatment status and at least 85% agreement on overall physiology variables. Use of the APACHE III database at our institution has been previously described.
The data abstracted included age, gender, acute physiology score (APS), APACHE III score (APIII), APACHE III-predicted ICU and hospital mortality, and predicted length of ICU and hospital stay. The APS and APACHE III scores for each patient were calculated as described by Knaus and colleagues. Additionally, patients' actual lengths of ICU and hospital stay, and ICU and hospital discharge status (survivor versus non-survivor) and discharge location were recorded.
In addition to variables abstracted from the APACHE database, the institutional anesthesia and liver transplant databases were searched and data regarding duration of anesthesia and surgery and intraoperative administration of packed red blood cells (PRBCs) were recorded for each patient. For patients transplanted after February 2002 (when the MELD score was adopted by the United Network for Organ Sharing as the basis for allocation of donor organs), the MELD score at the time of transplantation was abstracted.
Descriptive data are summarized as mean (standard deviation, SD), median (interquartile range, IQR) or percentage. A chi-square analysis was used to compare categorical variables and Student's t test and rank sum tests were used to compare continuous variables. Patients with missing data were excluded from the analyses involving the missing data. Statistical tests were two-tailed and tests were considered statistically significant with P < 0.05.
Standardized mortality ratios (SMRs) were calculated by dividing the observed rates by the rates predicted by APACHE III. The 95% confidence intervals (CI) were calculated for each of the standardized mortality ratios. Discrimination of a prognostic model is the ability of the model to distinguish between survivors and non-survivors. The discrimination of the APACHE III- predicted mortality for the prediction of in-hospital mortality was analyzed by calculating the area under the receiver operating characteristic curve (AUC). An AUC of > 0.9 was considered to be outstanding, greater than 0.8 to 0.9 excellent, 0.7 to 0.8 acceptable, and less than 0.7 was considered poor. Calibration of a model is the degree of agreement between predicted mortality and actual mortality. The Hosmer-Lemeshow C statistic was used to determine the calibration of the model. A model with good calibration should have a Hosmer-Lemeshow statistic close to the degrees of freedom, which is equal to the number of categories minus 2, and a P-value > 0.05. The Brier score was used as an overall assessment of the model's performance, with a lower score indicating better performance.
Data analyses were performed using SPSS 11.5 (SPSS Inc., Chicago, IL) and MedCalc Version 9.1 (MedCalc Software, Mariakerke, Belgium.)