Statistics for Biology and HealthBorchers/Buckland/Zucchini:Estimating Animal Abundance: Closed Populations.EveritURabe-Herketh: Analyzing Medical Data Using S-PLUS.EwedGran :Statistical Methods in Bioinfonnatics: An Introduction.Hougmrd: Analysis of Multivariate Survival Data.KIeidMoeschberger: Survival Analysis: Techniques for Censored andTruncated Data, 2nd ed.Kleinbawn: Survival Analysis: A Self-Learning TextK l e i n b a d e i n : Logistic Regression: A Self-LearningText, 2nd ed.Lunge: Mathematical and Statistical Methods for Genetic Analysis, 2nd ed.M m o n / S i n g e r / S mForecasting:the Health of Elderly Populations.Moyi: Multiple Analyses in Clinical Trials: Fundamentals for Investigators.Parmigiani/G etL/ rirarry/ZegecThe Analysis of Gene Expression Data:Methods and Software.Salsbwg: The Use of Restricted Significance Tests in Clinical Trials.SimodKornMdhandRadmacher/WnghuUro:Design and Analysis of DNA MimanayInvestigations.SorensedGimola: Likelihood, Bayesian, and MCMC Methods in QuantitativeGenetics.TherneadGrambsch: Modeling Survival D m Extending the Cox Model.ZhangSinger: Recursive Partitioning in the Health Sciences.SURVIVALANALYSISTechniques forCensored andTruncated DataSecond EditionJohn P. KleinMedical College of WiscominMelvin L. MoeschbergerThe Ohio State UniversityMedical CenterI *With 97 IllustrationsSpringer
viiiPrefaceclude those available in most standard computer packages. Techniquesfor assessing the fit of these parametric models are also discussed.The final theme is multivariate models for survival data. In Chapter13, tests for association between event times, adjusted for covariates,are given. An introduction to estimation in a frailty or random effectmodel is presented. An alternative approach to adjusting for associationbetween some individuals based on an analysis of an independentworking model is also discussed.There should be ample material in this book for a one or two semestercourse for graduate students. Abasic one semester or one quarter coursewould cover the following sections:ContentsChapter 2Chapter 3, Sections 1-5Chapter 4Chapter 7, Sections 1-6, 8Chapter 8Chapter 9, Sections 1-4Chapter 11Chapter 12In such a course the outlines of theoretical development of the techniques, in the theoretical notes, would be omitted. Depending onthe length of the course and the interest of the instructor, thesedetails could be added if the material in section 3.6 were coveredor additional topics from the remaining chapters could be addedto this skeleton outline. Applied exercises are provided at the endof the chapters. Solutions to odd numbered exercises are new tothe second edition. The data used in the examples and in most ofthe exercises is available from us at our Web site which is accessible through the Springer Web site at http://www.springer-ny.com ml.Milwaukee, WisconsinColumbus, OhioJohn P. KleinMelvin L. MoeschbergerPrefaceChapter 1 -Examples of Survival DataIntroductionRemission Duration from a Clinical Trial for Acute LeukemiaBone Marrow Transplantation for LeukemiaTimes to Infection of Kidney Dialysis PatientsTimes to Death for a Breast-Cancer TrialTimes to Infection for Bum PatientsDeath Times of Kidney Transplant Patients.Death Times of Male Laryngeal Cancer PatientsAutologous and Allogeneic Bone Marrow TransplantsBone Marrow Transplants for Hodgkin's andNon-Hodgkin's LymphomaTimes to Death for Patients with Cancer of the Tongue
xContentsContents1.12 Times to Reinfection for Patients with SexuallyTransmitted DiseasesChapter 4-1.13 Time to Hospitalized Pneumonia in Young Children13141.14 Times to Weaning of Breast-Fed Newborns141.15 Death Times of Psychiatric Patients151.16 Death Times of Elderly Residents of a RetirementCommunity161.17 Time to First Use of Marijuana171.18 Time to Cosmetic Deterioration of Breast Cancer Patients18191.19 Time to AIDSChapter 2 -Basic Quantities and Models212.2 The Survival Function222.3 The Hazard Function272.4 The Mean Residual Life Function and Median Life322.5 Common Parametric Models for Survival Data362.6 Regression Models for Survival Data452.7 Models for Competing Risks502.8 Exercises57and Truncation4.1 Introduction4.2 Estimators of the Survival and Cumulative HazardFunctions for Right-Censored Data2 1.2.1 IntroductionChapter 3 -CensoringNonparametric Estimation of Basic Quantities forRight-Censored and Left-Truncated Data91914.3 Pointwise Confidence Intervals for the Survival Function921044.4 Confidence Bands for the Survival Function1094.5 Point and Interval Estimates of the Mean andMedian Survival Time1174.6 Estimators of the Survival Function for Left-Truncatedand Right-Censored Data1234.7 Summary Curves for Competing Risks1274.8 Exercises133Chapter 5 -Estimation of Basic Quantities forOther Sampling Schemes1395.1 Introduction1395.2 Estimation of the Survival Function for Left, Double,and Interval Censoring1405.3 Estimation of the Survival Function forRight-Truncated Data1495.4 Estimation of Survival in the Cohort Life Table63xi5.5 Exercises1521583.1 Introduction633.2 Right Censoring643.3 Left or Interval Censoring706.1 Introduction3.4 Truncation726.2 Estimating the Hazard Function1651663.5 Likelihood Construction for Censored and Truncated Data746.3 Estimation of Excess Mortality1773.6 Counting Processes79876.4 Bayesian Nonparametric Methods1876.5 Exercises1983.7 ExercisesChapter 6-Topicsin Univariate Estimation165
xiiIContentsContentsxiiiiChapter 7 -Hypothesis Testing9.4 Left TruncationIntroductionOne-Sample TestsTests for Two or More Samples/9.5 Synthesis of Time-varying Effects (Multistate Modeling)3 149.6 Exercises326Chapter 10 -AdditiveHazards Regression ModelsTests for Trend10.1 IntroductionStratified Tests10.2 Aalen's Nonparametric, Additive Hazard ModelRenyi Type Tests10.3 Lin and Ying's Additive Hazards ModelOther Two-Sample Tests10.4 ExercisesTest Based on Differences in Outcome at aFixed Point in TimeExercisesChapter I 1 -RegressionDiagnostics11.1 Introduction11.2 Cox-Snell Residuals for Assessing the Fit of a Cox ModelChapter 8-Semiparametric Proportional Hazards Regressionwith Fixed CovariatesIntroduction11.4 Graphical Checks of the Proportional Hazards AssumptionCoding Covariates11.5 Deviance ResidualsPartial Likelihoods for Distinct-Event Time Data11.6 Checking the Influence of Individual ObservationsPartial Likelihoods When Ties Are Present11.7 ExercisesLocal TestsChapter 9-11.3 Determining the Functional Form of a Covariate:Martingale ResidualsChapter 12 -Inference for Parametric Regression ModelsDiscretizing a Continuous Covariate12.1 IntroductionModel Building Using the Proportional Hazards Model12.2 Weibull DistributionEstimation of the Survival Function12.3 Log Logistic DistributionExercises12.4 Other Parametric ModelsRefinements of the Semiparametric ProportionalHazards Model9.1 Introduction12.5 Diagnostic Methods for Parametric Models12.6 ExercisesChapter 13 -MultivariateSurvival Analysis9.2 Time-Dependent Covariates13.1 Introduction9.3 Stratified Proportional Hazards Models13.2 Score Test for Association329
xivContentsContents13.3 Estimation for the Gamma Frailty Model13.4 Marginal Model for Multivariate Survival13.5 ExercisesAppendix A-A. 1 Univariate Methods;f'A.2 Multivariate MethodsAppendix B -Large-Sample Tests Based on Likelihood TheoryAppendix C- Statistical TablesIIStandard Normal Survival Function P[Z r zlUpper Percentiles of a Chi-Square Distributionfor 90% EPConfidence Coefficients clo(&,6)Confidence BandsConfidence Coefficients c05(&, 6)for 95% EPConfidence BandsConfidence Coefficients col(&,6)for 99% EPConfidence BandsConfidence Coefficients kl0(&,6)for 90% Hall-WeherConfidence BandsConfidence Coefficients k&.,Confidence BandsAppendix D -Data on 137 Bone Marrow Transplant Patients483Appendix E -Selected Solutions to Exercises4891Numerical Techniques for Maximization6)for 95% Hall-WellnerConfidence Coefficients kol(&,6)for 99% Hall-WeherConfidence BandsC.5 Survival Function of the Supremum of the AbsoluteValue of a Standard Brownian Motion Process overthe Range 0 to 1C.6 Survival Function of W Jom[B(t)12dt,where B(t) isa Standard Brownian MotionC.7 Upper Percentiles of R 1: p(u)ldu, where Bo(u)is a Brownian BridgexvBibliographyAuthor lndexSubject lndex\
Examples ofSurvival Data1 .IIntroductionThe problem of analyzing time to event data arises in a number ofapplied fields, such a s medicine, biology, public health, epidemiology,engineering, economics, and demography. Although the statistical toolswe shall present are applicable to all these disciplines, our focus is onapplying the techniques to biology and medicine. In this chapter, wepresent some examples drawn from these fields that are used throughout the text to illustrate the statistical techniques we shall describe.A common feature of these data sets is they contain either c m o r e dor truncated obsetvations. Censored data arises when an individual'slife length is known to occur only in a certain period of time. Possiblecensoring schemes are right cmoring, where all that is known is thatthe individual is still alive at a given time, leJt censoring when all that isknown is that the individual has experienced the event of interest priorto the start of the study, or intmal censoring, where the only information is that the event occurs within some interval. Truncation schemesare lefitruncation, where only individuals who survive a sufficient timeare included in the sample and right truncation, where only individualswho have experienced the event by a specified time are included inthe sample. The issues of censoring and truncation are defined morecarehlly in Chapter 3.
2IChapter 1 Examples of Survival Data1.2Remission Duration from a Clinical Trialfor Acute LeukemiaFreireich et al. (1963) report the results of a clinical trial of a drug6-mercaptopurine (6-MP) versus a placebo in 42 children with acuteleukemia. The trial was conducted at 11 American hospitals. Patientswere selected who had a complete or partial remission of their leukemiainduced by treatment with the drug prednisone. (A complete or partialremission means that either most or all signs of disease had disappearedfrom the bone marrow.) The trial was conducted by matching pairs ofpatients at a given hospital by remission status (complete or partial) andrandomizing within the pair to either a 6-MP or placebo maintenancetherapy. Patients were followed until their leukemia returned (relapse)or until the end of the study (in months). The data is reported inTable 1.1.TABLE 1.1Remisfon duration of G-MP ersusplaceboin children with acute leukemiaPairRemission Status atRnnabmtzationTime to R e m e forPlacebo Patients123456789101112131415161718192021Partial RemissionComplete RemissionComplete RemissionComplete RemissionComplete RemissionPartial RemissionComplete RemissionComplete RemissionComplete RemissionComplete RemissionComplete RemissionPartial RemissionComplete RemissionComplete RemissionComplete RemissionPartial RemissionPartial RemissionComplete RemissionComplete RemissionComplete RemissionComplete Remission12231281721181225415823511418 Censored observationTime to Relapse for6-MPPatfen&10732 232261634 32'25 11 20f19 617 35 6139 6 10 1.3Bone Marrow Transplantation for Leukemia3This data set is used in Chapter 4 to illustrate the calculation ofthe estimated probability of survival using the product-limit estimator,the calculation of the Nelson-Aalen estimator of the cumulative hazardfunction, and the calculation of the mean sunival time, along with theirstandard errors. It is further used in section 6.4 to estimate the survivalfunction using Bayesian approaches. Matched pairs tests for differencesin treatment efficacy are performed using the stratified log rank testin section 7.5 and the stratified proportional hazards model in section188.8.131.52 Bone Marrow Transplantation for LeukemiaBone marrow transplants are a standard treatment for acute leukemia.Recovery following bone marrow transplantation is a complex process.Prognosis for recovery may depend on risk factors known at the timeof transplantation, such as patient andlor donor age and sex, the stageof initial disease, the time from diagnosis to transplantation, etc. Thefind prognosis may change as the patient's posttransplantation historydevelops with the occurrence of events at random times during therecovery process, such as development of acute or chronic graft-versushost disease (GVHD), return of the platelet count to norrnal levels,return of granulocytes to normal levels, or development of infections.Transplantation can be considered a failure when a patient's leukemiareturns (relapse) or when he or she dies while in remission (treatmentrelated death).Figure 1.1shows a simplified diagram of a patient's recovery processbased on two intermediate events that may occur in the recovery process. These intermediate events are the possible development of acuteGVHD that typically occurs within the first 100 days following transplantation and the recovery of the platelet count to a self-sustaininglevel 2 40 X 109/1 (called platelet recovery in the sequel). Immediatelyfollowing transplantation, patients have depressed platelet counts andare free of acute GVHD. At some point, they may develop acute GVHDor have their platelets recover at which time their prognosis (probabilities of treatment related death or relapse at some future time) maychange. These events may occur in any order, or a patient may dieor relapse without any of these events occurring. Patients may, then,experience the other event, which again modifies their prognosis, orthey may die or relapse.To illustrate this process we consider a multicenter trial of patientsprepared for transplantation with a radiation-free conditioning regimen.
4Chapter 1 Examples of Survival DataFigure 1.1 Recovery Processfrom a Bone Marrow TransplantDetails of the study are found in Copelan et al. (1991). The preparativeregimen used in this study of allogeneic marrow transplants for patientswith acute myeloctic leukemia (AML) and acute lyrnphoblastic leukemia(ALL) was a combination of 16 mghg of oral Busulfan (BU) and 120mg/kg of intravenous cyclophosphamide (Cy). A total of 137 patients(99 AML, 38 AH,)were treated at one of four hospitals: 76 at TheOhio State University Hospitals (OSLJ) in Columbus; 21 at HahnemannUniversity (HU) in Philadelphia; 23 at St. Vincent's Hospital (SVH) inSydney Australia; and 17 at Alfred Hospital (AH) in Melbourne. Thestudy consists of transplants conducted at these institutions from March1, 1984, to June 30, 1989. The maximum follow-up was 7 years. Therewere 42 patients who relapsed and 41 who died while in remission.Twenty-six patients had an episode of acute GVHD; and 17 patients1.3Bone Marrow Transplantation for Leukemia5either relapsed or died in remission without their platelets returning tonormal levels.Several potential risk factors were measured at the time of transplantation. For each disease, patients were grouped into risk categoriesbased on their status at the time of transplantation. These categorieswere as follows: ALL (38 patients), AML low-risk first remission (54 patients), and AML high-risk second remission or untreated first relapse(15 patients) or second or greater relapse or never in remission (30patients). Other risk factors measured at the time of transplantationincluded recipient and donor gender (80 and 88 males respectively),recipient and donor cytomegalovirus immune status (CMV) status (68and 58 positive, respectively), recipient and donor age (ranges 7-52and 2-56, respectively), waiting time from diagnosis to transplantation(range 0.8-87.2 months, mean 19.7 months), and, for AML patients,their French-American-British (FAB) classification based on standardmorphological criteria. AML patients with an FAB classification of M4 orM5 (45/99 patients) were considered to have a possible elevated risk ofrelapse or treatment-related death. Finally, patients at the two hospitalsin Australia (SVH and AH) were given a graft-versus-host prophylacticcombining methotrexate (MIX) with cyclosporine and possibly methylprednisolo .Patients at the other hospitals were not given methotrexate but rather a combination of cyclosporine and methylprednisolone.The data is presented in Table D.l of Appendix D.This data set is used throughout the book to illustrate the methodspresented. In Chapter 4, it is used to illustrate the product-limit estimator of the survival function and the Nelson-Men estimator of thecumulative hazard rate of treatment failure. Based on these statistics,pointwise confidence intervals and confidence bands for the survivalfunction are constructed. The data is also used to illustrate point andinterval estimation of summary survival parameters, such as the meanand median time to treatment failure in this chapter.This data set is also used in Chapter 4 to illustrate summary probabilities for competing risks. The competing risks, where the occurrence ofone event precludes the occurrence of the other event, in this example,are relapse and death.In section 6.2, the data set is used to illustrate the construction ofestimates of the hazard rate. These estimates are based on smoothingthe crude estimates of the hazard rate obtained from the jumps of theNelson-Men estimator found in Chapter 4 using a weighted averageof these estimates in a small interval about the time of interest. Theweights are chosen using a kernel weighting function.In Chapter 7, this data is used to illustrate tests for the equality of Ksurvival curves. Both stratified and unstratified tests are discussed.In Chapter 8, the data is used to illustrate tests of the equality of Khazard rates adjusted for possible fixed-time confounders. A proportional hazards model is used to make this adjustment. Model buildingfor this problem is illustrated. In Chapter 9, the models found in Chap-
6Chanter 1 Examnles of Survival Data1.5ter 8 are further refined to include covariates, whose values changeover time, and to allow for stratified regression models. In Chapter 11,regression diagnostics for these models are presented.1.4 Times to Infection of KidneyDialysis PatientsIn a study (Nahrnan et al., 1992) designed to assess the time to first exitsite infection (in months) in patients with renal insufficiency,43 patientsutilized a surgically placed catheter (Group 11, and 76 patients utilizeda percutaneous placement of their catheter (Group 2). Cutaneous exitsite infection was defined as a painful cutaneous exit site and positivecultures, or peritonitis, defined as a presence of clinical symptoms,elevated peritoneal dialytic fluid, elevated white blood cell count (100white blood cells/pl with 50% neutrophils), and positive peritonealdialytic fluid cultures. The data appears in Table 1.2.TABLE 1.2Times to infection (in months) of Mney dialysispatientswith different catheterization proceduresTimes to Death for a Breast-Cancer Trial7curves when there are ties present. Testing for proportional hazards isillustrated in section 9.2. The test reveals that a proportional hazardsassumption for this data is not correct. A model with a time-varying,covariate effect is more appropriate, and in that section the optimalcutoff for "early" and "late" covariate effect on survival is found.1.5Times to Death for a Breast-Ca.ncer TrialIn a study (Sedmak et al., 1989) designed to determine if female breastcancer patients, originally classified as lymph node negative by standard light microscopy (SLM), could be more accurately classified by irnmunohistochemical (IH) examination of their lymph nodes with an anticytokeratin monoclonal antibody cocktail, identical sections of lymphnodes were sequentially examined by SLM and IH. The sigmlicance ofthis study is that 16% of patients with negative axillary lymph nodes,by standard pathological examination, develop recurrent disease within10 years. Forty-five female breast-cancer pati
EveritURabe-Herketh: Analyzing Medical Data Using S-PLUS. EwedGran : Statistical Methods in Bioinfonnatics: An Introduction. Hougmrd: Analysis of Multivariate Survival Data. KIeidMoeschberger: Survival Analysis: Techniques for Censored and Truncated Data, 2nd ed. Kleinbawn: