| #81 | | Poisson models for person-years and expected rates This report summarizes approaches used to model observed events with respect to the expected number of events. Examples are provided for examining the excess risk or relative risk using both additive and multiplicative models. Elizabeth J. Atkinson, Cynthia S. Crowson, Rachel A. Pederson, Terry Therneau [September 2008] |
| #80 | | Concordance for Survival Time Data: Fixed and Time-Dependent Covariates and Possible Ties in Predictor and Time Concordance, or synonymously the C-statistic, is a valuable measure of model discrimination in analyses involving survival time data. This report provides a definition of concordance in the case of survival data, allowing for time-dependent covariates with the counting process data representation and accounting for ties in the covariates and times. Walter K Kremers and The William J. von Liebig Transplant Center [April 2007] |
| #79 | | Finding Optimal Cutpoints for Continuous Covariates with Binary and Time-to-Event Outcomes This report provides an overview of the literature and describes a unified strategy for finding optimal cutpoints with respect to binary and time-to-event outcomes. Two SAS macros for identifying a cutpoint have been developed in conjunction with this Technical Report. Brent Williams, Jayawant N. Mandrekar, Sumithra J. Mandrekar, Stephen S. Cha, Alfred F. Furth [June 2006] |
| #78 | | Estimating Genetic Components of Variance for Quantitative Traits in Family Studies using the MULTIC routines This reports provides an overview of the theory behind the variance components approach for analyzing one or more quantitative traits in the face of familial correlation. It also provides an introduction to the Splus/R multic library which contains software to carry out this analysis. Mariza de Andrade, Elizabeth J. Atkinson, Eric Lunde, Christopher I. Amos, Jianfang Chen [April 2006] |
| #77 | | Joint Estimation of Calibration and Expression for High-Density Oligonucleotide Arrays We present a unified algorithm which incorporates normalization and class comparison in one analysis using probe level perfect match and mismatch data. The algorithm is based on calibration models common to most biological assays, and the resulting chip-specific parameters have a natural interpretation. We show that the algorithm fits into the statistical generalized linear models framework, describe a practical fitting strategy and present results of the algorithm based on commonly used metrics. Ann L. Oberg., Douglas W. Mahoney, Karla V. Ballman, Terry M. Therneau [February 2006] |
| #76 | | Evaluation of a Simultaneous Mass-Calibration and Peak-Detection Algorithm for FT-ICR Mass Spectrometry Electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry (ESI-FT-ICR-MS) is a potentially superior biomarker discovery platform because it offers high mass-measurement accuracy and high mass-measurement precision as well as high resolving power over a broad mass-to-charge range. Herein, we describe and evaluate a simultaneous mass-calibration and peak-detection algorithm that exploits resolved isotopic peak-spacing information as well as space-charge frequency shifts across isotopic clusters that represent the same molecular species but differ in charge states by integer values. Jeanette E. Eckel-Passow, Terry M. Therneau, Ann L. Oberg, Christopher J. Mason, David C. Muddiman [January 2006] |
| #75 | | Why does PLIER really work? The PLIER (Probe Logarithmic Intensity ERror) algorithm was developed by Affymetrix and released in 2004. It is part of several commercially available software packages that analyze Genechip data such as Strand Genomic's Avadis and Stratagene's ArrayAssist . The PLIER algorithm produces an improved gene expression value (a summary value for a probe set) for the GeneChip microarray platform as compared to the Affymetrix MAS5 algorithm. In this report, we look at why the PLIER algorithm performs so well given that its derivation is based on a biologically implausible assumption. Terry M. Therneau and Karla V. Ballman [November 2005] |
| #74 | | An Exploration of Affymetrix Probe-Set Intensities in Spike-In Experiments In this report, we look at the characteristics of the relationship between the observed probe intensity values produced by the Affymetrix GeneChip platform and the known concentration level of the target gene. This is done using data from three publicly available spike-in gene experiments. The report discusses characteristics of the relationship and implications for statistical models and analysis of Affymetrix GeneChip data. We learned a considerable amount from looking at plots of the data, which are provided in the appendices (Appendix A, Appendix B, and Appendix C), and encourage readers to look and learn from the data themselves. Karla V. Ballman and Terry M. Therneau [July 2005] |
| #73 | | Evaluating Methods of Symmetry Knowing the symmetry of the underlying data is important for parametric analysis, fitting distributions or doing transformations to the data. We evaluate five different methods to assess skewness. We have also developed a comprehensive and efficient SAS> macro for computing the various skewness measures and the appropriate power transformation, if one exists, to make an asymmetric distribution symmetric. Mandrekar JN, Mandrekar SJ, and Cha SS[January 2005] |
| #72 | | Transmission Disequilibrium Methods for Family-Based Studies The study of the association of genetic markers with complex traits has generated a wide range of statistical methods, particularly those that are based on transmission-disequilibrium. This report provides a review of methods in this area through approximately 1999. Schaid DJ [JUL 2004] |
| #70 | | Duane's Little Handbook of Advice for Young Biostatisticians on How to Work with Investigators This handbook is intended to provide young biostatisticians with a set of guidelines about how to effectively work with investigators. Not all of these guidelines will work well in every consulting situation. You may find that you may develop better ways for you to deal with some situations than those which are given here. The advice given here should, however, help you to at least formulate for yourself how you should conduct your own consultations. Ilstrup DM [AUG 2004] |
| #69 | | Normalization of Two-Channel Microarray Experiments: A Semiparametric Approach An important underlying assumption of any experiment is that the experimental subjects are similar across levels of the treatment variable, so that changes in the response variable can be attributed to exposure to the treatment under study. This assumption is often not valid in the analysis of a microarray experiment due to systematic biases in the &easured expression levels related to experimental factors such as spot location (often referred to as a print-tip effect), arrays, dyes, and various interactions of these effects. Thus, normalization is a critical initial step in the analysis of a microarray experiment, where the objective is to balance the individual signal intensity levels across the experimental factors, while maintaining the effect due to the treatment under investigation. Burgoon LD, Boverhof DR, Eckel JE, Gennings C, Therneau TM, Zacharewski TR [JUL 2004] |
| #68 | | Faster cyclic loess: normalizing DNA arrays via linear models This technical report describes a normalization technique that yields results similar to cyclic loess normalization with speed comparable to quantile normalization. Ballman KV, Grill DE, Oberg AL, Therneau TM[NOV 2003] |
| #67 | | An Introduction fo Multiple Imputation Methods: Handling Missing Data with SAS V8.2 This report is organized to give a general overview of the basic concepts of data imputation, with emphasis on application. The purpose is to explain the basic principles of multiple imputation for handling missing data and how to implement this method using SAS version 8.2. Vargas-Chanes D, Decker PA, Schroeder DR, and Offord KP [JULY 2003] |
| #66 | | Penalized Survival Models and Frailty We demonstrate that solutions for gamma shared frailty models can be obtained exactly via penalized estimation. Similarly, Gaussian frailty models are closely linked to penalized models. This makes it possible to apply penalized estimation to other frailty models using Laplace approximations. Fitting frailty models with penalized likelihoods can be made quite rapid by taking advantage of computational methods available for penalized models. We have implemented penalized regression for the coxph function of Splus and illustrate the algorithms with examples using the Cox model. Therneau TM, Grambsch PM, and Pankratz VS [JUNE 2000] |
| #65 | | MCSTRAT: A SAS Macro to Analyze Data From a Matched or Finely Stratified Case-Control Design A case-control design is a common approach used to assess disease-exposure relationships, and the logistic regression model is the most common framework for the analysis of such data. This model expresses the logit transform of the disease probability as a linear combination of independent, or exposure, variables. MCSTRAT: A SAS Macro Vierkant RA, Kosanke JL, Therneau TM, and Naessens JM [FEB 2000] |
| #64 | | Calculating Incidence Rates Among Hospitalized Residents of Olmsted County, Minnesota. The purpose of this technical report is to describe the SAS macro, %inchosp, which allows users to calculate the incidence rate of any disease or event among hospitalized residents of Olmsted County, Minnesota from 1980 to 1990 providing the location at onset is collected. Lohse CM, Petterson, TM, O'Fallon WM, Melton LJ [FEB 1999] |
| #63 | | Expected Survival Based on Hazard Rates (update) This paper is an extension and update of Technical Report #52. An update to the rate tables themselves is based on the recently released data from the 1990 decennial census, which allowed us to replace extrapolated 1990 death rates with actual rates, and to improve the extrapolated year 2000 values. Much of the material in the prior report is contained here, in order to make this document useful on it's own. Therneau TM and Offord J [Feb 1999] |
| #62 | | Computing the Cox Model for Case Cohort Designs Prentice proposed a case-cohort design as an efficient subsampling mechanism for survival studies. Several other authors have expanded on these ideas to create a family of related sampling plans, along with estimators for the covariate effects. We describe how to obtain the proposed parameter estimates and their variance estimates using standard software packages, with SAS and S-Plus as particular examples. Therneau TM and Li H [July 1998] |
| #61 | | An Introduction to Recursive Partitioning Using the RPART Routines Short overview of the methods found in the rpart routines, which implement many of the ideas found in the CART (Classification and Regression Trees) book and programs of Breiman, Friedman, Olshen and Stone. Therneau TM and Atkinson EJ [Nov 1997] |
| RPART Condensed Version | | This document is a shortened version of technical report #61 focusing on the examples and the function options. Atkinson EJ and Therneau TM[Feb 2000] |
| #58 | | Extending the Cox Model Since its introduction, the proportional hazards model proposed by Cox has become the workhorse of regression analysis for censored data. In the last several years, the theoretical basis for the model has been solidified by connecting it to the study of counting processes and martingale theory. Comprehensive accounts of the underlying mathematics are given in the books of Fleming and Harrington and of Andersen et. al. These developments have, in turn, led to the introduction of several new extensions of the original model. These include the analysis of residuals, time varying covariates, time dependent coefficients, multiple/correlated observations, multiple time scales, time dependent strata, and estimation of underlying hazard functions. The aim of this monograph is to show how many of these methods and extensions of the model can be approached using standard statistical software, in particular the S-Plus and SAS packages. As such, it should be a bridge between the statistical journals and actual practice. Therneau TM [June 1996] |
| #57 | | How many stratification factors are "too many" to use in a randomization plan? Controlled Clinical Trials, 14:98-108, 1993 The issue of stratification and its role in patient assignment has generated much discussion, mostly focused on its importance to a study or lack thereof. This report focuses on a much narrower problem: assuming that stratified assignment is desired, how many factors can be accommodated? This is investigated for two methods of balanced patient assignment, the first is based on the minimization method of Taves and the second on the commonly used method of stratified assignment. Simulation results show that the former method can accommodate a large number of factors (10-20) without difficulty, but that the latter begins to fail if the total number of distinct combinations of factor levels is greater than approximately n=2. The two methods are related to a linear discriminant model, which helps to explain the results. Therneau TM [1993] |
| #56 | | Computerized matching of cases to controls The purpose of this report is to describe a new SAS macro, %match, written to facilitate the matching of cases to controls, where one case is matched to one or more controls. Bergstralh EJ and Kosanke JL [April 1995] |
| #55 | | Extrapolation of the U.S. Life Tables Therneau TM & Scheib C [October 1994] |
| #54 | | Generalized Population Attributable Risk Estimation Kahn MJ, O'Fallon WM, Sicks JD [April 1994] |
| #53 | | A package for survival analysis in S Therneau TM [June 1994] |
| #52 | | Expected survival based on hazard rates Therneau TM, Sicks JD, Bergstralh EJ, and Offord J [March 1994] |
| | The new PROC paired Grambsch PM and Therneau TM [Feb. 1993] |
| #51 | | The new PROC paired Bergstralh EJ, Offord KP, and Kosanke JL [Oct. 1992] |
| #50 | | A numerical solution for text information retrieval and its application in patient care classification. Yang Y and Chute C [Feb. 1992] |
| #49 | | Calculating incidence, prevalence and mortality rates in Olmsted County, Minnesota: An update Bergstralh EJ, Offord KP, Chu CP, Beard CM, O'Fallon WM, and Melton LJ [April 1992] |
| #48 | | A random survey of Olmsted County, Minnesota, 1973 O'Brien PD [Mar 1991] |
| #47 | | The GLIM procedure: An interface to the SAS_system Grambsch PM, Kosanke JL, Therneau TM, Schaid DJ, Zinsmeister AR, Wieand HS, Offord KP, LarsonKeller [Mar. 1990] |
| #46 | | Simple robust tests for comparing dispersion in bivariate data Grambsch PM [Dec. 1989] |
| #45 | | Optimal two-stage screening designs for survival comparisons Schaid DJ, Wieand HS and Therneau TM [Nov. 1988] |
| #44 | | Robust procedures for testing equality of covariance matrices O'Brien PC [Oct. 1988] |
| #43 | | A SAS macro for comparing covariance matrices O'Brien PC and Stertz CD [Oct. 1988] |
| #42 | | A SAS macro for validating stepwise regression O'Brien PC and Kosanke JL [Oct. 1988] |
| #41 | | A SAS macro for regression O'Brien PC, Stertz CD, Bergstralh EJ, Daood SL and Offord KP [Oct. 1988] |
| #40 | | Martingale based residuals for survival models Therneau TM, Grambsch PG and Fleming TR [Apr. 1988] |
| #39 | | The effects of preliminary tests for nonlinearity in regression Grambsch PM and O'Brien PC [Mar. 1988] |
| #38 | | Projected Rochester and Olmsted County populations for 1981-1995 Bergstralh EJ and Offord KP [Feb. 1988] |
| #37 | | Conditional probabilities used in calculating cohort expected survival Bergstralh EJ and Offord KP [Jan. 1988] |
| #36 | | Enumerating the optimal designs for a Phase II trial Therneau TM, Wieand HS and Chang M [Sept. 1987] |
| #35 | | Use of an Apple II+ personal computer to enter and code diagnostic data in a research setting Beard CM and Goss S [Apr. 1987] |
| #34 | | A two-stage design for randomized trials with binary outcomes Wieand HS and Therneau TM [Oct. 1986] |
| #33 | | Designs for group sequential Phase II clinical trials Chang M, Therneau TM, Wieand HS, Cha S [July 1986] |
| #32 | | Proc Twosample: A SAS Procedure for the two-sample t and rank-sum tests with extensions O'Brien PC, Offord KP, Kosanke JL [Nov. 1987] |
| #31 | | PERSONYRS: A SAS procedure for person year analyses Bergstralh E, Offord KP, Kosanke JL, and Augustine G [Apr. 1986] |
| #30 | | Comparing two samples: Extensions of the t, rank sum, and log rank tests O'Brien PC [Nov. 1985] |
| #28 | | PROC MCSTRAT Naessens JM, Offord KP, Scott WF, and Daood SL [Oct. 1984] |
| #26 | | PROC SURVDIFF Fleming TR, Augustine GA, Elcombe SA, Offord KP [Nov. 1984] |
| #25 | | Procedures for testing efficacy for clinical trials with multiple endpoints O'Brien PC [Sept. 1983] |
| #24 | | Designs for group sequential tests Fleming TR, Harrington DP, O'Brien PC [Apr. 1984] |
| #23 | | On robust estimation of location for arbitrarily right-censored data Green SJ and Crowley J [Feb. 1983] |
| #22 | | Statistical methods for analyzing variables measured repeatedly over time Zinsmeister AR [Being revised] |
| #21 | | A runs test based on run lengths O'Brien PC [Jan. 1983] |
| #20 | | A SAS MACRO which utilizes local and reference population counts appropriate for incidence, prevalence, and mortality rate calculations in Rochester and Olmsted County, Minnesota Schroeder DJ and Offord KP [Aug. 1982] |
| #19 | | Performing serial testing of treatment effects Fleming TR, Green SJ, and Harrington DP [July 1982] |
| #18 | | Reprint file management using SAS Davis C [June 1981] |
| #17 | | A SAS MACRO for calculating the quadratic discriminant function Davis C [June 1981] |
| #16 | | A SAS MACRO for EDF goodness-of-fit tests Davis C [June 1981] |
| #15 | | The usefulness of mathematical models in studying observational data O'Brien PC [Mar. 1981] |
| #14 | | A data management model for small-scale multi-center studies Tilley BC, Offord KP and Oenning R [Feb. 1981] |
| #13 | | One sample multiple testing procedure for Phase II clinical trials Fleming TR [Feb. 1981] |
| #12 | | A class of rank test procedures for censored survival data Harrington DP and Fleming TR [Feb. 1981] |
| #11 | | Adjusting significance levels in censored data when using multiple tests simultaneously Fleming TR and Harrington DP [Dec. 1980] |
| #10 | | An investigation into the operating characteristics of some two-sample nonparametric test procedures used for censored survival data Fleming TR and Harrington DP [Aug. 1980] |
| #9 | | A class of hypothesis tests for one and two sample censored survival data Fleming TR and Harrington DP [Aug. 1980] |
| #8 | | Nonparametric estimation of the survival distribution in censored data Fleming TR and Harrington DP [Aug. 1979] |
| #7 | | A likelihood test for multivariate serial correlation O'Brien PC [May 1979] |
| #5 | | A multi-stage procedure for clinical trials O'Brien PC and Fleming TR [Oct. 1978] |
| #4 | | A system for storage and retrieval of echocardiographic data Offord KP, Weber VP, Augustine GA, Giuliani ER [Apr. 1978] |
| #3 & 6 | | Modified Kolmogorov-Smirnov test procedures with application to arbitrarily right censored data Fleming TR, O'Fallon JR, O'Brien PC, Harrington DP [Aug. 1980] |
| #2 | | A system for assuring protocol adherence in clinical trials Golenzer H, Taylor WF, O'Fallon JR, and Silvers A [Mar. 1978] |
| #1 | | Asymptotic efficiencies for some nonparametric tests of association with censored data O'Brien PC [May 1978] |