Penalized spline smoothing using Kaplan-Meier weights in semiparametric censored regression models

  1. Jesus Orbe 1
  2. Jorge Virto 1
  1. 1 Universidad del País Vasco/Euskal Herriko Unibertsitatea
    info

    Universidad del País Vasco/Euskal Herriko Unibertsitatea

    Lejona, España

    ROR https://ror.org/000xsnr85

Revista:
Sort: Statistics and Operations Research Transactions

ISSN: 1696-2281

Año de publicación: 2022

Volumen: 46

Número: 1

Páginas: 95-114

Tipo: Artículo

Otras publicaciones en: Sort: Statistics and Operations Research Transactions

Resumen

In this article we consider an extension of the penalized splines approach in the context of censored semiparametric modelling using Kaplan-Meier weights to take into account the effect of censorship. We proposed an estimation method and develop statistical inferences in the model. Using various simulation studies we show that the performance of the method is quite satisfactory. A real data set is used to illustrate that the proposed method is comparable to parametric approaches when assuming a probability distribution of the response variable and/or the functional form. However, our proposal does not need these assumptions since it avoids model specification problems

Referencias bibliográficas

  • Aydin, D. and Yilmaz, E. (2018). Modifed estimators in semiparametric regression models with right-censored data. Journal of Statistical Computation and Simulation, 88:1470–1498.
  • Buckley, J. J. and James, I. R. (1979). Linear regression with censored data. Biometrika, 66:429–436.
  • Chen, W., Li, X., Wang, D., and Shi, G. (2015). Parameter estimation of partial linear model under monotonicity constraints with censored data. Journal of the Korean Statistical Society, 44:410–418.
  • Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 34:187–202.
  • De Boor, C. (2001). A Practical Guide to Splines, revised version, volume 27 of Applied Mathematical Sciences. Springer-Verlag, New York.
  • ´De Uña Alvarez, J. and Roca Pardiñas, J. (2009). Additive models in censored regression. Computational Statistics and Data Analysis, 53:3490–3501.
  • Dickson, E. R., Grambsch, P. M., Fleming, T. R., Fisher, L. D., and Langworthy, A. (1989). Prognosis in primary biliary cirrhosis: Model for decision making. Hepatology, 10:1–7.
  • Dierckx, P. (1993). Curve and Surface Fitting with Splines. Numerical Mathematics and Scientifc Computation. Oxford University Press, Oxford.
  • Eilers, P. H. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties (with discussion). Statistical Science, 11:89–121.
  • Eilers, P. H., Marx, B. D., and Durbán, M. (2015). Twenty years of p-splines. SORTStatistics and Operations Research Transactions, 39(2):149–186.
  • Eubank, R. L. (1988). Spline Smoothing and Nonparametric Regression. Marcel Dekker, New York.
  • Fleming, T. R. and Harrington, D. P. (2005). Counting Processes and Survival Analysis. John Wiley & Sons, Hoboken: New Jersey.
  • Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models, volume 58 of Monographs on Statistics and Applied Probability. Chapman and Hall, London.
  • Härdle, W. (1990). Applied Nonparametric Regression, volume 19 of Econometric Society Monographs. Cambridge University Press, Cambridge.
  • Heckman, N. E. (1986). Spline smoothing in a partly linear model. Journal of the Royal Statistical Society: Series B (Methodological), 48:244–248.
  • Holland, A. D. (2017). Penalized spline estimation in the partially linear model. Journal of Multivariate Analysis, 153:211–235.
  • Jin, Z., Lin, D. Y., Wei, L. J., and Ying, Z. (2003). Rank-based inference for the accelerated failure time model. Biometrika, 90:341–353.
  • Kalbfeisch, J. D. and Prentice, R. L. (2002). The Statistical Analysis of Failure Time Data. John Wiley & Sons, New York.
  • Kaplan, E. L. and Meier, P. (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53(282):457–481.
  • Kim, Y. J. and Gu, C. (2004). Smoothing spline Gaussian regression: more scalable computation via effcient approximation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 66:337–356.
  • Koul, H., Susarla, V., and Van-Ryzin, J. (1981). Regression analysis with randomly right-censored data. The Annals of Statistics, 9:1276 – 1288.
  • Lai, T. L. and Ying, Z. (1992). Linear rank statistics in regression analysis with censored or truncated data. Journal of Multivariate Analysis, 40:13–45.
  • Leurgans, S. (1987). Linear models, random censoring and synthetic data. Biometrika, 74:301–309.
  • Miller, R. G. (1976). Least squares regression with censored data. Biometrika, 63:449– 464.
  • Orbe, J., Ferreira, E., and Núñez Antón, V. (2003). Censored partial regression. Biostatistics, 4:109–121.
  • Orbe, J. and Virto, J. (2018). Penalized spline smoothing using Kaplan-Meier weights with censored data. Biometrical Journal, 60:947–961.
  • Orbe, J. and Virto, J. (2021). Selecting the smoothing parameter and knots for an extension of penalized splines to censored data. Journal of Statistical Computation and Simulation, 91:1–33.
  • O’Sullivan, F. (1986). A statistical perspective on ill-posed inverse problems (with discussion). Statistical Science, 1:502–527.
  • O’Sullivan, F. (1988). Fast computation of fully automated log-density and log-hazard estimators. SIAM Journal on Scientifc and Statistical Computing, 9:363–379.
  • R Core Team (2018). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
  • Reid, N. (1994). A conversation with sir david cox. Statistical Science, 9:439–455.
  • Rice, J. (1986). Convergence rates for partially splined models. Statistics and Probability Letter, 4:203–208.
  • Ruppert, D. (2002). Selecting the number of knots for penalized splines. Journal of Computational and Graphical Statistics, 11:735–757.
  • Schimek, M. G. (2000). Estimation and inference in partially linear models with smoothing splines. Journal of Statistical Planning and Inference, 91:525–540.
  • Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis, volume 26 of Monographs on Statistics and Applied Probability. Chapman & Hall, London.
  • Speckman, P. (1988). Kernel smoothing in partial linear models. Journal of the Royal Statistical Society: Series B (Methodological), 50:413–436.
  • Stare, J., Heinzl, H., and Harrel, F. (2000). On the use of buckley and james least squares regression for survival data. In Ferligoj, A. and Mrvar, A., editors, New Approaches in Applied Statistics, volume 16, pages 125–134. Metodološki zvezki, Ljubljana: Eslovenia.
  • Stute, W. (1993). Consistent estimation under random censorship when covariables are present. Journal of Multivariate Analysis, 45:89–103.
  • Stute, W. (1999). Nonlinear censored regression. Statistica Sinica, 9:1089–1102.
  • Swindell, W. R. (2009). Accelerated failure time models provide a useful statistical framework for aging research. Experimental Gerontology, 44:190–200.
  • Therneau, T. M. (2021). A Package for Survival Analysis in R. R package version 3.2-11.
  • Therneau, T. M. and Grambsch, P. M. (2000). Modeling Survival Data: Extending the Cox Model. Springer-Verlag, New York.
  • Tsiatis, A. A. (1990). Estimating regression parameters using linear rank tests for censored data. The Annals of Statistics, 18:354–372.
  • Wahba, G. (1990). Spline Models for Observational Data, volume 59 of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics, Philadelphia.
  • Wei, L. J. (1992). The accelerated failure time model: A useful alternative to the cox regression model in survival analysis. Statistics in Medicine, 11:1871–1879.
  • Wood, S. N. (2017). Generalized Additive Models: An Introduction with R. Texts in Statistical Science Series. CRC press, Boca Raton: Florida.
  • Zou, Y., Zhang, J., and Qin, G. (2011). A semiparametric accelerated failure time partial linear model and its application to breast cancer. Computational Statistics and Data Analysis, 55:1479–1487.