Imputation methods to correct overestimated shape parameter of Weibull hazard function in time-to-event modeling

Aims: Hyper-acute, mass occurrence of events (e.g., at day 0~1) is not rare in biomedical data. If the event times are recorded on a daily basis in such situation, information loss occurs from discretization of events at day 1. However, the overestimation of maximum likelihood estimator (MLE) of the Weibull shape parameter at survival analysis of such data has been rather neglected so far. The problem of overestimated MLE was commonly observed at SAS, R and NONMEM. Thus, we performed a simulation study to explore its implication and present a method of using a hybrid dataset with its events at day 1 replaced with simulated events.

Methods: A 1,000 subject survival datasets per shape parameter (0.2, 0.4, 0.6, 0.8, 1.0, 1.5, 2.0; scale parameter fixed to 30) was simulated for the time range of 0 to 28 (days) using the rweibull function in the R. The simulated datasets were then discretized ("reference dataset" hereafter) so that the event time is recorded by the unit of day. The overestimation of the MLE of shape parameters of the reference datasets were assessed by comparing the K-M plots and biases. To correct the overestimation, we used the biased MLE of Weibull shape and scale parameters first: we simulated the events occurring within day 1 and replaced the discretized day 1 events in the "reference dataset" to build "hybrid dataset". The goodness of the hybrid dataset compared with the "reference dataset" was that the event time information within day 1 was alive usable. With the hybrid dataset, we obtained MLE of Weibull parameters (shape and scale) and tested its biases from the nominal parameters of the reference datasets with the Kolmogorov-Smirnov test comparing the similarity K-M plots from the reference dataset and hybrid dataset. We repeated the imputation process using the new MLEs of Weibull parameters iteratively until we arrive at satisfactory MLEs the pass the Kolmogorov-Smirnov test (p>0.999) Results: When the nominal shape parameter was less than 0.6, the bias of MLE seemed evident. We arrived at satisfactory Weibull parameters with less than 20 iterative imputation steps. Conclusion: The MLE of the shape parameter when its nominal true value was less than 0.5 not appropriate to model discrete time-to-event data with the Weibull base hazard. We suggest that our iterative imputation methods may be helpful when modeling time-to-event data with hyper-acute, mass occurrence of events at the very first recording interval (e.g., at day 0~1).