Survival Analysis
Learning Interpretable Mixture of Weibull Distributions - Exploratory Analysis of How Economic Development Influences the Incidence of COVID-19 Deaths
This paper presents an algorithm for learning local Weibull models, whose operating regions are represented by fuzzy rules. The applicability of the proposed method is demonstrated in estimating the mortality rate of the COVID-19 pandemic. The reproducible results show that there is a significant difference between mortality rates of countries due to their economic situation, urbanization, and the state of the health sector. The proposed method is compared with the semi-parametric Cox proportional hazard regression method. The distribution functions of these two methods are close to each other, so the proposed method can estimate efficiently.
Mixture of Survival Analysis Models-Cluster-Weighted Weibull Distributions
Survival analysis is a widely used method to establish a connection between a time to event outcome and a set of variables. The goal of this work is to improve the accuracy of the widely applied parametric survival models. This work highlights that accurate and interpretable survival analysis models can be identified by clustering-based exploration of the operating regions of local survival models. The key idea is that when operating regions of local Weibull distributions are represented by Gaussian mixture models, the parameters of the mixture-of-Weibull model can be identified by a clustering algorithm. The proposed method is utilised in three case studies. The examples cover studying the dropout rate of university students, calculating the remaining useful life of lithium-ion batteries, and determining the chances of survival of prostate cancer patients. The results demonstrate the wide applicability of the method and the benefits of clustering-based identification of local Weibull models.
Integrated Survival Analysis and Frequent Pattern Mining for Course Failure-Based Prediction of Student Dropout
A data-driven method to identify frequent sets of course failures that students should avoid in order to minimize the likelihood of their dropping out from their university training is proposed. The overall probability distribution of the dropout is determined by survival analysis. This result can only describe the mean dropout rate of the undergraduates. However, due to the failure of different courses, the chances of dropout can be highly varied, so the traditional survival model should be extended with event analysis. The study paths of students are represented as events in relation to the lack of completing the required subjects for every semester. Frequent patterns of backlogs are discovered by the mining of frequent sets of these events. The prediction of dropout is personalised by classifying the success of the transitions between the semesters. Based on the explored frequent item sets and classifiers, association rules are formed providing the estimates of the success of the continuation of the studies in the form of confidence metrics. The results can be used to identify critical study paths and courses. Furthermore, based on the patterns of individual uncompleted subjects, it is suitable to predict the chance of continuation in every semester. The analysis of the critical study paths can be used to design personalised actions minimizing the risk of dropout, or to redesign the curriculum aiming the reduction in the dropout rate. The applicability of the method is demonstrated based on the analysis of the progress of chemical engineering students at the University of Pannonia in Hungary. The method is suitable for the examination of more general problems assuming the occurrence of a set of events whose combinations may trigger a set of critical events.
Estimation of machine setup and changeover times by survival analysis
The losses associated with changeovers are becoming more significant in manufacturing due to the high variance of products and requirements for just-in-time production. The study is based on the single minute exchange of die (SMED) philosophy, which aims to reduce changeover times. We introduced a method for the analysis of these losses based on models that estimate the product- and operator-dependent changeover times using survival analysis. The root causes of the losses are identified by significance tests of the utilized Cox regression models. The resulting models can be used to design a performance management system that considers the stochastic nature of the work of the operators. An anonymized manufacturing example related to the setup of crimping and wire cutting machines demonstrates the applicability of the method.