Large cohort studies under simple random sampling could be prohibitive to conduct with a limited budget for epidemiological studies seeking to relate a failure time to some exposure variables that are expensive to obtain. In this case, two-phase studies are desirable. Failure-time-dependent sampling (FDS) is a commonly used cost-effective sampling strategy in such studies. To enhance study efficiency upon FDS, counting the auxiliary information of the expensive variables into both sampling design and statistical analysis is necessary.
In survival analysis, it's commonly assumed that all subjects in a study will eventually experience the event of interest. However, this assumption may not hold in various scenarios. For example, when studying the time until a patient progresses or relapses from a disease, those who are cured will never experience the event. These subjects are often labeled as ``long-term survivors'' or ``cured'', and their survival time is treated as infinite. When survival data include a fraction of long-term survivors, censored observations encompass both uncured individuals, for whom the event wasn't observed, and cured individuals who won't experience the event. Consequently, the cure status is unknown, and survival data comprise a mixture of cured and uncured individuals that can't be distinguished beforehand. Cure models are survival models designed to address this characteristic.
Chapter~2 discusses the semiparametric inference for a two-phase failure-time-auxiliary-dependent sampling (FADS) design that allows the probability of obtaining the expensive exposures to depend on both the failure time and cheaply available auxiliary variables. Chapter~3 considers the generalized case-cohort design for studies with a cure fraction. A few directions for future research are discussed in Chapter~4.