We propose a nonparametric online-learning framework to conduct early-stage dose-finding clinical trials with simultaneous consideration of efficacy and toxicity. It has two major benefits: efficient use of patient responses and immunity to model misspecifications. First, unlike most Phase I trials, which only keep track of the toxicity, our framework makes efficient use of patient responses and infers the efficacy of each dose at the same time. Second, our framework utilizes application-specific structures of the dose-efficacy and dose-toxicity curves without imposing any parametric forms. Because of the discontinuity arising from the binary response (the dose is safe or not), the standard approaches in continuum-armed bandits do not apply. We then propose two algorithms, which are easy to understand, implement, and analyze their regret. The first one follows dose-escalation principles and analyzes the efficacy and toxicity simultaneously, which makes it appealing when very little information about the dose-toxicity profile is available. The second one, which is asymptotically optimal up to a logarithmic factor, uses bisection search to identify a safe dose range and then applies upper confidence bound algorithms within the safe range to identify efficacious doses. We test our proposed algorithms with three benchmarks commonly used in practice on synthetic and real datasets, and the results show that they significantly outperform the benchmarks.
Amin Khademi is an associate professor industrial engineering at Clemson University. He received his PhD from the University of Pittsburgh. His research interests lie at the intersection of operations research/management and healthcare. Specifically, innovative designs of clinical trials for which he received an NSF CAREER Award, developing efficient and fair organ transplantation allocation rules, optimal resource allocation for epidemic control, and ambulance dispatching and relocation strategies.