PhD candidate, UT
Friday, April 23, 202: 15:30-16:30
Baucum’s research focuses on improving the application of reinforcement learning to medical decision-making problems. He specializes in solving treatment-planning problems through the use of environment models, which are models that can simulate realistic patient symptom trajectories. These models are learned from real-world patient data and, when properly constructed and validated, can be paired with reinforcement learning algorithms to identify optimal treatment policies. Some of my past work focuses on designing new methods for building these environment models, while other work applies them to current healthcare challenges. He works under Dr. Anahita Khojandi and is scheduled to graduate in summer 2021.
Effective treatment of Parkinson’s disease (PD) is a continual challenge for healthcare providers, given patients’ unique symptom patterns and barriers to accessing neurology care. Providers can thus benefit from leveraging emerging technologies to supplement traditional clinic care. We develop a data-driven reinforcement learning (RL) framework to optimize PD medication regimens through wearable sensors. We leverage a dataset of 26 PD patients who wore wrist-mounted movement trackers for two separate six-day periods, whose medication regimens were modified per physician evaluations of collected data after the first wear period. We showcase the effectiveness of our RL framework as a decision support tool to prescribe medication regimens/changes to enable and augment decision making. To do so, we first build and validate a simulation model of how individual patients’ movement symptoms respond to medication administration. We then pair this simulation model with an on-policy RL algorithm that recommends optimal medication types, timing, and dosages during the day, while incorporating human-in-the-loop considerations on medication administration. The results show that the patient-specific RL-prescribed medication regimens outperform the physician-updated medication regimens, despite physicians having access to the same data as the RL agent. These resulting patient-specific RL policies provide clinically defensible treatment strategies. By re-training the RL policies with different dosing frequencies, we also identify patients who may benefit from alternative therapies (such as continuous-administration medication pumps) and those who could switch to lower-frequency regimens without sacrificing symptom improvement.