For best experience please turn on javascript and use a modern browser!
You are using a browser that is no longer supported by Microsoft. Please upgrade your browser. The site may not present itself correctly if you continue browsing.
In this edition of the DIEP seminar series, Harsha Honnappa, Associate Professor, School of Industrial Engineering, Purdue University and IAS fellow, will present a theoretical framework for a definition of differential systems that model reinforcement learning or simulation-based control in continuous time non-Markovian rough environments.
Event details of Pathwise Relaxed Optimal Control of Non-Markovian Systems
Date
11 April 2024
Time
11:00 -12:00
Room
Library

Title

Pathwise Relaxed Optimal Control of Non-Markovian Systems

Abstract

This talk presents a theoretical framework for a definition of differential systems that model reinforcement learning or simulation-based control in continuous time non-Markovian rough environments. The desideratum for such a framework arises, in part, from rare event estimation for non-Markovian stochastic systems in the Friedlin-Wentzell small noise setting for instance. Specifically, I will focus on optimal relaxed control of rough equations (the term relaxed referring to the fact that controls have to be considered as measure valued objects).

In a general context, our contribution focuses on a careful definition of the corresponding relaxed Hamilton-Jacobi-Bellman (HJB)-type equation. A substantial part of our endeavor consists in a precise definition of the notion of test function and viscosity solution for the rough relaxed PDE obtained in this framework. Note that this task is often merely sketched in the rough viscosity literature, in spite of the fact that it gives a proper meaning to the differential system at stake. We show that a natural value function solves a rough HJB equation in the viscosity sense. With reinforcement learning in view, our reward functions encompass forms that involve an entropy-type term favoring exploration. I will demonstrate that, in this setting, closed-form expressions for the optimal relaxed control are obtainable. 

This talk is based on joint work with Prakash Chakraborty at Penn State and Samy Tindel at Purdue. This project is supported by the National Science Foundation through grant DMS/2153915.

If you wish to attend this seminar online, please send an email to w.merbis@uva.nl to receive the zoom-link.