Learning with Abandonment

pdf

Consider a demand response provider that wants to learn a personalized policy for each user, but the platform faces the risk of a user abandoning the platform if she is dissatisfied with the actions of the platform. For example, the platform will want to personalize the thermostat control for the user, but faces the risk that the user unsubscribes forever if they are mistreated. We propose a general thresholded learning model for scenarios like this, and discuss the structure of optimal policies. We describe salient features of optimal personalization algorithms and how feedback the platform receives impacts the results. Furthermore, we investigate how the platform can efficiently learn the heterogeneity across users by interacting with a population and provide performance guarantees.

This paper appeared at the 2018 International Conference on Machine Learning (ICML).

Tags:
License: CC-2.5
Submitted by Alexis Rodriguez on
Feedback
Feedback
If you experience a bug or would like to see an addition or change on the current page, feel free to leave us a message.
Image CAPTCHA
Enter the characters shown in the image.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.