Alan Malek, Yinlam Chow, Mohammad Ghavamzadeh, Sumeet Katariya. Sequential Multiple Hypothesis Testing with Type I Error Control, to appear in Proceedings of AISTATS 2017.
Yasin Abbasi-Yadkori, Alan Malek, Peter Bartlett, Victor Gabillon. Hit-and-Run for Sampling and Planning in Non-Convex Spaces, to appear in Proceedings of AISTATS 2017.
Wojciech Kotłowski, Wouter M. Koolen, Alan Malek. Online Isotonic Regression, In Proceedings of The 29th Conference on Learning Theory, June 2016. pdf
Yasin Abbasi-Yadkori, Peter Bartlett, Xi Chen, Alan Malek. Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing. In International Conference on Machine Learning (ICML), July 2015. pdf poster slides
Peter Bartlett, Wouter Koolen, Alan Malek, Eiji Takimoto and Manfred Warmuth. Minimax Fixed-Design Linear Regression. In Proceedings of The 28th Conference on Learning Theory, July 2015. pdf poster slides
Y. Abbasi-Yadkori, P. Bartlett, and A. Malek, Linear Programming for Large-Scale Markov Decision Problems, arXiv:1402.6763 [math.OC], 2014 pdf.
Minimax Strategies for Square Loss, Linear Regression, and Time-series Prediction. Machine Learning Seminar, MIT. August 2016. slides
Minimax Strategies for Square Loss Games. Artificial Intelligence and Reinforcement Learning Seminar, University of Alberta. July 2016. slides
Planning in Large-Scale Markov Decision Problems. CMU. June, 2016. slides
Sequential decision making: modeling how we interact with the world. Keynote, Research Symposium. Harker High School. April 2016. slides
Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing. ICML. July 2015. slides
Linear Programming for Large-Scale Markov Decision Problems. ICML. July 2014. slides
My interests lie between optimization, machine learning, and randomized algorithms. Here is a short description of a few areas and projects:
Minimax Analysis in Online Learning
The major strength of most online learning algorithms (e.g. hedge, follow the perturbed leader) is that we can prove very strong guarantees on their performance. Typically, we have regret bounds that hold for any (potentially adversarial) sequence of inputs and often we have lower bounds that show that any algorithm can be made to suffer regret of the same order. However, in certain cases, we are able to tractably compute the exact minimax algorithm of certain online learning games. Here, I mean minimax in the game-theoretical sense (and not the statistics sense, which again is a statement about matching orders). Finding the minimax algorithm comes with the strongest regret guarantee: for any other algorithm, there is a data sequence that causes at least as much regret that the minimax algorithm would suffer. Specific problems I'm interested in:
Sequential Hypothesis Testing
In classical, Neyman-Pearson statistical hypothesis testing, a fixed sample size and decision region is calculated at the start of the test and no statistically correct conclusions can be made before the sample size is reached. This is because only the statistical deviation at is accounted for; there is no control over the rest of the statistic's sample path. Now if we get data that is much easier than expected, there is no way to adjust the sample size without loosing control on type I and type II errors. However, such statistics and decision boundaries do exist and have been known since at least Wald's SPRT (sequential probability ratio test).
Very Large-state space Markov Decision Processes
The planning problem for MDPs assumes full knowledge of the MDP dynamics and loss function and asks: what is the optimal policy? Traditional approaches, such as Value Iteration, scale quadratically in the number of states which becomes prohibitively expensive. If we restrict to some low dimensional family of policies, how well can we do? What conditions do we need to obtain algorithms that scale with the dimension of this family but not with the original state-space?
Applications of random Walks to learning and optimization
There is a celebrated thread of papers in Theoretical Computer Science about the volume of a convex body problem. Though exact volume computations are exponential in dimension, over the last two outcomes, increasingly efficient polynomial time algorithms have been developed which all rely on the ability to generate uniform samples with some random walk over convex sets. As a consequence, we can now use random walks to sample efficiently from more general distributions e.g. log-concave distributions, and as a further consequence, optimize some non-convex functions via simulated annealing.
I've previous spent a lot of time in a ceramics studio and helped found the Stanford ceramics studio. These days, when I'm not working on my degree, I'm likely to be climbing or cooking, or competing in powerlifting.