### Formulation of a Markov Decision Problem

Given a list of $N$ questions. If question $i$ is answered correctly (given probability $p_i$), we receive reward $R_i$; if not the quiz terminates. Find the optimal order of questions to maximize expected reward. (Hint: Optimal policy has an “index form”.) I am fairly new to Reinforcement Learning and Markov Decision Problems (MDP). I am…