I’m reading an article on reinforcement learning, and I don’t understand why the agent’s policy $\pi$ is not part of definition of Markov Decision process(MDP): Bu, Lucian, Robert Babu, and Bart De Schutter. “A comprehensive survey of multiagent reinforcement learning.” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 38.2 (2008): 156-172. […]

# Tag: I’m reading an article on reinforcement learning

I’m reading an article on reinforcement learning, and I don’t understand why the agent’s policy $\pi$ is not part of definition of Markov Decision process(MDP): Bu, Lucian, Robert Babu, and Bart De Schutter. “A comprehensive survey of multiagent reinforcement learning.” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 38.2 (2008): 156-172. […]

I’m reading an article on reinforcement learning, and I don’t understand why the agent’s policy $\pi$ is not part of definition of Markov Decision process(MDP): Bu, Lucian, Robert Babu, and Bart De Schutter. “A comprehensive survey of multiagent reinforcement learning.” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 38.2 (2008): 156-172. […]