### Formulation of a Markov Decision Problem

Given a list of \$N\$ questions. If question \$i\$ is answered correctly (given probability \$p_i\$), we receive reward \$R_i\$; if not the quiz terminates. Find the optimal order of questions to maximize expected reward. (Hint: Optimal policy has an “index form”.) I am fairly new to Reinforcement Learning and Markov Decision Problems (MDP). I am…

### how does monte carlo has high variance?

I was going through david silver’s lecture on reinforcement learning(lecture 4). At 51:22 he says that monte carlo(MC) methods have high variance and zero bias. I understand the zero bias part. It is because it is using the true value of value function for estimation. However i don’t understand the high variance part. Can someone…

### Possible model to use to find pixel locations of objects

I want to make a model that outputs the centre pixel of objects appearing in an image. My current method involves using a CNN with L2 loss to output an image of equivalent size to the input where each pixel has a value of 1 if it is the center of an object and 0…

### Need an explanation of the figure from book by shutton

I came across this graph in david silver’s youtube lecture and shutton’s book on reinforcement learning. Can anyone help me understand the graph.

### Need an explanation of the figure from book by shutton

I came across this graph in david silver’s youtube lecture and shutton’s book on reinforcement learning. Can anyone help me understand the graph.

### Need an explanation of the figure from book by shutton

I came across this graph in david silver’s youtube lecture and shutton’s book on reinforcement learning. Can anyone help me understand the graph.

### Using OpenAi’s Gym Java Client

I’m trying to utilise the Java port of OpenAi’s gym – as I’ve been using Java instead of Python. I’ve cloned the gym’s repo which is a submodule of the “mono-repo”. Within my project, in IntelliJ, I setup a dependency on the gym project (source code) but nothing’s getting picked up. After opening the gym…

### Machine Learning Theory – PAC learnable and No Free Lunch Theorem contradiction

I am reading the Understanding Machine Learning book by Shalev-Shwartz and Ben-David and based on the definitions of PAC learnability and No Free Lunch Theorem, and my understanding of them it seems like they contradict themselves. I know this is not the case and I am wrong, but I just don’t know what I am…

### What’s the best architecture for time series prediction with long dataset

I have to build a neural network without any architecture limitations which have to predict the next value of a time serie . Dataset is 400.000 values which is give in hex format eg 0xbfb22b14 0xbfb22b10 0xbfb22b0c 0xbfb22b18 0xbfb22b14 I think LSTM is suitable for this problem but I am worried about the length of…

### Spiking Neural Network to ANN conversion

In many papers about Artificial Spiking Neural Networks , the performance of them is not up to par with traditional ANN’s. I have read how some people have converted ANN’s to SNN’s using various techniques. There has been work done on using unsupervised learning in SNN to recognise MNIST digits through Spike Timing Dependent Plasticity.(Unsupervised…