Why does Keras give validation accuracies with way more decimal places than realistically needed?

I’m working with the MNIST handwritten digit dataset. It has 60k images for the training set and 10k images for the validation set. I get the validation accuracy like this: history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY)) val_accuracy = history.history[‘val_accuracy’] So how is it possible that I’m getting values like 0.9999666810035706? 1) There are…

NoisyNet DQN with defeault parameters not exploring

I implemented a dqn algorithm that plays OpenAIs Cartpole environment. The NN architecture consists of 3 normal linear layers that encode the state, and one noisy linear layer, that predicts the Q value based on the encoded state. My NoisyLinear layers looks like this: class NoisyLinear(nn.Module): def __init__(self, in_features, out_features): super(NoisyLinear, self).__init__() self.in_features = in_features…

What are some ways to quickly evaluate the potential of a given NN architecture?

Main question Is there some way we can leverage general knowledge of how certain hyperparameters affect performance, to very rapidly get some sort of estimate for how good a given architecture could be? Elaboration I’m working on a handwritten character recognition problem using CNNs. I want to try out a few different architectures (mostly at…

Is it normal to see oscillations in tested hyperparameters during bayesian optimisation?

I’ve been trying out bayesian hyperparameter optimisation (with TPE) on a simple CNN applied to the MNIST handwritten digit dataset. I noticed that over iterations of the optimisation loop, the tested parameters appear to oscillate slowly. Here’s the learning rate: Here’s the momentum: I won’t add a graph, but the batch size is also sampled…

Is it normal to see oscillations in tested hyperparameters during bayesian optimisation?

I’ve been trying out bayesian hyperparameter optimisation (with TPE) on a simple CNN applied to the MNIST handwritten digit dataset. I noticed that over iterations of the optimisation loop, the tested parameters appear to oscillate slowly. Here’s the learning rate: Here’s the momentum: I won’t add a graph, but the batch size is also sampled…