In machine learning, how can we overcome the restrictive nature of conjunctive space?

In machine learning, problem space can be represented through concept space, instance space version space and hypothesis space. These problem spaces used the conjunctive space and are very restrictive one and also in the above-mentioned representations of problem spaces, it is not sure that the true concept lies within conjunctive space. So, let’s say, if…

Reinforcement Learning Continuous Control (DDPG): How to avoid thrashing of issued actions? How to reward smooth output over flittering?

Currently I’m working on a continuous state / continuous action controller. It shall control a certain roll angle of an aircraft by issuing the correct aileron commands (between -1…1 continuous). To this end I use a neural network and the DDPG algorithm which shows promising results after som 20 minutes of training. I stripped down…

Reinforcement Learning Continuous Control (DDPG): How to avoid thrashing of issued actions? How to reward smooth output over flittering?

Currently I’m working on a continuous state / continuous action controller. It shall control a certain roll angle of an aircraft by issuing the correct aileron commands (between -1…1 continuous). To this end I use a neural network and the DDPG algorithm which shows promising results after som 20 minutes of training. I stripped down…

Reinforcement Learning Continuous Control (DDPG): How to avoid thrashing of issued actions? How to reward smooth output over flittering?

Currently I’m working on a continuous state / continuous action controller. It shall control a certain roll angle of an aircraft by issuing the correct aileron commands (between -1…1 continuous). To this end I use a neural network and the DDPG algorithm which shows promising results after som 20 minutes of training. I stripped down…

Why is my model accuracy high in train-test split but actually worse than chance in validation set?

I have trained a XGboost model to predict survival for the Kaggle Titanic ML competition. As with all Kaggle competitions there is a train dataset with the target variable included and a test dataset without the target variable which is used by Kaggle to compute the final accuracy score that determines your leaderboard ranking. My…

Is a neuronal network able to optimize itself for speed?

I am experimenting with OpenAI Gym and reinforcement learning. As far as I understood, the environment is waiting for the agent to make a decision, so it’s a sequential operation like this: decision = agent.decide(state) state, reward, done = environment.act(decision) agent.train(state, reward) Doing it in this sequential way, the Markov property is fulfilled: the new…