In deep learning, is it possible to use discontinuous activation functions?

In deep learning, is it possible to use discontinuous activation functions (e.g. one with jump discontinuity)? (My guess : for example, ReLU is non-differentiable at a single point, but it still has well-defined derivative. If an activation function has a jump discontinuity, then its derivative is supposed to have a delta function at that point.…

Why is Multi-agent Deep Deterministic Policy Gradient (MADDPG) running slowly and taking only 22% from the GPU?

I already asked this question on StackOverflow Where I need to run the Distributed Multi-Agent Cooperation Algorithm based on MADDPG with prioritized batch data code with increasing the number of agents to be 12 agents but it takes a lot of times to train 3500 episodes. I have tried different setting but nothing is working.…

off-policy evaluation in reinforcement learning

IPS estimator, which is used for off-policy evaluation in a contextual bandit problem, is well explained here: Doubly Robust Policy Evaluation andOptimization https://arxiv.org/pdf/1503.02834.pdf The old policy $\mu$, or the behavior policy, is okay to be non-stationary in the IPS estimator even if the new policy $\nu$, or the target policy, should be stationary. I wonder…