In deep learning, is it possible to use discontinuous activation functions?

In deep learning, is it possible to use discontinuous activation functions (e.g. one with jump discontinuity)? (My guess : for example, ReLU is non-differentiable at a single point, but it still has well-defined derivative. If an activation function has a jump discontinuity, then its derivative is supposed to have a delta function at that point.…

Details