Do L2 regularization and input normalization depend on sigmoid activation functions?

Following the online courses with Andrew Ng, he talks about L2 regularization (a.k.a. weight decay) and input normalization. Now, the argument is that L2 regularization make the weights smaller, which makes the sigmoid activation functions (and thus the whole network) “more” linear. Question 1: can this rather handwavey explanation be formalized? Can we define “more…