I’m trying to group convoluctions in only one convolution using the associative property from multidimensional discrete convolution (Conv2D in Keras/TensorFlow). This property says that:

$$Y=(x \ast \ast h_1) ** h_2=x**(h_1**h_2)$$

where $h_1$ and $h_2$ are the filters. Until now I achieved group two Conv2D in one: firstly convolving $h_1$ ans $h_2$ and later convolving with $x$. When the convolution is a linear operation I don’t have no problem up to this point. The problem is when two Conv2D layers has an activation function, for example (if only the last layer has act. function is no problem if first convolving the filters):

$$Y_1=ReLU(x \ast \ast h_1)$$

$$Y_2=ReLU(Y_1**h_1)$$

It is really impossible to apply apply the associative property if the first or both layers have activation function, correct? Any idea or related paper or some kind of approach?