How to perform Interpretability analysis toward a simple reinforcement learning network

We are currently using a RL network with the following simple structure to train a model which helps to solve a transformation task: Environment (a binary file) + reward —> LSTM (embedding) –> FC layer –> FC layer –> FC layer –> decision (to select and apply a kind of transformation toward the environment from…

Details