off-policy evaluation in reinforcement learning

IPS estimator, which is used for off-policy evaluation in a contextual bandit problem, is well explained here: Doubly Robust Policy Evaluation andOptimization https://arxiv.org/pdf/1503.02834.pdf The old policy $\mu$, or the behavior policy, is okay to be non-stationary in the IPS estimator even if the new policy $\nu$, or the target policy, should be stationary. I wonder…

What are evolutionary algorithms for topology and weights evolving of ANN (TWEANN) other than NEAT?

I wonder if there are other than NEAT approaches to evolving architectures and weights of artificial neural networks? To be more specific I am looking for projects/frameworks/libraries that use evolutionary/genetic algorithms to simultanousely evolve both topology and train weights of ANNs other than NEAT approach.