Draft: Work on integrating RL framework into latest develop
Added train-rl mode, i.e., raps train-rl ..., this includes a raps/envs/raps_env.py training environment, raps/train_rl.py , and raps/schedulers/rl.py. The rl branch version of this works, but the code base has changed a lot since then. The raps_env.py instantiates the Engine. Basically it has to run many small simulations using different inputs to learn an optimal policy. Requesting some help from Jesse to get this working in the latest version.