Split build image and tests
We are building this image 6 times simultaneously to run tests that are all based on the same Dockerfile. Instead, we should build once, and then pull the image to do each test. The simultaneous builds seem to fail regularly due to PyPI rate limits, so this will likely make the testing more consistent.