Commit e5e951f1 authored by David M. Rogers's avatar David M. Rogers
Browse files

Improved documentation.

parent 46f5b5e4
......@@ -7,10 +7,13 @@ runs at large scale.
It requires several subsystems that work together in tandem:
* Pre-requisites (not included):
* a fast docking code
* a Slurm Cluster
* a `dwork` server network-accessible from compute nodes
* either a shared filesystem or a gcloud bucket
* A fast docking code
- we recommend https://github.com/jvermaas/AutoDock-GPU/tree/relicensing
- or https://github.com/scottlegrand/AutoDock-GPU
* A Slurm Cluster
* A Redis server network-accessible from compute nodes
* Either a shared filesystem or a gcloud bucket
- You implement your file source/destination inside `config.py`.
* Internal machinery:
* A `rules.yaml` file listing how to run each docking step.
......@@ -19,14 +22,38 @@ It requires several subsystems that work together in tandem:
script from `rules.yaml`
## Interacting With the Work Queue Database
- db scanning and analysis steps
- user-level database interaction activities
- backup/checkpoint
- status query
- load/reset/dump
To run the process, you customize one of the slurm/lsf
batch job templates, then run it.
The LSF template (`run_docking.lsf`) starts
up the database during the job. The slurm templates
(like `docker.sh`) rely on a pre-existing databse.
Either way, the jobs call `loadem.py` in parallel.
That script talks to the redis DB number listed in `rules.yaml`
and operates on 3 Redis sets:
* ready: task strings ready to run
- The task strings are arbitrary, but are split by
whitespace into tokens to fill out "params" in `rules.yaml`.
* errors: task strings resulting in nonzero return value
* hosts: list of hosts currently online
As a user, you need to load up the database
with task strings and check the tasks remanining
and error sets after each run.
Of course, you can use redis' standard mechanisms
for backing up the database.
## Rescoring Setup
The rescoring scripts pull multiple docked ligand parquet
files and combine all the results into a single parquet.
This makes a good naming scheme important. You can check
the rescore.py file to see what we're using now, but we plan
to make this simpler in the future.
### Python/Anaconda Environment Setup
* Python Packages
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment