AutoDock Vina
Description¶
AutoDock Vina is one of the fastest and most widely used open-source docking engines. It is a turnkey computational docking program that is based on a simple scoring function and rapid gradient-optimization conformational search. It was originally designed and implemented by Dr. Oleg Trott in the Molecular Graphics Lab, and it is now being maintained and develop by the Forli Lab at The Scripps Research Institute. The source code is available at github.
Versions¶
Following versions of AutoDock Vina are currently available:
- Runtime dependencies:
- GCCcore/11.3.0
- Boost/1.79.0-GCC-11.3.0
- SWIG/4.0.2-GCCcore-11.3.0
You can load dependencies as modules by following command:
module load Boost SWIG
User guide¶
You can find the software documentation and user guide at official website.
Example run script¶
You can copy and modify this script to autodock_vina.sh and submit job to a compute node by command sbatch autodock_vina.sh. Number of CPUs per task have been tested and optimized on Devana cluster, following script spawns n jobs each running on 8 cores (omp threads), where n equals to number of ntasks allocated in SLURM script.
#!/bin/bash
#SBATCH -J "AD_Vina" # name of job in SLURM
#SBATCH --account=<project> # project number
#SBATCH --partition= # selected partition (short, medium, long)
#SBATCH --ntasks= # number of parallel tasks
#SBATCH --cpus-per-task=8 # number of cpus per task
#SBATCH --time=hh:mm:ss # time limit for a job
#SBATCH -o stdout.%J.out # standard output
#SBATCH -e stderr.%J.out # error output
# Modules
module load Boost SWIG
vina=/storage-apps/software/AutoDock-Vina/build/linux/release
# Home
HOME_DIR=`pwd`
# Compounds
ligand=<ligand>.pdbqt
protein=<protein>.pdbqt
# Target location
export WORK_DIR=/work/$USER/$SLURM_JOB_ID
rm -rf $WORK_DIR; mkdir -p $WORK_DIR
# Copy desired compounds
cp $HOME_DIR/$ligand $WORK_DIR/
cp $HOME_DIR/$protein $WORK_DIR/
cd $WORK_DIR
# Exhaustiveness - docking parameter describing robustness of the search, required CPU time is increases linearly, chance of missing certain pose decreases exponentialy
exht=
# Run AutoDock Vina
# This step requires config.txt file, see $vina/vina --help for list of neccessary and optional parameters
$vina/vina --verbosity 2 --exhaustiveness $exht --cpu $SLURM_CPUS_PER_TASK --ligand $ligand --receptor $receptor --config $HOME_DIR/config.txt --out ${ligand}_${protein}.pdbqt >> ${ligand}_${protein}_log.txt
# Copy files back
rm -rf $HOME_DIR/${ligand}_${protein}; mkdir -p $HOME_DIR/${ligand}_${protein}
cp ${ligand}_${protein}.pdbqt $HOME_DIR/${ligand}_${protein}
cp ${ligand}_${protein}_log.txt $HOME_DIR/${ligand}_${protein}
# Cleanup
cd $HOME_DIR
rm -rf $WORK_DIR
However, since the AutoDock Vina requires relatively small cpu time and is often used to dock thousands of compounds, it is advantageous to use an slurm array submission, with each array job cyclying over list of compounds that are to be docked. Additional required files (except protein and ligands structures) are lists of ligands that are to be docked in each array job and vina_loop.sh script that loops over the given list in each array. You can copy and modify this script to autodock_vina_loop.sh and submit job to a compute node by command sbatch autodock_vina_loop.sh
.
#!/bin/bash
#SBATCH -J "AD_vina" # name of job in SLURM
#SBATCH -o stdout.%J.out # standard output
#SBATCH -e stderr.%J.out # error output
#SBATCH --partition=ncpu # selected partition, ncpu or ngpu
#SBATCH --nodes=1 # number of used nodes
#SBATCH --exclusive # exclusive run on node
#SBATCH --account=<project> # project number
#SBATCH --time=72:00:00 # time limit for a job
#SBATCH --exclusive # exclusive run on node
#SBATCH --array=1-8 # spawns 8 array jobs
#SBATCH --cpus-per-task=8 # each using 8 cpus (8x8=64=1 node)
#SBATCH --mem-per-cpu=2G # required memory, if using whole node this is not required
# Modules
module load Boost SWIG
# Home
HOME_DIR=`pwd`
# Compounds
ligands_loc=/ligands/location
protein_loc=/protein/location
# Target location
export WORK_DIR=/work/$USER/$SLURM_JOB_ID/$SLURM_ARRA_TASK_ID
rm -rf $WORK_DIR; mkdir -p $WORK_DIR
cd $WORK_DIR
# Each array job starts a *vina_loop.sh* scripts that loops over list of given ligands
$HOME_DIR/vina_loop.sh $HOME_DIR $protein_loc $ligands_loc $SLURM_ARRAY_TASK_ID $SLURM_CPUS_PER_TASK
Second script, vina_loop.sh starts a loop of calculations that go over list of ligands. Number of parallel scripts is equal to number of array jobs and each script utilizes cores equal to number of cpus-per-task in autodock_vina_loop.sh.
#!/bin/bash
module load Boost
module load SWIG
# Accepted variables from slurm batch script
init_dir=$1
protein_loc=$2
ligands_loc=$3
job_number=$4
cpu=$5
# Shortcuts
work_dir=`pwd`
vina=/storage-apps/software/AutoDock-Vina/build/linux/release
# Exhaustiveness - docking parameter describing robustness of the search, required CPU time is increases linearly, chance of missing certain pose decreases exponentialy
exht=
# Define protein
protein=
# Copy protein
cp $protein_loc/${protein}.pdbqt $work_dir/.
# Define list of ligands, this points to file ligands_1.txt for 1st array job, ligands_2.txt for 2nd array job, etc.
ligands=$ligands_loc/ligands_${job_number}.txt
# Content of ligands_1.txt file
# Ligand_00001
# Ligand_00002
# etc
# Ligand_10000
# Content of ligands_2.txt file
# Ligand_10001
# Ligand_10002
# etc
# etc
# Loop over the list of ligands
while IFS= read -r line; do
# Get ligand
ligand=$line
# Copy ligand to working directory
cp $ligands_loc/${ligand}.pdbqt $work_dir/.
# Run Vina
"$vina"/vina --verbosity 2 --exhaustiveness $exht --cpu $cpu --ligand ${ligand}.pdbqt --receptor $protein --out ${ligand}_${protein}.pdbqt >> ${ligand}_${protein}_log.txt
# Copy files back
rm -rf $init_dir/${ligand}_${protein}; mkdir -p $init_dir/${ligand}_${protein}
cp ${ligand}_${protein}.pdbqt $init_dir/${ligand}_${protein}
cp ${ligand}_${protein}_log.txt $init_dir/${ligand}_${protein}
done < $ligands
# Cleanup
cd $init_dir
rm -rf $work_dir