Installation¶

A. Installation From source¶

1. Clone repository from github¶

export INSTALLDIR=/opt/bamquery
mkdir $INSTALLDIR
cd $INSTALLDIR
git clone https://github.com/lemieux-lab/BamQuery.git

2. Install required library files within $INSTALLDIR:¶

wget https://bamquery.iric.ca/download/lib_essentials.tar.gz
tar vxzf lib_essentials.tar.gz

2.a Installation of genomes¶

BamQuery supports three different versions of the human genome (v26_88 / v33_99 / v38_104) and two versions of the mouse genome (GRCm38 and GRCm39, respectively: M24 / M30).

You need to download the human or mouse genome version you wish to use to:

cd lib/genome_versions

And use the command below to download any human genome version : v26_88 or v33_99 or v38_104.

wget https://bamquery.iric.ca/download/genome_SET_VERSION.tar.gz

or to download any mouse genome version : m24, m30.

wget https://bamquery.iric.ca/download/genome_mouse_SET_VERSION.tar.gz

Finally, you need to:

tar vxzf GENOME_VERSION.tar.gz

2.b Installation of SNPs¶

BamQuery supports three different versions of dbSNPs of the human genome (149/151/155) and two versions of dbSNPs of the mouse genome (snps_GRCm38 and snps_GRCm39, respectively: M24 / M30).

You can download the annotated snps you need to (by default BamQuery does not use snps):

cd lib/snps

And use the command below to download any dbSNP corresponding to human genome releases : 149 or 151 or 155.

wget https://bamquery.iric.ca/download/dbsnps_SET_RELEASE.tar.gz

or to download any dbSNP corresponding to mouse genome releases : GRCm38 or GRCm39.

wget https://bamquery.iric.ca/download/snps_mouse_SET_RELEASE.tar.gz

Finally, you need to:

tar vxzf SNPS_RELEASE.tar.gz

3. Create a virtual environment and install dependencies¶

Option 1: Installation with Conda¶

For users having no administrator priviledges, we recommend installing BamQuery with conda: https://docs.conda.io/en/latest/miniconda.html

First create a conda environment and activate it:

conda create -n BQ
conda activate BQ

Then install all dependencies:

conda install -y -c bioconda pysam
conda install -y -c anaconda pandas
conda install -y -c bioconda Bio
conda install -y -c conda-forge pathos
conda install -y -c conda-forge xlsxwriter
conda install -y -c anaconda seaborn
conda install -y -c conda-forge billiard
conda install -y -c anaconda scipy
conda install -y -c bioconda bedtools
conda install -y -c bioconda star=2.7.9a
conda install -y -c conda-forge mamba
mamba install -y -c conda-forge r-ggplot2
mamba install -y -c conda-forge r-data.table

Launch the analysis:

conda activate BQ
source ${INSTALLDIR}/env/bin/activate
python3 ${INSTALLDIR}/BamQuery/BamQuery.py path_to_input_folder name_exp genome_version

Option 2: Installation from source¶

Download Python 3 and creare a virtual environment. Python: https://www.python.org/

python3 -m venv bamquery-venv
source ${INSTALLDIR}/bamquery-venv/bin/activate

Install python packages in the virtual environment

pip install --upgrade pip
pip install pandas
pip install pysam
pip install pathos
pip install xlsxwriter
pip install seaborn
pip install billiard
pip install numpy
pip install scipy

Install external dependencies so that their binaries are available in your $PATH:

STAR 2.7.9a: https://github.com/alexdobin/STAR
bedtools: https://bedtools.readthedocs.io/en/latest/
R: https://www.r-project.org/, required R packages: ggplot2, data.table

Launch the analysis

python3 ${INSTALLDIR}/BamQuery/BamQuery.py path_to_input_folder name_exp genome_version

B. Installation using the provided docker container¶

A docker container is also available to provide a self contained working environment.

1. Create an install folder:¶

export INSTALLDIR=/opt/bamquery
mkdir $INSTALLDIR
cd $INSTALLDIR

2. Download the docker image:¶

wget https://bamquery.iric.ca/download/bamquery-2023-07-03.tar.gz

3. Install the docker image (requires sudo access):¶

gunzip bamquery-2023-07-03.tar.gz
sudo docker load --input bamquery-2023-07-03

4. Install required library files within $INSTALLDIR:¶

Please, follow the instructions in step 2 enumerated above. See 2. Install required library files within $INSTALLDIR:

5. Launch the analysis from the docker container:¶

sudo docker run -i -t  \
--user $(id -u):$(id -g) \
-v $INSTALLDIR/lib:/opt/bamquery/lib \
-v $DATAFOLDER:$DATAFOLDER  \
-v $PWD:$PWD \
iric/bamquery:0.2 python3 /opt/bamquery/BamQuery/BamQuery.py path_to_input_folder name_exp

making sure to map any required folder mentionned in the input files (BAM locations, input folder) so that these paths may be available from within the container. This is done with multiple arguments -v $DATAFOLDER:$DATAFOLDER (where $DATAFOLDER is to be replaced by an actual folder name) and -v $PWD:$PWD if needed.
Note also that we force the application to run with user permissions instead of root using the --user $(id -u):$(id -g) argument.

For more information on configuration, see Configuration.

Note

BamQuery requires a specific folder structure to work.
Once BamQuery is installed, check that the structure looks as follows:

.
├── BamQuery
│   ├── BamQuery.py
│   ├── genomics
│   ├── plotting
│   ├── readers
│   ├── README.md
│   └── utils
└── lib
    ├── coefficients.dic
    ├── Cosmic_info.dic
    ├── ERE_info.dic
    ├── ERE_info_mouse.dic
    ├── EREs_souris.bed
    ├── genome_versions
    │   ├── genome_mouse_m24
    │   ├── genome_mouse_m30
    │   ├── genome_v26_88
    │   ├── genome_v33_99
    │   └── genome_v38_104
    ├── hg38_ucsc_repeatmasker.gtf
    ├── README.txt
    └── snps
        ├── snps_dics_149
        ├── snps_dics_149_common
        ├── snps_dics_151
        ├── snps_dics_151_common
        ├── snps_dics_155
        └── snps_dics_155_common