Skip to content

Low-pass imputation

Requirements

  • Ubuntu 22.04 (8 CPUs, 32 GB)
  • bcftools (version==1.13)
  • GLIMPSE2 v2.0.0, commit: 8ce534f, release: 2023-06-29

Input data

Low-pass imputation process

Low-pass imputation workflow

Prepare imputation reference

Code

Make GLIMPSE2 imputation reference data

set -ue

## INPUT
BATCH=$1                # e.g. 1, 2, ..., 10
REF_FOLDER=$2           # e.g. /path/to/reference_panel_folder
OUT_DIR=$3              # e.g. /path/to/output_directory
SAMPLE_LIST=$4          # e.g. /path/to/sample_list.txt
MAP_DIR=$5              # e.g. /path/to/map_directory


gen_ref_batch_chr(){
    chr=$1
    in_dir=$2
    out_dir=$3
    sample_list=$4
    map_dir=$5

    mkdir -p $out_dir

    echo filtering reference of ${chr}

    bcftools view \
        -S ^$sample_list ${in_dir}/chr${i}.vcf.gz  \
        -Oz -o ${out_dir}/chr${i}.vcf.gz

    bcftools index -f ${out_dir}/chr${i}.vcf.gz

    bash buid_ref.sh ${out_dir}/chr${i}.vcf.gz ${chr} ${map_dir}/${chr}.b38.gmap.gz ${out_dir}
}

for i in {1..22}
do
    gen_ref_batch_chr chr${i} $REF_FOLDER $OUT_DIR $SAMPLE_LIST $MAP_DIR &
done

wait
build_ref.sh splices raw reference panels (VCF files) to prepare the imputation panel for the GLIMPSE2 imputation process (bin files).

Imputation process

Code

set -ue

## INPUT
CHR=$1              # e.g. 1, 2, ..., 22
OUT=$2              # e.g. chr1_10x_lps_imputed.vcf.gz
CORES="${3:-1}"     # number of cores to use, default is 1
REF_FOLDER=${4}     # e.g. /path/to/reference_panel_folder

ls *${COV}_lps.bam > run_bam_list.txt

run_imputation_bam_list.sh     run_bam_list.txt         \
                               chr${CHR}                \
                               ${OUT}                   \
                               ${CORES}                 \
                               ${REF_FOLDER}            \
Imputation processing on autosomes and ligating using GLIMPSE2 (run_imputation_bam_list.sh)

Output data

  • lpWGS VCF files