Pseudo-array imputation
Requirements
Ubuntu 22.04 (8 CPUs, 32 GB)
bcftools (version==1.13)
shapeit5 (version==5.1.1)
Minimac3 (version==2.0.1)
Minimac4 (version==1.0.3)
Array imputation workflow
Prepare imputation reference
Code
This script extracts a reference panel, phases pseudo SNP array data using Shapeit5, and prepares the reference for imputation by indexing it in Minimac3 format
set -ue
CHR = $1 # Chromosome number (e.g., 1, 2, ..., 22)
ARRAY_NAME = $2 # Name of the pseudo-array (e.g., array1, array2, ...)
BATCH_SAMPLE_LIST = $3 # /path/to/batch_sample_list_file
PSEUDO_ARRAY_VCF = $4 # /path/to/pseudo_array_vcf_file
REFERENCE_VCF_FILE = $5 # /path/to/reference_vcf_file.vcf.gz
PHASING_REFERENCE = $6 # /path/to/phasing_reference_file.vcf.gz
## Extract reference
bcftools view -S ^${ BATCH_SAMPLE_LIST } ${ REFERENCE_VCF_FILE } | \
bcftools annotate --rename-chrs rename_chr.txt \
-Oz -o ref_chr${ CHR } .vcf.gz
bcftools index -f ref_chr${ CHR } .vcf.gz
## Phasing
shapeit5_phase_common_static --input ${ PSEUDO_ARRAY_VCF } \
--reference ref_chr${ CHR } .vcf.gz \
--region 4 --map ${ PHASING_REFERENCE } \
--thread 8 \
--output phased_${ ARRAY_NAME } _chr${ CHR } .bcf
## Indexing by Minimac3
Minimac3 --refHaps ref_chr${ CHR } .vcf.gz \
--processReference \
--prefix m3vcf_ref_chr${ CHR } \
--cpus 8
rename_chr.txt was used to convert to chromosome numeric format.
Imputation process
Code
Genotype imputation is performed using Minimac4. The phased BCF file is converted and indexed, imputed against a reference panel, and temporary files are removed upon completion.
set -ue
ARRAY = $1
CHR = $2
## Input
PHASED_BCF = phased_${ ARRAY } _chr${ CHR } .bcf
MINIMAC3_INDEX_VCF = m3vcf_ref_chr${ CHR } .m3vcf.gz
## Imputation
bcftools view -Oz -o tem_${ ARRAY } _chr${ CHR } .vcf.gz ${ PHASED_BCF }
bcftools index -f tem_${ ARRAY } _chr${ CHR } .vcf.gz
minimac4 --refHaps ${ MINIMAC3_INDEX_VCF } \
--ChunkLengthMb 50 \
--ChunkOverlapMb 5 \
--haps tem_${ ARRAY } _chr${ CHR } .vcf.gz \
--format GT,DS,GP \
--prefix imputed_${ ARRAY } _chr${ CHR } \
--ignoreDuplicates \
--cpus 8 \
--vcfBuffer 1100
rm tem_${ ARRAY } _chr${ CHR } .vcf.*
July 6, 2025
June 5, 2025