体细胞突变检测分析指南"/>
【GATK加速】替换BWA/GATK/Mutect2,Sentieon软件 肿瘤体细胞突变检测分析指南
前言
本文介绍了两种体细胞变异检测pipeline:
- TNscope:使用Sentieon特有的算法,拥有更快的计算速度和更高的计算精度,对临床基因诊断样本尤其适用;
- TNhaplotyper2:匹配Mutect2(现在匹配到4.1.9)结果的同时,计算速度提升10倍以上。
关于TNscope和TNhaplotyper2的完整脚本,可访问:
Sentieon软件下载地址:
TNscope pipeline的数据处理流程,主要针对WES和Panel (200-500x depth, AF > 1%)
第一步:Alignment
# ******************************************
# 1a. Mapping reads with BWA-MEM, sorting for tumor sample
# ******************************************
( sentieon bwa mem -M -R "@RG\tID:$tumor\tSM:$tumor\tPL:$platform" \-t $nt -K 10000000 $fasta $tumor_fastq_1 $tumor_fastq_2 || \echo -n 'error' ) | \sentieon util sort -o tumor_sorted.bam -t $nt --sam2bam -i -
# ******************************************
# 1b. Mapping reads with BWA-MEM, sorting for normal sample
# ******************************************
( sentieon bwa mem -M -R "@RG\tID:$normal\tSM:$normal\tPL:$platform" \-t $nt -K 10000000 $fasta $normal_fastq_1 $normal_fastq_2 || \echo -n 'error' ) | \sentieon util sort -o normal_sorted.bam -t $nt --sam2bam -i -
第二步:PCR Duplicate Removal (Skip For Amplicon)
# ******************************************
# 2a. Remove duplicate reads for tumor sample.
# ******************************************
sentieon driver -t $nt -i tumor_sorted.bam \--algo LocusCollector \--fun score_info \ tumor_score.txt
sentieon driver -t $nt -i tumor_sorted.bam \--algo Dedup \--score_info tumor_score.txt \--metrics tumor_dedup_metrics.txt \ tumor_deduped.bam
# ******************************************
# 2b. Remove duplicate reads for normal sample.
# ******************************************
sentieon driver -t $nt -i normal_sorted.bam \--algo LocusCollector \--fun score_info \ normal_score.txt
sentieon driver -t $nt -i normal_sorted.bam \--algo Dedup \--score_info normal_score.txt \--metrics normal_dedup_metrics.txt \ normal_deduped.bam
第三步: Base Quality Score Recalibration (Skip For Small Panel)
# ******************************************
# 3a. Base recalibration for tumor sample
# ******************************************
sentieon driver -r $fasta -t $nt -i tumor_deduped.bam \--algo QualCal \-k $dbsnp \-k $known_Mills_indels \-k $known_1000G_indels \tumor_recal_data.table
# ******************************************
# 3b. Base recalibration for normal sample
# ******************************************
sentieon driver -r $fasta -t $nt -i normal_deduped.bam \--algo QualCal \-k $dbsnp \-k $known_Mills_indels \-k $known_1000G_indels \ normal_recal_data.table
第四步:Variant Calling
sentieon driver -r $fasta -t $nt -i tumor_deduped.bam -i normal_deduped.bam \--algo TNscope \--tumor_sample $TUMOR_SM \--normal_sample $NORMAL_SM \--dbsnp $dbsnp \--sv_mask_ext 10 \--min_tumor_allele_frac 0.01 \--max_fisher_pv_active 0.05 \--filter_t_alt_frac 0.01 \--max_normal_alt_frac 0.005 \--max_normal_alt_qsum 200 \--max_normal_alt_cnt 5 \--assemble_mode 4 \output_tnscope.pre_filter.vcf.gz
第五步:Variant Filtration
bcftools annotate -x "FILTER/triallelic_site" output_tnscope.pre_filter.vcf.gz | \ bcftools filter -m + -s "insignificant" -e "(PV>0.25 && PV2>0.25)" | \ bcftools filter -m + -s "insignificant" -e "(INFO/STR == 1 && PV>0.05)" | \ bcftools filter -m + -s "orientation_bias" -e "FMT/FOXOG[0] == 1" | \ bcftools filter -m + -s "strand_bias" -e "SOR > 3" | \ bcftools filter -m + -s "low_qual" -e "QUAL < 20" | \ bcftools filter -m + -s "short_tandem_repeat" -e "RPA[0]>=10" | \ bcftools filter -m + -s "noisy_region" -e "ECNT>5" | \ bcftools filter -m + -s "read_pos_bias" -e "FMT/ReadPosRankSumPS[0] < -8" | \
sentieon util vcfconvert - output_tnscope.filtered.vcf.gz
更多推荐
【GATK加速】替换BWA/GATK/Mutect2,Sentieon软件 肿瘤体细胞突变检测分析指南
发布评论