更新 | gnomAD人群频率库的*载下**与处理

更新|gnomAD人群频率库的*载下**与处理

当前有两个文件可用:

  1. af-only-gnomad.hg38.vcf.gz (GATK提供)
  2. gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf (gnomAD官网)

将并行处理,最后比较。

# gnomAD官网数据源*载下**
wget -c https://storage.googleapis.com/gcp-public-data--gnomad/release/2.1.1/liftover_grch38/vcf/exomes/gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf.bgz &

看下信息

bcftools view -H  af-only-gnomad.hg38.vcf.gz  | head -n 3

#CHROM POS ID REF ALT QUAL FILTER INFO

chr1 10067 . T TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC 30.35 PASS AC=3 ;AF=7.384e-05

chr1 10108 . CAACCCT C 46514.3 PASS AC=6;AF=0.0001525

chr1 10109 . AACCCT A 89837.3 PASS AC=48;AF=0.001223

#发现此文件含有多等位基因位点
zcat af-only-gnomad.hg38.vcf.gz | grep -v '##' | grep ',' | head

更新|gnomAD人群频率库的*载下**与处理

字段的含义

AC , Alternate allele count for samples

AC0 , Allele count is zero after filtering out low-confidence genotypes ( GQ < 20; DP < 10; and AB < 0.2 for het calls )

AN , Total number of alleles in samples

AF , Alternate allele frequency in samples

AF_raw , Alternate allele frequency in samples, before removing low-confidence genotypes

AF_eas , Alternate allele frequency in samples of East Asian ancestry

RF , Failed random forest filtering thresholds of 0.055272738028512555, 0.20641025579497013 ( probabilities of being a true positive variant ) for SNPs, indels

PASS , Passed all variant filters

AN如果为0,说明该位点未测到,AF值也就不可信?

bcftools view -H  gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf  | head -n 1

#CHROM POS ID REF ALT QUAL FILTER INFO

chr1 12198 rs62635282 G C 9876.24 AC0 AC=0;AC_afr=0;AC_afr_female=0;AC_afr_male=0;AC_amr=0;AC_amr_female=0;AC_amr_male=0;AC_asj=0;AC_asj_female=0;AC_asj_male=0;AC_eas=0;AC_eas_female=0;AC_eas_jpn=0;AC_eas_kor=0;AC_eas_male=0;AC_eas_oea=0;AC_female=0;AC_fin=0;AC_fin_female=0;AC_fin_male=0;AC_male=0;AC_nfe=0;AC_nfe_bgr=0;AC_nfe_est=0;AC_nfe_female=0;AC_nfe_male=0;AC_nfe_nwe=0;AC_nfe_onf=0;AC_nfe_seu=0;AC_nfe_swe=0;AC_oth=0;AC_oth_female=0;AC_oth_male=0;AC_raw=227;AC_sas=0;AC_sas_female=0;AC_sas_male=0; AF_raw=0.0457108 ;AN=0;AN_afr=0;AN_afr_female=0;AN_afr_male=0;AN_amr=0;AN_amr_female=0;AN_amr_male=0;AN_asj=0;AN_asj_female=0;AN_asj_male=0;AN_eas=0;AN_eas_female=0;AN_eas_jpn=0;AN_eas_kor=0;AN_eas_male=0;AN_eas_oea=0;AN_female=0;AN_fin=0;AN_fin_female=0;AN_fin_male=0;AN_male=0;AN_nfe=0;AN_nfe_bgr=0;AN_nfe_est=0;AN_nfe_female=0;AN_nfe_male=0;AN_nfe_nwe=0;AN_nfe_onf=0;AN_nfe_seu=0;AN_nfe_swe=0;AN_oth=0;AN_oth_female=0;AN_oth_male=0;AN_raw=4966;AN_sas=0;AN_sas_female=0;AN_sas_male=0;BaseQRankSum=0;ClippingRankSum=0.358;DP=9204;FS=0;InbreedingCoeff=0.0098;MQ=23.04;MQRankSum=0.736;OriginalContig=1;OriginalStart=12198;QD=13.95;ReadPosRankSum=0.736;SOR=0.302;VQSLOD=1.01;VQSR_culprit=MQ;ab_hist_alt_bin_freq=0|0|0|0|1|0|2|0|2|0|10|0|1|28|0|3|0|0|0|0;age_hist_het_bin_freq=0|0|0|0|0|0|0|0|0|0;age_hist_het_n_larger=0;age_hist_het_n_smaller=0;age_hist_hom_bin_freq=0|0|0|0|0|0|0|0|0|0;age_hist_hom_n_larger=0;age_hist_hom_n_smaller=0;allele_type=snv;controls_AC=0;controls_AC_afr=0;controls_AC_afr_female=0;controls_AC_afr_male=0;controls_AC_amr=0;controls_AC_amr_female=0;controls_AC_amr_male=0;controls_AC_asj=0;controls_AC_asj_female=0;controls_AC_asj_male=0;controls_AC_eas=0;controls_AC_eas_female=0;controls_AC_eas_jpn=0;controls_AC_eas_kor=0;controls_AC_eas_male=0;controls_AC_eas_oea=0;controls_AC_female=0;controls_AC_fin=0;controls_AC_fin_female=0;controls_AC_fin_male=0;controls_AC_male=0;controls_AC_nfe=0;controls_AC_nfe_bgr=0;controls_AC_nfe_est=0;controls_AC_nfe_female=0;controls_AC_nfe_male=0;controls_AC_nfe_nwe=0;controls_AC_nfe_onf=0;controls_AC_nfe_seu=0;controls_AC_nfe_swe=0;controls_AC_oth=0;controls_AC_oth_female=0;controls_AC_oth_male=0;controls_AC_raw=109;controls_AC_sas=0;controls_AC_sas_female=0;controls_AC_sas_male=0;controls_AF_raw=0.046661;controls_AN=0;controls_AN_afr=0;controls_AN_afr_female=0;controls_AN_afr_male=0;controls_AN_amr=0;controls_AN_amr_female=0;controls_AN_amr_male=0;controls_AN_asj=0;controls_AN_asj_female=0;controls_AN_asj_male=0;controls_AN_eas=0;controls_AN_eas_female=0;controls_AN_eas_jpn=0;controls_AN_eas_kor=0;controls_AN_eas_male=0;controls_AN_eas_oea=0;controls_AN_female=0;controls_AN_fin=0;controls_AN_fin_female=0;controls_AN_fin_male=0;controls_AN_male=0;controls_AN_nfe=0;controls_AN_nfe_bgr=0;controls_AN_nfe_est=0;controls_AN_nfe_female=0;controls_AN_nfe_male=0;controls_AN_nfe_nwe=0;controls_AN_nfe_onf=0;controls_AN_nfe_seu=0;controls_AN_nfe_swe=0;controls_AN_oth=0;controls_AN_oth_female=0;controls_AN_oth_male=0;controls_AN_raw=2336;controls_AN_sas=0;controls_AN_sas_female=0;controls_AN_sas_male=0;controls_faf95=0;controls_faf95_afr=0;controls_faf95_amr=0;controls_faf95_eas=0;controls_faf95_nfe=0;controls_faf95_sas=0;controls_faf99=0;controls_faf99_afr=0;controls_faf99_amr=0;controls_faf99_eas=0;controls_faf99_nfe=0;controls_faf99_sas=0;controls_nhomalt=0;controls_nhomalt_afr=0;controls_nhomalt_afr_female=0;controls_nhomalt_afr_male=0;controls_nhomalt_amr=0;controls_nhomalt_amr_female=0;controls_nhomalt_amr_male=0;controls_nhomalt_asj=0;controls_nhomalt_asj_female=0;controls_nhomalt_asj_male=0;controls_nhomalt_eas=0;controls_nhomalt_eas_female=0;controls_nhomalt_eas_jpn=0;controls_nhomalt_eas_kor=0;controls_nhomalt_eas_male=0;controls_nhomalt_eas_oea=0;controls_nhomalt_female=0;controls_nhomalt_fin=0;controls_nhomalt_fin_female=0;controls_nhomalt_fin_male=0;controls_nhomalt_male=0;controls_nhomalt_nfe=0;controls_nhomalt_nfe_bgr=0;controls_nhomalt_nfe_est=0;controls_nhomalt_nfe_female=0;controls_nhomalt_nfe_male=0;controls_nhomalt_nfe_nwe=0;controls_nhomalt_nfe_onf=0;controls_nhomalt_nfe_seu=0;controls_nhomalt_nfe_swe=0;controls_nhomalt_oth=0;controls_nhomalt_oth_female=0;controls_nhomalt_oth_male=0;controls_nhomalt_raw=44;controls_nhomalt_sas=0;controls_nhomalt_sas_female=0;controls_nhomalt_sas_male=0;dp_hist_all_bin_freq=125724|24|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0;dp_hist_all_n_larger=0;dp_hist_alt_bin_freq=130|7|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0;dp_hist_alt_n_larger=0;faf95=0;faf95_afr=0;faf95_amr=0;faf95_eas=0;faf95_nfe=0;faf95_sas=0;faf99=0;faf99_afr=0;faf99_amr=0;faf99_eas=0;faf99_nfe=0;faf99_sas=0;gq_hist_all_bin_freq=1898|511|26|28|8|4|0|5|2|0|0|0|1|0|0|0|0|0|0|0;gq_hist_alt_bin_freq=14|78|1|25|7|4|0|5|2|0|0|0|1|0|0|0|0|0|0|0;n_alt_alleles=1;nhomalt=0;nhomalt_afr=0;nhomalt_afr_female=0;nhomalt_afr_male=0;nhomalt_amr=0;nhomalt_amr_female=0;nhomalt_amr_male=0;nhomalt_asj=0;nhomalt_asj_female=0;nhomalt_asj_male=0;nhomalt_eas=0;nhomalt_eas_female=0;nhomalt_eas_jpn=0;nhomalt_eas_kor=0;nhomalt_eas_male=0;nhomalt_eas_oea=0;nhomalt_female=0;nhomalt_fin=0;nhomalt_fin_female=0;nhomalt_fin_male=0;nhomalt_male=0;nhomalt_nfe=0;nhomalt_nfe_bgr=0;nhomalt_nfe_est=0;nhomalt_nfe_female=0;nhomalt_nfe_male=0;nhomalt_nfe_nwe=0;nhomalt_nfe_onf=0;nhomalt_nfe_seu=0;nhomalt_nfe_swe=0;nhomalt_oth=0;nhomalt_oth_female=0;nhomalt_oth_male=0;nhomalt_raw=90;nhomalt_sas=0;nhomalt_sas_female=0;nhomalt_sas_male=0;non_cancer_AC=0;non_cancer_AC_afr=0;non_cancer_AC_afr_female=0;non_cancer_AC_afr_male=0;non_cancer_AC_amr=0;non_cancer_AC_amr_female=0;non_cancer_AC_amr_male=0;non_cancer_AC_asj=0;non_cancer_AC_asj_female=0;non_cancer_AC_asj_male=0;non_cancer_AC_eas=0;non_cancer_AC_eas_female=0;non_cancer_AC_eas_jpn=0;non_cancer_AC_eas_kor=0;non_cancer_AC_eas_male=0;non_cancer_AC_eas_oea=0;non_cancer_AC_female=0;non_cancer_AC_fin=0;non_cancer_AC_fin_female=0;non_cancer_AC_fin_male=0;non_cancer_AC_male=0;non_cancer_AC_nfe=0;non_cancer_AC_nfe_bgr=0;non_cancer_AC_nfe_est=0;non_cancer_AC_nfe_female=0;non_cancer_AC_nfe_male=0;non_cancer_AC_nfe_nwe=0;non_cancer_AC_nfe_onf=0;non_cancer_AC_nfe_seu=0;non_cancer_AC_nfe_swe=0;non_cancer_AC_oth=0;non_cancer_AC_oth_female=0;non_cancer_AC_oth_male=0;non_cancer_AC_raw=227;non_cancer_AC_sas=0;non_cancer_AC_sas_female=0;non_cancer_AC_sas_male=0;non_cancer_AF_raw=0.0457293;non_cancer_AN=0;non_cancer_AN_afr=0;non_cancer_AN_afr_female=0;non_cancer_AN_afr_male=0;non_cancer_AN_amr=0;non_cancer_AN_amr_female=0;non_cancer_AN_amr_male=0;non_cancer_AN_asj=0;non_cancer_AN_asj_female=0;non_cancer_AN_asj_male=0;non_cancer_AN_eas=0;non_cancer_AN_eas_female=0;non_cancer_AN_eas_jpn=0;non_cancer_AN_eas_kor=0;non_cancer_AN_eas_male=0;non_cancer_AN_eas_oea=0;non_cancer_AN_female=0;non_cancer_AN_fin=0;non_cancer_AN_fin_female=0;non_cancer_AN_fin_male=0;non_cancer_AN_male=0;non_cancer_AN_nfe=0;non_cancer_AN_nfe_bgr=0;non_cancer_AN_nfe_est=0;non_cancer_AN_nfe_female=0;non_cancer_AN_nfe_male=0;non_cancer_AN_nfe_nwe=0;non_cancer_AN_nfe_onf=0;non_cancer_AN_nfe_seu=0;non_cancer_AN_nfe_swe=0;non_cancer_AN_oth=0;non_cancer_AN_oth_female=0;non_cancer_AN_oth_male=0;non_cancer_AN_raw=4964;non_cancer_AN_sas=0;non_cancer_AN_sas_female=0;non_cancer_AN_sas_male=0;non_cancer_faf95=0;non_cancer_faf95_afr=0;non_cancer_faf95_amr=0;non_cancer_faf95_eas=0;non_cancer_faf95_nfe=0;non_cancer_faf95_sas=0;non_cancer_faf99=0;non_cancer_faf99_afr=0;non_cancer_faf99_amr=0;non_cancer_faf99_eas=0;non_cancer_faf99_nfe=0;non_cancer_faf99_sas=0;non_cancer_nhomalt=0;non_cancer_nhomalt_afr=0;non_cancer_nhomalt_afr_female=0;non_cancer_nhomalt_afr_male=0;non_cancer_nhomalt_amr=0;non_cancer_nhomalt_amr_female=0;non_cancer_nhomalt_amr_male=0;non_cancer_nhomalt_asj=0;non_cancer_nhomalt_asj_female=0;non_cancer_nhomalt_asj_male=0;non_cancer_nhomalt_eas=0;non_cancer_nhomalt_eas_female=0;non_cancer_nhomalt_eas_jpn=0;non_cancer_nhomalt_eas_kor=0;non_cancer_nhomalt_eas_male=0;non_cancer_nhomalt_eas_oea=0;non_cancer_nhomalt_female=0;non_cancer_nhomalt_fin=0;non_cancer_nhomalt_fin_female=0;non_cancer_nhomalt_fin_male=0;non_cancer_nhomalt_male=0;non_cancer_nhomalt_nfe=0;non_cancer_nhomalt_nfe_bgr=0;non_cancer_nhomalt_nfe_est=0;non_cancer_nhomalt_nfe_female=0;non_cancer_nhomalt_nfe_male=0;non_cancer_nhomalt_nfe_nwe=0;non_cancer_nhomalt_nfe_onf=0;non_cancer_nhomalt_nfe_seu=0;non_cancer_nhomalt_nfe_swe=0;non_cancer_nhomalt_oth=0;non_cancer_nhomalt_oth_female=0;non_cancer_nhomalt_oth_male=0;non_cancer_nhomalt_raw=90;non_cancer_nhomalt_sas=0;non_cancer_nhomalt_sas_female=0;non_cancer_nhomalt_sas_male=0;non_neuro_AC=0;non_neuro_AC_afr=0;non_neuro_AC_afr_female=0;non_neuro_AC_afr_male=0;non_neuro_AC_amr=0;non_neuro_AC_amr_female=0;non_neuro_AC_amr_male=0;non_neuro_AC_asj=0;non_neuro_AC_asj_female=0;non_neuro_AC_asj_male=0;non_neuro_AC_eas=0;non_neuro_AC_eas_female=0;non_neuro_AC_eas_jpn=0;non_neuro_AC_eas_kor=0;non_neuro_AC_eas_male=0;non_neuro_AC_eas_oea=0;non_neuro_AC_female=0;non_neuro_AC_fin=0;non_neuro_AC_fin_female=0;non_neuro_AC_fin_male=0;non_neuro_AC_male=0;non_neuro_AC_nfe=0;non_neuro_AC_nfe_bgr=0;non_neuro_AC_nfe_est=0;non_neuro_AC_nfe_female=0;non_neuro_AC_nfe_male=0;non_neuro_AC_nfe_nwe=0;non_neuro_AC_nfe_onf=0;non_neuro_AC_nfe_seu=0;non_neuro_AC_nfe_swe=0;non_neuro_AC_oth=0;non_neuro_AC_oth_female=0;non_neuro_AC_oth_male=0;non_neuro_AC_raw=225;non_neuro_AC_sas=0;non_neuro_AC_sas_female=0;non_neuro_AC_sas_male=0;non_neuro_AF_raw=0.0470908;non_neuro_AN=0;non_neuro_AN_afr=0;non_neuro_AN_afr_female=0;non_neuro_AN_afr_male=0;non_neuro_AN_amr=0;non_neuro_AN_amr_female=0;non_neuro_AN_amr_male=0;non_neuro_AN_asj=0;non_neuro_AN_asj_female=0;non_neuro_AN_asj_male=0;non_neuro_AN_eas=0;non_neuro_AN_eas_female=0;non_neuro_AN_eas_jpn=0;non_neuro_AN_eas_kor=0;non_neuro_AN_eas_male=0;non_neuro_AN_eas_oea=0;non_neuro_AN_female=0;non_neuro_AN_fin=0;non_neuro_AN_fin_female=0;non_neuro_AN_fin_male=0;non_neuro_AN_male=0;non_neuro_AN_nfe=0;non_neuro_AN_nfe_bgr=0;non_neuro_AN_nfe_est=0;non_neuro_AN_nfe_female=0;non_neuro_AN_nfe_male=0;non_neuro_AN_nfe_nwe=0;non_neuro_AN_nfe_onf=0;non_neuro_AN_nfe_seu=0;non_neuro_AN_nfe_swe=0;non_neuro_AN_oth=0;non_neuro_AN_oth_female=0;non_neuro_AN_oth_male=0;non_neuro_AN_raw=4778;non_neuro_AN_sas=0;non_neuro_AN_sas_female=0;non_neuro_AN_sas_male=0;non_neuro_faf95=0;non_neuro_faf95_afr=0;non_neuro_faf95_amr=0;non_neuro_faf95_eas=0;non_neuro_faf95_nfe=0;non_neuro_faf95_sas=0;non_neuro_faf99=0;non_neuro_faf99_afr=0;non_neuro_faf99_amr=0;non_neuro_faf99_eas=0;non_neuro_faf99_nfe=0;non_neuro_faf99_sas=0;non_neuro_nhomalt=0;non_neuro_nhomalt_afr=0;non_neuro_nhomalt_afr_female=0;non_neuro_nhomalt_afr_male=0;non_neuro_nhomalt_amr=0;non_neuro_nhomalt_amr_female=0;non_neuro_nhomalt_amr_male=0;non_neuro_nhomalt_asj=0;non_neuro_nhomalt_asj_female=0;non_neuro_nhomalt_asj_male=0;non_neuro_nhomalt_eas=0;non_neuro_nhomalt_eas_female=0;non_neuro_nhomalt_eas_jpn=0;non_neuro_nhomalt_eas_kor=0;non_neuro_nhomalt_eas_male=0;non_neuro_nhomalt_eas_oea=0;non_neuro_nhomalt_female=0;non_neuro_nhomalt_fin=0;non_neuro_nhomalt_fin_female=0;non_neuro_nhomalt_fin_male=0;non_neuro_nhomalt_male=0;non_neuro_nhomalt_nfe=0;non_neuro_nhomalt_nfe_bgr=0;non_neuro_nhomalt_nfe_est=0;non_neuro_nhomalt_nfe_female=0;non_neuro_nhomalt_nfe_male=0;non_neuro_nhomalt_nfe_nwe=0;non_neuro_nhomalt_nfe_onf=0;non_neuro_nhomalt_nfe_seu=0;non_neuro_nhomalt_nfe_swe=0;non_neuro_nhomalt_oth=0;non_neuro_nhomalt_oth_female=0;non_neuro_nhomalt_oth_male=0;non_neuro_nhomalt_raw=89;non_neuro_nhomalt_sas=0;non_neuro_nhomalt_sas_female=0;non_neuro_nhomalt_sas_male=0;non_topmed_AC=0;non_topmed_AC_afr=0;non_topmed_AC_afr_female=0;non_topmed_AC_afr_male=0;non_topmed_AC_amr=0;non_topmed_AC_amr_female=0;non_topmed_AC_amr_male=0;non_topmed_AC_asj=0;non_topmed_AC_asj_female=0;non_topmed_AC_asj_male=0;non_topmed_AC_eas=0;non_topmed_AC_eas_female=0;non_topmed_AC_eas_jpn=0;non_topmed_AC_eas_kor=0;non_topmed_AC_eas_male=0;non_topmed_AC_eas_oea=0;non_topmed_AC_female=0;non_topmed_AC_fin=0;non_topmed_AC_fin_female=0;non_topmed_AC_fin_male=0;non_topmed_AC_male=0;non_topmed_AC_nfe=0;non_topmed_AC_nfe_bgr=0;non_topmed_AC_nfe_est=0;non_topmed_AC_nfe_female=0;non_topmed_AC_nfe_male=0;non_topmed_AC_nfe_nwe=0;non_topmed_AC_nfe_onf=0;non_topmed_AC_nfe_seu=0;non_topmed_AC_nfe_swe=0;non_topmed_AC_oth=0;non_topmed_AC_oth_female=0;non_topmed_AC_oth_male=0;non_topmed_AC_raw=218;non_topmed_AC_sas=0;non_topmed_AC_sas_female=0;non_topmed_AC_sas_male=0;non_topmed_AF_raw=0.0459334;non_topmed_AN=0;non_topmed_AN_afr=0;non_topmed_AN_afr_female=0;non_topmed_AN_afr_male=0;non_topmed_AN_amr=0;non_topmed_AN_amr_female=0;non_topmed_AN_amr_male=0;non_topmed_AN_asj=0;non_topmed_AN_asj_female=0;non_topmed_AN_asj_male=0;non_topmed_AN_eas=0;non_topmed_AN_eas_female=0;non_topmed_AN_eas_jpn=0;non_topmed_AN_eas_kor=0;non_topmed_AN_eas_male=0;non_topmed_AN_eas_oea=0;non_topmed_AN_female=0;non_topmed_AN_fin=0;non_topmed_AN_fin_female=0;non_topmed_AN_fin_male=0;non_topmed_AN_male=0;non_topmed_AN_nfe=0;non_topmed_AN_nfe_bgr=0;non_topmed_AN_nfe_est=0;non_topmed_AN_nfe_female=0;non_topmed_AN_nfe_male=0;non_topmed_AN_nfe_nwe=0;non_topmed_AN_nfe_onf=0;non_topmed_AN_nfe_seu=0;non_topmed_AN_nfe_swe=0;non_topmed_AN_oth=0;non_topmed_AN_oth_female=0;non_topmed_AN_oth_male=0;non_topmed_AN_raw=4746;non_topmed_AN_sas=0;non_topmed_AN_sas_female=0;non_topmed_AN_sas_male=0;non_topmed_faf95=0;non_topmed_faf95_afr=0;non_topmed_faf95_amr=0;non_topmed_faf95_eas=0;non_topmed_faf95_nfe=0;non_topmed_faf95_sas=0;non_topmed_faf99=0;non_topmed_faf99_afr=0;non_topmed_faf99_amr=0;non_topmed_faf99_eas=0;non_topmed_faf99_nfe=0;non_topmed_faf99_sas=0;non_topmed_nhomalt=0;non_topmed_nhomalt_afr=0;non_topmed_nhomalt_afr_female=0;non_topmed_nhomalt_afr_male=0;non_topmed_nhomalt_amr=0;non_topmed_nhomalt_amr_female=0;non_topmed_nhomalt_amr_male=0;non_topmed_nhomalt_asj=0;non_topmed_nhomalt_asj_female=0;non_topmed_nhomalt_asj_male=0;non_topmed_nhomalt_eas=0;non_topmed_nhomalt_eas_female=0;non_topmed_nhomalt_eas_jpn=0;non_topmed_nhomalt_eas_kor=0;non_topmed_nhomalt_eas_male=0;non_topmed_nhomalt_eas_oea=0;non_topmed_nhomalt_female=0;non_topmed_nhomalt_fin=0;non_topmed_nhomalt_fin_female=0;non_topmed_nhomalt_fin_male=0;non_topmed_nhomalt_male=0;non_topmed_nhomalt_nfe=0;non_topmed_nhomalt_nfe_bgr=0;non_topmed_nhomalt_nfe_est=0;non_topmed_nhomalt_nfe_female=0;non_topmed_nhomalt_nfe_male=0;non_topmed_nhomalt_nfe_nwe=0;non_topmed_nhomalt_nfe_onf=0;non_topmed_nhomalt_nfe_seu=0;non_topmed_nhomalt_nfe_swe=0;non_topmed_nhomalt_oth=0;non_topmed_nhomalt_oth_female=0;non_topmed_nhomalt_oth_male=0;non_topmed_nhomalt_raw=87;non_topmed_nhomalt_sas=0;non_topmed_nhomalt_sas_female=0;non_topmed_nhomalt_sas_male=0;pab_max=1;rf_label=FP;rf_negative_label;rf_tp_probability=0.836542;rf_train;segdup;variant_type=snv;vep=C|downstream_gene_variant|MODIFIER|WASH7P|ENSG00000227232|Transcript|ENST00000423562|unprocessed_pseudogene||||||||||rs62635282|1|2165|-1||SNV|1|HGNC|38034||||||||||||||||||||||||||||||||||||||||||,C|downstream_gene_variant|MODIFIER|WASH7P|ENSG00000227232|Transcript|ENST00000438504|unprocessed_pseudogene||||||||||rs62635282|1|2165|-1||SNV|1|HGNC|38034|YES|||||||||||||||||||||||||||||||||||||||||,C|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|DDX11L1|ENSG00000223972|Transcript|ENST00000450305|transcribed_unprocessed_pseudogene|2/6||ENST00000450305.2:n.68G>C||68|||||rs62635282|1||1||SNV|1|HGNC|37102||||||||||||||||||||||||||||||||||||||||||,C|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|DDX11L1|ENSG00000223972|Transcript|ENST00000456328|processed_transcript|1/3||ENST00000456328.2:n.330G>C||330|||||rs62635282|1||1||SNV|1|HGNC|37102|YES|||||||||||||||||||||||||||||||||||||||||,C|downstream_gene_variant|MODIFIER|WASH7P|ENSG00000227232|Transcript|ENST00000488147|unprocessed_pseudogene||||||||||rs62635282|1|2206|-1||SNV|1|HGNC|38034||||||||||||||||||||||||||||||||||||||||||,C|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|DDX11L1|ENSG00000223972|Transcript|ENST00000515242|transcribed_unprocessed_pseudogene|1/3||ENST00000515242.2:n.327G>C||327|||||rs62635282|1||1||SNV|1|HGNC|37102||||||||||||||||||||||||||||||||||||||||||,C|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|DDX11L1|ENSG00000223972|Transcript|ENST00000518655|transcribed_unprocessed_pseudogene|1/4||ENST00000518655.2:n.325G>C||325|||||rs62635282|1||1||SNV|1|HGNC|37102||||||||||||||||||||||||||||||||||||||||||,C|downstream_gene_variant|MODIFIER|WASH7P|ENSG00000227232|Transcript|ENST00000538476|unprocessed_pseudogene||||||||||rs62635282|1|2213|-1||SNV|1|HGNC|38034||||||||||||||||||||||||||||||||||||||||||,C|downstream_gene_variant|MODIFIER|WASH7P|ENSG00000227232|Transcript|ENST00000541675|unprocessed_pseudogene||||||||||rs62635282|1|2165|-1||SNV|1|HGNC|38034||||||||||||||||||||||||||||||||||||||||||,C|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00001576075|CTCF_binding_site||||||||||rs62635282|1||||SNV|1||||||||||||||||||||||||||||||||||||||||||||

bcftools annotate -x ^INFO/AC,^INFO/AN,^INFO/AF,^INFO/AF_raw,^INFO/AF_eas gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf | \
grep-v"^##"|head-n20

#CHROM POS ID REF ALT QUAL FILTER INFO

chr1 12198 rs62635282 G C 9876.24 AC0 AC=0;AF_raw=0.0457108; AN=0

chr1 12237 rs1324090652 G A 81.96 AC0 AC=0;AF_raw=0.000440995; AN=0

chr1 12259 rs1330604035 G C 37.42 AC0 AC=0; AF=0 ;AF_raw=0.000155788; AN=2

chr1 12266 rs1442951560 G A 2721.48 AC0 AC=0;AF_raw=0.00434708;AN=0

chr1 12272 rs1281272113 G A 2707.42 AC0 AC=0; AF=0 ;AF_raw=0.00430126; AN=2

chr1 12554 rs1371050997 A G 68.11 AC0;RF AC=0; AF=0 ;AF_eas=0;AF_raw=0.000294357; AN=3038

chr1 12559 rs1223049744 G A 1666.64 RF AC=14;AF=0.00472654;AF_eas=0.00259067;AF_raw=0.00287838; AN=2962

chr1 12573 rs1273605438 T C 366.59 PASS AC=2;AF=0.000476644;AF_eas=0;AF_raw=0.000172585;AN=4196

chr1 12586 rs1336625132 C T 223.87 PASS AC=2;AF=0.000558971;AF_eas=0.00220264;AF_raw=6.50675e-05;AN=3578

chr1 12596 rs1211439372 C A 44.76 AC0 AC=0;AF=0;AF_eas=0;AF_raw=2.17855e-05;AN=2952

chr1 12597 rs1272077481 T C 569.92 RF AC=8;AF=0.00275103;AF_eas=0;AF_raw=0.0010003;AN=2908

chr1 12599 rs1437963543 CT C 448.69 AC0 AC=0;AF=0;AF_eas=0;AF_raw=2.18036e-05;AN=2830

chr1 12612 rs1205998786 GGT G 41.94 AC0;RF AC=0;AF=0;AF_eas=0;AF_raw=4.02609e-05;AN=5600

chr1 12625 rs1235144565 G A 55.63 PASS AC=1;AF=0.000174825;AF_eas=0;AF_raw=5.98205e-05;AN=5720

chr1 12659 rs1469036210 G C 3242.59 RF AC=7;AF=0.00106093;AF_eas=0.00394737;AF_raw=0.001232;AN=6598

chr1 12670 rs1182032602 G C 2475.63 RF AC=20;AF=0.00291971;AF_eas=0.00791557;AF_raw=0.00295915;AN=6850

chr1 12672 rs1419072050 C T 3690.4 RF AC=13;AF=0.00200803;AF_eas=0.00531915;AF_raw=0.00194156;AN=6474

chr1 12673 rs1476353024 G A 1057.24 RF AC=10;AF=0.0014339;AF_eas=0;AF_raw=0.0019007;AN=6974

chr1 12680 rs1163072234 G A 796.7 PASS AC=6;AF=0.000839396;AF_eas=0.00683371;AF_raw=0.000667608;AN=7148

bcftools annotate -x ^INFO/AC,^INFO/AN,^INFO/AF,^INFO/AF_raw gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf | grep -v "^##" | head -n 20

#CHROM POS ID REF ALT QUAL FILTER INFO

chr1 12198 rs62635282 G C 9876.24 AC0 AC=0;AF_raw=0.0457108;AN=0

chr1 12237 rs1324090652 G A 81.96 AC0 AC=0;AF_raw=0.000440995;AN=0

chr1 12259 rs1330604035 G C 37.42 AC0 AC=0;AF=0;AF_raw=0.000155788;AN=2

chr1 12266 rs1442951560 G A 2721.48 AC0 AC=0;AF_raw=0.00434708;AN=0

chr1 12272 rs1281272113 G A 2707.42 AC0 AC=0;AF=0;AF_raw=0.00430126;AN=2

chr1 12554 rs1371050997 A G 68.11 AC0;RF AC=0;AF=0;AF_raw=0.000294357;AN=3038

chr1 12559 rs1223049744 G A 1666.64 RF AC=14;AF=0.00472654;AF_raw=0.00287838;AN=2962

chr1 12573 rs1273605438 T C 366.59 PASS AC=2;AF=0.000476644;AF_raw=0.000172585;AN=4196

chr1 12586 rs1336625132 C T 223.87 PASS AC=2;AF=0.000558971;AF_raw=6.50675e-05;AN=3578

chr1 12596 rs1211439372 C A 44.76 AC0 AC=0;AF=0;AF_raw=2.17855e-05;AN=2952

chr1 12597 rs1272077481 T C 569.92 RF AC=8;AF=0.00275103;AF_raw=0.0010003;AN=2908

chr1 12599 rs1437963543 CT C 448.69 AC0 AC=0;AF=0;AF_raw=2.18036e-05;AN=2830

chr1 12612 rs1205998786 GGT G 41.94 AC0;RF AC=0;AF=0;AF_raw=4.02609e-05;AN=5600

chr1 12625 rs1235144565 G A 55.63 PASS AC=1;AF=0.000174825;AF_raw=5.98205e-05;AN=5720

chr1 12659 rs1469036210 G C 3242.59 RF AC=7;AF=0.00106093;AF_raw=0.001232;AN=6598

chr1 12670 rs1182032602 G C 2475.63 RF AC=20;AF=0.00291971;AF_raw=0.00295915;AN=6850

chr1 12672 rs1419072050 C T 3690.4 RF AC=13;AF=0.00200803;AF_raw=0.00194156;AN=6474

chr1 12673 rs1476353024 G A 1057.24 RF AC=10;AF=0.0014339;AF_raw=0.0019007;AN=6974

chr1 12680 rs1163072234 G A 796.7 PASS AC=6;AF=0.000839396;AF_raw=0.000667608;AN=7148

bcftools annotate -x ^INFO/AC,^INFO/AN,^INFO/AF gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf | grep -v "^##" | head -n 20 | sed 's/\t/ /g'

#CHROM POS ID REF ALT QUAL FILTER INFO

chr1 12198 rs62635282 G C 9876.24 AC0 AC=0;AN=0

chr1 12237 rs1324090652 G A 81.96 AC0 AC=0;AN=0

chr1 12259 rs1330604035 G C 37.42 AC0 AC=0;AF=0;AN=2

chr1 12266 rs1442951560 G A 2721.48 AC0 AC=0;AN=0

chr1 12272 rs1281272113 G A 2707.42 AC0 AC=0;AF=0;AN=2

chr1 12554 rs1371050997 A G 68.11 AC0;RF AC=0;AF=0;AN=3038

chr1 12559 rs1223049744 G A 1666.64 RF AC=14;AF=0.00472654;AN=2962

chr1 12573 rs1273605438 T C 366.59 PASS AC=2;AF=0.000476644;AN=4196

chr1 12586 rs1336625132 C T 223.87 PASS AC=2;AF=0.000558971;AN=3578

chr1 12596 rs1211439372 C A 44.76 AC0 AC=0;AF=0;AN=2952

chr1 12597 rs1272077481 T C 569.92 RF AC=8;AF=0.00275103;AN=2908

chr1 12599 rs1437963543 CT C 448.69 AC0 AC=0;AF=0;AN=2830

chr1 12612 rs1205998786 GGT G 41.94 AC0;RF AC=0;AF=0;AN=5600

chr1 12625 rs1235144565 G A 55.63 PASS AC=1;AF=0.000174825;AN=5720

chr1 12659 rs1469036210 G C 3242.59 RF AC=7;AF=0.00106093;AN=6598

chr1 12670 rs1182032602 G C 2475.63 RF AC=20;AF=0.00291971;AN=6850

chr1 12672 rs1419072050 C T 3690.4 RF AC=13;AF=0.00200803;AN=6474

chr1 12673 rs1476353024 G A 1057.24 RF AC=10;AF=0.0014339;AN=6974

chr1 12680 rs1163072234 G A 796.7 PASS AC=6;AF=0.000839396;AN=7148

bcftools annotate -x ^INFO/AF gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf | bcftools norm -f /db/gatk/hg38/Homo_sapiens_assembly38.fasta --multiallelics -both | grep -v "^##" |  cut -f 1,2,3,4,5,8 | sed 's/AF=//g' > gnomad.exomes.r2.1.1.sites.liftover_grch38.AF.spliMulti.norm.vcf.6col.txt
#查看结果:
grep -w rs777038595 gnomad.exomes.r2.1.1.sites.liftover_grch38.AF.spliMulti.norm.vcf.6col.txt

chr1 13417 rs777038595 C A 1.49898e-05

chr1 13417 rs777038595 C CGAGA 0.112528

chr1 13417 rs777038595 C CGGGA 0

chr1 13417 rs777038595 C T 1.49898e-05

数据库文件对多等位基因位点似乎已经拆分完毕:

bcftools annotate -x ^INFO/AF gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf | head -n 5000 | grep -w rs777038595

chr1 13417 rs777038595 C A 2.63878e+07 PASS AF=1.49898e-05

chr1 13417 rs777038595 C CGAGA 2.63878e+07 PASS AF=0.112528

chr1 13417 rs777038595 C CGGGA 2.63878e+07 AC0 AF=0

chr1 13417 rs777038595 C T 2.63878e+07 PASS AF=1.49898e-05

GATK来源的gnomAD的数据的处理

af-only-gnomad .hg38 (含多等位基因位点)

bcftools annotate -x ^INFO/AF af-only-gnomad.hg38.vcf.gz | bcftools norm -f /db/gatk/hg38/Homo_sapiens_assembly38.fasta --multiallelics -both | grep -v "^##" |  cut -f 1,2,3,4,5,8 | sed 's/AF=//g' > af-only-gnomad.hg38.AF.spliMulti.norm.vcf.6col.txt

更新|gnomAD人群频率库的*载下**与处理

更新|gnomAD人群频率库的*载下**与处理

上图dbSNP的"GnomAD_exome"的值来自:

gnomad.exomes.r2.1.1.sites.liftover_grch38

更新|gnomAD人群频率库的*载下**与处理

比较:行数

17,201,297

gnomad.exomes.r2.1.1.sites.liftover_grch38.AF.spliMulti.norm.vcf.6col.txt

290,331,359

af-only-gnomad.hg38.AF.spliMulti.norm.vcf.6col.txt

82,985,813 a1000G /ftp.ensembl/chr/ALL.GRCh38.genotypes.20170504.AF.1samp.spliMulti.norm.vcf.6col.txt

比较:标准化历程

Lines total/split/ realigned /skipped:17201296/0/ 6 /0

Lines total/ split / realigned /skipped:268225276/ 15895112 / 9642331 /0

比较:内容

af-only-gnomad.hg38,带"chr",含补丁染色体

更新|gnomAD人群频率库的*载下**与处理

cat af-only-gnomad.hg38.AF.spliMulti.norm.vcf.6col.txt | \
  grep -P 'chr1\t10140\t'

chr1 10140 . ACCCTAAC A 0.0006338

chr1 10140 . A G 0.0001014

多等位基因位点拆分前:

chr1 10140 . A CCCTAAC A,G CCCTAAC 6752.26 PASS AC=25,4;AF= 0.0006338 , 0.0001014

对于gnomad.exomes.r2.1.1

更新|gnomAD人群频率库的*载下**与处理

AN , Total number of alleles in samples (AN =0 测序没有测到 ,或质控后 均无基因型 )

AC , Alternate allele count for samples

#CHROM POS ID REF ALT QUAL FILTER INFO

chr1 12198 rs62635282 G C 9876.24 AC0 AC=0; AN=0

chr1 12237 rs1324090652 G A 81.96 AC0 AC=0; AN=0

chr1 12259 rs1330604035 G C 37.42 AC0 AC=0 ;AF=0; AN=2

chr1 12266 rs1442951560 G A 2721.48 AC0 AC=0; AN=0

chr1 12272 rs1281272113 G A 2707.42 AC0 AC=0 ;AF=0; AN=2

chr1 12554 rs1371050997 A G 68.11 AC0;RF AC=0 ; AF=0 ; AN=3038

chr1 12559 rs1223049744 G A 1666.64 RF AC=14;AF=0.00472654; AN=2962

chr1 12573 rs1273605438 T C 366.59 PASS AC=2;AF=0.000476644; AN=4196

chr1 12586 rs1336625132 C T 223.87 PASS AC=2;AF=0.000558971; AN=3578

因此,

  1. gnomad.exomes.r2.1.1文件中,当AF值为 "." 时,测序未测到或样本均无基因型,应该舍弃这些位点;
  2. gnomad.exomes.r2.1.1文件中,值绝对为0的AF,其AC=0,此时的AN值可能很低,也可能很高 (在大样本中确实未发生任何突变),该VCF文件 并非"当队列中至少有1个样本变异时,才记录"--这对于AF数据库非常好 (不同于对病人测序样本的处理):有助于增加含有AF值的位点,且这些位点有相当一部分是可信的 (具有较大的AN值)。
  3. af-only-gnomad文件不含 "AC=0"
#af-only-gnomad.hg38不含AC=0
bcftools view -H  af-only-gnomad.hg38.vcf.gz  | \
head-n20000000|grep'AC=0'
#无返回

查看gnomad.exomes.r2.1.1文件中,AC=0时的AN值:

bcftools annotate -x ^INFO/AC,^INFO/AN,^INFO/AF gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf | grep -v "^##" | grep 'AC=0' | head -n 200 | cut -f 8 | grep 'AF=0'  | grep "AC=\|AF"

更新|gnomAD人群频率库的*载下**与处理

虽然AN值很低的情况并不多见,为更严谨一些,最好还是去除这些位点 (例如 AN<10)

nohupbcftoolsfilter-e"AN<10"-slowANgnomad.exomes.r2.1.1.sites.liftover_grch38.vcf|\
  bcftoolsview-H|cut-f1,2,4,5,7|\
  grep-wlowAN>gnomad.exomes.r2.1.1.sites.liftover_grch38.lowAN.txt&
#查看
head gnomad.exomes.r2.1.1.sites.liftover_grch38.lowAN.txt

chr1 12198 G C lowAN

chr1 12237 G A lowAN

chr1 12259 G C lowAN

chr1 12266 G A lowAN

chr1 12272 G A lowAN

chr1 30524 G A lowAN

chr1 30528 C T lowAN

wc -l gnomad.exomes.r2.1.1.sites.liftover_grch38.lowAN.txt
# 8641
grep -w chr11  gnomad.exomes.r2.1.1.sites.liftover_grch38.lowAN.txt | head
# chr11  400757  G  A  lowAN
# chr11  627095  C  T  lowAN
bcftools view -r chr11:400757-400757 gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf.bgz -H

更新|gnomAD人群频率库的*载下**与处理

当AF的值不为 "." 且 AN值不低时,输出到文件

nohupawk'BEGIN{OFS=FS="\t"}ARGIND==1{lowAN["_"$1"_"$2"_"$3"_"$4"_"]=10}ARGIND==2{if($6!="."&&lowAN["_"$1"_"$2"_"$4"_"$5"_"]=="")print$0}'\
gnomad.exomes.r2.1.1.sites.liftover_grch38.lowAN.txt\
gnomad.exomes.r2.1.1.sites.liftover_grch38.AF.spliMulti.norm.vcf.6col.txt\
>gnomad.exomes.r2.1.1.sites.liftover_grch38.AF.spliMulti.norm.vcf.6col.exLowAN.txt&

文件行数:17,192,586

gnomad.exomes.r2.1.1.sites.liftover_grch38.AF.spliMulti.norm.vcf.6col.exLowAN.txt

比较第3个文件源:

gnomad.genomes.r2.1.1.sites.1.liftover_grch38.vcf.bgz ( chr1 )

来源:

更新|gnomAD人群频率库的*载下**与处理

gnomad.genomes的VCF文件由于是全基因组数据,涉及多达1.5万个样本 (无基因型,但INFO列存在大量注释,如不同族群的AF,VEP注释等),导致文件过大,处理起来可能需要好几天。

先只处理chr11,提取其AF,再与现有的文件比较

zcatgnomad.genomes.r2.1.1.sites.1.liftover_grch38.vcf.bgz|bcftoolsannotate-x^INFO/AF|grep-v"^##"|cut-f1,2,3,4,5,8|sed's/AF=//g'>gnomad.genomes.r2.1.1.sites.1.liftover_grch38.AF.vcf.6col.txt

更新|gnomAD人群频率库的*载下**与处理

zcat af-only-gnomad.hg38.vcf.gz | grep -v "^##" | head -n 4

更新|gnomAD人群频率库的*载下**与处理

# zcat gnomad.genomes.r2.1.1.sites.1.liftover_grch38.vcf.bgz | grep -v "##" | head -n 100000 | cut -f 1-5 > test.chr1zcat gnomad.genomes.r2.1.1.sites.1.liftover_grch38.vcf.bgz | grep -v "##" | head -n 100000 | sed 's/AF_raw=/ \t/g' | cut -f 9 | sed 's/;/\t/g' | cut -f 1 > test.af.rawzcat gnomad.genomes.r2.1.1.sites.1.liftover_grch38.vcf.bgz | grep -v "##" | head -n 100000 | sed 's/AF=/ \t/g' | cut -f 9 | sed 's/;/\t/g' | cut -f 1 > test.af
zcat gnomad.genomes.r2.1.1.sites.1.liftover_grch38.vcf.bgz | grep -v "##" | head | grep -w 10109 | grep "AC=\|AF=\|AF_raw=\|AC_raw="

更新|gnomAD人群频率库的*载下**与处理

查看几个有rs ID的ClinVar位点,比较AF

cut-f10,13variant_summary_GRCh38.bed.txt|\
  nl | grep -v -P  '\-1' | grep -w Pathogenic | head

更新|gnomAD人群频率库的*载下**与处理

更新|gnomAD人群频率库的*载下**与处理

更新|gnomAD人群频率库的*载下**与处理

可见,仍有很多致病位点在已发表的大量AF数据库中没有人群频率。因此,筛选致病位点时:1. 可以对AF使用排除法 (排除有AF、且AF值较大的位点);2. 无条件纳入ClinVar/OMIM等已报告的致病位点;3. 当CADD Score极高时也可纳入 (如>50,即使无AF值)。

更新|gnomAD人群频率库的*载下**与处理

更新|gnomAD人群频率库的*载下**与处理

更新|gnomAD人群频率库的*载下**与处理

更新|gnomAD人群频率库的*载下**与处理

awk 'BEGIN{OFS=FS="\t"}{if($2<1000000) print $0}' gnomad.genomes.r2.1.1.sites.1.liftover_grch38.AF.vcf.6col.txt > test.gnomad.genomes
head -n 1000000 af-only-gnomad.hg38.AF.spliMulti.norm.vcf.6col.txt | awk 'BEGIN{OFS=FS="\t"}{if($2<1000000) print $0}' > test.af-only-gnomad
head  *test*
tail*test*

更新|gnomAD人群频率库的*载下**与处理

比较差异的位点数量

wc -l test.af-only-gnomad test.gnomad.genomes

71,842 test.af-only-gnomad

66,454 test.gnomad.genomes

全基因组 chr1:1-1,000,000中,含有AF值的总变异数目约7万 ( ~7% ),前者多出5,388

各自特有的位点

awk 'BEGIN{OFS=FS="\t"}ARGIND==1{var["_"$1"_"$2"_"$4"_"$5"_"]=1}ARGIND==2{if(var["_"$1"_"$2"_"$4"_"$5"_"]=="") print $0}' \
  test.gnomad.genomes test.af-only-gnomad | wc -l

5,594

awk'BEGIN{OFS=FS="\t"}ARGIND==1{var["_"$1"_"$2"_"$4"_"$5"_"]=1}ARGIND==2{if(var["_"$1"_"$2"_"$4"_"$5"_"]=="")print$0}'\
  test.af-only-gnomadtest.gnomad.genomes|wc-l

190

因此gnomad.genomes.r2.1.1来源的位点比af-only-gnomad少约8%,且文件过大。二者都来自gnomAD的全基因组数据。

可使用af-only-gnomad的全基因组AF,使用gnomad.exomes的全外显子组AF

最终待合并与使用的文件名

af-only-gnomad.hg38.AF.spliMulti.norm.vcf.6col.txt

gnomad.exomes.r2.1.1.sites.liftover_grch38.AF.spliMulti.norm.vcf.6col.exLowAN.txt

共3.1亿个短变异的人群频率