VCF Processing

This section describes how to process VCF files using the MEA-Pipeline.

Query Notation

Query Notation
Notation	Remark
`-variants`	Select the variants based on a bed file or a position file that is provided in the config.yaml with parameter `variant_file`
`-samples`	Select the samples based on a headerless text file that is provided in the config.yaml with parameter `sample_file`
`-d[INT]`	Set the minimum depth to [INT] for variant calls.
`-HETd[INT]r[FLOAT]`	Set alleles to het calls if the minimum alternate read count > INT and minimum alternate read ratio (out of total reads) > FLOAT.
`-HET2[OPT]`	Set het alleles to OPT, where OPT can be REF (reference alleles), ALT (alternate alleles), MISS (missing alleles) or MAJ (major alleles).
`-ANN`	Annotate variants using snpEff software, with provided snpEff database set in the config.yaml with parameter `snpEff_config_file`, `snpEff_data_dir`, and `snpEff_db`.
`-BCSQ`	Annotate using the `samtools bcsq` command, with gff file set in parameter `gff_file`.
`-V[FLOAT]`	Select the variants that have sample missingness up to FLOAT value.
`-S[FLOAT]`	Select the samples that have variant missingness up to FLOAT value.
`-MAC[INT]`	Select variants with minimum allele count (MAC) of INT.
`-MAF[FLOAT]`	Select variants with minimum allele frequency (MAF) of FLOAT.
`-atom`	Decompose complex variants using the `samtools norm` command.
`-split`	Split multi-allelec variants into individual variants.
`-snv`	Select single-nucleotide variants.
`-dedup`	Deduplicate variants that have identical position by selecting those with the lowest sample missingness.
`-FWS[FLOAT]`	Select samples with Fws > FLOAT, as calculated by moimix.