Run the CNV-GWAS

Wraps all the necessary functions to run a CNV-GWAS using the output of setupCnvGWAS function.

(i) Produces the GDS file containing the genotype information (if produce.gds == TRUE), (ii) Produces the requested inputs for a PLINK analysis, (iii) run a CNV-GWAS analysis using a linear model (i.e. lm function), and (iv) export a QQ-plot displaying the adjusted p-values. In this release only the p-value for the copy number is available (i.e. 'P(CNP)').

Usage

cnvGWAS(
  phen.info,
  n.cor = 1,
  min.sim = 0.95,
  freq.cn = 0.01,
  snp.matrix = FALSE,
  method.m.test = "fdr",
  lo.phe = 1,
  chr.code.name = NULL,
  genotype.nodes = "CNVGenotype",
  coding.translate = "all",
  path.files = NULL,
  list.of.files = NULL,
  produce.gds = TRUE,
  run.lrr = FALSE,
  assign.probe = "min.pvalue",
  correct.inflation = FALSE,
  both.up.down = FALSE,
  verbose = FALSE
)

Arguments

phen.info: Returned by setupCnvGWAS
n.cor: Number of cores to be used
min.sim: Minimum CNV genotype distribution similarity among subsequent probes. Default is 0.95 (i.e. 95%)
freq.cn: Minimum CNV frequency where 1 (i.e. 100%), or all samples deviating from diploid state. Default 0.01 (i.e. 1%)
snp.matrix: Only FALSE implemented - If TRUE B allele frequencies (BAF) would be used to reconstruct CNV-SNP genotypes
method.m.test: Correction for multiple tests to be used. FDR is default, see p.adjust for other methods.
lo.phe: The phenotype to be analyzed in the PhenInfo$phenotypesSam data-frame
chr.code.name: A data-frame with the integer name in the first column and the original name for each chromosome
genotype.nodes: Expression data type. Nodes with CNV genotypes to be produced in the gds file.
coding.translate: For 'CNVgenotypeSNPlike'. If NULL or unrecognized string use only biallelic CNVs. If 'all' code multiallelic CNVs as 0 for loss; 1 for 2n and 2 for gain.
path.files: Folder containing the input CNV files used for the CNV calling (i.e. one text file with 5 collumns for each sample). Columns should contain (i) probe name, (ii) Chromosome, (iii) Position, (iv) LRR, and (v) BAF.
list.of.files: Data-frame with two columns where the (i) is the file name with signals and (ii) is the correspondent name of the sample in the gds file
produce.gds: logical. If TRUE produce a new gds, if FALSE use gds previously created
run.lrr: If TRUE use LRR values instead absolute copy numbers in the association
assign.probe: ‘min.pvalue’ or ‘high.freq’ to represent the CNV segment
correct.inflation: logical. Estimate lambda from raw p-values and correct for genomic inflation. Use with argument method.m.test to generate strict p-values.
both.up.down: Check for CNV genotype similarity in both directions. Default is FALSE (i.e. only downstream)
verbose: Show progress in the analysis

Value

The CNV segments and the representative probes and their respective p-value

References

da Silva et al. (2016) Genome-wide detection of CNVs and their association with meat tenderness in Nelore cattle. PLoS One, 11(6):e0157711.

Author

Vinicius Henrique da Silva

Examples