Wraps all the necessary functions to run a CNV-GWAS using the output of
setupCnvGWAS
function.
(i) Produces the GDS file containing the genotype information (if produce.gds == TRUE),
(ii) Produces the requested inputs for a PLINK analysis,
(iii) run a CNV-GWAS analysis using a linear model (i.e. lm
function), and
(iv) export a QQ-plot displaying the adjusted p-values.
In this release only the p-value for the copy number is available (i.e. 'P(CNP)').
Usage
cnvGWAS(
phen.info,
n.cor = 1,
min.sim = 0.95,
freq.cn = 0.01,
snp.matrix = FALSE,
method.m.test = "fdr",
lo.phe = 1,
chr.code.name = NULL,
genotype.nodes = "CNVGenotype",
coding.translate = "all",
path.files = NULL,
list.of.files = NULL,
produce.gds = TRUE,
run.lrr = FALSE,
assign.probe = "min.pvalue",
correct.inflation = FALSE,
both.up.down = FALSE,
verbose = FALSE
)
Arguments
- phen.info
Returned by
setupCnvGWAS
- n.cor
Number of cores to be used
- min.sim
Minimum CNV genotype distribution similarity among subsequent probes. Default is 0.95 (i.e. 95%)
- freq.cn
Minimum CNV frequency where 1 (i.e. 100%), or all samples deviating from diploid state. Default 0.01 (i.e. 1%)
- snp.matrix
Only FALSE implemented - If TRUE B allele frequencies (BAF) would be used to reconstruct CNV-SNP genotypes
- method.m.test
Correction for multiple tests to be used. FDR is default, see
p.adjust
for other methods.- lo.phe
The phenotype to be analyzed in the PhenInfo$phenotypesSam data-frame
- chr.code.name
A data-frame with the integer name in the first column and the original name for each chromosome
- genotype.nodes
Expression data type. Nodes with CNV genotypes to be produced in the gds file.
- coding.translate
For 'CNVgenotypeSNPlike'. If NULL or unrecognized string use only biallelic CNVs. If 'all' code multiallelic CNVs as 0 for loss; 1 for 2n and 2 for gain.
- path.files
Folder containing the input CNV files used for the CNV calling (i.e. one text file with 5 collumns for each sample). Columns should contain (i) probe name, (ii) Chromosome, (iii) Position, (iv) LRR, and (v) BAF.
- list.of.files
Data-frame with two columns where the (i) is the file name with signals and (ii) is the correspondent name of the sample in the gds file
- produce.gds
logical. If TRUE produce a new gds, if FALSE use gds previously created
- run.lrr
If TRUE use LRR values instead absolute copy numbers in the association
- assign.probe
‘min.pvalue’ or ‘high.freq’ to represent the CNV segment
- correct.inflation
logical. Estimate lambda from raw p-values and correct for genomic inflation. Use with argument
method.m.test
to generate strict p-values.- both.up.down
Check for CNV genotype similarity in both directions. Default is FALSE (i.e. only downstream)
- verbose
Show progress in the analysis
References
da Silva et al. (2016) Genome-wide detection of CNVs and their association with meat tenderness in Nelore cattle. PLoS One, 11(6):e0157711.
Examples
# Load phenotype-CNV information
data.dir <- system.file("extdata", package="CNVRanger")
phen.loc <- file.path(data.dir, "Pheno.txt")
cnv.out.loc <- file.path(data.dir, "CNVOut.txt")
map.loc <- file.path(data.dir, "MapPenn.txt")
phen.info <- setupCnvGWAS('Example', phen.loc, cnv.out.loc, map.loc)
# Define chr correspondence to numeric, if necessary
df <- '16 1A
25 4A
29 25LG1
30 25LG2
31 LGE22'
chr.code.name <- read.table(text=df, header=FALSE)
segs.pvalue.gr <- cnvGWAS(phen.info, chr.code.name=chr.code.name)