Title: | Multiple Hypotheses Testing for Multiple Families/Groups Structure |
---|---|
Description: | A Comprehensive tool for almost all existing multiple testing methods for multiple families. The package summarizes the existing methods for multiple families multiple testing procedures (MTPs) such as double FDR, group Benjamini-Hochberg (GBH) procedure and average FDR controlling procedure. The package also provides some novel multiple testing procedures using selective inference idea. |
Authors: | Yalin Zhu, Wenge Guo |
Maintainer: | Yalin Zhu <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.0 |
Built: | 2024-11-20 05:20:37 UTC |
Source: | https://github.com/allenzhuaz/mhtmult |
Given a list/data frame of grouped p-values, selecting thresholds and p-value combining method, retruns adjusted p-values to make decisions
avgFDR.p.adjust(pval, t, make.decision)
avgFDR.p.adjust(pval, t, make.decision)
pval |
the structural p-values, the type should be |
t |
the thresholds determining whether the families are selected or not, also affects conditional p-value within families. |
make.decision |
logical; if |
A list of the adjusted conditional p-values, a list of NULL
means the family is not selected to do the test in the second stage.
Yalin Zhu
Benjamini, Y., & Bogomolov, M. (2014). Selective inference on multiple families of hypotheses. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76: 297-318.
# data is from Example 4.1 in Mehrotra and Adewale (2012) pval <- list(c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077), c(0.216,0.843,0.864), c(1,0.878,0.766,0.598,0.011,0.864), c(0.889,0.557,0.767,0.009,0.644), c(1,0.583,0.147,0.789,0.217,1,0.02,0.784,0.579,0.439), c(0.898,0.619,0.193,0.806,0.611,0.526,0.702,0.196)) avgFDR.p.adjust(pval = pval, t=0.1) sum(unlist(avgFDR.p.adjust(pval = pval,t=0.1)) <= 0.1)
# data is from Example 4.1 in Mehrotra and Adewale (2012) pval <- list(c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077), c(0.216,0.843,0.864), c(1,0.878,0.766,0.598,0.011,0.864), c(0.889,0.557,0.767,0.009,0.644), c(1,0.583,0.147,0.789,0.217,1,0.02,0.784,0.579,0.439), c(0.898,0.619,0.193,0.806,0.611,0.526,0.702,0.196)) avgFDR.p.adjust(pval = pval, t=0.1) sum(unlist(avgFDR.p.adjust(pval = pval,t=0.1)) <= 0.1)
Given a list/data frame of grouped p-values, selecting thresholds and p-value combining method, retruns adjusted conditional p-values to make decisions
cFDR.cp.adjust(pval, t, comb.method = c("Fisher", "Stouffer", "minP"), make.decision, sig.level)
cFDR.cp.adjust(pval, t, comb.method = c("Fisher", "Stouffer", "minP"), make.decision, sig.level)
pval |
the structural p-values, the type should be |
t |
the thresholds determining whether the families are selected or not, also affects conditional p-value within families. |
comb.method |
p-value combining methods including |
make.decision |
logical; if |
sig.level |
significant level used to compare with adjusted p-values to make decisions, the default value is 0.05. |
A list of the adjusted conditional p-values, a list of NULL
means the family is not selected to do the test in the second stage.
Yalin Zhu
Heller, R., Chatterjee, N., Krieger, A., & Shi, J. (2016). Post-selection Inference Following Aggregate Level Hypothesis Testing in Large Scale Genomic Data. bioRxiv, 058404.
# data is from Example 4.1 in Mehrotra and Adewale (2012) pval <- list(c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077), c(0.216,0.843,0.864), c(1,0.878,0.766,0.598,0.011,0.864), c(0.889,0.557,0.767,0.009,0.644), c(1,0.583,0.147,0.789,0.217,1,0.02,0.784,0.579,0.439), c(0.898,0.619,0.193,0.806,0.611,0.526,0.702,0.196)) sum(p.adjust(unlist(pval), method = "BH")<=0.1) DFDR.p.adjust(pval = pval,t=0.1) DFDR2.p.adjust(pval = pval,t=0.1) sum(unlist(DFDR.p.adjust(pval = pval,t=0.1))<=0.1) sum(unlist(DFDR2.p.adjust(pval = pval,t=0.1))<=0.1) t=select.thres(pval,select.method = "BH", comb.method = "minP", alpha = 0.1) cFDR.cp.adjust(pval, t=t, comb.method="minP") t1=select.thres(pval, select.method = "bonferroni", comb.method = "minP", alpha = 0.1, k=3) cFDR.cp.adjust(pval, t=t1, comb.method="minP") t2=select.thres(pval, select.method = "sidak", comb.method = "minP", alpha = 0.1, k=3) cFDR.cp.adjust(pval, t=t2, comb.method="minP")
# data is from Example 4.1 in Mehrotra and Adewale (2012) pval <- list(c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077), c(0.216,0.843,0.864), c(1,0.878,0.766,0.598,0.011,0.864), c(0.889,0.557,0.767,0.009,0.644), c(1,0.583,0.147,0.789,0.217,1,0.02,0.784,0.579,0.439), c(0.898,0.619,0.193,0.806,0.611,0.526,0.702,0.196)) sum(p.adjust(unlist(pval), method = "BH")<=0.1) DFDR.p.adjust(pval = pval,t=0.1) DFDR2.p.adjust(pval = pval,t=0.1) sum(unlist(DFDR.p.adjust(pval = pval,t=0.1))<=0.1) sum(unlist(DFDR2.p.adjust(pval = pval,t=0.1))<=0.1) t=select.thres(pval,select.method = "BH", comb.method = "minP", alpha = 0.1) cFDR.cp.adjust(pval, t=t, comb.method="minP") t1=select.thres(pval, select.method = "bonferroni", comb.method = "minP", alpha = 0.1, k=3) cFDR.cp.adjust(pval, t=t1, comb.method="minP") t2=select.thres(pval, select.method = "sidak", comb.method = "minP", alpha = 0.1, k=3) cFDR.cp.adjust(pval, t=t2, comb.method="minP")
Given a list/data frame of grouped p-values, retruns adjusted p-values to make decisions
DFDR.p.adjust(pval, t, make.decision, alpha)
DFDR.p.adjust(pval, t, make.decision, alpha)
pval |
the structural p-values, the type should be |
t |
the threshold selecting significant families. |
make.decision |
logical; if |
alpha |
significant level used to compare with adjusted p-values to make decisions, the default value is 0.05. |
A list of the adjusted p-values, a list of NULL
means the family is not selected to do the test in the second stage.
Yalin Zhu
Mehrotra, D. V., & Heyse, J. F. (2004). Use of the false discovery rate for evaluating clinical safety data. Statistical methods in medical research, 13: 227-238.
# data is from Example 4.1 in Mehrotra and Adewale (2012) pval <- list(c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077), c(0.216,0.843,0.864), c(1,0.878,0.766,0.598,0.011,0.864), c(0.889,0.557,0.767,0.009,0.644), c(1,0.583,0.147,0.789,0.217,1,0.02,0.784,0.579,0.439), c(0.898,0.619,0.193,0.806,0.611,0.526,0.702,0.196)) DFDR.p.adjust(pval = pval,t=0.1) sum(unlist(DFDR.p.adjust(pval = pval,t=0.1))<=0.1)
# data is from Example 4.1 in Mehrotra and Adewale (2012) pval <- list(c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077), c(0.216,0.843,0.864), c(1,0.878,0.766,0.598,0.011,0.864), c(0.889,0.557,0.767,0.009,0.644), c(1,0.583,0.147,0.789,0.217,1,0.02,0.784,0.579,0.439), c(0.898,0.619,0.193,0.806,0.611,0.526,0.702,0.196)) DFDR.p.adjust(pval = pval,t=0.1) sum(unlist(DFDR.p.adjust(pval = pval,t=0.1))<=0.1)
Given a list/data frame of grouped p-values, retruns adjusted p-values to make decisions
DFDR2.p.adjust(pval, t, make.decision)
DFDR2.p.adjust(pval, t, make.decision)
pval |
the structural p-values, the type should be |
t |
the threshold selecting significant families and testing hypotheses. |
make.decision |
logical; if |
A list of the adjusted p-values, a list of NULL
means the family is not selected to do the test in the second stage.
Yalin Zhu
Mehrotra, D. V., & Adewale, A. J. (2012). Flagging clinical adverse experiences: reducing false discoveries without materially compromising power for detecting true signals. Statistics in medicine, 31: 1918-1930.
# data is from Example 4.1 in Mehrotra and Adewale (2012) pval <- list(c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077), c(0.216,0.843,0.864), c(1,0.878,0.766,0.598,0.011,0.864), c(0.889,0.557,0.767,0.009,0.644), c(1,0.583,0.147,0.789,0.217,1,0.02,0.784,0.579,0.439), c(0.898,0.619,0.193,0.806,0.611,0.526,0.702,0.196)) DFDR2.p.adjust(pval = pval,t=0.1) sum(unlist(DFDR2.p.adjust(pval = pval,t=0.1))<=0.1)
# data is from Example 4.1 in Mehrotra and Adewale (2012) pval <- list(c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077), c(0.216,0.843,0.864), c(1,0.878,0.766,0.598,0.011,0.864), c(0.889,0.557,0.767,0.009,0.644), c(1,0.583,0.147,0.789,0.217,1,0.02,0.784,0.579,0.439), c(0.898,0.619,0.193,0.806,0.611,0.526,0.702,0.196)) DFDR2.p.adjust(pval = pval,t=0.1) sum(unlist(DFDR2.p.adjust(pval = pval,t=0.1))<=0.1)
Given a list/data frame of grouped p-values, selecting thresholds and p-value combining method, retruns adjusted conditional p-values to make decisions
GBH.p.adjust(pval, t, make.decision)
GBH.p.adjust(pval, t, make.decision)
pval |
the structural p-values, the type should be |
t |
the thresholds determining whether the families are selected or not, also affects conditional p-value within families. |
make.decision |
logical; if |
A list of the adjusted conditional p-values, a list of NULL
means the family is not selected to do the test in the second stage.
Yalin Zhu
Hu, J. X., Zhao, H., & Zhou, H. H. (2010). False discovery rate control with groups. Journal of the American Statistical Association, 105: 1215-1227.
# data is from Example 4.1 in Mehrotra and Adewale (2012) pval <- list(c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077), c(0.216,0.843,0.864), c(1,0.878,0.766,0.598,0.011,0.864), c(0.889,0.557,0.767,0.009,0.644), c(1,0.583,0.147,0.789,0.217,1,0.02,0.784,0.579,0.439), c(0.898,0.619,0.193,0.806,0.611,0.526,0.702,0.196)) sum(p.adjust(unlist(pval), method = "BH")<=0.1) DFDR.p.adjust(pval = pval,t=0.1) DFDR2.p.adjust(pval = pval,t=0.1) sum(unlist(DFDR.p.adjust(pval = pval,t=0.1))<=0.1) sum(unlist(DFDR2.p.adjust(pval = pval,t=0.1))<=0.1) GBH.p.adjust(pval = pval,t=0.1) sum(unlist(GBH.p.adjust(pval = pval,t=0.1))<=0.1) t=select.thres(pval,select.method = "BH", comb.method = "minP", alpha = 0.1) cFDR.cp.adjust(pval, t=t, comb.method="minP") t1=select.thres(pval, select.method = "bonferroni", comb.method = "minP", alpha = 0.1, k=3) cFDR.cp.adjust(pval, t=t1, comb.method="minP") t2=select.thres(pval, select.method = "sidak", comb.method = "minP", alpha = 0.1, k=3) cFDR.cp.adjust(pval, t=t2, comb.method="minP")
# data is from Example 4.1 in Mehrotra and Adewale (2012) pval <- list(c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077), c(0.216,0.843,0.864), c(1,0.878,0.766,0.598,0.011,0.864), c(0.889,0.557,0.767,0.009,0.644), c(1,0.583,0.147,0.789,0.217,1,0.02,0.784,0.579,0.439), c(0.898,0.619,0.193,0.806,0.611,0.526,0.702,0.196)) sum(p.adjust(unlist(pval), method = "BH")<=0.1) DFDR.p.adjust(pval = pval,t=0.1) DFDR2.p.adjust(pval = pval,t=0.1) sum(unlist(DFDR.p.adjust(pval = pval,t=0.1))<=0.1) sum(unlist(DFDR2.p.adjust(pval = pval,t=0.1))<=0.1) GBH.p.adjust(pval = pval,t=0.1) sum(unlist(GBH.p.adjust(pval = pval,t=0.1))<=0.1) t=select.thres(pval,select.method = "BH", comb.method = "minP", alpha = 0.1) cFDR.cp.adjust(pval, t=t, comb.method="minP") t1=select.thres(pval, select.method = "bonferroni", comb.method = "minP", alpha = 0.1, k=3) cFDR.cp.adjust(pval, t=t1, comb.method="minP") t2=select.thres(pval, select.method = "sidak", comb.method = "minP", alpha = 0.1, k=3) cFDR.cp.adjust(pval, t=t2, comb.method="minP")
The function for computing the critical value based on number of hypotheses , fold
and significant level
.
gbonf.cv(m, k, alpha)
gbonf.cv(m, k, alpha)
m |
number of hypotheses to be tested. |
k |
number of allowed type 1 errors in k-FWER controls. |
alpha |
significant level used to compare with adjusted p-values to make decisions, the default value is 0.05. |
A numeric vector of the adjusted p-values (of the same length as p
) if make.decision = FALSE
, or a list including original p-values, adjusted p-values and decision rules if make.decision = TRUE
.
Yalin Zhu
gbonf.p.adjust
, p.adjust
, Sidak.p.adjust
.
p <- c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077) gbonf.cv(m=length(p), k=2)
p <- c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077) gbonf.cv(m=length(p), k=2)
The function for computing the adjusted p-values based on original p-values and fold .
gbonf.p.adjust(p, k, alpha, make.decision)
gbonf.p.adjust(p, k, alpha, make.decision)
p |
numeric vector of p-values (possibly with |
k |
number of allowed type 1 errors in k-FWER controls. |
alpha |
significant level used to compare with adjusted p-values to make decisions, the default value is 0.05. |
make.decision |
logical; if |
A numeric vector of the adjusted p-values (of the same length as p
) if make.decision = FALSE
, or a list including original p-values, adjusted p-values and decision rules if make.decision = TRUE
.
Yalin Zhu
Lehmann, E. L., & Romano, J. P. (2005). Generalizations of the familywise error rate. The Annals of Statistics, 33: 1138-1154.
gsidak.p.adjust
, p.adjust
, Sidak.p.adjust
.
p <- c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077) gbonf.p.adjust(p, k=2)
p <- c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077) gbonf.p.adjust(p, k=2)
The function for computing the critical value based on number of hypotheses , fold
and significant level
.
gsidak.cv(m, k, alpha)
gsidak.cv(m, k, alpha)
m |
number of hypotheses to be tested. |
k |
number of allowed type 1 errors in k-FWER controls. |
alpha |
significant level used to compare with adjusted p-values to make decisions, the default value is 0.05. |
A numeric vector of the adjusted p-values (of the same length as p
) if make.decision = FALSE
, or a list including original p-values, adjusted p-values and decision rules if make.decision = TRUE
.
Yalin Zhu
gsidak.p.adjust
, p.adjust
, Sidak.p.adjust
.
p <- c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077) gsidak.cv(m=length(p), k=2)
p <- c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077) gsidak.cv(m=length(p), k=2)
The function for computing the adjusted p-values based on original p-values and fold .
gsidak.p.adjust(p, k, alpha, make.decision)
gsidak.p.adjust(p, k, alpha, make.decision)
p |
numeric vector of p-values (possibly with |
k |
number of allowed type 1 errors in k-FWER controls. |
alpha |
significant level used to compare with adjusted p-values to make decisions, the default value is 0.05. |
make.decision |
logical; if |
A numeric vector of the adjusted p-values (of the same length as p
) if make.decision = FALSE
, or a list including original p-values, adjusted p-values and decision rules if make.decision = TRUE
.
Yalin Zhu
Guo, W., & Romano, J. (2007). A generalized Sidak-Holm procedure and control of generalized error rates under independence. Statistical Applications in Genetics and Molecular Biology, 6(1).
gbonf.p.adjust
, p.adjust
, Sidak.p.adjust
.
p <- c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077) gsidak.p.adjust(p, k=2)
p <- c(0.031,0.023,0.029,0.005,0.031,0.000,0.874,0.399,0.293,0.077) gsidak.p.adjust(p, k=2)
Given the structural p-values, choose a selecting method for controlling generalized familywise error rate or false discovery rate across families, and a combining mehtod, returns a vector of thresholds for the first stage of cFDR controlling procedures.
select.thres(pval, select.method, comb.method, alpha, k)
select.thres(pval, select.method, comb.method, alpha, k)
pval |
the structural p-values, the type should be |
select.method |
global p-value selecting methods. For generalized FWER controlling, k-Bonferroni or k-Sidak procedures can be used; for FDR controlling, BH procedure can be used. |
comb.method |
p-value combining methods including |
alpha |
significant level for selecting significant families in the first stage. The default value is 0.05. |
k |
number of allowed type 1 errors in k-FWER controls. |
A list of the adjusted conditional p-values, a list of NULL
means the family is not selected to do the test in the second stage.
Yalin Zhu