Multiple testing

Background

As Multiple testing, we need to adjust p-value.

Control Type I error

Under $H_0$ :

$$ \begin{align*} t_1=\frac{\hat{\beta}_1}{\operatorname{se}\left(\hat{\beta}_1\right)} \sim t_{n-p} \end{align*} $$

As sample size $n$ large,$t_{n-p} \approx N(0,1)$. Under null $H_0$, effect size $\beta \sim N(0,1)$ generally. Suppose $P($ Type I error $)=\alpha$ ,for $m$ test

$$ \begin{align*} \text { FWER }=1-(1-\alpha)^m \approx 1-(1-m \alpha) \end{align*} $$

In Bonferroni Correction, to control FWER $\leq \alpha$

$$ \begin{align*} \begin{aligned} & \Rightarrow \quad m \alpha_{\text {Bon }} \leq \alpha \\ & \Rightarrow \quad \alpha_{\text {Bon }} \leq \frac{\alpha}{m} \end{aligned} \end{align*} $$

Control False discovery rate

FDR (False Discovery Rate)

$$ \begin{align*} \operatorname{FDR}\left(q^*\right)=E\left[\frac{F\left(q^*\right)}{S\left(q^*\right)}\right] \end{align*} $$
  • $q^*$ : threshold
  • $S$ : number of significance
  • $F$ : number of false discovery

For $m$ test,p-value $p_1, \ldots, p_m$

  1. order p-value $p_{(1)} \leq \ldots \leq p_{(m)}$
  2. $k \overset{\underset{\mathrm{def}}{}}{=} \underset{i}{\operatorname{argmax}} \left(p_i \leq \frac{i}{m} q^*\right),\ i=1,2, \ldots, m$
  3. reject $H_{(i)},\ i=1, \ldots, k$

q-value

$$ \begin{align*} \hat{F D R}(t)=\frac{\hat{\pi}_0 m t}{S(t)}=\frac{\hat{\pi}_0 m t}{\sum_i I\left(P_i \leq t\right)} \end{align*} $$

where $\pi_0=P(H_0 \text{ is true} )$,$t$ is cut off

$$ \begin{align*} \hat{\pi}_0(\lambda) &= \frac{\sum_i I\left(P_i \leq t\right)}{m(1-\lambda)} \\ &= \frac{m_0}{m} \\ &= \frac{\text{number of } H_0}{\text{number of total}} \end{align*} $$

使用 $\frac{\sum_i I\left(P_i \leq t\right)}{1-\lambda}$ 來估計 null true 數量 $m_0$

Why $\hat{\pi}_0$

首先畫出所有檢定 p-value 的分佈圖,觀察從哪裡開始,p-value 開始變得像是 Uniform distribution,以下圖為例,大約 $0.2$開始,p-value 次數趨於平緩,像是 Uniform distribution 的走向,我們就用 $0.2$ 當成 $\lambda$,那麼 null true 數量 $m_0$ 應該是多少? null hypothesis $H_0$ 假設 p-value 服從 Uniform distribution

$$ \begin{align*} \text{p-value} \mid \text{null is true } \sim \mathrm{U}(0,1) \end{align*} $$

下面這圖形像是 Uniform distribution 加上另一個分配的分佈圖,因為 Uniform distribution 在每個區間的機率密度函數一樣,我們可以用 p-value 大於 $\lambda$ 的次數, 來估計有多少 p-value 會小於 $\lambda$ ,得知 p-value 從 $0$ 到 $1$ under null hypothesis 的次數會是

$$ \begin{align*} \frac{\sum_i I\left(P_i \leq t\right)}{1-\lambda} \end{align*} $$

Reference

Licensed under CC BY-NC-SA 4.0
comments powered by Disqus
使用 Hugo 建立
主題 StackJimmy 設計