Several methods have been proposed that can adjust for batch effects provided a large number of samples (> 25) are included in each batch[1,2] .
ComBat: an empirical Bayes method has been described [3] that adjusts for batch effects even when the number of samples in each batch is small (< 10)
The aforementioned methods can adjust for batch effects provided that samples from "each biological group" are represented in every batch.
"each biological group" means that in each batch, each biological phenotype must have at least one sample in each batch, for example, each batch must have both disease and control, or each batch must have cancer subtypes “invasive”, “proliferative”, and “metabolic”
Suppose "metabolic" subtype consist of 1/3 in all cancer patients.
e.g. 1 million in 3 million cancer patients.
Then for each batch, in general, you have at least 6 tumors, thus you have 0.91 probability to have at least one metabolic sample in it.
use lib "./";
use Stat;
my $p = Stat::phyper_enrichment(3E6, 1E6, 6, 1);
print $p;
------
6 -> 91%
7 -> 94%
8 -> 96%
Reference:
e.g. 1 million in 3 million cancer patients.
Then for each batch, in general, you have at least 6 tumors, thus you have 0.91 probability to have at least one metabolic sample in it.
use lib "./";
use Stat;
my $p = Stat::phyper_enrichment(3E6, 1E6, 6, 1);
print $p;
------
6 -> 91%
7 -> 94%
8 -> 96%
Reference:
Empirical Bayes accomodation of batch-effects in microarray data using identical replicate reference samples: application to RNA expression profiling of blood from Duchenne muscular dystrophy patients
[1] Alter O, Brown PO, Botstein D. Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci USA. 2000;97:10101–10106. doi: 10.1073/pnas.97.18.10101. [PMC free article] [PubMed] [Cross Ref]
No comments:
Post a Comment