Zhengdeng Lei, PhD

Zhengdeng Lei, PhD

2009 - Present Research Fellow at Duke-NUS, Singapore
2007 - 2009 High Throughput Computational Analyst, Memorial Sloan-Kettering Cancer Center, New York
2003 - 2007 PhD, Bioinformatics, University of Illinois at Chicago

Monday, April 4, 2011

Batch effect

Several methods have been proposed that can adjust for batch effects provided a large number of samples (> 25) are included in each batch[1,2] .

ComBat: an empirical Bayes method has been described [3] that adjusts for batch effects even when the number of samples in each batch is small (< 10)

The aforementioned methods can adjust for batch effects provided that samples from "each biological group" are represented in every batch.

"each biological group" means that in each batch, each biological phenotype must have at least one sample in each batch, for example, each batch must have both disease and control, or each batch must have cancer subtypes “invasive”, “proliferative”, and “metabolic”

Suppose "metabolic" subtype consist of 1/3 in all cancer patients.
e.g. 1 million in 3 million cancer patients.
Then for each batch, in general, you have at least 6 tumors, thus you have 0.91 probability to have at least one metabolic sample in it.



use lib "./";
use Stat;
my $p = Stat::phyper_enrichment(3E6, 1E6, 6, 1);
print $p;
------

6 -> 91%
7 -> 94%
8 -> 96%



Reference:

Empirical Bayes accomodation of batch-effects in microarray data using identical replicate reference samples: application to RNA expression profiling of blood from Duchenne muscular dystrophy patients
[1] Alter O, Brown PO, Botstein D. Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci USA. 2000;97:10101–10106. doi: 10.1073/pnas.97.18.10101. [PMC free article] [PubMed] [Cross Ref]
[2] Benito M, Parker J, Du Q, Wu J, Xiang D, Perou CM, Marron JS. Adjustment of systematic microarray data biases. Bioinformatics. 2004;20:105–114. doi: 10.1093/bioinformatics/btg385. [PubMed] [Cross Ref]
[3] Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007;8:118–127. doi: 10.1093/biostatistics/kxj037.[PubMed] [Cross Ref]

No comments:

Post a Comment