Zhengdeng Lei, PhD
Zhengdeng Lei, PhD
2007 - 2009 High Throughput Computational Analyst, Memorial Sloan-Kettering Cancer Center, New York
2003 - 2007 PhD, Bioinformatics, University of Illinois at Chicago
Tuesday, December 13, 2011
Monday, November 28, 2011
Ranking
http://ontario.compareschoolrankings.org/elementary/SchoolsByAreaMap.aspx
http://ontario.compareschoolrankings.org/secondary/SchoolsByAreaMap.aspx
http://ontario.compareschoolrankings.org/secondary/SchoolsByAreaMap.aspx
Saturday, November 26, 2011
http://www.imuc.com/pdf/Griffin-Industry-Report-09-14-2009.pdf
http://www.imuc.com/pdf/Griffin-Industry-Report-09-14-2009.pdf
Brain Cancer Stem Cells Display Preferential Sensitivity to Akt Inhibition
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2739007/
Breast CS CD44+/CD24-/Lin-
Prospective identification of tumorigenic breast cancer cells
Brain Cancer Stem Cells Display Preferential Sensitivity to Akt Inhibition
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2739007/
Breast CS CD44+/CD24-/Lin-
Prospective identification of tumorigenic breast cancer cells
Friday, November 25, 2011
BEZ235 vs CD44+
Combination Therapy Targeting Both Tumor-Initiating and Differentiated Cell Populations in Prostate Carcinoma
Tuesday, November 22, 2011
Monday, November 21, 2011
gene profiling of cancer stem cell
http://www.google.com/search?q=gene+profiling+of+cancer+stem+cell&rls=com.microsoft:en-US&ie=UTF-8&oe=UTF-8&startIndex=&startPage=0
http://www.molecular-cancer.com/content/6/1/75
http://genomebiology.com/content/9/5/R83
http://www.molecular-cancer.com/content/6/1/75
http://genomebiology.com/content/9/5/R83
Sunday, November 20, 2011
cancer stem cells (CSCs) or tumor-initiating cells (TICs)
http://www.miltenyibiotec.com/en/NN_722_Tumor_stem_cells.aspx
Within a tumor, the majority of tumor cells have limited ability to proliferate and rather differentiate into cells that constitute the bulk of the tumor mass. Recent theories suggest that a small population of cells within some tumors possess the ability to self-renew and proliferate and are thus able to maintain the tumor. These cells, which are called cancer stem cells (CSCs) or tumor-initiating cells (TICs), have been observed to share certain characteristics with normal stem cells, including a stem cell–like phenotype and function.
Certain surface markers that are associated with stem cells are also found on cancer stem cells. Human and mouse stem cell markers such as CD34, CD133,CD117, Sca-1, and other markers, such as CD44, CD24, CD20, CD105, andCD326 (EpCAM) have been found on cancer stem cells. This particular type of cell seems to be able to initiate and drive tumor growth in different hematological and solid tumors. It is critical to be able to identify and isolate these cells from tumor tissues in order to provide a clearer picture of the mechanisms governing the establishment of CSCs, their maintenance, and the molecular alteration in comparison to normal cells. An enrichment of CSCs has been observed in cell populations selected for CD133 expression from brain tumor1–4, prostate cancer5, renal tumors6, and also recently from colon cancer7,8 and hepatocellular carcinoma9.
To view the respective citations and a list of associated products, please download the attached PDF file.
Within a tumor, the majority of tumor cells have limited ability to proliferate and rather differentiate into cells that constitute the bulk of the tumor mass. Recent theories suggest that a small population of cells within some tumors possess the ability to self-renew and proliferate and are thus able to maintain the tumor. These cells, which are called cancer stem cells (CSCs) or tumor-initiating cells (TICs), have been observed to share certain characteristics with normal stem cells, including a stem cell–like phenotype and function.
Certain surface markers that are associated with stem cells are also found on cancer stem cells. Human and mouse stem cell markers such as CD34, CD133,CD117, Sca-1, and other markers, such as CD44, CD24, CD20, CD105, andCD326 (EpCAM) have been found on cancer stem cells. This particular type of cell seems to be able to initiate and drive tumor growth in different hematological and solid tumors. It is critical to be able to identify and isolate these cells from tumor tissues in order to provide a clearer picture of the mechanisms governing the establishment of CSCs, their maintenance, and the molecular alteration in comparison to normal cells. An enrichment of CSCs has been observed in cell populations selected for CD133 expression from brain tumor1–4, prostate cancer5, renal tumors6, and also recently from colon cancer7,8 and hepatocellular carcinoma9.
To view the respective citations and a list of associated products, please download the attached PDF file.
Monday, November 14, 2011
PCA
#Here the object data is the gene expression from RMA with dimension pxn = 54675 x 248, here n=248 (two batches: 192+56)
genes<-data[1:54613,]
genes<-t(genes)
pcs<-prcomp(genes)
summary(pcs)
library(scatterplot3d)
PC1<-pcs$x[,1]
PC2<-pcs$x[,2]
PC3<-pcs$x[,3]
group.colors <- rep("#000000", length(PC1))
group.colors[seq(1,192,1)] = "#FF77FF" #SG Batch A
group.colors[seq(193,248,1)] = "blue" #SG Batch B
scatterplot3d(PC3,PC1,PC2, main="PCA scatterplot before ComBat normalization", color=group.colors, pch=16)
legend.txt <- c("SG Batch A", "SG Batch B")
legend.col <- c("#00FF00","#0000FF")
legend(-5.75, 5.2, legend.txt , bty="n", col=legend.col,cex=1.2, pch=15)
Monday, November 7, 2011
Cancer Guide
http://www.cancerguide.org/pathology.html
Wednesday, November 2, 2011
Boxplot
p <- signif(p, 3)
Monday, October 31, 2011
GSM2Sample.pl
use lib 'E:/perl_lib';
use FileSystem;
my $work_dir = 'E:\CEL\GSE15460\GSE15460_RAW\GSE15460.SG.CEL';
my $output_file = 'E:\CEL\GSE15460\GSE15460_RAW\GSE15460.info.txt';
my @CellFiles = ();
@CellFiles = FileSystem::GetFileByPattern($work_dir, '\.CEL', @CellFiles);
foreach $cel (@CellFiles)
{
print "$cel\n";
my @CELContent = FileSystem::ReadFile($cel);
my $sample_info_line = $CELContent[13];
my @sample_info = split(/[\s:]/, $sample_info_line);
print "$sample_info[2]";
FileSystem::WriteFile($output_file, "$cel\t$sample_info[2]\n");
}
FileSystem::Close();
use FileSystem;
my $work_dir = 'E:\CEL\GSE15460\GSE15460_RAW\GSE15460.SG.CEL';
my $output_file = 'E:\CEL\GSE15460\GSE15460_RAW\GSE15460.info.txt';
my @CellFiles = ();
@CellFiles = FileSystem::GetFileByPattern($work_dir, '\.CEL', @CellFiles);
foreach $cel (@CellFiles)
{
print "$cel\n";
my @CELContent = FileSystem::ReadFile($cel);
my $sample_info_line = $CELContent[13];
my @sample_info = split(/[\s:]/, $sample_info_line);
print "$sample_info[2]";
FileSystem::WriteFile($output_file, "$cel\t$sample_info[2]\n");
}
FileSystem::Close();
STANDARD TRID Tumors on mRNA (Affy U133P2) Reason for exclusion GEO NGCII011/LGE GC-011LGE-T.CEL Fail QC GSM387788.CEL NGCII035/PCC GC-035PCC-T.CEL Fail QC GSM387797.CEL NGCII038/LYC GC-038LYC-T.CEL Fail QC GSM387798.CEL TGCII021/LAH GC-021LAH-T.CEL Fail QC GSM387790.CEL 980327 GC-980327T.CEL called as "adenosquamous cancer", relapse also adenosquamous GSM387937.CEL 2000619 GC-2000619T.CEL squamous CA GSM387844.CEL TGCII026/GJK GC-026-GJK-T.CEL GIST/squamous GSM387793.CEL TGCII039/TSC GC-039-TSC-T.CEL GIST/squamous GSM387799.CEL
Wednesday, October 26, 2011
Sunday, October 16, 2011
BFRM+NTP
top 1/3 vs bottom 1/3
use NTP and gene signature to predict the direction.
use NTP and gene signature to predict the direction.
Thursday, October 13, 2011
human gastric precancerous lesions
http://www.weibing.com.cn/html/mxqbxwy/
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1766652/pdf/v045p000I5.pdf
http://www.google.com.sg/url?sa=t&source=web&cd=2&ved=0CC0QFjAB&url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fpmc%2Farticles%2FPMC1146204%2F&ei=yNuWTtO4BsLsrAeSvqWbBA&usg=AFQjCNHwRAwFh8ipHDTFVBzNx-r7G-dbkg&sig2=kbm6v1zThHgCqzhCMyg13Q
chronic superficial gastritis (CSG, 16),
chronic atrophic gastritis (CAG, 16), intestinal metaplasia
(IM, 35), gastric epithelial dysplasia (GED, 23) and gastric
cancer (CA, 25), and conditions of H.pylori infection
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1766652/pdf/v045p000I5.pdf
http://www.google.com.sg/url?sa=t&source=web&cd=2&ved=0CC0QFjAB&url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fpmc%2Farticles%2FPMC1146204%2F&ei=yNuWTtO4BsLsrAeSvqWbBA&usg=AFQjCNHwRAwFh8ipHDTFVBzNx-r7G-dbkg&sig2=kbm6v1zThHgCqzhCMyg13Q
chronic superficial gastritis (CSG, 16),
chronic atrophic gastritis (CAG, 16), intestinal metaplasia
(IM, 35), gastric epithelial dysplasia (GED, 23) and gastric
cancer (CA, 25), and conditions of H.pylori infection
Tuesday, October 11, 2011
http://en.wikipedia.org/wiki/Carcinogenesis
A new way of looking at carcinogenesis comes from integrating the ideas of developmental biology into oncology. The cancer stem cell hypothesis proposes that the different kinds of cells in a heterogeneous tumor arise from a single cell, termed Cancer Stem Cell. Cancer stem cells may arise from transformation of adult stem cells or differentiated cells within a body. These cells persist as a subcomponent of the tumor and retain key stem cell properties. They give rise to a variety of cells, are capable of self-renewal and homeostatic control.[25]Furthermore, the relapse of cancer and the emergence of metastasis are also attributed to these cells. The cancer stem cell hypothesis does not contradict earlier concepts of carcinogenesis.
A new way of looking at carcinogenesis comes from integrating the ideas of developmental biology into oncology. The cancer stem cell hypothesis proposes that the different kinds of cells in a heterogeneous tumor arise from a single cell, termed Cancer Stem Cell. Cancer stem cells may arise from transformation of adult stem cells or differentiated cells within a body. These cells persist as a subcomponent of the tumor and retain key stem cell properties. They give rise to a variety of cells, are capable of self-renewal and homeostatic control.[25]Furthermore, the relapse of cancer and the emergence of metastasis are also attributed to these cells. The cancer stem cell hypothesis does not contradict earlier concepts of carcinogenesis.
Monday, October 3, 2011
Saturday, October 1, 2011
http://www.cancer.gov/cancertopics/understandingcancer/cancergenomics
http://www.cancer.gov/cancertopics/understandingcancer/cancergenomics
http://www.cancer.gov/cancertopics/understandingcancer/moleculardiagnostics
http://www.cancer.gov/cancertopics/understandingcancer/moleculardiagnostics
Monday, September 26, 2011
Real time bus arrival time
http://www.sbstransit.com.sg/
21371 199
http://www.smrt.com.sg/buses/busarrivaltime.asp
06029 33, 63, 851, 75
http://www.publictransport.sg/publish/mobile/en/busarrivaltime.jsp
21371 199
http://www.smrt.com.sg/buses/busarrivaltime.asp
06029 33, 63, 851, 75
http://www.publictransport.sg/publish/mobile/en/busarrivaltime.jsp
Wednesday, September 21, 2011
Export chart from Excel to Adobe illustrator
1. In Excel: copy the chart, go to another sheet, e.g. sheet2, copy special, "Picture (Enhanced metafile)"
2. Copy the chart in "sheet2" in Excel, paste to a new word document, save this word document as PDF
3. open the pdf by Illustrator.
2. Copy the chart in "sheet2" in Excel, paste to a new word document, save this word document as PDF
3. open the pdf by Illustrator.
Friday, September 9, 2011
Find SRE motif in promoter region
#!/usr/bin/perl
use lib 'E:/perl_lib';
use Bioinformatics;
use LWP::Simple;
my $genes = qq/MMP1 chr11:102,660,641-102,668,966 -
MMP2 chr16:55,513,081-55,540,584 +
MMP9 chr20:44,637,547-44,645,199 +
IL6 chr7:22,766,766-22,771,619 +
IL8 chr4:74,606,275-74,609,431 +
/;
my @genes = split(/\n/, $genes);
for (my $gene_i=0; $gene_i<=$#genes; $gene_i++)
{
$genes[$gene_i] =~ s/,//g;
if ($genes[$gene_i] =~/(\w+)\s(chr\d+):(\d+)-(\d+)\s(.*)/)
{
my $gene_name = $1;
my $gene_chr = $2;
my $gene_start = $3;
my $gene_end = $4;
my $strand = $5;
my $promoter_start;
my $promoter_end;
if ($strand eq "+")
{
$promoter_start = $gene_start - 2001;
$promoter_end = $gene_start - 1;
}
else
{
$promoter_start = $gene_end + 1;
$promoter_end = $gene_end + 2001;
}
my $seg = "$gene_chr:$promoter_start,".$promoter_end;
my $URL_gene ="http://genome.ucsc.edu/cgi-bin/das/hg19/dna?segment=$seg";
my $genefile = get($URL_gene);
my @DNA=grep {
/^[acgt]*$/i;
} split("\n",$genefile);
my $DNA = '';
for (my $i=0; $i<=$#DNA; $i++)
{
$DNA .= $DNA[$i];
}
my $len = length($DNA);
if ($strand eq "-")
{
$DNA = lc(Bioinformatics::cStrand($DNA));
}
my $SRE_found = 0;
while( $DNA =~ /(cc[at]{5}gg)/)
{
$SRE_found = 1;
my $SRE = $1;
my $SRE_upper = uc($SRE);
$DNA =~ s/$SRE/----$SRE_upper----/;
}
if ($SRE_found)
{
print "$genes[$gene_i]: $DNA\n\n\n";
}
}
}
use lib 'E:/perl_lib';
use Bioinformatics;
use LWP::Simple;
my $genes = qq/MMP1 chr11:102,660,641-102,668,966 -
MMP2 chr16:55,513,081-55,540,584 +
MMP9 chr20:44,637,547-44,645,199 +
IL6 chr7:22,766,766-22,771,619 +
IL8 chr4:74,606,275-74,609,431 +
/;
my @genes = split(/\n/, $genes);
for (my $gene_i=0; $gene_i<=$#genes; $gene_i++)
{
$genes[$gene_i] =~ s/,//g;
if ($genes[$gene_i] =~/(\w+)\s(chr\d+):(\d+)-(\d+)\s(.*)/)
{
my $gene_name = $1;
my $gene_chr = $2;
my $gene_start = $3;
my $gene_end = $4;
my $strand = $5;
my $promoter_start;
my $promoter_end;
if ($strand eq "+")
{
$promoter_start = $gene_start - 2001;
$promoter_end = $gene_start - 1;
}
else
{
$promoter_start = $gene_end + 1;
$promoter_end = $gene_end + 2001;
}
my $seg = "$gene_chr:$promoter_start,".$promoter_end;
my $URL_gene ="http://genome.ucsc.edu/cgi-bin/das/hg19/dna?segment=$seg";
my $genefile = get($URL_gene);
my @DNA=grep {
/^[acgt]*$/i;
} split("\n",$genefile);
my $DNA = '';
for (my $i=0; $i<=$#DNA; $i++)
{
$DNA .= $DNA[$i];
}
my $len = length($DNA);
if ($strand eq "-")
{
$DNA = lc(Bioinformatics::cStrand($DNA));
}
my $SRE_found = 0;
while( $DNA =~ /(cc[at]{5}gg)/)
{
$SRE_found = 1;
my $SRE = $1;
my $SRE_upper = uc($SRE);
$DNA =~ s/$SRE/----$SRE_upper----/;
}
if ($SRE_found)
{
print "$genes[$gene_i]: $DNA\n\n\n";
}
}
}
Wednesday, September 7, 2011
http://www.novartis.com/innovation/research-development/drug-discovery-development-process/index.shtml
http://www.novartis.com/innovation/research-development/targeted-therapies/index.shtml
http://www.novartis.com/innovation/research-development/targeted-therapies/index.shtml
Tuesday, September 6, 2011
Sunday, August 21, 2011
How to dissect the tumors
1. check batch effect in SGIIA
(1) Heatmap, PCA on control genes
(2) Define the batches by global PCA
2. ComBat
3. CC_IFS on SGIIA
4. Select samples with avg_consensus_idx > 0.9
5. Use samples from step 4, and combine with SGIIB samples, repeat step 2-4
6. For new cohorts, repeat step 5.
limma to derive pairwise siganture, use NTP or SVM to predict.
(1) Heatmap, PCA on control genes
(2) Define the batches by global PCA
2. ComBat
3. CC_IFS on SGIIA
4. Select samples with avg_consensus_idx > 0.9
5. Use samples from step 4, and combine with SGIIB samples, repeat step 2-4
6. For new cohorts, repeat step 5.
limma to derive pairwise siganture, use NTP or SVM to predict.
Thursday, August 18, 2011
How to include the maximum number of tumors
How we have 201 tumor samples which have avg_consensus_idx > 0.9 in CC_IFS of ComBat248, and the new ComBat201 has cophenetic = 1.
Thus we can lower the avg_consensus_idx cutoff to include more samples until cophenetic < 1.
24% of M patients benifit from chemo.
30% of D patients may benifit from PI3K?
46% treatment unknow?
Thus we can lower the avg_consensus_idx cutoff to include more samples until cophenetic < 1.
24% of M patients benifit from chemo.
30% of D patients may benifit from PI3K?
46% treatment unknow?
Monday, August 8, 2011
tophat
104 nohup tophat -r 124 -o tophat124 --num-threads=4 hg19 AGS_1_sequence.fastq AGS_2_sequence.fastq >screen.txt &
The estimated library average fragment size is 280, the read length is 60bp, so the inner distange between paired reads (--mate-inner-dist) is 160.
124=fragment size(insert size) - 2*read length
Wednesday, July 27, 2011
Good Univ's
Tuesday, July 12, 2011
pvalues for affy probesets
library(affy)
setwd("/home/leiz/CEL/SGII192")
data <- ReadAffy()
eset_pma <- mas5calls(data)
pvalues <- assayDataElement(eset_pma, "se.exprs")
write.table(pvalues, "GC192.pvalues.txt", sep="\t")
setwd("/home/leiz/CEL/SGII192")
data <- ReadAffy()
eset_pma <- mas5calls(data)
pvalues <- assayDataElement(eset_pma, "se.exprs")
write.table(pvalues, "GC192.pvalues.txt", sep="\t")
Monday, July 11, 2011
Friday, July 8, 2011
glmnet cox
http://icb.med.cornell.edu/wiki/index.php/Elementolab/R_tutorial
Thursday, June 30, 2011
Wednesday, June 29, 2011
http://www.cic.gc.ca/english/immigrate/skilled/complete-applications.asp
Wednesday, June 8, 2011
SCP without password
http://www.linuxjournal.com/article/8600
copy from your local machine to remote
1. At you local machine (e.g. Steve Server):
ssh-keygen -t rsa
#then enter, enter(default file)
all default (no phrase)
cd /home/leiz/.ssh
scp ~/.ssh/id_rsa.pub NUSSTF\\gmslz@172.25.138.12:/home/gmslz/.ssh/id_ras.pub.FromSteve
PS: get you local ip if no ifconfig in your local machine
netstat -an|grep "tcp"
172.25.136.25
copy from your local machine to remote
1. At you local machine (e.g. Steve Server):
ssh-keygen -t rsa
#then enter, enter(default file)
all default (no phrase)
cd /home/leiz/.ssh
scp ~/.ssh/id_rsa.pub NUSSTF\\gmslz@172.25.138.12:/home/gmslz/.ssh/id_ras.pub.FromSteve
2. At remote machine (e.g. Cluster)
cd /home/gmslz/.ssh/
ls -la
cat id_ras.pub.FromSteve >>authorized_keys
DONE
you can scp from local (steve) to remote (cluster) without password
PS: get you local ip if no ifconfig in your local machine
netstat -an|grep "tcp"
172.25.136.25
Thursday, May 26, 2011
% With iterative feature selection, converged after three runs (consensus clustering)
file = 'E:\Projects\8.ComBAT\ComBat399T\CC_IFS\Run2\K3_consensus_matrix2.txt'
% No iterative feature selection
%file = 'E:\Projects\8.ComBAT\ComBat399T\CC_IFS\K3_consensus_matrix0.txt'
n=399
A = zeros(n, n);
fid = fopen(file, 'r');
row = 1;
% Skip first line
tline = fgetl(fid);
for row=1:n,
tline = fgetl(fid);
LineWith1stCol = regexp(tline, '\t', 'split');
A(row, :) = str2double(LineWith1stCol(1,2:n+1));
end
cd('E:\MATLAB_lib')
v=getcoph(A)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [coph] = getcoph(a)
% a is the consensus matrix
[m,m]=size(a);
uvec=a(1,2:end);
for i=2:m-1;
uvec=[uvec a(i,i+1:end)]; %get upper diagonal elements of consensus
end
y=1-uvec; % consensus are similarities, convert to distances
z=linkage(y,'average'); % use average linkage
coph=cophenet(z,y);
end
file = 'E:\Projects\8.ComBAT\ComBat399T\CC_IFS\Run2\K3_consensus_matrix2.txt'
% No iterative feature selection
%file = 'E:\Projects\8.ComBAT\ComBat399T\CC_IFS\K3_consensus_matrix0.txt'
n=399
A = zeros(n, n);
fid = fopen(file, 'r');
row = 1;
% Skip first line
tline = fgetl(fid);
for row=1:n,
tline = fgetl(fid);
LineWith1stCol = regexp(tline, '\t', 'split');
A(row, :) = str2double(LineWith1stCol(1,2:n+1));
end
cd('E:\MATLAB_lib')
v=getcoph(A)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [coph] = getcoph(a)
% a is the consensus matrix
[m,m]=size(a);
uvec=a(1,2:end);
for i=2:m-1;
uvec=[uvec a(i,i+1:end)]; %get upper diagonal elements of consensus
end
y=1-uvec; % consensus are similarities, convert to distances
z=linkage(y,'average'); % use average linkage
coph=cophenet(z,y);
end
Wednesday, May 25, 2011
Download youtube
Go to youtube URL for a video.
copy and paste the following to your chrome address bar
22 1280x720
javascript:isIE=/*@cc_on!@*/false;isIE ? swfHTML=document.getElementById('movie_player').getElementsByTagName('param')[1].value:swfHTML=document.getElementById("movie_player").getAttribute("flashvars");
w=swfHTML.split("&"); for(i=0;i<=w.length-1;i++) if(w[i].split("=")[0] == "fmt_url_map"){links=unescape(w[i].split("=")[1]);break;}abc = links.split(",");for(i=0;i<=abc.length-1;i++){fmt=abc[i].split("|")[0];if(fmt==22){url = abc[i].split("|")[1];window.location.href = url + '&title=' + (((document.title.replace('#',' ')).replace('@',' ')).replace('*',' ')).replace('|',' ');}}
w=swfHTML.split("&"); for(i=0;i<=w.length-1;i++) if(w[i].split("=")[0] == "fmt_url_map"){links=unescape(w[i].split("=")[1]);break;}abc = links.split(",");for(i=0;i<=abc.length-1;i++){fmt=abc[i].split("|")[0];if(fmt==22){url = abc[i].split("|")[1];window.location.href = url + '&title=' + (((document.title.replace('#',' ')).replace('@',' ')).replace('*',' ')).replace('|',' ');}}
35 854x480
javascript:isIE=/*@cc_on!@*/false;isIE ? swfHTML=document.getElementById('movie_player').getElementsByTagName('param')[1].value:swfHTML=document.getElementById("movie_player").getAttribute("flashvars");
w=swfHTML.split("&"); for(i=0;i<=w.length-1;i++) if(w[i].split("=")[0] == "fmt_url_map"){links=unescape(w[i].split("=")[1]);break;}abc = links.split(",");for(i=0;i<=abc.length-1;i++){fmt=abc[i].split("|")[0];if(fmt==35){url = abc[i].split("|")[1];window.location.href = url + '&title=' + (((document.title.replace('#',' ')).replace('@',' ')).replace('*',' ')).replace('|',' ');}}
34 640x360
18 640x360
javascript:isIE=/*@cc_on!@*/false;isIE ? swfHTML=document.getElementById('movie_player').getElementsByTagName('param')[1].value:swfHTML=document.getElementById("movie_player").getAttribute("flashvars");
w=swfHTML.split("&"); for(i=0;i<=w.length-1;i++) if(w[i].split("=")[0] == "fmt_url_map"){links=unescape(w[i].split("=")[1]);break;}abc = links.split(",");for(i=0;i<=abc.length-1;i++){fmt=abc[i].split("|")[0];if(fmt==18){url = abc[i].split("|")[1];window.location.href = url + '&title=' + (((document.title.replace('#',' ')).replace('@',' ')).replace('*',' ')).replace('|',' ');}}
w=swfHTML.split("&"); for(i=0;i<=w.length-1;i++) if(w[i].split("=")[0] == "fmt_url_map"){links=unescape(w[i].split("=")[1]);break;}abc = links.split(",");for(i=0;i<=abc.length-1;i++){fmt=abc[i].split("|")[0];if(fmt==18){url = abc[i].split("|")[1];window.location.href = url + '&title=' + (((document.title.replace('#',' ')).replace('@',' ')).replace('*',' ')).replace('|',' ');}}
5 320x240
javascript:isIE=/*@cc_on!@*/false;isIE ? swfHTML=document.getElementById('movie_player').getElementsByTagName('param')[1].value:swfHTML=document.getElementById("movie_player").getAttribute("flashvars");w=swfHTML.split("&");for(i=0;i<=w.length-1;i++)if(w[i].split("=")[0] == "fmt_url_map"){links=unescape(w[i].split("=")[1]);break;}abc = links.split(",");for(i=0;i<=abc.length-1;i++){fmt=abc[i].split("|")[0];if(fmt==5){url = abc[i].split("|")[1] + '&title=' + (((document.title.replace('#',' ')).replace('@',' ')).replace('*',' ')).replace('|',' ');window.location.href = url;}}
Friday, May 20, 2011
Standardization
x <- matrix(1:21, ncol=7)
By row (gene)
std.x.by.row <- t(scale(t(x), scale=T))
By row (gene)
std.x.by.row <- t(scale(t(x), scale=T))
By column (array)
std.x.by.col <- scale(x, scale=T)
Check the batch effect by date
date2col <- function(date.list)
{
clr.template = c("red", "orange", "yellow", "green", "cyan", "blue", "purple")
num.dates <- length(date.list)
clr.list <- vector()
clr.list[1] <- "red"
c.index <- 0
for (i in 2:num.dates) {
if(date.list[i] == date.list[i-1]) {
clr.list[i] = clr.list[i-1]
} else {
c.index <- c.index+1
clr.list[i] = clr.template[c.index %% 7+1]
}
}
return(clr.list)
}
setwd("E:\\CEL\\GastricCancer\\AU\\PM_data_new\\Gastric_Affy_files\\Tumors")
data <- read.table(file="AU_GC70.rma.txt", header=T, row.names=1)
data.ctrl <- data[54614:54675, ]
library("gplots")
#data <- t(scale(t(data.ctrl), scale=T)) #standardized by row(gene)
#data[data < -3] <- -3
#data[data > 3] <- 3
data <- sweep(data.ctrl, 1, apply(data.ctrl, 1, median)) #just median centered
my.color <- c("8/4/2004","8/4/2004","11/18/2004","11/18/2004","11/25/2004","11/25/2004","11/25/2004","11/25/2004","11/25/2004","11/25/2004","11/25/2004","11/26/2004","11/26/2004","11/26/2004","11/26/2004","11/26/2004","12/2/2004","12/2/2004","12/3/2004","12/3/2004","12/3/2004","12/3/2004","12/3/2004","12/3/2004","12/3/2004","12/3/2004","12/3/2004","1/14/2005","1/14/2005","1/14/2005","1/14/2005","1/14/2005","1/14/2005","2/17/2005","2/17/2005","2/17/2005","2/17/2005","2/17/2005","2/17/2005","2/17/2005","2/25/2005","2/25/2005","3/4/2005","3/4/2005","3/4/2005","3/18/2005","3/18/2005","3/23/2005","4/8/2005","4/8/2005","4/8/2005","4/8/2005","4/8/2005","4/8/2005","4/28/2005","4/28/2005","4/28/2005","4/28/2005","4/29/2005","4/29/2005","4/29/2005","4/29/2005","5/19/2005","5/24/2005","5/24/2005","6/22/2005","6/22/2005","6/22/2005","6/22/2005","6/22/2005")
#my.color <- rep("black",dim(data)[2])
my.color <- date2col(my.color)
hm<-heatmap.2(as.matrix(data), col=greenred(75), scale="none", dendrogram="none", Rowv= T, Colv=F, ColSideColors=my.color, key=TRUE, symkey=FALSE, density.info="none",trace="none", cexRow=0.75,cexCol=0.75)
pdf(file = "Batch_in_CtrlGenes.pdf", width=10, height=10)
#pdf(file = "Batch_in_CtrlGenes.pdf")
hm<-heatmap.2(as.matrix(data), col=greenred(75), scale="none", dendrogram="none", Rowv= T, Colv=F, ColSideColors=my.color, key=TRUE, symkey=FALSE, density.info="none",trace="none", cexRow=0.75,cexCol=0.75)
dev.off()
{
clr.template = c("red", "orange", "yellow", "green", "cyan", "blue", "purple")
num.dates <- length(date.list)
clr.list <- vector()
clr.list[1] <- "red"
c.index <- 0
for (i in 2:num.dates) {
if(date.list[i] == date.list[i-1]) {
clr.list[i] = clr.list[i-1]
} else {
c.index <- c.index+1
clr.list[i] = clr.template[c.index %% 7+1]
}
}
return(clr.list)
}
setwd("E:\\CEL\\GastricCancer\\AU\\PM_data_new\\Gastric_Affy_files\\Tumors")
data <- read.table(file="AU_GC70.rma.txt", header=T, row.names=1)
data.ctrl <- data[54614:54675, ]
library("gplots")
#data <- t(scale(t(data.ctrl), scale=T)) #standardized by row(gene)
#data[data < -3] <- -3
#data[data > 3] <- 3
data <- sweep(data.ctrl, 1, apply(data.ctrl, 1, median)) #just median centered
my.color <- c("8/4/2004","8/4/2004","11/18/2004","11/18/2004","11/25/2004","11/25/2004","11/25/2004","11/25/2004","11/25/2004","11/25/2004","11/25/2004","11/26/2004","11/26/2004","11/26/2004","11/26/2004","11/26/2004","12/2/2004","12/2/2004","12/3/2004","12/3/2004","12/3/2004","12/3/2004","12/3/2004","12/3/2004","12/3/2004","12/3/2004","12/3/2004","1/14/2005","1/14/2005","1/14/2005","1/14/2005","1/14/2005","1/14/2005","2/17/2005","2/17/2005","2/17/2005","2/17/2005","2/17/2005","2/17/2005","2/17/2005","2/25/2005","2/25/2005","3/4/2005","3/4/2005","3/4/2005","3/18/2005","3/18/2005","3/23/2005","4/8/2005","4/8/2005","4/8/2005","4/8/2005","4/8/2005","4/8/2005","4/28/2005","4/28/2005","4/28/2005","4/28/2005","4/29/2005","4/29/2005","4/29/2005","4/29/2005","5/19/2005","5/24/2005","5/24/2005","6/22/2005","6/22/2005","6/22/2005","6/22/2005","6/22/2005")
#my.color <- rep("black",dim(data)[2])
my.color <- date2col(my.color)
hm<-heatmap.2(as.matrix(data), col=greenred(75), scale="none", dendrogram="none", Rowv= T, Colv=F, ColSideColors=my.color, key=TRUE, symkey=FALSE, density.info="none",trace="none", cexRow=0.75,cexCol=0.75)
pdf(file = "Batch_in_CtrlGenes.pdf", width=10, height=10)
#pdf(file = "Batch_in_CtrlGenes.pdf")
hm<-heatmap.2(as.matrix(data), col=greenred(75), scale="none", dendrogram="none", Rowv= T, Colv=F, ColSideColors=my.color, key=TRUE, symkey=FALSE, density.info="none",trace="none", cexRow=0.75,cexCol=0.75)
dev.off()
Wednesday, May 11, 2011
GSAA
High, mid, low activity, e.g. p53
mid vs low -->(GSEA) ES
high vs low --> High ES??
mid vs low -->(GSEA) ES
high vs low --> High ES??
Supposed we have 200 cell lines, obtain the expression before drug treatment, and obtain GI50 for M drugs.
(NCI60, too small?)
Drug1 | Drug2 | Drug3 | ... ... | DrugM | |
GeneSet1 | Corr(1,1) = GSAA1vsGI50 for drug 1 across all cell lines | Corr(1,2) = GSAA1vsGI50 for drug2 across all cell lines | |||
GeneSet2 | |||||
GeneSet3 | |||||
GeneSet4 | |||||
... ... | |||||
GeneSetN |
Tuesday, April 26, 2011
Median Centered
data.log10.MeanCentered <- sweep(data.log10, 2, apply(data.log10, 2, median))
Tuesday, April 19, 2011
Enrichment test in cancer subtypes
Invasive | Proliferative | Metabolic | |
Hypermethylated | 4110 | 1536 | 988 |
Hypomethylated | 625 | 1263 | 284 |
Q1: Is the Invasive subtype enriched with Hypermethylated CpGs?
q <- 4110 # number of success (white balls) drawn
m <- 4110+1536+988 # number of success (white) in the urn
n <- 625+1263+284 # number of fail (black) in the urn
k <- 4110+625 # number of balls drawn out
p.value <- phyper(q-1,m,n,k, lower.tail=F)
# why q-1, because we want to have Pr(X>=x) instead of Pr(X>x)
p.value
= 2.975069e-162
YES
Conclusion: The Invasive subtype is significantly enriched with Hypermethylated CpGs.
Q2: Is the Invasive subtype enriched with Hypomethylated CpGs?
q <- 625 # number of success (white balls) drawn
m <- 625+1263+284 # number of fail (black) in the urn
n <- 4110+1536+988 # number of success (white) in the urn
k <- 4110+625 # number of balls drawn out
p.value <- phyper(q-1,m,n,k, lower.tail=F)
p.value
=1
No
Q3: Is the Proliferative subtype enriched with Hypermethylated CpGs?
q <- 1536
m <- 4110+1536+988
n <- 625+1263+284
k <- 1263+1536
p.value <- phyper(q-1,m,n,k, lower.tail=F)
p.value
= 1
No
Q4: Is the Proliferative subtype enriched with Hypomethylated CpGs?
q <- 1263
m <- 625+1263+284
n <- 4110+1536+988
k <- 1263+1536
p.value <- phyper(q-1,m,n,k, lower.tail=F)
p.value
= 3.765696e-193
YES
Conclusion: The Proliferative subtype is significantly enriched with Hypomethylated CpGs.
Q5: Is the Metabolic subtype enriched with Hypermethylated CpGs?
q <- 988 # number of success (white balls) drawn
m <- 4110+1536+988 # number of success (white) in the urn
n <- 625+1263+284 # number of fail (black) in the urn
k <- 988+284 # number of balls drawn out
p.value <- phyper(q-1,m,n,k, lower.tail=F)
p.value
= 0.01919936
YES
Conclusion: The Metabolic subtype is significantly enriched with Hypermethylated CpGs.
Q6: Is the Metabolic subtype enriched with Hypomethylated CpGs?
q <- 284 #
m <- 625+1263+284
n <- 4110+1536+988
k <- 988+284
p.value <- phyper(q-1,m,n,k, lower.tail=F)
p.value
= 0.9839128
No
See also:
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2724271/
An integrative genomics approach identifies Hypoxia Inducible Factor-1 (HIF-1)-target genes that form the core response to hypoxia
Subscribe to:
Posts (Atom)