Run DrVAEN on dataset: GSE32989


1. Dataset summary

To develop gene expression signatures for in vitro drug response and other phenotypes. Profiling was done on 68 NSCLC cell lines and 2 HBEC-KT cell lines (normal lung cells immortalized with CDK4 and hTERT).

2. Get and prepare dataset

The dataset can be accessed using the code:

  #setwd("/path/to/GSE32989")
  library("GEOquery")
  library("limma")

  gse <- getGEO("GSE32989", GSEMatrix = TRUE)

  save(gse, file="GSE32989.RData")

  exprs(gse[[1]]) -> datExpr0

  ########## section 1
  library('preprocessCore')
  normalize.quantiles(datExpr0) -> datExprLQ
  dimnames(datExprLQ) = dimnames(datExpr0)

  ########## section 2
  gpl = getGEO("GPL13376")
  Table(gpl) -> anno

  match(rownames(datExprLQ), anno[,1]) -> ii
  symbol = anno$ILMN_Gene[ii]

  apply(datExprLQ, 2, function(u)tapply(u, symbol, median)) -> expr.mat

  pData(gse[[1]]) -> pheno.anno

  save(expr.mat, pheno.anno, file="GSE32989.RData")

  write.table(format(expr.mat, digits=4),file="GSE32989.Expr.tsv", row.names=T, quote=F, sep="\t")
              

The genes will be mapped to our backend gene list that are used for drug response prediction. The full gene list is here

3. Predict drug responses using DrVAEN.

Once the gene expression matrix file is genereted, we can upload the file to DrVAEN to get the predicted drug response data.

Then download the predicted responses for further analysis.

4. Further analysis: Compare the drug responses between the two groups (Epithelial-like vs Mesenchymal-like).

Here we present the example codes for the analysis.

  library(gplots)
  library("ggplot2")

  load("GSE32989.RData")

  #######
  ccle = read.delim("ranksigmoid_CCLE.pred.txt", as.is=T)

  dat = rbind( cbind( ZEB1=ccle$Erlotinib[which(c1==2)], grp="Epithelial-like" ),
               cbind( ZEB1=ccle$Erlotinib[which(c1==1)], grp="Mesenchymal-like" )  )

  dat = as.data.frame(dat)
  dat[,1] = as.numeric(as.character(dat[,1]))

  pvalue = t.test(dat[,1] ~ dat[,2])$p.value

  p1 = ggplot(dat, aes(x=grp, y=ZEB1, fill=grp)) + geom_boxplot() + 
       labs(title=paste("CCLE, A-model\n", "p = ", format(pvalue, digits=3)), x="", y = "Response to Erlotinib") +
     theme(legend.position = "none", plot.title = element_text(hjust=0.5), axis.text.x = element_text(angle = 45, hjust = 1))

  png("CCLE.png")
  print(p1)
  dev.off()

## To analyse the predicted drug response data from GDSC model, please just change "CCLE" to "GDSC" in the above codes.