r - Group columns based on information in an annotation matrix -

January 15, 2014

i seeking advice on how following task:

i analyzing single-cell rnaseq dataset. have normalized expression data in table ( each column has unique cell id, each row gene).

i have annotation matrix have information of each cell (each row cell id, each column piece of info (such patient id, site,etc.)

for downstream analyses, have different grouping based on info available in annotation matrix. guys have suggestion how might able that????

for example, have

expression_matrix<-matrix(c(1:4), nrow = 4,ncol =4, dimnames = list(c("gene1", "gene2", "gene3", "gene4"),c("cell1","cell2","cell3","cell4")))  annotation_matrix<-matrix(c("1526","1788", "1526","1788","controller","noncontroller","controller","noncontroller","ln","pb","ln","pb"), nrow = 4,ncol =3, dimnames = list(c("cell1","cell2","cell3","cell4"),c("id","status","site")))

i want group based on "site" can combine cell 1 , 3 in 1 group , cell2 , cell4 in group. how use match info annotation matrix expression_matrix?

say, want compare between controller , non-controller need somehow match cell id in normalized_expression table patient group info available in annotation matrix

expression_matrix<-matrix(c(1:4), nrow = 4,ncol =4, dimnames = list(c("gene1", "gene2", "gene3", "gene4"),c("cell1","cell2","cell3","cell4")))  #       cell1 cell2 cell3 cell4 # gene1     1     1     1     1 # gene2     2     2     2     2 # gene3     3     3     3     3 # gene4     4     4     4     4  annotation_matrix<-matrix(c("1526","1788", "1526","1788","controller","noncontroller","controller","noncontroller","ln","pb","ln","pb"), nrow = 4,ncol =3, dimnames = list(c("cell1","cell2","cell3","cell4"),c("id","status","site")))  #       id     status          site # cell1 "1526" "controller"    "ln" # cell2 "1788" "noncontroller" "pb" # cell3 "1526" "controller"    "ln" # cell4 "1788" "noncontroller" "pb"

let's harmonize those

library(dplyr)  expression_df <- expression_matrix %>%   as.data.frame(stringsasfactor=f) %>%   mutate(gene = rownames(.)) %>%   gather(cell,value,-gene)  #     gene  cell value # 1  gene1 cell1     1 # 2  gene2 cell1     2 # 3  gene3 cell1     3 # 4  gene4 cell1     4 # 5  gene1 cell2     1 # 6  gene2 cell2     2 # 7  gene3 cell2     3 # 8  gene4 cell2     4 # 9  gene1 cell3     1 # 10 gene2 cell3     2 # 11 gene3 cell3     3 # 12 gene4 cell3     4 # 13 gene1 cell4     1 # 14 gene2 cell4     2 # 15 gene3 cell4     3 # 16 gene4 cell4     4  annotation_df <- annotation_matrix %>%   as.data.frame(stringsasfactor=f) %>%   mutate(cell = rownames(.))  #     id        status site  cell # 1 1526    controller   ln cell1 # 2 1788 noncontroller   pb cell2 # 3 1526    controller   ln cell3 # 4 1788 noncontroller   pb cell4

and can filter, merge, spread wish

example1 <- annotation_df %>%   filter(site == "ln") %>%   inner_join(expression_df)  #     id     status site  cell  gene value # 1 1526 controller   ln cell1 gene1     1 # 2 1526 controller   ln cell1 gene2     2 # 3 1526 controller   ln cell1 gene3     3 # 4 1526 controller   ln cell1 gene4     4 # 5 1526 controller   ln cell3 gene1     1 # 6 1526 controller   ln cell3 gene2     2 # 7 1526 controller   ln cell3 gene3     3 # 8 1526 controller   ln cell3 gene4     4  example2 <- example1 %>%   spread(gene,value)  #     id     status site  cell gene1 gene2 gene3 gene4 # 1 1526 controller   ln cell1     1     2     3     4 # 2 1526 controller   ln cell3     1     2     3     4

Search This Blog

Single

r - Group columns based on information in an annotation matrix -

Comments

Post a Comment

Popular posts from this blog

angular - Ionic slides - dynamically add slides before and after -

minify - Minimizing css files -

Add a dynamic header in angular 2 http provider -