i have following data frame
tdf <- structure(list(go = c("cytokine-cytokine receptor interaction", "cytokine-cytokine receptor interaction|endocytosis", "i-kappab kinase/nf-kappab signaling", "nf-kappa b signaling pathway", "nf-kappab import nucleus", "t cell chemotaxis"), poscount = c(17, 18, 4, 5, 1, 2), shortgo = structure(c(7l, 7l, 18l, 18l, 18l, 21l), .label = c("tnf", "adaptive", "alpha", "apop", "beta", "chemokine", "cytokine", "death", "defense", "gamma", "immune response", "infla", "interleukin-1 ", "interleukin-10 ", "interleukin-12 ", "interleukin-18 ", "interleukin-6 ", "kappa", "migration", "stress", "taxis", "wound"), class = "factor")), .names = c("go", "poscount", "shortgo"), class = "data.frame", row.names = c(na, 6l))
that looks this:
> tdf go poscount shortgo 1 cytokine-cytokine receptor interaction 17 cytokine 2 cytokine-cytokine receptor interaction|endocytosis 18 cytokine 3 i-kappab kinase/nf-kappab signaling 4 kappa 4 nf-kappa b signaling pathway 5 kappa 5 nf-kappab import nucleus 1 kappa 6 t cell chemotaxis 2 taxis
what want split data frame according shortgo
, sort go
member poscount
, yielding (handcrafted):
$cytokine [1] cytokine-cytokine receptor interaction|endocytosis [2] cytokine-cytokine receptor interaction $kappa [1] nf-kappa b signaling pathway [2] i-kappab kinase/nf-kappab signaling [3] nf-kappab import nucleus $taxis [1] t cell chemotaxis
i'm stuck this:
> split(tdf$go,tdf$shortgo) error in split.default(tdf$go, tdf$hsortgo) : group length 0 data length > 0
how can go it?
you can order dataframe first before split:
library(dplyr) tdf <- tdf %>% group_by(shortgo) %>% arrange(desc(poscount))
then split:
ldf <- split(tdf$go, tdf$shortgo, drop=true)
which gives desired (ordered) output:
> ldf $cytokine [1] "cytokine-cytokine receptor interaction|endocytosis" [2] "cytokine-cytokine receptor interaction" $kappa [1] "nf-kappa b signaling pathway" [2] "i-kappab kinase/nf-kappab signaling" [3] "nf-kappab import nucleus" $taxis [1] "t cell chemotaxis"
when want split dataframe in list of dataframes, can use:
ldf <- split(tdf, tdf$shortgo, drop=true)
a solution base r (provided @henrik in comments):
split(tdf$go[order(tdf$shortgo, -tdf$poscount)], tdf$shortgo, drop=true)