r - How to replicate a ddply behavior that uses a custom function with dplyr? -
i'm trying replace plyr
calls dplyr
. there still few snags , 1 of them group_by
function. imagine acts same way second ddply
argument , split, apply , combine based on grouping variables list. doesn't appear case. here rather trivial example.
let's define silly function
mm <- function(x) return(x[1:5, ])
now can split species in iris
dataset , apply function each piece.
ddply(iris, .(species), mm)
this works intended. however, when try same dplyr
, doesn't work expected.
iris %>% group_by(species) %>% mm
what doing wrong?
as shown in ?do
, can refer group .
in expression. following replicate ddply
output:
iris %>% group_by(species) %>% do(.[1:5, ]) # source: local data frame [15 x 5] # groups: species # # sepal.length sepal.width petal.length petal.width species # 1 5.1 3.5 1.4 0.2 setosa # 2 4.9 3.0 1.4 0.2 setosa # 3 4.7 3.2 1.3 0.2 setosa # 4 4.6 3.1 1.5 0.2 setosa # 5 5.0 3.6 1.4 0.2 setosa # 6 7.0 3.2 4.7 1.4 versicolor # 7 6.4 3.2 4.5 1.5 versicolor # 8 6.9 3.1 4.9 1.5 versicolor # 9 5.5 2.3 4.0 1.3 versicolor # 10 6.5 2.8 4.6 1.5 versicolor # 11 6.3 3.3 6.0 2.5 virginica # 12 5.8 2.7 5.1 1.9 virginica # 13 7.1 3.0 5.9 2.1 virginica # 14 6.3 2.9 5.6 1.8 virginica # 15 6.5 3.0 5.8 2.2 virginica
more generally, apply custom function groups dplyr
, can following (thanks @docendodiscimus):
iris %>% group_by(species) %>% do(mm(.))
Comments
Post a Comment