intervals - Conditional calculation in R -
i have been having issues calculating time between events based on conditions. determine, time between when refund made customer , previous purchase. is, time of refund minus time of last purchase id. there multiple users grouped id, each multiple events (purchases or refunds) indexed timestamp. relevant rows of table this:
view(df1) timestamp id order_type 2017-05-04 55 purchase 2017-05-12 55 purchase 2017-05-18 55 purchase 2017-06-16 55 refund 2017-05-06 36 purchase 2017-05-14 36 purchase 2017-05-22 36 purchase 2017-06-14 36 purchase 2017-06-28 36 refund 2017-07-10 36 purchase as in table, there cases client issued refund, later made purchase. want calculation previous purchase date until refund. i'm thinking use along lines of aggregate function.
with output as:
view(df2) timestamp id days_since_last_purchase 2017-06-16 55 29 2017-06-28 36 14 thanks input.
this solution in base r works:
df$timestamp <- as.date.character(df$timestamp, format = "%y-%m-%d") inds <- which(df$order_type == "refund") df2 <- df[inds, ] df2$days_since <- unlist(map(`-`, df$timestamp[inds], df$timestamp[inds-1])) # timestamp id order_type days_since_last_purchase # 2017-06-16 55 refund 29 # 2017-06-28 36 refund 14 you can choose mapply instead of map in (all?) situations:
df2$days_since <- mapply(difftime, df$timestamp[inds], df$timestamp[inds-1]) note: benefit of approach employs base r. however, moody_mudskipper pointed out in comments, solutions works when data chronologically ordered , every refund-record preceded corresponding purchase-record. in practical situations, big deal!
Comments
Post a Comment