intervals - Conditional calculation in R -


i have been having issues calculating time between events based on conditions. determine, time between when refund made customer , previous purchase. is, time of refund minus time of last purchase id. there multiple users grouped id, each multiple events (purchases or refunds) indexed timestamp. relevant rows of table this:

view(df1) timestamp   id  order_type 2017-05-04  55  purchase 2017-05-12  55  purchase 2017-05-18  55  purchase 2017-06-16  55  refund  2017-05-06  36  purchase 2017-05-14  36  purchase 2017-05-22  36  purchase 2017-06-14  36  purchase 2017-06-28  36  refund  2017-07-10  36  purchase 

as in table, there cases client issued refund, later made purchase. want calculation previous purchase date until refund. i'm thinking use along lines of aggregate function.

with output as:

view(df2) timestamp   id  days_since_last_purchase 2017-06-16  55     29 2017-06-28  36     14 

thanks input.

this solution in base r works:

df$timestamp <- as.date.character(df$timestamp, format = "%y-%m-%d")  inds <- which(df$order_type == "refund") df2  <- df[inds, ]  df2$days_since <- unlist(map(`-`, df$timestamp[inds], df$timestamp[inds-1])) #    timestamp  id order_type days_since_last_purchase #    2017-06-16 55     refund                       29 #    2017-06-28 36     refund                       14 

you can choose mapply instead of map in (all?) situations:

df2$days_since <- mapply(difftime, df$timestamp[inds], df$timestamp[inds-1]) 

note: benefit of approach employs base r. however, moody_mudskipper pointed out in comments, solutions works when data chronologically ordered , every refund-record preceded corresponding purchase-record. in practical situations, big deal!


Comments

Popular posts from this blog

neo4j - finding mutual friends in a cypher statement starting with three or more persons -

php - How to remove letter in front of the word laravel -

minify - Minimizing css files -