r - merge two daily time series after summarising on shifted hours -


i have measurement (for instance solar radiation) indexed datetime variable, @ hourly timestamp. want sum measurement value each day of year, , match source of data @ daily scale (let's mean outdoor temperature).

although, second source of data agregated 8:00am 8:00am next day. know how summarise first variable standard day, need 8 8 in order match both measurements.

an example of data

set.seed(1l) # create reproducible data hourly = data.frame(datetime = seq(from = lubridate::ymd_hm("2017-01-01 01:00"),                                     length.out = 168, = "hour"),                     value = rpois(168, 10)) daily = data.frame(datetime = seq(from=as.date("2017-01-01"), length.out = 31, by="day"),                    value=rnorm(31)) 

expanding my comment answer, it's worth note op has emphasized words aggregated 8:00am 8:00am next day.

mapping not aligned 24 hour periods dates

if 24 hour period not aligned midnight, i.e., not extend 00:00 24:00 starts , ends sometime during day, ambiguous date associated period.

we can take either

  1. the date of day on period starts,
  2. the date of day on period ends, or
  3. the date of day contains majority of hours of period.

just illustrate difference:

# timestamps: 9 am, 10pm, 7 next day  x <- lubridate::ymd_hm(c("2017-09-12 09:00", "2017-09-12 22:00", "2017-09-13 07:00")) x 
[1] "2017-09-12 09:00:00 utc" "2017-09-12 22:00:00 utc" "2017-09-13 07:00:00 utc" 
# map timestamps date on period starts shifting 8 hours x + lubridate::hours(-8l) 
[1] "2017-09-12 01:00:00 utc" "2017-09-12 14:00:00 utc" "2017-09-12 23:00:00 utc" 
# map timestamps date on period ends advancing 16 hours x + lubridate::hours(16l) 
[1] "2017-09-13 01:00:00 utc" "2017-09-13 14:00:00 utc" "2017-09-13 23:00:00 utc" 

as there no other information, let's assume daily data mapped onto day on period start.

aggregating , merging

for grouping, aggregating, , merging data.table used:

library(data.table) # aggregate data shifted timestamp setdt(hourly)[, .(sum.value = sum(value)),                = .(date = as.date(datetime + lubridate::hours(-8l)))] 
         date sum.value 1: 2016-12-31        68 2: 2017-01-01       232 3: 2017-01-02       222 4: 2017-01-03       227 5: 2017-01-04       228 6: 2017-01-05       231 7: 2017-01-06       260 8: 2017-01-07       144 

note new date column used grouping , aggregating created on fly in by parameter (one of reasons why prefer data.table)

now, daily data need joined. chaining can combined in 1 statement:

setdt(hourly)[, .(sum.value = sum(value)),                = .(date = as.date(datetime + lubridate::hours(-8l)))][                 setdt(daily), on = .(date = datetime), nomatch = 0l] 
         date sum.value      value 1: 2017-01-01       232 -0.5080862 2: 2017-01-02       222  0.5236206 3: 2017-01-03       227  1.0177542 4: 2017-01-04       228 -0.2511646 5: 2017-01-05       231 -1.4299934 6: 2017-01-06       260  1.7091210 7: 2017-01-07       144  1.4350696 

the parameter nomatch = 0l indicate want inner join here.


Comments

Popular posts from this blog

angular - Ionic slides - dynamically add slides before and after -

minify - Minimizing css files -

Add a dynamic header in angular 2 http provider -