r - merge two daily time series after summarising on shifted hours -
i have measurement (for instance solar radiation) indexed datetime variable, @ hourly timestamp. want sum measurement value each day of year, , match source of data @ daily scale (let's mean outdoor temperature).
although, second source of data agregated 8:00am 8:00am next day. know how summarise first variable standard day, need 8 8 in order match both measurements.
an example of data
set.seed(1l) # create reproducible data hourly = data.frame(datetime = seq(from = lubridate::ymd_hm("2017-01-01 01:00"), length.out = 168, = "hour"), value = rpois(168, 10)) daily = data.frame(datetime = seq(from=as.date("2017-01-01"), length.out = 31, by="day"), value=rnorm(31))
expanding my comment answer, it's worth note op has emphasized words aggregated 8:00am 8:00am next day.
mapping not aligned 24 hour periods dates
if 24 hour period not aligned midnight, i.e., not extend 00:00 24:00 starts , ends sometime during day, ambiguous date associated period.
we can take either
- the date of day on period starts,
- the date of day on period ends, or
- the date of day contains majority of hours of period.
just illustrate difference:
# timestamps: 9 am, 10pm, 7 next day x <- lubridate::ymd_hm(c("2017-09-12 09:00", "2017-09-12 22:00", "2017-09-13 07:00")) x
[1] "2017-09-12 09:00:00 utc" "2017-09-12 22:00:00 utc" "2017-09-13 07:00:00 utc"
# map timestamps date on period starts shifting 8 hours x + lubridate::hours(-8l)
[1] "2017-09-12 01:00:00 utc" "2017-09-12 14:00:00 utc" "2017-09-12 23:00:00 utc"
# map timestamps date on period ends advancing 16 hours x + lubridate::hours(16l)
[1] "2017-09-13 01:00:00 utc" "2017-09-13 14:00:00 utc" "2017-09-13 23:00:00 utc"
as there no other information, let's assume daily
data mapped onto day on period start.
aggregating , merging
for grouping, aggregating, , merging data.table
used:
library(data.table) # aggregate data shifted timestamp setdt(hourly)[, .(sum.value = sum(value)), = .(date = as.date(datetime + lubridate::hours(-8l)))]
date sum.value 1: 2016-12-31 68 2: 2017-01-01 232 3: 2017-01-02 222 4: 2017-01-03 227 5: 2017-01-04 228 6: 2017-01-05 231 7: 2017-01-06 260 8: 2017-01-07 144
note new date
column used grouping , aggregating created on fly in by
parameter (one of reasons why prefer data.table
)
now, daily
data need joined. chaining can combined in 1 statement:
setdt(hourly)[, .(sum.value = sum(value)), = .(date = as.date(datetime + lubridate::hours(-8l)))][ setdt(daily), on = .(date = datetime), nomatch = 0l]
date sum.value value 1: 2017-01-01 232 -0.5080862 2: 2017-01-02 222 0.5236206 3: 2017-01-03 227 1.0177542 4: 2017-01-04 228 -0.2511646 5: 2017-01-05 231 -1.4299934 6: 2017-01-06 260 1.7091210 7: 2017-01-07 144 1.4350696
the parameter nomatch = 0l
indicate want inner join here.
Comments
Post a Comment