Creating intervaled ramp array based on a threshold - Python / NumPy -
i measure length of sub-array fullfilling condition (like stop clock), condition not fulfilled more, value should reset zero. so, resulting array should tell me, how many values fulfilled condition (e.g. value > 1):
[0, 0, 2, 2, 2, 2, 0, 3, 3, 0] should result followin array:
[0, 0, 1, 2, 3, 4, 0, 1, 2, 0] one can define function in python, returns corresponding numy array:
def stopclock(signal, threshold=1): clock = [] current_time = 0 item in signal: if item > threshold: current_time += 1 else: current_time = 0 clock.append(current_time) return np.array(clock) stopclock([0, 0, 2, 2, 2, 2, 0, 3, 3, 0]) however, not for-loop, since counter should run on longer dataset. thought of np.cumsum solution in combination np.diff, not through reset part. aware of more elegant numpy-style solution of above problem?
yes, can use diff-styled differentiation alongwith cumsum create such intervaled ramps in vectorized manner , should pretty efficient specially large input arrays. resetting part taken care of assigning appropriate values @ end of each interval, idea of cum-summing resets numbers @ end of each interval.
here's 1 implementation accomplish -
def intervaled_ramp(a, thresh=1): mask = a>thresh # start, stop indices mask_ext = np.concatenate(([false], mask, [false] )) idx = np.flatnonzero(mask_ext[1:] != mask_ext[:-1]) s0,s1 = idx[::2], idx[1::2] out = mask.astype(int) valid_stop = s1[s1<len(a)] out[valid_stop] = s0[:len(valid_stop)] - valid_stop return out.cumsum() sample runs -
input (a) : [5 3 1 4 5 0 0 2 2 2 2 0 3 3 0 1 1 2 0 3 5 4 3 0 1] output (intervaled_ramp(a, thresh=1)) : [1 2 0 1 2 0 0 1 2 3 4 0 1 2 0 0 0 1 0 1 2 3 4 0 0] input (a) : [1 1 1 4 5 0 0 2 2 2 2 0 3 3 0 1 1 2 0 3 5 4 3 0 1] output (intervaled_ramp(a, thresh=1)) : [0 0 0 1 2 0 0 1 2 3 4 0 1 2 0 0 0 1 0 1 2 3 4 0 0] input (a) : [1 1 1 4 5 0 0 2 2 2 2 0 3 3 0 1 1 2 0 3 5 4 3 0 5] output (intervaled_ramp(a, thresh=1)) : [0 0 0 1 2 0 0 1 2 3 4 0 1 2 0 0 0 1 0 1 2 3 4 0 1] input (a) : [1 1 1 4 5 0 0 2 2 2 2 0 3 3 0 1 1 2 0 3 5 4 3 0 5] output (intervaled_ramp(a, thresh=0)) : [1 2 3 4 5 0 0 1 2 3 4 0 1 2 0 1 2 3 0 1 2 3 4 0 1] runtime test
one way fair benchmarking use posted sample in question , tiling big number of times , using input array. setup, here's timings -
in [841]: = np.array([0, 0, 2, 2, 2, 2, 0, 3, 3, 0]) in [842]: = np.tile(a,10000) # @alexander's soln in [843]: %timeit pandas_app(a, threshold=1) 1 loop, best of 3: 3.93 s per loop # @psidom 's soln in [844]: %timeit stop_clock(a, threshold=1) 10 loops, best of 3: 119 ms per loop # proposed in post in [845]: %timeit intervaled_ramp(a, thresh=1) 1000 loops, best of 3: 527 µs per loop
Comments
Post a Comment