Creating intervaled ramp array based on a threshold - Python / NumPy -


i measure length of sub-array fullfilling condition (like stop clock), condition not fulfilled more, value should reset zero. so, resulting array should tell me, how many values fulfilled condition (e.g. value > 1):

[0, 0, 2, 2, 2, 2, 0, 3, 3, 0] 

should result followin array:

[0, 0, 1, 2, 3, 4, 0, 1, 2, 0] 

one can define function in python, returns corresponding numy array:

def stopclock(signal, threshold=1):      clock = []     current_time = 0     item in signal:         if item > threshold:             current_time += 1         else:             current_time = 0         clock.append(current_time)     return np.array(clock)  stopclock([0, 0, 2, 2, 2, 2, 0, 3, 3, 0]) 

however, not for-loop, since counter should run on longer dataset. thought of np.cumsum solution in combination np.diff, not through reset part. aware of more elegant numpy-style solution of above problem?

yes, can use diff-styled differentiation alongwith cumsum create such intervaled ramps in vectorized manner , should pretty efficient specially large input arrays. resetting part taken care of assigning appropriate values @ end of each interval, idea of cum-summing resets numbers @ end of each interval.

here's 1 implementation accomplish -

def intervaled_ramp(a, thresh=1):     mask = a>thresh      # start, stop indices     mask_ext = np.concatenate(([false], mask, [false] ))     idx = np.flatnonzero(mask_ext[1:] != mask_ext[:-1])     s0,s1 = idx[::2], idx[1::2]      out = mask.astype(int)     valid_stop = s1[s1<len(a)]     out[valid_stop] = s0[:len(valid_stop)] - valid_stop     return out.cumsum() 

sample runs -

input (a) :  [5 3 1 4 5 0 0 2 2 2 2 0 3 3 0 1 1 2 0 3 5 4 3 0 1] output (intervaled_ramp(a, thresh=1)) :  [1 2 0 1 2 0 0 1 2 3 4 0 1 2 0 0 0 1 0 1 2 3 4 0 0]  input (a) :  [1 1 1 4 5 0 0 2 2 2 2 0 3 3 0 1 1 2 0 3 5 4 3 0 1] output (intervaled_ramp(a, thresh=1)) :  [0 0 0 1 2 0 0 1 2 3 4 0 1 2 0 0 0 1 0 1 2 3 4 0 0]  input (a) :  [1 1 1 4 5 0 0 2 2 2 2 0 3 3 0 1 1 2 0 3 5 4 3 0 5] output (intervaled_ramp(a, thresh=1)) :  [0 0 0 1 2 0 0 1 2 3 4 0 1 2 0 0 0 1 0 1 2 3 4 0 1]  input (a) :  [1 1 1 4 5 0 0 2 2 2 2 0 3 3 0 1 1 2 0 3 5 4 3 0 5] output (intervaled_ramp(a, thresh=0)) :  [1 2 3 4 5 0 0 1 2 3 4 0 1 2 0 1 2 3 0 1 2 3 4 0 1] 

runtime test

one way fair benchmarking use posted sample in question , tiling big number of times , using input array. setup, here's timings -

in [841]: = np.array([0, 0, 2, 2, 2, 2, 0, 3, 3, 0])  in [842]: = np.tile(a,10000)  # @alexander's soln in [843]: %timeit pandas_app(a, threshold=1) 1 loop, best of 3: 3.93 s per loop  # @psidom 's soln in [844]: %timeit stop_clock(a, threshold=1) 10 loops, best of 3: 119 ms per loop  # proposed in post in [845]: %timeit intervaled_ramp(a, thresh=1) 1000 loops, best of 3: 527 µs per loop 

Comments

Popular posts from this blog

neo4j - finding mutual friends in a cypher statement starting with three or more persons -

php - How to remove letter in front of the word laravel -

minify - Minimizing css files -