python - Perfomance of map vs starmap? -


i trying make pure-python (without external dependencies) element-wise comparison of 2 sequences. first solution was:

list(map(operator.eq, seq1, seq2)) 

then found starmap function itertools, seemed pretty similar me. turned out 37% faster on computer in worst case. not obvious me, measured time necessary retrieve 1 element generator (don't know if way correct):

from operator import eq itertools import starmap  seq1 = [1,2,3]*10000 seq2 = [1,2,3]*10000 seq2[-1] = 5  gen1 = map(eq, seq1, seq2)) gen2 = starmap(eq, zip(seq1, seq2))  %timeit -n1000 -r10 next(gen1) %timeit -n1000 -r10 next(gen2)  271 ns ± 1.26 ns per loop (mean ± std. dev. of 10 runs, 1000 loops each) 208 ns ± 1.72 ns per loop (mean ± std. dev. of 10 runs, 1000 loops each) 

in retrieving elements second solution 24% more perfomant. after both produce same results list. somewhere gain 13% in time:

%timeit list(map(eq, seq1, seq2)) %timeit list(starmap(eq, zip(seq1, seq2)))  5.24 ms ± 29.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 3.34 ms ± 84.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 

i don't know how dig deeper in profiling of such nested code? question why first generator faster in retrieving , gain 13% in list function?

edit: first intention perform element-wise comparison instead of all, all function replaced list, replacement not affect timing ratio.

python 3.6.2 on windows 10 (64bit)

there several factors contribute (in conjunction) observed performance difference:

  • zip re-uses returned tuple if has reference count of 1 when next __next__ call made.
  • map builds new tuple passed "mapped function" every time __next__ call made. won't create new tuple scratch because python maintains storage unused tuples. in case map has find unused tuple of right size.
  • starmap checks if next item in iterable of type tuple , if passes on.
  • calling c function within c code pyobject_call won't create new tuple passed callee.

so starmap zip use 1 tuple on , on again passed operator.eq reducing function call overhead immensely. map on other hand create new tuple (or fill c array cpython 3.6 on) every time operator.eq called. speed difference tuple creation overhead.

instead of linking source code i'll provide cython code can used verify this:

in [1]: %load_ext cython  in [2]: %%cython    ...:    ...: cpython.ref cimport py_decref    ...:    ...: cpdef func(zipper):    ...:     = next(zipper)    ...:     print('a', a)    ...:     py_decref(a)    ...:     b = next(zipper)    ...:     print('a', a)  in [3]: func(zip([1, 2], [1, 2])) (1, 1) (2, 2) 

yes, tuples aren't immutable, simple py_decref sufficient "trick" zip believing noone else holds reference returned tuple!

as "tuple-pass-thru":

in [4]: %%cython    ...:    ...: def func_inner(*args):    ...:     print(id(args))    ...:    ...: def func(*args):    ...:     print(id(args))    ...:     func_inner(*args)  in [5]: func(1, 2) 1404350461320 1404350461320 

so tuple passed right through (just because these defined c functions!) doesn't happen pure python functions:

in [6]: def func_inner(*args):    ...:     print(id(args))    ...:    ...: def func(*args):    ...:     print(id(args))    ...:     func_inner(*args)    ...:  in [7]: func(1, 2) 1404350436488 1404352833800 

note doesn't happen if called function isn't c function if called c function:

in [8]: %%cython    ...:     ...: def func_inner_c(*args):    ...:     print(id(args))    ...:     ...: def func(inner, *args):    ...:     print(id(args))    ...:     inner(*args)    ...:  in [9]: def func_inner_py(*args):     ...:     print(id(args))     ...:     ...:  in [10]: func(func_inner_py, 1, 2) 1404350471944 1404353010184  in [11]: func(func_inner_c, 1, 2) 1404344354824 1404344354824 

so there lot of "coincidences" leading point starmap zip faster calling map multiple arguments when called function c function...


Comments

Popular posts from this blog

angular - Ionic slides - dynamically add slides before and after -

Add a dynamic header in angular 2 http provider -

minify - Minimizing css files -