Wednesday, May 26, 2010

cython timings test

The TASK : To optimize cython functions

Detailed: functions which depend on a once initialized attribute value

This often comes handy in many cases, for example to write a Laplacian function of a scalar field in spherical/axisymmetric coordinate system, you would need three independent cases for 1,2,3 dimensions for performance purposes and if u do not write all functions as general 3D functions.

The test CODE : test_kernel.pyx

cdef class Kernel:
    cdef int dim
    cdef double (*func)(Kernel,double)
    def __init__(self, dim=1):
        self.dim = dim
        if dim == 1:
            self.func = self.func1
        elif dim == 2:
            self.func = self.func2
    cdef double func1(self, double x):
        return 1+x
    cdef double func2(self, double x):
        return 2+x
    cdef double c_func(self, double x):
        '''this is only to make function signature compatible with func1 and func2'''
        return self.func(self, x)
    def p_func(self, double x):
        return self.func(self, x)
    cpdef double py_func(self, double x):
        return self.func(self, x)
    cpdef double py_c_func(self, double x):
        return self.c_func(x)
    def py_func1(self, x):
        return self.func1(x)
    def py_func2(self, x):
        return self.func2(x)
    cdef double func_common(self, double x):
        cdef int dim = self.dim
        if dim == 1:
            return 10+x
        elif dim == 2:
            return 20+x
    def py_func_c_common(self, x):
        return self.func_common(x)
    cpdef double py_func_common(self, double x):
        cdef int dim = self.dim
        if dim == 1:
            return 10+x
        elif dim == 2:
            return 20+x

Compilation command:
    cython -a test_kernel.pyx;
    gcc <optimization-flag> -shared -fPIC test_kernel.c -lpython2.6 -I /usr/include/python2.6/ -o
where optimization flag is either empty or "-O2" or "-O3"

Cython optimization
Tip 1:
Type (cdef) as many variables as you can. You also need to type the locals in each function. Try to try to use C data types wherever possible.
Tip 2:
    cython -a file.pyx
command to generate a html file which shows lines which cause expensive python functions to be called. Clicking on a line shows the corresponding C code generated, highlighting expensive calls in shades of red. Try to eliminate as many such calls as you can.

The TEST :

import timeit

def time(s):
    '''returns time in microseconds'''
    t = 1e6*timeit.timeit(s,'import test_kernel;k1=test_kernel.Kernel(1);k2=test_kernel.Kernel(2);',number=1000000)/1000000.
    print s, t
    return t



Timings :

functiontime (μs)(ns)

Optimization flag ->None-O2-O3sum(k1+k2)/2penalty







Result :

The best is to write separate C function and a python accessor function.

functionpenalty cost (ns)
C function + python accessor : base casep_func
cpdef instead of defpy_func1.7345
calling a cdef class method instead of a function pointer attributepy_func1,py_func24.3456
one extra c function callpy_c_func3.9311
(def + cdef) instead of (cpdef)py_func_c_common-py_func_common2.5723
One C comparison vs one C function callpy_func_common1.4237

Conclusion :

As can be clearly seen that the results are clearly inconclusive :)
This was a small test carried on my laptop with no controlled environment. Also thought the results seemed close to repeatable, nevertheless many trials should be conduction and each value should have a standard deviation also to check the repeatability. However one clear conclusion is do not forget to add optimization flags. Setuptools already does that for you.
Also using a function pointer is not so bad after all. It would become more advantageous in case of more number of comparisons.
Cython provides great speedups (who didn't know that :) ). The pure python version of py_func_common took 0.408μs for dim=1 and 0.518μs for dim=2
These results are purely from python point of view. The effect of cdef/cpdef should also be considered in c/cython code which calls these functions.


I am no optimization expert. I have done this out of out of sheer boredom :)
If anyone wants to verify, you are welcome
Any information content is purely coincindental

No comments:

Post a Comment