Tuesday, May 7, 2013

Passing multidimensional numpy arrays to C with cffi

I've been playing with using numpy with the cffi library to allow c functions to manipulate numpy arrays directly. Doing this for one dimensional arrays is fairly straightforward, but dealing with multidimensional arrays is a bit trickier. I created a simple C file with various calling options. Given a numpy array x, the x.ctypes.data attribute gives us a pointer to the memory buffer for x. We can cast it to the appropriate type using cffi and pass it directly to the single_test function and manipulate it directly there.
def single_test(x):
    C.single_test(ffi.cast("double *", x.ctypes.data), len(x))
The easiest way to do something similar with multidimensional arrays is just to use array math as in the array_math_test example above. Then the array can be passed down as it is in single_test. If you want to work with a multidimensional array without doing array math, then you need to pass down an array of pointers to your c function. To do this you first need to allocate memory for an array of pointers using the ffi.new function, then fill each index of the resulting array with a pointer to your numpy buffer. One approach would be to break your multidimensional array into multiple numpy arrays and point to each buffer individually:
multi_array = [[1.1, 2.2, 1.3, 4.4, 5.5],
               [5.0, 3.0, 2.0, 1.0, 3],
               [6.0, 1.0, -3.2, -1, 2],
               [0.0, 1.0, 1.0, 2.0, 1]]
x = [numpy.array(v, dtype='float64') for v in multi_array]


def multi_test_a(x):
    ap = ffi.new("double* [%d]" % (len(x)))
    for i in range(len(x)):
        ap[i] = ffi.cast("double *", x[i].ctypes.data)
    C.multi_test(ap, len(x), len(x[0]))

multi_test_a(x)
Perhaps a better approach is to use a regular multidimensional numpy array and calculate the offset from the original buffer pointer for each row.
def multi_test_b(x):
    dsize = ffi.sizeof("double")
    ap = ffi.new("double* [%d]" % (x.shape[0]))
    ptr = ffi.cast("double *", x.ctypes.data)
    for i in range(x.shape[0]):
        ap[i] = ptr + i*x.shape[1]
    C.multi_test(ap, x.shape[0], x.shape[1])
Here is the full python script for this example: Output:
Before single [  1.   2.   3.   4.   5.   6.   7.   8.   9.  10.]
After single [  1.   3.   5.   7.   9.  11.  13.  15.  17.  19.]
Before multi_a [array([ 1.1,  2.2,  1.3,  4.4,  5.5]), array([ 5.,  3.,  2.,  1.,  3.]), array([ 6. ,  1. , -3.2, -1. ,  2. ]), array([ 0.,  1.,  1.,  2.,  1.])]
After multi_a [array([ 1.1,  3.2,  3.3,  7.4,  9.5]), array([ 15.,  14.,  14.,  14.,  17.]), array([ 26. ,  22. ,  18.8,  22. ,  26. ]), array([ 30.,  32.,  33.,  35.,  35.])]
Before multi_b [[ 1.1  2.2  1.3  4.4  5.5]
 [ 5.   3.   2.   1.   3. ]
 [ 6.   1.  -3.2 -1.   2. ]
 [ 0.   1.   1.   2.   1. ]] (4, 5)
After multi_b [[  1.1   3.2   3.3   7.4   9.5]
 [ 15.   14.   14.   14.   17. ]
 [ 26.   22.   18.8  22.   26. ]
 [ 30.   32.   33.   35.   35. ]]