Faster Python Made Easier with Cython’s Pure Python Mode

0

Cyton has long been one of Python’s great performance secret weapons, allowing you to turn Python code into C for speed. But Cython has also long suffered from cumbersome and counter-intuitive syntax, a strange hybrid of C and Python. To add insult to injury, Cython code cannot be processed by any of the current Python linting and formatting tools.

The good news: In recent years, Cython has developed an alternative syntax, called pure Python mode. As the name suggests, Pure Python mode uses native Python syntax to express Cython behaviors and constructs, making it easier for Python programmers to get started with Cython.

Pure Python mode also enhances one of Cython’s greatest advantages: it makes it easier to get started with a conventional Python code base and gradually transform it into C code. run as normal Python, but without Cython’s speed gains.

Finally, pure Python mode allows Python linting and code analysis tools to work with Cython modules. The existing culture of Python tools should not stop at the Cython barrier.

The original Cython syntax

Below is a short module written with conventional Cython syntax. It calculates (not very efficiently) the Fibonacci sequence for a given number. (Note that we’re using classes here not because they’re the best way to solve this problem, but because it’s worth showing how they map to equivalent elements in Cython.)

class Fibonacci:    
    def __init__(self, start: int):
        if start<0:
            raise ValueError("Starting number must be greater than 0")
        self.n = start

    def calc(self) -> int:
        return self._calc(self.n)
    
    def _calc(self, val: int) -> int:
        if val == 0:
            return 0
        elif val == 1 or val == 2:
            return 1
        else:
            return self._calc(val-1)+self._calc(val-2)

On an AMD Ryzen 5 3600, this code runs in about 20 seconds. Python is inefficient in mathematics. If we rewrite this code in Cython, we can speed things up considerably.

Here is a Cython version of the same code (backed up with a .pyx file extension):

cdef class Fibonacci:
    cdef int n
    def __init__(self, int start):
        if start<0:
            raise ValueError("Starting number must be greater than 0")
        self.n = start

    cpdef int calc(self):
        return self._calc(self.n)
    
    cdef int _calc(self, int val):
        if val == 0:
            return 0
        elif val == 1 or val == 2:
            return 1
        else:
            return self._calc(val-1)+self._calc(val-2)

This Cython code runs much faster: about half a second on the same hardware! But as you may have noticed, Cython’s syntax can be confusing.

If you squint hard you can see the original Python syntax still there, albeit buried under a number of other non-Python things. cdef and cpdef, for example, are used to declare Cython-only and Cython-wrapped functions. Also, the type decorations used on objects and function signatures have nothing to do with the type hinting syntax we generally use in Python.

There are many other ways Cython syntax can be tricky to parse, but this example should give you a general idea.

Pure Python Syntax in Cython

Here is the same module, rewritten in pure Python mode (and saved with the .py extension):

import cython

@cython.cclass
class Fibonacci:    
    n: cython.int
    def __init__(self, start:cython.int):
        if start<0:
            raise ValueError("Starting number must be greater than 0")
        self.n = start

    @cython.ccall
    def calc(self) -> cython.int:
        return self._calc(self.n)
    
    @cython.cfunc
    def _calc(self, val: cython.int) -> cython.int:
        if val == 0:
            return 0
        elif val == 1 or val == 2:
            return 1
        else:
            return self._calc(val-1)+self._calc(val-2)

Several things about this code should stand out right away:

  • We add Cython functionality to our code through the cython import, no custom syntax. All syntax shown here is standard Python.
  • The type hints for our variables are done in the conventional Python way. For example, the variable n is declared at the class level with a Python type index. Similarly, function signatures use Python-style type hints to indicate what they receive and return.
  • To declare Cython functions and classes, we use a decorator (a bit standard Python syntax) instead of the cdef/cpdef keywords (not standard at all).

Another useful aspect regarding the use of pure Python syntax: This code can run as is in standard Python. We can just import the module and use it without compiling it, although we won’t get the speed benefits that Cython provides. If we compile it, importing it will import the compiled version. This makes it easier to perform the kinds of incremental transformations of Python code that Cython was designed to make possible.

Compiled and uncompiled code in pure Python mode

A useful feature in pure Python mode is a way to create alternate code paths in a module depending on whether the code is running in normal Python mode or compiled Cython mode. An example:

if cython.compiled:
    data = cython.cast(
        cython.p_int, PyMem_Malloc(array_size * cython.sizeof(cython.int))
    )
else:
    data = arr.array("i", [0] * array_size)]

data[0] = 32

Here we assign one of two possible values ​​to data depending on whether this code is compiled or not. If compiled, data is a pointer to a region of memory allocated using the Python runtime environment. If not compiled, data is a Python array.array object composed of 32-bit integers. In both cases, we can access the elements of the array and set them with the same code, whether the code is compiled or not.

What Pure Python Mode Doesn’t (Yet) Do

Pure Python mode has some limitations that mean you can’t use it in yet all case where “classic” Cython works.

First, pure Python mode does not support the full range of PEP 484-like annotations. Annotations such as Final Where Union are not respected. The main reason for using PEP 484-style annotations is to provide a convenient way to include Cython’s own type hints, so many of the type hints used in standard Python are not yet supported.

Second, pure Python mode does not support packed C structures (extremely important for working with some C libraries) or C-style enums. Both of these features could possibly be supported in pure Python mode in some form, but at the moment the only way to work with them is with the old Cython format.

Finally, you cannot call C functions from pure Python mode as you would from normal Cython. Normally in Cython you can call a C function by including a reference in it like this:

cdef extern from "math.h":
    cpdef double sin(double x)

Pure Python mode doesn’t provide any mechanism to do this directly, but you can use selective imports as described in the Cython documentation to devise a workaround.

Copyright © 2022 IDG Communications, Inc.

Share.

About Author

Comments are closed.