Python code optimization with ctypes
Attention: The code in this article is licensed under GNU AGPLv3.
I wrote this guide because I could not find one that combines all the useful things about ctypes. I hope this article makes someone’s life a lot easier.
Content:
- Basic optimizations
- styles
- Python compilation
- Structures in Python
- Call your code in C
- Pypy
Basic optimizations
Before rewriting the Python source code in C, consider the basic optimization methods in Python.
Built-in Data Structures
Python’s built-in data structures, such as set and dict, are written in C. They work much faster than your own data structures written as Python classes. Data structures other than the standard set, dict, list, and tuple are described in the module documentation. collections.
List expressions
Instead of adding items to the list using the standard method, use list expressions.
# Slow
mapped = []
for value in originallist:
mapped.append(myfunc(value))
# Faster
mapped = [myfunc(value) in originallist]
ctypes
Module ctypes allows you to interact with C code from Python without using a module subprocess
or another similar module to start other processes from the CLI.
There are only two parts: compiling C code to load as shared object
and setting up data structures in Python code to map them to types C.
In this article, I will connect my Python code with lcs.c, which finds the longest subsequence in two line lists. I want the following to work in Python:
list1 = ['My', 'name', 'is', 'Sam', 'Stevens', '!']
list2 = ['My', 'name', 'is', 'Alex', 'Stevens', '.']
common = lcs(list1, list2)
print(common)
# ['My', 'name', 'is', 'Stevens']
One problem is that this particular C function is the signature of a function that takes lists of strings as argument types and returns a type that does not have a fixed length. I solve this problem with a sequence structure containing pointers and length.
Compiling C code in Python
First, the source code is in C (lcs.c) compiles to lcs.so
to load in Python.
gcc -c -Wall -Werror -fpic -O3 lcs.c -o lcs.o
gcc -shared -o lcs.so lcs.o
- –Wall will display all warnings;
- –Werrow will wrap all warnings in errors;
- –fpic will generate position-independent instructions that are needed if you want to use this library in Python;
- –O3 maximizes optimization;
And now we will start writing Python code using the resulting file shared object.
Structures in Python
Below are two data structures that are used in my C code.
struct Sequence
{
char **items;
int length;
};
struct Cell
{
int index;
int length;
struct Cell *prev;
};
And here is the translation of these structures into Python.
import ctypes
class SEQUENCE(ctypes.Structure):
_fields_ = [('items', ctypes.POINTER(ctypes.c_char_p)),
('length', ctypes.c_int)]
class CELL(ctypes.Structure):
pass
CELL._fields_ = [('index', ctypes.c_int), ('length', ctypes.c_int),
('prev', ctypes.POINTER(CELL))]
A few notes:
- All structures are classes that inherit from
ctypes.Structure
. - Single field
_fields_
is a list of tuples. Each tuple is (
,
) - IN
ctypes
there are similar types c_char (char) and c_char_p (* char). - IN
ctypes
also havePOINTER()
, which creates a type pointer from each type passed to it. - If you have a recursive definition like in
CELL
, you must pass the initial declaration, and then add the fields_fields_
to get a link to yourself later. - Since I have not used
CELL
in my Python code, I did not need to write this structure, but it has an interesting property in the recursive field.
Call your code in C
Also, I needed some code to convert Python types to new structures in C. Now you can use your new C function to speed up Python code.
def list_to_SEQUENCE(strlist: List[str]) -> SEQUENCE:
bytelist = [bytes(s, 'utf-8') for s in strlist]
arr = (ctypes.c_char_p * len(bytelist))()
arr[:] = bytelist
return SEQUENCE(arr, len(bytelist))
def lcs(s1: List[str], s2: List[str]) -> List[str]:
seq1 = list_to_SEQUENCE(s1)
seq2 = list_to_SEQUENCE(s2)
# struct Sequence *lcs(struct Sequence *s1, struct Sequence *s2)
common = lcsmodule.lcs(ctypes.byref(seq1), ctypes.byref(seq2))[0]
ret = []
for i in range(common.length):
ret.append(common.items[i].decode('utf-8'))
lcsmodule.freeSequence(common)
return ret
lcsmodule = ctypes.cdll.LoadLibrary('lcsmodule/lcs.so')
lcsmodule.lcs.restype = ctypes.POINTER(SEQUENCE)
list1 = ['My', 'name', 'is', 'Sam', 'Stevens', '!']
list2 = ['My', 'name', 'is', 'Alex', 'Stevens', '.']
common = lcs(list1, list2)
print(common)
# ['My', 'name', 'is', 'Stevens']
A few notes:
**char
(list of strings) maps directly to a list of bytes in Python.- IN
lcs.c
there is a functionlcs()
with a struct sequencer signature * lcs (struct Sequence * s1, struct Sequence * s2). To set up the return type, I uselcsmodule.lcs.restype = ctypes.POINTER(SEQUENCE)
. - To make a call with a link to the structure Sequence, I use
ctypes.byref()
which returns a “light pointer” to your object (faster thanctypes.POINTER()
) common.items
Is a list of bytes, they can be decoded to receiveret
as a liststr
.- lcsmodule.freeSequence (common) just frees the memory associated with common. This is important because the garbage collector (AFAIK) will not automatically collect it.
Optimized Python code is code that you wrote in C and wrapped in Python.
Something More: PyPy
Attention: I myself have never used PyPy.
One of the easiest optimizations is to run your programs at runtime. Pypy, which contains a JIT compiler (just-in-time), which speeds up the work of loops, compiling them into machine code when executed multiple times.
If you have comments or want to discuss something, write to me (samuel.robert.stevens@gmail.com).
That’s all. See you on course!