Calling Go functions from Python using ctypes
To implement all the steps we will need: Python, Go compiler and GCC (MinGW for Windows). Code examples are available in the repo at Github.
It's worth noting that there are other ways to call Go from Python − SWIG, For example. Here we will look at ctypes
because it doesn't require any additional dependencies and is very simple.
Go!
Plan
Hello world
Let's start with hello-world, where would we be without it?
package main
import "C"
import "fmt"
//export hello
func hello() {
fmt.Println("Hello world!")
}
func main() {}
Now we collect based on hello.go
file hello.dll
– for this we compile the first one with the flag -buildmode=c-shared
# Windows:
go build -o hello.dll -buildmode=c-shared hello.go
# Linux:
go build -o hello.so -buildmode=c-shared hello.go
Now, for use hello.dll
in Python we need to include this file using ctypes.CDLL()
:
import ctypes
lib = ctypes.CDLL('./hello.dll') # Or hello.so if on Linux.
hello = lib.hello
hello()
Let's run:
> python hello.py
Hello world!
Great; There are a couple of points worth noting here:
The Go code is absolutely standard, the only thing is that you need to specify
//export hello
to export a functionhello
for external useAssembly with flag
-buildmode=c-shared
creates a shared C-style library.Loading such a library in Python is done via
ctypes.CDLL()
Simple input and output
Well, now the function will take some arguments and return a value:
//export add
func add(a, b int64) int64 {
return a + b
}
lib = ctypes.CDLL('./primitive.dll')
add = lib.add
# конвертируем значения в C-представление
add.argtypes = [ctypes.c_int64, ctypes.c_int64]
add.restype = ctypes.c_int64
print('10 + 15 =', add(10, 15))
Let's launch:
> python add.py
10 + 15 = 25
So, to pass input and receive output from a Go function, you need to use attributes argtypes
And restype
from the library ctypes
. A couple of points:
argtypes
checks arguments before calling library codeUsing these attributes tells Python how to convert Python input values to values
ctypes
and how to convert the output values back to Python values.
By the way, you can look for correspondence between C types and Go types in the generated file .h
after compiling your Go code with -buildmode=c-shared
.
Attention: strictly speaking, types are closely related to hardware architecture. In general, it is safer to use dimension types (int64
) than dimensionless (int
)
Arrays and slices
Okay, now let's talk about arrays and slices. This brings us into unsafe memory access territory – although Python and Go are generally memory safe, working with raw pointers can lead to buffer overflows or leaks.
// возвращает квадраты введённых чисел
//
//export squares
func squares(numsPtr *float64, outPtr *float64, n int64) {
nums := unsafe.Slice(numsPtr, n)
out := unsafe.Slice(outPtr, n)
// кстати, для Go < 1.17
// nums := (*[1 << 30]float64)(unsafe.Pointer(numsPtr))[:n:n]
// out := (*[1 << 30]float64)(unsafe.Pointer(outPtr))[:n:n]
for i, x := range nums {
out[i] = x * x
}
}
And we call this dll
in Python:
lib = ctypes.CDLL('./arrays.dll')
squares = lib.squares
squares.argtypes = [
ctypes.POINTER(ctypes.c_double),
ctypes.POINTER(ctypes.c_double),
ctypes.c_int64,
]
# использовать from_buffer() более эффективно, чем просто делать:
# (ctypes.c_double * 3)(*[1, 2, 3])
nums = array('d', [1, 2, 3])
nums_ptr = (ctypes.c_double * len(nums)).from_buffer(nums)
out = array('d', [0, 0, 0])
out_ptr = (ctypes.c_double * len(out)).from_buffer(out)
squares(nums_ptr, out_ptr, len(nums))
print('nums:', list(nums))
print('out:', list(out))
Let's launch:
> python squares.py
nums: [1.0, 2.0, 3.0]
out: [1.0, 4.0, 9.0]
Summary: To work with lists, we need to convert them into C arrays. To do this, we need to create an array using (ctypes.my_type * my_length)(1, 2, 3 ...)
. A faster way is to use a library array
, as shown above. We'll touch on this a little later, when we talk about benchmarks.
In Go you can make a C-like pointer point to a slice. This way you can use Go syntax when working with Python buffers.
A couple more points: you can't return a Go pointer when using CGo, it will cause an error. Instead, you can allocate a pointer to C from Go using C.malloc()
and return it. However, the garbage collector does not interact with such pointers in any way, so you need to provide a mechanism for removing such pointers to avoid memory leaks.
What can we recommend here? In order for a function to safely return an array from Go, you must either first allocate memory for them in Python and pass it to Go as arguments, or generate arrays in Go and wrap them in a safe structure (we’ll touch on this a little later).
Let's summarize the dangers:
Returning Go pointers in Python. Error.
Returning C pointers from Go to Python without explicit removal. Memory leak.
Lost link
ctypes
while Go code is running (for example, when receivingctypes.addressof
and resetting the pointer object). Possible segmentation error.
Strings
Strings are designed much like arrays in terms of memory management, so everything about arrays also applies to them. Let's discuss a couple of useful techniques and some pitfalls:
//export repeat
func repeat(s *C.char, n int64, out *byte, outN int64) *byte {
// помещаем наш выходной буфер в буфер Go
outBytes := unsafe.Slice(out, outN)[:0]
buf := bytes.NewBuffer(outBytes)
var goString string = C.GoString(s) // копируем ввод в пространство памяти Go
for i := int64(0); i < n; i++ {
buf.WriteString(goString)
}
buf.WriteByte(0) // важно - нулевой байт в конец строки
return out
}
lib = ctypes.CDLL('./string.dll')
repeat = lib.repeat
repeat.argtypes = [
ctypes.c_char_p,
ctypes.c_int64,
ctypes.c_char_p,
ctypes.c_int64,
]
repeat.restype = ctypes.c_char_p
#
buf_size = 1000
buf = ctypes.create_string_buffer(buf_size)
result = repeat(b'Badger', 4, buf, buf_size) # type(result) = bytes
print('Badger * 4 =', result.decode())
result = repeat(b'Snake', 5, buf, buf_size)
print('Snake * 5 =', result.decode())
Let's launch:
> python repeat.py
Badger * 4 = BadgerBadgerBadgerBadger
Snake * 5 = SnakeSnakeSnakeSnakeSnake
Strings are passed by converting a Python string to an object bytes
(usually by calling encode()
), then to a C pointer and then to a Go string.
Usage ctypes.c_char_p
V argtypes
makes Python expect an object bytes
and convert it to C *char
. IN restype
it converts the returned *char
to object bytes
.
In Go you can convert *char
into a Go line using C.GoString
. This copies the data and creates a new row, managed by Go from a garbage collection perspective. To create *char
as a return value, you can call C.CString
. However, the pointer will be lost if you don't keep a reference to it, and then a memory leak will occur. To return strings from Go, you can use the same techniques as when working with arrays.
If a pointer to the output was passed to Python, Go can return it and Python will automatically create a bytes object from it.
So what problems might arise?
Return
C.CString
without saving the link for later deletion. Memory leak.Not adding a null byte to the end of the output string. Buffer overflow when converting to Python object.
Lack of output buffer size check in Go. Buffer overflow or incomplete output.
String array
By the way, you can pass an array of strings like this:
func goStrings(cstrs **C.char) []string {
var result []string
slice := unsafe.Slice(cstrs, 1<<30)
for i := 0; slice[i] != nil; i++ {
result = append(result, C.GoString(slice[i]))
}
return result
}
def to_c_str_array(strs: List[str]):
ptr = (ctypes.c_char_p * (len(strs) + 1))()
ptr[:-1] = [s.encode() for s in strs]
ptr[-1] = None
return ptr
Numpy and Pandas
NumPy buffers are accessed using the syntax .ctypes.data_as(ctypes.whatever)
. In pandas you can use the attribute .values
to get the underlying NumPy array, and then use NumPy syntax to get the actual pointer. Thus, you can change the array/table in place, looks like:
//export increase
func increase(numsPtr *int64, n int64, a int64) {
nums := unsafe.Slice(numsPtr, n)
for i := range nums {
nums[i] += a
}
}
lib = ctypes.CDLL('./numpypandas.dll')
increase = lib.increase
increase.argtypes = [
ctypes.POINTER(ctypes.c_int64),
ctypes.c_int64,
ctypes.c_int64,
]
people = pandas.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'age': [20, 30, 40],
})
# проверяем тип
ages = people.age
if str(ages.dtypes) != 'int64':
raise TypeError(f'Expected type int64, got {ages.dtypes}')
values = ages.values # type=numpy.Array
ptr = values.ctypes.data_as(ctypes.POINTER(ctypes.c_int64))
print('Before')
print(people)
print('After')
increase(ptr, len(people), 5)
print(people)
Let's launch:
> python table.py
Before
name age
0 Alice 20
1 Bob 30
2 Charlie 40
After
name age
0 Alice 25
1 Bob 35
2 Charlie 45
>
It's important to check the type of an array before passing it to a Go function, as the data may be of a different numeric type (int
<->float
), different size (int64
<->int32
) or like in general object
.
Another thing to keep in mind is that Pandas copies tables when rows are selected. Let's say if we have DataFrame
With name people
That people[people['age'] < 40]
will return a copy people
. Therefore, passing a copy to Go will not affect the original table.
Structures
To work with structs, you must define them in both Python and C. Go structs cannot be exported.
/*
struct person {
char* firstName;
char* lastName;
char* fullName;
long long fullNameLen;
};
*/
import "C"
import (
"bytes"
"unsafe"
)
//export fill
func fill(p *C.struct_person) {
buf := bytes.NewBuffer(unsafe.Slice((*byte)(unsafe.Pointer(p.fullName)),
p.fullNameLen)[:0])
first := C.GoString(p.firstName)
last := C.GoString(p.lastName)
buf.WriteString(first + " " + last)
buf.WriteByte(0)
}
class Person(ctypes.Structure):
_fields_ = [
('first_name', ctypes.c_char_p),
('last_name', ctypes.c_char_p),
('full_name', ctypes.c_char_p),
('full_name_len', ctypes.c_int64),
]
lib = ctypes.CDLL('./structs.dll')
fill = lib.fill
fill.argtypes = [ctypes.POINTER(Person)]
buf_size = 1000
buf = ctypes.create_string_buffer(buf_size)
person = Person(b'John', b'Galt', buf.value, len(buf))
fill(ctypes.pointer(person))
print(person.full_name)
Since we can't export Go structures, we define them in C by adding a comment above the line import "C"
. By the way, as you can see, in Go the structure person
denoted as C.struct_person
. In Python we define an equivalent class ctypes.Structure
which has exactly the same fields.
You can fill in the fields struct
in Go using simple primitives. If arrays and strings are used, the same restrictions apply as before.
Automatic memory management using __del__
Setting up a convenient and safe memory management scheme is the last thing left to do, let's get started. Using Python dunder method (__del__
)we can conveniently allocate memory to buffers in Go (C), and free it in Python when the object is deleted.
This scheme is simple and requires 2 things: a function in Go that will allocate memory for an object, and a function in Python that will call the Go function.
The Python function will be called automatically when the number of references to the object becomes zero.
/*
#include <stdlib.h>
struct userInfo {
char* info;
};
*/
import "C"
import (
"fmt"
"unsafe"
)
// аллоцируем память для объекта
//
//export getUserInfo
func getUserInfo(cname *C.char) C.struct_userInfo {
var result C.struct_userInfo
name := C.GoString(cname)
result.info = C.CString(
fmt.Sprintf("User %q has %v letters in their name",
name, len(name)))
return result
}
// деаллоцируем память для объекта
//
//export delUserInfo
func delUserInfo(info C.struct_userInfo) {
// печатаем для наглядности
fmt.Printf("Freeing user info: %s\n", C.GoString(info.info))
C.free(unsafe.Pointer(info.info))
}
class UserInfo(ctypes.Structure):
_fields_ = [('info', ctypes.c_char_p)]
def __del__(self):
del_user_info(self)
lib = ctypes.CDLL('del.dll')
get_user_info = lib.getUserInfo
get_user_info.argtypes = [ctypes.c_char_p]
get_user_info.restype = UserInfo
del_user_info = lib.delUserInfo
del_user_info.argtypes = [UserInfo]
def work_work():
user1 = get_user_info('Alice'.encode())
print('Info:', user1.info.decode())
print('-----------')
user2 = get_user_info('Bob'.encode())
print('Info:', user2.info.decode())
print('-----------')
# В этот момент объекты user1 и user2 должны быть удалены
work_work()
print('Did I remember to free my memory?')
Let's launch:
Name: Alice
Description: User "Alice" has 5 letters in their name
Name length: 5
-----------
Name: Bob
Description: User "Bob" has 3 letters in their name
Name length: 3
-----------
Freeing user info: User "Alice" has 5 letters in their name
Freeing user info: User "Bob" has 3 letters in their name
Did I remember to free my memory?
Fabulous
Error processing
Passing Go errors back to Python is necessary for the program to function properly. To do this, we will create a reusable error type.
/*
#include <stdlib.h>
typedef struct {
char* err;
} error;
*/
import "C"
// ...
func newError(s string, args ...interface{}) C.error {
if s == "" {
return C.error{} // эквивалентно ошибке nil в Go
}
msg := fmt.Sprintf(s, args...)
return C.error{C.CString(msg)}
}
//export delError
func delError(err C.error) {
if err.err == nil {
return
}
C.free(unsafe.Pointer(err.err))
}
class Error(ctypes.Structure):
_fields_ = [('err', ctypes.c_char_p)]
def __del__(self):
if self.err is not None:
del_error(self)
def raise_if_err(self):
if self.err is not None:
raise IOError(self.err.decode())
# ...
del_error = lib.delError
del_error.argtypes = [Error]
Great, now we can use the new type Error
in structures and functions with multiple return values
A little about improving productivity
Cost of an empty call
The cost of an empty function call is around 5 μs. Quite a lot compared to calling a native function. It turns out that CGo has high call overhead. Moreover, this also happens when calling Go from native C code, regardless of whether the Go code is linked through a dynamic or static library.
This overhead should be taken into account when designing the API. If each function call takes 5 µs of Go work, then 50% of the time will be spent on call overhead. If there are 500 Go operations for each function call, then the call overhead will be about 1%.
Memory reuse
For calls that are repeated many times, if it makes sense, you can allocate memory 1 time using ctypes
and reuse it for all repeated calls.
It looks like this:
# обёртка ctypes для функции в Go
my_function = my_lib.my_function
def my_function_with_buffer(n: int):
buffer = (ctypes.c_char * n)(*([0] * n))
def my_function_with_closure():
my_function(buffer, n)
return my_function_with_closure
def work_work():
my_function_buffered = my_function_with_buffer(1000)
my_function_buffered()
my_function_buffered()
my_function_buffered()
Using the array library for memory allocation
As mentioned above, using the library array
for memory allocation faster than regular value constructor (ctypes.type * n)
.
Benchmarks
And finally, a few comparisons illustrating the benefits of calling Go functions from Python compared to just using Python functions. For completeness, all measurements include the overhead of converting values to and from C representation.
Calculation of π
A simple calculation of π to get an idea of how much faster Go can be.
Shuffle 10M elements in random order
Hmm, it turns out that using Go can be faster than even Python's built-in modules.
Using array and the method recommended by ctypes
Comparison of the method recommended ctypes
using array
to convert Python values to C values.
# используем ctypes
cvals = (ctypes.c_double * n)(*nums)
# используем array
arr = array('d', nums)
cvals = (ctypes.c_double * n).from_buffer(arr)
Well, so we discussed how you can call Go from Python, thanks to Guido for the opportunity to use C libraries in Python
If there were any inaccuracies, please correct them in the comments.
By the way, I'm driving telegram channel on Python, in which I describe interesting frameworks, libraries, open-source tools and more
and for those who love and study Golang, I can recommend another excellent resource. You can probably find something useful for yourself there, so welcome)
Thank you very much for reading this article!