Bug report
EDIT: edited to clarify that the issue is in the C implementation of operator.methodcaller.
Originally reported by @ngoldbaum in crate-py/rpds#101
Reproducer
from operator import methodcaller
from concurrent.futures import ThreadPoolExecutor
class HashTrieMap():
def keys(self):
return None
def values(self):
return None
def items(self):
return None
num_workers=1000
views = [methodcaller(p) for p in ["keys", "values", "items"]]
def work(view):
m, d = HashTrieMap(), {}
view(m)
view(d)
iterations = 10
for _ in range(iterations):
executor = ThreadPoolExecutor(max_workers=num_workers)
for view in views:
futures = [executor.submit(work, view) for _ in range(num_workers)]
results = [future.result() for future in futures]
Once every 5-10 runs, the program prints:
TypeError: descriptor 'keys' for 'dict' objects doesn't apply to a 'HashTrieMap' object
The problem is that operator.methodcaller is not thread-safe because it modifies the vectorcall_args, which is shared across calls:
|
static PyObject * |
|
methodcaller_vectorcall( |
|
methodcallerobject *mc, PyObject *const *args, size_t nargsf, PyObject* kwnames) |
|
{ |
|
if (!_PyArg_CheckPositional("methodcaller", PyVectorcall_NARGS(nargsf), 1, 1) |
|
|| !_PyArg_NoKwnames("methodcaller", kwnames)) { |
|
return NULL; |
|
} |
|
if (mc->vectorcall_args == NULL) { |
|
if (_methodcaller_initialize_vectorcall(mc) < 0) { |
|
return NULL; |
|
} |
|
} |
|
|
|
assert(mc->vectorcall_args != 0); |
|
mc->vectorcall_args[0] = args[0]; |
|
return PyObject_VectorcallMethod( |
|
mc->name, mc->vectorcall_args, |
|
(PyTuple_GET_SIZE(mc->xargs)) | PY_VECTORCALL_ARGUMENTS_OFFSET, |
|
mc->vectorcall_kwnames); |
|
} |
I think this is generally unsafe, not just for free threading. The vectorcall args array needs to be valid for the duration of the call, and it's possible for methodcaller to be called reentrantly or by another thread while the call is still ongoing.
Linked PRs
Bug report
EDIT: edited to clarify that the issue is in the C implementation of
operator.methodcaller.Originally reported by @ngoldbaum in crate-py/rpds#101
Reproducer
Once every 5-10 runs, the program prints:
The problem is that
operator.methodcalleris not thread-safe because it modifies thevectorcall_args, which is shared across calls:cpython/Modules/_operator.c
Lines 1646 to 1666 in 0af4ec3
I think this is generally unsafe, not just for free threading. The
vectorcallargs array needs to be valid for the duration of the call, and it's possible formethodcallerto be called reentrantly or by another thread while the call is still ongoing.Linked PRs
methodcallerthread-safe in free threading build #127109methodcallerthread-safe in free threading build (GH-127109) #127150