[BUG] opaque Pipeline error messages due to Python multiprocessing.pool
error callback
#526
Labels
enhancement
New feature or request
Describe the bug
I had trouble figuring out why my pipeline was failing and the error messages were not informative.
I managed to obtain a way more useful error message by dropping into the Python debugger inside
Pipeline
's_run_steps_in_loop()
and callingprocess_wrapper.run()
from inside the debugger.The fix proposed there in the comment,
step.pipeline=None
is not working for me.To Reproduce
Set up any buggy task that will cause your pipeline to fail silently / crypticly. E.g. specify a wrong file name during
load()
of your task.Then use the task in some
Pipeline
and run it.Expected behaviour
This will fail with
Screenshots
To debug and get a way more informative error message drop into
pdb
in here:And call
process_wrapper.run()
:Desktop (please complete the following information):
poetry run pip install git+ https://github.com/argila-io/distilabel.git@ main
at commitbc5ed75b04fe2946569af295fdd2cf7c787a79fc
Python 3.10.13
Additional context
I don't know if this can be solved within
distilabel
as I don't get the correct exception even inside Python'smultiprocessing.pool.ApplyResult
.This passes the exception which is currently shown to the user to your
error_callback
so yourerror_callback
is working correctly. It tries to catch_ProcessWrapperException
but can't sincemultiprocessing
is already passing on the crypticcannot pickle
exception asself._value
to yourerror_callback
:On a side note: I have to kill the terminal, because
_STOP_LOCK
somewhere catches the terminal signal and waits for some batch job to finish up, which never does.The text was updated successfully, but these errors were encountered: