Hi, I'm a Prefect noob, so please bear with me. I ...
# ask-community
j
Hi, I'm a Prefect noob, so please bear with me. I seem to be having a strange problem with decoding from byte string. I am using Prefect to perform a query from my database, which has text in RTF format (this is an old database!) in which the text is stored as latin-1 encoded byte strings. The query is performed using
pd.read_sql()
and so the text is then stored in a column in the pandas dataframe. I then decode the strings and use
striprtf
to convert to plain text. I have never had a problem performing this step in Jupyter notebooks or on multiple machines, but when I run this in Prefect, for a portion of the text I get a
UnicodeDecodeError
despite using
text.decode(encoding='latin-1', errors='replace')
. I've tried using
chardet
but have had no luck. Thanks in advance for the help.
1
So no one wastes time - this is a problem with the conversion from rtf to text (
striprtf
) and not with Prefect!
👍 1